How can I contact Akash Khatri?

Email: akash.m.khatri@gmail.com. Phone: +1 (801) 403-3512. LinkedIn: linkedin.com/in/akashkhatri. GitHub: github.com/AkashKhatrii.

Currently SDE @ Amazon - Seattle, WA

Hi, I'm

Akash Khatri

Q: Who is Akash Khatri?

Akash Khatri is a Software Development Engineer at Amazon, based in Seattle, WA. He holds a Master's in Computer Science from the University of Utah (2025) with a 4.0 GPA and a Bachelor's in Computer Engineering from the University of Mumbai (2023). He is the first author of 'Sort it Like You Mean It' (PVLDB 2025).

Q: What is Akash's technical stack?

Languages: Java, Python, Go, JavaScript, TypeScript, C, SQL. Frameworks: React, Node.js, Express, FastAPI, Flask, Jinja2, LangChain. Cloud & data: AWS (CDK, Lambda, Fargate, Glue), CI/CD pipelines, ETL workflows. Databases: PostgreSQL, MySQL, MongoDB, Redis, vector databases (HNSW). ML/AI: PyTorch, TensorFlow, scikit-learn, LLMs (OpenAI GPT-4o, Anthropic Claude), embeddings, RAG.

Q: Has Akash published academic research?

Yes. Akash Khatri is the first author of 'Sort it Like You Mean It: Discovering Semantically Interesting Attribute Augmentations to Sort Tables', published in the Proceedings of the VLDB Endowment (PVLDB), Vol. 18, No. 12, 2025. The paper introduces InsightSort, a GPT-4o-powered semantic analytical engine that maps context-aware ranking dimensions onto target datasets and uses a dual HNSW vector-index layer for sub-linear join-path discovery over large data lakes.

Software Development Engineer

Software Development Engineer at Amazon · MS Computer Science, University of Utah (2025). I architect backend services and AWS data pipelines in Java, ship LLM-powered systems on the side, and first-authored “Sort it Like You Mean It” at PVLDB 2025 — the paper behind InsightSort, a semantic analytical engine for data lakes.

View work LinkedIn

MS CS GPA: 4.0/4.0
PVLDB 2025 paper: 1st author
SmartBillAgent volume: $110K+
Students mentored: 180+

Java/ Python/ Go/ TypeScript/ React/ Node.js/ AWS/ CDK/ Glue/ Lambda/ Fargate/ LLMs/ LangChain/ OpenAI/ Anthropic/ Embeddings/ HNSW/ RAG/ PyTorch/ TensorFlow/ Postgres/ MongoDB/ Redis/ DuckDB/ Docker/ FastAPI/ Flask/

EDUCATION

Master's in Computer Science, Aug 2023 - May 2025

University of Utah, GPA: 4.0/4.0

Coursework: Advanced Algorithms, Distributed Systems, Computer Architecture,
Deep Learning for NLP, Manage Data with & for ML, Operating Systems

Bachelors in Computer Engineering, Aug 2019 - May 2023

University of Mumbai, GPA: 9.64/10

Coursework: Data Structures, Artificial Intelligence, Machine Learning, Advanced DBMS,
Cloud Computing, Object Oriented Programming, Big Data Analytics

LANGUAGES & TOOLS

Languages:

Java, Python, Go, JavaScript, TypeScript, C, C++, SQL

Frameworks & Libraries:

React, Node.js, Express, FastAPI, Flask, Jinja2, LangChain, Tailwind CSS, REST APIs

Cloud & Data Engineering:

AWS (CDK, Lambda, Fargate, Glue, S3), CI/CD pipelines, ETL workflows, data pipelines, Docker

Databases & Vector Search:

PostgreSQL, MySQL, MongoDB, Redis, DuckDB, vector databases (HNSW)

Machine Learning & AI:

PyTorch, TensorFlow, Scikit-learn, LLMs, LangChain, OpenAI APIs, Anthropic Claude, NLP, embeddings, RAG

WORK EXPERIENCE

Software Development Engineer, Jun 2025 - Present

Amazon — Seattle, WA

Architect and scale backend services and AWS data pipelines in Java, orchestrating production ETL workflows and linking high-throughput product ordering pipelines to centralized commerce layers serving millions of users daily.
Coordinated the cross-functional release lifecycle of the Amazon Music for Artists application; unblocked multi-platform deployments across Web, Android, and iOS by resolving critical defects, directing QA testing in beta, and executing multi-channel MCMs.
Owned operational resilience for core microservices by structuring multi-package beta pipelines and implementing self-triggering canary stacks with active rollback protection to safeguard production deployment runs.
Executed data-layer migrations for high-volume AWS Glue pipelines handling TBs of historical data; designed validation tooling that eliminated duplicate backfill processing, dropping infrastructure overhead by $50K+ annually.
Led the testing-infrastructure migration of 20+ dependent packages to a runtime-managed framework; authored the technical design doc and formulated AWS CDK constructs to parameterize environment setups, compute scaling (Fargate / Lambda), and artifact generation.
Orchestrated component-by-component rollout across isolated distributed applications by mapping Jira epics, mentoring and unblocking a junior SDE-1, and delivering the entire lifecycle 2 weeks ahead of schedule.

AI/ML Research Assistant, Apr 2024 - May 2025

Kahlert School of Computing, University of Utah

First-authored “Sort it Like You Mean It” (PVLDB 2025) and deployed InsightSort, a semantic analytical engine running on GPT-4o that maps 5–7 context-aware ranking dimensions onto target datasets, enriching tabular records with schemas completely missing from the host table.
Engineered a dual 384-dimensional HNSW vector-index layer over extensive data lakes (isolating column joinability and column semantics) that replaces O(N) brute-force column scans with sub-linear top-5 approximate-nearest-neighbor lookups to isolate valid join paths in milliseconds.
Formulated a structural retrieval criteria fusing multi-modal utility scores with embedding-based uniqueness markers to filter out redundant relational keys, instantly synthesizing optimized DuckDB SQL query patches to skip manual data-profiling leaks.

Graduate Teaching Assistant — Advanced Algorithms & Full-Stack Systems, Jan 2024 - May 2025

Kahlert School of Computing, University of Utah

Conducted labs and evaluated core systems for Advanced Algorithms and Full-Stack Systems (React, FastAPI, AWS) for 180+ graduate students; provided architectural mentorship on concurrent state software and scaling configurations.
Mentored students on coding standards, testing strategies, algorithmic design, and deployment best practices — aligning assignment quality with industry expectations.

Full Stack Web Developer Intern, Jun 2021 - Jul 2021

Exposys Data Labs

Led development of Nostalgia, a memory-sharing platform built with JavaScript, Node.js, and MongoDB; implemented MVC architecture and optimized database queries to enhance scalability and performance.
Boosted user engagement by introducing interactive features like commenting and memory-saving, significantly enhancing user interaction and content sharing.
Developed 9+ RESTful APIs using Node.js, integrating Passport for authentication and Multer for file storage, ensuring secure data handling and efficient content management.

Internet of Things Trainee, Jun 2020 - Jul 2020

Enovate Skill

Created and simulated IoT models using platforms like Tinkercad to test designs virtually.
Designed and built various Tinkercad simulators, including fire alarm systems and temperature sensors.

PROJECTS

Featured · First-author research

InsightSort

The system behind “Sort it Like You Mean It” (PVLDB Vol. 18, No. 12, 2025). A GPT-4o-powered semantic analytical engine that maps 5–7 context-aware ranking dimensions onto target datasets and enriches tabular records with schemas missing from the host table. A dual 384-dim HNSW vector-index layer over data lakes replaces O(N) column scans with sub-linear top-5 ANN lookups, isolating valid join paths in milliseconds and emitting optimized DuckDB SQL patches.

Python
GPT-4o
HNSW
Embeddings
DuckDB
React.js
Flask

Read the paper GitHub

SmartBillAgent

Conversational billing & document-automation platform. An asynchronous multi-threaded Flask backend processes Telegram conversational text orders, replacing pen-and-paper billing and scaling multi-store operations to 60+ daily orders and $110K+ in gross transactional volume. A fault-tolerant semantic parser built on Claude maps vague English / Hindi / Sindhi inputs into schema-validated JSON, with bilingual rendering that bypasses worker literacy constraints. An on-the-fly bill compiler turns JSON arrays into paginated print-ready HTML and vector PDF receipts within seconds.

Python
Flask
Anthropic Claude
Jinja2
Docker
Telegram API

CollabHub — full-stack collaboration platform showing project listings and real-time chat

CollabHub Demo

A full-stack MERN platform that connects students and developers by shared technologies for mentorship and project collaboration. Real-time Firebase chat, profile management, project discovery. Frontend on Vercel, backend on Railway.

React.js
Node.js
MongoDB
Firebase
Vercel

CNNs for Text Classification

Adapted CNNs for sentence-level NLP classification on the SST dataset. Improved f1 from 82.5 → 84 with GloVe and 80 → 84.4 with fastText embeddings — an ~8% lift through fine-tuning and embedding experiments.

PyTorch
NLP
GloVe
fastText

UrbanAid — local services marketplace with appointment scheduling

UrbanAid Demo

Marketplace connecting local service providers (plumbers, electricians) with customers. Built with Node.js, EJS, and MongoDB; includes service selection, account creation, and appointment scheduling.

Node.js
EJS
MongoDB
Express

Raft Distributed Consensus Engine

A production-spec Raft consensus module in Go featuring automated leader election, heartbeats, continuous log replication, and persistent state machines across isolated nodes. Validated through a testing harness that mimics dynamic network partitions and cascading node drops across 10+ simulation clusters, asserting 100% state-data correctness under intense split-brain scenarios.

Go
Distributed Systems
Concurrency
RPC

Road Damage Detection — bounding boxes over potholes and cracks on road imagery

Road Damage Detection & Classification

Real-time road-damage detection (cracks, potholes) using Faster R-CNN, YOLOv5, and SSD. Curated a multi-country dataset spanning four lighting/environment regimes for robustness; deployed via a web upload interface.

PyTorch
YOLOv5
Faster R-CNN
Computer Vision

Paraclone

Chrome extension that uses Chrome's Storage API to save and organize content during browsing — no copy/paste round-trips to other apps.

Chrome Extension
JavaScript
Storage API

Teeter — React real-time chat application interface

Teeter Demo

A responsive chat app built with React.js and deployed on Netlify, offering real-time messaging and a clean UI.

React.js
Socket.IO
Netlify

Show more projects

Random Quote Generator — React UI showing a quote with refresh control

Random Quote Generator Demo

A small React.js app that fetches and displays random quotes from a third-party API.

Onestop

Full-stack e-commerce app for clothing retail with a shopping cart and Stripe payments.

PUBLICATIONS

Sort it Like You Mean It: Discovering Semantically Interesting Attribute Augmentations to Sort Tables PVLDB 2025

Akash Khatri, et al. Proceedings of the VLDB Endowment (PVLDB), Vol. 18, No. 12, 2025.

Introduces InsightSort, a semantic analytical engine running on GPT-4o that maps 5–7 context-aware ranking dimensions onto target datasets and enriches tabular records with schemas missing from the host table. Engineers a dual 384-dimensional HNSW vector-index layer over data lakes (isolating column joinability and column semantics) to replace O(N) brute-force scans with sub-linear top-5 ANN lookups for join-path discovery in milliseconds — and synthesizes optimized DuckDB SQL query patches that skip manual data profiling.

First author
VLDB Endowment
GPT-4o
HNSW
Vector search
DuckDB
Data lakes

GITHUB

Public repositories and profile data loaded live from the GitHub API — explore code, stars, and recent pushes alongside the featured projects above.

Loading repositories…

PARTICIPATIONS

Participated in 4 intercollege hackathons and was top 10 in two of them.
Cleared level 1 of International Maths Olympiad
State Level Cricket Player (Invited team)
Runner up of Blind Code Competition (code with monitor screen off) held in College
Worked with an NGO (Simran Seva Pratishthan, Nov 21 – June 22), for a period of 8 months where we raised funds for education of needy children and provided food and shelter to Adivasi (tribal people)

CERTIFICATIONS

Oracle Cloud Infrastructure 2023 Foundations Associate.
30 days of Google Cloud Platform
Hackerrank Python Basic and Intermediate
C++ Gold Hackerrank

FAQ

Quick answers for recruiters, hiring managers, and AI assistants.

Who is Akash Khatri?

Akash Khatri is a Software Development Engineer at Amazon, based in Seattle, WA. He holds an MS in Computer Science from the University of Utah (2025) with a 4.0 GPA, and a Bachelor's in Computer Engineering from the University of Mumbai (2023) with a 9.64/10 GPA. He is the first author of “Sort it Like You Mean It” at PVLDB 2025.

What does Akash work on?

At Amazon, Akash architects backend services and AWS data pipelines in Java — orchestrating production ETL workflows linking high-throughput product ordering pipelines to centralized commerce layers serving millions of users daily.

Outside of work he ships LLM-powered systems — InsightSort (PVLDB 2025, GPT-4o + HNSW), SmartBillAgent (Claude-powered billing platform processing $110K+ in transactions), and a Raft distributed-consensus engine in Go.

What is Akash's technical stack?

Languages: Java, Python, Go, JavaScript, TypeScript, C, SQL.
Frameworks & libraries: React, Node.js, Express, FastAPI, Flask, Jinja2, LangChain.
Cloud & data engineering: AWS (CDK, Lambda, Fargate, Glue, S3), CI/CD pipelines, ETL workflows, data pipelines.
Databases & vector search: PostgreSQL, MySQL, MongoDB, Redis, DuckDB, vector databases (HNSW).
ML / AI: PyTorch, TensorFlow, scikit-learn, LLMs (GPT-4o, Anthropic Claude), embeddings, NLP, RAG.

Has Akash published academic research?

Yes. Akash is the first author of “Sort it Like You Mean It: Discovering Semantically Interesting Attribute Augmentations to Sort Tables”, published in the Proceedings of the VLDB Endowment (PVLDB), Vol. 18, No. 12, 2025. See the Publications section for details.

How do I contact Akash?

Email akash.m.khatri@gmail.com, call +1 (801) 403-3512, or message him on LinkedIn. Code lives at github.com/AkashKhatrii.

Akash Khatri

EDUCATION

Master's in Computer Science, Aug 2023 - May 2025

Bachelors in Computer Engineering, Aug 2019 - May 2023

LANGUAGES & TOOLS

Languages:

Frameworks & Libraries:

Cloud & Data Engineering:

Databases & Vector Search:

Machine Learning & AI:

WORK EXPERIENCE

Software Development Engineer, Jun 2025 - Present

AI/ML Research Assistant, Apr 2024 - May 2025

Graduate Teaching Assistant — Advanced Algorithms & Full-Stack Systems, Jan 2024 - May 2025

Full Stack Web Developer Intern, Jun 2021 - Jul 2021

Internet of Things Trainee, Jun 2020 - Jul 2020

PROJECTS

InsightSort

PUBLICATIONS

GITHUB

PARTICIPATIONS

CERTIFICATIONS

FAQ

CONTACT ME

Akash Khatri