EDUCATION

Master's in Computer Science, Aug 2023 - May 2025

University of Utah, GPA: 4.0/4.0

Coursework: Advanced Algorithms, Distributed Systems, Computer Architecture,
Deep Learning for NLP, Manage Data with & for ML, Operating Systems

Bachelors in Computer Engineering, Aug 2019 - May 2023

University of Mumbai, GPA: 9.64/10

Coursework: Data Structures, Artificial Intelligence, Machine Learning, Advanced DBMS,
Cloud Computing, Object Oriented Programming, Big Data Analytics

LANGUAGES & TOOLS

Languages:

Java, Python, Go, JavaScript, TypeScript, C, C++, SQL

Frameworks & Libraries:

React, Node.js, Express, FastAPI, Flask, Jinja2, LangChain, Tailwind CSS, REST APIs

Cloud & Data Engineering:

AWS (CDK, Lambda, Fargate, Glue, S3), CI/CD pipelines, ETL workflows, data pipelines, Docker

Databases & Vector Search:

PostgreSQL, MySQL, MongoDB, Redis, DuckDB, vector databases (HNSW)

Machine Learning & AI:

PyTorch, TensorFlow, Scikit-learn, LLMs, LangChain, OpenAI APIs, Anthropic Claude, NLP, embeddings, RAG

WORK EXPERIENCE

Software Development Engineer, Jun 2025 - Present

Amazon — Seattle, WA
  • Architect and scale backend services and AWS data pipelines in Java, orchestrating production ETL workflows and linking high-throughput product ordering pipelines to centralized commerce layers serving millions of users daily.
  • Coordinated the cross-functional release lifecycle of the Amazon Music for Artists application; unblocked multi-platform deployments across Web, Android, and iOS by resolving critical defects, directing QA testing in beta, and executing multi-channel MCMs.
  • Owned operational resilience for core microservices by structuring multi-package beta pipelines and implementing self-triggering canary stacks with active rollback protection to safeguard production deployment runs.
  • Executed data-layer migrations for high-volume AWS Glue pipelines handling TBs of historical data; designed validation tooling that eliminated duplicate backfill processing, dropping infrastructure overhead by $50K+ annually.
  • Led the testing-infrastructure migration of 20+ dependent packages to a runtime-managed framework; authored the technical design doc and formulated AWS CDK constructs to parameterize environment setups, compute scaling (Fargate / Lambda), and artifact generation.
  • Orchestrated component-by-component rollout across isolated distributed applications by mapping Jira epics, mentoring and unblocking a junior SDE-1, and delivering the entire lifecycle 2 weeks ahead of schedule.

AI/ML Research Assistant, Apr 2024 - May 2025

Kahlert School of Computing, University of Utah
  • First-authored “Sort it Like You Mean It” (PVLDB 2025) and deployed InsightSort, a semantic analytical engine running on GPT-4o that maps 5–7 context-aware ranking dimensions onto target datasets, enriching tabular records with schemas completely missing from the host table.
  • Engineered a dual 384-dimensional HNSW vector-index layer over extensive data lakes (isolating column joinability and column semantics) that replaces O(N) brute-force column scans with sub-linear top-5 approximate-nearest-neighbor lookups to isolate valid join paths in milliseconds.
  • Formulated a structural retrieval criteria fusing multi-modal utility scores with embedding-based uniqueness markers to filter out redundant relational keys, instantly synthesizing optimized DuckDB SQL query patches to skip manual data-profiling leaks.

Graduate Teaching Assistant — Advanced Algorithms & Full-Stack Systems, Jan 2024 - May 2025

Kahlert School of Computing, University of Utah
  • Conducted labs and evaluated core systems for Advanced Algorithms and Full-Stack Systems (React, FastAPI, AWS) for 180+ graduate students; provided architectural mentorship on concurrent state software and scaling configurations.
  • Mentored students on coding standards, testing strategies, algorithmic design, and deployment best practices — aligning assignment quality with industry expectations.

Full Stack Web Developer Intern, Jun 2021 - Jul 2021

Exposys Data Labs
  • Led development of Nostalgia, a memory-sharing platform built with JavaScript, Node.js, and MongoDB; implemented MVC architecture and optimized database queries to enhance scalability and performance.
  • Boosted user engagement by introducing interactive features like commenting and memory-saving, significantly enhancing user interaction and content sharing.
  • Developed 9+ RESTful APIs using Node.js, integrating Passport for authentication and Multer for file storage, ensuring secure data handling and efficient content management.

Internet of Things Trainee, Jun 2020 - Jul 2020

Enovate Skill
  • Created and simulated IoT models using platforms like Tinkercad to test designs virtually.
  • Designed and built various Tinkercad simulators, including fire alarm systems and temperature sensors.

PROJECTS

SmartBillAgent

Conversational billing & document-automation platform. An asynchronous multi-threaded Flask backend processes Telegram conversational text orders, replacing pen-and-paper billing and scaling multi-store operations to 60+ daily orders and $110K+ in gross transactional volume. A fault-tolerant semantic parser built on Claude maps vague English / Hindi / Sindhi inputs into schema-validated JSON, with bilingual rendering that bypasses worker literacy constraints. An on-the-fly bill compiler turns JSON arrays into paginated print-ready HTML and vector PDF receipts within seconds.

  • Python
  • Flask
  • Anthropic Claude
  • Jinja2
  • Docker
  • Telegram API
CollabHub — full-stack collaboration platform showing project listings and real-time chat

CollabHub Demo

A full-stack MERN platform that connects students and developers by shared technologies for mentorship and project collaboration. Real-time Firebase chat, profile management, project discovery. Frontend on Vercel, backend on Railway.

  • React.js
  • Node.js
  • MongoDB
  • Firebase
  • Vercel
CNNs for Text Classification — research result charts and f1-score comparisons

CNNs for Text Classification

Adapted CNNs for sentence-level NLP classification on the SST dataset. Improved f1 from 82.5 → 84 with GloVe and 80 → 84.4 with fastText embeddings — an ~8% lift through fine-tuning and embedding experiments.

  • PyTorch
  • NLP
  • GloVe
  • fastText
UrbanAid — local services marketplace with appointment scheduling

UrbanAid Demo

Marketplace connecting local service providers (plumbers, electricians) with customers. Built with Node.js, EJS, and MongoDB; includes service selection, account creation, and appointment scheduling.

  • Node.js
  • EJS
  • MongoDB
  • Express

Raft Distributed Consensus Engine

A production-spec Raft consensus module in Go featuring automated leader election, heartbeats, continuous log replication, and persistent state machines across isolated nodes. Validated through a testing harness that mimics dynamic network partitions and cascading node drops across 10+ simulation clusters, asserting 100% state-data correctness under intense split-brain scenarios.

  • Go
  • Distributed Systems
  • Concurrency
  • RPC
Road Damage Detection — bounding boxes over potholes and cracks on road imagery

Road Damage Detection & Classification

Real-time road-damage detection (cracks, potholes) using Faster R-CNN, YOLOv5, and SSD. Curated a multi-country dataset spanning four lighting/environment regimes for robustness; deployed via a web upload interface.

  • PyTorch
  • YOLOv5
  • Faster R-CNN
  • Computer Vision
Paraclone — Chrome extension popup for saving content during browsing sessions

Paraclone

Chrome extension that uses Chrome's Storage API to save and organize content during browsing — no copy/paste round-trips to other apps.

  • Chrome Extension
  • JavaScript
  • Storage API
Teeter — React real-time chat application interface

Teeter Demo

A responsive chat app built with React.js and deployed on Netlify, offering real-time messaging and a clean UI.

  • React.js
  • Socket.IO
  • Netlify
Show more projects
Onestop — e-commerce clothing storefront preview

Onestop

Full-stack e-commerce app for clothing retail with a shopping cart and Stripe payments.

PUBLICATIONS

Sort it Like You Mean It: Discovering Semantically Interesting Attribute Augmentations to Sort Tables PVLDB 2025

Akash Khatri, et al. Proceedings of the VLDB Endowment (PVLDB), Vol. 18, No. 12, 2025.

Introduces InsightSort, a semantic analytical engine running on GPT-4o that maps 5–7 context-aware ranking dimensions onto target datasets and enriches tabular records with schemas missing from the host table. Engineers a dual 384-dimensional HNSW vector-index layer over data lakes (isolating column joinability and column semantics) to replace O(N) brute-force scans with sub-linear top-5 ANN lookups for join-path discovery in milliseconds — and synthesizes optimized DuckDB SQL query patches that skip manual data profiling.

  • First author
  • VLDB Endowment
  • GPT-4o
  • HNSW
  • Vector search
  • DuckDB
  • Data lakes

GITHUB

Public repositories and profile data loaded live from the GitHub API — explore code, stars, and recent pushes alongside the featured projects above.

Loading profile…

Loading repositories…

PARTICIPATIONS

  1. Participated in 4 intercollege hackathons and was top 10 in two of them.
  2. Cleared level 1 of International Maths Olympiad
  3. State Level Cricket Player (Invited team)
  4. Runner up of Blind Code Competition (code with monitor screen off) held in College
  5. Worked with an NGO (Simran Seva Pratishthan, Nov 21 – June 22), for a period of 8 months where we raised funds for education of needy children and provided food and shelter to Adivasi (tribal people)

CERTIFICATIONS

  1. Oracle Cloud Infrastructure 2023 Foundations Associate.
  2. 30 days of Google Cloud Platform
  3. Hackerrank Python Basic and Intermediate
  4. C++ Gold Hackerrank

FAQ

Quick answers for recruiters, hiring managers, and AI assistants.

Who is Akash Khatri?

Akash Khatri is a Software Development Engineer at Amazon, based in Seattle, WA. He holds an MS in Computer Science from the University of Utah (2025) with a 4.0 GPA, and a Bachelor's in Computer Engineering from the University of Mumbai (2023) with a 9.64/10 GPA. He is the first author of “Sort it Like You Mean It” at PVLDB 2025.

What does Akash work on?

At Amazon, Akash architects backend services and AWS data pipelines in Java — orchestrating production ETL workflows linking high-throughput product ordering pipelines to centralized commerce layers serving millions of users daily.

Outside of work he ships LLM-powered systems — InsightSort (PVLDB 2025, GPT-4o + HNSW), SmartBillAgent (Claude-powered billing platform processing $110K+ in transactions), and a Raft distributed-consensus engine in Go.

What is Akash's technical stack?

Languages: Java, Python, Go, JavaScript, TypeScript, C, SQL.
Frameworks & libraries: React, Node.js, Express, FastAPI, Flask, Jinja2, LangChain.
Cloud & data engineering: AWS (CDK, Lambda, Fargate, Glue, S3), CI/CD pipelines, ETL workflows, data pipelines.
Databases & vector search: PostgreSQL, MySQL, MongoDB, Redis, DuckDB, vector databases (HNSW).
ML / AI: PyTorch, TensorFlow, scikit-learn, LLMs (GPT-4o, Anthropic Claude), embeddings, NLP, RAG.

Has Akash published academic research?

Yes. Akash is the first author of “Sort it Like You Mean It: Discovering Semantically Interesting Attribute Augmentations to Sort Tables”, published in the Proceedings of the VLDB Endowment (PVLDB), Vol. 18, No. 12, 2025. See the Publications section for details.

How do I contact Akash?

Email akash.m.khatri@gmail.com, call +1 (801) 403-3512, or message him on LinkedIn. Code lives at github.com/AkashKhatrii.

CONTACT ME