Gujarat TechnologicalUniversity 2009–2013 B.S. Computer Science Illinois Instituteof Technology 2014–2016 M.S. Computer Science eContext.ai 2016 Python Engineer · Internship eContext.ai 2017–2018 Machine Learning Engineer eContext.ai 2019–2023 Senior Machine Learning Engineer 84.51° 2023–2024 Senior Data Scientist 84.51° 2024–now Lead Research Scientist 2009–2013 Gujarat TechnologicalUniversity B.S. Computer Science 2014–2016 Illinois Instituteof Technology M.S. Computer Science 2016 eContext.ai Python Engineer · Internship 2017–2018 eContext.ai Machine Learning Engineer 2019–2023 eContext.ai Senior Machine Learning Engineer 2023–2024 84.51° Senior Data Scientist 2024–now 84.51° Lead Research Scientist

I’m a Lead Research Scientist with ~10 years of applied ML and NLP experience, currently building production GenAI systems at scale on the Foundation Models team at 84.51° (Kroger’s data science arm). I own the full GenAI stack, spanning LLM fine-tuning and alignment, semantic search and embedding infrastructure, multimodal AI, and synthetic data generation. I take systems from research artifact to shipped product end-to-end. Recent work includes domain-specific LLM fine-tuning, multi-agent orchestration with tool use, and production semantic search.

10+ Years production ML
+5% Conversion lift over keyword baseline
<200ms P99 latency at scale
E2E Research to production ownership

What I Do

Synthetic Data Generation

Generating domain-specific training data from scratch: structured hierarchical examples, safety-aligned samples, and task-specific corpora designed to minimize class imbalance and label bias. Used to fine-tune small language models on targeted tasks without relying on frontier APIs.

Fine-tuning & Alignment

PEFT/LoRA fine-tuning on domain-specific corpora, quantization (GPTQ, AWQ, GGUF), safety evaluation, red-teaming, and multi-intent SLM development. Hands-on from dataset curation through deployment.

Semantic Search & Embeddings

Bi-encoder and cross-encoder architectures, ANN indexing, hybrid sparse-dense retrieval pipelines, and embedding evaluation frameworks. Experience taking dense retrieval from prototype to tens of millions of queries per month.

Production Serving

vLLM, Triton Inference Server, OpenAI-compatible API deployment, latency optimization, model compression, and controlled A/B rollouts. Optimized GPU inference to hit P99 under 200ms at production scale.

Foundation Models

Pre-training transformer-based models from scratch: custom tokenization, training objective design, distributed training with DeepSpeed/PyTorch FSDP, and data curriculum scheduling. Applied to behavioral sequence modeling and representation learning.

Vision LLMs

Deploying open-source vision-language models on a vLLM serving stack, and fine-tuning vision models as custom domain classifiers. Covers the full arc from multimodal prototype to production-grade inference with latency and throughput constraints.

Tech Stack

LLMs & GenAI
Fine-tuning PEFT / LoRA Quantization RAG Semantic Search Embedding Models vLLM Triton Inference Server Vector Databases Transformers
ML Systems
PyTorch TensorFlow DeepSpeed Distributed Training Model Serving at Scale A/B Testing Pipeline Automation
Infrastructure
Python SQL Docker AWS Azure REST / gRPC Elasticsearch Redis Git CI/CD

Certifications

AWS Certified Machine Learning Specialty    TensorFlow Developer Certificate