5+ years building production AI systems at enterprise scale — GenAI, LLMs, RAG, RLHF, Computer Vision, Voice AI, Cloud Architecture.
I’m a Senior AI/ML Engineer with 5+ years of hands-on experience shipping production AI systems across healthcare, finance, retail, media, and enterprise operations. I’ve worked with companies ranging from 10-person startups to 10,000+ employee enterprises like MARS, solving problems that move business metrics.
What I do best: Take AI from research to production. I’ve fine-tuned LLMs (LLaMA, Mistral) with LoRA/QLoRA, built RLHF pipelines with PPO, architected RAG systems over 2TB+ corpora, deployed real-time voice infrastructure handling 500+ concurrent calls, and shipped fraud detection models processing applications in real-time — all on AWS/GCP at scale.
What I’m looking for: Senior AI/ML Engineer, Staff ML Engineer, or Lead AI Engineer roles where I can build and ship production AI systems.
B.S. Computer Science (COMSATS University Islamabad, 2016–2020)
The work I’m most proud of — production systems processing real data, serving real users, driving real business impact.
Real-time concurrent voice processing with zero-latency ingestion engines
- Built voice-to-data pipelines handling 500+ simultaneous calls using WebSockets, Apache Kafka, and streaming architectures
- Developed gRPC microservices with C++ modules (CUDA, Eigen), reducing inference latency by 25%
- Designed speech-to-text, sentiment analysis, and sales insights extraction from live audio streams
Ensemble ML + GenAI explainability for financial services
- Engineered 650+ predictive features from raw application data — behavioral anomalies, timing patterns, identity verification signals
- Built ensemble model (XGBoost + Isolation Forest) achieving 50% fraud detection on holdout test sets
- Discovered 3 applicant personas via unsupervised clustering (UMAP + HDBSCAN) — “Digital Ghost” persona has 70% fraud concentration
- Implemented GenAI-powered explainable PDF reports via Amazon Bedrock translating SHAP values into plain English
Knowledge retrieval across 2TB+ structured and unstructured data
- Architected multi-index retrieval (FAISS + ChromaDB + PG-Vector) with cross-encoder re-ranking
- Built hallucination detection and citation tracking for grounded LLM responses
- Deployed on AWS SageMaker with auto-scaling — 40% cost reduction vs hosted API models
Parameter-efficient fine-tuning and human alignment for production LLMs
- Fine-tuned LLaMA-2, Mistral with LoRA, QLoRA, PEFT — served via VLLM with CUDA optimization
- Built full RLHF pipeline: SFT → Reward Modeling → PPO optimization with KL divergence constraints
- Achieved 68% win rate vs SFT baseline and 96% safety compliance
Multi-agent systems executing complex workflows without human intervention
- Built 8+ specialized agents for insurance underwriting, multilingual caregiving (100+ languages), content generation, and admissions automation
- LangChain Agent orchestration connecting LLMs to databases, APIs, and messaging platforms
- Reduced processing time by 50% for student admissions workflows
Object detection and digital avatar generation
- BiiView: Real-time object detection using Meta AI’s Segment Anything Model (SAM) — 90% accuracy across 11M+ images and 1.1B+ masks
- Digital People Platform: Hyper-realistic talking avatars with SadTalker + SpeechT5 TTS — 70% realism improvement, 30% user satisfaction increase
- KYC Platform: Identity verification with OpenCV + AI — 99.9% accuracy, 50% faster document processing
| Role | Company | Period | Highlights |
|---|---|---|---|
| Senior ML/AI Engineer | Verticiti | Mar 2024 – Present | RAG pipelines (2TB+), LLM fine-tuning (LoRA/QLoRA), agentic workflows, C++ inference optimization, SAM object detection at scale. |
| Senior Generative AI Engineer | MARS (10K+ employees) | Oct 2024 – Jan 2026 | Led $1M+ GenAI enterprise transformation. RAG architectures, LLM orchestration, multi-agent frameworks for regulated industries. |
| Cloud Solution Architect | Cloud Kinetics USA | Aug 2024 – Jan 2026 | Designed cloud-native AI solutions on AWS, Azure, GCP for enterprise clients. ETL/ELT, data migration, real-time pipelines. |
| Senior AI Engineer | Reallytics.ai | Oct 2022 – Jan 2026 | Voice AI infra (500+ calls), fraud detection, autonomous agents, RLHF frameworks, cloud architecture on AWS/GCP. |
| Senior ML Engineer | Afiniti | Oct 2022 – Nov 2023 | Production ETL at scale for $1M+ accounts, churn modeling, call routing optimization. |
| AI Product Engineer | Afiniti | Apr 2021 – Oct 2022 | ML pipelines for call-routing, feature engineering on millions of daily records, production monitoring. |
| Python Engineer | MeryCure | May 2020 – Apr 2021 | IoT data pipelines (1000+ devices), anomaly detection, predictive maintenance, Power BI dashboards. |
|
Sentinel AI — Fraud Detection Ensemble XGBoost + Isolation Forest with 650+ features and GenAI explainability via Amazon Bedrock.
|
Voice-AI-Platform Real-time voice processing — 500+ concurrent calls, WebSockets, Kafka, gRPC/C++.
|
|
BiiView — Object Detection Meta AI SAM for video object detection — 90% accuracy across 11M+ images.
|
RAG-Enterprise-Search Enterprise RAG with multi-index fusion, re-ranking, and hallucination detection.
|
|
LLM-Fine-Tuning-LoRA Fine-tuning LLaMA/Mistral with LoRA, QLoRA, PEFT — 40% cost reduction vs hosted APIs.
|
RLHF-LLM-Optimization Full RLHF pipeline — SFT, reward modeling, PPO with KL constraints.
|
|
Digital People Platform Talking avatars with SadTalker + SpeechT5 TTS — 70% realism improvement.
|
Agentic-AI-Workflows Autonomous AI agents for enterprise automation with LangChain orchestration.
|
Hands-on explorations, architecture deep-dives, and production-tested techniques — published on Kaggle.
|
🤖 Agentic AI: Multi-Agent Orchestration from Scratch Building a multi-agent system with tool registries, planning loops, and guardrails — framework-agnostic patterns from production. |
🔌 LLM Function Calling and Tool Use: Complete Guide End-to-end function calling — schema design, validation, chaining, error recovery, and production deployment patterns. |
|
🔍 Advanced RAG: Production Retrieval Guide Multi-query RAG, hybrid search, cross-encoder re-ranking, hallucination detection — beyond basic retrieve-and-generate. |
🎯 Prompt Engineering That Actually Works (2026) Chain-of-thought, few-shot, self-consistency, structured output — real techniques with measured results. |
|
👁️ Multimodal AI: Vision-Language Pipeline Vision encoders, cross-attention fusion, image captioning, visual QA — building multimodal systems from components. |
💳 Fraud Detection: XGBoost + Isolation Forest Ensemble Ensemble anomaly detection with SHAP explainability, t-SNE visualization, and DBSCAN clustering on imbalanced data. |
|
💬 Sentiment Analysis: NLP Pipeline Comparison TF-IDF vs BERT vs DistilBERT — benchmarking classical and transformer approaches on real text data. |
📚 RAG Pipeline: LangChain + FAISS for Document QA End-to-end retrieval-augmented generation with chunk strategies, embedding models, and answer grounding. |
|
🧬 LLM Fine-Tuning: LoRA and QLoRA Guide Parameter-efficient fine-tuning walkthrough — LoRA, QLoRA, PEFT with memory profiling and serving benchmarks. |
📈 Time Series: XGBoost Forecasting Feature engineering for temporal data — lag features, rolling stats, calendar effects, walk-forward validation. |
|
🚢 Titanic: Stacking Ensemble Pipeline — Advanced stacking with cross-validated base learners, meta-learner optimization, and feature engineering. |
|
👉 View all notebooks on Kaggle →
Technical writeups published as Kaggle Datasets — production insights, benchmarks, and reference architectures.
| Writeup | What’s Inside |
|---|---|
| Agentic AI Tool Schemas: Production Patterns | 50+ tool/function schemas, 8 agent configs, benchmark data from 500 agent executions |
| RAG Evaluation Benchmark 2026 | 1,000 QA pairs with human-annotated relevance scores across 50 retrieval configs |
| LLM Prompt Engineering Templates | 100+ prompt templates with A/B test results from 200 production experiments |
| Fraud Detection: Feature Engineering Guide | 650+ feature catalog, interaction analysis, and 3 fraud persona profiles |
| ML System Design Patterns: Production | 40+ patterns, 25+ anti-patterns, decision frameworks for production ML |
Languages & Frameworks
Generative AI & LLMs
Cloud & Infrastructure
Data Engineering
| B.S. Computer Science | COMSATS University Islamabad, 2016–2020 |
| Foundations: Data, Data, Everywhere | |
| PostgreSQL: Advanced Queries | LinkedIn Learning |
| SQL Essential Training | LinkedIn Learning |
Auto-generated articles with AI-crafted images — published daily to AI-Engineering-Notes
Llm Fine Tuning At Scale With Lora
|
💬 Commented on Widen asset format supports in dottxt-ai/outlines (2026-04-06)
⭐ Starred oceanbase/pyseekdb (2026-04-06)
⭐ Starred AlayaDB-AI/AlayaLite (2026-04-06)
⭐ Starred Pometry/Raphtory (2026-04-06)
⭐ Starred unum-cloud/USearch (2026-04-06)
💬 Commented on Token Safety Tool for DeFi Multi-Agent Workflows in FoundationAgents/MetaGPT (2026-04-06)
📝 Opened issue [Feature] Add real-time streaming evaluation for production in vibrantlabsai/ragas (2026-04-06)
💬 Commented on [Feature Request] Saving GPU-Translated States for Fast CPU- in facebookresearch/faiss (2026-04-06)
Topics discovered daily by a multi-model AI research engine (GPT-4.1, Grok-3, DeepSeek R1, Llama-4)
Research engine discovering trending topics...
📌 Prompt Version Control & A/B Testing Registry (Python) (2026-04-06)
📌 Webhook Event Processor for ML Model Alerts (Python) (2026-04-06)
📌 Webhook Event Processor for ML Model Alerts (Python) (2026-04-06)
🤖 Profile auto-updated on 2026-04-06 10:49 UTC
Currently open to Senior AI/ML Engineer, Staff ML Engineer, or Lead AI Engineer roles.
If you’re building production AI systems and need someone who ships — let’s talk.