I design, build, and deploy production-grade machine learning and AI systems that solve real business problems at scale.
My work spans classical ML, LLM-based systems, data pipelines, and MLOps, with a strong focus on robustness, performance, and real-world impact.
🎯 Currently focused on:
- LLM-powered applications (RAG, agents, evaluation frameworks)
- Scalable ML pipelines & feature engineering
- AI system design for data-intensive products
📍 United States | Open to Remote & Hybrid roles
- Applied Machine Learning: predictive modeling, NLP, feature engineering, experimentation
- Generative AI & LLMs: RAG pipelines, agentic systems, prompt engineering, evaluation
- Production ML Systems: model deployment, monitoring, CI/CD, data pipelines
- Data Engineering for ML: SQL-first design, ETL, analytics-ready datasets
I care deeply about clean abstractions, reproducibility, and measurable outcomes.
These repositories reflect how I think about real-world AI systems, not toy demos.
Tech: Python, FastAPI, OpenAI / OSS LLMs, Vector DB, Docker
- End-to-end Retrieval-Augmented Generation pipeline
- Document ingestion, chunking, embeddings, retrieval, and evaluation
- Production-ready API with monitoring hooks
🔗 Repo: llm-rag-platform
Tech: Python, SQL, Pandas, Scikit-learn, Airflow-style orchestration
- Data ingestion → feature engineering → model training → evaluation
- Emphasis on reproducibility and modular design
- Clear separation of data, training, and inference layers
🔗 Repo: ml-production-pipeline
Tech: Python, SQL, ML, Analytics
- Translating ambiguous business problems into ML solutions
- KPI-driven modeling and post-deployment analysis
- Strong emphasis on decision-making, not just accuracy
🔗 Repo: applied-ml-case-studies
Languages
- Python, SQL
ML & AI
- Scikit-learn, PyTorch (working knowledge)
- LLMs (OpenAI, open-source)
- RAG, embeddings, vector search
Data & Analytics
- Pandas, NumPy
- Feature engineering & data validation
MLOps & Infra
- Docker
- CI/CD concepts
- Model versioning & monitoring (foundational)
Cloud (working experience)
- AWS / GCP (ML workflows & deployment concepts)
- 🎓 Master’s in Data Analytics – Northeastern University
- 💼 Experience across consulting, product, and data-driven teams
- 🔍 Strong bridge between business context and technical execution
I believe strong AI engineers:
- Start with problem framing, not models
- Treat data pipelines as first-class systems
- Optimize for maintainability over cleverness
- Measure success using business and user impact, not just metrics
- 💼 LinkedIn: https://www.linkedin.com/in/pranshu-kumar
- 📧 Email: [email protected]
If you’re building data-intensive or AI-powered products, I’d love to collaborate or chat.
⭐ If you find something useful here, feel free to star a repo or start a discussion.

