Skip to content
View pranshu1921's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report pranshu1921

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
pranshu1921/README.md

Hi, I'm Pranshu Kumar Premi 👋

I design, build, and deploy production-grade machine learning and AI systems that solve real business problems at scale.
My work spans classical ML, LLM-based systems, data pipelines, and MLOps, with a strong focus on robustness, performance, and real-world impact.

🎯 Currently focused on:

  • LLM-powered applications (RAG, agents, evaluation frameworks)
  • Scalable ML pipelines & feature engineering
  • AI system design for data-intensive products

📍 United States | Open to Remote & Hybrid roles


🧠 What I Work On

  • Applied Machine Learning: predictive modeling, NLP, feature engineering, experimentation
  • Generative AI & LLMs: RAG pipelines, agentic systems, prompt engineering, evaluation
  • Production ML Systems: model deployment, monitoring, CI/CD, data pipelines
  • Data Engineering for ML: SQL-first design, ETL, analytics-ready datasets

I care deeply about clean abstractions, reproducibility, and measurable outcomes.


🚀 Featured Projects

These repositories reflect how I think about real-world AI systems, not toy demos.

🔹 LLM-Powered Knowledge Assistant (RAG)

Tech: Python, FastAPI, OpenAI / OSS LLMs, Vector DB, Docker

  • End-to-end Retrieval-Augmented Generation pipeline
  • Document ingestion, chunking, embeddings, retrieval, and evaluation
  • Production-ready API with monitoring hooks

🔗 Repo: llm-rag-platform


🔹 Machine Learning Platform (End-to-End)

Tech: Python, SQL, Pandas, Scikit-learn, Airflow-style orchestration

  • Data ingestion → feature engineering → model training → evaluation
  • Emphasis on reproducibility and modular design
  • Clear separation of data, training, and inference layers

🔗 Repo: ml-production-pipeline


🔹 Business-Focused ML Case Study

Tech: Python, SQL, ML, Analytics

  • Translating ambiguous business problems into ML solutions
  • KPI-driven modeling and post-deployment analysis
  • Strong emphasis on decision-making, not just accuracy

🔗 Repo: applied-ml-case-studies


🛠️ Tech Stack

Languages

  • Python, SQL

ML & AI

  • Scikit-learn, PyTorch (working knowledge)
  • LLMs (OpenAI, open-source)
  • RAG, embeddings, vector search

Data & Analytics

  • Pandas, NumPy
  • Feature engineering & data validation

MLOps & Infra

  • Docker
  • CI/CD concepts
  • Model versioning & monitoring (foundational)

Cloud (working experience)

  • AWS / GCP (ML workflows & deployment concepts)

📚 Background

  • 🎓 Master’s in Data Analytics – Northeastern University
  • 💼 Experience across consulting, product, and data-driven teams
  • 🔍 Strong bridge between business context and technical execution

📈 How I Think About AI

I believe strong AI engineers:

  • Start with problem framing, not models
  • Treat data pipelines as first-class systems
  • Optimize for maintainability over cleverness
  • Measure success using business and user impact, not just metrics

🤝 Let’s Connect

If you’re building data-intensive or AI-powered products, I’d love to collaborate or chat.


⭐ If you find something useful here, feel free to star a repo or start a discussion.

Pinned Loading

  1. Personalised-Cancer-Diagnosis Personalised-Cancer-Diagnosis Public

    Predict the effect of genetic mutations in cancer tumors and classify them based on text clinical literature.

    Jupyter Notebook

  2. Analyzing-International-Debt-Statistics Analyzing-International-Debt-Statistics Public

    SQL project to analyze global debt stats. collected by the World Bank.

    Jupyter Notebook 1

  3. Amazon-Fashion-Discovery-Engine Amazon-Fashion-Discovery-Engine Public

    Recommend similar apparel searched by user on Amazon.

    Jupyter Notebook 2 2

  4. Predicting-Credit-Card-Approvals Predicting-Credit-Card-Approvals Public

    Build a machine learning model to predict if a credit card application will get approved.

    Jupyter Notebook 1

  5. AINE-AI-Predicting-Customer-Churn-Telecom AINE-AI-Predicting-Customer-Churn-Telecom Public

    contains data and files related to the predictive analytics project for reducing monthly churn by identifying high risk customers well in advance

    Jupyter Notebook

  6. Gemini_Chatbot Gemini_Chatbot Public

    Python