Results-driven Data Scientist and Machine Learning Engineer with hands-on experience in designing, building, and deploying production-ready ML systems. I focus on turning data into reliable, scalable models and features that drive product value. My work spans research-grade experiments, production ML, and data engineering for analytics at scale.
- Practical experience with end-to-end ML workflows: data ingestion, feature engineering, model training, hyperparameter tuning, evaluation, and deployment.
- Strong Python-first stack (pandas, NumPy, scikit-learn) and deep learning using TensorFlow / PyTorch.
- Production experience with model serving, containerization (Docker), MLOps tooling (MLflow, DVC), and cloud platforms (AWS / GCP).
- Languages & Libraries: Python, SQL, pandas, NumPy, scikit-learn, TensorFlow, PyTorch
- Data Engineering: ETL pipelines, data validation, BigQuery / PostgreSQL, Airflow
- MLOps & Deployment: Docker, Kubernetes, MLflow, DVC, CI/CD, model monitoring
- Cloud & Infra: AWS (S3, EC2, SageMaker), GCP (BigQuery, Cloud Run), REST APIs
- Tools: Jupyter, Git, Docker, Bash, VS Code
- Other: Experiment tracking, A/B testing, model explainability, hyperparameter optimization
- GitHub profile: https://github.com/BlaiseMarvin
- LinkedIn: https://www.linkedin.com/in/blaiserusoke
Quick links to dive in — see my pinned repositories on my profile for full code and READMEs.
Curated highlights (click each title for the code and notebooks). The projects below are chosen to showcase my work across ML research, production ML, feature engineering at scale, and end-to-end data platforms.
-
Machine-Learning-and-Big-Data-Analytics — Primary AI & big-data repo. Contains Jupyter notebooks and pipeline examples for large-scale data processing and analytics using Apache Spark, end-to-end model training (feature engineering, model selection, hyperparameter tuning), and examples of scaling experiments. This repo is where most of my ML and big-data work lives: Spark-based ETL/feature pipelines, distributed experiments, and notebooks demonstrating model training and evaluation on large datasets.
-
FaceRecognitionPaymentSystem — Edge-AI face-recognition payment prototype. Deep-learning based face recognition (Siamese model on Inception-Resnet V1), deployed for real-time inference on Raspberry Pi with Intel NCS2 via OpenVINO. This project powered the paper "Edge AI Face Recognition for Public Transport Fare Payment" and won an award (see announcement: https://x.com/UCC_Official/status/1539631099923546113). The published e-print includes model metrics (validation accuracy ~93.8%) and deployment details: https://www.techrxiv.org/users/685096/articles/679153-edge-ai-face-recognition-for-public-transport-fare-payment
-
analytics_engineering — Analytics engineering and transformation pipelines (dbt + orchestration). Contains dbt models and Dagster pipelines used to build and maintain analytics-ready data marts and transformation workflows.
-
data_engineering_zoomcamp — My exercises and projects from the Data Engineering Zoomcamp. Includes hands-on work with ingestion, storage, Airflow orchestration, and analytics stacks (BigQuery/Postgres, Docker, and related tooling).
Other notable projects (selected):
- Sasanya — A barter trading app built with Flutter (mobile prototyping).
- socialMediaApp-fastapi — Full-stack prototype using FastAPI for backend APIs.
For more repositories and small utilities, see my GitHub profile: https://github.com/BlaiseMarvin
Example link format you can use for additional projects:
- [Project Title](https://github.com/BlaiseMarvin/REPO) — One-line summary + tech used + measurable impact
- Open the repo link and start with the top-level README or the
notebooks/(or similarly named) folder to run example experiments. - Many repos include Jupyter notebooks — use a Python 3.8+ environment, install requirements from
requirements.txt, and open the notebooks in Jupyter or VS Code. - For the FaceRecognitionPaymentSystem, see the
deployment/andREADME.mdfor edge deployment notes (OpenVINO, Raspberry Pi + NCS2).
If you'd like to collaborate or discuss a role, email: [email protected] (replace with your preferred address). I'm open to consulting, short-term contracts, and full-time opportunities.
- Value-driven: I prioritize models and features that produce measurable product improvements.
- Reproducible experiments: I track experiments, keep data lineage, and automate retraining where appropriate.
- Production-first mindset: I design models with serving, latency, and monitoring in mind from day one.
- B.Sc. / M.Sc. in [Your Field] — [University Name] (add years)
- Certifications: (e.g., Coursera TensorFlow, AWS Certified ML Specialist) — add specifics here
- LinkedIn: https://www.linkedin.com/in/blaisemrusoke
- GitHub: https://github.com/BlaiseMarvin
- Email: (add preferred contact email)
