Vishvesh Trivedi NerdyVisky

👋 Hello devs, I'm Vishvesh (NerdyVisky)

👨‍🎓 About Me

I’m a Machine Learning Engineer + Researcher currently pursuing my M.S. in Computer Science at NYU Courant (GPA: 4.0).
My work sits at the intersection of:

LLM reasoning, retrieval & attention mechanisms
Healthcare ML & safety for deployed clinical AI
Document intelligence, VLMs, synthetic data generation
Production ML systems & model monitoring

🧪 Current Roles

🔬 Graduate Research Assistant — NYU CILVR Lab

Working with Prof. Eunsol Choi on multilingual LLM retrieval:

Improving in-context fact retrieval across 5 languages
Modifying attention mechanisms in LLaMa-3.2-8B, Qwen-2.5-7B, Phi-3.5
Achieved 15% retrieval gains with 30% lower KV-cache

🏥 Machine Learning Engineer — NYU Langone Health (BioDASH)

Building production-grade safety systems for ML models powering clinical workflows across 23 hospitals.

Designed drift detection pipelines (K–S, PSI, DeLong)
Real-time monitoring with Prometheus + Grafana
Extensive work with HIPAA-compliant datasets (EPIC COSMOS, OMOP CDM, Caboodle, Clarity)
Co-authored NIH & PCORi grant proposals

📄 ML Researcher — CVIT, IIIT Hyderabad

Published at ICDAR 2025 (Oral, Top 2%).

Generated 18k synthetic slides using novel LLM pipeline
Boosted VLM performance by 13% mAP and 10% Recall@K
HuggingFace model reached 500+ downloads

📚 Publication

ICDAR 2025 (Oral, Top 2%)

AI-Generated Lecture Slides for Improving Slide Element Detection and Retrieval
Maniyar, Trivedi et al.
🔗 Project: https://synslidegen.github.io
🔗 DOI: https://doi.org/10.1007/978-3-032-04614-7_11

🚀 Featured Projects

⚡ Adaptive & Warp-Cooperative GPU Hash Table

🔗 Code: https://github.com/NerdyVisky/adaptive-gpu-hashtable

Built a high-performance adaptive GPU hash table in C++/CUDA using cooperative groups and elected-lane atomics
Achieved 21× faster inserts and 20× faster lookups, outperforming naïve GPU hashing at scale
Implemented epoch-based dynamic resizing + compaction for non-blocking concurrency
Sustains stable throughput on 100M+ operations, even at 0.99 load factor

🎯 Attention-Aware DPO for Multi-Image VQA

Designed Attention-Aware DPO improving multi-image VQA accuracy by 8.5%
Applied AdaptVis for inference boosts → 10% over base model
Built LLM-as-a-judge with Gemini-2.5-Pro
🔗 Code: https://github.com/harsh-sutariya/AA-DPO
🔗 Website: https://nerdyvisky.github.io/projects/AttnDPO/

🔍 Open-Source Contribution: Retrieval Heads (ICLR 2025 spotlight)

Refactored full codebase for faster, leaner execution
Built vectorized dataloaders, added flash-attention, integrated vLLM
Reduced inference time 2 hrs → 30 mins (4× faster)
🔗 Code: https://github.com/NerdyVisky/multilingual-retrieval-translation-heads

💻 Tech Stack

Languages

Python · C/C++ · R · SQL (Postgres, MySQL) · JavaScript · TypeScript · Bash/Zsh

Frameworks & Libraries

PyTorch · TensorFlow · HuggingFace · LangChain
NumPy · Pandas · scikit-learn · Matplotlib

DevOps & Tools

AWS · GCP · Azure · Databricks
Docker · Git · Redis · MongoDB
Prometheus · Grafana

📈 GitHub Stats

Thanks for stopping by! Feel free to reach out if you're working on LLMs, retrieval, ML safety, or healthcare AI. 🚀

[Nov 2025] - I am looking for fulltime roles related to SDE/MLE and Applied Science based in the US starting Summer 2026. I am a US Permanent Resident (Green Card), and hence require no visa sponsorship. If you're hiring and like my work, feel free to connect on my email : [email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly