📍 USA | 🤖 Googler | 🧠 AI Research Scientist |
Building intelligent coding agents and pushing the boundaries of AI for software engineering — from LLMs post-training to multi-agent systems that write real-world software.
- 🏗️ OpenDev - Open-source Coding Agent for the Terminal (470+ ⭐)
- 🧬 SWE-EVO - A Challenging Benchmark for Coding Agents in the Software Evolution Scenarios
- 📚 CodeWiki - Open-source DeepWiki: holistic repo-level documentation across multilingual codebases (760+ ⭐)
- 📚 TheVault - Open-source Code-Comment Dataset or Instruction Tuning (109+ ⭐)
- 🕵️ HyperAgent - Generalist software agents to solve software engineering tasks (235 ⭐)
- 👥 AgileCoder - Agile methodology meets multi-agent systems for building real-world software (451 ⭐)
- 🦫 CodeCapybara - Open-source self-instruction tuning Code LLM (171 ⭐)
- 🖥️ XMainframe - Language model for mainframe modernization (COBOL → modern code) (68 ⭐)
- 🔧 CodeTF - One-stop Transformer library for state-of-the-art Code LLMs (1.5k+ ⭐)
- 🚀 CodeT5 - Open Code LLMs for code understanding and generation (3k+ ⭐)
- 🧠 InferCode - Self-supervised learning of code representations (89 ⭐)
- 🌳 AST Node Encoding - AST node vector embeddings for source code
- 🔀 Bi-TBCNN - Bilateral tree-based convolutional neural network for code
- 📊 Graph-AST - Graph representations of source code
- 🔗 GGNN.TensorFlow - Gated graph neural networks for code classification
- 📖 Awesome AI4Code - Curated list of AI4Code papers and datasets
- Coding Agents - Building AI agents that can autonomously understand, navigate, and modify codebases
- Code LLMs - Training and fine-tuning large language models specialized for code
- Multi-Agent Systems - Orchestrating multiple AI agents for complex software engineering tasks
- Code Representation - Learning deep representations of source code via graphs, trees, and transformers




