I am a software engineer and machine learning researcher with a background in mathematics, currently focused on large language models, program reasoning, and AI systems.
I enjoy building systems, studying the foundations behind intelligent models, and turning research ideas into practical implementations.
I am currently pursuing a master’s degree in Data Mining at Shahid Beheshti University, where I hold the first rank in my major with a GPA of 18.88/20.
My master’s thesis focuses on studying the reliability of large language models in reasoning about semantic relationships between programs. This work develops a structured evaluation framework to assess whether LLMs can go beyond simple equivalence predictions and provide consistent, checkable reasoning about code changes.
I am conducting this research with Dr. Khashayar Etemadi, Postdoctoral Researcher at ETH Zurich.
I am currently working as a Software Engineer at Argoman, where I lead a squad focused on building HR-related products and solutions.
Our team focuses on turning business needs into software solutions designed using Domain-Driven Design (DDD) principles and transforming them into practical, usable products. As part of this effort, we are exploring how to leverage AI agents to improve product workflows, reduce repetitive tasks, and create a more efficient and intuitive user experience.
Before Argoman, I worked at Yektanet, the largest and one of the most advanced online advertising networks in Iran. It was one of the strongest technical environments I experienced, and it helped me grow significantly as a software engineer.
I hold a bachelor’s degree in Mathematics and Applications from Amirkabir University of Technology.
Before university, I studied at the National Organization for Development of Exceptional Talents, also known as Sampad.
Mathematics has had a major influence on the way I think. It helps me break down problems, reason carefully, organize ideas, and build structured solutions, whether in software engineering or machine learning research.
I have served as a lead teaching assistant at Amirkabir University of Technology for several courses, including:
- 📊 Data Mining, instructed by Dr. Fatemeh Shakeri
- 💻 Computational Data Mining, instructed by Dr. Fatemeh Shakeri
- 🤖 Deep Learning, instructed by Dr. Fatemeh Shakeri
- 📐 Linear Algebra, instructed by Dr. Behzad Najafi
- 🖥️ User Interface Design, instructed by Dr. Sajad Shirali Shahreza
Some of my projects are available on my GitHub.
This is the official implementation of our paper on accelerating diffusion model inference.
Our work introduced Cached Adaptive Token Merging, or CA-ToMe, a training-free method that reduces redundant self-attention computation by adaptively merging similar tokens and caching token-pair information across denoising steps. The goal is to achieve faster inference while preserving image generation quality.
This project was part of my early research experience, which I started near the end of my bachelor’s studies.
🔗 Cached Adaptive Token Merging
A friend and I created Mini Torch, an educational framework designed to show how PyTorch works behind the scenes.
The project is inspired by Andrew Karpathy’s Micrograd, but it extends the idea toward a more generalized and PyTorch-like structure. The goal is to make concepts such as tensors, automatic differentiation, neural network modules, and optimization easier to understand through implementation.
You can find the main differences and design details in the project README.
EquiBench is a benchmark related to my master’s thesis.
The goal of this project is to evaluate how well different large language models reason about semantic relationships between programs. Instead of only checking whether a model predicts two programs as equivalent or non-equivalent, EquiBench aims to assess whether the model can provide reliable, structured, and checkable reasoning about code transformations.
The benchmark focuses on questions such as:
- Can an LLM correctly identify when two pieces of code are semantically equivalent?
- Can it explain why a transformation preserves or changes program behavior?
- Is the model’s reasoning consistent across similar examples?
- Can its explanation be verified through structured checks?
In this article, I explain how to implement a decision tree from scratch and visualize its structure.




