I like building things from scratch, especially when there's a paper involved. These are some of my favorite projects:
Deep Learning NLP Transformers
A from-scratch implementation of the decoder from "Attention Is All You Need", trained on Shakespeare's complete works. Built progressively — from a bigram model up to the full transformer — to actually understand each step rather than just run the code.
NLP Embeddings NumPy
Word2Vec implemented in pure NumPy — no deep learning frameworks. Trained on the text8 Wikipedia dataset and validated with vector arithmetic and intruder tests. The model correctly identifies that war doesn't belong with seven, eight, nine.
Deep Learning Generative Models PyTorch
A Wasserstein GAN implemented in PyTorch following the original papers. Trained on MNIST to generate handwritten digits — skipped gradient penalty intentionally to see how far the base formulation gets on a simple dataset.
C++ Compression Information Theory
A from-scratch implementation of arithmetic coding — one of the more elegant ideas in data compression. Encodes symbols by probability and hits compression ratios close to the theoretical entropy limit.
Python Math Floating-Point
A Python package for doing arithmetic with floating-point binary fractions — because sometimes you want to do math the way the CPU does it. Supports standard operations, installable as a package, and great for understanding what's really going on beneath 0.1 + 0.2.

