rope-embeddings

Here are 2 public repositories matching this topic...

Zayer1 / nano-llama-engine

"A from-scratch implementation of a modern Large Language Model (LLaMA architecture). Features full backpropagation calculus in pure NumPy, followed by a GPU-accelerated PyTorch training loop."

machine-learning calculus numpy transformers attention-mechanism backpropagation layer-normalization self-attention deep-learning-from-scratch adamw-optimizer large-language-models swiglu rope-embeddings

Updated Jun 7, 2026
Python

A from-scratch PyTorch LLM implementing Sparse Mixture-of-Experts (MoE) with Top-2 gating. Integrates modern Llama-3 components (RMSNorm, SwiGLU, RoPE, GQA) and a custom-coded Byte-Level BPE tokenizer. Pre-trained on a curated corpus of existential & dark philosophical literature.

python pytorch transformer moe from-scratch mixture-of-experts bpe gqa llm rmsnorm swiglu existential-ai llama-3-architecture rope-embeddings custom-tokenizer

Updated Jan 7, 2026
Python

Improve this page

Add a description, image, and links to the rope-embeddings topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rope-embeddings topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rope-embeddings

Here are 2 public repositories matching this topic...

Zayer1 / nano-llama-engine

ralolooafanxyaiml / frad

Improve this page

Add this topic to your repo