Highlights
- Pro
Stars
Master programming by recreating your favorite technologies from scratch.
Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search
Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.
The simplest but fast implementation of matrix multiplication in CUDA.
🐠 Babel is a compiler for writing next generation JavaScript.
A set of hands-on tutorials for CUDA programming
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
Development repository for the Triton language and compiler
ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization
Powerful android apk editor - aapt/aapt2 independent
Tools for handling firmwares of DJI products, with focus on quadcopters.
Pytorch implementation of convolutional neural network visualization techniques
The simplest, fastest repository for training/finetuning medium-sized GPTs.
adefossez / demucs
Forked from facebookresearch/demucsCode for the paper Hybrid Spectrogram and Waveform Source Separation
You like pytorch? You like micrograd? You love tinygrad! ❤️
Real-time night sky rendering and planetarium in Swift 4 with native SceneKit and Metal.
Keep passwords and other sensitive information out of your inboxes and chat logs.