CUDA-kernels Some records of personal CUDA kernel implementations. These implementations are not best optimized and mainly for learning purposes. Kernels Softmax ReLU GEMM LayerNorm More kernels are coming...