- Pudong, Shanghai
Starred repositories
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
how to optimize some algorithm in cuda.
手把手带你实战 Huggingface Transformers 课程视频同步更新在B站与YouTube
Development repository for the Triton language and compiler
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
👑 Easy-to-use and powerful NLP and LLM library with 🤗 Awesome model zoo, supporting wide-range of NLP tasks from research to industrial applications, including 🗂Text Classification, 🔍 Neural Search…
Implementations of SIMD instruction sets for systems which don't natively support them.
《Effective Modern C++》- 完成翻译
The C++ Core Guidelines are a set of tried-and-true guidelines, rules, and best practices about coding in C++
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)
SGI STL source code analysis and note from 《STL源码剖析》 by 侯捷(包含电子书、源码注释及测试代码)
PaddlePaddle custom device implementaion. (『飞桨』自定义硬件接入实现)
Warp is a modern, Rust-based terminal with AI built in so you and your team can build great software, faster.
A General-purpose Task-parallel Programming System using Modern C++
C++ image processing and machine learning library with using of SIMD: SSE, AVX, AVX-512, AMX for x86/x64, NEON for ARM.
This is my translation of Chinese document of Eigen
C++ examples for the Vulkan graphics API
ncnn is a high-performance neural network inference framework optimized for the mobile platform
microsoft / Megatron-DeepSpeed
Forked from NVIDIA/Megatron-LMOngoing research training transformer language models at scale, including: BERT & GPT-2
Ongoing research training transformer models at scale