-
JD Explore Academy, JD.com Inc.
- Shanghai(CN) & Sydney(AU)
- liamding.cc
- @liangdingNLP
- https://scholar.google.com/citations?user=lFCLvOAAAAAJ
Stars
AnchorAttention: Improved attention for LLMs long-context training
📰 Must-read papers and blogs on Speculative Decoding ⚡️
经纬度转省市区县乡镇离线包,采用空间查询算法,速度快(单线程5w次/s),省市区县100%准确率。
Finetune Llama 3.2, Mistral, Phi, Qwen 2.5 & Gemma LLMs 2-5x faster with 80% less memory
Official PyTorch implementation of IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
A Collection on Large Language Models for Optimization
A simple screen parsing tool towards pure vision based GUI agent
[ACL 2024] Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding
[NeurIPS 2024] The official implementation of paper: Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs.
The official implementation of paper: SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction.
Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.
A bibliography and survey of the papers surrounding o1
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
O1 Replication Journey: A Strategic Progress Report – Part I
Re-organized codes for developing LLM-based Chatbots, including DataProc, SFT, DPO, Demo, etc.
Codebase for Aria - an Open Multimodal Native MoE
Unified KV Cache Compression Methods for Auto-Regressive Models
[MQM-APE] Toward High-Quality Error Annotation Predictors with Automatic Post-Editing in LLM Translation Evaluators.
欢迎来到 LLM-Dojo,这里是一个开源大模型学习场所,使用简洁且易阅读的代码构建模型训练框架(支持各种主流模型如Qwen、Llama、GLM等等)、RLHF框架(DPO/CPO/KTO/PPO)等各种功能。👩🎓👨🎓
Official code repo for the O'Reilly Book - "Hands-On Large Language Models"
Super-Efficient RLHF Training of LLMs with Parameter Reallocation
A framework to study AI models in Reasoning, Alignment, and use of Memory (RAM).
Code Gen with "Uncertainty Aware Selective Contrastive Decoding"
Contextual Position Encoding but with some custom CUDA Kernels https://arxiv.org/abs/2405.18719