XinXU-USTC

XinXU-USTC

Achievements

Tencent-Hunyuan/Thinking-Free_Policy_Initialization Tencent-Hunyuan/Thinking-Free_Policy_Initialization Public

The official code of [ICLR 2026] TFPI: Thinking-Free Policy Initialization Makes Distilled Reasoning Models More Effective and Efficient Reasoners

Python 103 12
Composition-RL Composition-RL Public

Official repository for the paper "Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models"

Python 134 18
YangLabHKUST/UGPhysics YangLabHKUST/UGPhysics Public

Official Repository of UGPhysics Benchmark [ICML 2025]

Python 118 17