GitHub - WangXinglin/DORA: code for NeurIPS2025 paper "Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling"

Code and data for NeurIPS2025 paper "Every Rollout Counts: Optimal Resource Allocation for Efficient Test-Time Scaling"

Environment

pip install -f requirements.txt

Sampling

bash scripts/run_{$method}_qwen.sh

Merge Sampling Result

bash scripts/merge_result.sh

Get Accuracy Result

bash scripts/eval_result.sh

Develop Your Own Sampling Strategy (optional)

Please add your code on scripts/sal/search

Acknowledgement

We learned a lot and borrowed some code from the following projects when building DORA.

Abstract

Test-Time Scaling (TTS) improves the performance of Large Language Models (LLMs) by using additional inference-time computation to explore multiple reasoning paths through search. Yet how to allocate a fixed rollout budget most effectively during search remains underexplored, often resulting in inefficient use of compute at test time. To bridge this gap, we formulate test-time search as a resource allocation problem and derive the optimal allocation strategy that maximizes the probability of obtaining a correct solution under a fixed rollout budget. Within this formulation, we reveal a core limitation of existing search methods: solution-level allocation tends to favor reasoning directions with more candidates, leading to theoretically suboptimal and inefficient use of compute. To address this, we propose Direction-Oriented Resource Allocation (DORA), a provably optimal method that mitigates this bias by decoupling direction quality from candidate count and allocating resources at the direction level. To demonstrate DORA’s effectiveness, we conduct extensive experiments on challenging mathematical reasoning benchmarks including MATH500, AIME2024, and AIME2025. The empirical results show that DORA consistently outperforms strong baselines with comparable computational cost, achieving state-of-the-art accuracy. We hope our findings contribute to a broader understanding of optimal TTS for LLMs.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
evaluation		evaluation
recipes		recipes
scripts		scripts
LICENSE		LICENSE
PAPER.pdf		PAPER.pdf
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Environment

Sampling

Merge Sampling Result

Get Accuracy Result

Develop Your Own Sampling Strategy (optional)

Acknowledgement

Abstract

About

Uh oh!

Releases

Packages

Languages

License

WangXinglin/DORA

Folders and files

Latest commit

History

Repository files navigation

Environment

Sampling

Merge Sampling Result

Get Accuracy Result

Develop Your Own Sampling Strategy (optional)

Acknowledgement

Abstract

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages