OneEval: Open EvalScope evaluation artifacts for LLMs — subset breakdowns, pass@k curves, and reproducible evaluation protocols.
-
Updated
Mar 4, 2026 - Python
OneEval: Open EvalScope evaluation artifacts for LLMs — subset breakdowns, pass@k curves, and reproducible evaluation protocols.
Add a description, image, and links to the oneeval topic page so that developers can more easily learn about it.
To associate your repository with the oneeval topic, visit your repo's landing page and select "manage topics."