Add normalized and group-boundary AutoQuant scoring#1878
Conversation
Signed-off-by: weimingc <[email protected]>
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1878 +/- ##
==========================================
+ Coverage 61.17% 61.26% +0.09%
==========================================
Files 515 515
Lines 57207 57357 +150
==========================================
+ Hits 34994 35141 +147
- Misses 22213 22216 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
What does this PR do?
Type of change: New feature
This PR adds two complementary AutoQuant scoring controls:
constraints["score_model"]="per_element"normalizes selector coefficients by the number of represented weight elements while preserving the configured effective-bit budget.method="group_recon"withscore_boundary="group"measures normalized reconstruction error at shared attention or MLP group outputs. This changes where sensitivity is measured without forcing the corresponding projection modules to share a quantization recipe.It also keeps shared-expert gate/up/down projections in one deployable fused-MoE recipe decision, persists the new scoring metadata in AutoQuant checkpoints, validates restored-state compatibility, and exposes the controls through the HF PTQ CLI.
Existing behavior remains the default:
score_model="raw",method="gradient", andscore_boundary="local".Usage
Equivalent HF PTQ flags:
Testing
72 passed: focused non-distributed AutoQuant unit suite1 passed: distributed AutoQuant checkpoint/search test3 passed: HF PTQ argument testsbash -npassed for the modified HF PTQ shell entry pointsBefore your PR is "Ready for review"
Make sure you read and follow Contributor guidelines and your commits are signed (
git commit -s -S).Make sure you read and follow the Security Best Practices (e.g. avoiding hardcoded
trust_remote_code=True,torch.load(..., weights_only=False),pickle, etc.).CONTRIBUTING.md: N/AAdditional Information
The terms group scoring and recipe grouping are intentionally distinct: group scoring changes the output boundary used to measure a projection's perturbation, while recipe grouping constrains multiple modules to use the same quantization format.