Add weight norm rename in _load_state_dict_into_model #35123
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
#33275 fixed old parametrization of weight_norm for _load_state_dict_into_meta_model, but the issue also holds for _load_state_dict_into_model when using an old version of torch, e.g., 1.13.
Namely, weights are not loaded correctly for wav2vec2.encoder.pos_conv_embed.conv.weight_g, wav2vec2.encoder.pos_conv_embed.conv.weight_v for a model trained on a PyTorch version where weight_g and weight_v were renamed to original0 and original1.
What happens is that
transformers/src/transformers/modeling_utils.py
Line 4712 in 7f95372
transformers/src/transformers/modeling_utils.py
Line 666 in 7f95372
transformers/src/transformers/modeling_utils.py
Line 832 in 7f95372
Related to #31970.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
cc @LysandreJik and @ArthurZucker