Add weight norm rename in _load_state_dict_into_model #35123

dannywhuang · 2024-12-06T13:42:11Z

What does this PR do?

#33275 fixed old parametrization of weight_norm for _load_state_dict_into_meta_model, but the issue also holds for _load_state_dict_into_model when using an old version of torch, e.g., 1.13.

Namely, weights are not loaded correctly for wav2vec2.encoder.pos_conv_embed.conv.weight_g, wav2vec2.encoder.pos_conv_embed.conv.weight_v for a model trained on a PyTorch version where weight_g and weight_v were renamed to original0 and original1.

What happens is that

cls._load_pretrained_model is called in from_pretrained
_fix_key is called to ensure no more warnings for renamed keys as nicely fixed by Fix parametrization-based weight norm #33275
_load_state_dict_into_model is called, with the state_dict containing still unrenamed keys

transformers/src/transformers/modeling_utils.py

Line 4712 in 7f95372

error_msgs = _load_state_dict_into_model(
_load_state_dict_into_model adjusts beta and gamma names but not weight_g and weight_v:

transformers/src/transformers/modeling_utils.py

Line 666 in 7f95372

new_key = key.replace("gamma", "weight")
This is however, done in _load_state_dict_into_meta_model creating an inconsistency between the two:

transformers/src/transformers/modeling_utils.py

Line 832 in 7f95372

# To reproduce `_load_state_dict_into_model` behaviour, we need to manually rename parametrized weigth norm, if necessary.

Related to #31970.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

cc @LysandreJik and @ArthurZucker

Rocketknight1 · 2024-12-09T15:24:31Z

Not sure exactly who to ping on this one, but since meta models are involved let's try @muellerzr and @SunMarc

ArthurZucker

Hey! we no longer support torch 1.13! I don't think this is worth it !

Add weight norm rename in _load_state_dict_into_model

8dbe95f

ArthurZucker reviewed Dec 23, 2024

View reviewed changes

dannywhuang closed this by deleting the head repository Mar 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add weight norm rename in _load_state_dict_into_model #35123

Add weight norm rename in _load_state_dict_into_model #35123

Uh oh!

dannywhuang commented Dec 6, 2024

Uh oh!

Rocketknight1 commented Dec 9, 2024

Uh oh!

ArthurZucker left a comment

Uh oh!

Uh oh!

Add weight norm rename in _load_state_dict_into_model #35123

Add weight norm rename in _load_state_dict_into_model #35123

Uh oh!

Conversation

dannywhuang commented Dec 6, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

Rocketknight1 commented Dec 9, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!