Skip to content

[BUG][IMPORTANT] zero_to_fp32.py consolidated weights are all zero after this commit #6791

@npuichigo

Description

@npuichigo

Describe the bug
After this commit dd40269, specify max_shard_size to zero_to_fp32.py would generate empty weights

To Reproduce
Use deepspeed v0.16.0

Reason
According to the code https://github.com/microsoft/DeepSpeed/blob/f743feca033515fdded50a98093da5a48eb41e74/deepspeed/utils/zero_to_fp32.py#L513-L529, the state_dict is overrided instead of making a copy, that's why the original state_dict become empty

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions