Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect hardcoded consolidated.pth path for Llama 3.2 11B Vision+Instruct Model #35049

Closed
strangiato opened this issue Dec 3, 2024 · 1 comment · Fixed by #35053
Closed

Comments

@strangiato
Copy link
Contributor

In the mllama convert_mllama_weights_to_hf.py script, when attempting to convert a model with the --num_shards 1 flag, the script expects the file consolidated.pth in the following line:

https://github.com/huggingface/transformers/blob/f9c7e6021e9a9a9fd3fc8bb291da9451066aeb8d/src/transformers/models/mllama/convert_mllama_weights_to_hf.py#L341C1-L342C1

loaded = [torch.load(os.path.join(input_base_path, "consolidated.pth"), map_location="cpu", mmap=True)]

However, the Llama 3.2 11B vision instruct model is distributed as a single shard, but uses consolidated.00.pth as the filename instead:

ll ~/.llama/checkpoints/Llama3.2-11B-Vision-Instruct
total 41524824
drwxr-xr-x@ 6 user  group   192B Dec  2 14:05 ./
drwxr-xr-x@ 5 user  group   160B Dec  2 14:05 ../
-rw-r--r--@ 1 user  group   156B Dec  2 14:05 checklist.chk
-rw-r--r--@ 1 user  group    20G Dec  2 14:14 consolidated.00.pth
-rw-r--r--@ 1 user  group   321B Dec  2 14:05 params.json
-rw-r--r--@ 1 user  group   2.1M Dec  2 14:05 tokenizer.model

This results in a file not found error:

FileNotFoundError: [Errno 2] No such file or directory: 'path-to-model/consolidated.pth'
@strangiato
Copy link
Contributor Author

A quick fix would be to simply update the file reference to consolidated.00.pth but I'm not sure if that would break compatibility with other models besides the 11B Vision Instruct model.

I'm happy to help contribute a fix if someone can weigh in on the possible compatibility issues.

Tagging @ArthurZucker since he seems to be the original author of this script.

strangiato added a commit to strangiato/transformers that referenced this issue Dec 3, 2024
strangiato added a commit to strangiato/transformers that referenced this issue Dec 5, 2024
qubvel pushed a commit that referenced this issue Dec 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant