Skip to content

Conversation

@MekkCyber
Copy link
Contributor

What does this PR do?

The model used to test the ggml conversion of falcon-7b in fp16 format is wrong :

image You can see that it contains some Q4 weights which is unexpected in a `fp16` model, and its size is only 4GB but it should be around 7x2 = 14GB. I did my own model conversion to gguf to fix the issue : image

Who can review ?

@SunMarc

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@MekkCyber MekkCyber requested a review from SunMarc December 10, 2024 15:35
Copy link
Member

@SunMarc SunMarc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice thanks for uploading the right model ! cc @Isotr0py

@MekkCyber MekkCyber merged commit 85eb339 into main Dec 16, 2024
12 checks passed
@MekkCyber MekkCyber deleted the fix_falcon_ggml branch December 16, 2024 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants