Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing error message #34658

Open
jpiabrantes opened this issue Nov 8, 2024 · 7 comments
Open

Confusing error message #34658

jpiabrantes opened this issue Nov 8, 2024 · 7 comments

Comments

@jpiabrantes
Copy link

"A decoder-only architecture is being used, but right-padding was detected! For correct "

The error message says:

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set padding_side='left' when initializing the tokenizer."

But the code does not check if the input tokenizer.padding_side == "left". Instead the code checks if the last token id is a padding token which is often the case when people set tokenizer.pad_token_id = tokenizer.eos_token_id.

@LysandreJik
Copy link
Member

Indeed! cc @gante, @zucchini-nlp

@Rocketknight1
Copy link
Member

gentle ping @gante @zucchini-nlp

@zucchini-nlp
Copy link
Member

@jpiabrantes sorry for later reply. Do you mean there are cases when you want to generate from an input that ends with an eos token?

The warning is mostly an advice for beginners who try to generate, since generating with right padding might result in gibberish or lower quality text. So we point out to those who don't have much experience with generation, what are the best practices. If you already have the padding set on the correct side, you can ignore the warning

@jpiabrantes
Copy link
Author

jpiabrantes commented Dec 11, 2024 via email

@huggingface huggingface deleted a comment from github-actions bot Dec 11, 2024
@zucchini-nlp
Copy link
Member

@jpiabrantes yep, we could check the tokenizer's attribute directly but since tokenizer is not a compulsory kwargs when calling generate(), we opted to check the inputs. What I cane think now is to change the warning level in warning.warn to be suppressible

some generations will reach eos faster than

True, but we don't try to generate from an EOS token right? When we just pass inputs to the generate() method, tokenizer doesn't add any EOS so it can be continued by an LLM

@jpiabrantes
Copy link
Author

jpiabrantes commented Dec 11, 2024 via email

@zucchini-nlp
Copy link
Member

Yes, that;s an option in case users decide to pass the tokenizer. The initial idea was to allow the tokenizer as arg in special cases like when we have stop_str or assisted_decoding. Would you like to open a PR for that? Feel free to tag gante and myself :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants