Description
System Info
transformers
version: 4.44.2- Platform: Linux-6.1.85+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.24.7
- Safetensors version: 0.4.5
- Accelerate version: 0.34.2
- Accelerate config: not found
- PyTorch version (GPU?): 2.4.1+cu121 (False)
- Tensorflow version (GPU?): 2.17.0 (False)
- Flax version (CPU?/GPU?/TPU?): 0.8.5 (cpu)
- Jax version: 0.4.33
- JaxLib version: 0.4.33
- Using distributed or parallel set-up in script?: No
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
Hey!
I noticed that top_p
was silenctly failing so I tested the rest of the generation parameters and found that no_repeat_ngram_size
also silently fails for the same reason: the condition checks inside of the _get_logits_processor()
method prevent their respective wrapper classes from executing, which is where the ValueError are being raised.
For instance, raise ValueError(f"`ngram_size` has to be a strictly positive integer, but is {ngram_size}")
error is never reached when we set no_repeat_ngram_size <= 0
.
Here is a simple example with the invalid values where generation proceeds without notifying the user. Ideally, those should raise errors or warnings.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "EleutherAI/pythia-14m"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
prompt = "hey there!"
inputs = tokenizer(prompt, return_tensors="pt")
generation_config = dict(do_sample=True, top_p=5, no_repeat_ngram_size=-1)
outputs = model.generate(input_ids=inputs['input_ids'], attention_mask=inputs['attention_mask'], **generation_config)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
response
Expected behavior
-
To not let things fail silently and proceed with a default value, instead raise a ValueError or issue a warning to the user.
-
It would be great if the
generate
method could fail early when invalid values are passed, maybe by checking for them upfront in_get_logits_processor
before applying the generation parameters one by one and going through the entire process, this will help avoid wasting compute resources.
I would be happy to open a PR to help address this issue if that’s possible, thank you for all your work!
Activity