Silent failure in generation parameters 

### System Info

- `transformers` version: 4.44.2
- Platform: Linux-6.1.85+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.24.7
- Safetensors version: 0.4.5
- Accelerate version: 0.34.2
- Accelerate config: 	not found
- PyTorch version (GPU?): 2.4.1+cu121 (False)
- Tensorflow version (GPU?): 2.17.0 (False)
- Flax version (CPU?/GPU?/TPU?): 0.8.5 (cpu)
- Jax version: 0.4.33
- JaxLib version: 0.4.33
- Using distributed or parallel set-up in script?: No

### Who can help?

@zucchini-nlp  @gante 

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

Hey! 

I noticed that `top_p` was silenctly failing so I tested the rest of the generation parameters and found that `no_repeat_ngram_size` also silently fails for the same reason:  the condition checks inside of the `_get_logits_processor()` method prevent their respective wrapper classes from executing, which is where the ValueError  are being raised.

For instance,  ``raise ValueError(f"`ngram_size` has to be a strictly positive integer, but is {ngram_size}")`` error is never reached when we set `no_repeat_ngram_size <= 0`.

Here is a simple example with the invalid values where generation proceeds without notifying the user. Ideally, those should raise errors or warnings.

```
from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "EleutherAI/pythia-14m"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
prompt = "hey there!"
inputs = tokenizer(prompt, return_tensors="pt")
generation_config = dict(do_sample=True, top_p=5, no_repeat_ngram_size=-1)

outputs = model.generate(input_ids=inputs['input_ids'], attention_mask=inputs['attention_mask'], **generation_config)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
response
```

### Expected behavior

- To not let things fail silently and proceed with a default value, instead raise a ValueError or issue a warning to the user.

- It would be great if the `generate` method could fail early when invalid values are passed, maybe by checking for them upfront in `_get_logits_processor`  before applying the generation parameters one by one and going through the entire process, this will help avoid wasting compute resources.


I would be happy to open a PR to help address this issue if that&rsquo;s possible, thank you for all your work!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Silent failure in generation parameters #33690

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Silent failure in generation parameters #33690

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Activity

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions