Description
System Info
transformers
version: 4.43.3- Platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.23.2
- Safetensors version: 0.4.3
- Accelerate version: 0.30.1
- Accelerate config: not found
- PyTorch version (GPU?): 2.3.1+cu121 (True)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: no
- Using GPU in script?: yes
- GPU type: NVIDIA GeForce RTX 4090 Laptop GPU
Who can help?
Information
- The official example scripts
- My own modified scripts
Tasks
- An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - My own task or dataset (give details below)
Reproduction
In generating short form output (<30 sec):
# inputs is from the processor
gen_kwargs = {
"max_new_tokens": 400,
"num_beams": 5,
"temperature": None,
"return_timestamps": False,
"return_dict_in_generate": True,
"num_return_sequences": 1,
"output_scores": True,
"language": "english"
}
pred_ids = self.model.generate(inputs, **gen_kwargs)
print(pred_ids.__class__)
print(dict((k,type(v)) for k, v in vars(pred_ids).items()))
Expected behavior
GenerateBeamEncoderDecoderOutput
seems to lose some fields in a recent version. (Maybe other output forms are also affected, haven't checked.)
Bisecting transformers versions, in 4.42.4 the output looked like:
<class 'transformers.generation.utils.GenerateBeamEncoderDecoderOutput'>
{'sequences': <class 'torch.Tensor'>, 'sequences_scores': <class 'torch.Tensor'>, 'scores': <class 'tuple'>, 'logits': <class 'NoneType'>, 'beam_indices': <class 'torch.Tensor'>, 'encoder_attentions': <class 'NoneType'>, 'encoder_hidden_states': <class 'NoneType'>, 'decoder_attentions': <class 'NoneType'>, 'cross_attentions': <class 'NoneType'>, 'decoder_hidden_states': <class 'NoneType'>, 'past_key_values': <class 'tuple'>}
In 4.43.0 and after sequences_scores
and beam_indices
became None
:
<class 'transformers.generation.utils.GenerateBeamEncoderDecoderOutput'>
{'sequences': <class 'torch.Tensor'>, 'sequences_scores': <class 'NoneType'>, 'scores': <class 'tuple'>, 'logits': <class 'NoneType'>, 'beam_indices': <class 'NoneType'>, 'encoder_attentions': <class 'NoneType'>, 'encoder_hidden_states': <class 'NoneType'>, 'decoder_attentions': <class 'NoneType'>, 'cross_attentions': <class 'NoneType'>, 'decoder_hidden_states': <class 'NoneType'>, 'past_key_values': <class 'tuple'>}
It looks like these get removed in postprocessing, potential culprit in _stack_split_outputs
at
Which looks like it changed in #30984.
Hacking in if key in ["sequences", "beam_indices", "sequences_scores"]:
, for example, fixes it, although I'm not sure what's intended to be handled as tensors vs. tuples, so will defer as to the best way to fix.
Activity