Skip to content

Some Whisper beam search output (sequences_scores, etc.) is lost in _stack_split_outputs #32373

Open
@drewhouston

Description

System Info

  • transformers version: 4.43.3
  • Platform: Linux-5.15.153.1-microsoft-standard-WSL2-x86_64-with-glibc2.35
  • Python version: 3.10.12
  • Huggingface_hub version: 0.23.2
  • Safetensors version: 0.4.3
  • Accelerate version: 0.30.1
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.3.1+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?: no
  • Using GPU in script?: yes
  • GPU type: NVIDIA GeForce RTX 4090 Laptop GPU

Who can help?

@sanchit-gandhi @kamilakesbi

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

In generating short form output (<30 sec):

# inputs is from the processor
gen_kwargs = {
            "max_new_tokens": 400,
            "num_beams": 5,
            "temperature": None,
            "return_timestamps": False,
            "return_dict_in_generate": True,
            "num_return_sequences": 1,
            "output_scores": True,
            "language": "english"
        }
pred_ids = self.model.generate(inputs, **gen_kwargs)
print(pred_ids.__class__)
print(dict((k,type(v)) for k, v in vars(pred_ids).items()))

Expected behavior

GenerateBeamEncoderDecoderOutput seems to lose some fields in a recent version. (Maybe other output forms are also affected, haven't checked.)

Bisecting transformers versions, in 4.42.4 the output looked like:

<class 'transformers.generation.utils.GenerateBeamEncoderDecoderOutput'>
{'sequences': <class 'torch.Tensor'>, 'sequences_scores': <class 'torch.Tensor'>, 'scores': <class 'tuple'>, 'logits': <class 'NoneType'>, 'beam_indices': <class 'torch.Tensor'>, 'encoder_attentions': <class 'NoneType'>, 'encoder_hidden_states': <class 'NoneType'>, 'decoder_attentions': <class 'NoneType'>, 'cross_attentions': <class 'NoneType'>, 'decoder_hidden_states': <class 'NoneType'>, 'past_key_values': <class 'tuple'>}

In 4.43.0 and after sequences_scores and beam_indices became None:

<class 'transformers.generation.utils.GenerateBeamEncoderDecoderOutput'>
{'sequences': <class 'torch.Tensor'>, 'sequences_scores': <class 'NoneType'>, 'scores': <class 'tuple'>, 'logits': <class 'NoneType'>, 'beam_indices': <class 'NoneType'>, 'encoder_attentions': <class 'NoneType'>, 'encoder_hidden_states': <class 'NoneType'>, 'decoder_attentions': <class 'NoneType'>, 'cross_attentions': <class 'NoneType'>, 'decoder_hidden_states': <class 'NoneType'>, 'past_key_values': <class 'tuple'>}

It looks like these get removed in postprocessing, potential culprit in _stack_split_outputs at

def _stack_split_outputs(self, seek_outputs, model_output_type, device, kwargs):

Which looks like it changed in #30984.

Hacking in if key in ["sequences", "beam_indices", "sequences_scores"]:, for example, fixes it, although I'm not sure what's intended to be handled as tensors vs. tuples, so will defer as to the best way to fix.

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions