Output dicts support in text generation pipeline #35092

jonasrohw · 2024-12-04T19:35:33Z

What does this PR do?

It is a minor fix to the text generation pipeline. When calling with the generation argument return_dict_in_generate=True, the code breaks because it does not expect a dict from the model.generate(...) call. If you want to view logits, for example, a dict is required. This fix can handle return_dict_in_generate=True as a pipeline parameter.

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@Rocketknight1

…of returning a error

…eration_pipeline

Rocketknight1

Hi - firstly, thanks for the PR, but I notice that in the code there's a lot more than just adding output dict support! It seems like you're also changing the handling for batched generation a lot.

I can understand the rationale there, but I think it would be better to split that into a separate PR. In other words, make this into a smaller PR that just adds output dict support, with a second PR focused on batching. WDYT?

Rocketknight1 · 2024-12-05T15:08:13Z

src/transformers/pipelines/text_generation.py

+            )
+
+        for key, value in other_outputs.items():
+            if isinstance(value, (torch.Tensor, tf.Tensor)) and value.shape[0] == out_b:


This line seems dangerous - remember that tf.Tensor may not be available or imported on most systems! You may have to make this a more complicated conditional that only checks tf.Tensor after checking is_tensorflow_available(), to ensure no issues on Torch-only systems (i.e. most of them)

Rocketknight1 · 2024-12-05T15:10:21Z

src/transformers/pipelines/text_generation.py

+            for key, value in other_outputs.items():
+                if isinstance(value, (list, tuple)):
+                    record[key] = value[idx]
+                elif isinstance(value, (torch.Tensor, tf.Tensor)):


Same issue here

HuggingFaceDocBuilderDev · 2024-12-05T15:35:20Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

jonasrohw · 2024-12-07T10:17:23Z

@Rocketknight1 Hi. Thanks for looking at the PR and pointing that out. I agree with you; I got a bit ahead of myself. I've removed most of the changes and just made the changes to support dictionary output. I'd be happy if you could take a look. Thanks again!

jonasrohw · 2024-12-16T10:15:19Z

@Rocketknight1 Any update on this? Thanks!

Rocketknight1

I think this looks good to me now, and sorry for the delay! @gante since you're probably more familiar with return_dict_in_generate, can you take a look and make sure it's all okay before I ping a core maintainer?

jonasrohw · 2025-01-10T18:43:18Z

@gante Friendly reminder if you could take a look. Thank you very much!

jonasrohw · 2025-01-20T17:14:01Z

@Rocketknight1 Can't seem to get a hold of @gante. Any idea how to proceed?

Rocketknight1 · 2025-01-21T14:44:50Z

Hang on, I'll get him!

jonasrohw · 2025-01-28T17:51:26Z

@Rocketknight1 Any success in finding him?

gante · 2025-01-29T11:30:21Z

@jonasrohw I exist, but I'm being pinged in many places :D

gante

LGTM, thank you for expanding the capabilities of the pipeline :D

…line

gante · 2025-01-29T11:33:22Z

(fixed conflicts, will merge as soon as CI gets green)

gante · 2025-01-29T11:49:15Z

(failing tests are unrelated to this PR, and are being fixed in other PRs. Waiting for them to be merged first)

…line

* Support for generate_argument: return_dict_in_generate=True, instead of returning a error * fix: call test with return_dict_in_generate=True * fix: Only import torch if it is present * update: Encapsulate output_dict changes * fix: added back original comments --------- Co-authored-by: Joao Gante <[email protected]>

jonasrohw and others added 4 commits December 4, 2024 20:23

Support for generate_argument: return_dict_in_generate=True, instead …

b6a6888

…of returning a error

Merge branch 'huggingface:main' into output_dicts_support_in_text_gen…

5249dd3

…eration_pipeline

fix: call test with return_dict_in_generate=True

30d595c

fix: Only import torch if it is present

d280c0b

Rocketknight1 reviewed Dec 5, 2024

View reviewed changes

jonasrohw added 2 commits December 7, 2024 10:05

update: Encapsulate output_dict changes

dc26319

fix: added back original comments

7f783ae

Rocketknight1 approved these changes Dec 16, 2024

View reviewed changes

gante approved these changes Jan 29, 2025

View reviewed changes

Merge branch 'main' into output_dicts_support_in_text_generation_pipe…

361910d

…line

Merge branch 'main' into output_dicts_support_in_text_generation_pipe…

552299d

…line

gante merged commit 23d782e into huggingface:main Jan 29, 2025
23 checks passed

Output dicts support in text generation pipeline #35092

Output dicts support in text generation pipeline #35092

Uh oh!

Conversation

jonasrohw commented Dec 4, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

Rocketknight1 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

Rocketknight1 Dec 5, 2024

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Dec 5, 2024

Uh oh!

jonasrohw commented Dec 7, 2024

Uh oh!

jonasrohw commented Dec 16, 2024

Uh oh!

Rocketknight1 left a comment

Choose a reason for hiding this comment

Uh oh!

jonasrohw commented Jan 10, 2025

Uh oh!

jonasrohw commented Jan 20, 2025

Uh oh!

Rocketknight1 commented Jan 21, 2025

Uh oh!

jonasrohw commented Jan 28, 2025

Uh oh!

gante commented Jan 29, 2025

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

gante commented Jan 29, 2025

Uh oh!

gante commented Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jonasrohw commented Dec 4, 2024 •

edited

Loading

Rocketknight1 left a comment •

edited

Loading

gante commented Jan 29, 2025 •

edited

Loading