Add retry hf hub decorator #35213

muellerzr · 2024-12-11T17:03:25Z

What does this PR do?

This PR adds a @hub_retry decorator, which should be used on tests that give us any form of requests-like error when a download from the hub fails for any reason. It'll then wait a beat then try to rerun it again. I currently have it just located in the
ModelTesterMixin class, but as time goes by and see where other places have it pop up, we should use it in those mixins/classes as well.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ydshieh

BenjaminBossan · 2024-12-11T17:17:57Z

This is a useful feature. I wonder if there is a possibility to use this for monkey patching from_pretrained. That way, it would be easier to use the retry mechanism in other packages. As an example, in PEFT we don't have a base class for all tests, so the trick with __init_subclass__ would not work.

ydshieh · 2024-12-11T17:28:26Z

@BenjaminBossan I was also thinking if we can instead patching from_pretrained in conftest.py rather than this change in __init__, but I can's judge it's better than the current version.

ydshieh · 2024-12-11T17:30:28Z

Also, it looks this could be applied to any function (test, from_pretrained) directly, so

monkey patching from_pretrained

is just hub_retry(XXX_class.from_pretrained) maybe

HuggingFaceDocBuilderDev · 2024-12-11T17:32:49Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

src/transformers/testing_utils.py

BenjaminBossan · 2024-12-11T17:52:06Z

is just hub_retry(XXX_class.from_pretrained) maybe

Right, this could work. Do you know if there is a single method that can be patched that would work for all transformers auto models?

ydshieh · 2024-12-11T17:57:06Z

Not a single method though.

from_pretrained for _BaseAutoModelClass will work for all auto model, but there are also auto tokenizer/processors etc.
And There are also PreTrainedModel if auto is not used (then similarly for tokenizer/processors etc)

src/transformers/testing_utils.py

…ace/transformers into muellerzr-network-retry

Co-authored-by: Lucain <[email protected]>

BenjaminBossan · 2025-02-25T14:14:37Z

src/transformers/testing_utils.py

+    Args:
+        max_attempts (`int`, *optional*, defaults to 5):
+            The maximum number of attempts to retry the flaky test.
+        wait_before_retry (`float`, *optional*, defaults to 2):


Wondering if it's better to use exponential backoff instead of a fixed waiting time?

We can see if fixed gets the job done before overengineering? :)

Also @Wauplin may just put this in huggingface_hub directly so this might (hopefully) just be temporary

maybe at some point yes but not short-term so please go ahead with this PR :) (cc @hanouticelina for viz')

muellerzr · 2025-02-25T14:18:15Z

For now going with what's in here (to just get it merged)

BenjaminBossan

Thanks for picking this back up. LGTM.

src/transformers/testing_utils.py

tests/utils/test_modeling_utils.py

ydshieh

Thank you for finalize this! Hope our life is easier with this 🙏

The test is written in a class that require torch, but that test is super simple not need torch. We can move it later, no big deal.

Let's merge and see how it goes 🚀

muellerzr added 4 commits December 11, 2024 11:34

Add retry torch decorator

10e2fb7

New approach

40f1845

Empty commit

2df4b50

Empty commit

d786f26

muellerzr requested a review from ydshieh December 11, 2024 17:03

ydshieh2 reviewed Dec 11, 2024

View reviewed changes

src/transformers/testing_utils.py Outdated Show resolved Hide resolved

muellerzr added 4 commits February 25, 2025 08:59

Merge branch 'main' into muellerzr-network-retry

1ca5f07

Style

109bcce

Use logger.error

d2ce5be

Merge branch 'main' into muellerzr-network-retry

30c0e6f

Wauplin reviewed Feb 25, 2025

View reviewed changes

src/transformers/testing_utils.py Outdated Show resolved Hide resolved

muellerzr and others added 3 commits February 25, 2025 09:14

Add a test

74f2961

Merge branch 'muellerzr-network-retry' of https://github.com/huggingf…

306c0fa

…ace/transformers into muellerzr-network-retry

Update src/transformers/testing_utils.py

c2a50ba

Co-authored-by: Lucain <[email protected]>

BenjaminBossan reviewed Feb 25, 2025

View reviewed changes

BenjaminBossan approved these changes Feb 25, 2025

View reviewed changes

ydshieh reviewed Feb 25, 2025

View reviewed changes

src/transformers/testing_utils.py Outdated Show resolved Hide resolved

ydshieh reviewed Feb 25, 2025

View reviewed changes

src/transformers/testing_utils.py Outdated Show resolved Hide resolved

Fix err

5319ceb

ydshieh reviewed Feb 25, 2025

View reviewed changes

tests/utils/test_modeling_utils.py Outdated Show resolved Hide resolved

Update tests/utils/test_modeling_utils.py

55de0d3

ydshieh approved these changes Feb 25, 2025

View reviewed changes

ydshieh merged commit 41925e4 into main Feb 25, 2025
24 checks passed

ydshieh deleted the muellerzr-network-retry branch February 25, 2025 19:53

dvrogozh mentioned this pull request Feb 26, 2025

multi-gpu: test_model_parallel_beam_search tests fail with "IndexError: list index out of range" #35824

Closed

Cyrilvallez mentioned this pull request Mar 3, 2025

Try working around the processor registration bugs #36184

Merged

Add retry hf hub decorator #35213

Add retry hf hub decorator #35213

Uh oh!

Conversation

muellerzr commented Dec 11, 2024

What does this PR do?

Before submitting

Who can review?

Uh oh!

BenjaminBossan commented Dec 11, 2024

Uh oh!

ydshieh commented Dec 11, 2024

Uh oh!

ydshieh commented Dec 11, 2024

Uh oh!

HuggingFaceDocBuilderDev commented Dec 11, 2024

Uh oh!

Uh oh!

BenjaminBossan commented Dec 11, 2024

Uh oh!

ydshieh commented Dec 11, 2024

Uh oh!

Uh oh!

BenjaminBossan Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

muellerzr Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

muellerzr Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

Wauplin Feb 25, 2025

Choose a reason for hiding this comment

Uh oh!

muellerzr commented Feb 25, 2025

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ydshieh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants