Skip to content

Conversation

@muellerzr
Copy link
Contributor

What does this PR do?

This PR adds a @hub_retry decorator, which should be used on tests that give us any form of requests-like error when a download from the hub fails for any reason. It'll then wait a beat then try to rerun it again. I currently have it just located in the
ModelTesterMixin class, but as time goes by and see where other places have it pop up, we should use it in those mixins/classes as well.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ydshieh

@muellerzr muellerzr requested a review from ydshieh December 11, 2024 17:03
@BenjaminBossan
Copy link
Member

This is a useful feature. I wonder if there is a possibility to use this for monkey patching from_pretrained. That way, it would be easier to use the retry mechanism in other packages. As an example, in PEFT we don't have a base class for all tests, so the trick with __init_subclass__ would not work.

@ydshieh
Copy link
Collaborator

ydshieh commented Dec 11, 2024

@BenjaminBossan I was also thinking if we can instead patching from_pretrained in conftest.py rather than this change in __init__, but I can's judge it's better than the current version.

@ydshieh
Copy link
Collaborator

ydshieh commented Dec 11, 2024

Also, it looks this could be applied to any function (test, from_pretrained) directly, so

monkey patching from_pretrained

is just hub_retry(XXX_class.from_pretrained) maybe

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@BenjaminBossan
Copy link
Member

is just hub_retry(XXX_class.from_pretrained) maybe

Right, this could work. Do you know if there is a single method that can be patched that would work for all transformers auto models?

@ydshieh
Copy link
Collaborator

ydshieh commented Dec 11, 2024

Not a single method though.

from_pretrained for _BaseAutoModelClass will work for all auto model, but there are also auto tokenizer/processors etc.
And There are also PreTrainedModel if auto is not used (then similarly for tokenizer/processors etc)

Args:
max_attempts (`int`, *optional*, defaults to 5):
The maximum number of attempts to retry the flaky test.
wait_before_retry (`float`, *optional*, defaults to 2):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if it's better to use exponential backoff instead of a fixed waiting time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can see if fixed gets the job done before overengineering? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also @Wauplin may just put this in huggingface_hub directly so this might (hopefully) just be temporary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe at some point yes but not short-term so please go ahead with this PR :) (cc @hanouticelina for viz')

@muellerzr
Copy link
Contributor Author

For now going with what's in here (to just get it merged)

Copy link
Member

@BenjaminBossan BenjaminBossan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking this back up. LGTM.

Copy link
Collaborator

@ydshieh ydshieh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for finalize this! Hope our life is easier with this 🙏

The test is written in a class that require torch, but that test is super simple not need torch. We can move it later, no big deal.

Let's merge and see how it goes 🚀

@ydshieh ydshieh merged commit 41925e4 into main Feb 25, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants