Skip to content

Conversation

@winglian
Copy link
Contributor

@winglian winglian commented Dec 6, 2024

What does this PR do?

Bug introduced in #34915 inverted the loss kwargs logic.

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@winglian winglian changed the title logic was inverted logic was inverted for passing loss_kwargs to forward pass Dec 6, 2024
@winglian winglian marked this pull request as draft December 6, 2024 21:02
@winglian winglian force-pushed the fix-num_items_in_batch-check branch from 4dce79b to c605d68 Compare December 6, 2024 21:12
Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linked to #35113, will do a patch for this

loss = None
if labels is not None:
loss = self.loss_function(logits=logits, labels=labels, vocab_size=self.config.vocab_size, **kwargs)
loss = self.loss_function(logits=logits, labels=labels, vocab_size=self.config.vocab_size, num_items_in_batch=num_items_in_batch, **kwargs)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, the whole point of having kwargs here is not not introduce unwanted training specific args 😓

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, the problem is I have to pop it from kwargs to keep from passing it to model(...) above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants