-
Notifications
You must be signed in to change notification settings - Fork 31.6k
[WIP] Add flex attention for Qwen2VL #35112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| score += causal_mask[b][0][q_idx][kv_idx] | ||
| return score | ||
|
|
||
| attn_output, attn_weights = flex_attention( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm having a hard time getting tests to pass with flex attention.
Can I merge the refactor for now, and add flex attention as a follow up?
…ransformers into feat/refactor_qwen2vl_attention
ArthurZucker
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey sorry for the late review! We ended up refactoring the API a bit more and I was off for a week! 🤗
Hope we did not deter you from contributing, and thanks for opening the PR! 🤗
What does this PR do?
Addresses #34809 (issue)
Who can review?
@ArthurZucker