-
Notifications
You must be signed in to change notification settings - Fork 42.1k
Move unschedulable pods to the active queue if they are not retried for more than 1 minute #72558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move unschedulable pods to the active queue if they are not retried for more than 1 minute #72558
Conversation
|
Hi @denkensk. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/ok-to-test |
ab21df1 to
1714a9f
Compare
|
/assign I will try to do a round of review. |
|
/kind bug |
95c0e4a to
40a9a5c
Compare
40a9a5c to
de8cfdc
Compare
bsalamat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
| Namespace: "ns1", | ||
| UID: types.UID("tp-mid"), | ||
| }, | ||
| Spec: v1.PodSpec{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Priority and NominatedNodeName are not really necessary here.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bsalamat, denkensk The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@Huang-Wei Yes, PR #73022 should also be included. I'm not sure if this change needs to be back ported into other releases. |
|
Thanks.
It seems so, see #72925 (comment). But let's hold for now. |
@denkensk @Huang-Wei Yes, this PR should be backported to 1.11+. Could you please send cherrypick PRs? |
Yes, I will send cherrypick PRs. |
…58-upstream-release-1.13 Automated cherry pick of #72558: add goroutine to move unschedulablepods to activeq regularly 1.13
…58-upstream-release-1.11 Automated cherry pick of #72558: add goroutine to move unschedulablepods to activeq regularly 1.11
…58-upstream-release-1.12 Automated cherry pick of #72558: add goroutine to move unschedulablepods to activeq regularly 1.12
What type of PR is this?
What this PR does / why we need it:
The scheduler places unschedulable pods in "unschedulabe" queue and retries them only when certain events happen that could potentially make them schedulable. This logic works well in almost all scenarios, but inevitable race condition in large distributed systems, could potentially cause some events to be seen before pods are added to the unschedulable queue. If this happens, pods may be left in the unschedulable queue and not be retried. Such scenarios should be rare and even if they occur, usually there are other events that trigger a retry and cover them. However, if such scenarios happen in smaller and low churn clusters, other events may not be seen for a while and pods may be stuck in the unschedulable queue for a long time.
Which issue(s) this PR fixes:
Fixes #72122
Special notes for your reviewer: