-
Notifications
You must be signed in to change notification settings - Fork 42.1k
Should move all unscheduable pods when we received move request to active queue #73309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should move all unscheduable pods when we received move request to active queue #73309
Conversation
|
/sig scheduling |
6dcc5de to
c6cbd4c
Compare
pkg/scheduler/factory/factory.go
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see your point but adding an interface named "Cycle()" which will expose implementation detail of queue to outside doesn't look good.
May need to think twice.
|
/assign |
bsalamat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few minor comments. I agree with @resouer's point that exposing scheduling cycle from the scheduling queue is not elegant. A different implementation could be to add the cycle to the scheduler and increment it at the beginning of every scheduling cycle. The scheduler could pass the cycle as an argument to AddUnschedulableIfNotPresent and MoveAllToActiveQueue. However, this approach is not great either and has its own problems. It may in fact be worse in that the scheduler would keep track of scheduling cycles while the only user of the cycle is the scheduling queue.
So, I am fine with the implementation done in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please rename this to SchedulingCycle. One could think that this is the queue cycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rename this to moveRequestCycle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
schedulingCycle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/cycle/podSchedulingCycle/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will recommend using p.lock.RLock() and p.lock.RUnlock() here. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
3dd2441 to
5328984
Compare
|
/test pull-kubernetes-e2e-gce-100-performance |
|
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make this comment general and straightforward for upcoming developers to implement their own scheduling queue. For example, at interface level, there should be no concept of activeQ or unschedulableQ.
How about:
// AddUnschedulableIfNotPresent adds an unschedulable pod back to scheduling queue.
// The podSchedulingCycle represents the current scheduling cycle number which can be
// returned by calling SchedulingCycle().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. We should not assume how interface is implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, let's make the comment straightforward and give tips for developers. My stupid example is like:
// SchedulingCycle returns the current number of scheduling cycle which is cached by scheduling queue.
// Normally, incrementing this number whenever a pod is popped (e.g. called Pop()) is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated. I think this is good enough. PTAL.
|
Just left two comments to make sure the newly added interfaces are understandable for other contributors or ppl who try to implement their own scheduling queue. PTAL |
- add incremental scheduling cycle - instead of set a flag on move reqeust, we cache current scheduling cycle in moveRequestCycle - when unschedulable pods are added back, compare its cycle with moveRequestCycle to decide whether it should be added into active queue or not
0e0d4cb to
ba47bef
Compare
|
/retest |
bsalamat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
Thanks, @cofyc!
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bsalamat, cofyc The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest Review the full test history for this PR. Silence the bot with an |
1 similar comment
|
/retest Review the full test history for this PR. Silence the bot with an |
|
/skip |
|
@cofyc: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
talked with @Huang-Wei, I cherry picked this PR into 1.12 (#73567) and 1.13 (#73568) |
| // queue and clears p.receivedMoveRequest. | ||
| func (p *PriorityQueue) AddUnschedulableIfNotPresent(pod *v1.Pod) error { | ||
| // p.moveRequestCycle > podSchedulingCycle or to backoff queue if p.moveRequestCycle | ||
| // <= podSchedulingCycle but pod is subject to backoff. In other cases, it adds pod to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the code on line 414, it seems the condition should be (without =) :
podSchedulingCycle > p.moveRequestCycle
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Greater than or equal is correct. In fact, it is very important to have =. Equal means that a move request is received in the same scheduling cycle as the pod was being scheduled. So, it is likely that the scheduler has not seen the changes that led to the move request yet and didn't use those changes when making a scheduling decision. As a result, the pod must be retried if it is unschedulable.
…upstream-release-1.12 Automated cherry pick of #73309: Should move all unscheduable pods when we received move request to active queue
…upstream-release-1.13 Automated cherry pick of #73309: Should move all unscheduable pods when we received move request to active queue
What type of PR is this?
What this PR does / why we need it:
Fix weakness of current receivedMoveRequest
cycle in moveRequestCycle
moveRequestCycle to decide whether it should be added into active queue
or not
See #73216 (comment) and #73078.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?: