Skip to content

Conversation

@cofyc
Copy link
Member

@cofyc cofyc commented Jan 25, 2019

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change
/kind bug
/kind cleanup
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

What this PR does / why we need it:

Fix weakness of current receivedMoveRequest

  • add incremental scheduling cycle
  • instead of set a flag on move reqeust, we cache current scheduling
    cycle in moveRequestCycle
  • when unschedulable pods are added back, compare its cycle with
    moveRequestCycle to decide whether it should be added into active queue
    or not

See #73216 (comment) and #73078.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

scheduler: use incremental scheduling cycle in PriorityQueue to put all in-flight unschedulable pods back to active queue if we received move request

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. kind/bug Categorizes issue or PR as related to a bug. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 25, 2019
@cofyc
Copy link
Member Author

cofyc commented Jan 25, 2019

/sig scheduling

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jan 25, 2019
@cofyc cofyc changed the title WIP: Should move all pods when we received mod request to active queue WIP: Should move all unscheduable pods when we received mod request to active queue Jan 25, 2019
@cofyc cofyc force-pushed the fix73216-receivedMoveRequest branch from 6dcc5de to c6cbd4c Compare January 25, 2019 10:55
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see your point but adding an interface named "Cycle()" which will expose implementation detail of queue to outside doesn't look good.

May need to think twice.

@resouer
Copy link
Contributor

resouer commented Jan 25, 2019

/assign

Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few minor comments. I agree with @resouer's point that exposing scheduling cycle from the scheduling queue is not elegant. A different implementation could be to add the cycle to the scheduler and increment it at the beginning of every scheduling cycle. The scheduler could pass the cycle as an argument to AddUnschedulableIfNotPresent and MoveAllToActiveQueue. However, this approach is not great either and has its own problems. It may in fact be worse in that the scheduler would keep track of scheduling cycles while the only user of the cycle is the scheduling queue.
So, I am fine with the implementation done in this PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this to SchedulingCycle. One could think that this is the queue cycle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this to moveRequestCycle.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

schedulingCycle

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/cycle/podSchedulingCycle/

Copy link
Contributor

@wgliang wgliang Jan 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will recommend using p.lock.RLock() and p.lock.RUnlock() here. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@cofyc cofyc force-pushed the fix73216-receivedMoveRequest branch 5 times, most recently from 3dd2441 to 5328984 Compare January 26, 2019 02:52
@cofyc cofyc changed the title WIP: Should move all unscheduable pods when we received mod request to active queue Should move all unscheduable pods when we received mod request to active queue Jan 26, 2019
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note-none Denotes a PR that doesn't merit a release note. labels Jan 26, 2019
@cofyc
Copy link
Member Author

cofyc commented Jan 26, 2019

/test pull-kubernetes-e2e-gce-100-performance

@Huang-Wei
Copy link
Member

/retest

Copy link
Contributor

@resouer resouer Jan 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make this comment general and straightforward for upcoming developers to implement their own scheduling queue. For example, at interface level, there should be no concept of activeQ or unschedulableQ.

How about:

// AddUnschedulableIfNotPresent adds an unschedulable pod back to scheduling queue. 
// The podSchedulingCycle represents the current scheduling cycle number which can be
// returned by calling SchedulingCycle().

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. We should not assume how interface is implemented.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, let's make the comment straightforward and give tips for developers. My stupid example is like:

// SchedulingCycle returns the current number of scheduling cycle which is cached by scheduling queue. 
// Normally, incrementing this number whenever a pod is popped (e.g. called Pop()) is enough.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. I think this is good enough. PTAL.

@resouer
Copy link
Contributor

resouer commented Jan 29, 2019

Just left two comments to make sure the newly added interfaces are understandable for other contributors or ppl who try to implement their own scheduling queue. PTAL

- add incremental scheduling cycle
- instead of set a flag on move reqeust, we cache current scheduling
cycle in moveRequestCycle
- when unschedulable pods are added back, compare its cycle with
moveRequestCycle to decide whether it should be added into active queue
or not
@cofyc cofyc force-pushed the fix73216-receivedMoveRequest branch from 0e0d4cb to ba47bef Compare January 30, 2019 02:14
@Huang-Wei
Copy link
Member

/retest

Copy link
Member

@bsalamat bsalamat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

Thanks, @cofyc!

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 30, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bsalamat, cofyc

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 30, 2019
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@cofyc
Copy link
Member Author

cofyc commented Jan 31, 2019

/skip

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jan 31, 2019

@cofyc: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-e2e-kops-aws ba47bef link /test pull-kubernetes-e2e-kops-aws

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@cofyc cofyc changed the title Should move all unscheduable pods when we received mod request to active queue Should move all unscheduable pods when we received move request to active queue Jan 31, 2019
@k8s-ci-robot k8s-ci-robot merged commit 9388a27 into kubernetes:master Jan 31, 2019
@cofyc
Copy link
Member Author

cofyc commented Jan 31, 2019

talked with @Huang-Wei, I cherry picked this PR into 1.12 (#73567) and 1.13 (#73568)

// queue and clears p.receivedMoveRequest.
func (p *PriorityQueue) AddUnschedulableIfNotPresent(pod *v1.Pod) error {
// p.moveRequestCycle > podSchedulingCycle or to backoff queue if p.moveRequestCycle
// <= podSchedulingCycle but pod is subject to backoff. In other cases, it adds pod to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the code on line 414, it seems the condition should be (without =) :

podSchedulingCycle > p.moveRequestCycle

Copy link
Member

@bsalamat bsalamat Feb 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Greater than or equal is correct. In fact, it is very important to have =. Equal means that a move request is received in the same scheduling cycle as the pod was being scheduled. So, it is likely that the scheduler has not seen the changes that led to the move request yet and didn't use those changes when making a scheduling decision. As a result, the pod must be retried if it is unschedulable.

k8s-ci-robot added a commit that referenced this pull request Feb 1, 2019
…upstream-release-1.12

Automated cherry pick of #73309: Should move all unscheduable pods when we received move request to active queue
k8s-ci-robot added a commit that referenced this pull request Feb 21, 2019
…upstream-release-1.13

Automated cherry pick of #73309: Should move all unscheduable pods when we received move request to active queue
@cofyc cofyc deleted the fix73216-receivedMoveRequest branch May 4, 2019 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants