Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix hpa scaling above max replicas w/ scaleUpLimit #53690

Conversation

mattjmcnaughton
Copy link
Contributor

@mattjmcnaughton mattjmcnaughton commented Oct 11, 2017

What this PR does / why we need it:

Fix a bug where desiredReplicas could be greater than maxReplicas
if the original value for desiredReplicas > scaleUpLimit and
scaleUpLimit > maxReplicas. Previously, when that happened, we would
scale up to scaleUpLimit, and then in the next auto-scaling run, scale
down to maxReplicas. Address this issue and introduce a regression
test.

Which issue this PR fixes

fixes #53670

Release note:

Address a bug which allowed the horizontal pod autoscaler to allocate `desiredReplicas` > `maxReplicas` in certain instances.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 11, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @mattjmcnaughton. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 11, 2017
@mattjmcnaughton
Copy link
Contributor Author

/assign @DirectXMan12 (because you've already been involved in the issue) :)

@mwielgus
Copy link
Contributor

mwielgus commented Oct 11, 2017

/lgtm
Please cherry-pick it to 1.7 and 1.8 branches.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 11, 2017
@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 11, 2017
@mwielgus
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Oct 11, 2017
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

@MaciekPytel
Copy link
Contributor

@mattjmcnaughton It looks like this doesn't pass gofmt. Can you update the PR?

@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to @fejta).

Review the full test history for this PR.

Fix kubernetes#53670

Fix a bug where `desiredReplicas` could be greater than `maxReplicas`
if the original value for `desiredReplicas > scaleUpLimit` and
`scaleUpLimit > maxReplicas`. Previously, when that happened, we would
scale up to `scaleUpLimit`, and then in the next auto-scaling run, scale
down to `maxReplicas`. Address this issue and introduce a regression
test.
@mattjmcnaughton mattjmcnaughton force-pushed the mattjmcnaughton/53670-fix-hpa-scaling-above-max-replicas branch from bbf7274 to 75c3877 Compare October 11, 2017 12:35
@k8s-github-robot k8s-github-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 11, 2017
@mattjmcnaughton
Copy link
Contributor Author

Whoops! Fixed and ready for a final look @mwielgus @MaciekPytel

Re cherry-picking onto 1.7 and 1.8, is that something I do, or something a bot does, or something the OWNERs do?

Thanks for the quick eyes on this :)

@MaciekPytel
Copy link
Contributor

/retest

@mwielgus
Copy link
Contributor

There is a script for cherry-picking to particular branch in /hack.

// Ensure that even if the scaleUpLimit is greater
// than the maximum number of replicas, we only
// set the max number of replicas as desired.
if scaleUpLimit > hpa.Spec.MaxReplicas {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically an exact copy-paste of one of the cases below. Adding ifs checking whether we would trigger condition of a different case if we haven't already triggered the one we're in makes me think switch-case is not the right solution here and we should move to a sequence of ifs instead (perhaps contained in a separate function that can be unittested in isolation).

Especially since we already had a very similar bug caused by interaction of conditions in this switch-case: #48997.

I'm ok with keeping this bugfix as is, but we should revisit this part of hpa, clean it up and make sure it's reasonably unittested to avoid another such issue in future. @mwielgus @DirectXMan12

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've created an issue #53728 to do this.

@MaciekPytel
Copy link
Contributor

/retest

@MaciekPytel
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 11, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MaciekPytel, mattjmcnaughton, mwielgus

Associated issue: 53670

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 93b3469 into kubernetes:master Oct 11, 2017
@k8s-ci-robot
Copy link
Contributor

@mattjmcnaughton: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-bazel-test 75c3877 link /test pull-kubernetes-bazel-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@MaciekPytel
Copy link
Contributor

@mattjmcnaughton can you create cherry-pick PRs for this into 1.7 and 1.8 branches? This is done by just running hack/cherry_pick_pull.sh script. Alternatively I can create them for you.

@mattjmcnaughton
Copy link
Contributor Author

Yup, I can create the cherry-pick PRs - I'll do it rn :)

@mattjmcnaughton mattjmcnaughton deleted the mattjmcnaughton/53670-fix-hpa-scaling-above-max-replicas branch October 12, 2017 12:34
@wojtek-t wojtek-t added cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cherrypick-candidate labels Oct 12, 2017
@wojtek-t wojtek-t added this to the v1.7 milestone Oct 12, 2017
k8s-github-robot pushed a commit that referenced this pull request Oct 13, 2017
…of-#53690-upstream-release-1.7

Automatic merge from submit-queue.

Automated cherry pick of #53690

Cherry pick of #53690 on release-1.7.

#53690: Fix hpa scaling above max replicas w/ scaleUpLimit
@k8s-cherrypick-bot
Copy link

Commit found in the "release-1.7" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

k8s-github-robot pushed a commit that referenced this pull request Oct 17, 2017
…of-#53690-upstream-release-1.8

Automatic merge from submit-queue.

Automated cherry pick of #53690

Cherry pick of #53690 on release-1.8.

#53690: Fix hpa scaling above max replicas w/ scaleUpLimit
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HPA scaling above spec.maxReplicas