Skip to content

Conversation

@logicalhan
Copy link
Member

What type of PR is this?

/kind feature

What this PR does / why we need it:

Currently, the healthz endpoint is inflexible in that any and all health checks are currently executed when the endpoint is hit. Since the kube-apiserver only currently supports a liveness check, this means all persistent health check failures will eventually result in a kubelet driven restart. However, this may not be desirable behavior if you have additional monitoring services layered on top of your control plane.

For instance, take the etcd failure from the perspective of the kube-apiserver. Currently, persistent failures during the etcd health check from the api-server healthz endpoint will cause the kube-apiserver to be restarted. This sometimes helps (in the case of a faulty etcd-client connection) but also can be detrimental (the kube-apiserver restarts until the etcd cluster is functioning, then the kube-apiserver bombards the etcd cluster with requests to update local caches, saturating the etcd cluster with requests, causing it to timeout on its own liveness probes and trigger a kubelet-driven restart).

Let's say that we could now disable etcd from our liveness health check. Then if we had an external monitoring service, we could ask independently ask etcd if it was healthy and we could also ask the kube-apiserver if it thought etcd was unhealthy. Based on the combination of those answers, we could enable more intelligent responses to certain situation, like restarting the kube-apiserver when it thought etcd was unhealthy but etcd was actually reporting as healthy but not doing so when etcd was actually unhealthy.

This feature is backwards-compatible and opt-in. There would be no impact to existing behavior unless these query params are actually used. This feature no-opts for strings which do not match the name of an existing health check.

Which issue(s) this PR fixes:
Fixes #70591

Does this PR introduce a user-facing change?:

The kube-apiserver's healthz now takes in an optional query parameter which allows you to disable health checks from causing healthz failures. 

/sig api-machinery

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/feature Categorizes issue or PR as related to a new feature. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 6, 2018
@k8s-ci-robot k8s-ci-robot requested review from deads2k and sttts November 6, 2018 00:43
@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. area/apiserver labels Nov 6, 2018
@logicalhan
Copy link
Member Author

/cc @cheftako @lavalamp @liggitt

@justinsb
Copy link
Member

justinsb commented Nov 7, 2018

/ok-to-test

Generally LGTM but I'd like to see the param used in https://github.com/kubernetes/kubernetes/blob/master/cluster/gce/manifests/kube-apiserver.manifest#L37 - and we can add a comment at the same time!

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Nov 7, 2018
@deads2k
Copy link
Contributor

deads2k commented Nov 7, 2018

I don't object per se , but are you sure that it's worth enough to avoid just crashlooping it? At least in the etcd example, it's not like this process is doing much for you.

@logicalhan
Copy link
Member Author

I don't object per se , but are you sure that it's worth enough to avoid just crashlooping it? At least in the etcd example, it's not like this process is doing much for you.

Crashlooping is also noisy. Yes, in the etcd example, not crashlooping is not going to fix the issue but if the etcd is unhealthy-ish due to a longish boot, then it will get to a correct state faster without crashlooping. There are trade-offs, for sure. Then again, this feature is opt-in and if you are in situation where it occurs to you to use this feature (i.e. happening a lot), then probably to you it seems worthwhile.

@roycaihw
Copy link
Member

roycaihw commented Nov 8, 2018

/assign @cheftako

@logicalhan logicalhan force-pushed the exclude-checks branch 4 times, most recently from 0923187 to dc6922b Compare November 10, 2018 03:45
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 10, 2018
@k8s-ci-robot k8s-ci-robot added sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 13, 2018
@lavalamp
Copy link
Contributor

lavalamp commented Nov 13, 2018 via email

@logicalhan
Copy link
Member Author

/test pull-kubernetes-e2e-gke

1 similar comment
@logicalhan
Copy link
Member Author

/test pull-kubernetes-e2e-gke

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to be nitpicky, but I'd call this a warning and not an error, since it doesn't make the check fail and the word "error" makes admins nervous :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I can remove the prefix. I added it after the fact because it made it slightly easier to parse the string out from the verbose output blob.

@lavalamp lavalamp added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Nov 13, 2018
@k8s-ci-robot k8s-ci-robot removed the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Nov 13, 2018
@lavalamp
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 13, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lavalamp, logicalhan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 13, 2018
@logicalhan
Copy link
Member Author

/test pull-kubernetes-verify

"host": "127.0.0.1",
"port": 8080,
"path": "/healthz"
"path": "/healthz?exclude=etcd"
Copy link
Member

@mikedanese mikedanese Nov 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a couple issues with this:

  • The API server main blocks (after listening, but before serving) on the etcd client opening a connection to the etcd server. Not all, but some etcd failures will cause the kube-apiserver to never start serving /healthz.
  • The etcd healthcheck is not the only healthcheck that depends on etcd. Spot checking [poststarthook/bootstrap-controller, poststarthook/rbac/bootstrap-roles, poststarthook/scheduling/bootstrap-system-priority-classes, poststarthook/ca-registration, autoregister-completion], all depend on etcd being up before they return an initial healthy status.

Is this change intended to help reduce restarts on apiserver startup or only in steady state?

Copy link
Member

@liggitt liggitt Nov 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change intended to help reduce restarts on apiserver startup or only in steady state?

The etcd check use case was to avoid useless restarts in steady state

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was intended for steady state. Also, the observation that boot sequences are more problematic was a contributing factor for my thought process in #71054.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add the ability to kube-apiserver's healthz endpoint to disable health checks via query params

9 participants