Skip to content

Conversation

@yujuhong
Copy link
Contributor

If kubelet never gets past sandbox creation (i.e., never attempted to
create containers for a pod), it should retry the sandbox creation on
failure, regardless of the restart policy of the pod.

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespaces from that line:

/kind api-change

/kind bug

/kind cleanup
/kind design
/kind documentation
/kind failing-test
/kind feature
/kind flake

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #79398

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

Fix a bug where kubelet would not retry pod sandbox creation when the restart policy of the pod is Never

If kubelet never gets past sandbox creation (i.e., never attempted to
create containers for a pod), it should retry the sandbox creation on
failure, regardless of the restart policy of the pod.
@yujuhong yujuhong added kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Jun 27, 2019
@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 27, 2019
@k8s-ci-robot k8s-ci-robot requested review from krmayankk and tmrts June 27, 2019 01:26
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: yujuhong

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 27, 2019
@yujuhong
Copy link
Contributor Author

The regression was introduced in 1.13. We need to patch 1.15, 1.14, and 1.13.

@yujuhong
Copy link
Contributor Author

The original change was introduced in #68980 /cc @derekwaynecarr

// killed and recreated, and init containers should be purged.
if createPodSandbox {
if !shouldRestartOnFailure(pod) && attempt != 0 {
if !shouldRestartOnFailure(pod) && attempt != 0 && len(podStatus.ContainerStatuses) != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add check len(podStatus.InitContainerStatuses) != 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a kubelet's internal type; there's no initcontianerstatuses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit tests added show that this would work for init containers too.

Copy link
Contributor

@mattjmcnaughton mattjmcnaughton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

modulo the open question about init containers this looks good to me! Thanks for the quick fix and for adding a test case.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 27, 2019
@yujuhong
Copy link
Contributor Author

/retest

@yujuhong
Copy link
Contributor Author

/cc @dashpole

@k8s-ci-robot k8s-ci-robot requested a review from dashpole June 27, 2019 16:45
@dashpole
Copy link
Contributor

/lgtm
Thanks for the fix!

@yujuhong
Copy link
Contributor Author

/retest

Don't think this change would affect pull-kubernetes-kubemark-e2e-gce-big

@k8s-ci-robot k8s-ci-robot merged commit b51f621 into kubernetes:master Jun 27, 2019
k8s-ci-robot added a commit that referenced this pull request Jun 28, 2019
…51-upstream-release-1.14

Automated cherry pick of #79451: kubelet: retry pod sandbox creation when containers were
k8s-ci-robot added a commit that referenced this pull request Jun 28, 2019
…51-upstream-release-1.15

Automated cherry pick of #79451: kubelet: retry pod sandbox creation when containers were
@johnpipi
Copy link

johnpipi commented Jan 6, 2020

from what version of kubernetes is this fix part of?

@ashb
Copy link

ashb commented Oct 12, 2020

from what version of kubernetes is this fix part of?

1.16+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kubelet won't retry PodSandbox creation for pods with restart policy "Never"

7 participants