Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement shared PID namespace in the dockershim #41583

Merged
merged 1 commit into from
Apr 29, 2017

Conversation

verb
Copy link
Contributor

@verb verb commented Feb 16, 2017

What this PR does / why we need it: Defaults the Docker CRI to using a shared PID namespace for pods. Implements proposal in kubernetes/community#207 tracked by #1615.

//cc @dchen1107 @vishh @timstclair

Special notes for your reviewer: none

Release note:

Some container runtimes share a process (PID) namespace for all containers in a pod. This will become the default for Docker in a future release of Kubernetes. You can preview this functionality if running with the CRI and Docker 1.13.1 by enabling the --experimental-docker-enable-shared-pid kubelet flag.

@k8s-ci-robot
Copy link
Contributor

Hi @verb. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with @k8s-bot ok to test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 16, 2017
@k8s-github-robot k8s-github-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note Denotes a PR that will be considered when it comes time to generate release notes. labels Feb 16, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 16, 2017
@yujuhong yujuhong assigned yujuhong and unassigned errordeveloper Feb 17, 2017
@timstclair
Copy link

@k8s-bot ok to test

@@ -260,6 +260,7 @@ func (s *KubeletServer) AddFlags(fs *pflag.FlagSet) {

fs.StringVar(&s.RemoteRuntimeEndpoint, "container-runtime-endpoint", s.RemoteRuntimeEndpoint, "[Experimental] The unix socket endpoint of remote runtime service. The endpoint is used only when CRI integration is enabled (--enable-cri)")
fs.StringVar(&s.RemoteImageEndpoint, "image-service-endpoint", s.RemoteImageEndpoint, "[Experimental] The unix socket endpoint of remote image service. If not specified, it will be the same with container-runtime-endpoint by default. The endpoint is used only when CRI integration is enabled (--enable-cri)")
fs.BoolVar(&s.DockerDisableSharedPID, "docker-disable-shared-pid", s.DockerDisableSharedPID, "The Container Runtime Interface (CRI) defaults to using a shared PID namespace for pods. Setting this flag reverts the Docker CRI runtime to the old behavior of isolated pod PID namespaces.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, s/shared PID namespace for pods/shared PID namespaces for containers in a pod
s/isolated pod PID namespaces/isolated PID namespaces for containers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@@ -485,6 +485,9 @@ type KubeletConfiguration struct {
// This flag, if set, instructs the kubelet to keep volumes from terminated pods mounted to the node.
// This can be useful for debugging volume related issues.
KeepTerminatedPodVolumes bool
// This flag, if set, disables use of a shared PID namespace for pods running in the docker CRI runtime.
// TODO: remove once this escape hatch is no longer required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: be more clear when this should happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -522,6 +522,8 @@ type KubeletConfiguration struct {
// This flag, if set, instructs the kubelet to keep volumes from terminated pods mounted to the node.
// This can be useful for debugging volume related issues.
KeepTerminatedPodVolumes bool `json:"keepTerminatedPodVolumes,omitempty"`
// This flag, if set, disables use of a shared PID namespace for pods run by the docker CRI runtime.
DockerDisableSharedPID bool `json:"dockerDisableSharedPID,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in case, make this a pointer (*bool) so that we can change the default it needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, though that's a bit of magic I haven't decoded so you may want to double check me.

// This option provides an escape hatch to override the new default behavior for Docker under
// the CRI to use a shared PID namespace for all pods. It is temporary and will be removed.
// See proposals/pod-pid-namespace.md for details.
// TODO: Remove once the escape hatch is no longer used (1.7?)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File a github issue for removing this knob and include the issue number here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed #41938

if !strings.HasPrefix(string(hc.PidMode), "container:") {
return
}
if disableSharedPID || (version.Major <= 1 && version.Minor <= 23) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's much better, thanks.

@yujuhong
Copy link
Contributor

yujuhong commented Feb 23, 2017

@verb I posted a comment on the original proposal about issues with docker 1.12 + shared pid namespace. I think that's a serious and might block this from being enabled this release.

EDIT: kubernetes/community#207 (comment)

@yujuhong yujuhong added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Feb 23, 2017
@verb
Copy link
Contributor Author

verb commented Feb 23, 2017

@yujuhong agreed, but I think the safer move is to bump the minimum docker version for shared PID to 1.13. I don't expect the problems to be addressed in 1.12.

Given that we know of no problems in 1.13 I don't think it should block this from being merged. In fact, I think we should include it to get early feedback from anyone testing with the not-qualified-for-1.6 docker 1.13. It would give us a slower rollout.

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 23, 2017
@yujuhong
Copy link
Contributor

@verb, we haven't even started qualifying docker 1.13 for kubernetes. If we don't even list docker 1.13 as a compatible version for k8s 1.6, we wouldn't get enough users to try out this feature. That makes this PR less meaningful (and the changelog/release note unclear). I think we can come back to this in the next release, when we have at least a node e2e image testing docker 1.13+. WDYT?

/cc @dchen1107

@verb
Copy link
Contributor Author

verb commented Feb 23, 2017

The release notes is a good point, but I think that could be worked around. It would probably help to list out the upsides and downsides of merging now:

Upsides of merging now:

  • Potentially get early feedback on the feature by people who are explicitly coloring outside of the lines and expect instability.
  • Slower rollout for change in default behavior
  • Is a no-op for earlier docker versions.
  • I don't have to carry forward a patch to an in-flux section of the CRI

Downsides of merging now:

  • Difficulty in wording a release note

Perhaps I'm missing something, but I don't understand why we wouldn't just merge this. It seems wasted effort to attempt to align it to releases when it's already aligned with docker versions, and there's only potential upside to including it.

Has 1.12 been qualified yet? I was always kind of expecting we'd skip 1.12 and go right to 1.13 anyway.

@verb verb force-pushed the sharedpid branch 2 times, most recently from 702bbd8 to 5e20b44 Compare February 23, 2017 21:33
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 23, 2017
@vishh
Copy link
Contributor

vishh commented Feb 26, 2017

We have users who use more recent versions of docker even though upstream recommends an older version. I agree with @verb's argument that having this PR in v1.6 would give us some early feedback. Perhaps we can make this an experimental opt-in feature for v1.6?

@yujuhong
Copy link
Contributor

I wrote a longer draft response last week but I couldn't find it anywhere :-\

Anyway, my main concern is that enabling this by default without some basic level of confidence on docker 1.13 compatibility may just cause too much noise to have any meaningful feedback. Also, this is a user-facing change, so opt-in may be better than opt-out.

I am against turning this on for docker 1.12 (since it doesn't work), but not so strongly against doing it for 1.13. @dchen1107 could you chime in since the code freeze is imminent?

@dchen1107
Copy link
Member

I am ok to take this PR to have early engagement with the users if we disable this by default no matter which docker version is since

  • We already know there is an issue in docker 1.12.6.
  • We didn't do any docker 1.13 validation tests. The existing automated validation pipeline is not sufficient. There is no performance, restart, etc. tests. We didn't run CRI tests against docker 1.13.x yet.
  • We didn't plan to announce docker 1.13 support in Kubernetes 1.6 release

WDYT? cc/ @yujuhong @verb @Random-Liu

if !strings.HasPrefix(string(hc.PidMode), "container:") {
return
}
if disableSharedPID || version.LT(semver.Version{Major: 1, Minor: 25}) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we change this one to API version 1.26 (docker 1.13.x)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ftr, API version 1.25 is docker 1.13.0, but yes, I will bump this to 1.26 which corresponds to docker 1.13.1.

ref: https://docs.docker.com/engine/api/v1.26/

@calebamiles
Copy link
Contributor

Moving to the 1.7 milestone @verb.

cc: @ethernetdan, @kubernetes/kubernetes-release-managers

@calebamiles calebamiles modified the milestones: v1.7, v1.6 Mar 2, 2017
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 7, 2017
@k8s-github-robot k8s-github-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 19, 2017
@verb
Copy link
Contributor Author

verb commented Apr 19, 2017

squashed and rebased

@verb
Copy link
Contributor Author

verb commented Apr 20, 2017

@k8s-bot unit test this

@dchen1107
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 20, 2017
@verb
Copy link
Contributor Author

verb commented Apr 21, 2017

@ethernetdan @calebamiles @dchen1107 can we remove do-not-merge now?

@yujuhong yujuhong removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Apr 27, 2017
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 27, 2017
@k8s-github-robot k8s-github-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 28, 2017
@verb
Copy link
Contributor Author

verb commented Apr 28, 2017

rebased again. conflict was upstream removal of generated/openapi/zz_generated.openapi.go

@yujuhong
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 28, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dchen1107, verb, yujuhong

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@verb
Copy link
Contributor Author

verb commented Apr 28, 2017

@k8s-bot gce etcd3 e2e test this

@verb
Copy link
Contributor Author

verb commented Apr 29, 2017

opened #45133 for flake

@k8s-bot bazel test this

@k8s-github-robot
Copy link

@k8s-bot test this [submit-queue is verifying that this PR is safe to merge]

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Apr 29, 2017

@verb: The following test(s) failed:

Test name Commit Details Rerun command
Jenkins verification d22dd0f link @k8s-bot verify test this

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 41583, 45117, 45123)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/node Categorizes an issue or PR as relevant to SIG Node. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.