-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement shared PID namespace in the dockershim #41583
Conversation
Hi @verb. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@k8s-bot ok to test |
cmd/kubelet/app/options/options.go
Outdated
@@ -260,6 +260,7 @@ func (s *KubeletServer) AddFlags(fs *pflag.FlagSet) { | |||
|
|||
fs.StringVar(&s.RemoteRuntimeEndpoint, "container-runtime-endpoint", s.RemoteRuntimeEndpoint, "[Experimental] The unix socket endpoint of remote runtime service. The endpoint is used only when CRI integration is enabled (--enable-cri)") | |||
fs.StringVar(&s.RemoteImageEndpoint, "image-service-endpoint", s.RemoteImageEndpoint, "[Experimental] The unix socket endpoint of remote image service. If not specified, it will be the same with container-runtime-endpoint by default. The endpoint is used only when CRI integration is enabled (--enable-cri)") | |||
fs.BoolVar(&s.DockerDisableSharedPID, "docker-disable-shared-pid", s.DockerDisableSharedPID, "The Container Runtime Interface (CRI) defaults to using a shared PID namespace for pods. Setting this flag reverts the Docker CRI runtime to the old behavior of isolated pod PID namespaces.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, s/shared PID namespace for pods/shared PID namespaces for containers in a pod
s/isolated pod PID namespaces/isolated PID namespaces for containers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
pkg/apis/componentconfig/types.go
Outdated
@@ -485,6 +485,9 @@ type KubeletConfiguration struct { | |||
// This flag, if set, instructs the kubelet to keep volumes from terminated pods mounted to the node. | |||
// This can be useful for debugging volume related issues. | |||
KeepTerminatedPodVolumes bool | |||
// This flag, if set, disables use of a shared PID namespace for pods running in the docker CRI runtime. | |||
// TODO: remove once this escape hatch is no longer required. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: be more clear when this should happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -522,6 +522,8 @@ type KubeletConfiguration struct { | |||
// This flag, if set, instructs the kubelet to keep volumes from terminated pods mounted to the node. | |||
// This can be useful for debugging volume related issues. | |||
KeepTerminatedPodVolumes bool `json:"keepTerminatedPodVolumes,omitempty"` | |||
// This flag, if set, disables use of a shared PID namespace for pods run by the docker CRI runtime. | |||
DockerDisableSharedPID bool `json:"dockerDisableSharedPID,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just in case, make this a pointer (*bool
) so that we can change the default it needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, though that's a bit of magic I haven't decoded so you may want to double check me.
// This option provides an escape hatch to override the new default behavior for Docker under | ||
// the CRI to use a shared PID namespace for all pods. It is temporary and will be removed. | ||
// See proposals/pod-pid-namespace.md for details. | ||
// TODO: Remove once the escape hatch is no longer used (1.7?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File a github issue for removing this knob and include the issue number here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed #41938
if !strings.HasPrefix(string(hc.PidMode), "container:") { | ||
return | ||
} | ||
if disableSharedPID || (version.Major <= 1 && version.Minor <= 23) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the existing helper function for comparing the versions: https://github.com/kubernetes/kubernetes/blob/v1.6.0-beta.0/pkg/kubelet/dockershim/helpers.go#L314
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh that's much better, thanks.
@verb I posted a comment on the original proposal about issues with docker 1.12 + shared pid namespace. I think that's a serious and might block this from being enabled this release. |
@yujuhong agreed, but I think the safer move is to bump the minimum docker version for shared PID to 1.13. I don't expect the problems to be addressed in 1.12. Given that we know of no problems in 1.13 I don't think it should block this from being merged. In fact, I think we should include it to get early feedback from anyone testing with the not-qualified-for-1.6 docker 1.13. It would give us a slower rollout. |
@verb, we haven't even started qualifying docker 1.13 for kubernetes. If we don't even list docker 1.13 as a compatible version for k8s 1.6, we wouldn't get enough users to try out this feature. That makes this PR less meaningful (and the changelog/release note unclear). I think we can come back to this in the next release, when we have at least a node e2e image testing docker 1.13+. WDYT? /cc @dchen1107 |
The release notes is a good point, but I think that could be worked around. It would probably help to list out the upsides and downsides of merging now: Upsides of merging now:
Downsides of merging now:
Perhaps I'm missing something, but I don't understand why we wouldn't just merge this. It seems wasted effort to attempt to align it to releases when it's already aligned with docker versions, and there's only potential upside to including it. Has 1.12 been qualified yet? I was always kind of expecting we'd skip 1.12 and go right to 1.13 anyway. |
702bbd8
to
5e20b44
Compare
We have users who use more recent versions of docker even though upstream recommends an older version. I agree with @verb's argument that having this PR in v1.6 would give us some early feedback. Perhaps we can make this an |
I wrote a longer draft response last week but I couldn't find it anywhere :-\ Anyway, my main concern is that enabling this by default without some basic level of confidence on docker 1.13 compatibility may just cause too much noise to have any meaningful feedback. Also, this is a user-facing change, so opt-in may be better than opt-out. I am against turning this on for docker 1.12 (since it doesn't work), but not so strongly against doing it for 1.13. @dchen1107 could you chime in since the code freeze is imminent? |
I am ok to take this PR to have early engagement with the users if we disable this by default no matter which docker version is since
WDYT? cc/ @yujuhong @verb @Random-Liu |
if !strings.HasPrefix(string(hc.PidMode), "container:") { | ||
return | ||
} | ||
if disableSharedPID || version.LT(semver.Version{Major: 1, Minor: 25}) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change this one to API version 1.26 (docker 1.13.x)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ftr, API version 1.25 is docker 1.13.0, but yes, I will bump this to 1.26 which corresponds to docker 1.13.1.
Moving to the 1.7 milestone @verb. cc: @ethernetdan, @kubernetes/kubernetes-release-managers |
squashed and rebased |
@k8s-bot unit test this |
/lgtm |
@ethernetdan @calebamiles @dchen1107 can we remove do-not-merge now? |
rebased again. conflict was upstream removal of |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dchen1107, verb, yujuhong
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
@k8s-bot gce etcd3 e2e test this |
@k8s-bot test this [submit-queue is verifying that this PR is safe to merge] |
@verb: The following test(s) failed:
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Automatic merge from submit-queue (batch tested with PRs 41583, 45117, 45123) |
What this PR does / why we need it: Defaults the Docker CRI to using a shared PID namespace for pods. Implements proposal in kubernetes/community#207 tracked by #1615.
//cc @dchen1107 @vishh @timstclair
Special notes for your reviewer: none
Release note: