Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove e2e-rbac-bindings. #46750

Merged
merged 1 commit into from
Jun 9, 2017
Merged

Conversation

cjcullen
Copy link
Member

@cjcullen cjcullen commented Jun 1, 2017

Replace todo-grabbag binding w/ more specific heapster roles/bindings.
Move kubelet binding.

What this PR does / why we need it:
The "e2e-rbac-bindings" held 2 leftovers from the 1.6 RBAC rollout process:

  • One is the "kubelet-binding" which grants the "system:node" role to kubelet. This is needed until we enable the node authorizer. I moved this to the folder w/ some other kubelet related bindings.
  • The other is the "todo-remove-grabbag-cluster-admin" binding, which grants the cluster-admin role to the default service account in the kube-system namespace. This appears to only be required for heapster. Heapster will instead use a "heapster" service account, bound to a "system:heapster" role on the cluster (no write perms), and a "system:pod-nanny" role in the kube-system namespace.

Which issue this PR fixes: Addresses part of #39990

Release Note:

New and upgraded 1.7 GCE/GKE clusters no longer have an RBAC ClusterRoleBinding that grants the `cluster-admin` ClusterRole to the `default` service account in the `kube-system` namespace.
If this permission is still desired, run the following command to explicitly grant it, either before or after upgrading to 1.7:
    kubectl create clusterrolebinding kube-system-default --serviceaccount=kube-system:default --clusterrole=cluster-admin

@cjcullen cjcullen added the release-note-none Denotes a PR that doesn't merit a release note. label Jun 1, 2017
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 1, 2017
@k8s-github-robot k8s-github-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jun 1, 2017
@liggitt
Copy link
Member

liggitt commented Jun 1, 2017

This is needed until we enable the node authorizer.

And until kubelets get credentials that work with the node authorizer

@liggitt
Copy link
Member

liggitt commented Jun 1, 2017

The other is the "todo-remove-grabbag-cluster-admin" binding, which grants the cluster-admin role to the default service account in the kube-system namespace. This appears to only be required for heapster, which we can scope down quite a bit.

Isn't it needed for many of the add-ons that run as the default service account?

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRole
metadata:
name: system:heapster-cluster-role
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k thoughts on system: prefixes for roles defined in addons?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deads2k thoughts on system: prefixes for roles defined in addons?

I'm ok with doing it for addons which live in the kubernetes org (not kubernetes-incubator).

labels:
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
rules:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't. Didn't realize that one was there. I'll drop this one and refer to the existing one.

resources:
- pods
verbs:
- get
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heapster only gets pods in the kube-system namespace?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the "pod_nanny" (one of heapster's containers) getting the heapster pod from the API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the sidecar dynamically resizes the heapster deployment, so it'll also need to do a put/patch on deployments to resize itself.

@Q-Lee

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, you need write on deployments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the sidecar dynamically resizes the heapster deployment, so it'll also need to do a put/patch on deployments to resize itself.

This seems like a questionable power to grant to a pod. If its given the power to modify its own deployment, then a compromise of a particular pod cannot be resolved by simply deleting the pod since it has been allowed to modify its own config.

In addition, by doing it inside of the kube-system namespace, if you compromise the heapster pod, you will have compromised every deployment, rs, rc, secret, etc in the cluster because it gained the power to change its service account which gave it the power to mount any SA token in the kube-system namespace where controller SAs are located.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a questionable power to grant to a pod

Agree, and since not all installations run the pod nanny, it shouldn't be part of the standard heapster role.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we could assign service accounts to individual pods. I wonder if there's a work around.

Could we create a "heapster-nanny" role, bind it to a service account, and mount the service account's secret where SAs are usually mounted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 5 of the supported CLUSTER_MONITORING setups have the pod-nanny running with heapster, and we don't create this role/binding unless one of those is selected.

I renamed that role to system:pod-nanny.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is not optimal, but it's better than what currently exists.

@roberthbailey roberthbailey added this to the v1.7 milestone Jun 1, 2017
@roberthbailey roberthbailey added sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. labels Jun 1, 2017
@k8s-github-robot k8s-github-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jun 2, 2017
@fejta
Copy link
Contributor

fejta commented Jun 2, 2017

@k8s-bot pull-kubernetes-federation-e2e-gce test this
ref: #46827

@roberthbailey
Copy link
Contributor

/approve

@roberthbailey
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 2, 2017
@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 2, 2017
name: system:heapster
subjects:
- kind: ServiceAccount
name: default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have expected a heapster service account here, defined by the addon and referenced inside the deployments... it's odd for one addon to grant a role to the default service account that lots of other addons use

name: system:pod-nanny
subjects:
- kind: ServiceAccount
name: default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same thing, heapster service account here

@liggitt
Copy link
Member

liggitt commented Jun 2, 2017

definitely needs a release note, action required. I am certain there are things deploying to GCE that assume the default service account in the kube-system namespace is a cluster admin. The release note should indicate how to grant that permission on a new cluster if it is desired.

# the system:serviceaccount:kube-system:default identity. We need to subdivide
# those service accounts, figure out which ones we're going to make bootstrap roles for
# and bind those particular roles in the addon yaml itself. This just gets us started
apiVersion: rbac.authorization.k8s.io/v1beta1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does removing this mean the add-on manager will delete the existing binding on an upgrade? we should not do that automatically. what do we need to do to avoid binding to the default service account in a new cluster, but avoid deleting the existing binding on an upgrade?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could create a binding that does nothing, and set it as an EnsureExists addon. New clusters would get the toothless policy created, but existing clusters would see that something by that name already exists. It does mean that we have a binding called "todo-remove-grabbag-cluster-admin" forever...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tightening permissions on unknown workloads on upgrade isn't something we should do lightly. I'd take an ugly toothless binding for a release or two. Can release note it now, omit it in new clusters, and remove the stub in a couple releases.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Creating the toothless binding would be the easiest for now, but I don't see a great path forward for removing it (without re-working how the addon manager works, which wouldn't be a bad thing, but...). I could have a cluster that I originally created at 1.6, and I've kept along with upgrades and now at 1.12 we remove the grabbag-binding, and it breaks me. Any thoughts? I know you aren't a huge fan of the addon manager in general...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bounced this around w/ @roberthbailey and @Q-Lee. I'm starting to lean towards just blowing away this binding, and including instructions in the release note on how to get it back if you really were doing something weird inside your kube-system namespace that needed it.

@liggitt liggitt added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Jun 2, 2017
@liggitt
Copy link
Member

liggitt commented Jun 2, 2017

marking do-not-merge pending the question about upgrade behavior

@@ -1126,16 +1126,14 @@ function start-kube-addons {
local -r src_dir="${KUBE_HOME}/kube-manifests/kubernetes/gci-trusty"
local -r dst_dir="/etc/kubernetes/addons"

# prep the additional bindings that are particular to e2e users and groups
setup-addon-manifests "addons" "e2e-rbac-bindings"
Copy link
Member

@liggitt liggitt Jun 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this script set up the rbac addon which now contains the replacement kubelet binding?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does now... :)

@cjcullen cjcullen added the release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. label Jun 5, 2017
@cjcullen cjcullen removed the release-note-none Denotes a PR that doesn't merit a release note. label Jun 5, 2017
@k8s-github-robot k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 5, 2017
@liggitt
Copy link
Member

liggitt commented Jun 6, 2017

I would have preferred this occur at the beginning of a release cycle, but I'll defer to the owners of the GCE/GKE deployments. At the very least, we should:

  1. Ensure no other in-tree add-ons depend on this binding
  2. Update the release note to indicate that an upgraded cluster will remove the automatically granted cluster-admin permissions from the default service account in the kube-system namespace (if I understand correctly, the binding is removed by the add-on manager after an upgrade)
  3. Include clear instructions in the release note for how to grant this permission if you need it for other things running in your cluster. Note that this can be done before or after upgrading from 1.6 to 1.7, or after deploying a new 1.7 cluster.

@zmerlynn / @roberthbailey feel free to remove the hold when those are done to your satisfaction

@cjcullen
Copy link
Member Author

cjcullen commented Jun 6, 2017

Rebased and pushed to get an updated set of e2e runs after the code-freeze rush.

@cjcullen
Copy link
Member Author

cjcullen commented Jun 6, 2017

Also added instructions on how to recreate the binding to the release notes.

@grodrigues3
Copy link
Contributor

@roberthbailey
This needs to be re-lgtmed due to the rebase.

@cjcullen
Copy link
Member Author

cjcullen commented Jun 6, 2017

/retest

@@ -1481,10 +1481,6 @@ function start-kube-addons {
local -r src_dir="${KUBE_HOME}/kube-manifests/kubernetes/gci-trusty"
local -r dst_dir="/etc/kubernetes/addons"

# TODO(mikedanese): only enable these in e2e
# prep the additional bindings that are particular to e2e users and groups
setup-addon-manifests "addons" "e2e-rbac-bindings"
Copy link
Contributor

@roberthbailey roberthbailey Jun 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this line be

setup-addon-manifests "addons" "rbac"

as above?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, it was already in this file.

@roberthbailey
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 7, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cjcullen, roberthbailey

Associated issue: 39990

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@roberthbailey
Copy link
Contributor

@liggitt - are you comfortable with the upgrade story now? If so, we can remove the do-not-merge label.

@liggitt
Copy link
Member

liggitt commented Jun 8, 2017

yes, I'll defer to the GCE/GKE deployment owners

@liggitt liggitt removed the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label Jun 8, 2017
@cjcullen
Copy link
Member Author

cjcullen commented Jun 9, 2017

/retest

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented Jun 9, 2017

@cjcullen: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-kubernetes-federation-e2e-gce 91bc7dde9e1fe5a36e5d61237d89a4e2f211775d link @k8s-bot pull-kubernetes-federation-e2e-gce test this

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@cjcullen
Copy link
Member Author

cjcullen commented Jun 9, 2017

/retest

@k8s-github-robot
Copy link

Automatic merge from submit-queue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-action-required Denotes a PR that introduces potentially breaking changes that require user action. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants