-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: propagate TransformFromStorage errors from List #69399
Conversation
Like we do everywhere else we use TranformFromStorage. The current behavior is causing all service account tokens to be regenerated, invalidating old service account tokens and unrecoverably breaking apps that are using InClusterConfig or exported service account tokens. If we are going to break stuff, let's just break the Lists so that misconfiguration of encryption config or checkpoint corruption are obvious.
If we merge this, will working but degraded clusters suddenly regress when upgraded? |
Original justification was the concern about partial degradation of a list due to an external factor not under our control (encryption service). But in practice, if your encryption config is broken, your cluster probably should grind to a halt. |
Yes.
Speaking more generally than Secrets, this behavior breaks what I would expect to be the semantics of ListWatch. If an error returned by TransformFromStorage is transient, two fully synced informer caches could have different state. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lavalamp, mikedanese The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
I'd LGTM a cherrypick of this if I saw one. |
Failures caused by accidental bump of go to 1.11. That should be reverted now. /retest |
/retest |
That's a really good point. I think for that reason I agree. However, we need to figure out how we're going to roll this out to potentially broken users - i.e. how do they mitigate (probably by deleting keys in etcd). |
@smarterclayton - So we did encounter this issue today while upgrading to 1.13.x, luckily we caught it really early and rolled back to 1.12. I believe when the cluster was created there might have been some errors and a re-setup was done but the etcd was not cleared. This caused some servicetokens from kube-system and default namespace that are autogenerated were encrypted with the old key. Right now the plan is to delete these keys (servicetokens) from the etcd and try again. |
Like we do everywhere else we use TranformFromStorage. The current
behavior is causing all service account tokens to be regenerated,
invalidating old service account tokens and unrecoverably breaking apps
that are using InClusterConfig or exported service account tokens.
If we are going to break stuff, let's just break the Lists so that
misconfiguration of encryption config or checkpoint corruption are
obvious.
Please let me know what you think.
/sig api-machinery
/sig auth
@kubernetes/sig-api-machinery-pr-reviews