-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
federation: Add admission controller for policy-based placement #44786
federation: Add admission controller for policy-based placement #44786
Conversation
Hi @tsandall. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@k8s-bot ok to test |
Replaced assignee @irfanurrehman with @nikhiljindal as per Federation Engineering Roadmap. I will also take a look now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick review and in general the code looks great. As I'm going to be OOTO for 2.5 weeks, I'll leave a detailed review up to others so as not to hold things up. As soon as one other has reviewed the code in some detail, I'm happy to approve. @nikhiljindal Can you?
policyConfigMapNamespace = "kube-federation-scheduling-policy" | ||
minRetryBackoff = time.Millisecond | ||
maxRetryBackoff = time.Second * 60 | ||
defaultRetryBackoff = time.Millisecond * 100 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add a comment explaining what the associated backoff algorithm is. Is it standard exponential? If so, what's the purpose of defaultRetryBackoff? etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will update with comment. The admission controller itself is not implementing the backoff; it's done by one of the utilities in kube.
*admission.Handler | ||
policyEngineClient *rest.RESTClient // client to communicate with policy engine | ||
policyEngineRetryBackoff time.Duration // backoff for policy engine queries | ||
client internalclientset.Interface // client to communicate with federation-apiserve |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apiserve -> apiserver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix.
} | ||
|
||
func (c *admissionController) Admit(a admission.Attributes) (err error) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: These extra blank lines are not needed and can be removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix.
const ( | ||
pluginName = "SchedulingPolicy" | ||
configKey = "schedulingPolicy" | ||
policyConfigMapNamespace = "kube-federation-scheduling-policy" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didnt we decide to keep system in the name?
"federation-system-scheduling-policy" maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me know what you want to do regarding the name. In kubernetes/community#292 this one was suggested and specified in the proposal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, reread the discussion there, thanks for the link.
lg.
return | ||
} | ||
|
||
event := &api.Event{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use versioned objects (v1.Event instead of api.Event)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the internalclientset requires api
rather than v1
. Let me know how to proceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, see my comment above about using versionedclientset.
You will have to use internal API structs with internalclientset.
Consider this resolved.
Type: "Warning", | ||
} | ||
|
||
if _, err := c.client.Core().Events(a.GetNamespace()).Create(event); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our controllers we use EventRecorder. We can use the same here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, I will look into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would make the admission controller depend on the controller manager code. Is this desirable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems the EventRecorder utility is not compatible with the internalclientset. Let me know how we should proceed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sigh, ok.
return fmt.Errorf("reason(s): %v", strings.Join(d.Errors, "; ")) | ||
} | ||
|
||
func convertObject(obj runtime.Object, gv schema.GroupVersion) (interface{}, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its not clear to me what you are converting the object into and why?
Are you serializing the obj into json?
Also again, the code seems generic. Can we reuse existing code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was needed so that the admission controller could construct the policy query as follows:
{
"input": <resource>
}
What I think would be better is if the <resource>
value was not encapsulated inside {"input": ...}
.
It should be fine to remove this entirely.
// executing a policy query. | ||
type policyEngineInput struct { | ||
Input struct { | ||
Resource interface{} `json:"resource"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resource can be runtime.Object?
Errors []string `json:"errors,omitempty"` | ||
Resource struct { | ||
Metadata struct { | ||
Annotations map[string]string `json:"annotations,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its fine to keep Annotations as top level field. No need to nest it in Resource.Metadata
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also am wondering if it should just return the updated object.
That way we wont have to update the API when we move annotations to fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The concern I have with returning the entire object is it implies that the admission controller supports updates on arbitrary fields. Currently, this is not the case, it only supports annotations. Changing the admission controller to support arbitrary fields (e.g., annotations, spec.foo, status.bar[0].baz, etc.) would be tricky with the current admission controller framework.
I'm OK with changing the response to expect (i) errors and (ii) annotations for now. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, on thinking more I agree with you. This way, we restrict what the policy engine can change. It cannot change anything apart from annotations on the object.
Infact, I think we can restrict it more. Instead of returning a whole map of annotations and we merging them all as is, it should only return cluster selector values.
This is fine as is for now.
return nil | ||
} | ||
|
||
func (c *admissionController) SetInternalKubeClientSet(client internalclientset.Interface) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why internal clientset and not the versioned clientset?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I know, this is the only option available: https://github.com/kubernetes/kubernetes/blob/master/pkg/kubeapiserver/admission/initializer.go#L31.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had updated all our controllers to use versioned clientset instead of internal ones.
Looks like we still need to do that work for our admission controllers.
cc @caesarxuchao to confirm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nikhiljindal. It's ok for admission controllers to use the internal clientset. The admission controller runs in the same process as the apiserver, and iirc objects are supplied by the apiserver to the admission controller, and the objects are already converted to the internal version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline with @caesarxuchao, admission controllers (like this one) also talk to apiserver directly to fetch other resources (like configmaps and events in this case).
Admission controllers should be using versioned clientsets for that.
This PR is fine as is since we do not have a way to provide versioned clients today, but adding support for it should not be too difficult. We can add another method SetVersionedKubeClientSet similar to SetInternalKubeClientSet.
This will become critical as we move admission controllers out of repo without moving the internal API structs.
|
||
func (c *admissionController) Admit(a admission.Attributes) (err error) { | ||
|
||
failClosed, err := c.shouldFailClosed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there is no policy defined, then we can just return here. No need to query the policy engine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For test purposes it's useful to have the fail-open option. Why don't we make this configurable as follows:
- By default, if no ConfigMap exists, return here. If ConfigMap exists, fail-closed.
- If admission controller configuration includes some flag (e.g.,
failOpenOnNonExistentPolicy
) then current behaviour will be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry can you explain more why is fail-open option required for testing?
By default, if no ConfigMap exists, return here. If ConfigMap exists, fail-closed.
This sgtm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @tsandall
High level comments:
- Use versioned clientset and versioned objects.
- Share as much code as possible with existing controllers.
- Some discussion on PolicyEngineAPI. Returning the Object instead of annotations is future change compatible when we move from annotations to fields.
cc @sig-federation-pr-reviews Also update the description to include release notes |
e1d0911
to
962b9c0
Compare
@k8s-bot pull-kubernetes-federation-e2e-gce test this |
} | ||
|
||
if len(cfg.Kubeconfig) == 0 { | ||
return nil, fmt.Errorf("kubeconfig must specify path") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not meaningful at all
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, I'll improve this.
Errors []string `json:"errors,omitempty"` | ||
Resource struct { | ||
Metadata struct { | ||
Annotations map[string]string `json:"annotations,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, on thinking more I agree with you. This way, we restrict what the policy engine can change. It cannot change anything apart from annotations on the object.
Infact, I think we can restrict it more. Instead of returning a whole map of annotations and we merging them all as is, it should only return cluster selector values.
This is fine as is for now.
// backoff implementation is handled by k8s.io/apiserver/pkg/util/webhook. | ||
// The defaultRetryBackoff is used if the admission controller | ||
// configuration does not specify one. | ||
minRetryBackoff = time.Millisecond |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you are following the pattern of other admission controllers, but it is not clear to me why we need these.
Admin should be allowed to configure any backoff they seem fit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, I was just following convention. I'll remove these.
client internalclientset.Interface // client to communicate with federation-apiserve | ||
client internalclientset.Interface // client to communicate with federation-apiserver | ||
|
||
// controls whether policy engine is queried when policy has not been |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats a scenario where admin should set this to true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The admin would set this flag to true if they were loading the policies into the engine out-of-band and wanted to exercise the policies for test purposes.
This flag would not be set in production.
In the initial version of the PR this flag was not necessary as the admission controller ALWAYS queried the policy engine and the fail-open vs fail-closed decision was determined by the existence of a ConfigMap in the built-in namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I dont think we need this flag.
Even for testing, admin should create the policies in configmap.
If there is no configmap (i.e policies not defined), then admission controller can just return,
If there is a configmap, then admission controller should call the policy engine and fail in case of an error.
@quinton-hoole or @csbell Will be great if one of you can also take a look at this. Specially the API with policy engine part (query.go) |
@k8s-bot pull-kubernetes-federation-e2e-gce test this |
60cf0fd
to
5847b7d
Compare
3532a88
to
ab9669e
Compare
mostly looks good. |
ab9669e
to
79bf550
Compare
/release-note /lgtm |
79bf550
to
8a3c637
Compare
/lgtm |
Fixed the bazel build failure. Forgot to run ./hack/update-bazel.sh after fixing @marun's suggestion about runtime.HandleError. |
/approve (for hack/) |
8a3c637
to
470e99c
Compare
/lgtm |
@k8s-bot pull-kubernetes-e2e-kops-aws test this |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: madhusudancs, nikhiljindal, tsandall Associated issue: 39982 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
Automatic merge from submit-queue (batch tested with PRs 46235, 44786, 46833, 46756, 46669) |
@nikhiljindal
Here's the initial version of the scheduling policy admission controller. It's at the point where it would benefit from having another pair of eyes look at it. The main thing I'm unsure of is the serialization of Kube resources for the webhook/query call.
Release Note:
Ref #39982