-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate the pod template from replicationController #170
Comments
If we were to remove replicationController from the core apiserver into a separate service, I'd leave the pod template in the core. |
I'll take a shot at this. @bgrant0607 Brian mentioned cron. This makes me think he wants to use podTemplates for delegation. Will the delegation of power be part of the /podTemplate message, or will that be stored in some sideband ACL? |
PUT /pods should only take pods, IMO. We should potentially offer something that "fills out" a pod template, but I think that should work like our current replication controller, which is an external component. |
The point of this proposal is to further narrow the responsibility and API of the replicationController to its bare essentials. The replicationController should just spawn new replicas. Right now, essentially a full copy of the pod API is embedded in the replicationController API. As an external, independently versioned API, it would be challenging to keep synchronized with the core API. Additionally, with the replicationController creating pods by value rather than by reference, it needs to be delegated the authority to create ~arbitrary pods as ~arbitrary users (once we support multi-tenancy) -- this would mean it could do anything as anybody. This is even more of an issue once we introduce an auto-scaler layered on the replicationController. It's very difficult to develop a signature-based approach layered on a literal pod creation API that could be made both usable and secure. OTOH, if the replicationController could only spawn instances from templates owned by the core, then it's power could be restricted. Think of this as the principle of least privilege and separation of concerns. Including the template by reference in the replicationController API would also facilitate rollbacks to previous pod configurations. A standalone template could be used for cron and other forms of deferred execution. Even if we were to add a more general templating/configuration-generation mechanism in the future, we can't have turtles all the way down. A pod template would be useful for spawning the config generator, among other things. As with the current replicationController API, pods would have no relationship to the template from which they were generated other than their labels and any other provenance information we kept. Changes to the template would have no effect on pods created from it previously. I'd be fine with a separate API endpoint for creating pods from a template, just as we have for replicationController today. |
Okay, putting together above comments and my own thoughts... For the initial PR, we just need to have a pod template type which unambiguously and completely defines a /pod. Later PRs can extend /podTemplate as needed to support authorization of delegated use. Here are a few examples with delegation and the types of expansion of templates that might occur: Considerations for podTemplate:
Therefore, a /podTemplate will look something like this: { "id": "awesomePodTemplate",
"pod": "<object exactly following /pod schema>",
"allowExtraEnvVars": [
"MOTD":
"Today's pod brought to you by a replication controller."],
"allowModifiedResourcesRequestsAndLimits": 1,
"delegatedPodMakers": ["[email protected]", "[email protected]"],
} Note the specific, capability-like descriptions of allowed modifications to the /pod object. However, the first PR will just have: { "id": "myPodTemplate",
"pod": "<object exactly following /pod schema>",
} The /pod schema will get a new member "actsAsUser". This affects which user the pod acts as. if authenticatedUser == request.pod.actAsUser { return auth.Authorized }
return auth.notAuthorized In a later PRs, the /pod schema will be extended to have a "fromPodTemplateId" member which references the id of the /podTemplate that this /pod is modeled on. This adds an interesting twist: we can't use the user-provided name alone to identify the /podTemplate. We need to specify which user's namespace the name lies in. Maybe "actAsUser" identifies this or maybe we need a globally unique id for a podTemplate. With that member added, the authorization check for creating a /pod would look like this: if authenticatedUser == request.pod.actAsUser { return Authorized }
if auth.Can(authenticatedUser, auth.MakePodsFor, request.pod.actAsUser) {
tpl := findPodTemplate(request.fromPodTemplateId)
if tpl != nil {
if tpl.Generates(request.pod) {
return auth.Authorized
}
}
}
}
return auth.NotAuthorized |
Other use case: pod's port can come from a range, to allow duplicate pods on the same host. Would this go in the template? |
I wonder if we should maybe add an "owner" field to the JSONBase, so that all objects in the system could have an owning user. If so, no need to specifically add that field to the PodTemplate.
This could be done with a label, which is what our current replicationController does. I think a step that should come shortly after adding PodTemplate as a resource is changing the replication controller struct to take a podTemplateID instead of a hardcoded PodTemplate. Port shouldn't be dynamic. May want @brendanburns to take a look at this when he gets back. |
On Thu, Jul 24, 2014 at 10:24 AM, Daniel Smith [email protected]
|
Thanks, @erictune . First of all, while security is part of the motivation for this, I'd drop all user / identity / auth / delegation stuff until we figure out auth[nz] more generally. That said, we'll want to namespace label keys by project implicitly by default to prevent conflicts and overlap across users. Second, we should leave out most/all forms of substitution and computation more generally. A more general config mechanism is a separate issue. I was thinking of take what replicationController supports today and moving it to a separate object, which we might want to garbage collect after some amount of time in the case that it hasn't been used. However, I think it's not too early to think about the override model, and whether we want one eventually, even though we wouldn't implement it initially. Env. vars. and resources are good examples. It would be useful to think about how splitting out the template (and overrides) would interact with updates driven by declarative configuration. Does the replicationController change to a new template, or does one update its template? How does one update pods controlled by the replicationController? Some ideas were discussed in #492 . Duplicate pods on the same host: We implement IP per pod, so no port allocation range is necessary. fromPodTemplateId: The template must behave as a cookie cutter -- once a pod is created from a template, it has no relationship to the template. The template may be changed or deleted without affecting pods created from it, and pods created from it may be modified independently. We probably do want to record provenance information for debugging and/or auditing, though. It would include information like the template id, time, replication controller id (if created by one), user, etc. |
@bgrant0607 Can you describe config generators a bit more as mentioned in 146? Haven't heard you mention that yet, but I suspect it matches use cases we are looking to solve as well. |
Created #1007 to start the broader config discussion. #503 contains another example that could use the pod template: job controller. I'd like a bulk-creation operation to go with the pod template, so that a replication controller could send one operation to create N pods. This will eventually be important for performance, gang scheduling, usage analytics, etc. |
@smarterclayton @erictune @lavalamp Trying to make this concrete. Standalone Pod TemplateFrom #1225:
It should also have a There is the question of whether we want metadata like labels and annotations to come from the template or to be provided at pod instantiation time or both. Taking metadata from the template is easier to use and more secure. Providing metadata at instantiation time would allow more flexible template reuse. I'm going to declare that flexible template is a problem for the config system, not the PodTemplate, so I recommend we should take metadata from the PodTemplate. The Metadata in the struct above is the PodTemplate's metadata. Typically, the pods created from the template will have the same labels and at least some of the same annotations, but may have additional annotations, such as to record the template from which they were created. However, for cleanliness (and flexibility, also), I recommend a separate field for pod metadata, I could also foresee us adding more fields in the future, such as authorization info, TTL, etc. Therefore, I propose a PodTemplateSpec, which includes PodMetadata, PodSpec, and whatever other desired state fields we need.
Bulk Pod CreationBy themselves, PodTemplates don't do anything. They are there to be used. I propose to extend POST /pods with 2 URL parameters:
More about the format of object references below. Replication ControllerCurrently (even in the v1beta3 proposal), ReplicationControllerSpec contains an inline There are 3 alternative approaches to using a PodTemplate in ReplicationController:
In order to produce the decoupling and security properties I was looking for when I proposed this issue, the replication controller service needs to be able to utilize the template at pod creation time rather than at the time the replication controller is created. Therefore, the ReplicationControllerSpec needs a reference to the PodTemplate. This has the disadvantage of creating a hard dependency between two objects -- the replication controller could fail if its pod template were deleted -- but we could disallow deletion of PodTemplates that were in use. We already have another creation-order dependency -- services must be created before their clients -- so that wouldn't be a new issue. I'm tempted to recommend (3), so we could support both simple and more sophisticated use cases, but (A) I'm concerned that inline PodTemplateSpecs in ReplicationControllerSpec will create problems down the road for auth and for API refactoring and (B) kubecfg could paper over the complexity of dealing with multiple objects for now and a Real Config solution or higher-level API should be able to deal with it later. So, I recommend (2): PodTemplate by reference only. Inter-object referencesThis could (and probably should) be forked into its own issue if there's a lot of debate. We don't currently have any cross-references between objects in our API. We just have indirect references via label selectors. Possible options:
Using label selectors would require adding a unique label to facilitate unique references, which is sort of contrary to what labels are for, or a non-label tie-breaking field to select the correct one from the set. Additionally, the consistency model would be more complex -- after adding a new template, users would want to ensure that the replication controller would use it before performing an action that would cause new pods to be created, such as killing pods or increasing the replica count. This seems overly complex for only a small benefit. UID has the problems that it isn't even indexed currently, would be hard for users to reason about, and couldn't be specified without additional communication with the apiserver and processing in the client. In particular, it would be hostile to configuration. JSON would require another encoding for URL parameters. Therefore, I suggest consistency with API object references from outside the system, so either (4) or (5). The reason to not use (5) is because the domain name and version are not necessarily stable (esp. if we replicate and/or self-host apiserver), so I recommend (4), path without API version. This form would be used both in URL parameters (pod creation from template) and in object fields (replication controller). This form also happens to be the most concise. |
Re: references, I agree with reasoning about 1, 2, 3, and 5. I would also agree with 4 as better than the alternatives. One problem with 4 is when we rename resources (minions -> nodes). Re: templates, Some potential problems that could crop up with referenced pods:
|
An alternative to fully moving the pod template out and replacing it with a reference to an external resource: Create a pod template subresource for all the controllers (and even for CRDs eventually). This would address the following use cases:
|
If we had API support for apply (#17333), PodPreset could mutate the template spec without changing the user's intent. |
Issues go stale after 90d of inactivity. Prevent issues from auto-closing with an If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or |
/remove-lifecycle stale |
I'm not sure if admission plugins were already discussed as part of the motivation for potentially making use of the top-level PodTemplate resource, but here is a concrete use case where such a design would have been helpful: |
backend/awsvpc: allow RBAC for instances
/remove-kind design
|
This was the original design (sunit prototype = pod template, replicate = replica set): But it can be addressed through a combination of type imports and duck typing. I don't think this will happen, so closing. |
@bgrant0607: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
We should separate the pod template from replicationController, to make it possible to create pods from template without replicationController (e.g., for cron jobs, for deferred execution using hooks). This would also make updates cleaner.
The text was updated successfully, but these errors were encountered: