DNS #146

bgrant0607 · 2014-06-18T02:59:10Z

Provide DNS resolution for pod addresses.

bgrant0607 · 2014-07-09T22:18:53Z

We should also make it possible to plug in other naming/discovery mechanisms, such as etcd, Eureka, and Consul. One prerequisite is that it needs to be possible to get pod IP addresses #385.

smarterclayton · 2014-07-24T21:44:03Z

It would also be nice if higher level services can participate with dns resolution and decorate lower level services where necessary. Also, registration of DNS with external parties is important in larger organizations where important services must be registered to a corporate DNS. DNS also plays into a number of components like protocol aware load balancers where ports are shared and hosts determine routing - we'd want to be able to tie things as CNAMEs as well.

bgrant0607 · 2014-07-24T23:27:29Z

@smarterclayton I'd be interested in hearing more about your requirements.

I'm currently thinking that, for the most part, pod DNS is not so useful, except as a shorthand for ipv6 addresses in debugging scenarios.

Except for possible future special cases, pods are relatively disposable and won't have stable addresses or even stable names. In particular, replicationControllers treat pods as fungible, and as discussed in #557 we don't currently plan to reschedule pods, in order to leave future flexibility for layers on top to manage rescheduling, migration, a variety of rolling update schemes, etc.

We could implement dynamic DNS, but would need to hammer on DNS caching problems, in linux, in language platforms and libraries, in applications, etc.

So, I'm thinking we should support IP per service (I think I mentioned that in the new networking.md doc) and DNS for services.

smarterclayton · 2014-07-25T00:00:49Z

IP per service can get very expensive - we run today roughly 1 load balancer per container. So in some cases we may want services to be cheap (which might be a different type of service, really). IP per service seems like it could be optional - I would argue if you're doing local proxying you'd be better off hiding the remote proxy port and ip from the containers anyway to prevent people from baking assumptions in about the remote destination, and if you need a service at a known ip and port you're typically talking more about an edge service.

However, when you need it, it should be possible to do easily.

smarterclayton · 2014-07-25T00:01:33Z

(above predicated on ipv4 and current state of networking in a lot of environments)

smarterclayton · 2014-07-25T04:14:09Z

It's probably also worth distinguishing between details an api consumer is required to provide, vs those the infrastructure picks for her. For instance, when creating a service a consumer may omit the service port, but expect to get back a value for port and a value for IP, chosen by the infrastructure. I have an assumption that software running in the infrastructure generally is flexible to remote ports (especially if injected), so not every consumer of a service needs an IP. However, if a specific port is requested, it may be at the discretion of the infrastructure whether to satisfy that request or reject it.

thockin · 2014-07-25T04:26:09Z

I feel like there are decisions here I am not aware of and do not
understand.

Pod DNS for non-replicated pods IS "IP per service". Maybe we don't want
people to run non-replicated pods? Then we should not offer it.

pods "won't have stable addresses or even stable names" ? In identifiers.md
we said:

4) Each pod instance on an apiserver has a PodID (a UUID) that is

unique across space and time
1) If not specified by the client, the apiserver will assign this
identifier
2) This identifier will persist for the lifetime of the pod, even if
the pod is stopped and started or moved across hosts

"we don't currently plan to reschedule pods" - under what circumstances?
If a minion dies, we won't trigger a reschedule? Or are we asserting that
the "rescheduled" pod is a different pod, despite being the only one the
user asked for?

On Thu, Jul 24, 2014 at 4:27 PM, bgrant0607 [email protected]
wrote:

@smarterclayton https://github.com/smarterclayton I'd be interested in
hearing more about your requirements.

I'm currently thinking that, for the most part, pod DNS is not so useful,
except as a shorthand for ipv6 addresses in debugging scenarios.

Except for possible future special cases, pods are relatively disposable
and won't have stable addresses or even stable names. In particular,
replicationControllers treat pods as fungible, and as discussed in #557
#557 we don't
currently plan to reschedule pods, in order to leave future flexibility for
layers on top to manage rescheduling, migration, a variety of rolling
update schemes, etc.

We could implement dynamic DNS, but would need to hammer on DNS caching
problems, in linux, in language platforms and libraries, in applications,
etc.

So, I'm thinking we should support IP per service (I think I mentioned
that in the new networking.md doc) and DNS for services.

Reply to this email directly or view it on GitHub
#146 (comment)
.

lavalamp · 2014-07-25T05:13:26Z

"we don't currently plan to reschedule pods" - under what circumstances?

+1; replication controllers start a new pod instead of waiting for the system to reschedule a pod. This is a nifty control loop thingy, but I have not been convinced the part of the system that makes it necessary (where we don't move pods off of damaged hosts) is a feature and not a bug. In fact I consider it a bug at the moment.

thockin · 2014-07-25T05:21:38Z

As a naive user I would expect the Pod Create API to return me an ID that I
can use forever and ever to address the Pod I just Created.

On Thu, Jul 24, 2014 at 10:13 PM, Daniel Smith [email protected]
wrote:

"we don't currently plan to reschedule pods" - under what circumstances?

+1; replication controllers start a new pod instead of waiting for the
system to reschedule a pod. This is a nifty control loop thingy, but I have
not been convinced the part of the system that makes it necessary (where we
don't move pods off of damaged hosts) is a feature and not a bug. In fact I
consider it a bug at the moment.

Reply to this email directly or view it on GitHub
#146 (comment)
.

lavalamp · 2014-07-25T05:28:10Z

As a naive user I would expect the Pod Create API to return me an ID that I can use forever and ever to address the Pod I just Created.

I believe that is the case, if you make a pod with a blank ID, apiserver should fill in the ID field and it would appear in the response. Otherwise it's bug.

thockin · 2014-07-25T06:40:18Z

So if the machine that pod was on disappears suddenly, and we re-schedule
that pod, it should have the same ID. Thus it is the same pod.

On Thu, Jul 24, 2014 at 10:28 PM, Daniel Smith [email protected]
wrote:

As a naive user I would expect the Pod Create API to return me an ID that
I can use forever and ever to address the Pod I just Created.

I believe that is the case, if you make a pod with a blank ID, apiserver
should fill in the ID field and it would appear in the response. Otherwise
it's bug.

Reply to this email directly or view it on GitHub
#146 (comment)
.

bgrant0607 · 2014-07-25T07:25:21Z

@thockin @lavalamp

This is a question of the layer of abstraction provided by the apiserver vs. higher-level layers, which haven't been built yet. Single-use pods are a useful building block.

Currently a pod is scheduled only upon creation. If a user wants a pod or N pods to continue to exist indefinitely, they must create a replicationController. This mechanism is very robust and flexible. Pods may disappear for any reason, host death only being one, and they will be replaced by the replicationController.

This sort of gives us on-demand restarts and reschedules for free. And, it allows for very flexible update policies, as discussed in #492 . It also enables rescheduling policy, backup pods, migration, etc. to be built at a higher layer. And one could imagine it working even in the case where we want to move a pod from one apisever "cell" to another (by decreasing the replica count in one cell and increasing it in another).

There are also cases where one wants to take a pod out of service for data recovery or debugging, but need to replace it in the serving set (aka zombies). And there are cases where we can't kill pods that we need to reschedule/replace (aka phantoms). These scenarios happen even with singletons.

Is a rescheduled pod the same pod? It would have a different address, given our current networking implementation. The service abstraction is the only thing we currently provide in k8s to deal with that. We could try to paper over that by updating DDNS, but we'd need to confront TTL/caching issues.

The pod would also lose its durable-ish local storage and everything else associated with its previous host. It would only be able to migrate its state in the case of remote storage, such as PD.

IMO, we need to be able to control dynamic reassignment of roles within the user's application via the API.

bgrant0607 · 2014-07-25T07:55:10Z

@smarterclayton Re. automatic port assignment for the existing service approach: Agree. See #390 .

lavalamp · 2014-07-25T17:55:59Z

@bgrant0607 OK I find that pretty convincing. Absolutely definitely do not want "backup" pods to ever be a thing in k8s.

It does mean that you basically always have to make a service & rep. contoller even for a single pod, and we'll need to make service names DNS-y--it's no longer critical that pod names are DNS-y--which I think is a change from what @thockin and @smarterclayton were thinking when they hashed out the pod naming system.

(I think, in this case, apiserver should offer some DNS-like pod lookup for debugging purposes, but not for production purposes.)

smarterclayton · 2014-07-25T19:34:43Z

To your last point, I think someone floated a "dns-service" that does nothing except expose the pod IPs of a pod label query being just as useful for availability as a proxy, as long as the rate_of_pod_failure * number_of_pods > average_latency_of_dns_propagate, especially for integrations with clients that already do some variation of this. But in that case the pod name still doesn't have to be DNS'y and our previous pains to make names resolvable were in vain.

erictune · 2014-07-25T21:48:41Z

While working on #170 I was thinking that it would be more consistent if we got rid of Pods as a create-able API object, and just had PodTemplates and ReplicationControllers. It sounds like that is in line with what @bgrant0607 said above.

So, then, if you can't create a pod, can you at least list them? Or is the Binding object that is being added in #592 sufficient to represent a (potentially) running pod?

erictune · 2014-07-25T22:16:30Z

I guess we need to have a Pod object for the scheduler to find to Bind. But it would be nice to remove the duplication of data between a pod and a podTemplate (in the context of #170).

smarterclayton · 2014-07-25T22:22:16Z

If the duplication is removed, doesn't that imply that a podTemplate cannot be removed from the system until all dependent pods are also deleted? And that a user cannot prevent templates from being used until that happens? Seems fraught with coupling issues to me (even with immutable templates, and if they're immutable how do you change who can create them?)

erictune · 2014-07-25T22:24:17Z

Good points Clayton. Also saw that Brian made the same point on #170 last
night .. just read it. Okay, suggestion withdrawn.

On Fri, Jul 25, 2014 at 3:22 PM, Clayton Coleman [email protected]
wrote:

If the duplication is removed, doesn't that imply that a podTemplate
cannot be removed from the system until all dependent pods are also
deleted? And that a user cannot prevent templates from being used until
that happens? Seems fraught with coupling issues to me (even with immutable
templates, and if they're immutable how do you change who can create them?)

—
Reply to this email directly or view it on GitHub
#146 (comment)
.

thockin · 2014-07-26T16:43:01Z

Hmm, I will have to digest this. It sounds good, but it does change some of
how I pictured the while stack.

It feels unfortunate that we say replication controller becomes required
for the vast majority of use cases - do we need a simpler singleton
controller that manages the "exactly one" semantic?

I don't think we should waive the name format requirements at lower levels,
though. It is still useful to have a tight spec. Maybe we want to change
them, e.g. to C tokens rather than DNS labels, but only if there is a good
reason.
On Jul 25, 2014 3:24 PM, "erictune" [email protected] wrote:

Good points Clayton. Also saw that Brian made the same point on #170 last
night .. just read it. Okay, suggestion withdrawn.

On Fri, Jul 25, 2014 at 3:22 PM, Clayton Coleman [email protected]

wrote:

If the duplication is removed, doesn't that imply that a podTemplate
cannot be removed from the system until all dependent pods are also
deleted? And that a user cannot prevent templates from being used until
that happens? Seems fraught with coupling issues to me (even with
immutable
templates, and if they're immutable how do you change who can create
them?)

Reply to this email directly or view it on GitHub
<
https://github.com/GoogleCloudPlatform/kubernetes/issues/146#issuecomment-50211352>

.

Reply to this email directly or view it on GitHub
#146 (comment)
.

lavalamp · 2014-07-26T16:55:09Z

... a simpler singleton controller that manages the "exactly one" semantic?

SGTM

smarterclayton · 2014-07-26T19:34:29Z

Maybe I misread 170 and above, but I didn't see a change being proposed to the model after Eric's retraction. I got the following:

pod template contents can't be read by repl controller loop (eventually, for security)
repl controller loop only has permission to create pod from template id, and specify any overrides allowed by the template
binding represents tying a pod to a host - a pod gets bound iff a binding exists
repl controller with 1 replica functions as singleton controller as described in last two comments
a user can create a pod adhoc at any point without specifying a template
a template is required for a repl controller
pod template can be changed at any time - it's up to the user to decide whether doing so is a good idea (vs creating a new template and updating the controller).

Or did I miss something?

bgrant0607 · 2014-07-28T04:51:52Z

@thockin @smarterclayton @erictune This is more the topic of #170, but, since the discussion is here...

I wouldn't be opposed to a simple singleton replicationController -- like a regenerating Pod. It would be nice if it could function anywhere a naked pod could.

If we wanted to get really slick, we could make replicationController capable of making replicas of anything. It just needs a URL to post to in order to create new replicas, a label selector to figure out how many replicas exist, and maybe some kind of health/liveness/readiness check for dynamic filtering of instances that aren't fully functional at the moment.

This would make our primitives more composable. For instance, if/when we were to add a MigratablePod, the replicationController could control a set of replicas of them. Maybe we could even have a replicationController control a set of replicationControllers. Once we added config generators, a replication controller could replicate replicationController/service pairs.

thockin · 2014-10-07T17:39:15Z

Closing in favor of #1261

Refactor API http handlers to prepare for versioning.

Fix rbd map for ceph Jewel

…eting-schedule-2 Add the updated meeting time and additional links to the sig-federation README.

Cherry-pick test deflake patches.

Fix scheduled release date for 1.5

* add scripts for hello-app-redis * add license headers and region tags to each script, also modify app/v1beta1 to app/v1 in app-deployment.yaml * fix some typos

…te-from-drone [1.27] migrate from drone to GH Actions

jbeda added the enhancement label Jun 18, 2014

bgrant0607 mentioned this issue Jun 26, 2014

Container identifiers should be globally unique #199

Closed

bgrant0607 changed the title ~~Container DNS~~ Pod DNS Jul 11, 2014

This was referenced Jul 11, 2014

Decide whether/how to extend the networking model #188

Closed

Automatic port number assignment #390

Closed

erictune added the networking label Jul 24, 2014

bgrant0607 mentioned this issue Jul 25, 2014

Durable local storage #598

Closed

bgrant0607 added this to the v1.0 milestone Aug 27, 2014

This was referenced Aug 28, 2014

WIP: ip per service #1038

Closed

Inconsistent usage of ID vs Name #1135

Closed

bgrant0607 changed the title ~~Pod DNS~~ DNS Sep 10, 2014

bgrant0607 mentioned this issue Sep 19, 2014

Proposal: Internal Dynamic DNS service #1261

Closed

bgrant0607 added the area/downward-api label Sep 19, 2014

brendandburns modified the milestones: 0.7, v1.0 Sep 25, 2014

This was referenced Sep 30, 2014

Container downward/upward API umbrella issue #386

Closed

Using a service variable to create another environment variable #1331

Closed

PetSet (was nominal services) #260

Closed

Proposal: Headless services #1607

Closed

thockin closed this as completed Oct 7, 2014

thockin mentioned this issue Apr 24, 2015

Make docs links go through docs.k8s.io #7270

Merged

vishh pushed a commit to vishh/kubernetes that referenced this issue Apr 6, 2016

Merge pull request kubernetes#146 from vmarmol/subcontainers

f2c6bec

Refactor API http handlers to prepare for versioning.

feiskyer added a commit to feiskyer/kubernetes that referenced this issue Oct 8, 2016

Merge pull request kubernetes#146 from feiskyer/fix-rbd

ddc2cf3

Fix rbd map for ceph Jewel

xingzhou pushed a commit to xingzhou/kubernetes that referenced this issue Dec 15, 2016

Merge pull request kubernetes#146 from madhusudancs/sig-federation-me…

bdcf85c

…eting-schedule-2 Add the updated meeting time and additional links to the sig-federation README.

k8s-ci-robot mentioned this issue Dec 25, 2017

validate print flag in 'kubectl create' #56773

Closed

iaguis pushed a commit to kinvolk/kubernetes that referenced this issue Feb 6, 2018

Merge pull request kubernetes#146 from diegs/diegs-1.6.6

42a5c8b

Cherry-pick test deflake patches.

MikeSpreitzer mentioned this issue May 9, 2018

Rare 10-second timeouts from ResourceQuota admission controller #63608

Closed

hchenxa mentioned this issue May 18, 2018

kubelet can not connect to apiserver when the number of nodes increase to 1500 #64014

Closed

yogi-sagar mentioned this issue Aug 7, 2018

Use of /dev/random generates secret key with invalid size in gce #67091

Closed

seans3 pushed a commit to seans3/kubernetes that referenced this issue Apr 10, 2019

Merge pull request kubernetes#146 from kubernetes/calebamiles-patch-1

3dfd3ae

Fix scheduled release date for 1.5

uxth mentioned this issue Apr 23, 2021

kubelet won't start after reboot with swap off already #101406

Closed

patsevanton mentioned this issue Jul 5, 2021

error: the server doesn't have a resource type "job" in microk8s 1.18 #103479

Closed

krunalhinguu pushed a commit to krunalhinguu/kubernetes that referenced this issue Jul 19, 2024

Merge pull request kubernetes#146 from slickwarren/cwarren/1.27/migra…

674f6ba

…te-from-drone [1.27] migrate from drone to GH Actions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DNS #146

DNS #146

bgrant0607 commented Jun 18, 2014

bgrant0607 commented Jul 9, 2014

smarterclayton commented Jul 24, 2014

bgrant0607 commented Jul 24, 2014

smarterclayton commented Jul 25, 2014

smarterclayton commented Jul 25, 2014

smarterclayton commented Jul 25, 2014

thockin commented Jul 25, 2014

lavalamp commented Jul 25, 2014

thockin commented Jul 25, 2014

lavalamp commented Jul 25, 2014

thockin commented Jul 25, 2014

bgrant0607 commented Jul 25, 2014

bgrant0607 commented Jul 25, 2014

lavalamp commented Jul 25, 2014

smarterclayton commented Jul 25, 2014

erictune commented Jul 25, 2014

erictune commented Jul 25, 2014

smarterclayton commented Jul 25, 2014

erictune commented Jul 25, 2014

thockin commented Jul 26, 2014

lavalamp commented Jul 26, 2014

smarterclayton commented Jul 26, 2014

bgrant0607 commented Jul 28, 2014

thockin commented Oct 7, 2014

DNS #146

DNS #146

Comments

bgrant0607 commented Jun 18, 2014

bgrant0607 commented Jul 9, 2014

smarterclayton commented Jul 24, 2014

bgrant0607 commented Jul 24, 2014

smarterclayton commented Jul 25, 2014

smarterclayton commented Jul 25, 2014

smarterclayton commented Jul 25, 2014

thockin commented Jul 25, 2014

lavalamp commented Jul 25, 2014

thockin commented Jul 25, 2014

lavalamp commented Jul 25, 2014

thockin commented Jul 25, 2014

bgrant0607 commented Jul 25, 2014

bgrant0607 commented Jul 25, 2014

lavalamp commented Jul 25, 2014

smarterclayton commented Jul 25, 2014

erictune commented Jul 25, 2014

erictune commented Jul 25, 2014

smarterclayton commented Jul 25, 2014

erictune commented Jul 25, 2014

thockin commented Jul 26, 2014

lavalamp commented Jul 26, 2014

smarterclayton commented Jul 26, 2014

bgrant0607 commented Jul 28, 2014

thockin commented Oct 7, 2014