Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Federation] Segregate DNS related code to separate controller #45034

Conversation

shashidharatd
Copy link

@shashidharatd shashidharatd commented Apr 27, 2017

What this PR does / why we need it:
This is the continuation of service controller re-factor work as outlined in #41253
This PR segregates DNS related code from service controller to another controller service-dns controller which manages the DNS records on the configured DNS provider.
service-dns controller monitors the federated services for the ingress annotations and create/update/delete DNS records accordingly.
service-dns controller can be optionally disabled and DNS record management could be done by third party components by monitoring the ingress annotations on federated services. (This would enable something like federation middleware for CoreDNS where federation api server could be used as a backend to CoreDNS eliminating the need for etcd storage.)

Special notes for your reviewer:

Release note:

Federation: A new controller for managing DNS records is introduced which can be optionally disabled to enable third party components to manage DNS records for federated services.

cc @kubernetes/sig-federation-pr-reviews

@k8s-github-robot k8s-github-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 27, 2017
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 27, 2017
@k8s-github-robot k8s-github-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Apr 27, 2017
@k8s-reviewable
Copy link

This change is Reviewable

@shashidharatd
Copy link
Author

@k8s-bot bazel test this

@shashidharatd
Copy link
Author

@k8s-bot unit test this

@@ -790,17 +699,6 @@ func (s *ServiceController) updateFederatedService(fedService *v1.Service, newLB
}
}

// Ensure DNS records based on Annotations in federated service for all federated clusters
if needUpdate && wantsDNSRecords(fedService) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me where this block of code (and the two removed blocks above) have been moved to?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@quinton-hoole, We may not need this anymore. We look for service status and update the federated ingress annotation. In case of non-loadbalancer type of service, service status will have empty LoadBalancer field and so we don't update the ingress annotation. is this ok?

@nikhiljindal nikhiljindal removed their assignment May 2, 2017
@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 5, 2017
@marun marun self-requested a review May 5, 2017 21:30
@shashidharatd shashidharatd force-pushed the federation-service-controller-3 branch from 53b1a3a to ecee9ca Compare May 6, 2017 01:00
@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 6, 2017
@shashidharatd
Copy link
Author

PR rebased. @quinton-hoole, @marun request to PTAL

@shashidharatd
Copy link
Author

/assign @marun

@marun
Copy link
Contributor

marun commented May 10, 2017

@shashidharatd Have you considered using a shared informer to serve up services to both the service and dns controllers? My naive assumption is the data requirements are going to be the same between the 2 controllers.

@shashidharatd
Copy link
Author

@marun, its a good suggestion. i will check it out and update. Thanks

@shashidharatd shashidharatd force-pushed the federation-service-controller-3 branch from ecee9ca to 815fa66 Compare May 12, 2017 11:53
@k8s-github-robot k8s-github-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels May 12, 2017
@shashidharatd
Copy link
Author

@k8s-bot gce etcd3 e2e test this

@shashidharatd
Copy link
Author

shashidharatd commented May 12, 2017

@marun, i gave a shot at using shared informer. Currently shared informer cannot be used by federation controllers since it expects a clientset.Interface. here is the reference. Federation controllers use federation_clientset.Interface. There is incompatibility as of now.

So my suggestion is to go forward without using shared informer for this PR and optimize it in the future. Do you agree?

This is also related to this issue #41302 which we need to address in future.

@marun
Copy link
Contributor

marun commented May 12, 2017

@shashidharatd Ok, that's fine. I didn't realize that it wouldn't be compatible.

@marun
Copy link
Contributor

marun commented May 12, 2017

Modulo some inline concerns, this looks reasonable. I also think this extraction of the dns code is a good opportunity for further cleanup that will enable better unit test coverage. Would you be willing to commit to doing that work in a follow-on PR?


Reviewed 6 of 6 files at r1, 1 of 4 files at r2.
Review status: 4 of 7 files reviewed at latest revision, 13 unresolved discussions.


federation/cmd/federation-controller-manager/app/controllermanager.go, line 146 at r1 (raw file):

	}

	if controllerEnabled(s.Controllers, serverResources, servicecontroller.DNSControllerName, servicecontroller.RequiredResources, true) {

The previous behavior initialized the dns provider before the services controller. Is there a reason this PR doens't maintain that behavior (i.e. failing on dns initialization failure before starting services controller)?

(No action required) Not sure why controllerEnabled accepts the defaultValue parameter if it never varies.


federation/pkg/federation-controller/service/dns.go, line 1 at r1 (raw file):

/*

Is there a reason to keep this controller in the service package if the controller manager instantiates it separately? That also prompts the question of whether the dns controller should be initialized if the services controller is disabled (as is proposed in the this revision). When would that configuration make sense?


federation/pkg/federation-controller/service/dns.go, line 48 at r1 (raw file):

	DNSUserAgentName = "federation-service-dns-controller"

	// minDNSTtl is the minimum safe DNS TTL value to use (in seconds).  We use this as the TTL for all DNS records.

Why is DNS in all caps but TTL is not?


federation/pkg/federation-controller/service/dns.go, line 61 at r1 (raw file):

	zoneName string
	zoneID   string
	// each federation should be configured with a single zone (e.g. "mycompany.com")

Should this comment apply to dnsZone? I'm assuming dnsZones is caching zones from the dnsaas provider.


federation/pkg/federation-controller/service/dns.go, line 68 at r1 (raw file):

	// Informer controller for federated services
	serviceController cache.Controller
	// Client to federated api server

I'm assuming this comment should be applied to federationClient?


federation/pkg/federation-controller/service/dns.go, line 77 at r1 (raw file):

	dns, err := dnsprovider.InitDnsProvider(dnsProvider, dnsProviderConfig)
	if err != nil {
		glog.Fatalf("DNS provider could not be initialized: %v", err)

I think this log line should be removed or at the least downgraded from fatal, since the controller manager will fail if it receives an error from this method.


federation/pkg/federation-controller/service/dns.go, line 93 at r1 (raw file):

	// Check if configuration provided is valid
	if err := d.init(); err != nil {
		glog.Fatalf("DNS provider failed to initialize: %v", err)

ditto


federation/pkg/federation-controller/service/dns.go, line 133 at r1 (raw file):

		service := item.(*v1.Service)

		ingress, err := ParseFederatedServiceIngress(service)

(No action required) Ugh, ingress is such an overloaded term.


federation/pkg/federation-controller/service/dns.go, line 146 at r1 (raw file):

}

func (s *ServiceDNSController) init() error {

A catch-all function name like init is often a good indication that a function is doing too much, and that appears to be the case here. Consider breaking this function down into config validation and zone retrieval to simplify maintenance and testing. Ideally validation of class invariants would be tested separately from zone handling.


federation/pkg/federation-controller/service/dns.go, line 344 at r1 (raw file):

}

/* ensureDNSRrsets ensures (idempotently, and with minimum mutations) that all of the DNS resource record sets for dnsName are consistent with endpoints.

Similarly, why is DNS capitalized but ResourceRecordSets abreviated to Rrsets?


federation/pkg/federation-controller/service/dns.go, line 362 at r1 (raw file):

			if uplevelCname != "" {
				glog.V(4).Infof("Creating CNAME to %q for %q", uplevelCname, dnsName)
				newRrset := rrsets.New(dnsName, []string{uplevelCname}, minDNSTtl, rrstype.CNAME)

Ttl -> TTL (same comment where this appears elsewhere in the file)


federation/pkg/federation-controller/service/dns.go, line 491 at r1 (raw file):

	// dnsNames is the path up the DNS search tree, starting at the leaf
	dnsNames := []string{
		commonPrefix + "." + zoneNames[0] + "." + regionName + "." + serviceDNSSuffix, // zone level - TODO might need other zone names for multi-zone clusters

Consider using strings.Join() to concatenate with a . separator.


federation/pkg/federation-controller/service/servicecontroller.go, line 387 at r1 (raw file):

}

func wantsDNSRecords(service *v1.Service) bool {

How does the new controller avoid processing services that aren't of type load balancer? I can't seem to find a check for the service type. Should there be one in the worker() method?


Comments from Reviewable

@shashidharatd shashidharatd force-pushed the federation-service-controller-3 branch from 815fa66 to d014468 Compare May 12, 2017 19:25
@shashidharatd
Copy link
Author

Thanks for the review @marun, have addressed the comments. PTAL. I am for bettering unit test coverage. shall pick it up in follow-up PR.


Review status: 4 of 7 files reviewed at latest revision, 13 unresolved discussions.


federation/cmd/federation-controller-manager/app/controllermanager.go, line 146 at r1 (raw file):
Actually service controller can function without DNS, so i thought it should be ok to go ahead and initialize Service Controller before validating DNS config parameters. after this PR is merged. DNS controller can optionally be switched off. There is a dependency from DNS controller to service controller, however its not a kind of hard dependency. There will be no issue if DNS controller is enabled and service controller is not enabled. In that case the DNS controller will not receive any service watch events.

In any case i am starting the DNS controller before the service controller now.

(No action required) Not sure why controllerEnabled accepts the defaultValue parameter if it never varies.

Currently all controllers are by default enabled and i feel thats why we see defaultValue to controllerEnabled func may be not required. But in future may be there will be some controller which by default we want to disable.


federation/pkg/federation-controller/service/dns.go, line 1 at r1 (raw file):
It would be good to separate DNS controller to its own package, but currently i wanted to keep the changes to minimum and make the changes just necessary. We shall segregate the DNS related code to another package subsequently.

That also prompts the question of whether the dns controller should be initialized if the services controller is disabled (as is proposed in the this revision). When would that configuration make sense?

Yes you are right. DNS controller should not be enabled if service controller is disabled. I have update the code now.


federation/pkg/federation-controller/service/dns.go, line 48 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

Why is DNS in all caps but TTL is not?

DNS is made all caps, because there were warnings from go lint and there were no warnings for Ttl. Anyway let me change Ttl to TTL


federation/pkg/federation-controller/service/dns.go, line 61 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

Should this comment apply to dnsZone? I'm assuming dnsZones is caching zones from the dnsaas provider.

yes, its mistake carried over from previous code. updated.


federation/pkg/federation-controller/service/dns.go, line 68 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

I'm assuming this comment should be applied to federationClient?

yes, another mistake. updated now


federation/pkg/federation-controller/service/dns.go, line 77 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

I think this log line should be removed or at the least downgraded from fatal, since the controller manager will fail if it receives an error from this method.

yes, you are right. changed to error.


federation/pkg/federation-controller/service/dns.go, line 93 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

ditto

Done.


federation/pkg/federation-controller/service/dns.go, line 133 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

(No action required) Ugh, ingress is such an overloaded term.

yes agree. lack of better term. I am very bad at coining terms :). request you to suggest a better one.


federation/pkg/federation-controller/service/dns.go, line 146 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

A catch-all function name like init is often a good indication that a function is doing too much, and that appears to be the case here. Consider breaking this function down into config validation and zone retrieval to simplify maintenance and testing. Ideally validation of class invariants would be tested separately from zone handling.

split the init function according to the suggestion.


federation/pkg/federation-controller/service/dns.go, line 344 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

Similarly, why is DNS capitalized but ResourceRecordSets abreviated to Rrsets?

Again there were no warnings from golint. Rrset is used heavily in DNS provider. i think it qualifies to be a word now :).


federation/pkg/federation-controller/service/dns.go, line 362 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

Ttl -> TTL (same comment where this appears elsewhere in the file)

Done.


federation/pkg/federation-controller/service/dns.go, line 491 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

Consider using strings.Join() to concatenate with a . separator.

Done.


federation/pkg/federation-controller/service/servicecontroller.go, line 387 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

How does the new controller avoid processing services that aren't of type load balancer? I can't seem to find a check for the service type. Should there be one in the worker() method?

Ok. i have added back an old function which used to do this in service controller, but at a different place.


Comments from Reviewable

@marun
Copy link
Contributor

marun commented May 12, 2017

Thank you for cleanup. There appears to be some strangeness introduced in the most recent revision around the Run method, unless I'm missing something this will break the controller.


Reviewed 1 of 4 files at r2, 3 of 3 files at r3.
Review status: all files reviewed at latest revision, 8 unresolved discussions, some commit checks failed.


federation/pkg/federation-controller/service/dns.go, line 77 at r1 (raw file):

Previously, shashidharatd (Shashidhara T D) wrote…

yes, you are right. changed to error.

(No action required) Is it desirable to handle any of the errors in this function vs just returning the fmt.Errorf() result?


federation/pkg/federation-controller/service/dns.go, line 133 at r1 (raw file):

Previously, shashidharatd (Shashidhara T D) wrote…

yes agree. lack of better term. I am very bad at coining terms :). request you to suggest a better one.

Can you describe what it is? I'm not familiar enough with the service controller to suggest a better name.


federation/pkg/federation-controller/service/dns.go, line 491 at r1 (raw file):

Previously, shashidharatd (Shashidhara T D) wrote…

Done.

(No action required) Consider joining the dns names separately from their components rather than inline.


federation/pkg/federation-controller/service/dns.go, line 75 at r3 (raw file):

	serviceDNSSuffix, zoneName, zoneID string) (*ServiceDNSController, error) {
	dns, err := dnsprovider.InitDnsProvider(dnsProvider, dnsProviderConfig)
	if err != nil || dns == nil {

Ah, I see that you've moved the check for dns to here. I don't think the error handling will be correct, though, since err may be nil and provide no detail to either HandleError or the caller.


federation/pkg/federation-controller/service/dns.go, line 88 at r3 (raw file):

		workQueue:        workqueue.New(),
	}
	if err := d.validateConfigParemeters(); err != nil {

Paremeters -> Parameters (could also just use validateConfig)


federation/pkg/federation-controller/service/dns.go, line 140 at r3 (raw file):

func (s *ServiceDNSController) worker() {
	for {
		func() {

Why is it desirable to wrap this in a function? Couldn't a similar result be achieved by using continue instead of return? I'm also not sure it makes sense to mix a for loop with wait.Until. Consider removing the for loop and decreasing the wait interval to achieve a similar result with less overhead.


federation/pkg/federation-controller/service/dns.go, line 211 at r3 (raw file):

		s.dnsZone = matchingZones[0]
	}
	return fmt.Errorf("Multiple matching DNS zones found for %q; please specify zoneID", s.zoneName)

Am I missing something, or will this statement always be executed (function will always return an error)?


federation/pkg/federation-controller/service/servicecontroller.go, line 253 at r3 (raw file):

	glog.Infof("Starting federation service controller")
	defer glog.Infof("Shutting down federation service controller")

What's to stop this function from starting and shutting down the controller in a single invocation now that the stop channel is not being checked before running the defer statements?


Comments from Reviewable

@shashidharatd shashidharatd force-pushed the federation-service-controller-3 branch 2 times, most recently from af3b312 to 57662d1 Compare May 13, 2017 04:16
@shashidharatd
Copy link
Author

I had caused an issue in previous commit which is fixed now and all tests are passing. Also i tried addressing all the comments. PTAL


Review status: 5 of 7 files reviewed at latest revision, 6 unresolved discussions.


federation/pkg/federation-controller/service/dns.go, line 77 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

(No action required) Is it desirable to handle any of the errors in this function vs just returning the fmt.Errorf() result?

The error handling is at the source of the error and also caller of this function is not doing runtime.HandleError. so i think it should be Ok the way it is.


federation/pkg/federation-controller/service/dns.go, line 133 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

Can you describe what it is? I'm not familiar enough with the service controller to suggest a better name.

The federated service holds information about how the service within a cluster can be accessed from outside the cluster in annotation with the type as described here


federation/pkg/federation-controller/service/dns.go, line 491 at r1 (raw file):

Previously, marun (Maru Newby) wrote…

(No action required) Consider joining the dns names separately from their components rather than inline.

This is done like this to provide similar readability as earlier code.


federation/pkg/federation-controller/service/dns.go, line 75 at r3 (raw file):

Previously, marun (Maru Newby) wrote…

Ah, I see that you've moved the check for dns to here. I don't think the error handling will be correct, though, since err may be nil and provide no detail to either HandleError or the caller.

moved it back.


federation/pkg/federation-controller/service/dns.go, line 88 at r3 (raw file):

Previously, marun (Maru Newby) wrote…

Paremeters -> Parameters (could also just use validateConfig)

mistake, :) fixed now.


federation/pkg/federation-controller/service/dns.go, line 140 at r3 (raw file):

Previously, marun (Maru Newby) wrote…

Why is it desirable to wrap this in a function? Couldn't a similar result be achieved by using continue instead of return? I'm also not sure it makes sense to mix a for loop with wait.Until. Consider removing the for loop and decreasing the wait interval to achieve a similar result with less overhead.

the wrapper function is for defer statement to do workQueue.Done. Also the worker will always be blocked at Get waiting for items. Only when the queue is shutdown the worker will quit the loop, and we had to use wait.Until to restart the worker.

This pattern for worker is quite common and we can find this in few other places within kubernetes code. I have done some update to make this more clear & also updated the serviceController worker to do the same.


federation/pkg/federation-controller/service/dns.go, line 211 at r3 (raw file):

Previously, marun (Maru Newby) wrote…

Am I missing something, or will this statement always be executed (function will always return an error)?

yeah, i introduced a bug mistakenly. Fixed it now.


federation/pkg/federation-controller/service/servicecontroller.go, line 253 at r3 (raw file):

Previously, marun (Maru Newby) wrote…

What's to stop this function from starting and shutting down the controller in a single invocation now that the stop channel is not being checked before running the defer statements?

there is <-stopCh at the end of this function.


Comments from Reviewable

@marun
Copy link
Contributor

marun commented May 15, 2017

Thank you for fixing. Code looks to be in good shape. My primary concern is avoiding non-trivial nested functions, which are really hard to test effectively.


Reviewed 2 of 2 files at r4.
Review status: all files reviewed at latest revision, 4 unresolved discussions.


federation/pkg/federation-controller/service/dns.go, line 133 at r1 (raw file):

Previously, shashidharatd (Shashidhara T D) wrote…

The federated service holds information about how the service within a cluster can be accessed from outside the cluster in annotation with the type as described here

Ah, so it stems from the LoadBalancerIngress collection. Not really something to be changed in federation then.


federation/pkg/federation-controller/service/dns.go, line 139 at r4 (raw file):

func (s *ServiceDNSController) worker() {
	workFunc := func() bool {

I think this should be a first-class method of ServiceDNSController to ensure that it can be unit tested.


federation/pkg/federation-controller/service/servicecontroller.go, line 248 at r4 (raw file):

// It's an error to call Run() more than once for a given ServiceController
// object.
func (s *ServiceController) Run(workers int, stopCh <-chan struct{}) {

I think localization of defer statements makes sense, so that addition or removal of a given controller component can be done in one place. I don't think this change needs to be included in this PR, though. Consider submitting it separately.


federation/pkg/federation-controller/service/servicecontroller.go, line 293 at r4 (raw file):

// fedServiceWorker runs a worker thread that just dequeues items, processes them, and marks them done.
func (s *ServiceController) fedServiceWorker() {
	workFunc := func() bool {

Same comment about making this a first-class function that can be unit tested.


Comments from Reviewable

@shashidharatd shashidharatd force-pushed the federation-service-controller-3 branch from 57662d1 to c9454f8 Compare May 16, 2017 03:51
@shashidharatd
Copy link
Author

@k8s-bot kops aws e2e test this

@shashidharatd
Copy link
Author

Review status: 5 of 7 files reviewed at latest revision, 4 unresolved discussions, some commit checks failed.


federation/pkg/federation-controller/service/dns.go, line 139 at r4 (raw file):

Previously, marun (Maru Newby) wrote…

I think this should be a first-class method of ServiceDNSController to ensure that it can be unit tested.

I have a separate function for workerFunction now which could be tested. is this ok?


federation/pkg/federation-controller/service/servicecontroller.go, line 248 at r4 (raw file):

Previously, marun (Maru Newby) wrote…

I think localization of defer statements makes sense, so that addition or removal of a given controller component can be done in one place. I don't think this change needs to be included in this PR, though. Consider submitting it separately.

Okie, will send them in separate PR. preserved old code.


federation/pkg/federation-controller/service/servicecontroller.go, line 293 at r4 (raw file):

Previously, marun (Maru Newby) wrote…

Same comment about making this a first-class function that can be unit tested.

ditto.


Comments from Reviewable

@shashidharatd
Copy link
Author

@marun, i have addressed your comments. PTAL. Thanks!

@marun
Copy link
Contributor

marun commented May 16, 2017

/lgtm

I think this is another good step in refactoring the service controller. I'm hoping future iterations can focus on ensuring good unit test coverage.


Reviewed 2 of 2 files at r5.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 16, 2017
@shashidharatd
Copy link
Author

Thanks @marun for lgtm.
As @quinton-hoole is away for a while, @csbell, can you PTAL and approve this PR. Thanks !

@shashidharatd
Copy link
Author

/assign @csbell

@csbell
Copy link
Contributor

csbell commented May 16, 2017

/lgtm

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: csbell, marun, shashidharatd

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 16, 2017
@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 45247, 45810, 45034, 45898, 45899)

@k8s-github-robot k8s-github-robot merged commit b8f084a into kubernetes:master May 17, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants