Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Leader Election for argocd-application-controller #3073

Closed
d-kuro opened this issue Feb 2, 2020 · 10 comments
Closed

Support Leader Election for argocd-application-controller #3073

d-kuro opened this issue Feb 2, 2020 · 10 comments
Labels
component:core Syncing, diffing, cluster state cache enhancement New feature or request type:supportability Enhancements that help operators to run Argo CD

Comments

@d-kuro
Copy link
Contributor

d-kuro commented Feb 2, 2020

Summary

argocd-application-controller supports leader election.

Motivation

Running multiple Kubernetes Controllers causes a conflict.
Supporting leader election allows multiple controllers to work at the same time, which indicates that the controller will be highly available.
You can also use a rolling update for the controller's deployment strategy.

Actually controller-manager, scheduler, cluster-autoscaler, etc. support leader election.

https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/
https://kubernetes.io/docs/reference/command-line-tools-reference/kube-scheduler/
https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md#what-are-the-parameters-to-ca

Proposal

Add the following to the options of arogcd-applicaton-controller.

name description default
leader-elect Start a leader election client and gain leadership before executing the main loop.Enable this when running replicated components for high availability true
leader-elect-lease-duration The duration that non-leader candidates will wait after observing a leadershiprenewal until attempting to acquire leadership of a led but unrenewed leader slot.This is effectively the maximum duration that a leader can be stopped before it is replaced by another candidate.This is only applicable if leader election is enabled 15 seconds
leader-elect-renew-deadline The interval between attempts by the acting master to renew a leadership slot before it stops leading.This must be less than or equal to the lease duration.This is only applicable if leader election is enabled 10 seconds
leader-elect-retry-period The duration the clients should wait between attempting acquisition and renewal of a leadership.This is only applicable if leader election is enabled 2 seconds
leader-elect-resource-lock The type of resource object that is used for locking during leader election.Supported options are leases (default), endpoints and configmaps "leases"

All seconds are default values.
https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/client-go/tools/leaderelection/leaderelection.go#L111

Also, change the number of replicas for the argocd-application-controller in the install manifest of HA configuration to 2.
and change the Deployment strategy to a rolling update.

See below for a code sample of leader election.
https://github.com/kubernetes/kubernetes/tree/master/staging/src/k8s.io/client-go/examples/leader-election

I can create a Pull Request for this enhancement issue.

@d-kuro d-kuro added the enhancement New feature or request label Feb 2, 2020
@jessesuen
Copy link
Member

Some considerations about this request:

  1. The coordination.k8s.io/LeaseLock resource kind has only been around only since Kubernetes v1.13, which means K8s v1.13 will become a minimum requirement for Argo CD.

  2. Running multiple controllers will cause the application controller's prometheus metric endpoint to become inconsistent depending on which instance was hit during scraping. If we allow multiple controllers to run, then the metrics will likely need to be sent and cached to redis instead of in-memory like they are currently.

Supporting leader election allows multiple controllers to work at the same time

  1. Leader election implies an active-passive relationship between the leader vs. followers, where passive controller instances are not actively in use. So we would not receive any scaling benefits from leader election alone. To truly benefit, we would need to additionally implement sharding of the application controller, so that we have active-active controllers which are operating on a different subset of applications.

@jannfis jannfis added component:core Syncing, diffing, cluster state cache type:supportability Enhancements that help operators to run Argo CD labels May 14, 2020
@raelga
Copy link
Contributor

raelga commented Jul 21, 2020

+1

1 similar comment
@somaliz
Copy link

somaliz commented Jul 24, 2020

+1

@jannfis
Copy link
Member

jannfis commented Dec 20, 2021

which means K8s v1.13 will become a minimum requirement for Argo CD

I think we should be fine with that by now :)

@jullianow
Copy link

+1

@Interaze
Copy link

Interaze commented Aug 1, 2023

Hi. I believe this would be nice. There's a significant issue currently in maintaining a HA application set controller

@Interaze
Copy link

Interaze commented Aug 1, 2023

I see that the 3.0.0 advertises LEADER_ELECTION_IDENTITY environment variable. I'll look into this and see if this is a solution. If this is the case, perhaps we can close this

@rumstead
Copy link
Member

rumstead commented Nov 8, 2024

@d-kuro the applicationset controller does support leader election. Are we able to close this issue?

@d-kuro
Copy link
Contributor Author

d-kuro commented Nov 14, 2024

@rumstead Thanks!

@d-kuro d-kuro closed this as completed Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component:core Syncing, diffing, cluster state cache enhancement New feature or request type:supportability Enhancements that help operators to run Argo CD
Projects
None yet
Development

No branches or pull requests

8 participants