-
Notifications
You must be signed in to change notification settings - Fork 40.6k
Add tolerations for Stackdriver Logging and Metadata Agents. #69737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
/assign @MaciekPytel |
/assign @x13n |
/ok-to-test |
@@ -58,6 +58,11 @@ spec: | |||
restartPolicy: Always | |||
schedulerName: default-scheduler | |||
terminationGracePeriodSeconds: 30 | |||
tolerations: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussion topic: do we need similar tolerations in the cluster-level agent spec? I suspect the only reason we would is if all nodes are tainted. @x13n @MaciekPytel @kawych @piosz @bmoyles0117
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some of that conversation happened in #68920 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for the pointer. I see the priority class now. Given that we'll only cherry-pick this into 1.12 and 1.11, this should work. If we ever need a 1.10 cherry-pick, we'd have to add stronger tolerations (e.g., CriticalAddonsOnly
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CriticalAddonsOnly
isn't really stronger - the ones added here are wildcard ones, so they include it already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant for the cluster-level agent, which currently doesn't have any tolerations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right. I think if someone is tainting all their nodes, something may break anyway, but yes, this could at least mitigate the problem in 1.10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a lot of things won't schedule if you taint all your nodes (DNS, metrics-server, ...). I don't think cluster-level agent should have stronger tolerations than other cluster-level system components.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: qingling128, x13n The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/kind bug |
Created cherry-pick PRs into |
…69737-upstream-release-1.12 Automated cherry pick of #69737: Add tolerations for Stackdriver Logging and Metadata Agents.
…69737-upstream-release-1.11 Automated cherry pick of #69737: Add tolerations for Stackdriver Logging and Metadata Agents.
…69737-upstream-release-1.10 Automated cherry pick of #69737: Add tolerations for Stackdriver Logging and Metadata Agents.
What this PR does / why we need it:
Followup of #68920. This makes sure taints are respected.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #
Special notes for your reviewer:
Release note: