Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-proxy: ratelimit runs of iptables by sync-period flags #46266

Merged
merged 5 commits into from
May 25, 2017

Conversation

thockin
Copy link
Member

@thockin thockin commented May 23, 2017

This bounds how frequently iptables can be synced. It will be no more often than every 10 seconds and no less often than every 1 minute, by default.

@timothysc FYI

@dcbw @freehan FYI

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 23, 2017
@k8s-github-robot k8s-github-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. release-note-label-needed labels May 23, 2017
@wojtek-t
Copy link
Member

@thockin - it seems it already needs rebase :)

Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly minor stuff (except from making this work by default in GCI too).

//
// The function will be run no more often than burstRuns per minInterval, For
// example, this can mitigate the impact of expensive operations being called in
// response to user-initiated operations. Burts runs are "accumulated" over
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Burts/Bursts/

if minInterval == 0 {
bfr.limiter = nullLimiter{}
} else {
// minInterval is a duration, typically in seconds but could be fractional
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand this comment. I mean I'm not sure how it relates to the line below (which I agree is correct).

bfr.lastRun = bfr.timer.Now()
bfr.fn()
bfr.timer.Stop()
bfr.timer.Reset(bfr.maxInterval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it would be good to comment about the semantic of it - I think it isn't obvious:

  • "lastRun" is the time when the last function call started
  • "timer" starts ticking when the function actually ended.
    So in the situation when noone is calling Run() (or calling it super rarely that it doesn't really matter), but the function call takes significant time comparing to "maxInterval", then we are not really calling it every "maxInterval".

To clarify - if calling a function takes 5s, and maxInterval is also 5s, we will be calling the function every 10s.

I guess, I would do the same (I mean I wouldn't change the logic here), but I think it needs clarification in the comment for NewBoundedFrequencyRunner

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note, that this semantic is different for user-initiated call, where then "minInterval" is between two consecutive starts of the function call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excellent point - this is busted. It really should be consistent. My hunch is that it should take the time after the run, so there's always some rest period after a run. This means the named "bounded frequency" is bad. Should be "ThrottledRunner" or something? That doesn't quite convey that is also has a periodic run ...

thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A possible rename is the only thing left, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now when I think about it, i think that the semantic that you are suggesting is the good one (I initially though that it should be measure between starts, but that has a bunch of drawbacks).

Regarding naming - I don't have opinion. I was always extremely bad at naming :)

# test_args should always go last to overwrite prior configuration
{% set params = log_level + " " + feature_gates + " " + test_args -%}
{% set params = log_level + " " + throttles + " " + feature_gates + " " + test_args -%}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

}

func (proxier *Proxier) OnServiceAdd(service *api.Service) {
namespacedName := types.NamespacedName{Namespace: service.Namespace, Name: service.Name}
if proxier.serviceChanges.update(&namespacedName, nil, service) {
// TODO(wojtek-t): If the initial sync of informer either for endpoints or
// services is not finished, it doesn't make sense to call syncProxyRules
// services is not finished, it doesn't make sense to sync right away
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess those changes are the one that conficts with my PR that has just been merged.

@@ -29,8 +29,10 @@
{% set feature_gates = "--feature-gates=" + grains.feature_gates -%}
{% endif -%}

{% set throttles = "--iptables-sync-period=1m --iptables-min-sync-period=10s" -%}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we start with smaller min-sync-period now? E.g. 5s?
After thinking about it, i think this change may surprise people with small cluster where everything worked really fast.

If we decide to increase it later, it would be super trivial change to do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general the bursts should allow changes to happen very quickly most of the time, and throttling kicks in only in larger clusters. At least, that's what I expect...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scenario I was afraid about is that you are setting bursts=2.

So if we are creating a replication controller with 3 replicas, if endpoint controller will be fast enough that it will be adding pods to Endpoints object one by one, the first two will be added quickly, but for the last we will need to wait 10s more.

I'm not sure how big problem it is though, so maybe it fine to leave now. We can tune later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can make this even more complex and add a short holdoff (e.g. 1sec). I'd rather do that as a followup, if needed. Or simpler, make bursts higher. Here's my thinking - if the iptables update is fast, a short burst isn't a big deal. If the update is slow, there will be a lot of time between runs anyway to accumulate new pods. So maybe setting burst to 4 or is better - we could do it as a follow up to make that a flag...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that bumping bursts sounds like a good option. I'm obviously fine with doing it in a followup - we will see what will be the impact of this change.

@timothysc
Copy link
Member

What's the current state of IPVS load-balancing, and doesn't that obviate this to some extent?

@thockin
Copy link
Member Author

thockin commented May 23, 2017

@timothysc AFAIK IPVS is not done, much less ready to be the default.

@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 24, 2017
@thockin thockin force-pushed the proxy-periodic-runner-2 branch 3 times, most recently from 65c3806 to 5b3c9bd Compare May 24, 2017 05:33
@thockin
Copy link
Member Author

thockin commented May 24, 2017

@k8s-bot pull-kubernetes-unit test this

@thockin
Copy link
Member Author

thockin commented May 24, 2017

@caesarxuchao I have no idea what staging godeps wants. Help?

@wojtek-t
Copy link
Member

Just two minor comments (not very actionable though).

@thockin - feel free to self-lgtm after fixing godeps issue

@caesarxuchao
Copy link
Member

Other staging repos also imports juju,

staging/src/k8s.io/apiserver/Godeps/Godeps.json:482:                    "ImportPath": "github.com/juju/ratelimit",
staging/src/k8s.io/client-go/Godeps/Godeps.json:162:                    "ImportPath": "github.com/juju/ratelimit",
staging/src/k8s.io/kube-aggregator/Godeps/Godeps.json:238:                      "ImportPath": "github.com/juju/ratelimit",
staging/src/k8s.io/kube-apiextensions-server/Godeps/Godeps.json:230:                    "ImportPath": "github.com/juju/ratelimit",
staging/src/k8s.io/sample-apiserver/Godeps/Godeps.json:230:                     "ImportPath": "github.com/juju/ratelimit",

Manually updating them should make the verify script happy.

I'll improve the script after the 1.7 rush.

// completion to next start), except that up to bursts may be allowed. Burst
// runs are "accumulated" over time, one per minInterval up to burstRuns total.
// This can be used, for example, to mitigate the impact of expensive operations
// being called in response to user-initiated operations. Run requests s that
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/requests\ss/request

@thockin
Copy link
Member Author

thockin commented May 25, 2017

manually fixed staging godeps. @caesarxuchao that script is my least favorite thing in the whole project right now. It also decided it needed to delete staging/src/k8s.io/client-go/LICENSE - no idea why. and I had to make a clean GOPATH or it just failed in inscrutable ways.

@k8s-ci-robot
Copy link
Contributor

k8s-ci-robot commented May 25, 2017

@thockin: The following test(s) failed:

Test name Commit Details Rerun command
pull-kubernetes-federation-e2e-gce 2856fde link @k8s-bot pull-kubernetes-federation-e2e-gce test this

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@thockin thockin added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 25, 2017
@k8s-github-robot k8s-github-robot added the do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. label May 25, 2017
@thockin thockin added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge DEPRECATED. Indicates that a PR should not merge. Label can only be manually applied/removed. release-note-label-needed labels May 25, 2017
@thockin thockin changed the title kube-proxy: throttle runs of iptables kube-proxy: ratelimit runs of iptables by sync-period flags May 25, 2017
@thockin
Copy link
Member Author

thockin commented May 25, 2017

self-lgtm

@wojtek-t
Copy link
Member

/lgtm

@thockin - thanks a lot!

@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: thockin, wojtek-t

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 44774, 46266, 46248, 46403, 46430)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants