Skip to content

Conversation

@lbernail
Copy link
Contributor

@lbernail lbernail commented May 13, 2019

What type of PR is this?

/kind bug

What this PR does / why we need it:
Disable graceful termination for non-TCP flows. The rational behind this is that with a high number of UDP connections (DNS or syslog for instance), stale UDP connections (to terminated/terminating backends) will be reused and fail (and if the load is high enough never expire).

This patch disables graceful termination for UDP. It is not perfect, but I believe it is much better. What happens is the following:

  • UDP target pod becomes NotReady
  • IP is removed from RealServers immediately
  • Because of expire_nodest_conn any new packet reusing the port will generate a write error and the connection will be removed from the ipvs connection table (if the application retries it will be routed to an active pod)

Which issue(s) this PR fixes:
Fixes #76664

Special notes for your reviewer:
I think in addition to that we should set much more aggressive UDP timeouts (5s? instead of the default: 300s) and make it configurable with a flag for specific use-case (I haven't seen a use case for long lived udp "connections" with very low amount of traffic).

I'm not a big fan of strings.ToLower(string(v1.ProtocolTCP)). Any thoughts on this?

Does this PR introduce a user-facing change?:

IPVS: Disable graceful termination for UDP traffic to solve issues with high number of UDP connections (DNS / syslog in particular)

/assign @m1093782566
/sig network
/area IPVS

@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label May 13, 2019
@k8s-ci-robot k8s-ci-robot added kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. area/ipvs needs-priority Indicates a PR lacks a `priority/foo` label and requires one. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels May 13, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @lbernail. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 13, 2019
@lbernail
Copy link
Contributor Author

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 15, 2019
@m1093782566
Copy link
Contributor

LGTM overall.

cc @andrewsykim for review.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re: strings.ToLower(string(v1.ProtocolTCP)), v1.ProtocolTCP is mainly a type used by the Service API and is only applicable when comparing values in that resource. For comparing protocols from VirtualServer, we can probably use a separate constant for this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally think just rsToDelete.VirtualServer.Protocol == "tcp" is also fine, having a constant may not be necessary either :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completely agree
I just changed it to "tcp"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this also apply for SCTP?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with SCTP so I chose to enable graceful termination for TCP only

I had a quick look at the protocol and it seems to include a handshake and a shutdown which should make connection tracking similar to TCP (so when a backend stops, the equivalent of a FIN is sent and the connection should be removed from the IPVS conntrack).

If you have a simple test scenario in mind I can look into it (the change is very simple we can simply do rsToDelete.VirtualServer.Protocol != "udp" instead)

Copy link
Member

@andrewsykim andrewsykim May 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not super familiar with it either, was more asking in case you knew :P. I think handling this on case-by-case basis makes sense. We can revisit if it's a problem for SCTP. Going to give another pass at this PR some time tomorrow to make sure we're not missing any edge cases :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put more thought into this, and I think to be safe, we should only follow this scenarios for protocols we know for sure run into issue (UDP). So rsToDelete.VirtualServer.Protocol != "udp" like you mentioned. Otherwise this could have unintended outcomes for sctp that we aren't aware of.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e. don't change the behavior for sctp unless we have a good reason.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do that. Not sure how to test it but I'm pretty sure SCTP will have real connection tracking

@bowei
Copy link
Member

bowei commented May 18, 2019

Is there a way to add a test?

@andrewsykim
Copy link
Member

andrewsykim commented May 19, 2019

@bowei added some initial unit tests for graceful termination here #78088 to keep the scope of this PR smaller :)

@jsravn
Copy link
Contributor

jsravn commented May 22, 2019

Are you sure about expire_nodest_conn? I thought because we set conn_reuse_mode=0, that flag has no effect - it is effectively 0.

@lbernail
Copy link
Contributor Author

Are you sure about expire_nodest_conn? I thought because we set conn_reuse_mode=0, that flag has no effect - it is effectively 0.

Yes, I thought that too, but it turns out the doc is either wrong or incomplete. I tested it and it works

@lbernail
Copy link
Contributor Author

lbernail commented May 28, 2019

/test pull-kubernetes-kubemark-e2e-gce-big

@m1093782566
Copy link
Contributor

/lgtm

Please squash the commits @lbernail

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 28, 2019
@lbernail lbernail force-pushed the lbernail/no-graceful-udp branch from 4ed9a69 to 9ff0685 Compare May 28, 2019 11:52
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 28, 2019
@lbernail
Copy link
Contributor Author

lbernail commented May 28, 2019

Please squash the commits @lbernail

Sure, just did it

@m1093782566
Copy link
Contributor

/lgtm

/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 28, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: lbernail, m1093782566

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 28, 2019
@k8s-ci-robot k8s-ci-robot merged commit 944a7e2 into kubernetes:master May 29, 2019
k8s-ci-robot added a commit that referenced this pull request May 31, 2019
…2-upstream-release-1.12

Automated cherry pick of #77802 upstream release 1.12
k8s-ci-robot added a commit that referenced this pull request Jun 1, 2019
…2-upstream-release-1.14

Automated cherry pick of #77802 upstream release 1.14
k8s-ci-robot added a commit that referenced this pull request Jun 4, 2019
…2-upstream-release-1.13

Automated cherry pick of #77802 upstream release 1.13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/ipvs cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/network Categorizes an issue or PR as relevant to SIG Network. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ipvs UDP to garbage collect realServers

6 participants