Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

loadBalancerSourceRanges does not work on Windows #120033

Closed
AbelHu opened this issue Aug 18, 2023 · 22 comments
Closed

loadBalancerSourceRanges does not work on Windows #120033

AbelHu opened this issue Aug 18, 2023 · 22 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/windows Categorizes an issue or PR as relevant to SIG Windows.

Comments

@AbelHu
Copy link

AbelHu commented Aug 18, 2023

What happened?

Pod CIDR should be added to loadBalancerSourceRanges if there are Pods needing to access the service's LoadBalancer IP for clusters with version v1.25 or above. It will block Linux pods to access the load balancer if the pod CIDR is not added but it will not block Windows pods.

What did you expect to happen?

Block Windows pods to access the load balancer if the pod CIDR is not added.

How can we reproduce it (as minimally and precisely as possible)?

  1. Create a cluster with k8s v1.25.6
  2. Create a service with setting loadBalancerSourceRanges which do not include pod cidr
  3. Access the service from a Windows pod
  4. The access can succeed

Anything else we need to know?

From the context in Don't use KUBE-MARK-DROP for LoadBalancerSourceRanges by danwinship · Pull Request #110289 · kubernetes/kubernetes (github.com) and Service.Spec.LoadBalancerSourceRanges intent? · Issue #109575 · kubernetes/kubernetes (github.com), it seems like that it only be implemented on Linux.

Kubernetes version

1.25.6

Cloud provider

AKS

OS version

On Windows:

10.0.17763.4499

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@AbelHu AbelHu added the kind/bug Categorizes issue or PR as related to a bug. label Aug 18, 2023
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Aug 18, 2023
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Aug 18, 2023
@AbelHu
Copy link
Author

AbelHu commented Aug 18, 2023

/sig Windows

@k8s-ci-robot k8s-ci-robot added sig/windows Categorizes an issue or PR as relevant to SIG Windows. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 18, 2023
@jsturtevant
Copy link
Contributor

/sig network
/cc @jayunit100 @sbangari @daschott

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label Aug 18, 2023
@daschott
Copy link
Contributor

Yes, I don't think this field is supported on Windows today. +@princepereira as well for confirmation.

@thockin
Copy link
Member

thockin commented Sep 14, 2023

Looking for an update on this, please.

@jayunit100
Copy link
Member

Pretty sure this isn't supported and also we don't test it at all .... I'll confirm

@jayunit100
Copy link
Member

jayunit100 commented Sep 14, 2023

Yeah, confirmed its not implemented in the windows kernel proxy, at least, not as far as i can tell in the kernelspace proxy ... but maybe its possible in other windows proxies that are able to do things other than HNS.....

In the iptables impl, they use CIDR matching in the rules to do this .

In the iptables proxy: you can generally grok how this all works... (inside of the syncProxyRules ....

  1. First
// make a firewall if there are LB Source Ranges (this cannot be done w/ HNS i dont think)
usesFWChain := ... (code that checks wether theres LBSourceRanges) ... 
  1. Then
...
// read through all the whitelisted LBs and make rules for them so they pass through
for _, src := range svcInfo.LoadBalancerSourceRanges() {
  // 1) allow traffic from IP's matching this CIDR
  natRules.Write(args, "-s", src, "-j", string(externalTrafficChain))
  /// other logic to check whitelist if its from node ip

)

I dont think we CAN do this in the windows, HNS based proxy?

To my knowledge the HNS API Doesnt allow any kind of firewall / packet filtering functionality to do this, so we cant implement it easily i think in the windows kernel proxy. Unless you did some magic thing like looking at the actual packets in userspace or whatever....

@daschott to confirm?

@jayunit100
Copy link
Member

cc @pramitagautam @tzifudzi for op-readiness we should doc this

@daschott
Copy link
Contributor

daschott commented Sep 16, 2023

Hi, the load balancer source ranges is not implemented for Windows... I have not looked at this in detail yet and am not familiar with the feature.

But HNS does have an ACL policy to allow/deny traffic based on CIDR, would that be suitable? This is used for example by npm & Calico...

@jayunit100
Copy link
Member

sgtm, at least for now we can leave this bug open. will discuss in sig-windows this wk.

@jayunit100
Copy link
Member

jayunit100 commented Sep 26, 2023

we'll leave this open. nobody on it yet though. we think we can impl it w/ the HNS ACLs....

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 29, 2024
@thockin
Copy link
Member

thockin commented Jan 31, 2024

@MikeZappa87

@MikeZappa87
Copy link

@jayunit100 did this ever get discussed at sig-windows? This issue might have fallen off the wagon. I didn't join sig-windows, do we triage at that meeting?

@jayunit100
Copy link
Member

jayunit100 commented Jan 31, 2024

  1. On one hand.... there is hope for this bc iirc we dug around and looked at the HNS parts and it does appear you could use windows firewall rules to implement some kind of lb source ranges functionality.

... but maybe

  1. On the other hand we should leave this as a wontfix.... bc lbsourceranges needs to silently just go away?..... I think some folks ( @thockin ?) warned us about 6 years ago tho that lbSourceRanges was approaching firewall territory and should
    Probably be reconsidered.

@thockin
Copy link
Member

thockin commented Jan 31, 2024 via email

@MikeZappa87
Copy link

Is this one of these cases where we have tests for this feature for Linux and they are not run on Windows nodes?

@jayunit100
Copy link
Member

jayunit100 commented Feb 4, 2024

  1. you can't really test LB source ranges easily at all because of the fact that I think we only test them for GCE

https://github.com/kubernetes/kubernetes/blob/master/test/e2e/framework/providers/gce/firewall.go ,

and I assume that test will go away once we impl the cloud provider migration?

Right now to test em I think it's a manual process where u grep your IP and then check if it's blocked properly..... not even sure there's a reliable cross platform test for this at all depending on snat and dnat behavior....?

  1. Previous issues discussions have concluded I believe that this feature it at cloud provider discretion and may not work, where we interpret cloud provider as "the person setting up the cluster".... which means

So I'd propose:

"The administrator may know that their windows nodes don't support lb src ranges.... and in that case... the fact that they are not supported is a normal scenario".

  1. To me that makes a case that this is a documentation issue - and maybe a feature request.... but It could go either way.

  2. Either way if this is cross platform we need a cross platform non cloud provider specific test for it as well since cloud provider tests going away.

@MikeZappa87
Copy link

@AbelHu what should we do with this issue? Not sure if I clear path forward exists.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 16, 2024
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

@k8s-ci-robot k8s-ci-robot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 15, 2024
@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue, marking it as "Not Planned".

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue with /reopen
  • Mark this issue as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close not-planned

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/network Categorizes an issue or PR as relevant to SIG Network. sig/windows Categorizes an issue or PR as relevant to SIG Windows.
Projects
Status: Done
Development

No branches or pull requests

8 participants