Skip to content

kube-proxy nftables test are flaky #128829

@aojea

Description

@aojea

Which jobs are flaking?

https://testgrid.k8s.io/sig-network-kind#sig-network-kind,%20nftables,%20master
https://testgrid.k8s.io/sig-network-kind#sig-network-kind,%20nftables,%20IPv6,%20master

Which tests are flaking?

Seems to impact test randomly

Since when has it been flaking?

15-11-2024

Testgrid link

No response

Reason for failure (if possible)

Checking at https://storage.googleapis.com/kubernetes-ci-logs/logs/ci-kubernetes-kind-network-nftables/1858001043166597120/artifacts/kind-worker/pods/kube-system_kube-proxy-tbpmz_fdcd393e-47df-4afe-a88e-27eaa918f570/kube-proxy/0.log

it seems there is some contention on the system

2024-11-17T04:51:06.117022547Z stderr F E1117 04:51:06.116486       1 proxier.go:1210] "Unable to delete stale chains; will retry later" err=<
2024-11-17T04:51:06.117050539Z stderr F 	/dev/stdin:2:28-104: Error: Could not process rule: Device or resource busy
2024-11-17T04:51:06.117056524Z stderr F 	delete chain ip kube-proxy endpoint-LR2XJHKW-services-4884/service-proxy-toggled/tcp/__10.244.1.180/9376
2024-11-17T04:51:06.117062922Z stderr F 	                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-11-17T04:51:06.117070369Z stderr F 	/dev/stdin:51:28-109: Error: Could not process rule: Device or resource busy
2024-11-17T04:51:06.117075351Z stderr F 	delete chain ip kube-proxy endpoint-23XNSDHO-nettest-2983/session-affinity-service/udp/udp__10.244.1.119/8081
2024-11-17T04:51:06.117079767Z stderr F 	                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-11-17T04:51:06.117084441Z stderr F 	/dev/stdin:57:28-108: Error: Could not process rule: Device or resource busy
2024-11-17T04:51:06.117088698Z stderr F 	delete chain ip kube-proxy endpoint-CLRXJGC4-services-8162/affinity-clusterip-timeout/tcp/__10.244.1.97/9376
2024-11-17T04:51:06.117092772Z stderr F 	                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-11-17T04:51:06.117096629Z stderr F 	/dev/stdin:58:28-104: Error: Could not process rule: Device or resource busy
2024-11-17T04:51:06.117100766Z stderr F 	delete chain ip kube-proxy endpoint-O34ANCIM-services-4884/service-proxy-toggled/tcp/__10.244.1.181/9376
2024-11-17T04:51:06.117122875Z stderr F 

it seems to be present in multiple jobs https://storage.googleapis.com/kubernetes-ci-logs/logs/ci-kubernetes-kind-network-nftables/1857638650775343104/artifacts/kind-worker/pods/kube-system_kube-proxy-mcfxw_41e1cb46-c2e5-440c-8174-246253f0def2/kube-proxy/0.log , most probably on some of them reconciling solves the problem

2024-11-16T04:31:53.06971039Z stderr F I1116 04:31:53.068065       1 proxier.go:1174] "Syncing nftables rules" ipFamily="IPv4"
2024-11-16T04:31:53.069715193Z stderr F I1116 04:31:53.068092       1 proxier.go:1204] "Deleting stale nftables chains" ipFamily="IPv4" numChains=4
2024-11-16T04:31:53.126729378Z stderr F E1116 04:31:53.125738       1 proxier.go:1210] "Unable to delete stale chains; will retry later" err=<
2024-11-16T04:31:53.12702478Z stderr F 	/dev/stdin:2:28-100: Error: Could not process rule: Device or resource busy
2024-11-16T04:31:53.12703517Z stderr F 	delete chain ip kube-proxy endpoint-YHMEV5IX-services-5210/externalip-test/tcp/http__10.244.1.6/9376
2024-11-16T04:31:53.127040928Z stderr F 	                           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2024-11-16T04:31:53.127046499Z stderr F  > ipFamily="IPv4"

Anything else we need to know?

No response

Relevant SIG(s)

/sig network

Metadata

Metadata

Labels

kind/flakeCategorizes issue or PR as related to a flaky test.needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.priority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next release.sig/networkCategorizes an issue or PR as relevant to SIG Network.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions