Millions of conflicts on Nodes in kubemark-scale tests

Kubemark-scale is currently constantly failing. There are different reasons of it:
- too high metrics
- some nodes becoming unready
- some pods not starting
- some requests getting "429" error

However, I looked into logs, and it seems there is one common underlying root cause of all of them. And the problem is:
__millions of conflicts on Node objects__

Those conflicts result in:
- some nodes not being able to heartbeat -> becoming unready
- more requests in the system in general -> higher latencies
- more requests in the system -> 429 errors
- system more overloaded -> pod status updated not being delivered on time

We should understand why those conflicts became so often and fix the problem.
This seems like a regression to me - so putting into 1.7 milestone.

@kubernetes/sig-scalability-bugs 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Millions of conflicts on Nodes in kubemark-scale tests #46851

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Millions of conflicts on Nodes in kubemark-scale tests #46851

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions