Skip to content

Conversation

@ffromani
Copy link
Contributor

What type of PR is this?

/kind bug
/kind flake

What this PR does / why we need it:

The metrics tests, memory manager first and foremost, expect a controlled environment on which they run.
This is why they are labeled Serial. The key factor here is the predictability: the test expect
to be in full control of the node state; if any other actor mutates the node state, that makes for a almost guaranteed
false negative.

To reduce the chance to flake, we add an extra check before to run the actual tests: detect pods pending deletion, and
wait for them to be gone.

Which issue(s) this PR fixes:

Fixes #128869

Special notes for your reviewer:

see slack thread: https://kubernetes.slack.com/archives/C0BP8PW9G/p1732023371237679

Does this PR introduce a user-facing change?

NONE

The metrics tests, memory manager first and foremost,
expect a controlled environment on which they run.
This is why they are labeled Serial.
The key factor here is the predictability: the test expect
to be in full control of the node state; if any other actor
mutates the node state, that makes for a almost guaranteed
false negative.

To reduce the chance to flake, we add an extra check before
to run the actual tests: detect pods pending deletion, and
wait for them to be gone.

Signed-off-by: Francesco Romani <[email protected]>
@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 20, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/test sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Nov 20, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ffromani
Once this PR has been reviewed and has the lgtm label, please assign sjenning for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ffromani
Copy link
Contributor Author

/hold

if we like this direction, I want to update all the other metrics tests

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 20, 2024
@ffromani
Copy link
Contributor Author

/close

we don't really like this direction after all

@k8s-ci-robot
Copy link
Contributor

@ffromani: Closed this PR.

Details

In response to this:

/close

we don't really like this direction after all

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/bug Categorizes issue or PR as related to a bug. kind/flake Categorizes issue or PR as related to a flaky test. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Failure cluster [20b4f3ec...] Memory Manager Metrics [Serial] [Feature:MemoryManager] when querying /metrics should report zero pinning counters after a fresh restart

2 participants