Skip to content

Current configuration of normal jobs makes some undesired tests running #31666

@carlory

Description

@carlory

What happened:

I raised a PR to add quota support for PVC with VolumeAttributesClass. It's an Alpha feature and I add some e2e tests.

The following jobs are always triggered and failed.

  • pull-kubernetes-e2e-gce
  • pull-kubernetes-e2e-kind
  • pull-kubernetes-e2e-kind-ipv6

The JUnit report shows that the test failed because of timeout.

Kubernetes e2e suite: [It] [sig-api-machinery] ResourceQuota [FeatureGate:VolumeAttributesClass] [Alpha] should create a ResourceQuota and capture the life of a persistent volume claim with a volume attributes class 
{ failed [FAILED] timed out waiting for the condition
In [It] at: k8s.io/kubernetes/test/e2e/apimachinery/resource_quota.go:1256 @ 01/18/24 08:19:06.682
}

But the root cause is that my e2e test requires a feature gate VolumeAttributesClass which is not enabled in the normal jobs.

Besides, I also found that we may abuse Feature, FeatureGate labels in e2e tests. There's a KEP-3041 tracking this issue, but sadly has been stuck for a while.

This PR is following this KEP guide and add a FeatureGate label in the e2e test. But the normal jobs still run this test and failed later.

What you expected to happen:

Adding e2e tests for features is a hard requirement when promoting a feature to Beta stage. And for Alpha features, it's not a hard requirement.

There're at least three jobs for Alpha and Beta tests in test-infra.

  • pull-kubernetes-e2e-kind-alpha-features
  • pull-kubernetes-e2e-kind-beta-features
  • pull-kubernetes-e2e-kind-alpha-beta-features

These jobs are not in the presubmit list and requires manual trigger.

In a beta feature we can have the luxury on detecting the problem later, so these jobs won't block PRs merging.

Talked with @aojea, @pohly and @pacoxu in slack sig-testing channel, we agreed roughly on the following points.

  1. Changes jobs' configuration

    • alpha: skip Feature:.*, focus: .
    • beta: skip Feature:.*|Alpha. focus: .
    • "normal": skip Feature:.*|Alpha|Beta, focus: .
  2. The pull-kubernetes-.*-beta jobs maybe expected to always run, and those jobs maybe non-optional and mandatory. It's a good point and we can iterate on this.

  3. Need to do more work to reconcile SIGs - KEPs - PRR , @aojea expect the testing strategy to be defined in the KEP, and the SIG reviewers to make sure those tests are added and are running and maintained.

After the changes, If a feature is in Alpha stage and some e2e tests are added, the normal jobs won't run these tests, so PRs can be merged without blocking when the manual alpha/beta jobs' result is successful.

If the option 2 and 3 were implemented, it would make kubernetes more stable and reliable.

How to reproduce it (as minimally and precisely as possible):

Write a e2e test for an Alpha feature and add a FeatureGate label, then raise a PR and wait for the normal jobs to run.

Code snippet:

import "k8s.io/kubernetes/pkg/features"
var _ = SIGDescribe("Something", framework.WithFeatureGate(features.Something), func() {})

Please provide links to example occurrences, if any:

Anything else we need to know?:

  • @pacoxu raised a PR to skip some tests for kind alpha/beta fetures CI
    |\[Feature:.+\] ... In-tree.Volumes.\[Driver:.nfs\]|PersistentVolumes.NFS|Network.should.set.TCP.CLOSE_WAIT.timeout ...
    

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.sig/testingCategorizes an issue or PR as relevant to SIG Testing.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions