Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PodLevelResources] Pod Level Resources Feature Alpha #128407

Merged
merged 15 commits into from
Nov 8, 2024

Conversation

ndixita
Copy link
Contributor

@ndixita ndixita commented Oct 29, 2024

What type of PR is this?

What this PR does / why we need it:

This PR implements Pod Level Resources that require following changes:

  1. API changes to support Resources in PodSpec
  2. Kubectl changes to display Resources at pod-level
  3. Defaulting logic when pod-level requests are missing but limits are set.
  4. Validation logic for pod-level resources
  5. Scheduler changes to consume pod-level requests
  6. QoS class determination changes
  7. cgroup changes to use pod-level resources
  8. OOM score adjustment calculation changes to account for pod-level requests.
  9. e2e tests
  10. Limit Range changes for type Pod to check against pod-level resources, if set.
  11. Resource Quota changes to check against pod-level resource, if set.

Which issue(s) this PR fixes:

Fixes #
xref: kubernetes/enhancements#2837

Special notes for your reviewer:

API changes: 62c5355

Scheduler changes: cb4f2ea

Kubelet changes: inside pkg/kubelet

Does this PR introduce a user-facing change?

- Changed the Pod API to support `resources` at `spec` level for pod-level resources.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- KEP: https://github.com/kubernetes/enhancements/blob/master/keps/sig-node/2837-pod-level-resource-spec/README.md 

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/code-generation area/kubectl kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed do-not-merge/needs-kind Indicates a PR lacks a `kind/foo` label and requires one. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 29, 2024
@kannon92
Copy link
Contributor

Hey @ndixita, which PR are you working on? I see #128406 and this one.

@ndixita
Copy link
Contributor Author

ndixita commented Oct 30, 2024

/test all

@ndixita
Copy link
Contributor Author

ndixita commented Oct 30, 2024

Addressing the default test failures

@ndixita ndixita closed this Oct 30, 2024
@ndixita ndixita deleted the pod-level-resources branch October 30, 2024 22:26
@ndixita
Copy link
Contributor Author

ndixita commented Nov 8, 2024

/test pull-kubernetes-node-kubelet-serial-podresources

1. Add Resources struct to PodSpec struct in both external and internal API packages
2. Adding feature gate and logic for dropping disabled fields for Pod Level Resources
KEP: enhancements/keps/sig-node/2837-pod-level-resource-spec
@yujuhong
Copy link
Contributor

yujuhong commented Nov 8, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: c2e9c9f2c0ca6fa70f05d4791e9aa2ef752a3bf1

1. Add support for pod level resources in kubectl
2. Reuse the existing method to describe container resources and generalize it to describe both pod and container level resources
1. If pod-level limit is set, pod-level request is unset and container-level request is set: derive pod-level request from container-level requests
2. If pod-level limit is set, pod-level request is unset and container-level request is unset: set pod-level request equal to pod-level limit
1. The effective container requests cannot be greater than pod-level requests
2. Inidividual container limits cannot be greater than pod-level limits
3. Only CPU & Memory are supported at pod-level
4. Inplace container resources updates are not supported if pod-level resources are set
Note: effective container requests cannot be greater than pod-level limits is supported by transitivity. Effective container requests <= pod-level requests && pod-level requests <= pod-level limits; Therefore effective container requests <= pod-level limits

Signed-off-by: ndixita <[email protected]>
1. Use pod-level resource when feature is enabled and resources are set at pod-level
2. Edge case handling: When a pod defines only CPU or memory limits at pod-level (but not both), and container-level requests/limits are unset, the pod-level requests stay empty for the resource without a pod-limit. The container's request for that resource is then set to the default request value from schedutil.
1. Pod cgrooup configured to use resources from pod spec if feature is enabled and resources are set at pod-level
2. Container cgroup limits defaulted to pod-level limits is container limits are not set
Signed-off-by: ndixita <[email protected]>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2024
@pacoxu
Copy link
Member

pacoxu commented Nov 8, 2024

/lgtm
after a typo fix

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 8, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: fc17af58370a99e0823cef19139b071bb4be00e7

@ndixita
Copy link
Contributor Author

ndixita commented Nov 8, 2024

/test pull-kubernetes-node-kubelet-serial-podresources

@k8s-ci-robot
Copy link
Contributor

@ndixita: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-kubernetes-e2e-kind-alpha-beta-features e11ac5d link false /test pull-kubernetes-e2e-kind-alpha-beta-features
pull-kubernetes-node-e2e-alpha-ec2 e11ac5d link false /test pull-kubernetes-node-e2e-alpha-ec2
pull-kubernetes-e2e-gce-cos-alpha-features b30e6c8 link false /test pull-kubernetes-e2e-gce-cos-alpha-features

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@ndixita
Copy link
Contributor Author

ndixita commented Nov 8, 2024

/test pull-kubernetes-e2e-gce-cos-alpha-features

@ndixita
Copy link
Contributor Author

ndixita commented Nov 8, 2024

The tests timing out in pull-kubernetes-e2e-gce-cos-alpha-features are not realated to this PR. They are related to InPlaceVerticalScaling feature

image

@k8s-ci-robot k8s-ci-robot merged commit c25f5ee into kubernetes:master Nov 8, 2024
14 of 18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/code-generation area/e2e-test-framework Issues or PRs related to refactoring the kubernetes e2e test framework area/kubeadm area/kubectl area/kubelet area/release-eng Issues or PRs related to the Release Engineering subproject area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/release Categorizes an issue or PR as relevant to SIG Release. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Status: API review completed, 1.32
Status: Done
Status: Done
Status: Done
Development

Successfully merging this pull request may close these issues.