Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cherry-pick rel-v1.30] Support extended resources and ephemeral-storage for scale-from-zero specified in MachineClass NodeTemplate #336

Merged
merged 1 commit into from
Nov 21, 2024

Conversation

elankath
Copy link

What this PR does / why we need it:

Right now in scale-from-zero cases, the autoscaler does not respect ephemeral-storage and extended resource specified in the MachineClass.NodeTemplate . Only the standard cpu, gpu and memory are picked up. Neither is there any support for custom extended resource which are fully ignored presently.

Which issue(s) this PR fixes:
Fixes #132

Special notes for your reviewer:

Release note:

Support extended resources and ephemeral-storage for scale-from-zero specified in MachineClass NodeTemplate

…specified in MachineClass NodeTemplate (gardener#334)

* support extended resouces and ephemeral-storage in scale-from-zero

* corrected resource issues

* adjusted unit test TestBuildNodeFromTemplate for changes
@gardener-robot gardener-robot added needs/review Needs review size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) labels Nov 20, 2024
@gardener-robot-ci-3 gardener-robot-ci-3 added reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Nov 20, 2024
@elankath
Copy link
Author

IT successful. ca-it.log

make test-integration 2>&1 | tee /tmp/ca-it.log                                                                               git:exres*
....
Ran 13 of 13 Specs in 1334.708 seconds
SUCCESS! -- 13 Passed | 0 Failed | 0 Pending | 0 Skipped
PASS

@elankath elankath self-assigned this Nov 21, 2024
@elankath
Copy link
Author

elankath commented Nov 21, 2024

Test Case

Shoot Setup

Shoot with 3 Worker Pools, Only the 2nd Worker Pool with minimum pool size specified as 0 has special dongle resource as shown in shoot YAML snippet below

        providerConfig:
          apiVersion: aws.provider.extensions.gardener.cloud/v1alpha1
          kind: WorkerConfig
          nodeTemplate:
            capacity:
              cpu: 8
              ephemeral-storage: 50Gi
              gpu: 0
              hana.hc.sap.com/hcu/cpu: 20
              hana.hc.sap.com/hcu/memory: 10
              memory: 7Gi
              resource.com/dongle: 4

Pod with Extended Resource

As can be seen, due to k8s restrictions, extended resource declaration must be specified in both requests and limits section.

apiVersion: v1
kind: Pod
metadata:
  name: exrestest1
spec:
  containers:
    - name: example-container
      image: busybox
      command: ["sh", "-c", "echo Using extended resources && sleep 3600"]
      resources:
        limits:
          resource.com/dongle: "2"
        requests:
          resource.com/dongle: "2"

CA Scale Up

As can be seen below, the b NodeGroup which declares extended resource was selected for scale-up. Final scale-up plan: [{shoot--i034796--aw-b-z1 0->1 (max: 3)}]. Other Node Groups were rejected

CA Scale Up Log

I1121 11:49:33.105893   58938 mcm_manager.go:791] Generating node template only using nodeTemplate from MachineClass shoot--i034796--aw-b-z1-f8c99: template resources-> cpu: 8,memory: 7Gi
I1121 11:49:33.105972   58938 mcm_manager.go:981] Copying extended resources map[hana.hc.sap.com/hcu/cpu:{{20 0} {<nil>} 20 DecimalSI} hana.hc.sap.com/hcu/memory:{{10 0} {<nil>} 10 DecimalSI} resource.com/dongle:{{4 0} {<nil>} 4 DecimalSI}] to template node.Status.Capacity
I1121 11:49:33.108659   58938 orchestrator.go:566] Pod default/exrestest1 can't be scheduled on shoot--i034796--aw-c-z1, predicate checking error: Insufficient resource.com/dongle; predicateName=NodeResourcesFit; reasons: Insufficient resource.com/dongle; debugInfo=
I1121 11:49:33.108794   58938 orchestrator.go:566] Pod default/exrestest1 can't be scheduled on shoot--i034796--aw-a-z1, predicate checking error: Insufficient resource.com/dongle; predicateName=NodeResourcesFit; reasons: Insufficient resource.com/dongle; debugInfo=
I1121 11:49:33.108967   58938 orchestrator.go:153] No pod can fit to shoot--i034796--aw-c-z1
I1121 11:49:33.108980   58938 orchestrator.go:153] No pod can fit to shoot--i034796--aw-a-z1
I1121 11:49:33.109082   58938 compare_nodegroups.go:164] nodes template-node-for-shoot--i034796--aw-b-z1-6493561263961564664 and template-node-for-shoot--i034796--aw-c-z1-397814316375267006 are not similar, ephemeral-storage does not match
I1121 11:49:33.109113   58938 compare_nodegroups.go:164] nodes template-node-for-shoot--i034796--aw-b-z1-6493561263961564664 and template-node-for-shoot--i034796--aw-a-z1-5270490573304173618 are not similar, cpu does not match
I1121 11:49:33.109747   58938 waste.go:55] Expanding Node Group shoot--i034796--aw-b-z1 would waste 100.00% CPU, 100.00% Memory, 100.00% Blended
I1121 11:49:33.109783   58938 orchestrator.go:184] Best option to resize: shoot--i034796--aw-b-z1
I1121 11:49:33.109796   58938 orchestrator.go:188] Estimated 1 nodes needed in shoot--i034796--aw-b-z1
I1121 11:49:33.109832   58938 compare_nodegroups.go:164] nodes template-node-for-shoot--i034796--aw-b-z1-6493561263961564664 and template-node-for-shoot--i034796--aw-a-z1-5270490573304173618 are not similar, cpu does not match
I1121 11:49:33.109892   58938 compare_nodegroups.go:150] nodes template-node-for-shoot--i034796--aw-b-z1-6493561263961564664 and template-node-for-shoot--i034796--aw-c-z1-397814316375267006 are not similar, missing capacity hana.hc.sap.com/hcu/memory
I1121 11:49:33.109911   58938 orchestrator.go:215] No similar node groups found
I1121 11:49:33.109931   58938 orchestrator.go:257] Final scale-up plan: [{shoot--i034796--aw-b-z1 0->1 (max: 3)}]
I1121 11:49:33.110018   58938 executor.go:147] Scale-up: setting group shoot--i034796--aw-b-z1 size to 1
I1121 11:49:33.110128   58938 mcm_cloud_provider.go:351] Received request to increase size of machine deployment shoot--i034796--aw-b-z1 by 1
I1121 11:49:33.110321   58938 event_sink_logging_wrapper.go:48] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"kube-system", Name:"cluster-autoscaler-status", UID:"2b3bbc7e-927e-48fb-b3cd-1522eb3148e5", APIVersion:"v1", ResourceVersion:"7204541", FieldPath:""}): type: 'Normal' reason: 'ScaledUpGroup' Scale-up: setting group shoot--i034796--aw-b-z1 size to 1 instead of 0 (max: 3)

Pod Assignment.

As can be seen the Pod declaring extended resource is still unassigned after scale-up. This is due to the fact that it is expected that controller/policy explicitly adds the extended resource to the Node.Status.Capacity as described at k8s documentation Advertise Extended Resources for a Node

k get po                                                                                             git:exres*
NAME         READY   STATUS    RESTARTS   AGE
exrestest1   0/1     Pending   0          10m

Currently, stakeholders are doing this with Kyverno policies to update the scaled Nodeobjects with extended resources, but this is inconvenient and somewhat error prone due to possible divergence of the values in the CA Node Group and Kyverno policy. On request, we created an additional feature enhancement ticket on the gardener MCMat: gardener/machine-controller-manager#955 which will be delivered in the future

Update Scaled Node with extended Resources

NOTE: This step is performed by policy - replicating manually for testing.

Proxy the API server

kubectl prosxy

Patch extended resource

 curl --header "Content-Type: application/json-patch+json" --request PATCH --data '[{"op": "add", "path": "/status/capacity/resource.com~1dongle", "value": "4"}]' http://localhost:8001/api/v1/nodes/ip-10-180-31-22.eu-west-1.compute.internal/status

Pod is now Scheduled.

49s         Normal    Scheduled                           pod/exrestest1                                    Successfully assigned default/exrestest1 to ip-10-180-31-22.eu-west-1.compute.internal

NAME         READY   STATUS    RESTARTS   AGE
exrestest1   1/1     Running   0          22m

@rishabh-11 rishabh-11 changed the title Support extended resources and ephemeral-storage for scale-from-zero specified in MachineClass NodeTemplate [Cherry-pick rel-v1.30] Support extended resources and ephemeral-storage for scale-from-zero specified in MachineClass NodeTemplate Nov 21, 2024
@rishabh-11 rishabh-11 changed the title [Cherry-pick rel-v1.30] Support extended resources and ephemeral-storage for scale-from-zero specified in MachineClass NodeTemplate [Cherry-pick rel-v1.30] Support extended resources and ephemeral-storage for scale-from-zero specified in MachineClass NodeTemplate Nov 21, 2024
Copy link

@rishabh-11 rishabh-11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@gardener-robot gardener-robot added reviewed/lgtm Has approval for merging and removed needs/review Needs review labels Nov 21, 2024
@rishabh-11 rishabh-11 merged commit c257e96 into gardener:rel-v1.30 Nov 21, 2024
10 checks passed
@gardener-robot gardener-robot added the status/closed Issue is closed (either delivered or triaged) label Nov 21, 2024
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/lgtm Has approval for merging reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) size/m Size of pull request is medium (see gardener-robot robot/bots/size.py) status/closed Issue is closed (either delivered or triaged)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants