https://kubernetes.io/docs/tasks/administer-cluster/extended-resource-node/
https://kubernetes.io/docs/reference/kubectl/cheatsheet/
https://access.redhat.com/solutions/3986301
This may be helpful too if object is part of an array:
https://stackoverflow.com/questions/64355902/is-there-a-way-in-kubectl-patch-to-delete-a-specific-object-in-an-array-withou
Find device paths
[dave@lenovo ~]$ kubectl get nodes -o json | jq -c 'paths|join(".")' | grep nvidia
"items.4.status.capacity.nvidia.com/GA102GL_A40"
"items.4.status.capacity.nvidia.com/GP104_GEFORCE_GTX_1070_TI"
"items.4.status.capacity.nvidia.com/GP104_GEFORCE_GTX_1080"
"items.4.status.capacity.nvidia.com/GP104_GeForce_GTX_1070_Ti"
"items.4.status.capacity.nvidia.com/GP104_GeForce_GTX_1080"
"items.4.status.capacity.nvidia.com/gpu"
Start a proxy to the cluster
[dave@lenovo ~]$ oc proxy
Starting to serve on 127.0.0.1:8001
Get the nodes in the cluster
[dave@lenovo ~]$ oc get nodes
NAME STATUS ROLES AGE VERSION
r730ocp3.localdomain Ready master,worker 63d v1.24.6+5157800
r730ocp4.localdomain Ready master,worker 63d v1.24.6+5157800
r730ocp5.localdomain Ready master,worker 63d v1.24.6+5157800
trt2ocp1.localdomain Ready,SchedulingDisabled worker 63d v1.24.6+5157800
trt2ocp2.localdomain Ready worker 63d v1.24.6+5157800
Describe the node you would like to remove a device from
Note In this case GP104_GEFORCE_GTX_1080
due to incorrect case, output shortened below
[dave@lenovo ~]$oc describe node trt2ocp1.localdomain
Name: trt2ocp1.localdomain
Roles: worker
Capacity:
cpu: 48
devices.kubevirt.io/kvm: 1k
devices.kubevirt.io/sev: 1k
devices.kubevirt.io/tun: 1k
devices.kubevirt.io/vhost-net: 1k
ephemeral-storage: 243127276Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 197792288Ki
nvidia.com/GP104_GEFORCE_GTX_1080: 0
nvidia.com/GP104_GeForce_GTX_1070_Ti: 3
nvidia.com/GP104_GeForce_GTX_1080: 6
pods: 250
Allocatable:
cpu: 47500m
devices.kubevirt.io/kvm: 1k
devices.kubevirt.io/sev: 0
devices.kubevirt.io/tun: 1k
devices.kubevirt.io/vhost-net: 1k
ephemeral-storage: 224066097191
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 196641312Ki
nvidia.com/GP104_GEFORCE_GTX_1080: 0
nvidia.com/GP104_GeForce_GTX_1070_Ti: 3
nvidia.com/GP104_GeForce_GTX_1080: 6
pods: 250
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 774m (1%) 0 (0%)
memory 1988Mi (1%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
devices.kubevirt.io/kvm 0 0
devices.kubevirt.io/sev 0 0
devices.kubevirt.io/tun 0 0
devices.kubevirt.io/vhost-net 0 0
nvidia.com/GP104_GEFORCE_GTX_1080 0 0
nvidia.com/GP104_GeForce_GTX_1070_Ti 0 0
nvidia.com/GP104_GeForce_GTX_1080 0 0
Then issue REST commands (Notice the ~1
instead of the /
, this is required for some reason)
[dave@lenovo ~] curl --header "Content-Type: application/json-patch+json" --request PATCH --data '[{"op": "remove", "path": "/status/capacity/nvidia.com~1GP104_GEFORCE_GTX_1080"}]' http://localhost:8001/api/v1/nodes/trt2ocp1.localdomain/status
Check to make sure the device is removed
[dave@lenovo ~]$ oc describe node trt2ocp1.localdomain
Name: trt2ocp1.localdomain
Roles: worker
Capacity:
cpu: 48
devices.kubevirt.io/kvm: 1k
devices.kubevirt.io/sev: 1k
devices.kubevirt.io/tun: 1k
devices.kubevirt.io/vhost-net: 1k
ephemeral-storage: 243127276Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 197792288Ki
nvidia.com/GP104_GeForce_GTX_1070_Ti: 3
nvidia.com/GP104_GeForce_GTX_1080: 6
pods: 250
Allocatable:
cpu: 47500m
devices.kubevirt.io/kvm: 1k
devices.kubevirt.io/sev: 0
devices.kubevirt.io/tun: 1k
devices.kubevirt.io/vhost-net: 1k
ephemeral-storage: 224066097191
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 196641312Ki
nvidia.com/GP104_GeForce_GTX_1070_Ti: 3
nvidia.com/GP104_GeForce_GTX_1080: 6
pods: 250
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits
-------- -------- ------
cpu 774m (1%) 0 (0%)
memory 1988Mi (1%) 0 (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)
devices.kubevirt.io/kvm 0 0
devices.kubevirt.io/sev 0 0
devices.kubevirt.io/tun 0 0
devices.kubevirt.io/vhost-net 0 0
nvidia.com/GP104_GeForce_GTX_1070_Ti 0 0
nvidia.com/GP104_GeForce_GTX_1080 0 0
curl --header "Content-Type: application/json-patch+json" --request PATCH --data '[{"op": "remove", "path": "/status/capacity/nvidia.com~1GP104_GEFORCE_GTX_1080"}]' http://localhost:8001/api/v1/nodes/trt2ocp2.localdomain/status
curl --header "Content-Type: application/json-patch+json" --request PATCH --data '[{"op": "remove", "path": "/status/capacity/nvidia.com~1GP104_GEFORCE_GTX_1070_TI"}]' http://localhost:8001/api/v1/nodes/trt2ocp2.localdomain/status
curl --header "Content-Type: application/json-patch+json" --request PATCH --data '[{"op": "remove", "path": "/status/capacity/nvidia.com~1gpu"}]' http://localhost:8001/api/v1/nodes/trt2ocp2.localdomain/status