-
Notifications
You must be signed in to change notification settings - Fork 39.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gpusInUse info error when kubelet restarts #46087
gpusInUse info error when kubelet restarts #46087
Conversation
Hi @tianshapjq. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@vishh could you verify this? thx :) |
@@ -101,8 +102,7 @@ func (ngm *nvidiaGPUManager) Start() error { | |||
if err := ngm.discoverGPUs(); err != nil { | |||
return err | |||
} | |||
// It's possible that the runtime isn't available now. | |||
ngm.allocated = ngm.gpusInUse() | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm -1 on this. If containers recover and continue using GPUs, then allocatable will be wrong for the node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but as the PR described, the number of activePods always remains 0 in the Start() function. And as the ngm.allocated has been initialized, it takes no chance to get the right data anymore.
If we don't erase this line from code, we have to call the Start() function late after the pods have been recovered so the activePods will not be 0 in the gpusInUse() function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to figure out the right place to call Start() function, but seems pods' synchronization is done by the statusManager concurrently, which means there is no explicit checkpoint to tell if the pods have been recovered. So, if we can not get the right info of active pods in gpusInUse(), the initialization of ngm.allocated will never be right when kubelet restarts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I suppose things should suss themselves out in the unit & e2e tests. I think this should be helped a bit once checkpoints are added with #43240.
@@ -239,7 +239,7 @@ func (ngm *nvidiaGPUManager) gpusInUse() *podGPUs { | |||
var containersToInspect []containerIdentifier | |||
for _, container := range pod.Status.ContainerStatuses { | |||
if containers.Has(container.Name) { | |||
containersToInspect = append(containersToInspect, containerIdentifier{container.ContainerID, container.Name}) | |||
containersToInspect = append(containersToInspect, containerIdentifier{strings.Replace(container.ContainerID, "docker://", "", 1), container.Name}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very surprised that this would not happen by default. @yujuhong Is this a subtle bug from dockertools -> dockershim ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder the same thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the code has always been like that.
@tianshapjq Thanks for the PR. Can you extend |
@vishh ok, I will do it~ |
Your idea of synchronizing allocated GPUs lazily SGTM.
Kubelet start up has to be reorganized, but that's beyond the scope of a
bug fix
…On May 22, 2017 12:52 PM, "Drinky Pool" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In pkg/kubelet/gpu/nvidia/nvidia_gpu_manager.go
<#46087 (comment)>
:
> @@ -101,8 +102,7 @@ func (ngm *nvidiaGPUManager) Start() error {
if err := ngm.discoverGPUs(); err != nil {
return err
}
- // It's possible that the runtime isn't available now.
- ngm.allocated = ngm.gpusInUse()
+
I tried to figure out the right place to call Start() function, but seems
pods' synchronization is done by the statusManager concurrently, which
means there is no explicit checkpoint to tell if the pods have been
recovered. So, if we can not get the right info of active pods in
gpusInUse(), the initialization of ngm.allocated will never be right when
kubelet restarts.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#46087 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AGvIKOi_I3NMAV1MEYMtp5sqITapaYqTks5r8Ph-gaJpZM4NgRq1>
.
|
@k8s-bot ok to test |
@vishh anything wrong with my commit? |
I think we need some e2e to prove out this logic @tianshapjq |
@cmluciano I am working on it. |
cc @mindprince |
@cmluciano I considered to create a new e2e test in such scene: 1. node should contain at least 2 gpu cards; 2. create a pod requesting one gpu card; 3. restart kubelet; 4. create another pod to request one gpu card; 5. these two pods should not be assigned the same gpu card. How do you think about this? |
@tianshapjq I took a stab at this in #46644 |
@cmluciano @vishh I am woking on the e2e test in gpus.go, is there any method to get the devices of a container in pod? cause I want to check if two pods assigned the same gpu card when Kubelet restarts, but seems no attribute to expose these info. |
@tianshapjq I'm not sure if this exists or not. We only have available GPUs in the current suite. |
Looks like this is a bugfix (intended to go in for code freeze). If there was an issue filed for the bug, @tianshapjq please reference it in the PR body. Otherwise, @vishh the change looks small enough that you can approve this PR going in with no issue via ( |
@grodrigues3 there is no issue for this pr, I just created this pr directly. Do I need to create one? |
/approve no-issue |
I'm waiting on @mindprince to validate this PR with #46644 |
/retest |
/lgtm |
@mindprince: changing LGTM is restricted to assignees, and only kubernetes org members may be assigned issues. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mindprince, tianshapjq, vishh Associated issue requirement bypassed by: vishh The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
@tianshapjq: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
/retest |
Automatic merge from submit-queue (batch tested with PRs 45877, 46846, 46630, 46087, 47003) |
What this PR does / why we need it:
In my test, I found 2 errors in the nvidia_gpu_manager.go.
Special notes for your reviewer:
Release note: