Delete all dead containers and sandboxes when under disk pressure. #45896

dashpole · 2017-05-16T16:58:49Z

This PR modifies the eviction manager to add dead container and sandbox garbage collection as a resource reclaim function for disk. It also modifies the container GC logic to allow pods that are terminated, but not deleted to be removed.

It still does not delete containers that are less than the minGcAge. This should prevent nodes from entering a permanently bad state if the entire disk is occupied by pods that are terminated (in the state failed, or succeeded), but not deleted.

There are two improvements we should consider making in the future:

Track the disk space and inodes reclaimed by deleting containers. We currently do not track this, and it prevents us from determining if deleting containers resolves disk pressure. So we may still evict a pod even if we are able to free disk space by deleting dead containers.
Once we can track disk space and inodes reclaimed, we should consider only deleting the containers we need to in order to relieve disk pressure. This should help avoid a scenario where we try and delete a massive number of containers all at once, and overwhelm the runtime.

/assign @vishh
cc @derekwaynecarr

Disk Pressure triggers the deletion of terminated containers on the node.

dashpole · 2017-05-16T21:45:54Z

@k8s-bot kops aws e2e test this

dashpole · 2017-05-17T20:38:40Z

@k8s-bot gce etcd3 e2e test this

dashpole · 2017-05-22T17:41:12Z

this passes the inode eviction test

derekwaynecarr · 2017-05-22T17:50:42Z

pkg/kubelet/eviction/helpers.go

 	resourceToReclaimFunc := map[v1.ResourceName]nodeReclaimFuncs{}
 	// usage of an imagefs is optional
 	if withImageFs {
 		// with an imagefs, nodefs pressure should just delete logs
-		resourceToReclaimFunc[resourceNodeFs] = nodeReclaimFuncs{deleteLogs()}
-		resourceToReclaimFunc[resourceNodeFsInodes] = nodeReclaimFuncs{deleteLogs()}
+		resourceToReclaimFunc[resourceNodeFs] = nodeReclaimFuncs{deleteTerminatedContainers(containerGC)}


shouldnt this be on the imageFs and not nodeFs?

yep, ill fix that.

dashpole · 2017-05-22T18:04:56Z

I also changed the ContainerGC to take a SourcesReadyProvider so that callers of ContainerGC dont need a reference to the kubelet's AllReady function.

derekwaynecarr · 2017-05-22T19:31:48Z

fyi @sjenning

sjenning · 2017-05-22T20:22:39Z

what is the rationale for not deleting pod containers on pod termination? getting the logs from them? debugging?

derekwaynecarr · 2017-05-22T20:24:12Z

@sjenning - yes

dashpole · 2017-05-31T19:48:54Z

@derekwaynecarr or @vishh, if either of you have the time, this would be good to include in 1.7.

vishh · 2017-06-01T02:31:30Z

Track the disk space and inodes reclaimed by deleting containers. We currently do not track this, and it prevents us from determining if deleting containers resolves disk pressure. So we may still evict a pod even if we are able to free disk space by deleting dead containers.

Can we get a count on number of containers deleted? If that's 0, then we can safely assume no disk space has been reclaimed and proceed to evict pods.

Once we can track disk space and inodes reclaimed, we should consider only deleting the containers we need to in order to relieve disk pressure. This should help avoid a scenario where we try and delete a massive number of containers all at once, and overwhelm the runtime.

We need to ratelimit all our calls to the runtime in a central location. It doesn't make sense for each module in kubelet to perform it's own rate limiting.

vishh · 2017-06-01T02:31:56Z

With this change, will kubelet avoid evicting pods if GC has reclaimed required amount of disk?

vishh · 2017-06-01T02:32:35Z

@dashpole if you can provide an answer to my previous question, this PR is ready to go.

dashpole · 2017-06-01T15:12:13Z

With this change, will kubelet avoid evicting pods if GC has reclaimed required amount of disk?

No. We dont know how much disk we have reclaimed, so we assume we have reclaimed 0. This PR isnt meant to prevent the eviction of pods, but to prevent the node from entering a bad state when it is filled with dead containers.
The behavior is essentially the same as before, except containerGC is triggered.

dashpole · 2017-06-01T17:34:43Z

@k8s-bot pull-kubernetes-federation-e2e-gce test this

vishh · 2017-06-01T17:44:24Z

Alright. In that case, can you open a bug to avoid deleting pods if GC frees up enough disk space?

…

On Thu, Jun 1, 2017 at 10:35 AM, David Ashpole ***@***.***> wrote: @k8s-bot <https://github.com/k8s-bot> pull-kubernetes-federation-e2e-gce test this — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#45896 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AGvIKBuqXML74dylDs3F6txVMhLKyVhoks5r_vZZgaJpZM4Ncyz_> .

vishh · 2017-06-01T17:44:38Z

/lgtm

falenn · 2017-06-01T17:49:54Z

This message was created automatically by mail delivery software. A message that you sent could not be delivered to one or more of its recipients. This is a temporary error. The following address(es) deferred: [email protected] Domain imwiz.com has exceeded the max emails per hour (187/150 (124%)) allowed. Message will be reattempted later

------- This is a copy of the message, including all the headers. ------ Received: from o4.sgmail.github.com ([192.254.112.99]:17255) by box969.bluehost.com with esmtps (TLSv1.2:ECDHE-RSA-AES128-GCM-SHA256:128) (Exim 4.87) (envelope-from <[email protected]>) id 1dGU9O-000RBd-3T for [email protected]; Thu, 01 Jun 2017 11:44:46 -0600 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=github.com; h=from:reply-to:to:cc:in-reply-to:references:subject:mime-version:content-type:content-transfer-encoding:list-id:list-archive:list-post:list-unsubscribe; s=s20150108; bh=u9hyDVF37YYIqm+RwTtaBLS0AVE=; b=UgyWvAqjCbhQLg19 EsgCoJERsiqn3O7GoHvt8avZ9aL+aJ++GTsFae/1hL3JlNzCF7OxFI8Aye6AHemX lS+Cr/+iPkXYYTAiFmcdibIsWveKX356CJW3gKIiX8tbyKcUDsKea+S4do6x8E1n 9d9ZdLE80uJpeiikXId8lo6L/UE= Received: by filter0621p1mdw1.sendgrid.net with SMTP id filter0621p1mdw1-28200-59305281-1E 2017-06-01 17:44:33.640877361 +0000 UTC Received: from github-smtp2a-ext-cp1-prd.iad.github.net (github-smtp2a-ext-cp1-prd.iad.github.net [192.30.253.16]) by ismtpd0002p1iad1.sendgrid.net (SG) with ESMTP id Ed2x3z4XSKm5iRTXUuDNTA for <[email protected]>; Thu, 01 Jun 2017 17:44:33.598 +0000 (UTC) Date: Thu, 01 Jun 2017 10:44:33 -0700 From: Vish Kannan <[email protected]> Reply-To: kubernetes/kubernetes <[email protected]> To: kubernetes/kubernetes <[email protected]> Cc: Subscribed <[email protected]> Message-ID: <kubernetes/kubernetes/pull/45896/[email protected]> In-Reply-To: <kubernetes/kubernetes/pull/[email protected]> References: <kubernetes/kubernetes/pull/[email protected]> Subject: Re: [kubernetes/kubernetes] Delete all dead containers and sandboxes when under disk pressure. (#45896) Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="--==_mimepart_593052811a514_7fcc3ff06cad9c3c3892"; charset=UTF-8 Content-Transfer-Encoding: 7bit Precedence: list X-GitHub-Sender: vishh X-GitHub-Recipient: falenn X-GitHub-Reason: subscribed List-ID: kubernetes/kubernetes <kubernetes.kubernetes.github.com> List-Archive: https://github.com/kubernetes/kubernetes List-Post: <mailto:[email protected]> List-Unsubscribe: <mailto:unsub+000ab60a69b9b1563a7fccaba5aa98f33e085a714ba977d992cf000000011548148192a169ce0da7d37d@reply.github.com>, <https://github.com/notifications/unsubscribe/AAq2Cl5qe25v-4zEE6aPNkuM42UIsGAVks5r_viBgaJpZM4Ncyz_> X-Auto-Response-Suppress: All X-GitHub-Recipient-Address: [email protected] X-SG-EID: APO41b8ovafPb3SK9rw3vFWrZuhYlnqPu/GaNB0BZk0+1CLX4EDou4x6wKgG7PtC9d+pxlo2ry9PBh y/BdG6pCMhIxprxnTVBBQd3xLTo0QKizak9b/tLO3jDjkvqTpTRw4tSVaq8zUGXoDVqDa5V/5bE1EN 1o9t65mB49eVBYQRKWf+V9/fdhDCk1Ak5fetTbbHUhXiVu2vTXuqzbMEnQ== X-Spam-Status: No, score=-3.1 X-Spam-Score: -30 X-Spam-Bar: --- X-Ham-Report: Spam detection software, running on the system "box969.bluehost.com", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see root\@localhost for details. Content preview: Alright. In that case, can you open a bug to avoid deleting pods if GC frees up enough disk space? On Thu, Jun 1, 2017 at 10:35 AM, David Ashpole <[email protected]> wrote: > @k8s-bot <https://github.com/k8s-bot> pull-kubernetes-federation-e2e-gce > test this > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#45896 (comment)>, or mute the thread > <https://github.com/notifications/unsubscribe-auth/AGvIKBuqXML74dylDs3F6txVMhLKyVhoks5r_vZZgaJpZM4Ncyz_> . > [...] Content analysis details: (-3.1 points, 4.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain -2.8 RCVD_IN_MSPIKE_H2 RBL: Average reputation (+2) [192.254.112.99 listed in wl.mailspike.net] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 HTML_IMAGE_ONLY_32 BODY: HTML: images with 2800-3200 bytes of words 0.0 HTML_MESSAGE BODY: HTML included in message -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.2 AWL AWL: Adjusted score from AWL reputation of From: address X-Spam-Flag: NO

----==_mimepart_593052811a514_7fcc3ff06cad9c3c3892 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Alright. In that case, can you open a bug to avoid deleting pods if GC frees up enough disk space? On Thu, Jun 1, 2017 at 10:35 AM, David Ashpole <[email protected]> wrote:

@k8s-bot <https://github.com/k8s-bot> pull-kubernetes-federation-e2e-gce test this =E2=80=94 You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#45896 (comment)=

11>,

or mute the thread <https://github.com/notifications/unsubscribe-auth/AGvIKBuqXML74dylDs3F6t=

xVMhLKyVhoks5r_vZZgaJpZM4Ncyz_>

.

--=20 You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: #45896 (comment)=

----==_mimepart_593052811a514_7fcc3ff06cad9c3c3892 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Alright. In that case, can you open a bug to avoid deleting pods if GC<br> frees up enough disk space?<br> <br> On Thu, Jun 1, 2017 at 10:35 AM, David Ashpole <[email protected]= ><br> wrote:<br> <br> > @k8s-bot <https://github.com/k8s-bot> pull-kubernetes-federation= -e2e-gce<br> > test this<br> ><br> > =E2=80=94<br> > You are receiving this because you were mentioned.<br> > Reply to this email directly, view it on GitHub<br> > <#45896 (comment)= 05565111>,<br> > or mute the thread<br> > <https://github.com/notifications/unsubscribe-auth/AGvIKBuqXML74dyl= Ds3F6txVMhLKyVhoks5r_vZZgaJpZM4Ncyz_><br> > .<br> ><br> <p style=3D"font-size:small;-webkit-text-size-adjust:none;color:#666;">&mda= sh;<br />You are receiving this because you are subscribed to this thread.<= br />Reply to this email directly, <a href=3D"https://github.com/kubernetes= /kubernetes/pull/45896#issuecomment-305567632">view it on GitHub</a>, or <a= href=3D"https://github.com/notifications/unsubscribe-auth/AAq2CjgVNeHoOKKu= WOmYILaXTf_5gc7Jks5r_viBgaJpZM4Ncyz_">mute the thread</a>.<img alt=3D"" hei= ght=3D"1" src=3D"https://github.com/notifications/beacon/AAq2CgWU6-amh1z7r6= 9Cm0O-3OXwZdZRks5r_viBgaJpZM4Ncyz_.gif" width=3D"1" /></p> <div itemscope itemtype=3D"http://schema.org/EmailMessage"> <div itemprop=3D"action" itemscope itemtype=3D"http://schema.org/ViewAction= "> <link itemprop=3D"url" href=3D"https://github.com/kubernetes/kubernetes/p= ull/45896#issuecomment-305567632"></link> <meta itemprop=3D"name" content=3D"View Pull Request"></meta> </div> <meta itemprop=3D"description" content=3D"View this Pull Request on GitHub"=

</meta>

</div> <script type=3D"application/json" data-scope=3D"inboxmarkup">{"api_version"= :"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"Gi= tHub"},"entity":{"external_key":"github/kubernetes/kubernetes","title":"kub= ernetes/kubernetes","subtitle":"GitHub repository","main_image_url":"https:= //cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95f= c-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com= /assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":= {"name":"Open in GitHub","url":"https://github.com/kubernetes/kubernetes"}}= ,"updates":{"snippets":[{"icon":"PERSON","message":"@vishh in #45896: Alrig= ht. In that case, can you open a bug to avoid deleting pods if GC\nfrees up= enough disk space?\n\nOn Thu, Jun 1, 2017 at 10:35 AM, David Ashpole \u003= [email protected]\u003e\nwrote:\n\n\u003e @k8s-bot \u003chttps://gi= thub.com/k8s-bot\u003e pull-kubernetes-federation-e2e-gce\n\u003e test this= \n\u003e\n\u003e =E2=80=94\n\u003e You are receiving this because you were = mentioned.\n\u003e Reply to this email directly, view it on GitHub\n\u003e = \u003chttps://github.com/kubernetes/kubernetes/pull/45896#issuecomment-3055= 65111\u003e,\n\u003e or mute the thread\n\u003e \u003chttps://github.com/no= tifications/unsubscribe-auth/AGvIKBuqXML74dylDs3F6txVMhLKyVhoks5r_vZZgaJpZM= 4Ncyz_\u003e\n\u003e .\n\u003e\n"}],"action":{"name":"View Pull Request","u= rl":"#45896 (comment)= 7632"}}}</script>=

…

----==_mimepart_593052811a514_7fcc3ff06cad9c3c3892--

dashpole · 2017-06-01T18:40:08Z

@vishh should we add the 1.7 milestone to this?

dashpole · 2017-06-03T14:55:22Z

@vishh rebased. Please reapply lgtm

dashpole · 2017-06-04T01:55:51Z

@k8s-bot pull-kubernetes-federation-e2e-gce test this
@k8s-bot pull-kubernetes-e2e-gce-etcd3 test this

vishh · 2017-06-04T01:58:37Z

/lgtm

k8s-github-robot · 2017-06-04T01:58:55Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dashpole, vishh

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

~~pkg/kubelet/OWNERS~~ [vishh]

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

dashpole · 2017-06-04T04:24:31Z

@k8s-bot pull-kubernetes-federation-e2e-gce test this

k8s-github-robot · 2017-06-04T06:43:46Z

Automatic merge from submit-queue

blakebarnett · 2017-07-17T17:14:03Z

Any chance this can go into 1.6.8? We're seeing it (or something very like it) occasionally on 1.6.6. The nodes didn't seem to ever be under disk pressure but seeing these messages constantly in the kubelet logs now.

dashpole · 2017-07-17T17:34:43Z

@blakebarnett as this is a large change, I would rather not cherrypick this to 1.6. As a workaround, it should be fairly trivial to run a job that periodically deletes terminated pods. I have also heard there is a controller called podgc, but I am not familiar enough with it to know if it would be a good solution to this issue.

blakebarnett · 2017-07-17T17:38:12Z

Thanks for the quick response. Sounds like we may just go to 1.7 ahead of schedule then. :)

@MrHohn

Automatic merge from submit-queue (batch tested with PRs 44596, 52708, 53163, 53167, 52692). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>. Do not GC exited containers in running pods This fixes a regression introduced by #45896, and was identified by #52462. This bug causes the kubelet to garbage collect exited containers in a running pod. This manifests in strange and confusing state when viewing the cluster. For example, it can show running pods as having no init container (see #52462), if that container has exited and been removed. This PR solves this problem by only removing containers and sandboxes from terminated pods. The important line change is: ` if cgc.podDeletionProvider.IsPodDeleted(podUID) || evictNonDeletedPods {` ---> `if cgc.podStateProvider.IsPodDeleted(podUID) || (cgc.podStateProvider.IsPodTerminated(podUID) && evictTerminatedPods) {` cc @MrHohn @yujuhong @kubernetes/sig-node-bugs ```release-note BugFix: Exited containers are not Garbage Collected by the kubelet while the pod is running ```

k8s-ci-robot assigned vishh May 16, 2017

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 16, 2017

k8s-github-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. release-note-label-needed release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed release-note-label-needed labels May 16, 2017

dashpole force-pushed the disk_pressure_reclaim branch 2 times, most recently from 87a39e8 to 4b170e1 Compare May 16, 2017 20:58

dashpole force-pushed the disk_pressure_reclaim branch from 4b170e1 to ab3ae9e Compare May 17, 2017 18:49

derekwaynecarr reviewed May 22, 2017

View reviewed changes

dashpole force-pushed the disk_pressure_reclaim branch from ab3ae9e to 9c6a228 Compare May 22, 2017 18:00

dashpole mentioned this pull request May 23, 2017

Kubelet evictions - whats remaining? #31362

Closed

dashpole mentioned this pull request Jun 1, 2017

ImageGCFailed, unable to delete images and reclaim disk. #45558

Closed

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 1, 2017

k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 1, 2017

dashpole mentioned this pull request Jun 1, 2017

Eviction: Calculate disk space reclaimed during container garbage collection #46789

Closed

vishh added this to the v1.7 milestone Jun 1, 2017

k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 3, 2017

trigger aggressive container garbage collection when under disk pressure

889afa5

dashpole force-pushed the disk_pressure_reclaim branch from 9c6a228 to 889afa5 Compare June 3, 2017 14:52

k8s-github-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 3, 2017

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jun 4, 2017

k8s-github-robot merged commit 3fdf6c3 into kubernetes:master Jun 4, 2017

dashpole deleted the disk_pressure_reclaim branch June 4, 2017 19:10

blakebarnett mentioned this pull request Jul 18, 2017

Add reserve compute resources kubelet flags kubernetes/kops#2982

Merged

rushtehrani mentioned this pull request Jul 24, 2017

Container pods stuck ins tate unknown #43279

Closed

dashpole mentioned this pull request Sep 27, 2017

Do not GC exited containers in running pods #53167

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Delete all dead containers and sandboxes when under disk pressure. #45896

Delete all dead containers and sandboxes when under disk pressure. #45896

dashpole commented May 16, 2017 •

edited

Loading

dashpole commented May 16, 2017

dashpole commented May 17, 2017

dashpole commented May 22, 2017

derekwaynecarr May 22, 2017

dashpole May 22, 2017

dashpole commented May 22, 2017

derekwaynecarr commented May 22, 2017

sjenning commented May 22, 2017

derekwaynecarr commented May 22, 2017

dashpole commented May 31, 2017

vishh commented Jun 1, 2017

vishh commented Jun 1, 2017

vishh commented Jun 1, 2017

dashpole commented Jun 1, 2017

dashpole commented Jun 1, 2017

vishh commented Jun 1, 2017 via email

vishh commented Jun 1, 2017 via email

falenn commented Jun 1, 2017 via email

dashpole commented Jun 1, 2017

dashpole commented Jun 3, 2017

dashpole commented Jun 4, 2017

vishh commented Jun 4, 2017

k8s-github-robot commented Jun 4, 2017

dashpole commented Jun 4, 2017

k8s-github-robot commented Jun 4, 2017

blakebarnett commented Jul 17, 2017

dashpole commented Jul 17, 2017

blakebarnett commented Jul 17, 2017

Delete all dead containers and sandboxes when under disk pressure. #45896

Delete all dead containers and sandboxes when under disk pressure. #45896

Conversation

dashpole commented May 16, 2017 • edited Loading

dashpole commented May 16, 2017

dashpole commented May 17, 2017

dashpole commented May 22, 2017

derekwaynecarr May 22, 2017

Choose a reason for hiding this comment

dashpole May 22, 2017

Choose a reason for hiding this comment

dashpole commented May 22, 2017

derekwaynecarr commented May 22, 2017

sjenning commented May 22, 2017

derekwaynecarr commented May 22, 2017

dashpole commented May 31, 2017

vishh commented Jun 1, 2017

vishh commented Jun 1, 2017

vishh commented Jun 1, 2017

dashpole commented Jun 1, 2017

dashpole commented Jun 1, 2017

vishh commented Jun 1, 2017 via email

vishh commented Jun 1, 2017 via email

falenn commented Jun 1, 2017 via email

dashpole commented Jun 1, 2017

dashpole commented Jun 3, 2017

dashpole commented Jun 4, 2017

vishh commented Jun 4, 2017

k8s-github-robot commented Jun 4, 2017

dashpole commented Jun 4, 2017

k8s-github-robot commented Jun 4, 2017

blakebarnett commented Jul 17, 2017

dashpole commented Jul 17, 2017

blakebarnett commented Jul 17, 2017

dashpole commented May 16, 2017 •

edited

Loading