Skip to content

Commit 5bec0a3

Browse files
jfernandezfuweid
authored andcommitted
sys: fix pidfd leak in UnshareAfterEnterUserns
UnshareAfterEnterUserns() creates a pidfd via os.StartProcess() with CLONE_PIDFD but fails to close the file descriptor in any code path, resulting in a file descriptor leak for every container that uses user namespace isolation. The leak occurs because: - The pidfd is created when PidFD field is set in SysProcAttr - The original defer block only calls PidfdSendSignal() and pidfdWaitid() - No code path calls unix.Close(pidfd) to release the file descriptor This causes one pidfd leak per container launch when user namespace isolation is enabled (e.g., Kubernetes pods with hostUsers: false). In production environments with high container churn, this can exhaust the system's file descriptor limit. Fix the leak by adding a defer statement immediately after process creation that ensures unix.Close(pidfd) is always called, regardless of which code path is taken. This guarantees cleanup even if the function returns early due to errors or lack of pidfd support. This follows the same cleanup pattern already established in core/mount/mount_idmapped_utils_linux.go:getUsernsFD() which properly closes its pidfd. Closes: #12166 Signed-off-by: Jose Fernandez <[email protected]> [Move SupportsPidFD up to handle dupfd in Go 1.23.{0,1} and simplify backport] Signed-off-by: Wei Fu <[email protected]>
1 parent e240415 commit 5bec0a3

1 file changed

Lines changed: 8 additions & 2 deletions

File tree

pkg/sys/unshare_linux.go

Lines changed: 8 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,10 @@ func UnshareAfterEnterUserns(uidMap, gidMap string, unshareFlags uintptr, f func
3535
return fmt.Errorf("unshare flags should not include user namespace")
3636
}
3737

38+
if !SupportsPidFD() {
39+
return fmt.Errorf("kernel doesn't support pidfd")
40+
}
41+
3842
uidMaps, err := parseIDMapping(uidMap)
3943
if err != nil {
4044
return err
@@ -65,7 +69,7 @@ func UnshareAfterEnterUserns(uidMap, gidMap string, unshareFlags uintptr, f func
6569
return fmt.Errorf("failed to start noop process for unshare: %w", err)
6670
}
6771

68-
if pidfd == -1 || !SupportsPidFD() {
72+
if pidfd == -1 {
6973
proc.Kill()
7074
proc.Wait()
7175
return fmt.Errorf("kernel doesn't support CLONE_PIDFD")
@@ -81,11 +85,13 @@ func UnshareAfterEnterUserns(uidMap, gidMap string, unshareFlags uintptr, f func
8185
if dupErr != nil {
8286
proc.Kill()
8387
proc.Wait()
84-
return fmt.Errorf("failed to dupfd: %w", err)
88+
return fmt.Errorf("failed to dupfd: %w", dupErr)
8589
}
8690
pidfd = dupPidfd
8791
}
8892

93+
defer unix.Close(pidfd)
94+
8995
defer func() {
9096
derr := unix.PidfdSendSignal(pidfd, unix.SIGKILL, nil, 0)
9197
if derr != nil {

0 commit comments

Comments
 (0)