Restricting root with per-process securebits
This article brought to you by LWN subscribersSubscribers to LWN.net made this article — and everything that surrounds it — possible. If you appreciate our content, please buy a subscription and make the next set of articles possible.
Linux capabilities have had a long and somewhat tortuous journey as part of the Linux kernel. Slowly—and very carefully—functionality is being added to this security feature to get it to a point where it is a viable alternative to the all-or-nothing setuid(0) model. A recently merged patch adds a per-process securebits feature that will allow capabilities-based daemons or subsystems to coexist with existing setuid utilities.
Linux capabilities break up the privileged tasks normally associated with root (i.e. uid 0) into finer-grained abilities which can be individually granted or revoked for specific processes. The idea is to change the standard Unix model that root has all special privileges while all other users have none. The terminology is always a bit contentious, though, as Linux capabilities are derived from a POSIX proposal that was never adopted, but shares the name "capabilities" with an entirely different approach; this article is only concerned with capabilities of the Linux variety.
There has long been interest in creating a Linux system that did not rely upon a single root account. Capabilities are seen as the way to get there, but they have suffered from a bit of a chicken-and-egg problem. With the recent work to add file-based capabilities and restore CAP_SETPCAP to its original meaning, a true capabilities-based system is becoming possible. In the patch, which has been merged for 2.6.26, Andrew Morgan describes the new functionality:
The patch removes the global securebits variable, replacing it with an entry in struct task_struct, that can be manipulated by a process, but only for itself—and any children. Morgan envisions hybrid systems that have some utilities using capabilities to get their privileges along with some setuid(0) utilities. In that scenario, a capabilities-based utility or daemon may wish to limit what its children can do, even if they execute a setuid(0) binary. As part of the evolution, process trees can be created that cannot get root privileges.
Processes which have the CAP_SETPCAP capability can change their securebits setting via the prctl() system call. There are three separate bits that govern the interaction of capabilities and setuid:
- SECURE_NOROOT – enabling this gives no special privileges to uid 0
- SECURE_NO_SETUID_FIXUP – setting this bit disables capability fixes when transitioning from or to uid 0 via setuid. This might be done for compatibility with older programs that use setuid to reduce their privileges.
- SECURE_KEEP_CAPS – when set, a process can retain its capabilities even when transitioning to a normal (not uid 0) user. This bit is cleared by exec().
prctl(PR_SET_SECUREBITS, 0x2f);This is the equivalent of setting SECURE_NOROOT, SECURE_NO_ROOT_LOCKED, SECURE_NO_SETUID_FIXUP, SECURE_NO_SETUID_FIXUP_LOCKED, and SECURE_KEEP_CAPS_LOCKED.
The memory of the sendmail-capabilities bug from 2000 makes some
a bit queasy—or worse—about any patches that involve
capabilities and setuid. Andrew
Morton asks: "what was the bug which
caused us to cripple capability inheritance back in the days of yore? (Some
sendmail thing?)
"
That bug was caused because unprivileged users could take away the
CAP_SETUID capability from setuid binaries like
sendmail. When sendmail then used setuid to drop its privileges,
it failed, but sendmail did not check, so it was still running with full
privilege. This could be leveraged by a user to gain root privileges. It
was a disconnect between capabilities and
the longstanding behavior of Unix-like systems when dropping privileges.
Morgan has written a detailed description of the sendmail-capabilities bug in response to Morton's questions. He makes it clear that he wants to move toward full capability support without breaking existing code:
As folk get more comfortable with this full capability model. I believe we can delete more cruft from the main kernel, but even that clean up will leave a fully functional legacy model in place. I feel it should be for something like init, or one of its children to be able to run subsystems in capability-only or legacy modes.
Morton seemed satisfied that his concerns had been addressed, but still
wonders about the future for capabilities: "So how do we ever get to the stage where we can recommend that distributors
turn these things on, and have them agree with us?
" This was echoed by Ismail Dönmez, who was looking
for concrete examples of how to use the per-process securebits feature.
Morgan provides a pointer to some examples along with his belief that
sometime soon the capabilities developers will become confident enough to
recommend turning off the "experimental" flag for the
SECURITY_FILE_CAPABILITIES kernel configuration. That flag
governs both the file-based capabilities as well as the per-process
securebits. In addition, Morgan says:
A developerWorks article on file-based capabilities by Serge Hallyn and a web page on POSIX capabilities by Chris Friedhoff were both mentioned in the thread as good references for the work being done to actually use capabilities in systems. Those pre-date the securebits work, so Dönmez was looking for use-cases for the new feature. Morgan replied that containers were one, deferring to Hallyn who has some ideas on using securebits:
But I especially like the thought of for instance postfix running in a carefully crafted application container (with its own virtual network card and limited file tree and no visibility of other processes) with SECURE_NOROOT on.
Capabilities are an interesting, but complicated, security feature. For most of the ten years they have been part of the Linux kernel, they have either been broken, ignored, or both. With the latest work being done by Hallyn, Morgan, and others, capabilities are finally becoming a fully-working alternative to things like SELinux. It will be interesting to see if more user utilities will become capability-aware and whether distributions start using capabilities. Some day, root may just fade away.
Index entries for this article | |
---|---|
Kernel | Capabilities |
Security | Linux kernel/Linux/POSIX capabilities |
Posted May 1, 2008 16:16 UTC (Thu)
by bkw1a (subscriber, #4101)
[Link] (3 responses)
Posted May 1, 2008 17:41 UTC (Thu)
by Lennie (subscriber, #49641)
[Link] (2 responses)
Posted May 2, 2008 3:44 UTC (Fri)
by bronson (subscriber, #4806)
[Link]
Posted May 2, 2008 15:22 UTC (Fri)
by giraffedata (guest, #1954)
[Link]
That's been there since the earliest days of capabilities. It is the NET_BIND_SERVICE capability. Whenever I have a program that wants superuser privilege just so it can bind a reserved port, I invoke it from a program "capexec", which sets NET_BIND_SERVICE capability only, sets the proper uid and gid, and execs the untrusted program. (The process is superuser when it invokes capexec).
Sometimes I have to modify the untrusted program because it contains a bogus check for uid zero.
But an even better way to deal with this is not to give any privilege to the untrusted program at all -- just pass it a file descriptor for a socket already bound to the reserved port. For that, I use "socketexec" before "capexec". Socketexec creates the socket and execs capexec. Capexec drops all privileges and execs the untrusted program. I usually have to modify the program to be able to take an already bound socket. Some programs have to bind multiple times in their life and this won't work.
Dropping root's ability to write all files
I'd love to be able to create a process that had root's ability to READ all files, but lacked
root's ability to WRITE all files. This would eliminate the need to run remote backup jobs as
root, which has always worried me.
Dropping root's ability to write all files
I'm not sure if there is an other way already, but having a process just bind a port below
1024 without being root while doing so would be really nice as well.
Dropping root's ability to write all files
Can't SELinux do both of these now?
SELinux is one of those things that I keep meaning to set up and learn... Just never enough
round tuits.
Limiting privilege to binding reserved port
having a process just bind a port below
1024 without being root while doing so would be really nice as well.