1. 57
    path.join Considered Harmful, or openat() All The Things unix val.packett.cool
    1. 6

      Great write up. I consider bare open a serious code smell in new code. There are some time when you want it, but it’s quite rare. I wish cap-std could become the default for Rust.

      It made me very sad that WH21 standardised the filesystems TS with its path model that is impossible to implement in a non-racy way and which cannot be used with sandboxing (and fails to learn any other lessons from the last few decades).

      1. 6

        I guess this is the main problem with the “standardize existing practice” thing… existing practice often isn’t very good and there’s very often some emerging obviously-much-better way to do it. Maybe standardization should be trying to make the best API possible using the newest research, not standardize existing bad practice?

        1. 1

          By bare open do you mean open without O_PATH or O_DIRECTORY? The article is titled “considered harmful” but like many such articles I wish it would have a concise description of what the author proposes the reader use instead.

          1. 5

            I’d count those anywhere except early setup as well. Once you’ve identified the valid scopes, you should be using openat.

            1. 1

              Ah I see, thanks!

          2. 1

            What’s WH21?

            1. 3

              A typo for WG21.

          3. 5

            the new OCaml I/O library eio has a sandboxed dir type that for now uses realpath + string trickery and doesn’t hold an fd, but the authors are aware that it probably should and RESOLVE_BENEATH exists.

            Note that the Linux backend does use RESOLVE_BENEATH: lib_eio_linux/low_level.ml#L287-L291, but it would be good to support this on FreeBSD too, indeed. The posix backend’s realpath hack was inherited from when we were using libuv and is something that I intend to fix before the 1.0 release.

            BTW, is the fallback algorithm documented somewhere (beyond the cap-std source code)?

            1. 2

              Oh cool! I’ll update that part then.

              Re: algorithm: not to my knowledge, but that part of the source is pretty well commented.

            2. 3

              Fun fact: Windows NT has had directory-relative paths since forever that is not only more secure but also more performant than doing full path name resolution (which is particularly finicky on Windows). More recently it’s gotten OBJ_DONT_REPARSE (don’t follow any links[1]) though this only really works well with relative paths as things like the C: drive are implemented as a type of symlink.

              However, the higher level Win32 APIs don’t expose this at all. The good news is that NtCreateFile is very much a stable function, Still, it would be good for there to be higher level functions that expose this functionality.

              [1]: It’s a bit more complicated than that but I didn’t feel like explaining reparse points ;).

              1. 2

                Article is more interesting as a run-down of how community consensus is achieved for such things.

                glibc provides no wrapper for openat2()

                Well, almost achieved. Guess I’ll be waiting a bit longer perhaps.

                At the very least, I’m glad this came to Linux finally (though I’m late noticing by 2 years). I’m looking forward to using it.

                1. 1

                  I understand how all of the technologies you mentioned help prevent race conditions when navigating the file system but I didn’t understand how RESOLVE_BENEATH for openat2() makes a directory file descriptor usable as a capability. The problem I see is that the API user is able to decide whether or not they pass in flag into the resolve member, making them able to easily go beyond the capability. Is the idea that something like seccomp-bpf would enforce the use of RESOLVE_BENEATH?

                  1. 3

                    Capsicum is really two related sets of features. The first adds fine-grained permissions to file descriptors. This lets you do things like have file descriptors that can not extend the file, or can be used for read-only mmap but not read-write, and so on. The other is a mode where you lose access to the global namespace and can access things only via file descriptors. This is a monotonic switch triggered by cap_enter. Before you’re running in capability mode, you have ambient authority and can choose whether or not openat will go up the tree and so on.

                    Once you’re in capability mode, you cannot follow .., you cannot call open, and so on. The only operations that you can perform are ones for which you have an authorising capability.

                    Capsicum is the only sandboxing framework that I’ve seen on mainstream operating systems that lets me usefully reason at the source level about what my sandbox is doing. It’s very sad that NIH prevented it from being upstreamed to Linux.

                    1. 2

                      Is the idea that something like seccomp-bpf would enforce the use of RESOLVE_BENEATH?

                      I think the answer is basically “yes”, capability mode in FreeBSD implies the behavior of RESOLVE_BENEATH:

                      [ENOTCAPABLE]

                      path is an absolute path, or contained a “..” component leading to a directory outside of the directory hierarchy specified by fd, and the process is in capability mode.

                      [ENOTCAPABLE]

                      path contains a “..” component leading to a direc- tory outside of the directory hierarchy specified by fd and O_RESOLVE_BENEATH is specified.

                      From the man page for openat(2)