-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convince OpenBSD kernel developers to support an executable obtaining the path to its own binary #6718
Comments
if I take my OpenBSD core-developer hat, I would say it will be complex 😃 but I am opened to discuss it let's me try to explain the problem from kernel point of vue. kernel should only provider information which is accurate, else it could lead to obscure errors or eventually security issues. having a interface (syscall or sysctl entry) to retrieving the pathname of the current running executable which is accurate all the time is really complex:
Even Linux provides only a partial solution (to my knowledge). For example, zig code source has a comment in zig code saying This lead to a second question: for what usage such path is need ? Because getting a pathname is per-se asking for trouble: the pathname could be out-of-date as soon as retrieved even if the kernel takes care of all the possible problems (see TOCTOU). For citing an example, Rust
Regarding zig's standard library,
Which makes me to think that every zig binary could potentially call such racy function to do the complex task of parsing a binary, whereas not all OS provides strong guarantee on the path quality. but I am unsure of the cases where stacktrace is printed (instead of an error return path). Regarding zig compiler (the binary), it is using exclusively
in both cases, the "traditional" way (when an application is source compiled) is to use a path provided at compile-time. but I agree it doesn't work for binary distribution where the installation directory isn't know at compile-time. if the path returned is wrong, it would mean building a executable with "wrong" std, or code-execution to "wrong" binary. but I agree that such problem is more theorical than pratical. Now, if I am returning to the original question to have such interface in OpenBSD kernel. I hope to have explained correctly why the kernel should not provide possibly inaccurate pathname. Eventually, an interface which return the descriptor (like for |
I don't understand any concerns with letting an executable read its own data. |
It may have a lot of complexities, but imho, it would be much better, to be done just once in the kernel, by qualified people like you, that do understand all the risks and potential pitfalls, then documented very well, with all the possible pitfalls, like most other platform do. The current situation, leads to the fact, that each program and language implementation, that needs to be ported to OpenBSD, now has to have a lot of unreliable heuristics in user space, that are rediscovered over and over again. For example, see: https://github.com/time-killer-games/libfilesystem/blob/ea009efea523439455327552f12fc71fa30c1e50/filesystem.cpp#L506C5-L506C5 and compare the complexity of the case for OpenBSD, for all other cases. |
This would help us as well. Please look at our We run on Linux, Android, AIX, macOS, Windows, Cygwin, FreeBSD, NetBSD, OpenBSD, DragonFly BSD, GNU/Hurd, Haiku, Solaris, illumos, SerenityOS, and Node.js+Noderawfs. Only the OpenBSD and AIX implementations are ridiculous and probably unreliable. Edit: The reason we need this is to implement features that operate relative to the location of the executable in the filesystem, such as dynamically loading plug-ins that add new functionality, without requiring the user (or an installation program) configuring absolute paths, and not breaking an existing installation if the program is moved. |
@johnsonjh not to be too far off topic or anything but I'd like to help contribute to your project, may I have a means to contact you outside of github? I prefer discord or slack for realtime chat if you have one, that way when I tell you the things I have to say it wont go here or somewhere else where it doesnt belong. My discord is "samuelvenable" my slack is "Samuel Venable". Its easier to chat in realtime than on a github issue or pull request. |
@semarie I would like to mention, Windows prevents a lot of the issues you mentioned by outright preventing an executable from being moved or unlinked if it currently has an actively running process which spawned from it. That would be a very drastic change to how OpenBSD works if they were to go that route, but I have no doubt there are more elegant ways of handling this. For example, if a sysctl() had an interface for getting the executable path, and the path returned points to a hard link which either no longer exists or doesn't have a matching kinfo_file.va_fileid (stat()/fstat() ino_t) and kinfo_file.va_fsid (stat()/fstat() dev_t) then under those cases, let sysctl() have a return value to indicate failure and set errno to indicate what the actual error was. See this code I wrote for example: https://github.com/time-killer-games/xproc/blob/fd06a3e5978c99a33ef57bc13b577eab8648cdcd/process.cpp#L907-L1007 And of course, if the link returned by sysctl() is a soft link, just use realpath() or readlink() recursively until it returns a valid hard link, otherwise, return an error indicator and set errno. On error, the developer could then use a fallback solution or a fallback path option they come up with on their own, even if that means just using the working directory or $HOME/.confg/whatever. It would be up to them because the Kernel-level data would not be accurate or useable. There is even a way to prevent the concern you had with multiple hard links, if multiple hard links to the executable exists (stat()/fstat() st_nlink), again, return error value to sysctl() and set errno. All these failure cases could be properly documented and all your concerns are nullified by this in my honest opinion. I tried suggesting these things to the OpenBSD mailing list, and everyone was helpful and nice to me, except the project leader Theo de Raadt, (go figure), so I gave up on the possibility of ever contributing to OpenBSD, I want to apologize to him for how I wrongly reacted to him, but I doubt that will help the situation. |
@time-killer-games I've found Theo to be "hard" but fair, if a little bit stressed at times. I'd give him another chance for sure. The "problem" with adopting some of the windows solutions is that it's impossible to do and maintain full POSIX semantics that BSD and other Unix systems target with their filesystem implementations. Making a change like that would break many existing applications that use a rename/remove/replace/re-execution workflow for updating themselves, for example. Also, you need to remember that a program executing may be running directly from memory and might not even exist on the filesystem at all - or ever. Look at "crazy" things like https://github.com/XiphosResearch/netelf or https://github.com/Frodox/execute-machine-code-from-memory/ However, these corner cases are acceptable if the use case is correctly specified and the limitations documented, so application developers can avoid TOCTOU (time-of-check vs. time-of-use errors). In most cases, when using a function like this, what the caller really wants to know is where the program was on the disk at the time that it was executed, and it shouldn't make any further assumptions beyond that, especially nothing that could be security sensitive. I believe that Theo had commented that he didn't want to introduce an interface where you could easily introduce security issues if you misunderstood it, but perhaps I'm misremembering. Edit: I'll try to reach you again later on Discord or via email. |
OpenBSD takes the prize for the hackiest implementation of
std.fs.selfExePath
, a function which is readily available on Linux, Darwin/macOS, FreeBSD, DragonFlyBSD, NetBSD, and Microsoft Windows.zig/lib/std/fs.zig
Lines 2235 to 2271 in 71ac5b1
Let's put some friendly pressure on the OpenBSD project to improve this use case.
The text was updated successfully, but these errors were encountered: