strscpy() and the hazards of improved interfaces
LWN.net needs you!Back in the distant past (May 2015), LWN looked at a couple of efforts to provide improved string-handling primitives to the kernel. One of those two was recently merged, while the other has run into trouble; both cases highlight a fundamental concern Linus has about this type of kernel patch. The end result is that it is possible to evolve the kernel toward safer interfaces, but attempts to do so as a series of mass changes will probably not end well.Without subscribers, LWN would simply not exist. Please consider signing up for a subscription and helping to keep LWN publishing.
Normally, one does not expect to see a new API merged into the mainline for an -rc4 release, but Linus decided to make an exception when he pulled in the strscpy() patch just before 4.3-rc4. That patch set has changed a bit since it was examined here, though the intent is the same: to provide a string-copy API that is safer and easier to use than strncpy() or strlcpy(). The new copy function is:
ssize_t strscpy(char *dest, const char *src, size_t count);
This function will copy the string found in src to dest, taking care not to overflow dest, which is count bytes long. Unlike strncpy(), it always null-terminates the destination string. The return value is the number of characters copied (without the trailing NUL byte) — unless the string would not fit into dest, in which case the return value is -E2BIG.
Unlike previous versions, strscpy() will always copy what it can, returning a truncated string in dest if the whole thing does not fit. That change took away the need for the strscpy_truncate() variant, so that function is no longer provided. Opinions may differ on whether returning a truncated string is the right thing to do, but there were enough opinions in favor of doing so that this change needed to be made to get the patch merged.
There are a number of advantages claimed for this API. It lacks an internal race condition found in the others, making it more robust in the face of a string that changes while it is being copied. The return value, it is claimed, more clearly indicates overflow than the value returned by strlcpy(). Unlike strncpy(), the result is always a null-terminated string. In the end, we might have finally come up with a reasonable string-copy function after about four attempts — not bad for such a complex task.
Anybody who is firing up their editor to start converting call sites in the kernel to strscpy() may want to reconsider, though. There is a warning in both the fate of parse_integer() and Linus's comments around the merging of strscpy().
parse_integer() is the other string function covered in the May article; its purpose is to make string-to-integer conversions easier and more robust. Linus recently got rather upset about this patch set which, he thought, changed the semantics of the API and introduced bugs. Various call sites were changed to the new functions and, in the process, some of them were broken. The idea was that parse_integer() would be a replacement for the kernel's existing integer-conversion functions (simple_strtoul(), kstrtoul(), and the like) but that the actual act of replacing those functions introduced regressions.
Linus was clearly afraid that the strscpy() patch could end up being a source of regressions as well. That wouldn't happen with the patch set itself, which does not convert any existing strncpy() or strlcpy() call sites. The problem happens when other, well-intentioned developers start doing those conversions. Linus described his worries in the merge commit that brought in strscpy():
Every time we introduce a new-and-improved interface, people start doing these interminable series of trivial conversion patches.
And every time that happens, somebody does some silly mistake, and the conversion patch to the improved interface actually makes things worse. Because the patch is mindnumbing and trivial, nobody has the attention span to look at it carefully, and it's usually done over large swatches of source code which means that not every conversion gets tested.
To try to head off such an outcome, Linus has made it clear that he will not be accepting patches that do mass conversions to strscpy() (note though that certain developers are already considering mass conversions anyway). It is there to be used with new code, but existing code should not be converted without some compelling reason to do so — or without a high level of attention to the possible implications of the change.
One might be tempted to think that this proclamation from Linus signals the
end of the "trivial clean-up patch" era. But that would almost certainly
be reading too much into what he said. Patches that do not make functional
changes to the code do not, one would hope, pose the same sort of risk that
API replacements do. So the flow of white-space adjustments is likely to
continue unabated. But developers who want to convert a bunch of working
code to a "safer" interface may want to think twice before sending in a
patch.
Index entries for this article | |
---|---|
Kernel | String processing |
Posted Oct 7, 2015 12:19 UTC (Wed)
by gerdesj (subscriber, #5446)
[Link]
Quite right too. It's a little worrying that a good engineering practice pronouncement from Mr T could actually cause any form of controversy. The road to OOPs is paved with good intentions and mass code changes.
Posted Oct 7, 2015 12:57 UTC (Wed)
by sorokin (guest, #88478)
[Link] (9 responses)
It is a pity that the kernel doesn't have any automated testing. Refactoring/cleanups are much less scarier when test coverage of a program is good.
Posted Oct 7, 2015 13:35 UTC (Wed)
by compenguy (guest, #25359)
[Link] (5 responses)
Posted Oct 7, 2015 14:13 UTC (Wed)
by NAR (subscriber, #1313)
[Link] (4 responses)
Posted Oct 7, 2015 16:20 UTC (Wed)
by smckay (guest, #103253)
[Link] (1 responses)
Posted Oct 12, 2015 3:57 UTC (Mon)
by jwarnica (subscriber, #27492)
[Link]
If they wanted code coverage, they would have it.
Posted Oct 7, 2015 16:25 UTC (Wed)
by raven667 (subscriber, #5198)
[Link]
Heck, if you put a gateway between the highly-tested parts and the others you are half way to a micro-kernel 8-)
Posted Oct 7, 2015 20:49 UTC (Wed)
by riddochc (guest, #43)
[Link]
That said, it doesn't mean that the hardware actually works that way. As far as testing goes, this would be a big step forward, but still not quite a substitute for having the actual hardware to test against.
QEMU is a remarkable piece of software, made even better by KVM. It enables lots of things that would have been difficult or tedious before, like this sort of testing. It could use more documentation, but it's well worth spending some time with to see what it can offer.
Posted Oct 7, 2015 13:54 UTC (Wed)
by intgr (subscriber, #39733)
[Link]
Posted Oct 8, 2015 20:43 UTC (Thu)
by robclark (subscriber, #74945)
[Link]
you mean like http://kernelci.org/ or 0-day kbuild robot thing (which iirc is doing boot tests on qemu?)
(granted that probably doesn't scratch the surface when it comes to driver coverage w/ all the different hw that is out there)
Posted Oct 16, 2015 5:40 UTC (Fri)
by ksandstr (guest, #60862)
[Link]
In addition, tests rot and turn crusty. Old tests accrete and become difficult to validate (even if test validation were automated) as the collective knowledge used as their basis fades. At that point one could insist on formal specs, and then derive tests from those specs concurrently with the implementation, so that test rot is preceded by spec rot by at least one rung on a metaphorical ladder of things I'll climb tomorrow (honest!) because The Boss wants results last week.
Not to mention that there's at least two incompatible schools of test-writing, just divided by style of tool: the multiple-assertion tests (as in Check, JUnit, etc), and the multiple-point tests (Perl's Test Anywhere Protocol). The latter has great density, produces an useful output for successful tests, and offers many comforts for the programmer, whereas the former is popular and well familiar to the Java generation (those who studied after 2000).
Running all these tests as part of regular development, i.e. so that a TDD-ey test replaces operator interactions, would also become more difficult as coverage increases. Tens of megabytes of code means hundreds of megabytes of test code, and if each of (say) 20,000 tests runs for five seconds, then the grand suite would execute serially in ~28 hours. That's long enough that 1/100 of it is too long a break to sustain programmer attention over an edit-compile-test cycle.
So there's quite a few obstacles there, with few gains early on (the first decade?). I've not even touched on the architectural testability issues, i.e. whether it's even possible to programmatically explore significant areas of the (state * parameter) space for various important operations within the kernel.
Posted Oct 7, 2015 13:29 UTC (Wed)
by pj (subscriber, #4506)
[Link] (3 responses)
Posted Oct 7, 2015 14:23 UTC (Wed)
by rriggs (guest, #11598)
[Link] (2 responses)
Posted Oct 7, 2015 20:43 UTC (Wed)
by utoddl (guest, #1232)
[Link] (1 responses)
Posted Oct 12, 2015 9:00 UTC (Mon)
by JdGordy (subscriber, #70103)
[Link]
1) There is going to be someone somewhere which depends on that behaviour(!)
Posted Oct 7, 2015 13:34 UTC (Wed)
by ncm (guest, #165)
[Link] (27 responses)
Ultimately, though, all the hand-wringing is really over what a weak language C is. I will be happy to see Rust begin to displace C, particularly in codecs, which are a particularly dense source of security holes that have no reason to exist. Since Rust is entirely compatible with kernel code, needing no special runtime support, we might reasonably hope to see it in kernel code soon, starting probably with drivers. Torvalds has to know that any rants over it will just make him look foolish, so we should only need to wait on more maturity in the compiler. (That is not to say that is how it will go.)
Posted Oct 7, 2015 13:58 UTC (Wed)
by intgr (subscriber, #39733)
[Link] (3 responses)
Was there a rant about Rust by Torvalds that I missed, or are you referencing his rants against C++?
Posted Oct 7, 2015 19:57 UTC (Wed)
by hummassa (guest, #307)
[Link] (2 responses)
I don't know if there was any specific Rust rant, but his C++ looks really dated these days.
Posted Oct 8, 2015 19:23 UTC (Thu)
by lsl (subscriber, #86508)
[Link] (1 responses)
Posted Oct 28, 2015 15:39 UTC (Wed)
by hummassa (guest, #307)
[Link]
1. C++ programmers are idiots
this is one of LT's least brilliant moments, IMNSHO.
2. C++ leads to bad design choices
yeah, and still does somewhat but look: kernel-C uses more or less the same design choices, and loses the correctness checks that C++ would give. RAII, people. RAII.
3. STL & Boost are non-portable, non-stable OR
old-school
4. STL & Boost can be nice, but are full of hidden surprises
old-school
5. C++ is unportable
where there is gcc, there is g++ and libstdc++; and probably clang++ and libc++ too.
He had reservations (in a 2004 post IIRC) about exceptions, too (because at the time they were expensive even when non-used... which nowadays is ancient history). Other thing he had ample reason (and still has, in some way) is that the type system causes ginourmous illegible error messages (but this has been mitigated a lot both by clang++ and g++ lately).
Posted Oct 7, 2015 15:20 UTC (Wed)
by wahern (subscriber, #37304)
[Link] (5 responses)
Top of the list of problems are boxed types like Arc, which require dynamic allocation. But Rust provides no way to handle OOM for these hidden allocations except to terminate the thread.
Rust needs a simple exception mechanism, perhaps similar to Lua's pcall(), which allows you stop a stack unwind. But the majority of Rust developers seem adamant that it's impossible to handle OOM correctly or cleanly (correctly even, without hacks like emergency pools), despite the fact that people have been doing it for ages, and that it would be even easier with Rust.
Posted Oct 7, 2015 15:22 UTC (Wed)
by wahern (subscriber, #37304)
[Link]
Posted Oct 7, 2015 20:31 UTC (Wed)
by [email protected] (guest, #94131)
[Link] (3 responses)
Posted Oct 7, 2015 22:55 UTC (Wed)
by wahern (subscriber, #37304)
[Link] (2 responses)
Per the paper:
Rust is designed around the notion of immutability and copy-by-value, with the compiler optimizing the copies away. If you have to use mutable references everywhere you would use a pointer in C (because Rust no longer has pointers at all, AFAIU), wouldn't it make it impossible to write idiomatic Rust code? Mutable types come with significant constraints regarding what you can do with the value and when, so it seems to me like it'd be a heavy burden. But I haven't used Rust before, so probably I'm missing something here.
Posted Oct 8, 2015 2:21 UTC (Thu)
by jameslivingston (guest, #57330)
[Link] (1 responses)
There are a lot of places in low-level code (collection implementations, concurrency things, etc) where you want to allow multiple mutable references to memory, and Rust lets you do that provided you do it in an block marked "unsafe", which turns off the compiler checking and is the programmer promising to maintain memory safety. The advantage is that those lower level bits of code can expose safe function with unsafe blocks inside them, meaning the implementer takes responsibility for correctness but the user does not have to worry, unlike something like Haskell's IO Monad which requires making everything up the call chain also in IO.
Rust used in a kernel would use little to none of the standard Rust library. Unlike most recent languages, it's a library not a runtime - just like C, and the kernel doesn't use lib's malloc().
Posted Oct 8, 2015 3:37 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link]
... but more or less exactly like Haskell's `unsafePerformIO` primitive, which takes an IO block and executes it inside pure code, leaving it up to the implementer to take responsibility for correctness while exposing a safe (i.e. pure) interface.
The IO type is for the majority of IO-based functions which do _not_ expose a pure interface.
There is also the ST type, which permits mutation of local variables without the possibility of I/O. In this case the language enforces the purity of the interface; only pure values are permitted to escape the ST computation via the `runST` primitive. By taking advantage of rank-2 polymorphism, the language ensures that any attempt to pass a reference to an ST variable outside `runST` and into pure code results in a type error.
Posted Oct 7, 2015 16:12 UTC (Wed)
by wahern (subscriber, #37304)
[Link] (10 responses)
But ultimately I think the difference is slight. The scorn some people show for strlcpy goes beyond all reason. I'd be happy to see either interface become standardized, but the fact of the matter is that strlcpy is already much more widely used and is a de facto standard. Any deficiencies in the semantics are overcome by the huge amount of production and example code out there using it correctly, often with copious descriptions.
Posted Oct 7, 2015 17:53 UTC (Wed)
by ballombe (subscriber, #9523)
[Link] (3 responses)
I agree that -E2BIG is dangerous.
Posted Oct 7, 2015 20:57 UTC (Wed)
by reubenhwk (guest, #75803)
[Link]
Only when dst is shorter than src right?
> Not only this slows thing down for no good reason
The reason is to inform the caller how large dst needs to be for success in the case of a failure. That seems like a very good reason to me. I doubt the extra slowness would be measurable in a typical use case.
Posted Oct 7, 2015 21:04 UTC (Wed)
by wahern (subscriber, #37304)
[Link] (1 responses)
Huh. So that's what's meant by supposedly being free of a race condition?
But if src isn't properly NUL terminated there are much bigger problems afoot. strscpy is claiming to the unwary user that it's somehow safer (or worse, safe without qualification) in a context where the src may be corruptible. But that's a guarantee it can't make. An attacker isn't always limited to writing forward in step with the loop in strscpy. I would think such a constraint would be the exception rather than the rule. As src is only being read, what we're concerned with here are information leaks or reads of unmapped memory, both of which are still possible if you can't trust the src.
strscpy seems especially problematic for making such a claim. It has the same problems as strlcpy when the user makes the wrong assumptions about the return value, but also intentionally misleads them about the semantics. And for all the ridicule, it still adopts strlcpy's poorly ordered argument list inherited from strncpy which has _actually_ been a factor or cause in many misuses of strlcpy--accidentally passing a src length (or otherwise src-derived value) instead of a destination buffer size.
If people were serious about the claimed strlcpy deficiencies, I would think they'd adopt something like error_t strscpy(char *dst, size_t *len, size_t lim, const char *src).
It's not a drop-in replacement for strncpy (or strlcpy), requiring people to pay attention to any code refactor, but just as easy to use given that to use the computed length of either strscpy or strlcpy properly you already need a named variable. The error condition and length are communicated via two separate values, significantly reducing the likelihood that the length will be used uncheck. Indeed, the length could even be set to 0 on error. And when inlined it should perform just as well as either alternative.
Otherwise, all the debate just seems like bike shedding and NIH syndrome. It seem disingenuous to fault strlcpy for problems that aren't fixed by the alternative. The only difference from strlcpy in the implied misuse scenario is the possibility of a compiler warning, notwithstanding that signed-to-unsigned conversions are legal. And strscpy could be worse because (size_t)-E2BIG is likely to be much bigger than the src length. (OTOH, it could be better, leading to segfaults more quickly.) strlcpy was a pragmatic compromise. It would seem so is strscpy, reflecting a certain set of preferred esthetics and compromises that certainly aren't objectively better than strlcpy.
At the very least the inclusion adds weight to the argument that glibc's stance has been misguided all along. These sorts of routines have utility and address a legitimate gap in C's string handling API. The real issue comes down to having to wade through the bike shedding, which I guess I can't fault glibc maintainers for wanting to avoid. musl libc has added strlcpy, though, and by most accounts musl reflects exceptional code quality. Maybe glibc will acquiesce, or at least propose an objectively better interface.
Posted Oct 8, 2015 22:32 UTC (Thu)
by ncm (guest, #165)
[Link]
Almost always just "if (0 > strscpy" suffices. But if somebody meant to adopt a sensible interface, that would always suffice, because it would take two more arguments, a size_t* and a size_t, with the latter an offset and the former a place to record the offset of the NUL after the copy. But sensible is too much to expect when strlcpy seems like a good idea to many. (And, no, the pointer alone isn't enough; it's too easy to forget to initialize what it points at before the call.). The name of the sensible interface would be strto(), because long, cryptic names with mystifying initials do nobody any good.
Posted Oct 7, 2015 20:23 UTC (Wed)
by PaXTeam (guest, #24616)
[Link] (5 responses)
Posted Oct 7, 2015 22:05 UTC (Wed)
by wahern (subscriber, #37304)
[Link] (4 responses)
Regarding not checking the return value: many times strlcpy is replacing code that already didn't check for the return value, and added as a stop-gap. More importantly, not checking for a return value is not always a bug, or even usually a bug, in the contexts where you add strlcpy. Checking the return value is neither necessary nor sufficient as a general matter; it's context specific.[1] If the semantics of the code are garbage-in, garbage-out, then not checking for truncation is not necessary. If the semantics are that truncated input could subvert some condition, then checking for truncation is not always sufficient. Coding an algorithm so that garbage-in is safely garbage-out is arguably the most robust way to write secure code. If the data is highly structured, you probably shouldn't be using C strings much. Separating constraint checks from the core algorithm that processes input is an anti-pattern when writing secure code--it's usually better to put constraints on the _actual_ state of the algorithm processing the input. This is why, for example, Lua removed it's bytecode validator--it was neither necessary nor sufficient, and in practice added needless complexity; remove needless complexity and it becomes easier to focus on finding and fixing bugs in the code that matters.
And strscpy is hardly better relative to strlcpy weak points. strscpy overloads the return value just like strlcpy does. If you don't check for a failure condition, the length will be too large (extremely large in the case of (size_t)-E2BIG). And nothing inherent in strscpy forces a developer to check for truncation.
You could argue truncation checking is slightly less prone to bugs. The obvious issue with strlcpy is using n > lim instead of n >= lim to compare, an off-by-one. But I could just as easily envision people doing n <= 0 or n == E2BIG instead of n < 0 or n == -E2BIG with strscpy. What matters is the _idiom_ that people will use and adopt. And in any event, in both case the misuses are fairly easy to locate using pattern matching.
It's a shame that so many otherwise rational people have adopted such an irrational aversion to strlcpy. There's so much poor reasoning involved. Even if it was the worst interface in the world, the situation is compounded by creating conditions where people constantly reimplement it, often with bugs. It's widely included in many projects; attempts at berating people into not using it have manifestly failed. It's like the war on drugs. Yes, drugs harm society--largely as a result of abuse by a small subset of individuals (sorta like the specter of strlcpy misuses). Yes society would be better off without drugs. And yet banning them does't work.
[1] Grepping for an unchecked strlcpy in the OpenBSD source tree, the first hit brings me to line 776 in bin/csh/csh.c. rechist is copying a command-line into the history buffer. It's old, crufty code, dating to the 1980s. That edit was made in 2003 and replaced a use of strcpy. There's no easy way to bubble up a truncation error, and panicing or exiting failure on a truncation could break existing code; indeed it could introduce security issues of its own by obscuring the exit status of the command. And truncation in this context is about as benign as these things come in the real world. Would you suggest refactoring all of the logic in csh related to management of the history buffer? Make it all dynamicalliy allocated, as GNU developers insist upon? Reject command-lines longer than a record buffer when in interactive sessions? Yeesh.
Posted Oct 7, 2015 22:54 UTC (Wed)
by PaXTeam (guest, #24616)
[Link] (2 responses)
check their kernel tree for strlcpy, it's called over 1800 times, 10 of which check the return value (8 of which are in ofdev.c), 2 of them blindly accumulate its return value and the rest do exactly nothing which means potential silent truncation. from here on the argument can go two ways, neither of which is good for your case. either those silent truncations cannot occur in which case strlcpy is utterly useless or they can occur in which case the use of strlcpy is wrong.
> Regarding not checking the return value: many times strlcpy is replacing code that already didn't check for the return value, and added as a stop-gap.
you've just proved how strlcpy encouraged even more sloppy programming. the copy-paste hoard turned one kind of bug into another. i wouldn't call that progress let alone an example to set for others to follow.
> And strscpy is hardly better relative to strlcpy weak points.
it is infinitely better as it doesn't waste cycles to compute a useless strlen that pretty much no caller cares about. in other news, you must have never written code that tries to copy out substrings from a big one using strlcpy. face it, strlcpy is a design mistake that should just die.
> And nothing inherent in strscpy forces a developer to check for truncation.
that's a strawman, nothing forces anybody to check anything at this rate. if people care about callers doing the right thing then there's __must_check in linux (a gcc attribute so it's not really specific to the kernel) but given how nobody cares about truncation (or at least doesn't want to learn about it from str*cpy) i can understand why it's not enforced. IOW, i don't think the return value even matters, if there's a potential of truncation and it matters, the callers will already have to do something else and the return value from str*cpy is irrelevant.
> It's a shame that so many otherwise rational people have adopted such an irrational aversion to strlcpy.
it's a shame that so many otherwise rational people have adopted such an irrational affection to strlcpy. how about you resort to rational arguments instead of cheap rhetoric?
Posted Oct 8, 2015 8:40 UTC (Thu)
by epa (subscriber, #39769)
[Link] (1 responses)
I'm not saying this is the best way to do things. Personally, if the truncation "cannot occur", I would rather use a string-copying function which just panics the kernel if that "impossible" condition should ever happen in practice. But I guess the programmers of OpenBSD have their reasons for preferring the way they do it. This is defensive programming: code which is demonstrably wrong, since it is predicated on something which will "never" happen (an input not meeting a precondition, an out-of-range value in something which has already been range-checked earlier, and so on), but which given human fallibility may have some value in practice.
Posted Oct 8, 2015 9:15 UTC (Thu)
by PaXTeam (guest, #24616)
[Link]
Posted Oct 8, 2015 7:37 UTC (Thu)
by kleptog (subscriber, #1183)
[Link]
This is the kernel we're talking about and there error codes less than zero are used throughout to handle exceptions. Arguably this means this function will fit in perfectly with the rest of the kernel thus reducing errors of this sort. The idiom is the same as for the rest.
Now, if they were proposing that *user-space* use this function, you'd have a point because error handling in C is all over the place, even within the C library, so that chance of issues is higher. Fortunately, no-one is proposing strscpy for user space.
Posted Oct 8, 2015 6:47 UTC (Thu)
by epa (subscriber, #39769)
[Link] (5 responses)
Posted Oct 9, 2015 0:12 UTC (Fri)
by Richard_J_Neill (subscriber, #23093)
[Link] (4 responses)
Posted Oct 9, 2015 0:55 UTC (Fri)
by dlang (guest, #313)
[Link] (2 responses)
Posted Oct 10, 2015 2:07 UTC (Sat)
by zlynx (guest, #2285)
[Link]
Posted Oct 12, 2015 12:05 UTC (Mon)
by renox (guest, #23785)
[Link]
Posted Oct 4, 2018 0:47 UTC (Thu)
by klossner (subscriber, #30046)
[Link]
A more likely design choice at the time would have been the UCSD Pascal scheme in which the first bytes of the char array contain the length. On the PDP-11, a two-byte length was sufficient. But that would have given up the convenience that a pointer to string is a pointer to the first byte of its data, allowing the same function to take either a string or a non-string buffer, e.g.
Buffer overflow wasn't on the radar much for 16-bit machines. We had so little space that we took better care of it. Then the 32-bit VAX came along and that discipline went by the wayside.
Posted Oct 7, 2015 18:36 UTC (Wed)
by madscientist (subscriber, #16861)
[Link] (5 responses)
Posted Oct 7, 2015 19:49 UTC (Wed)
by Karellen (subscriber, #67644)
[Link] (4 responses)
ssize_t strscpy(char * dest, size_t destlen, char const * src);
where the destination buffer and its size are together as the first two parameters, as they are for snprintf(), fgets(), fread(), strftime(), and probably some others that I'm forgetting right now.
I realise that this is to be consistent with the parameter order/meaning of strncpy() and strlcpy() (and memset()) - but I think the parameter order for those functions is daft too. Obviously we can't change them, but that doesn't stop them from being Wrong, and it doesn't mean we have to copy them in the future.
Posted Oct 7, 2015 20:37 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
Posted Oct 7, 2015 21:06 UTC (Wed)
by reubenhwk (guest, #75803)
[Link]
Posted Oct 7, 2015 21:11 UTC (Wed)
by josh (subscriber, #17465)
[Link]
(See also opinions on AT&T versus Intel assembly syntax.)
Posted Oct 8, 2015 18:19 UTC (Thu)
by Karellen (subscriber, #67644)
[Link]
Yes, I know that read(), gettimeofday(), getrusage(), getrlimit() and others have the destination buffer last, but they seem like the odd ones out to me.
(My sense of tidyness is less aggravated by the *_r() functions, as creating an updated API by appending a parameter to the new function seems less disruptive than reordering the existing parameters.)
Posted Oct 8, 2015 12:28 UTC (Thu)
by pabs (subscriber, #43278)
[Link] (3 responses)
Posted Oct 8, 2015 16:43 UTC (Thu)
by reubenhwk (guest, #75803)
[Link] (2 responses)
Posted Oct 8, 2015 18:33 UTC (Thu)
by Karellen (subscriber, #67644)
[Link] (1 responses)
If the problem with strlcpy() is that people would still use the truncated string after ignoring the return value being too big, then I'm not entirely convinced that just changing the return value to -E2BIG will fix that. Making an overrun provide an empty string to the caller seems like it would be even less ignorable and more likely to result in even more robust code.
Is truncation a better alternative than NULing the string? I am not yet convinced - but I have not read the thread where such arguments are likely to have been explored.
Posted Oct 10, 2015 15:20 UTC (Sat)
by pabs (subscriber, #43278)
[Link]
Posted Oct 15, 2015 17:02 UTC (Thu)
by dfsmith (guest, #20302)
[Link] (1 responses)
Is strscpy functionally equivalent (though presumably faster than) the following?
Posted Oct 15, 2015 22:47 UTC (Thu)
by PaXTeam (guest, #24616)
[Link]
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
One could argue that, in each case it is introduced, if the code to handle -E2BIG isn't already written, then there's an existing bug waiting to be hit.
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
2) The code might be there but not in the immediate area so you'd miss it with mass replace (i.e the safety checks are performed up the call chain so you'd end up adding unnecessary code because "you" thought there was a preexisting bug because the code isnt understood enough.
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
LT's C++ rants
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
The standard recommendation in rust is to never
write a function that directly returns a boxed
object[5]. Instead, the function should return the object
by value and the user should place it in a box using
the box keyword. This is because (as mentioned in
subsubsection 3.1.1) rust will automatically rewrite
many functions returning objects to instead use outpointers
to avoid a copy.
(3.3.1p2 of http://scialex.github.io/reenix.pdf)
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
either those silent truncations cannot occur in which case strlcpy is utterly useless or they can occur in which case the use of strlcpy is wrong.
I would suggest a third: they "cannot occur" as far as the programmer knows, and as far as anyone who has reviewed the code knows - but since the programmer is only human, it is possible he or she has made a mistake. In the case of such a mistake, a silent truncation of the string is less bad than allowing a buffer overflow.
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
write(1, "Hello, world\n", 13).
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
strscpy() and the hazards of improved interfaces
#define strscpy(DEST,SRC,LEN) ((strncpy(DEST,SRC,LEN)[(LEN)-1]='\0',strnlen(SRC,LEN)>=(LEN))?-E2BIG:strlen(DEST))strscpy() and the hazards of improved interfaces