GNU libc 2.15 released

[Posted March 22, 2012 by corbet]

From:		Carlos O'Donell <carlos_odonell-AT-mentor.com>
To:		libc-alpha <libc-alpha-AT-sourceware.org>, <libc-announce-AT-sourceware.org>, <info-gnu-AT-gnu.org>
Subject:		The GNU C Library version 2.15 is now available.
Date:		Wed, 21 Mar 2012 15:54:17 -0400
Message-ID:		<[email protected]>
Archive‑link:		Article

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The GNU C library
=================

The GNU C Library version 2.15 is now available.

The GNU C Library is used as *the* C library in the GNU systems 
and most systems with the Linux kernel. 

The GNU C Library is primarily designed to be a portable 
and high performance C library. It follows all relevant 
standards including ISO C99 and POSIX.1-2008.  It is also
internationalized and has one of the most complete 
internationalization interfaces known. 

The GNU C library webpage is at http://www.gnu.org/software/libc/

Packages for the 2.15 release may be downloaded from:
        http://ftpmirror.gnu.org/libc/
        http://ftp.gnu.org/gnu/libc/

The mirror list is at http://www.gnu.org/order/ftp.html

NEWS for version 2.15
=====================

* The following bugs are resolved with this release:

  6779, 6783, 9696, 10103, 10709, 11589, 12403, 12786, 12840, 12847, 12868,
  12852, 12874, 12885, 12892, 12906, 12907, 12922, 12935, 12962, 13007,
  13021, 13061, 13062, 13067, 13068, 13085, 13088, 13090, 13092, 13096,
  13114, 13118, 13123, 13134, 13138, 13147, 13150, 13166, 13179, 13185,
  13189, 13192, 13268, 13276, 13282, 13291, 13305, 13328, 13335, 13337,
  13344, 13358, 13367, 13413, 13416, 13423, 13439, 13446, 13472, 13484,
  13506, 13515, 13523, 13524, 13538, 13540

* New program pldd to list loaded object of a process
  Implemented by Ulrich Drepper.

* Add nss_db support back to glibc.  No more dependency on Berkeley db
  and support for initgroups lookups.
  Implemented by Ulrich Drepper.

* Optimized strcpy, strncpy, stpcpy, stpncpy for SSE2 and SSSE3 on x86-32.
  Contributed by HJ Lu.

* Improved strcpy, strncpy, stpcpy, stpncpy for SSE2 and SSSE3 on x86-64.
  Contributed by HJ Lu.

* Optimized strcat, strncat on x86-64 and optimized wcscmp, wcslen, strnlen
  on x86-32 and x86-64.
  Contributed by Liubov Dmitrieva.

* Optimized strchr and strrchr for SSE on x86-32.
  Contributed by Liubov Dmitrieva.

* Optimized memchr, memrchr, rawmemchr, memcmp, wmemcmp, wcschr, wcscpy
  for x86-64 and x86-32.
  Contributed by Liubov Dmitrieva.

* New interfaces: scandirat, scandirat64
  Implemented by Ulrich Drepper.

* Checking versions of FD_SET, FD_CLR, and FD_ISSET added.
  Implemented by Ulrich Drepper.

* nscd now also caches the netgroup database.
  Implemented by Ulrich Drepper.

* Integrate libm with gcc's -ffinite-math-only option.
  Implemented by Ulrich Drepper.

* Lots of generic, 64-bit, and x86-64-specific performance optimizations
  to math functions.  Implemented by Ulrich Drepper.

* Optimized strcasecmp and strncasecmp for AVX on x86-64.
  Implemented by Ulrich Drepper.

* New Linux interfaces: process_vm_readv, process_vm_writev

* Optimized strcasecmp and strncasecmp for SSSE3 and SSE4.2 on x86-32.
  Implemented by Ulrich Drepper.

* Optimized nearbyint and strcasecmp for PPC.
  Implemented by Adhemerval Zanella.

* New locales: bho_IN, unm_US, es_CU, ta_LK

Contributors
============

This release was made possible by the contributions of many people.
The maintainers are grateful to everyone who has contributed
changes or bug reports.  These include:

Adhemerval Zanella
Alan Modra
Andreas Dilger
Andreas Jaeger
Andreas Krebbel
Andreas Schwab
Aurelien Jarno
Bruno Haible
Carlos O'Donell
David S. Miller
Denis Zaitceff
H.J. Lu
Jakub Jelinek
Jeff Law
Jiri Olsa
John Stanley
Joseph Myers
Liubov Dmitrieva
Marek Polacek
Michael Zolotukhin
Mike Frysinger
Paul Pluzhnikov
Petr Baudis
Rafael Ãvila de EspÃndola
Richard B. Kreckel
Roland McGrath
Ross Lagerwall
Samuel Thibault
Thomas Jarosch
Thorsten Kukuk
Ulrich Drepper
Will Schmidt
==

Cheers,
Carlos.
- -- 
Carlos O'Donell
Mentor Graphics / CodeSourcery
[email protected]
[email protected]
+1 (613) 963 1026
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)

iQEcBAEBAgAGBQJPajHpAAoJECXvCkNsKkr/4IkIAKMi5K1k5aMH/oQeRyPodo4n
2qhkOKBptwxACvWlDzo8kVCNVNsVrbBpdNIao+Tr8YUePfiTP+J+aI/0IFIrPUfY
7y8Tn8vB+JJGrxE1LJi3ZcZSP5v/g7/E0vSuMbu072h50s2RC/xHIBbL2jfLyWGk
+6rRGR5vHt57nHHiOsyoKo25GzPkyhROq4itCpcbWyqkyIxdLQwrxDaCK5SVfJZm
Lk1LhqeB/Y622LRB6nBCU3JbTAKuP4sw5txfYcUi8zDF1OYHAbcatByl9ICAWEEp
uHNuLGJMLg2XVHl3TDzpM8NySggX1dgZpegEaUEXr9u/6X+DurHUcuuJODyuP/M=
=dju7
-----END PGP SIGNATURE-----

Still no strlcpy and friends

Posted Mar 22, 2012 17:45 UTC (Thu) by smurf (subscriber, #17840) [Link] (72 responses)

No comment. :-(

Still no strlcpy and friends

Posted Mar 22, 2012 17:53 UTC (Thu) by tjc (guest, #137) [Link]

I was going to complain about that, but you beat me to it. :)

Still no strlcpy and friends

Posted Mar 22, 2012 18:03 UTC (Thu) by daney (guest, #24551) [Link] (9 responses)

Contributing to glibc has become easier recently, so you could try sending a patch to add these. It might even be accepted.

Still no strlcpy and friends

Posted Mar 22, 2012 21:02 UTC (Thu) by mgedmin (subscriber, #34497) [Link] (8 responses)

IIRC strlcpy was explicitly rejected by Ulrich Drepper on the grounds of being an attractive nuissance (I'm paraphrasing). Something about programmers using it and thinking they're safe, without properly considering string truncation attacks. I'm too lazy to google it up now, sorry!

Still no strlcpy and friends

Posted Mar 22, 2012 22:12 UTC (Thu) by tjc (guest, #137) [Link] (7 responses)

sources.redhat.com/ml/libc-alpha/2000-08/msg00053.html

sources.redhat.com/ml/libc-alpha/2000-08/msg00061.html

As usually, the motive behind Debian's switch to EGLIBC is in evidence here.

Still no strlcpy and friends

Posted Mar 23, 2012 6:47 UTC (Fri) by smurf (subscriber, #17840) [Link] (6 responses)

> switch to eglibc

… which doesn't help at all in this case, since eglibc decided to stay binary compatible to glibc and thus doesn't have strlcpy either.

Anyway, I reject Drepper's argument that this facilitates path truncation attacks. Sure it does, but the alternative (programs running off into la-la land because a string didn't get NUL-terminated) is much worse.
Besides that, returning the actual length is a sensible optimization for string concatenation.

Still no strlcpy and friends

Posted Mar 23, 2012 9:03 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

Anyway, I reject Drepper's argument that this facilitates path truncation attacks. Sure it does, but the alternative (programs running off into la-la land because a string didn't get NUL-terminated) is much worse.

I'm not so sure. If program runs off into la-la land then it tends to crash which will be noticed, if program just rejects your input it's less visible problem. This means that program with incorrect handling of strings based on strlcpy is worse in the presence of skilled administrator and about the same in the absence of such administrator. The net worth is probably still negative.

Still no strlcpy and friends

Posted Mar 23, 2012 13:18 UTC (Fri) by NAR (subscriber, #1313) [Link]

I'm not so sure. If program runs off into la-la land then it tends to crash which will be noticed, if program just rejects your input it's less visible problem.

This time I'm not sure :-) The crash will be noticed on one system. The exploit created based on that crash might break the other system without crashing anything there...

Still no strlcpy and friends

Posted Mar 23, 2012 9:47 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

Your use of the phrase /the alternative/ implies that you're dealing with the sort of programmers who, when they discover that there's no built-in standard library function which does what they need, just do something else and tell the end user too bad it can't work the way you asked.

The _real_ alternative is for programmers to think about what they actually want to do with outliers and then do it. What if the GET request is 40kbytes long? Should they truncate it until it fits in the maximum path size? That seems wrong. Maybe they should overwrite the stack? No, that seems wrong too. Aha! They should report an HTTP error to the remote client saying that request is stupid.

I'm always much happier to see
/* We can't call the standard function here because: ... */

Than an uncommented line which calls some standard function and doesn't do anything with the return value or errors.

New standard library functions should, on the whole, reflect good existing practice. It's not clear that strlcpy-like functions in existing code are good practice, they're often just laziness. The C library already has more than enough of that.

Still no strlcpy and friends

Posted Mar 23, 2012 16:40 UTC (Fri) by tjc (guest, #137) [Link] (2 responses)

> … which doesn't help at all in this case, since eglibc decided to stay binary compatible to glibc and thus doesn't have strlcpy either.

Actually, the reason for Debian switching to EGLIBC that I alluded to was so that they didn't have to deal with Drepper. It wasn't the only reason, but it was a factor.

Still no strlcpy and friends

Posted Mar 25, 2012 1:51 UTC (Sun) by nix (subscriber, #2304) [Link] (1 responses)

I wonder how long it will be before they switch back? glibc development has really livened up recently (probably because drepper has left RH and isn't so involved anymore).

Still no strlcpy and friends

Posted Mar 26, 2012 14:46 UTC (Mon) by cortana (subscriber, #24596) [Link]

eglibc is more of a branch than a fork, so it's not clear they really need to.

Still no strlcpy and friends

Posted Mar 22, 2012 18:08 UTC (Thu) by guillemj (subscriber, #49706) [Link]

These (and some others) are provided by libbsd, which should be available on several mainstream distributions.

Still no strlcpy and friends

Posted Mar 22, 2012 21:12 UTC (Thu) by HelloWorld (guest, #56129) [Link] (26 responses)

Just use something sensible like glib's GString already and stop whining.

Still no strlcpy and friends

Posted Mar 22, 2012 21:43 UTC (Thu) by slashdot (guest, #22014) [Link] (25 responses)

std::string is best.

Concatenation:
a += b

Copy:
a = b

Still no strlcpy and friends

Posted Mar 22, 2012 22:11 UTC (Thu) by david.a.wheeler (subscriber, #72896) [Link] (24 responses)

Strange, "a += b" doesn't seem to work properly in my .c file.

I've recommended adding strlcpy, and I'm sad that glibc still fails to include them. While string truncation is a potential problem, unbounded allocation has its own problems (because it's, well, unbounded). The OpenBSD folks - who are big on security - specifically advocate strlcpy. I don't buy they argument that "strlcpy is a security problem" - it helps you deal with them.

Whenever you have buffers, you can either try to grow them unboundedly (which eventually fails) or cut them off at some point (strlcpy and friends). We need tools to let developers easily do either.

Still no strlcpy and friends

Posted Mar 23, 2012 3:32 UTC (Fri) by kevinm (guest, #69913) [Link] (23 responses)

It's not about doing things unbounded.

It's that if you write the code around your use of strlcpy() to detect and respond to truncation (instead of just ignoring it), you will find that you've written all the code you would need to just use memcpy() instead.

Still no strlcpy and friends

Posted Mar 23, 2012 6:54 UTC (Fri) by smurf (subscriber, #17840) [Link] (21 responses)

memcpy() doesn't find the terminating NUL for you.

If you want to carry the string length around, use C++ and std::string.
I have no problem with that. But let the rest of us C programmers use the functions that actually make sense.

Still no strlcpy and friends

Posted Mar 23, 2012 9:06 UTC (Fri) by khim (subscriber, #9252) [Link] (12 responses)

But let the rest of us C programmers use the functions that actually make sense.

Well, strlcpy does not make any sense in UTF-8 world thus your wish is obviously granted.

Still no strlcpy and friends

Posted Mar 23, 2012 12:11 UTC (Fri) by adobriyan (subscriber, #30858) [Link] (7 responses)

> Well, strlcpy does not make any sense in UTF-8 world thus your wish is obviously granted.

Reasonable programming languages have real character data type.
Reasonable programming languages also have one-dimensional arrays.
This route reasonable programming languages get strings (for free).
They also have automatic OOB access checks.

No part of this strncpy/strlcpy() idiocy makes sense in reasonable programming language universe. They won't even understand what the fuss is all about.

Still no strlcpy and friends

Posted Mar 23, 2012 13:22 UTC (Fri) by dgm (subscriber, #49227) [Link] (1 responses)

> No part of this strncpy/strlcpy() idiocy makes sense in reasonable programming language universe. They won't even understand what the fuss is all about.

That's fine. They are not the people that need to care. If an slow and safe string is all you need, why worry about all this? Just use whatever are given in your chosen language and move on. There's nothing for you to see here.

Still no strlcpy and friends

Posted Mar 23, 2012 14:07 UTC (Fri) by adobriyan (subscriber, #30858) [Link]

> If an slow and safe string is all you need

nice

> Just use whatever are given in your chosen language and move on.

The problem is that I'm C guy mostly.
But there is nothing given in C.
Exactly nothing.

The correct fix belongs into the programming language.
Looking at other PLs and several C "solutions" it should be obvious.

From this POV, Ulrich's decision is very smart.

Still no strlcpy and friends

Posted Mar 23, 2012 15:12 UTC (Fri) by tialaramex (subscriber, #21167) [Link] (4 responses)

“Reasonable programming languages have real character data type.”

A "real character data type" is almost never what you actually want because of how poorly defined characters are (or from another point of view, how many different and incompatible definitions there are for "character").

It's tempting to create a data type for Unicode code points. (Java specifications, some parts of the Win32 API, various databases, historically did this to their cost) because they are sometimes called "characters". But they can also represent things which aren't intuitively characters (such as the Byte Order Mark, or the LTR/RTL mode switches) and they can represent fractions of a character (like a macron) or symbols which are arguably groups of characters (like ligatures).

On the whole it's best to forget "characters" and handle only strings and sequences of bytes. This obliges the programmer to focus with due caution on any places that translate between the two. On the rare occasion that you do want to process Unicode code points they fit nicely into any modern integer type, such as 32-bit signed integers common in C.

This still leaves you with plenty of tricky problems (e.g. canonicalisation) with your Unicode strings if you need more work.

Still no strlcpy and friends

Posted Mar 23, 2012 20:23 UTC (Fri) by cmccabe (guest, #60281) [Link]

> > “Reasonable programming languages have real character data type.”

> A "real character data type" is almost never what you actually want
> because of how poorly defined characters are (or from another point of
> view, how many different and incompatible definitions there are for
> "character").

Please. We're trying to do "programming language advocacy" here.

Don't intrude on this with your "facts"or "logic."

Can I get an A-men?

Still no strlcpy and friends

Posted Mar 26, 2012 18:58 UTC (Mon) by bronson (subscriber, #4806) [Link] (2 responses)

Great post! It's unfortunate how many people still think they should be using C/C++'s fundamental char type.

The days of pointer arithmetic and character arrays have mostly drawn to a close. Even in C.

Still no strlcpy and friends

Posted Mar 27, 2012 15:09 UTC (Tue) by dgm (subscriber, #49227) [Link] (1 responses)

I think you have to read again the post you're answering to.

Still no strlcpy and friends

Posted Apr 2, 2012 20:57 UTC (Mon) by bronson (subscriber, #4806) [Link]

Care to say more? Code like "while(*c != '/') *b++ = *c++" doesn't work so well anymore.

Still no strlcpy and friends

Posted Mar 24, 2012 3:14 UTC (Sat) by cmccabe (guest, #60281) [Link] (3 responses)

> Well, strlcpy does not make any sense in UTF-8 world thus your
> wish is obviously granted.

Wow. You just put forward the ONLY reasonable argument in this thread for why strlcpy should not be included in glibc. Congratulations.

However, your argument is flawed, because it assumes that I will not check the return value of strlcpy and act appropriately. Functions have return values for a reason. If I am dealing with UTF8 data, I will not try to truncate the string on an arbitrary byte boundary. I will simply bail out when I'm told the string is too long to fit in my buffer.

The rest of the thread is just a bunch of "advocacy" (read: ignorance). Most it seems to center on people saying "just use std::string, it will fix what ails ye!"

This is good except that,
1. std::string offers the same level support for unicode as char* -- i.e., none.
2. using std::string to blindly copy user-supplied data opens you up to a different kind of security vulnerability, the denial of service.
3. std::string always allocates space on the heap, which makes it unsuitable for many uses
4. a lot of functions in the C++ standard library take char* arguments, so you have to learn how to use char* anyway.

Then there's the people who claim that still other programming languages will magically make the problems of string manipulation go away.

It simply isn't so.

For example, in Python or Java, control characters can be inserted into strings. This means that printing the strings to stdout later could cause undesirable effects to the user. If they are running in an xterm or in a GNU screen session, it could execute arbitrary commands. Will the higher level languages protect you from this? No.

Then there's the problem of normalization. There are four different ways to normalize unicode strings. That means that if your programming language wants to natively support the operation of comparing two strings, it has to have four different kinds of equals sign.

There's no Java worshippers in this thread (strange, we have every other kind of critter) but just to set the record straight, you cannot simply count the number of chars in a string in Java and assume that that is the length. Java uses UCS-2, so this will work only for the basic multilingual plane. There are reasons to use Java, but this isn't one of them.

tl;dr. There are lots of gotchas surrounding using strings in C, but some of them exist in every language. Unicode is complex and understood by few. Simple solutions to complicated problems are usually wrong.

Looks like you are correct…

Posted Mar 24, 2012 7:56 UTC (Sat) by khim (subscriber, #9252) [Link] (2 responses)

Actually arguments against strlcpy looks sensible, but most of them are centered against the ability to not check the return result (it looks like it's impossible to correctly without checking the return result).

Why not declare it like this:

#ifdef I_REALLY_NEED_STRLCPY #pragma GCC diagnostic error "-Wunused-result" size_t strlcpy(char *dst, const char *src, size_t size) __attribute__((warn_unused_result)); #endif

GLibC is tightly tied to GCC anyway and it looks like this approach should be fine WRT safety, no?

P.S. In reality I'm not sure I like strlcpy all that much: it's interface is too complicated. You really can not do anything sensible with strlcpy return value except compare it with “size” and for that strcpy_s looks like a saner interface (especially in C++).

Looks like you are correct… or not

Posted Mar 24, 2012 12:38 UTC (Sat) by smurf (subscriber, #17840) [Link] (1 responses)

strcpy_s is useless. Its argument order is *of course* different, and the requirement to not change the destination when the source doesn't fit ignores the only reason why strlcpy() even exists -- as opposed to a macro that calls strlen and memcpy.

Of course you can do something sensible with strlcpy's return value -- you can use it as offset to the end of the string, instead of calling strlcat(), when you want to append something.

NB: *its interface. :-P

Looks like you are correct… or not

Posted Mar 24, 2012 13:19 UTC (Sat) by khim (subscriber, #9252) [Link]

Its argument order is *of course* different, and the requirement to not change the destination when the source doesn't fit ignores the only reason why strlcpy() even exists -- as opposed to a macro that calls strlen and memcpy.

You want to imply that strlcpy exist only to introduce subtle security holes? Then the arguments to not include it are quite obviously valid…

Of course you can do something sensible with strlcpy's return value -- you can use it as offset to the end of the string, instead of calling strlcat(), when you want to append something.

Right. But the point is that you should not do that. Low-level functions like memcpy, strlcpy, or strcpy_s only make sense when you deal with buffers of fixed size. If you need/want to deal with reallocation and other similar tricks then the whole thing becomes so fragile that it must be put either in separate set of functions or in language core.

NB: *its interface. :-P

Yes. A horrible one. Good API must not only be easy to correctly use, it must be hard to misuse. strlcpy does so-so on the first requirement and completely blows up the second while strcpy_s is fine on both fronts.

Still no strlcpy and friends

Posted Mar 23, 2012 9:29 UTC (Fri) by HelloWorld (guest, #56129) [Link] (7 responses)

You didn't get the point. The point is that if you actually want to make sure that strlcpy won't silently truncate your string, you need to find the terminating NUL anyway, and at that point you might as well use memcpy.

Still no strlcpy and friends

Posted Mar 23, 2012 12:58 UTC (Fri) by smurf (subscriber, #17840) [Link] (6 responses)

Umm, what? strlcpy finds that NUL and copies my string, all in one pass.
memcpy only copies, so I'd need a second pass.

The whole point of strlcpy is that it does a strlen+memcpy at the same time. That's where its name comes from.

Please post sample code that's any shorter, faster OR safer than

if (strlcpy(dest,src,LEN) >= LEN) { return_with_error }

before making such assertions. Thanks.

C as a language does not have exceptions. You can complain all you want that calling strlcpy() and friends without checking its result leaves your data in an inconsistent state, but that's a problem of the language. Any other function that copies a string into a buffer will have the exact same problem.

Still no strlcpy and friends

Posted Mar 23, 2012 13:37 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

If you ever have to handle strings in encodings that have shift states or multi-byte characters, not even strl{cat,cpy} are safe enough for production use.

Still no strlcpy and friends

Posted Mar 23, 2012 13:56 UTC (Fri) by adobriyan (subscriber, #30858) [Link]

It depends on what "response to truncation" is.
If it's usual i-do-not-care exit(EXIT_FAILURE), strlcpy() is OK.
If it's not and, say, dst reallocation is required using strlcpy is double work if truncation happens.

"incosistent state" is not a problem of the language. strlcpy could easily restore original \0.

Still no strlcpy and friends

Posted Mar 23, 2012 15:21 UTC (Fri) by HelloWorld (guest, #56129) [Link] (3 responses)

> if (strlcpy(dest,src,LEN) >= LEN) { return_with_error }
Why would you write code like that? If you want to avoid truncating strings, you need to dynamically allocate a buffer to copy the string into, and to do so you need its size.

By the way, this problem doesn't even exist when using a sensible string representation which stores the string's length separately, such as GString. Functions like strlcpy are just a (bad) workaround for C's string representation.

Still no strlcpy and friends

Posted Mar 23, 2012 19:24 UTC (Fri) by smurf (subscriber, #17840) [Link] (2 responses)

I might have to fix a more-or-less-buggy legacy program which uses fixed-size buffers. My task, in this case, is not to rewrite the damn thing with GString, but to replace all the unsafe string crapola with sane code -- code that doesn't incur a performance penalty if at all possible.

str*cpy's use case is to copy a string A, whose length I don't know, to a buffer B, whose length I do know. Given this, my argument is that there's nothing better than strlcpy to do that safely. Given that this fairly common use case is not going to go away any time soon, IMHO not including it in glibc is stupid.

There are lots of situations with different use cases. I don't contest that. My point is that I don't always have a choice. In fact, whenever I _do_ have a choice, I use Python. :-P

Still no strlcpy and friends

Posted Mar 23, 2012 19:32 UTC (Fri) by HelloWorld (guest, #56129) [Link] (1 responses)

> str*cpy's use case is to copy a string A, whose length I don't know, to a buffer B, whose length I do know. Given this, my argument is that there's nothing better than strlcpy to do that safely. Given that this fairly common use case is not going to go away any time soon, IMHO not including it in glibc is stupid.
I completely disagree. New functions should be added to a library because they make sense for today's code, not because they make poor fixes to broken legacy programs easier. After all, nothing stops you from using strlcpy: write it yourself, copy it from somewhere, use libbsd, whatever.

Still no strlcpy and friends

Posted Mar 23, 2012 19:38 UTC (Fri) by HelloWorld (guest, #56129) [Link]

I just read tialaramex' comment here:
http://lwn.net/Comments/488249/
and he put it better than I could:
> New standard library functions should, on the whole, reflect good existing practice. It's not clear that strlcpy-like functions in existing code are good practice, they're often just laziness. The C library already has more than enough of that.

Still no strlcpy and friends

Posted Mar 23, 2012 9:36 UTC (Fri) by dgm (subscriber, #49227) [Link]

Actually this is a very nice angle.

Usually the people that compare C's string and memory functions just talk about their interfaces, as if they appeared in vacuum. Comparing their actual usage patterns (both the usual and correct cases) would be so much more interesting.

Still no strlcpy and friends

Posted Mar 22, 2012 23:55 UTC (Thu) by PaXTeam (guest, #24616) [Link] (32 responses)

since strl* are broken in several ways, it's little wonder.

Still no strlcpy and friends

Posted Mar 23, 2012 2:42 UTC (Fri) by wahern (subscriber, #37304) [Link]

That's just nonsense, and so is the resistance to these functions. If you want to see broken interfaces, just look at the utter crap that has made it into glibc over the years. strlcpy is gold by such a standard, and strlcat doesn't look so shabby, either.

Still no strlcpy and friends

Posted Mar 23, 2012 6:50 UTC (Fri) by smurf (subscriber, #17840) [Link] (16 responses)

Care to elucidate as to what these ways actually are?

Specifically, why does libc have strncpy(), which is far worse than strlcpy() in every possible way (useless return value, non-NUL terminated destination string)?

Still no strlcpy and friends

Posted Mar 23, 2012 8:14 UTC (Fri) by PaXTeam (guest, #24616) [Link] (10 responses)

a quite thorough and recent examination is here: http://byuu.org/articles/programming/strcpycat . note that i didn't say that the commonly used alternatives (e.g., strn*) were that much better, it's just that strl* are *not* the proper solution either.

for wahern (since i'm lazy to split the post ;): count the number of strl* usages in the OpenBSD base system and then count the usages where the return value is checked and acted upon (and for added bonus, figure out which of the rest introduces a bug due to the silent truncation). that's why it's a conceptually broken interface if even its own promoters don't do it right.

Still no strlcpy and friends

Posted Mar 23, 2012 20:21 UTC (Fri) by cmccabe (guest, #60281) [Link] (9 responses)

> a quite thorough and recent examination is here: http://byuu.org/articles/programming/strcpycat

Recent, maybe. Thorough-- nope. Among other things, the author suggests replacing strlcpy with a new function of his own design, which returns the length of the target string rather than the length of the source string. It's not clear that programmers would find this an easier to use API. It is clear that it would make error checking more difficult. No thanks.

> Note that i didn't say that the commonly used alternatives (e.g., strn*)
> were that much better, it's just that strl* are *not* the proper
> solution either.

This is the crux of the issue. People are trying to make the situation better and you are getting in the way with "but it doesn't meet my criterion for absolute perfection." It's not even clear that *any* C string manipulation function could meet your amorphous, unspoken criterion.

Meanwhile, the rest of us just emulate strlcpy with snprintf.

Still no strlcpy and friends

Posted Mar 24, 2012 13:58 UTC (Sat) by PaXTeam (guest, #24616) [Link] (8 responses)

first, i referenced that blog for explaining some of the problems with strl* (i note you didn't reflect on that though), not to pass judgement on his solution (speaking of which, if you're unable to understand error checking with strm* then you're quite hopeless as a programmer anyway, no amount of str* functions will save you).

second, i never said anything about 'absolute perfection', although now that you brought it up, yes, i believe that there's no substitute for correct code. the problem is that strl* doesn't let you easily write it. however it does easily let you produce silent truncation as you can see it in the OpenBSD base system itself. for correct code you have to know your string sizes and adjust your buffer sizes accordingly, simple as that.

third, i very much hope the snprintf comment was just a silly programmer's joke. otherwise please stop writing code.

Still no strlcpy and friends

Posted Mar 24, 2012 16:23 UTC (Sat) by tjc (guest, #137) [Link] (1 responses)

I've always found snprintf to be a reasonable alternative to strlcpy, and I remain unmoved by your contempt for programmers who use it! :)

Perhaps an a well-reasoned explanation as to why you think its bad would persuade me otherwise...

Still no strlcpy and friends

Posted Mar 24, 2012 18:16 UTC (Sat) by PaXTeam (guest, #24616) [Link]

if strlcpy is the best fit for a particular problem then you don't 'emulate' it, you use it. and if it's not present in your system then you implement it.

Still no strlcpy and friends

Posted Mar 28, 2012 18:39 UTC (Wed) by cmccabe (guest, #60281) [Link] (5 responses)

> first, i referenced that blog for explaining some of the problems with
> strl* (i note you didn't reflect on that though), not to pass judgement on
> his solution

Well, do you agree with his conclusions or not? If so, you should be telling the world about the virtues of strmcpy and strmcat. If not, then you shouldn't be linking to his post as an explanation. If you agree with some of his analysis but disagree with other parts (including the conclusion), what's the point of linking to it? It seems like you are just creating confusion. Maybe you are confused yourself.

> (speaking of which, if you're unable to understand error
> checking with strm* then you're quite hopeless as a programmer anyway, no
> amount of str* functions will save you).

So it's acceptable for use of strmcpy to require error checking, but not acceptable for use of strlcpy to require error checking. Got it.

> second, i never said anything about 'absolute perfection', although
> now that you brought it up, yes, i believe that there's no
> substitute for correct code

This is more "yes, but also no" out of you. Nobody said anything about coming up with a substitute for correct code. I like correct code too, just like fluffy bunnies, motherhood, and apple pie. Where we are disagreeing is in what C interfaces tend to produce that code.

Any interface that copies a string into a fixed buffer has to have one of two failure modes: truncation or refusal to write anything. There are other failure modes, like strncpy's insane "truncate, but don't NULL-terminate," or strcpy's "cause a buffer overflow," but we'll ignore those because they're obviously bad. Both truncation and failure to write anything can lead to bugs if the programmer fails to check the return code of the copy function.

Given this, strlcpy seems the best possible choice, since it at least defines what the output should be in the case of failure, whereas strcpy_s and similar functions do not.

> third, i very much hope the snprintf comment was just a
> silly programmer's joke. otherwise please stop writing code.

Another opinion presented without justification. snprintf is fast, and it checks buffer sizes. There is absolutely no reason to use it, especially when glibc refuses to provide saner string copying interfaces.

And before you spout more confusion, you can easily check whether snprintf truncated the input-- and I almost always do, unless it truly is irrelevant.

Still no strlcpy and friends

Posted Mar 28, 2012 20:48 UTC (Wed) by PaXTeam (guest, #24616) [Link] (4 responses)

> If you agree with some of his analysis but disagree with other parts
> (including the conclusion), what's the point of linking to it?

i assume a certain level of intelligence of LWN readers, but apparently not everyone meets it. not that i had expectations based on your past performance (or rather, lack thereof) in those other security related threads ;).

> So it's acceptable for use of strmcpy to require error checking, but not
> acceptable for use of strlcpy to require error checking. Got it.

too bad i didn't say anything like it :). on the other hand you did say this:

> It is clear that it would make error checking more difficult.

so you're smart enough to understand strl* error checking (ok, i'm assuming it for the sake of discussion ;) but you're too dumb to understand strm* error checking. i think that pretty much establishes your programming knowledge and the value your opinion on security matters carries.

> Nobody said anything about coming up with a substitute for correct code[...]

you did. everyone else promoting strl* did too. i even explained why. you can find it in this very thread, but then i don't know if i should directly link to it, it might confuse you. oh the irony ;).

> Another opinion presented without justification.

actually, i did explain it as well, but linking to it... ok, forget it, it's way beyond your mental capacity.

> snprintf is fast

yeah dude. it smokes str*. /o\

Still no strlcpy and friends

Posted Mar 29, 2012 1:16 UTC (Thu) by cmccabe (guest, #60281) [Link] (1 responses)

Look, at the end of the day, this is a debate about best practices. I have proposed using strlcpy rather than strcpy to copy a variable length string into a fixed-length buffer as a best practice. And if strlcpy is not available, I propose using snprintf.

What have you proposed? You linked to a web page with a proposal for strmcpy. When questioned about that, you said that you didn't agree with everything on the web page, without clarifying which parts you agreed with and which you didn't. As far as I can see, the question of what you are actually proposing remains unanswered.

Also, I would prefer that we keep the name-calling to a minimum. Remarks about "my mental capacity" are uncalled for. I realize that my earlier post might have been a little too aggressive, but let's not take this down to the grade school level.

> > snprintf is fast

> yeah dude. it smokes str*. /o\

snprintf performs better than a naive implementation of strlcpy. I have benchmarked this.

Still no strlcpy and friends

Posted Mar 29, 2012 13:42 UTC (Thu) by PaXTeam (guest, #24616) [Link]

> What have you proposed?

you'd know if you had actually cared to read *all* my comments in this thread. as far as i see it, you want to argue about the color of the bikeshed whereas we haven't even established that it's a good idea to have one. go read the rest in my comments, i won't repeat them here. if you still have an argument then, i'm all ears.

> snprintf performs better than a naive implementation of strlcpy. I have benchmarked this.

an optimized library implementation against a 'naive implementation'? without providing any numbers, any context, any details, anything whatsoever? and you're wondering why people are not taking you seriously? /o\

Still no strlcpy and friends

Posted Mar 29, 2012 7:30 UTC (Thu) by ekj (guest, #1524) [Link] (1 responses)

If you have an actual argument, make it.

If the best you can do is tossing characteristics like 'mental capacity' around while lamenting about lack of past performance, and in going for the man rather than the ball - then please go somewhere else than Lwn.

And no, adding smileys after every uncalled for personal attack does not make it okay.

Still no strlcpy and friends

Posted Mar 29, 2012 13:47 UTC (Thu) by PaXTeam (guest, #24616) [Link]

> If you have an actual argument, make it.

i actually have, you'd just have to read my earlier comments. have you? i didn't see any feedback from you at least.

as for name calling, etc: you might want to check up on your friend's earlier comments and his tone towards everyone else who didn't see the wisdom in strl* the way he does. but then it seems you too have reading comprehension problems and nothing useful to add except for 'going in for the man rather than the ball'. will you take your own advice then?

Still no strlcpy and friends

Posted Mar 23, 2012 8:18 UTC (Fri) by pdewacht (subscriber, #47633) [Link] (4 responses)

Libc has strncpy not because it's good but because the standards require it. It also has gets.

Still no strlcpy and friends

Posted Mar 26, 2012 11:54 UTC (Mon) by jwakely (subscriber, #60262) [Link] (3 responses)

In glibc 2.15 gets() is not declared when compiling with _GNU_SOURCE, even if compiling as C90 or C99

Still no strlcpy and friends

Posted Mar 26, 2012 13:30 UTC (Mon) by nix (subscriber, #2304) [Link] (2 responses)

That would be glibc 2.16: 2.15 does have it, at least upstream 2.15 does. See also GCC PR51785 (he says, pointlessly because you already commented on it).

Still no strlcpy and friends

Posted Mar 26, 2012 14:28 UTC (Mon) by jwakely (subscriber, #60262) [Link] (1 responses)

I have, but apparently didn't read it closely enough to notice the change is not in 2.15 :) I assume the change went into git after the code was branched for last week's release then.

Still no strlcpy and friends

Posted Mar 28, 2012 18:22 UTC (Wed) by nix (subscriber, #2304) [Link]

Yeah, that branch happened months ago, but no actual official release was made from it until recently.

Still no strlcpy and friends

Posted Mar 23, 2012 10:04 UTC (Fri) by liljencrantz (guest, #28458) [Link] (1 responses)

The strl* functions are not the be all, end all string handling library, but in some situations, they are actually somewhat useful, and unlike the strn* equivalents, they are at least possible to use correctly without jumping through insane hoops. More importantly, while strl* is a BSD extension, there are many projects that actually make use of it, which has historically been reason enough to include stuff into glibc.

When you consider the fact that glibc has happily included completely useless junk functions like memfrob and strfry, not wanting to include strl*, which are actively used by a bunch of different projects starts to smell a lot like somebody has a personal vendetta against a certain person who's first name is Theo.

Still no strlcpy and friends

Posted Mar 25, 2012 2:00 UTC (Sun) by nix (subscriber, #2304) [Link]

memfrob and strfry went in when glibc was a lot younger and still done pretty much purely for the fun of it: it had no real users back then. They can't be removed now because it would break the ABI, but the bar for inclusion is a lot higher now.

Still no strlcpy and friends

Posted Mar 23, 2012 20:02 UTC (Fri) by cmccabe (guest, #60281) [Link] (11 responses)

char buf[32];
if (strlcpy(buf, input, sizeof(buf)) >= sizeof(buf))
    return -ENAMETOOLONG;

Oh no, the error checking! It's so hard! Won't someone please think of the children? We should all be forced to use strcpy instead, that will fix things!

Still no strlcpy and friends

Posted Mar 24, 2012 13:40 UTC (Sat) by PaXTeam (guest, #24616) [Link] (10 responses)

> Oh no, the error checking! It's so hard!

apparently it is. you didn't check that
- input is not NULL
- input is a null terminated C string

and if you had done these checks then your code wouldn't need strlcpy to begin with because you'd already have the length of input at hand.

Still no strlcpy and friends

Posted Mar 24, 2012 15:02 UTC (Sat) by mb (subscriber, #50428) [Link] (9 responses)

> apparently it is. you didn't check that
> - input is a null terminated C string

How do you check this? And how would that be different from strlcpy running into the output buffer length limit (and thus the limit check by the caller)?

Either you _know_ that input is a C string, or you don't use _any_ str* at all.

Still no strlcpy and friends

Posted Mar 24, 2012 16:12 UTC (Sat) by tjc (guest, #137) [Link]

> How do you check this?

Yeah, I was wondering that.

Finding the first binary zero is not sufficient, since there's no way to tell if it's within the area allocated for the source string. C's string representation doesn't allow for allocation errors to be detected downstream.

Still no strlcpy and friends

Posted Mar 24, 2012 18:43 UTC (Sat) by PaXTeam (guest, #24616) [Link] (7 responses)

> Either you _know_ that input is a C string, or you don't use _any_ str* at all.

quite correct! ;)

what i was trying to reflect on is the sillyness of cmccabe's example as it's so often touted by strl* proponents. namely, that it'll somehow save you from security problems if you replace your strcpy's with strlcpy on unknown inputs. of course none of the str* functions work on arbitrary inputs, they must be C strings. but then it means that your program logic must have determined that already and that implies that at some earlier point you must have found the null terminator and hence computed the input's length. which then eliminates the need for strlcpy. now on the other hand if you don't actually know that your input is a C string then you have two cases: either you know the minimum length of the input (that you can access without triggering a segfault, etc) or you don't. in the latter case you cannot use str* at all, you'll need other programming constructs. in the former case however all you can do is to create a possibly truncated copy (at the minimum length) - but not with strlcpy since it'll still attempt to compute the input's length and that may trigger undefined behaviour. so what use case is left for strl*?

Still no strlcpy and friends

Posted Mar 24, 2012 19:17 UTC (Sat) by smurf (subscriber, #17840) [Link] (6 responses)

The problem is that, sure, you once knew how long the string was. You then call a random bunch of functions expecting C strings, and one of them wants to copy that string into a buffer, or whatever.

If you have an existing program, you are *not* going to change a whole lot of random functions, adding string length parameters left and right, just to replace that lone insecure strcpy() ypu (or your compiler) found in there.

I fail to see what's so bad about leaving a truncated string in the destination buffer. If you do proper error checking, you'll ignore the contents of the destination anyway.

If you don't do any error checking, using strcpy_s instead of strlcpy means that you now have a performance regression, and (in the overrun case) a non-initialized destination buffer which might contain sensitive information before the first NUL. Or you'll just continue with non-debuggable binary junk. That's supposed to be an improvement?

Nobody said strlcpy() is a panacea. But if you don't want to rewrite the whole program to use GString, at least it gets the job done.

Still no strlcpy and friends

Posted Mar 24, 2012 22:11 UTC (Sat) by PaXTeam (guest, #24616) [Link] (5 responses)

> If you have an existing program, you are *not* going to change a whole
> lot of random functions, adding string length parameters left and right,
> just to replace that lone insecure strcpy() you (or your compiler) found
> in there.

this is the difference between a programmer who writes correct programs and one who doesn't. the biggest issue with strl* is that it cultivates the latter kind. an insecure strcpy is already a sign of a badly designed program, strlcpy won't make that bad design go away either (at worst, it'll introduce new issues due to the all too often omitted truncation checks). and when you fix the design you'll suddenly find yourself without the need for strlcpy.

> I fail to see what's so bad about leaving a truncated string in the
> destination buffer. If you do proper error checking, you'll ignore the
> contents of the destination anyway.

then it was a pointless burn of precious CPU time (remember how strlcpy still has to compute the input's length? read that blog i linked above to see how that scales). it's not exactly the hallmark of properly written programs. a good design simply avoids truncation altogether and limits it only to cases that are somehow dictated by existing API or the problem itself. also it's funny that you talk about error checking and ignoring the truncated content (hello sensitive information leak) whereas right in the next paragraph you hold the same against strcpy_s ;).

> [...]at least [strlcpy] gets the job done.

it gets something done, no doubt. the difference in opinion is whether that something is the right thing or not.

Still no strlcpy and friends

Posted Mar 24, 2012 23:05 UTC (Sat) by slashdot (guest, #22014) [Link] (1 responses)

Most of the time the string is not truncated, and a "copy until \0" is faster than strlen + memcpy (if properly implemented), so the "burned CPU time" for the copy is actually good, because otherwise you would slow down the common no-truncation path.

The totally absurd thing of strlcpy is that it returns the *source* string length, which is totally useless, and causes it to have run time depending on the length of it.

For example, if you try to copy a 1 GB string into a 256-byte buffer, strlcpy will waste a billion instruction uselessly determining the length.

The only sensible solution is to return a pointer to the end of the string in case of non-truncation, and a NULL pointer if the string was truncated (but still terminating it).

Naive implementation:

char* sensible_strcpy(char* dst, ssize_t size, const char* src)
{
for(ssize_t i = 0; i < size; ++i)
{
char c = src[i];
dst[i] = c;
if(unlikely(!c)) return dst + i;
}
if(likely(size > 0)) dst[size - 1] = 0;
return NULL;
}

char* sensible_strcat(char* dst, size_t size, const char* src)
{
size_t len = strlen(dst);
return sensible_strcpy(dst + len, size - len, src);
}

Still no strlcpy and friends

Posted Mar 25, 2012 23:26 UTC (Sun) by cmccabe (guest, #60281) [Link]

If you blindly loaded a 1 GB string into memory earlier without doing any bounds checking, you have bigger problems than how well strlcpy will perform on it. These counter-arguments are starting to reek of desperation.

Still no strlcpy and friends

Posted Mar 25, 2012 23:50 UTC (Sun) by cmccabe (guest, #60281) [Link] (2 responses)

> then it was a pointless burn of precious CPU time (remember how strlcpy
> still has to compute the input's length? read that blog i linked above to
> see how that scales). it's not exactly the hallmark of properly written
> programs. a good design simply avoids truncation altogether and limits it
> only to cases that are somehow dictated by existing API or the problem
> itself. also it's funny that you talk about error checking and ignoring
> the truncated content (hello sensitive information leak) whereas right in
> the next paragraph you hold the same against strcpy_s ;).

The only pointless burn of CPU cycles here is this thread. Nearly every string-related function in the C standard library has a running time linear in the length of the string. This is fine, because strings are usually short, and most string operations don't make sense except in the context of the string as a whole.

It is not the C standard library's responsibility to keep you from using uninitialized memory or a pointer to the wrong thing as a string. If you want that kind of functionality, use something like valgrind or electric fence. Or don't use C. There are plenty of languages to choose from.

It is not the C standard library's responsibility to keep you from loading strings that are unreasonably long or doing unreasonable things with them. Even higher level languages like Java or Python don't impose arbitrary limits on string length, for a very good reason: it is stupid. If you want a function to load the first N bytes from a long buffer, there are already functions to do that. Just use memchr and memcpy.

So far, nobody here has suggested a better alternative to strlcpy. khim suggested strcpy_s, but that has the same "problem": bad programmers will not check the return code of strcpy_s, leading to undesired behavior. Some people suggested calling strlen to ensure that the source string wasn't too long, followed by strcpy if possible. However, that is stricly slower and more verbose than strlcpy, which can do the same thing in one pass.

Using memcpy and adding the zero-terminator manually all the time, which some people here have suggested, is usually going to be slower than strcpy and friends. Should I blow the L2 cache by writing out 4 kilobytes of zeros to copy my 12-byte long string into the buffer with MAX_LEN 4096? Hmm, let me think about it... NO.

A lot of the suggestions here strain credulity. If you don't understand or like C, then please don't post on a thread about glibc. It's fine to not like or use C. It is not the right choice for a lot of jobs these days. But please don't post something silly.

Still no strlcpy and friends

Posted Mar 26, 2012 6:37 UTC (Mon) by khim (subscriber, #9252) [Link] (1 responses)

khim suggested strcpy_s, but that has the same "problem": bad programmers will not check the return code of strcpy_s, leading to undesired behavior.

As I've pointed out you can make it error to not check the return value. And strcpy_s has huge advantage over strlcpy: it never truncates the string. Either it succeeds and copies the whole string or it fails and does nothing. This closes the whole slew of potential errors caused by incorrect strlcpy use.

A lot of the suggestions here strain credulity. If you don't understand or like C, then please don't post on a thread about glibc.

Niice. Ok, let's assume glibc's primary developer and a guy who's doing Linux hardening work are C novices. Who's not then and how can we determine if s/he has the right to talk about glibc?

Still no strlcpy and friends

Posted Mar 26, 2012 9:52 UTC (Mon) by smurf (subscriber, #17840) [Link]

> Either [strcpy_s] succeeds and copies the whole string or it fails and does nothing.

I fail to see how processing binary junk, which might include former stack or heap contents which might be a security risk (think "password from the database setup which happened earlier" or "user-generated content, left there during the last iteration") can possibly be better than working with a truncated string.

Different (and IMHO worse) potential security hole. Much worse debuggability.

And in any case, forcing a compiler error when you're ignoring the return value is not exactly rocket science.

GNU libc 2.15 released

Posted Mar 22, 2012 18:50 UTC (Thu) by jengelh (subscriber, #33263) [Link]

>New locales: unm_US

I wonder when we are going to get the tlh locale.

GNU libc 2.15 released

Posted Mar 22, 2012 21:10 UTC (Thu) by welinder (guest, #4699) [Link] (2 responses)

Sadly the math part of libc is seeing very little attention.

For example, here's pow() returning the wrong result in a bug from 2004:
http://sourceware.org/bugzilla/show_bug.cgi?id=369

Here's strtod returning the wrong number from 2006:
http://sourceware.org/bugzilla/show_bug.cgi?id=3479

Here's gamma inaccurate from 2006: http://sourceware.org/bugzilla/show_bug.cgi?id=2542

Here's erf failing to set errno: http://sourceware.org/bugzilla/show_bug.cgi?id=6785

GNU libc 2.15 released

Posted Mar 22, 2012 21:41 UTC (Thu) by daney (guest, #24551) [Link]

> Sadly the math part of libc is seeing very little attention.

Actually that is not true. It may not be in 2.15, but as of late there has been a bunch of activity on the large backlog of libm bugs.

GNU libc 2.15 released

Posted Mar 23, 2012 12:59 UTC (Fri) by foom (subscriber, #14868) [Link]

That's actually not at all true. The math part of glibc has been getting *a lot* of attention over the last few weeks!

http://thread.gmane.org/gmane.comp.lib.glibc.alpha/18040

GNU libc 2.15 released

Posted Mar 22, 2012 23:46 UTC (Thu) by HenrikH (subscriber, #31152) [Link] (1 responses)

It's nice to see optimized versionsion of strcpy(), memcpy() and friends, but doesn't GCC overwrite them with it's own functions anyways?

GNU libc 2.15 released

Posted Mar 23, 2012 0:14 UTC (Fri) by JoeBuck (subscriber, #2330) [Link]

GCC will replace strcpy (for example) with better code in some cases, for example when the input argument is a constant string. But in the general case it will call the library function.