Wow, newlines in filenames being officially deprecated?!
Re. modern C, multithreaded code really needs to target C11 or later for atomics. POSIX now requires C17 support; C17 is basically a bugfix revision of C11 without new features. (hmm, I have been calling it C18 owing to the publication year of the standard, but C23 is published this year so I guess there’s now a tradition of matching nominal years with C++ but taking another year for the ISO approval process…)
Nice improvements to make, and plenty of other good stuff too.
It seems like both C and POSIX have woken up from a multi-decade slumber and are improving much faster than they used to. Have a bunch of old farts retired or something?
I have been calling it C18 owing to the publication year of the standard, but C23 is published this year so I guess there’s now a tradition of matching nominal years with C++
I believe the date is normally they year when the standard is ratified. Getting ISO to actually publish the standard takes an unbounded amount of time and no one cares because everyone works from the ratified draft.
As a fellow brit, you may be amused to learn that the BSI shut down the BSI working group that fed WG14 this year because all of their discussions were on the mailing list and so they didn’t have the number of meetings that the BSI required for an active standards group. The group that feeds WG21 (of which I am a member) is now being extra careful about recording attendance.
Unfortunately, there were a lot of changes after the final public draft and the document actually being finished. ISO is getting harsher about this and didn’t allow the final draft to be public. This time around people will probably reference the “first draft” of C2y instead, which is functionally identical to the final draft of C23.
There are a bunch of web sites that have links to the free version of each standard. The way to verify that you are looking at the right one is
look at the committee mailings which include a summary of the documents for a particular meeting
look for the editor’s draft and the editor’s comments (two adjacent doocuments)
the comments will say if the draft is the one you want
Sadly I can’t provide examples because www.open-std.org isn’t working for me right now :-( It’s been unreliable recently, does anyone know what’s going on?
For C23, Cppreference links to N3301, the most recent C2y draft. Unfortunate that the site is down, so we can’t easily check whether all those June 2024 changes were also made to C23. The earlier C2y draft (N3220) only has minor changes listed. Cppreference also links to N3149, the final WD of C23, which is protected by low quality ZIP encryption.
I think for C23 the final committee draft was last year but they didn’t finish the ballot process and incorporating the feedback from national bodies until this summer. Dunno how that corresponds to ISO FDIS and ratification. Frankly, the less users of C and C++ (or any standards tbh) have to know or care about ISO the better.
Re modern C: are there improvements in C23 that didn’t come from either C++ or are standardization of stuff existing implementations have had for ages?
I think the main ones are _BitInt, <stdbit.h>, <stdckdint.h>, #embed
Generally the standard isn’t the place where innovation should happen, though that’s hard to avoid if existing practice is a load of different solutions for the same problem.
They made realloc(ptr, 0) undefined behaviour. Oh, sorry, you said improvements.
I learned about this yesterday in the discussion of rebasing C++26 on C23 and the discussion from the WG21 folks can be largely summarised as ‘new UB, in a case that’s trivial to detect dynamically? WTF? NO!’. So hopefully that won’t make it back into C++.
realloc(ptr,0) was broken by C99 because since then you can’t tell when NULL is returned whether it successfully freed the pointer or whether it failed to malloc(0).
POSIX has changed its specification so realloc(ptr, 0) is obsolescent so you can’t rely on POSIX to save you. (My links to old versions of POSIX have mysteriously stopped working which is super annoying, but I’m pretty sure the OB markers are new.)
C ought to require that malloc(0) returns NULL and (like it was before C99) realloc(ptr,0) is equivalent to free(ptr). It’s tiresome having to write the stupid wrappers to fix the spec bug in every program.
Maybe C++ can fix it and force C to do the sensible thing and revert to the non-footgun ANSI era realloc().
C ought to require that malloc(0) returns NULL and (like it was before C99) realloc(ptr,0) is equivalent to free(ptr). It’s tiresome having to write the stupid wrappers to fix the spec bug in every program.
98% sure some random vendor with a representative via one of the national standards orgs will veto it.
In cases like this it would be really helpful to know who are the bad actors responsible for making things worse, so we can get them to fix their bugs.
It was already UB in practice. I guarantee that there are C++ compilers / C stdlib implementations out there that together will make 99% of C++ programs that do realloc(ptr, 0) have UB.
Not even slightly true. POSIX mandates one of two behaviours for this case, which are largely compatible. I’ve seen a lot of real-world code that is happy with either of those behaviours but does trigger things that are now UB in C23.
But POSIX is not C++. And realloc(ptr, 0) will never be UB with a POSIX-compliant compiler, since POSIX defines the behavior. Compilers and other standards are always free to define things that aren’t defined in the C standard. realloc(ptr, 0) was UB “in practice” for C due to non-POSIX compilers. They could not find any reasonable behavior for it that would work for every vendor. Maybe there just aren’t enough C++ compilers out there for this to actually be a problem for C++, though.
And realloc(ptr, 0) will never be UB with a POSIX-compliant compiler
In general, POSIX does not change the behaviour of compiler optimisations. Compilers are free to optimise based on UB in accordance with the language semantics.
They could not find any reasonable behavior for it that would work for every vendor
Then make it IB, which comes with a requirement that you document what you do, but doesn’t require that you do a specific thing, only that it’s deterministic.
Maybe there just aren’t enough C++ compilers out there for this to actually be a problem for C++, though.
No, the C++ standards committee just has a policy of not introducing new kinds of UB in a place where they’re trivially avoidable.
In general, POSIX does not change the behaviour of compiler optimisations. Compilers are free to optimise based on UB in accordance with the language semantics.
C23 does not constrain implementations when it comes to the behavior of realloc(ptr, 0), but POSIX does. POSIX C is not the same thing as standard C. Any compiler that wants to be POSIX-compliant has to follow the semantics laid out by POSIX. Another example of this is function pointer to void * casts and vice versa. UB in C, but mandated by POSIX.
No, the C++ standards committee just has a policy of not introducing new kinds of UB in a place where they’re trivially avoidable.
They introduced lots of new UB in C++20, so I don’t believe this.
I love this opinionated POSIX standard. Instead of baking in a bunch of hacks to support filenames with newlines, they just said “don’t do that, not supported going forward”. That’s a change I can get behind.
what will happen to existing files tho? i have some, currenty all programs support those without any hacks, except one (ls). will i be unable to even rename or delete them if i mount my storage on a new posix 2024 system?
the following utilities are now either encouraged to error out if they are to create a filename that contains a newline, and/or encouraged to error out if they are about to print a pathname that contains a newline in a context where newlines may be used as a separator
I’m curious how this plays with locales. The most common places I’ve seen this issue are where the file is created with something like a Big5 locale and then viewed in a C locale.
Trail byte on Big5 is 0x40 or higher, so not line feed.
It would really make POSIX ready for 2024 to drop non-UTF-8 locales, though. (Which probably won’t happen as long as someone finds it important to be able to claim a Big5 AIX system POSIX-compliant.)
So you know that thing where you run make on a C program, and you go looking for the compiled output, and you can’t find it because it’s in the src/ directory, and compiled output is, well, the exact opposite of source? I’m a little fuzzy on the exact details, but I’ve been told this happens because people want to write makefiles that are posix-compliant, and up until now, posix make has no way to do this sensibly.
Do these new changes mean we can finally be done with putting compiled output in the src/ directory?
Nothing in POSIX make requires intermixing of source code and build artifacts, but its design makes that the path of least resistance, and unfortunately that does not change.
When you are writing makefiles by hand, you want to avoid having this kind of repetition:
With GNU make you can rewrite this using a template
%.o: %.c
$(CC) $(CFLAGS) $(CPPFLAGS) -c -o $@ $<
POSIX make also allows you to do so, but using a bit more special-cased logic with .SUFFIXES
# Tell make that file extensions .c and .o participate in this machinery
.SUFFIXES:
.SUFFIXES: .c .o
# Instead of a template, you specify a magic target name, and the dependencies are implicit
.c.o:
$(CC) $(CFLAGS) $(CPPFLAGS) -c -o $@ $<
If you want to separate your object files into build/ and your source to src/, with GNU make’s templates you could do so simply
However, the .SUFFIXES machinery only works with files in the same directory. To get the same behaviour as with the GNUMakefile you’d have to repeat yourself
VPATH, as mentioned by ~fanf in the sibling comment, sidesteps the issue quite elegantly: you can run the makefile in the build directory and have it pick up the source code as if it was located there. Sadly it is not one of the features added to the POSIX 2024 standard.
Thus far we’ve been talking about situations where you’d write a makefile by hand. However, if you are generating the makefile programmatically, the presence of repeated rules in the file is not really a problem. As such, it becomes pretty easy to implement out-of-tree builds even with POSIX make limitations – I believe GNU autoconfig does this.
Worth noting that POSIX make has a default ruleset whose .c.o rule is usually sufficient. In my makefiles in small projects I usually specify just the .o dependencies and let make work out the build commands from its defaults.
Thanks for the detailed answer! VPATH looks pretty nice; is there a reason it wasn’t considered for standardization?
In its current state, it’s pretty difficult to argue that anyone should actually target the POSIX standard since there are very basic use cases it doesn’t handle gracefully. I’m personally fine just targeting GNU Make, but I know it makes a lot of people sad, so it’s unfortunate they don’t seem to be doing much to address that. Like… maybe the BSD people hate VPATH or whatever for some reason, but surely they hate it more when people just give up on the standard because it’s not good enough, right?
I sympathize with people who want to make their Makefiles portable, but I wonder in how many cases would depending on new POSIX Make features really be more portable than depending on long-standing GNU Make features.
Wow, newlines in filenames being officially deprecated?!
Re. modern C, multithreaded code really needs to target C11 or later for atomics. POSIX now requires C17 support; C17 is basically a bugfix revision of C11 without new features. (hmm, I have been calling it C18 owing to the publication year of the standard, but C23 is published this year so I guess there’s now a tradition of matching nominal years with C++ but taking another year for the ISO approval process…)
Nice improvements to make, and plenty of other good stuff too.
It seems like both C and POSIX have woken up from a multi-decade slumber and are improving much faster than they used to. Have a bunch of old farts retired or something?
Even in standard naming they couldn’t avoid off by 1 error. ¯\_(ツ)_/¯
I believe the date is normally they year when the standard is ratified. Getting ISO to actually publish the standard takes an unbounded amount of time and no one cares because everyone works from the ratified draft.
As a fellow brit, you may be amused to learn that the BSI shut down the BSI working group that fed WG14 this year because all of their discussions were on the mailing list and so they didn’t have the number of meetings that the BSI required for an active standards group. The group that feeds WG21 (of which I am a member) is now being extra careful about recording attendance.
Unfortunately, there were a lot of changes after the final public draft and the document actually being finished. ISO is getting harsher about this and didn’t allow the final draft to be public. This time around people will probably reference the “first draft” of C2y instead, which is functionally identical to the final draft of C23.
There are a bunch of web sites that have links to the free version of each standard. The way to verify that you are looking at the right one is
Sadly I can’t provide examples because www.open-std.org isn’t working for me right now :-( It’s been unreliable recently, does anyone know what’s going on?
Or just look at cppreference …
https://en.cppreference.com/w/cpp/language/history
https://en.cppreference.com/w/c/language/history
For C23, Cppreference links to N3301, the most recent C2y draft. Unfortunate that the site is down, so we can’t easily check whether all those June 2024 changes were also made to C23. The earlier C2y draft (N3220) only has minor changes listed. Cppreference also links to N3149, the final WD of C23, which is protected by low quality ZIP encryption.
I think most of open-std is available via the Archive, e.g. here is N3301: https://web.archive.org/web/20241002141328/https://open-std.org/JTC1/SC22/WG14/www/docs/n3301.pdf
For C23 the documents are
I think for C23 the final committee draft was last year but they didn’t finish the ballot process and incorporating the feedback from national bodies until this summer. Dunno how that corresponds to ISO FDIS and ratification. Frankly, the less users of C and C++ (or any standards tbh) have to know or care about ISO the better.
Re modern C: are there improvements in C23 that didn’t come from either C++ or are standardization of stuff existing implementations have had for ages?
It’s best to watch the standard editor’s blog and sometimes twitter for this information.
https://thephd.dev/
https://x.com/__phantomderp
I think the main ones are
_BitInt
,<stdbit.h>
,<stdckdint.h>
,#embed
Generally the standard isn’t the place where innovation should happen, though that’s hard to avoid if existing practice is a load of different solutions for the same problem.
They made
realloc(ptr, 0)
undefined behaviour. Oh, sorry, you said improvements.I learned about this yesterday in the discussion of rebasing C++26 on C23 and the discussion from the WG21 folks can be largely summarised as ‘new UB, in a case that’s trivial to detect dynamically? WTF? NO!’. So hopefully that won’t make it back into C++.
realloc(ptr,0) was broken by C99 because since then you can’t tell when NULL is returned whether it successfully freed the pointer or whether it failed to malloc(0).
POSIX has changed its specification so realloc(ptr, 0) is obsolescent so you can’t rely on POSIX to save you. (My links to old versions of POSIX have mysteriously stopped working which is super annoying, but I’m pretty sure the OB markers are new.)
C ought to require that malloc(0) returns NULL and (like it was before C99) realloc(ptr,0) is equivalent to free(ptr). It’s tiresome having to write the stupid wrappers to fix the spec bug in every program.
Maybe C++ can fix it and force C to do the sensible thing and revert to the non-footgun ANSI era realloc().
98% sure some random vendor with a representative via one of the national standards orgs will veto it.
In cases like this it would be really helpful to know who are the bad actors responsible for making things worse, so we can get them to fix their bugs.
Alas, I don’t know. I’ve just heard from people on the C committee that certain things would be vetoed by certain vendors.
Oh good grief, it looks like some of the BSDs did not implement C89 properly, and failed to implement realloc(ptr, 0) as free(ptr) as they should have
FreeBSD 2.2 man page / phkmalloc source
OpenBSD also used phkmalloc; NetBSD’s malloc was conformant with C89 in 1999.
It was already UB in practice. I guarantee that there are C++ compilers / C stdlib implementations out there that together will make 99% of C++ programs that do
realloc(ptr, 0)
have UB.Not even slightly true. POSIX mandates one of two behaviours for this case, which are largely compatible. I’ve seen a lot of real-world code that is happy with either of those behaviours but does trigger things that are now UB in C23.
But POSIX is not C++. And
realloc(ptr, 0)
will never be UB with a POSIX-compliant compiler, since POSIX defines the behavior. Compilers and other standards are always free to define things that aren’t defined in the C standard.realloc(ptr, 0)
was UB “in practice” for C due to non-POSIX compilers. They could not find any reasonable behavior for it that would work for every vendor. Maybe there just aren’t enough C++ compilers out there for this to actually be a problem for C++, though.In general, POSIX does not change the behaviour of compiler optimisations. Compilers are free to optimise based on UB in accordance with the language semantics.
Then make it IB, which comes with a requirement that you document what you do, but doesn’t require that you do a specific thing, only that it’s deterministic.
No, the C++ standards committee just has a policy of not introducing new kinds of UB in a place where they’re trivially avoidable.
C23 does not constrain implementations when it comes to the behavior of
realloc(ptr, 0)
, but POSIX does. POSIX C is not the same thing as standard C. Any compiler that wants to be POSIX-compliant has to follow the semantics laid out by POSIX. Another example of this is function pointer tovoid *
casts and vice versa. UB in C, but mandated by POSIX.They introduced lots of new UB in C++20, so I don’t believe this.
I love this opinionated POSIX standard. Instead of baking in a bunch of hacks to support filenames with newlines, they just said “don’t do that, not supported going forward”. That’s a change I can get behind.
what will happen to existing files tho? i have some, currenty all programs support those without any hacks, except one (
ls
). will i be unable to even rename or delete them if i mount my storage on a new posix 2024 system?The article says:
thank you, i need to read more care fully (either i missed “create” or assumed create a filename means create a parsed path object from a string)
I’m curious how this plays with locales. The most common places I’ve seen this issue are where the file is created with something like a Big5 locale and then viewed in a C locale.
Trail byte on Big5 is 0x40 or higher, so not line feed.
It would really make POSIX ready for 2024 to drop non-UTF-8 locales, though. (Which probably won’t happen as long as someone finds it important to be able to claim a Big5 AIX system POSIX-compliant.)
So you know that thing where you run
make
on a C program, and you go looking for the compiled output, and you can’t find it because it’s in thesrc/
directory, and compiled output is, well, the exact opposite of source? I’m a little fuzzy on the exact details, but I’ve been told this happens because people want to write makefiles that are posix-compliant, and up until now, posix make has no way to do this sensibly.Do these new changes mean we can finally be done with putting compiled output in the
src/
directory?The traditional way to do that is with the VPATH make variable, which still isn’t in POSIX.
Nothing in POSIX make requires intermixing of source code and build artifacts, but its design makes that the path of least resistance, and unfortunately that does not change.
When you are writing makefiles by hand, you want to avoid having this kind of repetition:
With GNU make you can rewrite this using a template
POSIX make also allows you to do so, but using a bit more special-cased logic with .SUFFIXES
If you want to separate your object files into build/ and your source to src/, with GNU make’s templates you could do so simply
However, the .SUFFIXES machinery only works with files in the same directory. To get the same behaviour as with the GNUMakefile you’d have to repeat yourself
VPATH, as mentioned by ~fanf in the sibling comment, sidesteps the issue quite elegantly: you can run the makefile in the build directory and have it pick up the source code as if it was located there. Sadly it is not one of the features added to the POSIX 2024 standard.
Thus far we’ve been talking about situations where you’d write a makefile by hand. However, if you are generating the makefile programmatically, the presence of repeated rules in the file is not really a problem. As such, it becomes pretty easy to implement out-of-tree builds even with POSIX make limitations – I believe GNU autoconfig does this.
Worth noting that POSIX make has a default ruleset whose .c.o rule is usually sufficient. In my makefiles in small projects I usually specify just the .o dependencies and let make work out the build commands from its defaults.
Thanks for the detailed answer! VPATH looks pretty nice; is there a reason it wasn’t considered for standardization?
In its current state, it’s pretty difficult to argue that anyone should actually target the POSIX standard since there are very basic use cases it doesn’t handle gracefully. I’m personally fine just targeting GNU Make, but I know it makes a lot of people sad, so it’s unfortunate they don’t seem to be doing much to address that. Like… maybe the BSD people hate VPATH or whatever for some reason, but surely they hate it more when people just give up on the standard because it’s not good enough, right?
I sympathize with people who want to make their Makefiles portable, but I wonder in how many cases would depending on new POSIX Make features really be more portable than depending on long-standing GNU Make features.
VPATH works with FreeBSD and NetBSD make, and != ?= += .PHONY .WAIT .NOTPARALLEL for well over 25 years.