I noticed in the footer that the webpage is generated with the C preprocessor, and the source can be viewed by appending .h to the url (https://aartaka.me/c-not-c.h). I hadn’t considered the preprocessor as an approach to site generation. Interesting to read through and see the tradeoffs like needing a hash() macro in code blocks to render a #.
Another thing I really like is the middle dot separator that Go’s source code had in its infancy:
int string·len(string s)
void net·Listen(/* ... */)
You can also stray even further from C and define a few macros that communicate intent but do nothing:
#define mut
#define private static
private void foo(mut int *n) {
// mutates n
}
There are some other ideas (like range macros or defer pre-processor) that are probably harmful in the long run in a project with more than 1 person in it. Apparently Bash’s original source code had a lot of these language modifications that made it unmaintainable. But with a few macros and strong conventions you get a nicer looking language.
I like that middle dot idea. How would you do a defer pre-processor? Is that just a macro or something more (like the cleanup attribute extension)?
This is actually all quite relevant to an experiment I’ve been doing lately with a custom codegen step before C compilation. I used to not be into codegen due to the seeming inconvenience of an extra build step, but it’s been fine, I just put it in my build script and then don’t think about it now. The idea with the codegen is to read your types and only generate additional types and functions in additional files you #include, but not eg. transform the code inside function bodies or such. So far what I’ve gotten with it are (for each user type T):
void TDealloc(T *value) functions that propagate dealloc calls to each field of a struct recursively, offering one of the features of destructors that I find useful. You still do the top-level call explicitly, which I like.
TArray types for growable arrays that are type-safe. Also generates macros to allow for (TArrayEach(elem, array)) { ... }.
TFromJSON and TToJSON functions for json serialization and deserialization.
The clear names give me some things I like over templates / generics / ast-macros / comptime / … – you can ‘go to definition’ of these functions in your editor and just read them, step into them in a debugger, have readable callstacks in debuggers and profilers, global symbol search with the unique name and find the function, etc. When the generated code has compile errors they take you to the concrete examples where they fail. The generated code is just regular, readable code.
I’ve been working on a little game project to test it out: https://gist.github.com/nikki93/0ffa13a3b6e690c0317065ba0b415433 – ‘game.c’ is meant to be the main user-written code. ‘generate.c’ reads that to generate ‘generated.{h,c}’. And ‘main.c’ is surrounding scaffolding to launch the game and also provide ‘hot-reload’ that uses the serialization.
Here’s a quick video showing the resulting workflow: https://youtu.be/zGelkFXP4mo – the hot reload, debug stepping into generated functions, going to definition.
I’ve had a Go->C++ transpiler (https://github.com/nikki93/gx) I’ve used a lot for this kind of thing before but I’m finding that this C + codegen is getting me most of what I wanted from that. I’m liking the ‘explicitly writing types everywhere’ vibe for comprehension after-the-fact.
With codegen. The idea is to just parse the source code for defer and generate an output where the defer statement is put before every return in order.
It has to be a little bit smarter because when the original if/else/whatever doesn’t have braces then it’d have to add it, etc.
At some point too many of these modifications become more like a custom language that compiles to C, without any modern conveniences. However in general I agree with you that codegen can be clearer than generics/templates in some cases. It also shines in some other more basic scenarios where there would be a perfectly valid solution without codegen (e.g. Protobufs, OpenAPI, SQL query functions, etc).
Oh gotcha. Yeah __attribute__((cleanup(...))) in clang and gcc might be a reasonable way to get at this if using those compilers. I like that that attribute associates a specific function with a specific variable (the usage of defer that is usually desired), rather than usual designs of defer which involve an arbitrary code block and writing the defer anywhere.
I agree re: your point about a custom language. I’m thinking to limit my codegen to just only generate additional types and functions – essentially more regular API – for your code to use – rather than transform your code meaning to replace it. I think that sets a good boundary that keeps user code still feeling like regular C. I find that what I want more from C tends to be what APIs I’d like to be available on my types (reflection, generic growable array or other data structures), more than transformations of expressions and statements.
This balance without the ‘modern conveniences’ is actually what I’m wanting right now because I find that those conveniences often lure me into being distracted by trying to make my code ‘nicer’ using language features, or involve me having to make decisions about which way to write something – more accidental decisions that aren’t actual design decisions about the project I’m working on. eg. I find deciding if a function should be a method or a free function to usually fall under this category in modern languages. By just removing the need to make such decisions I’ve found that C can keep me focused on actually just working on the project. It’s a delicate balance for sure, and the tradeoffs vary across projects and authors (and especially would break down in teams) – but very reasonable for the ‘fun single person project’ category.
The windows headers actually do something similar to the mut concept you put in with in, out, and inout parameters. They’re definitely a nice way to communicate intent in a function signature.
There are some other ideas (like range macros or defer pre-processor) that are probably harmful in the long run in a project with more than 1 person in it. Apparently Bash’s original source code had a lot of these language modifications that made it unmaintainable. But with a few macros and strong conventions you get a nicer looking language.
Were you thinking of the Unix V7 Bourne shell? Stephen Bourne used macros to make C look like ALGOL 68. Here’s an example file.
You’re right, I mixed up the two: Bourne-Again Shell wasn’t written by Bourne. Rob Pike had a talk where he talked (briefly) a little more fondly about the Bourne shell and those macros.
Sort of related: Inspired by the IOCCC, instead of doing school work in highschool, I would often play around in C with #defines and typdefs to make weird and obscure syntaxes. It was really fun to mess about with.
I noticed in the footer that the webpage is generated with the C preprocessor, and the source can be viewed by appending
.h
to the url (https://aartaka.me/c-not-c.h). I hadn’t considered the preprocessor as an approach to site generation. Interesting to read through and see the tradeoffs like needing ahash()
macro in code blocks to render a#
.At least the author didn’t go so far as to add whilst loops.
(Also, C23 has type inference?! I haven’t been paying attention to recent developments in C, so this is the first time I’ve heard of it.)
I guess it’s technically type inference but not how I’d normally understand the term. It’s just a language-level way to do
typeof(expr) foo = expr
.I’ve also mentally compiled a few ideas of a “nicer” C over the years . Aside from the ones in the article there’s also “generics”:
Another thing I really like is the middle dot separator that Go’s source code had in its infancy:
You can also stray even further from C and define a few macros that communicate intent but do nothing:
There are some other ideas (like
range
macros ordefer
pre-processor) that are probably harmful in the long run in a project with more than 1 person in it. Apparently Bash’s original source code had a lot of these language modifications that made it unmaintainable. But with a few macros and strong conventions you get a nicer looking language.I like that middle dot idea. How would you do a
defer
pre-processor? Is that just a macro or something more (like thecleanup
attribute extension)?This is actually all quite relevant to an experiment I’ve been doing lately with a custom codegen step before C compilation. I used to not be into codegen due to the seeming inconvenience of an extra build step, but it’s been fine, I just put it in my build script and then don’t think about it now. The idea with the codegen is to read your types and only generate additional types and functions in additional files you
#include
, but not eg. transform the code inside function bodies or such. So far what I’ve gotten with it are (for each user typeT
):void TDealloc(T *value)
functions that propagate dealloc calls to each field of a struct recursively, offering one of the features of destructors that I find useful. You still do the top-level call explicitly, which I like.TArray
types for growable arrays that are type-safe. Also generates macros to allowfor (TArrayEach(elem, array)) { ... }
.TFromJSON
andTToJSON
functions for json serialization and deserialization.The clear names give me some things I like over templates / generics / ast-macros / comptime / … – you can ‘go to definition’ of these functions in your editor and just read them, step into them in a debugger, have readable callstacks in debuggers and profilers, global symbol search with the unique name and find the function, etc. When the generated code has compile errors they take you to the concrete examples where they fail. The generated code is just regular, readable code.
I’ve been working on a little game project to test it out: https://gist.github.com/nikki93/0ffa13a3b6e690c0317065ba0b415433 – ‘game.c’ is meant to be the main user-written code. ‘generate.c’ reads that to generate ‘generated.{h,c}’. And ‘main.c’ is surrounding scaffolding to launch the game and also provide ‘hot-reload’ that uses the serialization.
Here’s a quick video showing the resulting workflow: https://youtu.be/zGelkFXP4mo – the hot reload, debug stepping into generated functions, going to definition.
I’ve had a Go->C++ transpiler (https://github.com/nikki93/gx) I’ve used a lot for this kind of thing before but I’m finding that this C + codegen is getting me most of what I wanted from that. I’m liking the ‘explicitly writing types everywhere’ vibe for comprehension after-the-fact.
With codegen. The idea is to just parse the source code for
defer
and generate an output where thedefer
statement is put before everyreturn
in order.Becomes:
It has to be a little bit smarter because when the original if/else/whatever doesn’t have braces then it’d have to add it, etc.
At some point too many of these modifications become more like a custom language that compiles to C, without any modern conveniences. However in general I agree with you that codegen can be clearer than generics/templates in some cases. It also shines in some other more basic scenarios where there would be a perfectly valid solution without codegen (e.g. Protobufs, OpenAPI, SQL query functions, etc).
Oh gotcha. Yeah
__attribute__((cleanup(...)))
in clang and gcc might be a reasonable way to get at this if using those compilers. I like that that attribute associates a specific function with a specific variable (the usage of defer that is usually desired), rather than usual designs of defer which involve an arbitrary code block and writing the defer anywhere.I agree re: your point about a custom language. I’m thinking to limit my codegen to just only generate additional types and functions – essentially more regular API – for your code to use – rather than transform your code meaning to replace it. I think that sets a good boundary that keeps user code still feeling like regular C. I find that what I want more from C tends to be what APIs I’d like to be available on my types (reflection, generic growable array or other data structures), more than transformations of expressions and statements.
This balance without the ‘modern conveniences’ is actually what I’m wanting right now because I find that those conveniences often lure me into being distracted by trying to make my code ‘nicer’ using language features, or involve me having to make decisions about which way to write something – more accidental decisions that aren’t actual design decisions about the project I’m working on. eg. I find deciding if a function should be a method or a free function to usually fall under this category in modern languages. By just removing the need to make such decisions I’ve found that C can keep me focused on actually just working on the project. It’s a delicate balance for sure, and the tradeoffs vary across projects and authors (and especially would break down in teams) – but very reasonable for the ‘fun single person project’ category.
The windows headers actually do something similar to the
mut
concept you put in with in, out, and inout parameters. They’re definitely a nice way to communicate intent in a function signature.Were you thinking of the Unix V7 Bourne shell? Stephen Bourne used macros to make C look like ALGOL 68. Here’s an example file.
You’re right, I mixed up the two: Bourne-Again Shell wasn’t written by Bourne. Rob Pike had a talk where he talked (briefly) a little more fondly about the Bourne shell and those macros.
This can’t be right…?
Explained at https://www.reddit.com/r/programming/comments/1fnsgjy/comment/lon6wer/. It has been fixed now.
Aaah of course. Generating a website about C using CPP is a rather fraught endeavor!
Ya, that produces an error using GCC.
Sort of related: Inspired by the IOCCC, instead of doing school work in highschool, I would often play around in C with #defines and typdefs to make weird and obscure syntaxes. It was really fun to mess about with.