In my last post, I looked into the infrastructure of mimalloc, to be frank. A few details on how it is actually hooking the system allocator when used, mostly. I’m still using what is effectively a random scatter/gather approach to the codebase. Mostly because it is small. I’m trying to… grok it would be the best term, I guess. I’m going over the code because it also give me a good idea about practices in what seems to be a damn good C codebase.
I should mention that I drove right into the code, there is also the tech report, which I intend to read, but only after I got through enough of the code to get a good feeling for it.
I run into the code in the options.c file, for instance:
This is a really nice way to get configuration values from the user. What I find really interesting, to be frank, is not the actual options, which would be interesting later on, but the fact that this is such a nice way to represent things in a readable manner.
I’m doing similar things in C# (a much higher level language) to create readable options (typically using dictionary & dictionary initializer). I like how the code is able to express things so succinctly in a language with far fewer features.
However, the order of parameters is critical (is should match the mi_option_t enum values), and there is no way to express this in the code.
I also liked this code, which is part of reading configuration values from env variables:
I like that this is using strstr() in reverse in this manner. It is really elegant.
Going over the OS abstraction layer, on the other hand, show some granliness, take a look here:
I actually simplified the code abit, because it also had #if there for BSD, Linux, etc. I find it harder to follow this style, maybe adding indentation would help, but I have had to read this function multiple times, filtering for specific OSes to get it right.
I did find this tidbit, which is interesting:
This is attempting to do allocation with exclusive access. I wonder how this is actually used for. It looks like mimalloc is attempting to allocate in specific addresses, so that should be interesting.
Indeed, in _mi_os_reset() they will explicitly ask the OS to throw the memory away by calling MADV_FREE or MEM_RESET. I find this interesting, because this let the OS know that the memory can be thrown away, but the allocation still persists. I’m currently looking into some fragmentation issues in 32bits, which won’t be helped by this scenario. Then again, I don’t think that mimalloc is primarily intended for 32 bits systems (I can see code handling 32 bits, but I don’t think this is the primary use case or that 32 bits had a lot of attention).
The mi_os_alloc_aligned_ensured() method call is doing some interesting things. If I need a 16MB buffer, but aligned on 1MB boundary, I have no real way to ask this from the OS. So this is implemented directly by over-allocating. To be fair, I can’t think of a good reason why you’ll want to do something like that (you have no guarantees about what the actual physical memory layout would be after all, and that is the only scenario I can think this would be useful. Given that page aligned memory (which is what you get anyway from the OS) is usually sufficient, I wonder what is the use case for that here.
I get why mimalloc have to struggle with this, given that it is limited to just returning a pointer back (malloc interface), and doesn’t have a good way to tell that it played games with the alignment when you pass a value to free(). There seems to be a lot of code around here to deal with memory alignment requirements. I think that I’ll need to go up the stack to figure out what kind of alignment requirements it has.
That is enough for now, I think. I’m going to get to the core of mimalloc in the next post.