December Adventure 2023¶
Last year I've participated to the "AWKdvent of Code", my version of the advent of code: with AWK. This year let's try something else :)
The December Adventure!
┌───────────────────────────┐
│ December 2023 │
├───┬───┬───┬───┬───┬───┬───┤
│ │ │ │ │*1 │*2 │*3 │
├───┼───┼───┼───┼───┼───┼───┤
│*4 │*5 │*6 │*7 │*8 │*9 │*10│
├───┼───┼───┼───┼───┼───┼───┤
│*11│*12│*13│*14│ 15│ 16│ 17│
├───┼───┼───┼───┼───┼───┼───┤
│ 18│*19│*20│ 21│ 22│ 23│ 24│
├───┼───┼───┼───┼───┼───┼───┤
│ 25│ 26│ 27│ 28│ 29│ 30│ 31│
└───┴───┴───┴───┴───┴───┴───┘
Goals¶
Stuff I'd like to do:
- implement the plan/line interception for collision detection
- rework the text layout/display on my (unreleased) text editor
- finally do the PCB design to bring USB to a palm portable keyboard (foldable!)
- implement texture selection in my live shader coding tool "bonz"
- add a web page with a selection of shaders to be displayed on a web-browser
01¶
Created a quick demo page with a openGL shader that can run in the web-browser. I would like to create custom HTML elements, like what Stargirl did with KiCanvas.
02¶
Today was snowing outside, and so I spent the entire day on the computer...
Worked on a demo project, the goal is to create a 4K demo/intro for Linux. The idea is to compress a dynamic elf linked against the libSDL2 and embed all of this into another statically linked tiny elf. The second elf will decompress the first elf in memory and call an exec syscall on it, so the ld interpreter will be used. I hope it make sense.
Cleaned-up a skeleton that uses SDL2 to display a single full screen shader. Also hacked the GL loader "glad" to be more tightly integrated so it will only "load" (resolve) the openGL symbols that are actually used, this saves a lot of bytes.
Then I've started cleaning up the "loader" part, which I already started few month ago after finding a neat trick to execute an elf directly from memory:
int fd = memfd_create("", MFD_CLOEXEC);
execveat(fd, "", argv, envp, AT_EMPTY_PATH);
This trick is very close to the memfd_create()
+ fexecve()
syscall, except
this one uses execveat()
with AT_EMPTY_PATH
which doesn't changes much but
it doesn't require to craft an path to /proc/pid/fd/3
.
The execveat()
man-page suggest that is it used to execute a given file located
in a directory specified by a file descritor, which in my case isn't what I need.
However the AT_EMPTY_PATH
flags change its behavior by executing the given file
descriptor when the file path is the empty string ""
!
At that time I had a linker script to create a tiny elf solely using PHDRS
.
This script wasn't on my computer, but on the "old" one sitting next to me...
I didn't bother powering this "old" computer again to get the previous script.
Instead I tried to do something else this time: creating the elf header directly
in C by using the system elf.h
header and some symbols defined in a custom
linker script.
~~~
Things I've learned.
In the program header the "physical" file size p_filesz
must be a smaller or
equal to the "virtual" memory size p_memsz
, otherwise the elf loader will refuse
to load the program and will return a "segfault".
The environment variables are needed by SDL, or by X11, without it fails to connect to the X11 server
The environment variables pointer envp
is located at the end of the args.
/* pointer arithmetics */
char **envp = argv + argc + 1;
But then, where does argc
and argv
comes from ? Turns out it is passed by the kernel
on the stack (at least on x86_64):
mov %rsp,%rdi ;; copy the stack pointer to register rdi
The stack should be organized as the following, with argc
being on "top" of stack followed by the argv
pointer:
┌ $rsp ┌ $rsp+8
├──┬──┬──┬──┬──┬──┬──┬──┼──┬──┬──┬──┬──┬──┬──┬──┐
│ argc │ argv │
└──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┘
Listening to 8-bit mentality, on a loop...
what a day
Next goal is to look at how to lzma works and how it can be decompressed.
03¶
Yesterday was far more "productive" than expected, today's goal is to keep it cool. The initial plan for this sunday was to go outside to do some moderate mixed alpine climbing, but we had to cancel due to the recent snowfall.
Today's goal is to look into ray-marching and having fun with shaders.
~~~
I have not looked much at lzma, but I've looked at using ray marching in shaders...
I haven't fully understood the usual ray marching implementation details (such as how to create a "camera") but the general idea and algorithm is quite "simple", the catch is how to create complex worlds that I still don't understand.
I think the ray marching approche could be applied create a simple collision systems for games.
Started adding "timeline" controls into my live shader coding tool bonz.
04¶
Worked more on the "timeline" controls.
The time can be stopped and two markers can be placed on the timeline to create loops.
05¶
Started tinkering on a simple markdown parser in AWK, I want to be able write my own implementation of notmarkdown, which I already modified quite a bit. I want this version to support having lists in quoted blocks...
06¶
Studied a bit more on how LZMA works, but it is still a mystery.
LZMA stands for Lempel–Ziv–Markov chain Algorithm, and if i understood correctly is it the combination of an Lempel-Ziv based compression (see LZ77) which is then further compressed by a range coder, which is an entropy coding method...
An actual LZMA file starts with the following structure:
- 1 byte with the compression configuration (a complex combination)
- 4 bytes for the dictionary size (big-endian encoded ?)
- 8 bytes for the uncompressed size, or UINT64_MAX (all
0xff
) for a stream, the end is noted with a special marker
0 +1 +4 +8/
├──┼──┬──┬──┬──┼──┬──┬──┬──┬──┬──┬──┬──┼──
│cf│ dico_size │ uncompressed_size │<stream start>
└──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──┴──
Then the stream starts with 5 bytes that are used to initialize the range coder,
the first byte is always NUL
(and doesn't have much impact).
──┼──┬──┬──┬──┬──┼──
..│r0│r1│r2│r3│r4│<...>
──┴──┴──┴──┴──┴──┴──
Code snippet for the range coder initilization, the first iteration has no effect on the final rc->code
state
which is 32-bits.
for (i = 0; i < 5; i++)
rc->code = (rc->code << 8) + in->buf[in->pos++];
There is so much more ...
07¶
More study on LZMA. And cleanup of shader initialization in bonz.
Nothing visual.
08¶
I can't remember this far... I should have keept taking note...
09¶
Got a simple LZMA decompression working but the size of the decoding binary is very high (~3.8K bytes).
10¶
Reduce the binary size by re-writting few functions: passing parameters by values, reducing branches, using globals, removing testing for error cases. In the end this saved ~250 bytes, this is not enough.
11¶
Starting exploring other means to gain space on the lzma decompressor. I've got severals ideas:
- compressing the lzma decoder with lz4
- implemeting a lzma decoder with some form of byte code to achieve a better code reusability
At that point I wanted to put all piece togther and see, at least, if the LZMA decoder works well enough to properly decode the demo... somehow my laptop's kernel refuse to execute the tiny crafted elf and I don't know why. I'll have to take a look at this.
12¶
Fixed the issue with the tiny elf not being loaded by the kernel. Turns out the elf load address matters, despit being a virtual address, maybe this is because of VDSO.
To debug this I've looked at tmpout.sh which has a lot of informations about the elf executable and many links to other ressources.
My previous attempt at creating a tiny LZMA decoder failed and I think this is caused by me attempting to optimise for binary size too early on, before having a working setup. Thus I started all over but with the objective of having a working setup first, only then to look at size optimisation.
Managed to get a single file implemtation of an LZMA decoder. I had to debug along side a working implementation, some functions that I copied from my first attempt were broken.
13¶
Got a LZMA decoder working but the loader failed to execute the binary directly from memory, despit successfully decompressing.
It looked like it was a memory corruption because the debug printfs I added were
printing non-sense... But it turns out it was the execveat
syscall that simply
returned a negative error code (in this case -14
aka -EFAULT
) and my implementation
of the %d
decimal format had a bug when printing negative numbers.
After a lot of investigation it seems to be stack based variables that cause the issue,
and so I made every variables static and it "worked".
The current total binary size is 4483
bytes, a bit above the target of 4096
, the loader
itself takes around 1667
bytes, which is a lot.
14¶
Today's goal is to investigate on the stack corruption...
Increasing the stack, by substracting the stack pointer ($rsp
on x86_64) "fix" the issue,
which confirms that there is a stack overflow or corruption.
After more investigation this was caused by the inline assembly in the custom entry point
which had the naked
attribute, the main
function declared static
was inlined into the
entry point and somehow inherited the naked
attribute.
The issue can be simply fixed by removing the static
attribute on the main
function.
I'll have to see if there is way to make this startup code more compact. But that's for another day.
15¶
I went to Antibes to participate in a regatta for the weekend (16 and 17). We sail two days on a J130, this was the first time I was on such a huge boat (~13m). This was really nice and the weather was really hot for december, there were not much wind but still this was fun.
18¶
On the road back from Antibes.
19¶
Fixed the startup in the demo project, the stack is not initialized three instructions golfed into the elf header and simply fallthrough to the main function. The main function isn't declared static anymore solving the stack initialization issue, and is placed right after the elf header by the linker script.
20¶
Modified the demo to use X11 instead of SDL to open a window an get an openGL context but the SDL version is still smaller than the one using X11, maybe I'll need to look at this more. I hope using SDL is authorised by the demoparty.
23¶
Participated in a local regatta on a lake near-by.
eol¶
Keeping a log is harder that what I thought, I found it difficult to write and I tend to postpone writing until I don't remember what I did...