for the latest chapter of what's becoming a blog on the most cursed bash you can imagine, let's do some maths together
euclid's algorithm for gcd could be written like this in python:
>>> def gcd(a, b):
... if b:
... return gcd(b, a%b)
... return a
...
>>> gcd(36, 48)
12
here's a funky way to do the same thing in bash:
$ gcd=b?c=a,a=b,b=c%b,gcd:a
$ echo "$((a=36,b=48,gcd))"
12
bash recursively expands gcd
in the arithmetic context as if it was a macro,
lazily evaluating only one side of ?:
at every level. there is a fixed
recursion limit, mainly to prevent infinite expansions like a=a; echo $((a))
,
but it is 1024 steps, which is not too small and enough for many use cases
this turns out to be a powerful tool, and it's usually the fastest way to do nontrivial1 calculations in pure bash.
one could think of these recursive expansions as if they were functions, and
try to build a library of convenient building blocks. unfortunately it's quite
hard to compose them, because they're just evaluated via text replacement and
not as proper functions. there is no separate scope for the variables, and all
the names are fixed. using weird names is a possible way to simulate private
variables, but it remains hard to calculate gcd(gcd(396, 594), gcd(462, 363))
and a possible solution to this could be: a stack 🥳
(this may sound obvious to you, dear reader, but i'm slow so it took me a bit)
here's how it can work:
- an array of values
v
can serve as a generic stack, and a variablep
as a stack pointer. note that this kind of expansion cannot unset array elements, but that's fine, decrementing the stack pointer is enough - to call our new "functions", we'll make sure that the correct number of parameters are on the top of the stack
- the functions will be able to use all the space at the top of the stack (including the parameters) as a scratch buffer
- they will finally leave their retun value(s) at the top of the stack. if they need to return a variable number of values, they can communicate this to their callers by also pushing the number of values
overall we have just created the exact opposite of type safety, but you're writing this kind of bash code, you don't need safety, or types, or sanity
what we're left with is a 1:1 translation for our old gcd:
p = ${#v[@]} number of elements on the stack
a = v[p-2] passed on the stack
b = v[p-1] passed on the stack
c = v[p] temp parameter created by gcd on top of the stack
we take 2 values from the stack
we return 1 value
--> decrement the stack pointer to return
gcd=v[p-1]?v[p]=v[p-2],v[p-2]=v[p-1],v[p-1]=v[p]%v[p-1],gcd:v[--p-1]
b ? c = a , a = b , b = c % b ,gcd: a
we can verify that it works, and compute our nested gcd from before:
$ gcd=v[p-1]?v[p]=v[p-2],v[p-2]=v[p-1],v[p-1]=v[p]%v[p-1],gcd:v[--p-1]
$ v=(36 48) p=${#v[@]}
$ echo "$((gcd))"
12
$ declare -p v p
declare -a v=([0]="12" [1]="0" [2]="36")
declare -- p="1"
$ echo "$((v[p++]=396, v[p++]=594, gcd, v[p++]=462, v[p++]=363, gcd, gcd))"
33
$ declare -p v p
declare -a v=([0]="12" [1]="33" [2]="0" [3]="198" [4]="66")
declare -- p="2"
previous: revolutionary invention
Footnotes
-
shameless plug: one application of this trick is the wall hit code of my bash raycasting 3d engine, pretty much the only reason why things run at an almost acceptable speed https://github.com/izabera/pseudo3d/blob/90e3e58/game.bash#L181-L185 ↩