Tail call optimization in Ruby
Tail call optimization is an optimization where tail recursive functions are transformed into loops by the compiler. A tail recursive function is one where the final statement is a call to the same method. In this post, we will look at what tail recursive functions look like, how tail call optimization helps them, and how to enable TCO in Ruby.
Recursion
The factorial function provides an excellent example to demonstrate tail recursive functions.
Even though the above method is recursive, you cannot call it tail recursive, because the last line is an operation rather than a simple call to the #fact
method. Let’s see the sequence of operations that would happen to calculate fact(4)
:
As you can see from the shape of the above example, the operations keep stacking up with each recursive call to fact
. The interpreter needs to keep track of all these values as the function call gets expanded into the longest line above, and then reduces down to a single value. Because the stack size is limited, this will cause a SystemStackError for large values of n.
Tail recursion
There is a way to avoid the expand-then-contract shape of operations we saw above. We could introduce an additional argument to the fact
method that keeps track of intermediate values.
If we were to look at the sequence of operations in this example, it would look like this:
This version of the method can be called tail recursive. Even though this version only needs to keep track of the two arguments at a time, this will still cause SystemStackError. Why?
The reason is that when a method is called, the interpreter needs to keep track of where to return after executing that method. The location where execution has to return to is stored in the call stack. Each recursive call to the #fact
method gets added to the call stack.
Enabling tail call optimization
When tail call optimization is enabled, the tail recursive calls can be optimized to work like a loop. Instead of stacking the method calls on the call stack, the interpreter replaces the topmost stack frame with the new one.
Ruby does not enable tail call optimization by default, but you can enable it by setting a compile option in the code.
When the method is called as fact(4, 1)
, the final statement in the method can be expressed as fact(3, 4)
. Instead of making a method call, the interpreter can replace the contents of the arguments at the top of the stack, and jump back to the start of the method.
As a result of this, the stack size remains constant while recursing through to the result of the function, and you don’t run into stack errors.
Ruby does not enable TCO by default, but we can enable it in our program by setting the compile flag during runtime as shown here:
Writing a method decorator
In my previous post, I showed how we can write a method decorator for memoizing a method. Just for fun, let’s try writing a similar one for tail recursive functions.
In this example, we need to be able to define a Calculator class and declare the fact method as tail_recursive. The TailCallOptimization
module (which we will see soon) contains this tail_recursive
method that recompiles the method with TCO enabled.
To make this tail recursive, we need the source of the fact
method as a string, so that we can recompile it with the correct compile options. Unfortunately there is no easy way to access this, so I’m going to cheat and use the method_source gem to access the source. With this gem, calling #source
on a method object returns the source for that method.
In the above snippet, we instantiate a RubyVM::InstructionSequence with the string representation of the method, and then call #eval
on it. Since we’ve set compile options for TCO, this allows us to call the #fact
method with large input values. This isn’t a very useful method decorator, but it’s fun to explore how to recompile some parts of Ruby at runtime.
You can try calling #fact
with really large values for n, and you will not run into SystemStackError.
A negative side to TCO?
Guido van Rossum wrote about why he is against supporting TCO in Python. One major problem he points out is that TCO messes up the stack traces, and therefore makes debugging harder. For this reason, he does not support implementing TCO in Python.
In Ruby world, on the other hand, Matz is less opposed to the idea, and as we’ve seen, Ruby allows you to optionally enable it, even though it’s not the default.
Further reading
- Magnus Holm has written an article, Tailin’ Ruby that presents some more approaches to achieve TCO in Ruby.
- Tail Call Optimization in Ruby: Deep Dive goes into the internals of how TCO works in Ruby.