Functional Recursive JavaScript Programming
Introduction
This post isn’t about Functional Programming per se, as there are many parts to pure FP (seen in Lisp, Clojure, Haskell etc) which we can’t achieve completely in the JavaScript language (for example, implementing fundamental principles of FP such as “immutable data” is very difficult in a language designed around allowing data structures to be mutable. Although immutable data would be a lot easier with ES5 getters/setters and ES6 proxies).
What we do cover is recursion and a problem known as “tail call optimisation” and how to work around it in JavaScript.
The problem
In JavaScript, if a function calls itself recursively then the JavaScript engine has to create what’s called a new ‘stack’. A stack is a chunk of memory allocated to help keep track of all the information related to the function at the point of execution (such as its arguments and their initialised values).
Here in lies the problem: creating stacks is expensive, as the JavaScript engine can only create as many stacks as it has memory available. If we write a function that recursively calls itself many times then we’ll find that we can exhaust the memory allocation and trigger an error.
Let’s look at some code that demonstrates this.
The following function sum
simply adds two numbers together; but for the purpose of demonstrating recursion (a potentially complex topic) and making it easier to understand, we have written the sum
function in a very convoluted way:
function sum(x, y) {
if (y > 0) {
return sum(x + 1, y - 1);
} else {
return x;
}
}
sum(1, 10); // => 11
Let’s quickly explain what the code does:
- We pass in two numbers which are assigned to the parameters
x
andy
(in this instance we pass1
and10
) - We check if
y
is greater than0
- If
y
is greater than0
(which we know it is) then we recursively callsum
again but this time modify the arguments so thatx
is incremented by1
andy
is reduced by1
- When the
sum
function is next called we’re passing2
and9
. - At this point we’re now back inside the
sum
function but the first call tosum
has yet to finish (as we didn’t reach the end of the function, the focus moved to another function being executed - which in this case wassum
again) - Also at this point the JavaScript engine has two stacks. One for the point in time when we passed in
1
and10
, and now (as it has to remember2
and9
). - The JavaScript engine has to remember the previous arguments
1
and10
because potentially once this secondsum
execution finishes we’ll end up back in the first execution context.
As we can see, in the above code the JavaScript engine has to create a new stack for each recursive call. For a small sum of 1
and 10
this is fine; but if we try to sum 1
and 100000
then that’ll require more stacks than we have memory to allocate. This will cause an error like so:
sum(1, 100000);
// => `RangeError: Maximum call stack size exceeded`
Recursion in the form we’ve written it above requires many stacks to be created because of how the function is calling itself.
The first call to the sum
function (i.e. sum(1, 100000)
) doesn’t complete until the very last call to sum
returns the value of x
(i.e. sum(0, 99999)
is technically the final execution). When the final call to sum
occurs and we discover that y
is no longer greater than zero, we return the accumulated value which has (for each recursive call) been stored inside the argument x
. That returned value needs to then be passed back through each function execution context (effectively closing each stack) until we reach the very first function execution context that was opened when we ran sum(1, 100000)
.
If we create a stack deep enough (such as we did with sum(1, 100000)
) then the JavaScript engine will throw a Maximum call stack size exceeded
error.
This problem can occur very easily (as we’ve seen in this simple code example).
Tail Call Optimisation
In other programming languages the recursion could be rewritten in such a way that the engine would recognise a recursive execution was happening and optimise the code internally into a loop form. This is called a “tail call optimisation” (TCO).
Unfortunately the JavaScript language doesn’t implement this optimisation.
Note: apparently ECMAScript 6 has plans to implement TCO
The following is an example of a tail optimised version of the previous code…
function sum(x, y) {
function recur(a, b) {
if (b > 0) {
return recur(a + 1, b - 1);
} else {
return a;
}
}
return recur(x, y);
}
sum(1, 10); // => 11
This works because the recur
function ends up being the last invocation of the sum
function. Now the engine is able to construct internal code that executes recur
inside of a loop instead of via recursion.
As mentioned above, this optimisation has no effect in the current version of the JavaScript language (fingers crossed ES6 will implement TCO so the above code would work fine).
For now, we need an alternative solution.
The solution
The fix to this problem (in JavaScript at least) is to consume fewer stacks.
One way we could do this is to rewrite the code to not be recursive; so in other words: use a loop! Effectively we would be producing our own implementation of TCO (remember that TCO recognises a recursion and internally implements it as a loop).
The problem with using a loop is that it’s not as elegant as the recursive style we associate with functional programming.
Another solution is to use a type of functional pattern called “trampolining”. Let’s take a look at one implementation of it…
Trampolining
Note: the following code (and my explanation of it) assumes an understanding of the this
keyword and changing its context. If you’re unsure about what this means then read more about it here):
function trampoline(f) {
while (f && f instanceof Function) {
f = f();
}
return f;
}
function sum(x, y) {
function recur(x, y) {
if (y > 0) {
return recur.bind(null, x + 1, y - 1);
} else {
return x;
}
}
return trampoline(recur.bind(null, x, y));
}
sum(1, 10); // => 11
The reason the above code works is because we’ve replaced our functional recursion style with a loop.
In the above code we don’t create a deep nest of stacks because the sum
function only gets called once. The trampoline
function also is only called once; and the recur
function inside of sum
, although called multiple times, is called via a loop (again, no more than one stack is required at any one time).
The code breaks down like so…
- We call
sum(1, 10)
. - Our
sum
function ultimately returns a value. In this case whatever is returned by calling thetrampoline
function. - The
trampoline
function accepts a reference to a function as its argument (it’s important to understand it needs a reference to a function; doingreturn trampoline(recur(x, y))
wouldn’t work as that would end up passing the result of callingrecur(x, y)
to thetrampoline
function). - So we use
Function#bind
to pass a reference to therecur
function (we usenull
as thethis
binding because it doesn’t matter what the context the function executes in as we don’t use the function as a constructor). - When we execute
sum(1, 10)
we pass the reference to the internalrecur
method to thetrampoline
function. - The
trampoline
function checks if we passed a function and if so we execute the function and store its return value inside thef
variable. - If what we pass into the
trampoline
function isn’t a function then we assume it’s the end (i.e. accumulated) value and we return the value straight back to thesum
function which returns that value as the accumulated value. - Inside the
recur
function we check to see ify
is greater than zero, and if it is we modify thex
andy
values (like we did in the previous example) and then return another reference to therecur
function but this time with the modifiedx
andy
values. - Inside the
trampoline
function the newly referenced function is assigned to thef
variable and thewhile
loop on its next iteration checks to see iff
is indeed a function or not. If it is (which it would be in this instance) we again execute the function (which is nowrecur(2, 9)
) and the whole process starts again. - Until of course we reach the point where
y
is set to zero. Then when thetrampoline
function executes the function reference (recur
) and inside theif
conditional fails (asy
is now zero and no longer greater than zero) and so we return the accumulatedx
value; which then gets sent back and stored in thef
variable inside thetrampoline
function. - On the next iteration of the
while
loop,f
is now a value and not a function and so thewhile
loop ends and the accumulated value is returned to thesum
function which returns that as its accumulated value.
A more advanced solution
The previous code we just looked at works fine, but it required us to modify our code to work with the ‘trampoline’ pattern. This is a bit of a pain and means if we have lots of recursive code then it means each one might need subtle changes to accommodate this pattern.
The following code is an abstraction around that concept and it’ll allow us to keep our code exactly the same, with no modifications, and the abstraction will handle all of the work for us!
Let’s take a look at the implementation…
function tco(f) {
var value;
var active = false;
var accumulated = [];
return function accumulator() {
accumulated.push(arguments);
if (!active) {
active = true;
while (accumulated.length) {
value = f.apply(this, accumulated.shift());
}
active = false;
return value;
}
}
}
var sum = tco(function(x, y) {
if (y > 0) {
return sum(x + 1, y - 1)
}
else {
return x
}
});
sum(1, 100000) // => 100001
Here we’ve written a tco
function which simply wraps around our original code (so no modification required to our code).
Let’s now take a look at how it works.
Explanation
Be warned that understanding this code could take a while (lord knows it took me long enough to figure it out; that’s what prompted me to write this post, so I wouldn’t forget!).
If you don’t understand how the code works after the first time reading through these notes then it’s probably best to execute the above code via your browser of choice’s developer tools and use a debugger
statement to step through the code whilst reading this explanation…
Note: the above code was written by Irakli Gozalishvili (an engineer at Mozilla): https://gist.github.com/Gozala/1697037. As you can see above, it makes tail call optimising any function really easy.
- We store the result of calling
tco
(wrapped around our code) into thesum
variable. - The
sum
variable is now a function expression that when called (e.g.sum(1, 10)
) will execute theaccumulator
function thattco
returned. - The
accumulator
is a closure (meaning when we callsum
theaccumulator
will have access to the variables defined inside oftco
->value
,active
andaccumulated
; as well as our own code which is accessible via the identifierf
). - When we call
sum
for the first time (e.g.sum(1, 10)
) we indirectly executeaccumulator
. - The first thing we do inside of
accumulator
is push the arguments object (an Array-like object) into theaccumulated
Array (so theaccumulated
will now have a length of 1). - It’s worth knowing that the
accumulated
Array only ever has 1 item inside of it (as we’ll soon see). - The
active
variable by default isfalse
. So asaccumulator
is called for the first time, we end up inside theif
conditional, and then resetactive
totrue
. - Now we get to a
while
loop (we’re still calling functions recursively, as you’ll see in a moment; but this is a very clean loop compared to an ugly for loop with lots of operators/operands). - The
while
loop simply checks whether theaccumulated
Array has any content. If it does then we call thef
function and pass through the arguments that were insideaccumulated[0]
(for the first run of this function that would’ve been[1, 10]
). - When we pass the contents of
accumulated[0]
we use theshift
Array method to actually remove it from theaccumulated
Array so it now has a length of zero. - If we ignore for a moment the code inside the loop; on the next iteration of this loop the condition of
accumulated.length
will fail and so we would end up settingactive
tofalse
and ultimately return tosum
the value of the variablevalue
. - This is where it gets confusing, but hold tight!
- The
f
function is our own code. Our own code calls thesum
function (which indirectly calls theaccumulator
function).
The secret sauce to this entire chunk of code is coming up!
- If our code returns
x
then thewhile
loop will stop (I’ll explain why in a moment). - If our code can’t return
x
(notice our code has a conditional check to see ify
is greater than zero, if it is then we returnx
, otherwise…) we callsum
again and pass through modified arguments. - This time when we call
sum
we don’t get inside of theif
conditional becauseactive
is already set totrue
. - So the purpose of calling
sum
from inside our own code is simply to allow us to store the newly modified arguments inside theaccumulated
Array. - The
sum
function then returnsundefined
(as we weren’t able to move into theif
conditional) - The flow of control then throws us back into the original
while
loop (that’s still going, it hasn’t stopped yet) andundefined
is what’s stored into thevalue
variable. - At this point the
accumulated
Array has once again got some content and so thewhile
loop’s condition passes once more. - And that is where the recursive magic happens. At some point our code is going to call
sum
with they
argument set to zero meaning that when theaccumulator
function calls our code the conditiony > 0
will fail and so we return the value ofx
(which has been incremented each time along the way). - When that happens,
x
is returned and assigned as the value to thevalue
variable, and so our code never calledsum
and thus theaccumulated
Array is never modified again; meaning thewhile
loop condition inside theaccumulator
function will fail and thus theaccumulator
function will end returning whatever value is stored inside thevalue
variable (which in this example is the value ofx
).
Conclusion
There you have it, a quick run through of TCO, what it means and how potentially to work around it in JavaScript. Go forth and be productive!
Update 1
I’ve since discovered a nice explanation of trampolining in the JS Drip Newsletter (#65).
The following code is an example that is NOT tail-call optimised:
function isEvenNaive (num) {
if (num === 0) {
return true;
}
if (num === 1) {
return false;
}
return isEvenNaive(Math.abs(num) - 2);
}
isEvenNaive(10);
// => true
isEvenNaive(9);
// => false
isEvenNaive(99999);
// => InternalError: too much recursion
The following code snippet demonstrates the first step towards trampolining, which is to flatten the stack by requiring the user to manually execute the function over and over:
function isEvenInner (num) {
if (num === 0) {
return true;
}
if (num === 1) {
return false;
}
return function() {
return isEvenInner(Math.abs(num) - 2);
};
}
isEvenInner(8);
// => function() {
// return isEvenInner(Math.abs(num) - 2);
// };
isEvenInner(8)()()()();
// => true
The first thing to notice about our isEvenInner
function is that instead of directly calling itself again, it returns an anonymous function. That means each call to isEvenInner
gets resolved immediately, and doesn’t increase the size of the stack. It also means that we need a way to automatically invoke all of those anonymous functions that will get returned along the way. That’s where trampoline comes in.
The trampoline function effectively turns this recursive algorithm into something that is executed by a while
loop. As long as isEvenInner
keeps returning functions, trampoline will keep executing them. When we finally reach a non-function value, trampoline will return the result.
Now, finally, let’s see an example that automates the execution of the flatterned stack:
function trampoline (func, arg) {
var value = func(arg);
while(typeof value === "function") {
value = value();
}
return value;
}
trampoline(isEvenInner, 99999);
// => false
trampoline(isEvenInner, 99998);
// => true
var isEven = trampoline.bind(null, isEvenInner);
isEven(99999);
// => false