Java Language Deep recursion is problematic in Java


Example

Consider the following naive method for adding two positive numbers using recursion:

public static int add(int a, int b) {
    if (a == 0) {
        return b;
    } else {
        return add(a - 1, b + 1);  // TAIL CALL
    }
}

This is algorithmically correct, but it has a major problem. If you call add with a large a, it will crash with a StackOverflowError, on any version of Java up to (at least) Java 9.

In a typical functional programming language (and many other languages) the compiler optimizes tail recursion. The compiler would notice that the call to add (at the tagged line) is a tail call, and would effectively rewrite the recursion as a loop. This transformation is called tail-call elimination.

However, current generation Java compilers do not perform tail call elimination. (This is not a simple oversight. There are substantial technical reasons for this; see below.) Instead, each recursive call of add causes a new frame to be allocated on the thread's stack. For example, if you call add(1000, 1), it will take 1000 recursive calls to arrive at the answer 1001.

The problem is that the size of Java thread stack is fixed when the thread is created. (This includes the "main" thread in a single-threaded program.) If too many stack frames are allocated the stack will overflow. The JVM will detect this and throw a StackOverflowError.

One approach to dealing with this is to simply use a bigger stack. There are JVM options that control the default size of a stack, and you can also specify the stack size as a Thread constructor parameter. Unfortunately, this only "puts off" the stack overflow. If you need to do a computation that requires an even larger stack, then the StackOverflowError comes back.

The real solution is to identify recursive algorithms where deep recursion is likely, and manually perform the tail-call optimization at the source code level. For example, our add method can be rewritten as follows:

public static int add(int a, int b) {
    while (a != 0) {
       a = a - 1;
       b = b + 1;
    }
    return b;
}

(Obviously, there are better ways to add two integers. The above is simply to illustrate the effect of manual tail-call elimination.)

Why tail-call elimination is not implemented in Java (yet)

There are a number of reasons why adding tail call elimination to Java is not easy. For example:

  • Some code could rely on StackOverflowError to (for example) place a bound on the size of a computational problem.
  • Sandbox security managers often rely on analyzing the call stack when deciding whether to allow non-privileged code to perform a privileged action.

As John Rose explains in "Tail calls in the VM":

"The effects of removing the caller’s stack frame are visible to some APIs, notably access control checks and stack tracing. It is as if the caller’s caller had directly called the callee. Any privileges possessed by the caller are discarded after control is transferred to the callee. However, the linkage and accessibility of the callee method are computed before the transfer of control, and take into account the tail-calling caller."

In other words, tail-call elimination could cause an access control method to mistakenly think that a security sensitive API was was being called by trusted code.