Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Tail Recursion Optimization for the JVM (github.com/sipkab)
107 points by sipkab on April 22, 2020 | hide | past | favorite | 38 comments


I've been hoping for tail call elimination[1] on the JVM for a long time, and historically it has been blocked by a security requirement[2]. Changing the number of stack frames between two calls would cause an error. Apparently that check is no longer there so TCE is possible!

[1] Using the word 'optimization' suggests that eliminating tail calls is optional, but for it to be really useful it has to be guaranteed. If you're writing in recursive style, your program becomes instantly incorrect if TCE goes away because many acceptable inputs will start causing stack overflows.

[2] See this answer: https://softwareengineering.stackexchange.com/a/272086


> in jdk classes [...] there are a number of security sensitive methods that rely on counting stack frames between jdk library code and calling code to figure out who's calling them.

That's... lame. BTW, tail code elimination could keep a count of tail calls for this. It wouldn't work so well for mutual recursion, but maybe that would suffice. In any case, nothing should be so dependent on stack traces!


How to do it for mutual recursion is sketched in my other comment. Java's stack inspection doesn't inherently require counting stack frames -- it's a choice. I agree it's a bad model anyway.


Java's security manager model is fairly lame in general.


Yes. And JAAS won't die even though applets did.


I know for JavaScript it was not implemented due to issues around debugging. I think for that reason if tail call optimization was done, you would need some sort of control over it such as an annotation to turn it on or off per method, maybe even a global flag for it.


It's actually possible to implement calling in such a way that TCO is compatible with Java's security architecture (or at least as it was back in the late 90s when I cared about this). IIRC you'd have a finite state machine along with your calls, which would transit when you're making a tail call of the sort which would matter to the security checks. In other words, the amount of security state needed by a tail-call chain was bounded, if you implemented the spec in a tail-call conscious way.

I thought stack inspection was a bad idea anyway, but you could implement it together with TCO.

[edited because truncated by misclick]


I remember the rationale behind clojure's recur - if you want to use it, your recur calls needs to be tail call optimizable, otherwise it fails to compile (sorry was long time since I've done any clojure, so could be wrong here).

At first it was - why such requirement? Then it hit me - this is not bad - it's actually enforcing that either you would succeed doing that, and the compiler would also do it, or not. So there is a strong guarantee there.


Unfortunately Clojure’s recur only works for self tail calls.


This does not appear to be general TCE (i.e. between two possibly different functions), but strictly recursive TCE. So with mutual recursion you could still overflow the stack. Nevertheless it seems like a big step forward.


i recently learned about project loom that's addressing tail call elimination among other things, exciting stuff

https://wiki.openjdk.java.net/display/loom/Main


Now only if pron would tell us when its ready!


There is an updated EA build available.


Doesn't Scala already do this?


Yeah, but you need to explicitly annotate your functions with @rec.


Not sure if this is accurate. You can enforce @tailrec if you want, but I believe it will try to in any case, even if it is not annotated.


This is correct. @tailrec will simply cause a compile failure if the compiler DOESN’T optimize the function, but the compiler does still optimize tail recursion without the annotation.

It’s a weird but helpful annotation. For example, Scala won’t optimize methods that can be overridden (non-final and public/protected), which is easy to forget. So the annotation is a nice check/confirmation that the compiler is doing what you expect.


This is exciting. I'm guessing since it applies to bytecode it works for clojure as well? That would mean we'll be able to use (func (when cond function)) instead of (loop [] (recur)).


The use of recur in Clojure is by design. It would have been quite trivial to optimize this type of self recursion, but since it doesn't do general tail call optimization, they introduced the recur keyword. See also [1]. By the way, you don't need to use loop in Clojure either, it also works with functions.

[1]. https://clojure.org/about/functional_programming#_recursive_...


This library just does self-calls, doesn't optimize(and can't cause of JVM limitations) calls to others functions (mutual recursion).


It works on bytecode, so yes, it should apply to all JVM based languages. (Though I haven't done tests for them)


Could/Should these be added to the Graal project as compiler optimizations?


Definitely not by default.

This optimization messes with the stack trace, so if you throw an exception in one of the recursively called methods, it will only show up only once in the stack trace. This may very well cause confusion in cases where the developer is not aware that such optimizations have been performed.

However, you can still perform the optimization on your Java bytecode classes, and then pass them to Graal. (If I understand Graal correctly.)


Code gets optimized out of existence all the time and isn't a problem for exception handling code and debugging tools, I don't see why this would be any different.

http://cesquivias.github.io/blog/2015/01/15/writing-a-langua...


This optimization is simply a goto to the start of the function, and reassignment of the arguments. It doesn't prevent exception handling in any way. It doesn't interfere with debugging tools.

The point is that when your program throws an exception, and prints the stacktrace, you will only see a single stack frame for your method. Even if it recursively descended into itself multiple times. This is counter-intuitive as you don't see the evidence of the recursive calls in the stack trace. If you're unaware that tail recursion optimization was performed, you won't know why it seems like your method was only called once.

The program flow stays the same, only the diagnostic information of the program changes.


I spend a small amount of time programming in Scheme as a hobby, and honestly I would find the presence of a single stack frame resulting from a "recursive" call like this no more confusing than having a single stack frame containing a for loop. In both cases the state of the program at the point of the exception is specified by the variable bindings in a single stack frame rather than the depth of the call stack.


Sure, but it wouldn't be that hard to do something like <677 tail calls elided> from the stack trace.


Well, yes, it would be that hard.

The whole idea with tail call elimination is that a stack trace shouldn't be able to see the frames for the tail calls, because the tail call reuses the frame. To add support for tracking the tail call stack frames, you would need to modify the compiler to output code that specifically keeps track of "elided tail call frames", and you would need to update the stack trace traversal code to be able to recognize the extra debug information.

Sure, it's something you could build into a new toolchain, but adding something like this to the JVM would be harder due to the constraints already placed on the JVM. Furthermore I don't know of such support in any toolchains, not even for Lua or Scheme, both of which guarantee tail call optimization. (If anybody has an example, please share!)


Lua's stack trace includes an indication that 1 or more tail calls happened, but not how many

  $ cat tailrec.lua
  function f(g,n)
    if n == 0 then
      error("oh no")
    end
    return g(g,n-1)
  end
  f(f,5)
  $ lua tailrec.lua
  lua: tailrec.lua:3: oh no
  stack traceback:
   [C]: in function 'error'
   tailrec.lua:3: in function 'f'
   (...tail calls...)
   tailrec.lua:7: in main chunk
   [C]: in ?


JavaScriptCore keeps track of frames, even for JavaScript proper tail calls: https://webkit.org/blog/6240/ecmascript-6-proper-tail-calls-...


The essential words in that article are...

> There are some non-standard ECMAScript features in JavaScript that work differently in the presence of PTC.

Java has a strict spec, and the relevant methods which would break aren’t non-standard like they are in JavaScript.

If you’re willing to change the spec (I think Loom does) then yeah, but you can’t implement it as an ‘optimisation’ until then, because it’s not an optimisation if it changes behaviour.


What would Java's be? Is there an interface besides Thread.getStackTrace()?


It’s also the format of the backtrack - I think that’s covered by the TCK.


The point of the tail call is to not have to maintain the activation that you are tail calling from.

So we’d need to reconstruct some information about the activation. How are you proposing we do that? From what information?

Are you suggesting we could store some metadata perhaps? Can you suggest a mechanism for the metadata we should use? There’s a tricky constraint before you suggest an idea! It has to use constant storage, because making storage not grow with the number of calls is the whole point in the first place.


JavaScriptCore has probably the best answer for this question, and it's the hilariously-named "Shadow Chicken" mechanism.

JSC's call stack is the C stack, and this stack has real TCO. There is a separate, heap-allocated "shadow" stack which records calls (but not returns). A single recycled C-stack frame may correspond to multiple shadow-stack calls, but between the two, a debugger can recover many tail calls.

It's best effort. The GC may collect frames if it chooses to. Highly optimized code won't touch the shadow stack. etc. But it's good enough for a debugger.

It's named "Shadow Chicken" because it's a shadow stack inspired by CHICKEN scheme, which uses the legendary Cheney-on-the-MTA strategy.

https://trac.webkit.org/changeset/199076/webkit


Haskell has stack traces despite pervasive TCE.


Project Loom is added TCE to the JVM. It is apparently very close to release.

https://openjdk.java.net/projects/loom/


It sounds to be a great idea for an IDE plugin. Looking at you, Intellij IDEA.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: