hngopher.com

       [HN Gopher] Tail Call Recursion in Java with ASM (2023)
       ___________________________________________________________________
        
       Tail Call Recursion in Java with ASM (2023)
        
       Author : hyperbrainer
       Score  : 78 points
       Date   : 2025-03-30 12:47 UTC (10 hours ago)
        
 (HTM) web link (unlinkedlist.org)
 (TXT) w3m dump (unlinkedlist.org)
        
       | cempaka wrote:
       | Very nice article demonstrating a neat use of ASM bytecode. The
       | Java language devs are also working on Project Babylon (code
       | reflection), which will bring additional techniques to manipulate
       | the output from the Java compiler:
       | https://openjdk.org/projects/babylon/articles/code-models
        
         | gavinray wrote:
         | This was delivered in JDK 24 as the "Class-File API"
         | 
         | https://openjdk.org/jeps/484
        
           | algo_trader wrote:
           | Can this improve/replace AspectJ and similar
           | instrumentations? We do lots of instruction level
           | modifications
        
       | droideqa wrote:
       | Cool, now ABCL can have TCO!
        
         | dapperdrake wrote:
         | Finally.
         | 
         | The ANTLR guys went through terrible contortions for their
         | parsers.
         | 
         | Never felt like working those details out for ABCL.
        
         | 1932812267 wrote:
         | This isn't a _general_ tail call optimization--just tail
         | recursion. The issue is that this won't support mutual tail
         | recursion.
         | 
         | e.g.:
         | 
         | (defun func-a (x) (func-b (- x 34)) (defun func-b (x) (cond
         | ((<= 0 x) x) ('t (func-a (-x 3))))
         | 
         | Because func-a and func-b are different (JVM) functions, you'd
         | need an inter-procedural goto (i.e. a tail call) in order to
         | natively implement this.
         | 
         | As an alternative, some implementations will use a trampoline.
         | func-a and func-b return a _value_ which says what function to
         | call (and what arguments) for the next step of the computation.
         | The trampoline then calls the appropriate function. Because
         | func-a and func-b _return_ instead of actually calling their
         | sibling, the stack depth is always constant, and the trampoline
         | takes care of the dispatch.
        
           | knome wrote:
           | Sounds like a manual form of clojures recur function.
           | 
           | https://clojuredocs.org/clojure.core/recur
        
             | 1932812267 wrote:
             | Clojure's loop/recur is specifically tail recursion like
             | scala's tailrec or the optimization described in the
             | blogpost. It doesn't use trampolines to enable tail calls
             | that aren't tail recursion.
        
       | bradley13 wrote:
       | Every compiler should recognize and optimize for tail recursion.
       | It's not any harder than most other optimizations, and some
       | algorithms are far better expressed recursively.
       | 
       | Why is this not done?
        
         | _old_dude_ wrote:
         | Parroting something i have heard at a Java conference several
         | years ago, tail recursion remove stack frames but the security
         | model is based on stack frames, so it has to be a JVM
         | optimization, not a compiler optimization.
         | 
         | I've no idea if this fact still holds when the security manager
         | will be removed.
        
           | smarks wrote:
           | The security manager was removed (well, "permanently
           | disabled") in Java 24. As you note, the permissions available
           | at any given point can depend on the permissions of the code
           | on the stack, and TCO affects this. Removal of the SM thus
           | removes one impediment to TCO.
           | 
           | However, there are other things still in the platform for
           | which stack frames are significant. These are referred to as
           | "caller sensitive" methods. An example is Class.forName().
           | This looks up the given name in the classloader of the class
           | that contains the calling code. If the stack frames were
           | shifted around by TCO, this might cause Class.forName() to
           | use the wrong classloader.
           | 
           | No doubt there are ways to overcome this -- the JVM does
           | inlining after all -- but there's work to be done and
           | problems to be solved.
        
             | thfuran wrote:
             | Is there? As you say, there's already inlining, and I don't
             | see how tco presents a harder case for that.
        
         | SkiFire13 wrote:
         | In general, tail recursion destroys stacktrace information,
         | e.g. if f calls g which tail calls h, and h crashes, you won't
         | see g in the stacktrace, and this is bad for debuggability.
         | 
         | In lower level languages there are also a bunch of other
         | issues:
         | 
         | - RAII can easily make functions that appear in a tail position
         | not actually tail calls, due to destructors implicitly running
         | after the call;
         | 
         | - there can be issues when reusing the stack frame of the
         | caller, especially with caller-cleanup calling conventions;
         | 
         | - the compiler needs to prove that no pointers to the stack
         | frame of the function being optimized have escaped, otherwise
         | it would be reusing the memory of live variables which is
         | illegal.
        
           | hyperbrainer wrote:
           | AFAIK Zig is the only somewhat-big and known low-level
           | language with TCO. Obviously, Haskell/Ocaml and the like
           | support and it are decently fast too, but system programming
           | languages they are not.
        
             | vlovich123 wrote:
             | For guarantee:
             | 
             | https://crates.io/crates/tiny_tco
             | 
             | https://crates.io/crates/tco
             | 
             | As an optimization my understanding is that GCC and LLVM
             | implement it so Rust, C, and C++ also have it implicitly as
             | optimizations that may or may not apply to your code.
             | 
             | But yes, zig does have a formal language syntax for
             | guaranteeing tail calls to happen at the language level
             | (which I agree with as the right way to expose this
             | optimization).
        
             | SkiFire13 wrote:
             | Zig's tco support is not much different than Clang's
             | `[[clang::musttail]]` in C++. Both have the big restriction
             | that the two functions involved are required to have the
             | same signature.
        
               | hyperbrainer wrote:
               | > Both have the big restriction that the two functions
               | involved are required to have the same signature.
               | 
               | I did not know that! But I am a bit confused, since I
               | don't really program in either language. Where exactly in
               | the documentation could I read more about this? Or see
               | more examples?
               | 
               | The language reference for @call[0] was quite unhelpful
               | for my untrained eye.
               | 
               | [0] https://ziglang.org/documentation/master/#call
        
               | SkiFire13 wrote:
               | Generally I also find Zig's documentation pretty lacking,
               | instead I try looking for the relevant issues/prs. In
               | this case I found comments on this issues [1] which seem
               | to still hold true. That same issue also links to the
               | relevant LLVM/Clang issue [2], and the same restriction
               | is also being proposed for Rust [3]. This is were I first
               | learned about it and prompted me to investigate whether
               | Zig also suffers from the same issue.
               | 
               | [1]: https://github.com/ziglang/zig/issues/694#issuecomme
               | nt-15674... [2]: https://github.com/llvm/llvm-
               | project/issues/54964 [3]: https://github.com/rust-
               | lang/rfcs/pull/3407
        
               | ufo wrote:
               | This limitation is to ensure that the two functions use
               | the exact same calling convention (input & output
               | registers, and values passed via stack). It can depend on
               | the particular architecture.
        
             | Thorrez wrote:
             | C++:
             | 
             | > All current mainstream compilers perform tail call
             | optimisation fairly well (and have done for more than a
             | decade)
             | 
             | https://stackoverflow.com/questions/34125/which-if-any-c-
             | com... (2008)
        
               | hyperbrainer wrote:
               | I couldn't actually figure out whether this TCO being
               | done "fairly well" was a guarantee or simply like Rust (I
               | am referring to the native support of the language, not
               | what crates allow)
        
             | johnisgood wrote:
             | Depends on what you mean by "systems programming", you can
             | definitely do that in OCaml.
        
             | pjmlp wrote:
             | "Unix system programming in OCaml"
             | 
             | https://ocaml.github.io/ocamlunix/ocamlunix.html
             | 
             | MirageOS
             | 
             | https://mirage.io/
             | 
             | House OS,
             | 
             | https://programatica.cs.pdx.edu/House/
             | 
             | Just saying.
        
           | chowells wrote:
           | I'll believe destroying stacktrace information is a valid
           | complaint when people start complaining that for loops
           | destroy the entire history of previous values the loop
           | variables have had. Tail recursion is equivalent to looping.
           | People should stop complaining when it gives them the same
           | information as looping.
        
             | ameliaquining wrote:
             | I mean, if I were doing an ordinary non-recursive function
             | call that just happened to be in tail position, and it got
             | eliminated, and this caused me to not be able to get the
             | full stack trace while debugging, I might be annoyed.
             | 
             | In a couple languages I've seen proposals to solve this
             | problem with a syntactic opt-in for tail call elimination,
             | though I'm not sure whether any mainstream language has
             | actually implemented this.
        
               | chowells wrote:
               | Language designers could keep taking ideas from Haskell,
               | and allow functions to opt in to appearing in stack
               | traces. Give the programmer control, and all.
        
         | vlovich123 wrote:
         | My bigger issue with tail call optimization is that you really
         | want it to be enforceable since if you accidentally deoptimize
         | it for some reason then you can end up blowing up your stack at
         | runtime. Usually failure to optimize some pattern doesn't have
         | such a drastic effect - normally code just runs more slowly. So
         | tail call is one of those special optimizations you want a
         | language annotation for so that if it fails you get a compiler
         | error (and similarly you may want it applied even in debug
         | builds).
        
         | javier2 wrote:
         | In theory, if all you do is implement algorithms, this sounds
         | fine. But most apps implement horrible business processes, so
         | what would one do with missing stacktraces? Maybe in languages
         | that can mark functions as pure.
        
       | 1932812267 wrote:
       | Scala has been using this technique for years with its
       | scala.annotation.tailrec annotation. Regardless, it's cool to see
       | this implemented as a bytecode pass.
        
         | gavinray wrote:
         | Kotlin as well, with the "tailrec" keyword, e.g. "tailrec fun
         | fibonacci()"
         | 
         | https://kotlinlang.org/docs/functions.html#tail-recursive-fu...
         | 
         | Kotlin also has a neat other tool, "DeepRecursiveFunction<T,
         | R>" that allows defining deep recursion that is not necessarily
         | tail-recursive.
         | 
         | Really useful if you wind up a problem that is most cleanly
         | solved with mutual recursion or similar:
         | 
         | https://kotlinlang.org/api/core/kotlin-stdlib/kotlin/-deep-r...
        
           | deepsun wrote:
           | Interesting, does it depend on Kotlin compiler or it can be
           | implemented in Java as well?
        
             | gavinray wrote:
             | The "DeepRecursiveFunction<T,R>" could be implemented in
             | Java. The Kotlin implementation leverages Kotlin's native
             | coroutines and uses continuations.
             | 
             | It'd require a bit of engineering to get something working
             | in native Java I'd imagine, even with the new JDK
             | Structured Concurrency API offering you a coroutines
             | alternative.
             | 
             | On the other hand, "tailrec" is a keyword and implemented
             | as a compiler optimization.
             | 
             | The closest I've seen in Java is a neat IntelliJ plugin
             | that has a transformation to convert recursive method calls
             | into imperative loops with a stack frame.
             | 
             | This transformation and resulting tool was the result of
             | someone's thesis, it's pretty cool:
             | 
             | https://github.com/andreisilviudragnea/remove-recursion-
             | insp...
        
       | fsckboy wrote:
       | the "lambda the ultimate" papers and the birth of scheme was a
       | loong time ago, so it grates on my ears to hear this topic
       | presented as "an optimization". Yes, it is sometimes an
       | optimization a compiler can make, but the idea is much better
       | presented as a useful semantic of a language.
       | 
       | in the same way that passing parameters to a subfunction
       | "creates" a special set of local variables for the subfunction,
       | the tail recursion semantic updates this set of local variables
       | in an especially clean way for loop semantics, allowing
       | "simultaneous assignment" from old values to new ones.
       | 
       | (yes, it would be confusing with side effected C/C++ operators
       | like ++ because then you'd need to know order of evaluation or
       | know not to do that, but those are already issues in those
       | languages quite apart from tail recursion)
       | 
       | because it's the way I learned it, I tend to call the semantic
       | "tail recursion" and the optimization "tail call elimination",
       | but since other people don't do the same it's somewhat pointless;
       | but I do like to crusade for awareness of the semantic beyond the
       | optimization. If it's an optimization, you can't rely on it
       | because you could blow the stack on large loops. If it's a
       | semantic, you can rely on it.
       | 
       | (the semantic is not entirely "clean" either. it's a bit of a
       | subtle point that you need to return straightaway the return
       | value of the tail call or it's not a tail call. fibonacci is the
       | sum of the current with the next so it's not a tail call unless
       | you somewhat carefully arrange the values you pass/keep around.
       | also worth pointing out that all "tail calls" are up for
       | consideration, not just recursive ones)
        
         | ameliaquining wrote:
         | I mean, in the particular case demonstrated in this blog post
         | it can only be an optimization, because semantically
         | guaranteeing it would require language features that Java
         | doesn't have.
        
       ___________________________________________________________________
       (page generated 2025-03-30 23:00 UTC)