[HN Gopher] How to Use the Foreign Function API in Java 22 to Ca...
       ___________________________________________________________________
        
       How to Use the Foreign Function API in Java 22 to Call C Libraries
        
       Author : pjmlp
       Score  : 133 points
       Date   : 2024-05-06 08:40 UTC (2 days ago)
        
 (HTM) web link (ifesunmola.com)
 (TXT) w3m dump (ifesunmola.com)
        
       | xyst wrote:
       | What's the use case here? Developing drivers with Java?
        
         | invalidname wrote:
         | Invoking native code has always been necessary in Java. In the
         | past it was done via JNI which has many issues. These new APIs
         | solve the issues and simplify the API. The use case is
         | interacting with anything that isn't written in Java.
        
           | xtracto wrote:
           | Blast from the past! I remember doing JNI integration in Java
           | around 2003! It's been so long I don't remember details but
           | you had to declare some interfaces in java, then some
           | middleware .h or .c and then call the native library iirc.
           | 
           | Glad to see things are progressing!!
        
         | neonsunset wrote:
         | Same use case as to why .NET has low/zero-cost FFI.
         | 
         | This is similar, except more boilerplate and much, much slower.
        
           | pron wrote:
           | The FFM downcalls in OpenJDK compile down to argument
           | shuffling + a CALL instruction (in "critical" linker mode),
           | i.e. the same machine code gcc/clang would generate for a
           | call from a C program.
        
             | neonsunset wrote:
             | This is what it is compiled to in .NET[0] today more or
             | less[1]. What does OpenJDK compile these to? (edit: misread
             | as _could_ compile. Hmm, I wonder how much the difference
             | will there be in average FFI cost with newer APIs vs direct
             | calls)
             | 
             | [0] Objects that need pinning are pinned(by toggling a bit
             | in object header), byrefs are pinned by simply storing them
             | on the stack, arguments that need marshalling involve
             | calling corresponding marshalling code. That code can
             | allocate intermediate data on heap, on stack or call
             | NativeMemory.Alloc/.Free C-style.
             | 
             | [1] Overhead can be further reduced by 1. annotating FFI
             | calls with [SuppressGCTransition] which saves on possible
             | arguments stack spills and GC helper call, replacing the
             | call with a single flag check and optional call into GC in
             | epilog, 2. in NativeAOT, p/invokes can be "direct" which
             | saves on initialization checks and indirections (though
             | they are reduced in JIT as it can bake data directly into
             | codegen after static init has finished on recompilation).
             | This has a tradeoff as system's dynamic loader will be used
             | at application startup instead of regular lazy
             | initialization and 3. direct p/invokes can be upgraded to
             | static linking, which transforms them into direct calls
             | identical to regular C calls save for the same GC flag
             | check in post-condition. This comes with compiling .NET
             | executables and libraries into a single statically linked
             | binary (well, statically linked for the native dependencies
             | the user has opted into linking this way).
        
           | pjmlp wrote:
           | I still have some hopes that it will evolve towards a
           | P/Invoke like experience.
           | 
           | While a step closer to Valhala, the whole dev experience is
           | still quite lacking versus what .NET offers.
           | 
           | Currently is too much like making direct use of
           | InteropServices.
        
             | pron wrote:
             | > I still have some hopes that it will evolve towards a
             | P/Invoke like experience.
             | 
             | Doubtful, given that this is something we worked hard to
             | avoid. To be efficient, a P/Invoke-like model places
             | restrictions on the runtime, which inhibits optimisation
             | and flexibility and this cost is worth it only when native
             | calls are relatively common. In Java they are rare and
             | easily abstracted away, so we opted for a model that offers
             | full control without giving up on abstraction, given that
             | only a very small number of experts (<1%) would directly
             | write native calls and then hide them as implementation
             | details. I'm not saying this approach is the right one for
             | all languages, but it's clearly the right one for Java
             | given the frequency of native calls and who makes them.
             | 
             | Of course, you can wrap FFM with a higher-level P/Invoke-
             | like mechanism, but it won't give you as much control.
        
               | pjmlp wrote:
               | Well, for developers like myself that feel at home with
               | JNI, the current development experience, even with
               | jpackage, is too much to ask for.
               | 
               | I will rather keep writing C++ with JNI, instead of
               | enduring the current boilerplate, specially if I already
               | need to manually create header files to feed into
               | jpackage, for basic stuff like struct definitions, which
               | I don't feel like writing by hand.
               | 
               | As for performance, this is something I agree with
               | neonsunset, unless we see Techpowerbenchmarks level of
               | Panama beating P/Invoke, it is pretty much theoretical
               | stuff at the expense of developer convience.
        
               | _old_dude_ wrote:
               | s/jpackage/jextract/g
        
               | pron wrote:
               | We can't tailor every feature to the widely disparate
               | preferences of so many developers nor do we try to
               | convince every last developer of the merit of our
               | approach -- this is both impractical and a losing
               | strategy. Rather, we rely on our experience designing a
               | highly successful language and platform, and consult with
               | companies -- each employing thousands of Java developers
               | -- and authors of some of the most popular relevant Java
               | libraries to ensure that we meet their requirements. Of
               | course, we also look at what other languages have done
               | and the tradeoffs they've accepted (some of which may not
               | be appropriate for Java [1]), but there are always many
               | possible designs and we don't adopt one from a less
               | successful language _just_ because it, too, has its fans.
               | 
               | I would encourage those who think that we're
               | _consistently_ making suboptimal choices for Java
               | compared to choices made by significantly less successful
               | languages to consider whether it is possible that their
               | preferences are not aligned with those of the software
               | market at large. Java is and aims to continue being the
               | world 's most popular language for serious server
               | software, and that requires tailoring designs to a very
               | large audience.
               | 
               | I always notice a certain lack of respect on forums such
               | as HN for the world's most consistently successful and
               | popular languages -- JS, Java, and Python. Different
               | programmers have different preferences and I'm all for
               | rooting for the underdog now and again, but you simply
               | cannot consistently make wrong decisions over a very long
               | period of time and yet consistently win. What we do may
               | not be everyone's cup of tea (no language is), but it is
               | clearly that of a whole lot of people. We work to offer
               | value to them.
               | 
               | [1]: E.g. the design of native interop has significantly
               | impacted that of user-mode threads (or lack thereof:
               | https://github.com/dotnet/runtimelab/issues/2398) in both
               | .NET and Go, and we weren't willing to make such
               | tradeoffs in either performance or programming model.
        
               | pjmlp wrote:
               | I can say that in my bubble we reach out for Java,
               | because of Spring, AEM and Android.
               | 
               | That is it, other use cases, have other programming
               | stacks.
               | 
               | As such our native libraries are written in consideration
               | to be consumed at very least, across .NET (P/Invoke,
               | C++/CLI, COM), Java (JNI), nodejs (C++ addons), Swift.
               | 
               | So to move the existing development workflow from JNI to
               | Panama, it must be an easy sell why we should budget
               | rewrites to start with.
               | 
               | Also in regards to "hate", if all decisions were that
               | great there wouldn't be needed to create a new library
               | support group to help Java ecosystem actually move
               | forward and adopt new Java versions, as I learned from
               | JFokus related content.
        
               | pron wrote:
               | You shouldn't! We're not trying to "sell" any rewrite
               | from JNI to FFM. Since FFM is both significantly easier
               | to use and offers better performance, most people would
               | choose to write _new_ interop code with FFM; that is an
               | easy sell. But that 's not to say that these benefits
               | justify a rewrite of _existing_ code, and we have no plan
               | to remove JNI. JNI and FFM can coexist in same program
               | (and even in the same class). However, we are about to
               | place the same protections on JNI as those we have on FFM
               | to ensure that Java programs are free of undefined
               | behaviour by default, and that modules that may introduce
               | undefined behaviour are clearly acknowledged by the
               | application so that the application owners may give them
               | closer scrutiny if they wish [1].
               | 
               | To elaborate just a bit more on what I wrote in my
               | previous comment, to get a straightforward interop with C
               | you need to place certain restrictions on the runtime
               | which limit your ability to implement certain
               | abstractions such as moving GCs and user-mode threads.
               | Because native interop requires special care anyway due
               | to native memory management, which makes it significantly
               | more complex than ordinary code and so less suitable for
               | direct exposure to application developers -- so it's best
               | done by experts in the area -- and on top of that native
               | calls in Java aren't common, we decided not to sacrifice
               | the runtime in favour of more direct interop. As a
               | result, native interop is somewhat more elaborate to
               | code, but as it requires some special expertise and so
               | should be hidden away from application developers anyway,
               | we decided it's better to place the extra burden on the
               | experts doing the interop rather than trade off runtime
               | capabilities and performance. We think this is the better
               | tradeoff for Java. Consequently, we have both compacting
               | collectors and no performance penalty for native calls on
               | virtual threads. Other languages made whatever tradeoffs
               | they thought were right for them, but they did very
               | clearly sacrifice something.
               | 
               | [1]: https://openjdk.org/jeps/472
        
               | cesarb wrote:
               | > [...] and this cost is worth it only when native calls
               | are relatively common. In Java they are rare and easily
               | abstracted away, [...] but it's clearly the right one for
               | Java given the frequency of native calls [...]
               | 
               | Native calls are rare in Java _because they 're such a
               | pain_. If it wasn't so hard to do native calls in Java,
               | it would be common even for non-experts to make use of
               | non-Java libraries.
        
               | pron wrote:
               | I don't think so, given that there are more popular Java
               | libraries than popular libraries with a C ABI. There is a
               | small number of very popular C libraries that result in
               | the majority of native call uses. But in any event,
               | calling native libraries in Java is now no longer a pain
               | thanks to FFM (and jextract [1]) so we'll see.
               | 
               | Note that interaction with native libraries often
               | requires a more careful management of native memory that,
               | though much easier now with FFM, is still significantly
               | trickier (and more dangerous in terms of introducing
               | undefined behaviour) than interacting with Java code
               | regardless of how that interaction is declared in code.
               | In Java, as in Python, interaction with native code -- in
               | the vast majority of cases -- is best encapsulated inside
               | a Java library and not often directly exposed to
               | application programmers.
               | 
               | [1]: https://github.com/openjdk/jextract
        
       | marginalia_nu wrote:
       | What I'm missing is a model for building/distributing those C
       | libraries with a java application.
       | 
       | Every ffi example I've found seem to operate on the assumption
       | that you want to invoke syscalls or libc, which (with possibly
       | the exception of like madvise and aioring) Java already mostly
       | has decent facilities to interact with even without native calls.
        
         | ruslan_talpa wrote:
         | Put them in a jar?
        
         | pjmlp wrote:
         | You do it the standard way, package them inside the jar file.
        
           | marginalia_nu wrote:
           | Oh, does this actually work?
           | 
           | I was on the assumption that it was dynamically linking the
           | libarary with the OS dynamic linker, which in no OS I'm aware
           | of is capable of loading libraries inside of zip files.
           | 
           | Not sure where I got that notion. Maybe I was overthinking
           | this.
        
             | zten wrote:
             | Yes. Check out a library like zstd-jni. You'll find native
             | libraries inside it. It'll load from the classpath first,
             | and then ask the OS linker to find it.
        
               | marginalia_nu wrote:
               | Sounds promising.
               | 
               | I have some extremely unwieldy off-heap operations
               | currently implemented in Java (like quicksort for 128 bit
               | records) that would be very nice to offload as FFI calls
               | to the corresponding a single-line C++ function.
        
               | neonsunset wrote:
               | Why not give C# a try instead? It has everything you ask
               | for and then some.
        
               | maksut wrote:
               | I'd like to learn how they do it. Because last time I've
               | looked at this, the suggested solution was to copy the
               | binaries from claspath (eg: the jar) into a temporary
               | folder then load it from there. It feels icky :)
        
               | renewiltord wrote:
               | EDIT: Disregard. I am wrong. Original below.
               | 
               | You can just load as a resource. We do this internally
               | since much of network stack is C. But we use JNI because
               | code is older than Java 22.
        
               | maksut wrote:
               | You made me search it again. And still I don't see how
               | that's possible. `Runtime.load` requires a regular file
               | with an absolute path[0].
               | 
               | Stackoverflow is full of "copy it into a temp file"
               | solutions. ChatGPT keeps saying "sorry" but still insists
               | on copying it into a temp file :)
               | 
               | [0] - https://docs.oracle.com/en%2Fjava%2Fjavase%2F22%2Fd
               | ocs%2Fapi...
        
               | renewiltord wrote:
               | Embarrassing of me to give you wrong answer. I went and
               | checked my old code and:                    new
               | FileOutputStream(tmpFile)
               | 
               | Apologies.
        
               | zten wrote:
               | Yep, you're right, they do exactly that. Apologies for
               | the confusion.
               | 
               | Decompiled class file:                   try {
               | var4 = File.createTempFile("libzstd-jni-1.5.0-4", "." +
               | libExtension(), var0);             var4.deleteOnExit();
        
             | brabel wrote:
             | I remember using Sqlite Java and not having to install
             | sqlite on the image. Then I looked inside the Sqlite-java's
             | jar and they just packed the sqlite binaries for the
             | different OSs in the jar!!
        
               | DannyB2 wrote:
               | I once (2016 ish) used a serial-port library for Java.
               | Needed to be cross platform desktop app for Linux,
               | Windows and Mac (in that order, all on x86/64). And it
               | was. I have forgotten the name of the library project I
               | included, but it included DLL binaries for the platforms
               | we were targeting.
        
           | fire_lake wrote:
           | Is there a solution when the binaries are 500mb+ per
           | platform?
        
             | pjmlp wrote:
             | People seem pretty happy when Go and Rust do the same with
             | static linking, advocating how great it happens to be.
        
         | pron wrote:
         | The recommended distribution model for Java applications is a
         | jlinked runtime image [1], which supports including native
         | libraries in the image.
         | 
         | [1]: Technically, this is the only distribution model because
         | all Java runtimes as of JDK 9 are created with jlink, including
         | the runtime included in the JDK (which many people use as-is),
         | but I mean a custom runtime packaged with the application.
        
           | maksut wrote:
           | Is that still true when distributing libraries?
        
             | pron wrote:
             | If you distribute libraries as jmod files, which few
             | libraries do (in that case, jlink would automatically
             | extract the native libraries and place them in the
             | appropriate location).
        
             | brabel wrote:
             | Absolutely not. jlink is used to distribute applications
             | (it includes your code, the Java libs you use, i.e. their
             | jars, and the trimmed-down JVM with the modules you're
             | using so that your distribution is not so big - typically
             | around 30MB).
             | 
             | Java libraries are still obtained from Maven repositories
             | via Maven/Gradle/Ant/Bazel/etc.
        
         | sedro wrote:
         | Native libraries are typically packaged inside a jar so that
         | everything works over the existing build and dependency
         | management systems.
         | 
         | For example, each these jars named "native-$os-$arch.jar"
         | contain a .dll/.so/.dylib:
         | https://repo1.maven.org/maven2/com/aayushatharva/brotli4j/
         | 
         | JNA will extract the appropriate native library (using os.name
         | and os.arch system properties), save the library to a temp
         | file, then load it.
        
         | gwbas1c wrote:
         | > Every ffi example I've found seem to operate on the
         | assumption that you want to invoke syscalls or libc ... Java
         | already mostly has decent facilities to interact with even
         | without native calls.
         | 
         | Because you would use ffi to interact with libraries that don't
         | have Java wrappers yet: IE, you're writing the wrapper.
         | 
         | Using syscalls or libc is a way to write an example against a
         | known library that you're probably familiar with.
        
       | alex_suzuki wrote:
       | Wonder if this will make JNA (Java Native Access) redundant at
       | some point: https://github.com/java-native-access/jna
       | 
       | Very useful, especially the prebundled platform bindings.
        
       | xyproto wrote:
       | Does this mean that one can use SDL2 together with Java without
       | bending over backwards?
        
         | neonsunset wrote:
         | It seems it will make it somewhat easier.
         | 
         | But if you want to use SDL2 from something higher-level, you
         | will be _much_ better served by C# which will give you minimal
         | FFI cost and most data structures you want to express in C as-
         | is.
        
           | maksut wrote:
           | I don't know much about C#. It certainly looks more popular
           | in gamedev circles.
           | 
           | When I played with this new java api. I wasn't worried about
           | the FFI cost. It seemed fast enough to me. My toy application
           | was performing about 0.77x of pure C equivalent. I think
           | Java's memory model and heavy heap use might hurt more.
           | Hopefully Java will catch up when it gets value objects with
           | Project Valhalla. Next decade or so :)
        
             | neonsunset wrote:
             | Genuine curiosity - what would be your motivation to use
             | Java over C# here aside from familiarity (which is
             | perfectly understandable)? The latter takes heavy focus on
             | making sure to provide features like structs and pointers
             | with little to no friction, you can even AOT compile it and
             | statically link SDL2 into a single executable.
             | 
             | In improbable case you may want to try it out, then all it
             | needs is
             | 
             | - SDK from https://dot.net/download (or package manager of
             | your choice if you are on Linux e.g. `sudo apt-get install
             | dotnet-sdk-8.0`, !do not! use Homebrew if you are on macOS
             | however, use .pkg installer)
             | 
             | - C# extension for VS Code (DevKit is not needed)
             | 
             | - SDL2 abstraction: https://github.com/dotnet/Silk.NET
             | (there are all sorts of alternate bindings depending on
             | your preferences)
        
               | lazide wrote:
               | Not the original poster, but most folks have little
               | choice what ecosystem they're using.
               | 
               | And once you have enough momentum, switching isn't
               | usually worth it.
               | 
               | (As someone who has done Perl, C, Java, C#, Kotlin, JS,
               | and Python professionally - god help me. Maybe a million
               | lines of code all in now?)
        
           | p0w3n3d wrote:
           | You're are (or were) right. Java has (had) an awful
           | performance of a foreign API call, and I wonder was this
           | fixed in this release, because as I heard, fixing it was the
           | main reason of the upcoming functionality
        
             | neonsunset wrote:
             | It could bring Java closer in FFI overhead but not
             | necessarily match. There are still missing features like
             | structs, C pointers (though in C# they are superseded quite
             | a bit by byrefs aka `ref T` syntax, e.g. used by Span<T>),
             | stack allocated buffers, etc.
             | 
             | C# also has function pointers (managed/unmanaged) and C
             | exports with NativeAOT.
        
         | marginalia_nu wrote:
         | Oh man that's a cool idea.
         | 
         | Might just build a SDL2-wrapper for ffi just as an FFI and FMI-
         | exercise.
        
         | maksut wrote:
         | I have played with raylib bindings for clojure by using the new
         | foreign function api. It was a lot of fun. SDL might be a
         | better fit because it prefers pass by reference arguments [0].
         | 
         | [0]
         | https://gist.github.com/raysan5/17392498d40e2cb281f5d09c0a4b...
        
       | creativeSlumber wrote:
       | Not directly related to the artcile,but is there any article that
       | explain how memory management (stack/heap) work when using FFI in
       | java. Also when a call is made though FFI to a C library, is
       | there a separate java and C call stack? I haven't found a good
       | article yet on what happens under the hood.
        
         | dzaima wrote:
         | Don't have an article, but the gist on stacks is that Java
         | still uses the regular architecture stack (rsp on x86, etc)
         | that the FFI'd code will, and on exit to/entry from FFI it'd
         | have to store its stack end/start pointer (or otherwise be able
         | to figure the range out) such that GC knows what to scan.
        
         | w10-1 wrote:
         | For the heap, JEP 454 is reasonably detailed:
         | https://openjdk.org/jeps/454
         | 
         | It describes how to adopt memory from C and have C adopt memory
         | you allocate, and gives control over how memory is allocated in
         | an arena.
         | 
         | The arena has lifecycle boundaries, and allocations determine
         | the memory space available. Java guarantees (only) that you
         | can't use unallocated memory or memory outside the arena, and
         | if you access via a (correct) value layout, you should be able
         | to navigate structure correctly.
         | 
         | The interesting stuff is passing function pointers back and
         | forth - look for `downcall method handles`.
        
       | thefaux wrote:
       | I am sort of surprised that there isn't a widely used tool that
       | uses codegen to generate jni bindings sort of like what the jna
       | does but at build time. You could go meta and bundle a builder in
       | a jar that looks for the shared library in a particular place and
       | shells out to build and install the native library if it is
       | missing on the host computer. This would run once pretty similar
       | I think to bundling native code in npm.
       | 
       | I have bundled shared libraries for five or six platforms in a
       | java library that needs to make syscalls. It works but it is a
       | pain if anything ever changes or a new platform needs to be
       | brought up. Checking in binaries always feels icky but is
       | necessary if not all targets can be built on a single machine.
       | 
       | The problem with the new api is that people upgrade java very
       | slowly in most contexts. For an oss library developer, I see very
       | little value add in this feature because I'm still stuck for all
       | of my users who are using an older version of java. If I
       | integrate the new ffi api, now I have to support both it and the
       | jni api.
        
         | MaxBarraclough wrote:
         | > I am sort of surprised that there isn't a widely used tool
         | that uses codegen to generate jni bindings sort of like what
         | the jna does but at build time
         | 
         | There are several, including SWIG.
        
         | lelanthran wrote:
         | There is SWIG, which does bings to and from C for almost every
         | language that exists.
        
       | iso8859-1 wrote:
       | Calling C is easy. But how do you call C++? Shiboken has a
       | language that let's you express ownership properties on C++ data
       | structures/methods/functions. It's tailored to generating Python
       | FFI bindings though. It would be so nice if there were a cross-
       | platform language to do this.
        
         | secondcoming wrote:
         | You put your C++ behind a C API.
        
         | p0w3n3d wrote:
         | This is something new. Before it you had to create a native-
         | compatible shared library that returns jString/jObject instead
         | or use a proxy which did this for you (JNA). Let's see what
         | happens next, maybe even shiboken
        
       | p0w3n3d wrote:
       | Last time I checked (ca. 2017-9) every call to foreign API in
       | Java had to create a memory barrier causing flush of all CPU
       | cache. This was different to using normal JVM interfaces and when
       | I asked some guy on a Java conference, he told me they cheated
       | during writing of calls to JVM API, but other people need to
       | adhere to rules. I wonder what happened in this matter in Java
       | 22, as this change was highly expected
        
         | ryanpetrich wrote:
         | Memory barriers don't force a flush of all CPU cache. They will
         | enforce the ordering of memory operations issued before and
         | after the barrier instruction, preserving the contents of the
         | CPU's various caches.
        
       ___________________________________________________________________
       (page generated 2024-05-08 23:00 UTC)