[HN Gopher] Crossing the impossible FFI boundary, and my gradual...
___________________________________________________________________
Crossing the impossible FFI boundary, and my gradual descent into
madness
Author : signa11
Score : 88 points
Date : 2024-06-17 15:17 UTC (7 hours ago)
(HTM) web link (verdagon.dev)
(TXT) w3m dump (verdagon.dev)
| jauntywundrkind wrote:
| There's a huge amount of doom & gloom, prophecies of failure
| against wasm's component-model, a latent expectation that trying
| to solve FFI is impossible & destined to failure. But what if?
|
| It's be so neat for language creators to be able to use &
| leverage other works. Getting there wouldn't be easy, but there's
| be a standard path to getting the hard fought capability here.
| kodablah wrote:
| It seems like the struggle here is trying to use Rust
| transparently/automatically from another language instead of just
| make bindings easier. I have found that trying to auto-FFI
| existing Rust types is not the best for languages because there
| is often an impedance mismatch with how the language treats
| things and how Rust does. Therefore trying an always-works
| transparent binding may inevitably end up with people asking for
| more flexibility to fit the language better (e.g. controlling
| lifetime semantics, type mappings/copying, etc).
|
| I think it's clearer to take an approach like Neon and PyO3 and
| other FFI-to-lang helpers do where you just make it easy/safe to
| write these Vale functions in Rust.
| DSMan195276 wrote:
| I agree with you, but it's always hard to ignore the allure of
| not needing to write all the bindings manually. If nobody is
| willing to write the bulk of the initial bindings then the
| chance of someone using it seems low, and in theory writing a
| transparent layer between the two takes less time/effort (in
| practice I agree that the incompatibilities will make it messy
| long term).
|
| Rust has the same problem with C APIs, in the past I've went to
| use something and found that the binding was not there. For a
| couple functions it's no big deal, but if say half or more of
| the ones I needed weren't there already then I wouldn't have
| bothered trying to use it at all.
| toyg wrote:
| _> Anyone trying to make a new mainstream language is completely
| insane, unless they 're backed by a huge corporation. There are
| only two exceptions in the last 25 years that come close: Scala
| and Kotlin_
|
| Kotlin was designed and backed by JetBrains from the start. Maybe
| not a "huge" corporation but a pretty big company still (by
| revenue).
| iudqnolq wrote:
| I don't know the story of how the Android team went Kotlin-
| first. If that wasn't a deliberate plan they got quite lucky.
| Could Kotlin arguably be backed by Google?
| izacus wrote:
| No, the Android community adopted Kotlin before Google added
| any support for it from their side.
| kernal wrote:
| I don't know when the first Kotlin Android app was
| published, but Kotlin 1.0 was released in 2016 and then
| announced as a first class language at Google I/O in 2017.
| kernal wrote:
| Android Studio is based on IntelliJ and there's a lot of
| collaboration between both teams. The adoption of Kotlin was
| a logical next step, considering a lot of IntelliJ is written
| in Kotlin.
| throwawaymaths wrote:
| Rustler crosses the rust/Erlang barrier relatively well, though
| it's error messages when you try to cross it wrong are somewhat
| unhelpful.
| move-on-by wrote:
| I've not used rust, and quite frankly I think a lot of the post
| is over my head, but I enjoyed the read nonetheless.
|
| > I don't have any specific plans to turn this C proof-of-concept
| into a production-quality tool that would enable calling Rust
| from C, but if anyone wants to take it from here, I'd be happy to
| assist!
|
| I laughed at this, I'd bet my bottom dollar it's an attempted
| nerd snip!
| ingve wrote:
| This could be great for scripting with Neptune! [0]
|
| [0] https://github.com/Srinivasa314/neptune-lang
| yobananaboy wrote:
| I'm gonna find some reason to use this for my Battleship game
| too!
| marklar423 wrote:
| With all this effort required (as the author points out), I start
| to wonder if a better solution is to communicate via RPC over
| local sockets.
|
| There will be some overhead, but it might be a wash considering
| calling over a FFI often involves similar overhead to marshall /
| unmarshall objects. And the simplicity gains would be massive.
| layer8 wrote:
| Why over a socket? You could perform the same protocol more
| efficiently with normal functions in-process. Maybe we need a
| standard serializing LPC protocol just using the platform ABI.
| Or maybe this comes down to something like ZeroMQ in-process.
| marklar423 wrote:
| Mostly because sockets are supported by everything today, and
| they're easy to understand. What you're describing would
| certainly work but it looks similar to what the OP did in the
| blog post, with all the complexity it comes with.
| layer8 wrote:
| The OP doesn't serialize. My proposal would still serialize
| as with RPC, but instead of passing the data over a socket,
| just pass the data as a binary blob over a regular function
| call.
| spongebobstoes wrote:
| The main thing on my mind is that the build system would
| become more bespoke when doing it that way, compared to
| running a few processes that interact with each other.
|
| The overhead of socket read+write is typically much less
| than the serialization overhead, although both can be
| optimized to the point of irrelevance for many
| applications.
|
| It's also interesting because it ends up looking like a
| microservices architecture, except all on one machine
| (even all in one process tree).
| marklar423 wrote:
| https://zeromq.org/ -> TIL really cool, thanks for the
| pointer.
| masfuerte wrote:
| COM [1] was a solution to these problems thirty years ago.
|
| In-process it's just function calls. Cross-process COM has
| automatic marshalling for standard types ("automation types")
| or you can define custom marshalling that does whatever you
| want.
|
| WinRT [2] is a more modern version. It builds on COM and (among
| other things) provides the basis for the latest UI frameworks
| in Windows.
|
| [1]: https://en.wikipedia.org/wiki/Component_Object_Model
|
| [2]: https://en.wikipedia.org/wiki/Windows_Runtime
| nsguy wrote:
| A long time ago I worked on a project where we needed to
| distribute an in process COM object, so we moved it to DCOM,
| instantiated multiple instances, and that worked! All in all
| COM was a fairly pleasant technology. Not really that
| different than gRPC (e.g. idl vs. proto).
| alexvitkov wrote:
| If you want to interop well with Rust code, it feels to me like
| your language has to inherit so many Rust semantics, that I'm
| questioning myself why I would use it over Rust.
|
| If you're making a new language, just have good interop with C.
| Most libraries worth using are written in C. Calling into C is
| trivial* and enforces almost no limitations on what you can do
| language-design wise.
|
| * trivial, with the somewhat sizable asterisk that you have to
| rewrite the header files in your language.
| verdagon wrote:
| I've been looking into this, and I _suspect_ that one actually
| needs surprisingly little to interoperate safely with Rust.
|
| TL;DR: The lowest common denominator between Rust and any other
| memory-safe language is a borrow-less affine type.
|
| The key insight is that Rust is actually several different
| mechanisms stacked on top of each other.
|
| To illustrate, imagine a program in a Rust-like language.
|
| Now, refactor it so you don't have any & references, only &mut.
| It actually works, if you're willing to refactor a bit: you'll
| be storing a lot of things in collections and referring to them
| by index, and cloning even more, but nothing too bad.
|
| Now, go even further and refactor the program to not have any
| &mut either. This requires some acrobatics: you'll be
| temporarily removing things from those collections and moving
| things into and out of functions like in [2], but it's still
| possible.
|
| You're left with something I refer to as "borrowless affine
| style" in [1] or "move-only programming" in [0].
|
| I believe that's the bare minimum needed to interoperate with
| Rust in a memory safe way: unreference-able moveable types.
|
| The big question then becomes: if our language has only these
| moveable types, and we want to call a Rust function that
| accepts a reference, what then?
|
| I'd say: make the language move the type in as an argument,
| take a temporary reference just for Rust, and then move-return
| the type back to the caller. The rest of our language doesn't
| need to know about borrowing, it's just a private
| implementation detail of the FFI.
|
| These weird moveable types are, of course, _extremely
| unergonomic,_ but they serves as a foundation. A language could
| use these only for Rust interop, or it could go further: it
| could add other mechanisms on top such as & (hard), or &mut
| (easy), or both (like Rust), or a lot of cloning (like [3]), or
| generational references (like Vale), or some sort of RefCell/Rc
| blend, or linear types + garbage collection (like Haskell) and
| so on.
|
| (This is actually the topic of the next post, you can tell I've
| been thinking about it a lot, lol)
|
| [0] "Move-only programming" in
| https://verdagon.dev/grimoire/grimoire#the-list
|
| [1] "Borrowless affine style" in
| https://verdagon.dev/blog/vale-memory-safe-cpp
|
| [2] https://verdagon.dev/blog/linear-types-borrowing
|
| [3]
| https://web.archive.org/web/20230617045201/https://degaz.io/...
| rng_civ wrote:
| Have you taken a look at the paper "Foreign Function Typing:
| Semantic Type Soundness for FFIs" [0]?
|
| > We wish to establish type soundness in such a setting,
| where there are two languages making foreign calls to one
| another. In particular, we want a notion of convertibility,
| that a type tA from language A is convertible to a type tB
| from language B, which we will write tA ~ tB , such that
| conversions between these types maintain type soundness
| (dynamically or statically) of the overall system
|
| > ...the languages will be translated to a common target. We
| do this using a realizability model, that is, by up a logical
| relation indexed by source types but inhabited by target
| terms that behave as dictated by source types. The
| conversions tA ~ tB that should be allowed, are the ones
| implemented by target-level translations that convert terms
| that semantically behave like tA to terms that semantically
| behave like tB (and vice versa)
|
| I've toyed with this approach to formalize the FFI for
| TypeScript and Pyret and it seemed to work pretty well. It
| might get messier with Rust because you would probably need
| to integrate the Stacked/Tree Borrows model into the common
| target.
|
| But if you can restrict the exposed FFI as a Rust-sublanguage
| without borrows, maybe you wouldn't need to.
|
| [0] (PDF Warning):
| https://wgt20.irif.fr/wgt20-final23-acmpaginated.pdf
| alexvitkov wrote:
| Thanks for the write-up. My biggest fear is not references,
| overloads or memory management, but rather just the layout of
| their structures.
|
| We have this: sizeof(String) == 24
| sizeof(Option<String>) == 24
|
| Which is cool. But Option<T> is defined like this:
| enum Option<T> { Some(T), None,
| }
|
| I didn't find any "template specialization" tricks that you
| would see in C++, as far as I can see the compiler figures
| out some trick to squeeze Option<String> into 24 bytes.
| Whatever those tricks are, unless rustc has an option to
| export the layout of a type, you will need to implement
| yourself.
| vlovich123 wrote:
| You don't need to determine the internal representation as
| long as you're dealing with opaque types and invoking rust
| functions on it.
|
| As for the tricks used to make both 24 bytes, it's NonNull
| within String that Option then detects and knows it can
| represent transparently without any enum tags. For what
| it's worth you can do similar tricks in c++ using zero-
| sized types and tags to declare nullable state (in fact
| std::option already knows to do this for pointer types if I
| recall correctly)
| ithkuil wrote:
| Yeah currently "niche optimization" is performed when the
| compiler can infer that some values of the structure are
| illegal.
|
| This can be currently done when a type declares the range
| of an integer to not be complete with the
|
| rustc_layout_scalar_valid_range_start or _end attribute
| (requires #![feature(rustc_attrs)])
|
| In your example it works for String, because String
| contains a Vec<U8> which inside contains a capacity field
| of type struct Cap(usize) but the usize is effectively
| constrained to contain values from 0..=max_isize
|
| The only way for you to know that is to effectively be the
| rustc compiler or be able to consume it's output
| jlarocco wrote:
| I wish Rust would standardize their ABI already. I started a
| project to call Rust from Common Lisp, but haven't got very
| far. It's a lot of work, and they can break compatibility at
| any time.
|
| If they really want to replace C and C++ then they really need
| to support being called from third party languages.
| guipsp wrote:
| https://github.com/rust-lang/rust/issues/111423
| revskill wrote:
| WHy not WASI ?
| Retr0id wrote:
| How could WASI solve (or be involved in solving) this problem?
| Findecanor wrote:
| I suspect confusion with the WebAssembly Component Model --
| whose development is somewhat intertwined with that of
| WASI's.
|
| It defines a function call ABI between sandboxes. No object
| is in shared memory: parameters are passed by value or by
| handle. Has its own IDL and ABI that languages' ABIs need to
| have adaptors to, if they don't conform.
| jamilbk wrote:
| The article provides a very detailed exploration of all of the
| fun challenges you can face designing FFIs with Rust, but there's
| a good chance you can "get away" with simpler approaches if you
| think ahead a bit.
|
| In our case, we call into Rust from Kotlin using JNI [0] and
| Swift using swift-bridge [1]. Thankfully our use case for the FFI
| [2] is for non-performance-critical calls and the data structures
| are fairly simple, so we just serialize objects with JSON.
|
| No major issues so far.
|
| One thing I am surprised hasn't been mentioned so far is
| Mozilla's UniFFI [3] which seems to solve some of the issues
| brought up in the article. We plan to switch to that once our FFI
| requirements become more complex.
|
| [0] https://docs.rs/jni/latest/jni/
|
| [1] https://github.com/chinedufn/swift-bridge
|
| [2] https://www.firezone.dev/kb/architecture/tech-
| stack#client-a...
|
| [3] https://github.com/mozilla/uniffi-rs
| ar7hur wrote:
| > Anyone trying to make a new mainstream language is completely
| insane, unless they're backed by a huge corporation. There are
| only two exceptions in the last 25 years that come close: Scala
| and Kotlin.
|
| And Clojure! (also a JVM language)
| munchler wrote:
| I would also add Zig to the list. I certainly hear about it
| often enough on HN.
| zem wrote:
| elixir and gleam in the erlang world
| blaise-pabon wrote:
| I'm a novice on this topic, but I'm surprised that no one has
| mentioned Python. Is that because it is a solved problem, thanks
| to https://github.com/PyO3/pyo3 and is no longer a challenge?
| forrestthewoods wrote:
| C APIs are the best APIs. I do a lot of mixed language work and I
| would never attempt anything like. Just write a C API and provide
| trivial FFI bindings for your favorite language.
|
| That said, I thoroughly enjoyed the article and the authors
| admission of its insanity! Great read. But do the simple thing
| and call it a day.
___________________________________________________________________
(page generated 2024-06-17 23:01 UTC)