[HN Gopher] Pair Your Compilers at the ABI Cafe
       ___________________________________________________________________
        
       Pair Your Compilers at the ABI Cafe
        
       Author : nrabulinski
       Score  : 88 points
       Date   : 2024-05-06 06:35 UTC (2 days ago)
        
 (HTM) web link (faultlore.com)
 (TXT) w3m dump (faultlore.com)
        
       | gumby wrote:
       | It's Interop for compilers!
        
       | not2b wrote:
       | There is a specified common C++ ABI that gcc, clang, Intel's
       | proprietary compiler, and others use. It was originally developed
       | for the Itanium processor but is now used by gcc and clang for
       | everything. See
       | 
       | https://itanium-cxx-abi.github.io/cxx-abi/abi.html
       | 
       | Unfortunately this ABI didn't specify how __int128 (and other
       | nonstandard types) are to be passed.
        
         | jcranmer wrote:
         | The Itanium ABI effectively specifies how to lower the C++ ABI
         | to an assumed C ABI, and the C ABI is given by what is known as
         | the "psABI" (processor-specific ABI).
         | 
         | The (not-most-recent) x86-64 ABI is here:
         | https://raw.githubusercontent.com/wiki/hjl-
         | tools/x86-psABI/x..., and it does actually explain how to pass
         | __int128.
        
         | kevingadd wrote:
         | WebAssembly also has a de-facto standard ABI:
         | https://github.com/WebAssembly/tool-conventions/blob/main/Ba...
        
       | jcranmer wrote:
       | I can definitely feel the pain of trying to work out ABI mismatch
       | concerns. It doesn't help that it often isn't clear from the
       | output what some of the underlying assumptions are--expected
       | stack alignment, for example, or structs being broken up into
       | registers or not.
       | 
       | It would be nice if compilers could output some sort of metadata
       | that basically says "ah yes, here's a struct, it requires this
       | alignment, fields are at these offsets and have these sizes, and
       | are present in these cases" (the latter option being able to
       | support discriminated unions) or "function call, parameters are
       | here, here, and here." You'd think this is what DWARF itself
       | provides, but if you play around with DWARF for a bit, you
       | discover that it actually lacks a lot of the low-level ABI
       | details you want to uncover; instead, it's more of a format
       | that's meant to be generic enough to convey to the debugger the
       | AST of the source program along with some hint of how to map the
       | binary code to that AST--you can't really write a language-
       | agnostic DWARF-based debugger.
        
         | PartiallyTyped wrote:
         | What you are asking for sounds quite a bit like what rustc
         | does.
         | 
         | Rustc (outside `extern "c"`) offers no guarantees on the
         | ordering of the fields, however, it guarantees that every
         | instance of struct A will have the same ordering during that
         | particular compilation. This allows rustc to compile external
         | crates (as long as no monomorphization is needed) in a
         | consistent manner across all crates that depend on that.
        
           | menaerus wrote:
           | Most of the ABI issues arise when you start to mix and match
           | shared libaries produced by different compilers, or even the
           | libraries produced by the different versions of the same
           | compiler.
           | 
           | Rust has none of that, nor does support dynamic linking, so I
           | fail to understand what is it that rustc can offer in that
           | solution space. There is none.
        
             | bluGill wrote:
             | Rust works around the issue by not allowing all the useful
             | things that get you there. There are other useful things
             | like sharing pointers across threads that rust will not let
             | you do - for both better and worse. (better in that you
             | avoid a lot of problems for something you rarely need -
             | worse for those few cases where you actually need to do
             | those and cannot)
        
               | Georgelemental wrote:
               | You can share pointers across threads in Rust, it's just
               | `unsafe`.
        
         | orivej wrote:
         | Some Common Lisp FFIs have opted to coax this information out
         | of the compiler. https://github.com/rpav/c2ffi is a C++ tool
         | that links to libclang-cpp and literally outputs JSON with
         | sizes and alignments. (It is then used by
         | https://github.com/rpav/cl-autowrap to autogenerate a Lisp
         | wrapper.) The older CFFI Groveller [1] works by generating C
         | code which is compiled by the system C compiler (e.g. GCC or
         | Clang) and, when executed, prints Lisp code that contains
         | resolved values of constants, sizes, alignments, etc.
         | 
         | [1] https://cffi.common-lisp.dev/manual/html_node/The-
         | Groveller....
        
       | w10-1 wrote:
       | Also function pointers, errors & exception-handling,
       | async/channels/thread-local's, go stacks, swift @objc, @cdecl and
       | cpp inter-op, FFI dialects...
       | 
       | It's not really pain anymore; it's a kind of hilarity
        
       | zX41ZdbW wrote:
       | I think the right way to avoid this problem is to avoid using ABI
       | at runtime or build time.
       | 
       | At runtime, it means - don't use shared libraries. At build time,
       | it means - build every library from the source, don't use pre-
       | built artifacts.
       | 
       | This sounds controversial... But it allows you to change compiler
       | or compiler options at any time, and you don't have to bother. It
       | also enables cross-compilation, reproducible builds, and portable
       | binaries. You no longer have to ask developers to set up a
       | complex build environment on a specific Linux distribution
       | because it works everywhere.
       | 
       | I use this approach for ClickHouse.
        
         | comex wrote:
         | Even then, you still need ABI consistency between compilers if
         | you want to link together codebases written in different
         | languages (e.g. C and Rust).
         | 
         | In practice this almost always 'just works' because most cross-
         | language calls simply don't use the kinds of complicated types
         | discussed in the blog post. They tend to stick to simple
         | integer and pointer types, where ABI consistency is usually a
         | given.
         | 
         | Though you can still get into trouble when passing function
         | pointers, especially when combined with some modern control-
         | flow integrity systems.
        
           | tester756 wrote:
           | >Even then, you still need ABI consistency between compilers
           | if you want to link together codebases written in different
           | languages (e.g. C and Rust).
           | 
           | Let's talk over http, queue or other IPC-ish way
        
             | 0xdeafbeef wrote:
             | You can't pass values in registers using this model
        
             | dwattttt wrote:
             | You _still_ need a consistent way to talk about values; IPC
             | systems tackle the same problems under the name marshalling
             | & de/serialisation. They just tend to take much more
             | conservative options to deal with exactly this kind of
             | problem (you don't have to care about integer endian-ness
             | if integers are expressed as strings).
        
         | jcranmer wrote:
         | Okay, how do you propose to talk to your kernel then?
        
           | anonymoushn wrote:
           | Over an spsc queue (unfortunately you cannot mmap this way
           | yet, and you cannot set up the spsc queue itself this way)
        
             | vlovich123 wrote:
             | Io_uring?
        
           | eddd-ddde wrote:
           | Your kernel will likely have well defined interfaces. You
           | don't need libraries to talk to the kernel.
        
             | jcranmer wrote:
             | But how can you use those interfaces without an ABI?
             | 
             | Fundamentally, an ABI _is_ the way you define interfaces.
        
             | vlovich123 wrote:
             | Except for Linux, those well-defined interfaces sit behind
             | a C API.
        
         | menaerus wrote:
         | This can work only if you own the entire codebase and have all
         | external dependencies that you depend on statically link
         | (compiled) within your product.
         | 
         | I also very much prefer this way of handling dependencies but
         | it's not a solution for all ABI problems since it also implies
         | that you will need to statically link (compile) against all the
         | transient dependencies. These are including at very minimum
         | libc++ or libstdc++. And with this requirement in place this
         | already isn't possible for many of the codebases out there.
         | 
         | And it also brings another issue at the table: X version of
         | libc++/libstdc++ depends on Y version of libc.
         | 
         | Since you generally cannot statically link against the libc,
         | and you don't own it since it's part of the OS, this becomes a
         | hairy problem. You really need to make sure that your code
         | works across different versions and thereof combinations of
         | libc++/libstdc++/libc.
         | 
         | And then there's ... a bunch of other different platforms which
         | aren't Linux.
        
       | gigel82 wrote:
       | I struggled with this many times and at the end of the day threw
       | down the towel and just wrapped everything in plain C exports.
       | That's the only way I know to get ABI compatibility across
       | different compilers/toolsets/versions. COM-like constructs come
       | as a close second.
       | 
       | It's an unfortunate state.
        
       ___________________________________________________________________
       (page generated 2024-05-08 23:01 UTC)