[HN Gopher] Pair Your Compilers at the ABI Cafe
___________________________________________________________________
Pair Your Compilers at the ABI Cafe
Author : nrabulinski
Score : 88 points
Date : 2024-05-06 06:35 UTC (2 days ago)
(HTM) web link (faultlore.com)
(TXT) w3m dump (faultlore.com)
| gumby wrote:
| It's Interop for compilers!
| not2b wrote:
| There is a specified common C++ ABI that gcc, clang, Intel's
| proprietary compiler, and others use. It was originally developed
| for the Itanium processor but is now used by gcc and clang for
| everything. See
|
| https://itanium-cxx-abi.github.io/cxx-abi/abi.html
|
| Unfortunately this ABI didn't specify how __int128 (and other
| nonstandard types) are to be passed.
| jcranmer wrote:
| The Itanium ABI effectively specifies how to lower the C++ ABI
| to an assumed C ABI, and the C ABI is given by what is known as
| the "psABI" (processor-specific ABI).
|
| The (not-most-recent) x86-64 ABI is here:
| https://raw.githubusercontent.com/wiki/hjl-
| tools/x86-psABI/x..., and it does actually explain how to pass
| __int128.
| kevingadd wrote:
| WebAssembly also has a de-facto standard ABI:
| https://github.com/WebAssembly/tool-conventions/blob/main/Ba...
| jcranmer wrote:
| I can definitely feel the pain of trying to work out ABI mismatch
| concerns. It doesn't help that it often isn't clear from the
| output what some of the underlying assumptions are--expected
| stack alignment, for example, or structs being broken up into
| registers or not.
|
| It would be nice if compilers could output some sort of metadata
| that basically says "ah yes, here's a struct, it requires this
| alignment, fields are at these offsets and have these sizes, and
| are present in these cases" (the latter option being able to
| support discriminated unions) or "function call, parameters are
| here, here, and here." You'd think this is what DWARF itself
| provides, but if you play around with DWARF for a bit, you
| discover that it actually lacks a lot of the low-level ABI
| details you want to uncover; instead, it's more of a format
| that's meant to be generic enough to convey to the debugger the
| AST of the source program along with some hint of how to map the
| binary code to that AST--you can't really write a language-
| agnostic DWARF-based debugger.
| PartiallyTyped wrote:
| What you are asking for sounds quite a bit like what rustc
| does.
|
| Rustc (outside `extern "c"`) offers no guarantees on the
| ordering of the fields, however, it guarantees that every
| instance of struct A will have the same ordering during that
| particular compilation. This allows rustc to compile external
| crates (as long as no monomorphization is needed) in a
| consistent manner across all crates that depend on that.
| menaerus wrote:
| Most of the ABI issues arise when you start to mix and match
| shared libaries produced by different compilers, or even the
| libraries produced by the different versions of the same
| compiler.
|
| Rust has none of that, nor does support dynamic linking, so I
| fail to understand what is it that rustc can offer in that
| solution space. There is none.
| bluGill wrote:
| Rust works around the issue by not allowing all the useful
| things that get you there. There are other useful things
| like sharing pointers across threads that rust will not let
| you do - for both better and worse. (better in that you
| avoid a lot of problems for something you rarely need -
| worse for those few cases where you actually need to do
| those and cannot)
| Georgelemental wrote:
| You can share pointers across threads in Rust, it's just
| `unsafe`.
| orivej wrote:
| Some Common Lisp FFIs have opted to coax this information out
| of the compiler. https://github.com/rpav/c2ffi is a C++ tool
| that links to libclang-cpp and literally outputs JSON with
| sizes and alignments. (It is then used by
| https://github.com/rpav/cl-autowrap to autogenerate a Lisp
| wrapper.) The older CFFI Groveller [1] works by generating C
| code which is compiled by the system C compiler (e.g. GCC or
| Clang) and, when executed, prints Lisp code that contains
| resolved values of constants, sizes, alignments, etc.
|
| [1] https://cffi.common-lisp.dev/manual/html_node/The-
| Groveller....
| w10-1 wrote:
| Also function pointers, errors & exception-handling,
| async/channels/thread-local's, go stacks, swift @objc, @cdecl and
| cpp inter-op, FFI dialects...
|
| It's not really pain anymore; it's a kind of hilarity
| zX41ZdbW wrote:
| I think the right way to avoid this problem is to avoid using ABI
| at runtime or build time.
|
| At runtime, it means - don't use shared libraries. At build time,
| it means - build every library from the source, don't use pre-
| built artifacts.
|
| This sounds controversial... But it allows you to change compiler
| or compiler options at any time, and you don't have to bother. It
| also enables cross-compilation, reproducible builds, and portable
| binaries. You no longer have to ask developers to set up a
| complex build environment on a specific Linux distribution
| because it works everywhere.
|
| I use this approach for ClickHouse.
| comex wrote:
| Even then, you still need ABI consistency between compilers if
| you want to link together codebases written in different
| languages (e.g. C and Rust).
|
| In practice this almost always 'just works' because most cross-
| language calls simply don't use the kinds of complicated types
| discussed in the blog post. They tend to stick to simple
| integer and pointer types, where ABI consistency is usually a
| given.
|
| Though you can still get into trouble when passing function
| pointers, especially when combined with some modern control-
| flow integrity systems.
| tester756 wrote:
| >Even then, you still need ABI consistency between compilers
| if you want to link together codebases written in different
| languages (e.g. C and Rust).
|
| Let's talk over http, queue or other IPC-ish way
| 0xdeafbeef wrote:
| You can't pass values in registers using this model
| dwattttt wrote:
| You _still_ need a consistent way to talk about values; IPC
| systems tackle the same problems under the name marshalling
| & de/serialisation. They just tend to take much more
| conservative options to deal with exactly this kind of
| problem (you don't have to care about integer endian-ness
| if integers are expressed as strings).
| jcranmer wrote:
| Okay, how do you propose to talk to your kernel then?
| anonymoushn wrote:
| Over an spsc queue (unfortunately you cannot mmap this way
| yet, and you cannot set up the spsc queue itself this way)
| vlovich123 wrote:
| Io_uring?
| eddd-ddde wrote:
| Your kernel will likely have well defined interfaces. You
| don't need libraries to talk to the kernel.
| jcranmer wrote:
| But how can you use those interfaces without an ABI?
|
| Fundamentally, an ABI _is_ the way you define interfaces.
| vlovich123 wrote:
| Except for Linux, those well-defined interfaces sit behind
| a C API.
| menaerus wrote:
| This can work only if you own the entire codebase and have all
| external dependencies that you depend on statically link
| (compiled) within your product.
|
| I also very much prefer this way of handling dependencies but
| it's not a solution for all ABI problems since it also implies
| that you will need to statically link (compile) against all the
| transient dependencies. These are including at very minimum
| libc++ or libstdc++. And with this requirement in place this
| already isn't possible for many of the codebases out there.
|
| And it also brings another issue at the table: X version of
| libc++/libstdc++ depends on Y version of libc.
|
| Since you generally cannot statically link against the libc,
| and you don't own it since it's part of the OS, this becomes a
| hairy problem. You really need to make sure that your code
| works across different versions and thereof combinations of
| libc++/libstdc++/libc.
|
| And then there's ... a bunch of other different platforms which
| aren't Linux.
| gigel82 wrote:
| I struggled with this many times and at the end of the day threw
| down the towel and just wrapped everything in plain C exports.
| That's the only way I know to get ABI compatibility across
| different compilers/toolsets/versions. COM-like constructs come
| as a close second.
|
| It's an unfortunate state.
___________________________________________________________________
(page generated 2024-05-08 23:01 UTC)