[HN Gopher] What the hell is a target triple?
___________________________________________________________________
What the hell is a target triple?
Author : ingve
Score : 154 points
Date : 2025-04-15 18:35 UTC (22 hours ago)
(HTM) web link (mcyoung.xyz)
(TXT) w3m dump (mcyoung.xyz)
| Retr0id wrote:
| Note to author, I'm not sure the word "anachronism" is being used
| correctly in the intro.
| compyman wrote:
| I think the meaning is that the idea that compilers can only
| compile for their host machine is an ananchronism, since that
| was historically the case but is no longer true.
| stefan_ wrote:
| Telling people that "Clang can compile for any architecture
| you like!" tends to confuse them more than it helps. I
| suppose it sets up unrealistic assumptions because of course
| outputting assembly for some architecture is a very long way
| from making working userland binaries for a system based on
| that architecture, which is what people actually want.
|
| And ironically in all of this, building a full toolchain
| based on GCC is _still_ easier than with LLVM.
| bregma wrote:
| Heck, it hasn't been true since the 1950s. Consider it as
| "has never been true".
|
| Oh, sure, there have been plenty of native-host-only
| compilers. It was never a property of all compilers, though.
| Most system brings-ups, from the mainframes of the 1960s
| through the minis of the 1970s to the micros and embeddeds of
| the 1980s and onwards have required cross compilers.
|
| I think what he means is that a single-target toolchain is an
| anachronism. That's also not true, since even clang doesn't
| target everything under the sun in one binary. A toolchain
| needs far more than a compiler, for a start; it needs the
| headers and libraries and it needs a linker. To go from
| source to executable (or herd of dynamic shared objects)
| requires a whole lot more than installing the clang (or
| whatever front-end) binary and choosing a nifty target
| triple. Most builds of clang don't even support all the
| interesting target triples and you need to build it yourself,
| which require a lot more computer than I can afford.
|
| Target triples are not even something limited to toolchains.
| I maintain software that gets cross-built to all kinds of
| targets all the time and that requires target triples for the
| same reasons compilers do. Target triples are just a basic
| tool of the trade if you deal with anything other than
| scripting the browser and they're a solved problem
| rediscovered every now and then by people who haven;t studied
| their history.
| kupiakos wrote:
| It's being used correctly: something that is conspicuously old-
| fashioned for its environment is an anachronism. A toolchain
| that only supports native builds fits.
| Retr0id wrote:
| The article does not place any given toolchain within an
| incorrect environment, though.
|
| If someone said "old compilers were usually cross-compilers",
| that would be an _ahistoric_ statement (somewhat).
|
| If someone used clang in a movie set in the 90s, that would
| be anachronistic.
| bqmjjx0kac wrote:
| It's technically correct, but feels a bit forced.
| jkelleyrtp wrote:
| The author's blog is a FANTASTIC source of information. I
| recommend checking out some of their other posts:
|
| - https://mcyoung.xyz/2021/06/01/linker-script/
|
| - https://mcyoung.xyz/2023/08/09/yarns/
|
| - https://mcyoung.xyz/2023/08/01/llvm-ir/
| eqvinox wrote:
| Given TFA's bias against GCC, I'm not so sure. e.g. looking at
| the linker script article... it's also missing the __start_XYZ
| and __stop_XYZ symbols automatically created by the linker.
| matheusmoreira wrote:
| It also focuses exclusively on sections. I wish it had at
| least mentioned segments, also known as program headers.
| Linux kernel's ELF loader does not care about sections, it
| only cares about segments.
|
| Sections and segments are more or less the same concept:
| metadata that tells the loader how to map each part of the
| file into the correct memory regions with the correct memory
| protection attributes. Biggest difference is segments don't
| have names. Also they aren't neatly organized into logical
| blocks like sections are, they're just big file extents. The
| segments table is essentially a table of arguments for the
| mmap system call.
|
| Learning this stuff from scratch was pretty tough. Linker
| script has commands to manipulate the program header table
| but I couldn't figure those out. In the end I asked
| developers to add command line options instead and the
| maintainer of mold actually obliged.
|
| Looks like very few people know about stuff like this. One
| can use it to do some heavy wizardry though. I leveraged this
| machinery into a cool mechanism for embedding arbitrary data
| into ELF files. The kernel just memory maps the data in
| before the program has even begun execution. Typical
| solutions involve the program finding its own executable on
| the file system, reading it into memory and then finding some
| embedded data section. I made the kernel do almost all of
| that automatically.
|
| https://www.matheusmoreira.com/articles/self-contained-
| lone-...
| o11c wrote:
| I wouldn't call them "same concept" at all. Segments
| (program headers) are all about the runtime (executables
| and shared libraries) and are low-cost. Sections are all
| about development (.o files) and are detailed.
|
| Generally there are many sections combined into a single
| segment, other than special-purpose ones. Unless you are
| reimplementing ld.so, you almost certainly don't want to
| touch segments; sections are far easier to work with.
|
| Also, normally you just just call `getauxval`, but if
| needed the type is already named `ElfW(auxv_t)*`.
| matheusmoreira wrote:
| > I wouldn't call them "same concept" at all.
|
| They are both metadata about file extents and their
| memory images.
|
| > sections are far easier to work with
|
| Yes. They are not, however, loaded into memory by
| default. Linkers do not generate LOAD segments for
| section metadata since they are not needed for execution.
| Thus it's impossible for a program to introspect its own
| sections without additional logic and I/O to read them
| into memory.
|
| > Also, normally you just just call `getauxval`, but if
| needed the type is already named `ElfW(auxv_t)*`.
|
| True. I didn't use it because it was not available. I
| wrote my article in the context of a freestanding nolibc
| program.
| o11c wrote:
| Right, but you can just use the section start/end symbols
| for a section that already goes into a mapped segment.
| matheusmoreira wrote:
| Can you show me how that would work?
|
| It's trivial to put arbitrary files into sections:
| objcopy --add-section program.files.1=file.1.dat \
| --add-section program.files.2=file.2.dat \
| program program+files
|
| The problem is the program.files.* sections do not get
| mapped in by a LOAD segment. I ended up having to write
| my own tool to patch in a LOAD segment into the segments
| table because objcopy does not have the ability to do it.
|
| Even asked a Stack Overflow question about this two years
| ago:
|
| https://stackoverflow.com/q/77468641
|
| The only answer I got told me to simply read the sections
| into memory via /proc/self/exe or edit the segments table
| and make it so that the LOAD segments cover the whole
| file. I eventually figured out ways to add LOAD segments
| to the table. By that point I didn't need sections
| anymore, just a custom segment type.
| o11c wrote:
| The whole point of section names is that they mean
| something. If you give it a name that matches `.rodata.*`
| it will be part of the existing read-only LOADed
| segments, or `.data.*` for (private) read-write.
|
| Use `ld --verbose` to see what sections are mapped by
| default (it is impossible for a linker to work without
| having such a linker script; we're just lucky that GNU ld
| exposes it in a sane form rather than hard-coding it as C
| code). In modern versions of the linker (there is still
| old documentation found by search engines), you can
| specify multiple SECTIONS commands (likely from multiple
| scripts, i.e. just files passed on the command line), but
| why would you when you can conform to the default one?
|
| You should pick a section name that won't collide with
| the section names generated by `-fdata-sections` (or
| `-ffunction-sections` if that's ever relevant for you).
| matheusmoreira wrote:
| That requires relinking the executable. That is not
| always desirable or possible. Unless the dynamic linker
| ignores the segments table in favor of doing this on the
| fly... Even if that's the case, it won't work for
| statically linked executables. Only the dynamic linker
| can assign meaning to section names at runtime and the
| dynamic linker isn't involved at all in the case of
| statically linked programs.
| eqvinox wrote:
| Absolutely agree. Had my own fun dealings with ELF, and to
| be clear, on plain mainline shipping products (amd64
| Linux), not toys/exercise/funky embedded. (Wouldn't have
| known about section start/stop symbols otherwise)
| sramsay wrote:
| I was really struck by the antipathy toward GCC. I'm not sure
| I quite understand where it's coming from.
| fweimer wrote:
| I think GCC's more-or-less equivalent to Clang's --target is
| called -B: https://gcc.gnu.org/onlinedocs/gcc/Directory-
| Options.html#in...
|
| I assume it works with an all-targets binutils build. I haven't
| seen anyone building their cross-compilers in this way (at least
| not in recent memory).
| JoshTriplett wrote:
| I haven't either, probably because it would require building
| once per target and installing all the individual binaries.
|
| This is one of the biggest differences between clang and GCC:
| clang has one binary that supports multiple targets, while a
| GCC build is always target-specific.
| o11c wrote:
| Old versions of GCC used to provide `-b <machine>` (and also
| `-V <version>`), but they were removed a long time ago in favor
| of expecting people to just use and set `CC` correctly.
|
| It looks like gcc 3.3 through 4.5 just forwards to an external
| driver; prior to that it seems like it used the same driver for
| different paths, and after that it is removed.
| jcranmer wrote:
| I did start to try to take clang's TargetInfo code
| (https://github.com/llvm/llvm-project/blob/main/clang/lib/Bas...)
| and porting it over to TableGen, primarily so somebody could
| actually extract useful auto-generated documentation out of it,
| like "What are all the targets available?"
|
| I actually do have working code for the triple-to-TargetInfo
| instantiation portion (which is fun because there's one or two
| cases that juuuust aren't quite like all of the others, and I'm
| not sure if that's a bad copy-paste job or actually intentional).
| But I never got around to working out how to actually integrate
| the actual bodies of TargetInfo implementations--which provide
| things like the properties of C/C++ fundamental types or default
| macros--into the TableGen easily, so that patch is still merely
| languishing somewhere on my computer.
| IshKebab wrote:
| Funny thing I found when I gave up trying to find documentation
| and read the LLVM source code (seems to be what happened to the
| author too!): there are actually _five_ components of the triple,
| not four.
|
| I can't remember what the fifth one is, but yeah... insane
| system.
|
| Thanks for writing this up! I wonder if anyone will ever come up
| with something more sensible.
| o11c wrote:
| There are up to 7 components in a triple, but not all are used
| at once, the general format is:
| <machine>-<vendor>-<kernel>-<libc?><abi?><fabi?>
|
| But there's also <obj>, see below.
|
| Note that there are both canonical and non-canonical triples in
| use. Canonical triples are output by `config.guess` or
| `config.sub`; non-canonical triples are input to `config.sub`
| and used as prefixes for commands.
|
| The <machine> field (1st) is what you're running on, and on
| some systems it includes a version number of sorts. Most 64-bit
| vs 32-bit differences go here, except if the runtime differs
| from what is natural (commonly "32-bit pointers even though the
| CPU is in 64-bit mode"), which goes in <abi> instead.
| Historically, "arm" and "mips" have been a mess here, but that
| has largely been fixed, in large part as a side-effect of
| Debian multiarch (whose triples only have to differ from GNU
| triples in that they canonicalize i[34567]86 to i386, but you
| should use dpkg-architecture to do the conversion for sanity).
|
| The <vendor> field (2nd) is not very useful these days. It
| defaults to "unknown" but as of a few years ago "pc" is used
| instead on x86 (this means that the canonical triple can
| change, but this hasn't been catastrophic since you should
| almost always use the non-canonical triple except when pattern-
| matching, and when pattern-matching you should usually ignore
| this field anyway).
|
| The <kernel> field (3rd) is pretty obvious when it's called
| that, but it's often called <os> instead since "linux" is an
| oddity for regularly having a <libc> component that differs. On
| many systems it includes version data (again, Linux is the
| oddity for having a stable syscall API/ABI). One notable
| exception: if a GNU userland is used on BSD/Solaris system, a
| "k" is prepended. "none" is often used for
| freestanding/embedded compilation, but see <obj>.
|
| The <libc> field (main part of the 4th) is usually absent on
| non-Linux systems, but mandatory for "linux". If it is absent,
| the dash after the kernel is usually removed, except if there
| are ABI components. Note that "gnu" can be both a kernel (Hurd)
| and a libc (glibc). Android uses "android" here, so maybe
| <libc> is a bit of a misnomer (it's not "bionic") - maybe
| <userland>?
|
| <abi>, if present, means you aren't doing the historical
| default for the platform specified by the main fields. Other
| than "eabi" for ARM, most of this is for "use 32-bit pointers
| but 64-bit registers".
|
| <fabi> can be "hf" for 32-bit ARM systems that actually support
| floats in hardware. I don't think I've seen anything else,
| though I admit the main reason I separately document this from
| <abi> is because of how Debian's architecture puts it
| elsewhere.
|
| <obj> is the object file format, usually "aout", "coff", or
| "elf". It can be appended to the kernel field (but before the
| kernel version number), or replace it if "none", or it can go
| in the <abi> field.
| IshKebab wrote:
| Nah I dunno where you're getting your information from but
| LLVM only supports 5 components.
|
| See the code starting at line 1144 here:
| https://llvm.org/doxygen/Triple_8cpp_source.html
|
| The components are arch-vendor-os-environment-objectformat.
|
| It's absolutely full of special cases and hacks. Really at
| this point I think the only sane option is an explicit list
| of fixed strings. I think Rust does that.
| o11c wrote:
| LLVM didn't invent the scheme; why should we pay attention
| to their copy and not look at the original?
|
| The GNU Config project is the original.
| IshKebab wrote:
| The article goes into this a bit. But basically because
| LLVM is extremely popular and used as a backend by lots
| of other languages, e.g. Rust.
|
| Frankly being the originators of this deranged scheme is
| a good reason _not_ to listen to GNU!
| jcranmer wrote:
| You're not really contradicting o11c here; what LLVM calls
| "environment" is a mixture of what they called
| libc/abi/fabi. There's also what LLVM calls "subarch" to
| distinguish between different architectures that may be
| relevant (e.g., i386 is not the same as i686, although LLVM
| doesn't record this difference since it's generally less
| interested in targeting old hardware), and there's also OS
| version numbers that may or may not be relevant.
|
| The underlying problem with target triples is that
| architecture-vendor-system isn't sufficient to uniquely
| describe the relevant details for specifying a toolchain,
| so the necessary extra information has been somewhat
| haphazardly added to the format. On top of that, since the
| relevance of some of the information is questionable for
| some tasks (especially the vendor field), different
| projects have chosen not to care about subtle differences,
| so the _normalization_ of a triple is different between
| different projects.
|
| LLVM's definition is not more or less correct than gcc's
| here, nor are these the only definitions floating around.
| o11c wrote:
| Hm, looking to see if the vendor field is actually
| meaningful ... I see some stuff for m68k and mips and
| sysv targets ... some of it working around pre-standard
| vendor C implementations
|
| Ah, I found a modern one:
| i[3456789]86-w64-mingw* does not use winsup
| i[3456789]86-*-mingw* with other vendors does use winsup
|
| There are probably more; this is embedded in all sorts of
| random configure scripts and it is very not-greppable.
| vient wrote:
| > Kalimba, VE
|
| > No idea what this is, and Google won't help me.
|
| Seems that Kalimba is a DSP, originally by CSR and now by
| Qualcomm. CSR8640 is using it, for example
| https://www.qualcomm.com/products/internet-of-things/consume...
|
| VE is harder to find with such short name.
| AKSF_Ackermann wrote:
| NEC Vector Engine. Basically not a thing outside
| supercomputers.
| fc417fc802 wrote:
| $800 for the 20B-P model on ebay. More memory bandwidth than
| a 4090. I wonder if llama.cpp could be made to run on it?
|
| I see rumors they charge for the compiler though.
| kridsdale1 wrote:
| I really appreciate the angular tilt of the heading type on that
| blog.
| throw0101d wrote:
| Noticed endians listed in the table. It seems like little-endian
| has basically taken over the world in 2025:
|
| * https://en.wikipedia.org/wiki/Endianness#Hardware
|
| Is there anything that is used a lot that is not little? IBM's
| stuff?
|
| Network byte order is BE:
|
| * https://en.wikipedia.org/wiki/Endianness#Networking
| thro3838484848 wrote:
| Java VM is BE.
| kbolino wrote:
| This is misleading at best. The JVM only exposes multibyte
| values to ordinary applications in such a way that byte order
| doesn't matter. You can't break out a pointer and step
| through the bytes of a long field to see what order it's in,
| at least not without the unsafe memory APIs.
|
| In practice, any real JVM implementation will simply use
| native byte order as much as possible. While bytecode and
| other data in class files is serialized in big endian order,
| it will be converted to native order whenever it's actually
| used. If you do pull out the unsafe APIs, you can see that
| e.g. values are little endian on x86(-64). The JVM would
| suffer from major performances issues if it tried to impose a
| byte order different from the underlying platform.
| PhilipRoman wrote:
| One relatively commonly used class which exposes this is
| ByteBuffer and its Int/Long variants, but there you can
| specify the endianness explicitly (or set it to match the
| native one).
| forrestthewoods wrote:
| BE isn't technically dead buts it's practically dead for almost
| all projects. You can static_assert byte order and then never
| think about BE ever again.
|
| All of my custom network serialization formats use LE because
| there's literally no reason to use BE for network byte order.
| It's pure legacy cruft.
| f_devd wrote:
| ...Until you find yourself having to workaround legacy code
| to support some weird target that does still use BE. Speaking
| from experience (tbf usually lower level than anything
| actually networked, more like RS485 and friends).
| dharmab wrote:
| LEON, used by the European Space Agency, is big endian.
| naruhodo wrote:
| Should have been called BEON.
| formerly_proven wrote:
| 10 years ago the fastest BE machines that were practical were
| then-ten year old powermacs. This hasn't really changed. I
| guess they're more expensive now.
| eqvinox wrote:
| e6500/T4240 are faster than powermacs. Not sure how rare they
| are nowadays, we didn't have any trouble buying some (on
| eBay). 12x2 cores, 48GB RAM, for BE that's essentially
| heaven...
| Palomides wrote:
| IBM's Power chips can run in either little or big modes, but
| "used a lot" is a stretch
| inferiorhuman wrote:
| Most PowerPC related stuff (e.g. Freescale MPC5xx found in a
| bunch of automotiver applications) can run in either big or
| little endian mode, as can most ARM and MIPS (routers, IP
| cameras) stuff. Can't think of the last time I've seen any of
| them configured to run in big endian mode tho.
| classichasclass wrote:
| For the large Power ISA machines, it's most commonly when
| running AIX or IBM i these days, though the BSDs generally
| run big too.
| richardwhiuk wrote:
| Some ARM stuff.
| rv3392 wrote:
| Apart from IBM Power/AIX systems, SPARC/Solaris is another one.
| I wouldn't say either of these are used a lot, but there's a
| reasonable amount of legacy systems out there that are still
| being supported by IBM and Oracle.
| psyclobe wrote:
| Sounds like what we use with vcpkg to define the systems tooling;
| still trying to make sense of it all these years later, but we
| define things like x64-Linux-static to imply target architecture
| platform and linkage style to runtime.
| bruce343434 wrote:
| Why does this person have such negative views of GCC and positive
| bias towards LLVM?
| nemothekid wrote:
| If OP is above 30 - it's probably due to the frustration of
| trying to modularize GCC that led to the creation of LLVM in
| the first place. If OP is below 30, it's probably because he
| grew up in a world where most compiler research and design is
| done on LLVM and GCC is for grandpa.
| matheusmoreira wrote:
| Good question. Author is incredibly hostile to one of the most
| important pieces of software ever developed because of the way
| they approached the problem nearly 40 years ago. Then he
| criticizes Go for trying to redesign the system instead of just
| using target triples...
| FitCodIa wrote:
| The author writes: "really stupid way in which GCC does cross
| compiling [...] Nobody with a brain does this [...]", and
| then admits in the footnote, "I'm not sure why GCC does
| this".
|
| Immature to the point of alienating.
| xyst wrote:
| Seems to have a decent amount of knowledge in this domain in
| education and professional work. Author is from MIT so maybe
| professors had a lot of influence here.
|
| also, gcc is relatively old and comes with a lot of baggage.
| LLVM is sort of the defacto standard now with improvements in
| performance
| rlpb wrote:
| > LLVM is sort of the defacto standard now...
|
| Distributions, and therefore virtually all the software used
| by a distribution user, still generally use gcc. LLVM is only
| the de facto standard when doing something new, and for JIT.
| bruce343434 wrote:
| as someone who uses both Clang and GCC to cover eachothers
| weaknesses, as far as I can tell both LLVM and GCC are
| hopelessly beastly codebases in terms of raw size and their
| complexity. I think that's just what happens when people
| desire to build an "everything compiler".
|
| From what I gathered, LLVM has a lot of C++ specific design
| choices in its IR language anyway. I think I'd count that as
| baggage.
|
| I personally don't think one is better than the other.
| Sometimes clang produces faster code, sometimes gcc. I
| haven't really dealt with compiler bugs from either. They
| compile my projects at the same speed. Clang is better at
| certain analyses, gcc better at certain others.
| ahartmetz wrote:
| Clang used to compile much faster than GCC. I was excited.
| Now there is barely any difference, so I keep using GCC and
| occasionally some Clang-based tools such as iwyu,
| ClangBuildAnalyzer or sanitizer options (rare, Valgrind is
| easier and more powerful though sanitizers also have unique
| features).
| Skywalker13 wrote:
| It is unfortunate. GCC has enabled the compilation of countless
| lines of source code for nearly 40 years and has served
| millions of users. Regardless of whether its design is
| considered good or bad today, GCC has played an essential role
| and has enabled the emergence of many projects and new
| compilers. GCC deserves deep respect.
| steveklabnik wrote:
| I have intense respect for the history of gcc, but everything
| about using it screams that it's stuck in the past.
|
| LLVM has a lot of problems, but it feels significantly more
| modern.
|
| I do wish we had a "new LLVM" doing to LLVM what it did to gcc.
| Just because it's better doesn't mean it's perfect.
|
| Basically, you can respect history while also being honest
| about the current state of things. But also, doing so requires
| you to care primarily about things like ease of use, rather
| than things like licenses. For some people, they care about
| licenses first, usability second.
| flkenosad wrote:
| Honestly, I love that both exist with their respective world
| views.
| steveklabnik wrote:
| I for sure don't want to suggest that anyone who loves gcc
| shouldn't be working on what they love. More compilers are
| a good thing, generally. Just trying to say why I have a
| preference.
| tialaramex wrote:
| Their IR is a mess. So a "new LLVM" ought to start by nailing
| down the IR.
|
| And as a bonus, seems to me a nailed down IR actually _is_
| that portable assembly language the C people keep telling us
| is what they wanted. Most of them don 't actually want that
| and won't thank you - but if even 1% of the "I need a
| portable assembler" crowd actually did want a portable
| assembler they're a large volume of customers from day one.
| o11c wrote:
| Having tried writing plugins for both, I very much prefer
| GCC's codebase. You have to adapt to its quirks, but at
| least it won't pull the rug from under your feet
| gratuitously. There's a reason every major project ends up
| embedding a years-old copy of LLVM rather than just using
| the system version.
|
| If you're ignoring the API and writing IR directly there
| are advantages to LLVM though.
| flkenosad wrote:
| It's the new anti-woke mind virus going around attacking
| anything "communist" such as copyleft, Stallman, GCC, GNU, etc.
| forrestthewoods wrote:
| What a great article.
|
| Everytime I deal with target triples I get confused and have to
| refresh my memory. This article makes me feel better in knowing
| that target triples are an unmitigated cluster fuck of cruft and
| bad design.
|
| > Go does the correct thing and distributes a cross compiler.
|
| Yes but also no. AFAIK Zig is the _only_ toolchain to provide
| native cross compiling out of the box without bullshit.
|
| Missing from this discussion is the ability to specify and target
| different versions of glibc. Something that I think only Zig even
| _attempts_ to do because Linux's philosophy of building against
| local system globals is an incomprehensibly bad choice. So all
| these target triples are woefully underspecified.
|
| I like that at least Rust defines its own clear list of target
| triples that are more rational than LLVM's. At this point I feel
| like the whole concept of a target triples needs to be thrown
| away. Everything about it is bad.
| peterldowns wrote:
| Some other sources of target triples (some mentioned in the
| article, some not):
|
| rustc: `rustc --print target-list`
|
| golang: `go tool dist list`
|
| zig: `zig targets`
|
| As the article point out, the complete lack of standardization
| and consistency in what constitutes a "triple" (sometimes
| actually a quad!) is kind of hellishly hilarious.
| ycombinatrix wrote:
| at least we don't have to deal with --build, --host, --target
| nonsense anymore
| rendaw wrote:
| You do on Nix. And it's as inconsistently implemented there
| as anywhere.
| lifthrasiir wrote:
| > what constitutes a "triple" (sometimes actually a quad!)
|
| It is actually a quintiple at most because the first part,
| architecture, may contain a version for e.g. ARM. And yet it
| doesn't fully describe the actual target because it may require
| an additional OS version for e.g. macOS. Doubly silly.
| achierius wrote:
| Why would macOS in particular require an OS version where
| other platforms would not -- just backwards compatibility?
| cbmuser wrote:
| >>32-bit x86 is extremely not called "x32"; this is what Linux
| used to call its x86 ILP324 variant before it was removed.<<
|
| x32 support has not been removed from the Linux kernel. In fact,
| we're still maintaining Debian for x32 in Debian Ports.
| AceJohnny2 wrote:
| Offtopic, but I'm distracted by the opening example:
|
| > _After all, you don't want to be building your iPhone app on
| literal iPhone hardware._
|
| iPhones are impressively powerful, but you wouldn't know it from
| the software lockdown that Apple holds on it.
|
| Example: https://www.tomsguide.com/phones/iphones/iphone-16-is-
| actual...
|
| There's a reason people were clamoring for Apple to make ARM
| laptops/desktops for years before Apple finally committed.
| boricj wrote:
| A more pertinent (if dated) example would be " _you don 't want
| to be building your GBA game on literal Game Boy Advance
| hardware_".
| richardwhiuk wrote:
| Or a microcontroller
| AceJohnny2 wrote:
| I do not think I like this author...
|
| > _A critical piece of history here is to understand the really
| stupid way in which GCC does cross compiling. Traditionally,
| each GCC binary would be built for one target triple. [...]
| Nobody with a brain does this ^2_
|
| You're doing GCC a great disservice by ignoring its storied and
| essential history. It's over 40 years old, and was created at a
| time where there were no free/libre compilers. Computers were
| small and slow. _Of course_ you wouldn 't bundle multiple
| targets in one distribution.
|
| LLVM benefitted from a completely different architecture and
| starting from a blank slate when computers were already faster
| and much larger, and was heavily sponsored by a vendor that was
| innately interested in cross-compiling: Apple. (Guess where
| LLVM's creator worked for years and lead the development tools
| team)
| steveklabnik wrote:
| "This was the right way to do it forty years ago, so that's
| why the experience is worse" isn't a compelling reason for a
| user to suffer today.
|
| Also, in this specific case, this ignores the history around
| LLVM offering itself up to the FSF. gcc could have benefitted
| from this fresh start too. But purely by accident, it did
| not.
| AceJohnny2 wrote:
| I'd love to learn what accident you're referring to, Steve!
|
| I vaguely recall the FSF (or maybe only Stallman) arguing
| _against_ the modular nature of LLVM because a monolothic
| structure (like GCC 's) makes it harder for anti-GPL actors
| (Apple!) to undermine it. Was this related?
| steveklabnik wrote:
| That is true history, in my understanding, but it's not
| related.
|
| Chris Lattner offered to donate the copyright of LLVM to
| the FSF at one point: https://gcc.gnu.org/legacy-
| ml/gcc/2005-11/msg00888.html
|
| He even wrote some patches: https://gcc.gnu.org/legacy-
| ml/gcc/2005-11/msg01112.html
|
| However, due to Stallman's... idiosyncratic email setup,
| he missed this: https://lists.gnu.org/archive/html/emacs-
| devel/2015-02/msg00...
|
| > I am stunned to see that we had this offer.
|
| > Now, based on hindsight, I wish we had accepted it.
|
| Note this email is in 2015, ten years after the initial
| one.
| Philpax wrote:
| Incredible. Thank you for sharing.
| steveklabnik wrote:
| You're welcome! It's a wild story. Sometimes, history
| happens by accident.
| matheusmoreira wrote:
| Wow that is wild. Imagine how different things could have
| been...
| FitCodIa wrote:
| > "This was the right way to do it forty years ago, so
| that's why the experience is worse" isn't a compelling
| reason for a user to suffer today.
|
| On my system, "dnf repoquery --whatrequires cross-gcc-
| common" lists 26 gcc-*-linux-gnu packages (that is, kernel
| / firmware cross compilers for 26 architectures). The
| command "dnf repoquery --whatrequires cross-binutils-
| common" lists 31 binutils-*-linux-gnu packages.
|
| The author writes, "LLVM and all cross compilers that
| follow it instead put all of the backends in one binary".
| Do those compilers support 25+ back-ends? And if they do,
| is it good design to install back-ends for (say) 23 such
| target architectures that you're never going to cross-
| compile for, in practice? Does that benefit the user?
|
| My impression is that the author does not understand the
| modularity of gcc cross compilers / packages because he's
| unaware of (or doesn't care for) the scale that gcc aims
| at.
| steveklabnik wrote:
| > And if they do, is it good design to install back-ends
| for (say) 23 such target architectures that you're never
| going to cross-compile for, in practice? Does that
| benefit the user? rustc --print target-
| list | wc -l 287
|
| I'm kinda surprised at how large that is, actually. But
| yeah, I don't mind if I have the capability to cross-
| compile to x86_64-wrs-vxworks that I'm never going to
| use.
|
| I am not an expert on all of these details in clang
| specifically, but with rustc, we take advantage of llvm's
| target specifications, so you that you can even configure
| a backend that the compiler doesn't yet know about by
| simply giving it a json file with a description.
| https://doc.rust-lang.org/nightly/nightly-
| rustc/rustc_target...
|
| While these built-in ones aren't defined as JSON, you can
| ask the compiler to print one for you:
| rustc +nightly -Z unstable-options
| --target=x86_64-unknown-linux-gnu --print target-spec-
| json
|
| It's lengthy so instead of pasting here, I've put this in
| a gist: https://gist.github.com/steveklabnik/a25cdefda1ae
| f25d7b40df3...
|
| Anyway, it is true that gcc supports more targets than
| llvm, at least in theory.
| https://blog.yossarian.net/2021/02/28/Weird-
| architectures-we...
| jaymzcampbell wrote:
| The older I get the more this kind of commentary (the OP, not
| you!) is a total turn off. Systems evolve and there's
| usually, not always, a reason for why _" things are the way
| they are"_. It's typically arrogance to have this kind of
| tone. That said I was a bit like that when I was younger, and
| it took a few knockings down to realise the world is complex.
| FitCodIa wrote:
| > and was heavily sponsored by a vendor that was innately
| interested in cross-compiling
|
| and innately disinterested in Free Software, too
| plorkyeran wrote:
| iPhones have terrible heat dispersion compared to even a
| fanless computer like a macbook air. You get a few minutes at
| full load before thermal throttling kicks in, so you could do
| the occasional build of your iPhone app on an iPhone but it'd
| be pretty terrible as a development platform.
|
| At work we had some benchmarking suites that ran on physical
| devices and even with significant effort put into cooling them
| they spent more time sleeping waiting to cool off than actually
| running the benchmarks.
| cwood-sdf wrote:
| "And no, a "target quadruple" is not a thing and if I catch you
| saying that I'm gonna bonk you with an Intel optimization manual.
| "
|
| https://github.com/ziglang/zig/issues/20690
| debugnik wrote:
| The argument is that they're called triples even when they've
| got more or less components than 3. They should have simply
| been called target tuples or target monikers.
| o11c wrote:
| "gnu tuple" and "gnu type" are also common names.
|
| The comments in `config.guess` and `config.sub`, which are
| the origin of triples, use a large variety of terms, at least
| the following: configuration name
| configuration type [machine] specification system
| name triplet tuple
| therein wrote:
| I like the code editor style preview on the right. Enough to
| forgive the slightly clunky scroll.
| tiffanyh wrote:
| FYI - to see this you need to have your browser at least 1435px
| wide.
| SrslyJosh wrote:
| It looks nice, but I find the choppy scrolling (on an M1 MBP,
| no less!) to be distracting.
|
| It also doesn't really tell me anything about the content,
| except where I'm going to see tables or code blocks, so I'm not
| sure what the benefit is.
|
| Given the really janky scrolling, I'd like to have a way to
| hide it.
| Starlevel004 wrote:
| Unfortunately the text in the preview shows up in ctrl+f.
| matheusmoreira wrote:
| > Go originally wanted to not have to link any system libraries,
| something that does not actually work
|
| It does work on Linux, the only kernel that promises a stable
| binary interface to user space.
|
| https://www.matheusmoreira.com/articles/linux-system-calls
| guipsp wrote:
| Does it _really_ tho? I 've had address resolution break more
| than once in go programs.
| matheusmoreira wrote:
| That's because on Linux systems it's typical for domain name
| resolution to be provided by glibc. As a result, people ended
| up depending on glibc. They were writing GNU/Linux software,
| not Linux software.
|
| https://wiki.archlinux.org/title/Domain_name_resolution
|
| https://en.wikipedia.org/wiki/Name_Service_Switch
|
| https://man.archlinux.org/man/getaddrinfo.3
|
| This is user space stuff. You can trash all of this and roll
| your own mechanism to resolve the names however you want. Go
| probably did so. Linux will not complain in any way
| whatsoever.
|
| Linux is the only kernel that lets you do this. Other kernels
| will break your software if you bypass their system
| libraries.
| guipsp wrote:
| I mean, that is fine and all, but it doesn't really matter
| for making the software run correctly on systems that
| currently exist.
| matheusmoreira wrote:
| It works fine on current Linux systems. We can have
| freestanding executables that talk to Linux directly and
| link against zero system libraries.
|
| It's just that those executables are going to have to
| resolve names all by themselves. Chances are they aren't
| going to do it exactly like glibc does. That may or may
| not be a problem.
| o11c wrote:
| Historically, when DNS breaks in a not-glibc environment,
| it's very often found to in fact be a violation of some
| standard by the not-glibc, rather than a program that
| fails to document a glibc dependency.
| fc417fc802 wrote:
| Just connect to the service running on localhost ...
|
| I'm curious. Why isn't getaddrinfo implemented in a
| similar manner to the loaders that graphics APIs use?
| Shouldn't that functionality be the responsibility of
| whatever resolver has been installed?
| o11c wrote:
| That _is_ how `getaddrinfo` works under GLIBC; it 's
| called NSS. The problem (well, one of them) is the non-
| GLIBC implementations that say "we don't need no stinkin'
| loader!"
| lonjil wrote:
| FreeBSD does as well, but old ABI versions aren't kept forever.
| damagednoob wrote:
| When developing a small program for my Synology NAS in Go, I'm
| sure I had to target a specific version of glibc.
| o11c wrote:
| This article should be ignored, since it disregards the canonical
| _origin_ of target triples (and the fact that it 's linked to
| `configure`):
|
| https://git.savannah.gnu.org/cgit/config.git/tree/
|
| The `testsuite/` directory contains some data files with a fairly
| extensive list of known targets. The vendor field should be
| considered fully extensible, and new combinations of know
| machine/kernel/libc shouldn't be considered invalid, but anything
| else should have a patch submitted.
| jcranmer wrote:
| This article is a very LLVM-centric view, and it does ignore
| the GNU idea of a target triple, which is essentially $(uname
| -a)- _vendor_ -$(uname -s), with _vendor_ determined (so far as
| I can tell) entirely from uname -s, the system name undergoing
| some amount of butchering, and version numbers sometimes being
| included and sometimes not, and Linux getting a LIBC tacked on.
|
| But that doesn't mean the article should be ignored in its
| entirety. LLVM's target triple parsing _is_ more relevant for
| several projects (especially given that the GNU target triple
| scheme _doesn 't include native Windows_, which is one of the
| most common targets in practice!). Part of the problem is that
| for many people "what is a target triple" is actually a lead-in
| to the question "what are the valid targets?", and trying to
| read config.guess is not a good vehicle to discover the answer.
| config.guess isn't also a good way to find about target triples
| for systems that aren't designed to run general-purpose
| computing, like if you're trying to compile for a GPU
| architecture, or even a weird x86 context like UEFI.
| o11c wrote:
| The GNU scheme does in fact have support for various windows
| targets. It's just that the GNU _compilers_ don 't support
| them all.
| pjc50 wrote:
| There's MinGW.
| psanford wrote:
| As a Go developer, I certainly find the complaints about the go
| conventions amusing. I guess if you have really invested so much
| into understanding all the details in the rest of this article
| you might be annoyed that it doesn't translate 1 to 1 to Go.
|
| But for the rest of us, I'm so glad that I can just cross compile
| things in Go without thinking about it. The annoying thing with
| setting up cross compilation in GCC is not learning the naming
| conventions, it is getting the correct toolchains installed and
| wired up correctly in your build system. Go just ships that out
| of the box and it is so much more pleasant.
|
| Its also one thing that is great about zig. Using Go+zig when I
| need to cross compile something that includes cgo in it is so
| much better than trying to get GCC toolchains setup properly.
| theoperagoer wrote:
| Great content. Also, this website is gorgeous!
| ComputerGuru wrote:
| Great article but I was really put off by this bit, which aside
| from being very condescending, simply isn't true and reveals a
| lack of appreciation for the innovation that I would have thought
| someone posting about target triples and compilers would have
| appreciated:
|
| > Why the Windows people invented a whole other ABI instead of
| making things clean and simple like Apple did with Rosetta on ARM
| MacBooks? I have no idea, but
| http://www.emulators.com/docs/abc_arm64ec_explained.htm contains
| various excuses, none of which I am impressed by. My read is that
| their compiler org was just worse at life than Apple's, which is
| not surprising, since Apple does compilers better than anyone
| else in the business.
|
| I was already familiar with ARM64EC from reading about its
| development from Microsoft over the past years but had not come
| across the emulators.com link before - it's a stupendous (long)
| read and well worth the time if you are interested in lower-level
| shenanigans. The truth is that Microsoft's ARM64EC solution is a
| hundred times more brilliant and a thousand times better for
| backwards (and forwards) compatibility than Rosetta on macOS,
| which gave the user a far inferior experience than native code,
| executed (sometimes far) slower, prevented interop between legacy
| and modern code, left app devs having to do a full port to move
| to use newer tech (or even just have a UI that matched the rest
| of the system), and was always intended as a merely transitional
| bit of tech to last the few years it took for native x86 apps to
| be developed and take the place (usurp) of old ppc ones.
|
| Microsoft's solution has none of these drawbacks (except the
| noted lack of AVX support), doesn't require every app to be 2x or
| 3x as large as a sacrifice to the fat binaries hack, offers a
| much more elegant solution for developers to migrate their code
| (piecemeal or otherwise) to a new platform where they don't know
| if it will be worth their time/money to invest in a full rewrite,
| lets users use all the apps they love, and maintains Microsoft's
| very much well-earned legacy for backwards compatibility.
|
| When you run an app for Windows 2000 on Windows 11 (x86 or ARM),
| you don't see the old Windows 2000 aesthetic (and if you do,
| there's an easy way for _users_ to opt into newer theming rather
| than requiring the developer to do something about it) and you
| aren 't stuck with bugs from 30 years ago that were long since
| patched by the vendor many OS releases ago.
| juped wrote:
| You have neglected to consider that Microsoft bad; consider how
| they once did something differently from a Linux distribution I
| use. (This sentiment is alive and well among otherwise
| intelligent people; it's embarrassing to read.)
| Philpax wrote:
| This author has a tendency to be condescending about things
| they find disagreeable. It's why I stopped reading them.
| Zamiel_Snawley wrote:
| Do those criticisms of Rosetta hold for Rosetta 2?
|
| I assumed the author was talking about the x86 emulator
| released for the arm migration a few years ago, not the powerpc
| one.
| plorkyeran wrote:
| The thing named Rosetta (actually Rosetta 2) for the x86_64 ->
| ARM transition is technologically completely unrelated to the
| PPC -> x86 Rosetta, and has none of the problems you mention.
| There's no user-observable difference between a program using
| Rosetta and a native program in modern macOS, and porting
| programs which didn't have any assembly or other CPU-arch-
| specific code was generally just a matter of wrangling your
| build system.
| arp242 wrote:
| _> There are also many ficticious names for 64-bit x86, which you
| should avoid unless you want the younger generation to make fun
| of you. amd64 refers to AMD's original implementation of long
| mode in their K8 microarchitecture, first shipped in their Athlon
| 64 product. Calling it amd64 is silly and also looks a lot like
| arm64, and I am honestly kinda annoyed at how much Go code I've
| seen with files named fast_arm64.s and fast_amd64.s. Debian also
| uses amd64 /arm64, which makes browsing packages kind of
| annoying._
|
| I prefer amd64 as it's so much easier to type and scans so much
| easier. x86_64 is so awkward.
|
| Bikeshed I guess and in the abstract I can see how x86_64 is
| better, but pragmatism > purity and you'll take my amd64 from my
| cold dead hands.
|
| As for Go, you can get the GOARCH/GOOS combinations from "go tool
| dist list". Can be useful at times if you want to ensure your
| code cross-compiles in CI.
| ycombinatrix wrote:
| >There's a few variants. wasm32-unknown-unknown (here using
| unknown instead of none as the system, oops)
|
| Why isn't it called wasm32-none-none?
| pie_flavor wrote:
| As far as I can tell, it's because libstd exists (but is full
| of do-nothing stubs). There is another `wasm32-none` target
| which is no_std.
| pie_flavor wrote:
| Sorry, going to keep typing x64. Unlike the article's
| recommendation of x86, literally everyone knows exactly what it
| means at all times.
| qu4z-2 wrote:
| If someone tells me x86, I am certainly thinking 32-bit
| protected mode not 64-bit long mode... Granted I'm in the weird
| space where I know enough to be dangerous but not enough to
| keep me up-to-date with idiomatic naming conventions.
| kevin_thibedeau wrote:
| You mean AMD64?
| dvektor wrote:
| Great read. Love those articles where you go in thinking that you
| have a pretty solid understanding of the topic and then proceed
| to learn much more than you thought you would.
| Joker_vD wrote:
| > no one calls it x64 except for Microsoft. And even though it is
| fairly prevalent on Windows, I absolutely give my gamedev friends
| a hard time when they write x64.
|
| So, it turns out, actually a _lot_ of people call it x64 --
| including author 's own friends! -- it's just that the author
| dislikes it. Disliking something is fine, but why claim outright
| falsehood which you know first-hand is false?
|
| Also, the actual proper name for this ISA is, of course, EM64T.
| /s
|
| > The fourth entry of the triple (and I repeat myself, yes, it's
| still a triple)
|
| Any actual justification except the bald assertions from the
| personal preferences? Just call it a "tuple", or something...
| IAmLiterallyAB wrote:
| > However, due to the runaway popularity of LLVM, virtually all
| compilers now use target triples.
|
| That's a wild take. I think its pretty universally accepted the
| GCC and the GNU toolchain is what made this ubiquitous.
|
| Also, the x32 ABI is still around, support is still around, I
| don't know where the author got that notion
| cestith wrote:
| > "i386" (the first Intel microarchitecture that implemented
| protected mode)12
|
| This is technically incorrect. The 286 had protected mode. It was
| a 16-bit protected mode, being a 16-bit processor. It was also
| incompatible with the later protected mode of the 386 through
| today's processors. It did, however, exist.
___________________________________________________________________
(page generated 2025-04-16 17:03 UTC)