hngopher.com

       [HN Gopher] What the hell is a target triple?
       ___________________________________________________________________
        
       What the hell is a target triple?
        
       Author : ingve
       Score  : 154 points
       Date   : 2025-04-15 18:35 UTC (22 hours ago)
        
 (HTM) web link (mcyoung.xyz)
 (TXT) w3m dump (mcyoung.xyz)
        
       | Retr0id wrote:
       | Note to author, I'm not sure the word "anachronism" is being used
       | correctly in the intro.
        
         | compyman wrote:
         | I think the meaning is that the idea that compilers can only
         | compile for their host machine is an ananchronism, since that
         | was historically the case but is no longer true.
        
           | stefan_ wrote:
           | Telling people that "Clang can compile for any architecture
           | you like!" tends to confuse them more than it helps. I
           | suppose it sets up unrealistic assumptions because of course
           | outputting assembly for some architecture is a very long way
           | from making working userland binaries for a system based on
           | that architecture, which is what people actually want.
           | 
           | And ironically in all of this, building a full toolchain
           | based on GCC is _still_ easier than with LLVM.
        
           | bregma wrote:
           | Heck, it hasn't been true since the 1950s. Consider it as
           | "has never been true".
           | 
           | Oh, sure, there have been plenty of native-host-only
           | compilers. It was never a property of all compilers, though.
           | Most system brings-ups, from the mainframes of the 1960s
           | through the minis of the 1970s to the micros and embeddeds of
           | the 1980s and onwards have required cross compilers.
           | 
           | I think what he means is that a single-target toolchain is an
           | anachronism. That's also not true, since even clang doesn't
           | target everything under the sun in one binary. A toolchain
           | needs far more than a compiler, for a start; it needs the
           | headers and libraries and it needs a linker. To go from
           | source to executable (or herd of dynamic shared objects)
           | requires a whole lot more than installing the clang (or
           | whatever front-end) binary and choosing a nifty target
           | triple. Most builds of clang don't even support all the
           | interesting target triples and you need to build it yourself,
           | which require a lot more computer than I can afford.
           | 
           | Target triples are not even something limited to toolchains.
           | I maintain software that gets cross-built to all kinds of
           | targets all the time and that requires target triples for the
           | same reasons compilers do. Target triples are just a basic
           | tool of the trade if you deal with anything other than
           | scripting the browser and they're a solved problem
           | rediscovered every now and then by people who haven;t studied
           | their history.
        
         | kupiakos wrote:
         | It's being used correctly: something that is conspicuously old-
         | fashioned for its environment is an anachronism. A toolchain
         | that only supports native builds fits.
        
           | Retr0id wrote:
           | The article does not place any given toolchain within an
           | incorrect environment, though.
           | 
           | If someone said "old compilers were usually cross-compilers",
           | that would be an _ahistoric_ statement (somewhat).
           | 
           | If someone used clang in a movie set in the 90s, that would
           | be anachronistic.
        
         | bqmjjx0kac wrote:
         | It's technically correct, but feels a bit forced.
        
       | jkelleyrtp wrote:
       | The author's blog is a FANTASTIC source of information. I
       | recommend checking out some of their other posts:
       | 
       | - https://mcyoung.xyz/2021/06/01/linker-script/
       | 
       | - https://mcyoung.xyz/2023/08/09/yarns/
       | 
       | - https://mcyoung.xyz/2023/08/01/llvm-ir/
        
         | eqvinox wrote:
         | Given TFA's bias against GCC, I'm not so sure. e.g. looking at
         | the linker script article... it's also missing the __start_XYZ
         | and __stop_XYZ symbols automatically created by the linker.
        
           | matheusmoreira wrote:
           | It also focuses exclusively on sections. I wish it had at
           | least mentioned segments, also known as program headers.
           | Linux kernel's ELF loader does not care about sections, it
           | only cares about segments.
           | 
           | Sections and segments are more or less the same concept:
           | metadata that tells the loader how to map each part of the
           | file into the correct memory regions with the correct memory
           | protection attributes. Biggest difference is segments don't
           | have names. Also they aren't neatly organized into logical
           | blocks like sections are, they're just big file extents. The
           | segments table is essentially a table of arguments for the
           | mmap system call.
           | 
           | Learning this stuff from scratch was pretty tough. Linker
           | script has commands to manipulate the program header table
           | but I couldn't figure those out. In the end I asked
           | developers to add command line options instead and the
           | maintainer of mold actually obliged.
           | 
           | Looks like very few people know about stuff like this. One
           | can use it to do some heavy wizardry though. I leveraged this
           | machinery into a cool mechanism for embedding arbitrary data
           | into ELF files. The kernel just memory maps the data in
           | before the program has even begun execution. Typical
           | solutions involve the program finding its own executable on
           | the file system, reading it into memory and then finding some
           | embedded data section. I made the kernel do almost all of
           | that automatically.
           | 
           | https://www.matheusmoreira.com/articles/self-contained-
           | lone-...
        
             | o11c wrote:
             | I wouldn't call them "same concept" at all. Segments
             | (program headers) are all about the runtime (executables
             | and shared libraries) and are low-cost. Sections are all
             | about development (.o files) and are detailed.
             | 
             | Generally there are many sections combined into a single
             | segment, other than special-purpose ones. Unless you are
             | reimplementing ld.so, you almost certainly don't want to
             | touch segments; sections are far easier to work with.
             | 
             | Also, normally you just just call `getauxval`, but if
             | needed the type is already named `ElfW(auxv_t)*`.
        
               | matheusmoreira wrote:
               | > I wouldn't call them "same concept" at all.
               | 
               | They are both metadata about file extents and their
               | memory images.
               | 
               | > sections are far easier to work with
               | 
               | Yes. They are not, however, loaded into memory by
               | default. Linkers do not generate LOAD segments for
               | section metadata since they are not needed for execution.
               | Thus it's impossible for a program to introspect its own
               | sections without additional logic and I/O to read them
               | into memory.
               | 
               | > Also, normally you just just call `getauxval`, but if
               | needed the type is already named `ElfW(auxv_t)*`.
               | 
               | True. I didn't use it because it was not available. I
               | wrote my article in the context of a freestanding nolibc
               | program.
        
               | o11c wrote:
               | Right, but you can just use the section start/end symbols
               | for a section that already goes into a mapped segment.
        
               | matheusmoreira wrote:
               | Can you show me how that would work?
               | 
               | It's trivial to put arbitrary files into sections:
               | objcopy --add-section program.files.1=file.1.dat \
               | --add-section program.files.2=file.2.dat \
               | program program+files
               | 
               | The problem is the program.files.* sections do not get
               | mapped in by a LOAD segment. I ended up having to write
               | my own tool to patch in a LOAD segment into the segments
               | table because objcopy does not have the ability to do it.
               | 
               | Even asked a Stack Overflow question about this two years
               | ago:
               | 
               | https://stackoverflow.com/q/77468641
               | 
               | The only answer I got told me to simply read the sections
               | into memory via /proc/self/exe or edit the segments table
               | and make it so that the LOAD segments cover the whole
               | file. I eventually figured out ways to add LOAD segments
               | to the table. By that point I didn't need sections
               | anymore, just a custom segment type.
        
               | o11c wrote:
               | The whole point of section names is that they mean
               | something. If you give it a name that matches `.rodata.*`
               | it will be part of the existing read-only LOADed
               | segments, or `.data.*` for (private) read-write.
               | 
               | Use `ld --verbose` to see what sections are mapped by
               | default (it is impossible for a linker to work without
               | having such a linker script; we're just lucky that GNU ld
               | exposes it in a sane form rather than hard-coding it as C
               | code). In modern versions of the linker (there is still
               | old documentation found by search engines), you can
               | specify multiple SECTIONS commands (likely from multiple
               | scripts, i.e. just files passed on the command line), but
               | why would you when you can conform to the default one?
               | 
               | You should pick a section name that won't collide with
               | the section names generated by `-fdata-sections` (or
               | `-ffunction-sections` if that's ever relevant for you).
        
               | matheusmoreira wrote:
               | That requires relinking the executable. That is not
               | always desirable or possible. Unless the dynamic linker
               | ignores the segments table in favor of doing this on the
               | fly... Even if that's the case, it won't work for
               | statically linked executables. Only the dynamic linker
               | can assign meaning to section names at runtime and the
               | dynamic linker isn't involved at all in the case of
               | statically linked programs.
        
             | eqvinox wrote:
             | Absolutely agree. Had my own fun dealings with ELF, and to
             | be clear, on plain mainline shipping products (amd64
             | Linux), not toys/exercise/funky embedded. (Wouldn't have
             | known about section start/stop symbols otherwise)
        
           | sramsay wrote:
           | I was really struck by the antipathy toward GCC. I'm not sure
           | I quite understand where it's coming from.
        
       | fweimer wrote:
       | I think GCC's more-or-less equivalent to Clang's --target is
       | called -B: https://gcc.gnu.org/onlinedocs/gcc/Directory-
       | Options.html#in...
       | 
       | I assume it works with an all-targets binutils build. I haven't
       | seen anyone building their cross-compilers in this way (at least
       | not in recent memory).
        
         | JoshTriplett wrote:
         | I haven't either, probably because it would require building
         | once per target and installing all the individual binaries.
         | 
         | This is one of the biggest differences between clang and GCC:
         | clang has one binary that supports multiple targets, while a
         | GCC build is always target-specific.
        
         | o11c wrote:
         | Old versions of GCC used to provide `-b <machine>` (and also
         | `-V <version>`), but they were removed a long time ago in favor
         | of expecting people to just use and set `CC` correctly.
         | 
         | It looks like gcc 3.3 through 4.5 just forwards to an external
         | driver; prior to that it seems like it used the same driver for
         | different paths, and after that it is removed.
        
       | jcranmer wrote:
       | I did start to try to take clang's TargetInfo code
       | (https://github.com/llvm/llvm-project/blob/main/clang/lib/Bas...)
       | and porting it over to TableGen, primarily so somebody could
       | actually extract useful auto-generated documentation out of it,
       | like "What are all the targets available?"
       | 
       | I actually do have working code for the triple-to-TargetInfo
       | instantiation portion (which is fun because there's one or two
       | cases that juuuust aren't quite like all of the others, and I'm
       | not sure if that's a bad copy-paste job or actually intentional).
       | But I never got around to working out how to actually integrate
       | the actual bodies of TargetInfo implementations--which provide
       | things like the properties of C/C++ fundamental types or default
       | macros--into the TableGen easily, so that patch is still merely
       | languishing somewhere on my computer.
        
       | IshKebab wrote:
       | Funny thing I found when I gave up trying to find documentation
       | and read the LLVM source code (seems to be what happened to the
       | author too!): there are actually _five_ components of the triple,
       | not four.
       | 
       | I can't remember what the fifth one is, but yeah... insane
       | system.
       | 
       | Thanks for writing this up! I wonder if anyone will ever come up
       | with something more sensible.
        
         | o11c wrote:
         | There are up to 7 components in a triple, but not all are used
         | at once, the general format is:
         | <machine>-<vendor>-<kernel>-<libc?><abi?><fabi?>
         | 
         | But there's also <obj>, see below.
         | 
         | Note that there are both canonical and non-canonical triples in
         | use. Canonical triples are output by `config.guess` or
         | `config.sub`; non-canonical triples are input to `config.sub`
         | and used as prefixes for commands.
         | 
         | The <machine> field (1st) is what you're running on, and on
         | some systems it includes a version number of sorts. Most 64-bit
         | vs 32-bit differences go here, except if the runtime differs
         | from what is natural (commonly "32-bit pointers even though the
         | CPU is in 64-bit mode"), which goes in <abi> instead.
         | Historically, "arm" and "mips" have been a mess here, but that
         | has largely been fixed, in large part as a side-effect of
         | Debian multiarch (whose triples only have to differ from GNU
         | triples in that they canonicalize i[34567]86 to i386, but you
         | should use dpkg-architecture to do the conversion for sanity).
         | 
         | The <vendor> field (2nd) is not very useful these days. It
         | defaults to "unknown" but as of a few years ago "pc" is used
         | instead on x86 (this means that the canonical triple can
         | change, but this hasn't been catastrophic since you should
         | almost always use the non-canonical triple except when pattern-
         | matching, and when pattern-matching you should usually ignore
         | this field anyway).
         | 
         | The <kernel> field (3rd) is pretty obvious when it's called
         | that, but it's often called <os> instead since "linux" is an
         | oddity for regularly having a <libc> component that differs. On
         | many systems it includes version data (again, Linux is the
         | oddity for having a stable syscall API/ABI). One notable
         | exception: if a GNU userland is used on BSD/Solaris system, a
         | "k" is prepended. "none" is often used for
         | freestanding/embedded compilation, but see <obj>.
         | 
         | The <libc> field (main part of the 4th) is usually absent on
         | non-Linux systems, but mandatory for "linux". If it is absent,
         | the dash after the kernel is usually removed, except if there
         | are ABI components. Note that "gnu" can be both a kernel (Hurd)
         | and a libc (glibc). Android uses "android" here, so maybe
         | <libc> is a bit of a misnomer (it's not "bionic") - maybe
         | <userland>?
         | 
         | <abi>, if present, means you aren't doing the historical
         | default for the platform specified by the main fields. Other
         | than "eabi" for ARM, most of this is for "use 32-bit pointers
         | but 64-bit registers".
         | 
         | <fabi> can be "hf" for 32-bit ARM systems that actually support
         | floats in hardware. I don't think I've seen anything else,
         | though I admit the main reason I separately document this from
         | <abi> is because of how Debian's architecture puts it
         | elsewhere.
         | 
         | <obj> is the object file format, usually "aout", "coff", or
         | "elf". It can be appended to the kernel field (but before the
         | kernel version number), or replace it if "none", or it can go
         | in the <abi> field.
        
           | IshKebab wrote:
           | Nah I dunno where you're getting your information from but
           | LLVM only supports 5 components.
           | 
           | See the code starting at line 1144 here:
           | https://llvm.org/doxygen/Triple_8cpp_source.html
           | 
           | The components are arch-vendor-os-environment-objectformat.
           | 
           | It's absolutely full of special cases and hacks. Really at
           | this point I think the only sane option is an explicit list
           | of fixed strings. I think Rust does that.
        
             | o11c wrote:
             | LLVM didn't invent the scheme; why should we pay attention
             | to their copy and not look at the original?
             | 
             | The GNU Config project is the original.
        
               | IshKebab wrote:
               | The article goes into this a bit. But basically because
               | LLVM is extremely popular and used as a backend by lots
               | of other languages, e.g. Rust.
               | 
               | Frankly being the originators of this deranged scheme is
               | a good reason _not_ to listen to GNU!
        
             | jcranmer wrote:
             | You're not really contradicting o11c here; what LLVM calls
             | "environment" is a mixture of what they called
             | libc/abi/fabi. There's also what LLVM calls "subarch" to
             | distinguish between different architectures that may be
             | relevant (e.g., i386 is not the same as i686, although LLVM
             | doesn't record this difference since it's generally less
             | interested in targeting old hardware), and there's also OS
             | version numbers that may or may not be relevant.
             | 
             | The underlying problem with target triples is that
             | architecture-vendor-system isn't sufficient to uniquely
             | describe the relevant details for specifying a toolchain,
             | so the necessary extra information has been somewhat
             | haphazardly added to the format. On top of that, since the
             | relevance of some of the information is questionable for
             | some tasks (especially the vendor field), different
             | projects have chosen not to care about subtle differences,
             | so the _normalization_ of a triple is different between
             | different projects.
             | 
             | LLVM's definition is not more or less correct than gcc's
             | here, nor are these the only definitions floating around.
        
               | o11c wrote:
               | Hm, looking to see if the vendor field is actually
               | meaningful ... I see some stuff for m68k and mips and
               | sysv targets ... some of it working around pre-standard
               | vendor C implementations
               | 
               | Ah, I found a modern one:
               | i[3456789]86-w64-mingw* does not use winsup
               | i[3456789]86-*-mingw* with other vendors does use winsup
               | 
               | There are probably more; this is embedded in all sorts of
               | random configure scripts and it is very not-greppable.
        
       | vient wrote:
       | > Kalimba, VE
       | 
       | > No idea what this is, and Google won't help me.
       | 
       | Seems that Kalimba is a DSP, originally by CSR and now by
       | Qualcomm. CSR8640 is using it, for example
       | https://www.qualcomm.com/products/internet-of-things/consume...
       | 
       | VE is harder to find with such short name.
        
         | AKSF_Ackermann wrote:
         | NEC Vector Engine. Basically not a thing outside
         | supercomputers.
        
           | fc417fc802 wrote:
           | $800 for the 20B-P model on ebay. More memory bandwidth than
           | a 4090. I wonder if llama.cpp could be made to run on it?
           | 
           | I see rumors they charge for the compiler though.
        
       | kridsdale1 wrote:
       | I really appreciate the angular tilt of the heading type on that
       | blog.
        
       | throw0101d wrote:
       | Noticed endians listed in the table. It seems like little-endian
       | has basically taken over the world in 2025:
       | 
       | * https://en.wikipedia.org/wiki/Endianness#Hardware
       | 
       | Is there anything that is used a lot that is not little? IBM's
       | stuff?
       | 
       | Network byte order is BE:
       | 
       | * https://en.wikipedia.org/wiki/Endianness#Networking
        
         | thro3838484848 wrote:
         | Java VM is BE.
        
           | kbolino wrote:
           | This is misleading at best. The JVM only exposes multibyte
           | values to ordinary applications in such a way that byte order
           | doesn't matter. You can't break out a pointer and step
           | through the bytes of a long field to see what order it's in,
           | at least not without the unsafe memory APIs.
           | 
           | In practice, any real JVM implementation will simply use
           | native byte order as much as possible. While bytecode and
           | other data in class files is serialized in big endian order,
           | it will be converted to native order whenever it's actually
           | used. If you do pull out the unsafe APIs, you can see that
           | e.g. values are little endian on x86(-64). The JVM would
           | suffer from major performances issues if it tried to impose a
           | byte order different from the underlying platform.
        
             | PhilipRoman wrote:
             | One relatively commonly used class which exposes this is
             | ByteBuffer and its Int/Long variants, but there you can
             | specify the endianness explicitly (or set it to match the
             | native one).
        
         | forrestthewoods wrote:
         | BE isn't technically dead buts it's practically dead for almost
         | all projects. You can static_assert byte order and then never
         | think about BE ever again.
         | 
         | All of my custom network serialization formats use LE because
         | there's literally no reason to use BE for network byte order.
         | It's pure legacy cruft.
        
           | f_devd wrote:
           | ...Until you find yourself having to workaround legacy code
           | to support some weird target that does still use BE. Speaking
           | from experience (tbf usually lower level than anything
           | actually networked, more like RS485 and friends).
        
         | dharmab wrote:
         | LEON, used by the European Space Agency, is big endian.
        
           | naruhodo wrote:
           | Should have been called BEON.
        
         | formerly_proven wrote:
         | 10 years ago the fastest BE machines that were practical were
         | then-ten year old powermacs. This hasn't really changed. I
         | guess they're more expensive now.
        
           | eqvinox wrote:
           | e6500/T4240 are faster than powermacs. Not sure how rare they
           | are nowadays, we didn't have any trouble buying some (on
           | eBay). 12x2 cores, 48GB RAM, for BE that's essentially
           | heaven...
        
         | Palomides wrote:
         | IBM's Power chips can run in either little or big modes, but
         | "used a lot" is a stretch
        
           | inferiorhuman wrote:
           | Most PowerPC related stuff (e.g. Freescale MPC5xx found in a
           | bunch of automotiver applications) can run in either big or
           | little endian mode, as can most ARM and MIPS (routers, IP
           | cameras) stuff. Can't think of the last time I've seen any of
           | them configured to run in big endian mode tho.
        
             | classichasclass wrote:
             | For the large Power ISA machines, it's most commonly when
             | running AIX or IBM i these days, though the BSDs generally
             | run big too.
        
         | richardwhiuk wrote:
         | Some ARM stuff.
        
         | rv3392 wrote:
         | Apart from IBM Power/AIX systems, SPARC/Solaris is another one.
         | I wouldn't say either of these are used a lot, but there's a
         | reasonable amount of legacy systems out there that are still
         | being supported by IBM and Oracle.
        
       | psyclobe wrote:
       | Sounds like what we use with vcpkg to define the systems tooling;
       | still trying to make sense of it all these years later, but we
       | define things like x64-Linux-static to imply target architecture
       | platform and linkage style to runtime.
        
       | bruce343434 wrote:
       | Why does this person have such negative views of GCC and positive
       | bias towards LLVM?
        
         | nemothekid wrote:
         | If OP is above 30 - it's probably due to the frustration of
         | trying to modularize GCC that led to the creation of LLVM in
         | the first place. If OP is below 30, it's probably because he
         | grew up in a world where most compiler research and design is
         | done on LLVM and GCC is for grandpa.
        
         | matheusmoreira wrote:
         | Good question. Author is incredibly hostile to one of the most
         | important pieces of software ever developed because of the way
         | they approached the problem nearly 40 years ago. Then he
         | criticizes Go for trying to redesign the system instead of just
         | using target triples...
        
           | FitCodIa wrote:
           | The author writes: "really stupid way in which GCC does cross
           | compiling [...] Nobody with a brain does this [...]", and
           | then admits in the footnote, "I'm not sure why GCC does
           | this".
           | 
           | Immature to the point of alienating.
        
         | xyst wrote:
         | Seems to have a decent amount of knowledge in this domain in
         | education and professional work. Author is from MIT so maybe
         | professors had a lot of influence here.
         | 
         | also, gcc is relatively old and comes with a lot of baggage.
         | LLVM is sort of the defacto standard now with improvements in
         | performance
        
           | rlpb wrote:
           | > LLVM is sort of the defacto standard now...
           | 
           | Distributions, and therefore virtually all the software used
           | by a distribution user, still generally use gcc. LLVM is only
           | the de facto standard when doing something new, and for JIT.
        
           | bruce343434 wrote:
           | as someone who uses both Clang and GCC to cover eachothers
           | weaknesses, as far as I can tell both LLVM and GCC are
           | hopelessly beastly codebases in terms of raw size and their
           | complexity. I think that's just what happens when people
           | desire to build an "everything compiler".
           | 
           | From what I gathered, LLVM has a lot of C++ specific design
           | choices in its IR language anyway. I think I'd count that as
           | baggage.
           | 
           | I personally don't think one is better than the other.
           | Sometimes clang produces faster code, sometimes gcc. I
           | haven't really dealt with compiler bugs from either. They
           | compile my projects at the same speed. Clang is better at
           | certain analyses, gcc better at certain others.
        
             | ahartmetz wrote:
             | Clang used to compile much faster than GCC. I was excited.
             | Now there is barely any difference, so I keep using GCC and
             | occasionally some Clang-based tools such as iwyu,
             | ClangBuildAnalyzer or sanitizer options (rare, Valgrind is
             | easier and more powerful though sanitizers also have unique
             | features).
        
         | Skywalker13 wrote:
         | It is unfortunate. GCC has enabled the compilation of countless
         | lines of source code for nearly 40 years and has served
         | millions of users. Regardless of whether its design is
         | considered good or bad today, GCC has played an essential role
         | and has enabled the emergence of many projects and new
         | compilers. GCC deserves deep respect.
        
         | steveklabnik wrote:
         | I have intense respect for the history of gcc, but everything
         | about using it screams that it's stuck in the past.
         | 
         | LLVM has a lot of problems, but it feels significantly more
         | modern.
         | 
         | I do wish we had a "new LLVM" doing to LLVM what it did to gcc.
         | Just because it's better doesn't mean it's perfect.
         | 
         | Basically, you can respect history while also being honest
         | about the current state of things. But also, doing so requires
         | you to care primarily about things like ease of use, rather
         | than things like licenses. For some people, they care about
         | licenses first, usability second.
        
           | flkenosad wrote:
           | Honestly, I love that both exist with their respective world
           | views.
        
             | steveklabnik wrote:
             | I for sure don't want to suggest that anyone who loves gcc
             | shouldn't be working on what they love. More compilers are
             | a good thing, generally. Just trying to say why I have a
             | preference.
        
           | tialaramex wrote:
           | Their IR is a mess. So a "new LLVM" ought to start by nailing
           | down the IR.
           | 
           | And as a bonus, seems to me a nailed down IR actually _is_
           | that portable assembly language the C people keep telling us
           | is what they wanted. Most of them don 't actually want that
           | and won't thank you - but if even 1% of the "I need a
           | portable assembler" crowd actually did want a portable
           | assembler they're a large volume of customers from day one.
        
             | o11c wrote:
             | Having tried writing plugins for both, I very much prefer
             | GCC's codebase. You have to adapt to its quirks, but at
             | least it won't pull the rug from under your feet
             | gratuitously. There's a reason every major project ends up
             | embedding a years-old copy of LLVM rather than just using
             | the system version.
             | 
             | If you're ignoring the API and writing IR directly there
             | are advantages to LLVM though.
        
         | flkenosad wrote:
         | It's the new anti-woke mind virus going around attacking
         | anything "communist" such as copyleft, Stallman, GCC, GNU, etc.
        
       | forrestthewoods wrote:
       | What a great article.
       | 
       | Everytime I deal with target triples I get confused and have to
       | refresh my memory. This article makes me feel better in knowing
       | that target triples are an unmitigated cluster fuck of cruft and
       | bad design.
       | 
       | > Go does the correct thing and distributes a cross compiler.
       | 
       | Yes but also no. AFAIK Zig is the _only_ toolchain to provide
       | native cross compiling out of the box without bullshit.
       | 
       | Missing from this discussion is the ability to specify and target
       | different versions of glibc. Something that I think only Zig even
       | _attempts_ to do because Linux's philosophy of building against
       | local system globals is an incomprehensibly bad choice. So all
       | these target triples are woefully underspecified.
       | 
       | I like that at least Rust defines its own clear list of target
       | triples that are more rational than LLVM's. At this point I feel
       | like the whole concept of a target triples needs to be thrown
       | away. Everything about it is bad.
        
       | peterldowns wrote:
       | Some other sources of target triples (some mentioned in the
       | article, some not):
       | 
       | rustc: `rustc --print target-list`
       | 
       | golang: `go tool dist list`
       | 
       | zig: `zig targets`
       | 
       | As the article point out, the complete lack of standardization
       | and consistency in what constitutes a "triple" (sometimes
       | actually a quad!) is kind of hellishly hilarious.
        
         | ycombinatrix wrote:
         | at least we don't have to deal with --build, --host, --target
         | nonsense anymore
        
           | rendaw wrote:
           | You do on Nix. And it's as inconsistently implemented there
           | as anywhere.
        
         | lifthrasiir wrote:
         | > what constitutes a "triple" (sometimes actually a quad!)
         | 
         | It is actually a quintiple at most because the first part,
         | architecture, may contain a version for e.g. ARM. And yet it
         | doesn't fully describe the actual target because it may require
         | an additional OS version for e.g. macOS. Doubly silly.
        
           | achierius wrote:
           | Why would macOS in particular require an OS version where
           | other platforms would not -- just backwards compatibility?
        
       | cbmuser wrote:
       | >>32-bit x86 is extremely not called "x32"; this is what Linux
       | used to call its x86 ILP324 variant before it was removed.<<
       | 
       | x32 support has not been removed from the Linux kernel. In fact,
       | we're still maintaining Debian for x32 in Debian Ports.
        
       | AceJohnny2 wrote:
       | Offtopic, but I'm distracted by the opening example:
       | 
       | > _After all, you don't want to be building your iPhone app on
       | literal iPhone hardware._
       | 
       | iPhones are impressively powerful, but you wouldn't know it from
       | the software lockdown that Apple holds on it.
       | 
       | Example: https://www.tomsguide.com/phones/iphones/iphone-16-is-
       | actual...
       | 
       | There's a reason people were clamoring for Apple to make ARM
       | laptops/desktops for years before Apple finally committed.
        
         | boricj wrote:
         | A more pertinent (if dated) example would be " _you don 't want
         | to be building your GBA game on literal Game Boy Advance
         | hardware_".
        
           | richardwhiuk wrote:
           | Or a microcontroller
        
         | AceJohnny2 wrote:
         | I do not think I like this author...
         | 
         | > _A critical piece of history here is to understand the really
         | stupid way in which GCC does cross compiling. Traditionally,
         | each GCC binary would be built for one target triple. [...]
         | Nobody with a brain does this ^2_
         | 
         | You're doing GCC a great disservice by ignoring its storied and
         | essential history. It's over 40 years old, and was created at a
         | time where there were no free/libre compilers. Computers were
         | small and slow. _Of course_ you wouldn 't bundle multiple
         | targets in one distribution.
         | 
         | LLVM benefitted from a completely different architecture and
         | starting from a blank slate when computers were already faster
         | and much larger, and was heavily sponsored by a vendor that was
         | innately interested in cross-compiling: Apple. (Guess where
         | LLVM's creator worked for years and lead the development tools
         | team)
        
           | steveklabnik wrote:
           | "This was the right way to do it forty years ago, so that's
           | why the experience is worse" isn't a compelling reason for a
           | user to suffer today.
           | 
           | Also, in this specific case, this ignores the history around
           | LLVM offering itself up to the FSF. gcc could have benefitted
           | from this fresh start too. But purely by accident, it did
           | not.
        
             | AceJohnny2 wrote:
             | I'd love to learn what accident you're referring to, Steve!
             | 
             | I vaguely recall the FSF (or maybe only Stallman) arguing
             | _against_ the modular nature of LLVM because a monolothic
             | structure (like GCC 's) makes it harder for anti-GPL actors
             | (Apple!) to undermine it. Was this related?
        
               | steveklabnik wrote:
               | That is true history, in my understanding, but it's not
               | related.
               | 
               | Chris Lattner offered to donate the copyright of LLVM to
               | the FSF at one point: https://gcc.gnu.org/legacy-
               | ml/gcc/2005-11/msg00888.html
               | 
               | He even wrote some patches: https://gcc.gnu.org/legacy-
               | ml/gcc/2005-11/msg01112.html
               | 
               | However, due to Stallman's... idiosyncratic email setup,
               | he missed this: https://lists.gnu.org/archive/html/emacs-
               | devel/2015-02/msg00...
               | 
               | > I am stunned to see that we had this offer.
               | 
               | > Now, based on hindsight, I wish we had accepted it.
               | 
               | Note this email is in 2015, ten years after the initial
               | one.
        
               | Philpax wrote:
               | Incredible. Thank you for sharing.
        
               | steveklabnik wrote:
               | You're welcome! It's a wild story. Sometimes, history
               | happens by accident.
        
               | matheusmoreira wrote:
               | Wow that is wild. Imagine how different things could have
               | been...
        
             | FitCodIa wrote:
             | > "This was the right way to do it forty years ago, so
             | that's why the experience is worse" isn't a compelling
             | reason for a user to suffer today.
             | 
             | On my system, "dnf repoquery --whatrequires cross-gcc-
             | common" lists 26 gcc-*-linux-gnu packages (that is, kernel
             | / firmware cross compilers for 26 architectures). The
             | command "dnf repoquery --whatrequires cross-binutils-
             | common" lists 31 binutils-*-linux-gnu packages.
             | 
             | The author writes, "LLVM and all cross compilers that
             | follow it instead put all of the backends in one binary".
             | Do those compilers support 25+ back-ends? And if they do,
             | is it good design to install back-ends for (say) 23 such
             | target architectures that you're never going to cross-
             | compile for, in practice? Does that benefit the user?
             | 
             | My impression is that the author does not understand the
             | modularity of gcc cross compilers / packages because he's
             | unaware of (or doesn't care for) the scale that gcc aims
             | at.
        
               | steveklabnik wrote:
               | > And if they do, is it good design to install back-ends
               | for (say) 23 such target architectures that you're never
               | going to cross-compile for, in practice? Does that
               | benefit the user?                  rustc --print target-
               | list | wc -l       287
               | 
               | I'm kinda surprised at how large that is, actually. But
               | yeah, I don't mind if I have the capability to cross-
               | compile to x86_64-wrs-vxworks that I'm never going to
               | use.
               | 
               | I am not an expert on all of these details in clang
               | specifically, but with rustc, we take advantage of llvm's
               | target specifications, so you that you can even configure
               | a backend that the compiler doesn't yet know about by
               | simply giving it a json file with a description.
               | https://doc.rust-lang.org/nightly/nightly-
               | rustc/rustc_target...
               | 
               | While these built-in ones aren't defined as JSON, you can
               | ask the compiler to print one for you:
               | rustc +nightly -Z unstable-options
               | --target=x86_64-unknown-linux-gnu --print target-spec-
               | json
               | 
               | It's lengthy so instead of pasting here, I've put this in
               | a gist: https://gist.github.com/steveklabnik/a25cdefda1ae
               | f25d7b40df3...
               | 
               | Anyway, it is true that gcc supports more targets than
               | llvm, at least in theory.
               | https://blog.yossarian.net/2021/02/28/Weird-
               | architectures-we...
        
           | jaymzcampbell wrote:
           | The older I get the more this kind of commentary (the OP, not
           | you!) is a total turn off. Systems evolve and there's
           | usually, not always, a reason for why _" things are the way
           | they are"_. It's typically arrogance to have this kind of
           | tone. That said I was a bit like that when I was younger, and
           | it took a few knockings down to realise the world is complex.
        
           | FitCodIa wrote:
           | > and was heavily sponsored by a vendor that was innately
           | interested in cross-compiling
           | 
           | and innately disinterested in Free Software, too
        
         | plorkyeran wrote:
         | iPhones have terrible heat dispersion compared to even a
         | fanless computer like a macbook air. You get a few minutes at
         | full load before thermal throttling kicks in, so you could do
         | the occasional build of your iPhone app on an iPhone but it'd
         | be pretty terrible as a development platform.
         | 
         | At work we had some benchmarking suites that ran on physical
         | devices and even with significant effort put into cooling them
         | they spent more time sleeping waiting to cool off than actually
         | running the benchmarks.
        
       | cwood-sdf wrote:
       | "And no, a "target quadruple" is not a thing and if I catch you
       | saying that I'm gonna bonk you with an Intel optimization manual.
       | "
       | 
       | https://github.com/ziglang/zig/issues/20690
        
         | debugnik wrote:
         | The argument is that they're called triples even when they've
         | got more or less components than 3. They should have simply
         | been called target tuples or target monikers.
        
           | o11c wrote:
           | "gnu tuple" and "gnu type" are also common names.
           | 
           | The comments in `config.guess` and `config.sub`, which are
           | the origin of triples, use a large variety of terms, at least
           | the following:                 configuration name
           | configuration type       [machine] specification       system
           | name       triplet       tuple
        
       | therein wrote:
       | I like the code editor style preview on the right. Enough to
       | forgive the slightly clunky scroll.
        
         | tiffanyh wrote:
         | FYI - to see this you need to have your browser at least 1435px
         | wide.
        
         | SrslyJosh wrote:
         | It looks nice, but I find the choppy scrolling (on an M1 MBP,
         | no less!) to be distracting.
         | 
         | It also doesn't really tell me anything about the content,
         | except where I'm going to see tables or code blocks, so I'm not
         | sure what the benefit is.
         | 
         | Given the really janky scrolling, I'd like to have a way to
         | hide it.
        
         | Starlevel004 wrote:
         | Unfortunately the text in the preview shows up in ctrl+f.
        
       | matheusmoreira wrote:
       | > Go originally wanted to not have to link any system libraries,
       | something that does not actually work
       | 
       | It does work on Linux, the only kernel that promises a stable
       | binary interface to user space.
       | 
       | https://www.matheusmoreira.com/articles/linux-system-calls
        
         | guipsp wrote:
         | Does it _really_ tho? I 've had address resolution break more
         | than once in go programs.
        
           | matheusmoreira wrote:
           | That's because on Linux systems it's typical for domain name
           | resolution to be provided by glibc. As a result, people ended
           | up depending on glibc. They were writing GNU/Linux software,
           | not Linux software.
           | 
           | https://wiki.archlinux.org/title/Domain_name_resolution
           | 
           | https://en.wikipedia.org/wiki/Name_Service_Switch
           | 
           | https://man.archlinux.org/man/getaddrinfo.3
           | 
           | This is user space stuff. You can trash all of this and roll
           | your own mechanism to resolve the names however you want. Go
           | probably did so. Linux will not complain in any way
           | whatsoever.
           | 
           | Linux is the only kernel that lets you do this. Other kernels
           | will break your software if you bypass their system
           | libraries.
        
             | guipsp wrote:
             | I mean, that is fine and all, but it doesn't really matter
             | for making the software run correctly on systems that
             | currently exist.
        
               | matheusmoreira wrote:
               | It works fine on current Linux systems. We can have
               | freestanding executables that talk to Linux directly and
               | link against zero system libraries.
               | 
               | It's just that those executables are going to have to
               | resolve names all by themselves. Chances are they aren't
               | going to do it exactly like glibc does. That may or may
               | not be a problem.
        
               | o11c wrote:
               | Historically, when DNS breaks in a not-glibc environment,
               | it's very often found to in fact be a violation of some
               | standard by the not-glibc, rather than a program that
               | fails to document a glibc dependency.
        
               | fc417fc802 wrote:
               | Just connect to the service running on localhost ...
               | 
               | I'm curious. Why isn't getaddrinfo implemented in a
               | similar manner to the loaders that graphics APIs use?
               | Shouldn't that functionality be the responsibility of
               | whatever resolver has been installed?
        
               | o11c wrote:
               | That _is_ how `getaddrinfo` works under GLIBC; it 's
               | called NSS. The problem (well, one of them) is the non-
               | GLIBC implementations that say "we don't need no stinkin'
               | loader!"
        
         | lonjil wrote:
         | FreeBSD does as well, but old ABI versions aren't kept forever.
        
         | damagednoob wrote:
         | When developing a small program for my Synology NAS in Go, I'm
         | sure I had to target a specific version of glibc.
        
       | o11c wrote:
       | This article should be ignored, since it disregards the canonical
       | _origin_ of target triples (and the fact that it 's linked to
       | `configure`):
       | 
       | https://git.savannah.gnu.org/cgit/config.git/tree/
       | 
       | The `testsuite/` directory contains some data files with a fairly
       | extensive list of known targets. The vendor field should be
       | considered fully extensible, and new combinations of know
       | machine/kernel/libc shouldn't be considered invalid, but anything
       | else should have a patch submitted.
        
         | jcranmer wrote:
         | This article is a very LLVM-centric view, and it does ignore
         | the GNU idea of a target triple, which is essentially $(uname
         | -a)- _vendor_ -$(uname -s), with _vendor_ determined (so far as
         | I can tell) entirely from uname -s, the system name undergoing
         | some amount of butchering, and version numbers sometimes being
         | included and sometimes not, and Linux getting a LIBC tacked on.
         | 
         | But that doesn't mean the article should be ignored in its
         | entirety. LLVM's target triple parsing _is_ more relevant for
         | several projects (especially given that the GNU target triple
         | scheme _doesn 't include native Windows_, which is one of the
         | most common targets in practice!). Part of the problem is that
         | for many people "what is a target triple" is actually a lead-in
         | to the question "what are the valid targets?", and trying to
         | read config.guess is not a good vehicle to discover the answer.
         | config.guess isn't also a good way to find about target triples
         | for systems that aren't designed to run general-purpose
         | computing, like if you're trying to compile for a GPU
         | architecture, or even a weird x86 context like UEFI.
        
           | o11c wrote:
           | The GNU scheme does in fact have support for various windows
           | targets. It's just that the GNU _compilers_ don 't support
           | them all.
        
           | pjc50 wrote:
           | There's MinGW.
        
       | psanford wrote:
       | As a Go developer, I certainly find the complaints about the go
       | conventions amusing. I guess if you have really invested so much
       | into understanding all the details in the rest of this article
       | you might be annoyed that it doesn't translate 1 to 1 to Go.
       | 
       | But for the rest of us, I'm so glad that I can just cross compile
       | things in Go without thinking about it. The annoying thing with
       | setting up cross compilation in GCC is not learning the naming
       | conventions, it is getting the correct toolchains installed and
       | wired up correctly in your build system. Go just ships that out
       | of the box and it is so much more pleasant.
       | 
       | Its also one thing that is great about zig. Using Go+zig when I
       | need to cross compile something that includes cgo in it is so
       | much better than trying to get GCC toolchains setup properly.
        
       | theoperagoer wrote:
       | Great content. Also, this website is gorgeous!
        
       | ComputerGuru wrote:
       | Great article but I was really put off by this bit, which aside
       | from being very condescending, simply isn't true and reveals a
       | lack of appreciation for the innovation that I would have thought
       | someone posting about target triples and compilers would have
       | appreciated:
       | 
       | > Why the Windows people invented a whole other ABI instead of
       | making things clean and simple like Apple did with Rosetta on ARM
       | MacBooks? I have no idea, but
       | http://www.emulators.com/docs/abc_arm64ec_explained.htm contains
       | various excuses, none of which I am impressed by. My read is that
       | their compiler org was just worse at life than Apple's, which is
       | not surprising, since Apple does compilers better than anyone
       | else in the business.
       | 
       | I was already familiar with ARM64EC from reading about its
       | development from Microsoft over the past years but had not come
       | across the emulators.com link before - it's a stupendous (long)
       | read and well worth the time if you are interested in lower-level
       | shenanigans. The truth is that Microsoft's ARM64EC solution is a
       | hundred times more brilliant and a thousand times better for
       | backwards (and forwards) compatibility than Rosetta on macOS,
       | which gave the user a far inferior experience than native code,
       | executed (sometimes far) slower, prevented interop between legacy
       | and modern code, left app devs having to do a full port to move
       | to use newer tech (or even just have a UI that matched the rest
       | of the system), and was always intended as a merely transitional
       | bit of tech to last the few years it took for native x86 apps to
       | be developed and take the place (usurp) of old ppc ones.
       | 
       | Microsoft's solution has none of these drawbacks (except the
       | noted lack of AVX support), doesn't require every app to be 2x or
       | 3x as large as a sacrifice to the fat binaries hack, offers a
       | much more elegant solution for developers to migrate their code
       | (piecemeal or otherwise) to a new platform where they don't know
       | if it will be worth their time/money to invest in a full rewrite,
       | lets users use all the apps they love, and maintains Microsoft's
       | very much well-earned legacy for backwards compatibility.
       | 
       | When you run an app for Windows 2000 on Windows 11 (x86 or ARM),
       | you don't see the old Windows 2000 aesthetic (and if you do,
       | there's an easy way for _users_ to opt into newer theming rather
       | than requiring the developer to do something about it) and you
       | aren 't stuck with bugs from 30 years ago that were long since
       | patched by the vendor many OS releases ago.
        
         | juped wrote:
         | You have neglected to consider that Microsoft bad; consider how
         | they once did something differently from a Linux distribution I
         | use. (This sentiment is alive and well among otherwise
         | intelligent people; it's embarrassing to read.)
        
         | Philpax wrote:
         | This author has a tendency to be condescending about things
         | they find disagreeable. It's why I stopped reading them.
        
         | Zamiel_Snawley wrote:
         | Do those criticisms of Rosetta hold for Rosetta 2?
         | 
         | I assumed the author was talking about the x86 emulator
         | released for the arm migration a few years ago, not the powerpc
         | one.
        
         | plorkyeran wrote:
         | The thing named Rosetta (actually Rosetta 2) for the x86_64 ->
         | ARM transition is technologically completely unrelated to the
         | PPC -> x86 Rosetta, and has none of the problems you mention.
         | There's no user-observable difference between a program using
         | Rosetta and a native program in modern macOS, and porting
         | programs which didn't have any assembly or other CPU-arch-
         | specific code was generally just a matter of wrangling your
         | build system.
        
       | arp242 wrote:
       | _> There are also many ficticious names for 64-bit x86, which you
       | should avoid unless you want the younger generation to make fun
       | of you. amd64 refers to AMD's original implementation of long
       | mode in their K8 microarchitecture, first shipped in their Athlon
       | 64 product. Calling it amd64 is silly and also looks a lot like
       | arm64, and I am honestly kinda annoyed at how much Go code I've
       | seen with files named fast_arm64.s and fast_amd64.s. Debian also
       | uses amd64 /arm64, which makes browsing packages kind of
       | annoying._
       | 
       | I prefer amd64 as it's so much easier to type and scans so much
       | easier. x86_64 is so awkward.
       | 
       | Bikeshed I guess and in the abstract I can see how x86_64 is
       | better, but pragmatism > purity and you'll take my amd64 from my
       | cold dead hands.
       | 
       | As for Go, you can get the GOARCH/GOOS combinations from "go tool
       | dist list". Can be useful at times if you want to ensure your
       | code cross-compiles in CI.
        
       | ycombinatrix wrote:
       | >There's a few variants. wasm32-unknown-unknown (here using
       | unknown instead of none as the system, oops)
       | 
       | Why isn't it called wasm32-none-none?
        
         | pie_flavor wrote:
         | As far as I can tell, it's because libstd exists (but is full
         | of do-nothing stubs). There is another `wasm32-none` target
         | which is no_std.
        
       | pie_flavor wrote:
       | Sorry, going to keep typing x64. Unlike the article's
       | recommendation of x86, literally everyone knows exactly what it
       | means at all times.
        
         | qu4z-2 wrote:
         | If someone tells me x86, I am certainly thinking 32-bit
         | protected mode not 64-bit long mode... Granted I'm in the weird
         | space where I know enough to be dangerous but not enough to
         | keep me up-to-date with idiomatic naming conventions.
        
         | kevin_thibedeau wrote:
         | You mean AMD64?
        
       | dvektor wrote:
       | Great read. Love those articles where you go in thinking that you
       | have a pretty solid understanding of the topic and then proceed
       | to learn much more than you thought you would.
        
       | Joker_vD wrote:
       | > no one calls it x64 except for Microsoft. And even though it is
       | fairly prevalent on Windows, I absolutely give my gamedev friends
       | a hard time when they write x64.
       | 
       | So, it turns out, actually a _lot_ of people call it x64 --
       | including author 's own friends! -- it's just that the author
       | dislikes it. Disliking something is fine, but why claim outright
       | falsehood which you know first-hand is false?
       | 
       | Also, the actual proper name for this ISA is, of course, EM64T.
       | /s
       | 
       | > The fourth entry of the triple (and I repeat myself, yes, it's
       | still a triple)
       | 
       | Any actual justification except the bald assertions from the
       | personal preferences? Just call it a "tuple", or something...
        
       | IAmLiterallyAB wrote:
       | > However, due to the runaway popularity of LLVM, virtually all
       | compilers now use target triples.
       | 
       | That's a wild take. I think its pretty universally accepted the
       | GCC and the GNU toolchain is what made this ubiquitous.
       | 
       | Also, the x32 ABI is still around, support is still around, I
       | don't know where the author got that notion
        
       | cestith wrote:
       | > "i386" (the first Intel microarchitecture that implemented
       | protected mode)12
       | 
       | This is technically incorrect. The 286 had protected mode. It was
       | a 16-bit protected mode, being a 16-bit processor. It was also
       | incompatible with the later protected mode of the 386 through
       | today's processors. It did, however, exist.
        
       ___________________________________________________________________
       (page generated 2025-04-16 17:03 UTC)