[HN Gopher] Wild - A fast linker for Linux
       ___________________________________________________________________
        
       Wild - A fast linker for Linux
        
       Author : hkalbasi
       Score  : 233 points
       Date   : 2025-01-24 16:25 UTC (6 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | throwaway106382 wrote:
       | > Mold is already very fast, however it doesn't do incremental
       | linking and the author has stated that they don't intend to. Wild
       | doesn't do incremental linking yet, but that is the end-goal. By
       | writing Wild in Rust, it's hoped that the complexity of
       | incremental linking will be achievable.
       | 
       | Can someone explain what is so special about Rust for this?
        
         | dralley wrote:
         | I assume they're referring to thread-safety and the ability to
         | more aggressively parallelize.
        
           | compiler-guy wrote:
           | Mold and lld are already very heavily parallelized. It's one
           | of the things that makes them very fast already.
        
         | senkora wrote:
         | I assume that he is referring to "fearless concurrency", the
         | idea that Rust makes it possible to write more complex
         | concurrent programs than other languages because of the safety
         | guarantees:
         | 
         | https://doc.rust-lang.org/book/ch16-00-concurrency.html
         | 
         | So the logic would go:
         | 
         | 1. mold doesn't do incremental linking because it is too
         | complex to do it while still being fast (concurrent).
         | 
         | 2. Rust makes it possible to write very complex fast
         | (concurrent) programs.
         | 
         | 3. A new linker written in Rust can do incremental linking
         | while still being fast (concurrent).
         | 
         | EDIT: I meant this originally, but comments were posted before
         | I added it so I want to be clear that this part is new: (Any of
         | those three could be false; I take no strong position on that.
         | But I believe that this is the motivating logic.)
        
           | compiler-guy wrote:
           | Both mold and lld are already very heavily concurrent. There
           | is no fear at all there.
        
           | ComputerGuru wrote:
           | Actually a lot of the hacks that mold uses to be the fastest
           | linker would be, ironically, harder to reproduce with rust
           | because they're antithetical to its approach. Eg Mold
           | intentionally eschews used resource collection to speed up
           | execution (it'll be cleaned up by the os when the process
           | exits) while rust has a strong RAII approach here that would
           | introduce slowdowns.
        
             | Philpax wrote:
             | I mean, that's pretty easy to do in Rust: https://doc.rust-
             | lang.org/std/mem/struct.ManuallyDrop.html
             | 
             | Also see various arena allocator crates, etc.
        
               | ComputerGuru wrote:
               | Not really. You would have to either wrap any standard
               | library types in newtypes with ManuallyDrop implemented
               | or (for some) use a custom allocator. And if you want to
               | free some things in one go but not others that gets much
               | harder, especially when you look at how easy a language
               | like zig makes it.
               | 
               | And if you intentionally leak everything it is onerous to
               | get the borrow checker to realize that unless you use a
               | leaked box for all declaration/allocations, which
               | introduces both friction and performance regressions (due
               | to memory access patterns) because the use of custom
               | allocators doesn't factor into lifetime analysis.
               | 
               | (Spoken as a die-hard rust dev that still thinks it's the
               | better language than zig for most everything.)
        
               | Aurornis wrote:
               | > You would have to either wrap any standard library
               | types in newtypes with ManuallyDrop implemented
               | 
               | ManuallyDrop would presumably be implemented on large
               | data structures where it matters, not on every single
               | type involved in the program.
        
             | cogman10 wrote:
             | Depends on how things are approached.
             | 
             | You could, for example, take advantage of bump arena
             | allocator in rust which would allow the linker to have just
             | 1 alloc/dealloc. Mold is still using more traditional
             | allocators under the covers which won't be as fast as a
             | bump allocator. (Nothing would stop mold from doing the
             | same).
        
               | cma wrote:
               | Traditional allocators are fast if you never introduce
               | much fragmentation with free, though you may still get
               | some gaps and have some other overhead and not be quite
               | as fast. But why couldn't you just LD_PRELOAD a malloc
               | for mold that worked as a bump/stack/arena allocator and
               | just ignored free if anything party stuff isn't making
               | that many allocations?
        
               | zamalek wrote:
               | > Traditional allocators are fast
               | 
               | Really it's allocators in general. Allocations are
               | perceived as expensive only because they are mostly
               | dependent on the amortized cost of prior deallocations.
               | As an extreme example, even GCs can be fast if you avoid
               | deallocation because most typically have a long-lived
               | object heap that rarely gets collected - so if you keep
               | things around that can be long-lived (pooling) their cost
               | mostly goes away.
        
               | cogman10 wrote:
               | Slight disagreement here.
               | 
               | Allocation is perceived as slow because it is. Getting
               | memory from the OS is somewhat expensive because a page
               | of memory needs to be allocated and stored off. Getting
               | memory from traditional allocators is expensive because
               | freespace needs to be tracked. When you say "I need 5
               | bytes" the allocator needs to find 5 free bytes to give
               | back to you.
               | 
               | Bump allocators are fast because the operation of "I need
               | 5 bytes" is incrementing the allocation pointer forward
               | by 5 bytes and maybe doing a new page allocation if
               | that's exhausted.
               | 
               | GC allocators are fast because they are generally bump
               | allocators! The only difference is that when exhaustion
               | happens the GC says "I need to run a GC".
               | 
               | Traditional allocators are a bit slower because they are
               | typically something like an arena with skiplists used to
               | find free space. When you free up memory, that skiplist
               | needs to be updated.
               | 
               | But further, unlike bump and GC allocators another fault
               | of traditional allocators is they have a tendency to
               | scatter memory which has a negative impact on CPU cache
               | performance. With the assumption that related memory
               | tends to be allocated at the same time, GCs and bump
               | allocators will colocate memory. But, because of the
               | skiplist, traditional allocators will scattershot
               | allocations to avoid free memory fragmentation.
               | 
               | All this said, for most apps this doesn't matter a whole
               | lot. However, if you are doing a CPU/memory intense
               | operation then this is stuff to know.
        
             | junon wrote:
             | You can absolutely introduce free-less allocators and the
             | like, as well as use `ManuallyDrop` or `Box::leak`. Rust
             | just asks that you're explicit about it.
        
             | dralley wrote:
             | Nothing about Rust requires the use of the heap or RAII.
             | 
             | Also, if wild is indeed faster than mold even without
             | incrementalism, as the benchmarks show, then it seems quite
             | silly to go around making the argument that it's harder to
             | write a fast linker in Rust. It's apparently not _that_
             | hard.
        
         | devit wrote:
         | It's feasible to write complex correct programs with optimal
         | performance in Rust, unlike any other programming language
         | (complex+correct is not feasible in C/C++/assembly/Zig/etc.,
         | optimal performance not possible in any other language).
        
         | compiler-guy wrote:
         | That's puzzling to me too. Rust is a great language, and
         | probably makes developing Wild faster. But the complexity of
         | incremental linking doesn't stem from the linker's
         | implementation language. It stems from all the tracking,
         | reserved spacing, and other issues required to link a
         | previously linked binary (or at least parts of it) a second
         | time.
        
           | tialaramex wrote:
           | I would guess the idea is that in Rust the complexity is
           | cheaper on a "per unit" basis so you can afford more
           | complexity. So yes, it is a more complicated problem than the
           | previous linkers, but, in Rust maybe you can get that done
           | anyway.
        
           | IshKebab wrote:
           | Rust allows your to enforce more invariants at compile time,
           | so implementing a complex system where you are likely to make
           | a mistake and violate those invariants is easier.
        
         | IshKebab wrote:
         | There are two main factors:
         | 
         | 1. Rust's well designed type system and borrow checker makes
         | writing code that works just easier. It has the "if it compiles
         | it works" property (not unique to Rust; people say this about
         | e.g. Haskell too).
         | 
         | 2. Rust's type system - especially its trait system can be used
         | to enforce safety constraints statically. The obvious one is
         | the Send and Sync traits for thread safety, but there are
         | others, e.g. the Fuchsia network code statically guarantees
         | deadlocks are impossible.
         | 
         | Mold is written in C++ which is extremely error prone in
         | comparison.
        
         | manoweb wrote:
         | That is baffling. Maybe the author assumes that a language with
         | many safeguards will lead to keeping complexity under control
         | for a difficult task.
         | 
         | By the way I had to lookup what incremental linking is, in
         | practice I think it means that code from libraries and modules
         | that have not changed won't need to be re-packed each time
         | which ch will save time for frequent development builds, it's
         | actually ingenious
        
         | the_duke wrote:
         | Rust has a pretty good incremental caching compiler that makes
         | debug builds relatively fast.
         | 
         | Linking is often a very notable bottleneck for debug binaries
         | and mold can make a big difference.
         | 
         | So interest in speeding up linking for Rust is expected.
        
         | wffurr wrote:
         | I went looking for some writing by the author about _how_ he
         | made wild fast, but couldn 't find much:
         | https://davidlattimore.github.io/
        
         | panstromek wrote:
         | Apart from what others said, maybe he plans to use Salsa or
         | something like that. Rust has a few popular libraries for doing
         | this.
        
       | devit wrote:
       | I think the optimal approach for development would be to not
       | produce a traditional linked executable at all, but instead just
       | place the object files in memory, and then produce a loader
       | executable that hooks page faults in those memory areas and on-
       | demand mmaps the relevant object elsewhere, applies relocations
       | to it, and then moves it in place with mremap.
       | 
       | Symbols would be resolved based on an index where only updated
       | object files are reindexed. It could also eagerly relocate in the
       | background, in order depending on previous usage data.
       | 
       | This would basically make a copyless lazy incremental linker.
        
         | IshKebab wrote:
         | Sounds like dynamic linking, sort of.
        
         | fsfod wrote:
         | You can sort of do that with some of LLVM's JIT systems
         | https://llvm.org/docs/JITLink.html, I'm surprised that no one
         | has yet made a edit and continue system using it.
        
           | all2 wrote:
           | My parens sense is tingling. This sounds like a lisp-machine,
           | or just standard lisp development environment.
        
         | eseidel wrote:
         | Sounds like Apple's old ZeroLink from the aughts?
        
         | checker659 wrote:
         | Linker overlays?
        
         | ignoramous wrote:
         | > _Symbols would be resolved based on an index where only
         | updated object files are reindexed. It could also eagerly
         | relocate in the background, in order depending on previous
         | usage data._
         | 
         | Not exactly this, but Google's _Propeller_ fixes up (
         | "relinks") Basic Blocks (hot code as traced from PGO) in native
         | code at runtime (like an optimizing JIT compiler would):
         | https://research.google/pubs/propeller-a-profile-guided-reli...
        
         | 95014_refugee wrote:
         | This makes some very naive assumptions about the relationships
         | between entities in a program; in particular that you can make
         | arbitrary assertions about the representation of already-
         | allocated datastructures across multiple versions of a
         | component, that the program's compositional structure morphs in
         | understandable ways, and that you can pause a program in a
         | state where a component can actually be replaced.
         | 
         | By the time you have addressed these, you'll find yourself
         | building a microkernel system with a collection of independent
         | servers and well-defined interaction protocols. Which isn't
         | necessarily a terrible way to assemble something, but it's not
         | quite where you're trying to go...
        
         | jjmarr wrote:
         | Isn't this how dynamic linking works? If you really want to
         | reduce build times, you should be making your hot path in the
         | build a shared library, so you don't have to relink so long as
         | you're not changing the interface.
        
           | hinkley wrote:
           | But do rust's invariants work across dynamic links?
           | 
           | I thought a lot of its proofs were done at compile time not
           | link time.
        
         | cbsmith wrote:
         | That sounds a lot like traditional dynamic language runtimes.
         | You kind of get that for free with Smalltalk/LISP/etc.
        
       | satvikpendem wrote:
       | I looked at this before, is it ready for production? I thought
       | not based on the readme, so I'm still using mold.
       | 
       | For those on macOS, Apple released a new linker about a year or
       | two ago (which is why the mold author stopped working on their
       | macOS version), and if you're using it with Rust, put this in
       | your config.toml:                   [target.aarch64-apple-darwin]
       | rustflags = [              "-C",             "link-arg=-fuse-ld=/
       | Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault
       | .xctoolchain/usr/bin/ld",             "-C",             "link-
       | arg=-ld_new",         ]
        
         | brink wrote:
         | I don't even use mold for production. It's for development.
        
         | dralley wrote:
         | No, the author is pretty clear that it shouldn't be used for
         | production yet
        
           | satvikpendem wrote:
           | Great, I'll keep a look out but will hold off on using it for
           | now.
        
         | newman314 wrote:
         | Can you confirm that's still the right location for Sequioa?
         | 
         | I have the command line tools installed and I only have
         | /usr/bin/ld and /usr/bin/ld-classic
        
           | satvikpendem wrote:
           | Then it'd be the /usr/bin/ld as I believe my solution was for
           | before they moved the linker it seems.
        
       | ComputerGuru wrote:
       | There's been a lot of interest in faster linkers spurred by the
       | adoption and popularity of rust.
       | 
       | Even modest statically linked rust binaries can take a couple of
       | minutes in the link stage of compilation in release mode (using
       | mold). It's not a rust-specific issue but an amalgam of (usually)
       | strictly static linking, advanced link-time optimizations enabled
       | by llvm like LTO and bolt, and a general dissatisfaction with
       | compile times in the rust community. Rust's (clinically) strong
       | relationship with(read: dependency on) LLVM makes it the most
       | popular language where LLVM link-time magic has been most heavily
       | universally adopted; you could face these issues with C++ but it
       | wouldn't be chalked up to the language rather than your
       | toolchain.
       | 
       | I've been eyeing wild for some time as I'm excited by the promise
       | of an optimizing _incremental_ linker, but to be frank, see zero
       | incentive to even fiddle with it until it can actually, you know,
       | link incrementally.
        
         | sitkack wrote:
         | I solved this by using Wasm. Your outer application shell calls
         | into Wasm business logic, only the inner logic needs to get
         | recompiled, the outer app shell doesn't even need to restart.
        
           | ComputerGuru wrote:
           | I don't think I can use wasm with simd or syscalls, which is
           | the bulk of my work.
        
             | sitkack wrote:
             | I haven't used SIMD in Rust (or Wasm). Syscalls can be
             | passed into the Wasm env.
             | 
             | https://doc.rust-lang.org/core/arch/wasm32/index.html#simd
             | 
             | https://nickb.dev/blog/authoring-a-simd-enhanced-wasm-
             | librar...
             | 
             | Could definitely be more effort than it is worth just to
             | speed up compilation.
        
           | SkiFire13 wrote:
           | How is this different than dynamically linking the business
           | logic library?
        
         | pjmlp wrote:
         | C++ can be rather faster to compile than Rust, because some
         | compilers do have incremental compilation, and incremental
         | linking.
         | 
         | Additionally, the acceptance of binary libraries across the C
         | and C++ ecosystem, means that more often than not, you only
         | need to care about compiling you own application, and not the
         | world, every time you clone a repo, or switch development
         | branch.
        
       | pzmarzly wrote:
       | Ever since mold relicensed from AGPL to MIT (as part of mold 2.0
       | release), the worldwide need for making another fast linker has
       | been greatly reduced, so I wasn't expecting a project like this
       | to appear. And definitely wasn't expecting it to already be 2x
       | faster than mold in some cases. Will keep an eye on this project
       | to see how it evolves, best of luck to the author.
        
         | secondcoming wrote:
         | Maybe I'm holding it wrong, but mold isn't faster at all if
         | you're using LTO, which you probably should be.
        
           | 0x457 wrote:
           | I think we're talking about non-release builds here. In
           | those, you don't want to use LTO, you just want to get that
           | binary as fast as possible.
        
           | compiler-guy wrote:
           | Mold will be faster than LLD even using LTO, but all of its
           | benefits will be absolutely swamped by the LTO process, which
           | is, more or less, recompiling the entire program from high-
           | level LLVM-IR. That's extremely expensive and dwarfs any
           | linking advantages.
           | 
           | So the benefit will be barely noticable. As another comment
           | points out, LTO should only be used when you need a binary
           | optimized to within an inch of its life, such as a release
           | copy, or a copy for performance testing.
        
             | paulddraper wrote:
             | Username checks out.
             | 
             | And factual.
        
           | Arelius wrote:
           | Yeah, if you're development process requires LTO you may be
           | holding it wrong....
           | 
           | Specifically, if LTO is so important that you need to be
           | using it during development, you likely have a very
           | exceptional case, or you have some big architectural issues
           | that are causing much larger performance regressions then
           | they should be.
        
             | benatkin wrote:
             | Being able to choose a middle ground between
             | development/debug builds and production builds is becoming
             | increasingly important. This is especially true when
             | developing in the browser, when often something appears to
             | be slow in development mode but is fine in production mode.
             | 
             | WebAssembly and lightweight MicroVMs are enabling FaaS with
             | real time code generation but the build toolchain makes it
             | less appealing, when you don't want it to take half a
             | minute to build or to be slow.
        
             | jcalvinowens wrote:
             | If you're debugging, and your bug only reproduces with LTO
             | enabled, you don't have much of a choice...
        
               | paulddraper wrote:
               | Sure, for that 1% of the time.
        
               | thesz wrote:
               | ...which takes these remaining 99% of a development
               | time...
        
           | benatkin wrote:
           | Agreed. Both fast and small are desirable for sandboxed
           | (least authority) isomorphic (client and server)
           | microservices with WebAssembly & related tech.
        
         | estebank wrote:
         | Note that Mold has no interest in becoming incremental, so
         | there is a big reason there for another linker to exist. I find
         | it kind of embarrassing that MS' linker has been incremental by
         | default for decades, yet there's no production ready
         | incremental linker on Linux yet.
        
           | pjmlp wrote:
           | Additionally the way precompiled headers are handled in
           | Visual C++ and C++ Builder have always been much better than
           | traditional UNIX compilers, and now we have modules as well.
        
           | paulddraper wrote:
           | It has to be a candidate for the longest biggest gap in build
           | tooling ever.
        
         | easythrees wrote:
         | Wait a minute, it's possible to relicense something from GPL to
         | MIT?
        
           | DrillShopper wrote:
           | Yes. Generally you need permissions from contributors (either
           | asking them directly or requiring a contribution agreement
           | that assigns copyright for contributions to either the author
           | or the org hosting the project), but you can relicense from
           | any license to any other license.
           | 
           | That doesn't extinguish the prior versions under the prior
           | license, but it does allow a project to change its license.
        
           | prmoustache wrote:
           | Yes if you are the only developper and never received nor
           | accepted external contributions or if you managed to get
           | permission from every single person who contributed or
           | replaced their code with your own.
        
             | computably wrote:
             | > or if you managed to get permission from every single
             | person who contributed
             | 
             | This makes it sound more difficult than it actually is
             | (logistically); it's not uncommon for major projects to
             | require contributors to sign a CLA before accepting PRs.
        
               | mrighele wrote:
               | That depends on how old and big is the project. For
               | example Linux is "stuck" on GPL2 and even if they wanted
               | to move to something else it wouldn't be feasible to get
               | permission from all the people involved. Some
               | contributors passed away making it even more difficult.
        
       | kryptiskt wrote:
       | What would be refreshing would be a C/C++ compiler that did away
       | with the intermediate step of linking and built the whole program
       | as a unit. LTO doesn't even have to be a thing if the compiler
       | can see the entire program in the first place. It would still
       | have to save some build products so that incremental builds are
       | possible, but not as object files, the compiler would need
       | metadata to know of the origin and dependencies of all the
       | generated code so it would be able to replace the right things.
       | 
       | External libs are most often linked dynamically these days, so
       | they don't need to be built from source, so eliminating the
       | linker doesn't pose a problem for non-open source dependencies.
       | And if that's not enough letting the compiler also consume object
       | files could provide for legacy use cases or edge cases where you
       | must statically link to a binary.
        
         | almostgotcaught wrote:
         | People trot this out like it's some kind of brilliant insight
         | all the time and I always laugh.
         | 
         | First of all UNITY_BUILD is supported in CMake for a long time
         | - try it out and please report back how many ODR violations
         | your code base has.
         | 
         | Secondly, if you think any compiler is meaningfully doing
         | anything optimal ("whole program analysis") on a TU scale
         | greater than say ~50kloc (ie ~10 files) relative to compiling
         | individually you're dreaming. Let alone on a codebase with
         | millions of lines. Maybe inlining functions at most but you
         | should have those in a header already.
        
           | nn3 wrote:
           | >Secondly, if you think any compiler is meaningfully doing
           | anything optimal >>("whole program analysis") on a TU scale
           | greater than say ~50kloc (ie ~10 files) >relative to
           | compiling individually you're dreaming.
           | 
           | That's wrong. gcc generates summaries of function properties
           | and propagate those up and down the call tree, which for LTO
           | is then build in a distributed way. It does much more than
           | mere inlining, but even advanced analysis like points to
           | analysis.
           | 
           | https://gcc.gnu.org/onlinedocs/gccint/IPA.html
           | https://gcc.gnu.org/onlinedocs/gccint/IPA-passes.html
           | 
           | It scales to millions of lines of code because it's
           | partioned.
        
           | jcalvinowens wrote:
           | > if you think any compiler is meaningfully doing anything
           | optimal ("whole program analysis") on a TU scale greater than
           | say ~50kloc (ie ~10 files) relative to compiling individually
           | you're dreaming.
           | 
           | You can build the Linux kernel with LTO: simply diff the LTO
           | vs non-LTO outputs and it will be obvious you're wrong.
        
           | dapperdrake wrote:
           | SQLite3 may be a counter-example:
           | 
           | https://sqlite.org/amalgamation.html
        
         | dapperdrake wrote:
         | SQLite3 just concatenation everything together into one
         | compilation unit. So, more people have been using this than
         | probably know about it.
         | 
         | https://sqlite.org/amalgamation.html
        
       | shmerl wrote:
       | That looks promising. In Rust to begin with and with the goal of
       | being fast and support incremental linking.
       | 
       | To use it with Rust, this can probbaly also work using gcc as
       | linker driver.
       | 
       | In project's .cargo/config.toml:
       | [target.x86_64-unknown-linux-gnu]         rustflags = ["-C",
       | "link-arg=-fuse-ld=wild"]
       | 
       | Side note, but why does Rust need to plug into gcc or clang for
       | that? Some missing functionality?
        
         | sedatk wrote:
         | Because Rust compiler generates IR bytecode, not machine code.
        
           | shmerl wrote:
           | That's the reason to use llvm as part of Rust compiler
           | toolchain, not to use gcc or clang as linker manager?
        
             | sedatk wrote:
             | You're right, @davidlattimore seems to have answered that.
        
         | davidlattimore wrote:
         | Unfortunately gcc doesn't accept arbitrary linkers via the
         | `-fuse-ld=` flag. The only linkers it accepts are bfd, gold lld
         | and mold. It is possible to use gcc to invoke wild as the
         | linker, but currently to do that, you need to create a
         | directory containing the wild linker and rename the binary (or
         | a symlink) to "ld", then pass
         | `-B/path/to/directory/containing/wild` to gcc.
         | 
         | As for why Rust uses gcc or clang to invoke the linker rather
         | than invoking the linker directly - it's because the C compiler
         | knows what linker flags are needed on the current platform in
         | order to link against libc and the C runtime. Things like
         | `Scrt1.o`, `crti.o`, `crtbeginS.o`, `crtendS.o` and `crtn.o`.
        
           | shmerl wrote:
           | Ah, good to know, thanks!
           | 
           | May be it's worth filing a feature request for gcc to have
           | parity with clang for arbitrary linkers?
        
       | sylware wrote:
       | The real issue is actually runtime ELF (and PE) which are
       | obsolete on modern hardware architecture.
        
         | bmacho wrote:
         | What do you mean by this?
        
           | sylware wrote:
           | ELF(COFF) should now be only an assembler output format on
           | modern large hardware architecture.
           | 
           | On modern large hardware architecture, for executable
           | files/dynamic libraries, ELF(PE[+]) has overkill complexity.
           | 
           | I am personnally using a executable file format of my own I
           | do wrap into an "ELF capsule" on linux kernel. With position
           | independent code, you kind of only need memory mapped
           | segments (which dynamic libraries are in this very format). I
           | have two very simple partial linkers I wrote in plain and
           | simple C, one for risc-v assembly, one for x86_64 assembly,
           | which allow me to link into such executable file some simple
           | ELF object files (from binutils GAS).
           | 
           | There is no more centralized "ELF loader".
           | 
           | Of course, there are tradeoffs, 1 billion times worth it in
           | regards of the accute simplicity of the format.
           | 
           | (I even have a little vm which allows me to interpret simple
           | risc-v binaries on x86_64).
        
       | ajb wrote:
       | 2008: Gold, a new linker, intended to be faster than Gnu LD
       | 
       | 2015(?): Lld a drop in replacement linker, at least 2x as fast as
       | Gold
       | 
       | 2021: mold, a new linker, several times faster than lld
       | 
       | 2025: wild, a new linker...
        
         | dundarious wrote:
         | For windows, there is also [The RAD
         | Linker](https://github.com/EpicGamesExt/raddebugger?tab=readme-
         | ov-fi...) though quite early days.
        
         | wolfd wrote:
         | I'm not sure if you're intending to leave a negative or
         | positive remark, or just a brief history, but the fact that
         | people are still managing to squeeze better performance into
         | linkers is very encouraging to me.
        
           | ajb wrote:
           | Certainly no intention to be negative. Not having run the
           | numbers, I don't know if the older ones got slower over time
           | due to more features, or the new ones are squeezing out _new_
           | performance gains. I guess it 's also partly that the bigger
           | codebases scaled up so much over this period, so that there
           | are gains to be had that weren't interesting before.
        
       | bjourne wrote:
       | What a coincidence. :) Just an hour ago I compared the
       | performance of wild, mold, and (plain-old) ld on a C project I'm
       | working on. 23 kloc and 172 files. Takes about 23.4 s of user
       | time to compile with gcc+ld, 22.5 s with gcc+mold, and 21.8 s
       | with gcc+wild. Which leads me to believe that link time shouldn't
       | be that much of a problem for well-structured projects.
        
         | ndesaulniers wrote:
         | How about ld.lld?
        
         | searealist wrote:
         | Fast linkers are mostly useful in incremental compilation
         | scenarios to cut down on the edit cycle.
        
         | davidlattimore wrote:
         | It sounds like you're building from scratch. In that case, the
         | majority of the time will be spent compiling code, not linking.
         | The case for fast linkers is strongest when doing iterative
         | development. i.e. when making small changes to your code then
         | rebuilding and running the result. With a small change, there's
         | generally very little work for the compiler to do, but linking
         | is still done from scratch, so tends to dominate.
        
         | wolf550e wrote:
         | The linker time is important when building something like
         | Chrome, not small projects.
        
       | ndesaulniers wrote:
       | Can it link the Linux kernel yet? Was a useful milestone for LLD.
        
       | KerrAvon wrote:
       | I'm curious: what's the theory behind why this would be faster
       | than mold in the non-incremental case? "Because Rust" is a fine
       | explanation for a bunch of things, but doesn't explain expected
       | performance benefits.
       | 
       | "Because there's low hanging concurrent fruit that Rust can help
       | us get?" would be interesting but that's not explicitly stated or
       | even implied.
        
       | fuzztester wrote:
       | Related, and a good one, though old:
       | 
       | The book Linkers and Loaders by John Levine.
       | 
       | Last book in the list here:
       | 
       | https://www.johnlevine.com/books.phtml
       | 
       | I had read it some years ago, and found it quite interesting.
       | 
       | It's a standard one in the field.
       | 
       | He has also written some other popular computer books (see _link_
       | above - pun not intended, but noticed).
        
       ___________________________________________________________________
       (page generated 2025-01-24 23:00 UTC)