[HN Gopher] Linkers and Loaders (1999) [pdf]
       ___________________________________________________________________
        
       Linkers and Loaders (1999) [pdf]
        
       Author : Tomte
       Score  : 116 points
       Date   : 2024-03-05 10:58 UTC (1 days ago)
        
 (HTM) web link (www.wh0rd.org)
 (TXT) w3m dump (www.wh0rd.org)
        
       | fractalb wrote:
       | I want to learn more about linkers and loaders. Can someone tell
       | me if this is still relevant today? Any other books that are more
       | recent?
        
         | begriffs wrote:
         | I have a hard-copy of this book, and it seems like the PDF
         | isn't the final version, judging by the hand drawn
         | illustrations at least.
         | 
         | The book does dive into some old and arcane object file
         | formats, but that's interesting in its own way. Nice to get a
         | comparison of how different systems did things.
         | 
         | After reading that book and other sources, I examined how
         | people typically create libraries in the UNIX/Linux world. Some
         | of the common practices seem to be lacking, so I wrote an
         | article about how they might be improved:
         | https://begriffs.com/posts/2021-07-04-shared-libraries.html
         | 
         | Edit: That covers the dynamic linking side of things at least.
        
         | t-3 wrote:
         | The concepts are still relevant, but the specifics are mostly
         | outdated. If you read this and then read the relevant standards
         | documents for your platform, you should have a good grounding.
         | I don't know of any other books that cover the topic well.
        
         | bregma wrote:
         | Still pretty much relevant in terms of introductory
         | understanding. Some specific details seem slightly
         | anachronistic for the general population (segmented memory for
         | example, which still exists in modern PC hardware but is of
         | little import to the great majority).
        
         | Tomte wrote:
         | A book that's even older:
         | http://www.davidsalomon.name/assem.advertis/AssemAd.html
        
         | monocasa wrote:
         | Very relevant.
         | 
         | The big outlier not listed here is apple. Quick overview from
         | someone who's written binary analysis tools targeting most of
         | these:
         | 
         | Mach-O is the format they use, going back to nextstep. They use
         | it everywhere including their kernels and their internal only
         | L4 variant they run on the secure enclave. Instead of being
         | structured as a declarative table of the expected address space
         | when loaded (like PE and and ELF), Mach-O is built around a
         | command list that has to be run to load the file. So instead of
         | an entry point address in a table somewhere, mach-o has a
         | 'start a thread with this address' command in the command list.
         | Really composable, which means binaries can do a lot of nasty
         | things with that command list.
         | 
         | They also very heavily embrace their ld cache to the point that
         | they don't bother including a lot of the system libraries on
         | the actual root filesystem anymore, and the kernel is
         | ultimately a cache of the minimal kernel image itself as well
         | as the drivers need at least to boot all in one file (and
         | actually all of the drivers I think on iOS with driver loading
         | disabled if it's not in the cache?).
         | 
         | There's a neat wrapper format of Mach-O called "fat binaries"
         | that lets you hold multiple Mach-O images in one file, tagged
         | by architecture. This is what's letting current osx have the
         | same application be a native arm binary and a native x86_64
         | binary, the loader just picks the right one based on current
         | arch and settings.
         | 
         | I think those are the main points, but I might have missed
         | something; this was pretty off the cuff.
        
           | Someone wrote:
           | > Mach-O is built around a command list that has to be run to
           | load the file. So instead of an entry point address in a
           | table somewhere, mach-o has a 'start a thread with this
           | address' command in the command list. Really composable,
           | which means binaries can do a lot of nasty things with that
           | command list.
           | 
           | ELF isn't immune to doing nasty things at link time. https://
           | www.usenix.org/system/files/conference/woot13/woot13... has
           | an example where they make ping call _execl_ on arbitrary
           | executables as root by tweaking such a declarative table.
        
         | Upvoter33 wrote:
         | definitely worth reading and understanding. Concepts haven't
         | really changed.
        
         | snnn wrote:
         | The mechanisms for Windows DLLs have been changed a lot(like
         | how thread local vars are handled). Besides, this book could
         | not cover C++11's magic statics, or Windows' ARM64X format, or
         | Apple's universal2, because these things are very new. Windows
         | now has the apiset concept, which is very unique. Upon it there
         | are direct forwarding and reverse forwarders.
         | 
         | I think for C/C++ programmers it is more practical to know
         | that: 1. The construction/destruction order for global vars in
         | DLLs(shared libs) are very different between Linux and Windows.
         | It means the same code might work fine on one platform but no
         | the other one. It imposes challenges on writing portable code.
         | 2. On Linux it is hard to get a shared lib cleanly unloaded,
         | and it might affect how global vars are destructed, and might
         | cause unexpected crashes at exit. 3. Since Windows has a DLL
         | loader lock, there are a lot of things you cannot do in C++
         | classes constructors/destructors if the classes could be used
         | to define a global variable. For example, no thread
         | synchronization is allowed. 4. It is difficult to cleanup a
         | thread local variable if the variable lies in a DLL and the
         | variable's destructor depends on another global object. 5. When
         | one static lib depends on another, a linker wouldn't use this
         | information to sort the initialization order of global vars. It
         | means it could be the case that A.lib depends on B.lib but
         | A.lib get initialized first. The best way to avoid this problem
         | is using dynamic linking.
         | 
         | For Windows related topics I highly recommend the "Windows
         | Internals" book.
        
       | elteto wrote:
       | This is a good book, probably still relevant but not up to date.
       | See Advanced C and C++ Compiling [0] for a more modern treatment,
       | including using tools like objdump, nm, and readelf to explore
       | compilation artifacts.
       | 
       | [0] https://a.co/d/gckw03M
        
         | 0xcafefood wrote:
         | https://practicalbinaryanalysis.com/ may not be focused on
         | linking and loading per se, but touches on these topics in a
         | way that felt (to me) more lucid than the book you've linked.
        
           | elteto wrote:
           | Very nice! Thanks for the link.
        
         | kragen wrote:
         | ian lance taylor, author of gold, major binutils contributor,
         | and one of the core golang team, wrote a book on linkers as a
         | series of 20 blog posts in 02008, which of course also includes
         | using tools like objdump, nm, and readelf to explore
         | compilation artifacts https://lwn.net/Articles/276782/
         | 
         | levine's book was _the book_ in the 01990s
        
           | KerrAvon wrote:
           | Big retro revival in the 3700's, huh?
        
             | kragen wrote:
             | man, you have no idea
        
           | jollyllama wrote:
           | Has the subject matter just not changed much since a ~decade
           | ago not to warrant new books?
        
       | fuzztester wrote:
       | Someone ping gumby :)
       | 
       | https://news.ycombinator.com/user?id=gumby
       | 
       | He's likely to have something interesting to say here.
        
       | adancalderon wrote:
       | This is a fun book. I remember checking it out from the NC State
       | University Library.
        
       | ndand wrote:
       | Steve Yegge mentions the book in the "Compilers, why learn them?"
       | episode,
       | 
       | https://www.youtube.com/watch?v=o87pOVK8khc&t=1939s
       | 
       | "you're not done actually when you learn how compilers work you
       | also need to learn how linkers and loaders work and you need to
       | learn how to operating systems work before all of the magic is
       | gone. So you really want to learn compilers and operating systems
       | and then get this book that I have called Linkers and Loaders
       | it's like the only book on linkers and loaders. I should have
       | brought it and uh and it's really good"
        
       | CalChris wrote:
       | Absolutely start with Levine. There's a lot that hasn't changed
       | and Levine will give you a solid background in the old school. I
       | have a hard copy and it's not going anywhere.
       | 
       | Next up, the Ian Lance Taylor's 20-part series. This will give
       | you some motivation why the Levine world needed to change:
       | performance.
       | 
       | https://lwn.net/Articles/276782/
       | 
       | Next, look at LLVM's lld and their Developer Meeting tutorials on
       | it. There have been many but Rui Ueyama 2017 _lld: A Fast, Simple
       | and Portable Linker_ and Peter Smith 's _How to add a new target
       | to LLD_ are where to start. Bluntly, Rui utterly solved
       | /crushed/killed the linker performance problem. He probably
       | should get an ACM Software System award but won't.
       | 
       | After that Oracle's (!) _Linker and Libraries Guide_ and their
       | Linker Aliens blog posts have a lot of gritty detail.
       | 
       | https://docs.oracle.com/cd/E23824_01/html/819-0690/toc.html
       | 
       | http://www.linker-aliens.org
        
       ___________________________________________________________________
       (page generated 2024-03-06 23:00 UTC)