[HN Gopher] Linkers and Loaders (1999) [pdf]
___________________________________________________________________
Linkers and Loaders (1999) [pdf]
Author : Tomte
Score : 116 points
Date : 2024-03-05 10:58 UTC (1 days ago)
(HTM) web link (www.wh0rd.org)
(TXT) w3m dump (www.wh0rd.org)
| fractalb wrote:
| I want to learn more about linkers and loaders. Can someone tell
| me if this is still relevant today? Any other books that are more
| recent?
| begriffs wrote:
| I have a hard-copy of this book, and it seems like the PDF
| isn't the final version, judging by the hand drawn
| illustrations at least.
|
| The book does dive into some old and arcane object file
| formats, but that's interesting in its own way. Nice to get a
| comparison of how different systems did things.
|
| After reading that book and other sources, I examined how
| people typically create libraries in the UNIX/Linux world. Some
| of the common practices seem to be lacking, so I wrote an
| article about how they might be improved:
| https://begriffs.com/posts/2021-07-04-shared-libraries.html
|
| Edit: That covers the dynamic linking side of things at least.
| t-3 wrote:
| The concepts are still relevant, but the specifics are mostly
| outdated. If you read this and then read the relevant standards
| documents for your platform, you should have a good grounding.
| I don't know of any other books that cover the topic well.
| bregma wrote:
| Still pretty much relevant in terms of introductory
| understanding. Some specific details seem slightly
| anachronistic for the general population (segmented memory for
| example, which still exists in modern PC hardware but is of
| little import to the great majority).
| Tomte wrote:
| A book that's even older:
| http://www.davidsalomon.name/assem.advertis/AssemAd.html
| monocasa wrote:
| Very relevant.
|
| The big outlier not listed here is apple. Quick overview from
| someone who's written binary analysis tools targeting most of
| these:
|
| Mach-O is the format they use, going back to nextstep. They use
| it everywhere including their kernels and their internal only
| L4 variant they run on the secure enclave. Instead of being
| structured as a declarative table of the expected address space
| when loaded (like PE and and ELF), Mach-O is built around a
| command list that has to be run to load the file. So instead of
| an entry point address in a table somewhere, mach-o has a
| 'start a thread with this address' command in the command list.
| Really composable, which means binaries can do a lot of nasty
| things with that command list.
|
| They also very heavily embrace their ld cache to the point that
| they don't bother including a lot of the system libraries on
| the actual root filesystem anymore, and the kernel is
| ultimately a cache of the minimal kernel image itself as well
| as the drivers need at least to boot all in one file (and
| actually all of the drivers I think on iOS with driver loading
| disabled if it's not in the cache?).
|
| There's a neat wrapper format of Mach-O called "fat binaries"
| that lets you hold multiple Mach-O images in one file, tagged
| by architecture. This is what's letting current osx have the
| same application be a native arm binary and a native x86_64
| binary, the loader just picks the right one based on current
| arch and settings.
|
| I think those are the main points, but I might have missed
| something; this was pretty off the cuff.
| Someone wrote:
| > Mach-O is built around a command list that has to be run to
| load the file. So instead of an entry point address in a
| table somewhere, mach-o has a 'start a thread with this
| address' command in the command list. Really composable,
| which means binaries can do a lot of nasty things with that
| command list.
|
| ELF isn't immune to doing nasty things at link time. https://
| www.usenix.org/system/files/conference/woot13/woot13... has
| an example where they make ping call _execl_ on arbitrary
| executables as root by tweaking such a declarative table.
| Upvoter33 wrote:
| definitely worth reading and understanding. Concepts haven't
| really changed.
| snnn wrote:
| The mechanisms for Windows DLLs have been changed a lot(like
| how thread local vars are handled). Besides, this book could
| not cover C++11's magic statics, or Windows' ARM64X format, or
| Apple's universal2, because these things are very new. Windows
| now has the apiset concept, which is very unique. Upon it there
| are direct forwarding and reverse forwarders.
|
| I think for C/C++ programmers it is more practical to know
| that: 1. The construction/destruction order for global vars in
| DLLs(shared libs) are very different between Linux and Windows.
| It means the same code might work fine on one platform but no
| the other one. It imposes challenges on writing portable code.
| 2. On Linux it is hard to get a shared lib cleanly unloaded,
| and it might affect how global vars are destructed, and might
| cause unexpected crashes at exit. 3. Since Windows has a DLL
| loader lock, there are a lot of things you cannot do in C++
| classes constructors/destructors if the classes could be used
| to define a global variable. For example, no thread
| synchronization is allowed. 4. It is difficult to cleanup a
| thread local variable if the variable lies in a DLL and the
| variable's destructor depends on another global object. 5. When
| one static lib depends on another, a linker wouldn't use this
| information to sort the initialization order of global vars. It
| means it could be the case that A.lib depends on B.lib but
| A.lib get initialized first. The best way to avoid this problem
| is using dynamic linking.
|
| For Windows related topics I highly recommend the "Windows
| Internals" book.
| elteto wrote:
| This is a good book, probably still relevant but not up to date.
| See Advanced C and C++ Compiling [0] for a more modern treatment,
| including using tools like objdump, nm, and readelf to explore
| compilation artifacts.
|
| [0] https://a.co/d/gckw03M
| 0xcafefood wrote:
| https://practicalbinaryanalysis.com/ may not be focused on
| linking and loading per se, but touches on these topics in a
| way that felt (to me) more lucid than the book you've linked.
| elteto wrote:
| Very nice! Thanks for the link.
| kragen wrote:
| ian lance taylor, author of gold, major binutils contributor,
| and one of the core golang team, wrote a book on linkers as a
| series of 20 blog posts in 02008, which of course also includes
| using tools like objdump, nm, and readelf to explore
| compilation artifacts https://lwn.net/Articles/276782/
|
| levine's book was _the book_ in the 01990s
| KerrAvon wrote:
| Big retro revival in the 3700's, huh?
| kragen wrote:
| man, you have no idea
| jollyllama wrote:
| Has the subject matter just not changed much since a ~decade
| ago not to warrant new books?
| fuzztester wrote:
| Someone ping gumby :)
|
| https://news.ycombinator.com/user?id=gumby
|
| He's likely to have something interesting to say here.
| adancalderon wrote:
| This is a fun book. I remember checking it out from the NC State
| University Library.
| ndand wrote:
| Steve Yegge mentions the book in the "Compilers, why learn them?"
| episode,
|
| https://www.youtube.com/watch?v=o87pOVK8khc&t=1939s
|
| "you're not done actually when you learn how compilers work you
| also need to learn how linkers and loaders work and you need to
| learn how to operating systems work before all of the magic is
| gone. So you really want to learn compilers and operating systems
| and then get this book that I have called Linkers and Loaders
| it's like the only book on linkers and loaders. I should have
| brought it and uh and it's really good"
| CalChris wrote:
| Absolutely start with Levine. There's a lot that hasn't changed
| and Levine will give you a solid background in the old school. I
| have a hard copy and it's not going anywhere.
|
| Next up, the Ian Lance Taylor's 20-part series. This will give
| you some motivation why the Levine world needed to change:
| performance.
|
| https://lwn.net/Articles/276782/
|
| Next, look at LLVM's lld and their Developer Meeting tutorials on
| it. There have been many but Rui Ueyama 2017 _lld: A Fast, Simple
| and Portable Linker_ and Peter Smith 's _How to add a new target
| to LLD_ are where to start. Bluntly, Rui utterly solved
| /crushed/killed the linker performance problem. He probably
| should get an ACM Software System award but won't.
|
| After that Oracle's (!) _Linker and Libraries Guide_ and their
| Linker Aliens blog posts have a lot of gritty detail.
|
| https://docs.oracle.com/cd/E23824_01/html/819-0690/toc.html
|
| http://www.linker-aliens.org
___________________________________________________________________
(page generated 2024-03-06 23:00 UTC)