[HN Gopher] Evolution of the ELF object file format
       ___________________________________________________________________
        
       Evolution of the ELF object file format
        
       Author : MaskRay
       Score  : 40 points
       Date   : 2024-05-26 22:42 UTC (1 days ago)
        
 (HTM) web link (maskray.me)
 (TXT) w3m dump (maskray.me)
        
       | MaskRay wrote:
       | A few folks have asked me the generic ABI status (unmaintained?)
       | and the availability of an up-to-date specification (no). I
       | compiled "History" and "Evolution of the generic ABI" in the blog
       | post.
       | 
       | I have two specific questions:
       | 
       | - Key features (symbol visibility, section groups, SHF_MERGE,
       | etc) were all available as of April 2001. Where can we find the
       | discussion mailing lists? Are they still available?
       | 
       | - How does the ABI end up being "All rights reserved" by SCO?
       | Tool Interface Standard (TIS) Portable Formats Specification,
       | version 1.2 effectively put the specification in the public
       | domain.
        
         | Joker_vD wrote:
         | > How does the ABI end up being "All rights reserved" by SCO?
         | 
         | Text of the spec is copyrighted by the SCO, they did not put it
         | in public domain. That's what "All rights reserved" was
         | intended to mean even though this phrase has not been meaning
         | anything since 2000.
        
           | LegionMammal978 wrote:
           | Yeah, it's not like they could copyright the ELF format
           | anyway, only that particular description, or any particular
           | implementation they had. Pretty much everyone seems to agree
           | that file formats themselves aren't copyrightable expression,
           | though I have trouble finding any U.S. case law to that
           | effect.
           | 
           | (Personally, that makes me a bit uneasy: for instance, if the
           | copyrighted spec lists the file sections in a certain order,
           | and your implementation happens to output them in the same
           | order even if it doesn't have to, then have you infringed on
           | the owner's copyright of that particular arrangement?)
           | 
           | Meanwhile, they could have gotten a patent on (some parts of)
           | the format, but that doesn't seem to be the case here.
        
             | Veserv wrote:
             | The rule in general (not just for software) is that patents
             | are for functional elements where as copyrights are for
             | non-functional elements; they are exclusive, a element can
             | only be patentable or copyrightable, not both.
             | 
             | If it is necessary for operation or inter-operation [1]
             | then they can probably not enforce copyright or trademark.
             | 
             | [1] https://en.wikipedia.org/wiki/Sega_v._Accolade
        
               | LegionMammal978 wrote:
               | Sure, but usually in any file format, an encoder has some
               | choice in how to produce its output, with no functional
               | difference either way. For instance, different PNG
               | encoders might include the necessary chunks in a
               | different order in the file. And my concern is that some
               | such choices, if they align with choices described in the
               | copyrighted spec, might be protectable as a creative
               | arrangement or similar. (E.g., a spec might list the
               | chunks in such an order that their names form a creative
               | acronym.)
               | 
               | Of course, the defense against this would be to scramble
               | (or normalize) all non-functional choices compared to
               | anything in the spec. But you have to be careful to make
               | sure there's nothing left of the spec's non-functional
               | influence.
               | 
               | At least _Oracle v. Google_ appears to provide some
               | ammunition here in favor of implementers: the court found
               | the transformative use in that case enough to trump even
               | the byte-for-byte copying of the API signatures. So
               | perhaps interoperability could similarly trump such non-
               | functional copying from the spec. But overall, it 's
               | still on shakier ground than I'd like.
        
             | MaskRay wrote:
             | I feel that The SCO Group's role in the evolution of the
             | System V ABI seems to have been more of a curator/editor
             | than an innovator, inheriting the System V ABI from
             | previous entities. Given that the Tool Interface Standard
             | (TIS) Committee has essentially released the ELF-related
             | chapters into the public domain, and others have made
             | changes, it's unclear what specific rights The SCO Group
             | (and now Xinuos) could claim to reserve. (That said, their
             | maintenance work needs to be remembered.)
        
           | vintagedave wrote:
           | It's concerning that a huge open source ecosystem like Linux
           | depends on a closed specification. That is, I read it as that
           | the spec cannot be continued, developed, and evolved or
           | republished without full redefinition and respecification.
           | 
           | Has anyone looked at creating a new object format for Linux?
           | A non-open spec seems a minor issue, really, in an era when
           | we put binary blobs in the kernel (hi Nvidia.) But the more
           | decades I work in closed source, the more I value open
           | source, and believe keeping _everything_ open.
        
             | yjftsjthsd-h wrote:
             | I wouldn't think you'd need to create a new format, just
             | write up a spec of what all available FOSS implementations
             | are actually doing and agree that that's the standard going
             | forward. Between Linux/*BSD/illumos you should be able to
             | pretty completely describe all currently-active uses of
             | ELF.
             | 
             | (Of course this assumes that doing this is sufficient to
             | make it legally not derived from the original spec as far
             | as copyright law cares about, which is beyond my non-lawyer
             | ability to be that confident in.)
        
             | aseipp wrote:
             | > Has anyone looked at creating a new object format for
             | Linux?
             | 
             | No, because that's overkill for this problem, because the
             | _de facto_ standard is  "What do modern free Unicies do",
             | where those are basically Linux/BSD/Illumos. Besides, it's
             | not unheard of in Unix history to have free implementations
             | of proprietary standards, where those implementations
             | became de facto standards anyway.
             | 
             | Even if it was a seriously pressing issue, writing a new
             | specification that covers ELF as it exists today is going
             | to be many, many, many times easier than introducing a new
             | executable format across the ecosystem, they're not even in
             | the same league or sport in terms of effort. If you were
             | going to introduce a whole new object file format it would
             | be much better to do it and justify it on the basis of
             | fundamental technical/architectural changes.
        
         | jcranmer wrote:
         | One of the things I've been (very slowly) putting together is a
         | master list of all the references one needs to build a compiler
         | toolchain--such as processor ISA manuals, ABI specifications,
         | language standards, even things like IEEE 754 or DWARF.
         | 
         | While IEEE 754 is an issue because it's not freely available
         | for most people, the only document that has truly stumped me is
         | the ELF specification. From your blog post, it seems like my
         | failure to find the most up-to-date specification is simply
         | because one _doesn 't exist._
        
           | marssaxman wrote:
           | Are you familiar with the OSDev wiki? Perhaps their list of
           | links will be helpful:
           | 
           | https://wiki.osdev.org/ELF#External_Links
        
             | jcranmer wrote:
             | That wiki's list of links isn't always the most up-to-date.
        
               | marssaxman wrote:
               | How often does anyone change the ELF file format, though?
               | 
               | No matter; but that's the most up-to-date list of
               | references I know about.
        
               | jcranmer wrote:
               | The ELF format isn't the one that I'm the most concerned
               | about, but for example compressed sections were updated
               | in the past decade. The ABI for x86-64 is out-of-date
               | (actually, the site it links to is now a 404, but my
               | recollection was that was a version around 2013)--missing
               | things like _Float16 rules, DWARF register naming for AMX
               | and APX registers, microarchitecture levels, new
               | relocation types,
        
           | agent281 wrote:
           | Are you going to make this list of resources available online
           | or is it for personal consumption only?
        
             | jcranmer wrote:
             | So far, the list of resources is a collection of PDFs I've
             | downloaded plus a script to download the latest version of
             | every relevant specification. (Although it does seem that
             | Intel likes breaking its links to the latest ISA PDF every
             | now and then). I don't have a great solution for things
             | that are essentially web-only pages, which is how the
             | Itanium ABI is handled (and unlike C, C++ makes it hard to
             | find the relevant working draft copies, instead preferring
             | to use the HTML-based https://eel.is/c++draft for that).
             | 
             | I probably will make it publicly available eventually, it
             | just hasn't been a priority for me.
        
           | MaskRay wrote:
           | Thank you! This will be very useful. Yes, for C/C++ one
           | needs:
           | 
           | * ISA manual * ELF (generic ABI, psABI (e.g. x86-64-psABI),
           | OSABI) * DWARF * Floating-point * Language standards *
           | Itanium C++ ABI
           | 
           | and probably a few other stuff.
           | 
           | > it seems like my failure to find the most up-to-date
           | specification is simply because one doesn't exist.
           | 
           | While the up-to-date specification is unavailable, hopefully
           | it is not too bad because all essential features have been
           | completed as of 2001:)
           | 
           | RELR is a linker and loader feature, not on the compiler
           | side. Compressed debug sections are a pretty natural
           | extension.
        
             | jcranmer wrote:
             | I file things like linkers, loaders, and debuggers into the
             | same bucket as compilers, so all of the ELF extensions are
             | interesting to me, even if compilers directly don't
             | interact with them. I'm also interested in things like the
             | loader-debugger rendezvous interface. Basically any
             | document anyone who touches any project in llvm-project
             | might need to read.
             | 
             | And then there are things like the layout of
             | .gcc_except_table, which are undocumented except for random
             | blog posts.
        
       | yjftsjthsd-h wrote:
       | > Q18: How can you get a single binary to work identically across
       | all these diverse systems?
       | 
       | > Most Unix-on-Intel binary packages are already largely similar.
       | Almost all such operating systems use the "ELF" binary
       | 'packaging'; the various operating systems have small but
       | significant differences, though, that make each system's ELF
       | binary unusable on others'.
       | 
       | Though the scope has diminished with the decline of proprietary
       | unixen, there is a nice bright spot with APE binaries (
       | https://justine.lol/ape.html ) which are in some regards even
       | _more_ portable since they work on Darwin (which natively uses
       | Mach-O) and NT (which natively uses PE).
        
       | CalChris wrote:
       | Who controls the ELF standard? The best I can tell is that you
       | submit a pull request with LLVM and maybe GCC.
        
         | MaskRay wrote:
         | - 2003-2010 The SCO Group
         | 
         | - 2011- Xinous, but Xinous has stopped updating
         | https://www.sco.com/developers/gabi/latest/contents.html . It
         | seems that Xinous has moved on from System V based
         | OpenServer/UnixWare. The newer OpenServer seems to be based on
         | FreeBSD. They likely no longer care about the System V ABI.
         | 
         | Nowadays, people make proposals to the generic-abi Google
         | Group. Essentially, a proposal is considered "standardized" if
         | it receives approval from GNU, LLVM, and Ali Bahrami (Solaris
         | representative).
         | 
         | Many GNU extensions are implemented by LLVM and adopted by BSD
         | and newer ELF-based OSes. For extensions that are considered
         | generic enough, it's recommended to propose them through the
         | generic-abi Google Group. psABI documents generally prefer
         | generic extensions over GNU or LLVM-specific ones.
        
       ___________________________________________________________________
       (page generated 2024-05-27 23:00 UTC)