[HN Gopher] Linux and Glibc API Changes
       ___________________________________________________________________
        
       Linux and Glibc API Changes
        
       Author : arunc
       Score  : 52 points
       Date   : 2021-05-05 17:34 UTC (5 hours ago)
        
 (HTM) web link (man7.org)
 (TXT) w3m dump (man7.org)
        
       | itamarst wrote:
       | "The Linux Programming Interface" book referenced in the intro is
       | an amazingly useful book if you do Linux (or POSIX) system
       | programming, massive amount of detailed and clearly explained
       | information.
       | 
       | https://man7.org/tlpi/
       | 
       | I do hope there's a new edition eventually, but as he says the
       | core is still the same.
        
       | froh wrote:
       | are there any interface removals or backwards incompatible glib
       | and syscall changes? to my understanding kernel and glibc manage
       | to be binary compatible since iirc the switch to posix threads?
       | 
       | if so that would be glibc and kernel abi additions only?
        
         | rwmj wrote:
         | glibc APIs are occasionally deprecated, usually because they
         | are broken or dangerous. A well-known example is gets[1] which
         | is impossible to use without introducing a buffer overflow.
         | Less well-known ones include readdir_r, sigpause,
         | register_printf_function.
         | 
         | APIs are occasionally broken too, although that would be a bug.
         | 
         | At least one API is known buggy and hasn't been fixed:
         | fts_open. It only works properly if used on 32 bit
         | architectures. [Edit: I just noticed that glibc got around to
         | fixing this - yay!]
         | 
         | glibc+Linux is not entirely conforming to POSIX - threads being
         | a good example where it differs in some significant respects
         | from what POSIX requires.
         | 
         | Sometimes POSIX itself isn't well defined. Until very recently
         | it was not properly specified if a file descriptor is closed if
         | close(2) returns an error. Some POSIX systems closed it, some
         | didn't, some closed it on some errors but not on others; and
         | most programs wouldn't retry the close on error so would leak
         | the fd. It has since been changed so that the fd is always
         | closed even in the error case.
         | 
         | [1] https://www.man7.org/linux/man-pages/man3/gets.3.html
        
           | jcranmer wrote:
           | Fun fact: the _linker_ complains at you when you try to use
           | gets. It makes it rather annoying when you 're trying to use
           | it in a testsuite making sure your tool can handle some idiot
           | using gets correctly...
        
           | pjmlp wrote:
           | Actually gets() was deprecated in ISO C11.
        
             | einpoklum wrote:
             | ... but this does not apply retroactively; nor are you
             | required to use C11 to interface with libc. So maybe this
             | will be impactful in, oh, 20-30 years from now? :-(
        
           | cesarb wrote:
           | > A well-known example is gets[1] which is impossible to use
           | without introducing a buffer overflow.
           | 
           | Actually, it _is_ possible to use gets() safely, under a
           | sufficiently contrived set of circumstances. Since it reads
           | from stdin, you just have to make sure that stdin reads from
           | a pipe which has its write end under the control of the same
           | process, and be careful to only ever write a limited amount
           | of data to that pipe, smaller than the buffer of any gets()
           | call in the process.
        
           | zlynx wrote:
           | I like to nitpick and point out gets() can be used safely, as
           | a stunt.
           | 
           | Memory map a read/write page and after that memory map a no-
           | permissions guard page. Now you can safely use gets() to read
           | a page size string without allowing a buffer overflow.
        
             | gpm wrote:
             | Or if you own both sides of stdin...
        
             | kevincox wrote:
             | Does gets() guarantee that it will write its output in
             | order? If not it could in theory write after your guard
             | page before touching the guard page itself. Of course I
             | don't know if either the kernel or glibc would ever do
             | this.
             | 
             | I think the only safe way to use gets() is with trusted
             | input.
        
               | Arnavion wrote:
               | Why does the order matter? It'll only write to the guard
               | page if the input string is long enough to necessitate
               | it, in which case it was going to fault anyway regardless
               | of which page it touched first.
               | 
               | Edit: I guess you're considering "used safely" to include
               | reading a truncated string, in which case writing in
               | order would allow the program to be written such that it
               | recovers from the fault and reads the valid page-worth of
               | string.
        
               | jmgao wrote:
               | You could hypothetically have a situation where libc has
               | an arbitrarily large internal FILE* buffer (instead of
               | reading a block, looking for a newline, and copying
               | everything over immediately), and then copies in reverse,
               | corrupting data after the guard page before it hits the
               | guard page.
               | 
               | If there are other threads accessing data that happens to
               | be placed after the guard page, bad things could happen,
               | but this seems rather unlikely to be a real problem.
        
               | Arnavion wrote:
               | Ah, right. >2 pages-worth of input would break the
               | scheme.
        
               | tsimionescu wrote:
               | If it can write past the guard page, even if your program
               | faults afterwards, it could have already compromised the
               | larger system. Not claiming that it can, just
               | entertaining the what-if.
        
               | kelnos wrote:
               | I think the parent meant that if your string was longer
               | than the size of the read/write page _plus_ the size of
               | the guard page, and if gets() is allowed to, say, write
               | the string from the end to the beginning (I think this is
               | unlikely, but let 's say it could), then it would try to
               | write the last character (first) all the way out beyond
               | your guard page, possibly scribbling on some memory that
               | the application had allocated for something else.
        
             | Denvercoder9 wrote:
             | If we're nitpicking, doesn't this technically not still
             | allow a buffer overflow, just negate the consequences of
             | it?
        
         | arunc wrote:
         | glibc breaks ABI quite often. Linus has roasted about it openly
         | in the past https://www.youtube.com/watch?v=Pzl1B7nB9Kc
         | 
         | Notable quote from that: If there's a bug that people rely on,
         | it's not a bug, it's a feature.
        
           | wahern wrote:
           | Linux famously removed the sysctl syscall (the original, BSD-
           | derived syscall version of /proc). It was justified because
           | distros had already removed it. The removal was a _huge_ API
           | breakage and even broke security sensitive software, like
           | Tor, for _countless_ deployed systems. But because the
           | distros removed it first (RedHat, specifically), Linus got to
           | claim that  "nobody was using it" and was shielded from the
           | fallout.
           | 
           | Otherwise, both the kernel and glibc regularly break things
           | _accidentally_. You rarely hear about it, though, because its
           | the nature of software development that the areas most likely
           | to be broken are those where people rarely lurk. glibc makes
           | at least as much effort as Linux in terms of supporting
           | backward compatibility, but glibc 's job is in some ways much
           | more difficult, and they have far fewer contributors to help
           | out. There's no shortage of bugs in glibc, and I have plenty
           | of my own gripes, but by the standards of the industry
           | (particularly of FOSS), they do an outstanding job of
           | maintaining ABI compatibility.
           | 
           | Once upon a time people would claim that glibc's efforts were
           | feeble as compared to proprietary OSs like Solaris, AIX, or
           | Windows. But these days those backward compat stories are far
           | more complex and less pristine, and glibc has well over a
           | decade (or two decades?) of using ELF symbol versioning to
           | maintain compat.
        
             | Denvercoder9 wrote:
             | _> The removal was a huge API breakage and even broke
             | security sensitive software, like Tor, for countless
             | deployed systems._
             | 
             | Honestly, I'd say that is on them. It has been discouraged
             | to use it since basically forever (it has been noted in
             | all-caps in the manpage since at least 2001), the kernel
             | started complaining about its usage since Linux 2.6.24
             | which was released in January 2008, and it finally
             | disappeared in Linux 5.5, released in January 2020. That's
             | a two-decade deprecation period.
        
         | mhitza wrote:
         | > are there any interface removals or backwards incompatible
         | glib
         | 
         | Yes https://abi-laboratory.pro/?view=timeline&l=glibc
        
         | JoshTriplett wrote:
         | The Linux kernel very occasionally removes interfaces, but
         | generally only via "this turns out to have been broken for
         | years and nobody has been using it".
         | 
         | glibc does deprecate and remove things, but it uses very
         | careful symbol versioning, such that code compiled against
         | previous versions of glibc continues to run but _new_ code can
         | 't use those interfaces. It's a rare example of being ABI-
         | compatible but not API-compatible.
        
         | matheusmoreira wrote:
         | Linux kernel system call interface is considered stable. I
         | checked the changelog and didn't see any removals. Don't know
         | about glibc.
        
       ___________________________________________________________________
       (page generated 2021-05-05 23:01 UTC)