[HN Gopher] I thought I found a bug
       ___________________________________________________________________
        
       I thought I found a bug
        
       Author : MBCook
       Score  : 116 points
       Date   : 2024-12-25 19:55 UTC (1 days ago)
        
 (HTM) web link (www.os2museum.com)
 (TXT) w3m dump (www.os2museum.com)
        
       | userbinator wrote:
       | I was a bit disappointed that the article didn't go into the
       | system calls themselves, since AFAIK those have always supported
       | interleaved reads and writes with no problems even on early
       | Unices. E.g. POSIX has this:
       | 
       | https://pubs.opengroup.org/onlinepubs/9699919799/functions/w...
       | 
       |  _After a write() to a regular file has successfully returned:
       | 
       | Any successful read() from each byte position in the file that
       | was modified by that write shall return the data specified by the
       | write() for that position until such byte positions are again
       | modified._
        
         | PeterWhittaker wrote:
         | Perhaps because the article is specifically about the buffered
         | f*() calls in stdio, and not the system calls?
         | 
         | Though, as I offer that thought, the divergence between C and
         | the system calls is definitely curious.
        
         | mystified5016 wrote:
         | I get a real kick out of the different ways people pluralize
         | Unixen. Unices is a good one
        
       | anonymousiam wrote:
       | While reading this, I realized that I have a copy of the elusive
       | usr/group Standard that the author mentions. I just pulled it off
       | an image of my early DOS hard drive (before I migrated to Linux
       | in 1991). I should probably post it somewhere.
       | 
       | # ls -altr
       | 
       | total 540
       | 
       | -rw-r--r-- 1 root root 22606 Apr 12 1990 NOTES
       | 
       | -rw-r--r-- 1 root root 172645 Apr 12 1990 LIB
       | 
       | -rw-r--r-- 1 root root 102349 Apr 12 1990 APP
       | 
       | -rw-r--r-- 1 root root 224037 Apr 12 1990 C
        
         | codetrotter wrote:
         | > I should probably post it somewhere.
         | 
         | Upload it to the Internet Archive! :D
         | 
         | https://help.archive.org/help/uploading-a-basic-guide/
        
           | anonymousiam wrote:
           | Thanks. It looks like the document itself is still
           | copyrighted. The files can be uploaded, so I will.
        
         | xelxebar wrote:
         | > before I migrated to Linux in 1991
         | 
         | What a mic drop. That must have been a fun ride from then until
         | now! Would love to hear some of your battle stories.
        
           | jsjohnst wrote:
           | My first Linux machine was in 1993, what would you like to
           | know? Pre-1.0 Kernels were an adventure that's for sure.
        
             | icedchai wrote:
             | My first Linux box ran 0.99.10. I was running SLS, which
             | installed off of a dozen or so floppies. I eventually moved
             | to Slackware a year or so later.
        
               | xenophonf wrote:
               | I remember downloading and installing one of the MCC
               | Interim releases in 1993? 1994? before switching to
               | Slackware. Early *BSD and Linux were certainly an
               | adventure back then. I don't miss it.
        
               | anonymousiam wrote:
               | SLS was also my first distro. I also played with
               | Yggdrasil Linux (bootable CD) for a while, because at the
               | time, nobody could afford a hard drive with as much
               | capacity as a CD-ROM.
               | 
               | Those early Linux distros borrowed a lot from SunOS
               | (Solaris 1), so it was easy to adapt between work/home.
        
       | purplesyringa wrote:
       | I must be missing something.
       | 
       | The article lists three libcs (Open Watcom, Microsoft Visual C++
       | 6.0, IBM C/C++ 3.6 for Windows) from the good old times. Does the
       | emulator link to Open Watcom, i.e., does it emulate DOS on
       | machines about as old as DOS itself? What's the point here?
        
         | AntiRush wrote:
         | The article is about compiling and running a program inside the
         | emulator. When the unexpected behavior occurred, the author
         | assumed it was a bug in the emulator.
        
           | purplesyringa wrote:
           | So if it's not a bug in the emulator, then it's a bug in
           | COMMAND.COM? I don't think that's the case, surely it
           | couldn't have been missed by Microsoft at the time. The
           | article goes on to talk about fread/fwrite calls, but
           | COMMAND.COM was written in assembly, I'm pretty sure it
           | didn't link to any libc, and certainly not to Open Watcom --
           | why would MS use it instead of their own library?
        
             | grodriguez100 wrote:
             | It is not a bug. The article explains that this is the
             | expected behaviour.
        
               | purplesyringa wrote:
               | What is expected behavior? Surely `echo AB> foo.txt; echo
               | CD>> foo.txt` producing `ABBC` is either a bug in
               | COMMAND.COM, the emulator, or something else? That can't
               | be correct.
        
         | justin_ wrote:
         | I believe it is a bug in the the emulator's implementation of
         | COMMAND.COM. Often, these DOS "emulators" re-implement the
         | standard commands of DOS, including the shell[1]. This is in
         | addition to emulating weird 16-bit environment stuff and the
         | BIOS.
         | 
         | The bug can pop up in any C program using stdio that assumes
         | it's fine to do `fread` followed immediately by `fwrite`. The
         | spec forbids this. To make matters more confusing, this
         | behavior does _not_ seem to be in modern libc implementations.
         | Or at least, it works on my machine. I bet modern
         | implementations are able to be more sane about managing
         | different buffers for reading and writing.
         | 
         | The original COMMAND.COM from MS-DOS probably did not have this
         | problem, since at least in some versions it was written in
         | assembly[2]. Even for a shell written in C, the fix is pretty
         | easy: seek the file before switching between reading/writing.
         | 
         | The title of this post is confusing, since it clearly _is_ a
         | bug somewhere. But I think the author was excited about
         | possibly finding a bug in libc:
         | 
         | > Sitting down with a debugger, I could just see how the C run-
         | time library (Open Watcom) could be fixed to avoid this
         | problem.
         | 
         | [1] Here's DOSBox, for example: https://github.com/dosbox-
         | staging/dosbox-staging/blob/main/s...
         | 
         | [2] MS-DOS 4.0: https://github.com/microsoft/MS-
         | DOS/tree/main/v4.0/src/CMD/C...
        
           | rep_lodsb wrote:
           | The article is very vague about which emulator and
           | COMMAND.COM it is about, and if they're integrated with each
           | other. Can't be DOSBox, since it handles it correctly:
           | C:\> echo AB> foo.txt         C:\> echo CD>> foo.txt
           | C:\> type foo.txt         AB         CD
           | 
           | (Note that echo adds a newline, same as on real DOS, or even
           | UNIX without "-n". This other shell doesn't for some reason.)
           | 
           | The "real" COMMAND.COM, and all other essential parts of
           | MS-/PC-/DR-DOS, have _always_ been written in asm, where none
           | of this libc nonsense matters.
           | 
           | Also it annoys me greatly when people talk about " _the_ C
           | Library " as if it exists in some Platonic realm, and is
           | essential to all software ever written.
        
         | stevage wrote:
         | There's a lot of weird missing details.
        
       | raldi wrote:
       | I'm having trouble following whether the problem occurs with any
       | append or only when it's two consecutive commands like this.
        
       | cryptonector wrote:
       | In the stdio implementations that don't support free intermixing
       | of reads and writes the issue typically is that there is only one
       | buffer for both reading and writing. You have to reset the buffer
       | in order to switch from reading to writing or vice-versa, else
       | you will have a dirty, non-empty buffer that does not correspond.
       | The functions `fflush()`, `fseek()`, `rewind()`, and `fsetpos()`
       | happen to clear the buffer, which is why you have to use them
       | before switching from reading to writing or vice-versa!
       | 
       | Without an indicator in `struct FILE` of whether the last
       | operation was a read or a write, the stdio implementation has no
       | way to detect the problem and correct the situation by
       | automatically flushing and resetting the buffer, say. An
       | alternative would be to have two buffers, naturally. But you can
       | see how a pre-update version could be trivially made to support
       | update modes without adding a second buffer or automatic buffer
       | flushing. And that's almost certainly what happened when update
       | mode was added. My guess is someone got bitten by that and then
       | the maintainer decided to just document the problem rather than
       | fix it, probably because by then fixing the problem was hard.
        
         | fweimer wrote:
         | Historically, before mandatory locking, getc and putc have been
         | implemented as macros, and an extra check for stream state
         | likely mattered from a performance perspective.
         | 
         | To avoid the extra check, you don't actually need two buffers,
         | just separate buffer pointers for reading and writing. (This is
         | probably how most libcs implement this today.) I suppose memory
         | was really scarce back then.
        
           | cryptonector wrote:
           | Separate non-overlapping pointers into one buffer is not that
           | different from two buffers, notionally, but yeah.
        
             | fweimer wrote:
             | The idea is that for the non-active mode, the current/end
             | pointers are equal, signifying that the buffer is
             | exhausted. This forces entering the slow path, where the
             | mode can be switched.
             | 
             | I don't think an implementation with two active, non-empty
             | buffers is all that useful because you can't tell which
             | buffer's progress should be used for the file pointer
             | adjustment in ftell.
        
               | cryptonector wrote:
               | I get that. One buffer that can be maximized by the path
               | that most needs it (read or write). I'm just saying that
               | notionally it's two independent buffers, which solves the
               | problem of not having to force a buffer flush between
               | mode change.
               | 
               | > I don't think an implementation with two active, non-
               | empty buffers is all that useful because you can't tell
               | which buffer's progress should be used for the file
               | pointer adjustment in ftell.
               | 
               | Oh interesting. The other problem is that two buffers
               | reduces memory utilization.
        
       ___________________________________________________________________
       (page generated 2024-12-26 23:01 UTC)