[HN Gopher] I built a 2x faster lexer, then discovered I/O was t...
       ___________________________________________________________________
        
       I built a 2x faster lexer, then discovered I/O was the real
       bottleneck
        
       Author : modulovalue
       Score  : 173 points
       Date   : 2026-01-20 16:10 UTC (5 days ago)
        
 (HTM) web link (modulovalue.com)
 (TXT) w3m dump (modulovalue.com)
        
       | marginalia_nu wrote:
       | Zip with no compression is a nice contender for a container
       | format that shouldn't be slept on. It effectively reduces the
       | I/O, while unlike TAR, allowing direct random to the files
       | without "extracting" them or seeking through the entire file,
       | this is possible even via mmap, over HTTP range queries, etc.
       | 
       | You can still get the compression benefits by serving files with
       | Content-Encoding: gzip or whatever. Though it has builtin
       | compression, you can just not use that and use external
       | compression instead, especially over the wire.
       | 
       | It's pretty widely used, though often dressed up as something
       | else. JAR files or APK files or whatever.
       | 
       | I think the articles complaints about lacking unix access rights
       | and metadata is a bit strange. That seems like a feature more
       | than a bug, as I wouldn't expect this to be something that
       | transfers between machines. I don't want to unpack an archive and
       | have to scrutinize it for files with o+rxst permissions, or have
       | their creation date be anything other than when I unpacked them.
        
         | stabbles wrote:
         | > Zip with no compression is a nice contender for a container
         | format that shouldn't be slept on
         | 
         | SquashFS with zstd compression is used by various container
         | runtimes, and is popular in HPC where filesystems often have
         | high latency. It can be mounted natively or with FUSE, and the
         | decompression overhead is not really felt.
        
           | ciupicri wrote:
           | Wouldn't you still have a lot of syscalls?
        
             | stabbles wrote:
             | Yes, but with much lower latency. The squashfs file ensures
             | the files are close together and you benefit from fs cache
             | a lot.
        
             | LtdJorge wrote:
             | You then use io_uring
        
           | __turbobrew__ wrote:
           | Just make sure you mount the squashfs with --direct-io or
           | else you will be double caching (caching the sqfs pages, and
           | caching the uncompressed files within the sqfs). I have no
           | idea why this isn't the default. Found this out the hard way.
        
         | 1718627440 wrote:
         | Isn't this what is already common in the Python community?
         | 
         | > I don't want to unpack an archive and have to scrutinize it
         | for files with o+rxst permissions, or have their creation date
         | be anything other than when I unpacked them.
         | 
         | I'm the opposite, when I pack and unpack something, I want the
         | files to be identical including attributes. Why should I throw
         | away all the timestamps, just because the file were temporarily
         | in an archive?
        
           | rustyhancock wrote:
           | Yes, it's a lossy process.
           | 
           | If your archive drops it you can't get it back.
           | 
           | If you don't want it you can just chmod -R u=rw,go=r,a-x
        
             | 1718627440 wrote:
             | > If your archive drops it you can't get it back.
             | 
             | Hence, the common archive format is tar not zip.
        
           | password4321 wrote:
           | > _Why should I throw away all the timestamps, just because
           | the file were temporarily in an archive?_
           | 
           | In case anyone is unaware, you don't have to throw away all
           | the timestamps when using "zip with no compression". The
           | metadata for each zipped file includes one timestamp
           | (originally rounded to even number of seconds in local time).
           | 
           | I am a big last modified timestamp fan and am often
           | discouraged that scp, git, and even many zip utilities are
           | not (at least by default).
        
             | rcxdude wrote:
             | git updates timestamps in part by necessity of
             | compatibility with build systems. If it applied the
             | timestamp of when the file was last modified on checkout
             | then most build systems would break if you checked out an
             | older commit.
        
               | ralferoo wrote:
               | git blame is more useful than the file timestamp in any
               | case.
        
           | nh2 wrote:
           | There is some confusion here.
           | 
           | ZIP retains timestamps. This makes sense because timestamps
           | are a global concept. Consider them a attribute dependent on
           | only the file in ZIP, similar to the file's name.
           | 
           | Owners and permissions are dependent also on the computer the
           | files are stored on. User "john" might have a different user
           | ID on another computer, or not exist there at all, or be a
           | different John. So there isn't one obvious way to handle
           | this, while there is with timestamps. Archiving tools will
           | have to pick a particular way of handling it, so you need to
           | pick the tool that implements the specific way you want.
        
             | pwg wrote:
             | > ZIP retains timestamps.
             | 
             | It does, but unless the 'zip' archive creator being used
             | makes use of the extensions for high resolution timestamps,
             | the basic ZIP format retains only old MSDOS style
             | timestamps (rounded to the closed two seconds). So one may
             | lose some precision in ones timestamps when passing files
             | through a zip archive.
        
           | zahlman wrote:
           | > Isn't this what is already common in the Python community?
           | 
           | I'm not aware of standards language mandating it, but build
           | tools generally do compress wheels and sdists.
           | 
           | If you're thinking of zipapps, those are not actually common.
        
             | 1718627440 wrote:
             | I was talking about using zipfile as a generic file format,
             | instead of open and close.
        
               | zahlman wrote:
               | I'm afraid I don't understand specifically what you're
               | referring to. Maybe you could show some code citations of
               | popular projects doing it?
        
         | LtdJorge wrote:
         | Doesn't ZIP have all the metadata at the end of the file,
         | requiring some seeking still?
        
           | conradludgate wrote:
           | Yes, but it's an O(1) random access seek rather than O(n)
           | scanning seek
        
           | marginalia_nu wrote:
           | It has an index at the end of the file, yeah, but once you've
           | read that bit, you learn where the contents are located and
           | if compression is disabled, you can e.g. memory map them.
           | 
           | With tar you need to scan the entire file start-to-finish
           | before you know where the data is located, as it's literally
           | a tape archiving format, designed for a storage medium with
           | no random access reads.
        
         | 01HNNWZ0MV43FF wrote:
         | I thought Tar had an extension to add an index, but I can't
         | find it in the Wikipedia article. Maybe I dreamt it.
        
           | st_goliath wrote:
           | You might be thinking of _ar_ , the classic Unix ARchive that
           | is used for static libraries?
           | 
           | The format used by `ar` is a quite simple, somewhat like tar,
           | with files glued together, a short header in between and no
           | index.
           | 
           | Early Unix eventually introduced a program called `ranlib`
           | that generates and appends and index for libraries (also
           | containing extracted symbols) to speed up linking. The index
           | is simply embedded as a file with a special name.
           | 
           | The GNU version of `ar` as well as some later Unix
           | descendants support doing that directly instead.
        
           | cb321 wrote:
           | Besides `ar` as a sibiling observed, you might also be
           | thinking of pixz - https://github.com/vasi/pixz , but really
           | any archive format (cpio, etc.) can, in principle, just put a
           | stake in the ground to have its last file be any kind of
           | binary / whatever index file directory like Zip. Or it could
           | hog a special name like .__META_INF__ instead.
        
         | zahlman wrote:
         | > It's pretty widely used, though often dressed up as something
         | else. JAR files or APK files or whatever.
         | 
         | JAR files generally do/did use compression, though. I imagine
         | you _could_ forgo it, but I didn 't see it being done. (But
         | maybe that was specific to the J2ME world where it was more
         | necessary?)
        
           | charcircuit wrote:
           | Specifically the benefit is for the native libraries within
           | the file as you can map the library directly to memory
           | instead of having to make a decompressed copy and then
           | mapping that copy to memory.
        
             | zahlman wrote:
             | Yes, that's clear. I'm just not aware of people actually
             | _doing_ that, or having done it back in the era when Java
             | was more dominant.
        
               | charcircuit wrote:
               | The bigger issue is that glibc doesn't support loading
               | libraries from zip archives where bionic's linker ddoes.
               | So on platforms where glibc is used you wouldn't see it
               | being done.
        
               | zahlman wrote:
               | Again, I was talking about Java (not C). Good to know,
               | though.
        
         | paulddraper wrote:
         | > I wouldn't expect this to be something that transfers between
         | machines
         | 
         | Maybe non-UNIX machines I suppose.
         | 
         | But I 100% need executable files to be executable.
        
           | marginalia_nu wrote:
           | Do you also want the setuid bit I added?
        
           | Joker_vD wrote:
           | Honestly, sometimes I just want to mark all files on a Linux
           | system as executable and see what would even break and why.
           | Seriously, why is there a whole bit for something that's
           | essentially an 'read permission, but you can also directly
           | execute it from the shell'?
        
             | paulddraper wrote:
             | It's a security thing, in conjunction with sudoers, I
             | think.
        
           | dwattttt wrote:
           | This seems like something that shouldn't be the container
           | formats responsibility. You can record arbitrary metadata and
           | put it in a file in the container, so it's trivial to layer
           | on top.
           | 
           | On the other hand, tie the container structure to your OS
           | metadata structure, and your (hopefully good) container
           | format is now stuck with portability issues between other
           | OSes that don't have the same metadata layout, as well as
           | your own OS in the past & future.
        
         | hinkley wrote:
         | Gzip will make most line protocols efficient enough that you
         | can do away with needing to write a cryptic one that will just
         | end up being friction every time someone has to triage a
         | production issue. Zstd will do even better.
         | 
         | The real one-two punch is make your parser faster and then
         | spend the CPU cycles on better compression.
        
           | cb321 wrote:
           | DNA researchers developed a parallel format for gzip they
           | call "bgzip" ( https://learngenomics.dev/docs/genomic-file-
           | formats/compress... ) that makes data seem less trapped
           | behind a decompression perf wall. Zstd is still a bit faster
           | (but < ~2X) and also gets better compression ratios
           | (https://forum.nim-lang.org/t/5103#32269)
        
         | smallstepforman wrote:
         | This is how Haiku packages are managed, from the outside its a
         | single zstd file, internally all dependacies and files and
         | included in read only file. Reduces IO, reduces file clutter,
         | instant install/uninstall, zero chance for user to corrupt
         | files or dependancy, and easy to switch between versions. The
         | Haiku file system also supports virtual dir mapping so the
         | stubborn Linux port thinks its talking to /usr/local/lib, but
         | in reality its part of the zstd file in /system/packages.
        
         | 0xbadcafebee wrote:
         | Strangely enough, there is a tool out there that gives Zip-like
         | functionality while preserving Tar metadata functionality, that
         | nobody uses. It even has extra archiving functions like binary
         | deltas. dar (Disk ARchive) http://dar.linux.free.fr/
        
           | spwa4 wrote:
           | You mean ZIP?
           | 
           | Zip has 2 tricks: First, compression is per-file, allowing
           | extraction of single files without decompressing anything
           | else.
           | 
           | Second, the "directory" is at the end, not the beginning, and
           | ends in the offset of the beginning of the directory. Meaning
           | 2 disk seeks (matters even on SSDs) and you can show the user
           | all files.
           | 
           | Then, you know exactly what bytes are what file and
           | everything's fast. Second, you can easily take off the
           | directory from the zip file, allowing new files to be added
           | without modifying the rest of the file, which can be extended
           | to allow for arbitrary modification of the contents, although
           | you may need to "defragment" the file.
           | 
           | And I believe, encryption is also per-file. Meaning to
           | decrypt a file you need _both_ the password and the directory
           | entry, which means that if you delete a file, and rewrite
           | just the directory, the data is unrecoverable without
           | requiring a total rewrite of the bytes.
        
             | hedora wrote:
             | I think Zip's main trick is that it's been preloaded on
             | everything forever.
        
         | thaumasiotes wrote:
         | > It effectively reduces the I/O, while unlike TAR, allowing
         | direct random to the files without "extracting" them or seeking
         | through the entire file
         | 
         | How do you access a particular file without seeking through the
         | entire file? You can't know where anything is without first
         | seeking through the whole file.
        
           | charcircuit wrote:
           | You look at the end of the file which tells you where the
           | central directory is. The directory tells you where
           | individual files are.
        
           | nikanj wrote:
           | At the end of the ZIP file, there's a central directory of
           | all files contained in that archive. Read the last block,
           | seek to the block containing the file you want to access,
           | done
        
         | account42 wrote:
         | One problem with the zip format is that metadata is stored both
         | in the central directory and also before each file data - that
         | creates ambiguity when the metadata differs which different
         | programs/libraries don't handle consistently.
        
       | akaltar wrote:
       | Amazing article, thanks for sharing. I really appreciate the deep
       | investigations in response to the comments
        
       | stabbles wrote:
       | "I/O is the bottleneck" is only true in the loose sense that
       | "reading files" is slow.
       | 
       | Strictly speaking, the bottleneck was latency, not bandwidth.
        
       | raggi wrote:
       | there are a loooot of languages/compilers for which the most
       | wall-time expensive operation in compilation or loading is
       | stat(2) searching for files
        
         | ghthor wrote:
         | I actually ran into this issue building dependency graphs of a
         | golang monorepo. We analyzed the cpu trace and found that the
         | program was doing a lot of GC so we reduced allocations. This
         | was just noise though as the runtime was just making use of
         | time waiting for I/O as it had shelled out to go list to get a
         | json dep graph from the CLI program. This turns out to be slow
         | due to stat calls and reading from disk. We replaced our usage
         | of go list with a custom package import graph parser using the
         | std lib parser packages and instead of reading from disk we
         | give the parser byte blobs from git, also using git ls-files to
         | "stat" the files. Don't remember the specifics but I believe we
         | brought the time from 30-45s down to 500ms to build the dep
         | graph.
        
       | nudpiedo wrote:
       | Same thing applies to other system aspects:
       | 
       | compressing the kernel loads it faster on RAM even if it still
       | has to execute the un compressing operation. Why?
       | 
       | Load from disk to RAM is a larger bottleneck than CPU
       | uncompressing.
       | 
       | Same is applied to algorithms, always find the largest bottleneck
       | in your dependent executions and apply changes there as the rest
       | of the pipeline waits for it. Often picking the right algorithm
       | "solves it" but it may be something else, like waiting for IO or
       | coordinating across actors (mutex if concurrency is done as it
       | used to).
       | 
       | That's also part of the counterintuitive take that more
       | concurrency brings more overhead and not necessarily faster
       | execution speeds (topic largely discussed a few years ago with
       | async concurrency and immutable structures).
        
         | direwolf20 wrote:
         | Networks too. Compressing the response with gzip is usually
         | faster than sending it uncompressed through the network. This
         | wasn't always the case.
        
       | solatic wrote:
       | Headline is wrong. I/O wasn't the bottleneck, syscalls were the
       | bottleneck.
       | 
       | Stupid question: why can't we get a syscall to load an entire
       | directory into an array of file descriptors (minus an array of
       | paths to ignore), instead of calling open() on every individual
       | file in that directory? Seems like the simplest solution, no?
        
         | cb321 wrote:
         | One aspect of the question is that "permissions" are mostly
         | regulated at the time of open and user-code should check for
         | failures. This was a driving inspiration for the tiny 27 lines
         | of C virtual machine in https://github.com/c-blake/batch that
         | allows you to, e.g., synthesize a single call that mmaps a
         | whole file
         | https://github.com/c-blake/batch/blob/64a35b4b35efa8c52afb64...
         | which seems like it would have also helped the article author.
        
         | stabbles wrote:
         | What comes closest is scandir [1], which gives you an iterator
         | of direntries, and can be used to avoid lstat syscalls for each
         | file.
         | 
         | Otherwise you can open a dir and pass its fd to openat together
         | with a relative path to a file, to reduce the kernel overhead
         | of resolving absolute paths for each file.
         | 
         | [1] https://man7.org/linux/man-pages/man3/scandir.3.html
        
           | zokier wrote:
           | in what way does scandir avoid stat syscalls?
        
             | stabbles wrote:
             | Because you get an iterator over `struct dirent`, which
             | includes `d_type` for popular filesystems.
             | 
             | Notice that this avoids `lstat` calls; for symlinks you may
             | still need to do a stat call if you want to stat the
             | target.
        
           | direwolf20 wrote:
           | This is a (3) man page which means it's not a syscall. Have
           | you checked it doesn't call lstat on each file?
        
             | stabbles wrote:
             | Fair, https://www.man7.org/linux/man-
             | pages/man2/getdents64.2.html is a better link. You'd have
             | to call lstat when d_type is DT_UNKNOWN
        
         | arter45 wrote:
         | >why can't we get a syscall to load an entire directory into an
         | array of file descriptors (minus an array of paths to ignore),
         | instead of calling open() on every individual file in that
         | directory?
         | 
         | You mean like a range of file descriptors you could use if you
         | want to save files in that directory?
        
         | levodelellis wrote:
         | Not sure, I'd like that too
         | 
         | You could use io_uring but IMO that API is annoying and I
         | remember hitting limitations. One thing you could do with
         | io_uring is using openat (the op not the syscall) with the dir
         | fd (which you get from the syscall) so you can asynchronously
         | open and read files, however, you couldn't open directories for
         | some reason. There's a chance I may be remembering wrong
        
         | justsomehnguy wrote:
         | If you don't need the security at all then yes. Otherwise you
         | need to check every file for the permissions.
        
         | direwolf20 wrote:
         | You can probably do it with io_uring, as a generic syscall
         | batching mechanism.
        
         | king_geedorah wrote:
         | io_uring supports submitting openat requests, which sounds like
         | what you want. Open the dirfd, extract all the names via
         | readdir and then submit openat SQEs all at once. Admittedly I
         | have not used the io uring api myself so I can't speak to edge
         | cases in doing so, but it's "on the happy path" as it were.
         | 
         | https://man7.org/linux/man-pages/man3/io_uring_prep_open.3.h...
         | 
         | https://man7.org/linux/man-pages/man2/readdir.2.html
         | 
         | Note that the prep open man page is a (3) page. You could of
         | course construct the SQEs yourself.
        
           | torginus wrote:
           | You have a limit of 1k simultaneous open files per process -
           | not sure what overhead exists in the kernel that made them
           | impose this, but I guess it exists for a reason. You might
           | run into trouble if you open too many files at ones (either
           | the kernel kills your process, or you run into some internal
           | kernel bottleneck that makes the whole endeavor not so
           | worthwhile)
        
             | dinosaurdynasty wrote:
             | That's mainly for historical reasons (select syscall can
             | only handle fds<1024), modern programs can just set their
             | soft limit to their hard limit and not worry about it
             | anymore: https://0pointer.net/blog/file-descriptor-
             | limits.html
        
         | paulddraper wrote:
         | io_uring can open multiple files.
        
         | ori_b wrote:
         | It's not the syscalls. There were only 300,000 syscalls made.
         | Entering and exiting the kernel takes 150 cycles on my (rather
         | beefy) Ryzen machine, or about 50ns per call.
         | 
         | Even assume it takes 1us per mode switch, which would be
         | insane, you'd be looking at 0.3s out of the 17s for syscall
         | overhead.
         | 
         | It's not obvious to me where the overhead is, but random seeks
         | are still expensive, even on SSDs.
        
           | ncruces wrote:
           | Didn't test, but my guess is it's not "syscalls" but "open,"
           | "stat," etc; "read" would be fine. And something like
           | "openat" might mitigate it.
        
       | mnahkies wrote:
       | Something that struck me earlier this week was when profiling
       | certain workloads, I'd really like a flame graph that included
       | wall time waiting on IO, be it a database call, filesystem or
       | other RPC.
       | 
       | For example, our integration test suite on a particular service
       | has become quite slow, but it's not particularly clear where the
       | time is going. I suspect a decent amount of time is being spent
       | talking to postgres, but I'd like a low touch way to profile this
        
         | trillic wrote:
         | See if you can wrap the underlying library call to pg.query or
         | whatever it is with a generic wrapper that logs time in the
         | query function. Should be easy in a dynamic lang.
        
           | Kuinox wrote:
           | Tracing profiler can do exactly that, you don't need a
           | dynamic lang.
        
         | 6keZbCECT2uB wrote:
         | There's prior work:
         | https://www.brendangregg.com/FlameGraphs/offcpuflamegraphs.h...
         | 
         | There are a few challenges here. - Off-cpu is missing the
         | interrupt with integrated collection of stack traces, so you
         | instrument a full timeline when they move on and off cpu or
         | periodically walk every thread for its stack trace -
         | Applications have many idle threads and waiting for IO is a
         | common threadpool case, so its more challenging to associate
         | the thread waiting for a pool doing delegated IO from idle
         | worker pool threads
         | 
         | Some solutions: - Ive used nsight systems for non GPU stuff to
         | visualize off CPU time equally with on CPU time - gdb thread
         | apply all bt is slow but does full call stack walking. In
         | python, we have py-spy dump for supported interpreters -
         | Remember that any thing you can represent as call stacks and
         | integers can be converted easily to a flamegraph. eg taking
         | strace durations by tid and maybe fd and aggregating to a
         | flamegraph
        
       | jchw wrote:
       | Sounds more like the VFS layer/FS is the bottleneck. It would be
       | interesting to try another FS or operating system to see how it
       | compares.
        
         | arter45 wrote:
         | Some say Mac OS X is (or used to be) slower than Linux at least
         | for certain syscalls.
         | 
         | https://github.com/golang/go/issues/28739#issuecomment-10426...
         | 
         | https://stackoverflow.com/questions/64656255/why-is-the-c-fu...
         | 
         | https://github.com/valhalla/valhalla/issues/1192
         | 
         | https://news.ycombinator.com/item?id=13628320
         | 
         | Not sure what's the root cause, though.
        
           | jchw wrote:
           | This would not be surprising at all! An impressive amount of
           | work has gone into making the Linux VFS and filesystem code
           | fast and scalable. I'm well aware that Linux didn't invent
           | the RCU scheme, but it uses variations on RCU liberally to
           | make filesystem operations minimally contentious, and
           | aggressively caches. (I've also learned recently that the
           | Linux VFS abstractions are quite different from BSD/UNIX, and
           | they don't really map to eachother. Linux has many
           | structures, like dentries and generic inodes, that map to
           | roughly one structure in BSD/UNIX, the vnode structure. I'm
           | not positive that this has huge performance implications but
           | it does seem like Linux is aggressive at caching dentries
           | which may make a difference.)
           | 
           | That said, I'm certainly no expert on filesystems or OS
           | kernels, so I wouldn't know if Linux would perform faster or
           | slower... But it would be very interesting to see a
           | comparison, possibly even with a hypervisor adding overhead.
        
       | djmips wrote:
       | profile profile profile
        
       | rurban wrote:
       | I still use a 10x faster lexer, RE2C over flex, because it does
       | so much more at compile-time. And on top of that has a bunch of
       | optimization options for better compilers, like computed goto's.
       | 
       | Of course syscalls suck, slurping the whole file at once always
       | wins, and in this case all files at once.
       | 
       | Kernels suck in general. You don't really need one for high perf
       | and low space.
        
       | sheepscreek wrote:
       | Always knew this anecdotally - through experience - told myself
       | it's the file system block size that's the culprit. Put together
       | with SSD random seek times, it makes sense on the surface. I
       | never thought of syscalls being so expensive. But they might just
       | be symptom and not the bottleneck itself (after all it's just a
       | function call to the server). My initial thought was DMA. You
       | see, CPUs usually have direct access to only one PCI/e in most
       | consumer hardware. The other PCI/e and mvme.2 slots share the
       | same bandwidth and take turns. When someone wants access, they do
       | a dance with the CPU and other parts of the computer using INT or
       | (or interrupt) instructions that make the CPU pause so I/O can
       | take over for a bit. The switching back and forth is costly too
       | and adds up quickly.
       | 
       | That said, it wouldn't explain why a MacBook (which should have
       | the SSD already on the fastest/dedicated pathway) be this slow
       | unless something else in the OS was the bottleneck?
       | 
       | I think we're just scratching the surface here and there is more
       | to this story that is waiting to be discovered. But yeah, to get
       | the job done, package it in fewer files for the OS, preload into
       | RAM or use mmap, then profit.
        
       | Dwedit wrote:
       | On Windows, an optimization is to call CloseHandle from a
       | secondary thread.
        
       | hwspeed wrote:
       | Classic case of optimizing the wrong thing. I've hit similar
       | issues with ML training pipelines where GPU utilization looks
       | terrible because data loading is the bottleneck. The profiler
       | tells you the GPU kernel is fast, but doesn't show you it's
       | sitting idle 80% of the time waiting for the next batch. Amdahl's
       | law is brutal when you've got a serial component in your
       | pipeline.
        
       | torginus wrote:
       | I started programming on DOS - I remember how amazing was that
       | you basically almost talked to hardware directly, there was very
       | little restriction on what you could do, and the OS (which imo
       | was much more akin to a set of libraries) provided very little
       | abstraction for you.
       | 
       | Then I moved to Windows, and Linux. Each had its own
       | idiosyncrasies, like how everything is a file on Linux, and
       | you're supposed to write programs by chaining existing
       | executables together, or on the desktop, both Win32 and X11
       | started out with their own versions of UI elements, so XWindow or
       | Win32 would know about where a 'button' was, and the OS was
       | responsible for event handling and drawing stuff.
       | 
       | Eventually both Windows and Linux programs moved to a model where
       | the OS just gave you the window as a drawing surface, and you
       | were supposed to fill it.
       | 
       | Similarly, all other OS supplied abstractions slowly fell out of
       | use beyond the bare minimum.
       | 
       | Considering this, I wonder if it's time to design a new, much
       | lower level abstraction, for file systems in this case, this
       | would be a way to mmap an entire directory into the process
       | space, where each file would be a struct, whicha had a list of
       | pointers for the pages on the disk, and each directory would be a
       | list of such entries, again stored in some data structure you
       | could access, synchronizing reads/writes would be orechestrated
       | by the kernel somehow (I'm thinking locking/unlocking pages being
       | written to).
       | 
       | So that way there'd be no difference between traversing an in-
       | memory data structure and reading the disk.
       | 
       | I know this approach isnt super compatible with the async/await
       | style of I/O, however I'm not 100% convinced that's the correct
       | approach either (disk paging is a fundamental feature of all
       | OSes, yet is absolutely inexpressible in programming terms)
        
         | LorenPechtel wrote:
         | I'd love to see this.
         | 
         | Bring back the "segmented" memory architecture. It was not evil
         | because of segments, but because of segment size. If any
         | segment can be any size the bad aspects fall away.
         | 
         | File handles aren't needed anymore. You open a file, you get
         | back a selector rather than an ID. You reference memory from
         | that selector, the system silently swaps pages in as needed.
         | 
         | You could probably do the same thing with directories but I
         | haven't thought about it.
        
           | torginus wrote:
           | The idea as I stated it is super half-baked but
           | 
           | > You could probably do the same thing with directories but I
           | haven't thought about it.
           | 
           | For example in the FAT filesystem, a directory is just a file
           | with a special flag set in its file descriptor and inside
           | said file there is just a list of file descriptors. Not sure
           | if something so simple would a good idea, but it certainly
           | works and has worked IRL.
        
       | 1vuio0pswjnm7 wrote:
       | "TAR vs ZIP: Sequential vs Random Access"
       | 
       | https://199.233.217.201/pub/pkgsrc/distfiles/dictd-1.13.3.ta...
       | 
       | Wikipedia:
       | 
       | "In order to efficiently store dictionary data, dictzip, an
       | extension to the gzip compression format (also the name of the
       | utility), can be used to compress a .dict file. Dictzip
       | compresses file in chunks and stores the chunk index in the gzip
       | file header, thus allowing random access to the data."
        
       | Sparkyte wrote:
       | I/O has been the bottleneck for many things especially databases.
       | 
       | So as someone who has seen a long spread of technological
       | advancements over the years I can confidently tell you that chips
       | have far surpassed any peripheral components.
       | 
       | Kind of that scenario where compute has to be fast enough anyway
       | to support I/O. So really it always has to be faster, but I am
       | saying that it has exceeded those expectations.
        
       ___________________________________________________________________
       (page generated 2026-01-26 15:01 UTC)