[HN Gopher] Freqfs: In-memory filesystem cache for Rust
___________________________________________________________________
Freqfs: In-memory filesystem cache for Rust
Author : haydnv
Score : 75 points
Date : 2021-10-01 18:17 UTC (4 hours ago)
(HTM) web link (docs.rs)
(TXT) w3m dump (docs.rs)
| munro wrote:
| Musing: sometimes I wish file systems & databases were the
| unified. I'm imaging just a single fast db engine sitting on my
| storage--and your traditional file system structure would just be
| tables in there. I kinda just treat SQLite like that, but it's
| not as transparently optimized as it could be for large files.
| Why? I don't want to mentally jump around technologies. I want to
| query my FS like a DB, and I want to store files in my DB like a
| FS. The reality is though that there isn't a one size fits all DB
| that exists.
|
| And more on topic: tokio-uring is really fast [1], and I'm really
| loving tokio in general.
|
| [1]
| https://gist.github.com/munro/14219f9a671484a8fe820eb35d26bb...
| haydnv wrote:
| Yeah I just learned about tokio-uring and I'm planning to get
| it into the next major release of freqfs
| weinzierl wrote:
| We've been there. Before we had filesystems as we know them
| today, there were many different ways of persistent data
| storage. Roughly these could be grouped into two camps: The
| files camp and the records camp.
|
| The record based approach had many properties we know from
| modern databases. It was a first class citizen on the mainframe
| and IBM was its champion.
|
| In my opinion hierarchical filesystems won as everyday data
| storage _because_ of their simplicity and not despite it. I
| think the idea of a file being just a series of bytes and
| leaving the interpretation to the application is ingenious.
| That doesn 't mean there is no room for standardized OS-level
| database-like storage. In fact I'd love to see that.
| Koshkin wrote:
| Modern filesystems in a way combine both approaches - they
| store the data unstructured but give the ability to also
| store metadata (attributes) in a structured way.
| pkaye wrote:
| Are you thinking of something like WinFS?
|
| https://en.wikipedia.org/wiki/WinFS
|
| Or more like Beos BFS with its extended attributes, indexing
| and querying?
|
| https://en.wikipedia.org/wiki/Be_File_System
|
| Also I think a lot of the old mainframe filesystems had the
| concept of records and indexes built in since they were
| primarily used for business operations.
| arghwhat wrote:
| Your filesystem _is_ a database. It 's just a document-oriented
| database, rather than relational SQL.
| edoceo wrote:
| You mean it's always been NoSQL?
|
| _Astronaut with gun_ : always has been.
| [deleted]
| cogman10 wrote:
| It even has a lot of the same features as a full fledged DB.
|
| For example, most file systems today are journaling. Which is
| exactly how most databases handle atomic, consistent, and
| durability in ACID.
|
| About the only thing it's missing is automatic document
| locking (though most file systems support explicit locks).
|
| That said, there are often some pretty hard limits on the
| number of objects in a table (directory). Depending on the
| file system you can be looking at anywhere from 10k to 1
| billion files per directory.
|
| There are also some unfortunate storage characteristics. Most
| file systems have a minimum file size of around 4kb, mostly
| to optimize for disk access. DBs often pack things together
| much more tightly.
|
| But hey, if you can spin using the FS as a DB... Do it.
| Particular for a read heavy application, the FS is nearly
| perfect for such operations.
| gpderetta wrote:
| The biggest problem is the lack of good transactional
| facilities.
| wongarsu wrote:
| There is a transactional API for NTFS in Windows [1]. It
| allows transactional operations not just within a file
| but also across files or across multiple computers (to
| make sure something is applied to your whole fleet
| atomically).
|
| 1: https://en.wikipedia.org/wiki/Transactional_NTFS
| cogman10 wrote:
| Yup, the I in ACID is a bitch :)
| the8472 wrote:
| you can lock directories, you can atomically swap
| directories (on linux), CoW filesystems make cloning kind
| of cheap. That could be used to implement transactions
| and commits. Getting the consistency checks/conflict
| detection during the commit right would be the most
| difficult part. Change notifications could be used to do
| some of that proactively. It's a terrible idea, but it
| could be done.
| amelius wrote:
| It doesn't really support transactions very well, though.
| jerrysievert wrote:
| until an underlying change in technology happens and then you
| wish they were no longer unified (rust to ssd to nvme, for
| example).
|
| I would prefer more pluggable interfaces personally.
|
| (hi Ryan, long time no see!)
| [deleted]
| mountainboy wrote:
| requires tokio / async. ugh. I'm out.
| nonameiguess wrote:
| I honestly don't think I like this at the application level.
| You're removing a degree of freedom from operators and users. I
| have a ton of memory on all of my home devices, and usually just
| take the working directories for frequently used applications and
| mount them as tmpfs. I do the same thing for application working
| directories of applications I deploy at work as well, where we
| have complete freedom to deploy memory-optimized servers with
| lots of RAM. Putting an extra in-memory cache on top of the OS
| filesystem that is already in-memory is an unnecessary extra step
| and doubling the memory use of each file that can't be turned off
| without patching and recompiling your application. The OS is
| already smart enough not to add a cache on top of tmpfs.
| haydnv wrote:
| I don't know that it's fair to say it's "doubling" the memory
| use of each file because the OS cache memory is still "free"
| from the perspective of an application. Where it comes in handy
| is an applications like databases or training an ML model where
| there are hot spots that get accessed/updated extremely
| frequently--then the application doesn't have to incur
| serialization overhead in order to read/write the data that the
| file encodes (although as another poster pointed out it might
| also be possible to do this with mmap).
| axegon_ wrote:
| Why hasn't anyone told me about this?!?!? I love you so much for
| posting, I needed something like this for a personal project I'm
| fiddling around with in my spare time.
| haydnv wrote:
| That's great! Please let me know how it goes! Feel free to file
| any bug reports or feature requests here:
| https://github.com/haydnv/freqfs/issues
| Koshkin wrote:
| They just did.
| Svetlitski wrote:
| Upon seeing this, I can't help but think of "So What's Wrong with
| 1975 Programming" from the author of Varnish [1].
|
| [1] https://varnish-cache.org/docs/trunk/phk/notes.html
| sagichmal wrote:
| Don't filesystems already do this? Like, really really good?
| __s wrote:
| Unfortunately this doesn't meet my use case (which they list as
| an intended use case): serving static assets over http. I
| currently use an in-memory cache without eviction. It doesn't
| meet my requirements because I store the in-memory content
| precompressed
|
| https://github.com/serprex/openEtG/blob/master/src/rs/server...
|
| edit: seems it can. Nice
| haydnv wrote:
| I think if you call your precompression function in
| FileLoad::load it should do what you need--please file an issue
| if this is not the case:
| https://github.com/haydnv/freqfs/issues
| jagged-chisel wrote:
| I think I don't understand the problem. Precompressed files are
| still files and can be cached.
| WJW wrote:
| I think what they mean is that they want the file on disk to
| be uncompressed but the file in memory to be compressed.
| amelius wrote:
| > freqfs automatically caches the most frequently-used files and
| backs up the others to disk. This allows the developer to create
| and update large collections of data purely in-memory without
| explicitly sync'ing to disk, while still retaining the
| flexibility to run on a host with extremely limited memory.
|
| Why not let the OS take care of this?
| jdeaton wrote:
| My thought exactly.
| BiteCode_dev wrote:
| You can pick and chose.
|
| Maybe your caching strat of you OS isn't best for your use
| case. Also, you may use a network file system, or several types
| of FS, and want your cache warm up to be tuned up and
| consistent.
| the8472 wrote:
| > Maybe your caching strat of you OS isn't best for your use
| case.
|
| On the other hand the OS does know about memory pressure from
| IO and from heap memory for the whole system. This crate will
| only know about cache pressure within a single process.
|
| > Also, you may use a network file system
|
| Which can also be set to do aggressive caching, at the
| expense of consistency.
|
| > and want your cache warm up to be tuned up and consistent.
|
| the description doesn't say that it's doing cache warmup any
| more eagerly as regular reads would
| kbenson wrote:
| The benefit of bringing something in process is as always
| more control, and usually at the expense of having to make
| decisions with less data about the rest of the system than
| an OS level service would have.
|
| Sometimes you need very explicit control over when things
| are read from cache and when they aren't. This can be hard
| with network file systems. Especially when you have two
| different use cases on the same filesystem, which isn't
| that odd, even within a single application.
| shakna wrote:
| Presumably for a similar usecase as SQLite [0]. Performance.
| You can beat the OS, and by a noticeable margin, by doing
| things in memory and avoiding the I/O bottleneck.
|
| [0] https://www.sqlite.org/fasterthanfs.html
| cornstalks wrote:
| I think GP's point is that the OS usually has a file system
| cache in RAM.
| topspin wrote:
| I think the P's point, supported by evidence, is that the
| OS cache is not optimal for all use cases.
| edoceo wrote:
| No cache is optimal for all use cases. That's an
| impossible goal.
| topspin wrote:
| Thus why things like Freqfs exist and we don't always
| "let the OS take care of this."
| edoceo wrote:
| Yea friend, we're in violent agreement :)
| haydnv wrote:
| One advantage is consistency across host platforms, but the
| main advantage is that the file data can be accessed (and
| mutated) in memory in a deserialized format. If you let the OS
| take care of it, you would still have the overhead of
| serializing & deserializing a file every time it's accessed.
| dfranke wrote:
| That's what mmap is for.
| haydnv wrote:
| It might be possible to replace freqfs with mmap on a POSIX
| OS, but a) you would still have to implement your own read-
| write lock, and b) you would (I think probably?) lose some
| consistency in behavior across different host operating
| systems.
| vlovich123 wrote:
| Which OSes does this run on that doesn't have some kind
| of mmap operation?
| haydnv wrote:
| It should work on Windows (because tokio::fs works on
| Windows) although I have not personally tested this
| julian37 wrote:
| You can do mmap on Windows, eg.
| https://github.com/danburkert/memmap-rs
| gpderetta wrote:
| mmaps for read, explicit API for writing, a-la LMDB.
| Buggy readers can read inconsistent data but cannot
| corrupt the os.
| otterley wrote:
| Corrupt the OS? How might that happen?
| gpderetta wrote:
| Sorry, I meant the DB!
| [deleted]
| alexruf wrote:
| Personally I don't see a scenario for myself, but I can
| imagine that there are some where this might be useful. But
| isn't there a extremely high risk of data loss an
| inconsistency when adding an extra layer on top of OS file
| system handling?
| haydnv wrote:
| If there is any concurrent access to cached files not
| through freqfs, there is a risk of inconsistency and
| crashes.
| ericbarrett wrote:
| Freqfs seems like a shim you'd add to an existing project
| for a quick optimization. Whereas mmap et al. are "better"
| the same way any specific, built-to-purpose code will be
| "better" than just bolting a framework on. Sometimes it's
| the right call to do the extra work; sometimes it's 100%
| more effort (both development and maintenance) for an extra
| 10% gain.
___________________________________________________________________
(page generated 2021-10-01 23:00 UTC)