[HN Gopher] Show HN: Versioning Filesystem for SQLite
       ___________________________________________________________________
        
       Show HN: Versioning Filesystem for SQLite
        
       Author : devnull3
       Score  : 78 points
       Date   : 2022-08-28 13:49 UTC (9 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | ctas wrote:
       | Is there a way to provide the same functionality via extensions
       | instead of VFS? Recently, there's been a few other interesting
       | projects adding backup and replication to SQLite, but they all
       | use VFS. AFAIK only one VFS can be active at a time.
        
         | devnull3 wrote:
         | I am the author. Mojo is both an extension and a VFS
         | implementation.
         | 
         | > AFAIK only one VFS can be active at a time.
         | 
         | You can specify VFS as a part of connect URI.
        
       | TheRealPomax wrote:
       | Given that sqlite databases are just files, what is the benefit
       | of this vs. version controlling the database file itself?
        
         | operator-name wrote:
         | > The main feature of the fs is versioning/snapshotting. Only
         | one version is writable and all the old versions are immutable.
        
       | devnull3 wrote:
       | (I am the author) Some of the use-cases of mojo I could think of:
       | 
       | 1. Rollbacking DDL changes (like create table, adding columns,
       | etc)
       | 
       | 2. (My use-case) Multiple readers-single writer. Readers read old
       | versions. Writers do not block readers and vice-versa. Note: this
       | is different than reading+writing in the same version can still
       | happen but with database-level locking.
       | 
       | 3. Extension of #2: The older versions read from S3 so that lamda
       | can read.
       | 
       | 4. Extension of #3: Readers reading from old immutable snapshots
       | makes the database as a queue of complex changes with indexes and
       | all the benefits of SQL.
       | 
       | 5. Backups and replication.
       | 
       | Oh: I have change the license to MIT
        
       | rnavi wrote:
       | Would love to learn more on the motivation behind the project
        
       | koeng wrote:
       | How would something like this work with litestream? I imagine
       | that the WAL file gets written the same way, but would recovering
       | the database allow for mojo to continue working?
        
         | devnull3 wrote:
         | From what I know of litestream, it should be decoupled from VFS
         | implementation. The versions/snapshots are not known to the
         | database.
        
       | punnerud wrote:
       | Could this be used to have multiple writes? where every write
       | have their own sqlite3 and read is a combination of all the files
        
         | Lex-2008 wrote:
         | regarding multiple writes - I think hardest part would be to
         | find an sqlite version where "read is a combination of all the
         | files".
         | 
         | Speaking of multiple writers, I've recently heard about "BEGIN
         | CONCURRENT" feature of SQLite [1] - currently it lives on its
         | separate branch, but I hope they will eventually merge it to
         | the main branch. Not sure if it can be used in your case, but
         | worth mentioning anyway, I think.
         | 
         | [1]: https://www.sqlite.org/cgi/src/doc/begin-
         | concurrent/doc/begi...
        
         | devnull3 wrote:
         | Not sure I understood your question fully, but if there are
         | multiple versions, then read will be over multiple files but
         | writes go to a single active version.
        
         | remram wrote:
         | Each writer can get its own branch but you would never be able
         | to re-combine them.
        
       | phil294 wrote:
       | This seems great, also in performance, but why would you use this
       | over making plain file copy-based backups, since SQLite is single
       | file based? I'd assume this solution might be preferable because
       | it saves deltas instead of absolute, but the Readme does not say
       | so. If the main advantage here is on-the-fly selecting from
       | different versions, granted, that is unique, even though I cannot
       | really imagine a production use case for this.
       | 
       | So far, I have been using file copies and GFS backup scheme based
       | solutions like Borg, which also does deltas in chunks,
       | compression, and encryption. Perhaps with Borg one should better
       | even backup SQL dumps instead of db files, I don't know.
        
         | devnull3 wrote:
         | Only modified pages go into a version data file [1]
         | 
         | [1] https://github.com/sudeep9/mojo/blob/main/design.md
        
           | arjvik wrote:
           | Could you achieve the same by using BTRFS to store the db
           | file?
        
             | devnull3 wrote:
             | Yes. But you need to have such a filesystem installed at
             | the first place.
             | 
             | Mojo can run on any filesystem which does not support
             | snapshots/versions. This makes it portable not only across
             | different fs but also across OS.
        
           | rahimiali wrote:
           | don't you get the same benefit if you version controlled the
           | db file with git? with git, each commit saves a diff from the
           | previous one as a blob. the difference is that in git, in
           | addition to the diffs, you also have to create a working copy
           | of the db, which means you use up at least 2x the storage
           | your system uses. in your implementation, the diff blobs
           | _are_ the live db, which saves you ~2x storage. is that the
           | main benefit?
        
         | Lex-2008 wrote:
         | re: backup SQL dumps instead of db files - indeed! In my small
         | experiment (SQL databases in a browser profile, [1]), simple
         | `sqlite3 "$file" .dump | gzip >"$file.sql.gz"` decreased size
         | of files to be backed up about 10 times!
         | 
         | I'm not sure how it compares to Borg's delta- and zstd-
         | compression, though.
         | 
         | [1]: http://alexey.shpakovsky.ru/en/minimizing-size-of-browser-
         | pr...
        
       | dkjaudyeqooe wrote:
       | > At present there is no license and to my knowledge it is very
       | restrictive. This will change. I have not made up my mind on the
       | exact license.
       | 
       | There might be something lost in translation but since there is
       | no licence this is a copyrighted work and cannot be used without
       | permission.
        
         | raro11 wrote:
         | You are correct
        
         | metadat wrote:
        
           | devnull3 wrote:
           | > is it some kind of a tease
           | 
           | (I am the author). It's definitely not a tease. The license
           | will change within a week (max 2). This is my first open
           | source project. The purpose is to validate the approach.
           | 
           | This has been a private project for a very long time. It can
           | be argued that I should have figured out the license but the
           | delay from my side reached a tipping point. I just thought
           | "make it public, figure the rest later".
        
             | metadat wrote:
             | Got it, if you're open to feedback: in the future, a few
             | more words of explanation about the reasoning in the readme
             | will be appreciated by folks and go a long way in making it
             | clear you are reasonable and thoughtful person. Otherwise
             | it's up to the reader to guess what kind of individual sits
             | on the other side of the screen.
             | 
             | Unfortunately there is a lot of code license drama these
             | days compared to a decade ago. For reference, see the
             | Elasticsearch OSS License renege fiasco from January 2021.
             | 
             | https://www.infoq.com/news/2021/01/elastic-aws-open-source/
             | 
             | https://en.wikipedia.org/wiki/Elasticsearch#Licensing_chang
             | e...
             | 
             | Keep up the good work!
        
               | kevinmgranger wrote:
               | Isn't "...and to my knowledge it is very restrictive.
               | This will change." enough of an explanation?
        
               | devnull3 wrote:
               | Noted.
               | 
               | I have changed the license to MIT.
        
             | GoOnThenDoTell wrote:
             | Should we avoid even reading this code if there is no
             | license? Its currently proprietary code?
        
               | devnull3 wrote:
               | I think you can read the code. I knowingly made the repo
               | public. You do not even have to login to github. I think
               | it should be ok.
        
               | dkjaudyeqooe wrote:
               | It's just a matter of legalities. Even if you share it
               | openly, it doesn't change its copyright status. Unless
               | you offer a licence or make a statement putting it in the
               | public domain (this doesn't work in all jurisdictions),
               | it remains proprietary.
               | 
               | Of course that doesn't matter too much because you're not
               | going to sue people who use it, but it can still have
               | potentially significant consequences for people who use
               | the code.
        
         | devnull3 wrote:
         | I have changed the license to MIT. Sorry for the wrong
         | impression.
        
         | squeaky-clean wrote:
         | That's exactly how I read that sentence. There was no license,
         | and ironically no license is more restrictive than most
         | licenses.
         | 
         | Once you release code under a certain license you can pretty
         | much never undo that. You can change the license, but the
         | previous version under the previous license will still be
         | valid. So I understand wanting to take time to think about what
         | license you will use.
        
         | [deleted]
        
       ___________________________________________________________________
       (page generated 2022-08-28 23:01 UTC)