[HN Gopher] Show HN: Versioning Filesystem for SQLite
___________________________________________________________________
Show HN: Versioning Filesystem for SQLite
Author : devnull3
Score : 78 points
Date : 2022-08-28 13:49 UTC (9 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| ctas wrote:
| Is there a way to provide the same functionality via extensions
| instead of VFS? Recently, there's been a few other interesting
| projects adding backup and replication to SQLite, but they all
| use VFS. AFAIK only one VFS can be active at a time.
| devnull3 wrote:
| I am the author. Mojo is both an extension and a VFS
| implementation.
|
| > AFAIK only one VFS can be active at a time.
|
| You can specify VFS as a part of connect URI.
| TheRealPomax wrote:
| Given that sqlite databases are just files, what is the benefit
| of this vs. version controlling the database file itself?
| operator-name wrote:
| > The main feature of the fs is versioning/snapshotting. Only
| one version is writable and all the old versions are immutable.
| devnull3 wrote:
| (I am the author) Some of the use-cases of mojo I could think of:
|
| 1. Rollbacking DDL changes (like create table, adding columns,
| etc)
|
| 2. (My use-case) Multiple readers-single writer. Readers read old
| versions. Writers do not block readers and vice-versa. Note: this
| is different than reading+writing in the same version can still
| happen but with database-level locking.
|
| 3. Extension of #2: The older versions read from S3 so that lamda
| can read.
|
| 4. Extension of #3: Readers reading from old immutable snapshots
| makes the database as a queue of complex changes with indexes and
| all the benefits of SQL.
|
| 5. Backups and replication.
|
| Oh: I have change the license to MIT
| rnavi wrote:
| Would love to learn more on the motivation behind the project
| koeng wrote:
| How would something like this work with litestream? I imagine
| that the WAL file gets written the same way, but would recovering
| the database allow for mojo to continue working?
| devnull3 wrote:
| From what I know of litestream, it should be decoupled from VFS
| implementation. The versions/snapshots are not known to the
| database.
| punnerud wrote:
| Could this be used to have multiple writes? where every write
| have their own sqlite3 and read is a combination of all the files
| Lex-2008 wrote:
| regarding multiple writes - I think hardest part would be to
| find an sqlite version where "read is a combination of all the
| files".
|
| Speaking of multiple writers, I've recently heard about "BEGIN
| CONCURRENT" feature of SQLite [1] - currently it lives on its
| separate branch, but I hope they will eventually merge it to
| the main branch. Not sure if it can be used in your case, but
| worth mentioning anyway, I think.
|
| [1]: https://www.sqlite.org/cgi/src/doc/begin-
| concurrent/doc/begi...
| devnull3 wrote:
| Not sure I understood your question fully, but if there are
| multiple versions, then read will be over multiple files but
| writes go to a single active version.
| remram wrote:
| Each writer can get its own branch but you would never be able
| to re-combine them.
| phil294 wrote:
| This seems great, also in performance, but why would you use this
| over making plain file copy-based backups, since SQLite is single
| file based? I'd assume this solution might be preferable because
| it saves deltas instead of absolute, but the Readme does not say
| so. If the main advantage here is on-the-fly selecting from
| different versions, granted, that is unique, even though I cannot
| really imagine a production use case for this.
|
| So far, I have been using file copies and GFS backup scheme based
| solutions like Borg, which also does deltas in chunks,
| compression, and encryption. Perhaps with Borg one should better
| even backup SQL dumps instead of db files, I don't know.
| devnull3 wrote:
| Only modified pages go into a version data file [1]
|
| [1] https://github.com/sudeep9/mojo/blob/main/design.md
| arjvik wrote:
| Could you achieve the same by using BTRFS to store the db
| file?
| devnull3 wrote:
| Yes. But you need to have such a filesystem installed at
| the first place.
|
| Mojo can run on any filesystem which does not support
| snapshots/versions. This makes it portable not only across
| different fs but also across OS.
| rahimiali wrote:
| don't you get the same benefit if you version controlled the
| db file with git? with git, each commit saves a diff from the
| previous one as a blob. the difference is that in git, in
| addition to the diffs, you also have to create a working copy
| of the db, which means you use up at least 2x the storage
| your system uses. in your implementation, the diff blobs
| _are_ the live db, which saves you ~2x storage. is that the
| main benefit?
| Lex-2008 wrote:
| re: backup SQL dumps instead of db files - indeed! In my small
| experiment (SQL databases in a browser profile, [1]), simple
| `sqlite3 "$file" .dump | gzip >"$file.sql.gz"` decreased size
| of files to be backed up about 10 times!
|
| I'm not sure how it compares to Borg's delta- and zstd-
| compression, though.
|
| [1]: http://alexey.shpakovsky.ru/en/minimizing-size-of-browser-
| pr...
| dkjaudyeqooe wrote:
| > At present there is no license and to my knowledge it is very
| restrictive. This will change. I have not made up my mind on the
| exact license.
|
| There might be something lost in translation but since there is
| no licence this is a copyrighted work and cannot be used without
| permission.
| raro11 wrote:
| You are correct
| metadat wrote:
| devnull3 wrote:
| > is it some kind of a tease
|
| (I am the author). It's definitely not a tease. The license
| will change within a week (max 2). This is my first open
| source project. The purpose is to validate the approach.
|
| This has been a private project for a very long time. It can
| be argued that I should have figured out the license but the
| delay from my side reached a tipping point. I just thought
| "make it public, figure the rest later".
| metadat wrote:
| Got it, if you're open to feedback: in the future, a few
| more words of explanation about the reasoning in the readme
| will be appreciated by folks and go a long way in making it
| clear you are reasonable and thoughtful person. Otherwise
| it's up to the reader to guess what kind of individual sits
| on the other side of the screen.
|
| Unfortunately there is a lot of code license drama these
| days compared to a decade ago. For reference, see the
| Elasticsearch OSS License renege fiasco from January 2021.
|
| https://www.infoq.com/news/2021/01/elastic-aws-open-source/
|
| https://en.wikipedia.org/wiki/Elasticsearch#Licensing_chang
| e...
|
| Keep up the good work!
| kevinmgranger wrote:
| Isn't "...and to my knowledge it is very restrictive.
| This will change." enough of an explanation?
| devnull3 wrote:
| Noted.
|
| I have changed the license to MIT.
| GoOnThenDoTell wrote:
| Should we avoid even reading this code if there is no
| license? Its currently proprietary code?
| devnull3 wrote:
| I think you can read the code. I knowingly made the repo
| public. You do not even have to login to github. I think
| it should be ok.
| dkjaudyeqooe wrote:
| It's just a matter of legalities. Even if you share it
| openly, it doesn't change its copyright status. Unless
| you offer a licence or make a statement putting it in the
| public domain (this doesn't work in all jurisdictions),
| it remains proprietary.
|
| Of course that doesn't matter too much because you're not
| going to sue people who use it, but it can still have
| potentially significant consequences for people who use
| the code.
| devnull3 wrote:
| I have changed the license to MIT. Sorry for the wrong
| impression.
| squeaky-clean wrote:
| That's exactly how I read that sentence. There was no license,
| and ironically no license is more restrictive than most
| licenses.
|
| Once you release code under a certain license you can pretty
| much never undo that. You can change the license, but the
| previous version under the previous license will still be
| valid. So I understand wanting to take time to think about what
| license you will use.
| [deleted]
___________________________________________________________________
(page generated 2022-08-28 23:01 UTC)