[HN Gopher] Show HN: Stanchion - Column-oriented tables in SQLite
___________________________________________________________________
Show HN: Stanchion - Column-oriented tables in SQLite
Hello HN! I built stanchion to scratch my own itch. I have
personal applications running on SQLite where I want to store log
and metric data (basically forever) but don't want to add
complexity by bringing in a new database or whole separate server.
Some of these applications are running on a Raspberry Pi, where
storage and compute are limited, but I still want data warehouse
like capabilities. I envision stanchion being used in similar
scenarios: on phones and resource-limited devices in applications
that are already using SQLite. I know that there are alternatives
like DuckDB (which is very cool), but I want stanchion to be "good
enough" that it is useful without having to add whole new database
technology. If you think stanchion may be a good fit for your use
case and you are interested in contributing, please test it and
provide feedback by opening issues for any bugs, difficulties, or
missing features you would need! Ideas are also welcome in this
thread or as a github issue. Of course stars are always appreciated
as well. The CONTRIBUTING doc in the repository has more details.
- Dan
Author : dgllghr
Score : 101 points
Date : 2024-01-31 18:38 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| datadrivenangel wrote:
| I'm excited to see how this compares to DuckDB. More
| local/embedded analytics databases will be good for the industry.
| lainga wrote:
| Well, if Wikipedia's anything to go by, it seems to be the
| first in-mem column-oriented DB written in Zig
| Iridescent_ wrote:
| As far as I can tell, the linked github repository is written
| only in C. Zig is used only as build system
|
| EDIT: nvm, the github stats seem to be completely wrong? It
| shows 96% C to me...
|
| Found it, I believe this is the result of including code from
| SQLite in the repo
| lr1970 wrote:
| Yes, this is because it includes SQlite source (in C) that
| dwarfs the size of columnar storage extension (in Zig).
| dgllghr wrote:
| Including the sqlite source is for the convenience of
| running unit tests without needing to rely on sqlite as an
| external dependency :)
| BiteCode_dev wrote:
| DuckDB can read from several heterogenous sources, but is read
| only.
|
| SQLite is limited to its own format but can write.
|
| So very different use cases.
| speedgoose wrote:
| Actually DuckDB has its own database format for persistence
| and can support writes. It's not obvious from their
| documentation. People told me that on HN not long ago, as I
| thought the same than you.
| BiteCode_dev wrote:
| Oh, thanks, TIL.
| remram wrote:
| It can also write Parquet files and other formats.
| orlp wrote:
| DuckDB isn't optimized for writing speed, but it is not read-
| only.
| jitl wrote:
| I'm looking at some DuckDB code right here that does `INSERT
| INTO ... ON CONFLICT (id) DO UPDATE SET` so it can definitely
| do upserts :)
| BiteCode_dev wrote:
| Ok, I retract.
| cvalka wrote:
| It's called DuckDB.
| speedgoose wrote:
| If you are into alternative storage engines for SQLite, there is
| also an LSM (Log-Structured Merge-tree) extension in the main
| repository that is not announced nor documented but seems to
| work. It's based on the SQLite 4 project.
|
| https://github.com/sqlite/sqlite/tree/master/ext/lsm1
|
| https://www.charlesleifer.com/blog/lsm-key-value-storage-in-...
| conradev wrote:
| "SQLite4 was an experimental rewrite of SQLite that was active
| from 2012 through 2014. All development work on SQLite4 has
| ended. Lessons learned from SQLite4 have been folded into the
| main SQLite3 product. SQLite4 was never released. There are no
| plans to revive it. You should be using SQLite3."
|
| https://sqlite.org/src4/doc/trunk/www/index.wiki
| alexgarcia-xyz wrote:
| This is an awesome project, I love SQLite extensions and I think
| they have a ton of use. Giants props to Dan here, I haven't seen
| many SQLite extensions written in Zig, and I'm learning a ton
| reading through the source code.
|
| The column-oriented data is stored in large BLOBs inside of
| regular SQLite tables. It uses the SQLite incremental BLOB I/O
| API [0] to incrementally read/write data in a column oriented
| way.
|
| However, this project (and other SQLite extensions) will
| eventually hit a limit with SQLite's virtual table API. When you
| create a virtual table, you can perform a number of optimizations
| on queries. For examples, SQLite will tell your virtual table
| implementation the WHERE clauses that appear on the virtual
| table, any ORDER BYs, which columns are SELECT'ed, and other
| limited information. This allows extension developers to do
| things like predicate + projection pushdowns to make queries
| faster.
|
| Unfortunately, it doesn't offer many ways to make analytical
| queries faster. For example, no matter what you do, a `SELECT
| COUNT(*) FROM my_vtab` will always iterate through every single
| row in your virtual table to determine a count. There's no
| "shortcut" to provide top-level counts. Same with other aggregate
| functions like SUM() or AVERAGE(), SQLite will perform full scans
| and do calculations themselves.
|
| So for this project, while column-oriented datasets could make
| analytical queries like that much faster, the SQLite API does
| limit you quite a bunch. I'm sure there are workarounds around
| this (by custom UDFs or exposing other query systems), but would
| be hefty to add.
|
| That being said, I still love this project! Really would love to
| see if there's any size benefit to this, and will definitely
| contribute more when I get a chance. Great job Dan!
|
| [0] https://www.sqlite.org/c3ref/blob_open.html
| zzo38computer wrote:
| This is true, and I had the similar issues with the virtual
| table API. They did add some things, but they won't do
| everything. There are some other problems with SQLite as well.
| However, there are benefits of SQLite, and I wrote some
| extensions, too.
| jerrysievert wrote:
| > The column-oriented data is stored in large BLOBs inside of
| regular SQLite tables.
|
| this is similar to how citus and hydra columnar engines for
| postgres work - it seems to be a fairly successful use-case.
| dgllghr wrote:
| Thanks for the kind words, Alex! You have probably written more
| great sqlite extensions than anyone, so your feedback is very
| meaningful.
|
| You are right that there are limitations. The sqlite virtual
| table API is very clearly designed with row-oriented access in
| mind. However, there are still ways that stanchion makes
| analytical queries faster: lazily loading segments (which are
| the BLOBs that contain the actual values) so data that is not
| accessed is skipped, using encodings like bit packing to reduce
| the size of data that needs to be traversed on disk, and
| allowing users to specify a clustered index to ensure records
| are ordered in a way that allows querying to minimize data
| access.
|
| One area that I want to explore more deeply is the
| xFindFunction callback in the virtual table API. It allows the
| a virtual table to "overload" a function with its own
| implementation. I believe there are some opportunities to work
| around some of the limitations you are describing, but I'm not
| even sure at this point if they can apply to aggregate
| functions.
|
| This is all theoretical until there are solid benchmarks, which
| is something that I want to add in the near term. And if you
| know of any workarounds to the limitations that you think may
| be useful, I am all ears!
| alexgarcia-xyz wrote:
| That's great to hear! The clustered index sounds really cool.
| Especially since SQLite tells you about ORDER BYs in
| xBestIndex (with the nOrderBy[0]), so it would be super cool
| to have super-fast ORDER BYs with those.
|
| Very interested to see how xFindFunction works for you. One
| limitation I've found is that you don't know if a user uses a
| xFindFunction inside of xBestIndex (ie at query time), unless
| 1) it's part of a WHERE clause and 2) only two arguments are
| provided, the first being a column value and the 2nd any
| literal. I've found this limiting in the past, only having 1
| argument to work with in that narrow case. But I'm sure
| there's clever tricks there!
|
| One trick I've noticed: You can kindof detect a COUNT(*) with
| the `colUsed` field in xBestIndex. In that case, `colUsed`
| will be 0 (ie 0 columns are requested), so you can use that
| as a signal to just iterate over N times instead of accessing
| the underlying data. Still slow, but you can probably do
| something like ~1 million/sec, but better than accessing the
| data that many times!
|
| [0] https://www.sqlite.org/vtab.html#order_by_and_orderbycons
| ume...
| m1sta_ wrote:
| Is there any value in just treating rows as columns and vice
| versa?
| nattaylor wrote:
| This is particularly interesting to me for Android/iOS. I can't
| even picture the use case where there'd be enough data on the
| device for the row-based format to b a bottleneck, but maybe some
| case that involves many, many aggregations
| simcop2387 wrote:
| Maybe not a bottleneck but maybe it could improve battery life
| there for those tasks by being better performing
| jitl wrote:
| Note that DuckDB can ingest SQLite tables. If you need OLAP on
| your SQLite system today instead of whenever this is stable,
| consider pairing up DuckDB. In the simplest case if you don't
| need amazing latency and have small data (<1gb), I would copy the
| whole table from SQLite into an in-memory or temp file DuckDB, do
| your OLAP queries, then throw it away. For larger datasets, you
| can incrementally replicate the SQLite table into DuckDB by
| adding a logical clock column or something to the SQLite table,
| and then copy rows where logical_clock>last_duckdb_change before
| running your next DuckDB query.
|
| We're currently doing a bake-off between a few databases for
| small datasets (<10m rows) with dynamic schemas, and have pretty
| comparable read latencies between DuckDB and SQLite for our
| workflow, so you might be able to get away with switching
| wholesale from SQLite to DuckDB if you don't do a lot of update.
| chrisjc wrote:
| Adding to this.
|
| https://duckdb.org/2024/01/26/multi-database-support-in-duck...
|
| Unsure what the parent means by "ingest" SQLite tables
| (although i believe you can use COPY with sqlite tables in
| DuckDB), but you can interact with sqlite tables in DuckDB
| using the extension to attach to a sqlite db.
|
| https://duckdb.org/docs/extensions/sqlite
| jitl wrote:
| Yeah by ingest I mean COPY FROM or attach via the SQLite
| extension. Then you can `INSERT INTO duckdbTable (select *
| from sqliteDb.sqliteTable)` or similar. I don't think duckdb
| aggregate on native SQLite format will have any advantage; I
| think the native duckdb format would be necessary to see a
| big advantage over doing the aggregate in SQLite directly
| based on my reading of the duckdb extension code.
| dgllghr wrote:
| DuckDB is great! But row-oriented storage is also great for
| many use cases. My goal is to provide a way for people to have
| access to the benefits of column-oriented storage without
| leaving behind the benefits of OLTP. Many people are already
| using sqlite and might not have the time/energy/budget to make
| a full switch to a new database. I also think there should be
| many options for embedded columnar storage, and right now there
| really aren't.
| breadchris wrote:
| It seems like there are a lot of extensions that are being built
| for sqlite. I would like to use these extensions, but I am
| skeptical about their support over time. I like sqlite for how
| freakin stable it is. How do people feel about sqlite extensions?
| yellowapple wrote:
| SQLite being as stable as it is means that even an unmaintained
| extension will probably continue to work for a very long time.
| thamer wrote:
| The "Data Storage Internals" section[1] of the README sounds to
| me like it has its own column-oriented format for these tables,
| at least that's how I'm reading the part about segments. Is that
| the case? If so, have you tried using Apache Arrow or Parquet to
| see how they compare?
|
| [1] https://github.com/dgllghr/stanchion#data-storage-internals
| dgllghr wrote:
| Yes it does. I have found it easier to start simple than try to
| integrate the complexity of Parquet whole hog. The Parquet
| format is also designed with the idea of having multiple
| columns in the same "file". I'm sure there are ways to split
| the components of a Parquet file across multiple BLOBs and be
| able to append those BLOBs together to create a valid Parquet
| file, but that is more complexity and does not lend itself to
| existing Parquet code. Keeping the blobs separate is valuable
| because it means not touching data for columns that are not
| being read in queries.
|
| My understanding is that Arrow has really focused on an in-
| memory format. It is binary, so it can be written to disk, but
| they are seemingly just scratching the surface on compression.
| Compression is a big advantage of columnar storage because
| really cool compression schemes like bit packing, run-length
| encoding, dictionary compression, etc. can be used to
| significantly reduce the size of data on disk.
|
| Arrow interop especially is a great idea, though, regardless of
| the storage format. And if they are developing a storage-
| friendly version of arrow, I would certainly consider it.
| didgetmaster wrote:
| Is this really an important feature for DB users here? I built my
| own hobby database system that was column oriented (using the
| metadata tagging system I invented for a file system replacement
| I built). It does lightning fast analytics and even outperforms
| SQLite in a bunch of other operations. I posted a benchmark video
| on my YouTube channel and made a free beta download available;
| but the response has been lukewarm at best.
|
| https://YouTube.com/@didgetsdemos https://www.didgets.com
| thecowgoes wrote:
| Can't wait to see compression!
|
| Any timeline for that?
___________________________________________________________________
(page generated 2024-02-01 23:00 UTC)