[HN Gopher] Rearchitecting: Redis to SQLite
       ___________________________________________________________________
        
       Rearchitecting: Redis to SQLite
        
       Author : thecodemonkey
       Score  : 227 points
       Date   : 2024-09-25 08:37 UTC (14 hours ago)
        
 (HTM) web link (wafris.org)
 (TXT) w3m dump (wafris.org)
        
       | rini17 wrote:
       | If you need writes, can just use second sqlite database.
        
         | HelloNurse wrote:
         | It would be a small command log (a batch of requested changes
         | from that client) with a completely different schema from the
         | main database.
         | 
         | But if we are sending deltas to a central server performance
         | isn't critical: there can be a traditional web service to call,
         | without uploading databases.
        
       | ten13 wrote:
       | Nice post! I'm curious how the SQLite-per-instance model works
       | for rate-limiting in the scale-out scenario. I took a cursory
       | glance at the docs but nothing jumped out at me about how it
       | works.
        
         | michaelbuckbee wrote:
         | Post author and Wafris co-founder here. Conceptually "rate
         | limiting to prevent abuse" (what we're doing here) and "rate
         | limiting for API throttling" have different levels for
         | tolerance.
         | 
         | With that in mind, it's setting higher levels of limiting and
         | doing the math to push that out over many
         | machines/instances/dynos. That helps for things like scraping
         | prevention, etc.
         | 
         | For issues like credential stuffing attacks, you'd want a lower
         | limit but also coupled with mitigations like IP bans, IP
         | reputation, etc. to deal with underlying issue.
        
       | matharmin wrote:
       | It sounds like a niche use case where SQLite does work quite well
       | server-side without needing any replication, since the database
       | is read-only.
       | 
       | Other alternatives may use static files loaded in-memory, but I'm
       | guessing the data is more than you'd want to keep in memory in
       | this case, making SQLite a nice alternative.
        
         | michaelbuckbee wrote:
         | (article author here) - yes 100% and I hope that came through
         | in the article that this is great solution given our particular
         | use case and that it's not a 1:1 swap out of Redis or Postgres.
        
         | chipdart wrote:
         | > Other alternatives may use static files loaded in-memory, but
         | I'm guessing the data is more than you'd want to keep in memory
         | in this case, making SQLite a nice alternative.
         | 
         | Ultimately a RDBMS like SQLite is what you'd get if you start
         | with loading static files into memory and from that point
         | onward you add the necessary and sufficient features you need
         | to get it to work for the most common usecases. Except it's
         | rock solid, very performant, and exceptionally tested out.
        
       | simonw wrote:
       | I'm really interested in this model where each application server
       | has a copy of a SQLite database file which is then replaced on a
       | scheduled basis.
       | 
       | Here it's being used for web application firewall rules.
       | 
       | Another place I've thought about using this is feature flag
       | configuration. Feature flags can be checked dozens of times per
       | request and often need the kind of queries (user is a member of
       | group A and has an IP located in country B) which could be well
       | served by a local SQLite - and feature flags have a tolerance for
       | updates taking a few seconds (or longer) to roll out.
        
         | michaelbuckbee wrote:
         | SQLite for distribution is neat. FWIW - this is at least
         | partially inspired by your datasette project which we may still
         | try and do something with later on the reporting and data
         | exploration side of things.
        
         | supriyo-biswas wrote:
         | > I'm really interested in this model where each application
         | server has a copy of a SQLite database file which is then
         | replaced on a scheduled basis.
         | 
         | BTW, this is also the model used by all CDNs, where the global
         | configuration file containing the certificates, HTTP routing
         | rules etc. for all customers will be updated into into a
         | single-file b-tree structure*, and that "bundle" is distributed
         | among all edge locations frequently.
         | 
         | * I'm yet to see someone use sqlite for this purpose, it's
         | usually DBM style databases like LMDB or Kyoto Cabinet.
        
           | redserk wrote:
           | I've worked on a project a long time ago where we did this
           | with BerkeleyDB files. BDB was used to store configuration
           | data that was frequently looked up.
           | 
           | Periodically we would run a full sync to replace the
           | database. Between the periodic full syncs, we had a
           | background process keep changes applied on a rolling basis.
           | 
           | All-in-all, it worked pretty well at the time! The full
           | database file sync ensured a bad database was timeboxed and
           | we got a bootstrapping mechanism for free.
        
           | Capricorn2481 wrote:
           | > will be updated into into a single-file b-tree structure
           | 
           | I'm not knowledgeable on this, but my understanding was a
           | b-tree is a way of sorting values that could be ordered in a
           | certain way. Like this would be a b-tree of IDs
           | 
           | ```                           [8]                     /   \
           | [3, 5]   [10, 12]               / | \     / | \
           | [1] [4] [6,7] [9] [11, 13]
           | 
           | ```
           | 
           | You traverse by comparing your needle to the root node and
           | going left or right depending on the results.
           | 
           | How is that done with configuration options? That seems like
           | it would just be a regular hashmap which is already efficient
           | to read. What would a b-tree of key/values even look like
           | that wouldn't be less efficient than a hashmap?
        
             | lucianbr wrote:
             | A hash table makes sense in memory. If it's loaded just
             | right for fast access, it has holes - empty entries. That
             | makes little sense if you are building a file that will be
             | transferred to many places over the internet. Bandwith
             | waste would be significant.
             | 
             | So it might seem that simply enumerating the data (sorted
             | or not) would be a better option for a file. (After all,
             | the receiver will read everything anyway.) I guess frequent
             | updates make this inefficient, so a tree helps.
        
             | remram wrote:
             | Each number in your btree would actually be a key-value
             | pair. So you can find the key fast, and then you have the
             | value.
             | 
             | Databases including SQLite usually use b+tree for tables (a
             | variant where only the leaves have data, the interior nodes
             | only have keys) and regular btrees for indexes.
        
           | twic wrote:
           | > Kyoto Cabinet
           | 
           | Now, that's a name I've not heard in a long time.
           | 
           | Are people still using Kyoto Cabinet in new projects? Are
           | people still using DBM-style storage generally? I thought
           | that whole branch of the evolutionary tree had sort of died
           | out.
        
             | supriyo-biswas wrote:
             | > Are people still using Kyoto Cabinet in new projects?
             | 
             | Cloudflare used to use Kyoto Cabinet[1] and moved to
             | LMDB[1] in 2020; other implementations that I'm familiar
             | with (but don't have a link to share) also use LMDB.
             | 
             | > Are people still using DBM-style storage generally?
             | 
             | It's fairly common in these scenarios, as well as the
             | underlying key-value store for popular software like
             | Consul[3].
             | 
             | [1] https://blog.cloudflare.com/kyoto-tycoon-secure-
             | replication/
             | 
             | [2] https://blog.cloudflare.com/introducing-quicksilver-
             | configur...
             | 
             | [3] https://github.com/hashicorp/consul/issues/8442
        
           | d0mine wrote:
           | you can use sqlite as a dbm db
           | https://docs.python.org/3.13/library/dbm.html#module-
           | dbm.sql...
        
         | quesera wrote:
         | > _Feature flags can be checked dozens of times per request_
         | 
         | My strategy for resolving this is to fetch the flag value once,
         | but to store it in the request object, so that a) you never
         | have to take the expensive lookup hit more than once per
         | request, and b) there's no risk of an inconsistent value if the
         | flag is updated mid-request.
        
           | CraigJPerry wrote:
           | What's the use case for re-checking the same feature flag in
           | a single session?
           | 
           | I can see why you need to check multiple different flags in a
           | session and I understand the parent point about looking in
           | SQLite for them (effectively a function call into a library
           | in process address space rather than a call over the network
           | for each flag).
        
             | quesera wrote:
             | Sorry, s/session/request/g; corrected above.
             | 
             | One example is a multistep transaction processing request.
             | The feature flag could gate several branch points.
             | 
             | A memory-mapped SQLite file is great too, but the strategy
             | I describe above is less code to write, adds no new
             | dependencies, is quicker to implement, avoids the SQLite
             | file distribution/availability issues, and should get you a
             | very similar performance improvement.
        
           | jitl wrote:
           | Where is the "session object" stored?
        
             | quesera wrote:
             | Apologies, I meant "request object". Corrected above.
        
         | bob1029 wrote:
         | > a SQLite database file which is then replaced on a scheduled
         | basis.
         | 
         | You could look into WAL replication if you wanted an efficient
         | way to update the copies. Something like Litestream.
        
           | er0k wrote:
           | litestream is great. Dead simple and rock solid IME.
        
         | chipdart wrote:
         | > Feature flags can be checked dozens of times per request and
         | often need the kind of queries (user is a member of group A and
         | has an IP located in country B) which could be well served by a
         | local SQLite - and feature flags have a tolerance for updates
         | taking a few seconds (or longer) to roll out.
         | 
         | This doesn't sound right. A feature flag only requires checking
         | if a request comes from a user that is in a specific feature
         | group. This is a single key:value check.
         | 
         | The business logic lies in assigning a user to a specific
         | feature group, which the simplest way means pre assigning the
         | user and in the most complex cases takes place at app
         | start/first request to dynamically control dialups.
         | 
         | Either way, it's a single key: value check where the key is
         | user ID+feature ID, or session ID + feature ID.
         | 
         | I mean, I guess you can send a boat load of data to perform the
         | same complex query over and over again. I suppose. But you need
         | to not have invested any thought onto the issue and insisted in
         | making things very hard for you, QAs, and users too. I mean,
         | read your own description: why are you making the exact same
         | complex query over and over and over again, multiple times in
         | the same request? At most, do it once, cache the result, and
         | from therein just do a key:value check. You can use sqlite for
         | that if you'd like.
        
           | nnf wrote:
           | GP's comment is talking about checking multiple feature
           | flags, not checking a single feature flag multiple times.
        
             | chipdart wrote:
             | GP referred specifically to queries checking if "user is a
             | member of group A and has an IP located in country B".
             | 
             | The number of feature flags is irrelevant. In fact, the
             | feature flag and A/B testing services I used always
             | returned all default treatment overrides in a single
             | request.
        
           | simonw wrote:
           | I've worked at places where the feature flag system was much
           | more dynamic than that, considering way more than just
           | membership in a group.
           | 
           | This meant you could roll features out to:
           | 
           | - Specific user IDs
           | 
           | - Every user ID in a specific group
           | 
           | - Every object owned by a specific user ID (feature flags
           | might apply to nested objects in the system)
           | 
           | - Requests from IP addresses in certain countries
           | 
           | - Requests served by specific website TLDs
           | 
           | - Users who are paid members of a specific plan
           | 
           | - etc etc etc
           | 
           | It was an enormously complicated system, that had evolved
           | over 5-10 years.
           | 
           | Not saying that level of complexity is recommended, but
           | that's what we had.
           | 
           | Looks like I gave a talk about this back in 2014:
           | https://speakerdeck.com/simon/feature-flags
        
         | nnf wrote:
         | I've wanted to implement this on a distributed web server
         | environment I manage. Right now there's a centralized MySQL
         | database that the web servers read from when rendering a web
         | page, but there can be lots of queries for a single render
         | (page, sections, snippets, attributes, assets, etc.), and
         | sending that all over the wire, while fast, is slower than
         | reading from a database running on the same host. It'd be great
         | to be able to copy the "master" database onto each web server
         | instance, maybe once per minute, or just on-demand when a
         | change to the data is made. I imagine this would make reads
         | much faster.
        
           | pkhuong wrote:
           | That's how https://github.com/backtrace-labs/verneuil 's read
           | replication is meant to be used. There's a command-line tool
           | to recreate a sqlite DB file from a snapshot's manifest, with
           | an optional local cache to avoid fetching unchanged pages, or
           | you can directly use a replica in memory, with pragmas for
           | (async) refreshes.
           | 
           | The write tracking needs to intercept all writes with a
           | custom VFS, but once registered and configured, it's regular
           | in-memory SQLite (no additional daemon).
        
         | jitl wrote:
         | We used this model to distribute translations, feature flags,
         | configuration, search indexes, etc at Airbnb. But instead of
         | SQLite we used Sparkey, a KV file format developed by Spotify.
         | In early years there was a Cron job on every box that pulled
         | that service's thingies; then once we switched to Kubernetes we
         | used a deamomset & host tagging (taints?) to pull a variety of
         | thingies to each host and then ensure the services that use the
         | thingies only ran on the hosts that had the thingies.
         | 
         | In Ruby we called this "hammerspace"
         | https://github.com/airbnb/hammerspace
        
           | michaelbuckbee wrote:
           | 1. Very cool
           | 
           | 2. If you were making it today would you just use SQLite?
        
             | skevy wrote:
             | Yes, we would. And likely will switch to SQLite at some
             | point in the future. (source, I work adjacent to these
             | systems at Airbnb)
        
         | otoolep wrote:
         | rqlite[1] could basically do this, if you use read-only
         | nodes[2]. But it's not quite a drop-in replacement for SQLite
         | at the write-side. But from point of view of a clients at the
         | edge, they see a SQLite database being updated which they can
         | directly read[3].
         | 
         | That said, it may not be practical to have hundreds of read-
         | only nodes, but for moderate-size needs, should work fine.
         | 
         | Disclaimer: I'm the creator of rqlite.
         | 
         | [1] https://rqlite.io/
         | 
         | [2] https://rqlite.io/docs/clustering/read-only-nodes/
         | 
         | [3] https://rqlite.io/docs/guides/direct-access/
        
           | michaelbuckbee wrote:
           | (author here) I had looked at both Rqlite and the different
           | commercial versions of this, but I didn't pursue them as they
           | all seemed to require running an additional service on the
           | host machines.
        
             | otoolep wrote:
             | Yes, that is right, it would require a new service running
             | on the host machines.
             | 
             | That said, I do think it depends on what you consider
             | important, and what your experience has been in the past. I
             | used to value simplicity above all, so reducing the number
             | of moving pieces was important to my designs. For the
             | purpose of this discussion let's count a service as a
             | single moving piece.
             | 
             | But over time I've decided that I also value _reliability_.
             | Operators don 't necessarily want simplicity. What they
             | want is reliability and ease-of-use. Simplicity sometimes
             | helps you get there, but not always.
             | 
             | So, yes, rqlite means another service. But I put a lot of
             | emphasis on reliability when it comes to rqlite, and ease-
             | of-operation. Because often when folks want something
             | "simple" what they really want is "something that just
             | works, works really well, and which I don't have to think
             | about". SQLite certainly meets that requirement, that is
             | true.
        
         | closeparen wrote:
         | This is the type of architecture we use for feature flagging,
         | but it's just a JSON file.
        
         | Salgat wrote:
         | We use something similar for our event based databases. We
         | project millions of events into an in-memory object (usually a
         | dictionary), and periodically persist that state as json in S3.
         | It's guaranteed to be consistent across all service instances
         | because the persisted state contains the event stream positions
         | to catchup from. The only drawback of course is that it can use
         | up to several GB of memory depending on how big the state is,
         | but it's extremely fast and doing in-memory operations is
         | trivial compared to using SQL or a library with api calls.
        
         | TmpstsTrrctta wrote:
         | I've utilized this in Lambda@Edge for use case half feature
         | flag-y, half HTTP routing/proxy serving rules as mentioned in a
         | sibling comment. Lambdas pick up ~50-200MB of data on first
         | boot, and keep their copy through their lifecycle.
         | 
         | As requests come in, gather their features, convert to
         | effectively an int vector, filter for row where match and serve
         | request
        
         | adroitboss wrote:
         | You may want to check out LiteFS and LiteStream by benbjohnson.
         | There was a time in 2022 where he was trending every week for
         | his projects. The following snippet is taken from the LiteFS
         | webpage. "LiteFS is a distributed file system that
         | transparently replicates SQLite databases. You can run your
         | application like it's running against a local on-disk SQLite
         | database but behind the scenes the database is replicated to
         | all the nodes in your cluster."
        
       | dangoodmanUT wrote:
       | I have a hard time believing that Redis local was beat by SQLite
       | local unless the workload was poorly fit for Redis structures, or
       | the integration code wasn't well written.
       | 
       | But always happy to see a discovery of a better solution. I agree
       | removing the network is a win.
        
         | michaelbuckbee wrote:
         | In Redis, the data is a sorted-set that we forced into being
         | lexicographically ordered by setting all the scores to 0. We
         | went through a lot of iterations of it and to be clear it's not
         | _slow_ it's just not as fast as essentially `fopen`
         | 
         | 1 - Redis sorted sets -
         | https://redis.io/docs/latest/develop/data-types/sorted-sets/
        
         | epcoa wrote:
         | I do agree it is somewhat fishy of the large performance
         | difference not being explained by comparatively fundamentally
         | poor data access patterns.
         | 
         | However, Redis runs as an out of process server with marshaling
         | and unmarshaling of data across sockets. SQLite is in process
         | and with a prepared query is basically one library call to a
         | purpose built data access VM. So I'm not sure why it would be
         | hard to believe this cache and TLB friendly setup can beat
         | Redis.
        
           | mrl5 wrote:
           | Thank you. First explanation what might be the root cause :)
        
       | masfoobar wrote:
       | NICE!
       | 
       | I have not used Redis myself, but have been using Sqlite more and
       | more over the years.. and found a perfect application I wrote
       | using Sqlite under the hood.
       | 
       | Powerful and convienient database system!
        
       | aquilaFiera wrote:
       | Somewhat related: for the Neon internal hackathon a few weeks ago
       | I wrote a little Node.js server that turns Redis's wire protocol
       | (RESP) into Postgres queries. Very fun hack project:
       | https://github.com/btholt/redis-to-postgres
        
       | ragu4u wrote:
       | So is the sqlite file on disk or in memory somehow?
        
         | michaelbuckbee wrote:
         | The sqlite db is on disk sync'd down to the clients from our
         | service.
         | 
         | The client is responsible for checking with our servers and, if
         | rule updates are found, downloading a new database file. To
         | avoid locking and contention issues, these are each uniquely
         | named, and which DB is "current" is just updated.
         | 
         | Note: This is only in "managed" mode. If you'd rather, you can
         | distribute a SQLite database of the rules alongside your app.
        
         | TheDong wrote:
         | > on disk or in memory somehow?
         | 
         | Due to the magic of the page cache, the answer to that can be
         | "both".
         | 
         | If the sqlite database is being read often and not being
         | written, the page cache will be valid and reads will pretty
         | much never go to the filesystme.
        
       | macspoofing wrote:
       | >While Redis is "fast" in comparison to traditional RDBMS, it's
       | still a database that you have to manage connections, memory,
       | processes, etc., which introduces more brittleness into the stack
       | (the opposite of what we're trying to achieve).
       | 
       | Every database, Relational or Nonrelational, requires
       | approximately the same level of management and maintenance when
       | you start dealing with non-toy levels of transactions.
       | 
       | The "Fast" part is a little funny. If you don't care about joins,
       | then row inserts and retrievals are pretty damn fast too =)
        
         | gwbas1c wrote:
         | SQLite has its vacuum operation, which is kind-of like running
         | a garbage collection. Every time I read the docs about when to
         | run a vacuum, I end up confused.
         | 
         | The last time I shipped an application on SQLite, I ended up
         | just using a counter and vacuuming after a large number of
         | write operations.
        
           | prirun wrote:
           | HashBackup author here, been using SQLite for about 15 years.
           | 
           | Doing a vacuum after a large number of _deletes_ might make
           | sense. The only real purpose of vacuum IMO is to recover free
           | space from a database. Vacuum _may_ also optimize certain
           | access patterns for a short while, though I have never tested
           | this, and it would be highly dependent on the queries used.
           | If fragmentation is a bigger concern for you than recovering
           | free space, you can also compute the fragmentation to decide
           | whether to vacuum by using the dbstat table:
           | 
           | https://www.sqlite.org/dbstat.html
           | 
           | Then again, computing this will require accessing most of the
           | database pages I'm guessing, so might take nearly as long as
           | a vacuum. The other gotcha here is that just because db pages
           | appear to be sequential in a file doesn't mean they are
           | sequential on a physical drive, though filesystems do strive
           | for that.
           | 
           | SQLite has pragma commands to tell you the number of total
           | and free db pages. When the percentage of free pages is
           | greater than x% and it's a convenient time, do a vacuum. For
           | a highly volatile db, you can add a table containing this
           | percentage, update it every day, and make your decision based
           | on an average, but IMO it's easier just to check for more
           | than 50% free (or whatever) and do the vacuum.
           | 
           | Vacuums used to be (circa 2019) pretty slow operations, but
           | the SQLite team has sped them up greatly since then.
           | Vacuuming a 3GB SQLite db on a SSD takes less than a minute
           | these days. That's with the db 100% full; with only 50% used
           | pages, it would be considerably faster.
           | 
           | Vacuums are done in a statement transaction, so you don't
           | have to worry about a "half vacuum that runs out of disk
           | space" screwing up your database.
        
         | chipdart wrote:
         | > The "Fast" part is a little funny. If you don't care about
         | joins, then row inserts and retrievals are pretty damn fast too
         | =)
         | 
         | What makes SQLite exceptionally fast in a server environment is
         | that you do not require a network call to do the query or even
         | retrieve the data. Your remarks about joins and transactions
         | are meaningless once you understand you're just reading stuff
         | from your very own local HD, which is already orders of
         | magnitude faster.
        
       | justinclift wrote:
       | Wonder if they had indexes on their SQLite tables?
       | 
       | Not seeing a mention of that in the article.
        
         | michaelbuckbee wrote:
         | The answer is "yes." We had indexes - but it's also a little
         | more complicated than that, as we're storing IPv4 and IPv6
         | ranges in a single table in a format _designed_ to be indexed a
         | particular way.
         | 
         | In the article, we refer to this as "decimal lexical"
         | formatting, where we're taking the IPs and making them integers
         | but actually treating them as strings. We're doing this in both
         | Redis with sorted sets and then in a single table in SQLite.
         | 
         | I was going to explain all this in the article, but it was too
         | long already, so it will be a future blog post.
        
           | filleokus wrote:
           | Really great article and I really appreciate seeing this
           | "flavour" of "distributed" sqlite, think it can be useful in
           | many no/low-write scenarios.
           | 
           | But about the formatting of the data, is it completely
           | inherent to the rest of the system / unchangeable?
           | Spontaneously I would have guessed that for example a
           | bitfield in redis would have performed better. Did you test
           | any other formattings?
        
           | a12b wrote:
           | You should definitely write an article with all tricks you
           | used to make it fast!
        
           | epcoa wrote:
           | Curious, what is the advantage of decimal? Why not base-64 or
           | some larger and power of 2 base?
        
       | gwbas1c wrote:
       | How large is the SQLite database you're syncing?
       | 
       | Is it even "worth" using SQLite at this point? What about a
       | configuration file, and straight-up code that works with in-
       | memory data structures?
        
         | michaelbuckbee wrote:
         | This is something we seriously considered. The SQLite dbs are
         | several hundred megabytes in size (millions of IP ranges) so
         | while it would be technically doable to send around rules files
         | as JSON or something more specifically suited there's still a
         | number of wins that SQLite gives us:
         | 
         | - Really strong support across multiple platforms (we have
         | clients for most of the major web frameworks)
         | 
         | - Efficiency, sure we have lots of RAM on servers nowdays but
         | on some platforms it's constrained and if you don't have to
         | burn it, we'd just rather not.
         | 
         | - When we started mapping this out, we ended up with something
         | that looked like a JSON format that we were adding indexes
         | to....and then we were re-inventing SQLite.
        
         | wormlord wrote:
         | I don't know how it works exactly, but I believe you can have a
         | fully in-memory SQLite database. Bun's sqlite library and
         | SqlAlchemy both let you operate on in-memory SQLite db's which
         | you can then write to disk.
         | 
         | Edit: reading the docs it looks like it operates the same way,
         | just reading sections of the db from memory instead of disk
         | 
         | https://www.sqlite.org/atomiccommit.html
        
       | codingbot3000 wrote:
       | It's posts like this explaining architecture decisions in detail
       | I am reading HN for. Thank you!
        
         | michaelbuckbee wrote:
         | (author) - It's genuinely delightful to know that you liked it.
        
       | keybits wrote:
       | People reading this might be interested in Redka - Redis re-
       | implemented with SQLite in Go: https://github.com/nalgeon/redka
        
         | meowface wrote:
         | Was interested and considering switching until I saw this part:
         | 
         | >According to the benchmarks, Redka is several times slower
         | than Redis.
         | 
         | Still a cool project, don't get me wrong. But this kind of
         | doesn't give me any incentive to switch.
        
           | anonzzzies wrote:
           | We (keydb users; it's _much_ faster than redis for all our
           | cases) use redka for our dev machines; we develop
           | _everything_ on sqlite so there is no install of anything and
           | in prod, we just switch to our mysql, clickhouse, redis etc
           | cluster and it all works while having a light experience for
           | dev.
        
             | mikeshi42 wrote:
             | How are you guys using sqlite in dev instead of clickhouse?
             | (Afaik there's a good bit of difference between the two
             | dialects so I'm surprised it's possible without hurting dx
             | through one compromise or another)
        
               | anonzzzies wrote:
               | We have our own query language based on prolog which
               | compiles to efficient queries depending on the underlying
               | db. We haven't caught any cases for about half a decade
               | where humans could do better queries. We are in a niche
               | market so this is not a catch all solution; it is
               | specifically for our market.
        
               | pdimitar wrote:
               | And I assume you have no intention of open-sourcing that
               | query language and its compiler? Because it sounds pretty
               | awesome...
        
         | nikisweeting wrote:
         | Holy cow this is amazing, I've been looking for something like
         | this for years!!
         | 
         | Thanks for sharing.
        
       | tiffanyh wrote:
       | FoundationDB
       | 
       | Isn't "redis to sqlite" effectively what foundationDB?
       | 
       | https://www.foundationdb.org/
        
       | favorited wrote:
       | > Further, when we exhibited at RailsWorld 2023, there was a
       | definite "blood in the water" vibe regarding Redis and the
       | assumption that you'd automatically need a Redis server running
       | alongside your Rails application.
       | 
       | I've only worked on one production Rails application in my career
       | (and it did use Redis!), so I'm way out of the loop - is the
       | ecosystem turning against Redis from a business perspective (I
       | know there have been some license changes), or is it a YAGNI
       | situation, or something else?
       | 
       | IIRC we used it mainly with Rescue to schedule asynchronous jobs
       | like indexing, transcoding, etc., but it seemed like a neat tool
       | at the time.
        
         | x0x0 wrote:
         | I think it's purely a simplicity thing.
         | 
         | Right now, most rails setups with decent traffic will have
         | frontend boxes, a sql db, a KV store (redis or memcached), and
         | a cache store pointed at the kv store, with, annoyingly, very
         | different usage patterns than typical KV store usage, eg for
         | maintaining api quotas or rate limiting.
         | 
         | Disk performance has gotten fast enough and SQL performance has
         | gotten good enough that there's a movement to drop the KV store
         | and split the usages (for traditional KV use and also backing a
         | cache) to the sql db and disk, respectively. Plus new nvme
         | disks are almost as fast and still much cheaper than ram so you
         | can cache more.
        
         | michaelbuckbee wrote:
         | It's a little YAGNI - I think the biggest driver of Redis in
         | community was for exactly what you described aysnc jobs and the
         | tool most folks reached for was Sidekiq.
         | 
         | The 2024 Rails community survey just came out and Redis is
         | still listed as the top datastore that people use in their
         | apps.
         | 
         | FWIW - we found that while many folks are _using_ Redis in
         | their apps, they're just using it for things like Sidekiq and
         | not actually taking advantage of it for holding things like
         | real time leaderboards, vector db functions, etc. so it's a
         | little fuzzy the actual usage.
        
       | doubleorseven wrote:
       | > Benchmarking is a dark art of deceiving yourself with highly
       | precise numbers
       | 
       | .
        
       | nikisweeting wrote:
       | I really wish there were a compatibility layer that could sit on
       | top of SQLite and make it pretend to be redis, so we could switch
       | more things to use SQLite. It doesn't even need to satisfy all
       | the distributed systems guarantess or even implement proper
       | pub/sub, it could just do everything with polling and a single
       | event loop. It would be great for smaller projects that want to
       | run something like celery or any app that depends on redis
       | without needing to install redis.
        
         | m_sahaf wrote:
         | There's Redka: https://github.com/nalgeon/redka/
        
           | nikisweeting wrote:
           | Wow amazing, thank you so much! I've spent many hours over
           | the years looking for a project like this, but it makes sense
           | that I haven't seen this yet as it's only 6mo old.
        
       | vchynarov wrote:
       | Apart from network latency, one of the behaviours I've seen with
       | Redis is that reads/write latencies are fairly linearly
       | proportional to the amount of keys queried - which seems to be
       | shown in your chart as well.
       | 
       | We had a different problem, where our monolithic app used both
       | Postgres / Redis for different use cases and worked relatively
       | well. However - it was a lot easier to shove new functionality in
       | the shared Redis cluster. Because Redis is single-threaded, one
       | inconsiderate feature that does bulk reads (100K+ keys) may start
       | to slow down other things. One of the guidelines I proposed was
       | that Redis is really good when we're reading/writing a key, or
       | small fixed-cardinality set of keys at a time, because we have a
       | lot of random things using Redis (things like locks and rate
       | limits on popular endpoints, etc).
       | 
       | However, in your case, I'm guessing Redis shines in the case of a
       | naive single-key (IP address) lookup, but also doesn't do well
       | with more complicated reads (representing your range query
       | representation?). Cool write up overall, I don't have a deeper
       | understanding of how SQLite performs so well when compared to a
       | local Redis instance, so that was unexpected and interesting to
       | observe.
        
       | tony-allan wrote:
       | Best quote:
       | 
       | "SQLite does not compete with client/server databases. SQLite
       | competes with fopen()."
        
       | jszymborski wrote:
       | A bit strange they replaced Redis with SQLite rather than LMDB or
       | RocksDB which are key-value stores
        
       ___________________________________________________________________
       (page generated 2024-09-25 23:00 UTC)