[HN Gopher] ReadySet Core: next-generation SQL caching, freely a...
       ___________________________________________________________________
        
       ReadySet Core: next-generation SQL caching, freely available
        
       Author : marzoeva
       Score  : 86 points
       Date   : 2022-06-16 16:19 UTC (6 hours ago)
        
 (HTM) web link (readyset.io)
 (TXT) w3m dump (readyset.io)
        
       | jjice wrote:
       | I love the concept. Not needing to have extra code and logic for
       | a caching layer seems very nice. In my experience, I haven't ever
       | been in a situation where I needed super heavy caching, but this
       | seems like it gives it to you "for free". Interested to see if we
       | see more of ReadySet in the future.
        
         | [deleted]
        
       | anentropic wrote:
       | Seems clever!
       | 
       | I'm curious what might be pathological cases, patterns of query
       | watching and updates that give the cache a lot of work to do to
       | keep up
        
       | steve-chavez wrote:
       | How does ReadySet interact with Row level security[1]? For RLS to
       | work you'd need validation at the origin server anyway right?
       | 
       | [1]: https://www.postgresql.org/docs/current/ddl-rowsecurity.html
        
         | zasdffaa wrote:
         | Damn, that's a good question! Or security in general. But I
         | wouldn't blame them at all if they didn't do it. Good thought
         | though.
        
       | jensneuse wrote:
       | BSL :(
        
         | jjice wrote:
         | From their README:
         | 
         | > ReadySet is licensed under the BSL 1.1 license, converting to
         | the open-source Apache 2.0 license after 4 years. The ReadySet
         | team is hard at work getting the codebase ready to be hosted on
         | Github.
        
           | simonw wrote:
           | Has anyone else done this thing with BSL for 4 years that
           | then converts to Apache 2? I've not seen it before.
        
             | gst wrote:
             | In addition to the other software already listed here
             | Materialize does this too: https://github.com/MaterializeIn
             | c/materialize/blob/main/LICE...
        
             | js4ever wrote:
             | Redpanda is doing the same. TBH I really don't like BSL, I
             | prefer 10 times open core model
        
             | ritesofbryan wrote:
             | This is what Cockroach does as well:
             | https://www.cockroachlabs.com/docs/stable/licensing-
             | faqs.htm...
        
       | freitasm wrote:
       | Interesting. I would love to see this available for MS SQL
       | Server.
       | 
       | I've played with Safepeak (1) which runs on Windows Server. It
       | was sold later to an Israeli company (2), which have since gone
       | out of business and assets ended up with another company and now
       | sold as ScaleArc (3)
       | 
       | The original SafePeak is available free but no maintenance or
       | anything, so not really production ready. It works, as tested in
       | a test environment but eight years without support or updates...
       | 
       | (1) http://www.safepeak.org/ (2)
       | https://en.wikipedia.org/wiki/SafePeak (3)
       | https://www.devgraph.com/scalearc/
        
       | zasdffaa wrote:
       | There are too many questions here. What does it not do? What's
       | the overhead of monitoring the main DB and how's it done -
       | triggers? Does it need schema changes? What about race conditions
       | - can you guarantee none? What's the memory overhead you need for
       | the cache? Can you control what gets cached?
       | 
       | > It can serve millions of reads per second on a single node ...
       | 
       | I'm not a network guy but that seems just astonishing - what is a
       | 'node' here?
       | 
       | > ReadySet incrementally maintains result sets of SQL queries
       | based on writes to the primary database.
       | 
       | So basically you've solved the general materialised view
       | incremental update problem? That's an unsolved problem in
       | general, surely?
       | 
       | Edit: not dissing but trying to see where the limits are.
        
         | trollied wrote:
         | Oracle has had MVs that can refresh on update for decades.
        
           | zasdffaa wrote:
           | So does Pgres & mssql, but _general_ views that are
           | _incrementally_ updated - that 's another matter. I'd be very
           | surprised (and pleased).
        
             | marzoeva wrote:
             | This is indeed our goal- we're most of the way there with
             | SQL 92 and plan to continue to expand our query support
             | over time!
        
       | pmarec wrote:
       | Are you looking into spreading the dataflow even more down to the
       | clients ? Think realtime subscription for complex queries over
       | structured data.
        
         | greg-m wrote:
         | Yes! We've thought about this in depth and have some ideas but
         | I'd love to chat more. Shoot me an email: greg@readyset.io
        
       | aeyes wrote:
       | If you release something new, you should make sure that your
       | documentation contains useful information.
       | 
       | Even the most fundamental information like available
       | configuration options, command-line arguments, deployment
       | information and so on is missing.
       | 
       | Looking at the code it appears that you need Consul, Zookeeper
       | and Redis to make this fly and the docs don't mention this
       | anywhere. They (barely) explain how to run the SQL proxy on a
       | local machine but thats it.
       | 
       | I wonder if the testimonials on your website are just pulled from
       | thin air, I don't think any sane person would even experiment
       | with this anywhere near production environments.
        
         | [deleted]
        
         | greg-m wrote:
         | Hey, PM @ ReadySet - fair points, and thanks for checking us
         | out.
         | 
         | We've been in pretty heavy development and have been heads down
         | on getting ReadySet into your hands as quickly as we could.
         | We'll be doing a major documentation pass soon which will have
         | more info about clustering, etc.
         | 
         | There's also a bit more detail in our development guide - see
         | https://github.com/readysettech/readyset/blob/main/developme...
        
       | d_watt wrote:
       | Interesting. What would you say the use case is for this, rather
       | than setting up read replicas? Not having to maintain routing to
       | the replicas on the application side?
        
         | _jezell_ wrote:
         | When you care about perf a lot more than consistency
        
           | marzoeva wrote:
           | Hi! CEO of ReadySet here. You can think of ReadySet as being
           | a cross between a traditional read replica and a custom
           | caching layer (e.g. one you might build on top of Redis).
           | With read replicas, you still rerun queries from scratch
           | every time they're issued, which means you still have to
           | think about things like query optimization. ReadySet caches
           | frequently run queries in memory so you get super-fast query
           | latencies on cache hits. Because of this, you can scale to
           | much higher read throughputs without extra effort. This is
           | especially useful for read-heavy applications (e.g. websites,
           | certain types of dashboards, among others!)
           | 
           | You can read more about how it works here:
           | https://docs.readyset.io/concepts/overview
        
             | jinjin2 wrote:
             | How do you deal with security? In modern databases like
             | MongoDB permissions are granular down to the field level.
             | 
             | The same query could produce wildly different result based
             | on the user issuing them, and the caching somehow has to
             | take that into account. Is that something you address?
        
               | marzoeva wrote:
               | Funnily enough, I wrote a paper on this topic when I was
               | in grad school. It's not on our short-term roadmap at
               | ReadySet, but this idea is certainly compatible with the
               | underlying dataflow model. I'd love to hear more about
               | what you had in mind here, shoot me an email at
               | alana@readyset.io
               | 
               | https://people.csail.mit.edu/malte/pub/papers/2019-hotos-
               | mul...
        
       | _ben_ wrote:
       | PolyScale [1] is a serverless plug-and-play database edge cache.
       | Our goal is for devs to be able to scale reads globally in a few
       | minutes. It's wire compatible with Postgres, MySQL, MS SQL Server
       | (more coming including no-sql).
       | 
       | It has a global edge network, so no infrastructure to deploy and
       | AI managed cache and auto invalidation, so no cache configuration
       | needed.
       | 
       | [1] https://www.polyscale.ai/
        
       | trollied wrote:
       | > ReadySet is a lightweight SQL caching engine that precomputes
       | frequently-accessed query results and automatically keeps these
       | results up-to-date over time as the underlying data in your
       | database changes
       | 
       | I don't see the point in using an extra app - you can do this
       | natively in Postgres. Materialized views.
       | https://www.postgresql.org/docs/current/rules-materializedvi...
        
         | cpursley wrote:
         | Are materialized views aware of applied where filters?
        
         | adwf wrote:
         | Except then you need to be paying someone to monitor your
         | queries and develop your views rather than just dropping a
         | container in the middle with this app.
        
         | hbrundage wrote:
         | Postgres materialized views have to be manually refreshed on a
         | schedule, and so are always out of date, whereas ReadySet keeps
         | your results up to date automatically as the input changes. For
         | PG materialized views, the compute required proportional to the
         | size of the input data, and is paid every time, whereas with
         | ReadySet the computation is incremental, so it's proportional
         | to the size of the change in the data over time.
         | 
         | And finally, ReadySet's (Noria's) big innovation is that the
         | result set can be only partial, storing only the elements of
         | the result set (and underlying data flow graph) that are
         | frequently accessed, instead of the whole result set like a
         | materialized view would.
        
       | wasd wrote:
       | I signed up for the waitlist! I noticed it asked about AWS,
       | Azure, and GCP but we use Heroku. Hopefully, that won't put me
       | too low on the list.
       | 
       | Do you have a sense for when people can try it? Most of our app
       | is reads and we're using Rails + Redis and it's fine and
       | sometimes a pain. Would love to try it.
        
         | greg-m wrote:
         | Hey, PM @ ReadySet here. Shoot an email to greg@readyset.io and
         | we can see what we can do :)
        
       | xtreak29 wrote:
       | GitHub repo : https://github.com/readysettech/readyset
        
         | mdaniel wrote:
         | BSL
         | <https://github.com/readysettech/readyset/blob/main/LICENSE> so
         | they're serious about the "freely available" part I guess
        
       | cpursley wrote:
       | Woah, I had the same idea not so long ago. Right now I'm using
       | GraphCDN but would much rather cache at the database level. Looks
       | like this could be a drop in for lots of people already on
       | Postgres & MySQL (meaning no more dog-slow Rails apps).
       | 
       | There was a cool article about intercepting the Postgres
       | connection with Elixir not long ago:
       | https://docs.statetrace.com/blog/build-a-postgres-proxy/
        
         | jensneuse wrote:
         | As an alternative, you can use WunderGraph (oss) to compile
         | GraphQL Queries to REST Endpoints so that you can use fastly or
         | Cloudflare as a CDN (and the Browser Cache obviously):
         | https://wundergraph.com/docs/overview/features/caching It
         | supports configurable Cache-Control Headers per Operation and
         | comes with ETags out of the box, so content can be invalidated
         | easily.
        
           | cpursley wrote:
           | That's pretty cool, thanks.
        
         | greg-m wrote:
         | PM at ReadySet here - that's the idea! We think sub ms reads
         | while still using SQL are pretty cool :)
         | 
         | If you want to dig-in more, hop into our community slack:
         | https://readysetcommunity.slack.com/
        
       | tmikaeld wrote:
       | > Traditional databases would compute the results of this query
       | from scratch every time it was issued.
       | 
       | Is this really the case that queries can't be cached on
       | traditional databases?
        
         | CharlesW wrote:
         | It doesn't appear to be the case:
         | https://docs.oracle.com/database/121/TGDBA/tune_result_cache...
        
         | greg-m wrote:
         | ReadySet PM here - depends on if there are writes to the table
         | or not!
         | 
         | For example, MySQL deprecated their query cache, but previously
         | it would only cache until there were any writes to the tables
         | that the queries were referencing
         | https://dev.mysql.com/doc/refman/5.7/en/query-cache-configur...
        
           | tmikaeld wrote:
           | I was just looking this up and it's correct, they don't cache
           | queries (if they do, it's a separate feature), they only
           | manage query planning in ways that make them faster.
           | 
           | Even CochroachDB doesn't do query cache, only query planning
           | is cached. [0] [1]
           | 
           | [0] https://www.cockroachlabs.com/blog/memory-usage-
           | cockroachdb/
           | 
           | [1] https://www.cockroachlabs.com/blog/query-plan-caching-in-
           | coc...
        
       ___________________________________________________________________
       (page generated 2022-06-16 23:01 UTC)