[HN Gopher] ReadySet Core: next-generation SQL caching, freely a...
___________________________________________________________________
ReadySet Core: next-generation SQL caching, freely available
Author : marzoeva
Score : 86 points
Date : 2022-06-16 16:19 UTC (6 hours ago)
(HTM) web link (readyset.io)
(TXT) w3m dump (readyset.io)
| jjice wrote:
| I love the concept. Not needing to have extra code and logic for
| a caching layer seems very nice. In my experience, I haven't ever
| been in a situation where I needed super heavy caching, but this
| seems like it gives it to you "for free". Interested to see if we
| see more of ReadySet in the future.
| [deleted]
| anentropic wrote:
| Seems clever!
|
| I'm curious what might be pathological cases, patterns of query
| watching and updates that give the cache a lot of work to do to
| keep up
| steve-chavez wrote:
| How does ReadySet interact with Row level security[1]? For RLS to
| work you'd need validation at the origin server anyway right?
|
| [1]: https://www.postgresql.org/docs/current/ddl-rowsecurity.html
| zasdffaa wrote:
| Damn, that's a good question! Or security in general. But I
| wouldn't blame them at all if they didn't do it. Good thought
| though.
| jensneuse wrote:
| BSL :(
| jjice wrote:
| From their README:
|
| > ReadySet is licensed under the BSL 1.1 license, converting to
| the open-source Apache 2.0 license after 4 years. The ReadySet
| team is hard at work getting the codebase ready to be hosted on
| Github.
| simonw wrote:
| Has anyone else done this thing with BSL for 4 years that
| then converts to Apache 2? I've not seen it before.
| gst wrote:
| In addition to the other software already listed here
| Materialize does this too: https://github.com/MaterializeIn
| c/materialize/blob/main/LICE...
| js4ever wrote:
| Redpanda is doing the same. TBH I really don't like BSL, I
| prefer 10 times open core model
| ritesofbryan wrote:
| This is what Cockroach does as well:
| https://www.cockroachlabs.com/docs/stable/licensing-
| faqs.htm...
| freitasm wrote:
| Interesting. I would love to see this available for MS SQL
| Server.
|
| I've played with Safepeak (1) which runs on Windows Server. It
| was sold later to an Israeli company (2), which have since gone
| out of business and assets ended up with another company and now
| sold as ScaleArc (3)
|
| The original SafePeak is available free but no maintenance or
| anything, so not really production ready. It works, as tested in
| a test environment but eight years without support or updates...
|
| (1) http://www.safepeak.org/ (2)
| https://en.wikipedia.org/wiki/SafePeak (3)
| https://www.devgraph.com/scalearc/
| zasdffaa wrote:
| There are too many questions here. What does it not do? What's
| the overhead of monitoring the main DB and how's it done -
| triggers? Does it need schema changes? What about race conditions
| - can you guarantee none? What's the memory overhead you need for
| the cache? Can you control what gets cached?
|
| > It can serve millions of reads per second on a single node ...
|
| I'm not a network guy but that seems just astonishing - what is a
| 'node' here?
|
| > ReadySet incrementally maintains result sets of SQL queries
| based on writes to the primary database.
|
| So basically you've solved the general materialised view
| incremental update problem? That's an unsolved problem in
| general, surely?
|
| Edit: not dissing but trying to see where the limits are.
| trollied wrote:
| Oracle has had MVs that can refresh on update for decades.
| zasdffaa wrote:
| So does Pgres & mssql, but _general_ views that are
| _incrementally_ updated - that 's another matter. I'd be very
| surprised (and pleased).
| marzoeva wrote:
| This is indeed our goal- we're most of the way there with
| SQL 92 and plan to continue to expand our query support
| over time!
| pmarec wrote:
| Are you looking into spreading the dataflow even more down to the
| clients ? Think realtime subscription for complex queries over
| structured data.
| greg-m wrote:
| Yes! We've thought about this in depth and have some ideas but
| I'd love to chat more. Shoot me an email: greg@readyset.io
| aeyes wrote:
| If you release something new, you should make sure that your
| documentation contains useful information.
|
| Even the most fundamental information like available
| configuration options, command-line arguments, deployment
| information and so on is missing.
|
| Looking at the code it appears that you need Consul, Zookeeper
| and Redis to make this fly and the docs don't mention this
| anywhere. They (barely) explain how to run the SQL proxy on a
| local machine but thats it.
|
| I wonder if the testimonials on your website are just pulled from
| thin air, I don't think any sane person would even experiment
| with this anywhere near production environments.
| [deleted]
| greg-m wrote:
| Hey, PM @ ReadySet - fair points, and thanks for checking us
| out.
|
| We've been in pretty heavy development and have been heads down
| on getting ReadySet into your hands as quickly as we could.
| We'll be doing a major documentation pass soon which will have
| more info about clustering, etc.
|
| There's also a bit more detail in our development guide - see
| https://github.com/readysettech/readyset/blob/main/developme...
| d_watt wrote:
| Interesting. What would you say the use case is for this, rather
| than setting up read replicas? Not having to maintain routing to
| the replicas on the application side?
| _jezell_ wrote:
| When you care about perf a lot more than consistency
| marzoeva wrote:
| Hi! CEO of ReadySet here. You can think of ReadySet as being
| a cross between a traditional read replica and a custom
| caching layer (e.g. one you might build on top of Redis).
| With read replicas, you still rerun queries from scratch
| every time they're issued, which means you still have to
| think about things like query optimization. ReadySet caches
| frequently run queries in memory so you get super-fast query
| latencies on cache hits. Because of this, you can scale to
| much higher read throughputs without extra effort. This is
| especially useful for read-heavy applications (e.g. websites,
| certain types of dashboards, among others!)
|
| You can read more about how it works here:
| https://docs.readyset.io/concepts/overview
| jinjin2 wrote:
| How do you deal with security? In modern databases like
| MongoDB permissions are granular down to the field level.
|
| The same query could produce wildly different result based
| on the user issuing them, and the caching somehow has to
| take that into account. Is that something you address?
| marzoeva wrote:
| Funnily enough, I wrote a paper on this topic when I was
| in grad school. It's not on our short-term roadmap at
| ReadySet, but this idea is certainly compatible with the
| underlying dataflow model. I'd love to hear more about
| what you had in mind here, shoot me an email at
| alana@readyset.io
|
| https://people.csail.mit.edu/malte/pub/papers/2019-hotos-
| mul...
| _ben_ wrote:
| PolyScale [1] is a serverless plug-and-play database edge cache.
| Our goal is for devs to be able to scale reads globally in a few
| minutes. It's wire compatible with Postgres, MySQL, MS SQL Server
| (more coming including no-sql).
|
| It has a global edge network, so no infrastructure to deploy and
| AI managed cache and auto invalidation, so no cache configuration
| needed.
|
| [1] https://www.polyscale.ai/
| trollied wrote:
| > ReadySet is a lightweight SQL caching engine that precomputes
| frequently-accessed query results and automatically keeps these
| results up-to-date over time as the underlying data in your
| database changes
|
| I don't see the point in using an extra app - you can do this
| natively in Postgres. Materialized views.
| https://www.postgresql.org/docs/current/rules-materializedvi...
| cpursley wrote:
| Are materialized views aware of applied where filters?
| adwf wrote:
| Except then you need to be paying someone to monitor your
| queries and develop your views rather than just dropping a
| container in the middle with this app.
| hbrundage wrote:
| Postgres materialized views have to be manually refreshed on a
| schedule, and so are always out of date, whereas ReadySet keeps
| your results up to date automatically as the input changes. For
| PG materialized views, the compute required proportional to the
| size of the input data, and is paid every time, whereas with
| ReadySet the computation is incremental, so it's proportional
| to the size of the change in the data over time.
|
| And finally, ReadySet's (Noria's) big innovation is that the
| result set can be only partial, storing only the elements of
| the result set (and underlying data flow graph) that are
| frequently accessed, instead of the whole result set like a
| materialized view would.
| wasd wrote:
| I signed up for the waitlist! I noticed it asked about AWS,
| Azure, and GCP but we use Heroku. Hopefully, that won't put me
| too low on the list.
|
| Do you have a sense for when people can try it? Most of our app
| is reads and we're using Rails + Redis and it's fine and
| sometimes a pain. Would love to try it.
| greg-m wrote:
| Hey, PM @ ReadySet here. Shoot an email to greg@readyset.io and
| we can see what we can do :)
| xtreak29 wrote:
| GitHub repo : https://github.com/readysettech/readyset
| mdaniel wrote:
| BSL
| <https://github.com/readysettech/readyset/blob/main/LICENSE> so
| they're serious about the "freely available" part I guess
| cpursley wrote:
| Woah, I had the same idea not so long ago. Right now I'm using
| GraphCDN but would much rather cache at the database level. Looks
| like this could be a drop in for lots of people already on
| Postgres & MySQL (meaning no more dog-slow Rails apps).
|
| There was a cool article about intercepting the Postgres
| connection with Elixir not long ago:
| https://docs.statetrace.com/blog/build-a-postgres-proxy/
| jensneuse wrote:
| As an alternative, you can use WunderGraph (oss) to compile
| GraphQL Queries to REST Endpoints so that you can use fastly or
| Cloudflare as a CDN (and the Browser Cache obviously):
| https://wundergraph.com/docs/overview/features/caching It
| supports configurable Cache-Control Headers per Operation and
| comes with ETags out of the box, so content can be invalidated
| easily.
| cpursley wrote:
| That's pretty cool, thanks.
| greg-m wrote:
| PM at ReadySet here - that's the idea! We think sub ms reads
| while still using SQL are pretty cool :)
|
| If you want to dig-in more, hop into our community slack:
| https://readysetcommunity.slack.com/
| tmikaeld wrote:
| > Traditional databases would compute the results of this query
| from scratch every time it was issued.
|
| Is this really the case that queries can't be cached on
| traditional databases?
| CharlesW wrote:
| It doesn't appear to be the case:
| https://docs.oracle.com/database/121/TGDBA/tune_result_cache...
| greg-m wrote:
| ReadySet PM here - depends on if there are writes to the table
| or not!
|
| For example, MySQL deprecated their query cache, but previously
| it would only cache until there were any writes to the tables
| that the queries were referencing
| https://dev.mysql.com/doc/refman/5.7/en/query-cache-configur...
| tmikaeld wrote:
| I was just looking this up and it's correct, they don't cache
| queries (if they do, it's a separate feature), they only
| manage query planning in ways that make them faster.
|
| Even CochroachDB doesn't do query cache, only query planning
| is cached. [0] [1]
|
| [0] https://www.cockroachlabs.com/blog/memory-usage-
| cockroachdb/
|
| [1] https://www.cockroachlabs.com/blog/query-plan-caching-in-
| coc...
___________________________________________________________________
(page generated 2022-06-16 23:01 UTC)