[HN Gopher] How we built a serverless SQL database
___________________________________________________________________
How we built a serverless SQL database
Author : TheresaBraccio
Score : 179 points
Date : 2021-10-26 15:04 UTC (7 hours ago)
(HTM) web link (www.cockroachlabs.com)
(TXT) w3m dump (www.cockroachlabs.com)
| boynamedsue wrote:
| Any plans to expand the region availability for AWS beyond us-
| west-2 in the US? I am interested in us-east-2.
| andydb wrote:
| Yes, definitely, we'll be expanding to more regions soon.
| ed25519FUUU wrote:
| > _And you'll never be surprised by a bill, because you can set a
| guaranteed monthly spend limit._
|
| It's amazing that this a killer and not standard feature, but
| here we are!
| geoduck14 wrote:
| How does this compare to Snowflake?
| reilly3000 wrote:
| Snowflake is OLAP and billed based on usage time + storage.
| This looks like it's a regular OLTP SQL DB and pay by request.
| phoboslab wrote:
| > If you've created a database before, you probably had to
| estimate how many servers to use based on the expected traffic.
|
| The answer is "one". If you have less than 10k req/s you
| shouldn't even start to think about multiple DB servers or
| migrating from bog-standard MySQL/MariaDB or Postgres.
|
| I will never understand this obsession with "scaling". Modern web
| dev seriously over-complicates so many things, it's not even
| funny anymore.
| [deleted]
| brokencode wrote:
| The question isn't whether you have 10k req/s now, but whether
| you expect to in the future. If you are designing a blog, then
| yeah, you probably don't need to worry about it. If you are
| starting a social network or SAAS business application, then
| you probably do.
| GuB-42 wrote:
| The future will be different.
|
| A lot of successful businesses start with things that are not
| scalable, and it is a strength, not a weakness. If you start
| a social network for instance, you can't beat Facebook at its
| own game. You have to do something that Facebook can't do
| because it is too big. Scalability problems will be tackled
| as you grow.
|
| Among the many things Facebook can't do is running its
| service on a single database. It makes things much harder on
| them, thankfully, you are much smaller than Facebook and you
| can. Take that advantage.
| wyager wrote:
| Alternatively, people really _do_ need to put a lot of thought
| into scaling, but only because they did something like write
| some core web service in an interpreted language framework that
| maxes out at 100 requests per second.
| bitwize wrote:
| (robot voice) Does /dev/null support sharding? Sharding is the
| secret ingredient in the web scale sauce.
| zkldi wrote:
| it's impressive how well that video has aged with modern web
| dev.
| lumost wrote:
| The answer is unfortunately less clear cut. Particularly if you
| assume that whoever is tasked with scaling this hypothetical DB
| doesn't know what they are doing a-priori.
|
| The following questions are likely to come up
|
| 1) My t3.xl DB is down, how much bigger can I make it?
|
| 2) My r3.24xl DB can only handle 100 TPS and now my site is
| down, what can I do?
|
| 3) My 2x r3.24xl DB cluster costs a lot of money, Are other
| solutions cheaper?
|
| 4) My latency is high, are other solutions faster?
|
| For someone who hasn't dealt with these questions before, these
| will become long and painful lessons with massive material
| impacts to the business.
|
| It's appealing to use Dynamo as it takes the magic out of
| scaling. It's appealing to use serverless RDBMS as you don't
| have to think about it anymore unless it has high
| costs/latency.
| mjb wrote:
| What happens when that database fails? Are you OK losing some
| data, or do you want the data to be synchronously replicated
| off the machine and be available somewhere else after failure?
| Distribution isn't only about scale, it's also about
| availability.
|
| What happens when that database loses some data? Do you want an
| up-to-the second backup, or point-in-time recovery? Or are you
| OK restoring last night's backup? Distribution isn't only about
| scale, it's also about durability.
|
| What happens when you need to run an expensive business process
| ad-hoc? Do you want it to be easy to scale out reads, or to
| export that data to an analytics system? Or are you OK building
| something else to handle that case? Distribution isn't only
| about scale, it's also about flexibility.
|
| What happens when you want to serve customers in one market,
| and make sure that their data stays local for regulatory
| compliance reasons or latency? Are you OK with having separate
| databases? Distribution isn't only about scale, it's also about
| locality.
| phoboslab wrote:
| Fair points. I would argue that for most people a simple
| master-slave setup with manual failover will produce far
| fewer headaches than a "serverless" architecture.
|
| When you are big enough to worry about the other issues, you
| surely are big enough to handle the requirements in-house. I
| see the dependence on some specific companies as the bigger
| threat to reliability.
| mjibson wrote:
| The setup you describe is very much not simple. I worked at
| a place with very good DBAs and our replication setup
| caused us more downtime than anything else. Cockroach and
| Spanner exist because many programmers observed that what
| you describe is hard.
| Groxx wrote:
| As a counter-anecdote: multiple startup projects I've
| worked on with separate MySQL setups where each had just
| a single master + two slaves (one warm for fast failover
| in case of hardware failure or upgrades, one cold for
| slow analytics-style queries) did just fine with millions
| (to tens of millions) of users. No downtime at all for
| years on end.
|
| MySQL and Postgres are _massively_ more widely-used than
| Cockroach and Spanner, broadly very successfully. It 's
| entirely feasible to run it with high uptime.
| dboreham wrote:
| Very few deployments experience actual failures. Could be
| some fridge-door/light situation going on.
| dcposch wrote:
| This is a good argument against running a k8s cluster when u
| don't need one, but not a good argument against this new
| serverless Cockroach product.
|
| Serverless is not just about auto scaling up from 1 to n, it's
| about autoscaling down from 1 to 0.
|
| If Cockroach provides a robust SQL DB at a marginal cost of
| ~$0/mo for small request volumes, that is a real value add over
| running your own pg server.
|
| Not having to deal with administration or backups is another
| big value add.
|
| This offering looks like it compares very nicely to say running
| an rds instance with auto backups enabled.
| psadri wrote:
| Unfortunately, easy to hit that with say GraphQL where each
| client request can resolve to dozens of db selects vs a single
| hand written/tuned SQL select.
| babelfish wrote:
| Really great post, thanks for sharing. I spent a lot of time a
| couple months ago researching 'DBaaS' offerings (for fun, not
| business) and found it difficult to find any posts outlining the
| architecture of a DBaaS. Really cool to see CRDB putting this out
| in the open.
| andydb wrote:
| It's been something we've done since the start and plan to
| continue doing. If you read back over our engineering blog,
| you'll find a surprisingly thorough description of the entire
| CockroachDB stack, from the lowest storage layer to distributed
| transactions, Raft consensus, SQL => key-value mapping, online
| schema changes, cost-based SQL optimizer, Kubernetes usage, and
| so on. In fact, when we onboard new engineers, we typically
| point them to a stack of external blog entries to read in order
| to get up-to-speed on how CockroachDB works.
|
| Being open on how we solve hard problems is the way to build
| our collective knowledge as a developer community. Certainly
| CockroachDB itself has benefited enormously from all that has
| gone before and been published in the open.
| rabaut wrote:
| Does CockroachDB Serverless expose an HTTP api? This sounds like
| a great fit for use with Cloudflare Workers, but that requires an
| http api.
| sorenbs wrote:
| Prisma will launch a data proxy, as well as support for
| cockroachdb and Cloudflare workers early next year.
| sharps_xp wrote:
| Why do you prefer an HTTP API versus a DB connection? Isn't the
| former going to inherently have the overhead cost of creating
| the connection + TLS handshakes?
|
| My question is similar, which is, is CockroachDB going to have
| an equivalent RDS proxy so that apps can handle traffic spikes
| and not have to deal with problems with DB connection pools
| pistoriusp wrote:
| I think we won't be getting socket connections in some of the
| wasm powered JS runtime engines soon. Using http solves that,
| and a bunch of caching issues.
| estambar wrote:
| there are some great projects that present an http api for
| postgres that you could use with CRDB I think. I'm thinking
| of something like https://github.com/pramsey/pgsql-http
| pistoriusp wrote:
| Now it would be awesome if we could use Prisma against an
| open standard like this.
| zaidf wrote:
| I kind of assumed this came with an HTTP api, woops :) I've
| been using Airtable with Workers for tiny side projects. This
| seemed like a nice alternative to Airtable's lack of support
| for vanilla SQL.
| andydb wrote:
| Great question. We recognize how important this is and are
| actively working on it.
| andydb wrote:
| To any who might see this, I'm the author of the blog post, and
| led the engineering team that built CockroachDB Serverless. I'll
| be monitoring this thread in case there are any questions you'd
| like to ask me about it.
| dilyevsky wrote:
| Haven't yet got time to read the whole thing so sorry if it's
| already answered but is it possible to run this sql pod/storage
| pod separation setup yourself with crdb community/enterprise?
| We run enterprise crdb but it's all in one process (with
| replicas)
| andydb wrote:
| It's not currently possible, partly because it complicates
| the deployment model quite a bit. Dynamically bringing SQL
| pods up and down requires a technology like Kubernetes. It
| takes some serious operational know-how to keep it running
| smoothly, which is why we thought it would be perfect for a
| managed Cloud service.
|
| What would be your company's reasons for wanting this
| available in self-hosted CRDB? What kinds of use cases would
| it address for you?
| nonameiguess wrote:
| It's understandly not something a lot of vendors consider,
| especially early stage products, but classified systems
| don't have Internet access and can't use managed cloud
| services (unless offered by AWS via C2S, and possibly in
| the future by whoever wins JEDI if they resurrect that),
| leaving no choice but to self-host anything. That's the
| extremest example, but lots of enterprises airgap part of
| their network for other reasons, and private companies that
| don't handle officially classified data may still be
| restricted in where they can host data based on PII/PHI
| legal regulations, and may not be able to use your data
| center.
| dilyevsky wrote:
| In addition to security/legal stuff there are also
| concerns around slo/business continuity and costs
| andydb wrote:
| Sure, those are all good points. We'll definitely keep an
| eye on customer demand in these areas so we can make good
| prioritization decisions. It can be frustrating when
| there are so many things that we'd like to do, but we can
| only pick a small subset to actually do.
| dilyevsky wrote:
| I'm actually thinking for side proj (I'm very familiar with
| running production crdb in k8s via my current job) - the
| usecase is to bring isolated "virtual" crdb cluster per
| customer to save on operational overdhead of managing many
| crdb clusters in kubernetes (which is doable but requires
| extra automation). That's not how we handle customer
| mutitenancy at my current job but I'm sure this could be
| common usecase. Now I could use your cloud but I sure am
| interested in diy option as well
| vegasje wrote:
| I'm quite confused by Request Units, and trying to predict how
| many would be used by queries/operations.
|
| I launched a test cluster, and the RUs are continuously
| increasing without me having connected to the cluster yet. At
| this rate of RU climb, the cluster would use over 8mil of the
| available 10mil RUs in a month without me touching it.
|
| Coming from AWS, one of the most difficult aspects of using
| Aurora is guessing how many I/Os will be used for different
| workloads. It would be a shame to introduce this complexity for
| CockroachDB Serverless, especially if the RUs are impacted by
| internal cluster operations that aren't initiated by the user.
| sebbul wrote:
| I haven't connected to my cluster but my RUs keep going up.
| Extrapolating I'll be at 20M RUs over 30 days without using
| it.
| andydb wrote:
| You've run into a "rough edge" of the beta release that will
| be fixed soon. When you keep the Cluster Overview page open,
| it runs queries against your cluster so that it can display
| information like "# databases in the cluster". Unfortunately,
| those queries run every 10 seconds in the background, and are
| consuming RUs, which is why you see RU usage without having
| connected to the cluster yet. But never fear, we'll get that
| fixed.
|
| One thing that may not be clear - you get 10M RUs for free,
| up front, but you also get a constant accumulation of 100
| RU/s for free throughout the month. That adds up to >250M
| free RUs per month. This ensures that your cluster is always
| accessible, and that you never truly "run out" of RUs - at
| most you get throttled to 100 RU/s.
|
| I hear you on the difficulty of understanding how your
| queries map to RUs. SQL queries can be enormously complex and
| differ by multiple orders of magnitude from one another in
| terms of their compute cost. That's why we built a near real-
| time dashboard that shows you how quickly you're consuming
| RUs. You can run your workload for a few minutes and then
| check back on the dashboard to see how many RUs that workload
| consumed.
| corentin88 wrote:
| Do you have a getting started guide on CockroachDB Serverless?
| I couldn't find one in your docs [1]. This looks very
| interesting.
|
| [1] https://www.cockroachlabs.com/docs/v21.1/example-apps.html
| shampeon wrote:
| The Quickstart [1] is what you're looking for. Examples for
| NodeJS, Python, Go, and Java.
|
| [1]https://www.cockroachlabs.com/docs/cockroachcloud/quicksta
| rt...
| jawns wrote:
| What's your elevator pitch for why my organization should use
| CockroachDB Serverless vs. something like AWS Aurora
| Serverless, particularly if we're already relatively invested
| in the AWS ecosystem?
| andydb wrote:
| Oh boy, I'm an engineer, but I'll do my best to pretend I'm
| on the sales or marketing team for a minute...
|
| First of all, CockroachDB Serverless is available on AWS, and
| should integrate quite well with that ecosystem, including
| with Serverless functions offered by AWS Lambda.
|
| Here are a few advantages of CockroachDB Serverless that
| Aurora will struggle to match (note that we're still working
| on Serverless multi-region support):
|
| 1. Free-forever tier. We offer a generous "free forever" tier
| that doesn't end after a month or a year. As the blog post
| outlines, our architecture is custom-built to make this
| economical.
|
| 2. No ceiling on write scalability. Even non-Serverless
| Aurora runs into increasing trouble as the number of writes /
| second increases past what a single machine can handle.
| CockroachDB just keeps going. We've had multiple high-scale
| customers who hit Aurora limits and had to move over to
| Cockroach to support business growth.
|
| 3. True multi-region support. Aurora only allows read-only,
| stale replicas in other regions, while CRDB allows full ACID
| SQL transactions. If you want to move into other regions of
| the world and have latency concerns or GDPR concerns, CRDB is
| custom-built to make the full SQL experience possible.
|
| 4. No Cloud lock-in. Perhaps this is not a concern for you
| company, but many companies don't like getting completely
| locked in to a single Cloud provider. CockroachDB works on
| multiple cloud providers and doesn't have a monetary interest
| in locking you in to just one.
|
| 5. Online schema changes. CockroachDB supports operations
| like adding/removing columns, renaming tables, and adding
| constraints without any downtime. You can perform arbitrary
| schema changes without disturbing your running application
| workloads. SQL DDL "just works".
|
| 6. Cold start in an instant. CockroachDB clusters
| automatically "scale to zero" when they're not in use. When
| traffic arrives, they resume in a fraction of a second.
| Compare that to Aurora, where you need to either have a
| minimum compute reservation, or you need to endure multi-
| second cold starts.
|
| 7. Great support. We've got a friendly Slack room where you
| can get free support and rub shoulders with fellow
| CockroachDB users, as well as CockroachDB folks like myself.
| We also have 24/7 paid support for deeper problems you might
| encounter.
|
| Taken altogether, CockroachDB can go wherever your business
| needs it to go, without all the constraints that traditional
| SQL databases usually have. Do you want thousands of clusters
| for testing/development/tiny apps at a reasonable cost? Could
| your business take off and need the scale that CRDB offers?
| Could your business need to expand into multiple geographic
| regions? Are some of your workloads erratic or periodic, but
| still should start up instantly when needed? It's not just
| about what you need now, but what you _may need_ in the
| future. It makes sense to plan ahead and go with a database
| that has "got you covered" wherever you need to go.
| zaphar wrote:
| I'm sold. What's your migration story from decades old
| MySQL Database over to CockroachDB?
|
| I'm only half joking there.
| estambar wrote:
| not sure if this is working yet for Serverless but...
| https://www.cockroachlabs.com/docs/v21.1/migrate-from-
| mysql....
| the_arun wrote:
| You will do great with Sales role as well. Thanks for this
| write up!
| 9dev wrote:
| That's a nice pitch, and certainly catched my curiosity!
| Going to check out CRDB.
| qaq wrote:
| Not CDB employee but CDB scales beyond what Aurora can
| support.
| kendru wrote:
| The lack of a serverless option was the only reason that I did
| not use Cockroach on a recent project. I am excited to see the
| serverless offering now, and the architectural details in the
| post are awesome. Nice work!
| timwis wrote:
| This sounds great! I've wanted to create an open data portal for
| a while that lets you spin up a (ephemeral, read-only) Postgres
| database of a dataset and run queries on it, maybe with a
| notebook. Sounds like this might be perfect!
| Shelnutt2 wrote:
| [Disclaimer: I work for TileDB, Inc]
|
| We have developed just this[1] except using a MariaDB storage
| engine (MyTile) we've written. You can serverless run queries
| against TileDB arrays without spinning up a MariaDB instance.
| You can run any type of query MariaDB supports (joins,
| aggregates, CTE, etc). I've linked the basic documentation and
| an example notebook below[2]. You can run SQL queries from
| python/R or even JS or curl. We support a number of data return
| formats, i.e. arrow and JSON to facilitate use cases.
|
| I'll also mention that we have a number of public example
| dataset[3] in TileDB cloud, such as the NYC taxi data used in
| this notebook[4], which you can explore!
|
| [1] https://docs.tiledb.com/cloud/api-reference/serverless-sql
|
| [2] https://cloud.tiledb.com/notebooks/details/TileDB-
| Inc/Quicks...
|
| [3] https://cloud.tiledb.com/explore/arrays
|
| [4] https://cloud.tiledb.com/notebooks/details/TileDB-
| Inc/tutori...
| chatmasta wrote:
| You might like what we're building at Splitgraph:
| https://www.splitgraph.com/connect
| timwis wrote:
| Oh wow, very relevant indeed! I guess I thought ephemeral DBs
| would be better so that a user's expensive query wouldn't bog
| down the db for other users. And rather than just limiting
| them, enabling them to do whatever queries they could with pg
| running locally
| pachico wrote:
| I really see the greatness of the serverless option.
| Congratulations!
|
| What I can't really understand is why would someone use the
| dedicated cluster in AWS at that price.
| qaq wrote:
| pricing
| dmitriid wrote:
| - Pricing
|
| - storage and db access is still mostly through hoops
|
| - long running processes
|
| - pain to develop, test and debug
| estambar wrote:
| One of the first apps to migrate from CockroachDB Dedicated to
| Serverless is live now if someone wants to try it out.
| https://web.flightchop.com/dashboard - posting for a friend.
___________________________________________________________________
(page generated 2021-10-26 23:01 UTC)