[HN Gopher] DoorDash manages high-availability CockroachDB clust...
___________________________________________________________________
DoorDash manages high-availability CockroachDB clusters at scale
Author : orangechairs
Score : 16 points
Date : 2023-11-02 21:13 UTC (1 hours ago)
(HTM) web link (www.cockroachlabs.com)
(TXT) w3m dump (www.cockroachlabs.com)
| rickreynoldssf wrote:
| I'm not really seeing why DoorDash needs all their operational
| data in one monster clustered database. I would think its so much
| simpler to shard the data by region for operational queries and
| aggregate in the background for long-term storage.
| killingtime74 wrote:
| Resume driven development
| mbyio wrote:
| Cockroach automates the sharding of data by region and provides
| tools that let you use and manage it more like a traditional
| database. If they didn't use cockroach, they would have to
| write/setup tools and adapters to do all that anyway. It would
| probably be more familiar to developers conceptually if they
| used traditional sharding, but why build and maintain all that
| when you can just use off the shelf software?
| JohnBooty wrote:
| I've never done geographic sharding but it seems kind of hard.
| How do you pick shard boundaries? How do you deal with entities
| who are near the boundaries and whose current operational data
| therefore spans >1 shards? (Imagine somebody at near the
| geographic intersection of like, five shards looking for pizza
| in a 10 miles radius or w/e)
|
| Also the majority of entities they're tracking (users, drivers)
| do not have fixed locations.
|
| Maybe it's not as hard as I'm thinking. I guess you just have
| to accept that any query can span an arbitrary number of shards
| and the results need to be union'd.
|
| I'm sure a lot of smart people have tackled this at the
| DoorDashes and Ubers of the world and maybe there's some
| optimal way of handling it. I would love to hear about that.
| jfim wrote:
| > I've never done geographic sharding but it seems kind of
| hard. How do you pick shard boundaries? How do you deal with
| entities who are near the boundaries and whose current
| operational data therefore spans >1 shards? (Imagine somebody
| at near the geographic intersection of like, five shards
| looking for pizza in a 10 miles radius or w/e)
|
| You could do it by market (eg. SFBA, Los Angeles, San Diego)
| or by state.
| jordanthoms wrote:
| [delayed]
| sciurus wrote:
| The article says they have 300+ clusters, not one monster one.
| jordanthoms wrote:
| Sharding is anything but simple. A single shard per region
| wouldn't have enough write capacity so they'd have to be
| managing likely 100+ shards in each region - you'd have to
| build a lot of infrastructure to automate setting those up,
| rebalancing traffic to avoid hot spots and underutilized
| shards, in sync with schema migrations etc.
|
| Even after that, now your applications using the DB have to be
| aware of the sharding - interactions between users who are
| housed on different shards etc could require a lot of work at
| the application layer. If your customers can be easily be split
| into tenants which never interact with each other this isn't so
| bad but for a consumer app like DoorDash there isn't clear
| tenant boundaries.
|
| We looked at all this for Kami and realised that it would be
| much easier for us to move from PostgreSQL to CockroachDB (we
| had exceeded the write capacity of a single PostgreSQL primary)
| than to shard Postgres, and it'd make future development much
| faster. We could have made sharding work if we had to... but
| it's not 2013 any more and we have distributed SQL databases,
| why not use them?
| snihalani wrote:
| interesting. curious if anyone has benchmarked it relative to
| other dbs. like: https://benchmark.clickhouse.com/
| jordanthoms wrote:
| [delayed]
| cebert wrote:
| This reads like a long form advertisement.
| al_borland wrote:
| Case studies hosted on a company's own website generally are.
| It's kinds of an, "it worked for them, so it will work for
| you," thing.
| candiddevmike wrote:
| "Art of the possible" (YMMV)
___________________________________________________________________
(page generated 2023-11-02 23:00 UTC)