[HN Gopher] D1: Improvements to performance and scalability
___________________________________________________________________
D1: Improvements to performance and scalability
Author : eallam
Score : 94 points
Date : 2023-05-19 13:08 UTC (9 hours ago)
(HTM) web link (blog.cloudflare.com)
(TXT) w3m dump (blog.cloudflare.com)
| davnicwil wrote:
| So this looks like basically a distributed SQLite with read
| replication for free across cloudflare's edge. Is that right?
| m3kw9 wrote:
| So serverless meaning you don't manage the db servers, so they
| may or may not put your stuff on the edge, cloudflare takes
| care of maybe load balancing too?
| jgrahamc wrote:
| Yes, Cloudflare handles load balancing and neat tricks like
| measuring the latency between your users, your code and
| databases etc. and moving your code around the network to
| make it run fast: https://blog.cloudflare.com/announcing-
| workers-smart-placeme...
|
| The vision of the Supercloud is that you give us your code
| and we'll figure out where and how to execute it:
| https://blog.cloudflare.com/welcome-to-the-supercloud-and-
| de...
| kneebonian wrote:
| So I hate cloudflare with a burning passion, at some point
| cloudflare decided me home IP was bad and has started flagging it
| all over the place, which leads to vast swathes of the internet
| now being inaccessible to me.
|
| There is danger in centralized systems.
| v0idzer0 wrote:
| Change your IP? Most ISP change every modem reboot. Others you
| can call and request a change
| bombcar wrote:
| You may be able to force your ISP to give you a new IP by
| resetting your modem or leaving it off long enough, that may
| help.
|
| But once cloud flare hates you, you have to bow to them.
|
| Try browsing the normal web over Tor sometime and see how bad
| it can get.
| depingus wrote:
| When I started using Firefox with the Temporary Containers
| extension I would get absolutely bombarded by Cloudflare's
| captchas. To the point where if I clicked on a search result
| that sent me to a captcha, I would just close the tab and click
| the next result (which often resulted in another captcha, rinse
| repeat). That's when I realized just how big Cloudflare had
| gotten.
|
| I still use Temporary Containers and lately, I've noticed a
| sharp decline in these Cloudflare captchas. I don't know if its
| because people are moving away from them, or Cloudflare just
| found a better way to finger print me.
| kentonv wrote:
| Have you tried using Privacy Pass? It is a browser extension
| which helps prove that your traffic is legitimate which should
| reduce the incidence of challenges from Cloudflare, even if
| your IP has bad rep. It uses advanced cryptography to do this
| without creating any new tracking vectors.
|
| (Disclosure/disclaimer: I work for Cloudflare but in a
| different department; I'm not an expert on Privacy Pass.)
| simiones wrote:
| > It uses advanced cryptography to do this without creating
| any new tracking vectors.
|
| Sounds like something which is designed to create _hard to
| detect_ tracking vectors.
| kentonv wrote:
| Note that Privacy Pass is not a Cloudflare product, it's an
| open protocol which Cloudflare supports.
|
| https://datatracker.ietf.org/doc/charter-ietf-privacypass/
| threatofrain wrote:
| I enjoy Cloudflare and have been using workers + D1 for a few
| months, but warn that Cloudflare's definition of beta is far more
| beta than what other companies mean. The API surface may change
| repeatedly without corresponding documentation and their Discord
| can be a bit sparse on help.
| marcopicentini wrote:
| Why and when should I use a serverless Postgresql instead of a
| postresql hosted on a server?
| rozenmd wrote:
| this isn't postgres
| fire wrote:
| hmm, is this roughly equivalent to Neon[1], but sqlite based
| rather than postgres?
|
| 1: https://neon.tech
| nikita wrote:
| (neon CEO)
|
| This is true. Neon however offers bottomless storage and D1 is
| 100Mb currently going to 1Gb.
| colesantiago wrote:
| Can I connect directly to a D1 database with a SQLite URL? That
| would be awesome.
| ericstegemann wrote:
| Eek 0.75 per GB of storage! AWS is 0.115 One of our DBs has 2.3TB
| in it.
| Akkuma wrote:
| The costs as a whole has me worried. I'm not sure it'll be
| better than just using https://turso.tech/pricing as that is
| already a free tier of 8GB and it might cost less overall the
| paid tier.
| jFriedensreich wrote:
| at least at the moment D1 is for a completely different use
| case, d1 has currently a limit of 100mb and will now be
| increased to 1gb.
| yashap wrote:
| Yeah, I'd imagine this will be cheaper than AWS RDS for low
| storage use cases (lack of fixed monthly compute costs wins
| out), but more expensive for high storage. Like quite cheap for
| a 10-100 GB DB, quite expensive for a 1-10 TB GB.
|
| Though they do say:
|
| > when we enable global read replication, you won't have to pay
| extra for it, nor will replication multiply your storage
| consumption
|
| With AWS you'll have at least one read replica for failover, so
| $0.23/GB. And if you really want global read replicas, with AWS
| you might end up with something like a primary in North America
| and read replicas in South America, Europe and Asia. That would
| work out to $0.46/GB, so closes the gap a bit.
| nemothekid wrote:
| This is Cloudflare's flavor of Fly.io?
| pier25 wrote:
| Fly doesn't really offer distributed data.
|
| Edit:
|
| It does!
| avinassh wrote:
| is this not same - https://fly.io/docs/litefs/ ? They mention
| it as 'LiteFS - Distributed SQLite'
| pier25 wrote:
| Ah you're right!
|
| I thought this was a wrapper for Litestream but apparently
| it's a parallel project by the same author who Fly hired.
|
| https://litestream.io/
|
| https://github.com/benbjohnson/litestream/pull/411
| trollied wrote:
| Interesting, Time Travel works the same as Oracle Flashback. Hope
| there aren't any patents to trip over.
| pottertheotter wrote:
| So when would someone use something like this? I learn better by
| example if anyone has any.
|
| And for reference, here's the original D1 announcement with some
| additional info https://blog.cloudflare.com/introducing-d1/
| leetrout wrote:
| I use D1 through microfeed https://www.microfeed.org/
| kentonv wrote:
| You'd use it whenever you're building an application on
| Cloudflare Workers and you need to store data. D1 provides you
| with a SQL database. It's based on SQLite, so it's designed for
| relatively small datasets but can serve them very quickly from
| the edge.
|
| Note there are several alternatives here, too. Workers Durable
| Objects[0] provide a lower-level primitive for building
| advanced distributed systems. But D1 is easier to use for
| typical use cases. For blob storage you might use R2[1]. And
| for large databases Workers can easily integrate with several
| serverless database providers.[2]
|
| [0] https://blog.cloudflare.com/introducing-workers-durable-
| obje... [1] https://www.cloudflare.com/products/r2/ [2]
| https://blog.cloudflare.com/announcing-database-integrations...
| Akkuma wrote:
| Is it a non-goal to be long term usable for larger databases?
| That would force the usage of something like turso your
| closest direct comparison as a possible migration strategy or
| relying on "Smart Placement" (which from my point of view
| reduces the benefit of global edge) for other serverless non-
| global dbs.
| jgrahamc wrote:
| I mean, our object storage is called R2, our first database
| offering is called D1, and if we were to offer a fully
| Distributed Database then D2 seems like a good name.
| Akkuma wrote:
| And you can follow up D2 with D2: Lord of Distributed.
| kentonv wrote:
| Personally, I'm a firm believer that most "web app" use
| cases are better served by many small databases (e.g. per-
| user or per-document) rather than a single monolithic
| databases. This is especially true when serving users all
| around the world -- per-user databases can be located near
| each user (both for speed and to comply with data locality
| laws).
|
| What I'd like to enable here is a progression where you
| start out prototyping your app with a single D1 database,
| which is easy to use and reasonably fast. Then as you grow
| we provide tools to let you transition to many D1 databases
| sharded in a way that makes sense (e.g. per-user). Apps
| that want even more control can move to using full-on
| Durable Objects (which will soon support a SQLite database
| per-object).
|
| That said, there are certainly many use cases out there
| where simple monoliths make sense, especially non-
| interactive data crunching. I'm not sure yet if D1 will
| ever be the right choice for those, but the Workers
| platform aims to provide many options.
| Akkuma wrote:
| Thanks for the insight, I greatly appreciate it! This
| definitely is a reasonable idea for many things and I'm
| looking forward to seeing something similar to the
| sharding mechanism in the future.
|
| I've only started to think about this and I'm thinking
| the hardest part will be dealing with cross-cutting
| concerns (in a non-auto sharded world manually creating
| multiple database) and trying to find a way to keep each
| database isolated without extra burden compared to using
| a hosted Postgres.
|
| As an aside, that lan optimized house was a gaming dream.
| Hope your new house is as awesome.
| Wallacy wrote:
| " Apps that want even more control can move to using
| full-on Durable Objects (which will soon support a SQLite
| database per-object)."
|
| Can you elaborate this little bit more? Im using DO today
| and i have a bad time sharding my data (works, but i hate
| it);
|
| So i will have the option to use the standard store
| or/and SQLite?
|
| If so, i dont can keep with my DO (because i have control
| of everything) and use SQLite for things that is bigger
| than what the value store supports.
| kentonv wrote:
| Sorry, I don't quite understand what you're asking.
|
| In the future each DO will have a private SQLite
| database. The key/value store will actually be redirected
| to store into a special table in this database, but
| probably new apps will just use the database and not the
| KV store.
|
| Separately from that, I would like to develop tools that
| make sharding Durable Objects (and D1 databases) easier.
| Today it's a pain to do manually. This is independent
| from the underlying storage model, though.
| fyzix wrote:
| Cloudflare are 285 pops. Surely you dont need your db
| replication them all of them. A few locations per continent
| should suffice.
|
| For comparison, fly.io, turso's provider, has 34 locations
| and well-documented reliability issues.
| Akkuma wrote:
| That is a fair point. A few centralized locations will
| likely be more than sufficient for most use cases.
| Mystery-Machine wrote:
| "up to 37x faster" is NOT before: 37.81ms, now: 1.82ms
|
| You can't just round down 1.82ms to 1ms.
| elithrar wrote:
| Fixing this. We ran a few benchmarks (and some were much faster
| than 37x), but this was a more typical case. Not our goal to
| inflate numbers.
| ccorda wrote:
| Kenton Varda (tech lead) has some more notes in this twitter
| thread:
| https://twitter.com/KentonVarda/status/1659551757796515846
| rektide wrote:
| > _Our new engine is based on intercepting SQLite 's disk
| writes and doing clever stuff with them. It was so easy because
| the file format is not just well designed but amazingly well-
| documented._
|
| Quite the hobby project for a lot of people too! Other folks
| doing this:
|
| Rqlite https://hn.algolia.com/?query=Rqlite&sort=byDate ,
| Dqlite https://hn.algolia.com/?query=dqlite&sort=byDate ,
| Litestream https://hn.algolia.com/?query=Litestream&sort=byDate
| / LiteFS https://hn.algolia.com/?query=LiteFS&sort=byDate,
| marmot, mvsqlite
| https://hn.algolia.com/?query=mvsqlite&sort=byDate
| otoolep wrote:
| I've been doing it for almost 10 years. :-)
|
| https://www.philipotoole.com/9-years-of-open-source-
| database...
| aranke wrote:
| Curious if you prototyped DuckDB before deciding to invest
| further into SQLite.
|
| DuckDB works great as an in-memory database (it's also the
| default mode).
| kentonv wrote:
| We didn't, no.
|
| I'm sure there's a lot of really cool local-first databases
| out there, but SQLite has the benefit of being incredibly
| widely battle-tested, with literally billions of
| installations worldwide. It has received thorough security
| research and fuzzing (it's part of Chrome's attack surface
| after all). And there's tons of resources online to help
| people understand how to use it. Although I'm sure there are
| alternatives that serve certain use cases better it's hard to
| imagine anything coming close for ours.
|
| That said, the storage engine we've built is not that heavily
| dependent on SQLite specifically. Any database that uses a
| write-ahead log like SQLite does should be possible to adapt
| to it in the future. So maybe we'll eventually open it up to
| a variety of choices, or even let you bring your own as a
| Wasm module.
| kentonv wrote:
| Oh, I've been informed that DuckDB uses SQLite under the
| hood, so maybe compiling DuckDB to Wasm and running it on
| top of this will be possible, we'll see.
| aranke wrote:
| From https://news.ycombinator.com/item?id=23290512:
|
| > DuckDB is indeed a free columnar database system, but
| it is not entirely built on top of SQLite. It exposes the
| same front-end and uses components of SQLite (the shell
| and testing infrastructure), but the execution
| engine/storage code is new.
| rektide wrote:
| I expect most workloads are more OLTP/transactional than
| OLAP/analytical.
| elithrar wrote:
| Correct. DuckDB is really interesting technology, but it's
| not a direct successor to SQLite for transactional
| workloads. It's also very new: there's a LOT of new code in
| DuckDB on top of the (heavily fuzzed) SQLite parts.
|
| (I use it personally, but it's not the same thing as what
| we're building with D1)
| pier25 wrote:
| Has the DX of Workers improved?
|
| I think Deno is lightyears ahead.
| kentonv wrote:
| Hard to say without knowing what specific problem you have had
| but we are improving all the time. Several improvements
| announced just this week, take a look at the blog.
|
| Among other things, we made Wrangler (Workers CLI tool) use the
| open source workerd by default for local development, so local
| dev should produce a much more precise simulation now (since
| it's literally running the same code).
| pier25 wrote:
| And what about the DX of using Workers with Pages?
|
| I tried to use that recently and it was a disaster. I wrote
| about my experience here:
|
| https://twitter.com/pierbover/status/1641474067013271552
|
| I then opened these two issues:
|
| https://github.com/cloudflare/workers-sdk/issues/2962
|
| https://github.com/cloudflare/workers-sdk/issues/2964
|
| I ended up moving the project over to Netlify + Edge
| functions. I had it all working in like 5-10 mins as it
| should. Took me two hours to figure out why Workers weren't
| working in my Pages project, and could never get Workers
| working properly with my Astro project.
|
| I think you're working exclusively on the engine of Workers
| which is really top notch, but Cloudflare really needs to
| improve the outer layer which affects DX considerably.
| kentonv wrote:
| Sorry you experienced that. FWIW this announcement from
| Wednesday should help address the problems you ran into:
|
| https://blog.cloudflare.com/pages-and-workers-are-
| converging...
| KRAKRISMOTT wrote:
| Still no compatibility with standard ORMs like Prisma either.
|
| And the team has been aware of the issue for years now
|
| https://github.com/cloudflare/workers-sdk/issues/2701
|
| https://news.ycombinator.com/item?id=31341513
| kentonv wrote:
| Sorry that D1 has been slow out the gate. Now that we've
| solved the basic technical issues we can really focus on
| improving DX.
| Akkuma wrote:
| You can use drizzle-orm with Cloudflare and that is fully
| compatible.
| rozenmd wrote:
| Drizzle is amazing.
| Akkuma wrote:
| This is my first time using it and I've been very pleased
| with it so far. It keeps it simple, has solid typing &
| schema building, and reminds me of LINQ. I'm also a thin
| models kind of person and the fact this is just an object
| without needing to build ORM classes is even better.
| mscccc wrote:
| I've been using workers on & off since it launched. Just tried
| it again recently and the local dev experience with wrangler is
| excellent now.
___________________________________________________________________
(page generated 2023-05-19 23:00 UTC)