[HN Gopher] Show HN: WunderBase - Serverless OSS database on top...
___________________________________________________________________
Show HN: WunderBase - Serverless OSS database on top of SQLite,
Firecracker
Author : jensneuse
Score : 135 points
Date : 2022-09-15 14:22 UTC (8 hours ago)
(HTM) web link (wundergraph.com)
(TXT) w3m dump (wundergraph.com)
| azebazenestor wrote:
| Another way to do sqlite over S3 is:
|
| - https://github.com/uktrade/sqlite-s3vfs (Read/Write) -
| https://github.com/michalc/sqlite-s3-query ( Read Only)
| avl999 wrote:
| SQLite, KVM, Firecracker, GraphQL, Serverless... if this was
| written in Rust it would hit the holy trinity of all the HN
| buzzwords that pull a post to the frontpage ;)
| Sujan wrote:
| The Prisma Query Engine is indeed written in Rust :D
| tptacek wrote:
| I'd love to say that the most interesting thing here is Fly
| Machines (i am bias) but really it's SQLite, which, no matter
| what platform you're using, makes it architecturally simpler to
| scale up and down, since you don't have to scale a database
| server up and down with your workload; the database is embedded
| in the app, which already had the scaling logic.
|
| People have been sleeping on SQLite and are starting to wake up
| and I'm kind of psyched to see what else they come up with
| (another very cool example of a software tool that really plays
| to SQLite's strengths is Datasette: https://datasette.io/).
| nijave wrote:
| As soon as you start scaling the app beyond 1 replica, you have
| to handle data replication again
| tptacek wrote:
| Of course.
| gregwebs wrote:
| This just runs a request proxy that turns off after 10 seconds of
| no activity and starts it up (with a half second delay) when
| there is a new request. It runs SQLite with Prisma. Prisma is an
| API server that puts a GraphQL API in front of a DB.
|
| It's a nice blog post about gluing technology and I can see how
| this could be a really nice way to run some lower-cost databases
| in a non-demanding development environment. However, it is not a
| reliable way of operating a database. For me it isn't really
| serverless since it only scales between 0 and 1 instance whereas
| a serverless DB ideally would scale-out but should at least have
| some ability to scale to greater load in response to demand,
| along with higher reliability and availability, and backing up
| data to object storage.
| ithrow wrote:
| Couldn't find anything in the Prisma docs about it exposing a
| GraphQL API.
| jensneuse wrote:
| https://github.com/prisma/prisma-engines#query-engine
| lbhdc wrote:
| I was pretty interested to see how this worked as well. I think
| you are right, this is a toy. It will be interesting to see if
| they can solve scaling.
|
| It would be cool if you could ditch the graphql layer. It seems
| like there are other alternatives that go the vfs route so you
| still get to use a standard sqlite client.
| tptacek wrote:
| You can, of course, scale to >1 with the Fly Machines API. I
| don't know enough about how they're managing the SQLite part of
| this to say more about how this scales, except that I think
| scaling out SQLite is about to get a lot more interesting.
|
| But I mostly agree that we need a better term than "serverless"
| for this kind of stuff. The big things people seem to want from
| "serverless" solutions are "not managing long-running server
| instances" and "true usage-metered billing".
|
| Whether or not there are servers, like, at all has not all that
| much to do with things.
| shabbatt wrote:
| For many fly.io is not a viable alternative especially those
| that are already on the AWS train.
|
| I think that AWS already offers Aurora Serverless v2 which
| pretty much accomplishes what a lot of these me-too-
| serverless services that won't integrate as well as something
| that is offered out of AWS.
|
| Even if you were insistent on cloud-agnostic mandate (which
| is really not logical since there is at best 3 public clouds
| to choose from that are also vulnerable to targeted
| cyberattacks and faultlines), it would be hard to convince a
| large organization to switch to using Sqlite on Fly.io
| vosper wrote:
| > I think that AWS already offers Aurora Serverless v2
| which pretty much accomplishes what a lot of these me-too-
| serverless services that won't integrate as well as
| something that is offered out of AWS.
|
| People should just be aware that Aurora Serverless v2 won't
| scale to zero, and you'll pay for it even if you never use
| it.
|
| https://www.lastweekinaws.com/blog/no-aws-aurora-
| serverless-...
| shabbatt wrote:
| ah thanks for pointing that out, since this is new its
| bound to change. will be surprised if this didn't
| eventually scale to zero but if I had to bet I would back
| AWS here going forward. This is way too critical for it's
| serverless stack to get the Cognito treatment.
| tptacek wrote:
| Forget the Fly.io part; SQLite is what's interesting here.
| I agree in advance that it's unlikely anyone's going to
| convert a large app from Postgres to SQLite; if full-stack
| SQLite succeeds as a trend, it'll be with new apps that
| grow up using it.
| shabbatt wrote:
| interested to know what you see in sqlite here? why is
| there so much interest in this all of a sudden lately? am
| I missing something?
| tptacek wrote:
| It's a database that in full-stack culture has been
| relegated to "unit test database mock" for about 15 years
| that is (1) surprisingly capable as a SQL engine, (2) the
| simplest SQL database to get your head around and manage,
| and (3) can embed directly in literally every application
| stack, which is especially interesting in latency-
| sensitive and globally-distributed applications.
|
| Reason (3) is clearly our ulterior motive here, so we're
| not disinterested: our model user deploys a full-stack
| app (Rails, Elixir, Express, whatever) in a bunch of
| regions around the world, hoping for sub-100ms responses
| for users in most places around the world. Even within a
| single data center, repeated queries to SQL servers can
| blow that budget. Running an in-process SQL server neatly
| addresses it. Conveniently, most applications are read-
| heavy, and most performance-sensitive app requests are
| reads.
| shabbatt wrote:
| hmm but how would the replication and sync be handled if
| you have many sqlite instances on edge locations around
| the world? If someone inserts a row with id 234 and
| somebody from other side of the world does it, wouldn't
| this type of logic involve reaching into a central source
| of truth to compare the diff?
|
| tryna wrap my head around this architecture, it is quite
| interesting but concerning that it is now sharding into
| close-to-local sqlite instances located near the user.
| tptacek wrote:
| Yes: the model topology you should have in your head is
| "single writer, multiple readers" --- exactly the same
| way it would work with a conventional Postgres setup.
| What you're getting with SQLite here is that the reads
| themselves are served out of the app process rather than
| round-tripping over the network; otherwise, it's the same
| architecture.
|
| (You're not generally "reaching back to the central
| source of truth to compare" things, so much as
| "satisfying the write centrally and shipping out the new
| database pages back to the read replicas at the edges").
|
| More on this model: https://fly.io/blog/globally-
| distributed-postgres/
| shabbatt wrote:
| Interesting, do you have plans to support GPU as well? I
| can see this is a bottom up approach: put a low load
| instance close to the user for reads and have a globally
| synced write that should handle race conditions etc
|
| Are there cold start delays? From the moment I type
| domain.com is it going to spin up a fly instance closest
| to me and serve the SQLite database reads?
|
| I'm gonna give this a go this weekend to see what it can
| do
| tptacek wrote:
| This is getting into Fly.io stuff and not WunderBase or
| SQLite stuff. GPU is a ways off for us: the programming
| interface for GPUs is tricky to implement with full
| isolation between VMs. The post we're commenting on talks
| a bit about cold start delays (a couple hundred
| milliseconds).
| gregwebs wrote:
| This is SQLite, so how would you scale itr to > 1? Certainly
| you can put an app tier in front of this DB tier and scale
| the app tier to infinity.
| tptacek wrote:
| By replicating the SQLite transactions to other SQLite
| databases.
| soamv wrote:
| Hey, I like SQLite and I also like fly.io, but
| "distributed DB built with sqlite as storage" is really a
| very different beast from just "sqlite".
| tptacek wrote:
| I'm mostly interested in the SQLite part of this, and we
| don't have an SQLite offering, only Postgres, just to be
| clear. So you can't hurt my feelings here.
|
| When does SQLite become "a distributed DB built with
| sqlite as storage"? Did Postgres stop being Postgres when
| someone plugged log shipping into it? That's basically
| what we're talking about here --- not stuff like rqlite,
| which I'm also pretty interested in, but which really is
| a new database built on SQLite.
| gregwebs wrote:
| Users generally don't plug ad-hoc log shipping solutions
| into Postgres. They generally use the built-in battle-
| tested Postgres replication features, and they can setup
| synchronous replication to avoid data loss. Shipping a
| log is trivial but synchronous replication and failover
| are quite difficult to get right (see jepsen.io), and
| setting up failover for Postgres is still quite
| difficult. Newer DBs have been built from the groundup
| (CRDB, TIDB, etc) in part because of the difficulty of
| attempting to operate traditional DBs as reliable
| distributed systems.
| tptacek wrote:
| They do now, but that wasn't always the case, and people
| didn't say that you weren't running Postgres when you did
| that.
|
| Cockroach is not the same thing we're talking about here;
| it's a much more ambitious design, just like rqlite is
| much more ambitious than shipping SQLite transactions.
| What we're talking here is the tooling needed to generate
| a single-writer multi-reader cluster the way you would
| for Postgres, but for SQLite instead. I don't know if
| single-writer multi-reader clusters for Postgres qualify
| as "easy", but they're not science projects.
|
| If it's not obvious: we love Cockroach. Our commercial
| bias is that we built a platform that is especially
| useful for distributed services and clusters, and
| Cockroach is very much that.
| jensneuse wrote:
| As discussed in the post, the next steps are to add read
| replicas. Regarding backups, that's possible with Litestream:
| https://github.com/benbjohnson/litestream
| gregwebs wrote:
| Are you going to use LiteFS then for replication [1]? LiteFS
| replication is asynchronous, meaning failover can lose the
| latest data. Will LiteFS scale down to 0? Does scaling down
| to zero mean electing a leader when scaling back up to 1 and
| will that have a delay? Will the read replicas scale down to
| 0 along with LiteFS when the primary scales down to 0?
|
| [1] https://github.com/superfly/litefs
| benbjohnson wrote:
| LiteFS/Litestream author here. LiteFS will scale to zero
| with a persistent volume attached. For short-lived
| instances (aka serverless), we still have a few more
| features to complete on the road map (e.g. S3 replication,
| synchronous replication) to make that work well. Pure
| serverless (e.g. Lambda, Vercel) is also something we plan
| to support but we want to get LiteFS working well on more
| traditional deployments (e.g. longer running instances,
| Kubernetes, etc) first.
| snadal wrote:
| Slightly off-topic: according to what I read, it is a lightweight
| proxy written in golang that is capable of starting a vm when
| receiving network traffic and 10 seconds after the last request
| it turns off the fly machine.
|
| I've been looking for something similar for some time to use in
| my development docker instances (specifically with dokku). I have
| many services that, although they consume little CPU time, they
| do have a high overall consumption of RAM, but they are actually
| used for a few minutes each day.
|
| I don't want to use kubernetes for this as it adds too much
| complexity for the benefit I would get.
|
| Do you know any solution similar to this, to turn on / off docker
| containers when network traffic comes in?
| talhof8 wrote:
| Serverless database. Loving it! Good-luck!
| bragr wrote:
| >I do not recommend to expose WunderBase to the public internet.
| The intended use case is to run it on a private network and
| expose it to your frontend via an API Gateway, like WunderGraph!
|
| My SEC team felt a disturbance in the force from me even
| considering this on our internal network. Security should not be
| a secondary consideration for a DB!
| shabbatt wrote:
| Seeing how many MongoDB instances were running wide open to the
| internet despite calls to not do so, your concern is certainly
| valid.
___________________________________________________________________
(page generated 2022-09-15 23:00 UTC)