[HN Gopher] Show HN: DriftDB - an open source WebSocket backend ...
___________________________________________________________________
Show HN: DriftDB - an open source WebSocket backend for real-time
apps
Hey HN! I've written a bunch of WebSocket servers over the years to
do simple things like state synchronization, WebRTC signaling, and
notifying a client when a backend job was run. I realized that if I
had a simple way to create a private, temporary, mini-redis that
the client could talk to directly, it would save a lot of time. So
we created DriftDB. In addition to the open source server that you
can run yourself, we also provide https://jamsocket.live where you
can use an instance we host on Cloudflare's edge (~13ms round trip
latency from my home in NY). You may have seen my blog post a
couple months back, "You might not need a CRDT"[1]. Some of those
ideas (especially the emphasis on state machine synchronization)
are implemented in DriftDB. Here's an IRL talk I gave on DriftDB
last week at Browsertech SF[2] and a 4-minute tutorial of building
a cross-client synchronized slider component in React[3] [1]
https://news.ycombinator.com/item?id=33865672 [2]
https://www.youtube.com/watch?v=wPRv3MImcqM [3]
https://www.youtube.com/watch?v=ktb6HUZlyJs
Author : paulgb
Score : 302 points
Date : 2023-02-03 11:12 UTC (11 hours ago)
(HTM) web link (driftdb.com)
(TXT) w3m dump (driftdb.com)
| globalise83 wrote:
| This looks just about perfect for powering all those team online
| games we all played a lot during lockdown (and still do), is that
| right?
| paulgb wrote:
| Yep, in fact, it was making a word game[1] for my family to
| play on zoom calls early in lockdown that sent me down the
| rabbit hole of synchronizing state in distributed systems.
|
| [1] https://word.red
| Thaxll wrote:
| Websocket is not very good for online games because it's TCP
| based, also there are millions of websockets library in every
| languages.
| paulgb wrote:
| Right, WebSocket is fine for a chess game, but you wouldn't
| use it for a first-person shooter.
|
| If you do want UDP from a browser (via a WebRTC data channel)
| you first need a side channel to establish the connection,
| and DriftDB is handy for that.
| alexisread wrote:
| I've not looked at DriftDB in depth (cloudflare worker running
| this is neat!), but can't MQTT handle this sort of workload?
|
| Obv. there's not a cloudflare worker running say an MQTT server
| over websockets, but you can scope topics with wildcards
| (https://www.hivemq.com/blog/mqtt-essentials-part-5-mqtt-
| topi...), replay missed messages on reconnection, last-will-and-
| testament, ACLs, dynamic topic creation, binary messages etc.
|
| I'm asking as many of these websocket projects seem to use custom
| protocols rather than anything standard aka interoperable.
| nine_k wrote:
| Maybe it's the richness of MQTT that makes it a worse choice
| for a startup. Offering a conformant MQTT broker is a lot of
| work, and the semantics come from elsewhere, not geared towards
| emphasizing your unique advantages.
|
| Building a much simpler, custom-tailored protocol allows to
| ship faster, and improve gradually. If the point is to deploy
| on Cloudflare in a massively-parallel fashion (which is likely
| harder for a regular MQTT broker), the custom protocol allows
| to concentrate on that special advantage, and not on standards
| conformance or interoperability with a bevy of existing
| libraries.
| paulgb wrote:
| The problem with MQTT is that most of the use cases I'm
| interested in involve a web browser as at least one party of
| the connection, and the browser doesn't support MQTT. I could
| wrap MQTT in a WebSocket, but then I'd lose the advantages of
| MQTT's compactness and interoperability (unless MQTT-over-
| WebSocket is a thing?)
|
| The other operation that I haven't seen elsewhere, but is vital
| to enabling stream compaction without a leader, is the idea of
| a stream rollup _up to a specific stream number_. NATS
| Jetstream, for example, has the ability to roll up an entire
| stream, but if another message hits the stream between when the
| rollup is computed and when it arrives at the server, that
| message too will be replaced (IIRC). So I thought about using
| NATS (which already has a WebSocket protocol), but ruled it
| out.
| alexisread wrote:
| MQTT-over-websocket does exist
| (https://github.com/mqttjs/MQTT.js), and most MQTT brokers
| support it (Mosquito, AmazonMQ etc.). You're right about the
| compaction - MQTT doesn't have anything in it's protocol
| about compaction, and I don't know of any brokers that
| implement it. Having said that, you could use an MQTT-kafka
| bridge.
|
| Something like Mosquito + https://github.com/nodefluent/mqtt-
| to-kafka-bridge + Redpanda in a docker image would work,
| though obv. this might be a bit overkill for most. Having
| said that, it does open many new avenues for interaction at
| scale. You pays your money...
| fud101 wrote:
| What is compaction?
| paulgb wrote:
| Compaction is where you take a chunk of messages and
| replace them with a single message.
|
| For example, one of the DriftDB demos is a counter
| (https://demos.driftdb.com/counter). State is
| synchronized by putting increment/decrement events into a
| stream. When a new user connects, their client get all
| the messages in the stream, plays them back, and arrives
| at the same state as everyone else.
|
| If that's all we did, over time, the stream would grow
| unruly. It would take ages to load the page because we'd
| have to load every state change. But we only really care
| about a single numeric value. Compaction takes a chunk of
| messages that look like this:
| {"apply":"increment"} {"apply":"increment"}
| {"apply":"decrement"} {"apply":"increment"}
|
| And replaces them with a message that looks like this:
| {"reset":2}
|
| DriftDB doesn't know _how_ to compute the compaction, it
| relies on clients to do that. When a client does
| something that increases the length of the stream, the
| server sends back the new length of the stream, so that
| the client can decide whether to compact it (i.e. if it
| passes some threshold).
|
| The important part that I haven't seen elsewhere is that
| when a client compacts the stream, it includes a sequence
| number of the last message that's part of the compaction.
| The server will preserve messages greater than that
| sequence number, since they are not part of the
| compaction.
| chrisdalke wrote:
| Most MQTT implementations do support MQTT-over-Websocket. I
| use it extensively at work and it's been fairly reliable!
| elithrar wrote:
| > The problem with MQTT is that most of the use cases I'm
| interested in involve a web browser as at least one party of
| the connection, and the browser doesn't support MQTT. I could
| wrap MQTT in a WebSocket, but then I'd lose the advantages of
| MQTT's compactness and interoperability (unless MQTT-over-
| WebSocket is a thing?)
|
| We support MQTT over WS (or JSON over WS, or just HTTP) in
| Cloudflare Pub/Sub, FWIW -
| https://developers.cloudflare.com/pub-
| sub/learning/websocket...
|
| I also agree with the comments re: MQTT being well suited to
| a lot of these "broadcast" use-case, but that the IoT roots
| seem to hold it back. MQTT 5.0 is just a great protocol --
| clear spec, explicit about errors, flexible payloads -- that
| make it well suited to these broadcast/fan-in/real-time
| workloads. The traditional cloud providers do MQTT (3.1.1) in
| their respective IoT platforms but never grew it beyond that.
| jconley wrote:
| Can I get in on that beta? Submitted the form yesterday.
| Currently building something that could use it. ;)
| manv1 wrote:
| Funny, the IoT space has bought into MQTT but the general
| internet space has not.
|
| MQTT scales and works. And it's easy, fast, and small.
|
| I've been trying to get our guys to do MQTT-based pub/sub, and
| they're rather do their own thing with web sockets because MQTT
| is scary. <shrug>.
|
| That's the problem when front-end guys make decisions about
| tech sometimes, they choose stuff that seems easy to integrate
| without caring about things like deployment, scalability,
| capabilities, etc.
| manv1 wrote:
| I mean, it'd be trivial to write stream replay for MQTT. It's
| literally just stashing messages and sending them back on
| connect. Not sure what the issue is there.
| Scottopherson wrote:
| Jeez that's a big paint brush you're slinging around.
|
| That's the problem when non-front-end guys make decisions
| about tech sometimes, they choose stuff that seems easy to
| integrate without caring about things like accessibility,
| design scalability, client device capabilities, etc.
| nine_k wrote:
| How does a wire protocol relate to UX concerns like
| accessibility or design scalability?
|
| Client device capabilities are there, MQTT is neither
| rocket science nor a resource hog, since it was designed
| for underpowered IoT devices.
| jwilber wrote:
| Awesome stuff. Here's a short video talking about DriftDB at
| Browsertech SF (I believe this is an put on by them ("Drifting in
| Space"): https://www.youtube.com/watch?v=wPRv3MImcqM
| ocimbote wrote:
| Plane.dev is mentioned.
|
| Has anyone experience with it? It seems quite interesting but I
| need more opinion on what they call "backend sessions"...
| BTBurke wrote:
| This is great. I'm going to use this with something I'm working
| on. The edge behavior is just what I need.
|
| When you say limitations are a "relatively small number of
| clients need to share some state over a relatively short period
| of time," I read in another comment about a dozen or so clients,
| but what about the time factor? Can it be on the order of hours?
| paulgb wrote:
| > but what about the time factor? Can it be on the order of
| hours?
|
| So far I've focused on use cases where clients are online for
| overlapping time intervals. When all the clients go offline,
| Cloudflare will shut down the worker after some period and the
| replay ability will be lost. The core data structure is
| designed such that it could be stored in the Durable Object
| storage Cloudflare provides, but I haven't wired it up yet.
| BTBurke wrote:
| One more thought - any consideration of hooking this to
| Cloudflare's queue? Then you could optionally connect another
| worker to that and e.g. persist everything in their D1 SQLite
| database.
| paulgb wrote:
| I haven't looked at the queue specifically, but Durable
| Objects have a nice key/value storage mechanism that
| happens to map nicely. It would take a bit of munging to
| make it work for a stream instead of a single value, but I
| have a design in mind.
| BTBurke wrote:
| That works perfectly for what I'm using it for. Thanks for
| building this!
| jcq3 wrote:
| I didn't find the use case section, the first thing I read before
| code, implementation example or whatever. Why is it always
| lacking in SaaS landpages?
| paulgb wrote:
| Good feedback, here you go :) https://github.com/drifting-in-
| space/driftdb/commit/8d946217...
| jcq3 wrote:
| Beautiful, now it makes me want to use your tool because I
| can relate to use cases I might have...
| speps wrote:
| Reminds me of Colyseus: https://github.com/colyseus/colyseus
|
| Colyseus has support for persistence as well as matchmaking!
| mrtksn wrote:
| How the race conditions are handled? If one of the clients of the
| shared state delivers the the input with a delay(network issue
| etc.), will it overwrite state of the other client once delivered
| or will be dismissed? Is there a concept of slave/master client?
|
| Edit:
|
| So, I played a bit and it appears that if a client is
| disconnected and changes of the state happens when offline, once
| connected these changes will be applied to the other client who
| was having its own changes in the state. So its working on the
| "last message" basis? Also it seems like it can't detect the
| offline/online status?
|
| I'm curious because the interesting part of this kind of systems
| is the way races are handled.
| paulgb wrote:
| > So, I played a bit and it appears that if a client is
| disconnected and changes of the state happens when offline,
| once connected these changes will be applied to the other
| client who was having its own changes in the state. So its
| working on the "last message" basis? Also it seems like it
| can't detect the offline/online status?
|
| From the server's point of view, it's just an ordered broadcast
| channel with replay. The conflict semantics are whatever you
| build on top of that.
|
| The `useSharedState` hook in the React bindings implements
| last-write-wins. For the `useSharedReducer` hook, the reducer
| itself determines the semantics, but in the voxel editor demo
| we also use last-write-wins.
|
| > Also it seems like it can't detect the offline/online status?
|
| Online/offline status is exposed in the client libraries, e.g.
| in the react bindings there is a useConnectionStatus hook:
| https://driftdb.com/docs/react#useconnectionstatus-hook
|
| > I'm curious because the interesting part of this kind of
| systems is the way races are handled.
|
| It's academically the interesting part, but I think it matters
| less than people assume it does. Here's a section from a blog
| post I wrote a couple months ago:
|
| > Developers may find it tempting to treat collaborative
| applications as any other distributed systems, and in many ways
| that's a useful way to look at them. But they differ in an
| important way, which is that they always have humans-in-the-
| loop. As a result, many edge cases can simply be deferred to
| the user.
|
| > For example, every multiplayer application has to decide how
| to handle two users modifying the same object concurrently. In
| practice, this tends to be rare, because of something I call
| social locking: the tendency of reasonable people not to
| clobber each other's work-in-progress, even in the absence of
| software-based locking features. This is especially the case
| when applications have presence features that provide hints to
| other users about where their attention is (cursor position,
| selection, etc.) In the rare times it does occur, the users can
| sort it out among themselves.
|
| > A general theme of successful multiplayer approaches we've
| seen is not overcomplicating things. We've heard a number of
| companies confess that their multiplayer approach feels naive
| -- especially compared to the academic literature on the topic
| -- and yet it works just fine in practice.
|
| https://driftingin.space/posts/you-might-not-need-a-crdt
| mrtksn wrote:
| Good point, in the case of users interacting it's probably a
| non issue. Thanks for the insight.
| Aldipower wrote:
| How can something be real-time, if there is a websocket
| connection in-between. How do you ensure real time? In real-time
| applications response times must be guaranteed. Seems impossible
| to me with websocket connections.
| paulgb wrote:
| I mean real-time apps in the colloquial sense - applications
| where two people see the same state nearly instantly. In the
| strict computation sense, it's true that you can't guarantee an
| upper bound for delivery of a message. This isn't just a
| limitation of WebSockets, it's a limitation of TCP/IP, which
| don't provide a way to reserve bandwidth along a path between
| hosts (IIRC).
| bufferoverflow wrote:
| SurrealDB was supposed to be a websocket real time DB, but it
| seems they never finished that websocket part.
|
| Glad there's an alternative.
|
| https://surrealdb.com/docs/integration/websockets
| winrid wrote:
| Reminds me of DerbyJS and ShareDB/Racer. It's a pretty productive
| stack, but came out at the wrong time. You can plug in different
| storage engines (mongo, postgres) and it handles conflicts via
| operational transform.
| JohnCClarke wrote:
| useState() --> useSharedState()
|
| My brain just exploded with how perfect this DX is! Love it!
| stmblast wrote:
| This is really cool!
|
| Looking forward to seeing how this progresses.
| ArtWomb wrote:
| Seems expensive no? To start a http container per request? But I
| suppose it does solve many server side persistence issues. And I
| love the power it affords you in creating virtual worlds. Awesome
| stuff ;)
|
| https://github.com/drifting-in-space/plane
| paulgb wrote:
| We created Plane, but we're actually not using it for this!
| DriftDB stemmed out of realizing that a lot of the use cases
| people were coming to Plane for were simple WebSocket servers
| for which spinning up a container is excessive.
|
| Plane is still great (I mean, I'm biased) if you want to run a
| WebSocket server that implements custom business logic, uses
| heavy compute, GPUs[1], or is stateful.
|
| [1] teaser: https://canvas.stream/
| ArtWomb wrote:
| Blender over WebRTC demo looks fast too ;)
| paulgb wrote:
| Thanks! My colleagues gave a talk last week on streaming
| data visualization that you might like:
| https://www.youtube.com/watch?v=0WyeZ9lKdSU
| quickthrower2 wrote:
| Would be fascinating if you could build Jitsi like video ontop of
| this.
|
| I think DB in the name is a little misleading due to there being
| no persistence (I assume?) but that is a small nitpick!
| paulgb wrote:
| > I think DB in the name is a little misleading due to there
| being no persistence (I assume?) but that is a small nitpick!
|
| Yes, I feel a bit guilty about that part. When I started it the
| design looked more like a traditional key/value or durable
| stream database with real-time capabilities, but over time I
| realized that the use cases I had in mind usually didn't
| actually need long-term persistence. The DB stuck, partly
| because it turns out if you add "db" as a suffix it's a lot
| easier to find available package names and domains :). If it's
| any consolation, I still do intend to support persistence
| eventually.
| quickthrower2 wrote:
| Thanks for the reply. Sounds like a neat bit of
| infrastructure. Well done for getting it done! I almost want
| to create a project as an excuse to use it ha ha! Also naming
| stuff is hard of course.
| dabeeeenster wrote:
| This is super interesting! Do you have any data on how well this
| scales when running on Cloudflare Edge? Can you run more than one
| instance and have them share state?
| paulgb wrote:
| Thanks! When hosted on Cloudflare, it uses their Durable
| Objects product. Rather than running multiple backend instances
| that share state, it's set up so that all users in the same
| "room" are connected to the same instance. The instances can
| then be scaled out horizontally (but Cloudflare takes care of
| that.)
|
| Within a room, things are a bit more constrained. We haven't
| found the limit yet, and I suspect it's pretty high, but our
| design goal was to support on the order of dozens of users in a
| room, not necessarily beyond that. (Targeting e.g. a shared
| whiteboard use case)
| tmikaeld wrote:
| We also looked at using Cloudflare, but it was prohibitively
| expensive, because you pay for the duration of each "room"
| (Connection, depending on how you use it).
|
| https://developers.cloudflare.com/workers/platform/pricing/#.
| ..
|
| Eventually we went with Centrifuge.
| paulgb wrote:
| Yeah, it remains to be seen whether it is economical for us
| to keep the hosted version on CF. I suspect that for users
| who want to run their own geographically distributed
| instance of it, CF will be the path that makes sense for
| the majority of them.
|
| Who did you end up going with as a hosting provider?
| (Centrifuge looks to be a library, if I'm looking at the
| right thing)
| unraveller wrote:
| CF edge wants you to be more one and done, very anti
| connection. Deno is the better priced edgejs compute for
| websockets last I checked.
|
| Probably still worthwhile for DriftDB SaaS if mainly short
| lived connections are used, even though similar
| functionality can be had with NATS bridge + an ordered
| streaming library in your fav language on fly.io
| e1g wrote:
| "Centrifuge" as in
| https://github.com/centrifugal/centrifugo ?
| [deleted]
| matt-attack wrote:
| > DriftDB is a real-time data backend that runs on the edge.
|
| What does "on the edge" mean in this context? Can I just run the
| server part on my own infrastructure? What if I have multiple
| pods for redundancy, and client web connections might get
| connected randomly to any of those pods? How would the pods all
| share state between each other?
| paulgb wrote:
| > What does "on the edge" mean in this context?
|
| DriftDB has a concept of "rooms", which are essentially
| broadcast channels. By "on the edge", what I mean is that the
| authoritative server for each room can be geographically
| located near the participants in that room. In practice, today
| that means that it can be compiled to WebAssembly and run as a
| Cloudflare Worker.
|
| > Can I just run the server part on my own infrastructure?
|
| Kinda. It includes a server that runs locally, but it's only
| useful as a development server at this point. Your question
| about multiple pods is exactly the reason -- unless you have a
| routing layer that is aware of DriftDB's "rooms", it won't work
| if you scale it up. We also make https://plane.dev which
| provides the routing layer, but it might be overkill for a
| DriftDB use case.
| avinassh wrote:
| This is really cool! But how are conflicts handled?
| paulgb wrote:
| As far as the server itself is concerned, it's just a broadcast
| channel with replay and compaction capabilities, so it's not
| directly concerned with conflict resolution. You could use it
| as a broadcast channel for CRDTs if you wanted to.
|
| The useSharedState react hook is more opinionated, it uses
| last-write-wins semantics in the case of a conflict. The
| useSharedReducer hook's behavior on conflict is up to the
| reducer provided.
| samhuk wrote:
| Looks interesting. Coincidentally, I've _just_ completed the bulk
| of work on a distributed Websocket network system to synchronize
| certain bits of state between multiple clients for my own kind of
| Storybook tool [0]. How interesting!
|
| This kind of tool is exactly what I would have needed, instead of
| the approach I've taken which is a bit kludgy and grass-roots.
|
| By far the most difficult part of it for me was ensuring that the
| web socket network can heal from outages of any of the clients or
| the server. E.g. If a client loses connection, how does it regain
| knowledge of state? If the server dies, what do clients do with
| state changes they want to upload? Etc. It was really difficult!
|
| Good work :)
|
| [0] https://github.com/samhuk/exhibitor/pull/22
| rlt wrote:
| Neat.
|
| > DriftDB is a real-time data backend that runs on the edge
|
| What does it mean for these backends to be "on the edge"? Do
| geographically disperse clients connect to different backends? If
| so are messages synchronized between them? If so what's the point
| of them being on the edge?
| fernandopj wrote:
| OP must have meant it runs on Cloudflare Edge.
| scaredginger wrote:
| Please explain your reasoning here
| paulgb wrote:
| That's essentially what I meant. The core database is
| separate from the Cloudflare parts, so it could in theory
| run on other edges (I want to get it running on fly.io!),
| but for now "the edge" can be read as "Cloudflare Workers".
| paulgb wrote:
| By "on the edge", I mean that if you're in London and I'm in
| Amsterdam, and we want to exchange messages, the messages
| shouldn't have to do a round-trip through Virginia, they should
| go through a server closer to both of us. (Of course, if I'm in
| SF and you're in London, this is less of a win.)
|
| The way it works in DriftDB is that everything is siloed into
| "rooms", which are effectively broadcast channels. The room is
| started based on the geography of the person who first joins it
| (Cloudflare handles this part).
| trollitarantula wrote:
| Nice! Would love to see Cloudflare deployment guide.
| Cloudflare isn't mentioned in the docs.
| paulgb wrote:
| Ah, you're right, I haven't written that up yet. The tl;dr
| is something like: cd driftdb-worker
| npm i npm run deploy
|
| You'll need to sign in to wrangler if you haven't already,
| and will need to have rustc/cargo available (wrangler will
| install some things and build it into a WebAssembly
| module).
| HighlandSpring wrote:
| Oh, cool! So kinda like IRC?
| paulgb wrote:
| Yes, the concept of rooms is analogous to rooms in a chat
| service. One difference from IRC as a protocol (besides
| being over websocket) is that each connection corresponds
| to exactly one room (since different rooms may be on
| different servers.)
| rlt wrote:
| > The room is started based on the geography of the person
| who first joins it
|
| Cool, makes a lot of sense because people using a given
| "room" are often likely to be geographically collocated.
| paulgb wrote:
| Exactly!
| atentaten wrote:
| Can this be used in the Dart/Flutter world?
| paulgb wrote:
| The server itself speaks a very simple WebSocket protocol[1],
| so it could be used by anything that can speak WebSocket.
|
| The JS/React bindings that implement the actual data sync
| patterns (shared state, shared reducers, presence) haven't been
| ported to Dart (yet?) though.
|
| [1] https://driftdb.com/docs/api
| rgbrgb wrote:
| > presence
|
| Congrats on the launch! You have a pointer to docs about
| presence? Use-case is an ephemeral chatroom where I want to
| show who's online.
| paulgb wrote:
| Good catch, this should be in the react docs but it's
| missing. Until then, it's pretty simple. You call `const
| presence = usePresence({})` and pass in any data you want,
| and the `presence` value that gets returned is an object
| that maps client IDs (a unique string for each client) to
| the values that _they_ passed in to `usePresence`.
|
| Here's an example from the voxel demo:
| https://github.com/drifting-in-
| space/driftdb/blob/af64f62b29...
|
| And from the canvas demo: https://github.com/drifting-in-
| space/driftdb/blob/af64f62b29...
| jaime-ez wrote:
| for those interested in open source websocket servers checkout
| deepstream.io ... data persistence, subscriptions, rpc calls,
| authorization, permissions, custom connectors..basically
| everything you need to develop an app.
___________________________________________________________________
(page generated 2023-02-03 23:00 UTC)