[HN Gopher] Patterns for Building Realtime Features
___________________________________________________________________
Patterns for Building Realtime Features
Author : zknill
Score : 43 points
Date : 2025-02-10 19:42 UTC (3 hours ago)
(HTM) web link (zknill.io)
(TXT) w3m dump (zknill.io)
| martinsnow wrote:
| How do you handle deployments of realtime back ends which needs
| state in memory?
| jakewins wrote:
| In general you do it by doing a failover behind some variation
| of a reverse proxy.
|
| If you can start new instances quickly and clients can handle
| short delays you can do it by just stopping the old deployment
| and starting the new one, booting off of the snapshotted state
| from the prior deployment.
|
| If you need "instant" you do it by implementing some form of
| catchup and then fail over.
|
| It is a lot easier to do this if you have a dedicated component
| that "does" the failover, rather than having the old and new
| deployments try to solve it bilaterally. Could just be a script
| ran by a human, or something like a k8s operator if you do this
| a lot
| cess11 wrote:
| On BEAM/OTP you can control how state is handled at code
| updates. Finicky but you can.
|
| In most other contexts you'd externalise state to a data store
| like Redis or RDBMS, and spawn one, kill one or do blue-green
| in the nebula behind your load balancer constellation.
| blixt wrote:
| What I found building multiplayer editors at scale is that it's
| very easy to very quickly overcomplicate this. For example, once
| you get into pub/sub territory, you have a very complex
| infrastructure to manage, and if you're a smaller team this can
| slow down your product development a lot.
|
| What I found to work is:
|
| Keep the data you wish multiplayer to operate on atomic. Don't
| split it out into multiple parallel data blobs that you sometimes
| want to keep in sync (e.g. if you are doing a multiplayer drawing
| app that has commenting support, keep comments inline with the
| drawings, don't add a separate data store). This does increase
| the size of the blob you have to send to users, but it
| dramatically decreases complexity. Especially once you inevitably
| want versioning support.
|
| Start with a simple protocol for updates. This won't be possible
| for every type of product, but surprisingly often you can do just
| fine with a JSON patching protocol where each operation patches
| properties on a giant object which is the atomic data you operate
| on. There are exceptions to this such as text, where something
| like CRDTs will help you, but I'd try to avoid the temptation to
| make your entire data structure a CRDT even though it's
| theoretically great because this comes with additional complexity
| and performance cost in practice.
|
| You will inevitably need to deal with getting all clients to
| agree on the order in which operations are applied. CRDTs solve
| this perfectly, but again have a high cost. You might actually
| have an easier time letting a central server increment a number
| and making sure all clients re-apply all their updates that
| didn't get assigned the number they expected from the server.
| Your mileage may vary here.
|
| On that note, just going for a central server instead of trying
| to go fully distributed is probably the most maintainable way for
| you to work. This makes it easier to add on things like
| permissions and honestly most products will end up with a central
| authority. If you're doing something that is actually local-
| first, then ignore me.
|
| I found it very useful to deal with large JSON blobs next to a
| "transaction log", i.e. a list of all operations in the order the
| server received them (again, I'm assuming a central authority
| here). Save lines to this log immediately so that if the server
| crashes you can recover most of the data. This also lets you
| avoid rebuilding the large JSON blob on the server too often (but
| clients will need to be able to handle JSON blob + pending
| updates list on connect, though this follows naturally since
| other clients may be sending updates while they connect).
|
| The trickiest part is choosing a simple server-side
| infrastructure. Honestly, if you're not a big company, a single
| fat server is going to get you very far for a long time. I've
| asked a lot of people about this, and I've heard many
| alternatives that are cloud scale, but they have downsides I
| personally don't like from a product experience perspective
| (harder to implement features, latency/throughput issues,
| possibility of data loss, etc.) Durable Objects from Cloudflare
| do give you the best from both worlds, you get perfect sharding
| on a per-object (project / whatever unit your users work on)
| basis.
|
| Anyway, that's my braindump on the subject. The TLDR is: keep it
| as simple as you can. There are a lot of ways to overcomplicate
| this. And of course some may claim I am the one overcomplicating
| things, but I'd love to hear more alternatives that work well at
| a startup scale.
| jvanderbot wrote:
| I've worked in robotics for a long time. In robotics nowadays
| you always end up with a distributed system, where each robot
| has to have a view of the world, it's mission, etc, and also of
| each other robot, and also the command and control dashboards
| do too, etc etc.
|
| Always always always follow parent's advice. Pick one canonical
| owner for the data, and have everyone query it. Build an
| estimator at each node that can predict what the robot is doing
| when you don't have timely data (usually just running a shadow
| copy of the robot's software), but try to never ever do
| distributed state.
|
| Even something as simple as a map gets arbitrarily complicated
| when you're sensing multiple locations. Just push everyone's
| guesses to a central location and periodically batch update and
| disseminate updates. You'll be much happier.
| mikhmha wrote:
| Wow, this sounds like how the AI simulation for my
| multiplayer game works. Each AI agent has a view of the world
| and can make local steering decisions to avoid other agents
| and self preservation. Agents carry out low-level goals that
| are given to them by squad leaders. A squad leader receives
| high level "world" objectives from a commander. High-level
| objectives are broken down into low level objectives
| distributed among squad units based on their attributes and
| preferences.
| recroad wrote:
| It's amazing how much boilerplate stuff you don't have to worry
| about when you use Phoenix LiveView. I think I'm in love with it.
| pawelduda wrote:
| Exactly! I was halfway through the article and thought how
| LiveView is basically equivalent of the "push ops" pattern
| described but beautifully abstracted away and comes for free,
| while you (mostly) write dynamic HTML markup. Magic!
| mervz wrote:
| I have not enjoyed a language and framework like I'm enjoying
| Elixir and Phoenix! It has become my stack for just about
| everything.
| jtwaleson wrote:
| I'm building a simple version with horizontally scalable app
| servers that each use LISTEN/NOTIFY on the database. The article
| says this will lead to problems and you'll need PubSub services,
| but I was hoping LISTEN/NOTIFY would easily scale to hundreds of
| concurrent users. Please let me know if that won't work ;)
|
| Some context: The use case is a digital whiteboard like Miro and
| the heaviest realtime functionality will be tracking all of the
| pointers of all the users updating 5x per second. I'm not
| expecting thousands/millions of users as I'm planning on running
| each instance of the software on-prem.
___________________________________________________________________
(page generated 2025-02-10 23:00 UTC)