[HN Gopher] Using CRDTs for multiplayer text editing
___________________________________________________________________
Using CRDTs for multiplayer text editing
Author : henning
Score : 70 points
Date : 2022-12-01 19:03 UTC (3 hours ago)
(HTM) web link (zed.dev)
(TXT) w3m dump (zed.dev)
| eep_social wrote:
| This reminds me of Oracle's SCN which blew my mind back in ~2009
| when I moved up from mysql home gamer stuff to the corporate
| world.
|
| > Even long edit histories barely compare to the memory savings
| Zed obtains from not being built with Electron.
|
| I got a good chuckle out of this. Speaking of performance, I
| think it's I nteresting that ms word and google docs are still
| using OT rather than CRDT. The goog version seems to work well
| but I have had nothing but frustration with ms word. Bad merges
| and weird states are typical, particularly from the fat client.
|
| Anyone else interested in the background of OT vs CRDT might find
| this thread useful: https://news.ycombinator.com/item?id=18191867
| tym0 wrote:
| Has Microsoft published anything about their pairing plug-in for
| VS Code?
| rubatuga wrote:
| Yeah CRDTs are pretty annoying for collaborative text editing,
| even Google Docs doesn't use them. Some companies have switched
| from CRDTs back to traditional methods.
| paulgb wrote:
| In addition to OP (nice work, Nathan!), one of the yjs authors
| has made a pretty in-depth argument that (modern) CRDTs are
| suitable for human-scale text editing.
| https://blog.kevinjahns.de/are-crdts-suitable-for-shared-edi...
| nathansobo wrote:
| Thanks very much!
| nathansobo wrote:
| We're quite happy with them. Why do you find them annoying?
| tlb wrote:
| One common failure mode is that two people start typing at
| the beginning of the same line, and rather than getting two
| lines, it alternates characters. At least, Etherpad did this.
| maxbond wrote:
| This is a common artifact of operational transforms I
| believe. Etherpad launched in 2008; CRDTs were proposed in
| 2011. AFAIK Etherpad used (uses?) OT.
|
| Open to correction though, it's been a while since I dug
| into the differences in these approaches & my memory is
| imperfect.
| muxator wrote:
| Etherpad used Operational Transform, not CRDTs.
|
| Source: I have been etherpad's maintainer for two years.
| josephg wrote:
| This is called the "interleaving problem". It shows up with
| simple list algorithms like fractional indexing.
|
| All the main text editing CRDT algorithms around today
| solve this no problem. (Yjs, automerge, diamond types,
| etc).
| nathansobo wrote:
| We honestly don't solve it yet, but haven't found it that
| big of an issue in practice. Would be curious to see the
| best resource on it.
| josephg wrote:
| I remember reading about it in one of Martin Kleppmann's
| papers, though I can't remember which one.
|
| It's an ordering problem that comes from some of the
| simpler ordering algorithms. For Diamond types I'm using
| a variant of Yjs's ordering. But even RGA doesn't have
| this problem because each character's insert location is
| specified by naming the character immediately to the left
| when that character was typed.
|
| This repository implements a few different list CRDTs
| using an insertion sort approach, where the algorithm
| scans for the appropriate location every time an insert
| happens. This is the scanning function for RGA
| (automerge's algorithm):
|
| https://github.com/josephg/reference-
| crdts/blob/fed747255df9...
|
| And this is an interactive visualisation of how diamond
| types works (which uses Yjs's algorithm instead),
| complete with run-length encoding:
| https://home.seph.codes/public/diamond-vis/
| maxbond wrote:
| I'll offer up that I've had a hard time wrapping my mind
| around them and that in practice it seems like you need to
| implement a check pointing operation on top of them which is
| necessarily not conflict-free, or your replication log will
| expand without bound & eventually overwhelm your system.
| (Though perhaps not, depending on your problem domain. Not
| CRDTs but in this interview the engineers discuss a massive,
| high frequency replication log that they don't checkpoint,
| which they've been running at scale for years. Though you
| could also say they implicitly checkpoint every trading day,
| and they're working on implementing checkpointing.
| https://signalsandthreads.com/state-machine-replication-
| and-...)
|
| That being said I would use CRDTs for any greenfield
| collaboration project.
| maxbond wrote:
| The first version of Google Docs was written in 2006; CRDTs
| weren't proposed until 2011, and still seem to be maturing as a
| technology.
|
| So, I don't think this is a reflection on the merits of CRDTs
| versus operational transforms as much as it is a reflection on
| the ecosystem and the history of the codebase.
| josephg wrote:
| Yep. I've been working in the collaborative editing space for
| over a decade - with both OT algorithms and CRDT algorithms.
| The modern CRDTs I've been developing are significantly
| faster and smaller than any OT system I made 5+ years ago.
|
| Much more complex though. (~3k loc for a good, high
| performance text crdt merging algorithm, vs 300 loc for a
| good text OT algorithm.)
| HWR_14 wrote:
| Sorry, I'm not sure I properly understood OT as an acronym.
| Did you mean "operational transformation"? Because I
| thought that OT was method of making operational based
| CRDTs[0].
|
| But I'm probably wrong in at least one way! Hoping to
| learn.
|
| [0] https://link.springer.com/chapter/10.1007/978-3-662-433
| 52-2_...
| zackoverflow wrote:
| Can you explain why they're more complex? I've always heard
| the narrative that OT is harder to get right because of
| edge case
| nmlt wrote:
| This webpage crashed my iPad 6 Gen Safari when repeatedly
| scrolling down without reading (iPadOS 16.1.1). It's probably
| just too little memory, but still, isn't this just a blog post?
| iamnbutler wrote:
| Apologies, looks like we underestimated how heavy the
| autoplaying diagrams would be. If you could try again in about
| 5 minutes and see if you still have the problem I'd appreciate
| it!
|
| Sorry for the hassle.
| nathansobo wrote:
| We embeded a lot of videos for animated diagrams. I think we
| need to disable autoplay, at least on mobile.
| sghosh2 wrote:
| Very cool use case for CRDTs! I've seen a bunch of different use
| cases from other products like https://liveblocks.io/ and
| https://electric-sql.com/. It's interesting how CRDTs are now
| taking hold so much for all these collaborative syncing
| scenarios. Wonder what's driving the proliferation now given
| they've been around for awhile?
___________________________________________________________________
(page generated 2022-12-01 23:00 UTC)