[HN Gopher] Collaborative Text Editing Without CRDTs or OT
___________________________________________________________________
Collaborative Text Editing Without CRDTs or OT
Author : samwillis
Score : 161 points
Date : 2025-05-21 17:05 UTC (5 hours ago)
(HTM) web link (mattweidner.com)
(TXT) w3m dump (mattweidner.com)
| hencq wrote:
| I'm not an expert on this, but the main difference with a CRDT
| like Automerge seems to be the server reconciliation. See for
| example this article [1]. Automerge handles concurrent insertions
| by using a sequence number and relying on an agreed ordering of
| agent ids when insertions are concurrent, while this scheme
| relies on the server to handle them in the order they come in.
|
| The article mentions this:
|
| > This contrasts with text-editing CRDTs, in which IDs are
| ordered for you by a fancy algorithm. That ordering algorithm is
| what differs between the numerous text-editing CRDTs, and it's
| the complicated part of any CRDT paper; we get to avoid it
| entirely.
|
| I can buy the idea that many apps have a central server anyway,
| so you can avoid the "fancy algorithm", though the server
| reconciliation requires undoing and replaying of local edits, so
| it's not 100% clear to me if that's much simpler.
|
| [1] https://josephg.com/blog/crdts-go-brrr/
| hem777 wrote:
| Agreed, the undoing and replying isn't exactly non-fancy.
| Peristent B+Tree is also not very non-fancy.
| mweidner wrote:
| I believe that Automerge internally stores all operations in an
| eventually consistent total order, which you _can_ use as a
| substitute for the server in server reconciliation (cf.
| https://mattweidner.com/2025/05/21/text-without-
| crdts.html#d...). However, Automerge does not actually do that
| - instead it processes text operations using a traditional
| CRDT, RGA, probably because (as you point out) undoing and
| replaying ops is not trivial to implement.
| hem777 wrote:
| Cool to see a write up on this! Discovered the same method years
| ago and also wondered why it doesn't show up in academic
| literature.
|
| I implemented this in a decentralized context and as a CRDT
| though, so that the properties hold (commutative, idempotent and
| associative).
| k__ wrote:
| If the idea is to have an alternative to CRDT, what did you
| gain from building it as CRDT?
| hem777 wrote:
| It wasn't really that we wanted to have a CRDT per se, but as
| it was implemented on top of an op-based append-only log
| CRDT, it turned out to hold those properties, which makes it
| a CRDT. We wanted to have the edits to be able to arrive in
| any order or after delays due to network partitions (this was
| for a p2p network).
| uh2010 wrote:
| wonder if there would be a perf gain with UUIDv7
| poly2it wrote:
| Why not just use an ever incrementing u64?
| zamadatix wrote:
| An incrementing u64 requires either atomic increments between
| concurrent clients or recalculation logic to consistently
| find the newly incremented ID after conflicting information
| syncs. UUIDs just spit out a unique ID without any complexity
| or associations with other clients.
| ChadNauseam wrote:
| There's closely related idea that might work, though. Each
| device editing text could be assigned a 32-bit ID by the
| server (perhaps auto-incrementing). Devices then maintain a
| separate 32-bit ID that they increment for each operation
| they perform. The ID used for each character is (device_id,
| edit_id), which should fit nicely in 8 bytes.
| mweidner wrote:
| Indeed, this is close to what Yjs (popular CRDT library)
| does: each client instance (~ browser tab) chooses a
| random 32-bit clientId, and character IDs combine this
| clientId with local counters. https://github.com/yjs/yjs/
| blob/987c9ebb5ad0a2a89a0230f3a0c6...
|
| Any given collaborative document will probably only see
| ~1k clientIds in its lifetime, so the odds of a collision
| are fairly low, though I'd be more comfortable with a
| 64-bit ID.
| kortex wrote:
| Then you need central coordination, either a single central
| server containing the counter, or something like Snowflake
| where you have multiple counters, each assigned orthogonal
| blocks ahead of time (that need to coordinate with a central
| server).
|
| UUIDs/ULIDs/etc are fully distributed, you can have two
| clients assign an ID without coordinating with ~0% of
| collision.
| poly2it wrote:
| You could also split the u64 to have the first 24 bits be
| unique to the client, and the last 40 be unique to the
| produced character. This would still allow 1 TiB of data
| produced per user and session. The single mutex would be
| the user ID counter.
| pshc wrote:
| They do discuss optimization of the IDs at the bottom of the
| article.
| asdffdasy wrote:
| so, an unoptimized crdt? just set max set size to 1 and yolo?
| th0ma5 wrote:
| It's very appealing as a kind of irreducible complexity. It
| feels close to what is actually happening and is simple hahah
| if like you say not optimized.
| math_dandy wrote:
| Is the take-home message of the post that the full complexity of
| CRDTs/OT is necessary only in the absence of a central server?
| sampton wrote:
| OT requires centralized server.
| mweidner wrote:
| Even in the absence of a central server, you can still avoid
| CRDT/OT complexity _if_ you have a decentralized way to
| eventually total order operations & apply them in that order:
| https://mattweidner.com/2025/05/21/text-without-crdts.html#d...
|
| As others in the comments argue, this is technically a CRDT
| (though a fully general one); also, undoing/replaying ops is
| itself non-trivial to implement. However, I hope this is still
| simpler than using a traditional CRDT/OT for each data type.
| apt-apt-apt-apt wrote:
| Does this finally solve collaborative text editing and its
| friends?
|
| Such an awesome approach.
| criddell wrote:
| Could it? If you and I are simultaneously editing a list of
| people and you are trying to order them by age and I'm ordering
| them alphabetically, we are still going to have to reconcile
| the result.
| shermantanktop wrote:
| By that definition, it's not a solvable problem, is it?
| Footkerchief wrote:
| Surprised to see no discussion of other data structures like
| dicts/maps, or arrays of arbitrary type. Hopefully they'd be a
| straightforward extension. IME, apps need collaborative data
| structures more often than they need pure collaborative text
| editing.
|
| The motivating examples (update validation, partial loading,
| higher-level operations) are interesting, but I don't see a
| strong case that the reason Yjs etc. lack these features is the
| underlying CRDT implementation, as opposed to these features
| being intrinsically difficult to build.
| filleokus wrote:
| > Surprised to see no discussion of other data structures like
| dicts/maps, or arrays of arbitrary type. Hopefully they'd be a
| straightforward extension. IME, apps need collaborative data
| structures more often than they need pure collaborative text
| editing.
|
| Totally agree. I guess an array of "atomic" objects, where the
| properties of the objects can't be changed can be done just by
| replacing the string with your own type. Changes inside of the
| object is probably trickier, but maybe it's just about
| efficiently storing/traversing the tree?
|
| I've also always thoguth it should be possible to create
| something where the consumer of the helper library (per OP
| terminology) can hook in their own lightweight "semantic model"
| logic, to prevent/manage invalid states. A todo item can't both
| have isDone: true and state: inProgress at the same time.
| Similar to rich text formatting semantics mentioned in the
| linked article.
| pshc wrote:
| Use of server reconciliation makes me think client-side
| reconciliation would be tricky... how do you preserve smooth
| editor UX while applying server updates as they arrive?
|
| For example, if your client-sent request to insert a character
| fails, do you just retry the request? What if an update arrived
| in the intervening time? (Edit: they acknowledge this case in the
| "Client-Side" section, the proposal is to rewind and replay, and
| a simpler proposal to block until the pending queue is flushed)
|
| From a frontend vantage I feel like there may be a long tail of
| underspecified UI/UX edge cases, such that CRDT would be simpler
| overall. And how does the editor feel to use while riding the NYC
| subway where coverage is spotty?
| straws wrote:
| Both ProseMirror and the newer version of CodeMirror have a
| pretty elegant solution to this: each modification to the
| document is modeled as a step that keeps track of indices
| instead of node/text identities, and uses a data structure
| called a "position map" that the buffered steps can be mapped
| through to create steps with updated positions, which are then
| applied to your document.
|
| In practice, it works quite well. Here's more info:
|
| https://marijnhaverbeke.nl/blog/collaborative-editing.html
| https://marijnhaverbeke.nl/blog/collaborative-editing-cm.htm...
| Animats wrote:
| That is very neat. The algorithm:
|
| - Label each text character with a globally unique ID (e.g., a
| UUID), so that we can refer to it in a consistent way across time
| - instead of using an array index that changes constantly.
|
| - Clients send the server "insert after" operations that
| reference an existing ID. The server looks up the target ID and
| inserts the new characters immediately after it.
|
| - Deletion hides a character for display purposes, but it is
| still kept for "insert after" position purposes.
|
| This might have potential outside text editing. Game world
| synchronization, maybe.
| yubblegum wrote:
| Is this really that novel? I mean using a central process for
| serializing a distributed system is like a no brainer -- didn't
| we start off from here originally? -- until you have to worry
| about network partitions, and CAP and all that jazz. You also
| now have a single point of failure. Also I skimmed the thing
| but was performance discussed?
| bongodongobob wrote:
| Yeah I have the same question. I'm not familiar with the
| problem space but this seems like my naive first idea so I'm
| wondering what the catch is.
| sdeframond wrote:
| What you describe is a CRDT, isn't it ?
| ori_b wrote:
| No; there is no single consistent final state that the system
| must converge to if the parts go offline. If you have this
| document: a{uuid=1}
|
| and two clients send the following operations:
| b{uuid=2} insert-after{uuid=1} c{uuid=3} insert-
| after{uuid=1}
|
| then the following two documents are both valid final states:
| abc acb
|
| That's fine as long as you have an authoritative server that
| observes all events in a single order and a way to unwind
| misordered local state, but it means that it's not a CRDT.
| Yoric wrote:
| If you have the additional semantics that says "operation
| with lowest uuid gets applied first", don't you essentially
| get a CRDT?
|
| I mean, a uuid is kind of a poor man's Lamport clock, isn't
| it?
| ori_b wrote:
| Yes, you can extend the algorithm that was described with
| additional semantics, and you can turn it into a CRDT.
| sdeframond wrote:
| What about ordering concurrent operations by id? Then "abc"
| is the only consistent final state.
|
| I get what you mean though, having a central authority
| greatly relaxes the requirement.
| 3cats-in-a-coat wrote:
| We could generalize having a tree of known authorities
| upstream with what a CRDT does, resulting in both being
| two special cases of a more general consistent event
| processing model. CRDT makes the order of events
| commutative, hence the "authority" becomes a property of
| the math itself, rather than a physical service.
| YoumuChan wrote:
| Like Raft is a "special case" of Paxos, this feels like a
| "special case" of CRDT.
|
| It has all the flavor of CRDT, but adds a leader and a
| different way for the total ordering (basically using
| leader's local lamport clock to break tie).
|
| Throw in leader reelection and some ledger syncing and then
| give everything some other names, I bet you can have
| "collaborative text editing on one page".
| jahewson wrote:
| This is literally a degenerate CRDT. Central server for tie-
| breaking goes back to Google Wave.
___________________________________________________________________
(page generated 2025-05-21 23:00 UTC)