[HN Gopher] Collaborative Text Editing Without CRDTs or OT
       ___________________________________________________________________
        
       Collaborative Text Editing Without CRDTs or OT
        
       Author : samwillis
       Score  : 161 points
       Date   : 2025-05-21 17:05 UTC (5 hours ago)
        
 (HTM) web link (mattweidner.com)
 (TXT) w3m dump (mattweidner.com)
        
       | hencq wrote:
       | I'm not an expert on this, but the main difference with a CRDT
       | like Automerge seems to be the server reconciliation. See for
       | example this article [1]. Automerge handles concurrent insertions
       | by using a sequence number and relying on an agreed ordering of
       | agent ids when insertions are concurrent, while this scheme
       | relies on the server to handle them in the order they come in.
       | 
       | The article mentions this:
       | 
       | > This contrasts with text-editing CRDTs, in which IDs are
       | ordered for you by a fancy algorithm. That ordering algorithm is
       | what differs between the numerous text-editing CRDTs, and it's
       | the complicated part of any CRDT paper; we get to avoid it
       | entirely.
       | 
       | I can buy the idea that many apps have a central server anyway,
       | so you can avoid the "fancy algorithm", though the server
       | reconciliation requires undoing and replaying of local edits, so
       | it's not 100% clear to me if that's much simpler.
       | 
       | [1] https://josephg.com/blog/crdts-go-brrr/
        
         | hem777 wrote:
         | Agreed, the undoing and replying isn't exactly non-fancy.
         | Peristent B+Tree is also not very non-fancy.
        
         | mweidner wrote:
         | I believe that Automerge internally stores all operations in an
         | eventually consistent total order, which you _can_ use as a
         | substitute for the server in server reconciliation (cf.
         | https://mattweidner.com/2025/05/21/text-without-
         | crdts.html#d...). However, Automerge does not actually do that
         | - instead it processes text operations using a traditional
         | CRDT, RGA, probably because (as you point out) undoing and
         | replaying ops is not trivial to implement.
        
       | hem777 wrote:
       | Cool to see a write up on this! Discovered the same method years
       | ago and also wondered why it doesn't show up in academic
       | literature.
       | 
       | I implemented this in a decentralized context and as a CRDT
       | though, so that the properties hold (commutative, idempotent and
       | associative).
        
         | k__ wrote:
         | If the idea is to have an alternative to CRDT, what did you
         | gain from building it as CRDT?
        
           | hem777 wrote:
           | It wasn't really that we wanted to have a CRDT per se, but as
           | it was implemented on top of an op-based append-only log
           | CRDT, it turned out to hold those properties, which makes it
           | a CRDT. We wanted to have the edits to be able to arrive in
           | any order or after delays due to network partitions (this was
           | for a p2p network).
        
       | uh2010 wrote:
       | wonder if there would be a perf gain with UUIDv7
        
         | poly2it wrote:
         | Why not just use an ever incrementing u64?
        
           | zamadatix wrote:
           | An incrementing u64 requires either atomic increments between
           | concurrent clients or recalculation logic to consistently
           | find the newly incremented ID after conflicting information
           | syncs. UUIDs just spit out a unique ID without any complexity
           | or associations with other clients.
        
             | ChadNauseam wrote:
             | There's closely related idea that might work, though. Each
             | device editing text could be assigned a 32-bit ID by the
             | server (perhaps auto-incrementing). Devices then maintain a
             | separate 32-bit ID that they increment for each operation
             | they perform. The ID used for each character is (device_id,
             | edit_id), which should fit nicely in 8 bytes.
        
               | mweidner wrote:
               | Indeed, this is close to what Yjs (popular CRDT library)
               | does: each client instance (~ browser tab) chooses a
               | random 32-bit clientId, and character IDs combine this
               | clientId with local counters. https://github.com/yjs/yjs/
               | blob/987c9ebb5ad0a2a89a0230f3a0c6...
               | 
               | Any given collaborative document will probably only see
               | ~1k clientIds in its lifetime, so the odds of a collision
               | are fairly low, though I'd be more comfortable with a
               | 64-bit ID.
        
           | kortex wrote:
           | Then you need central coordination, either a single central
           | server containing the counter, or something like Snowflake
           | where you have multiple counters, each assigned orthogonal
           | blocks ahead of time (that need to coordinate with a central
           | server).
           | 
           | UUIDs/ULIDs/etc are fully distributed, you can have two
           | clients assign an ID without coordinating with ~0% of
           | collision.
        
             | poly2it wrote:
             | You could also split the u64 to have the first 24 bits be
             | unique to the client, and the last 40 be unique to the
             | produced character. This would still allow 1 TiB of data
             | produced per user and session. The single mutex would be
             | the user ID counter.
        
         | pshc wrote:
         | They do discuss optimization of the IDs at the bottom of the
         | article.
        
       | asdffdasy wrote:
       | so, an unoptimized crdt? just set max set size to 1 and yolo?
        
         | th0ma5 wrote:
         | It's very appealing as a kind of irreducible complexity. It
         | feels close to what is actually happening and is simple hahah
         | if like you say not optimized.
        
       | math_dandy wrote:
       | Is the take-home message of the post that the full complexity of
       | CRDTs/OT is necessary only in the absence of a central server?
        
         | sampton wrote:
         | OT requires centralized server.
        
         | mweidner wrote:
         | Even in the absence of a central server, you can still avoid
         | CRDT/OT complexity _if_ you have a decentralized way to
         | eventually total order operations  & apply them in that order:
         | https://mattweidner.com/2025/05/21/text-without-crdts.html#d...
         | 
         | As others in the comments argue, this is technically a CRDT
         | (though a fully general one); also, undoing/replaying ops is
         | itself non-trivial to implement. However, I hope this is still
         | simpler than using a traditional CRDT/OT for each data type.
        
       | apt-apt-apt-apt wrote:
       | Does this finally solve collaborative text editing and its
       | friends?
       | 
       | Such an awesome approach.
        
         | criddell wrote:
         | Could it? If you and I are simultaneously editing a list of
         | people and you are trying to order them by age and I'm ordering
         | them alphabetically, we are still going to have to reconcile
         | the result.
        
           | shermantanktop wrote:
           | By that definition, it's not a solvable problem, is it?
        
       | Footkerchief wrote:
       | Surprised to see no discussion of other data structures like
       | dicts/maps, or arrays of arbitrary type. Hopefully they'd be a
       | straightforward extension. IME, apps need collaborative data
       | structures more often than they need pure collaborative text
       | editing.
       | 
       | The motivating examples (update validation, partial loading,
       | higher-level operations) are interesting, but I don't see a
       | strong case that the reason Yjs etc. lack these features is the
       | underlying CRDT implementation, as opposed to these features
       | being intrinsically difficult to build.
        
         | filleokus wrote:
         | > Surprised to see no discussion of other data structures like
         | dicts/maps, or arrays of arbitrary type. Hopefully they'd be a
         | straightforward extension. IME, apps need collaborative data
         | structures more often than they need pure collaborative text
         | editing.
         | 
         | Totally agree. I guess an array of "atomic" objects, where the
         | properties of the objects can't be changed can be done just by
         | replacing the string with your own type. Changes inside of the
         | object is probably trickier, but maybe it's just about
         | efficiently storing/traversing the tree?
         | 
         | I've also always thoguth it should be possible to create
         | something where the consumer of the helper library (per OP
         | terminology) can hook in their own lightweight "semantic model"
         | logic, to prevent/manage invalid states. A todo item can't both
         | have isDone: true and state: inProgress at the same time.
         | Similar to rich text formatting semantics mentioned in the
         | linked article.
        
       | pshc wrote:
       | Use of server reconciliation makes me think client-side
       | reconciliation would be tricky... how do you preserve smooth
       | editor UX while applying server updates as they arrive?
       | 
       | For example, if your client-sent request to insert a character
       | fails, do you just retry the request? What if an update arrived
       | in the intervening time? (Edit: they acknowledge this case in the
       | "Client-Side" section, the proposal is to rewind and replay, and
       | a simpler proposal to block until the pending queue is flushed)
       | 
       | From a frontend vantage I feel like there may be a long tail of
       | underspecified UI/UX edge cases, such that CRDT would be simpler
       | overall. And how does the editor feel to use while riding the NYC
       | subway where coverage is spotty?
        
         | straws wrote:
         | Both ProseMirror and the newer version of CodeMirror have a
         | pretty elegant solution to this: each modification to the
         | document is modeled as a step that keeps track of indices
         | instead of node/text identities, and uses a data structure
         | called a "position map" that the buffered steps can be mapped
         | through to create steps with updated positions, which are then
         | applied to your document.
         | 
         | In practice, it works quite well. Here's more info:
         | 
         | https://marijnhaverbeke.nl/blog/collaborative-editing.html
         | https://marijnhaverbeke.nl/blog/collaborative-editing-cm.htm...
        
       | Animats wrote:
       | That is very neat. The algorithm:
       | 
       | - Label each text character with a globally unique ID (e.g., a
       | UUID), so that we can refer to it in a consistent way across time
       | - instead of using an array index that changes constantly.
       | 
       | - Clients send the server "insert after" operations that
       | reference an existing ID. The server looks up the target ID and
       | inserts the new characters immediately after it.
       | 
       | - Deletion hides a character for display purposes, but it is
       | still kept for "insert after" position purposes.
       | 
       | This might have potential outside text editing. Game world
       | synchronization, maybe.
        
         | yubblegum wrote:
         | Is this really that novel? I mean using a central process for
         | serializing a distributed system is like a no brainer -- didn't
         | we start off from here originally? -- until you have to worry
         | about network partitions, and CAP and all that jazz. You also
         | now have a single point of failure. Also I skimmed the thing
         | but was performance discussed?
        
           | bongodongobob wrote:
           | Yeah I have the same question. I'm not familiar with the
           | problem space but this seems like my naive first idea so I'm
           | wondering what the catch is.
        
         | sdeframond wrote:
         | What you describe is a CRDT, isn't it ?
        
           | ori_b wrote:
           | No; there is no single consistent final state that the system
           | must converge to if the parts go offline. If you have this
           | document:                  a{uuid=1}
           | 
           | and two clients send the following operations:
           | b{uuid=2} insert-after{uuid=1}        c{uuid=3} insert-
           | after{uuid=1}
           | 
           | then the following two documents are both valid final states:
           | abc        acb
           | 
           | That's fine as long as you have an authoritative server that
           | observes all events in a single order and a way to unwind
           | misordered local state, but it means that it's not a CRDT.
        
             | Yoric wrote:
             | If you have the additional semantics that says "operation
             | with lowest uuid gets applied first", don't you essentially
             | get a CRDT?
             | 
             | I mean, a uuid is kind of a poor man's Lamport clock, isn't
             | it?
        
               | ori_b wrote:
               | Yes, you can extend the algorithm that was described with
               | additional semantics, and you can turn it into a CRDT.
        
             | sdeframond wrote:
             | What about ordering concurrent operations by id? Then "abc"
             | is the only consistent final state.
             | 
             | I get what you mean though, having a central authority
             | greatly relaxes the requirement.
        
               | 3cats-in-a-coat wrote:
               | We could generalize having a tree of known authorities
               | upstream with what a CRDT does, resulting in both being
               | two special cases of a more general consistent event
               | processing model. CRDT makes the order of events
               | commutative, hence the "authority" becomes a property of
               | the math itself, rather than a physical service.
        
             | YoumuChan wrote:
             | Like Raft is a "special case" of Paxos, this feels like a
             | "special case" of CRDT.
             | 
             | It has all the flavor of CRDT, but adds a leader and a
             | different way for the total ordering (basically using
             | leader's local lamport clock to break tie).
             | 
             | Throw in leader reelection and some ledger syncing and then
             | give everything some other names, I bet you can have
             | "collaborative text editing on one page".
        
         | jahewson wrote:
         | This is literally a degenerate CRDT. Central server for tie-
         | breaking goes back to Google Wave.
        
       ___________________________________________________________________
       (page generated 2025-05-21 23:00 UTC)