https://www.moment.dev/blog/lies-i-was-told-pt-1 Moment devlog Made with [?] Nov 16, 2024 Lies I was Told About Collaborative Editing, Part 1: Algorithms for offline editing AC By Alex Clemmer In early 2024, I began investigating collaborative editing systems for use in Moment's core text editor. In some ways, we are in a golden era for the problem. Several algorithms now claim to solve not only the online case (where many people simultaneously edit a document), but also the offline case (where people edit offline for an unbounded amount of time, come online, and "magically" their changes are merged together). I had read the papers and listened to the talks, and like others, I was initially optimistic the community had settled on the "right answer" to the whole problem space. One of the earliest surprises in the evaluation process was that this is not true----that despite the marketing copy, blog posts, and technical papers stating otherwise, these collaborative editing algorithms provide a systematically poor offline editing experience. Specifically: The two most popular families of collab editing algorithms (CRDTs and OT) resolve direct editing conflicts unintuitively enough that users generally interpreted their results as data corruption. Offline editing dramatically increases the likelihood of direct conflicts, which means these algorithms are not, on their own, appropriate for offline editing. In many ways, this is bad news. And we will spend much of this post discussing why we think these things are true--example anomalies, failure modes, what the practical limitations are, and so on. But, the news is not all bad. We are heartened to see some recent research (e.g.) reversing the trend of treating collab editing as a pure algorithms problem, and instead treating it as a UI/UX problem in which algorithms assist. One of the goals of this post is to help motivate this line of reasoning, and we will make time at the end to encourage that, too. Example: a seemingly trivial editing conflict Let's start with a game. I provide a trivial editing conflict and you try to guess what happens. Ready? Here's the scenario: Alice and Bob are both offline. They are editing a document containing the following text: return render(() => (
'{result.expectedDocumentContents}'
code>, got '{guess}'>; return result; }) const
anyErrors = testResults.filter(test => !test.passed).length > 1;
const summary = testResults.filter(test => !test.passed).length > 1 ?
span> You'd think so, but no.{" "}
The final result for all {testResults.length} collaborative editing
algorithms is a document containing only the string 'u'
. Yes, you read that right! The Color of
Pomegranates is deleted, but Bob's addition, 'u'
is kept! The reason for this is that they all work roughly the same:
in direct conflicts, they first accept any deletes, and then
apply any additions. : You
did well. You correctly guessed the output of
{testResults.length}/{testResults.length} collab editing algorithms!
But how did you know what the answer was? ; if
(guess === null) { return Waiting for a guess... } return ( <> {summary} {testResults.map((test) => { return ( ); })} > ) });
Did you guess correctly? Seriously--you really should try this
before continuing on.
These anomalies are common on direct conflicts.
The example above is interesting because it is a trivial, direct
editing conflict which these (popular) collaborative editing
algorithms do seem to claim to support, but which empirically
produces a document that no human would ever write on their own. This
is also a case where users interpreted the result as the Moment
product corrupting their data.
So how common is this problem? Every use case is different, but,
anecdotally we found that roughly 20-30% of direct conflicts we tried
produced results that were similarly--in our opinion--unacceptable, for
our use case of offline editing. [nb., I'm not sure if it's helpful
to list some of the anomalies we encountered here, if it is, let me
know at [email protected] or the Moment Discord server and I'll
follow up with a more thorough review.]
Ultimately, though, because of the negativity of the feedback and the
frequency of its occurrence, we felt we could not defend this result
to our users.
These algorithms do seem to claim to support offline editing.
Initially, when we encountered this result, we thought that we must
have misunderstood the semantics offered by these tools. Even now,
though, when I read the technical copy for these projects, I can't
help but think this scenario really is in scope.
The Yjs readme explicitly states that it "supports [...] offline
editing."
The ShareJS landing page claims to allow collaboration "with any
amount of lag (from zero to an extended holiday)."
The Peritext authors describe it as "allow[ing] users to edit
independent copies of a document" which can be "automatically merg
[ed] [...] back together in a way that preserves the users' intent as
much as possible."
I admit to still not being quite sure what to make of this. We are in
the market for a solution to this problem! I want to believe! I just
don't see how to reconcile these claims with the frequency and kinds
of errors we're seeing in practice.
Coordination-free algorithms will always have anomalies on direct
conflicts.
At some point in the evaluation process we resigned ourself to the
fact that the algorithms did not do what we wanted. But a lingering
question remained: is this a fundamental problem for the algorithms,
or with some careful contributions, could we contribute back and fix
it?
Unfortunately, we think it's a fundamental problem for the
algorithms, for a few reasons.
The algorithms only have enough information to guess what the result
should be. The algorithms can't email Alice and Bob to see what they
wanted or review it in the GitHub Pull Request UI. It receives
Alice's proposal to delete all the text and Bob's proposal to change
the spelling without knowing what their intent was, and it must use a
heuristic to decide what to do with them.
The algorithms operate on characters and have very, very poor
guarantees about their output. This is why in the example above,
Alice and Bob end up with a document simply containing the letter u.
Not a valid sentence, and not even a valid word!
Alice and Bob might make different decisions if they knew what the
other was doing. Alice and Bob both made decisions based on the
document containing the text, The Color of Pomegranates. If Alice
knew Bob changed the spelling, or Bob knew Alice deleted the
paragraph, they might change their decision to edit at all.
I think this is probably worth a post all on its own, particularly
with more treatment given to the fact that these algorithms don't--and
cannot--obey a causal ordering (cf., Jamie Brandon's post). But for
now, I think we can settle for the intuitive version of this
argument.
Offline editing is a UI/UX problem, not an algorithms problem; and
other conclusions.
So far the news has been bad. We entered the evaluation hopeful that
we'd be able to support true offline editing "incidentally" for
having implemented a tricky algorithm. And it is true that we left
the evaluation left process, well, bereft of hope--forced to conclude
that users were right to think of these algorithms were corrupting
their data, that the anomalies were fundamental to the algorithms
themselves, and that they would be frequent enough to be a real
issue. YMMV, but that is where we ended up.
As I said, though, the news is not all bad. Another way to view this
result is as a strong argument for putting significant resources into
collaborative editing as a UI/UX problem. The algorithms cannot
completely solve the problem, but they can play a role in the
solution.
There are a few reasons to be optimistic about this. One is that we
already have one widely-adopted UI for merging documents: git! So in
some sense, the research question is how much better it can be--more
approachable, more accessible, more automatic, etc.
I think this resemblance is not superficial. In 2009, a surprising
amount of the discourse was focused on the algorithms git used to
automatically merge changes together. git adapted the Myers O(ND)
diff algorithm, then used mostly by biologists to do BLAST-like
sequence analysis. Bram Cohen thought the diff results were
unintuitive, and invented the patience diff algorithm, which was then
adopted by bzr, a now-defunct git competitor. The list goes on. I
think the primary change is that these debates were focused on
producing diffs that humans read; now the debate is whether the
algorithms can accomplish this result with no human involvement at
all.
Another reason to be optimistic is that some researchers do seem to
be concentrating on this problem as a UI/UX problem. I am thinking
particularly of Ink & Switch, e.g., the work on collaborative
history. I understand Ink & Switch to be operating purposefully a few
years ahead of production, but I am excited to see where this lands
when it settles down.
Acknowledgements
Heartfelt thanks to Kyle Kingsbury, Sean Gillespie, Ian Livingstone,
David Adrian, Ben Linsay, Alex Rasmussen, Lita Cho, Lenny Pruss,
Michael Yamnitsky, Camille Fournier, James Turnbull, and Ryan Cox for
their comments and feedback.
return render(() => (
If you enjoyed this, you may enjoy other entries in the Moment devlog. We're also hiring! [email protected] ));