https://zjy.cloud/posts/collaborative-web-apps
* zjy.cloud
* twitter
* email
A Simple Way to Build Collaborative Web Apps
published on 2021-08-17,
discuss on HN
---------------------------------------------------------------------
---------------------------------------------------------------------
Recently I've been thinking about how to build a collaborative web
app in 2021.
By collaborative web app I mean apps with desktop-like interactions
and realtime collaborations, such as Notion, Discord, Figma, etc.
I want to find an approach suitable for small teams, simple but not
necessarily easy, fast by default, and scalable when it grows.
Thus I started a journey to find out by creating a demo todo app,
exploring the toolsI welcome all kinds of third-party tools and
services, as long as they don't bring in hard-to-replace vendor
lock-ins. and methods along the way.
Our todo app has these features:
* users can create, edit, delete and reorder todos, which are
stored on the server for persistency
* users can cooperatively edit the same list of todos, and changes
are automatically synced between different clients in realtime
* all the operations should be as fast as possible for all the
users around the world
The fast part is mainly concerned with latency, because most apps
won't have much throughput to deal with in the beginning. More
specifically, we want the changes made by one client to be delivered
to other clients in a portion of the speed of light - less than 100ms
in the same geographical region and several hundred ms across the
continent. The app is named Todo Light for this reason.
You can play with the end productThe result failed to meet our goal
for speed, depending on your location in the world. More on this
later. here. The client and server code are both hosted on Github.
todo light app screenshottodo light app screenshot
A random list id is created when the app starts. You can share the
URL with the list_id parameter to collaboratively edit the same list
of todos with others (or yourself in another browser tab).
Todo Light is simple enough as a demo but nonetheless embodies some
essence of the SaaS apps mentioned above, albeit in a much-simplified
manner.
Let's get started building it.
Client
---------------------------------------------------------------------
---------------------------------------------------------------------
We begin from the client because it contains the core of our app.
The user interface part is easy, which we use React to build. Other
reactive UI frameworks like Svelte and Vue should work as well. I
choose React because of familiarity.
The client-only version of the app is straightforward to write:
import React, { useState } from 'react'
export default function TodoApp() {
const [todos, setTodos] = useState([])
const [content, setContent] = useState('')
return (
)
}
With less than 100 lines of code, the resulting app already looks and
works like the end product, except that its state is volatile.
Refresh the browser, and you'd lose all your todo items!
We use React's state to store data, which is okay for the input value
because it is temporary by nature, but not quite right for the todos.
The todos need to be:
1. updated in a local browser cache for maximal speedThis pattern is
sometimes called Optimistic UI
2. synced to the server for persistency
3. delivered to other clients in correct order and state.
Nowadays, there is a plethora of frontend state management libraries
to choose from: Redux, MobX, Recoil, GraphQL clients like Apollo and
Relay, etc. Sadly none of them works in our use case. What we need is
a distributed system with realtime syncing and conflict resolution
baked in. Although there are good writings on this subject,
distributed systems are still too hard to implement correctly for a
one-person team. I'd like to bring in some help.
After some search, a promising option shows up - Replicache, of which
the homepage says:
Replicache makes it easy to add realtime collaboration, lag-free
UI, and offline support to web apps. It works with any backend
stack.
Sounds too good to be true (spoiler: it's mostly true). How does
Replicache achieve these bold claims? Its doc site has a whole page
to explain how it works. To save your time, I will summarize roughly
here.
Replicache implements a persistent store in the browser, using
IndexedDB. You can mutate the store locally and subscribe to part of
the store in your UI. When data changes, subscriptions re-fire, and
the UI refreshes.
You need to provide two backend endpoints for Replicache to talk to:
replicache-pull and replicache-push. replicache-pull sends back a
subset of your database for the current client. replicache-push
updates the database from local mutations. After applying a mutation
on the server, you send a WebSocket As the Replicache doc says,
managing your own WebSocket backend has a very high operational cost.
We use Ably here. message hinting to affected clients to pull again.
That's all you need to do. Relicache orchestrates the whole process
to make sure the state is consistent while being synced in realtime.
We will dive into the backend integration in the next section of this
article. For now, let's rewrite the state-related code utilizing
Replicache:
// Only relevant part are shown
import { Replicache } from 'replicache'
import { useSubscribe } from 'replicache-react'
import { nanoid } from 'nanoid'
const rep = new Replicache({
// other replicache options
mutators: {
async createTodo(tx, { id, completed, content, order }) {
await tx.put(`todo/${id}`, {
completed,
content,
order,
id,
})
},
async updateTodoCompleted(tx, { id, completed }) {
const key = `todo/${id}`
const todo = await tx.get(key)
todo.completed = completed
await tx.put(`todo/${id}`, todo)
},
async deleteTodo(tx, { id }) {
await tx.del(`todo/${id}`)
},
},
})
export default function TodoApp() {
const todos =
useSubscribe(rep, async (tx) => {
return await tx.scan({ prefix: 'todo/' }).entries().toArray()
}) ?? []
const onSubmit = (e) => {
e.preventDefault()
if (content.length > 0) {
rep.mutate.createTodo({
id: nanoid(),
content,
completed: false,
})
setContent('')
}
}
const onChangeCompleted = (e) => {
rep.mutate.updateTodoCompleted({
id: todo.id,
completed: e.target.checked,
})
}
const onDelete = (_e) => {
rep.mutate.deleteTodo({ id: todo.id })
}
// render
}
We replace React's in-memory state with Replicache's persistent
store. The app should work as before, except your carefully written
todo items won't disappear when the browser tab closes.
Notice the mutators we register when initializing Replicache. They
are the main APIs we use to interact with Replicache's store. When
they are executed on the client, the corresponding mutations will be
sent to the replicache-push endpoint by Replicache.
With the help of Replicache, you can think about your client state as
a giant hashtable. You can read from it and write to it as you like,
and Replicache would dutifully keep the state in sync among the
server and all the clients.
Server
---------------------------------------------------------------------
---------------------------------------------------------------------
Now let's move on to the server.
The plan is clear: we will implement the two endpoints needed by
Replicache, using some backend language (we use NodeJS in this case)
and some database. The only requirement by Replicache is that the
database must support a certain kind of transaction.
Before we set out to write the code, we need to think about the
architecture. Remember the third feature of Todo Light? It should be
as fast as possible for all users around the world.
Since we have implemented Optimistic UI on the client, most
operations are already speedy (zero latency). For changes to be
synced from one client to others quickly, we still need to achieve
low latency for the requests to the server. Hopefully, the latency
should be under 100ms for the collaboration to feel realtime.
We can only achieve that by globally deploying the server and the
database. If we don't and only deployed to one region, the latency
for a user in another continent will be several hundred milliseconds
high no matter what we do. It's the speed of light, period.
Globally deploying a stateless server should be easy. At least that's
what I initially thought. Turns out I was wrong. In 2021, most cloud
Mostly I'm referring to PaaS like Heroku and Google App Engine. FaaS
(function as a service) is much easier to deploy globally but comes
with its own gotchas. providers still only allow you to deploy your
server to a single region. You need to go many extra steps to have a
global setup.
Luckily I find Fly.io, a cloud service that helps you "deploy app
servers close to your users", which is excatly what we need. It comes
with an excellent command-line tool and a smooth "push to deploy"
deployment flow. Scaling out to multiple regions (in our case, Hong
Kong and Los Angeles) takes only a few keystrokes. Even better, they
offer a pretty generous free tier.
The only question left is which database we should use. Globally
distributed databases with strong consistency is a huge and
complicated area that has been tackled by big companies in recent
years.
Inspired by Google's Spanner, many open source solutions come out.
One of the most polished competitors is CockroachDB. Luckily, they
offer a managed service with a 30-day trial.
Although I managed to build a version of Todo Light using
CockroachDB, the end product in this article is based on a much
simpler Postgres setup with distributed read replicas. Dealing with a
global database brings in much complexity that is not essential to
the subject matter of this article, which will wait for another
piece.
We need two tables, one for todos and one for replicache clients.
database schemadatabase schema
Replicache needs to track the last_mutation_id of different clients
to coordinate all mutations, whether confirmed or pending. The
deleted column is used for soft deletes. The version column is used
to compute change for Replicache pulls, which we will explain later.
The replicache-push endpoint receives arguments from the local
mutators. Let's persist them to the database. We also need to
increment the lastMutationID in the same transaction, as mandated.
router.post('/replicache-push', async (req, res) => {
const { list_id: listID } = req.query
const push = req.body
try {
// db is a typical object than represents a database connection
await db.tx(async (t) => {
let lastMutationID = await getLastMutationID(t, push.clientID)
for (const mutation of push.mutations) {
const expectedMutationID = lastMutationID + 1
if (mutation.id < expectedMutationID) {
console.log(
`Mutation ${mutation.id} has already been processed - skipping`,
)
continue
}
if (mutation.id > expectedMutationID) {
console.warn(`Mutation ${mutation.id} is from the future - aborting`)
break
}
// these mutations are automatically sent by Replicache when we execute their counterparts on the client
switch (mutation.name) {
case 'createTodo':
await createTodo(t, mutation.args, listID)
break
case 'updateTodoCompleted':
await updateTodoCompleted(t, mutation.args)
break
case 'updateTodoOrder':
await updateTodoOrder(t, mutation.args)
break
case 'deleteTodo':
await deleteTodo(t, mutation.args)
break
default:
throw new Error(`Unknown mutation: ${mutation.name}`)
}
lastMutationID = expectedMutationID
}
// after successful mutations we use Ably to notify the clients
const channel = ably.channels.get(`todos-of-${listID}`)
channel.publish('change', {})
await t.none(
'UPDATE replicache_clients SET last_mutation_id = $1 WHERE id = $2',
[lastMutationID, push.clientID],
)
res.send('{}')
})
} catch (e) {
console.error(e)
res.status(500).send(e.toString())
}
})
async function getLastMutationID(t, clientID) {
const clientRow = await t.oneOrNone(
'SELECT last_mutation_id FROM replicache_clients WHERE id = $1',
clientID,
)
if (clientRow) {
return parseInt(clientRow.last_mutation_id)
}
await t.none(
'INSERT INTO replicache_clients (id, last_mutation_id) VALUES ($1, 0)',
clientID,
)
return 0
}
async function createTodo(t, { id, completed, content, order }, listID) {
await t.none(
`INSERT INTO todos (
id, completed, content, ord, list_id) values
($1, $2, $3, $4, $5)`,
[id, completed, content, order, listID],
)
}
async function updateTodoCompleted(t, { id, completed }) {
await t.none(
`UPDATE todos
SET completed = $2, version = gen_random_uuid()
WHERE id = $1
`,
[id, completed],
)
}
// other similar SQL CRUD functions are omitted
The replicache-pull endpoint requires more effort. The general plan
is, in every request to replicache-pull we compute a diff of state
and an arbitrary cookie (not to be confused with HTTP cookie) to send
back to the client. The cookie will be attached to the subsequent
request to compute the diff. Rinse and repeat.
How to compute the diff may be the most challenging part of
integrating Replicache. The team provides several helpful strategies.
We will use the most recommend one: the row version strategy.
router.post('/replicache-pull', async (req, res) => {
const pull = req.body
const { list_id: listID } = req.query
try {
await db.tx(async (t) => {
const lastMutationID = parseInt(
(
await t.oneOrNone(
'select last_mutation_id from replicache_clients where id = $1',
pull.clientID,
)
)?.last_mutation_id ?? '0',
)
const todosByList = await t.manyOrNone(
'select id, completed, content, ord, deleted, version from todos where list_id = $1',
listID,
)
// patch is an array of mutations that will be applied to the client
const patch = []
const cookie = {}
// For initial call we will just clear the client store.
if (pull.cookie == null) {
patch.push({ op: 'clear' })
}
todosByList.forEach(
({ id, completed, content, ord, version, deleted }) => {
// The cookie is a map from row id to row version.
// As the todos count grows, it might become too big to be efficiently exchanged.
// By then, we can compute a hash as a cookie and store the actual cookie on the server.
cookie[id] = version
const key = `todo/${id}`
if (pull.cookie == null || pull.cookie[id] !== version) {
if (deleted) {
patch.push({
op: 'del',
key,
})
} else {
// addtions and updates are all represented as the 'put' op
patch.push({
op: 'put',
key,
value: {
id,
completed,
content,
order: ord,
},
})
}
}
},
)
res.json({ lastMutationID, cookie, patch })
res.end()
})
} catch (e) {
res.status(500).send(e.toString())
}
})
Because version is a random UUID generated by Postgres's
gen_random_uuid function, we can use it to efficiently calculate
whether a todo item has been updated or not.
That's all for the server code, and we've come to the end of our
journey. With the help of many great tools, we've successfully built
a fast, collaborative todo app. More importantly, we've worked out a
reasonably simple approach to building similar web apps. As the user
base and feature set grow, this approach shall scale well in both
performance and complexity.
Bonus - Implement Reordering with Fractional Indexing
---------------------------------------------------------------------
---------------------------------------------------------------------
You may notice that we use the type text for the ord column in the
database schema, which seems better suited for a number type. The
reason is we are using a technique called Fractional Indexing to
implement reordering. Check the source code of Todo Light or try to
implement it by yourself. It should be an interesting practice.
At the time of the writing, one shortcoming of Replicache is that its
local transactions are not fast enough to enable heavy interactions
such as drag and drop. To prevent lagging, we turned on the
useMemstore: true option to disable offline support. Hopefully, this
will be fixed soon.