[HN Gopher] A simple way to build collaborative web apps
___________________________________________________________________
A simple way to build collaborative web apps
Author : thezjy
Score : 165 points
Date : 2021-08-17 13:42 UTC (9 hours ago)
(HTM) web link (zjy.cloud)
(TXT) w3m dump (zjy.cloud)
| Gabrielwxf wrote:
| > Dealing with a global database brings in much complexity that
| is not essential to the subject matter of this article, which
| will wait for another piece.
|
| Excellent write. It would be great to know why CockroachDB failed
| your needs.
| BackBlast wrote:
| You could build this with couchdb multi master regional servers
| and pouchdb on the client and have full consistency with the
| replication both to clients and servers as well as conflict
| resolution (in case of collision) done for you.
|
| This route seems like a lot of extra work for pretty similar
| functionality.
| nesarkvechnep wrote:
| I'm interested how this stacks against Phoenix Channels +
| Presence.
| deathtrader666 wrote:
| Yes, it would be great if someone with this experience can
| chime in, especially since Phoenix has CRDTs built-in.
| sambroner wrote:
| I'm really glad to see an article like this. I've worked in the
| space for a while (Fluid Framework) and there's a growing number
| of libraries addressing realtime collab. One of the key things
| that many folks miss is that building a collaborative app with
| real time coauthoring is tricky. Setting up a websocket and
| hoping for the best won't work.
|
| The libraries are also not functionally equivalent. Some use OT,
| some use CRDTs, some persist state, some are basically websocket
| wrappers, fairly different perf guarantees in both memory &
| latency etc. The very different capabilities make it complicated
| to evaluate all the tools at once.
|
| Obviously I'm partial the Fluid Framework, but not many realtime
| coauthoring libraries have made it as easy to get started as
| Replicache. Kudos to them!
|
| A few solutions with notes... - Fluid Framework -
| My old work... service announced at Microsoft Build '21 and will
| be available on Azure - yJS - CRDTs. Great integration with
| many open source projects (no service) - Automerge - CRDTs.
| Started by Martin Kleppman, used by many at Ink & Switch (no
| service) - Replicache - Seen here, founder has done a great
| job with previous dev tools (service integration) -
| Codox.io - Written by Chengzheng Sun, who is super impressive and
| wrote one of my fav CRDT/OT papers - Chronofold - CRDTs.
| Oriented towards versioned text. I'm mostly unfamiliar -
| Convergence.io - Looks good, but I haven't dug in -
| Liveblocks.io - Seems to focus on live interactions without
| storing state - derbyjs - Somewhat defunct. Cool, early
| effort. - ShareJS/ShareDB - Somewhat defunct, but the code
| and thinking is very readable/understandable and there are good
| OSS integrations - Firebase - Not the typical model people
| think of for RTC, but frequently used nonetheless
|
| I should add... I talk to many folks in the space. People are
| very welcoming and excited to help each other. Really fun space
| right now.
| vyrotek wrote:
| Fluid Framework looks pretty cool! I somehow missed the Build
| announcement about this.
|
| Maybe it's just me, but it has a SignalR + Orleans sort of vibe
| to it when I think about the types of problems it solves. I
| will definitely be digging into this a bit more.
| BackBlast wrote:
| I'll have to look at some of these, I've reviewed some of these
| but not all. You are missing some I'm familiar with.
|
| PouchDB+CouchDB work well out of the box with minimal fuss for
| open pieces you can just plug into this role. PouchDB handles
| the client's state persist and replication on the client,
| couchdb is the reliable cloud service you can replicate to.
|
| Meteor, at least their pre-apollo stack had realtime collab
| type features with their mini-mongo client and oplog tailing.
| nikodunk wrote:
| I've used YJS and can strongly recommend.
| https://github.com/yjs/yjs
|
| Built a Google Docs like rich text collaborator for a client on
| Express/Psql and React. Worked like a charm. The hardest part
| was dealing with ports on AWS to be honest.
| memco wrote:
| Very nice writeup! However, the example did not fully work for
| me. I could perform CRUD on a single tab, but opening the list in
| multiple tabs did not replicate the list or actions. Seeing this
| in the console: [Error] Could not connect to the
| server. [Error] Fetch API cannot load https://damp-
| fire-554.fly.dev/replicache-pull?list_id=kx1I-gXPWwOxU9teRUJ_c
| due to access control checks. [Error] Failed to load
| resource: Could not connect to the server. (replicache-pull, line
| 0)
|
| Safari 14.2 on macOS 10.15.7.
| timwis wrote:
| What about conflict resolution? If two users update the same
| record/field around the same time? Isn't that the trickiest part
| of real time?
| tommoor wrote:
| I believe it uses a CRDT hosted by a third party service.
| amelius wrote:
| Trickiest part is probably adding fine-grained access control
| rules.
| tmikaeld wrote:
| That's what Replicache[0] solves, it provides for Causal+
| Consistency across the entire system.
|
| "This means that transactions are guaranteed to be applied
| atomically, in the same order, across all clients. Further, all
| clients will see an order of transactions that is compatible
| with causal history. Basically: all clients will end up seeing
| the same thing, and you're not going to have any weirdly
| reordered or dropped messages."
|
| [0] https://doc.replicache.dev/design
|
| Note: There's more in their links, but the linked sites are
| down..
| btown wrote:
| It appears Replicache doesn't use CRDTs since it has a
| central source of truth:
| https://news.ycombinator.com/item?id=22175530
|
| See also the commentary here:
| https://doc.replicache.dev/guide/local-mutations
|
| This sounds a lot like Operational Transform but without the
| transform part - it assumes that locally applied mutations
| can be undone and rebased without user interaction. But I
| feel like the Google Wave team would have a lot of objections
| to the idea that this can just be ignored. If your state is
| just a group of key value stores where last write wins and
| everyone can agree on who's last, that's fine, but text/token
| streams require a notion of transformation that I'm worried
| Replicache simply glosses over.
| aboodman wrote:
| I'm not sure if you are understanding that when Replicache
| rebases operations locally it actually re-executes code
| which can have arbitrary effects. This design yields a lot
| of flexibility to preserve intent: the function can look at
| current state of world and decide to do something
| different.
|
| Now, it is true that OT is considered the gold standard for
| certain kinds of collaborative editing, in particular
| unstructured text. But CRDTs are quickly catching up and I
| believe that any CRDT should by definition be implementable
| on top of Replicache.
|
| Its also quite a lot easier to implement a Replicache
| backend than an ot backend.
| tmikaeld wrote:
| I'd rather it was configurable, since there's different
| use-cases for both and it can be in the same app. So you're
| definitely making a valid point.
| Chris_Newton wrote:
| Indeed, there can never be one universal solution to this,
| because the problem is one of specification rather than
| (only) implementation.
|
| For example, suppose we have an edit/delete conflict, where
| two clients concurrently interact with the same entity in
| your data model. In a simple case, we can decide to
| "resurrect" the affected entity and apply the edit, which
| is the option that never results in significant data loss
| and so might be a reasonable behaviour if no user
| interaction is involved.
|
| Now, what if there were other consequences of deleting that
| entity? Maybe the client that deleted the entity then
| created a new entity that would violate some uniqueness
| constraint if both existed simultaneously. Or maybe it
| wasn't the originally deleted entity that would violate
| that constraint, but some related one that was also deleted
| implicitly because of a cascade. How should we reconcile
| these changes, if simply allowing either one to take
| precedence means discarding data from the other?
|
| At least if all clients are communicating in close to real
| time, it's unlikely that any one of them will diverge far
| from the others before they get resynchronised, so the
| scope for awkward conflicts is limited. But in general, we
| might also need to support offline working for extended
| periods, when multiple clients might come back with longer
| sequences of potentially conflicting operations, and
| there's no general way to resolve that without the
| intervention of users who can make intelligent decisions
| about intent, or at least a set of automated rules that
| makes sense in the context of that specific application.
| And in the latter case, we'd still probably want to prove
| that our chosen rules were internally consistent and
| covered all possible situations, which might not be easy.
| soco wrote:
| The good old CAP theorem hits again...
| tabtab wrote:
| How one wants to see them could depend; that's why I
| recommend using an RDBMS. One can "play back" transactions
| using different orders and filters. If teams get confused or
| accidentally "step on each others toes", then one may need to
| review different scenarios to see what was intended by two or
| more parties.
| Gabrielwxf wrote:
| I suppose, as mentioned in the essay, it's handled by
| Replicache.
| Zealotux wrote:
| Figma's blog has a few valuable articles on that subject:
| https://www.figma.com/blog/how-figmas-multiplayer-technology...
| [deleted]
| Wowfunhappy wrote:
| I remember listening to an episode of the Exponent podcast, in
| which Ben Thompson said something like (paraphrasing from
| memory):
|
| > People who love "native apps" can complain about Electron all
| they want--but there's simply no replacement for the real-time
| collaboration offered by web-based apps like Figma!
|
| As someone who's not exactly thrilled with Electron and its
| memory usage--is there a reason the two go together? Is there a
| reason we can't build collaborative apps in Cocoa and GTK? I
| think these systems are awesome, I just think they'd be even
| better if they weren't also running full web browsers!
| rl3 wrote:
| Figma's performance is excellent due in large part to the fact
| they compile a lot of native code to Wasm. Electron or not it's
| still fast.
|
| To answer your question, collaborative apps ideally need to
| target the widest possible audience. Barring a massive budget,
| the best way to accomplish this is to also have a singular
| compile/build target. In most cases, that's the web platform.
| Wowfunhappy wrote:
| Figma's performance is impressive for an Electron app, but it
| does choke on very large files, which Sketch would have
| handled without a care. It's not great.
|
| If Sketch had had Figma's collaboration features, we wouldn't
| have switched. But during the pandemic it was necessary.
| BackBlast wrote:
| It could totally be done natively. The obstacle is how much of
| the stack you have to write and maintain. There are js
| libraries that do most of this heavy lifting for you, and CRDTs
| are pretty new to most devs.
|
| It's just much much easier and cost effective to build a single
| code base and hit many many targets platforms with it.
|
| Computing history has also shown that publishing efficient lean
| software doesn't help in the market. At least not over time to
| market, getting the key features right, and your ongoing costs.
| idontevengohere wrote:
| Really interesting...you can build a similar (websocket/db
| backed) app with LiveView out of the box, no? Any idea how well
| that'd hold up against this solution?
| paulgb wrote:
| Does LiveView have any conflict resolution, or would it just be
| last-write-wins?
| _virtu wrote:
| This was my first thought as well.
| sirtimbly wrote:
| A 225K gzipped .wasm file download for a client-side state
| management and persisistence layer is not great. It is
| competitive with some similar solutions, but still a lot for any
| web app's performance budget
| aboodman wrote:
| The release build is 100k brotli I believe. It's possible this
| site is using the dev binary.
| albertgoeswoof wrote:
| This stack reminds me of Meteor, which came out nearly a decade
| ago(!). https://meteor.com
|
| It never really took off in the mainstream - I think because it
| was before many developers really trusted JS on the server, and a
| "full stack" framework is quite a big commitment for a team to
| shift to. Also most CRUD apps don't need real time collab.
|
| I remember being amazed when changes were instantly propagated
| between my phone and laptop browsers with almost zero lag. This
| was the demo that sold it for me
| https://www.youtube.com/watch?v=MGbmW9bwJh4
| eatonphil wrote:
| I haven't yet done this but based on some research it seems to me
| like the core of any collaborative app today (that wants to avoid
| Firebase and the other hosted platforms like Replicache seems to
| be) is easiest served by picking some CRDT library.
|
| There are a couple of open-source CRDT libraries that provide
| both clients and servers (yjs [0] and automerge [1] are two big
| ones for JavaScript I'm aware of).
|
| My basic assumption is that as long as you put all your relevant
| data into one of these data structures and have the CRDT library
| hook into a server for storing the data, you're basically done.
|
| This may be a simplistic view of the problem though. For example
| I've heard people mention that CRDTs can be space inefficient so
| you may want/have to do periodic compaction.
|
| [0] https://github.com/yjs/yjs
|
| [1] https://github.com/automerge/automerge
| brunoqc wrote:
| Would Chronofold works for this too?
| eatonphil wrote:
| If this [0] is what you're talking about, at the moment yjs
| and automerge are significantly more full-featured and used
| by many major companies.
|
| [0] https://github.com/dkellner/chronofold
| brunoqc wrote:
| Thanks!
| tabtab wrote:
| > _is easiest served by picking some CRDT library._
|
| RDBMS A.C.I.D. and transactions are also capable of much of the
| same.
| feanaro wrote:
| You probably don't want to use Automerge. See
| https://josephg.com/blog/crdts-go-brrr/ for a nice CRDT
| optimization story.
| eatonphil wrote:
| Interesting! I know there was a large performance refactor
| that was merged in May [0]. This post you link was written in
| June of this year. Unclear if the performance fix is related
| to the reported issues and unsure if it still exists or not.
|
| At the very least, the automerge maintainers seem to be very
| actively tackling performance problems.
|
| [0] https://github.com/automerge/automerge/pull/253
| Zealotux wrote:
| So far I've managed to keep the state in my side-project in sync
| with Websockets and Redux, Replicache sounds like the kind of
| solution I'd love to use, but boy the pricing makes it impossible
| to even consider.
| ZeroCool2u wrote:
| I don't have any plans to use Replicache, but I went and looked
| at the pricing and I was kind of struck by your comment.
| Looking at it, it seems pretty fair to me? Especially under 10k
| MAC's. It seems like a flat rate / month is pretty nice too.
| Plus, it's free for all non-commercial use.
|
| Am I wildly off base here? Is it just that middle tier jump to
| over 10k that is a no go?
|
| Again, I don't have a horse in this race or even my own
| startup, just trying to understand if my own judgement is way
| off.
| Zealotux wrote:
| I would quickly be in the $500/mo tier and that would be a
| consequent cost to handle since I don't really make that kind
| of profit yet. But I have to agree anything beyond 10K is
| very reasonable given the features. I just kind of wish they
| had an more affordable bracket between 500 and 10K but they
| probably have reasons not to.
| craig_asp wrote:
| We implemented all that manually, more or less in swift (and
| sqlite), then react+redux, and on the back end - postgres and
| python+flask. Works flawlessly so far. We do have the same setup
| more or less, with listeners triggering UI updates and push
| messages signalling the clients to fetch data from the server.
| Then, on the server, we have two dbs -> one where we store each
| update or create message, in a postgres-based queue, and another
| one, in a normalised format which we use for login (it's way
| faster than replaying all messages from the queue). There are
| complexities when you move beyond one or two tables, though -
| like maintaining relations, ensuring things get done in the
| correct order, that they get merged (we merge all attributes of
| each item - e.g. one client can change color, and the if another
| changes the text content of the item these will get merged), etc.
|
| We gave up on the websocket part and implemented basic polling,
| because they were not supported by App Engine at the time (things
| might have moved on since then, which is a couple of years ago).
| Yet, for a note/todo/habit tracking app, it simply doesn't need
| to be real-time from our experience.
|
| Have a play at https://www.mindpad.io/app/. You can see how it
| works if you open up the web app in two incognito tabs, or on an
| iPhone and the web.
| davedx wrote:
| It's a nice summary of how to use these technologies, but
| considering it states avoiding vendor lock-in is a goal, I was
| surprised to see it using fly.io and a managed cockroachDB.
| mrkurt wrote:
| It didn't actually use CockroachDB, they ended up using
| Postgres + Read Replicas.
|
| I work on Fly.io, but there's very little vendor lock in here.
| We can't afford to lock people in, we're too small. We need to
| make their existing stuff work with zero friction.
___________________________________________________________________
(page generated 2021-08-17 23:00 UTC)