[HN Gopher] Automerge 2.0
       ___________________________________________________________________
        
       Automerge 2.0
        
       Author : llimllib
       Score  : 667 points
       Date   : 2023-01-30 21:10 UTC (1 days ago)
        
 (HTM) web link (automerge.org)
 (TXT) w3m dump (automerge.org)
        
       | ghoomketu wrote:
       | Sorry if this is a stupid question but how do I write the json
       | changes to a MySQL database via Php? From what I can understand
       | the json is being updated on the client side via javascript.. but
       | to save the changes do I have to send the entire json doc to
       | server or is it possible to somehow patch the json on php side
       | also? i want to create a google docs type autosave functionality
       | so sending the whole json seems quite wasteful.
       | 
       | I'm learning web programming and this seems quite useful for what
       | I'm doing. Any tips on how to do it in php with this?
        
         | 1123581321 wrote:
         | Use the sync methods, which send, receive and merge packages of
         | changes until all local instances of the data agree that they
         | are caught up. These packages are intelligently created by
         | Automerge and balance the amount of data transferred against
         | how quickly all instances are consistent. Your PHP server app
         | would keep track of all the local instances and pass the sends
         | to them. You have a lot of freedom in how you implement this so
         | long as you process the messages.
         | 
         | Eventually you'll want to implement this using web sockets for
         | performance, using a system like Laravel Broadcasting.
         | 
         | You can also package the total state JSON as a compact binary
         | and store it in your server app, but you wouldn't use that file
         | to keep clients synced.
         | 
         | https://automerge.org/docs/cookbook/real-time/
        
           | francislavoie wrote:
           | I have a usecase where I think this would be useful, but it's
           | the PHP backend producing the changes to state, not the
           | front-end. I have a test runner that runs async on the server
           | via a job system, and I want to sync the state to the front-
           | end. This means I'd have to produce the diff in the backend.
           | 
           | What are my options for that? There's no PHP library for
           | that, it seems. Is their goal to have someone build a PHP C
           | extension (or FFI) to call down to their library? That seems
           | not very fun, because it's somewhat less portable than having
           | a pure PHP implementation (even if it might be less
           | performant).
        
             | 1123581321 wrote:
             | You're right, there doesn't seem to be a PHP SDK yet. This
             | is unholy, but perhaps you could execute it in a node
             | environment with v8js. https://github.com/phpv8/v8js
             | 
             | Otherwise I think you'd be looking at a headless browser in
             | the test runner.
        
       | mateusfreira wrote:
       | Great improvements, performance was probably the major concern of
       | automerge prev versions, I have begin reading about CRDTs and
       | this is the correct movement field at the moment.
       | 
       | Local first in my mind means your users can still work even if
       | your server is down for a small amount of time for any reason,
       | which means the users can also jump on an airplane and decide to
       | review/update their price table or check out their sales
       | commissions if they want. It benefits both sides of the software
       | economy (users and providers)[1].
       | 
       | I recommend the reading the paper from 2019 about local first it
       | is not too academic and gives a good view of the challenges [2].
       | 
       | [1]
       | https://mateusfreira.github.io/@mateusfreira-2022-12-04-my-t...
       | [2] https://martin.kleppmann.com/papers/local-first.pdf
        
       | jitl wrote:
       | Big congratulations to the Automerge folks for shipping this
       | after years of work. At this point both Yjs and Automerge have
       | Rust libraries and (soon?) bindings for more languages than just
       | JS that stay in sync.
       | 
       | Yjs (pure javascript?) is quoted on the paper benchmark at
       | 1,074ms and 10,141,696 bytes of memory, compared to Automerge
       | 2.0.2-unstable at 661ms and 22,953,984 bytes of memory. It looks
       | like Automerge 2 latest is faster than Yjs, but still uses 2x
       | more memory.
       | 
       | I wonder if this is comparing usage from JS via bindings, or
       | directly comparing two different rust implementations, or
       | comparing Automerge 2.0.2-unstable via Rust to Yjs via NodeJS.
       | 
       | I am still not sure which set of tools I would recommend; I
       | believe Yjs is more actively deployed in production since the
       | Automerge implementation was so far behind performance wise until
       | now. However one of the Peritext authors
       | (https://twitter.com/sliminality who is on my team at Notion)
       | tells me that Automerge is better at text because it doesn't
       | suffer from interleaved characters like Yjs does. So consider it
       | instead of Yjs!
        
         | meitros wrote:
         | Was also curious how the comparisons were being done. Isn't
         | there a new non-negligible overhead of converting and passing
         | data structures between js and rust?
        
           | josephg wrote:
           | I've gotten similar performance from yjs (about 1 second from
           | this test). To do it, I ran the benchmark itself in nodejs /
           | V8. When I benchmarked in rust, the same data set was loaded
           | from JSON and benchmarked using pure rust code.
           | 
           | Its not a fair test if you'll be running this code in a
           | browser, since the rust code will need to be compiled to wasm
           | (and suffer a ~3x performance penalty), while the javascript
           | code will run at the same speed. But whether that matters to
           | you depends on what you're doing.
        
             | kevincox wrote:
             | 3x semms unusually high. It definitely depends what you are
             | doing but I recently compared a Ricochet Robots solver that
             | I wrote (A* search) and it took only about 110% of native
             | time. When I benchmarked it a couple of years ago it took
             | about 2x so things have definitely improved a lot.
             | 
             | I'm sure the use case matters a lot. But at least in come
             | cases the result can be very close to native.
        
             | lll-o-lll wrote:
             | Wow, 3x perf penalty? I thought wasm was supposed to get us
             | to "near native speeds". Is this typical of the penalty
             | paid, or something specific to automerge?
        
               | josephg wrote:
               | Yep. At least, thats the slowdown rate I saw porting my
               | own (very optimized) CRDT to wasm last year. I haven't
               | measured automerge's wasm build but 3x slower is a
               | reasonable baseline for wasm's performance compared to
               | x86_64 code.
               | 
               | Some of that difference is a lack of auto-vectorization
               | in wasm. Wasm SIMD is pretty new, and not well supported
               | in wasm runtimes yet as far as I know.
        
         | josephg wrote:
         | I've spent a lot of time benchmarking both libraries and
         | talking to the authors. The main difference is that yjs has an
         | extra optimization that's still missing from automerge: Yjs
         | does internal run-length encoding of adjacent inserted items.
         | And adjacent inserts come up a lot in real text editing traces.
         | 
         | Adding this optimization to diamond types, in pure rust,
         | improved performance by another order of magnitude (25ms for
         | the same test with this tweak). It also dropped memory usage to
         | about 2MB. The automerge engineers know about this trick (I've
         | talked to them about it). So I assume it's in the pipeline
         | somewhere. And yjs is working on a rust reimplementation, which
         | should bring its performance in line too.
        
           | memorythought wrote:
           | This optimization is indeed in the pipeline, although there
           | are other things nearer the front because performance is
           | currently Good Enough (tm) that other things are more
           | pressing (other things being e.g. completing the Peritext
           | implementation, improving the sync protocol).
        
           | satvikpendem wrote:
           | diamond-types (for reference for others [0]) still only
           | supports plain text, is that right? I was thinking of using
           | it for more general use cases such as an offline habit
           | tracker, which isn't text of course, but I was interested to
           | hear more on the progress towards other data types such as
           | generic JSON data.
           | 
           | Currently for this use case I've been using autosurgeon [1]
           | so far which has a nice Rust API for structs, even if it
           | might be slower than yjs (or yrs, its Rust implementation) or
           | diamond-types.
           | 
           | [0] https://github.com/josephg/diamond-types
           | 
           | [1] https://github.com/automerge/autosurgeon
        
             | josephg wrote:
             | Yep; sadly still true. I started some work last year to
             | simultaneously add support for arbitrary JSON data and add
             | a database-like storage layer to allow us to safely stream
             | changes to disk. (Automerge and yjs usually require the
             | entire data set to be re-saved in its entirety when updates
             | happen). Its taken longer than I thought, because I've gone
             | through a bunch of different designs for both pieces. We'll
             | get there; everything just takes longer than you want when
             | you do it for the first time.
             | 
             | I'll look at autosurgeon. Having similar APIs is good for
             | everyone.
        
       | EGreg wrote:
       | I loved Automergr and they chose all the right tech. But they
       | disclosed a huge caveat: the system got massively slower as more
       | operations were done on the CRDT. Is that still the case or was
       | it fixed in 2.0?
        
         | pvh wrote:
         | The article has quite a few performance numbers in it. The
         | short answer is that it's much, much faster but that we will
         | continue to pursue improvements pretty much forever.
        
       | stephenkingsley wrote:
       | [dead]
        
       | munhitsu wrote:
       | [dead]
        
       | dqpb wrote:
       | > you can just think of it as a version controlled data
       | structure. Automerge lets you record changes made to data and
       | then replay them in other places
       | 
       | This is not my understanding of what a CRDT is. From Wikipedia:
       | 
       | > [CRDTs] send their full local state to other replicas, where
       | the states are merged by a function which must be commutative,
       | associative, and idempotent. The merge function provides a join
       | for any pair of replica states, so the set of all states forms a
       | semilattice. The update function must monotonically increase the
       | internal state, according to the same partial order rules as the
       | semilattice.
       | 
       | For those people saying there is no such thing as conflict-free,
       | there is! But only for datatypes that satisfy these constraints.
        
       | jchook wrote:
       | Any insight for why they choose CRDTs over Operational Transform?
       | 
       | AFAIK Google Docs uses OT, for example.
        
         | mkl wrote:
         | OT needs an authoritative server to coordinate things, but
         | CRDTs can be purely peer-to-peer.
        
       | LAC-Tech wrote:
       | This is really cool. Theres a large class of applications for
       | which deterministic conflict resolution makes a lot of sense.
       | Having a mature library available and not having to implement
       | your own versions from research papers is great.
       | 
       | At least in theory.. anyone used this?
        
       | quartz wrote:
       | Excited to see this!
       | 
       | I built a personal project with automerge 1 recently because I
       | liked the philosophy of offline-first from the team and because
       | the docs were frankly much more approachable than yjs but ended
       | up switching to yjs half way through because of performance
       | issues and also for the rich text support (via the delta doc
       | type).
       | 
       | A little bummed about 2.0 not working with react native because
       | of webassembly but excited to see the peritext work progress for
       | rich text coming soon.
       | 
       | CRDT is one of those technologies I assumed was "done" a decade
       | ago and I was surprised to learn how much of the major progress
       | was only just recently made when I dug in.
       | 
       | Definitely expecting to see some cool new multiplayer startups
       | built on this tech.
        
       | tluyben2 wrote:
       | For now we use pouch/couch for this purpose, which does this
       | (merge docs automatically and pick a winner) out of the box, but
       | has the disadvantage of having to run couch which is an
       | infrastructure pain. We have been exploring substituting it with
       | crdt and this release seems to be the sign of maturity we needed
       | to get us over the line.
        
         | LAC-Tech wrote:
         | I think you're misunderstanding how couch/pouch works.
         | 
         | It doesn't merge documents automatically - it deterministically
         | picks a "winning" version of the data, but the winner is
         | _completely arbitrary_. You need to look at the conflicting
         | versions to properly resolve conflicts, otherwise you 're
         | pretty much just rolling a dice and saying "yeah that version
         | of the data will do whatever". There's no actual merging going
         | on.
        
       | therockhead wrote:
       | Is advised to use CRDTs for an offline app that syncs to a
       | central server, such as a todo app like Apple's Reminders or
       | Todoist? Can simpler methods suffice ?
        
         | jamil7 wrote:
         | You can if you like but, with a centralised server, you can do
         | all the merging logic in one place without a CRDT. With a CRDT,
         | you'd be storing the entire history of a document, which might
         | be overkill for something like a todo app that's just doing
         | last write wins.
        
         | LAC-Tech wrote:
         | I hate to give a vague answer, but it really depends.
         | 
         | What do you think a todo app should do when it detects a
         | conflict/concurrent update?
        
       | brunoqc wrote:
       | with CRDTs, can you do something like "discard old revisions,
       | like after 1 year", to make it more efficient?
        
         | [deleted]
        
       | codeptualize wrote:
       | Link isn't working (extra /),
       | https://automerge.org/blog/automerge-2/
        
         | llimllib wrote:
         | yikes! no idea how that happened, I'll delete and resubmit
         | 
         | edit: there's no way to delete a submission? weird
         | 
         | dang: any way you'll see this and fix the URL?
        
           | alixanderwang wrote:
           | you could just resubmit, and leave a comment here pointing
           | there. i'll look for it in the new section to upvote!
        
           | cNuckIGGER wrote:
           | [dead]
        
           | dang wrote:
           | You submitted https://automerge.org/blog/automerge-2/. But
           | that page has https://automerge.github.io//blog/automerge-2/
           | as its canonical URL (note the extra slash).* The canonical
           | URL redirected back to automerge.org, still with the double
           | slash. I've fixed the URL now.
           | 
           | * Our software canonizes URLs when it can. I suppose we could
           | make sure the canonical doesn't 404 before doing that, but
           | it's a bit tricky when there's an additional redirect etc.
        
             | llimllib wrote:
             | Thanks for fixing!
        
             | pvh wrote:
             | Thanks. We'll fix that before the next one.
        
           | satvikpendem wrote:
           | There is a delete button in one's submission history I
           | believe, so for you it'd be
           | https://news.ycombinator.com/submitted?id=llimllib
        
             | llimllib wrote:
             | no delete button there for me
        
               | satvikpendem wrote:
               | Ah, well dang might fix it if he sees this, or you could
               | email them too.
        
               | layer8 wrote:
               | You can't delete anymore once there are comments, I
               | believe.
        
               | dang wrote:
               | That is correct.
        
       | x-complexity wrote:
       | This is awesome, especially the massive improvements to
       | performance: The fact that a CRDT file can be within 2x the size
       | of a plain text file whilst still fully loading within 2s is a
       | great sight to behold.
        
       | the_duke wrote:
       | As it stands CRDTs are only really useful for a narrow subset of
       | data. Data is guaranteed to converge, but there is no guarantee
       | that the final result makes any kind of semantic sense in the
       | application domain.
       | 
       | One can write custom conflict resolution and treat the data
       | structures as a convenient baseline for event sourcing, but that
       | requires a lot of work and potentially often user guided
       | resolution.
       | 
       | I'd really love to see some research into deriving CRDT merge
       | semantics from a formal description of application behaviour.
        
         | pvh wrote:
         | In fact, convergence is a very easy property to preserve in all
         | distributed systems. The trivial but technically valid version
         | of convergence is to throw away all the writes and always
         | return an empty document. A "last writer wins" version at the
         | document level is what you get from a blob store like S3, but
         | while it does converge, it's not that great either.
         | 
         | What we probably want from a distributed system is useful
         | convergence properties that preserve the intent of the
         | participants. A CRDT might not be a good fit for a bank
         | account: if we can both withdraw the last $20 from my account,
         | the bank will be upset. On the other hand, it's a pretty great
         | way of combining independent observations into a list: it
         | doesn't matter what order the observations arrive. Easy!
         | 
         | Most CRDTs aim to preserve causality: if I see your change, and
         | then make my change, my new value will win. If we both make
         | changes without knowing about each other, that's a conflict.
         | 
         | Of course if we both edit unrelated fields -- maybe it's not a
         | conflict! At least, that's how we handle it in Automerge.
         | 
         | In the most conservative case, we should never merge data
         | automatically. Most systems have _unmodeled_ constraints. For
         | example, sometimes a `git` merge will produce no conflicts but
         | fail to compile anyway. Git 's model (another CRDT) doesn't
         | model program behaviour, nor do we expect it to. In this case,
         | we rely on a combination of our experience, programming tools,
         | and git's version history tooling to figure out what went
         | wrong.
         | 
         | The conclusion I have is that a CRDT should give us robust
         | tools for minimizing conflict, but also needs to be able to
         | explain how things got to be the way they are and what you can
         | do to make them how you want.
         | 
         | We've made a decent amount of progress on this in Automerge and
         | have a paper coming up about this problem soon but I agree
         | there's still more distance to go. If there are particular
         | questions you have about merge semantics, I'm all ears! We'll
         | continue to explore this space for the foreseeable future and I
         | love to hear about new questions.
         | 
         | The last thing I want to add is that when you say "CRDTs are
         | only really useful for a narrow subset of data", you're really
         | drawing a lot of conclusions all at once about other people's
         | needs and interests. From my perspective, CRDTs are useful for
         | a _lot_ of kinds of data. Not everything certainly, but from
         | where I sit, perhaps more kinds of data than a limited single-
         | node relational database and more kinds than a POSIX file which
         | doesn 't retain any history at all.
        
           | josephg wrote:
           | > From my perspective, CRDTs are useful for a lot of kinds of
           | data.
           | 
           | Yep I 100% agree.
           | 
           | I think the highest value uses for technology like this are
           | in creative applications. I think about wikis, blogs, shared
           | whiteboards, music production and video editing. In all of
           | these cases, "referential integrity" (database constraints)
           | don't really matter that much, and the working set is usually
           | pretty small.
           | 
           | Sketch was outcompeted by Figma because figma used a CRDT as
           | its backend, which enabled it to be collaborative. Sketch had
           | an arguably better product, and was first to market. But it
           | was stuck in the single-editor model because they didn't have
           | a tool like automerge.
           | 
           | As for conflicts, increasingly my favorite CRDT for "general
           | purpose" data (JSON trees) is MVRegisters. In the case of a
           | conflict, a MV (Multi-value) register stores all of the
           | conflicting values. But the application doesn't have to care
           | - we can still treat it like a "single writer wins" register.
           | 
           | To make this work, the CRDT provides two APIs: a simple API
           | and a complex API:
           | 
           | - The simple API just gives the application "the current
           | value". In the case of concurrent edits, the system quietly
           | chooses a winner. This is enough for most software most of
           | the time. Its certainly enough to get started.
           | 
           | - The complex API returns all current values when a conflict
           | has happened. Applications further along in their development
           | lifecycle can use this API to present conflicts to the user
           | and ask the user what should happen. (Or the application can
           | resolve the conflict itself using application-specific
           | logic).
           | 
           | The nice thing about this approach is that the data itself
           | doesn't have to change. Its just an application / UI change
           | to show conflicts. So collaborative applications can be
           | written without caring about conflicts (at first). And later,
           | when conflicts between multiple users cause problems, the
           | applications can move to a richer API if they want to. (And
           | remember, it all works like git under the hood anyway. We can
           | store the full history so even when conflicts are resolved in
           | a weird way, you still haven't lost the users' original
           | edits.)
        
           | crabmusket wrote:
           | > Most CRDTs aim to preserve causality: if I see your change,
           | and then make my change, my new value will win. If we both
           | make changes without knowing about each other, that's a
           | conflict.
           | 
           | I haven't kept track of CRDTs since I worked with them in
           | ~2015 and having read the paper by Shapiro et al, but I
           | thought a casual description would be more along the lines of
           | "once we both receive each other's changes, we will agree on
           | the final state"? Or does that no longer reflect current
           | state of the art, or was I just mistaken at the time?
        
           | pharmakom wrote:
           | Your mention of Git reminds me of CI and makes me think of a
           | general strategy:
           | 
           | 1. Allow the user (of the CRDT library) to define a fitness
           | function that should be minimised
           | 
           | 2. When multiply valid merges are possible, pick the result
           | according to the fitness function
        
           | lll-o-lll wrote:
           | Would you say that automerge is useful for applications that
           | don't involve a human? I'm imagining a cluster of "service
           | registry" services that use automerge as a way to manage
           | shared state between them. There wouldn't be a human to fix a
           | merge conflict, so all possible merge outcomes would need to
           | be well defined.
           | 
           | The CRDT examples I see are all oriented around human
           | collaboration, are they a bad choice for something more akin
           | to a distributed database?
        
             | pharmakom wrote:
             | There is a talk about using CRDT across a server cluster to
             | maintain a social media "like" counter
        
       | satvikpendem wrote:
       | See also, Autosurgeon (with a 0.3.0 release today), which is a
       | higher level API on top of Automerge for Rust:
       | 
       | I'm building a mobile app with a server backend, and I was
       | looking for resources to build them in an offline-first way
       | (since unlike on the browser, people expect to use apps offline,
       | if they can, such as fitness or habit trackers).
       | 
       | I found the concept of conflict-free replicated data types
       | (CRDTS) interesting as it allows you to have fully offline
       | experiences while also having a conflict-free syncing experience.
       | I was looking for some good libraries and came across automerge
       | [0] and yrs [1], but both had some rough APIs as they're
       | primarily low-level Rust libraries that are wrapped by higher-
       | level TypeScript APIs.
       | 
       | Autosurgeon wraps the low-level API of automerge to make it much
       | more ergonomic, closer to the TypeScript experience, but in Rust
       | of course. You can for example use `struct`s which autosurgeon
       | will serialize and deserialize automatically, which is not
       | present in base automerge, which focuses more on string keys and
       | arbitrary values.
       | 
       | I am planning on using this together with Flutter and
       | flutter_rust_bridge [2] in order to use this same Rust library
       | everywhere. In this case, the server just becomes another (albeit
       | more privileged) client.
       | 
       | [0] https://github.com/automerge/automerge-rs
       | 
       | [1] https://github.com/y-crdt/y-crdt
       | 
       | [2] https://github.com/fzyzcjy/flutter_rust_bridge
        
         | abiro wrote:
         | Another thing to keep in mind is that if you want the data to
         | be end-to-end encrypted, then you need both devices to be
         | online at the same time to sync with Automerge.
        
         | pugio wrote:
         | Thanks for the links, this is pretty interesting stuff. Just a
         | quick note: it's Conflict Free _Replicated_ Data Types - not
         | Relational.
        
           | satvikpendem wrote:
           | Yep you're right, fixed.
        
         | paulgb wrote:
         | Autosurgeon repo: https://github.com/automerge/autosurgeon
        
         | KRAKRISMOTT wrote:
         | Be careful when using CRDTs. Having no conflicts does not mean
         | the end result is correct. In many cases you essentially
         | converge to last-write-wins with respect to the Lamport clock.
        
           | CGamesPlay wrote:
           | While this is true, the base "text" CRDT generally does the
           | right thing for user documents, and conflicts are generally
           | handled reasonably (though it's fair to say a bad conflict
           | would not be automatically resolved "correctly"). Yjs (not
           | Automerge) also has an XML CRDT, which extends the text CRDT
           | to always have correct XML syntax (although again, which text
           | falls into the <em> and which text falls outside of it may
           | not be "correct" in the case of a conflict).
        
           | satvikpendem wrote:
           | In automerge's (and usually any CRDT implementation's) case,
           | if it encounters a merge conflict, it will allow you to
           | handle it with a custom merge function. So it's not
           | necessarily that CRDTs are truly "conflict-free," just that
           | it will merge correctly in all other cases than editing the
           | same value at the same time.
        
           | LAC-Tech wrote:
           | Generally agree with this. There is no magic solution to
           | resolving conflicts in multi master systems (despite what
           | some database marketing may imply). CRDTs are predictable but
           | they are 'dumb' in how they automatically merge. Make sure
           | the outputs are likely to make sense for your problem domain.
        
           | pvh wrote:
           | I prefer to think of Automerge as a form of version control:
           | because the full history is retained, if you don't like the
           | merge you can decide what you want to do instead.
        
           | api wrote:
           | The fundamental thing is that no merge or consensus algorithm
           | can somehow telepathically know the real world intent of its
           | users.
           | 
           | CRDTs can best be thought of as a way to eliminate spurious
           | and false conflicts, leaving only real errors. Without them
           | anyone who has ever coded a data merge knows you tend to get
           | a ton of noise.
           | 
           | So basically you have reduced the problem surface area.
        
             | LAC-Tech wrote:
             | Can you expand on "spurious and false conflicts" here?
        
               | CGamesPlay wrote:
               | Not the OP, but I'm guessing he's referring to, for
               | example, two users each correcting a typo in a different
               | location in the document. From the perspective of the
               | text CRDT, there's no conflict, and users are likely to
               | agree. Raising a "file edited simultaneously, choose
               | which version to use" error would be a "spurious and
               | false conflict" in this sense.
               | 
               | Note that from a different user perspective, say a code
               | document, such a conflict is actually correct and
               | desired. So it's all about context.
        
           | samstave wrote:
           | Is it possible to have a selectable roll-back/diff feature
           | such that if the sync goes through - the originals on both
           | sides have a 'backup'/source-of-truth option such that you
           | can revert easily?
        
             | ChadNauseam wrote:
             | Yes, the full history is always retained
        
         | crabmusket wrote:
         | This is neither here nor there, but I've always preferred
         | "convergent replicated data-types", as there is some confusion
         | about what the "true acronym" was intended to be.
        
       | jasmer wrote:
       | Can someone answer how we go about the single-source-of truth
       | problem in this distributed scenario? Or does this approach
       | guarantee consistency among all synced data so that they are all
       | effectively the same?
       | 
       | I can see how this would be used for 'live data sharing' but what
       | about for more persistent information, like documents, designs
       | etc?
        
         | LAC-Tech wrote:
         | The term used in the CRDT papers is "strong eventual
         | consistency" - basically it's an eventually consistent system
         | with the added guarantee that any two replicas that have
         | received the same updates - _in any order_ - will have the same
         | state.
         | 
         | So as for documents etc, if you can find or come up with a CRDT
         | where the automatic merging function will give you something
         | that makes sense for a user - sure.
        
         | satvikpendem wrote:
         | > _Or does this approach guarantee consistency among all synced
         | data so that they are all effectively the same?_
         | 
         | Correct, this is what CRDTs are for, eventual consistency.
        
       | renke1 wrote:
       | I am currently using yjs. What would be the equivalent way as
       | described here [0] for yjs to sync docs in Automerge? I don't
       | need any WebSockets or real time stuff. It always seemed so
       | complicated in Automerge compared to yjs. I just want to roll my
       | own simple sync mechanism via HTTP.
       | 
       | [0]: https://docs.yjs.dev/api/document-updates#syncing-clients
        
         | OJFord wrote:
         | Looks pretty much the same with different words?
         | 
         | https://automerge.org/docs/cookbook/real-time/#changes-inter...
        
       | MrBuddyCasino wrote:
       | > Cloud software is fragile and prone to outages, rarely supports
       | offline use, and is expensive to scale to large audiences.
       | 
       | Hint to whoever wrote this: except the offline thing, this has
       | not been my experience at all. Automerge sounds cool enough as
       | is, no need to make up reasons why Cloud Bad.
        
       | pharmakom wrote:
       | Is there a way to hard delete old history in CRDTs? I'm thinking
       | about legal and privacy requirements.
        
         | CGamesPlay wrote:
         | I know this is possible in Yjs: replace the key itself with a
         | new instance of a text CRDT, and populate it with the latest
         | value. Such a change will destroy any concurrent edits, however
         | (concurrent changes will be overwritten by the new instance of
         | the CRDT upon merge). A more complex solution is garbage
         | collection, which depends on the internals of the CRDT. I don't
         | think Yjs exposes this for specific edits in text fields.
         | 
         | Automerge... the same approach would work but Automerge
         | advertises itself as storing the full history, so I think the
         | history of the root object would leak the data. I am not sure
         | if it's possible to erase such history with Automerge.
        
           | pharmakom wrote:
           | This is a big problem!
           | 
           | I like full history most of the time, but there are
           | situations where hard delete is a... hard requirement.
        
       | dang wrote:
       | Related:
       | 
       |  _Automerge CRDT - Build local-first software_ -
       | https://news.ycombinator.com/item?id=30881016 - April 2022 (8
       | comments)
       | 
       |  _Automerge: A JSON-like data structure (a CRDT) that can be
       | modified concurrently_ -
       | https://news.ycombinator.com/item?id=30412550 - Feb 2022 (69
       | comments)
       | 
       |  _Automerge: a new foundation for collaboration software [video]_
       | - https://news.ycombinator.com/item?id=29501465 - Dec 2021 (29
       | comments)
       | 
       |  _Automerge: A library [..] for building collaborative
       | applications in JavaScript_ -
       | https://news.ycombinator.com/item?id=24791713 - Oct 2020 (1
       | comment)
       | 
       |  _Automerge: JSON-like data structure for building collaborative
       | apps_ - https://news.ycombinator.com/item?id=16309533 - Feb 2018
       | (98 comments)
        
         | satvikpendem wrote:
         | I'd also add
         | 
         | - Local First Software
         | [https://news.ycombinator.com/item?id=31594613 (28 comments)]
         | by Martin Kleppmann (who works on Automerge at the company Ink
         | and Switch, perhaps better known as the author of Designing
         | Data Intensive Applications), which introduces Automerge
         | 
         | - CRDTs: The Hard Parts
         | [https://news.ycombinator.com/item?id=23802208 (124 comments)],
         | a video talk also by Kleppmann
         | 
         | - CRDTs go brrr, 5000x faster CRDT implementations
         | [https://news.ycombinator.com/item?id=28017204 (151 comments)],
         | by the creator of another CRDT in Rust library, Diamond Types
         | [https://github.com/josephg/diamond-types]
        
       | MayeulC wrote:
       | Both y-crdt and Automerge are rust libraries (though the latter
       | seems to have a lot more targets), is there a short rundown/vs
       | comparison somewhere? This article compares performance with yjs,
       | but not y-crdt, and there are probably other comparison points to
       | be studied.
        
       | chaxor wrote:
       | Can you cram a duck DB in this? Or maybe SQLite? I suppose you
       | can export several tables into jsonlines files, but it might be
       | cool to have a cleaner solution on the backend.
        
       | josephg wrote:
       | Congratulations to the automerge team! This is a fantastic
       | accomplishment.
       | 
       | I particularly enjoy the performance improvements. I benchmarked
       | automerge 18 months ago and the benchmark took about 5 minutes to
       | process the edits of a paper. Some single character inserts took
       | as much as 2 seconds of cpu time. From the article it looks like
       | this entire editing trace (of 260000 keystrokes) is down to
       | 600ms. That's a huge improvement. It means automerge is similar
       | in performance to yjs, and in turn that makes automerge useful
       | for a much broader set of applications.
       | 
       | One thing I really enjoy about the collaborative editing space is
       | how much ideas are shared around. The highly compact binary
       | encoding was done first in automerge, then copied and tuned in
       | yjs and diamond types. The idea of using an internal list rather
       | than a tree went the other way - yjs came up with the idea, and
       | that approach has landed in all production sequence CRDTs that I
       | know about.
       | 
       | There's a bunch of work in the pipeline around non-interleaving,
       | BFT properties, database interoperability and more performance
       | tuning that we are (collectively) still figuring out. But the
       | future of CRDTs seems bright. In a few years I'd love all new
       | software to be built on local first fundamentals. Work like this
       | is how we get there.
       | 
       | To everyone involved, great work! Keep it coming!
        
         | jamil7 wrote:
         | Thanks for all your work on performance, my side project is a
         | rich text CRDT in Swift which wraps AttributedString. I took a
         | lot of inspiration from Peritext and used your blog post
         | extensively for performance tuning.
        
           | nugmanoff wrote:
           | hey, it sounds super cool! mind sharing the link? definitely
           | something I'd love to use
        
             | munhitsu wrote:
             | shameless plug :) you can steal whatever you want from:
             | https://github.com/munhitsu/CRAttributes
             | https://github.com/munhitsu/CRAttributesDemo
        
             | jamil7 wrote:
             | Haven't open sourced it yet but will definitely post to HN
             | at some point.
        
       | sirodoht wrote:
       | So exciting! Strangely enough, a couple of hours before this
       | release, we just managed to wrap our heads around Yjs after
       | playing with it on and off for a few weeks!
       | 
       | For anyone not up to date with the world of CRDTs, Seph Gentle's
       | two blog posts have become legendary:
       | 
       | * https://josephg.com/blog/crdts-are-the-future/
       | 
       | * https://josephg.com/blog/crdts-go-brrr/
       | 
       | these are also worth checking out:
       | 
       | * https://github.com/y-crdt/y-crdt (rust implementation started
       | by the creator of Yjs, Kevin Jahns)
       | 
       | * https://github.com/y-crdt/ypy (python bindings for the rust
       | implementation)
       | 
       | * https://github.com/josephg/diamond-types (Seph Gentle's rust
       | implementation of YATA, the algorith behind Yjs)
        
         | mkl wrote:
         | Some big past HN threads on those blog posts:
         | 
         | CRDTs are the future
         | https://news.ycombinator.com/item?id=24617542 312 comments,
         | https://news.ycombinator.com/item?id=31049883 45 comments
         | 
         | Faster CRDTs: An Adventure in Optimization
         | https://news.ycombinator.com/item?id=28017204 151 comments,
         | https://news.ycombinator.com/item?id=33903563 22 comments
        
       | riverdweller wrote:
       | Would be great to have a CDN-based JS distribution for those who
       | want to play without the heartache of JS build systems
       | (npm/yarn/webpack/etc).
        
       ___________________________________________________________________
       (page generated 2023-01-31 23:02 UTC)