[HN Gopher] Downsides of Offline First
___________________________________________________________________
Downsides of Offline First
Author : typingmonkey
Score : 254 points
Date : 2021-10-01 12:57 UTC (10 hours ago)
(HTM) web link (rxdb.info)
(TXT) w3m dump (rxdb.info)
| akkartik wrote:
| I remember the good old days when we had offline first by
| default. We called them just "computer programs".
|
| Offline first as a principle is more important than web apps. If
| today's browsers have trouble with offline first, consider that a
| downside of today's browsers.
| candiddevmike wrote:
| This is a good list. The app I built
| (https://about.homechart.app) is "read only" offline first, it's
| a compromise I chose over having to solve queuing writes and
| consistency checks during the (assuming rare) occurrences users
| find themselves offline. I'd add write support, but no one has
| asked for it. Don't go into offline first thinking you need to
| solve for writes, it can be added later if necessary.
| JamesSwift wrote:
| The write queue is messy but not hard as long as you are
| serializing all the operations. The real difficult part is
| conflict resolution. If you have an easy resolution model (e.g.
| last one wins) then I think its not too much extra work for
| you.
| [deleted]
| endisneigh wrote:
| In practice I believe last write wins or compare last two writes
| is sufficient.
|
| Any thoughts on this in practice?
| sroussey wrote:
| Last write wins is a simple CRDT merge type. It's just that
| their are others depending on the data type. LWW might be good
| for an avatar photo, but not great for a counter.
| megamix wrote:
| Humans first. How about that for an idea.
| lovemenot wrote:
| Not great to be honest. I cannot imagine how to use this idea
| to support or reject a decision.
|
| Maybe as a slogan or sales pitch...
|
| Reminds me of Fujitsu's Vision Statement: Human-centric
| Computing. I asked a dozen employees of that company what it
| meant, and not one could give a coherent answer
| olah_1 wrote:
| Idk why you're being voted down. It's a genuine design
| principle. Some ideas that come to mind that fit with "human
| first":
|
| - offline-first
|
| - human curation instead of algorithms (or at least transparent
| algorithms that are customizable)
|
| - the user is not the account number. the account is just the
| mech that the user climbs into. user can have multiple mechs.
|
| - leverage existing social fabric to provide better user
| experience. account recovery, etc.
| hdjjhhvvhga wrote:
| An alternative: a native app.
| endisneigh wrote:
| native to what?
| j1elo wrote:
| The word "native" has diluted in a sea of abstractions and
| supporting technologies, but in this context I'd read "native
| app" as removing the browser engine layer (so, not even
| really native by far, but much closer to the OS and hardware
| than when writing code on top of the mentioned layer)
| hdjjhhvvhga wrote:
| Yes. Putting an app in a browser has solved tens of
| problems and introduced a hundred new ones. At some point
| you start to wonder if a plain old native app wouldn't work
| much better that an uncontrollably growing stack of
| technologies, some ow which designed for something
| completely different than what we're trying to accomplish.
| hunterb123 wrote:
| Okay. You still need a client side database with syncing.
| Which is what this is about.
|
| The pure native approach only solves (kind of) the
| limited storage issues, nothing else.
|
| If you nuke your site and only develop about native apps
| then you'll lose users to competitors because people can
| try their service without downloading an app.
|
| Yeah maybe we shouldn't have shoved an app run-time into
| a document viewer, but here we are and things run
| decently well (thanks to v8 and other tech) if you do
| things right.
| Veen wrote:
| Surely many of the problems highlighted in this article apply
| whether it's web or native. Unless all the app's data is
| generated on device, it needs some way to synchronize with a
| server, which might either be an "offline-first sync all the
| data approach" or an "online-only sync a little bit at a time
| approach".
| lytefm wrote:
| Exactly, and the boundaries blur even more when you consider
| that it's possible to build apps that look native to the user
| with web technology (Electron, Cordova,..)
|
| And I'd add than even if all data is generated and stored on
| the device, syncing capabilities are desirable if the user
| wants to use his Offline-first app across devices.
| ltearno wrote:
| I once made an app and a talk on this matter. Slides about
| problems mentionned in the article were addressed from slide 15
| onwards. Mainly I remember having used Lamport clocks to track
| rows' causal history. And negative primary keys for inserting
| data in offline mode (these days I'd use UUID). Since IndexedDB
| was not a thing yet on every browser, I used asm.js (ancestor to
| WebAssembly) to compile SQLite for the browser. The database file
| was stored in the LocalStorage and I used zlib (compiled with
| emscripten) to make most of the little space LocalStorage gave to
| us. It has been a learning experience, and worked at the end. I
| wonder how I would do differently today... By the way, we were
| using GWT to code for the browser, but that's just an anecdote
| and not important for that matter...
|
| Here is the link : https://fr.slideshare.net/ltearno/easing-
| offline-web-applica...
| JamesSwift wrote:
| Ive also gone the negative key route, and would move to a
| "client dictates the key via UUID" like you mention if I redid
| it today.
| mamcx wrote:
| Why? I have done this before and found negative much easier
| to use and understand. UUID have the nasty property that deny
| easy debugging, logging or any kind of HUMAN understanding...
| JamesSwift wrote:
| The primary benefit is theres no post-hoc reconciliation
| you need to do where you re-write all the foreign keys and
| object ids. It greatly streamlines the entire process, and
| lets you eliminate a lot of code / potential bugs.
| mamcx wrote:
| I found it make far easier to see what object are
| candidates for sync, and to know the original row in the
| server.
|
| I think makes a diference if exist a master database or
| if is peer-to-peer..
| JamesSwift wrote:
| Yeah peer-to-peer is a different scenario. I'm assuming
| traditional client-server.
| WorldMaker wrote:
| Some of that depends on the version of UUID you are using.
| v1 and v6 UUIDs with timestamps can provide very useful for
| debugging. Of course v1 are problematic with MAC address
| embedding and v6 still just "draft/experimental" with the
| IETF.
|
| My offline-first apps I settled on using ULIDs, which have
| time stamps and put them first so that they sort
| lexicographically (such as when included in string
| CouchDB/PouchDB _ids), and I've been pretty happy with
| that. That timestamp up front in the first few bytes can
| help a lot in debugging/"human understanding" of about
| where the ULID fits in a log stream.
|
| I can also tell you at this point way more than you care to
| know about storing ULIDs in Microsoft SQL Server to get
| decent clustered index behaviors.
| leetrout wrote:
| Thanks for sharing.
|
| I do not miss GWT but I think it was inspirational.
| jadafaa wrote:
| Enjoyed reading this
| josephg wrote:
| I'm a "true believer" in CRDTs, which I have some experience in.
| You can implement a useful CRDT for simple applications in under
| 100 lines if all you care about are standard database objects -
| like maps, sets and values. List CRDTs are where they get
| complicated, but most applications aren't collaborative text
| editors.
|
| The promise of CRDTs is that unlike most conflict resolution
| systems, you can layer over a crdt library and basically ignore
| all the implantation details. Applications should (can) mostly
| ignore how a CRDT works when building up the stack.
|
| The biggest roadblock to their use is that they're poorly
| understood. Well, that and implementation maturity. Automerge-rs
| merged a PR the other day which brought a 5 minute benchmark run
| down to 2 seconds. But by bit we're getting there.
| the_duke wrote:
| I might be missing something, but I have trouble seeing how
| CRDTs can work for regular CRUD style applications.
|
| Just the first example that pops into my head: edit A sets an
| invoice status to paid, edit b changes the invoice amount from
| 100 to 120. The merge is a paid invoice with an incorrect
| amount.
|
| A workaround would be to record a separate PaidInvoice that
| wont be changed by the application logic.
|
| But that's just a really trivial example that only involves
| scalar fields inside a single object, and also relies on the
| application logic considering all ways that the CRDT might
| behave.
|
| There are countless ways to end up with data that violates the
| constraints of the domain.
|
| Is there any theoretical groundwork happening on how CRDTs can
| preserve domain semantics?
| olah_1 wrote:
| I think CRDTs necessitate a robust permission system to be
| more than a toy.
|
| If you want a decentralized CRDT, you'd probably end up re-
| inventing smart contracts (which is what I found myself doing
| once).
| ollysb wrote:
| Your general point withstanding, in this case wouldn't the
| merge be a partially paid invoice?
| mamcx wrote:
| I'm interesting to se how can this apply for "regular" database
| apps (like invoicing). I need to add sync/offline to my main
| app and want something solid to build upon (have done things
| homemade before) and has wonder if CRDT could be applied, but
| how?
| azteceagle wrote:
| We are in the same need right now. Something that we have
| researched and think it has a lot of potential, at least for
| us is James Long sync implementation. You can check out his
| talks and demos at: * https://www.youtube.com/watch?v=2dh_gtn
| dayY&feature=emb_imp_... *
| https://www.youtube.com/watch?v=DEcwa68f-jY
|
| And his demo implementation (and annotated fork):
|
| * https://github.com/jlongster/crdt-example-app *
| htps://github.com/clintharris/crdt-example-
| app_annotated/blob/master/NOTES.md
|
| I wonder why there isn't some open source engine based on
| this at least for CRUD apps since it has a lot of potential
| and it is really "simple" to implement and even understand.
| a_conservative wrote:
| I'm a CRDT newbie, but I'll take a stab, hopefully someone
| can correct me if I'm getting it wrong.
|
| After a quick reading the "CRDTs go brr" and the wikipedia
| page, I think CRDT gives us a mathematical strategy for
| resolving conflicts. It doesn't mean that the end result will
| make sense.
|
| The Wikipedia article gives an example of merging an event
| flag represented by a boolean variable. So the var in this
| case means that "someone observed this event happening". So
| the rule for merging this var from different sources is
| simple, if any source of data reports the var as true, the
| merged result should be true as well.
|
| The implication is it matters what the data represents, not
| just whether it is a boolean or a string, etc.
|
| I'm guessing that a colloborative notes field, or a "did
| someone call this customer" boolean might benefit from a CRDT
| more so than keeping track of bank account values.
| fauigerzigerk wrote:
| I would like to know the answer to that as well.
|
| My current understanding is that CRDTs merely guarantee
| derministic merging of updates to some basic data structures,
| where deterministic means that the outcome is well defined
| and always the same regardless of the order in which updates
| are merged.
|
| That doesn't mean the outcome makes any sense at all in terms
| of the sort of application level requirements and constraints
| you would typically find in a transactional database
| application. Conflicts may still arise on that level.
|
| So I think what we need to really fulfill the promise of
| CRDTs is a way to express those application level constraints
| on top of them.
| JamesSwift wrote:
| Just read your "crdts-go-brrr" post from the other commenter
| and just wanted to say thanks for writing all that up! Great
| insight and great information.
| thamer wrote:
| I was also a "true believer" in CRDTs for a long time,
| implementing my first ones in Erlang about 9 years ago[1], but
| my opinion of where they fit has changed significantly.
|
| The one issue with CRDT that I find is rarely mentioned and
| often ignored is the case where you've deployed these data
| structures that include merge logic to a set of participating
| nodes that you can't necessarily update at will. Think phones
| that people don't update, or IOT/sensor devices like electric
| meters or other devices "in the wild".
|
| When you include merge logic - really any code or rules that
| dictate what happens when the the data of 2 or more CRDTs are
| merged - and you have bugs in this code running on devices you
| can never update, this can be a huge mess. Sure you can
| implement simple counters easily (like the ones I linked to),
| and you can even use model checking to validate them. But what
| about complex tree logic like for edits made to a document?
| Conflict resolution logic? Distributed file system operations?
| These are already very complex and hard to get right without
| multiple versions involved and unfixable bugs causing mayhem.
|
| Having to deal with these bugs in the context of a fleet of
| participants on a wide range of versions of the code, the
| combinatorial explosion of the number of possible interactions
| and effects of these differing versions and bugs taken
| _together_ can really become impossible to manage.
|
| I'd be interested to hear from folks who have experience with
| these kinds of issues and how they have dealt with them,
| especially if they are still convinced that CRDTs were the
| right choice.
|
| [1] https://github.com/nicolasff/distributed-counters
| bmurphy1976 wrote:
| How are CRDTs unique in this respect? Ossification is a
| challenge for all APIs, domain models, network layers,
| protocol specs, etc.
| lazide wrote:
| Because those other options abstract the decision making
| away from the 'can never be updated' part usually into some
| part of the system where the consequences of a
| problem/interest in the right solution are also collocated
| with someone who can do something about it (a API server
| with ossified clients, or network switch with buggy NIC's
| wired into it, etc.)
|
| If truly peer to peer, that is a lot less clear - do you
| end up in a collaborative p2p document model forking the
| documents between 'new rev' clients and 'old rev' clients?
| Who 'wins'? What is the consequence of losing?
|
| At least an API server can clearly reject the client and
| give an error message - if it's a CRDT, how does that work?
| hinkley wrote:
| I started trying to build a non-PHP Wiki for a hobby, and the
| notion that I was going to have to implement my own version
| control on top of it stymied me, so CRDTs looked good to me
| when I first started to get familiar with the idea.
|
| Trac (project management) does some stuff behind the scenes
| stored in svn, which it uses for edit history. I always liked
| that idea. Why have two? I have wished for some time that
| someone did a Trac for git. I just don't want to be the one to
| write it.
|
| I've also wished for some time that someone would make a new
| git, not designed by barbarian cannibals, potentially based on
| CRDTs.
|
| Somehow these notions melted together and now making progress
| hinges on whether there's a CRDT out there that's up to the
| task of managing source code - and written in a language I can
| handle. So far those two haven't appeared, and based on the
| number of corner cases I've heard described when I watch
| lectures on CRDTs, I'm pretty sure if I tried to write one
| myself I'd never see daylight again, but might see the inside
| of a padded room. Do you think it's safe to say we're still in
| the distillation phase of invention where CRDT's are concerned?
| Is this accidental complexity we are seeing or is most of it
| intrinsic?
|
| I'm instead spending a lot of my free time on the hobby instead
| of on writing collaborative tools and/or accidentally writing a
| Trac replacement.
| a_conservative wrote:
| Can you point to a good introduction to Conflict-free
| replicated data types (CRDT)?
|
| It's crossed my (short attention span) radar a couple times and
| seems interesting. I really love the idea about being able to
| use it, but also "forget" about it while developing.
|
| Does "leaky abstraction" apply here? If you constrain things
| enough, can a dev really use it and forget about the details?
| BiteCode_dev wrote:
| Man, I will remember Google Wave for the rest of my life. It
| was really ahead of its time. Kuddos for being that visionary.
| BiteCode_dev wrote:
| We all think of CRDT for collaborative text editors, but can
| you give example of how it's used in the wild that are
| unexpected, yet very useful?
| jitl wrote:
| I work on a collaborative text editing system
| (https://www.notion.so) and read a lot about CRDTs, thank you
| for your work in this area, particularly
| https://josephg.com/blog/crdts-go-brrr/
|
| I agree with your assessments here - CRDT is the way forward
| for most applications; no user wants to fiddle with a merge UI
| or picking versions like with iCloud. I think RxDB's position
| here is from their CouchDB lineage.
|
| _> The biggest roadblock to their use is that they're poorly
| understood. Well, that and implementation maturity._
|
| I certainly have more understanding to do. My biggest open
| question is how to design my centralized server side storage
| system for CRDT data. To service writes from old clients I need
| to retain a total history of a document, but I don't want to
| force new clients to download the entire history nor do I want
| these big histories in my cache, so I end up wanting a hot/cold
| system; and building that kind of thing and dealing with the
| edge cases seems like more than 100 lines of code.
|
| It seems like the Yjs authors also recognize that CRDT storage
| on the server is an area to address, there was some work on a
| custom database in 2018, although my thinking is more about how
| to retrofit text CRDTs into my existing very conservative
| production cloud software stack than about writing to block
| storage.
| derefr wrote:
| > no user wants to fiddle with a merge UI or picking versions
| like with iCloud.
|
| For prose text, what do you think about combining a document-
| scale CRDT, with _fine-grained_ locking -- e.g. splitting the
| document into a "list of lines/sentences", where lines have
| identity, and then only allowing one person to be modifying
| _a given line_ at a time?
|
| I've always felt like this was under-explored, given that in
| prose text it's almost always _semantically_ incoherent for
| multiple people to be trying to change a single sentence in
| different ways at the same time anyway (i.e. they would each
| have a series of edits they want to do to line A to turn it
| into either A ' or A"; but any result that's not either
| purely A' or A" will very likely be nonsensical. One could
| say that the A - A' transformation a user does to a sentence
| is _intentionally transactional_.)
|
| I almost thought Notion would be a good example of this, but
| apparently not (https://www.notion.so/Real-time-
| collaboration-20a1873baf334d...) -- they actually do allow
| multiple users to be editing the same leaf-node content block
| at the same time, and so have taken on the full scope of the
| CRDT problem.
|
| > but I don't want to force new clients to download the
| entire history nor do I want these big histories in my cache,
| so I end up wanting a hot/cold system; and building that kind
| of thing and dealing with the edge cases seems like more than
| 100 lines of code.
|
| Yes, but these needs are able to be cleanly abstracted away
| on the backend -- there are internally-complex infrastructure
| components (like CouchDB, or Kafka) that expose a pure CQRS
| model to clients, but internally are doing fancier stuff
| involving reducing changes onto snapshots and then exposing
| new CQRS streams that begin history with atomic "introduce
| all this stuff from a snapshot"-typed 'changes'.
|
| There's also some convergent evolution happening here with
| non-replay sync strategies in blockchains (which can be more
| interesting to look at if you care about the serverless p2p
| Operational Transformation type use-case of CRDTs.)
| tylerhou wrote:
| There is no notion of "same time" in a distributed system
| -- what if a client locks a line and disconnects?
|
| Also, it leads to a poor user experience.
| derefr wrote:
| My vision of this wouldn't involve not accepting edits;
| but rather, the document would react to you trying to
| edit a locked block the same way macOS reacts to you
| trying to edit a locked document: offer to duplicate the
| block and then let you edit the duplicate. Both versions
| of the block would then appear in the document, perhaps
| with some annotation that they're sibling forks of the
| same original text. (Compare and contrast: books that
| compare-and-contrast works with subtle changes between
| different versions, e.g. the four Gospels of the Bible.)
|
| Also, I'm speaking about locking that's fine-grained
| temporally as well as spacially: a line would only need
| to be locked with a ~10s TTL when a user begins to type
| in that line. Think of it like the user composing a
| transaction of modifications to the same line, and then
| committing it. A lot like typing a message into a chat
| program. Just the user having their cursor on the line,
| wouldn't imply that the line is locked; it would only
| lock when they start typing.
|
| This is already how group-chat apps work, mind you; if
| you're an admin who can edit other people's message
| lines, you nevertheless can't edit someone else's message
| line while _they're_ editing it. But they're only
| considered to be editing it while they're actively
| typing, and for a few seconds after that. If they go
| idle, someone else edits their message-line, and then you
| come back and try to submit your edit, it will be
| rejected. (Of course, that behaviour makes perfect sense
| for group chat software, where the only other people who
| can edit your text are moderators, and so moderation
| actions _should_ "trump" user actions. In a p2p
| collaboration context, IMHO adding resistance
| /intentionality to per-line forking, but nevertheless
| allowing it, makes the most sense.)
| resoluteteeth wrote:
| > My vision of this wouldn't involve not accepting edits;
| but rather, the document would react to you trying to
| edit a locked block the same way macOS reacts to you
| trying to edit a locked document: offer to duplicate the
| block and then let you edit the duplicate. Both versions
| of the block would then appear in the document, perhaps
| with some annotation that they're sibling forks of the
| same original text. (Compare and contrast: books that
| compare-and-contrast works with subtle changes between
| different versions, e.g. the four Gospels of the Bible.)
|
| If you're using a system where you're guaranteed to have
| knowledge of what other people are editing at all times,
| there's really no need to use CRDTs in the first place.
| tylerhou wrote:
| Group chat apps don't care about preserving functionality
| when the network drops. But a document editor does. And
| editing a message is relatively rare in chat apps so they
| can afford to lock; editing a line is common in documents
| and the chance of a lock conflict is higher.
|
| Your suggestion also assumes that the network is
| reliable. What happens if a user takes a lock and there
| is a partition? If the document is P2P, there is no
| central authority; when should the other participants
| override the lock? How much overhead does that add to the
| protocol?
|
| The main point is that there is no notion of a central
| clock in a distributed system; hence "lock temporally" is
| not precise. Relative to which participant? And what
| happens when messages are dropped? (Even the lock message
| might be dropped!) A distributed lock implementation is
| non-trivial.
|
| https://lamport.azurewebsites.net/pubs/time-clocks.pdf
|
| https://static.googleusercontent.com/media/research.googl
| e.c...
|
| https://en.m.wikipedia.org/wiki/Fallacies_of_distributed_
| com...
| suchire wrote:
| I don't personally have use experience of Quip, but from
| engineers I know who work there, fine-grained locking is
| how Quip handles collaboration.
| jitl wrote:
| The locking in Quip is like a UI concern - it's not a
| guarantee, and I don't know how (or if) Quip handles
| concurrent offline edits. As a user of Quip (while at
| Airbnb) I was pretty frustrated by the lock UI, although
| it improved once they added the "steal this lock" button.
| josephg wrote:
| Oh cool! I've wanted something like notion for years. Ideally
| on top of CRDTs (so I own my own data). I really appreciate
| all the work your company is doing! Feel free to get in touch
| if you want to have a proper chat about this stuff.
|
| > My biggest open question is how to design my centralized
| server side storage system for CRDT data. To service writes
| from old clients I need to retain a total history of a
| document, but I don't want to force new clients to download
| the entire history nor do I want these big histories in my
| cache, so I end up wanting a hot/cold system; and building
| that kind of thing and dealing with the edge cases seems like
| more than 100 lines of code.
|
| Yeah definitely more than 100 lines of code. I'm sad to
| report that in diamond types (my own CRDT) I've spent ~12000
| lines of code in an attempt to solve some of these problems.
| I could probably get that down under 3000 loc in a rewrite if
| I'm happy to throw away some of my optimizations. Doing so
| would dramatically lower the size of the compiled wasm bundle
| too - though the wasm bundle is still comfortably under 100kb
| over the wire, so maybe its fine?
|
| Regarding history, I have a lot of thoughts. The first is
| that with the right approach, historical data compresses
| really well. Martin Kleppman's automerge-perf data set has
| 260k edits ending in 100kb of text. The saving system I'm
| working on can store the entire editing history (enough to
| merge changes from any version) in this example with just
| 23kb of overhead on disk. I think that resulting data set
| might only need to be accessed in the case of concurrent
| changes, and then only back as far as the common ancestor.
| But I haven't implemented that optimization yet.
|
| And yeah; I've been thinking a lot about what a CRDT-native
| database could look like too. There's way too many
| interesting and useful problems here to explore.
| prox wrote:
| I really seen a lot of buzz recently about notion, cool work
| you are doing.
| taeric wrote:
| While I am sympathetic to the idea that users don't like
| merges, they also hate idiotic combinations of data without
| their oversight. So, bit of a rock and a hard place.
|
| If you want collaboration between people, you have to
| structure it in a way that makes it a conversation, I
| believe.
|
| I could almost see an idea that you could pattern it after
| musicians playing together, but that is a very particular
| kind of rehearsal that has not been done in any other
| practice, as far as I am aware. Improv may come close, but
| even that has very specific techniques that really don't make
| sense in a CRDT landscape.
| somenewaccount1 wrote:
| Is there a developer community or forum for the notion.so
| api? The slack channel published on your developer website is
| down.
| jitl wrote:
| We have the Slack you mentioned (invite:
| https://notiondevs.slack.com/join/shared_invite/zt-
| vkinpzs0-...) and a Stack Overflow tag
| (https://stackoverflow.com/questions/tagged/notion-api).
|
| The Slack link from https://developers.notion.com works for
| me, maybe Slack has a DNS issue according to this article?
| https://www.theverge.com/2021/9/30/22702876/slack-is-down-
| ou...
| somenewaccount1 wrote:
| Thank you for the correct link
|
| The Slack link I mentioned is
| https://join.slack.com/t/notiondevs/shared_invite/zt-
| lkrnk74...
|
| The "invite has expired"
|
| It's linked in the footer of
| https://developers.notion.com/
| [deleted]
| fendrak wrote:
| I would very much love to see the "simple CRDT" implementation
| described above, seems like it would be a great learning tool
| and/or foundation on which to build something more complicated!
| azteceagle wrote:
| This presentation by James Long helped me a lot:
| https://www.youtube.com/watch?v=DEcwa68f-jY
| mikevm wrote:
| So how do the various popular CRDT libraries compare nowadays?
|
| There's Yjs (with a Rust port that is in progress), Automerge-
| rs, and your own diamond-types project :).
|
| Is Yjs still the current go-to for most projects' needs?
| hunterb123 wrote:
| There's also GUN (https://gun.eco/) if you want a CRDT graph
| p2p syncing database. I have no relation to the project, but
| the community is awesome.
| amelius wrote:
| > I'm a "true believer" in CRDTs
|
| Don't make the same mistake as the Google Wave team.
| Collaborative editing is a great intellectual challenge to work
| on, but in reality users don't care much about documents that
| auto-update in real time. In fact, it can be annoying.
| readams wrote:
| That's like the main feature of Google Docs and people use it
| all the time.
| Thaxll wrote:
| Most of the time they don't use it for the realtime aspect
| of it, they use it to easily share documents.
| twicetwice wrote:
| I think you're typical-minding here. The realtime aspect
| is used all the time by many many people.
| jitl wrote:
| You're replying to someone who worked on Wave.
| cjblomqvist wrote:
| Parent was actually a part of the team doing the core engine
| for Wave, so in that case it's more like he never dropped his
| belief in it.
| jasode wrote:
| > _I'm a "true believer" in CRDTs, which I have some experience
| in._
|
| What is the largest-scale or highest-profile real-world usage
| of CRDT today?
|
| (I glanced at this CRDT vs OT topic before but I'm not up-to-
| date on where things stand in the real-world performance:
| https://news.ycombinator.com/item?id=23988999)
| dunham wrote:
| It's not really high scale / high performance, but Apple's
| notes.app uses state based CRDTs internally for conflict
| resolution. It is perhaps high profile, since a lot of people
| are using it.
| dugmartin wrote:
| I'm not sure about the largest-scale single usage but the
| Phoenix Framework uses CRDTs for handing user presence.
|
| It isn't enabled by default but it is very easy to use and
| the CRDT backend is basically hidden away.
|
| More info:
|
| - https://hexdocs.pm/phoenix/presence.html -
| https://dockyard.com/blog/2016/03/25/what-makes-phoenix-
| pres...
| jitl wrote:
| Grandparent wrote a good article on CRDT performance
| https://josephg.com/blog/crdts-go-brrr/
| LAC-Tech wrote:
| TomTom uses CRDTs for their SatNav software.
| LeanderK wrote:
| I am a strong believer in PWAs. I think most of the apps I use
| could be PWAs without a problem. I really don't get why Apple
| isn't developing them for MacOS, since I've never really used the
| app-store on MacOS and in comparison to iOS a lot of the apps on
| my Mac are productivity apps that are basically electron-apps. I
| think some basic PWA functionality on the desktop would be more
| interesting for me than more advanced PWAs for ios.
| jdavis703 wrote:
| I want to believe in PWAs. But in the real world they are just
| way too clunky. I've observed this on both software I've
| written, and on PWAs from teams that should objectively know
| what they're doing like Google Calendar.
| atatatat wrote:
| PWAs lead to the walls of Apple's walled garden being torn
| down.
|
| They'll fight it with subtlety until the end, and I hope they
| burn in hell for it.
| blauditore wrote:
| Exactly. They introduced auto-deletion of localStorage under
| the hood of "privacy", when it was really about driving
| people to building standalone apps (and use their store).
| breakfastduck wrote:
| Good. PWAs are absolutely awful compared to native apps.
| breakfastduck wrote:
| Because native apps are basically a huge chunk of the appeal to
| using macOS and PWAs are the absolute antithesis of that.
|
| Encourage developers to make apps that all look/feel completely
| different in terms of style/UX - no thanks.
| LeanderK wrote:
| it's not working at all. I currently use vscode, jupyter lab
| and Slack. They are all not native. I can't see myself
| switch.
|
| I use some native apps and they are great and all, but to be
| honest I don't care. I want a good user experience and not
| technical details what's behind the hood of the UI.
|
| I also feel like the stance is pointless since electron apps
| are here to stay and it just makes the experience of
| everything worse.
| breakfastduck wrote:
| It's not about whats behind the hood. It's about a
| consistent UX & performance.
|
| Honestly VS Code is one of the best electron apps available
| and it still feels clunky compared to basically any native
| macOS app.
| kevincox wrote:
| I do find it disappointing that there is no reliable local
| storage system for the web. I get the resistance to trackers but
| there should be a way to request permission to store data that
| isn't deleted except for explicit action by users. It means that
| you effectively have to store the data in "the Cloud" which means
| 1. I have to pay for it 2. If I shut down the service you are
| screwed and 3. I have at least some access to it (encryption
| aside).
|
| I would also like to see a synced version since most browsers
| these days support syncing settings and passwords. But creating a
| generic syncing solution that is actually useful is hard.
| oblib wrote:
| A few years ago a tried using the browser's IndexedDB for my
| invoicing app and loaded a 1000 documents into it at a time. It
| did ok up to about 2-3k documents but choked the browser to the
| point of being unusable at 5k documents. That was on a Late `09
| Mac Mini so maybe newer or more powerful PCs would do better,
| but that's not an issue at all if you're using CouchDB to store
| the data on the client side.
|
| I use CouchDB installed on the client side to implement Offline
| First data storage. This works for Desktop PCs and if you run
| it on local device that's accessible via your local network,
| like a Raspberry Pi in-house for example, you can also use it
| with mobile devices in-house too.
|
| The local CouchDB will sync with the Cloud based CouchDB as
| soon as they're both online. CouchDB will decide which version
| to keep and deliver it from that point.
|
| It's certainly not perfect and doesn't provide "real time
| collaboration" but so far that's been out of reach anyway and
| may not be a very good approach at all. The notion of several
| people editing the same document at the same time seems to me
| to be chaotic no matter how you approach it.
|
| The biggest downside to this approach is that the user has to
| install and configure CouchDB. I made a simple web app to help
| with this, but it's a bit too much to expect users to install
| and configure it.
|
| What we need is a client side DB pre-installed that any web app
| can access and the data for that app is sandboxed and can only
| access the DB assigned to it. But it's not reasonable to add
| that to a web browser. CouchDB can do that now.
| richardwhiuk wrote:
| Can you allow the user to load and save files?
| kevincox wrote:
| I mean sure, you can get the user to download/upload files,
| but this is very awkward and not suited for storing every
| change. (But is good for migrating and backing up the data).
| Having to get the user to manually break the data out of the
| browser is not a good UX for day-to-day work.
|
| I know that Chrome is pushing for a filesystem API but I
| don't know if that will be exempt from the usual
| ephemerality. IIRC it is just a private storage space with a
| filesystem-like API.
| easrng wrote:
| Chrome has an API that allows you to save to files without
| a new download every time. They made a pretty nice library
| that wraps that API on Chrome and it gracefully degrades on
| other browsers.
|
| https://web.dev/browser-fs-access/
| jcims wrote:
| i noticed this behavior in a drawing app called
| excalidraw. first save opens a file dialog, subsequent
| saves just update the file, basically like a standard
| local text editor.
|
| i keep doing 'save as' to create new files because i
| don't trust it lol
| nathcd wrote:
| `navigator.storage.persist()` will prompt the user to allow
| persistent storage for your site, so data won't be evicted
| under storage pressure (for IndexedDB, service worker
| registrations, localStorage, sessionStorage, etc.)
|
| https://developer.mozilla.org/en-US/docs/Web/API/StorageMana...
| marcosdumay wrote:
| The problem runs deeper than what the GP complains about.
|
| If you rely on that, and write important things on the
| browser storage, a week later your user access some moronic
| site that doesn't work, and their support tells him to clear
| his browser cache, your data will almost certainly be cleaned
| with it.
|
| The problem is that browsers do not even consider that they
| may be storing important data. There is no clear way to
| recover or backup that data, and it is tangled with what
| comes from every other site.
| simiones wrote:
| That doesn't work on iOS, nor on Safari on Mac for any
| released version as far as I understand:
|
| https://caniuse.com/mdn-api_storagemanager_persist
| _fat_santa wrote:
| > I do find it disappointing that there is no reliable local
| storage system for the web. I get the resistance to trackers
| but there should be a way to request permission to store data
| that isn't deleted except for explicit action by users.
|
| I've built a number of apps in the last year or two that use
| browser local storage. It also annoys me that the idea around
| "storing data in your browser" is automatically attributed to
| ad tracking. I have to go out of my way in my apps to inform
| users that yes this app uses localStorage/cookies, but no this
| is not used for any ad tracking, rather for actually storing
| your app's data.
| kevincox wrote:
| How do you deal with the face that the browser may just
| decide to wipe the storage at any time?
| jitl wrote:
| The most important thing to do is get your users data
| somewhere safe as quickly as possible. For the vast
| majority of users that means your cloud database.
|
| As the user generates new data, spray it to your servers
| _as well_ as writing it to the syncable IndexedDB local
| storage, and to an in-memory buffer. Make your backend
| handle writes idempotently, and retry all failures a few
| times. (Eg, IndexedDB disk might be full or flaking out, so
| retry writing the memory buffer to disk.)
|
| As long as the _write path_ is quick, users can tolerate
| the browser nuking offline storage cache because they can
| re-download all the data that made it up to your server.
|
| Hopefully soon the browser vendors will allow more durable
| file system access with appropriate user controls. Chromium
| built out the file system access API (https://web.dev/file-
| system-access/) but it's not supported in Firefox or
| Safari.
| chrisfinazzo wrote:
| Is this (generally) reliable?
|
| I know terrible, awful bugs eventually doomed WebSQL from
| getting any traction and IndexedDB seems to be a more
| competent replacement, but the fact that Google is
| leaning on FSA seems like a non-starter.
|
| It just feels like there's no way in hell Webkit will
| ever implement this stuff - not because of the divide
| between the App Store and PWA's - but due to the
| implications for privacy.
|
| Hard pass.
| jitl wrote:
| I'm not sure exactly what your reliability question is
| about. I haven't actually used the file system API I
| posted, probably same as you I'm waiting for it to ship
| outside of Chromium.
|
| On the subject of WebKit, IndexedDB bugs are also pretty
| bad especially on iOS; we have debated about turning off
| IndexedDB write buffering in Safari and just do in-memory
| there. The best thing to do on Apple platforms is to make
| the app they're trying to force you to make. Then you can
| make a little adapter so your web app can write to disk
| using SQLite and enjoy a nice relational API without
| needing to worry about the whims of the browser.
| aboodman wrote:
| For a SaaS style (aka client-server) application, the right
| way to think of client-side storage is as a persistent
| cache, for a few reasons:
|
| * it can be deleted at anytime (by browser, or even by
| user!)
|
| * you generally want the server to be authoritative. if
| there's a bug client-side, server view of state should win.
|
| * it's not possible in the general case to store _all_ user
| data offline, it 's always a subset.
|
| Once you realize that the client-side state is a cache,
| potential uses of it become a lot more clear.
| kevincox wrote:
| That's the thing. I want to make apps where the client-
| state is more than a cache. I want it to be able to be
| authoritative.
|
| Sure, you probably want to put some sort of syncing on
| top, but that isn't even always necessary.
| aboodman wrote:
| Alright, there is some naming collision then.
|
| "offline-first" (terrible name, but here we are)
| generally refers to a classic web application that wants
| to be able to run offline either for network resiliency
| reasons or for performance.
|
| "local-first" is a term that has been coined for
| something close to what you are talking about:
| https://www.inkandswitch.com/local-first.html
| cdbattags wrote:
| Not much more to say other than Noms was my favorite project
| (https://github.com/attic-labs/noms) for a while until
| acquisition and the engineers are now the ones behind
| Replicache (https://replicache.dev/).
|
| I think this is going to be the next "Realm" that works
| everywhere.
| nacs wrote:
| Looks like replicache is pretty expensive though.
|
| If you have more than 500 users, the price is $500/mo (and it
| goes up from there).
| aboodman wrote:
| aww, thanks.
| jefftk wrote:
| _> there should be a way to request permission to store data
| that isn 't deleted except for explicit action by users_
|
| I think that's "request permission to use files on disk". This
| is in progress as https://developer.mozilla.org/en-
| US/docs/Web/API/File_System..., though there's still more work
| before there's a version all the browsers like (Mozilla likes
| the ability to work with files, but thinks cross-site access
| should not be included: https://mozilla.github.io/standards-
| positions/#native-file-s...)
| danShumway wrote:
| > but it is wrapped together with aspects for which we do not
| think meaningful end user consent is possible to obtain (in
| particular cross-site access to the end user's local file
| system)
|
| This is a really difficult problem to solve, and I get
| Mozilla's hesitation. I'm also frankly very hesitant about
| Google leading the charge on this, not because I'm paranoid
| about them sneaking in tracking, just because I think Google
| tends to create less thoughtful web specifications sometimes.
|
| But... cross-site file access is really important for data
| portability and open standards, and Google's current proposal
| isn't bad, it might be rough but it's definitely workable.
| Mozilla really should try to figure out a way to move forward
| on this.
|
| We've seen the difference in data portability between mobile
| and desktop apps, and a big part of the difference between
| those two platforms is being able to very easily have
| multiple sources working on your data at the same time.
| Siloing data has downsides. It's tough to embrace a Unix-
| style philosophy without allowing programs to operate on the
| same data. And having Unix-style smaller webapps that work
| with each other is a good way of fighting against data silos
| and in some cases a good way of fighting against anti-user
| and anti-privacy services in general.
|
| I'd love to see more progress made on this, but who knows how
| that will work out. Caution is probably warranted for the
| moment, I'm just disappointed that the language suggests
| Mozilla would never consider a proposal that included this.
|
| It's also very important that this expose _user-accessible_
| file system access and not just a virtual filesystem in the
| browser; otherwise it just becomes another data-silo in the
| web browser. This is something that Google 's proposal really
| gets right, and it's disappointing to see what appears to be
| pushback on the idea that users should be able to open up the
| directories that a web browser is writing to, inspect the
| files, and open them or move them around the filesystem, or
| even write to them from native apps. That to me is an
| essential part of the proposal.
| hinkley wrote:
| sqlite bindings in browsers came and went in the time between
| when I learned about them and finally found a nail that needed
| that hammer.
|
| I started on a design and literally within a few weeks Firefox
| announced it was deprecated.
| EamonnMR wrote:
| What's wrong with offering the user a file to "download"
| (actually creating a file on the fly) or "upload" (actually
| loads it into the local web app?)
|
| Edit: Here's the stack overflow copypasta I used to achieve it:
| https://github.com/EamonnMR/Flythrough.Space/blob/master/src...
| lytefm wrote:
| I've been working on offline-first apps (CouchDB/PouchDB +
| Cordova/Capacitor and published via App/Play Store) in the last
| years and can definitely relate. But some points to add:
|
| - The 7 days IDB limitation does not apply for apps that are
| published through the stores
|
| - Conflicts can happen, but depending on your design they might
| not matter in practise. ,,Implement a proper conflict resulution
| strategy" has been on my ,,todo: maybe" list for over 3 years now
| but was never important enough.
|
| - Data migration is not needed as long as schema changes are
| additive (new doc fields, new doc types). Design carefully early
| on, keep track of ,,abandoned" properties and you'll rarely need
| a difficult migration.
|
| - Depending on the performance of your customers' phones and the
| amount of data your app is processing, it (JS -> ... -> IDB and
| back) might not be fast enough. I had to add caching layers for
| some use cases. But at some point, you probably want a proper
| state management library anyways which should include caching
| nearly for free.
|
| - You can (and should!) still consider most of your data
| relational. There is even a relational-pouch plugin. But I'm
| strongly missing foreign key constraints and better DB-level data
| validation than CouchDB's design docs provide.
| yanis_t wrote:
| PWAs are the future.
|
| Along with other benefits comes the fact that you can "install"
| apps on devices bypassing the app stores (which means not paying
| commission fees), which I guess is the primarily reason why
| Apples is so reluctant of giving it a proper support in Safari.
| twobitshifter wrote:
| Good article. For structured data, I've had good luck using UUIDs
| for keys rather than needing an approach that relies on an atomic
| clock.
| ngrilly wrote:
| Many of the problems mentioned in the article are solved by doing
| offline first with a native app instead of a PWA.
| taneq wrote:
| The most confusing thing to me is a discussion of "offline first"
| applications which starts with, and maintains, the assumption
| that your only option is a web app.
|
| Back in my day we had a word for software that always worked
| without an internet connection. We called it "software" and it
| was installed on the user's computer.
| Swenrekcah wrote:
| A user's computer? Written with a possessive? What a bizarre
| idea!
| tehbeard wrote:
| Har de har look at these modern programmers with their web 2.0s
| and js frameworks of the week...
|
| Offline first is not "software" as you classified it.
|
| Your "software" you are on about is more accurately described
| as "offline only", little to none of the functionality requires
| network access.
|
| Offline first refers to how online functionality is needed for
| the design of that app, but with steps taken to ensure that
| even without connectivity for periods of time, it still
| functions.
| jjnoakes wrote:
| While offline first seems to be discussed in the context of web
| apps a lot, to me it is more about the data and synchronization
| than where the executable lives, and most of the ideas also
| apply to certain kinds of desktop software.
| jitl wrote:
| It is approximately infinity times easier to distribute and
| grow the user base of a web app compared to locally installed
| software. The consumer software economy is moving online for
| this reason - it's much better for business.
|
| RxDB software comes from this context - it's a JavaScript
| library built for this world that attempts to retain the
| distribution and sharing advantages of the web, while adding
| back the responsiveness and availability of traditional
| installed software.
|
| I think you'll find the original "local first" manifesto more
| aligned with both the user & traditional installed software
| with less of a web focused bias:
| https://www.inkandswitch.com/local-first.html
|
| _> In this article we propose "local-first software": a set of
| principles for software that enables both collaboration and
| ownership for users. Local-first ideals include the ability to
| work offline and collaborate across multiple devices, while
| also improving the security, privacy, long-term preservation,
| and user control of data._
| blacktriangle wrote:
| Also back in that day 95% of our users were on Windows and we
| had a direct relationship with them.
|
| Now your user base is roughly split 40/40 between iOS and
| Android with a non-ignorable 20 running some windows tablet
| thing. Then for extra fun your access to those users is
| mediated by the giant black boxes staffed by assholes that are
| the iOS and Play stores.
|
| And of course back then nobody had any expectation that your
| software would easily sync between devices and users because
| that just wasn't something that was doable easily.
|
| The world changes. Yeah it was better for us devs back then,
| but honestly the new world has some real advantages.
|
| For me personally, I'm cautiously optimistic that PWAs are our
| way out of the hell that is supporting 3 native platforms for
| all but the rarest cases.
| poetaster wrote:
| Qt/qml works for me. Write once.
| lytefm wrote:
| For me, PWA doesn't quite cut it yet but using a single
| codebase based on web technology + adding the appropriate
| wrappers for native functionality (Electron, React Native,
| Capacitor, whatever...) is fine for now.
| dspillett wrote:
| That isn't offline first though: desktop software like that is
| generally offline _only_ , and the user wraps their own chosen
| sync method (which could simply be good ol' sneaker-net or
| frizby-net) around that if they want/need to.
|
| Offline first is only used in the context of web applications
| and sometimes their Android/iOS cousins (which probably share
| the same backend, where both are available), once the decision
| had been made that a not-locally-installed and/or remote synced
| application is desirable where possible, so isn't being
| suggested (directly) as an alternative to locally installed
| offline programs.
| JamesSwift wrote:
| The difference is that back in the day it _only_ worked on your
| computer. There was no cloud component and no collaborative
| use-case. If you only have a single, local client then a lot of
| this doesn't apply. But its rare for that to be the case in
| modern software.
| jasode wrote:
| _> The most confusing thing to me is a discussion of "offline
| first" applications which starts with, and maintains, the
| assumption that your only option is a web app._
|
| I didn't downvote your comment but in this author's article,
| it's deliberate for the _starting context for discussion_ to be
| a networked collaborative app.
|
| Yes, internet connected apps is a _subset_ of all possible
| software but that 's not the point.
|
| As an analogy, imagine if someone else submitted an article
| about C Language memory techniques on an embedded chip. E.g.:
| https://www.embedded.com/memory-allocation-in-c/
|
| And then a commenter misunderstands that article complaining,
| _" it's confusing to me because this article maintains that the
| only option is C in an embedded app but over here, I'm using
| Python with Cloudflare Workers"_
|
| In other words, it doesn't seem like you're interested in
| collaborative apps that require distributed data consistency so
| this article looks "wrong" to you.
| phkahler wrote:
| It's confusing because "offline-first" doesn't even seem to
| make sense in the context of web-apps, which I thought meant
| "fancy (functional, able to do stuff) web sites" or similar.
| josephg wrote:
| I've been working in this problem space for awhile now and I
| sort of agree with the GP poster. I really like native
| software; and I want native software which can work simply in
| a distributed, collaborative context. As an example, I have a
| note taking app on my laptop. I want to be able to read and
| edit all my notes on all my other devices. And I want that to
| work in a way that doesn't depend on some random startup
| keeping their servers on the other side of the planet
| running. Right now every software company which wants to
| build something like this needs to invent their own data
| stack, network protocols and storage systems. And the prize
| at the end is with software that can only talk to itself via
| closed protocols. It's infeasible, and inefficient.
|
| We have an opportunity right now to do an awful lot better.
| When we do, I want to service both native and web apps. If we
| do it right, from the network level the distinction should
| just boil away anyway.
| zorr wrote:
| I'm in this boat with the product I'm currently hacking on
| but I just can't seem to commit to an architecture I'm
| happy with. Mostly due to what you mentioned here.
|
| Essentially I want to build an opensource/hackable
| notes/tasks/calendar system with a central datastore and
| message broker for coordination between various
| systems/scripts/components/clients.
|
| The thing I'm struggling with is to define which features
| are supported in online and offline mode. Every feature
| that gets added to offline mode adds tons of duplication
| and complexity. I'm almost at the point of just saying I
| don't need offline-mode except for viewing already-cached
| data and maybe very basic creates/updates, with all the
| processing happening on the backend once the client comes
| back online.
|
| edit for more context: The old-style native apps (for
| example OmniFocus) usually have all the logic only in the
| client and use "dumb" cloud storage for synchronizing
| between clients. The difference with what I'm trying to
| build is that I want that central hub to be "smart" and
| always online so it becomes easy to hack/interact with the
| system from cronjobs/scripts/external services.
| josephg wrote:
| The architecture I have in mind is to use a CRDT of some
| sort as the data store. (Or OT if you have a centralised
| server and want to keep complexity down). Then make the
| client smart, like old school apps like OmniFocus. Do
| concurrent editing via the data layer - so the
| application only really deals with the local data and
| hears about updates via the underlying data itself
| changing.
|
| The data model can be reused across many applications,
| since there's nothing application specific about it. So
| we can make standard, interoperable debugging tools,
| backup tools, viewers, etc.
|
| If you want to interact with the same data from scripts,
| cron jobs and external services, just have another peer
| on the network with access to the same set of API methods
| the application can access. You can already read and
| write data via that API, and any applications with the
| data open should see any changes instantly.
|
| Basically, what I'm imagining is pretty similar to a self
| hosted firebase. Except, ideally, I want a CRDT under the
| hood so we don't _need_ to send all edits via someone
| else 's computer.
| nonameiguess wrote:
| Networked collaborative app doesn't need to mean runs in a
| browser. A git repo fits that description, even a true
| distributed VCS with no server where every editor has their
| own copy and no single copy is authoritative. Each user
| chooses which changes to merge into their personal copies.
| The native versions of Microsoft Office when backed by
| Sharepoint also operates that way, allowing users to check
| out individual copies and edit them in a native editor,
| although in that case Microsoft is clearly trying to push
| people into editing directly in the browser.
|
| A lot of these problems go away if you don't run in a
| browser, because user inherently trust software more when
| they're running a copy that can't change from underneath them
| on a second-by-second basis. I'm a lot more willing to give
| filesystem access to an application I have to explicitly
| install and that remains what I installed until I knowingly
| and intentionally upgrade it, as opposed to code pulled
| continuously from the network as I am working.
| psychometry wrote:
| Ok, and if you want your non-web app to be compatible with
| Windows, MacOS, iOS, and Android and you don't want to write
| more than one app, your options are...?
| poetaster wrote:
| Qt/qml works for. With python sometimes.
| justinclift wrote:
| Generally Qt.
| blondin wrote:
| i agree with the essence of your comment. the "back in my day"
| is what is, perhaps, causing resistance.
|
| my impression these days is that web engineers have outnumbered
| other software engineers. that's a problem because context
| matters as you are alluding to.
|
| we shouldn't use terms like "offline first" without an
| appropriate context. or assume context.
| hengheng wrote:
| I will still call any web app a "web site", and if that is my
| age showing, so be it.
| atatatat wrote:
| That's fine, if it's on your tablet or PC.
|
| If you're looking at a web "site" on a vertical phone screen,
| one side or the other did something wrong.
|
| My point is: sites and services should have two separate
| experiences, depending on screensize. These are: desktop
| site, mobile webapp.
| hoarad wrote:
| i do so also. and it is because they have a link
| taneq wrote:
| We can shout at the cloud together.
| ItsMonkk wrote:
| The way I've learned to use git is to
|
| 0. Sync with remote
|
| 1. Edit files until I'm ready to check-in
|
| 2. Stash changes
|
| 3. Sync up with latest from remote
|
| 4. Pop changes
|
| 5. If there are any conflicts, deal with them here, locally.
| Possibly delete all changes and redo. Redoing is equivalent to
| someone checking in something at step 0. If this takes some time,
| move back to Step 2.
|
| 6. Push changes
|
| If done this way, the only benefit of CRDT's is during step 5.
| One of the lessons I've learned and truly believe is that if
| something sucks, and it gets worse as the problem gets bigger,
| you need to do it more often. Git merges are a great example of
| this concept.
|
| And this is where CRDT's are in trouble. The best way to make
| step 5 easier is by making step 1 smaller. CRDT's viewpoint is
| that we are offline, and therefore should allow any number of
| edits, and when we reconnect the system should be able to work it
| out. It flies in the face of smaller commits. The more changes
| you need to merge, the harder it is. We live in a world that's
| connected 99% of the time, and CRDT's simply aren't needed.
|
| On the other hand, the "Offline First" model is great! A user
| having access to all of their data is wonderful. As the blog
| notes, it's not preferable for a user to have the entire
| Wikipedia or Google indexes on their device. So you need to have
| a use-case for Offline, and we need to do better about this type
| of stuff, but we don't need to wait on CRDT research for this.
| Smart caching and materialized views are where I think the real
| progress is going to come from, making things like Offline
| Wikipedia possible.
| jcun4128 wrote:
| I made a PWA that used IndexedDB and base64 photos. I ran into a
| funky max-length issue that was an error from Chromium kind of
| interesting. But yeah the main problem I had was the base64
| images would get too big (if you had too many to load) and then
| you would see a slower render vs. an image pulled by url.
|
| Still pretty cool since I'm not a native developer and RN is
| something I've dabbled in but don't use daily.
| jitl wrote:
| These days IndexedDB supports storing binary blobs as File
| objects or even ArrayBuffers; the relevant bug on the Chrome
| issue tracker was marked fixed in 2014:
| https://bugs.chromium.org/p/chromium/issues/detail?id=108012
| jcun4128 wrote:
| Interesting I think was straight up using just text or
| whatever is the default. Though I was using the Dexie wrapper
| which is nice.
|
| To be clear I don't know if the bug was from IndexedDB it was
| something about length exceeded.
|
| I did try to use small images too eg. some 150px by 150px but
| going to base64 usually multiplies the size by 1.3
|
| For anyone interested [1] not a phenomenal app but one I
| poured idk 2-300+hrs into (was working on an RN version too)
| and went nowhere sucks. I was contributing to one of those
| codefor# deals.
|
| [1] https://github.com/codeforkansascity/tagging-tracker-alt-
| app...
| ngokevin wrote:
| Perhaps Service Workers is better than IDB for that?
| begueradj wrote:
| Just before yesterday, I read an article here praising the
| "Offline First" principle. And as for everything related to
| software, I am reading again what was good is bad, and what was
| bad is good. All you have to do for a successful clickbait is to
| wait until someone shares his opinion so that you can finally
| write something against it... (most of the time, just for the
| purpose of saying you're against)
| esrch wrote:
| The other article you mention comes from the same website:
| https://rxdb.info/offline-first.html. It's in an opinion
| section, giving the arguments from both sides.
| typingmonkey wrote:
| Yes. I have written both of them, mostly to make sure I
| trigger 100% of people, no matter if they like offline first
| or not.
|
| I mean, read all these comments on both articles. People say
| offline first does not work, then they say that every
| software is offline first by default and it is nothing new.
|
| It was totally worth it spending two 3 days on it :)
| somenewaccount1 wrote:
| And you know what, I AM triggered!
|
| I feel like you missed the most important point of "offline
| first" is that your data belongs to you and does not need
| to be shared to the cloud in order to have tremendous
| value. The code/logic to enhance your data should be
| shipped to you, rather than the other way around.
| JamesSwift wrote:
| Its a good article but it should be noted that this very much is
| a web-centric view of Offline First and its challenges. When I
| say web-centric, I mean as opposed to offline-first on a mobile
| app.
|
| Native apps dont deal with the same issues around storage, and
| are actually _much_ more performant overall. If you haven't done
| an offline-first app, I highly recommend it. The experience is
| magical. You can fly around your app at the speed of the users
| touch. Content is magically loaded as soon as its clicked on. Its
| amazing as a user.
|
| As for similarities, conflict resolution is a universal problem
| and is either the most difficult or second most difficult problem
| [1] to solve. What makes this more difficult is that there is no
| one-size-fits-all for this. You need to have a deep, nuanced
| understanding of your system and what makes sense for your use
| case in terms of resolution strategies. Then you need to
| implement them, which is not easy especially if your backend is
| bog-standard REST on a classic SQL datastore.
|
| I've enjoyed reading the past 2 rxdb articles on this (the one
| mentioned here as well as the one from a couple days ago [2]).
| Its great to have more content on this publicly available, when I
| was getting into offline-first I only had a couple options.
|
| [1] - At the end of the day, offline-first is a whole bunch of
| caching, and so you end up needing to deep dive into cache
| invalidation strategies which we all know is a hairy problem.
|
| [2] - https://news.ycombinator.com/item?id=28690427
| lytefm wrote:
| > At the end of the day, offline-first is a whole bunch of
| caching, and so you end up needing to deep dive into cache
| invalidation strategies which we all know is a hairy problem.
|
| Looks like you're not truly getting the concept of Offline-
| first apps then. You don't have or need a cache. You have a
| local database on the device that syncs up with the server, if
| online.
|
| Conflicts can occur, but modeling data well with a
| CouchDB/PouchDB setup (i.e. prevent conflicts by not modifying
| docs all the time, rahter create new ones) + having a simple
| time-stamp based heuristic can already be sufficient.
| Otherwise, using CRDT is an option.
| JamesSwift wrote:
| > You don't have or need a cache. You have a local database
| on the device that syncs up with the server, if online.
|
| The source of truth is the server. Everything else (i.e. the
| local copy) is a snapshot of that, aka a cache. Its just that
| offline-first is _always_ a cache-first read, so you seem to
| think that this makes it not a cache any more, but a regular
| data store.
| lytefm wrote:
| > so you seem to think that this makes it not a cache any
| more, but a regular data store.
|
| Yes. In an true offline-first approach with
| CouchDB/PouchDB, the client-side database can be considered
| to be main data store and the Server-Side DB could just be
| a backup. Or it might not be needed at all/ might only be
| used to migrate from one device to another.
|
| I'd say whether it's a master-slave or Multi-master model
| depends on the conflict resolution strategy.
| BackBlast wrote:
| CRDT is really a multi-master model with eventual
| consistency. It's not merely a cache but also operates as a
| master and source of truth. With a well-built CRDT you can
| also skip the server and sync client-to-client where both
| have master copies.
|
| The tricky part is conflict resolution.
| JamesSwift wrote:
| Right, the peer-to-peer or anything with a concept of
| multi-master / majority is a different class of problem.
| I am speaking only to traditional client-server
| scenarios.
| LAC-Tech wrote:
| That just sounds like last write wins - the first client to
| sync with the server is now the source of truth and other
| clients that were working on the same thing will get
| clobbered.
| Trufa wrote:
| Yeah,I'm trying to implement an electron offline first app that
| syncs, there seems to readymade solution.
|
| Stuff like https://github.com/aerogear/offix seem to be in the
| right direction of what I'm looking for but not nearly mature
| enough.
|
| I don't want to pu to much effort on the app so I would like
| something more or less ready made, preferably with graphql
| apis.
|
| Any suggestions welcome.
| firebase-user wrote:
| You should try using Firebase. It handles the data-sync for
| you, and it applies all updates locally first so that your
| app feels snappy.
| JamesSwift wrote:
| I mentioned Realm because thats what Im familiar with but
| yeah Firebase is also a good choice if you want a client-
| server setup.
| code-is-code wrote:
| Firebase works until you want a different conflict
| resolution then last-write-wins. It is also only offline
| first if the user is authenticated. Otherwise you need a
| connection to the servers before using the local state.
| jamil7 wrote:
| Firebase is offline-resistant more so than offline-first
| I'd argue.
| LAC-Tech wrote:
| Last write wins is not handling data sync, it's washing
| your hands of it.
|
| Your users will be frustrated.
| davidzweig wrote:
| The (JS) frontend SDK can be set to either online or
| offline mode. When in online mode, it seems to insist on
| downloading full copies of records, even when a version is
| somewhere in the local cache, slowing things down and
| incurring cost. Some optimisation could have been applied
| here, I didn't find anything about it in the docs. Does
| this match others experience?
| danuker wrote:
| Ah yes. Convenience is how Big Tech gets all data to pass
| through it.
| JamesSwift wrote:
| I would hesitate to use any client-server setup that isnt
| also your primary data store. So, if you have an "offline
| aware, multi-tenant" store like CouchDB but you still need to
| sync to the primary store which is SQL, then you lose a lot
| of context and awareness around conflict resolution. If you
| are going to eventually sync to the other store, I would say
| only use Couch on the frontend, and do the syncing/resolution
| from the perspective of the client, since the client knows
| what it was trying to do and how best to notify on conflict.
|
| The "better" option is to have an offline-aware client-server
| component (e.g. Realm) which is the primary store as well.
| This eliminates the sync and so all conflict resolution stays
| in the same system with well defined semantics.
| Trufa wrote:
| Thanks for the recommendation, I'm a little bit skeptical
| of MongoDB, but realm looks nice, Firebase I'm a little
| reluctant on being so dependant on google, but otherwise
| looks good.
| JamesSwift wrote:
| I share your concerns and avoid it for the same reasons :
| )
|
| Just giving options and google fodder.
| BackBlast wrote:
| PouchDB works well as the primary client store and can
| sync to CouchDB in the cloud. IMHO this is one of the
| more mature combinations that gives you a ready client
| side document database that takes care of data
| replication for you.
| WorldMaker wrote:
| It's just unfortunate that running CouchDB in the cloud
| seems increasingly perilous. Since IBM acquired Cloudant,
| it's much tougher to run CouchDB "at scale PaaS" on any
| data center other than IBM's (for obvious reasons),
| whereas early Cloudant had robust Azure and AWS support
| for years.
|
| I wish Couchbase were more helpful practically than they
| try to present themselves theoretically. Even if their
| products weren't so expensive the impedance mismatch
| between their version of CouchDB's sync APIs and their
| own APIs seems to increase by the year, and is pretty
| noticeable in how different it works from a PouchDB
| standpoint and how easy it is to break sync. (Impedance
| mismatches in allowable database names and _id keys are
| huge on their own that have massive repercussions in
| application design.)
|
| Even CouchDB is not CouchDB anymore with impedance
| mismatches of its own between versions. On the one hand
| it's good that Cloudant upstreamed a lot of their cluster
| management tools directly into Apache CouchDB 2+ (even as
| they made their PaaS offering IBM Cloud only [or whatever
| it's name of the month is]), but huge architectural
| changes below the covers in CouchDB 3+ start to present
| their own sync issues akin to but distinct from
| Couchbase's (and even some of Cloudant's as they seem [?]
| to be diverging again into their own 2+ fork after all
| that work upstreaming stuff?).
|
| More than ever, Azure CosmosDB's focus on bare minimum
| MongoDB capability and not supporting anything like
| CouchDB sync, despite having close the same raw
| ingredients (Cosmos' change feed looks a lot more like
| Couch's, but is just missing a couple subtle things to
| make it directly and immediately useful for Couch
| replication) seems like a "CouchDB is dead and not worth
| supporting" signal from Microsoft.
|
| Unfortunately, I think PouchDB <=> CouchDB replication
| has past the "mature" point to the "decrepit" and
| "falling apart" stage, maybe to the point of
| "evolutionary dead end" if I'm feeling strongly
| pessimistic enough, and I've been for years trying to
| figure what to replace it with.
| BackBlast wrote:
| That's a timely response and I too have to noticed that
| CouchDB may not have the best/compatible future. I have
| an existing product that depends on PouchDB in the client
| and the replication.
|
| I'm not really into the managed db service offerings, I
| like to have control over it that they rarely offer.
|
| My tentative solution for the application is to go with
| PouchDB on node.js on the backend too.
| psychometry wrote:
| It's infuriating to me that WebSQL was killed. 99% of real-world
| data is relational and yet the powers that be decided that we
| should all be forced to use IndexedDB and hacky layers like
| PouchDB built atop it.
|
| I'm excited about https://github.com/jlongster/absurd-sql though.
| goohle wrote:
| IMHO, these days browser should just ship popular libraries,
| e.g. sqlite, with browser, or download them once and cache them
| permanently, until never version is released. Think like
| <<Linux distribution>>, but for web/wasm.
| dragonwriter wrote:
| WebSQL wasn't killed by opposition to having a relational API,
| it was killed because the spec was tied to, and only
| implemented by embedding, a specific, identified version of
| SQLite.
| EvanAnderson wrote:
| I think that a developer distaste for relational databases
| was a major driver. Digging back into correspondence on this
| a few months ago (when this came up on HN) I found clear
| statements that Mozilla opposed anything relational. The
| SQLite version is a convenient excuse for some developers
| who, at the time, we're enamoured with "NoSQL".
|
| Discussion here:
| https://news.ycombinator.com/item?id=28156831
| WorldMaker wrote:
| Mozilla for a long time backed their IndexedDB with SQLite,
| they wouldn't have done that if they were that antagonistic
| to relational databases.
|
| I trust Mozilla's surface reasons here: they inherited the
| mess that was NPAPI from Netscape, then decades of
| experience with XUL binary components, were among the many
| dealing with Flash bugs and zero-day fallout well after
| Flash's "heyday", and have combined multiple decades of
| experience in what happens if the web depends on _specific_
| binaries to do its job. From that standpoint of they were
| already knee deep in trying to sandbox /reign in NPAPI,
| remove XUL, and remove Flash I absolutely understand why
| "you want the web to depend on the bugs and zero days of
| SQLite directly with no abstraction layer between?" was a
| complete non-starter.
| ec109685 wrote:
| The need to productionize another embedded database in order
| to support embedded SQL in the browser seems like a tough
| hill to climb given how widespread SQLLite is. This stance is
| always going to keep web apps behind mobile apps in terms of
| features and performance.
| thayne wrote:
| > in the end the user itself could be the one that deletes the
| browsers local data
|
| This is especially true, because often customer support for many
| websites have customers begin troubleshooting with clearing the
| browser's "cache and cookies". In other words deleting all of the
| local data for all sites. There are ways to delete the local data
| for just one site, but they are pretty hidden, and involve
| multiple steps. I wish browsers had a simple "delete all data for
| just this site" button.
| lambda_dn wrote:
| It's not worth it, have a internet connection is more ubiquitous
| every day with Wifi, 4/5G and coming soon Low orbit satellite
| grids.
|
| Trying to engineer your app to work offline causes complexity in
| the design and implementation for a issue that might never be an
| issue for most customers.
|
| Assuming your app is pretty crippled when it can't access the
| cloud.
|
| What's next making your app still work if there is no display by
| screen reading?
| lytefm wrote:
| This statement very much depends on the kind of app you're
| developing, as mentioned in the article. Is the main use case
| communicating with another user? Do you need a third party API
| for your app's basic functionality? Sure, don't even consider
| offline first.
|
| But what if the app is only about storing and displaying data
| entered by the user and you'd definitely also want to be able
| to use it on an airplane, e.g. a todo/notetaking/journaling
| app? Then offline first can make sense.
|
| I'd even say that the complexity of developing an offline-first
| app can be lower than that of a classical client/server app +
| caching logic. Sure, you need to figure out how to do schema
| migrations and you probably want a kill switch to lock out
| older apps at some point. But the same applies in a classical
| setting when API-endpoints should be changed or removed.
| Basically, your document model is now your API. And a very
| simplistic conflict resolution strategy like ,,just take the
| most recent version" is often good enough.
|
| Once that + basics like auth and account creation are set up,
| it's very productive and low overhead to work with
| PouchDB/CouchDB and offline first:
|
| - no need to coordinate with a backend team because nothing
| else than auth happens there - simpler state management and
| error handling because the state in the local DB can always be
| assumed to be correct and the DB is always there
|
| - no need for schema migrations as long as you only add new
| docs or extend existing ones
|
| - it's great for quickly hacking a prototype or for beginners
| who just know some HTML/CSS/JS
| jamil7 wrote:
| Ideally yes, you'd make some effort to make your app accessible
| by a screen reader.
| austincheney wrote:
| > When you create a web based offline first app, you cannot store
| data directly on the users filesystem. In fact there are many
| layers between your JavaScript code and the filesystem of the
| operation system.
|
| Solved: File system in the browser plus network distribution -
| https://github.com/prettydiff/share-file-systems
___________________________________________________________________
(page generated 2021-10-01 23:00 UTC)