[HN Gopher] Cloudant/IBM back off from FoundationDB based CouchD...
___________________________________________________________________
Cloudant/IBM back off from FoundationDB based CouchDB rewrite
Author : jFriedensreich
Score : 97 points
Date : 2022-03-12 16:06 UTC (6 hours ago)
(HTM) web link (lists.apache.org)
(TXT) w3m dump (lists.apache.org)
| elitepleb wrote:
| So what's the deal with the unpopularity of CouchDB?
|
| It's seems like a compelling database, but i've yet to run into
| it in the wild.
| tehbeard wrote:
| Beyond the meta of it being old/mature and thus not continually
| piercing the tech newsspace with releases etc.
|
| Querying in a more ad-hoc way (vs. building indexes ahead of
| time and querying by key, etc) is a bit janky / not 1st class
| (I think mango addresses this but not entirely sure).
|
| The runtime being erlang? It certainly seemed to be the cause
| of some issues when I tried to run it in WSL, or atleast my
| lack of knowledge with erlang made diagnosing it more trouble.
|
| The JS query server engine is/was fairly old (I think it might
| have jumped to a more recent version of Spidermonkey at some
| point), and hooked up in a way that, while more modular, limits
| the performance (documents have to be serialized to/from the
| engine in another process, rather than just natively passed in)
|
| The authorization model is... unique. You can limit down to a
| doc/field level who can submit changes via
| validate_doc_update(...) in a design doc. So allowing those
| with a reviewer role to only be able to edit a notes field on a
| document, while the user in the author field has full access to
| the other fields is possible. But read access is at the
| database level, as in you can either read the db, or not.
|
| The way around this for having "private" storage is enabling a
| feature to make a db per user automatically and assign them
| rights, but this is more complicated to manage client side (two
| dbs to talk to) and replication even more of a nightmare if
| stuff needs to be shareable instead of just private.
| kache_ wrote:
| I've used it, its pretty decent given you understand the
| internals.
| Already__Taken wrote:
| npm does or did run on it https://github.com/npm/npm-registry-
| couchapp if that's what you mean by in-the-wild
| gedy wrote:
| It is/was nice, just an early NoSQL DB with a lot of
| interesting features. Just better options came about to take
| its mindshare. We used it about 11 years ago for an internal
| marketing CMS system and the replication and attachment support
| were a good fit.
| jFriedensreich wrote:
| I would highly object to "better options came about". I am
| not debating maybe a better fit to your specific problems
| came along, but in general case of the sweet spot for couchdb
| there are no obvious better alternatives. The sweet spot
| being "a schemalesss json database with a rest api and first
| level support for master-master and online/offline
| replication that values your data safety and reliability
| first and everything else second."
| tehbeard wrote:
| What's out there with a better client device sync option?
|
| This is something I've been looking for for a few PWAs that
| need to operate on bad/no network, and most other solutions
| are build your own entire sync setup, or magic-in-a-box you
| can't tune.
|
| With Couch/pouch, I can sync with a filter/several filters to
| make sure the subset of data I need is on the device.
| gedy wrote:
| Yeah agreed that's really cool. Closest I've seen is Apollo
| client, but to your point you have a lot less fine grained
| control.
| rat9988 wrote:
| What betters options would you have in mind? Asking from
| curiosity, I don't follow this space closely.
| gedy wrote:
| It's been a while, but seems like many people wanted
| something simpler like MongoDB for a NoSQL document
| database. CouchDBs map/reduce queries were hard to get
| people's heads around, many people didn't need attachments,
| etc.
| bsaul wrote:
| Sidenote : i've heard foundationDB was used for cloudkit, but is
| it also used for iMessage ?
|
| It seems like its transactionnal properties would be quite well
| suited to something like a messenger service (where order of
| messages matter, especially with e2e encryption)
| navarro485 wrote:
| pretty sure Cassandra is used for iMessage. although that may
| have changed after apple acquired foundationDB.
| gigatexal wrote:
| On device iMessage is SQLite I think. Backend not sure.
| samwillis wrote:
| While I don't have enough knowledge of the wider implications of
| this, it does impact something I was experimenting with last
| year.
|
| The FoundationDB rewrite would introduce a size limit on document
| attachments, there currently isn't one. Arguably the attachments
| are a rarely used feature but I found a useful use case for them.
|
| I combined the CRDT Yjs toolkit with CouchDB (PouchDB on the
| client) to automatically handle sync conflicts. Each couch
| document was an export of the current state of the Yjs Doc (for
| indexing and search), all changes done via Yjs. The Yjs doc was
| then attached to the Couch document as an attachment. When there
| was a sync conflict the Yjs documents would be merged and re-
| exported to create a new version. The issue being that the
| FoundationDB rewrite would limit the size and that makes this
| architecture more difficult. It's partly why I ultimately put the
| project on hold.
|
| (Slight aside, a CouchDB like DB with native support for a CRDT
| toolkit such as Yjs or Automerge would be awesome, when syncing
| mirrors you would be able to just exchange the document state
| vectors - the changes to it - rather than the whole document)
| HelloNurse wrote:
| But is it a small size limit that affects realistic usage?
| Don't you have performance issues if you use a CRDT implemented
| in JavaScript and running in the browser with large files?
| samwillis wrote:
| So yes, a particularly large document is not the norm but it
| can happen.
|
| JavaScript CRDTs can be quite performant, see the Yjs
| benchmarks: https://github.com/dmonad/crdt-benchmarks
| kevincox wrote:
| I don't see why there would be a fundamental reason why there
| would be an attachment size limit. I guess it would just need
| to be implemented by breaking the attachment into multiple
| keys? There may be some overhead but it seems that this is
| valuable because it allows large attachments to be split across
| servers as required.
| tlarkworthy wrote:
| When you chunk it you have problems about what happens if
| that process is interrupted. So it's not trivial (though
| solvable) but it's the kind of atomics you want the new
| engine to do.
| aseipp wrote:
| I think the person you're replying to is saying that the
| document should be split across keys inside the
| implementation, i.e. split across the fdb keyspace, not
| split by the user at the application level. Which is the
| approach you mostly always have to use for 'large' values;
| FoundationDB has size limitations on the k/v pairs it can
| accept and splitting documents and writing those chunks in
| small transactional batches is the recommended workaround
| (along with some other 'switch over' transactional write
| which makes the complete document visible all at once.)
| tehbeard wrote:
| If I remember the fdb docs, there's also a time limit on
| transactions that further limits the feasible max size.
| malkia wrote:
| Reminds me, when a team I worked in, had to migrate from one
| database to another (we were the only team left using that one,
| and no one was supporting it internally), but the new one had
| 22MB (or was it 44mb) limit on the total transaction size,
| while previous one did not have (AFAIR). Someone worked on
| splitting into several transactions (the bulk was really due to
| long recorded conversation "forum" like messages related to
| specific data), but overall it changed how things worked and
| had some issues initially... Who would've thought you would
| need that, years from the day it was originally designed...
| robertnewson wrote:
| The (low) attachment size limit at Cloudant is about service
| quality and guiding folks to good uses of the service more than
| a technical issue.
|
| As others have noted, the solution to storing attachments in
| FDB, where keys and values have an enforced maximum length, is
| to split the attachments over multiple key/values, which is
| exactly what the CouchDB-FDB code currently does.
|
| The other limit in FDB is the five second transaction duration,
| which is a more fundamental constraint on how large attachments
| can be, as we are keen to complete a document update in a
| single transaction. The S3 approach of uploading multiple parts
| of a file and then joining them together in another request
| would also work for CouchDB-FDB. While it _could_ be done,
| there's no interest in the CouchDB project to support it.
| samwillis wrote:
| Exactly, almost all the time you would be better to save the
| attachment to an object store. However I think I found that
| small edge case where the attachment system was perfect. It
| was essential to save the binary Yjs doc with the couch
| document, it needed to be synced to clients with the main
| document. Saving it to an object store is not viable due to
| the overhead during syncing.
| robertnewson wrote:
| yup. the purpose of couchdb's original attachment support
| was for "couchapps". The notion that you'd serve your
| entire application from couchdb. Attachments were therefore
| for html, javascript, image, font assets, which are all
| relatively small. The attachment support in CouchDB <= 3.x
| is a bit more capable than that due to its implementation,
| but storing large binaries was not strictly a goal of the
| project.
| tluyben2 wrote:
| Why don't you open source your work? Can you contact me
| otherwise, maybe I can take over this work on couchdb; we have
| to do it anyway and we would open source it.
| [deleted]
| matlin wrote:
| This is too bad. I understand there is likely a ton of complexity
| in making this switch but I think it still leaves CouchDB with a
| frustrating problem which is document conflicts within a given
| cluster. Client <-> Server conflicts are very understandable but
| when you might unexpectedly get a document conflict from two
| server instances replicating with each other, you're just bound
| to run into a bunch of issues.
|
| To have multi-master work properly you basically need Strong
| Eventual Consistency via CRDTs which most databases don't
| natively support (I think only Riak). Otherwise, you're better
| off switching to a single writer model.
| endisneigh wrote:
| What's the simplest client or way to use foundationDB? I was
| excited for this because FDB is somewhat unintuitive to use and
| deploy
| manishsharan wrote:
| The FoundationDB Document Layer is compatibile with MongoDb 3.x
| API. https://github.com/FoundationDB/fdb-document-layer .and
| you get the transaction Al integrity.
|
| I stopped using MongoDB and switched to this.
| endisneigh wrote:
| Amazing - much appreciated. How is it going compared to
| mongo?
| manishsharan wrote:
| Better than MongoDB. Easy to scale up. And no MongoDB
| gotchas for transaction.
|
| I use my FoundationDb cluster as a MongoDB alternative and
| a Redis alternative. Only one cluster to maintain and two
| types of functionality! I have tried setting up and
| maintaining clusters of MongoDB and Redis in the past and
| it was horribly complicated. FoundationDB cluster is so
| much easier to setup and maintain. And it gives me
| functionality of both Redis KV and MongoDB .
___________________________________________________________________
(page generated 2022-03-12 23:00 UTC)