[HN Gopher] Immudb 1.0 - open-source, immutable database with SQ...
___________________________________________________________________
Immudb 1.0 - open-source, immutable database with SQL and verified
timetravel
Author : vchain-dz
Score : 247 points
Date : 2021-05-25 11:49 UTC (11 hours ago)
(HTM) web link (www.codenotary.com)
(TXT) w3m dump (www.codenotary.com)
| nerdponx wrote:
| So is this something I would want to use for a basic CRUD
| application, and reap the benefits of time travel and
| immutability?
|
| Or are there downsides that would relegate it to specific use
| cases? A what would those use cases be?
| brokencube wrote:
| It wouldn't be suitable for any application where you care
| about GDPR (i.e. you store personal information and have users
| in the EU)
|
| The "right to be forgotten" is not compatible with immutable
| data. You can't simply need to mark data as deleted, you need
| to 'purge' it from your system (and possible backups, depending
| on how long you keep historic backups) - that isn't possible in
| a system with immutable data.
| blablabla123 wrote:
| I mean there are solutions for this. About CQRS/Event
| sourcing I've read that it's possible to solve it by
| encrypting the data with different keys and then
| rotating/throwing away the keys every now and then. Seems a
| bit hacky but probably there are more elegant approaches.
| LukeEF wrote:
| There seems to be a growth in the number of time traveling
| immutable-first databases available. We have OpenCrux, Datomic,
| TerminusDB, Noms, Dolt, and now Immudb. Three using datalog for
| query and two forcing SQL (not sure about Noms).
|
| What sort of use cases are most common? GitHub repository says:
|
| > Companies use immudb to protect credit card transactions and to
| secure processes by storing digital certificates and checksums.
|
| But I am not sure how people are building that into their
| architecture to be honest.
| clusterhacks wrote:
| I have used "immutable" schema designs when there were strong
| requirements for full audit needs over time. It works very well
| even in a normal RDBMs system. It also allows some very neat
| reporting e.g. compare the same report at different points in
| time.
|
| The basic idea was that every operation (create, update,
| delete) are actually normal SQL inserts and all reads are
| against views defined such that the most recent tuples are
| returned unless they are flagged as "deleted."
|
| I have typically used these types of designs in mostly simple
| applications with tables where the row counts are in the low
| millions of tuples. Dealing with this design in the billions of
| tuples (probably sharded somehow) might have motivated us out
| of normal RDBMs and into one of the specialized immutable DBs
| mentioned.
| ak39 wrote:
| Thanks. Can you comment on how this differs from a "mutable"
| RDBMS model but one with automatic history based on triggers
| for example?
| gregwebs wrote:
| > Data stored in immudb is cryptographically coherent and
| verifiable, just like blockchains, but without all the
| complexity. Unlike blockchains, immudb can handle millions of
| transactions per second, and can be used both as a lightweight
| service or embedded in your application as a library.
|
| > Companies use immudb to protect credit card transactions and to
| secure processes by storing digital certificates and checksums.
|
| This explanation is available on their github repo [1]. It has
| been a common refrain on Hacker News that you don't need a
| blockchain and instead can just use a database, but this product
| may actually fill the gap where tamper resistance is desired.
|
| [1] https://github.com/codenotary/immudb
| simtel20 wrote:
| Yeah, this interests me because I'm thinking about how to use
| grafeas - it's role is critical for reliable software
| development going forward - but storing it's data in a backend
| like this would add one more layer of trust and verifiability
| to a software supply chain. There are some interesting
| possibilities with making e.g. public software repos' metadata
| clonable verifiable and queryable via local immutable copies.
| decodebytes wrote:
| Maybe take a look at rekor, part of the sigstore project,
| it's built specifically for software supply chain
| transparency (disclaimer I am one of the community):
|
| https://github.com/sigstore/rekor
| c01n wrote:
| Does immudb offer mechanisms for distributed consensus, because
| that is one of the top features in blockchains, they do this
| while remaining P2P.
| jeroiraz wrote:
| the order of changes is not subject to consensus, but clients
| have the tools to ensure no history rewrite happened
| judge2020 wrote:
| sounds like git :)
| stingraycharles wrote:
| I think both blockchains and git are based on the concept
| of merkle trees, so that sounds about right.
|
| https://en.m.wikipedia.org/wiki/Merkle_tree
| c01n wrote:
| Can Immudb work in a decentralized network while remaining
| secure from attacks in such networks or is Immudb meant for
| centralized systems if so I think you cannot compare it to
| Blockchains. Maybe a better comparison is Git.
| jeroiraz wrote:
| immudb is not meant for public decentralized networks,
| although it might be possible to use embedded immudb to
| build a public blockchain... but that's a different
| story. immudb server is tailored to provide a database
| where any tampering will be subject to detection by any
| single client application consuming its data.
| decodebytes wrote:
| Maybe take a look at rekor, part of the sigstore project, it's
| built specifically for software supply chain transparency
| (disclaimer I am one of the community). Being a transparency
| log, you get much better guarantees around inclusion proof (it
| uses a merkle tree):
|
| https://github.com/sigstore/rekor
| cryptonector wrote:
| It sounds a lot like ZFS.
|
| What I really want is a way to get a hash of a root node /
| snapshot.
| simias wrote:
| Was there a gap in the first place? We could design temper-
| proof data storage since way before the blockchain. All you
| need is checksums, public key cryptography and a way to publish
| your signed checksums.
|
| I'm not saying that this isn't a good project but it's a bit
| strange to frame it as if it was a major technical
| breakthrough.
|
| If anything what catches my eye in this announcement is the
| "time travel" feature as well as the wire protocol
| compatibility with Postgres, that's pretty cool.
| capableweb wrote:
| I don't quite understand how something you run yourself on your
| own hardware can be tamper-proof (digitally, not physically).
| If you're running the software you can modify it, so no matter
| how many processes there are in place for resisting mutability,
| you'll always be able to find some way to mutate it.
|
| Compared to blockchain which is running on X number of nodes
| that you'd have to have access to in order to modify something,
| immudb doesn't actually seem to replace the use case when you
| need something actually tamper-proof.
| jeroiraz wrote:
| the entire state of the database gets captured by a hash
| value. By having light-weight clients (or auditors) keeping
| track of it is how tampering is detected in despite of where
| the database server is running
| exfalso wrote:
| This is insufficient. The strongest guarantee you can get
| without consensus is that the state of the DB you see on
| the client is/was a correct state at some point, it doesn't
| provide freshness/rollback attack prevention, aka that the
| state you see is in fact the latest one.
|
| Keeping track of the "HEAD" hash on the clients _is_ what
| consensus protocols achieve. You can also achieve it with
| trusted counters like the one SGX provides (depends on
| Intel ME so not exactly recommended, also most probably
| switched off in cloud environments). Alternative is an
| implementation of something like
| https://dl.acm.org/doi/10.5555/3241189.3241289.
|
| You can of course say that it's the clients' responsibility
| to do this, but in practice they won't and they'll
| implicitly trust the server state.
|
| Having said this, the project does look promising, we may
| end up using it in a confidential compute setting where
| clients can verify the server code running, and we'll add
| rollback protection on top
| toolslive wrote:
| > aka that the state you see is in fact the latest one.
|
| This is an impossible guarantee. Suppose the state that
| is sent to you from the server needs some time to get to
| you. meanwhile the state on the server could have
| changed. You don't even need a remote server to have this
| issue. Your thread (where you see the latest state) is
| put to sleep for a while (sheduler, os, ...) It wakes up.
| Is the state it observes still the latest? That's
| impossible to know. The only thing you can do is to
| refuse future updates if the state they were built upon
| is not the current state of the database.
| capableweb wrote:
| I see. It's a blockchain without calling it a blockchain,
| so people who hate blockchain can use it without having to
| realize they use a blockchain.
| ForHackernews wrote:
| It's only the actually-useful bits of a "blockchain"
| without the planet-cooking proof-of-waste consensus
| algorithm brute-forcing sha256 over and over again.
| f38zf5vdt wrote:
| And with git being the most superior blockchain of them
| all.
| jacquesm wrote:
| Blockchain is just a special case of Merkle trees, there
| isn't anything original about them other than that
| Bitcoin served as a marketing engine for the term
| blockchain because some people made a ton of money with
| it.
|
| https://en.wikipedia.org/wiki/Merkle_tree
| [deleted]
| joshuak wrote:
| No block chains are different than Merkle trees entirely.
| Block chains include previous hashes in each block,
| whereas Merkle trees, as the name implies are trees of
| block hashes. In Merkle trees blocks do not include the
| previous block's hash.
| decodebytes wrote:
| Rekor is just that. It's a merkle tree implementation
| (with extras such as timestamping)
|
| https://github.com/sigstore/rekor
| [deleted]
| foepys wrote:
| Blockchains like Bitcoin are actually not tamper-proof. They
| can be attacked by 51% attacks where you can even rewrite
| history if you have enough hashpower. The protocol is
| _explicitly_ designed to always follow the longest chain thus
| the only defense is to hash faster than the attackers. This
| might vary for other blockchains but the biggest and most
| mentioned is particularly unsafe in that regard.
| kdragon wrote:
| You can't rewrite the history of blocks that have already
| been distributed. You may fool SPV nodes but any node with
| a copy of the blockchain (even if pruned) will reject your
| version.
| [deleted]
| PhilippGille wrote:
| There are wo things you skip over:
|
| 1. You can't rewrite everything. Given enough hash power to
| create a longer chain, you can create a block that a)
| removes any transactions from a block in the original chain
| and b) contains new valid transactions (must be signed by
| you so you must be the owner of the Bitcoins used in the
| tx), allowing you to double spend your coins, but you can't
| change other people's transactions.
|
| 2. With each new block changing a past one becomes harder,
| while you make it sound as if you could arbitrarily rewrite
| history. Merchants usually wait several blocks before
| accepting your on-chain payment. Exchanges wait 6 blocks as
| that's seen as infeasible to change a block that's buried
| under 5 other blocks for non-nation state actors.
|
| TL;DR: 1) Other people's transactions can at most be
| removed but not be changed and 2) data on the Bitcoin
| blockchain is tamper proof after x blocks.
| rcoder wrote:
| You can build a Merkle tree from any data that is append-
| only; Git does this, as do ZFS and Dat/Hypercore. That lets
| you make strong assertions about data integrity, even without
| blocking local writes.
|
| Now add a mirror: git upstream, FS snapshot, immudb replica,
| etc...or even just an outside log of the merkle proofs
| themselves. Then, if your database ever fails a check against
| that proof, you know the data has been modified, not just
| appended to.
|
| To use a familiar Git workflow example: you can do whatever
| local writes you want, but if you disallow force pushes no
| one can erase history on the upstream repo.
|
| Put another way: if you have immutable backups you don't need
| a blockchain to ensure data integrity. OTOH, if you can't
| trust your own infrastructure even as far as a secure remote
| backup you have other problems that a blockchain won't solve
| either.
| rhacker wrote:
| you can do something quite simple like posting a tweet or
| inserting something into a public chain, like Etherium. Then
| follow that back to the private immudb hash.
| theamk wrote:
| You can still have the extra verifier nodes, but those don't
| have to be on the critical read/write path.
|
| Presumably you can create a config where have your "main"
| beefy server where all the activity is -- which is backed up,
| redundant, etc... And a bunch of "client" servers, which just
| pull and verify the data all the time. And the client servers
| notify if there are any errors using some out-of-band
| channel, probably using the same system you use for general
| server health monitoring.
|
| So you are getting same security guarantees as "private
| blockchain", but with drastically higher performance, and
| only needing one beefy server. And the downside is that you
| won't auto-stop all operations on tampering, you'll only get
| an alert for it.
| akiselev wrote:
| _> immudb doesn 't actually seem to replace the use case when
| you need something actually tamper-proof._
|
| I think that's an unrealistic requirement. There's tamper-
| evident and tamper-resistant but AFAIK _nothing_ is tamper
| proof. Best you can do is an HSM with a tamper resistant HMAC
| with keys and a running checksum in unrecoverable ROM coupled
| to the packaging.
| capableweb wrote:
| > nothing is tamper proof
|
| I beg the differ.
|
| If I place a signed message in the Bitcoin chain, can you
| then modify that message?
|
| If you can prove that you can somehow modify the message,
| I'll give you $1,000,000 USD tomorrow.
| akiselev wrote:
| That's "tamper proof" in the colloquial sense. As a term
| of art, it means something very specific. For example,
| see FIPS 140-2/3 [1]
|
| It makes no sense to say that the blockchain is tamper
| proof because the blockchain is just information. Tamper
| "proofness"/resistance is first a property of the devices
| storing the information - once you get into custody
| chains, provenance documents, etc. that's when a system
| becomes tamper resistant. At best the blockchain as a
| system is "tamper evident" in the colloquial sense
| because the network of all the other nodes decides which
| bits of information form the "real" blockchain. However,
| without verifying the (physical) identity and data
| integrity of the devices that run (at least?) 50%+1 of
| the nodes, you have no idea whether the system has been
| tampered with.
|
| [1] https://en.wikipedia.org/wiki/FIPS_140
| k_ wrote:
| Not after you post it, but by infecting your device
| before you make that message, and tampering when you
| place it in the Bitcoin chain
| [deleted]
| codetrotter wrote:
| > this product may actually fill the gap where tamper
| resistance is desired
|
| I think in the future, all enterprise storage solutions will be
| append-only by default. To protect against cryptolocker
| malware. But also with isolated functionality for actually
| deleting data, for example because of GDPR requests or because
| of malware that tries to fill all writable storage with
| garbage. So that data can still be deleted, but not from any of
| the regular servers that are reading and appending data to the
| system. Instead from separate servers that are isolated and for
| data storage management only.
| ampdepolymerase wrote:
| How does this compare feature wise to
| https://aws.amazon.com/qldb/
| jeroiraz wrote:
| there are many differences (as immudb contributor): - immudb
| can be used embedded or client-server database while qldb is
| a aws service - immudb behaves as a key-value store but also
| provides SQL support while qldb provides a document-like data
| model with PartiQL language - immudb provides time travel
| features - immudb it's faster, built-in with a mode of
| operation designed for fast writes which works with eventual
| indexing.
|
| Finally but super important, immudb can be deployed anywhere
| and it's open source!
| giaour wrote:
| QLDB provides time travel features, too (if by "time
| travel" you mean being able to query the state of a record
| at an arbitrary point in the past): https://docs.aws.amazon
| .com/qldb/latest/developerguide/worki...
| jeroiraz wrote:
| immudb already included history support for key-value
| entries in previous releases. But since v1.0.0, immudb
| provides query resolution at a given point, using the
| current data on that specific moment but also being able
| to combine data at different points in time on the same
| query. Is not clear to me if it's something that can be
| achieved with "SELECT * FROM history", it requires up
| most one result per different entry (the most recent one)
| giaour wrote:
| QLDB is a document DB, so you are limited to a single
| point or range per query. Also keep in mind `history` in
| QLDB is a function, not just a store of previous values;
| given a table "foo" and a key "bar", getting its
| immutable state from last Tuesday at 4 PM EDT would be:
|
| SELECT * FROM history('foo', `2021-05-18T20:00:00`,
| `2021-05-18T20:00:00`) as t WHERE t.metadata.id = 'bar';
| jeroiraz wrote:
| temporal features provided in immudb allows query (and
| subquery) resolution based on older states of the
| database. So for instance, it can be thought as
| retrieving the documents on its current state in a given
| time range. Querying the history of changes of a given
| key or document is slightly different and it's also
| covered with history operation in immudb.
| giaour wrote:
| Ok, that sounds extremely similar to the history function
| in QLDB.
|
| In the examples shown in the AWS docs, the results of a
| historical query are not changes made to the document,
| but the fully resolved state of a document at the
| requested timestamp (or within the timestamp range). Like
| other threads on this page mention, this is an unusual
| but not uncommon DB feature these days.
| deknos wrote:
| this is hugely interesting, i have to look into this, but... for
| dev/test environments, can i have a "unverified" version, where
| clients reget/reset the state?
| supergirl wrote:
| not exactly immutable is it? their docs say you can do UPSERT for
| example. the key is that once you update something, the clients
| can check using crypto that something was changed. you can't do
| this in regular databases.
| dmacvicar wrote:
| Immutable in the sense that the old value is preserved, even if
| you update it, and you can't change the history (tamper-
| evident).
| boshomi wrote:
| GDPR requires to erease user date if users withdraw their consent
| or their data are no longer required for purpose which you
| originally collected or processed it for.
|
| Therefore, you must carefully check that no personal data is
| stored in immutable databases.
| endisneigh wrote:
| How is this any different than taking every mutation, signing it
| using whatever signing mechanism you'd like and adding a column,
| in addition to the ones you'd like with the hash.
|
| Then, if anything changes you know it's been mutated because the
| computed signature has changed.
| jeroiraz wrote:
| In some way, it's basically that but on steroids... Note that
| if the signature includes the previous one then you are
| protecting the history of changes. However, this simple
| approach may not scale when dealing with considerable amount of
| data, proving some older entry was not tampered may require to
| validate all signatures from that point up to the latest one.
| immudb employs hash trees to optimise these proofs.
| ianpurton wrote:
| Your solution wouldn't handle the case of row deletion.
|
| It's a little harder than you might think to make a database
| with tamper resistance.
| hypertele-Xii wrote:
| According to its own description, this database does not
| support deletion at all.
|
| "You can [...] never change or delete records."
| endisneigh wrote:
| Oh I'm sure - but without delving into philosophy, how would
| you know that something was deleted and tampered with vs.
| Immudb (for example) being compromised and turns out it's
| possible to delete something without you knowing vs. it never
| existed to begin with?
|
| In my mind the only way to guarantee is to maintain a copy
| yourself and check against the "original", but if you're
| going to do that, then what I described is sufficient, no?
|
| I only mention this because the project mentions that the
| history is protected by clients, which I imagine is similar
| to what I'm describing, e.g. copying and checking against the
| original.
| ianpurton wrote:
| > In my mind the only way to guarantee is to maintain a
| copy yourself and check against the "original", but if
| you're going to do that, then what I described is
| sufficient, no?
|
| The attacker in that case could update your copy. But you
| have somewhat started to fix the issue.
|
| To cover the case where a bad admin has access to the DB
| and any copies, you need to send a hash every so often to
| an outside source. In this case they use clients (I'm not
| sure exactly how they do this).
|
| In fact you need a list of hashes one for every 100 rows
| for example. Re-generated the hashes and checking against
| an external source should detect a tamper.
|
| In the case of Bitcoin (which is extremely tamper
| resistant) every node operator is a validator. The hashes
| are stored in a merkle tree.
| foobarbazetc wrote:
| Definitely not the first database to allow time travel, TM or
| not.
| slver wrote:
| I think it's the first to allow it with TM.
| alrs wrote:
| > For any question contact us on Discord.
|
| Hard no.
| endymi0n wrote:
| There, I did it for you in PostgreSQL: ALTER TABLE table_name SET
| (autovacuum_enabled = false);
|
| Snark aside, it's still not 100% clear what's the upside of using
| a completely different database, just for that use case.
| _bohm wrote:
| Huh? Dead tuples are not queryable in Postgres.
| anentropic wrote:
| > immudb is the first database which allows you to do queries
| across time.
|
| I don't think it is
|
| e.g. Datomic already had this for a long time, no?
| dspillett wrote:
| Several databases (MS SQL Server, MariaDB, Postgres with
| appropriate extension) support system versioned temporal tables
| (added in the SQL2011 standard, though I don't know if any DB
| entirely follows the standard) which I'm pretty sure counts as
| "queries across time".
|
| Maybe they are claiming to be the first with it built-in as a
| core part of the engine that it is specifically optimised for,
| but even that might not be true.
| refset wrote:
| > even that might not be true
|
| It's not. For example, see SAP HANA's "Timeline Index"
| https://websci.informatik.uni-
| freiburg.de/publications/sigmo...
| [deleted]
| foobarbazetc wrote:
| Yeah, it's not. Which makes the rest suspect.
| branko_d wrote:
| Oracle has had flashback queries for a long time.
|
| Though this does not do what immudb claims:
|
| > _immudb is the first database to provide tamper-evident data
| management, immutable history and client-cryptographic proof._
|
| And:
|
| > _Clients do not need to trust the server and every new client
| adds trust to the deployment_
| waheoo wrote:
| Yes. https://youtube.com/watch?v=Cym4TZwTCNU
| endymi0n wrote:
| Datomic does, so does Oracle, Snowflake and BigQuery.
| CharlesW wrote:
| Teradata Vantage, too.
| cbsmith wrote:
| Yeah, that line was a real head scratcher. I think someone in
| the marketing department got a bit ahead of their reality.
| foobarbazetc wrote:
| It's not marketing as much as it is "try to get a patent for
| something that's been done for decades by doing it slightly
| differently".
| [deleted]
| chatmasta wrote:
| We're building something similar to this at Splitgraph, at
| least in the sense that we have immutable data in a Postgres-
| compatible DB with point-in-time queries across versioned,
| addressable snapshots. In our case, we apply the idea of
| immutability to "data images" that are analogous to Docker
| images. You build and push them in the same way, and then you
| can reference any "image" (version) [0] of data by addressing
| it with the correct tag.
|
| For example, here is a link to a live query on our Data
| Delivery Network (DDN) that runs a JOIN on two daily snapshots
| (20200809 and 20200810). [1] In this case, these images are the
| result of a daily script that builds and pushes a new image
| each day. The storage costs are minimal, as each new image only
| needs to store the changed rows, rather than a duplicative
| snapshot.
|
| Each immutable image is comprised of a set of small content-
| addressable cstore fragments uploaded to object storage, which
| we only load into the database when they become necessary to
| satisfy a query. When a query arrives at the DDN, we intercept
| it at the network level by scripting PgBouncer with embedded
| Python to orchestrate the infrastructure required to answer the
| query. The embedded code parses the AST of the query for table
| references, which it uses to "mount" a temporary schema for
| serving the query. The temporary schema includes an FDW that
| implements a "layered querying" protocol (think AUFS) to lazily
| download only the fragments required to satisfy the query.
|
| (Also, we support live data. But that's for another time!)
|
| [0] https://www.splitgraph.com/docs/concepts/images
|
| [1]
| https://www.splitgraph.com/workspace/ddn?layout=hsplit&query...
| ignoramous wrote:
| Doesn't Bigtable, according to the 2006 paper, allow for this
| too?
|
| > _Each cell in a Bigtable can contain multiple versions of the
| same data; these versions are indexed by timestamp. Bigtable
| timestamps are 64-bit integers. They can be assigned by
| Bigtable, in which case they represent realtime in
| microseconds..._
|
| https://research.google/pubs/pub27898.pdf
| parentheses wrote:
| Seems like a database that stores content hashes. Very cool but
| what makes it better than simply adding a table to my database
| (or a DB specifically for this) and running `insert into
| content_hashes...`?
|
| The above approach also allows me to choose any database because
| I can model this data however I want.
| jeroiraz wrote:
| immudb can hold the actual data. An equivalent approach using
| an existent database without this features will involve
| creating a cryptographic data structure which captures not only
| individual content but the entire history of changes. Also
| having the functionality to construct and verify the
| cryptographic proofs to validate read data
| hutrdvnj wrote:
| What happens if you have to delete some data e.g. due to law?
| jacquesm wrote:
| You have several options here:
|
| - store the data encrypted using a secondary protocol, lose the
| key
|
| - rewrite the whole db
|
| If either of these is not feasible then you should have thought
| longer about what tech is suitable for which application.
| Operating your company in a legal manner is a pretty strong
| factor when making such choices.
| remram wrote:
| Is losing the key sufficient to comply with the law? "We
| didn't actually delete anything but I promise I don't
| remember how to decrypt it" would be acceptable for the court
| to not e.g. seize your drives?
| speed_spread wrote:
| It's the same as "we actually deleted the data and I
| promise we didn't keep any backup copies", except it's
| probably even easier to enforce, since you already to have
| to secure the key instead of the whole database.
| imhoguy wrote:
| IANAL With GDPR right to forget you need to get rid of any
| identifable subject information. If you can't tell a
| subject from data then you comply. Encrypted data without a
| key is just a noise.
|
| You are allowed to keep aggregations and hashes of data.
| These shouldn't allow to identify a subject. E.g. you can
| keep list of banned emails as MD5s to verify on sign up
| etc.
| remram wrote:
| In this situation though, any client who still knows the
| key can access the data, since there is no way to remove
| data from the database server, or make it unavailable at
| the server level.
|
| Assuming the clients and server are operated by different
| entities (otherwise the immutability and verifiability
| are not that interesting), if someone comes to the server
| operator with a court order and ask that data be removed,
| it seems like there is nothing they can do.
| setr wrote:
| You can't do much of anything if you've already given
| away the information in question -- the same is true if
| someone copied the data itself.
|
| You have to not give away the key in the first place, at
| least not to any clients that you don't own.
|
| E.g. following the rule "any problem can be solved with a
| level of indirection", external clients get some Auth key
| A, which they feed to internal client, who internally
| maps it to some data key B, and decrypts the data and
| hands it back to the external client.
|
| When the data is removed, you delete the mapping from
| your internal client.
| hutrdvnj wrote:
| > store the data encrypted using a secondary protocol, lose
| the key
|
| Thing is that you have to do this upfront. I think it's very
| possible to get into a situation where the data you have to
| delete is in plaintext. Dropping the whole DB and recreate it
| from scratch is a bit hefty.
| cyberge99 wrote:
| I love what you've done. I think you may have an issue with the
| TimeTravel trademark however. Snowflake uses it in your exact
| market segment (not to mention where else it may be used in a
| similar context). Good stuff though, I'll be checking it out.
| artemonster wrote:
| Can someone ELI5 what is an "immutable database"? If you can add
| to the table, that means mutation, right? I am missing
| something...
| f38zf5vdt wrote:
| SQL system versioned tables but with git hash tree versioning
| for every mutable command.
| goto11 wrote:
| It basically means "append only". You can add new data to the
| database, but you can't change or delete existing data.
| dspillett wrote:
| _> immudb is the first database to provide tamper-evident data
| management, immutable history and client-cryptographic proof.
| Every change is preserved and can 't be changed without clients
| noticing._
|
| Sounds like they are recording all changes (like SQL2011's
| system versioned tables, as implemented more-or-less by several
| common DB engines) but with some sort of hash-chain ledger so
| that history can be verified and therefore any tampering
| detected.
|
| _> If you can add to the table, that means mutation, right?_
|
| It isn't keeping the current view of the data immutable, but is
| keeping an immutable history of the data. It is immutable in
| the sense that nothing written to it is ever lost, and you can
| use the "time-travel" query functions (like SELECT stuff FROM
| atable FOR SYSTEM_TIME AS OF '2021-03-05') to retrieve it even
| if it looks to have been completely mangled or deleted if you
| use a non-time-travelling query.
| qsort wrote:
| It's immutable in the same sense a purely functional data
| structure is immutable. You represent mutation by making a new
| version of the data structure. Of course you don't _literally_
| do that on the database because it would be inefficient, but
| there are several algorithmical tricks that can expose an
| interface that works _as if_.
| artemonster wrote:
| that makes sense on a language level, when you hold a
| reference to some data and you can assume nothing can be
| changed about it. how does that hold on DB level?
| qsort wrote:
| In the same way. A database is basically just a giant data
| structure, a table is not unlike a B-Tree (in some engines
| it _literally_ is a B-tree). Data warehouses already do
| something like this informally, as they are structured in a
| star schema around a single "append-only" fact table.
| [deleted]
| ianpurton wrote:
| You would be able to query and INSERT but not DELETE and
| UPDATE.
|
| This is useful for example in banking applications that keep an
| audit trail for example.
|
| A sysadmin would not be able to update or delete items in the
| audit table and so can't cover up a crime.
|
| If the database is tampered with at the file level, they have a
| way to detect that. (Probably some kind of merkle tree.)
| artemonster wrote:
| allright, makes perfect sense. thank you!
| JulianMorrison wrote:
| If this is deployed in a situation where record volumes are
| large, example: recording credit card transactions, there is
| going to have to be a process to "retire" old records (and
| perhaps, move them to external archives). The alternative is
| endlessly growing storage, and the resulting performance
| degradation.
|
| At a first glance, I don't see anything like that in there.
| 1cvmask wrote:
| Any major customers using this and if so how?
| tutfbhuf wrote:
| I would like to have such a database based on git. Where every
| change is a git commit. This should then work with things like
| github where you can connect to your database via github api. The
| db git repositories could be either private or even public. You
| can then deploy a serverless webpage to gh-pages and use a
| serverless gh-gitdb as storage.
|
| serverless := you don't have to operate the infrastructure
| yourself
| agbell wrote:
| It seems like this is somewhat in that direction. It looks like
| it is using merkle trees to store the history.
| quasiperson wrote:
| You should check out https://www.dolthub.com/ then. They are
| working on something very similar.
| lifty wrote:
| Check out https://replicache.dev and https://github.com/attic-
| labs/noms
| arpinum wrote:
| The QLDB performance comparison looks quite dodgy, but I can't
| find their QLDB benchmark code to see what they are doing wrong.
| 0xbadcafebee wrote:
| > This new functionality allows travel back in time through the
| data change history, and even compares these values in the same
| query!
|
| So we can actually treat our databases like immutable
| infrastructure and actually roll back changes now without the
| hulking cludge that is snapshots/restores and database
| migrations? That's game-changing.
| robto wrote:
| Reminds me a lot of Fluree[0], an immutable, cryptographically
| verifiable, temporal database, but with RDF as a query language,
| which I think is very nice. SQL is nice because it's familiar but
| it's honestly not that hard to improve on.
|
| [0]https://flur.ee/
___________________________________________________________________
(page generated 2021-05-25 23:00 UTC)