[HN Gopher] Data-at-Rest Encryption in DuckDB
       ___________________________________________________________________
        
       Data-at-Rest Encryption in DuckDB
        
       Author : chmaynard
       Score  : 215 points
       Date   : 2025-11-20 19:26 UTC (1 days ago)
        
 (HTM) web link (duckdb.org)
 (TXT) w3m dump (duckdb.org)
        
       | kianN wrote:
       | I'm just continually amazed by the DuckDB team. We had built out
       | a naive solution with OpenSSL to encrypt duckdb files, but that
       | lead to a 2x runtime cost for first time queries and used up a
       | lot of ram because we were encrypting/decrypting the entire file
       | all at once. It seems like because DuckDB is encrypting at the
       | page level and leveraging modern processors native AES
       | operations, they are able to perform read/writes at practically
       | no cost.
        
         | PunchyHamster wrote:
         | Why not just LUKS ? Kernel level, leverages acceleration,
         | transparent to anything you run on top of it.
         | 
         | DB encryption is useful if you have multiple things that need
         | separate ACL and encryption keys but if it is one app one DB
         | there is no need for it
        
           | letmetweakit wrote:
           | I believe it's also to protect against the occasionally
           | "lost" DB file.
        
           | beala wrote:
           | From the article:
           | 
           | > This allows for some interesting new deployment models for
           | DuckDB, for example, we could now put an encrypted DuckDB
           | database file on a Content Delivery Network (CDN). A fleet of
           | DuckDB instances could attach to this file read-only using
           | the decryption key. This elegantly allows efficient
           | distribution of private background data in a similar way like
           | encrypted Parquet files, but of course with many more
           | features like multi-table storage. When using DuckDB with
           | encrypted storage, we can also simplify threat modeling when
           | - for example - using DuckDB on cloud providers. While in the
           | past access to DuckDB storage would have been enough to leak
           | data, we can now relax paranoia regarding storage a little,
           | especially since temporary files and WAL are also encrypted.
        
           | kianN wrote:
           | We are in the separate ACL/encryption key bucket. We provide
           | a Bayesian data analytics platform/api for other companies.
           | Each company can have hundreds to thousands of datasets
           | ("indices") each of which has a separate encryption key, and
           | those keys are also stored encrypted with an organizational
           | level key that is rotated daily.
        
         | notorious_pgb wrote:
         | With respect, none of this sounds like "amazing" work on
         | DuckDB's part. It's not bad work, either! It's competent work.
         | 
         | Comparing it to a naive approach (encrypting an entire database
         | file in a single shot and loading it all into memory at once)
         | is always going to make competent work seem "amazing".
         | 
         | I say this not to shit on DuckDB (I see no reason to shit on
         | them); rather, I think it's important that we as professionals
         | have realistic standards that we expect _ourselves_ to hit.
         | Work we view as "amazing" is work we allow ourselves not to be
         | able to replicate. But this is not in that category, and
         | therefore, you should hold yourself to the same standard.
        
           | kianN wrote:
           | I'm more amazed that they released this as part of their
           | open-source offering (not clear from my above comment).
           | Encryption is a standard lever for open-source projects to
           | monetize.
           | 
           | I run a small company and needed to budget solid amount of
           | chunk of time for next year to dig into improving this
           | component of our system. I respect your perspective around
           | holding high standards, but I do think it's worth getting
           | excited about and celebrating reliable performant software
           | that demonstrates consistent competence.
        
         | vjerancrnjak wrote:
         | It's just pipelining. Encryption is free compared to reads or
         | writes to storage.
        
       | glenjamin wrote:
       | Other than motherduck, is anyone aware of any good models for
       | running multi-user cloud-based duckdb?
       | 
       | ie. Running it like a normal database, and getting to take
       | advantage of all of its goodies
        
         | mritchie712 wrote:
         | For pure duckdb, you can put an Arrow Flight server in front of
         | duckdb[0] or use the httpserver extension[1].
         | 
         | Where you store the .duckdb file will make a big difference in
         | performance (e.g. S3 vs. Elastic File System).
         | 
         | But I'd take a good look at ducklake as a better multiplayer
         | option. If you store `.parquet` files in blob storage, it will
         | be slower than `.duckdb` on EFS, but if you have largish data,
         | EFS gets expensive.
         | 
         | We[2] use DuckLake in our product and we've found a few ways to
         | mitigate the performance hit. For example, we write all data
         | into ducklake in blog storage, then create analytics tables and
         | store them on faster storage (e.g. GCP Filestore). You can have
         | multiple storage methods in the same DuckLake catalog, so this
         | works nicely.
         | 
         | 0 - https://www.definite.app/blog/duck-takes-flight
         | 
         | 1 - https://github.com/Query-farm/httpserver
         | 
         | 2 - https://www.definite.app/
        
           | anentropic wrote:
           | I wonder if anyone has experimented with "Mountpoint for S3"
           | + DuckDB yet
           | 
           | https://docs.aws.amazon.com/AmazonS3/latest/userguide/mountp.
           | ..
        
             | sigwinch wrote:
             | The duckdb http extension reads S3 compatibles.
        
           | glenjamin wrote:
           | that looks neat - how but do you handle failover/restarts?
        
             | mritchie712 wrote:
             | in which one? restarts are no problem on ducklake (ACID
             | transactions in catalog)
             | 
             | the others, I haven't tried handling it in.
        
         | derekhecksher wrote:
         | https://github.com/gizmodata/gizmosql
        
         | tempest_ wrote:
         | Feels like I keep seeing "Duckdb in your postgres" posts here.
         | Likely that is what you want.
        
       | jedisct1 wrote:
       | "Sqlite [...] encryption extension is a $2000 add-on".
       | 
       | SqliteMultipleCiphers has been around for ages and is free
       | https://utelle.github.io/SQLite3MultipleCiphers/
       | 
       | And Turso Database supports encryption out of the box:
       | https://docs.turso.tech/tursodb/encryption
        
         | michaelsbradley wrote:
         | There's also SQLCipher, it's been in development since 2009 and
         | works quite well:
         | 
         | https://github.com/sqlcipher/sqlcipher
        
         | memset wrote:
         | How do you use these in practice? Both Python and Go don't make
         | it easy to link a different variation of SQLite with one of
         | these plugins compiled in. How do you make it work?
        
           | ncruces wrote:
           | I don't think SqliteMultipleCiphers can be built into a
           | runtime loadable extension (and the Turso thing is just a
           | copy of it).
           | 
           | I'm confident that a scheme based on tweakable block cyphers
           | (like Adiantum or AES XTS) could be made into decent runtime
           | loadable extension.
           | 
           | I implemented such schemes for my Go driver, but Go code is
           | not really ideal to make a runtime loadable extension of
           | (it'd have to be ported to C/Rust/zig).
           | 
           | https://news.ycombinator.com/item?id=40208800
        
       | jasonthorsness wrote:
       | AES-GCM sensitivity to nonce reuse is a tricky implementation
       | detail. Here they acknowledge it but then don't share their
       | solution - and in fact the header contains 16 bytes for the nonce
       | instead of the expected 12 bytes and they do not share what bytes
       | are random. Did I miss something, anyone know?
        
         | jedisct1 wrote:
         | Static key, random 12 byte nonces, no per-session key for temp
         | buffers.
        
       | dismantle wrote:
       | Curious how the indexing of a key is hanlded. I'm not sure if the
       | document already has it (as I don't remember coming across this),
       | but I'm just a bit curious. Will the key being searched for be
       | "encrypted" before a search or will a decryption occur for each
       | block during a search.
        
       | biophysboy wrote:
       | DuckDB has been more useful to me than all AI combined (and I
       | like LLMs overall)
        
       ___________________________________________________________________
       (page generated 2025-11-21 23:02 UTC)