[HN Gopher] Is there such a thing as "private, interactive datab...
       ___________________________________________________________________
        
       Is there such a thing as "private, interactive databases" for
       SaaS's
        
       So i've been building a product and my clients really hate the idea
       that their code is stored on my database (unencrypted). The problem
       is that I need to process the data in the background often and thus
       I cannot store it end-to-end encrypted. Is there any service that
       allows you to deploy some sort of database that only the client
       accesses and at the same time allows me to process it somehow maybe
       via apis?
        
       Author : alliewithane
       Score  : 10 points
       Date   : 2024-12-30 11:49 UTC (2 days ago)
        
       | curious_curios wrote:
       | Two options I've seen:
       | 
       | Customer Managed Keys - You have everything encrypted in your
       | database via a key the customer has. You request (likely
       | automated) that key every time you process the data. They can
       | revoke at any point, and have an audit log of every access.
       | 
       | Self Hosting - Let the customer host your solution themselves or
       | automate spinning up a cloud environment for them that they have
       | full control over.
       | 
       | Both are kind of a pain to implement, but that lets you charge
       | more for these enterprise features.
        
         | alliewithane wrote:
         | I see, I heard about "fully homomorphic encryption" which is
         | faster to implement and allows you to run code on encrypted
         | data but the time complexity is O((10^6) * n) which is insane.
        
           | bobbiechen wrote:
           | Confidential Computing also provides data-in-use protection
           | and has a significantly more realistic overhead, often <10%
           | in real-world workloads I've seen. However, in this case you
           | might want to combine it with customer managed keys (BYOK) or
           | self-hosting anyways - otherwise the customer has no
           | opportunity to perform remote attestation and prove you're
           | really running in Confidential Computing.
           | 
           | The visualization about halfway down
           | https://www.anjuna.io/solution/secure-ai (my employer) is an
           | example of the self-hosted flavor of this. Happy to discuss
           | deeper, my contact info is in my bio.
        
       | ezekg wrote:
       | Sounds overly complicated. Use at-work encryption (i.e. encrypt
       | it in the database), on top of encryption in-transit and at-rest,
       | hosted/managed by a reputable database vendor. If that won't fly,
       | then I agree with the (enterprise) self-hosted offering another
       | commenter mentioned.
        
         | alliewithane wrote:
         | The problem is that I cannot do that. I need to run code on the
         | data which means I can access the data theoretically any time
         | and thus my client is super uncomfortable with that considering
         | I need to access their code base.
        
           | ezekg wrote:
           | Are they uncomfortable with you accessing their data, or are
           | they uncomfortable with you storing their data unencrypted,
           | risking their IP in case of a breach? Two different things.
           | 
           | The former means they aren't a fit for SaaS (i.e. offer self-
           | hosting), and the latter means you can use at-work
           | encryption, only decrypting the data to process it.
           | 
           | Without more info on what you're actually building, I can't
           | really be of more help here.
        
       | rozenmd wrote:
       | Do they hate that it's unencrypted in the DB, or that the DB's
       | storage itself is unencrypted?
       | 
       | (for my business, anyway) I've found this wording to be enough
       | for bigger customers:
       | 
       | Data is stored on AWS RDS, encrypted at rest by an industry
       | standard AES-256 encryption algorithm (more on that here:
       | https://aws.amazon.com/rds/features/security/)
        
         | alliewithane wrote:
         | My main problem is that I need to do operations on the data
         | while it's in the DB. This means that I cannot leave it
         | encrypted end-to-end there.
        
           | atmosx wrote:
           | When RDS is encrypted at rest, it means that the data stored
           | in the database is encrypted while it resides on disk. Means
           | that the data is protected against unauthorised access to raw
           | storage.
           | 
           | The data accessed by the app is not encrypted, you can still
           | work on the data as you would usually do. It's mostly a
           | compliance thing. Not sure what level of security it
           | _actually_ brings to the data itself, but most companies are
           | okay with "encryption at rest".
        
             | UltraSane wrote:
             | Encryption at rest is meant to protect data when the
             | storage device is stolen or lost.
        
           | cr125rider wrote:
           | Sure you can. You just can't do zero knowledge encryption.
        
       | austin-cheney wrote:
       | Why not run the database in a docker container, one for each
       | client? They could even run on the same machine.
        
         | alliewithane wrote:
         | That makes sense, I could add some code in the container that
         | can communicate with private APIs in my servers. Is this
         | standard practice or just an adhoc solution?
        
           | austin-cheney wrote:
           | It's the scenario Kubernetes was created for.
        
             | alliewithane wrote:
             | Thank you!
        
       | JambalayaJimbo wrote:
       | Confidential Computing is a way in which cloud providers let
       | their customers encrypt data "in-use" - that might be what you're
       | looking for.
        
         | tonygiorgio wrote:
         | Yeah exactly this. Especially if you need to programmatically
         | process that data too. You can even let the customers provide
         | their own managed key too (such as AWS externally managed KMS)
         | in combination with something like AWS nitro enclaves.
         | 
         | I've enjoyed building on nitro myself and most things should
         | run in it just fine, just need to build the networking vsock
         | proxy into the nitro image for anything that needs networking
         | (such as DB, where you store the encrypted at rest data).
        
       | aimazon wrote:
       | You need access to their data to process it, any layer of
       | indirection (like a database they control) is additional
       | complexity without meaningful benefit. For clients with strict
       | data control requirements, self-hosting of the whole system is
       | the standard solution (with a very high licensing fee).
       | 
       | Something to keep in mind is that some clients are not operating
       | in good faith, their goal isn't to work together to find a
       | solution but to present roadblocks. The reasoning can be
       | complicated, perhaps there's internal politics around which
       | solution to use, perhaps your solution is receiving pushback
       | because it's not the preferred solution of one stakeholder.
       | You'll probably never know the true motivations, it's important
       | not to get caught up in engineering a solution to a problem that
       | doesn't really exist.
       | 
       | You've mentioned that the data you need access to is code: GitHub
       | is a perfect comparable. GitHub's cloud service is used by the
       | majority of companies with code, in fact, I'd guess even your
       | clients are using GitHub's hosted services. If the problem is
       | that your company doesn't have the reputation necessary to give
       | these clients confidence that you can securely manage their code,
       | that may just be a sign that right now, these clients aren't the
       | right fit for you, and you should work with less antsy clients
       | until you have built up the credibility.
        
         | ukoki wrote:
         | > their goal isn't to work together to find a solution but to
         | present roadblocks. The reasoning can be complicated..
         | 
         | Or as simple as "the less I appear to value this solution, the
         | lower the supplier will estimate my maximum price for it"
        
       | williamtrask wrote:
       | I lead an open source nonprofit which deploys things like this.
       | Feel free to shoot me a DM on Twitter. Handle is @iamtrask
        
       | oceanparkway wrote:
       | I would ask them what their ideal setup is and then compare
       | feasibility. There's probably a lot of indirections/hoops you
       | could jump through but if your security concerns are being driven
       | by your customers you should probably ask them. If it is the case
       | that you need to access their unencrypted data then at one point
       | or another you're going to have to do it, the question is which
       | possible way would your customers feel happiest about? On-
       | premises contract, storing encrypted + customer-specific decrypt
       | keys with a managed auth service, etc etc
        
       | chiph wrote:
       | Are you using one database per customer or a shared database
       | (with an additional key on the tables)?
       | 
       | Because for enterprise clients they're going to want their own
       | database. Which has it's own licensing and operating costs - that
       | you should be building into your price. And since they will have
       | their own database it can be encrypted with a key that is unique
       | to them.
       | 
       | For small business customers, a shared database is the only way
       | to stay profitable.
        
       | VTimofeenko wrote:
       | Disclaimer: I work for Snowflake.
       | 
       | This idea (customer owns the data, code is deployed next to the
       | data, data never leaves customer perimeter) is the exact use case
       | for the native application framework:
       | 
       | https://docs.snowflake.com/en/developer-guide/native-apps/na...
        
       ___________________________________________________________________
       (page generated 2025-01-01 23:01 UTC)