_______               __                   _______
       |   |   |.---.-..----.|  |--..-----..----. |    |  |.-----..--.--.--..-----.
       |       ||  _  ||  __||    < |  -__||   _| |       ||  -__||  |  |  ||__ --|
       |___|___||___._||____||__|__||_____||__|   |__|____||_____||________||_____|
                                                             on Gopher (inofficial)
 (HTM) Visit Hacker News on the Web
       
       
       COMMENT PAGE FOR:
 (HTM)   Show HN: Local Privacy Firewall-blocks PII and secrets before ChatGPT sees them
       
       
        upghost wrote 3 hours 48 min ago:
        Ok what I would really love is something like this but for the damn
        terminal. No, I don't store credentials in plaintext, but when they get
        pulled into memory after being decrypted you really gotta watch
        $TERMINAL_AGENT or it WILL read your creds eventually and it's ever so
        much fun explaining why you need to rotate a key.
        
        Sure go ahead and roast me but please include full proof method you use
        to make sure that never happens that still allows you to use
        credentials for developing applications in the normal way.
       
          ComputerGuru wrote 3 hours 1 min ago:
          If you store passwords encrypted at rest à la my SecureStore, this
          isn’t an issue.
          
 (HTM)    [1]: https://github.com/neosmart/securestore-rs
       
        idiotsecant wrote 3 hours 55 min ago:
        This is a concept that I firmly believe will be a fundamental feature
        of the medium-term future. Personal memetic firewalls.
        
        As AI gets better and cheaper there will absolutely be influence
        campaigns conducted at the individual level for every possible thing
        anyone with money might want, and those campaigns will be so precisely
        targeted and calibrated by autonomous influencer AI that know so much
        about you that they will convince you to do the thing they want,
        whether by emotional manipulation, subtle blackmail, whatever.
        
        It will also be extraordinarily easy to emit subliminal or unconscious
        signals that will encode a great deal more of our internal state than
        we want them to.
        
        It will be necessary to have a 'memetic firewall' that reduces our
        unintentional outgoing informational cross section, while also
        preventing contamination by the torrent of ideas trying to worm their
        way into our heads. This firewall would also need to be autonomous, but
        by exploiting the inherent information asymmetry (your firewall would
        know you very well) it need not be as powerful as the AI that are
        trying to exploit you.
       
        gnarlouse wrote 6 hours 46 min ago:
        I'd like to see this as a Windsurf plugin.
       
        NJL3000 wrote 7 hours 49 min ago:
        This is a great idea of using a BERT model for DLP at the door. Have
        you thought integrating this into semantic router as an option leaving
        the look-ahead ? Maybe a smaller code base ?
       
        mentalgear wrote 9 hours 26 min ago:
        Neat!
        
        There's also:
        
        - [1] -
        
 (HTM)  [1]: https://github.com/superagent-ai/superagent
 (HTM)  [2]: https://github.com/superagent-ai/vibekit
       
        ttul wrote 10 hours 42 min ago:
        This should be a native feature of the native chat apps for all major
        LLM providers. There’s no reason why PII can’t be masked from the
        API endpoint and then replaced again when the LLM responds. “Mary
        Smith” becomes “Samantha Robertson” and then back to “Mary
        Smith” on responses from the LLM. A small local model (such as the
        BERT model in this project) detects the PII.
        
        Something like this would greatly increase end user confidence. PII in
        the input could be highlighted so the user knows what is being hidden
        from the LLM.
       
        throwaway613745 wrote 11 hours 15 min ago:
        Maybe you should fix your logging to not output secrets in plaintext? 
        Every single modern logging utility has this ability.
       
          lurking_swe wrote 2 hours 24 min ago:
          so what happens if you are running an agent locally and it helpfully
          tries to write a script that prints the environment variables, for
          debugging purposes?
       
        sciencesama wrote 11 hours 47 min ago:
        Develop a pihole style adblock
       
          accrual wrote 3 hours 47 min ago:
          I feel it's not really applicable here. Pihole has the advantage of
          funneling all DNS traffic (typically UDP/53) to a single endpoint and
          making decisions about the request.
          
          A user using an LLM is probably talking directly to the service
          inside a TLS connection (TCP/443) so there's not a lot of room to
          inspect the prompt at the same layer a Pihole might (unless you MITM
          yourself).
          
          I think OP has the right idea to approach this from the application
          layer in the browser where the contents of the page are available.
          But to me it feels like a stopgap, something that fixes a specific
          scenario (copy/pasted private data into a web browser form), and not
          a proper service-level solution some have proposed (swap PII at the
          endpoint, or have a client that pre-filters).
       
        greenbeans12 wrote 11 hours 56 min ago:
        This is pretty cool. I barely use the web UIs for LLMs anymore. Any way
        you could make a wrapper for Claude Code/Cursor/Gemini CLI? Ideally it
        works like github push protection in GH advanced security.
       
        jedisct1 wrote 12 hours 9 min ago:
        LLMs don't need your secret tokens (but MCP servers hand them over
        anyway): [1] Encrypting sensitive data can be more useful than blocking
        entire requests, as LLMs can reason about that data even without seeing
        it in plain text.
        
        The ipcrypt-pfx and uricrypt prefix-preserving schemes have been
        designed for that purpose.
        
 (HTM)  [1]: https://00f.net/2025/06/16/leaky-mcp-servers/
       
        sailfast wrote 12 hours 30 min ago:
        How do you prevent these models from reading secrets in your repos
        locally?
        
        It’s one thing for the ENVs to be user pasted but typically you’re
        also giving the bots access to your file system to interrogate and
        understand them right? Does this also block that access for ENVs by
        detecting them and doing granular permissions?
       
          SparkyMcUnicorn wrote 3 hours 45 min ago:
          I configure permission settings within projects.
          
 (HTM)    [1]: https://code.claude.com/docs/en/settings#permission-settings
       
          woodrowbarlow wrote 6 hours 26 min ago:
          by putting secrets in your environment instead of in your files, and
          running AI tools in a dedicated environment that has its own set of
          limited and revocable secrets.
       
        dwa3592 wrote 12 hours 35 min ago:
        Neat - I built something similar -
        
 (HTM)  [1]: https://github.com/deepanwadhwa/zink?tab=readme-ov-file#3-shie...
       
        fmkamchatka wrote 12 hours 41 min ago:
        Could this run at the network level (like TripMode)? So it would catch
        usage from web based apps but also the ChatGPT app, Codex CLI etc?
       
          robertinom wrote 12 hours 36 min ago:
          That would be a great way to get some revenue from "enterprise"
          customers!
       
          p_ing wrote 12 hours 38 min ago:
          Deploy a TLS interceptor (forward proxy). There are many out there,
          both free and paid for solutions; there are also agent-based endpoint
          solutions like Netskope which do this so you don't have to route
          traffic through an internal device.
       
        postalcoder wrote 12 hours 51 min ago:
        Very neat, but recently I've tried my best to reduce my extension usage
        across all apps (browsers/ide).
        
        I do something similar locally by manually specifying all the things I
        want scrubbed/replaced and having keyboard maestro run a script on my
        system keyboard whenever doing a paste operation that's mapped to
        `hyperkey + v`. The plus side of this is that the paste is instant. The
        latency introduced by even the littlest of inference is enough friction
        to make you want to ditch the process entirely.
        
        Another plus of the non-extension solution is that it's application
        agnostic.
       
          informal007 wrote 12 hours 44 min ago:
          Smart idea! Thanks for sharing.
          
          If we move the detection and modification process from paste to copy
          operation, that will reduce in-use latency
       
            postalcoder wrote 10 hours 10 min ago:
            That's a great idea. My original excuse to not do that was because
            I copy so many things but, duh, I could just key the sanitizing
            copy to `hyperkey + c`.
       
        willwade wrote 13 hours 4 min ago:
        I wonder if this would have been useful [1] - its heavy but looks
        really good. There is a lite version..
        
 (HTM)  [1]: https://github.com/microsoft/presidio
       
          shaoz wrote 7 hours 11 min ago:
          I've used it, lots of false positives out of the box, you need to do
          a ton of tuning or put a transformer/BERT model with it, but then at
          that point it's basically the same thing as the OP's project.
       
          threecheese wrote 9 hours 22 min ago:
          Looks like it uses Googles Langextract, which uses only LLMs for NLP,
          while OP is using a small NER model that runs locally.
       
        willwade wrote 13 hours 5 min ago:
        can i have this between my machine and git please.. Like its twice now
        I've commmited .env* and totally passed me by (usually because its to a
        private repo..) then later on we/someone clears down the files.. and
        forgets to rewrite git history before pushing live.. it should never
        have got there in the first place.. (I wish github did a scan before
        making a repo public..)
       
          ComputerGuru wrote 2 hours 59 min ago:
          Already mentioned it in another reply, but .env and passing secrets
          as environment variables are a tragedy. Take a look at how
          SecureStore stores secrets encrypted at rest, and you’re even
          advised to commit them to git!
          
 (HTM)    [1]: https://github.com/neosmart/securestore-rs
       
          mh- wrote 12 hours 35 min ago:
          You can use git hooks. Pre-commit specifically.
          
 (HTM)    [1]: https://git-scm.com/docs/githooks
       
          hombre_fatal wrote 12 hours 40 min ago:
          At least you can put .env in the global gitignore. I haven’t
          committed DS_Store in 15 years because of it - its secrets will die
          with me.
       
            willwade wrote 4 hours 55 min ago:
            sorry.. global gitignore.. what have i been doing..
       
          acheong08 wrote 13 hours 2 min ago:
          GitHub does warn you when you have API keys in your repo.
          Alternatively, there are CLI tools such as TruffleHog you can put in
          pre-commit hooks to run before commits automatically
       
        cjonas wrote 13 hours 15 min ago:
        Curious about how much latency this adds (per input token)?  Obviously
        depends on your computer, but it's it ~10s or ~1s?
        
        Also, how does this deal with inquiries when piece of PII is important
        to the task itself?  I assume you just have to turn it off?
       
        itopaloglu83 wrote 13 hours 29 min ago:
        It wasn’t very clear in the video, does it trigger on paste event or
        when the page is activated?
        
        There are a lot of websites that scans the clipboard to improve user
        experience, but also pose a great risk to users privacy.
       
       
 (DIR) <- back to front page