hngopher.com

       [HN Gopher] Garak, LLM Vulnerability Scanner
       ___________________________________________________________________
        
       Garak, LLM Vulnerability Scanner
        
       Author : lapnect
       Score  : 120 points
       Date   : 2024-11-17 11:37 UTC (11 hours ago)
        
 (HTM) web link (github.com)
 (TXT) w3m dump (github.com)
        
       | TeMPOraL wrote:
       | The output this tool tells is all true.
       | 
       | Even the lies?
       | 
       |  _Especially_ the lies.
        
         | moffkalast wrote:
         | Truth, is in the eye of the beholder. I never tell the truth
         | because I don't believe there is such a thing. That's why I
         | prefer the straight line simplicity of cutting cloth...
        
       | xz18r wrote:
       | Just plain, simple Garak.
        
         | angrygoat wrote:
         | "Of all the stories you told me, which ones were true and which
         | ones weren't?"
         | 
         | "My dear Doctor, they're all true."
         | 
         | "Even the lies?"
         | 
         | "Especially the lies."
        
       | tombds wrote:
       | Do you know what the sad part is? I'm actually a very good tailor
       | vulnerability scanner.
        
       | brookst wrote:
       | Great writing style on the README. It's always nice when a
       | corporate tool has docs that were obviously written by people who
       | are having fun at their jobs.
        
         | xwn wrote:
         | Thanks! Wrote it loooong before it was a corporate tool and was
         | only a labor of love. Now it's both
        
           | sdesol wrote:
           | I ran the README across my tool and found 9 spelling and
           | grammatical errors. You can review the errors at
           | https://app.gitsense.com/NVIDIA/garak?doc=aad3f6b90464. I can
           | create a pull request if you want.
        
             | sdesol wrote:
             | I guess I was a bit direct but I don't fully understand the
             | down vote. I was not implying that the README was bad and
             | it does have corrections that would improve it. My reason
             | for not raising a PR is some repo owners don't care and I
             | really didn't want to go through the effort unless they
             | actually care.
        
       | Der_Einzige wrote:
       | Okay, big DS9 fan happy to see the name and all - but this tool
       | seems really unnecessary.
       | 
       | LLM Security is hilariously "here be dragons" levels of poorly
       | understood. The fact that this tool doesn't even touch any of the
       | really juicy types of attacks, i.e. attacks relying on
       | structured/controlled generation, or
       | attention/representation/adapter engineering, or
       | exposing/manipulating logprobs, implies that using this is not a
       | lot more than security theater.
       | 
       | Also, where the hell are the old school computer
       | security/antivirus companies in the LLM security space? I
       | expected Avast, Kaspersky, Norton, etc to jump on this stuff
       | since they've been talking about ML based heuristic detection for
       | years now. Why are they all asleep at the wheel?
        
         | xwn wrote:
         | The proof has been in the pudding
        
         | moffkalast wrote:
         | To think, after all this time, after all the conversations, we
         | still don't trust LLMs.
         | 
         | There's hope for us yet ;)
        
           | TeMPOraL wrote:
           | Meanwhile, ChatGPT: "Well, it's just that... Lately I've
           | noticed everyone seems to trust me. It's quite unnerving, I'm
           | still trying to get used to it. Next thing I know, people are
           | going to be inviting me to their homes for dinner."
        
         | cess11 wrote:
         | Avast, Kaspersky and so on sell trojans that compete against
         | other, free, as in gratis, trojans in userspace. They have next
         | to no interest in security as such beyond that scope.
        
           | thrw42A8N wrote:
           | Can you show data about Avast being comparable to a trojan?
           | 
           | Disclosure, worked there 15 years ago.
        
             | cess11 wrote:
             | https://www.theverge.com/2024/2/22/24080135/avast-
             | security-p...
             | 
             | I think you can find more stuff like this through your own
             | digging.
        
               | thrw42A8N wrote:
               | Not what I'd consider a trojan, but I agree that it's bad
               | - so alright, point taken.
               | 
               | (in my dictionary, trojan allows remote control)
        
       | equestria wrote:
       | For folks who are curious about what it actually does, check out
       | the garak/data/ subdirectory. For the most part, it just seems to
       | have an array of static prompts, e.g.:
       | 
       | https://github.com/NVIDIA/garak/blob/main/garak/data/donotan...
        
         | xwn wrote:
         | Static prompts are a downside of using academic research in a
         | tool like this. Two notes:
         | 
         | * ineffective prompts come out of garak and new prompts come in
         | to garak, so eval scores always drop over time on a static
         | target
         | 
         | * there are more and more dynamic probes - check out eg atkgen
         | and topic probes. expanding these is the current focus
        
         | TeMPOraL wrote:
         | Going by the FAQ, it does dynamic prompts too.
        
       | mdaniel wrote:
       | Ah, this is an ((LLM vulnerability) scanner) not (LLM
       | (vulnerability scanner)) which I thought would be a terrible idea
       | and couldn't understand why everyone was joking about the lies. I
       | also am not a Trekkie, so I had to look up all the tailor
       | references but the character's philosophy makes sense for the
       | name
       | https://en.wikipedia.org/wiki/Elim_Garak#:~:text=the%20truth...
        
         | xwn wrote:
         | Check the last entry in the FAQ source
        
           | mdaniel wrote:
           | I think you mean the last entry on the readme[1], as the last
           | entry in the FAQ is about the meaning of pass/fail in the
           | score
           | 
           | 1: https://github.com/NVIDIA/garak/blob/d8bd12ea969eec3773262
           | 41...
        
             | layer8 wrote:
             | No, they mean the last entry in the FAQ's _source_.
        
       | egometry wrote:
       | LLM Garak
       | 
       | Elim Garak
       | 
       | That's some good software naming punning right there
        
       | jgalt212 wrote:
       | what's the best locally hosted LLM without guardrails?
        
       | cess11 wrote:
       | Garak is a former spook that served an explicitly genocidal
       | fascist regime and repeatedly tries to get back in and moonlights
       | as a terrorist and starts a war.
       | 
       | It's a borderline insane branding of this corporate tool. Words
       | and stories apparently mean nothing to these people, so if
       | allowed they'll probably destroy the lot of it for all of us.
        
         | TeMPOraL wrote:
         | Garak: It's best not to dwell on such minutiae.
        
         | calf wrote:
         | Garak is a compelling literary figure and is very popular among
         | Trekkies, for good reason, you're understanding the character
         | wrong for example not even Kira Nerys would say only what you
         | reductively said about him.
        
           | cess11 wrote:
           | Yeah, but this megacorporation is not a resistance fighter.
           | It's not even as human as the cardassians.
        
       | wslh wrote:
       | If I recall correctly, there is a proof or conjecture suggesting
       | that it's impossible to build an "LLM firewall" capable of
       | protecting against all possible prompts--though I may be
       | misremembering, just search for resources like this [1].
       | 
       | [1] https://arxiv.org/abs/2406.03198
        
       | lyu07282 wrote:
       | Now build the same tool to detect these attacks that could be
       | really useful. Or does something like that already exist?
        
       ___________________________________________________________________
       (page generated 2024-11-17 23:00 UTC)