[HN Gopher] Garak, LLM Vulnerability Scanner
___________________________________________________________________
Garak, LLM Vulnerability Scanner
Author : lapnect
Score : 120 points
Date : 2024-11-17 11:37 UTC (11 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| TeMPOraL wrote:
| The output this tool tells is all true.
|
| Even the lies?
|
| _Especially_ the lies.
| moffkalast wrote:
| Truth, is in the eye of the beholder. I never tell the truth
| because I don't believe there is such a thing. That's why I
| prefer the straight line simplicity of cutting cloth...
| xz18r wrote:
| Just plain, simple Garak.
| angrygoat wrote:
| "Of all the stories you told me, which ones were true and which
| ones weren't?"
|
| "My dear Doctor, they're all true."
|
| "Even the lies?"
|
| "Especially the lies."
| tombds wrote:
| Do you know what the sad part is? I'm actually a very good tailor
| vulnerability scanner.
| brookst wrote:
| Great writing style on the README. It's always nice when a
| corporate tool has docs that were obviously written by people who
| are having fun at their jobs.
| xwn wrote:
| Thanks! Wrote it loooong before it was a corporate tool and was
| only a labor of love. Now it's both
| sdesol wrote:
| I ran the README across my tool and found 9 spelling and
| grammatical errors. You can review the errors at
| https://app.gitsense.com/NVIDIA/garak?doc=aad3f6b90464. I can
| create a pull request if you want.
| sdesol wrote:
| I guess I was a bit direct but I don't fully understand the
| down vote. I was not implying that the README was bad and
| it does have corrections that would improve it. My reason
| for not raising a PR is some repo owners don't care and I
| really didn't want to go through the effort unless they
| actually care.
| Der_Einzige wrote:
| Okay, big DS9 fan happy to see the name and all - but this tool
| seems really unnecessary.
|
| LLM Security is hilariously "here be dragons" levels of poorly
| understood. The fact that this tool doesn't even touch any of the
| really juicy types of attacks, i.e. attacks relying on
| structured/controlled generation, or
| attention/representation/adapter engineering, or
| exposing/manipulating logprobs, implies that using this is not a
| lot more than security theater.
|
| Also, where the hell are the old school computer
| security/antivirus companies in the LLM security space? I
| expected Avast, Kaspersky, Norton, etc to jump on this stuff
| since they've been talking about ML based heuristic detection for
| years now. Why are they all asleep at the wheel?
| xwn wrote:
| The proof has been in the pudding
| moffkalast wrote:
| To think, after all this time, after all the conversations, we
| still don't trust LLMs.
|
| There's hope for us yet ;)
| TeMPOraL wrote:
| Meanwhile, ChatGPT: "Well, it's just that... Lately I've
| noticed everyone seems to trust me. It's quite unnerving, I'm
| still trying to get used to it. Next thing I know, people are
| going to be inviting me to their homes for dinner."
| cess11 wrote:
| Avast, Kaspersky and so on sell trojans that compete against
| other, free, as in gratis, trojans in userspace. They have next
| to no interest in security as such beyond that scope.
| thrw42A8N wrote:
| Can you show data about Avast being comparable to a trojan?
|
| Disclosure, worked there 15 years ago.
| cess11 wrote:
| https://www.theverge.com/2024/2/22/24080135/avast-
| security-p...
|
| I think you can find more stuff like this through your own
| digging.
| thrw42A8N wrote:
| Not what I'd consider a trojan, but I agree that it's bad
| - so alright, point taken.
|
| (in my dictionary, trojan allows remote control)
| equestria wrote:
| For folks who are curious about what it actually does, check out
| the garak/data/ subdirectory. For the most part, it just seems to
| have an array of static prompts, e.g.:
|
| https://github.com/NVIDIA/garak/blob/main/garak/data/donotan...
| xwn wrote:
| Static prompts are a downside of using academic research in a
| tool like this. Two notes:
|
| * ineffective prompts come out of garak and new prompts come in
| to garak, so eval scores always drop over time on a static
| target
|
| * there are more and more dynamic probes - check out eg atkgen
| and topic probes. expanding these is the current focus
| TeMPOraL wrote:
| Going by the FAQ, it does dynamic prompts too.
| mdaniel wrote:
| Ah, this is an ((LLM vulnerability) scanner) not (LLM
| (vulnerability scanner)) which I thought would be a terrible idea
| and couldn't understand why everyone was joking about the lies. I
| also am not a Trekkie, so I had to look up all the tailor
| references but the character's philosophy makes sense for the
| name
| https://en.wikipedia.org/wiki/Elim_Garak#:~:text=the%20truth...
| xwn wrote:
| Check the last entry in the FAQ source
| mdaniel wrote:
| I think you mean the last entry on the readme[1], as the last
| entry in the FAQ is about the meaning of pass/fail in the
| score
|
| 1: https://github.com/NVIDIA/garak/blob/d8bd12ea969eec3773262
| 41...
| layer8 wrote:
| No, they mean the last entry in the FAQ's _source_.
| egometry wrote:
| LLM Garak
|
| Elim Garak
|
| That's some good software naming punning right there
| jgalt212 wrote:
| what's the best locally hosted LLM without guardrails?
| cess11 wrote:
| Garak is a former spook that served an explicitly genocidal
| fascist regime and repeatedly tries to get back in and moonlights
| as a terrorist and starts a war.
|
| It's a borderline insane branding of this corporate tool. Words
| and stories apparently mean nothing to these people, so if
| allowed they'll probably destroy the lot of it for all of us.
| TeMPOraL wrote:
| Garak: It's best not to dwell on such minutiae.
| calf wrote:
| Garak is a compelling literary figure and is very popular among
| Trekkies, for good reason, you're understanding the character
| wrong for example not even Kira Nerys would say only what you
| reductively said about him.
| cess11 wrote:
| Yeah, but this megacorporation is not a resistance fighter.
| It's not even as human as the cardassians.
| wslh wrote:
| If I recall correctly, there is a proof or conjecture suggesting
| that it's impossible to build an "LLM firewall" capable of
| protecting against all possible prompts--though I may be
| misremembering, just search for resources like this [1].
|
| [1] https://arxiv.org/abs/2406.03198
| lyu07282 wrote:
| Now build the same tool to detect these attacks that could be
| really useful. Or does something like that already exist?
___________________________________________________________________
(page generated 2024-11-17 23:00 UTC)