Post AcJXpFAhP3n1bs4eGW by SmallOther@techhub.social
(DIR) More posts by SmallOther@techhub.social
(DIR) Post #AcIxl1CHoy24lEVCqm by simon@fedi.simonwillison.net
2023-11-29T16:55:00Z
0 likes, 0 repeats
Every time I get into a conversation about prompt injection online I get more worried about it, because it turns out it's REALLY hard for a lot of people to get their head around why it's a problem.And if you don't understand it and you're building applications on top of LLMs, you're almost doomed to design and build vulnerable systems.(In case you're new to it, here's everything I've written about it so far: https://simonwillison.net/series/prompt-injection/)
(DIR) Post #AcJ0bPXgkd8FEUciVU by anirvan@mastodon.social
2023-11-29T17:27:09Z
0 likes, 0 repeats
@simon Your blog post about why delimiters don’t work was incredibly surprising. Till I read it, I was pretty convinced that they solved the problem.If you do more prompt injection 101 talks or posts, would you mind at least mentioning delimiters as only a half solution?
(DIR) Post #AcJ1OJKkciLQANtCHw by sqncs@mstdn.social
2023-11-29T17:35:09Z
0 likes, 0 repeats
@simon counter point: exposing LLMs as APIs is going to be looked back on as a massive mistake. The industry should be geared around shipping a model that an end user can then train, run and basically holistically own so data leakage isn't an issue.
(DIR) Post #AcJ1v2tekyrUkCpDtI by lawik@fosstodon.org
2023-11-29T17:41:49Z
0 likes, 0 repeats
@simon I guess the danger of exploitation is very different depending on what actions are taken and some harms will be non-obvious. If the model can access functions or data all that suddenly becomes reachable. If you send messages to others based on what the model produces, suddenly you could be sending spam..
(DIR) Post #AcJ63k6iNsiotQKsIi by FenTiger@mastodon.social
2023-11-29T18:28:17Z
0 likes, 0 repeats
@simon What amazes me is that our method for controlling the behaviour of LLMs consists of ... telling them not to do things and hoping that they listen. How can anyone think this technology is ready for commercial application?
(DIR) Post #AcJXpFAhP3n1bs4eGW by SmallOther@techhub.social
2023-11-29T23:38:44Z
0 likes, 0 repeats
@simon LLMs automate social engineering vectors. It would be fun to fuzz test token smuggling.
(DIR) Post #AcJlZrtp2rYkRlk6gS by seanb@hachyderm.io
2023-11-30T02:12:55Z
0 likes, 0 repeats
@simon for now, I'm operating under the assumption that my prompts will be as public and secure as my HTML.
(DIR) Post #AcKbuNiTegEiX9XT0a by almad@fosstodon.org
2023-11-30T11:59:24Z
0 likes, 0 repeats
@simon It’s not surprising when you rewind and think about how SQL injections developed and how it was hard for people too, and took few decades of tooling and education. This will be same.
(DIR) Post #AcKpqXLdIn4abauktM by simon@fedi.simonwillison.net
2023-11-30T14:36:05Z
0 likes, 0 repeats
@almad I named prompt injection after SQL injection, but I now think that was a mistake, because it implies that the solution is the same14 months later we still don't have a fix for prompt injection, whereas the fixes for SQL injection were quickly discovered and documented
(DIR) Post #AcKqZYnojyyu9EZAoK by almad@fosstodon.org
2023-11-30T14:44:07Z
0 likes, 0 repeats
@simon Because we have designed SQL intentionally whereas LLM’s emergent behavior kinda happened and we do not completely understand it?I think that is a good reminder about how to approach LLMs in general, IMHO.