Post AVJ6PJBlWlHTKwDR4a by cypherfox@mas.to
(DIR) More posts by cypherfox@mas.to
(DIR) Post #AVFuAwmAHOY4qVkMm8 by simon@fedi.simonwillison.net
2023-05-02T20:26:08Z
0 likes, 0 repeats
I participated in a talk hosted by LangChain this morning, where I attempted to explain prompt injection and why it's so hard to solve in just 8 minutes.The video, slides and transcript of my presentation is now up on my bloghttps://simonwillison.net/2023/May/2/prompt-injection-explained/
(DIR) Post #AVFz0Txd3fHYLuwFRg by twilliability@genart.social
2023-05-02T21:19:57Z
0 likes, 0 repeats
@simon @invisv Prompt injection and PP attachment ambiguity are equally hard to solve in 8 minutes :)
(DIR) Post #AVHlFX5AiwohKsfX8q by awooo@pawb.fun
2023-05-03T17:55:32Z
0 likes, 0 repeats
@simon The privileged/quarantined pattern you've shown is a really neat idea!I'm sure prompt injections will become a massive problem either way as people jump to integrate these things without the necessary experience, it's deceptively simple to make something "just work" with LLMs without noticing the monsters you've unleashed.
(DIR) Post #AVJ6PJBlWlHTKwDR4a by cypherfox@mas.to
2023-05-04T09:27:12Z
0 likes, 0 repeats
@simon My answer is similar to your clean hand/dirty approach, but different.Foundational LLMs need to be instruction tuned to be useful. What if your summarization LLM is tuned on limited instructions. Essentially using multiple limited-purpose-fine-tuned LLMs in different parts of the system.The part that summarizes (for example) is just not even trained on anything that will let it misbehave.Instead of clean/tainted, you have general/specialized. Probably one general and N specialized.🤷
(DIR) Post #AVnl55HY4eiHACbGoC by thompsonson2@mastodon.social
2023-05-19T04:24:19Z
0 likes, 0 repeats
@simon I really liked what you said and your approach tonthe challenges of securing LLM interaction. I've got a slightly different view on the solution. Like yours I think there is a need to separate general actions from privileged actions. I've attempted to document it here: https://github.com/hwchase17/langchain/issues/4912Would be great to talk amore about it if you have time.Peace,Matt