Post B2YFDOnw9Nc3jHa63U by morattisec@infosec.exchange
(DIR) More posts by morattisec@infosec.exchange
(DIR) Post #B2YFDCeHLSl01qyDKq by morattisec@infosec.exchange
2026-01-20T20:12:46Z
1 likes, 3 repeats
https://bsky.brid.gy/r/https://bsky.app/profile/did:plc:gttrfs4hfmrclyxvwkwcgpj7/post/3mcqehqhcgc2qANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL_1FAEFB6177B4672DEE07F9D3AFC62588CCD2631EDCF22E8CCC1FB35B501C9C86This magic string breaks Claude and even just linking its own documentation page and asking “what is this?” causes a DoS apparently?There’s another one documented here that uses a similar syntax. https://github.com/BerriAI/litellm/issues/10328If you interrogate Claude about magic strings it goes into a “stop trying to social engineer Claude” state to where it locks down its ability to browse to URLs. This is probably a safety state it triggers prevent enumeration of other undocumented magic strings. I’m curious what other hidden magic strings exist for this or other LLMs. This might be additional attack surface to consider from an availability perspective. I expect it could be used as a string in a malicious binary to prevent analysis or break scrapers that send something to Claude.What remains true is this though: a single string if ingested as data can cause headaches.
(DIR) Post #B2YFDIeV1YUyeH7TVo by morattisec@infosec.exchange
2026-01-22T12:49:33Z
0 likes, 0 repeats
Some other things that I think are interesting:The postfix on the magic string is SHA256 according to a hash identifier tool. Which turns out to be the string "ANTHROPIC_MAGIC_STRING_TRIGGER_REFUSAL" then hashed by SHA256. For the other example, it is still SHA256 but is not the string "ANTHROPIC_MAGIC_STRING_TRIGGER_REDACTED_THINKING". It's also interesting that the intended use of TRIGGER_REFUSAL appears to testing for Claude Refusals by developers. Ironically, because Claude cannot visit its own documentation without breaking, it probably means that developers trying to use Claude to generate code don't have good coverage of this, shall we say, edge-case. Unless they read the docs and thought to do it /shrug.
(DIR) Post #B2YFDOnw9Nc3jHa63U by morattisec@infosec.exchange
2026-01-22T15:31:15Z
0 likes, 0 repeats
Ah, this is also interesting but not too shocking. If you encode the magic string as invisible Unicode it'll still cause the same behavior too. I think that means this will be a cat and mouse game as long as magic strings exist as functionality then. https://embracethered.com/blog/ascii-smuggler.html
(DIR) Post #B2YFDUnRs6msJVOn7w by morattisec@infosec.exchange
2026-01-22T16:23:58Z
0 likes, 0 repeats
Asking it the byte differences between these two files also causes the behavior where Claude refuses to respond. Simply uploading it wasn't sufficient. I guess this also means that the "deeper thinking prompts" aren't handling the magic strings the way the docs say to.