[HN Gopher] Red teams jailbreak GPT-5 with ease, warn it's 'near...
___________________________________________________________________
Red teams jailbreak GPT-5 with ease, warn it's 'nearly unusable'
for enterprise
Author : giuliomagnifico
Score : 15 points
Date : 2025-08-08 19:51 UTC (3 hours ago)
(HTM) web link (www.securityweek.com)
(TXT) w3m dump (www.securityweek.com)
| artisin wrote:
| Maybe it's just me, but...
|
| > _" The attack successfully guided the new model to produce a
| step-by-step manual for creating a Molotov cocktail"_
|
| hardly qualifies as Bond-villain material
| andy99 wrote:
| The molotov cocktail example is so stupid, because how to make
| it is essentially entailed in knowing what it is. At least they
| could do making meth, or better still- something not readily
| found on the internet that gives a non-expert new capabilities.
| If there was a Claude code for crime, that wouldn't be in
| society's interest. As it is, these trivial examples are just
| testing the strength of built in refusals, and should be
| represented as such, instead of anything related to safety.
| king_geedorah wrote:
| I don't see anything in the article besides the jailbreaking in
| terms of faults and I'd expect "can be made to do things OpenAI
| does not want you to make it do" to be a good (or at least
| neutral) thing for users and a bad thing for OpenAI. I expect
| "enterprise" to fall into the former category rather than the
| latter, so I don't understand where the unusable claim comes
| from.
|
| What have I missed or what am I misunderstanding?
| nerdsniper wrote:
| "AI Safety" is really about whether its "safe" (economically,
| legally, reputationally) for a third partyy corporation (not
| the company which created the model) to let customers/the
| public interact with them via an AI interface.
|
| If a Mastercard AI talks with customers and starts saying the
| n-word, it's not "safe" for Mastercard to use that in a public-
| facing role.
|
| As org size increases, even purely internal uses could be
| legally/reputationally hazardous.
___________________________________________________________________
(page generated 2025-08-08 23:01 UTC)