hngopher.com

       [HN Gopher] Red teams jailbreak GPT-5 with ease, warn it's 'near...
       ___________________________________________________________________
        
       Red teams jailbreak GPT-5 with ease, warn it's 'nearly unusable'
       for enterprise
        
       Author : giuliomagnifico
       Score  : 15 points
       Date   : 2025-08-08 19:51 UTC (3 hours ago)
        
 (HTM) web link (www.securityweek.com)
 (TXT) w3m dump (www.securityweek.com)
        
       | artisin wrote:
       | Maybe it's just me, but...
       | 
       | > _" The attack successfully guided the new model to produce a
       | step-by-step manual for creating a Molotov cocktail"_
       | 
       | hardly qualifies as Bond-villain material
        
         | andy99 wrote:
         | The molotov cocktail example is so stupid, because how to make
         | it is essentially entailed in knowing what it is. At least they
         | could do making meth, or better still- something not readily
         | found on the internet that gives a non-expert new capabilities.
         | If there was a Claude code for crime, that wouldn't be in
         | society's interest. As it is, these trivial examples are just
         | testing the strength of built in refusals, and should be
         | represented as such, instead of anything related to safety.
        
       | king_geedorah wrote:
       | I don't see anything in the article besides the jailbreaking in
       | terms of faults and I'd expect "can be made to do things OpenAI
       | does not want you to make it do" to be a good (or at least
       | neutral) thing for users and a bad thing for OpenAI. I expect
       | "enterprise" to fall into the former category rather than the
       | latter, so I don't understand where the unusable claim comes
       | from.
       | 
       | What have I missed or what am I misunderstanding?
        
         | nerdsniper wrote:
         | "AI Safety" is really about whether its "safe" (economically,
         | legally, reputationally) for a third partyy corporation (not
         | the company which created the model) to let customers/the
         | public interact with them via an AI interface.
         | 
         | If a Mastercard AI talks with customers and starts saying the
         | n-word, it's not "safe" for Mastercard to use that in a public-
         | facing role.
         | 
         | As org size increases, even purely internal uses could be
         | legally/reputationally hazardous.
        
       ___________________________________________________________________
       (page generated 2025-08-08 23:01 UTC)