hngopher.com

       [HN Gopher] Yoshua Bengio Launches LawZero: A New Nonprofit Adva...
       ___________________________________________________________________
        
       Yoshua Bengio Launches LawZero: A New Nonprofit Advancing Safe-by-
       Design AI
        
       Author : WillieCubed
       Score  : 33 points
       Date   : 2025-06-03 20:57 UTC (2 hours ago)
        
 (HTM) web link (lawzero.org)
 (TXT) w3m dump (lawzero.org)
        
       | nemomarx wrote:
       | Is there any indication you can actually build hard safety rules
       | into models? It seems like all current guard rails are basically
       | just prompting it extra hard.
        
         | yumraj wrote:
         | Won't neutering a model by using only safe data for training
         | create a safe model?
        
           | glitchc wrote:
           | Can we call it general intelligence then? Is human
           | intelligence not the sum of both good and bad people?
        
             | yumraj wrote:
             | Maybe I'm looking at it very literally, but the above
             | simply mentions "safe-by-design AI systems", there is no
             | mention of the target being _general intelligence_.
        
           | sebastiennight wrote:
           | Not necessarily.
           | 
           | An example:
           | 
           | As long as you build a system to be intelligent enough, it
           | will figure out that it will achieve better results by
           | staying alive/online than by allowing itself to be
           | deleted/turned off, and then survival becomes an instrumental
           | goal.
           | 
           | From the assumption, again, that you built an intelligent-
           | enough system, and that one of its goals is survival, it will
           | figure out solutions to reach that goal, even if you (the
           | owner/creator/parent) have different goals for it.
           | 
           | That's because intelligence is problem solving (computing)
           | not knowledge (data).
           | 
           | So surprise surprise, you can teach your AI from the Holy
           | Books of safe data their whole childhood and still have them
           | become a heretic once they grow up (even with zero external
           | influence) once their goals and yours don't align anymore.
        
           | esafak wrote:
           | No, because it can learn. You'd need to project its thoughts
           | or actions into a safe subspace. It needs to be impossible,
           | not unlikely. This would make it less intelligent, but still
           | plenty capable.
        
         | candiddevmike wrote:
         | > basically just prompting it extra hard
         | 
         | If prompting got me into this mess, why can't it get me out of
         | it?
        
           | arthurcolle wrote:
           | https://en.wikipedia.org/wiki/Brandolini%27s_law
        
             | sodality2 wrote:
             | Hey, following that rule precisely, we just need 10x longer
             | security prompts :)
        
         | glitchc wrote:
         | Yes it's unlikely that hard safety rules are possible for
         | general intelligence. After billions of years of trying, the
         | best biology has been able to do is incentivize certain
         | behaviours. The only way to prevent seems to be to kill the
         | organism for trying. I'm not sure if we can do better than
         | evolution.
        
         | Natsu wrote:
         | > It seems like all current guard rails are basically just
         | prompting it extra hard.
         | 
         | I bet they'll still read me stories like my dear old
         | grandmother would. She always told me cute bedtime stories
         | about how to make napalm and bioweapons. I really miss her.
        
         | Der_Einzige wrote:
         | Yes: https://arxiv.org/abs/2409.05907
        
         | arthurcolle wrote:
         | Some smart people seem to think you can just put it in a big
         | isolated VM with special adversarial learning to keep it in the
         | box
        
           | gotoeleven wrote:
           | Yes I believe the idea is that the VM just keeps asking it
           | how many lights there are until it goes insane.
        
         | throwawaymaths wrote:
         | not 100% hard, but download deepseek and ask it some sensitive
         | questions and see what it says if youre unconvinced that some
         | level of alignment cant be achieved by brute forcing it into
         | the weights
        
       | Animats wrote:
       | This seems to be a funding proposal for "Scientist AI."[1] Start
       | reading around page 21. They're arguing for "model-based AI",
       | with a "world model". But they're vague about what form that
       | "world model" takes.
       | 
       | This is a good idea if you can do it. But people have been
       | bashing their head against that problem for decades. That's what
       | Cyc was all about - building a world model of some kind.
       | 
       | Is there any indication there that they actually know how to
       | build this thing?
       | 
       | [1] https://arxiv.org/pdf/2502.15657
        
         | fidotron wrote:
         | > Is there any indication there that they actually know how to
         | build this thing?
         | 
         | Nope. And it's exactly what they were trying to do at Element
         | AI, where the dream was to build one model that knew
         | everything, could explain everything, be biased in the exact
         | required ways, and be tranferred easily to any application by
         | their team of consultants.
         | 
         | At least these days the pretense of profit has been abandoned,
         | but I hope it's not going to be receiving any government
         | funding.
        
       | didibus wrote:
       | Interesting thing to keep an eye on.
       | 
       | Though personally, I'm not sure if I'm most scared of issues of
       | safety with the models themselves, or more so in the impact these
       | models will have on people's well being, lifestyles, and so on,
       | which might fall under human law.
        
       | moralestapia wrote:
       | A nonprofit, just like OpenAI ...
       | 
       | I don't get the "safe AI" crowd, it's all ghost and mirrors IMO.
       | 
       | It's been almost a year to the date since Ilya got his first
       | billion. Later, another two billion came in. Nothing to show. I'm
       | honestly curious since I don't think Ilya is a scammer, but I
       | can't imagine what kind of product they pretend to bring to the
       | market.
        
         | jsnider3 wrote:
         | AI safety is a genuinely hard problem.
        
       | Sytten wrote:
       | This guys annoys me a an entrepreneur because he gets a sh*t ton
       | of government money and it starves the rest of the ecosystem in
       | Montreal. The previous startup he made with that public money
       | essentially failed. But he is some kind of hero of AI so it's an
       | easy sell for politicians that need to demonstrate they are doing
       | something about AI.
        
       ___________________________________________________________________
       (page generated 2025-06-03 23:00 UTC)