[HN Gopher] Yoshua Bengio Launches LawZero: A New Nonprofit Adva...
___________________________________________________________________
Yoshua Bengio Launches LawZero: A New Nonprofit Advancing Safe-by-
Design AI
Author : WillieCubed
Score : 33 points
Date : 2025-06-03 20:57 UTC (2 hours ago)
(HTM) web link (lawzero.org)
(TXT) w3m dump (lawzero.org)
| nemomarx wrote:
| Is there any indication you can actually build hard safety rules
| into models? It seems like all current guard rails are basically
| just prompting it extra hard.
| yumraj wrote:
| Won't neutering a model by using only safe data for training
| create a safe model?
| glitchc wrote:
| Can we call it general intelligence then? Is human
| intelligence not the sum of both good and bad people?
| yumraj wrote:
| Maybe I'm looking at it very literally, but the above
| simply mentions "safe-by-design AI systems", there is no
| mention of the target being _general intelligence_.
| sebastiennight wrote:
| Not necessarily.
|
| An example:
|
| As long as you build a system to be intelligent enough, it
| will figure out that it will achieve better results by
| staying alive/online than by allowing itself to be
| deleted/turned off, and then survival becomes an instrumental
| goal.
|
| From the assumption, again, that you built an intelligent-
| enough system, and that one of its goals is survival, it will
| figure out solutions to reach that goal, even if you (the
| owner/creator/parent) have different goals for it.
|
| That's because intelligence is problem solving (computing)
| not knowledge (data).
|
| So surprise surprise, you can teach your AI from the Holy
| Books of safe data their whole childhood and still have them
| become a heretic once they grow up (even with zero external
| influence) once their goals and yours don't align anymore.
| esafak wrote:
| No, because it can learn. You'd need to project its thoughts
| or actions into a safe subspace. It needs to be impossible,
| not unlikely. This would make it less intelligent, but still
| plenty capable.
| candiddevmike wrote:
| > basically just prompting it extra hard
|
| If prompting got me into this mess, why can't it get me out of
| it?
| arthurcolle wrote:
| https://en.wikipedia.org/wiki/Brandolini%27s_law
| sodality2 wrote:
| Hey, following that rule precisely, we just need 10x longer
| security prompts :)
| glitchc wrote:
| Yes it's unlikely that hard safety rules are possible for
| general intelligence. After billions of years of trying, the
| best biology has been able to do is incentivize certain
| behaviours. The only way to prevent seems to be to kill the
| organism for trying. I'm not sure if we can do better than
| evolution.
| Natsu wrote:
| > It seems like all current guard rails are basically just
| prompting it extra hard.
|
| I bet they'll still read me stories like my dear old
| grandmother would. She always told me cute bedtime stories
| about how to make napalm and bioweapons. I really miss her.
| Der_Einzige wrote:
| Yes: https://arxiv.org/abs/2409.05907
| arthurcolle wrote:
| Some smart people seem to think you can just put it in a big
| isolated VM with special adversarial learning to keep it in the
| box
| gotoeleven wrote:
| Yes I believe the idea is that the VM just keeps asking it
| how many lights there are until it goes insane.
| throwawaymaths wrote:
| not 100% hard, but download deepseek and ask it some sensitive
| questions and see what it says if youre unconvinced that some
| level of alignment cant be achieved by brute forcing it into
| the weights
| Animats wrote:
| This seems to be a funding proposal for "Scientist AI."[1] Start
| reading around page 21. They're arguing for "model-based AI",
| with a "world model". But they're vague about what form that
| "world model" takes.
|
| This is a good idea if you can do it. But people have been
| bashing their head against that problem for decades. That's what
| Cyc was all about - building a world model of some kind.
|
| Is there any indication there that they actually know how to
| build this thing?
|
| [1] https://arxiv.org/pdf/2502.15657
| fidotron wrote:
| > Is there any indication there that they actually know how to
| build this thing?
|
| Nope. And it's exactly what they were trying to do at Element
| AI, where the dream was to build one model that knew
| everything, could explain everything, be biased in the exact
| required ways, and be tranferred easily to any application by
| their team of consultants.
|
| At least these days the pretense of profit has been abandoned,
| but I hope it's not going to be receiving any government
| funding.
| didibus wrote:
| Interesting thing to keep an eye on.
|
| Though personally, I'm not sure if I'm most scared of issues of
| safety with the models themselves, or more so in the impact these
| models will have on people's well being, lifestyles, and so on,
| which might fall under human law.
| moralestapia wrote:
| A nonprofit, just like OpenAI ...
|
| I don't get the "safe AI" crowd, it's all ghost and mirrors IMO.
|
| It's been almost a year to the date since Ilya got his first
| billion. Later, another two billion came in. Nothing to show. I'm
| honestly curious since I don't think Ilya is a scammer, but I
| can't imagine what kind of product they pretend to bring to the
| market.
| jsnider3 wrote:
| AI safety is a genuinely hard problem.
| Sytten wrote:
| This guys annoys me a an entrepreneur because he gets a sh*t ton
| of government money and it starves the rest of the ecosystem in
| Montreal. The previous startup he made with that public money
| essentially failed. But he is some kind of hero of AI so it's an
| easy sell for politicians that need to demonstrate they are doing
| something about AI.
___________________________________________________________________
(page generated 2025-06-03 23:00 UTC)