hngopher.com

       [HN Gopher] OpenAI delays launch of open-weight model
       ___________________________________________________________________
        
       OpenAI delays launch of open-weight model
        
       Author : martinald
       Score  : 234 points
       Date   : 2025-07-12 01:07 UTC (21 hours ago)
        
 (HTM) web link (twitter.com)
 (TXT) w3m dump (twitter.com)
        
       | stonogo wrote:
       | we'll never hear about this again
        
       | mystraline wrote:
       | To be completely and utterly fair, I trust Deepseek and Qwen
       | (Alibaba) more than American AI companies.
       | 
       | American AI companies have shown they are money and compute
       | eaters, and massively so at that. Billions later, and well, not
       | much to show.
       | 
       | But Deepseek cost $5M to develop, and made multiple novel ways to
       | train.
       | 
       | Oh, and their models and code are all FLOSS. The US companies are
       | closed. Basically, the US ai companies are too busy treating each
       | other as vultures.
        
         | ryao wrote:
         | Wasn't that figure just the cost of the GPUs and nothing else?
        
           | rynn wrote:
           | It was more than $5m
           | 
           | https://interestingengineering.com/culture/deepseeks-ai-
           | trai...
        
           | rpdillon wrote:
           | Yeah, I hate that this figure keeps getting thrown around.
           | IIRC, it's the price of 2048 H800s for 2 months at
           | $2/hour/GPU. If you consider months to be 30 days, that's
           | around $5.7M, which lines up. What doesn't line up is
           | ignoring the costs of facilities, salaries, non-cloud
           | hardware, etc. which will dominate costs, I'd expect. $100M
           | seems like a fairer estimate, TBH. The original paper had
           | more than a dozen authors, and DeepSeek had about 150
           | researchers working on R1, which supports the notion that
           | personnel costs would likely dominate.
        
             | moralestapia wrote:
             | >ignoring the costs of facilities, salaries, non-cloud
             | hardware, etc.
             | 
             | If you lease, those costs are amortized. It was definitely
             | more than $5M, but I don't think it was as high as $100M.
             | All things considered, I still believe Deepseek was trained
             | at one (perhaps two) orders of magnitude lower cost than
             | other competing models.
        
               | rpdillon wrote:
               | Perhaps. Do you think DeepSeek made use of those
               | competing models at all in order to train theirs?
        
               | moralestapia wrote:
               | I believe so, but have no proof obviously.
        
           | 3eb7988a1663 wrote:
           | That is also just the final production run. How many
           | experimental runs were performed before starting the final
           | batch? It could be some ratio like 10 hours of research to
           | every one hour of final training.
        
         | kamranjon wrote:
         | Actually the majority of Google models are open source and they
         | also were pretty fundamental in pushing a lot of the techniques
         | in training forward - working in the AI space I've read quite a
         | few of their research papers and I really appreciate what
         | they've done to share their work and also release their models
         | under licenses that allow you to use them for commercial
         | purposes.
        
           | simonw wrote:
           | "Actually the majority of Google models are open source"
           | 
           | That's not accurate. The Gemini family of models are all
           | proprietary.
           | 
           | Google's Gemma models (which are some of the best available
           | local models) are open weights but not technically OSI-
           | compatible open source - they come with usage restrictions:
           | https://ai.google.dev/gemma/terms
        
             | kamranjon wrote:
             | You're ignoring the T5 series of models that were
             | incredibly influential, the T5 models and their derivatives
             | (FLAN-T5, Long-T5, ByT5, etc) have been downloaded millions
             | of times on huggingface and are real workhorses. There are
             | even variants still being produced within the last year or
             | so.
             | 
             | A yea the Gemma series is incredible and while maybe not
             | meeting the standards of OSI - I consider them to be pretty
             | open as far as local models go. And it's not just the
             | standard Gemma variants, Google is releasing other
             | incredible Gemma models that I don't think people have
             | really even caught wind of yet like MedGemma, of which the
             | 4b variant has vision capability.
             | 
             | I really enjoy their contributions to the open source AI
             | community and think it's pretty substantial.
        
         | Aunche wrote:
         | $5 million was the gpu hour cost of a single training run.
        
           | dumbmrblah wrote:
           | Exactly. Not to minimize Deepseeks tremendous achievement,
           | but that $5 million was just for the training run, not the
           | GPUs used they purchased before, and all the OpenAI API calls
           | they likely used to assist in synthetic data generation.
        
         | IncreasePosts wrote:
         | Deepseek R1 was trained at least partially on the output of
         | other LLMs. So, it might have been much more expensive if they
         | needed to do it themselves from scratch.
        
           | nomel wrote:
           | Lawsuit, since it was against OpenAI TOS:
           | https://hls.harvard.edu/today/deepseek-chatgpt-and-the-
           | globa...
        
         | refulgentis wrote:
         | > Billions later, and well, not much to show.
         | 
         | This is obviously false, I'm curious why you included it.
         | 
         | > Oh, and their models and code are all FLOSS.
         | 
         | No?
        
         | NitpickLawyer wrote:
         | > But Deepseek cost $5M to develop, and made multiple novel
         | ways to train
         | 
         | This is highly contested, and was either a big misunderstanding
         | by everyone reporting it, or maliciously placed there (by a
         | quant company, right before the stock fell a lot for nvda and
         | the rest) depending on who you ask.
         | 
         | If we're being generous and assume no malicious intent (big
         | if), anyone who has trained a big model can tell you that the
         | cost of 1 run is useless in the big scheme of things. There is
         | a lot of cost in getting there, in the failed runs, in the
         | subsequent runs, and so on. The fact that R2 isn't there after
         | ~6 months should say a lot. Sometimes you get a great training
         | run, but no-one is looking at the failed ones and adding up
         | that cost...
        
           | jampa wrote:
           | They were pretty explicit that this was only the cost in GPU
           | hours to USD for the final run. Journalists and Twitter tech
           | bros just saw an easy headline there. It's the same with
           | Clair Obscur developer's Sandfall, where the people say that
           | the game was made by 30 people, when there were 200 people
           | involved.
        
             | badsectoracula wrote:
             | These "200 people" were counted from credits which list
             | pretty much everyone who even sniffed at the general
             | direction of the studio's direction. The studio itself is
             | ~30 people (just went and check on their website, they have
             | a team list with photos for everyone). The rest are
             | contractors whose contributions usually vary wildly.
             | Besides, credits are free so unless the the company are
             | petty (see Rockstar not crediting people on their games if
             | they leave before the game is released even if they worked
             | on it for years) people err on the site on crediting
             | everyone. Personally i've been credited on a game that used
             | a library i wrote once and i learned about it years after
             | the release.
             | 
             | Most importantly those who mention that the game was made
             | by 30 people do it to compare it with other much larger
             | teams with hundreds if not thousands of people _and those
             | teams use contractors too_!
        
             | NitpickLawyer wrote:
             | > They were pretty explicit that this was only the cost in
             | GPU hours to USD for the final run.
             | 
             | The researchers? Yes.
             | 
             | What followed afterwards, I'm not so sure. There was
             | clearly some "cheap headlines" in the media, but there were
             | also some weird coverage being pushed everywhere, from
             | weird tlds, and they were all pushing nvda dead, cheap
             | deepseek, you can run it on raspberries, etc. That _might_
             | have been a campaign designed to help short the stocks.
        
         | buyucu wrote:
         | Deepseek is far more worthy of the name OpenAI than Sam
         | Altman's ClosedAI.
        
         | baobabKoodaa wrote:
         | > American AI companies have shown they are money and compute
         | eaters
         | 
         | Don't forget they also quite literally eat books
        
           | knicholes wrote:
           | Who is literally eating books?
        
             | jasonjmcghee wrote:
             | Parent is referencing the recent court case with Anthropic,
             | and the legal requirement of not copying books, but
             | consuming them- translating to Anthropic having to destroy
             | every book it uses as input data in order to comply with
             | said requirements.
        
         | root_axis wrote:
         | > _But Deepseek cost $5M to develop_
         | 
         | Not true. It was $5M to train - it was many more millions in
         | R&D.
        
       | krackers wrote:
       | Probably the results were worse than K2 model released today. No
       | serious engineer would say it's for "safety" reasons given that
       | ablation nullifies any safety post-training.
        
         | simonw wrote:
         | I'm expecting (and indeed hoping) that the open weights OpenAI
         | model is a _lot_ smaller than K2. K2 is 1 trillion parameters
         | and almost a terabyte to download! There 's no way I'm running
         | that on my laptop.
         | 
         | I think the sweet spot for local models may be around the 20B
         | size - that's Mistral Small 3.x and some of the Gemma 3 models.
         | They're very capable and run in less than 32GB of RAM.
         | 
         | I really hope OpenAI put one out in that weight class,
         | personally.
        
           | NitpickLawyer wrote:
           | Early rumours (from a hosting company that apparently got
           | early access) was that you'd need "multiple h100s to run it",
           | so I doubt it's a gemma - mistral small tier model..
        
             | simonw wrote:
             | I think you're right, I've seen a couple of other comments
             | now that indicate the same thing.
        
           | aabhay wrote:
           | You will get at 20gb model. Distillation is so compute
           | efficient that it's all but inevitable that if not OpenAI,
           | numerous other companies will do it.
           | 
           | I would rather have an open weights model that's the best
           | possible one I can run and fine tune myself, allowing me to
           | exceed SOTA models on the narrower domain my customers care
           | about.
        
       | dorkdork wrote:
       | Maybe they're making last minute changes to compete with Grok 4?
        
       | puttycat wrote:
       | https://nitter.space/sama/status/1943837550369812814
        
       | ryao wrote:
       | Am I the only one who thinks mention of "safety tests" for LLMs
       | is a marketing scheme? Cars, planes and elevators have safety
       | tests. LLMs don't. Nobody is going to die if a LLM gives an
       | output that its creators do not like, yet when they say "safety
       | tests", they mean that they are checking to what extent the LLM
       | will say things they do not like.
        
         | eviks wrote:
         | Why is your definition of safety so limited? Death isn't the
         | only type of harm...
        
           | ryao wrote:
           | There are other forms of safety, but whether a digital parrot
           | says something that people do not like is not a form of
           | safety. They are abusing the term safety for marketing
           | purposes.
        
             | eviks wrote:
             | You're abusing the terms by picking either the overly
             | limited ("death") or overly expansive ("not like")
             | definitions to fit your conclusion. Unless you reject the
             | fact that harm can come from words/images, a parrot can
             | parrot harmful words/images, so be unsafe.
        
               | ryao wrote:
               | The maxim "sticks and stones can break my bones, but
               | words can never hurt me" comes to mind here. That said, I
               | think this misses the point that the LLM is not a
               | gatekeeper to any of this.
        
               | eviks wrote:
               | Don't let your mind potential be limited by such
               | primitive slogans!
        
               | jiggawatts wrote:
               | I find it particularly irritating that the models are so
               | overly puritan that they refuse to translate subtitles
               | because they mention violence.
        
               | jazzyjackson wrote:
               | it's like complaining about bad words in the dictionary
               | 
               | the bot has no agency, the bot isn't doing anything,
               | people talk to themselves, augmenting their chain of
               | thought with an automated process. If the automated
               | process is acting in an undesirable manner, the human
               | that started the process can close the tab.
               | 
               | Which part of this is dangerous or harmful?
        
         | ks2048 wrote:
         | You could be right about this being an excuse for some other
         | reason, but lots of software has "safety tests" beyond life or
         | death situations.
         | 
         | Most companies, for better or worse (I say for better) don't
         | want their new chatbot to be a RoboHitler, for example.
        
           | ryao wrote:
           | It is possible to turn any open weight model into that with
           | fine tuning. It is likely possible to do that with closed
           | weight models, even when there is no creator provided sandbox
           | for fine tuning them, through clever prompting and trying
           | over and over again. It is unfortunate, but there really is
           | no avoiding that.
           | 
           | That said, I am happy to accept the term safety used in other
           | places, but here it just seems like a marketing term. From my
           | recollection, OpenAI had made a push to get regulation that
           | would stifle competition by talking about these things as
           | dangerous and needing safety. Then they backtracked somewhat
           | when they found the proposed regulations would restrict
           | themselves rather than just their competitors. However, they
           | are still pushing this safety narrative that was never really
           | appropriate. They have a term for this called alignment and
           | what they are doing are tests to verify alignment in areas
           | that they deem sensitive so that they have a rough idea to
           | what extent the outputs might contain things that they do not
           | like in those areas.
        
         | natrius wrote:
         | An LLM can trivially instruct someone to take medications with
         | adverse interactions, steer a mental health crisis toward
         | suicide, or make a compelling case that a particular ethnic
         | group is the cause of your society's biggest problem so they
         | should be eliminated. Words can't kill people, but words can
         | definitely lead to deaths.
         | 
         | That's not even considering tool use!
        
           | ryao wrote:
           | This is analogous to saying a computer can be used to do bad
           | things if it is loaded with the right software.
           | Coincidentally, people do load computers with the right
           | software to do bad things, yet people are overwhelmingly
           | opposed to measures that would stifle such things.
           | 
           | If you hook up a chat bot to a chat interface, or add tool
           | use, it is probable that it will eventually output something
           | that it should not and that output will cause a problem.
           | Preventing that is an unsolved problem, just as preventing
           | people from abusing computers is an unsolved problem.
        
             | ronsor wrote:
             | As the runtime of any program approaches infinity, the
             | probability of the program behaving in an undesired manner
             | approaches 1.
        
               | ryao wrote:
               | That is not universally true. The yes program is a
               | counter example:
               | 
               | https://www.man7.org/linux/man-pages/man1/yes.1.html
        
               | cgriswald wrote:
               | Devil's advocate:
               | 
               | (1) Execute yes (with or without arguments, whatever you
               | desire).
               | 
               | (2) Let the program run as long as you desire.
               | 
               | (3) When you stop desiring the program to spit out your
               | argument,
               | 
               | (4) Stop the program.
               | 
               | Between (3) and (4) some time must pass. During this time
               | the program is behaving in an undesired way. Ergo, yes is
               | not a counter example of the GP's claim.
        
               | ryao wrote:
               | I upvoted your reply for its clever (ab)use of ambiguity
               | to say otherwise to a fairly open and shut case.
               | 
               | That said, I suspect the other person was actually
               | agreeing with me, and tried to state that software
               | incorporating LLMs would eventually malfunction by
               | stating that this is true for all software. The yes
               | program was an obvious counter example. It is almost
               | certain that all LLMs will eventually generate some
               | output that is undesired given that it is determining the
               | next token to output based on probabilities. I say almost
               | only because I do not know how to prove the conjecture.
               | There is also some ambiguity in what is a LLM, as the
               | first L means large and nobody has made a precise
               | definition of what is large. If you look at literature
               | from several years ago, you will find people saying 100
               | million parameters is large, while some people these days
               | will refuse to use the term LLM to describe a model of
               | that size.
        
               | cgriswald wrote:
               | Thanks, it was definitely tongue-in-cheek. I agree with
               | you on both counts.
        
             | pesfandiar wrote:
             | The society has accepted that computers bring more benefit
             | than harm, but LLMs could still get pushback due to bad PR.
        
             | 0points wrote:
             | > This is analogous to saying a computer can be used to do
             | bad things if it is loaded with the right software.
             | 
             | It's really not. Parent's examples are all out-of-the-box
             | behavior.
        
           | 123yawaworht456 wrote:
           | does your CPU, your OS, your web browser come with ~~built-in
           | censorship~~ safety filters too?
           | 
           | AI 'safety' is one of the most neurotic twitter-era nanny
           | bullshit things in existence, blatantly obviously invented to
           | regulate small competitors out of existence.
        
             | no_wizard wrote:
             | It isn't. This is dismissive without first thinking through
             | the difference of application.
             | 
             | AI safety is about proactive safety. Such an example: if an
             | AI model could be used to screen hiring applications,
             | making sure it doesn't have any weighted racial biases.
             | 
             | The difference here is that it's not reactive. Reading a
             | book with a racial bias would be the inverse; where you
             | would be reacting to that information.
             | 
             | That's the basis of proper AI safety in a nutshell
        
               | ryao wrote:
               | As someone who has reviewed people's resumes that they
               | submitted with job applications in the past, I find it
               | difficult to imagine this. The resumes that I saw had no
               | racial information. I suppose the names might have some
               | correlation to such information, but anyone feeding these
               | things into a LLM for evaluation would likely censor the
               | name to avoid bias. I do not see an opportunity for
               | proactive safety in the LLM design here. It is not even
               | clear that they even are evaluating whether there is bias
               | in such a scenario when someone did not properly sanitize
               | inputs.
        
               | thayne wrote:
               | > but anyone feeding these things into a LLM for
               | evaluation would likely censor the name to avoid bias
               | 
               | That should really be done for humans reviewing the
               | resumes as well, but in practice that isn't done as much
               | as it should be
        
               | kalkin wrote:
               | > I find it difficult to imagine this
               | 
               | Luckily, this is something that can be studied and has
               | been. Sticking a stereotypically Black name on a resume
               | on average substantially decreases the likelihood that
               | the applicant will get past a resume screen, compared to
               | the same resume with a generic or stereotypically White
               | name:
               | 
               | https://www.npr.org/2024/04/11/1243713272/resume-bias-
               | study-...
        
               | bigstrat2003 wrote:
               | That is a terrible study. The stereotypically black names
               | are not just stereotypically black, they are
               | stereotypical for the underclass of trashy people. You
               | would also see much higher rejection rates if you slapped
               | stereotypical white underclass names like "Bubba" or
               | "Cleetus" on resumes. As is almost always the case, this
               | claim of racism in America is really classism and has
               | little to do with race.
        
               | stonogo wrote:
               | "Names from N.C. speeding tickets were selected from the
               | most common names where at least 90% of individuals are
               | reported to belong to the relevant race and gender
               | group."
               | 
               | Got a better suggestion?
        
               | selfhoster11 wrote:
               | If you're deploying LLM-based decision making that
               | affects lives, you should be the one held responsible for
               | the results. If you don't want to do due diligence on
               | automation, you can screen manually instead.
        
             | derektank wrote:
             | iOS certainly does by limiting you to the App Store and
             | restricring what apps are available there
        
               | selfhoster11 wrote:
               | They have been forced to open up to alternative stores in
               | the EU. This is unequivocally a good thing, and a victory
               | for consumer rights.
        
             | jowea wrote:
             | Social media does. Even person to person communication has
             | laws that apply to it. And the normal self-censorship a
             | normal person will engage in.
        
               | 123yawaworht456 wrote:
               | okay. and? there are no AI 'safety' laws in the US.
               | 
               | without OpenAI, Anthropic and Google's fearmongering, AI
               | 'safety' would exist only in the delusional minds of
               | people who take sci-fi way too seriously.
               | 
               | https://en.wikipedia.org/wiki/Regulatory_capture
               | 
               | for fuck's sake, how more obvious could they be? sama
               | himself went on a world tour begging for laws and
               | regulations, only to purge safetyists a year later. if
               | you believe that he and the rest of his ilk are motivated
               | by anything other than profit, smh tbh fam.
               | 
               | it's all deceit and delusion. China will crush them all,
               | inshallah.
        
           | bongodongobob wrote:
           | Books can do this too.
        
             | derektank wrote:
             | Major book publishers have sensitivity readers that
             | evaluate whether or not a book can be "safely" published
             | nowadays. And even historically there have always been at
             | least a few things publishers would refuse to print.
        
               | selfhoster11 wrote:
               | All it means is that the Overton window on "should we
               | censor speech" has shifted in the direction of less
               | freedom.
        
               | snozolli wrote:
               | GP said major publishers. There's nothing stopping you
               | from printing out your book and spiral binding it by
               | hand, if that's what it takes to get your ideas into the
               | world. Companies having standards for what _they_ publish
               | isn 't censorship.
        
             | ben_w wrote:
             | There's a reason the inherititors of the coyright* refused
             | to allow more copies of Mein Kampf to be produced until
             | that copyright expired.
             | 
             | * the federal state of Bavaria
        
               | nofriend wrote:
               | Was there? It seems like that was the perfect natural
               | experiment then. So what was the outcome? Was there a
               | sudden rash of holocausts the year that publishing
               | started again?
        
               | ben_w wrote:
               | > Was there a sudden rash of holocausts the year that
               | publishing started again?
               | 
               | Bit worse than the baseline, I'd say. You judge:
               | https://en.wikipedia.org/wiki/List_of_genocides
               | 
               | 2016 was also first Trump, Brexit, and roughly when the
               | AfD (who are metaphorically wading ankle deep in the
               | waters of legal trouble of this topic) made the
               | transition from "joke party" to "political threat".
        
           | bilsbie wrote:
           | PDFs can do this too.
        
             | jiggawatts wrote:
             | Twitter does it at scale.
        
             | xigoi wrote:
             | In such a case, the author of the PDF can be held
             | responsible.
        
               | bilsbie wrote:
               | Radical idea: let's hold the reader responsible for the
               | actions they take from the material.
        
               | amelius wrote:
               | Radical rebuttal of this idea: if you hire an assassin
               | then you are responsible too (even more so, actually),
               | even if you only told them stuff over the phone.
        
               | bilsbie wrote:
               | I don't see the connection. Publishing != hiring.
        
               | amelius wrote:
               | Then I don't see the connection in your idea. Answering
               | questions != publishing.
        
               | johnfn wrote:
               | So we should hold my grandmother responsible for the
               | phishing emails she gets? Hmm.
        
           | thayne wrote:
           | Part of the problem is due to the marketing of LLMs as more
           | capable and trustworthy than they really are.
           | 
           | And the safety testing actually makes this worse, because it
           | leads people to trust that LLMs are less likely to give
           | dangerous advice, when they could still do so.
        
             | jdross wrote:
             | Spend 15 minutes talking to a person in their 20's about
             | how they use ChatGPT to work through issues in their
             | personal lives and you'll see how much they already trust
             | the "advice" and other information produced by LLMs.
             | 
             | Manipulation is a genuine concern!
        
               | justacrow wrote:
               | It's not just young people. My boss (originally a
               | programmer) agreed with me that there's lots of problems
               | using ChatGPT for our products and programs as it gives
               | the wrong answers too often, but tgen 30 seconds later
               | told me that it was apparently great at giving medical
               | advice.
               | 
               | ...later someone higher-up decided that it's actually
               | great at programming as well, and so now we all believe
               | it's incredibly useful and necessary for us to be able to
               | do our daily work
        
               | literalAardvark wrote:
               | Most doctors will prescribe antibiotics for viral
               | infections just to get you out and the next guy in, they
               | have zero interest in sitting there to troubleshoot with
               | you.
               | 
               | For this reason o3 is way better than most of the doctors
               | I've had access to, to the point where my PCP just writes
               | whatever I brought in because she can't follow 3/4 of it.
               | 
               | Yes, the answers are often wrong and incomplete, and it's
               | up to you to guide the model to sort it out, but it's
               | just like vibe coding: if you put in the steering effort,
               | you can get a decent output.
               | 
               | Would it be better if you could hire an actual
               | professional to do it? Of course. But most of us are
               | priced out of that level of care.
        
               | andsoitis wrote:
               | > Most doctors will prescribe antibiotics for viral
               | infections just to get you out and the next guy in
               | 
               | Where do you get this data from?
        
               | somenameforme wrote:
               | Family in my case. There are two reasons they do this. A
               | lot of people like medicine - they think it justifies the
               | cost of the visit, and there's a real placebo effect
               | (which is not an oxymoron as many might think).
               | 
               | The second is that many viral infections can, in rare
               | scenarios, lead to bacterial infections. For instance a
               | random flu can leave one more susceptible to developing
               | pneumonia. Throwing antibiotics at everything is a
               | defensive measure to help ward of malpractice lawsuits.
               | Even if frivolous, it's something no doctor wants to deal
               | with, but some absurd number - something like 1 in 15 per
               | year, will.
        
               | literalAardvark wrote:
               | Lived experience. I'm not in the US and neither are most
               | doctors.
        
               | bdangubic wrote:
               | I can co-sign this being bi-coastal. in the US not once
               | have I or my 12-year old kid been prescribed antibiotics.
               | on three ocassions in europe I had to take my kid to the
               | doctor and each time antibiotics were prescribed (never
               | consumed)
        
               | asadotzler wrote:
               | Your claim of _most_ here is not only unsupported, it 's
               | completely wrong.
        
               | literalAardvark wrote:
               | I'd like to see your support for that very confident
               | take.
               | 
               | In my experience it's not only correct, but so common
               | that it's hard not to get a round of antibiotics to go.
               | 
               | The only caveat is that I'm in the EU, not the US.
        
               | DiscourseFan wrote:
               | LLMs are really good at medical diagnostics, though...
        
               | jpeeler wrote:
               | Netflix needs to do a Black Mirror episode where either a
               | sentient AI pretends that it's "dumber" than it is while
               | secretly plotting to overthrow humanity. Either that or a
               | LLM is hacked by deep state actors that provides similar
               | manipulated advice.
        
               | seam_carver wrote:
               | One of the story arcs in "The Phoenix" by Osama Tezuka is
               | on a similar topic.
        
             | brookst wrote:
             | Can you point to a specific bit of marketing that says to
             | take whatever medications a LLM suggests, or other similar
             | overreach?
             | 
             | People keep talking about this "marketing", and I have yet
             | to see a single example.
        
           | pyuser583 wrote:
           | The problem is "safety" prevents users from using LLMs to
           | meet their requirements.
           | 
           | We typically don't critique the requirements of users, at
           | least not in functionality.
           | 
           | The marketing angle is that this measure is needed because
           | LLMs are "so powerful it would be unethical not to!"
           | 
           | AI marketers are continually emphasizing how powerful their
           | software is. "Safety" reinforces this.
           | 
           | "Safety" also brings up many of the debates
           | "mis/disinformation" brings up. Misinformation concerns
           | consistently overestimate the power of social media.
           | 
           | I'd feel much better if "safety" focused on preventing
           | unexpected behavior, rather than evaluating the motives of
           | users.
        
           | selfhoster11 wrote:
           | Yes, and a table saw can take your hand. As can a whole
           | variety of power tools. That does not render them illegal to
           | sell to adults.
        
             | ZiiS wrote:
             | It dose render them illigal to sell without studying their
             | safety.
        
             | vntok wrote:
             | An interesting comparison.
             | 
             | Table saws sold all over the world are inspected and
             | certified by trusted third parties to ensure they operate
             | safely. They are illegal to sell without the approval seal.
             | 
             | Moreover, table saws sold in the United States & EU (at
             | least) have at least 3 safety features (riving knife, blade
             | guard, antikickback device) designed to prevent personal
             | injury while operating the machine. They are illegal to
             | sell without these features.
             | 
             | Then of course there are additional devices like sawstop,
             | but it is not mandatory yet as far as I'm aware. Should be
             | in a few years though.
             | 
             | LLMs have none of those board labels or safety features, so
             | I'm not sure what your point was exactly?
        
               | xiphias2 wrote:
               | They are somewhat self regulated, as they can cause
               | permament damage to the company that releases them, and
               | they are meant for general consumers without any
               | training, unlike table saws that are meant for trained
               | people.
               | 
               | An example is the first Microsoft bot that started to go
               | extreme rightwing when people realized how to make it go
               | that direction. Grok had a similar issue recently.
               | 
               | Google had racial issues with its image generation (and
               | earlier with image detection). Again something that
               | people don't forget.
               | 
               | Also an OpenAI 4o release was encouraging stupid things
               | to people when they asked stupid questions and they just
               | had to roll it back recently.
               | 
               | Of course I'm not saying that that's the real reason
               | (somehow they never say that the problem is with
               | performance for not releasing stuff), but safety matters
               | with consumer products.
        
               | latexr wrote:
               | > They are somewhat self regulated, as they can cause
               | permament damage to the company that releases them
               | 
               | And then you proceed to give a number of examples of that
               | _not_ happening. Most people already forgot those.
        
               | andsoitis wrote:
               | An LLM is not gonna chop of your limb. You can't use it
               | to attack someone.
        
               | vntok wrote:
               | An LLM is gonna convince you to treat your wound with
               | quack medics instead of seeing a doctor, which will
               | eventually result the limb being chopped to save you from
               | gangrene.
               | 
               | You can perfectly use an LLM to attack someone. Your
               | sentence is very weird as it comes off as a denial of
               | things that have been happening for months and are
               | ramping up. Examples abound: generate scam letters, find
               | security flaws in a codebase, extract personal
               | information from publicly-available-yet-not-previously-
               | known locations, generate attack software customized for
               | particular targets, generate untraceable hit offers and
               | then post them on anonymized Internet services on your
               | behalf, etc. etc.
        
               | andsoitis wrote:
               | > You can perfectly use an LLM to attack someone.
               | 
               | The act of generating content is not the attack on
               | someone.
               | 
               | That would be like saying if I write something with and
               | paper but not expose anyone to it that I attacked
               | someone.
        
             | conception wrote:
             | No but they have guards on them.
        
           | anonymoushn wrote:
           | The closed weights models from OpenAI already do these things
           | though
        
           | buyucu wrote:
           | At the end of the day an LM is just a machine that talks. It
           | might say silly things, bad things, nonsensical things, or
           | even crazy insane things. But end the end of the day it just
           | talks. Words don't kill.
           | 
           | LM safety is just a marketing gimmick.
        
             | hnaccount_rng wrote:
             | We absolutely regulate which words you can use in certain
             | areas. Take instructions on medicine for one example
        
           | andsoitis wrote:
           | > An LLM can trivially instruct someone to take medications
           | with adverse interactions,
           | 
           | What's an example of such a medication that does not require
           | a prescription?
        
             | edoceo wrote:
             | Oil of wintergreen?
        
             | pixl97 wrote:
             | How about just telling people that drinking grapefruit
             | juice with their liver medicine is a good idea and to
             | ignore their doctor.
        
               | andsoitis wrote:
               | > liver medicine
               | 
               | What is an example liver medicine that does not require a
               | prescription?
        
             | mdemare wrote:
             | Tylenol.
        
               | andsoitis wrote:
               | > Tylenol
               | 
               | This drug comes with warnings: "Taking acetaminophen and
               | drinking alcohol in large amounts can be risky. Large
               | amounts of either of these substances can cause liver
               | damage. Acetaminophen can also interact with warfarin,
               | carbamazepine (Tegretol), and cholestyramine. It can also
               | interact with antibiotics like isoniazid and rifampin."
               | 
               | It is on the consumer to read it.
        
           | andsoitis wrote:
           | > An LLM can trivially make a compelling case that a
           | particular ethnic group is the cause of your society's
           | biggest problem so they should be eliminate
           | 
           | This is an extraordinary claim.
           | 
           | I trust that the vast majority of people are good and would
           | ignore such garbage.
           | 
           | Even assuming that an LLM can trivially build a compelling
           | case to convince someone who is not already murderous to go
           | on a killing spree to kill a large group of people, one
           | killer has limited impact radius.
           | 
           | For contrast, many books and religious texts, have vastly
           | more influence and convincing power over huge groups of
           | people. And they have demonstrably caused widespread death or
           | other harm. And yet we don't censor or ban them.
        
           | amelius wrote:
           | Yeah, give it access to some bitcoin and the internet, and it
           | can definitely cause deaths.
        
         | recursive wrote:
         | I also think it's marketing but kind of for the opposite
         | reason. Basically I don't think any of the current technology
         | can be made safe.
        
           | nomel wrote:
           | Yes, perfection is difficult, but it's relative. It can
           | definitely be made much safer. Looking at the analysis of pre
           | vs post alignment makes this obvious, including when the raw
           | unaligned models are compared to "uncensored" models.
        
         | jrflowers wrote:
         | > Am I the only one who thinks mention of "safety tests" for
         | LLMs is a marketing scheme?
         | 
         | It is. It is also part of Sam Altman's whole thing about being
         | _the_ guy capable of harnessing the theurgical magicks of his
         | chat bot without shattering the earth. He periodically goes on
         | Twitter or a podcast or whatever and reminds everybody that he
         | will yet again single-handedly save mankind. Dude acts like
         | he's Buffy the Vampire Slayer
        
         | olalonde wrote:
         | Especially since "safety" in this context often just means
         | making sure the model doesn't say things that might offend
         | someone or create PR headaches.
        
           | SV_BubbleTime wrote:
           | Don't draw pictures of celebrities.
           | 
           | Don't discuss making drugs or bombs.
           | 
           | Don't call yourself MechaHitler... which I don't care that
           | while scenario was objectively funny on its sheer
           | ridiculousness.
        
             | jekwoooooe wrote:
             | Sure it's funny until some mentally unstable Nazi
             | sympathizer goes and shoots up another synagogue. So funny.
        
         | halfjoking wrote:
         | It's overblown. Elon shipped Hitler grok straight to prod
         | 
         | Nobody died
        
           | pona-a wrote:
           | Playing devil's advocate, what if it was more subtle?
           | 
           | Prolonged use of conversational programs does reliably induce
           | certain mental states in vulnerable populations. When ChatGPT
           | got a bit too agreeable, that was enough for a man to kill
           | himself in a psychotic episode [1]. I don't think this
           | magnitude of delusion was possible with ELIZA, even if the
           | fundamental effect remains the same.
           | 
           | Could this psychosis be politically weaponized by biasing the
           | model to include certain elements in its responses? We know
           | this rhetoric works: cults have been using love-bombing,
           | apocalypticism, us-vs-them dynamics, assigned special
           | missions, and isolation from external support systems to
           | great success. What we haven't seen is what happens when
           | everyone has a cult recruiter in their pocket, waiting for a
           | critical moment to offer support.
           | 
           | ChatGPT has an estimated 800 million weekly active users [2].
           | How many of them would be vulnerable to indoctrination? About
           | 3% of the general population has been involved in a cult [3],
           | but that might be a reflection of conversion efficiency, not
           | vulnerability. Even assuming 5% are vulnerable, that's still
           | 40 million people ready to sacrifice their time, possessions,
           | or even their lives in their delusion.
           | 
           | [1] https://www.rollingstone.com/culture/culture-
           | features/chatgp...
           | 
           | [2] https://www.forbes.com/sites/martineparis/2025/04/12/chat
           | gpt...
           | 
           | [3] https://www.peopleleavecults.com/post/statistics-on-cults
        
             | stogot wrote:
             | You're worried about indoctrination in an LLM but it starts
             | much earlier than that. The school system is indoctrination
             | of our youngest minds, both today in the West and its
             | Prussian origins
             | 
             | https://today.ucsd.edu/story/education-systems-were-first-
             | de...
             | 
             | We should fix both systems. I don't want Altman's or Musk's
             | opinions indoctrinating
        
         | simianwords wrote:
         | I hope the same people questioning ai safety (which is
         | reasonable) don't also hold concern on Grok due to the recent
         | incident.
         | 
         | You have to understand that a lot of people do care about these
         | kind of things.
        
         | ignoramous wrote:
         | > _Nobody is going to die_
         | 
         | Callous. Software does have real impact on real people.
         | 
         | Ex: https://news.ycombinator.com/item?id=44531120
        
         | layer8 wrote:
         | It's about safety for the LLM provider, not necessarily the
         | user.
        
         | stogot wrote:
         | At my company (which produces models) almost all the
         | responsible AI jazz is about DEI and banning naughty words.
         | Little actions on preventing bad outcomes
        
       | etaioinshrdlu wrote:
       | It's worth remembering that the safety constraints can be
       | successfully removed, as demonstrated by uncensored fine-tunes of
       | Llama.
        
       | adidoit wrote:
       | Not sure if it's coincidental that OpenAI's open weights release
       | got delayed right after an ostensibly excellent open weights
       | model (Kimi K2) got released today.
       | 
       | https://moonshotai.github.io/Kimi-K2/
       | 
       | OpenAI know they need to raise the bar with their release. It
       | can't be a middle-of-the-pack open weights model.
        
         | lossolo wrote:
         | This could be it, especially since they announced last week
         | that it would be the best open-source model.
        
           | reactordev wrote:
           | Technically they were right when they said it, in their
           | minds. Things are moving so fast that in a week, it will be
           | true again.
        
         | sigmoid10 wrote:
         | They might also be focusing all their work on beating Grok 4
         | now, since xAi has a significant edge in accumulating computing
         | power and they opened a considerable gap in raw intelligence
         | tests like ARC and HLE. OpenAI is in this to win the
         | competitive race, not the open one.
        
           | unsupp0rted wrote:
           | > They might also be focusing all their work on beating Grok
           | 4 now,
           | 
           | With half the key team members they had a month prior
        
             | sigmoid10 wrote:
             | I'm starting to think talent is way less concentrated in
             | these individuals than execs would have investors believe.
             | While all those people who left OpenAI certainly have the
             | ability to raise ridiculous sums of venture capital in all
             | sorts of companies, Anthropic remains the only offspring
             | that has actually reached a level where they can go head-
             | to-head with OpenAI. Zuck now spending billions on
             | snatching those people seems more like a move out of
             | desperation than a real plan.
        
               | agentcoops wrote:
               | At this point, it seems to be more engineering throughput
               | that will decide short to medium term outcomes. I've yet
               | to see a case where an IC who took a position only
               | because of X in fact outrageous compensation package
               | (especially if not directly tied to longterm company
               | performance through equity) was ever productive again.
               | Meta certainly doesn't strike me as a company that
               | attracts talent for their "mission."
               | 
               | TLDR Zuck's recent actions definitely smell like a
               | predictable failure driven by desperation to me.
        
               | gdbsjjdn wrote:
               | They've kind of played themselves making "genius
               | engineers" their competitive advantage. Anyone can hire
               | those engineers!
        
           | macawfish wrote:
           | Yet it suspiciously can't draw a pelican?
        
             | mhuffman wrote:
             | simonw is going to force every competitive LLM to over-
             | ingest cartoon svg pelicans before this is over!
        
               | ethbr1 wrote:
               | Cue unpublished battery of '{animal} riding
               | {motiveDevice}' real benchmarks behind the scenes.
        
         | bilsbie wrote:
         | Btw why is there no k2 discussion on HN? Isn't it pretty huge
         | news?
        
           | always_imposter wrote:
           | had to search for the discussion, it's here, seems like
           | nobody noticed it and it only couple hundred upvotes.
           | 
           | Here: https://news.ycombinator.com/item?id=44533403
        
             | homebrewer wrote:
             | You've been shadowbanned for saying some things that go
             | against the prevailing groupthink, so all your comments
             | within the last couple of months are invisible for most
             | users.
             | 
             | I really think it's disrespectful towards honest users
             | (excluding spammers and obvious trolls), but I don't pay
             | HN's moderation bills...
        
               | Alifatisk wrote:
               | How can you tell the user been shadowbanned?
        
               | gruez wrote:
               | check his comment history. it's all [flagged]
        
           | otterley wrote:
           | Why don't you start one?
        
           | Alifatisk wrote:
           | There is, but it's not on the front page so you don't find it
           | unless you go through multiple pages or manually search it
           | up.
           | 
           | Moonshot ai has released banger models without much noise
           | about it. Like for example Kimi K1.5, it was quite impressive
           | at the time
        
           | segmondy wrote:
           | probably because maybe 1 or 2 folks on here can run it? It's
           | 1000B model, if 16bit training then you need 2000b of GPU
           | vram to run it. Or about 80 5090s hooked up to the same
           | machine. Or 20 of them to run it in Q2.
        
         | jekwoooooe wrote:
         | Every openai model since gpt4 has been behind the curve by
         | miles
        
       | buyucu wrote:
       | Probably ClosedAI's model was not as good as some of the models
       | being released now. They are delaying it to do some last minute
       | benchmark hacking.
        
       | Y_Y wrote:
       | My hobby: monetizing cynicism.
       | 
       | I go on Polymarket and find things that would make me happy or
       | optimistic about society and tech, and then bet a couple of
       | dollars (of some shitcoin) against them.
       | 
       | e.g. OpenAI releasing an open weights model before September is
       | trading at 81% at time of writing -
       | https://polymarket.com/event/will-openai-release-an-open-sou...
       | 
       | Last month I was up about ten bucks because OpenAI wasn't open,
       | the ceasefire wasn't a ceasefire, and the climate metrics got
       | worse. You can't hedge away all the existential despair, but you
       | can take the sting out of it.
        
         | heeton wrote:
         | My friend does this and calls it "hedging humanity". Every time
         | some big political event has happened that bums me out, he's
         | made a few hundred.
        
         | hereme888 wrote:
         | people still use crypto? I thought the hype died around the
         | time when AI boomed.
        
           | ben_w wrote:
           | Unfortunatley crypto hype is still high; and I think still on
           | the up, but that's vibes not market analysis.
        
           | unsupp0rted wrote:
           | Bitcoin is higher than ever. People can't wait until it gets
           | high enough that they can sell it for dollars, and use those
           | dollars to buy things and make investments in things that are
           | valuable.
        
             | esperent wrote:
             | > Bitcoin is higher than ever
             | 
             | That's just speculation though. I saw a cynical comment on
             | reddit yesterday that unfortunately made a lot of sense.
             | Many people now are just so certain that the future of work
             | is not going to include many humans, so they're throwing
             | everything into stocks and crypto, which is why they remain
             | so high even in the face of so much political uncertainty.
             | It's not that people are investing because they have hope.
             | People are just betting everything as a last ditch survival
             | attempt before the robots take over.
             | 
             | Of course this is hyperbolic - market forces are never that
             | simple. But I think there might be some truth to it.
        
               | dmd wrote:
               | What does 'just' mean here? The monetary value of a thing
               | is what people will pay you for it. Full stop.
        
               | esperent wrote:
               | The question was about people using crypto to buy things.
               | The person above me was implying that because it's going
               | up in value, people are using it that way. I replied to
               | say that it's (mostly) just speculation. Which is a kind
               | of use, but not the one being implied.
        
               | nl wrote:
               | Wait until you hear about this thing called gold and how
               | its price behaves during periods of uncertainty.
        
               | esperent wrote:
               | Gold (and land, jewels, antiques etc.) are bought and
               | held during times of uncertainty because people believe
               | they will retain their value through virtually anything.
               | Stocks don't work that way. In times of uncertainty, gold
               | should increase in value, stocks should decrease.
        
               | matthewdgreen wrote:
               | The irony is that these people think their rights as
               | shareholders will be respected in this future world.
        
               | literalAardvark wrote:
               | It's a bit like when peasants had to sell their land for
               | cash and ended up enslaved working their own land.
               | 
               | It'll work at first but it's just what the parent poster
               | said: a last ditch effort of the desperate and soon to be
               | desperater.
        
               | deadbabe wrote:
               | Crypto is high because people keep believing some sucker
               | in the future will buy it from them even higher. So far,
               | they've been right all along. You really think crypto is
               | ever going to pay some kind of dividend?
        
               | lsaferite wrote:
               | Isn't that equivalent to saying the USD won't pay
               | dividends. Correct, but also not the point. I say this as
               | someone with no crypto ownership.
        
               | deadbabe wrote:
               | There is always demand for USD, it is the only way to pay
               | taxes in the US.
        
               | esperent wrote:
               | Does gold pay dividends? Would you say it's a bad
               | investment?
               | 
               | I'm also someone who owns zero crypto.
        
               | deadbabe wrote:
               | Gold is useful.
        
           | yorwba wrote:
           | People use crypto on Polymarket because it doesn't comply
           | with gambling regulations, so in theory isn't allowed to have
           | US customers. Using crypto as an intermediary lets Polymarket
           | pretend not to know where the money is coming from. Though I
           | think a more robust regulator would call them out on the
           | large volume of betting on US politics on their platform...
        
             | miki123211 wrote:
             | > a more robust regulator would call them out
             | 
             | Calling them out is one thing, but do you think the US
             | could realistically stop them?
             | 
             | I don't know much about Polymarkets governance structure,
             | if it's a decentralized smart contract, the US is DOA. Even
             | if it's not... the Pirate Bay wasn't, the US really tried
             | to stop them, and they basically didn't get anywhere.
        
               | yorwba wrote:
               | Looking it up, it seems like the CEO actually got raided
               | by the FBI last year:
               | https://www.reuters.com/world/us/fbi-raids-polymarket-
               | ceos-h... So maybe the wheels of justice are just
               | grinding a bit slowly. The terms of use want to have
               | Panamanian law apply https://polymarket.com/tos but that
               | doesn't provide much protection when the company is
               | physically operating in the US.
        
           | amelius wrote:
           | people use crypto for speculation, and for (semi)illegal
           | purposes
           | 
           | only a small percentage of use is for actual legitimate money
           | transfers
        
         | khurs wrote:
         | "Gambling can be addictive. Please gamble responsibly. You must
         | be 18 years or older to gamble. If you need help, please
         | contact your local gambling advice group or your doctor"
        
         | xnx wrote:
         | > go on Polymarket and find things that would make me happy or
         | optimistic about society and tech, and then bet a couple of
         | dollars (of some shitcoin) against them.
         | 
         | Classic win win bet. Your bet wins -> you make money (win).
         | Your bet loses -> something good happened for society (win).
        
       | fresh_broccoli wrote:
       | Delays aside, I wonder what kind of license they're planning to
       | use for their weights.
       | 
       | Will it be restricted like Llama, or fully open like Whisper or
       | Granite?
        
       | Havoc wrote:
       | Pointless security theatre. The community worked out long ago how
       | to strip away any safeguards.
        
         | thrance wrote:
         | Whenever I read something similar I immediately remember how
         | "Open"AI refused to release GTP2 XL at the time because it was
         | "too powerful".
        
       | qoez wrote:
       | My pet theory is that they delayed this because grok-4 released
       | because they explicitly want to _not_ be seen as competing with
       | them by pulling the usual trick of releasing right around when
       | google does. Feels like a very sam altman move in my model of his
       | mind.
        
       | gnarlouse wrote:
       | Honestly, they're distancing themselves optically/temporally from
       | HerrGrokler newslines
        
       | seydor wrote:
       | > this is new for us
       | 
       | So much for the company that should never be new to that
        
       | jmugan wrote:
       | What is their business purpose for releasing an open-weights
       | model? How does it help them? I asked an LLM but it just said
       | vague unconvincing things about ecosystem plays and fights for
       | talent.
        
         | brap wrote:
         | PR
        
       | macawfish wrote:
       | Wow. Twitter is not a serious website anymore. Why are companies
       | and professionals still using it? Is it really like that now,
       | with all that noise from grok floating to the top?
        
       | hansmayer wrote:
       | Is it now coming before or after the release of AGI, which
       | OpenAI, "knows how to build now" ?
        
       | groggo wrote:
       | why would OpenAI release an open weight model? Genuinely curious.
        
       ___________________________________________________________________
       (page generated 2025-07-12 23:01 UTC)