[HN Gopher] OpenAI delays launch of open-weight model
___________________________________________________________________
OpenAI delays launch of open-weight model
Author : martinald
Score : 234 points
Date : 2025-07-12 01:07 UTC (21 hours ago)
(HTM) web link (twitter.com)
(TXT) w3m dump (twitter.com)
| stonogo wrote:
| we'll never hear about this again
| mystraline wrote:
| To be completely and utterly fair, I trust Deepseek and Qwen
| (Alibaba) more than American AI companies.
|
| American AI companies have shown they are money and compute
| eaters, and massively so at that. Billions later, and well, not
| much to show.
|
| But Deepseek cost $5M to develop, and made multiple novel ways to
| train.
|
| Oh, and their models and code are all FLOSS. The US companies are
| closed. Basically, the US ai companies are too busy treating each
| other as vultures.
| ryao wrote:
| Wasn't that figure just the cost of the GPUs and nothing else?
| rynn wrote:
| It was more than $5m
|
| https://interestingengineering.com/culture/deepseeks-ai-
| trai...
| rpdillon wrote:
| Yeah, I hate that this figure keeps getting thrown around.
| IIRC, it's the price of 2048 H800s for 2 months at
| $2/hour/GPU. If you consider months to be 30 days, that's
| around $5.7M, which lines up. What doesn't line up is
| ignoring the costs of facilities, salaries, non-cloud
| hardware, etc. which will dominate costs, I'd expect. $100M
| seems like a fairer estimate, TBH. The original paper had
| more than a dozen authors, and DeepSeek had about 150
| researchers working on R1, which supports the notion that
| personnel costs would likely dominate.
| moralestapia wrote:
| >ignoring the costs of facilities, salaries, non-cloud
| hardware, etc.
|
| If you lease, those costs are amortized. It was definitely
| more than $5M, but I don't think it was as high as $100M.
| All things considered, I still believe Deepseek was trained
| at one (perhaps two) orders of magnitude lower cost than
| other competing models.
| rpdillon wrote:
| Perhaps. Do you think DeepSeek made use of those
| competing models at all in order to train theirs?
| moralestapia wrote:
| I believe so, but have no proof obviously.
| 3eb7988a1663 wrote:
| That is also just the final production run. How many
| experimental runs were performed before starting the final
| batch? It could be some ratio like 10 hours of research to
| every one hour of final training.
| kamranjon wrote:
| Actually the majority of Google models are open source and they
| also were pretty fundamental in pushing a lot of the techniques
| in training forward - working in the AI space I've read quite a
| few of their research papers and I really appreciate what
| they've done to share their work and also release their models
| under licenses that allow you to use them for commercial
| purposes.
| simonw wrote:
| "Actually the majority of Google models are open source"
|
| That's not accurate. The Gemini family of models are all
| proprietary.
|
| Google's Gemma models (which are some of the best available
| local models) are open weights but not technically OSI-
| compatible open source - they come with usage restrictions:
| https://ai.google.dev/gemma/terms
| kamranjon wrote:
| You're ignoring the T5 series of models that were
| incredibly influential, the T5 models and their derivatives
| (FLAN-T5, Long-T5, ByT5, etc) have been downloaded millions
| of times on huggingface and are real workhorses. There are
| even variants still being produced within the last year or
| so.
|
| A yea the Gemma series is incredible and while maybe not
| meeting the standards of OSI - I consider them to be pretty
| open as far as local models go. And it's not just the
| standard Gemma variants, Google is releasing other
| incredible Gemma models that I don't think people have
| really even caught wind of yet like MedGemma, of which the
| 4b variant has vision capability.
|
| I really enjoy their contributions to the open source AI
| community and think it's pretty substantial.
| Aunche wrote:
| $5 million was the gpu hour cost of a single training run.
| dumbmrblah wrote:
| Exactly. Not to minimize Deepseeks tremendous achievement,
| but that $5 million was just for the training run, not the
| GPUs used they purchased before, and all the OpenAI API calls
| they likely used to assist in synthetic data generation.
| IncreasePosts wrote:
| Deepseek R1 was trained at least partially on the output of
| other LLMs. So, it might have been much more expensive if they
| needed to do it themselves from scratch.
| nomel wrote:
| Lawsuit, since it was against OpenAI TOS:
| https://hls.harvard.edu/today/deepseek-chatgpt-and-the-
| globa...
| refulgentis wrote:
| > Billions later, and well, not much to show.
|
| This is obviously false, I'm curious why you included it.
|
| > Oh, and their models and code are all FLOSS.
|
| No?
| NitpickLawyer wrote:
| > But Deepseek cost $5M to develop, and made multiple novel
| ways to train
|
| This is highly contested, and was either a big misunderstanding
| by everyone reporting it, or maliciously placed there (by a
| quant company, right before the stock fell a lot for nvda and
| the rest) depending on who you ask.
|
| If we're being generous and assume no malicious intent (big
| if), anyone who has trained a big model can tell you that the
| cost of 1 run is useless in the big scheme of things. There is
| a lot of cost in getting there, in the failed runs, in the
| subsequent runs, and so on. The fact that R2 isn't there after
| ~6 months should say a lot. Sometimes you get a great training
| run, but no-one is looking at the failed ones and adding up
| that cost...
| jampa wrote:
| They were pretty explicit that this was only the cost in GPU
| hours to USD for the final run. Journalists and Twitter tech
| bros just saw an easy headline there. It's the same with
| Clair Obscur developer's Sandfall, where the people say that
| the game was made by 30 people, when there were 200 people
| involved.
| badsectoracula wrote:
| These "200 people" were counted from credits which list
| pretty much everyone who even sniffed at the general
| direction of the studio's direction. The studio itself is
| ~30 people (just went and check on their website, they have
| a team list with photos for everyone). The rest are
| contractors whose contributions usually vary wildly.
| Besides, credits are free so unless the the company are
| petty (see Rockstar not crediting people on their games if
| they leave before the game is released even if they worked
| on it for years) people err on the site on crediting
| everyone. Personally i've been credited on a game that used
| a library i wrote once and i learned about it years after
| the release.
|
| Most importantly those who mention that the game was made
| by 30 people do it to compare it with other much larger
| teams with hundreds if not thousands of people _and those
| teams use contractors too_!
| NitpickLawyer wrote:
| > They were pretty explicit that this was only the cost in
| GPU hours to USD for the final run.
|
| The researchers? Yes.
|
| What followed afterwards, I'm not so sure. There was
| clearly some "cheap headlines" in the media, but there were
| also some weird coverage being pushed everywhere, from
| weird tlds, and they were all pushing nvda dead, cheap
| deepseek, you can run it on raspberries, etc. That _might_
| have been a campaign designed to help short the stocks.
| buyucu wrote:
| Deepseek is far more worthy of the name OpenAI than Sam
| Altman's ClosedAI.
| baobabKoodaa wrote:
| > American AI companies have shown they are money and compute
| eaters
|
| Don't forget they also quite literally eat books
| knicholes wrote:
| Who is literally eating books?
| jasonjmcghee wrote:
| Parent is referencing the recent court case with Anthropic,
| and the legal requirement of not copying books, but
| consuming them- translating to Anthropic having to destroy
| every book it uses as input data in order to comply with
| said requirements.
| root_axis wrote:
| > _But Deepseek cost $5M to develop_
|
| Not true. It was $5M to train - it was many more millions in
| R&D.
| krackers wrote:
| Probably the results were worse than K2 model released today. No
| serious engineer would say it's for "safety" reasons given that
| ablation nullifies any safety post-training.
| simonw wrote:
| I'm expecting (and indeed hoping) that the open weights OpenAI
| model is a _lot_ smaller than K2. K2 is 1 trillion parameters
| and almost a terabyte to download! There 's no way I'm running
| that on my laptop.
|
| I think the sweet spot for local models may be around the 20B
| size - that's Mistral Small 3.x and some of the Gemma 3 models.
| They're very capable and run in less than 32GB of RAM.
|
| I really hope OpenAI put one out in that weight class,
| personally.
| NitpickLawyer wrote:
| Early rumours (from a hosting company that apparently got
| early access) was that you'd need "multiple h100s to run it",
| so I doubt it's a gemma - mistral small tier model..
| simonw wrote:
| I think you're right, I've seen a couple of other comments
| now that indicate the same thing.
| aabhay wrote:
| You will get at 20gb model. Distillation is so compute
| efficient that it's all but inevitable that if not OpenAI,
| numerous other companies will do it.
|
| I would rather have an open weights model that's the best
| possible one I can run and fine tune myself, allowing me to
| exceed SOTA models on the narrower domain my customers care
| about.
| dorkdork wrote:
| Maybe they're making last minute changes to compete with Grok 4?
| puttycat wrote:
| https://nitter.space/sama/status/1943837550369812814
| ryao wrote:
| Am I the only one who thinks mention of "safety tests" for LLMs
| is a marketing scheme? Cars, planes and elevators have safety
| tests. LLMs don't. Nobody is going to die if a LLM gives an
| output that its creators do not like, yet when they say "safety
| tests", they mean that they are checking to what extent the LLM
| will say things they do not like.
| eviks wrote:
| Why is your definition of safety so limited? Death isn't the
| only type of harm...
| ryao wrote:
| There are other forms of safety, but whether a digital parrot
| says something that people do not like is not a form of
| safety. They are abusing the term safety for marketing
| purposes.
| eviks wrote:
| You're abusing the terms by picking either the overly
| limited ("death") or overly expansive ("not like")
| definitions to fit your conclusion. Unless you reject the
| fact that harm can come from words/images, a parrot can
| parrot harmful words/images, so be unsafe.
| ryao wrote:
| The maxim "sticks and stones can break my bones, but
| words can never hurt me" comes to mind here. That said, I
| think this misses the point that the LLM is not a
| gatekeeper to any of this.
| eviks wrote:
| Don't let your mind potential be limited by such
| primitive slogans!
| jiggawatts wrote:
| I find it particularly irritating that the models are so
| overly puritan that they refuse to translate subtitles
| because they mention violence.
| jazzyjackson wrote:
| it's like complaining about bad words in the dictionary
|
| the bot has no agency, the bot isn't doing anything,
| people talk to themselves, augmenting their chain of
| thought with an automated process. If the automated
| process is acting in an undesirable manner, the human
| that started the process can close the tab.
|
| Which part of this is dangerous or harmful?
| ks2048 wrote:
| You could be right about this being an excuse for some other
| reason, but lots of software has "safety tests" beyond life or
| death situations.
|
| Most companies, for better or worse (I say for better) don't
| want their new chatbot to be a RoboHitler, for example.
| ryao wrote:
| It is possible to turn any open weight model into that with
| fine tuning. It is likely possible to do that with closed
| weight models, even when there is no creator provided sandbox
| for fine tuning them, through clever prompting and trying
| over and over again. It is unfortunate, but there really is
| no avoiding that.
|
| That said, I am happy to accept the term safety used in other
| places, but here it just seems like a marketing term. From my
| recollection, OpenAI had made a push to get regulation that
| would stifle competition by talking about these things as
| dangerous and needing safety. Then they backtracked somewhat
| when they found the proposed regulations would restrict
| themselves rather than just their competitors. However, they
| are still pushing this safety narrative that was never really
| appropriate. They have a term for this called alignment and
| what they are doing are tests to verify alignment in areas
| that they deem sensitive so that they have a rough idea to
| what extent the outputs might contain things that they do not
| like in those areas.
| natrius wrote:
| An LLM can trivially instruct someone to take medications with
| adverse interactions, steer a mental health crisis toward
| suicide, or make a compelling case that a particular ethnic
| group is the cause of your society's biggest problem so they
| should be eliminated. Words can't kill people, but words can
| definitely lead to deaths.
|
| That's not even considering tool use!
| ryao wrote:
| This is analogous to saying a computer can be used to do bad
| things if it is loaded with the right software.
| Coincidentally, people do load computers with the right
| software to do bad things, yet people are overwhelmingly
| opposed to measures that would stifle such things.
|
| If you hook up a chat bot to a chat interface, or add tool
| use, it is probable that it will eventually output something
| that it should not and that output will cause a problem.
| Preventing that is an unsolved problem, just as preventing
| people from abusing computers is an unsolved problem.
| ronsor wrote:
| As the runtime of any program approaches infinity, the
| probability of the program behaving in an undesired manner
| approaches 1.
| ryao wrote:
| That is not universally true. The yes program is a
| counter example:
|
| https://www.man7.org/linux/man-pages/man1/yes.1.html
| cgriswald wrote:
| Devil's advocate:
|
| (1) Execute yes (with or without arguments, whatever you
| desire).
|
| (2) Let the program run as long as you desire.
|
| (3) When you stop desiring the program to spit out your
| argument,
|
| (4) Stop the program.
|
| Between (3) and (4) some time must pass. During this time
| the program is behaving in an undesired way. Ergo, yes is
| not a counter example of the GP's claim.
| ryao wrote:
| I upvoted your reply for its clever (ab)use of ambiguity
| to say otherwise to a fairly open and shut case.
|
| That said, I suspect the other person was actually
| agreeing with me, and tried to state that software
| incorporating LLMs would eventually malfunction by
| stating that this is true for all software. The yes
| program was an obvious counter example. It is almost
| certain that all LLMs will eventually generate some
| output that is undesired given that it is determining the
| next token to output based on probabilities. I say almost
| only because I do not know how to prove the conjecture.
| There is also some ambiguity in what is a LLM, as the
| first L means large and nobody has made a precise
| definition of what is large. If you look at literature
| from several years ago, you will find people saying 100
| million parameters is large, while some people these days
| will refuse to use the term LLM to describe a model of
| that size.
| cgriswald wrote:
| Thanks, it was definitely tongue-in-cheek. I agree with
| you on both counts.
| pesfandiar wrote:
| The society has accepted that computers bring more benefit
| than harm, but LLMs could still get pushback due to bad PR.
| 0points wrote:
| > This is analogous to saying a computer can be used to do
| bad things if it is loaded with the right software.
|
| It's really not. Parent's examples are all out-of-the-box
| behavior.
| 123yawaworht456 wrote:
| does your CPU, your OS, your web browser come with ~~built-in
| censorship~~ safety filters too?
|
| AI 'safety' is one of the most neurotic twitter-era nanny
| bullshit things in existence, blatantly obviously invented to
| regulate small competitors out of existence.
| no_wizard wrote:
| It isn't. This is dismissive without first thinking through
| the difference of application.
|
| AI safety is about proactive safety. Such an example: if an
| AI model could be used to screen hiring applications,
| making sure it doesn't have any weighted racial biases.
|
| The difference here is that it's not reactive. Reading a
| book with a racial bias would be the inverse; where you
| would be reacting to that information.
|
| That's the basis of proper AI safety in a nutshell
| ryao wrote:
| As someone who has reviewed people's resumes that they
| submitted with job applications in the past, I find it
| difficult to imagine this. The resumes that I saw had no
| racial information. I suppose the names might have some
| correlation to such information, but anyone feeding these
| things into a LLM for evaluation would likely censor the
| name to avoid bias. I do not see an opportunity for
| proactive safety in the LLM design here. It is not even
| clear that they even are evaluating whether there is bias
| in such a scenario when someone did not properly sanitize
| inputs.
| thayne wrote:
| > but anyone feeding these things into a LLM for
| evaluation would likely censor the name to avoid bias
|
| That should really be done for humans reviewing the
| resumes as well, but in practice that isn't done as much
| as it should be
| kalkin wrote:
| > I find it difficult to imagine this
|
| Luckily, this is something that can be studied and has
| been. Sticking a stereotypically Black name on a resume
| on average substantially decreases the likelihood that
| the applicant will get past a resume screen, compared to
| the same resume with a generic or stereotypically White
| name:
|
| https://www.npr.org/2024/04/11/1243713272/resume-bias-
| study-...
| bigstrat2003 wrote:
| That is a terrible study. The stereotypically black names
| are not just stereotypically black, they are
| stereotypical for the underclass of trashy people. You
| would also see much higher rejection rates if you slapped
| stereotypical white underclass names like "Bubba" or
| "Cleetus" on resumes. As is almost always the case, this
| claim of racism in America is really classism and has
| little to do with race.
| stonogo wrote:
| "Names from N.C. speeding tickets were selected from the
| most common names where at least 90% of individuals are
| reported to belong to the relevant race and gender
| group."
|
| Got a better suggestion?
| selfhoster11 wrote:
| If you're deploying LLM-based decision making that
| affects lives, you should be the one held responsible for
| the results. If you don't want to do due diligence on
| automation, you can screen manually instead.
| derektank wrote:
| iOS certainly does by limiting you to the App Store and
| restricring what apps are available there
| selfhoster11 wrote:
| They have been forced to open up to alternative stores in
| the EU. This is unequivocally a good thing, and a victory
| for consumer rights.
| jowea wrote:
| Social media does. Even person to person communication has
| laws that apply to it. And the normal self-censorship a
| normal person will engage in.
| 123yawaworht456 wrote:
| okay. and? there are no AI 'safety' laws in the US.
|
| without OpenAI, Anthropic and Google's fearmongering, AI
| 'safety' would exist only in the delusional minds of
| people who take sci-fi way too seriously.
|
| https://en.wikipedia.org/wiki/Regulatory_capture
|
| for fuck's sake, how more obvious could they be? sama
| himself went on a world tour begging for laws and
| regulations, only to purge safetyists a year later. if
| you believe that he and the rest of his ilk are motivated
| by anything other than profit, smh tbh fam.
|
| it's all deceit and delusion. China will crush them all,
| inshallah.
| bongodongobob wrote:
| Books can do this too.
| derektank wrote:
| Major book publishers have sensitivity readers that
| evaluate whether or not a book can be "safely" published
| nowadays. And even historically there have always been at
| least a few things publishers would refuse to print.
| selfhoster11 wrote:
| All it means is that the Overton window on "should we
| censor speech" has shifted in the direction of less
| freedom.
| snozolli wrote:
| GP said major publishers. There's nothing stopping you
| from printing out your book and spiral binding it by
| hand, if that's what it takes to get your ideas into the
| world. Companies having standards for what _they_ publish
| isn 't censorship.
| ben_w wrote:
| There's a reason the inherititors of the coyright* refused
| to allow more copies of Mein Kampf to be produced until
| that copyright expired.
|
| * the federal state of Bavaria
| nofriend wrote:
| Was there? It seems like that was the perfect natural
| experiment then. So what was the outcome? Was there a
| sudden rash of holocausts the year that publishing
| started again?
| ben_w wrote:
| > Was there a sudden rash of holocausts the year that
| publishing started again?
|
| Bit worse than the baseline, I'd say. You judge:
| https://en.wikipedia.org/wiki/List_of_genocides
|
| 2016 was also first Trump, Brexit, and roughly when the
| AfD (who are metaphorically wading ankle deep in the
| waters of legal trouble of this topic) made the
| transition from "joke party" to "political threat".
| bilsbie wrote:
| PDFs can do this too.
| jiggawatts wrote:
| Twitter does it at scale.
| xigoi wrote:
| In such a case, the author of the PDF can be held
| responsible.
| bilsbie wrote:
| Radical idea: let's hold the reader responsible for the
| actions they take from the material.
| amelius wrote:
| Radical rebuttal of this idea: if you hire an assassin
| then you are responsible too (even more so, actually),
| even if you only told them stuff over the phone.
| bilsbie wrote:
| I don't see the connection. Publishing != hiring.
| amelius wrote:
| Then I don't see the connection in your idea. Answering
| questions != publishing.
| johnfn wrote:
| So we should hold my grandmother responsible for the
| phishing emails she gets? Hmm.
| thayne wrote:
| Part of the problem is due to the marketing of LLMs as more
| capable and trustworthy than they really are.
|
| And the safety testing actually makes this worse, because it
| leads people to trust that LLMs are less likely to give
| dangerous advice, when they could still do so.
| jdross wrote:
| Spend 15 minutes talking to a person in their 20's about
| how they use ChatGPT to work through issues in their
| personal lives and you'll see how much they already trust
| the "advice" and other information produced by LLMs.
|
| Manipulation is a genuine concern!
| justacrow wrote:
| It's not just young people. My boss (originally a
| programmer) agreed with me that there's lots of problems
| using ChatGPT for our products and programs as it gives
| the wrong answers too often, but tgen 30 seconds later
| told me that it was apparently great at giving medical
| advice.
|
| ...later someone higher-up decided that it's actually
| great at programming as well, and so now we all believe
| it's incredibly useful and necessary for us to be able to
| do our daily work
| literalAardvark wrote:
| Most doctors will prescribe antibiotics for viral
| infections just to get you out and the next guy in, they
| have zero interest in sitting there to troubleshoot with
| you.
|
| For this reason o3 is way better than most of the doctors
| I've had access to, to the point where my PCP just writes
| whatever I brought in because she can't follow 3/4 of it.
|
| Yes, the answers are often wrong and incomplete, and it's
| up to you to guide the model to sort it out, but it's
| just like vibe coding: if you put in the steering effort,
| you can get a decent output.
|
| Would it be better if you could hire an actual
| professional to do it? Of course. But most of us are
| priced out of that level of care.
| andsoitis wrote:
| > Most doctors will prescribe antibiotics for viral
| infections just to get you out and the next guy in
|
| Where do you get this data from?
| somenameforme wrote:
| Family in my case. There are two reasons they do this. A
| lot of people like medicine - they think it justifies the
| cost of the visit, and there's a real placebo effect
| (which is not an oxymoron as many might think).
|
| The second is that many viral infections can, in rare
| scenarios, lead to bacterial infections. For instance a
| random flu can leave one more susceptible to developing
| pneumonia. Throwing antibiotics at everything is a
| defensive measure to help ward of malpractice lawsuits.
| Even if frivolous, it's something no doctor wants to deal
| with, but some absurd number - something like 1 in 15 per
| year, will.
| literalAardvark wrote:
| Lived experience. I'm not in the US and neither are most
| doctors.
| bdangubic wrote:
| I can co-sign this being bi-coastal. in the US not once
| have I or my 12-year old kid been prescribed antibiotics.
| on three ocassions in europe I had to take my kid to the
| doctor and each time antibiotics were prescribed (never
| consumed)
| asadotzler wrote:
| Your claim of _most_ here is not only unsupported, it 's
| completely wrong.
| literalAardvark wrote:
| I'd like to see your support for that very confident
| take.
|
| In my experience it's not only correct, but so common
| that it's hard not to get a round of antibiotics to go.
|
| The only caveat is that I'm in the EU, not the US.
| DiscourseFan wrote:
| LLMs are really good at medical diagnostics, though...
| jpeeler wrote:
| Netflix needs to do a Black Mirror episode where either a
| sentient AI pretends that it's "dumber" than it is while
| secretly plotting to overthrow humanity. Either that or a
| LLM is hacked by deep state actors that provides similar
| manipulated advice.
| seam_carver wrote:
| One of the story arcs in "The Phoenix" by Osama Tezuka is
| on a similar topic.
| brookst wrote:
| Can you point to a specific bit of marketing that says to
| take whatever medications a LLM suggests, or other similar
| overreach?
|
| People keep talking about this "marketing", and I have yet
| to see a single example.
| pyuser583 wrote:
| The problem is "safety" prevents users from using LLMs to
| meet their requirements.
|
| We typically don't critique the requirements of users, at
| least not in functionality.
|
| The marketing angle is that this measure is needed because
| LLMs are "so powerful it would be unethical not to!"
|
| AI marketers are continually emphasizing how powerful their
| software is. "Safety" reinforces this.
|
| "Safety" also brings up many of the debates
| "mis/disinformation" brings up. Misinformation concerns
| consistently overestimate the power of social media.
|
| I'd feel much better if "safety" focused on preventing
| unexpected behavior, rather than evaluating the motives of
| users.
| selfhoster11 wrote:
| Yes, and a table saw can take your hand. As can a whole
| variety of power tools. That does not render them illegal to
| sell to adults.
| ZiiS wrote:
| It dose render them illigal to sell without studying their
| safety.
| vntok wrote:
| An interesting comparison.
|
| Table saws sold all over the world are inspected and
| certified by trusted third parties to ensure they operate
| safely. They are illegal to sell without the approval seal.
|
| Moreover, table saws sold in the United States & EU (at
| least) have at least 3 safety features (riving knife, blade
| guard, antikickback device) designed to prevent personal
| injury while operating the machine. They are illegal to
| sell without these features.
|
| Then of course there are additional devices like sawstop,
| but it is not mandatory yet as far as I'm aware. Should be
| in a few years though.
|
| LLMs have none of those board labels or safety features, so
| I'm not sure what your point was exactly?
| xiphias2 wrote:
| They are somewhat self regulated, as they can cause
| permament damage to the company that releases them, and
| they are meant for general consumers without any
| training, unlike table saws that are meant for trained
| people.
|
| An example is the first Microsoft bot that started to go
| extreme rightwing when people realized how to make it go
| that direction. Grok had a similar issue recently.
|
| Google had racial issues with its image generation (and
| earlier with image detection). Again something that
| people don't forget.
|
| Also an OpenAI 4o release was encouraging stupid things
| to people when they asked stupid questions and they just
| had to roll it back recently.
|
| Of course I'm not saying that that's the real reason
| (somehow they never say that the problem is with
| performance for not releasing stuff), but safety matters
| with consumer products.
| latexr wrote:
| > They are somewhat self regulated, as they can cause
| permament damage to the company that releases them
|
| And then you proceed to give a number of examples of that
| _not_ happening. Most people already forgot those.
| andsoitis wrote:
| An LLM is not gonna chop of your limb. You can't use it
| to attack someone.
| vntok wrote:
| An LLM is gonna convince you to treat your wound with
| quack medics instead of seeing a doctor, which will
| eventually result the limb being chopped to save you from
| gangrene.
|
| You can perfectly use an LLM to attack someone. Your
| sentence is very weird as it comes off as a denial of
| things that have been happening for months and are
| ramping up. Examples abound: generate scam letters, find
| security flaws in a codebase, extract personal
| information from publicly-available-yet-not-previously-
| known locations, generate attack software customized for
| particular targets, generate untraceable hit offers and
| then post them on anonymized Internet services on your
| behalf, etc. etc.
| andsoitis wrote:
| > You can perfectly use an LLM to attack someone.
|
| The act of generating content is not the attack on
| someone.
|
| That would be like saying if I write something with and
| paper but not expose anyone to it that I attacked
| someone.
| conception wrote:
| No but they have guards on them.
| anonymoushn wrote:
| The closed weights models from OpenAI already do these things
| though
| buyucu wrote:
| At the end of the day an LM is just a machine that talks. It
| might say silly things, bad things, nonsensical things, or
| even crazy insane things. But end the end of the day it just
| talks. Words don't kill.
|
| LM safety is just a marketing gimmick.
| hnaccount_rng wrote:
| We absolutely regulate which words you can use in certain
| areas. Take instructions on medicine for one example
| andsoitis wrote:
| > An LLM can trivially instruct someone to take medications
| with adverse interactions,
|
| What's an example of such a medication that does not require
| a prescription?
| edoceo wrote:
| Oil of wintergreen?
| pixl97 wrote:
| How about just telling people that drinking grapefruit
| juice with their liver medicine is a good idea and to
| ignore their doctor.
| andsoitis wrote:
| > liver medicine
|
| What is an example liver medicine that does not require a
| prescription?
| mdemare wrote:
| Tylenol.
| andsoitis wrote:
| > Tylenol
|
| This drug comes with warnings: "Taking acetaminophen and
| drinking alcohol in large amounts can be risky. Large
| amounts of either of these substances can cause liver
| damage. Acetaminophen can also interact with warfarin,
| carbamazepine (Tegretol), and cholestyramine. It can also
| interact with antibiotics like isoniazid and rifampin."
|
| It is on the consumer to read it.
| andsoitis wrote:
| > An LLM can trivially make a compelling case that a
| particular ethnic group is the cause of your society's
| biggest problem so they should be eliminate
|
| This is an extraordinary claim.
|
| I trust that the vast majority of people are good and would
| ignore such garbage.
|
| Even assuming that an LLM can trivially build a compelling
| case to convince someone who is not already murderous to go
| on a killing spree to kill a large group of people, one
| killer has limited impact radius.
|
| For contrast, many books and religious texts, have vastly
| more influence and convincing power over huge groups of
| people. And they have demonstrably caused widespread death or
| other harm. And yet we don't censor or ban them.
| amelius wrote:
| Yeah, give it access to some bitcoin and the internet, and it
| can definitely cause deaths.
| recursive wrote:
| I also think it's marketing but kind of for the opposite
| reason. Basically I don't think any of the current technology
| can be made safe.
| nomel wrote:
| Yes, perfection is difficult, but it's relative. It can
| definitely be made much safer. Looking at the analysis of pre
| vs post alignment makes this obvious, including when the raw
| unaligned models are compared to "uncensored" models.
| jrflowers wrote:
| > Am I the only one who thinks mention of "safety tests" for
| LLMs is a marketing scheme?
|
| It is. It is also part of Sam Altman's whole thing about being
| _the_ guy capable of harnessing the theurgical magicks of his
| chat bot without shattering the earth. He periodically goes on
| Twitter or a podcast or whatever and reminds everybody that he
| will yet again single-handedly save mankind. Dude acts like
| he's Buffy the Vampire Slayer
| olalonde wrote:
| Especially since "safety" in this context often just means
| making sure the model doesn't say things that might offend
| someone or create PR headaches.
| SV_BubbleTime wrote:
| Don't draw pictures of celebrities.
|
| Don't discuss making drugs or bombs.
|
| Don't call yourself MechaHitler... which I don't care that
| while scenario was objectively funny on its sheer
| ridiculousness.
| jekwoooooe wrote:
| Sure it's funny until some mentally unstable Nazi
| sympathizer goes and shoots up another synagogue. So funny.
| halfjoking wrote:
| It's overblown. Elon shipped Hitler grok straight to prod
|
| Nobody died
| pona-a wrote:
| Playing devil's advocate, what if it was more subtle?
|
| Prolonged use of conversational programs does reliably induce
| certain mental states in vulnerable populations. When ChatGPT
| got a bit too agreeable, that was enough for a man to kill
| himself in a psychotic episode [1]. I don't think this
| magnitude of delusion was possible with ELIZA, even if the
| fundamental effect remains the same.
|
| Could this psychosis be politically weaponized by biasing the
| model to include certain elements in its responses? We know
| this rhetoric works: cults have been using love-bombing,
| apocalypticism, us-vs-them dynamics, assigned special
| missions, and isolation from external support systems to
| great success. What we haven't seen is what happens when
| everyone has a cult recruiter in their pocket, waiting for a
| critical moment to offer support.
|
| ChatGPT has an estimated 800 million weekly active users [2].
| How many of them would be vulnerable to indoctrination? About
| 3% of the general population has been involved in a cult [3],
| but that might be a reflection of conversion efficiency, not
| vulnerability. Even assuming 5% are vulnerable, that's still
| 40 million people ready to sacrifice their time, possessions,
| or even their lives in their delusion.
|
| [1] https://www.rollingstone.com/culture/culture-
| features/chatgp...
|
| [2] https://www.forbes.com/sites/martineparis/2025/04/12/chat
| gpt...
|
| [3] https://www.peopleleavecults.com/post/statistics-on-cults
| stogot wrote:
| You're worried about indoctrination in an LLM but it starts
| much earlier than that. The school system is indoctrination
| of our youngest minds, both today in the West and its
| Prussian origins
|
| https://today.ucsd.edu/story/education-systems-were-first-
| de...
|
| We should fix both systems. I don't want Altman's or Musk's
| opinions indoctrinating
| simianwords wrote:
| I hope the same people questioning ai safety (which is
| reasonable) don't also hold concern on Grok due to the recent
| incident.
|
| You have to understand that a lot of people do care about these
| kind of things.
| ignoramous wrote:
| > _Nobody is going to die_
|
| Callous. Software does have real impact on real people.
|
| Ex: https://news.ycombinator.com/item?id=44531120
| layer8 wrote:
| It's about safety for the LLM provider, not necessarily the
| user.
| stogot wrote:
| At my company (which produces models) almost all the
| responsible AI jazz is about DEI and banning naughty words.
| Little actions on preventing bad outcomes
| etaioinshrdlu wrote:
| It's worth remembering that the safety constraints can be
| successfully removed, as demonstrated by uncensored fine-tunes of
| Llama.
| adidoit wrote:
| Not sure if it's coincidental that OpenAI's open weights release
| got delayed right after an ostensibly excellent open weights
| model (Kimi K2) got released today.
|
| https://moonshotai.github.io/Kimi-K2/
|
| OpenAI know they need to raise the bar with their release. It
| can't be a middle-of-the-pack open weights model.
| lossolo wrote:
| This could be it, especially since they announced last week
| that it would be the best open-source model.
| reactordev wrote:
| Technically they were right when they said it, in their
| minds. Things are moving so fast that in a week, it will be
| true again.
| sigmoid10 wrote:
| They might also be focusing all their work on beating Grok 4
| now, since xAi has a significant edge in accumulating computing
| power and they opened a considerable gap in raw intelligence
| tests like ARC and HLE. OpenAI is in this to win the
| competitive race, not the open one.
| unsupp0rted wrote:
| > They might also be focusing all their work on beating Grok
| 4 now,
|
| With half the key team members they had a month prior
| sigmoid10 wrote:
| I'm starting to think talent is way less concentrated in
| these individuals than execs would have investors believe.
| While all those people who left OpenAI certainly have the
| ability to raise ridiculous sums of venture capital in all
| sorts of companies, Anthropic remains the only offspring
| that has actually reached a level where they can go head-
| to-head with OpenAI. Zuck now spending billions on
| snatching those people seems more like a move out of
| desperation than a real plan.
| agentcoops wrote:
| At this point, it seems to be more engineering throughput
| that will decide short to medium term outcomes. I've yet
| to see a case where an IC who took a position only
| because of X in fact outrageous compensation package
| (especially if not directly tied to longterm company
| performance through equity) was ever productive again.
| Meta certainly doesn't strike me as a company that
| attracts talent for their "mission."
|
| TLDR Zuck's recent actions definitely smell like a
| predictable failure driven by desperation to me.
| gdbsjjdn wrote:
| They've kind of played themselves making "genius
| engineers" their competitive advantage. Anyone can hire
| those engineers!
| macawfish wrote:
| Yet it suspiciously can't draw a pelican?
| mhuffman wrote:
| simonw is going to force every competitive LLM to over-
| ingest cartoon svg pelicans before this is over!
| ethbr1 wrote:
| Cue unpublished battery of '{animal} riding
| {motiveDevice}' real benchmarks behind the scenes.
| bilsbie wrote:
| Btw why is there no k2 discussion on HN? Isn't it pretty huge
| news?
| always_imposter wrote:
| had to search for the discussion, it's here, seems like
| nobody noticed it and it only couple hundred upvotes.
|
| Here: https://news.ycombinator.com/item?id=44533403
| homebrewer wrote:
| You've been shadowbanned for saying some things that go
| against the prevailing groupthink, so all your comments
| within the last couple of months are invisible for most
| users.
|
| I really think it's disrespectful towards honest users
| (excluding spammers and obvious trolls), but I don't pay
| HN's moderation bills...
| Alifatisk wrote:
| How can you tell the user been shadowbanned?
| gruez wrote:
| check his comment history. it's all [flagged]
| otterley wrote:
| Why don't you start one?
| Alifatisk wrote:
| There is, but it's not on the front page so you don't find it
| unless you go through multiple pages or manually search it
| up.
|
| Moonshot ai has released banger models without much noise
| about it. Like for example Kimi K1.5, it was quite impressive
| at the time
| segmondy wrote:
| probably because maybe 1 or 2 folks on here can run it? It's
| 1000B model, if 16bit training then you need 2000b of GPU
| vram to run it. Or about 80 5090s hooked up to the same
| machine. Or 20 of them to run it in Q2.
| jekwoooooe wrote:
| Every openai model since gpt4 has been behind the curve by
| miles
| buyucu wrote:
| Probably ClosedAI's model was not as good as some of the models
| being released now. They are delaying it to do some last minute
| benchmark hacking.
| Y_Y wrote:
| My hobby: monetizing cynicism.
|
| I go on Polymarket and find things that would make me happy or
| optimistic about society and tech, and then bet a couple of
| dollars (of some shitcoin) against them.
|
| e.g. OpenAI releasing an open weights model before September is
| trading at 81% at time of writing -
| https://polymarket.com/event/will-openai-release-an-open-sou...
|
| Last month I was up about ten bucks because OpenAI wasn't open,
| the ceasefire wasn't a ceasefire, and the climate metrics got
| worse. You can't hedge away all the existential despair, but you
| can take the sting out of it.
| heeton wrote:
| My friend does this and calls it "hedging humanity". Every time
| some big political event has happened that bums me out, he's
| made a few hundred.
| hereme888 wrote:
| people still use crypto? I thought the hype died around the
| time when AI boomed.
| ben_w wrote:
| Unfortunatley crypto hype is still high; and I think still on
| the up, but that's vibes not market analysis.
| unsupp0rted wrote:
| Bitcoin is higher than ever. People can't wait until it gets
| high enough that they can sell it for dollars, and use those
| dollars to buy things and make investments in things that are
| valuable.
| esperent wrote:
| > Bitcoin is higher than ever
|
| That's just speculation though. I saw a cynical comment on
| reddit yesterday that unfortunately made a lot of sense.
| Many people now are just so certain that the future of work
| is not going to include many humans, so they're throwing
| everything into stocks and crypto, which is why they remain
| so high even in the face of so much political uncertainty.
| It's not that people are investing because they have hope.
| People are just betting everything as a last ditch survival
| attempt before the robots take over.
|
| Of course this is hyperbolic - market forces are never that
| simple. But I think there might be some truth to it.
| dmd wrote:
| What does 'just' mean here? The monetary value of a thing
| is what people will pay you for it. Full stop.
| esperent wrote:
| The question was about people using crypto to buy things.
| The person above me was implying that because it's going
| up in value, people are using it that way. I replied to
| say that it's (mostly) just speculation. Which is a kind
| of use, but not the one being implied.
| nl wrote:
| Wait until you hear about this thing called gold and how
| its price behaves during periods of uncertainty.
| esperent wrote:
| Gold (and land, jewels, antiques etc.) are bought and
| held during times of uncertainty because people believe
| they will retain their value through virtually anything.
| Stocks don't work that way. In times of uncertainty, gold
| should increase in value, stocks should decrease.
| matthewdgreen wrote:
| The irony is that these people think their rights as
| shareholders will be respected in this future world.
| literalAardvark wrote:
| It's a bit like when peasants had to sell their land for
| cash and ended up enslaved working their own land.
|
| It'll work at first but it's just what the parent poster
| said: a last ditch effort of the desperate and soon to be
| desperater.
| deadbabe wrote:
| Crypto is high because people keep believing some sucker
| in the future will buy it from them even higher. So far,
| they've been right all along. You really think crypto is
| ever going to pay some kind of dividend?
| lsaferite wrote:
| Isn't that equivalent to saying the USD won't pay
| dividends. Correct, but also not the point. I say this as
| someone with no crypto ownership.
| deadbabe wrote:
| There is always demand for USD, it is the only way to pay
| taxes in the US.
| esperent wrote:
| Does gold pay dividends? Would you say it's a bad
| investment?
|
| I'm also someone who owns zero crypto.
| deadbabe wrote:
| Gold is useful.
| yorwba wrote:
| People use crypto on Polymarket because it doesn't comply
| with gambling regulations, so in theory isn't allowed to have
| US customers. Using crypto as an intermediary lets Polymarket
| pretend not to know where the money is coming from. Though I
| think a more robust regulator would call them out on the
| large volume of betting on US politics on their platform...
| miki123211 wrote:
| > a more robust regulator would call them out
|
| Calling them out is one thing, but do you think the US
| could realistically stop them?
|
| I don't know much about Polymarkets governance structure,
| if it's a decentralized smart contract, the US is DOA. Even
| if it's not... the Pirate Bay wasn't, the US really tried
| to stop them, and they basically didn't get anywhere.
| yorwba wrote:
| Looking it up, it seems like the CEO actually got raided
| by the FBI last year:
| https://www.reuters.com/world/us/fbi-raids-polymarket-
| ceos-h... So maybe the wheels of justice are just
| grinding a bit slowly. The terms of use want to have
| Panamanian law apply https://polymarket.com/tos but that
| doesn't provide much protection when the company is
| physically operating in the US.
| amelius wrote:
| people use crypto for speculation, and for (semi)illegal
| purposes
|
| only a small percentage of use is for actual legitimate money
| transfers
| khurs wrote:
| "Gambling can be addictive. Please gamble responsibly. You must
| be 18 years or older to gamble. If you need help, please
| contact your local gambling advice group or your doctor"
| xnx wrote:
| > go on Polymarket and find things that would make me happy or
| optimistic about society and tech, and then bet a couple of
| dollars (of some shitcoin) against them.
|
| Classic win win bet. Your bet wins -> you make money (win).
| Your bet loses -> something good happened for society (win).
| fresh_broccoli wrote:
| Delays aside, I wonder what kind of license they're planning to
| use for their weights.
|
| Will it be restricted like Llama, or fully open like Whisper or
| Granite?
| Havoc wrote:
| Pointless security theatre. The community worked out long ago how
| to strip away any safeguards.
| thrance wrote:
| Whenever I read something similar I immediately remember how
| "Open"AI refused to release GTP2 XL at the time because it was
| "too powerful".
| qoez wrote:
| My pet theory is that they delayed this because grok-4 released
| because they explicitly want to _not_ be seen as competing with
| them by pulling the usual trick of releasing right around when
| google does. Feels like a very sam altman move in my model of his
| mind.
| gnarlouse wrote:
| Honestly, they're distancing themselves optically/temporally from
| HerrGrokler newslines
| seydor wrote:
| > this is new for us
|
| So much for the company that should never be new to that
| jmugan wrote:
| What is their business purpose for releasing an open-weights
| model? How does it help them? I asked an LLM but it just said
| vague unconvincing things about ecosystem plays and fights for
| talent.
| brap wrote:
| PR
| macawfish wrote:
| Wow. Twitter is not a serious website anymore. Why are companies
| and professionals still using it? Is it really like that now,
| with all that noise from grok floating to the top?
| hansmayer wrote:
| Is it now coming before or after the release of AGI, which
| OpenAI, "knows how to build now" ?
| groggo wrote:
| why would OpenAI release an open weight model? Genuinely curious.
___________________________________________________________________
(page generated 2025-07-12 23:01 UTC)