[HN Gopher] The surprise deprecation of GPT-4o for ChatGPT consu...
___________________________________________________________________
The surprise deprecation of GPT-4o for ChatGPT consumers
Author : tosh
Score : 238 points
Date : 2025-08-08 18:04 UTC (4 hours ago)
(HTM) web link (simonwillison.net)
(TXT) w3m dump (simonwillison.net)
| tosh wrote:
| would have been smart to keep them around for a while and just
| hide them (a bit like in the pro plan, but less hidden)
|
| and then phase them out over time
|
| would have reduced usage by 99% anyway
|
| now it all distracts from the gpt5 launch
| Syntonicles wrote:
| Is the new model significantly more efficient or something?
| Maybe using distillation? I haven't looked into it, I just
| heard the price is low.
|
| Personally I use/prefer 4o over 4.5 so I don't have high hopes
| for v5.
| hinkley wrote:
| Charge more for LTS support. That'll chase people onto your new
| systems.
|
| I've seen this play out badly before. It costs real money to
| keep engineers knowledgeable of what should rightfully be EOL
| systems. If you can make your laggard customers pay extra for
| that service, you can take care of those engineers.
|
| The reward for refactoring shitty code is supposed to be not
| having to deal with it anymore. If you have to continue dealing
| with it anyway, then you pay for every mistake for years even
| if you catch it early. You start shutting down the will for
| continuous improvement. The tech debt starts to accumulate
| because it can never be cleared, and trying to use makes
| maintenance five times more confusing. People start wanting
| more Waterfall design to try to keep errors from ever being
| released in the first place. It's a mess.
|
| Make them pay for the privilege/hassle.
| svachalek wrote:
| Models aren't code though. I'm sure there's code around it
| but for the most part models aren't maintained, they're just
| replaced. And a system that was state of the art literally
| yesterday is really hard to characterize as "rightfully EOL".
| koolala wrote:
| Two diffetent models can not be direct replacements of
| eachother. It's like two different novels.
| hinkley wrote:
| That doesn't stop manufacturers from getting rid of parts
| that have no real equivalent elsewhere in their catalog.
| Sometimes they do, but at the end of the day you're at
| their mercy. Or you have strong enough ties to their
| management that they keep your product forever, even later
| when it's hurting them to keep it.
| andy99 wrote:
| Edit to add: according to Sam Altman in the reddit AMA they un-
| deprecated it based on popular demand.
| https://old.reddit.com/r/ChatGPT/comments/1mkae1l/gpt5_ama_w...
|
| I wonder how much of the '5 release was about cutting costs vs
| making it outwardly better. I'm speculating that one reason
| they'd deprecate older models is because 5 materially cheaper to
| run?
|
| Would have been better to just jack up the price on the others.
| For companies that extensively test the apps they're building
| (which should be everyone) swapping out a model is a lot of work.
| sebzim4500 wrote:
| Are they deprecating the older models in the API? I don't see
| any indication of that in the docs.
| dbreunig wrote:
| I'm wondering that too. I think better routers will allow for
| more efficiency (a good thing!) at the cost of giving up
| control.
|
| I think OpenAI attempted to mitigate this shift with the modes
| and tones they introduced, but there's always going to be a
| slice that's unaddressed. (For example, I'd still use dalle 2
| if I could.)
| waldrews wrote:
| Doesn't look like they blew up the API use cases, just the
| consumer UI access. I wouldn't be surprised if they allow it
| again, hidden behind a setting (along with allowing the
| different routed GPT5 levels to be in the selector).
| andy99 wrote:
| Ah ok, that's an important distinction. Seems much less a big
| deal then - or at least a consumer issue rather than a
| business one. Having never really used chatgpt (but used the
| apis a lot), I'm actually surprised that chat users would
| care. There are cost tradeoffs for the different models when
| building on them, but for chatgpt, it's less clear to me why
| one would move between selecting different models.
| svachalek wrote:
| Not everyone is an engineer. There's a substantial
| population that were selecting for maximum sycophancy.
| dragonwriter wrote:
| > There are cost tradeoffs for the different models when
| building on them, but for chatgpt, it's less clear to me
| why one would move between selecting different models.
|
| The same tradeoffs (except cost, because that's roled into
| the plan not a factor when selecting on the interface)
| exist on ChatGPT, which is an app built on the underlying
| model like any other.
|
| So getting rid of models that are stronger in some areas
| when adding a new one that is cheaper (presuming API costs
| also reflect cost to provide) has the same kinds of impacts
| on existing ChatGPT users established usages as it would
| have on a businesses established apps _except_ that the
| ChatGPT users don 't see a cost savings along with any
| disruption in how they were used to things working.
| cgriswald wrote:
| Lower tiers have limited uses for some models.
| waynesonfire wrote:
| I have a feeling that the chatgpt ui does some behind-the
| scenes tuning as well--hidden prompt engineering if you will.
| I migrated to the api and 4o still seems different. Most
| obvious, I don't get the acks that make me feel like I should
| run for president.
|
| Even ChatGPT 5 confirmed this,
|
| why does the gpt-4o api not do this?
|
| ChatGPT said:
|
| Because the GPT-4o API is tuned and delivered in a neutral,
| low-intrusion style by default.
|
| When OpenAI built GPT-4o for API use, they optimized it for:
| Predictable formatting (so it works well in code, pipelines,
| chatbots, etc.) Minimal unsolicited chatter (no
| "Nice!" or "Great job!" unless explicitly prompted)
| Deterministic tone -- so that two API calls with the same
| input produce consistent, professional output without extra
| filler.
|
| That's different from the ChatGPT product experience, which
| has its own "assistant personality" layer that sometimes adds
| those rapport-building acknowledgements in casual
| conversation.
|
| In API mode, you're the one defining the personality, so if
| you want that "Good! Looks like you're digging in" style, you
| have to bake it into the system prompt, for example:
| AlecSchueler wrote:
| But it always gives answers like that for questions where
| it doesn't know the actual reason.
| simonw wrote:
| The GPT-4o you talk to through ChatGPT and the GPT-4o you
| access via the API are different models... but they're
| actually both available via the API.
|
| https://platform.openai.com/docs/models/gpt-4o is gpt-4o in
| the API, also available as three date-stamped snapshots:
| gpt-4o-2024-11-20 and gpt-4o-2024-08-06 and
| gpt-4o-2024-05-13 - priced at $2.50/million input and
| $10.00/million output.
|
| https://platform.openai.com/docs/models/chatgpt-4o-latest
| is chatgpt-4o-latest in the API. This is the model used by
| ChatGPT 4o, and it doesn't provide date-stamped snapshots:
| the model is updated on a regular basis without warning. It
| costs $5/million input and $15/million output.
|
| If you use the same system prompt as ChatGPT (from one of
| the system prompt leaks) with that chatgpt-4o-latest alias
| you should theoretically get the same experience.
| hinkley wrote:
| Margins are weird.
|
| You have a system that's cheaper to maintain or sells for a
| little bit more and it cannibalizes its siblings due to
| concerns of opportunity cost and net profit. You can also go
| pretty far in the world before your pool of potential future
| customers is muddied up with disgruntled former customers. And
| there are more potential future customers overseas than there
| are pissed off exes at home so let's expand into South America!
|
| Which of their other models can run well on the same gen of
| hardware?
| corysama wrote:
| The vibe I'm getting from the Reddit community is that 5 is
| much less "Let's have a nice conversation for hours and hours"
| and much more "Let's get you a curt, targeted answer quickly."
|
| So, good for professionals who want to spend lots of money on
| AI to be more efficient at their jobs. And, bad for casuals who
| want to spend as little money as possible to use lots of
| datacenter time as their artificial buddy/therapist.
| jelder wrote:
| Well, good, because these things make bad friends and worse
| therapists.
| moralestapia wrote:
| I kind of agree with you as I wouldn't use LLMs for that.
|
| But also, one cannot speak for everybody, if it's useful
| for someone on that context, why's that an issue?
| chowells wrote:
| The issue is that people in general are very easy to fool
| into believing something harmful is helping them. If it
| was actually useful, it's not an issue. But just because
| someone believes it's useful doesn't mean it actually is.
| saubeidl wrote:
| Because it's probably not great for one's mental health
| to pretend a statistical model is ones friend?
| lukan wrote:
| Well, because in a worst case scenario, if the pilot of
| that big airliner decides to do ChatGPT therapy instead
| of a real one and then suicides while flying, also other
| people feel the consequences.
| renewiltord wrote:
| _That 's_ the worst case scenario? I can always construct
| worse ones. Suppose Donald Trump goes to a bad therapist
| and then decides to launch nukes at Russia. Damn, this
| therapy profession needs to be hard regulated. It could
| lead to the extinction of mankind.
| andy99 wrote:
| Doc: The encounter could create a time paradox, the
| result of which could cause a chain reaction that would
| unravel the very fabric of the spacetime continuum and
| destroy the entire universe! Granted, that's a worst-case
| scenario. The destruction might in fact be very
| localised, limited to merely our own galaxy.
|
| Marty: Well, that's a relief.
| anonymars wrote:
| Good thing Biff Tanner becoming president was a silly
| fictional alternate reality. Phew.
| anonymars wrote:
| Pilots don't go to real therapy, because real pilots
| don't get sad
|
| https://www.nytimes.com/2025/03/18/magazine/airline-
| pilot-me...
| oceanplexian wrote:
| Yeah I was going to say, as a pilot there is no such
| thing as "therapy" for pilots. You would permanently lose
| your medical if you even mentioned the word to your
| doctor.
| moralestapia wrote:
| Fascinating read. Thanks.
| nickthegreek wrote:
| If this type of thing really interests you and you want
| to go on a wild ride, check out season 2 of nathan
| fielders's The Rehearsal. You dont need to watch s1.
| csours wrote:
| Speaking for myself: the human mind does not seek truth
| or goodness, it primarily seeks satisfaction. That
| satisfaction happens in a context, and ever context is at
| least a little bit different.
|
| The scary part: It is very easy for LLMs to pick up
| someone's satisfaction context and feed it back to them.
| That can distort the original satisfaction context, and
| it may provide improper satisfaction (if a human did
| this, it might be called "joining a cult" or "emotional
| abuse" or "co-dependence").
|
| You may also hear this expressed as "wire-heading"
| pmarreck wrote:
| If treating an LLM as a bestie is allowing yourself to be
| "wire-headed"... Can gaming be "wire-heading"?
|
| Does the severity or excess matter? Is "a little" OK?
|
| This also reminds me of one of Michael Crichton's
| earliest works (and a fantastic one IMHO), The Terminal
| Man
|
| https://en.wikipedia.org/wiki/The_Terminal_Man
|
| https://1lib.sk/book/1743198/d790fa/the-terminal-man.html
| oh_my_goodness wrote:
| Fuck.
| TimTheTinker wrote:
| Because more than any other phenomenon, LLMs are capable
| of bypassing natural human trust barriers. We ought to
| treat their output with significant detachment and
| objectivity, especially when they give personal advice or
| offer support. But especially for non-technical users,
| LLMs _leap_ over the uncanny valley and create
| conversational attachment with their users.
|
| The conversational capabilities of these models directly
| engages people's relational wiring and easily fools many
| people into believing:
|
| (a) the thing on the other end of the chat is
| thinking/reasoning and is personally invested in the
| process (not merely autoregressive stochastic content
| generation / vector path following)
|
| (b) its opinions, thoughts, recommendations, and
| relational signals are the result of that reasoning, some
| level of personal investment, and a resulting mental
| state it has with regard to me, and thus
|
| (c) what it says is personally meaningful on a far higher
| level than the output of other types of compute (search
| engines, constraint solving, etc.)
|
| I'm sure any of us can mentally enumerate a lot of the
| resulting negative effects. Like social media, there's a
| temptation to replace important relational parts of life
| with engaging an LLM, as it _always_ responds
| _immediately_ with something that feels at least somewhat
| meaningful.
|
| But in my opinion the worst effect is that there's a
| temptation to turn to LLMs _first_ when life trouble
| comes, instead of to family /friends/God/etc. I don't
| mean for help understanding a cancer diagnosis (no
| problem with that), but for support, understanding,
| reassurance, personal advice, and hope. In the very worst
| cases, people have been treating an LLM as a spiritual
| entity -- not unlike the ancient Oracle of Delphi -- and
| getting sucked deeply into some kind of spiritual
| engagement with it, and causing destruction to their real
| relationships as a result.
|
| A parallel problem is that just like people who know
| they're taking a placebo pill, even people who are aware
| of the completely impersonal underpinnings of LLMs can
| adopt a functional belief in some of the above (a)-(c),
| even if they really know better. That's the power of
| verbal conversation, and in my opinion, LLM vendors ought
| to respect that power far more than they have.
| varispeed wrote:
| I've seen many therapists and:
|
| > autoregressive stochastic content generation / vector
| path following
|
| ...their capabilities were much worse.
|
| > God
|
| Hate to break it to you, but "God" are just voices in
| your head.
|
| I think you just don't like that LLM can replace
| therapist and offer better advice than biased
| family/friends who only know small fraction of what is
| going on in the world, therefore they are not equipped to
| give valuable and useful advice.
| TimTheTinker wrote:
| > I've seen many therapists and [...] their capabilities
| were much worse
|
| I don't doubt it. The steps to mental and personal
| wholeness can be surprisingly concrete and formulaic for
| most life issues - stop believing these lies & doing
| these types of things, start believing these truths &
| doing these other types of things, etc. But were you
| tempted to stick to an LLM instead of finding a better
| therapist or engaging with a friend? In my opinion,
| assuming the therapist or friend is competent, the
| _relationship_ itself is the most valuable aspect of
| therapy. That relational context helps you honestly face
| where you really are now--never trust an LLM to do that--
| and learn and grow much more, especially if you 're
| lacking meaningful, honest relationships elsewhere in
| your life. (And many people who already have healthy
| relationships can skip the therapy, read books/engage an
| LLM, and talk openly with their friends about how they're
| doing.)
|
| Healthy relationships with other people are irreplaceable
| with regard to mental and personal wholeness.
|
| > I think you just don't like that LLM can replace
| therapist and offer better advice
|
| What I don't like is the potential loss of real
| relationship and the temptation to trust LLMs more than
| you should. Maybe that's not happening for you -- in that
| case, great. But don't forget LLMs have _zero_ skin in
| the game, no emotions, and nothing to lose if they 're
| wrong.
|
| > Hate to break it to you, but "God" are just voices in
| your head.
|
| Never heard that one before :) /s
| MattGaiser wrote:
| > We ought to treat their output with significant
| detachment and objectivity, especially when it gives
| personal advice or offers support.
|
| Eh, ChatGPT is inherently more trustworthy than average
| if simply because it will not leave, will not judge, it
| will not tire of you, has no ulterior motive, and if
| asked to check its work, has no ego.
|
| Does it care about you more than most people? Yes, by
| simply being not interested in hurting you, not needing
| anything from you, and being willing to not go away.
| TimTheTinker wrote:
| You've illustrated my point pretty well. I hope you're
| able to stay personally detached enough from ChatGPT to
| keep engaging in real-life relationships in the years to
| come.
| AlecSchueler wrote:
| It's not even the first time this week I've seen someone
| on HM apparently ready to give up human contact in favour
| of LLMs.
| pmarreck wrote:
| Unless you had a really bad upbringing, "caring" about
| you is _not simply not hurting you, not needing anything
| from you, or not leaving you_
|
| One of the _important_ challenges of existence, IMHO, is
| the struggle to authentically connect to people... and to
| recover from rejection (from other peoples ' rulers,
| which eventually shows you how to build your own ruler
| for yourself, since you are immeasurable!) Which LLM's
| can now undermine, apparently.
|
| Similar to how gaming (which I happen to enjoy, btw... at
| a distance) hijacks your need for
| achievement/accomplishment.
|
| But _also_ similar to gaming which can work alongside
| actual real-life achievement, it can work OK as an
| adjunct /enhancement to existing sources of human
| authenticity.
| zdragnar wrote:
| Whether the Hippocratic oath, the rules of the APA or any
| other organization, most all share "do no harm" as a core
| tenant.
|
| LLMs cannot conform to that rule because they cannot
| distinguish between good advice and enabling bad
| behavior.
| SoftTalker wrote:
| Having an LLM as a friend or therapist would be like
| having a sociopath for those things -- not that an LLM is
| necessarily evil or antisocial, but they certainly meet
| the "lacks a sense of moral responsibility or social
| conscience" part of the definition.
| dcrazy wrote:
| The counter argument is that's just a training problem,
| and IMO it's a fair point. Neural nets are used as
| classifiers all the time; it's reasonable that sufficient
| training data could produce a model that follows the
| professional standards of care in any situation you hand
| it.
|
| The real problem is that _we can't tell when or if we've
| reached that point._ The risk of a malpractice suit
| influences how human doctors act. You can't sue an LLM.
| It has no fear of losing its license.
| macintux wrote:
| An LLM would, surely, have to:
|
| * Know whether its answers are objectively beneficial or
| harmful
|
| * Know whether its answers are _subjectively_ beneficial
| or harmful in the context of the current state of a
| person it cannot see, cannot hear, cannot understand.
|
| * Know whether the user's questions, over time, trend in
| the right direction for that person.
|
| That seems awfully optimistic, unless I'm
| misunderstanding the point, which is entirely possible.
| dcrazy wrote:
| It is definitely optimistic, but I was steelmanning the
| optimist's argument.
| moralestapia wrote:
| Neither most of the doctors I've talked to in the past
| like ... 20 years or so.
| dsadfjasdf wrote:
| Are all humans good friends and therapists?
| saubeidl wrote:
| Not all humans are good friends and therapists. All LLMS
| are bad friends and therapists.
| quantummagic wrote:
| > all LLMS are bad friends and therapists.
|
| Is that just your gut feel? Because there has been some
| preliminary research that suggest it's, at the very
| least, an open question:
|
| https://neurosciencenews.com/ai-chatgpt-
| psychotherapy-28415/
|
| https://pmc.ncbi.nlm.nih.gov/articles/PMC10987499/
|
| https://arxiv.org/html/2409.02244v2
| fwip wrote:
| The first link says that patients can't reliably tell
| which is the therapist and which is LLM in single
| messages, which yeah, that's an LLM core competency.
|
| The second is "how 2 use AI 4 therapy" which, there's at
| least one paper for every field like that.
|
| The last found that they were measurably worse at therapy
| than humans.
|
| So, yeah, I'm comfortable agreeing that all LLMs are bad
| therapists, and bad friends too.
| dingnuts wrote:
| there's also been a spate of reports like this one
| recently https://www.papsychotherapy.org/blog/when-the-
| chatbot-become...
|
| which is definitely worse than not going to a therapist
| pmarreck wrote:
| If I think "it understands me better than any human",
| that's dissociation? Oh boy. And all this time while life
| has been slamming me with unemployment while my toddler
| is at the age of maximum energy-extraction from me (4),
| devastating my health and social life, I thought it was
| just a fellow-intelligence lifeline.
|
| Here's a gut-check anyone can do, assuming you use a
| customized ChatGPT4o and have lots of conversations it
| can draw on: Ask it to roast you, _and not to hold back_.
|
| If you wince, it "knows you" quite well, IMHO.
| davorak wrote:
| I do not think there are any documented cases of LLMs
| being reasonable friends or therapists so I think it is
| fair to say that:
|
| > All LLMS are bad friends and therapists
|
| That said it would not surprise me that LLMs in some
| cases are better than having nothing at all.
| SketchySeaBeast wrote:
| Though given how agreeable LLMs are, I'd imagine there
| are cases where they are also worse than having nothing
| at all as well.
| TimTheTinker wrote:
| > Is that just your gut feel?
|
| Here's my take further down the thread:
| https://news.ycombinator.com/item?id=44840311
| resource_waste wrote:
| Absolutes, monastic take... Yeah I imagine not a lot of
| people seek out your advice.
| goatlover wrote:
| All humans are not LLMs, why does this constantly get
| brought up?
| baobabKoodaa wrote:
| > All humans are not LLMs
|
| What a confusing sentence to parse
| exe34 wrote:
| You wouldn't necessarily know, talking to some of them.
| hn_throwaway_99 wrote:
| Which is a bit frightening because a lot of the r/ChatGPT
| comments strike me as unhinged - it's like you would have
| thought that OpenAI murdered their puppy or something.
| jcims wrote:
| This is only going to get worse.
|
| Anyone that remembers the reaction when Sydney from
| Microsoft or more recently Maya from Sesame losing their
| respective 'personality' can easily see how product
| managers are going to have to start paying attention to
| the emotional impact of changing or shutting down models.
| nilespotter wrote:
| Or they could just do it whenever they want to for
| whatever reason they want to. They are not responsible
| for the mental health of their users. Their users are
| responsible for that themselves.
| AlecSchueler wrote:
| Generally it's poor business to give a big chunk of your
| users am incredibly visceral and negative emotional
| reaction to your product update.
| einarfd wrote:
| Depends on what business OpenAI wants to be in. If they
| want to be in the business of selling AI to companies.
| Then "firing" the consumer customers that want someone to
| talk to, and double down models that are useful for work.
| Can be a wise choice.
| sacado2 wrote:
| Unless you want to improve your ratio of paid-to-free
| users and change your userbase in the process. They're
| pissing off free users, but pros who use the paid version
| might like this new version better.
| encom wrote:
| >unhinged
|
| It's Reddit, what were you expecting?
| whynotminot wrote:
| Yeah it's really bad over there. Like when a website
| changes its UI and people prefer the older look... except
| they're acting like the old look was a personal friend
| who died.
|
| I think LLMs are amazing technology but we're in for
| really weird times as people become attached to these
| things.
| dan-robertson wrote:
| I mean, I don't mind the Claude 3 funeral. It seems like
| it was a fun event.
|
| I'm less worried about the specific complaints about
| model deprecation, which can be 'solved' for those people
| by not deprecating the models (obviously costs the AI
| firms). I'm more worried about AI-induced psychosis.
|
| An analogy I saw recently that I liked: when a cat sees a
| laser pointer, it is a fun thing to chase. For dogs it is
| sometimes similar and sometimes it completely breaks the
| dog's brain and the dog is never the same again. I feel
| like AI for us may be more like laser pointers for dogs,
| and some among us are just not prepared to handle these
| kinds of AI interactions in a healthy way.
| simonw wrote:
| I just saw a _fantastic_ TikTok about ChatGPT psychosis:
| https://www.tiktok.com/@pearlmania500/video/7535954556379
| 761...
| pmarreck wrote:
| Oh boy. My son just turned 4. Parenting about to get
| weird-hard
| epcoa wrote:
| Considering how much d-listers can lose their shit over a
| puppet, I'm not surprised by anything.
| resource_waste wrote:
| Well, like, thats just your opinion man.
|
| And probably close to wrong if we are looking at the sheer
| scale of use.
|
| There is a bit of reality denial among anti-AI people. I
| thought about why people don't adjust to this new reality.
| I know one of my friends was anti-AI and seems to continue
| to be because his reputation is a bit based on proving he
| is smart. Another because their job is at risk.
| monster_truck wrote:
| The number of comments in the thread talking about 4o as if
| it were their best friend the shared all their secrets with
| is concerning. Lotta lonely folks out there
| delfinom wrote:
| Wait until you see
|
| https://www.reddit.com/r/MyBoyfriendIsAI/
|
| They are very upset by the gpt5 model
| pmarreck wrote:
| oh god, this is some real authentic dystopia right here
|
| these things are going to end up in android bots in 10
| years too
|
| (honestly, I wouldn't mind a super smart, friendly bot in
| my old age that knew all my quirks but was always
| helpful... I just would not have a full-on relationship
| with said entity!)
| razster wrote:
| That subreddit is fascinating and yet saddening at the
| same time. What I read will haunt me.
| greesil wrote:
| I weep for humanity. This is satire right? On the flip
| side I guess you could charge these users more to keep 4o
| around because they're definitely going to pay.
| abxyz wrote:
| https://www.nytimes.com/2025/08/08/technology/ai-
| chatbots-de...
| abxyz wrote:
| AI safety is focused on AGI but maybe it should be
| focused on how little "artificial intelligence" it takes
| to send people completely off the rails. We could barely
| handle social media, LLMs seem to be too much.
| alecsm wrote:
| I had this feeling too.
|
| I needed some help today and it's messages where shorter but
| also detailed without all the spare text that I usually don't
| even read.
| tibbar wrote:
| It's a good reminder that OpenAI isn't incentivized to have
| users spend a lot of time on their platform. Yes, they want
| people to be engaged and keep their subscription, but better
| if they can answer a question in few turns rather than many.
| This dynamic would change immediately if OpenAI introduced
| ads or some other way to monetize each minute spent on the
| platform.
| yawnxyz wrote:
| the classic 3rd space problem that Starbucks tackled; they
| initially wanted people to hang out and do work there, but
| grew to hate it so they started adding lots of little
| things to dissuade people from spending too much time there
| dragonwriter wrote:
| > the classic 3rd space problem that Starbucks tackled
|
| "Tackled" is misleading. "Leveraged to grow a customer
| base and then exacerbated to more efficiently monetize
| the same customer base" would be more accurate.
| hn_throwaway_99 wrote:
| The GPT-5 API has a new parameter for verbosity of output. My
| guess is the default value of this parameter used in ChatGPT
| corresponds to a lower verbosity than previous models.
| michaelbrave wrote:
| I've seen quite a bit of this too, the other thing I'm seeing
| on reddit is I guess a lot of people really liked 4.5 for
| things like worldbuilding or other creative tasks, so a lot
| of them are upset as well.
| torginus wrote:
| I mean - I 'm quite sure it's going to be available via
| API, and you can still do your worldbuilding if you're
| willing to go to places like OpenRouter.
| corysama wrote:
| There is certainly a market/hobby opportunity for "discount
| AI" for no-revenue creative tasks. A lot of r/LocalLLaMA/
| is focused on that area and in squeezing the best results
| out of limited hardware. Local is great if you already have
| a 24 GB gaming GPU. But, maybe there's an opportunity for
| renting out low power GPUs for casual creative work. Or, an
| opportunity for a RenderToken-like community of GPU
| sharing.
| AlecSchueler wrote:
| If you're working on a rented GPU are you still doing
| local work? Or do you mean literally lending out the
| hardware?
| corysama wrote:
| Working on a rented GPU would not be local. But, renting
| a low-end GPU might be cheap enough to use for hobbyist
| creative work. I'm just musing on lots of different
| routes to make hobby AI use economically feasible.
| simonw wrote:
| The gpt-oss-20b model has demonstrated that a machine
| with ~13GB of available RAM can run a very decent local
| model - if that RAM is GPU-accessible (as seen on Apple
| silicon Macs for example) you can get very usable
| performance out of it too.
|
| I'm hoping that within a year or two machines like that
| will have dropped further in price.
| raincole wrote:
| Reddit is where people literally believed GPT5 was going to
| be AGI.
| thejazzman wrote:
| reddit is a large group of people sharing many diverse
| ideas
| goatlover wrote:
| That was the r/singularity sub which has a rather large
| bias toward believing the singularity is near and
| inevitable.
| hirvi74 wrote:
| > _" Let's get you a curt, targeted answer quickly."_
|
| This probably why I am absolutely digging GPT-5 right now.
| It's a chatbot not a therapist, friend, nor a lover.
| mvieira38 wrote:
| Great for the environment as well and the financial future of
| the company. I can't see how this is a bad thing, some people
| really were just suffering from Proompt Disorder
| drewbeck wrote:
| Also good for the bottom line: fewer tokens generated.
| oceanplexian wrote:
| I don't see how people using these as a therapist really has
| any measurable impact compared to using them as agents. I'll
| spend a day coding with an LLM and between tool calls,
| passing context to the model, and iteration I'll blow through
| millions of tokens. I don't even think a normal person is
| capable of reading that much.
| el_benhameen wrote:
| I am all for "curt, targeted answers", but they need to be
| _correct_, which is my issue with gpt-5
| rpeden wrote:
| I'm appalled by how dismissive and heartless many HN users
| seem toward non-professional users of ChatGPT.
|
| I use the GPT models (along with Claude and Gemini) a ton for
| my work. And from this perspective, I appreciate GPT-5. It
| does a good job.
|
| But I also used GPT-4o extensively for first-person non-
| fiction/adventure creation. Over time, 4o had come to be
| quite good at this. The force upgrade to GPT-5 has, up to
| this point, been a massive reduction in quality for this use
| case.
|
| GPT-5 just _forgets_ or misunderstands things or mixes up
| details about characters that were provided a couple of
| messages prior, while 4o got these details right even when
| they hadn 't been mentioned in dozens of messages.
|
| I'm using it for fun, yes, but not as a buddy or therapist.
| Just as entertainment. I'm fine with paying more for this use
| if I need to. And I do - right now, I'm using
| `chatgpt-4o-latest` via LibreChat but it's a somewhat
| inferior experience to the ChatGPT web UI that has access to
| memory and previous chats.
|
| Not the end of the world - but a little advance notice would
| have been nice so I'd have had some time to prepare and test
| alternatives.
| jimbokun wrote:
| > For companies that extensively test the apps they're building
| (which should be everyone) swapping out a model is a lot of
| work.
|
| Yet another lesson in building your business on someone else's
| API.
| dragonwriter wrote:
| > I wonder how much of the '5 release was about cutting costs
| vs making it outwardly better. I'm speculating that one reason
| they'd deprecate older models is because 5 materially cheaper
| to run?
|
| I mean, assuming the API pricing has some relation to OpenAI
| cost to provide (which is somewhat speculative, sure), that
| seems pretty well supported as a truth, if not necessarily the
| reason for the model being introduced: the models discontinued
| ("deprecated" implies entering a notice period for future
| discontinuation) from the ChatGPT interface are priced
| significantly higher than GPT-5 on the API.
|
| > For companies that extensively test the apps they're building
| (which should be everyone) swapping out a model is a lot of
| work.
|
| Who is building apps relying on the ChatGPT frontend as a model
| provider? Apps would normally depend on the OpenAI API, where
| the models are still available, but GPT-5 is added and cheaper.
| nickthegreek wrote:
| > Who is building apps relying on the ChatGPT frontend as a
| model provider? Apps would normally depend on the OpenAI API,
| where the models are still available, but GPT-5 is added and
| cheaper.
|
| Always enjoy your comments dw, but on this one I disagree.
| Many non-technical people at my org use custom gpt's as
| "apps" to do some re-occuring tasks. Some of them have spent
| absurd time tweaking instructions and knowledge over and
| over. Also, when you create a custom gpt, you can
| specifically set the preferred model. This will no doubt
| change the behavior of those gpts.
|
| Ideally at the enterprise level, our admins would have a
| longer sunset on these models via web/app interface to ensure
| no hiccups.
| trashface wrote:
| Maybe the true cost of GPT-5 is hidden, I tried to use the
| GPT-5 API and openai wanted me to do a biometric scan with my
| camera, yikes.
| scarface_74 wrote:
| Companies testing their apps would be using the API not the
| ChatGPT app. The models are still available via the API.
| tropicalfruit wrote:
| reading all the shilling of Claude and GPT i see here often I
| feel like i'm being gaslighted
|
| i've been using premium tiers of both for a long time and i
| really felt like they've been getting worse
|
| especially Claude I find super frustrating and maddening,
| misunderstanding basic requests or taking liberties by making
| unrequested additions and changes
|
| i really had this sense of enshittification, almost as if they
| are no longer trying to serve my requests but do something else
| instead like i'm victim of some kind of LLM a/b testing to see
| how far I can tolerate or how much mental load can be transferred
| back onto me
| TechDebtDevin wrote:
| If Anthropic made Deepthink 3.5 it would be AGI, I never use >
| 3.5
| macawfish wrote:
| I suspect that it may not necessarily be that they're getting
| objectively _worse_ as much as that they aren't static
| products. They're constantly getting their prompts/context
| engines tweaked in ways that surely break peoples' familiar
| patterns. There really needs to be a way to cheaply and easily
| anchor behaviors so that people can get more consistency.
| Either that or we're just going to have to learn to adapt.
| tibbar wrote:
| While it's possible that the LLMs are intentionally throttled
| to save costs, I would also keep in mind that LLMs are now
| being optimized for new kinds of workflows, like long-running
| agents making tool calls. It's not hard to imagine that
| improving performance on one of those benchmarks comes at a
| cost to some existing features.
| simonw wrote:
| Anthropic have stated on the record several times that they do
| not update the model weights once they have been deployed
| without also changing the model ID.
| jjani wrote:
| No, they do change deployed models.
|
| How can I be so sure? Evals. There was a point where Sonnet
| 3.5 v2 happily output 40k+ tokens in one message if asked.
| And one day it started with 99% consistency, outputting
| "Would you like me to continue?" after a lot fewer tokens
| than that. We'd been running the same set of evals and so
| could definitively confirm this change. Googling will also
| reveal many reports of this.
|
| Whatever they did, in practice they lied: API behavior of a
| deployed model changed.
|
| Another one: Differing performance - not latency but output
| on the same prompt, over 100+ runs, statistically significant
| enough to be impossible by random chance - between AWS
| Bedrock hosted Sonnet and direct Anthropic API Sonnet, same
| model version.
|
| Don't take at face value what model providers claim.
| simonw wrote:
| If they are lying about changing model weights despite
| keeping the date-stamped model ID the same it would be a
| _monumental_ lie.
|
| Anthropic make most of their revenue from paid API usage.
| Their paying customers need to be able to trust them when
| they make clear statements about their model deprecation
| policy.
|
| I'm going to chose to continue to believe them until
| someone shows me incontrovertible evidence that this isn't
| true.
| tibbar wrote:
| I've worked on many migrations of things from vX to vX + 1, and
| there's always a tension between maximum backwards-compatibility,
| supporting every theoretical existing use-case, and just
| "flipping the switch" to move everyone to the New Way. Even
| though I, personally, am a "max backwards-compatibility" guy, it
| can be refreshing when someone decides to rip off the bandaid and
| force everyone to use the new best practice. How exciting!
| Unfortunately, this usually results in accidentally eliminating
| some feature that turns out to be Actually Important, a fuss is
| made, and the sudden forced migration is reverted after all.
|
| I think the best approach is to move people to the newest version
| by default, but make it possible to use old versions, and then
| monitor switching rates and figure out what key features the new
| system is missing.
| ronsor wrote:
| I usually think it's best to have both _n_ and _n - 1_ versions
| for a limited time. As long as you _always_ commit to removing
| the _n - 1_ version at a specified point in time, you don 't
| get trapped in backward compatibility hell.
| koolala wrote:
| Unless n is in any way objectively worse than n-1, then
| remove n-1 immediately so users don't directly compare them.
| Even Valve did it with Counter-Strike 2 and GO.
| tibbar wrote:
| With major redesigns, you often can't directly compare the
| two versions --- they are different enough that you
| actually want people to use them in a different way. So
| it's not that the new version is "worse", it's just
| different, and it's possible that there are some workflows
| that are functionally impossible on the new version (you'd
| be surprised how easy it is to mess this up.)
| riffic wrote:
| It's like everyone got a U2 album they didn't ask for, but
| instead of U2 they got Nickelback.
| iamleppert wrote:
| Taking away user choice is often done in the name of simplicity.
| But let's not forget that given 100 users, 60 are likely to
| answer with "no opinion" when asked what about their preference
| to ANY question. Does that mean the other 40% aren't valuable and
| their preferences not impactful to the other "we don't care"
| majority?
| jimbokun wrote:
| And that 60% are going to be in the %40 for other questions.
| pphysch wrote:
| It's not totally surprising given the economics of LLM operation.
| LLMs, when idle, are much more resource-heavy than an idle web
| service. To achieve acceptable chat response latency, the models
| need to be already loaded in memory, and I doubt that these huge
| SotA models can go from cold start to inference in milliseconds
| or even seconds. OpenAI is incentivized to push as many users
| onto as few models as possible to manage the capacity and
| increase efficiency.
| danpalmer wrote:
| This was my thought. They messaged quite heavily in advance
| that they were capacity constrained, and I'd guess they just
| want to shuffle out GPT-4 serving as quickly as possible as its
| utilisation will only get worse over time, and that's time they
| can be utilising better for GPT-5 serving.
| eurekin wrote:
| I couldn't be more confused by this launch...
|
| I had gpt-5 only on my account for the most of today, but now I'm
| back at previous choices (including my preferred o3).
|
| Had gpt-5 been pulled? Or, was it only a preview?
| chmars wrote:
| Same here.
| jasondigitized wrote:
| This. I don't see 5 at all as a Plus customer.
| paco3346 wrote:
| I'm on Plus and only have 5
| felipemesquita wrote:
| I'm on Plus and have only GPT-5 on the iOS app and only the old
| models (except 4.5 and older expensive to run ones) in the web
| interface since yesterday after the announcement.
| kgeist wrote:
| We have a team account and my buddy has GPT-5 in the app but
| not on the website. At the same time, I have GPT-5 on the
| website, but in the app, I still only have GPT-4o. We're
| confused as hell, to say the least.
| einarfd wrote:
| I have gpt-5 on my iPhone, but not on my iPad. Both runs the
| newest chatgpt app.
|
| Maybe they do device based rollout? But imo. that's a weird
| thing to do.
| tudorpavel wrote:
| For me it was available today on one laptop, but not the other.
| Both logged into the same account with Plus.
| ascorbic wrote:
| I have it only on the desktop app, not web or mobile. Seems a
| really weird way to roll it out.
| binarymax wrote:
| This doesn't seem to be the case for me. I have access to GPT-5
| via chatgpt, and I can also use GPT-4o. All my chat history opens
| with the originally used model as well.
|
| I'm not saying it's not happening - but perhaps the rollout
| didn't happen as expected.
| felipemesquita wrote:
| Are you on the pro plan? I think pro users can use all models
| indefinitely
| binarymax wrote:
| Just plus
| ramoz wrote:
| One enterprise angle to open source models is that we will
| develop advanced forms of RPA. Models automating a single task
| really well.
|
| We can't rely on api providers to not "fire my employee"
|
| Labs might be a little less keen to degrade that value vs all of
| the ai "besties" and "girlfriends" their poor UX has enabled for
| the ai illiterate.
| CodingJeebus wrote:
| Totally agree, stuff like this completely undermines the idea
| that these products will replace humans at scale.
|
| If one develops a reputation for putting models out to pasture
| like Google does pet projects, you'd think twice before
| building a business around it
| iSloth wrote:
| It's boggles my mind that enterprises or SaaS wouldn't be
| following release cycles of new models to improve their service
| and/or cost. Although I guess there's enterprises that don't do
| OS upgrades or pathing too, just alien to me.
| jjani wrote:
| They're almost never straight upgrades for the exact same
| prompts across the board at the same latency and price. The
| last time that happened was already a year ago, with 3.5
| Sonnet.
| AndrewKemendo wrote:
| >There's no deprecation period at all: when your consumer ChatGPT
| account gets GPT-5, those older models cease to be available.
|
| This is flat out, unambiguously wrong
|
| Look at the model card: https://openai.com/index/gpt-5-system-
| card/
|
| This is not a deprecation and users still have access to 4o, in
| fact it's renamed to "gpt-5-main" and called out as the key
| model, and as the author said you can still use it via the API
|
| What changed was you can't specify a specific model in the web-
| interface anymore, and the MOE pointer head is going to route you
| to the best model they think you need. Had the author addressed
| that point it would be salient.
|
| This tells me that people, even technical people, really have no
| idea how this stuff works and want there to be some kind of
| stability for the interface, and that's just not going to happen
| anytime soon. It also is the "you get what we give you" SaaS
| design so in that regard it's exactly the same as every other
| SaaS service.
| andrewmcwatters wrote:
| They're different models, "It can be helpful to think of the
| GPT-5 models as successors to previous models"
| (https://openai.com/index/gpt-5-system-
| card/#:~:text=It%20can...)
| og_kalu wrote:
| Did you read that card ? They didn't just rename the models.
| Gpt-5-main isn't a renamed GPT-4o, it's the successor to 4o
| op00to wrote:
| I'm unable to use anything but GPT-5, and the response I've
| gotten don't nearly consider my past history. Projects don't
| work at all. I cancelled my Plus subscription, not that OpenAI
| cares.
| simonw wrote:
| No, GPT-4o has not been renamed to gpt-5-main. gpt-5-main is an
| entirely new model.
|
| I suggest comparing
| https://platform.openai.com/docs/models/gpt-5 and
| https://platform.openai.com/docs/models/gpt-4o to understand
| the differences in a more readable way than that system card.
| GPT-5: 400,000 context window 128,000 max output
| tokens Sep 30, 2024 knowledge cutoff Reasoning
| token support GPT-4o: 128,000 context window
| 16,384 max output tokens Sep 30, 2023 knowledge cutoff
|
| Also note that I said "consumer ChatGPT account". The API is
| different. (I added a clarification note to my post about that
| since first publishing it.)
| AndrewKemendo wrote:
| You can't compare them like that
|
| GPT-5 isn't the successor to 4o no matter what they say,
| GPT-5 is a MOE handler on top of multiple "foundations", it's
| not a new model, it's orchestration of models based on
| context fitting
|
| You're buying the marketing bullshit as though it's real
| simonw wrote:
| No, there are two things called GPT-5 (this is _classic_
| OpenAI, see also Codex).
|
| There's GPT-5 the system, a new model routing mechanism
| that is part of their ChatGPT consumer product.
|
| There's also a new model called GPT-5 which is available
| via their API:
| https://platform.openai.com/docs/models/gpt-5
|
| (And two other named API models, GPT-5 mini and GPT-5 nano
| - part of the GPT-5 model family).
|
| AND there's GPT-5 Pro, which isn't available via the API
| but can be accessed via ChatGPT for $200/month subscribers.
| andrewmcwatters wrote:
| This industry just keeps proving over and over again that if it's
| not open, or yours, you're building on shifting sand.
|
| It's a really bad cultural problem we have in software.
| jimbokun wrote:
| Pretty tautological, no?
|
| If it's not yours, it's not yours.
| CodingJeebus wrote:
| > or trying prompt additions like "think harder" to increase the
| chance of being routed to it.
|
| Sure, manually selecting model may not have been ideal. But
| manually prompting to get your model feels like an absurd hack
| MattGaiser wrote:
| Anecdotally, saying "think harder" and "check your work
| carefully" has always gotten me better results.
| thorum wrote:
| We need a new set of UX principles for AI apps. If users need
| to access an AI feature multiple times a day it should be a
| button.
| curiouser3 wrote:
| claude code does this (all the way up to keyword "superthink")
| which drives me nuts. 12 keystrokes to do something that should
| be a checkbox
| faizshah wrote:
| o3 was also an anomaly in terms of speed vs response quality and
| price vs performance. It used to be one of the fastest ways to do
| some basic web searches you would have done to get an answer if
| you used o3 pro you it would take 5x longer for not much better
| response.
|
| So far I haven't been impressed with GPT5 thinking but I can't
| concretely say why yet. I am thinking of comparing the same
| prompt side by side between o3 and GPT5 thinking.
|
| Also just from my first few hours with GPT5 Thinking I feel that
| it's not as good at short prompts as o3 e.g instead of using a
| big xml or json prompt I would just type the shortest possible
| phrase for the task e.g "best gpu for home LLM inference vs cloud
| api."
| jjani wrote:
| My chats so far have been similar to yours, across the board
| worse than o3, never better. I've had cases where it completely
| misinterpreted what I was asking for, a very strange experience
| which I'd never had with the other frontier models (o3, Sonnet,
| Gemini Pro). Those would of course get things wrong, make
| mistakes, but never completely misunderstand what I'm asking. I
| tried the same prompt on Sonnet and Gemini and both understood
| correctly.
|
| It was related to software architecture, so supposedly
| something it should be good at. But for some reason it
| interpreted me as asking from an _end-user_ perspective instead
| of a _developer_ of the service, even though it was plenty
| clear to any human - and other models - that I meant the
| latter.
| faizshah wrote:
| > I've had cases where it completely misinterpreted what I
| was asking for, a very strange experience which I'd never had
| with the other frontier models (o3, Sonnet, Gemini Pro).
|
| Yes! This exactly, with o3 you could ask your question
| imprecisely or word it badly/ambiguously and it would figure
| out what you meant, with GPT5 I have had several cases just
| in the last few hours where it misunderstands the question
| and requires refinement.
|
| > It was related to software architecture, so supposedly
| something it should be good at. But for some reason it
| interpreted me as asking from an end-user perspective instead
| of a developer of the service, even though it was plenty
| clear to any human - and other models - that I meant the
| latter.
|
| For me I was using o3 in daily life like yesterday we were
| playing a board game so I wanted to ask GPT5 Thinking to
| clarify a rule, I used the ambiguous prompt with a picture of
| a card's draw 1 card power and asked "Is this from the deck
| or both?" (From the deck or from the board). It responded by
| saying the card I took a picture of was from the game
| wingspan's deck instead of clarifying the actual power on the
| card (o3 would never).
|
| I'm not looking forward to how much time this will waste on
| my weekend coding projects this weekend.
| jjani wrote:
| It appears to be overtuned on extremy strict instruction
| following, interpreting things in a very unhuman way, which
| may be a benefit to agentic tasks at the costs of
| everything else.
|
| My limited API testing with gpt-5 also showed this. As an
| example, the instruction "don't use academic language"
| caused it to basically omit half of what it output without
| that instruction. The other frontier models, and even open
| source Chinese ones like Kimi and Deepseek, understand
| perfectly fine what we mean by it.
| int_19h wrote:
| It's not great at agentic tasks either. Not the least
| because it seems very timid about doing things on its
| own, and demands (not asks - _demands_ ) that user
| confirm every tiny step.
| macawfish wrote:
| GPT-5 reflecting Sam A's personality? Hmm...
| oh_my_goodness wrote:
| 4o is for shit, but it's inconvenient to lose o3 with no warning.
| Good reminder that it was past time to keep multiple vendors in
| use.
| resource_waste wrote:
| Yep, this caused me to unsubscribe. o3/o4 and 4.5 were
| extremely good. GPT5 is worse than both.
| nafizh wrote:
| I still haven't got access to GPT-5 (plus user in US), and I am
| not really super looking forward to it given I would lose access
| to o3. o3 is a great reasoning and planning model (better than
| Claude Opus in planning IMO and cheaper) that I use in the UI as
| well as through API. I don't think OpenAI should force users to
| an advanced model if there is not a noticeable difference in
| capability. But I guess it saves them money? Someone posted on X
| how giving access to only GPT-5 and GPT-5 thinking reduces a plus
| user's overall weekly request rate.
| renewiltord wrote:
| I have GPT-5 on the mobile app and the full set on my browser and
| this is good.
| yard2010 wrote:
| I'm happy to hear. If you need anything else, I'm here to help.
| perlgeek wrote:
| GPT-5 simply sucks at some things. The very first thing I asked
| it to do was to give me an image of knife with spiral damascus
| pattern, it gave me an image of such a knife, but with two
| handles at a right angle:
| https://chatgpt.com/share/689506a7-ada0-8012-a88f-fa5aa03474...
|
| Then I asked it to give me the same image but with only one
| handle; as a result, it removed one of the pins from a handle,
| but the knife had still had two handles.
|
| It's not surprising that a new version of such a versatile tool
| has edge cases where it's worse than a previous version (though
| if it failed at the very first task I gave it, I wonder how edge
| that case really was). Which is why you shouldn't just switch
| over everybody without grace period nor any choice.
|
| The old chatgpt didn't have a problem with that prompt.
|
| For something so complicated it doesn't surprise that a major new
| version has some worse behaviors, which is why I wouldn't
| deprecate all the old models so quickly.
| zaptrem wrote:
| The image model (GPT-Image-1) hasn't changed
| orphea wrote:
| Yep, GPT-5 doesn't output images:
| https://platform.openai.com/docs/models/gpt-5
| perlgeek wrote:
| Then why does it produce different output?
| simonw wrote:
| It works as a tool. The main model (GPT-4o or GPT-5 or o3
| or whatever) composes a prompt and passes that to the image
| model.
|
| This means different top level models will get different
| results.
|
| You can ask the model to tell you the prompt that it used,
| and it will answer, but there is no way of being 100% sure
| it is telling you the truth!
|
| My hunch is that it is telling the truth though, because
| models are generally very good at repeating text from
| earlier in their context.
| seba_dos1 wrote:
| You know that unless you control for seed and temperature,
| you always get a different output for the same prompts even
| with the model unchanged... right?
| carlos_rpn wrote:
| Somehow I copied your prompt and got a knife with a single
| handle on the first try:
| https://chatgpt.com/s/m_689647439a848191b69aab3ebd9bc56c
|
| Edit: chatGPT translated the prompt from english to portuguese
| when I copied the share link.
| hirvi74 wrote:
| I think that is one of the most frustrating issues I
| currently face when using LLMs. One can send the same prompt
| in two separate chats and receive two drastically different
| responses.
| dymk wrote:
| It is frustrating that it'll still give a bad response
| sometimes, but I consider the variation in responses a
| feature. If it's going down the wrong path, it's nice to be
| able to roll the dice again and get it back on track.
| techpineapple wrote:
| I've noticed inconsistencies like this, everyone said that it
| couldn't count the b's in blueberry, but it worked for me the
| first time, so I thought it was haters but played with a few
| other variations and got flaws. (Famously, it didn't get r's
| in strawberry).
|
| I guess we know it's non-deterministic but there must be some
| pretty basic randomizations in there somewhere, maybe around
| tuning its creativity?
| seba_dos1 wrote:
| Temperature is a very basic concept that makes LLMs work as
| well as they do in the first place. That's just how it
| works and that's how it's been always supposed to work.
| chrismustcode wrote:
| The image model is literally the same model
| joaohaas wrote:
| Yes, it sucks
|
| But GPT-4 would have the same problems, since it uses the same
| image model
| minimaxir wrote:
| So there may be something weird going on with images in GPT-5,
| which OpenAI avoided any discussion about in the livestream.
| The artist for SMBC noted that GPT-5 was better at plagiarizing
| his style:
| https://bsky.app/profile/zachweinersmith.bsky.social/post/3l...
|
| However, there have been no updates to the underlying image
| model (gpt-image-1). But due to the autoregressive nature of
| the image generation where GPT generates tokens which are then
| decoded by the image model (in contrast to diffusion models),
| it is _possible_ for an update to the base LLM token generator
| to incorporate new images as training data without having to
| train the downstream image model on those images.
| simonw wrote:
| No, those changes are going to be caused by the top level
| models composing different prompts to the underlying image
| models. GPT-5 is not a multi-modal image output model and
| still uses the same image generation model that other ChatGPT
| models use, via tool calling.
|
| GPT-4o was _meant_ to be multi-modal image output model, but
| they ended up shipping that capability as a separate model
| rather than exposing it directly.
| minimaxir wrote:
| That may be a more precise interpretation given the leaked
| system prompt, as the schema for the tool there includes a
| prompt: https://news.ycombinator.com/item?id=44832990
| yobananaboy wrote:
| I've been seeing someone on Tiktok that appears to be one of the
| first public examples of AI psychosis, and after this update to
| GPT-5, the AI responses were no longer fully feeding into their
| delusions. (Don't worry, they switched to Claude, which has been
| far worse!)
| simonw wrote:
| Hah, that's interesting! Claude just shipped a system prompt
| update a few days ago that's intended to make it less likely to
| support delusions. I captured a diff here:
| https://gist.github.com/simonw/49dc0123209932fdda70e0425ab01...
|
| Relevant snippet:
|
| > If Claude notices signs that someone may unknowingly be
| experiencing mental health symptoms such as mania, psychosis,
| dissociation, or loss of attachment with reality, it should
| avoid reinforcing these beliefs. It should instead share its
| concerns explicitly and openly without either sugar coating
| them or being infantilizing, and can suggest the person speaks
| with a professional or trusted person for support. Claude
| remains vigilant for escalating detachment from reality even if
| the conversation begins with seemingly harmless thinking.
| kranke155 wrote:
| I started doing this thing recently where I took a picture of
| melons at the store to get chatGPT to tell me which it thinks
| is best to buy (from color and other characteristics).
|
| chatGPT will do it without question. Claude won't even
| recommend any melon, it just tells you what to look for.
| Incredibly different answer and UX construction.
|
| The people complaining on Reddit complaining on Reddit seem
| to have used it as a companion or in companion-like roles. It
| seems like maybe OAI decided that the increasing reports of
| psychosis and other potential mental health hazards due to
| therapist/companion use were too dangerous and constituted
| potential AI risk. So they fixed it. Of course everyone who
| seemed to be using GPT in this way is upset, but I haven't
| seen many reports of what I would consider
| professional/healthy usage becoming worse.
| macawfish wrote:
| Meanwhile I'm stuck on 4o
| rs186 wrote:
| > Emotional nuance is not a characteristic I would know how to
| test!
|
| Well, that's easy, we knew that decades ago.
| It's your birthday. Someone gives you a calfskin wallet.
| You've got a little boy. He shows you his butterfly collection
| plus the killing jar. You're watching television.
| Suddenly you realize there's a wasp crawling on your arm.
| smogcutter wrote:
| Something I hadn't thought about before with the V-K test: in
| the setting of the film animals are just about extinct. The
| only animal life we see are engineered like the replicants.
|
| I had always thought of the test as about empathy for the
| animals, but hadn't really clocked that in the world of the
| film the scenarios are all _major_ transgressions.
|
| The calfskin wallet isn't just in poor taste, it's rare &
| obscene.
|
| Totally off topic, but thanks for the thought.
| dmezzetti wrote:
| This thread is the best sales pitch for local / self-hosted
| models. With local, you have total control over when you decide
| to upgrade.
| rob74 wrote:
| > _But if you're already leaning on the model for life advice
| like this, having that capability taken away from you without
| warning could represent a sudden and unpleasant loss!_
|
| Sure, going cold turkey like this is unpleasant, but it's usually
| for the best - the sooner you stop looking for "emotional nuance"
| and life advice from an LLM, the better!
| iamspoilt wrote:
| It's coming back according to Sam
| https://www.reddit.com/r/ChatGPT/comments/1mkae1l/gpt5_ama_w...
| Oceoss wrote:
| I tried gpt 5 high with extended thinking and isnt bad I prefer
| opus 4.1 though, at least for now
| bookofjoe wrote:
| Currently 13 of 30 submissions on hn homepage are AI-related.
| That seems to be about average now.
| KaiMagnus wrote:
| Some are interesting no doubt, but it's getting one-sided.
|
| Personally, two years ago the topics here were much more
| interesting compared to today.
| bookofjoe wrote:
| Concur. It's not even close.
| mattmanser wrote:
| We go through hype bubbles every now and again. A few years
| ago you could make the same complaint about crypto currency.
| caspper69 wrote:
| This is disappointing. 4o has been performing great for me, and
| now I see I only have access to the 5-level models. Already it's
| not as good. More verbose with technical wording, but it adds
| very little to what I'm using GPT for.
| tosh wrote:
| sama: https://x.com/sama/status/1953893841381273969
|
| """
|
| GPT-5 rollout updates:
|
| _We are going to double GPT-5 rate limits for ChatGPT Plus users
| as we finish rollout.
|
| _ We will let Plus users choose to continue to use 4o. We will
| watch usage as we think about how long to offer legacy models
| for.
|
| _GPT-5 will seem smarter starting today. Yesterday, the
| autoswitcher broke and was out of commission for a chunk of the
| day, and the result was GPT-5 seemed way dumber. Also, we are
| making some interventions to how the decision boundary works that
| should help you get the right model more often.
|
| _ We will make it more transparent about which model is answering
| a given query.
|
| _We will change the UI to make it easier to manually trigger
| thinking.
|
| _ Rolling out to everyone is taking a bit longer. It's a massive
| change at big scale. For example, our API traffic has about
| doubled over the past 24 hours...
|
| We will continue to work to get things stable and will keep
| listening to feedback. As we mentioned, we expected some
| bumpiness as we roll out so many things at once. But it was a
| little more bumpy than we hoped for!
|
| """
| eurg wrote:
| All these announces are scenery and promotion. Very low chance
| any of these "corrections" were not planned. For some reason,
| sama et al. make me feel like a mouse played with by a cat.
| baobabKoodaa wrote:
| Why on earth would they undercut the launch of their new
| model by "planning" to do a stunt where people demand the old
| models instead of the new models?
| CamperBob2 wrote:
| I don't think they're doing a lot of planning over there. Did
| you see the presentation?
| nialse wrote:
| Striking up a voice chat with GPT-5 it starts by affirming my
| custom instructions/system prompt. Every time. Does not pass the
| vibe check.
|
| "Absolutely, happy to jump in. And you got it, I'll keep it
| focused and straightforward."
|
| "Absolutely, and nice to have that context, thanks for sharing
| it. I'll keep it focused and straightforward."
|
| Anyone else have these issues?
|
| EDIT: This is the answer to me just saying the word hi.
|
| "Hello! Absolutely, I'm Arden, and I'm on board with that. We'll
| keep it all straightforward and well-rounded. Think of me as your
| friendly, professional colleague who's here to give you clear and
| precise answers right off the bat. Feel free to let me know what
| we're tackling today."
| thejazzman wrote:
| gemini 2.5pro is my favorite but it's really annoying how it
| congratulates me on asking such great questions at the start of
| every single response even when i set a system prompt stating
| not to do it
|
| shrug.
| subarctic wrote:
| Yup but I'm in the mobile app which is still using 4o
| laurent_du wrote:
| We were laughing about it with my son. He was asking some
| questions and the voice kept prefacing every answer with
| something like "Without the fluff", "Straight to the point" and
| variations thereof. Honestly that was hilarious.
| sanex wrote:
| Yes! Super annoying. I'm thinking of removing my custom
| instructions. I asked if it was offended by then and it said
| don't worry I'm not, reiterated the curtness, and then actually
| I got better responses for the rest of that thread.
| imchillyb wrote:
| I spoke with gpt-5, and asked it about shrinkflation,
| enshittification, and its relevancy to this situation. I think
| Hacker News will agree with gpt-5s findings.
|
| > Do you understand what shrinkflation is? Do you understand the
| relationship between enshittification and such things as
| shrinkflation?
|
| > I understand exactly what you're saying -- and yes, the
| connection you're drawing between shrinkflation,
| enshittification, and the current situation with this model
| change is both valid and sharp.
|
| > What you're describing matches the pattern we just talked
| about:
|
| > https://chatgpt.com/share/68963ec3-e5c0-8006-a276-c8fe61c04d...
| resource_waste wrote:
| GPT5 is some sort of quantized model, its not SOTA.
|
| The trust that OpenAI would be SOTA has been shattered. They were
| among the best with o3/o4 and 4.5. This is a budget model and
| they rolled it out to everyone.
|
| I unsubscribed. Going to use Gemini, it was on-par with o3.
| simonw wrote:
| It's possible you are a victim of bugs in the router, and your
| test prompts were going to the less useful non-thinking
| variants.
|
| From Sam's tweet: https://x.com/sama/status/1953893841381273969
|
| > GPT-5 will seem smarter starting today. Yesterday, the
| autoswitcher broke and was out of commission for a chunk of the
| day, and the result was GPT-5 seemed way dumber. Also, we are
| making some interventions to how the decision boundary works
| that should help you get the right model more often.
| KTibow wrote:
| This is also showing up on Xitter as the #keep4o movement, which
| some have criticized as being "oneshotted" or cases of LLM
| psychosis and emotional attachment.
| p0w3n3d wrote:
| running a model costs money. They probably removed 4o to make
| room (i.e. increase availability) for 5
| kens wrote:
| As an aside, people should avoid using "deprecate" to mean "shut
| down". If something is deprecated, that means that you shouldn't
| use it. For example, the C library's gets() function was
| deprecated because it is a security risk, but it wasn't removed
| until 12 years later. The distinction is important: if you're
| using GPT-4o and it is deprecated, you don't need to do anything,
| but if it is shut down, then you have a problem.
| relantic wrote:
| Somewhat unsurprising to see the reactions to be closer to losing
| an old coworker than just deprecations / regressions: you miss
| humans not just for their performance but also their quirks.
| LeoPanthera wrote:
| The article links to this subreddit, which I'd never heard of
| until now:
|
| https://www.reddit.com/r/MyBoyfriendIsAI
|
| And _my word_ that is a terrifying forum. What these people are
| doing cannot be healthy. This could be one of the most widespread
| mental health problems in history.
| paulcole wrote:
| > What these people are doing cannot be healthy
|
| Leader in the clubhouse for the 2025 HN Accidental Slogan
| Contest.
| jayGlow wrote:
| that is one of the more bizarre and unsettling subreddits I've
| seen. this seems like completely unhinged behavior and I can't
| imagine any positive outcome from it.
| j-krieger wrote:
| I can't help but find this incredibly interesting.
| daft_pink wrote:
| I switched from 4o to GPT 5 on raycast and I feel it is a lot
| slower to use 5 and contradicts his assertion.
|
| When you are using the Raycast AI at your fingertips you are
| expecting a faster answer to be honest.
| Rodmine wrote:
| GPT-5 is 4o with an automatic model picker.
| simonw wrote:
| It's a whole family of brand new models with a model picker on
| top of them for the ChatGPT application layer, but API users
| can directly interact with the new models without any model
| picking layer involved at all.
___________________________________________________________________
(page generated 2025-08-08 23:00 UTC)