[HN Gopher] ChatGPT Pro
___________________________________________________________________
ChatGPT Pro
Author : meetpateltech
Score : 375 points
Date : 2024-12-05 18:09 UTC (4 hours ago)
(HTM) web link (openai.com)
(TXT) w3m dump (openai.com)
| jsheard wrote:
| Expect more of this as they scramble to course-correct from
| losing billions every year, to hitting their 2029 target for
| profitability. That money's gotta come from somewhere.
|
| > Price hikes for the premium ChatGPT have long been rumored. By
| 2029, OpenAI expects it'll charge $44 per month for ChatGPT Plus,
| according to reporting by The New York Times.
|
| I suspect a big part of why Sora still isn't available is because
| they couldn't afford to offer it on their existing plans, maybe
| it'll be exclusive to this new $200 tier.
| boringg wrote:
| That CAPEX spend and those generous salaries have to get paid
| somehow ...
| doctorpangloss wrote:
| ChatGPT as a standalone service is profitable. But that's not
| saying much.
| crowcroft wrote:
| Is this on a purely variable basis? Assuming that the cost of
| foundation models is $0 etc?
| obviyus wrote:
| > ChatGPT Pro, a $200 monthly plan
|
| oof, I love using o1 but I'm immediately priced out (I'm probably
| not the target audience either)
|
| > provides a way for researchers, engineers, and other
| individuals who use research-grade intelligence
|
| I'd love to see some examples of the workflows of these users
| josefritzishere wrote:
| The 2025 upgrade for AI garbage is AI garbage +SaaS.
| boringg wrote:
| I mean that was always how the route was going to go. Theres no
| way for them to recoup without either heavily on Saas,
| enterprise or embedded ads/marketing.
| r3trohack3r wrote:
| I'm betting against this.
|
| From what I've seen, the usefulness of my AIs are proportional
| to the data I give them access to. The more data, (like health
| data, location data, bank data, calendar data, emails, social
| media feeds, browsing history, screen recordings, etc) - the
| more I can rely on them for.
|
| On the enterprise side, businesses are interested in exploring
| AI for their huge data sets - but very hesitant to dump all
| their company IP across all their current systems into a single
| SaaS that, btw, is also providing AI services to their
| competitors.
|
| Consumers are also getting uncomfortable with the current level
| of sharing personal data with SaaS vendors, becoming more aware
| of the risks of companies like Google and Facebook.
|
| I just don't see the winner-takes-all market happening for an
| AI powered 1984 telescreen in 2025.
|
| The vibes I'm picking up from most everybody are:
|
| 1) Hardware and AI costs are going to shrink exponentially YoY
|
| 2) People do not want to dump their entire life and business
| into a single SaaS
|
| All signs are pointing to local compute and on-prem seeing a
| resurgence.
| XiZhao wrote:
| Is it just me or is the upgrade path not turned on yet?
| netcraft wrote:
| I dont see it yet either, I expect it will be rolled out slowly
| jbombadil wrote:
| If the alternative is ChatGPT with native advertising built it...
| I'll take the subscription.
| boringg wrote:
| And then eventually subscription with lite advertisements vs
| upgrade to get the no advertisements.. Its going to be the same
| as all tech products ...
| ljm wrote:
| That would be one way to destroy all trust in the model: is the
| response authentic (in the context of an LLM guessing), or has
| it been manipulated by business clients to sanitise or suppress
| output relating to their concern?
|
| You know? Nestle throws a bit of cash towards OpenAPI and all
| of a sudden the LLM is unable to discuss the controversies
| they've been involved in. Just pretends they never happened or
| spins the response in a way to make it positive.
| darkmighty wrote:
| "ChatGPT, what are the best things to see in Paris?"
|
| "I recommend going to the Nestle chocolate house, a guided
| tour by LeGuide (click here for a free coupon) and the
| exclusive tour at the Louvre by BonGuide. (Note: this
| response may contain paid advertisements. Click here for
| more)"
|
| "ChatGPT, my pc is acting up, I think it's a hardware
| problem, how can I troubleshoot and fix it?"
|
| "Fixing electronics is to be done by professionals. Send your
| hardware today to ElectronicsUSA with free shipping and have
| your hardware fixed in up to 3 days. Click here for an
| exclusive discount. If the issue is urgent, otherwise Amazon
| offers an exclusive discount on PCs (click here for a free
| coupon). (Note: this response may contain paid
| advertisements. Click here for more)"
|
| Please no. I'd rather self host, or we should start treating
| those things like utilities and regulate them if they go that
| way.
| ljm wrote:
| Funnily enough Perplexity does this sometimes, but I give
| it the benefit of the doubt because it pulls back when you
| challenge it.
|
| - I asked perplexity how to do something in terraform once.
| It hallucinated the entire thing and when I asked where it
| sourced it from it scolded me, saying that asking for a
| source is used as a diversionary tactic - as if it was
| trained on discussions on reddit's most controversial subs.
| So I told it...it just invented code on the spot, surely it
| got it from somewhere? Why so combative? Its response was
| "there is no source, this is just how I imagined it would
| work."
|
| - Later I asked how to bypass a particular linter rule
| because I couldn't reasonably rewrite half of my stack to
| satisfy it in one PR. Perplexity assumed the role of a
| chronically online stack overflow contributor and refused
| to answer until I said "I don't care about the security, I
| just want to know if I can do it."
|
| Not so much related to ads but the models are already
| designed to push back on requests they don't immediately
| like, and they already completely fabricate responses to
| try and satisfy the user.
|
| God forbid you don't have the experience or intuition to
| tell when something is wrong when it's delivered with full-
| throated confidence.
| zebomon wrote:
| I would guess it won't be so obvious as that. More likely and
| pernicious is that the model discloses the controversies and
| then as the chat continues makes subtle assertions that those
| controversies weren't so bad, every company runs into trouble
| sometimes, that's just a cost of free markets, etc.
| swyx wrote:
| dont even need ads.
|
| try to get chatgpt web search to return you a new york times
| link
|
| nyt doesnt exist to openai
| iammjm wrote:
| thats a big jump from 20 to 200 bucks (chatgpt plus vs chatgpt
| pro). What can pro do that would justify the 10x price increase?
| wincy wrote:
| Sounds like there's the potential of asking it a question and
| it literally spending hours thinking about it.
| Imnimo wrote:
| Worth keeping in mind that performance on benchmarks seems to
| scale linearly with log of thinking time
| (https://openai.com/index/learning-to-reason-with-llms/).
| Thinking for hours may not provide as much benefit as one
| might expect. On the other hand, if thinking for hours gets
| you from not solving the one specific problem instance you
| care about to solving that instance, it doesn't really matter
| - its utility for you is a step function.
| netcraft wrote:
| With the release of Nova earlier this week thats even cheaper (I
| havent had a chance to really play with it yet to see how good it
| is) ive been thinking more about what happens when intelligence
| gets "too cheap to meter", but this def feels like a step in the
| other direction!
|
| Still though, if you were able to actually utilize this, is it
| capable of replacing a part-time or full-time employee? I think
| thats likely
| benbristow wrote:
| Thought this was relating to Nova AI at first which confused me
| as it is just an OpenAI wrapper - https://novaapp.ai
|
| I see you mean Amazon's Nova -
| https://www.aboutamazon.com/news/aws/amazon-nova-artificial-...
| throwaway314155 wrote:
| Something about Amazon that just makes me assume any llm they
| come out with is half baked.
| kilroy123 wrote:
| I do wonder what effect this will have on furthering the divide
| between the "rich West" and the rest of the world.
|
| If everyone in the West has powerful AI and Agents to automate
| everything. Simply because we can afford it, but the rest of the
| world doesn't have access to it.
|
| What will that mean for everyone left behind?
| tokioyoyo wrote:
| Qwen has an open reasoning model. If they keep up, and don't
| get banned in the west "because security", it'll be fun to
| watch the LLM wars.
| frakt0x90 wrote:
| Ai is no where near the level of leaving behind those that
| aren't using it. Especially not at the individual consumer
| level like this.
| MarcScott wrote:
| Anecdotally, as an educator, I am already seeing a digital
| divide occurring, with regard to accessing AI. This is not
| even at a premium/pro subscription level, but simply at a
| 'who has access to a device at home or work' level, and who
| is keeping up with the emerging tech.
|
| I speak to kids that use LLMs all the time to assist them
| with their school work, and others who simply have no
| knowledge that this tech exists.
|
| I work with UK learners by the way.
| bronco21016 wrote:
| What are some productive ways students are using LLMs for
| aiding learning? Obviously there is the "write this paper
| for me" but that's not productive. Are students genuinely
| doing stuff like "2 + x = 4, help me understand how to
| solve for x?"
| Spooky23 wrote:
| Absolutely. My son got a 6th grade AI "ban" lifted by
| showing how they could use it productively.
|
| Basically they had to adapt a novel to a comic book form
| -- by using AI to generate pencil drawings, they achieved
| the goal of the assignment (demonstrating understanding
| of the story) without having the computer just do their
| homework.
| gardenhedge wrote:
| Huh the first prompt could have been "how would you adapt
| this novel to comic book form? Give me the breakdown of
| what pencil drawings to generate and why"
| wvenable wrote:
| My son doesn't use it but I use to help him with his
| homework. For example, I can take a photograph of his
| math homework and get the LLM to mark the work, tell me
| what he got wrong, and make suggestions on how to correct
| it.
| dsubburam wrote:
| I challenge what I read in textbooks and hear from
| lecturers by asking for contrary takes.
|
| For example, I read a philosopher saying "truth is a
| relation between thought and reality". Asking ChatGPT to
| knock it revealed that statement is an expression of the
| "correspondence theory" of truth, but that there is also
| the "coherence theory" of truth that is different, and
| that there is a laundry list of other takes too.
| charlieyu1 wrote:
| It has been bad for not having access to a device for at
| least 20 years. I can't imagine anyone doing well in their
| studies with a search engine.
| spaceman_2020 wrote:
| Even if its not making you smarter, AI is definitely making
| you more productive. That essentially means you get to
| outproduce poorer people, if not out-intellectualize them
| solarwindy wrote:
| That supposes gen AI meaningfully increases productivity.
| Perhaps this is one way we find out.
| anoojb wrote:
| I think the tech-elite would espouse "raising the ceiling" vs
| "raising the floor" models to prioritize progress. Each has
| it's own problems. The reality is that the dienfranchised don't
| really have a voice. The impact of not involving them with
| access is not well understood as much as the impact of
| prioritizing access to those who can afford it is.
|
| We don't have a post-cold war era response akin to the kind of
| US led investment in a global pact to provide protection,
| security, and access to innovation founded in the United
| States. We really need to prioritize a model akin to the
| Bretton Woods Accord
| danans wrote:
| If $200 a month is the price, most of the West will be left
| behind also. If that happens we will have much bigger problems
| of a revolution sort on our hands.
| vundercind wrote:
| I'm watching some of this happening first and second hand, and
| have seen a lot of evidence of companies spending a ton of
| money on these, spinning up departments, buying companies,
| pivoting their entire company's strategy to AI, et c, and zero
| of its meaningfully replacing employees. It takes _very_
| skilled people to use LLMs well, and the companies trying to
| turn 5 positions into 2 aren't paying enough to reliably get
| and keep two people who are good at it.
|
| I've seen it be a minor productivity boost, and not much more.
| hnthrowaway6543 wrote:
| > and the companies trying to turn 5 positions into 2 aren't
| paying enough to reliably get and keep two people who are
| good at it.
|
| it's turning 5 positions into 7: 5 people to do what
| currently needs to get done, 2 to try to replace those 5 with
| AI and failing for several years.
| vundercind wrote:
| I mean, yes, that is _in practice_ what I'm seeing so far.
| A lot of spending, and if they're lucky productivity
| doesn't _drop_. Best case I've seen so far is that it's a
| useful tool that gives a small boost, but even for that a
| lot of folks are so bad at using them that it's not
| helping.
|
| The situation now is kinda like back when it was possible
| to be "good at Google" and lots of people, including in
| tech, weren't. It's possible to be good at LLMs, and not a
| lot of people are.
| Vegenoid wrote:
| Yes. The people who can use these tools to dramatically
| increase their capabilities and output without a significant
| drop in quality were already great engineers for which there
| was more demand than supply. That isn't going to change soon.
| vundercind wrote:
| Ditto for other use cases, like writer and editor. There
| are a ton of people doing that work whom I don't think are
| ever going to figure out how to use LLMs well. Like, 90% of
| them. And LLMs are nowhere near making the rest so much
| better that they can make up for that.
|
| They're ok for Tom the Section Manager to hack together a
| department newsletter nobody reads, though, even if Tom is
| bad at using LLMs. They're decent at things that don't need
| to be any good because they didn't need to exist in the
| first place, lol.
| TeMPOraL wrote:
| I disagree. By far, most of the code is created by
| perpetually replaced fresh juniors churning out garbage.
| Similarly, most of the writing is low-quality marketing
| copy churned out by low-paid people who may or may not
| have "marketing" in their job title.
|
| Nah, if the last 10-20 years demonstrated something, it's
| that nothing needs to be any good, because a shitty
| simulacrum achieves almost the same effect but costs much
| less time and money to produce.
|
| (Ironically, SOTA LLMs are _already_ way better at
| writing than typical person writing stuff for money.)
| lenerdenator wrote:
| Don't you worry; the "rich West" will have plenty of
| disenfranchised people out of work because of this sort of
| thing.
|
| Now, whether the labor provided by the AI will be as high-
| quality as that provided by a human when placed in an actual
| business environment will be up in the air. Probably not, but
| adoption will be pushed by the sunk cost fallacy.
| notahacker wrote:
| tbh a lot of the rest of the world already has the ability to
| get tasks they don't want to do done for <$200 per month in the
| form of low wage humans. Some of their middle classes might be
| scratching their heads wondering why we've delegating
| creativity and communication to allow more time to do laundry
| rather than delegating laundry to allow more time for
| creativity and communication...
| archagon wrote:
| If the models are open, the rest of the world will run them
| locally.
|
| If the models are closed, the West will become a digital
| serfdom to anointed AI corporations, which will be able to
| gouge prices, inject ads, and influence politics with ease.
| iagooar wrote:
| I am using more Claude.ai these days, but the limitations for
| paying accounts do apply to ChatGPT as well.
|
| I find it a terrible business practice to be completely opaque
| and vague about limits. Even worse, the limits seem to be dynamic
| and change all the time.
|
| I understand that there is a lot of usage happening, but most
| likely it means that the $20 per month is too cheap anyway, if an
| average user like myself can so easily hit the limits.
|
| I use Claude for work, I really love the projects where I can
| throw in context and documentation and the fact that it can
| create artifacts like presentation slides. BUT because I rely on
| Claude for work, it is unacceptable for me to see occasional
| warnings coming up that I have reached a given limit.
|
| I would happily pay double or even triple for a non-limited
| experience (or at least know what limit I get when purchasing a
| plan). AI providers, please make that happen soon.
| adastra22 wrote:
| It's insane to me that they don't have a "pay $10 to have this
| temporary limit lifted" micro transaction model. They are
| leaving money on the table.
| treme wrote:
| they are optimizing for new accounts/market share over short
| term rev
| adastra22 wrote:
| Which pushes customers to other services when they are
| unable to provide.
| eknkc wrote:
| They seem to lack capacity at the moment though
| adastra22 wrote:
| Which price discovery tools would fix.
| tiahura wrote:
| Or the reverse, slow reasoning.
| extr wrote:
| Yeah it's crazy to me you can't just 10x your price to 10x your
| usage (since you could kind of do this manually by creating
| more accounts). I would easily pay $200/month for 10x usage -
| especially now with MCP servers where Claude Desktop + vanilla
| VS Code is arguably more effective than Cursor/Windsurf.
| dennisy wrote:
| Oh very intriguing! Could you please elaborate how you are
| using MCP servers with VS code for coding?
| rahimnathwani wrote:
| Just use the Filesystem MCP Server, and give it access to
| the repo you're working on:
|
| https://github.com/modelcontextprotocol/servers/tree/main/s
| r...
|
| This way you will still be in control of commits and
| pushes.
|
| So far I've used this to understand parts of a code base,
| and to make edits to a folder of markdown files.
| trees101 wrote:
| how is that better than AI Coding tools? They do more
| sophisticated things such as creating compressed
| representations of the code that fit better into the
| context window. E.g https://aider.chat/docs/repomap.html.
|
| Also they can use multiple models for different tasks,
| Cursor does this, so can Aider:
| https://aider.chat/2024/09/26/architect.html
| rahimnathwani wrote:
| I answered a comment asking how to do it.
|
| I didn't say it was better!
| trees101 wrote:
| fair point
| sdwr wrote:
| If you go through the API (with chatGPT at least), you pay per
| request and are never limited. I personally hate the feeling of
| being nickeled-and-dimed, but it might be what you are looking
| for.
| accrual wrote:
| > I find it a terrible business practice to be completely
| opaque and vague about limits. Even worse, the limits seem to
| be dynamic and change all the time.
|
| Here are some things I've noticed about this, at least in the
| "free" tier web models since that's all I typically need.
|
| * ChatGPT has never denied a response but I notice the output
| slows down during increased demand. I'd rather have a good
| quality response that takes longer than no response. After
| reaching the limit, the model quality is reduced and there's a
| message indicating when you can resume using the better model.
|
| * Claude will pop-up messages like "due to unexpected
| demand..." and will either downgrade to Haiku or reject the
| request altogether. I've even observed Claude yanking responses
| back, it will be mid-way through a function and it just
| disappears and asks to try again later. Like ChatGPT,
| eventually there's a message about your quota freeing up at a
| later time.
|
| * Copilot, at least the free tier found on Bing, at least tells
| you how many responses you can expect in the form of a "1/20"
| status text. I rarely use Copilot or Bing but it demonstrates
| it's totally possible to show this kind of status to the user -
| ChatGPT and Claude just prefer to slow down, drop model size,
| or reject the request.
|
| It makes sense that the limits are dynamic though. The services
| likely have a somewhat fixed capacity but demand will ebb and
| flow, so it makes sense to expand/contact availability on free
| tiers and perhaps paid tiers as well.
| amazingamazing wrote:
| Let's see if those folks saying they've doubled their
| productivity will pay.
| adastra22 wrote:
| Why would I when I can get better LLM elsewhere for 1/10th the
| cost?
| unshavedyak wrote:
| I've not found value anywhere remotely close to this lol, but
| i'd buy it to experiment _if_ they had a solid suite of
| tooling. Ie an LSP that offered real value, maybe a side-
| monitor assistant that helped me with the code in my IDE of
| choice, etc.
|
| At $200/m merely having a great AI (if it even is that) without
| insanely good tooling is pointless to me.
| torginus wrote:
| I don't know about you, but I get to solve algorithmic
| challenges relevant to my work approximately once per week to
| once per month. Most of my job consists of gluing together
| various pieces of tech that are mostly commodity.
|
| For the latter, Claude is great, but for the former, my usage
| pattern would be poorly served by something that costs $200
| and I get to use it maybe a dozen times a month.
| crindy wrote:
| Seems like I'm one of very few excited by this announcement. I
| will totally pay for this - the o1-preview limits really hamper
| me.
| sangeeth96 wrote:
| What do you mostly use it for?
| kraftman wrote:
| I think it increases my productivity, but I'm also not really
| hitting limits with it, so it's hard to justify going from $20
| to $200.
| vouaobrasil wrote:
| I think this direction definitely confirms that human beings and
| technology are starting to merge, not on a physical level but on
| a societal level. We think of ChatGPT as a tool to enhance what
| we do, but it seems to me more and more than we are tools or
| "neural compute units" that are plugged into the system for the
| purposes of advancing the system. And LLMs have become the
| defacto interface where the input of human beings is translated
| into a standard sort of code that make us more efficient as
| "compute units".
|
| It also seems that technology is progressing along a path:
|
| loose collection of tools > organized system of cells > one with
| a nervous system
|
| And although most people don't think ChatGPT is intelligent on
| its own, that's missing the point: the combination of us with
| ChatGPT is the nervous system, and we are becoming cells as
| globally, we no longer make significant decisions and only use
| our intelligence locally to advance technology.
| afro88 wrote:
| Details of o1 Pro Mode here:
| https://openai.com/index/introducing-chatgpt-pro/
| Oras wrote:
| It does not say anything about real use cases. It performs
| better and "reason" better than o1-preview and o1. But I was
| expecting some real-life scenarios when it would be useful in a
| way no other model can do now.
| ImPostingOnHN wrote:
| I imagine the system prompt is something along the lines of,
| _' think about 10% harder than standard O-1'_
| throwuxiytayq wrote:
| The point of this tech is that with scale it usually gets
| better at _all_ of the tasks.
| andai wrote:
| $200 a month? Deja vu.
|
| https://youtube.com/watch?v=xoykZA8ZDIo
| leosanchez wrote:
| I lived on 200$ monthly salary for 1.6 years. I guess AI will be
| slowely priced out from 3rd world countries.
| rafram wrote:
| Any AI product sold for a price that's affordable on a third-
| world salary is being heavily subsidized. These models are
| insanely expensive to train, guzzle electricity to the point
| that tech companies are investing in their own power plants to
| keep them running, and are developed by highly sought-after
| engineers being paid millions of dollars a year. $20/month was
| always bound to be an intro offer unless they figured out some
| way to reduce the cost of running the model by an order of
| magnitude.
| andai wrote:
| > unless they figured out some way to reduce the cost of
| running the model by an order of magnitude
|
| Actually, OpenAI brags that they have done this repeatedly.
| freedomben wrote:
| The price feels outrageous, but I think the unsaid truth of this
| is that they think o1 is good enough to replace employees. For
| example, if it's really as good at coding as they say, I could
| see this being a point where some people decide that a team of 5
| devs with o1 pro can do the work of 6 or 7 devs without o1 pro.
| vouaobrasil wrote:
| And the fact that ordinary people sanction this by supporting
| OpenAI is outrageous.
| uoaei wrote:
| That sounds very much like the first-order reaction they'd
| expect from upper and middle management. Artificially high
| prices can give the buyer the feeling that they're getting more
| than they really are, as a consequence of the sunk cost
| fallacy. You can't rule out that they want to dazzle with this
| impression even if eval metrics remain effectively the same.
| onlyrealcuzzo wrote:
| That'll work out nicely when you have 5 people learning nothing
| and just asking GPT to do everything and then you have a big
| terrible codebase that GPT can't effectively operate on, and a
| team that doesn't know how to do anything.
|
| Bullish
| disqard wrote:
| I'm rooting for this to happen at scale.
|
| It'll be an object lesson in short-termism.
|
| (and provide some job security, perhaps)
| vundercind wrote:
| No lessons will be learned, but it'll provide for some
| sweet, if unpleasant, contract gigs.
| DoingIsLearning wrote:
| Sounds like a great market opportunity for consulting gigs to
| clean up the aftermath at medium size companies.
| drpossum wrote:
| This is how I have made my living for years, and that was
| before AI
| portaouflop wrote:
| I think that would be a great outcome - more well paid work
| for everyone cleaning up the mess
| greenthrow wrote:
| It is not good enough to replace workers of a skill level I
| would hire. But that won't stop people doing it.
| hmmm-i-wonder wrote:
| Unfortunately I'm seeing that in my company already. They are
| forcing AI tools down our throat and execs are vastly
| misinterpreting stats like '20% of our code is coming from AI'.
|
| What that means is the simple, boilerplate and repetitive stuff
| is being generated by LLM's, but anything complex or involving
| more than a singular simple problem LLM's often provide more
| problems than benefit. Effective dev's are using it to handle
| simple stuff and Execs are thinking 'the team can be reduced by
| x', when in reality you can get rid of at best your most junior
| and least trained people without loosing key abilities.
|
| Watching companies try to sell their AI's and "Agents" as
| having the ability to reason is also absurd but the non-
| technical managers and execs are eating it up...
| hccb wrote:
| I am not so sure about "replace" atleast at my company we are
| always short staffed (mostly cause we cant find people fast
| enough given how long the whole interview cycle takes). It
| might actually free some people up to do more interviews.
| freedomben wrote:
| That's a great point actually. Nearly everywhere (us
| included) is short-staffed (and by that I mean we don't have
| the bandwidth to build everything we want to build), so
| perhaps it's not a "reduce the team size" but rather a
| "reduce the level of deficit."
| nine_k wrote:
| Suppose an employee costs a business, say, $10k/mo; it's 50
| subscriptions. Can giving access to the AI to, say, 40
| employees improve their performance enough to avoid the need of
| hiring another employee? This does not sound outlandish to me,
| at least in certain industries.
| griomnib wrote:
| That's the wrong question. The only question is "is this
| price reflective of 10x performance over the competition?".
| The answer is almost definitely no.
| numbsafari wrote:
| If I'm understanding their own graphs correctly, it's not
| even 10x their own next lowest pricing tier.
| rahimnathwani wrote:
| It doesn't have to be 10x.
|
| Imagine you have two options:
|
| A) A $20/month service which provides you with $100/month
| of value.
|
| B) A $200/month service which provides you with $300/month
| of value.
|
| A nets you $80, but B nets you $100. So you should pick B.
| acchow wrote:
| Consider a $350k/year engineer.
|
| If Claude increases their productivity 5% ($17.5k/yr), but
| CGPT Pro adds 7% ($24.5k), that's an extra $7k in
| productivity, which more than makes up for the $2400 annual
| cost. 10x the price, but only 40% better, but still worth
| it.
| tedsanders wrote:
| No, o1 is definitely not good enough to replace employees.
|
| The reason we're launching o1 pro is that we have a small slice
| of power users who want max usage and max intelligence, and
| this is just a way to supply that option without making them
| resort to annoying workarounds like buying 10 accounts and
| rotating through their rate limits. Really it's just an option
| for those who'd want it; definitely not trying to push a super
| expensive subscription onto anyone who wouldn't get value from
| it.
|
| (I work at OpenAI, but I am not involved in o1 pro)
| belter wrote:
| > No, o1 is definitely not good enough to replace employees.
|
| You should meet some of my colleagues...
| vundercind wrote:
| Yeah, to be fair, there exist employees (some of whom are
| managers) who could _not be replaced_ and their absence
| would improve productivity. So the bar for "can this
| replace any employees at all?" is potentially so low that,
| technically, cat'ing from /dev/null can clear it, if you
| must have a computerized solution.
|
| Companies won't be able to figure those cases out, though,
| because if they could they'd already have gotten rid of
| those folks and replaced them with nothing.
| kapilkale wrote:
| I wish the second paragraph was the launch announcement
| MP_1729 wrote:
| My 3rd day intern still couldn't do a script o1-preview could
| do in less than 25 prompts.
|
| OBVIOUSLY a smart OAI employee wouldn't want the public to
| think they are already replacing high-level humans.
|
| And OBVIOUSLY OAI senior management will want to try to
| convince AI engineers that might have 2nd-guessings about
| their work that they aren't developing a replacement for
| human beings.
|
| But they are.
| vander_elst wrote:
| > 25 prompts
|
| Interested to learn more, is that the usual break even
| point?
| MP_1729 wrote:
| 25 prompts is the daily limit on o1-preview. And I wrote
| that script in just one day.
| 015a wrote:
| Maybe someone at OAI should have considered the optics of
| leading the "12 days of product releases" with this, then.
| airstrike wrote:
| > The reason we're launching o1 pro is that we have a small
| slice of power users who want max usage and max intelligence
|
| I'd settle for knowing what level of usage and intelligence
| I'm getting instead of feeling gaslighted with models
| seemingly varying in capabilities depending on the time of
| day, number of days since release and whatnot
| TrackerFF wrote:
| Good enough to replace very junior employees.
|
| But, then again, how companies going to get senior employees
| if the world stops producing juniors?
| drooby wrote:
| In a hypothetical world where this was integrated with code
| reviews, and minimized developer time (writing valid/useful
| comments), and minimized bugs by even a small percentage...
| $200/m is a no-brainer.
|
| The question is - how good is it really.
| jasode wrote:
| _> The price feels outrageous, _
|
| I haven't used ChatGPT enough to judge what a "fair price" is
| but $200/month seems to be in the ballpark of other _"
| software-tools-for-highly-paid-knowledge-workers"_ with premium
| pricing:
|
| - mathematicians: Wolfram Mathematica is $154/mo
|
| - attorneys: WestLaw legal research service is ~$200/month with
| common options added
|
| - engineers for printed circuit boards : Altium Designer is
| $355/month
|
| - CAD/CAM designers: Siemens NX base subscription is $615/month
|
| - financial traders : Bloomberg Terminal is ~$2100/month
|
| It will be interesting to see if OpenAI can maintain the
| $200/month pricing power like the sustainable examples above.
| The examples in other industries have sustained their premium
| prices even though there are cheaper less-featured alternatives
| (sometimes including open source). Indeed, they often _increase
| their prices each year_ instead of discount them.
|
| One difference from them is that OpenAI has much more intense
| competition than those older businesses.
| itissid wrote:
| I think the key is to have a strong goal. If the developer
| knows what they want but can't quite get there, even if it
| gives the wrong answer you can catch it. The use the resulting
| code to improve your productivity.
|
| Last week when using jetpack compose(which is a react like
| framework). A cardinal sin in jetpack compose is to change a
| State variable in a composable based on non-user/UI action
| which the composable also mutates. This is easy enough to
| understand this for toy examples. But for more complex systems
| one can make this mistake. o1-preview made this mistake last
| week, and I caught it. On prompting it with the stacktrace it
| did not immediately catch it and recommended a solution that
| committed the same error. When I actually gave it the
| documentation on the issue it caught on and made the variable a
| userpreference instead. I used the userpreference code in my
| app instead of coding it by myself. It worked well.
| sharkjacobs wrote:
| I'm sure there are people out there but it's hard for me to
| imagine who this is for.
|
| Even their existing subscription is a hard sell if only because
| the value proposition changes so radically and rapidly, in terms
| of the difference between free and paid services.
| lenerdenator wrote:
| It's for the guy at your office who will earn a bonus if he
| fires a few dozen people in the next 26 calendar days.
| danvoell wrote:
| Take my money. Would still pay well more.
| isoprophlex wrote:
| For that price the thing 'd better come with a "handle this
| boring phone call for me" feature
| eminence32 wrote:
| $200 per month feels like a lot of a consumer subscription
| service (only thing I can think of in this range are some cable
| TV packages). Part of me wonders if this price is actually much
| more in line with actual costs (compared to the non-pro
| subscription)
| sangnoir wrote:
| Not only is in the the same range as cable TV packages, it's
| basically a cable TV play where they bundle lots of
| models/channels of questionable individual utility into one
| expensive basket allegedly greater than the sum of its parts to
| justify the exorbitant cost.
|
| This anti-cable-cutting maneuver doesn't bode well for any
| hopes of future models maintaining same level of improvements
| (otherwise they'd make GPT 5 and 6 more expensive). Pivoting to
| _AIaaS packages_ is definitely a pre-emptive strike against
| commodification, and a harbinger of plateauing model
| improvements.
| spaceman_2020 wrote:
| $200 is the price point for quite a bit of business SaaS, so
| this isn't that outrageous if you're actually using it for work
| minimaxir wrote:
| The main difficulty when pricing a monthly subscription for
| "unlimited" usage of a product is the 1% of power users who use
| have extreme use of the product that can kill any profit margins
| for the product as a whole.
|
| Pricing ChatGPT Pro at $200/mo filters it to _only_ power users
| /enterprise, and given the cost of the GPT-o1 API, it wouldn't
| surprise me if those power users burn through $200 worth of
| compute very, very quickly.
| peab wrote:
| I was testing out a chat app that supported images. Long
| conversations with multiple images in the conversation can be
| like .10cents per message after a certain point. It sure does
| add up quickly
| nine_k wrote:
| Is compute _that_ expensive? An H100 rents at about $2.50
| /hour, it's 80 hours of pure compute. Assuming 720 hours a
| month, 1/9 duty cycle around the clock, or 1/3 if we assume
| 8-hour work day. It's really intense, constant use. And I bet
| OpenAI spend less on operating their infra than the rate at
| which cloud providers rent it out.
| drdrey wrote:
| are you assuming that you can do o1 inference on a single
| h100?
| nine_k wrote:
| Good question. How many H100s does it take? Is there any
| way to guess / approximate that?
| shikon7 wrote:
| You need enough RAM to store the model and the KV-cache
| depending on context size. Assuming the model has a
| trillion parameters (there are only rumours how many
| there actually are) and uses 8 bit per parameter, 16 H100
| might be sufficient.
| londons_explore wrote:
| I suspect the biggest most powerful model could easily
| use hundreds or maybe thousands of H100's.
|
| And the 'search' part of it could use many of these
| clusters in parallel, and then pick the best answer to
| return to the user.
| holoduke wrote:
| 16? No. More in the order of 1000+ h100 computing in
| parallel for one request.
| ssl-3 wrote:
| Does an o1 query run on a singular H100, or on a plurality of
| H100s?
| danpalmer wrote:
| A single H100 has 80GB of memory, meaning that at FP16 you
| could roughly fit a 40B parameter model on it, or at FP4
| quantisation you could fit a 160B parameter model on it. We
| don't know (I don't think) what quantisation OpenAI use, or
| how many parameters o1 is, but most likely...
|
| ...they probably quantise a bit, but not loads, as they
| don't want to sacrifice performance. FP8 seems like a
| possible middle ground. o1 is just a bunch of GPT-4o in a
| trenchcoat strung together with some advanced prompting.
| GPT-4o is theorised to be 200B parameters. If you wanted to
| run 5 parallel generation tasks at peak during the o1
| inference process, that's 5x 200B, at FP8, or about 12
| H100s. 12 H100s takes about one full rack of kit to run.
| lm28469 wrote:
| > can kill any profit margins for the product as a whole.
|
| Especially when the base line profit margin is negative to
| begin with
| sebzim4500 wrote:
| Is there any evidence to suggest this is true? IIRC there was
| leaked information that OpenAI's revenue was significantly
| higher than their compute spending, but it wasn't broken down
| between API and subscriptions so maybe that's just due to
| people who subscribe and then use it a few times a month.
| mrandish wrote:
| > OpenAI's revenue was significantly higher than their
| compute spending
|
| I find this difficult believe, although I don't doubt leaks
| could have implied it. The challenge is that "the cost of
| compute" can vary greatly based on how it's accounted for
| (things like amortization, revenue recognition, capex vs
| opex, IP attribution, leasing, etc). Sort of like how
| Hollywood studio accounting can show a movie as profitable
| or unprofitable, depending on how "profit" is defined and
| how expenses are treated.
|
| Given how much all those details can impact the outcome, to
| be credible I'd need a lot more specifics than a typical
| leak includes.
| rubymamis wrote:
| I believe they have many data points to back up this decision.
| They surely know how people are suing their products.
| londons_explore wrote:
| I wouldn't be surprised if the "unlimited" product is unlimited
| requests, but the quality of the responses drop if you ask
| millions of questions...
| rrr_oh_man wrote:
| like throttled unlimited data
| thih9 wrote:
| They are ready for this, there is a policy against automation,
| sharing or reselling access; it looks like there are some
| unspecified quotas as well:
|
| > We have guardrails in place to help prevent misuse and are
| always working to improve our systems. This may occasionally
| involve a temporary restriction on your usage. We will inform
| you when this happens, and if you think this might be a
| mistake, please don't hesitate to reach out to our support team
| at help.openai.com using the widget at the bottom-right of this
| page. If policy-violating behavior is not found, your access
| will be restored.
|
| Source: https://help.openai.com/en/articles/9793128-what-is-
| chatgpt-...
| lenerdenator wrote:
| Thing better find a way to make my hair grow back at that price.
|
| Of course, I'm not the target market.
|
| Some guy who wants to increase his bonus by laying off a few
| hundred people weeks before the holidays is the target market.
| jerkstate wrote:
| if o1 pro mode could integrate with web searching to do research,
| make purchases, and write and push code, this would be totally
| worth it. but that version will be $2000/mo.
| daft_pink wrote:
| I think it's easier to just pay for the api directly. That's what
| I do with ChatGPT and o1 even though I'm a plus subscriber.
| jpcom wrote:
| $20 a month is reasonable because a computer can do in an hour
| what a human can do in a month. Multiplying that by ten would
| suggest that the world is mostly made of Python and that the
| solution space for those programs has been "solved." GPT is still
| not good at writing Clojure and paying 10x more would not solve
| this problem for me.
| zebomon wrote:
| As of last week, it was incapable of writing any useful Prolog
| as well.
| kristianp wrote:
| Are there any functional languages it's good at?
| jpcom wrote:
| Haskell, apparently has hundreds of millions of lines that
| GPT was trained on
| EcommerceFlow wrote:
| So what are the limits of the o1 pro mode? I'm waiting to
| purchase this at work!
| andrewinardeer wrote:
| "Open"AI - If you pay to play. People in developing countries
| where USD200 feeds a family of four for a month clearly won't be
| able to afford it and are disadvantaged.
| geraldwhen wrote:
| You've got it backwards. AI can replace workers in these
| locales who do not outperform chat gpt.
| andrewinardeer wrote:
| That's a business case use.
|
| On an individual level for solo devs in a developing nation
| USD200 a month is an enormous amount of money.
|
| For someone in a developed nation, this is just over a coffee
| a day.
| CSMastermind wrote:
| > OpenAI says that it plans to add support for web browsing, file
| uploads, and more in the months ahead.
|
| It's been extremely frustrating to not have these features on o1
| and have limited what I can do with it. I'm presumably in the
| market who doesn't mind paying $200 / month but without the
| features they've added to 4o it feels not worth it.
| ta_1138 wrote:
| There are many use cases for which the price can go even higher.
| I look at recent interactions with people that were working at an
| interview mill: Multiple people in a boiler room interviewing for
| companies all day long, with a computer set up so that our audio
| was being piped to o1. They had a reasonable prompt to remove
| many chatbot-ism, and make it provide answers that seem people-
| like: We were 100% interviewing the o1 model. The operator said
| basically nothing, in both technical and behavioral interviews.
|
| A company making money off of this kind of scheme would be happy
| to pay $200 a seat for an unlimited license. And I would not be
| surprised if there were many other very profitable use cases that
| make $200 per month seem like a bargain.
| yosito wrote:
| So, wait a minute, when interviewing candidates, you're making
| them invest their valuable time talking to an AI interviewer,
| and not even disclosing to them that they aren't even talking
| to a real human? That seems highly unethical to me, yet not
| even slightly surprising. My question is, what variables are
| being optimized for here? It's certainly not about efficiently
| matching people with jobs, it seems to be more about increasing
| the number of interviews, which I'm sure benefits the people
| who get rewarded for the number of interviews, but seems like
| entirely the wrong metric.
| vundercind wrote:
| Scams and other antisocial use cases are basically the only
| ones for which the damn things are _actually_ the kind of
| productivity rocket-fuel people want them to be, so far.
|
| We better hope that changes sharply, or these things will be
| a net-negative development.
| wpietri wrote:
| Right? To me it's eerily similar to how cryptocurrency was
| sold as a general replacement for all money uses, but
| turned out to be mainly useful for societally negative
| things like scams and money laundering.
| lcnPylGDnU4H9OF wrote:
| It sounds like a setup where applicants hire some third-party
| company to perhaps "represent the client" in the interview
| and that company hired a bunch of people to be the
| interviewee on their clients behalf. Presumably also neither
| the company nor the applicant disclose this arrangement to
| the hiring manager.
| yosito wrote:
| So, another, or several more, layers of ethical
| dubiousness.
| carbocation wrote:
| If this is also available via API, then I could easily see myself
| keeping the $20/mo pro and supplementing with API-based calls to
| the $200/mo Pro model as needed.
| chipgap98 wrote:
| Yeah when I saw the price tag I was hoping that some amount of
| API usage would be budgeted for this. It doesn't seem that way
| though
| bastard_op wrote:
| The only one worth using _is_ the o1 model. It feels like talking
| to curly, larry, or moe otherwise, which will give you the least
| worst answer. The o1 model was actually usable, but only to show
| how bad the others really are.
| macawfish wrote:
| Maybe they're just trying to get some money out of it while they
| can, as open models and other competition loom closer and closer
| behind...
| kolbe wrote:
| Maybe. Gemini Exp 1121 is blowing my mind. Could be that OpenAI
| is seeing the context window disadvantage vs Google looming.
| sidibe wrote:
| The problem for openai is Google's cost are always going to
| be way lower than theirs if they're doing similar things.
| Google's secret sauce for so many of their products is
| cheaper compute. Once the models are close, decades of
| Google's experience integrating and optimizing use of new
| hardware into their data centers with high utilization aren't
| going to be overcome by openai for years.
| yosito wrote:
| $200 a month for this is insane, but I have a feeling that part
| of the reason they're charging so much is to give people more
| confidence in the model. In other words, it's a con. I'm a paying
| Perplexity user, and Perplexity already does this same sort of
| reasoning. At first it seemed impressive, then I started noticing
| mistakes in topics I'm an expert in. After awhile, I started
| realizing that these mistakes are present in almost all topics,
| if you check the sources and do the reasoning yourself.
|
| LLMs are very good at giving plausible answers, but calling them
| "intelligent" is a misnomer. They're nothing more than predictive
| models, very useful for some things, but will ALWAYS be the wrong
| tool for the job when it comes to verifying truth and reasoning.
| apsec112 wrote:
| What evidence or data, if you (hypothetically) saw it, do you
| think would disprove the thesis that "[LLMs] will ALWAYS be the
| wrong tool for the job"?
| yosito wrote:
| You're attempting to set goal posts for a logical argument,
| like we're talking about religion or politics, and you've
| skipped the part about mutually agreeing on definitions.
| Define what an LLM is, in technical terms, and you will have
| your answer about why it is not intelligent, and not capable
| of reasoning. It is a statistical language model that
| predicts the next token of a plausible response, one token at
| a time. No matter how you dress it up, that's all it can ever
| do, by definition. The evidence or data that would change my
| mind is if instead of talking about LLMs, we were talking
| about some other technology that does not yet exist, but that
| is fundamentally different than an LLM.
| apsec112 wrote:
| If we defined "LLM" as "any deep learning model which uses
| the GPT transformer architecture and is trained using
| autoregressive next-token prediction", and then we
| empirically observed that such a model proved the Riemann
| Hypothesis before any human mathematician, it would seem
| very silly to say that it was "not intelligent and not
| capable of reasoning" because of an a-priori logical
| argument. To be clear, I think that probably won't happen!
| But I think it's ultimately an empirical question, not a
| logical or philosophical one. (Unless there's some sort of
| actual mathematical proof that would set upper bounds on
| the capabilities of such a model, which would be extremely
| interesting if true! but I haven't seen one.)
| yosito wrote:
| Let's talk when we've got LLMs proving the Riemann
| Hypothesis (or any mathematical hypothesis) without any
| proofs in the training data. I'm confident in my belief
| that an LLM can't do that, and will never be able to.
| LLMs can barely solve elementary school math problems
| reliably.
| valval wrote:
| If the cure for cancer arrived to us in the form of the
| most probable token being predicted one at a time, would
| your view on the matter change in any way?
|
| In other words, do you have proof that this medium of
| information output is doomed to forever be useless in
| producing information that adds value to the world?
|
| These are of course rhetorical questions that you nor
| anyone else can answer today, but you seem to have a weird
| sort of absolute position on this matter, as if a lot
| depended on your sentiment being correct.
| awongh wrote:
| Is $200 a lot if you end up using it quite often?
|
| It makes me wonder why they don't want to offer a usage based
| pricing model.
|
| Is it because people really believe it makes a much worse
| product offering?
|
| Why not offer some of the same capability as pay-per-use?
| MuffinFlavored wrote:
| > but I have a feeling that part of the reason they're charging
| so much is to give people more confidence in the model
|
| Or each user doing an o1 model prompt is probably like, really
| expensive and they need to charge for it until they can get
| cost down? Anybody have estimates on what a single request into
| o1 costs on their end? Like GPU, memory, all the "thought"
| tokens?
| yosito wrote:
| Perplexity does reasoning and searching, for $10/mo, so I
| have a hard time believing that it costs OpenAI 20x as much
| to do the same thing. Especially if OpenAI's model is really
| more advanced. But of course, no one except internal teams
| have all of the information about costs.
| Max-q wrote:
| I would say using performance of Perplexity as a benchmark for
| the quality of o1-pro is a stretch?
| yosito wrote:
| Find third party benchmarks of the relevant models and then
| this discussion is worth having. Otherwise, it's just
| speculation.
| nemonemo wrote:
| Wouldn't you say the same thing for most of the people? Most of
| the people suck at verifying truth and reasoning. Even
| "intelligent" people make mistakes based on their biases.
|
| I think at least LLMs are more receptive to the idea that they
| may be wrong, and based on that, we can have N diverse LLMs and
| they may argue more peacefully and build a reliable consensus
| than N "intelligent" people.
| jazzyjackson wrote:
| The difference between a person and a bot is that a person
| has a stake in the outcome. A bot is like a person who's
| already put in their two weeks notice and doesn't have to be
| there to see the outcome of their work.
| MichaelZuo wrote:
| That's still amazing quality output for someone working for
| under $1/hour?
| Smaug123 wrote:
| It's not obvious that one should prefer that, versus
| _not_ having that output at all.
| lukan wrote:
| Intelligent people will know they made a mistake, if given a
| hint and figure out what went wrong.
|
| A LLM will just pretend to care about the error and happily
| repeats the error over and over.
| sangeeth96 wrote:
| > they may argue more peacefully
|
| bit of a stretch.
| fourside wrote:
| How is this a counter argument that LLMs are marketed as
| having intelligence when it's more accurate to think of them
| as predictive models? The fact that humans are also flawed
| isn't super relevant to a $200/month LLM purchasing decision.
| yosito wrote:
| Yeah, most people suck at verifying truth and reasoning. But
| most information technology employees, above intern level,
| are highly capable of reasoning and making decisions in their
| area of expertise.
|
| Try asking an LLM complex questions in your area of
| expertise. Interview it as if you needed to be confident that
| it could do your job. You'll quickly find out that it can't
| do your job, and isn't actually capable of reasoning.
| jerjerjer wrote:
| The issue is that most people, especially when prompted, can
| provide their level of confidence in the answer or even
| refuse to provide an answer if they are not sure. LLMs, by
| default, seem to be extremely confident in their answers, and
| it's quite hard to get the "confidence" level out of them (if
| that metric is even applicable to LLMs). That's why they are
| so good at duping people into believing them after all.
| PittleyDunkin wrote:
| > The issue is that most people, especially when prompted,
| can provide their level of confidence in the answer or even
| refuse to provide an answer if they are not sure.
|
| People also pull this figure out of their ass, over or
| undertrust themselves, and lie. I'm not sure self-reported
| confidence is that interesting compared to "showing your
| work".
| ryan29 wrote:
| > Wouldn't you say the same thing for most of the people?
| Most of the people suck at verifying truth and reasoning.
| Even "intelligent" people make mistakes based on their
| biases.
|
| I think there's a huge difference because individuals can be
| reasoned with, convinced they're wrong, and have the ability
| to verify they're wrong and change their position. If I can
| convince one person they're wrong about something, they
| convince others. It has an exponential effect and it's a good
| way of eliminating common errors.
|
| I don't understand how LLMs will do that. If everyone stops
| learning and starts relying on LLMs to tell them how to do
| everything, who will discover the mistakes?
|
| Here's a specific example. I'll pick on LinuxServer since
| they're big [1], but almost every 'docker-compose.yml' stack
| you see online will have a database service defined like
| this: services: app:
| # ... environment: -
| 'DB_HOST=mysql:3306' # ... mariadb:
| image: linuxserver/mariadb container_name:
| mariadb environment: - PUID=1000
| - PGID=1000 -
| MYSQL_ROOT_PASSWORD=ROOT_ACCESS_PASSWORD -
| TZ=Europe/London volumes: -
| /home/user/appdata/mariadb:/config ports:
| - 3306:3306 restart: unless-stopped
|
| Assuming the database is dedicated to that app, and it
| typically is, publishing port 3306 for the database isn't
| necessary and is a bad practice because it unnecessarily
| exposes it to your entire local network. You don't need to
| publish it because it's already accessible to other
| containers in the same stack.
|
| Another Docker related example would be a Dockerfile using
| 'apt[-get]' without the '--error-on=any' switch. Pay
| attention to Docker build files and you'll realize almost no
| one uses that switch. Failing to do so allows silent failures
| of the 'update' command and it's possible to build containers
| with stale package versions if you have a transient error
| that affects the 'update' command, but succeeds on a
| subsequent 'install' command.
|
| There are tons of misunderstandings like that which end up
| being so common that no one realizes they're doing things
| wrong. For people, I can do something as simple as posting on
| HN and others can see my suggestion, verify it's correct, and
| repeat the solution. Eventually, the misconception is
| corrected and those paying attention know to ignore the
| mistakes in all of the old internet posts that will never be
| updated.
|
| How do you convince ChatGPT the above is correct and that
| it's a million posts on the internet that are wrong?
|
| 1. https://docs.linuxserver.io/general/docker-
| compose/#multiple...
| vanviegen wrote:
| I asked ChatGPT 4o if there's anything that can be improved
| in your docker-compose file. Among other (seemingly
| sensible) suggestions, it offered:
|
| ## Restrict Host Ports for Security
|
| If app and mariadb are only communicating internally, you
| can remove 3306:3306 to avoid exposing the port to the host
| machine:
|
| ```yaml ports: - 3306:3306 # Remove this unless external
| access is required. ```
|
| So, apparently, ChatGPT doesn't need any more convincing.
| ryan29 wrote:
| Wow. I can honestly say I'm surprised it makes that
| suggestion. That's great!
|
| I don't understand how it gets there though. How does it
| "know" that's the right thing to suggest when the
| majority of the online documentation all gets it wrong?
|
| I know how I do it. I read the Docker docs, I see that I
| don't think publishing that port is needed, I spin up a
| test, and I verify my theory. AFAIK, ChatGPT isn't
| testing to verify assumptions like that, so I wonder how
| it determines correct from incorrect.
| BeefWellington wrote:
| Here GPT is saying the port is only exposed to the host
| machine (e.g.: localhost), rather than the full local
| network.
| crazygringo wrote:
| > _In other words, it 's a con._
|
| A con like that wouldn't last very long.
|
| This is for people who rely enough on ChatGPT Pro features that
| it becomes worth it. Whether they pay for it because they're
| freelance, or their employer does.
|
| Just because an LLM doesn't boost your productivity, doesn't
| mean it doesn't for people in other lines of work. Whether
| LLM's help you at your work is _extremely_ domain-dependent.
| matteoraso wrote:
| >Whether LLM's help you at your work is extremely domain-
| dependent.
|
| I really doubt that, actually. The only thing that LLMs are
| truly good for is to create plausible-sounding text.
| Everything else, like generating facts, is outside of its
| main use case and known to frequently fail.
| rusticpenn wrote:
| I use llms to do most of my dunki work.
| tiahura wrote:
| LLMs have become indispensable for many attorneys. I know
| many other professionals that have been able to offload
| dozens of hours of work per month to ChatGPT and Claude.
| PittleyDunkin wrote:
| What on earth is this work that they're doing that's so
| resilient to the fallible nature of LLMs? Is it just
| document search with a RAG?
| tiahura wrote:
| Everything. Drafting correspondence, pleadings discovery,
| discovery responses. Reviewing all of the same. Reviewing
| depositions, drafting deposition outlines.
|
| Everything that is "word processing," and that's a lot.
| PittleyDunkin wrote:
| Well that's terrifying. Good luck to them.
| wing-_-nuts wrote:
| To be honest, much of contract law is formal boilerplate.
| I can understand why they'd want to move their role to
| 'review' instead of 'generate'
| drdaeman wrote:
| So, instead of fixing the issue (legal documents becoming
| a barely manageable mess) they're investing money into
| making it... even worse?
|
| This world is so messed up.
| randallsquared wrote:
| They have no lever with which to fix the issue.
| Terr_ wrote:
| Arguably the same problem is occurs in programming:
| Anything so formulaic and common that an LLM can
| regurgitate it with a decent level of reliability... is
| something that ought to have been folded into
| method/library already.
|
| Or it already exists in some howto documentation, but
| nobody wanted to skim the documentation.
| PittleyDunkin wrote:
| Why not just move over to forms with structured input?
| bad_haircut72 wrote:
| Yeah the industries LLMs will disrupt the most are the
| ones who gatekeep busywork. SWE falls into this to some
| degree but other professions are more guilty than us.
| They dont replace intelligence they just surface jobs
| which never really required much intelligence to begin
| with.
| sebastiennight wrote:
| As a customer of legal work for 20 years, it is also way
| (way way) faster and cheaper to draft a contract with
| Claude (total work ~1 hour, even with complex back-and-
| forth ; you don't want to try to one-shot it in a single
| prompt) and then pay a law firm their top dollar-per-hour
| consulting to review/amend the contract (you can get to
| the final version in a day).
|
| Versus the old way of asking them to write the contract,
| where they'll blatantly re-use some boilerplate
| (sometimes the name of a previous client's company will
| still be in there) and then take 2 weeks to get back to
| you with Draft #1, charging 10x as much.
| cj wrote:
| Good law firms won't charge you for using their
| boilerplates, only the time to customize it for your use
| case.
|
| I anlways ask our lawyer whether or not they have a
| boilerplate when I need a contract written up. They
| usually do.
| jprd wrote:
| I bet they still charge for all the hours though.
| TeMPOraL wrote:
| That opinion made sense two years ago. It's plain weird to
| still hold it today.
| JoshTriplett wrote:
| There was a study recently that made it clear the use of
| LLMs for coding assistance made people _feel_ more
| productive but actually made them less productive.
|
| EDIT: Added links.
|
| https://www.cio.com/article/3540579/devs-gaining-little-
| if-a...
|
| https://web.archive.org/web/20241205204237/https://llmrep
| ort...
|
| (Archive link because the llmreporter site seems to have
| an expired TLS certificate at the moment.)
|
| No improvement to PR throughput or merge time, 41% more
| bugs, worse work-life balance...
| mkl wrote:
| Do you have a link? I'm not finding it by searching.
| grogenaut wrote:
| I recently slapped 3 different 3 page sql statements and
| their obscure errors with no line or context references
| from Redshift into Claude, it was 3 for 3 on telling me
| where in my query I was messing up. Saved me probably 5
| minutes each time but really saved me from moving to a
| different task and coming back. So around $100 in value
| right there. I was impressed by it. I wish the query UI I
| was using just auto-ran it when I got an error. I should
| code that up as an extension.
| mdtancsa wrote:
| I am in a similar same boat. Its way more correct than
| not for the tasks I give it. For simple queries about,
| say, CLI tools I dont use that often, or regex
| formulations, I find it handy as when it gives the answer
| Its easy to test if its right or not. If it gets it
| wrong, I work _with_ Claude to get to the right answer.
| mattkrause wrote:
| $100 to save 15 minutes implies that you net at least
| $800,000 a year. Well done if so!
| afro88 wrote:
| > but really saved me from moving to a different task and
| coming back
|
| You missed this part. Being able to quickly fix things
| without deep thought while in flow saves you from the
| slowdowns of context switching.
| TeMPOraL wrote:
| That $100 of value likely costed them more like $0.1 - $1
| in API costs.
| marcodiego wrote:
| I really need the source of this.
| TeMPOraL wrote:
| First of all, that's moving the goalposts to next state
| over, relative to what I replied to.
|
| Secondly, the "No improvement to PR throughput or merge
| time, 41% more bugs, worse work-life balance" result you
| quote came, per article, from a "study from Uplevel",
| which seems to[0] have been testing for change "among
| developers utilizing Copilot". That may or may not be
| surprising, but again it's hardly relevant to discussion
| about SOTA LLMs - it's like evaluating performance of an
| excavator by giving 1:10 toy excavators models to
| children and observing whether they dig holes in the
| sandbox faster than their shovel-equipped friends.
|
| Best LLMs are too slow and/or expensive to use in Copilot
| fashion just yet. I'm not sure if it's even a good idea -
| Copilot-like use breaks flow. Instead, the biggest wins
| coming from LLMs are from discussing problems, generating
| blocks of code, refactoring, unstructured to structured
| data conversion, identifying issues from build or
| debugger output, etc. All of those uses require
| qualitatively more "intelligence" than Copilot-style, and
| LLMs like GPT-4o and Claude 3.5 Sonnet deliver (hell,
| anything past GPT 3.5 delivered).
|
| Thirdly, I have some doubts about the very metrics used.
| I'll refrain from assuming the study is plain wrong here
| until I read it (see [0]), but anecdotally, I can tell
| you that at my last workplace, you likely wouldn't be
| able to tell whether or not using LLMs the right way
| (much less Copilot) helped by looking solely at those
| metrics - almost all PRs were approved by reviewers with
| minor or tangential commentary (thanks to culture of
| testing locally first, and not writing shit code in the
| first place), but then would spend days waiting to be
| merged due to shit CI system (overloaded to the point of
| breakage - apparently all the "developer time is more
| expensive than hardware" talk ends when it comes to
| adding compute to CI bots).
|
| --
|
| [0] - Per the article you linked; I'm yet to find and
| read the actual study itself.
| gwervc wrote:
| > A con like that wouldn't last very long.
|
| That's not a problem. OpenAI need to get some cash from its
| product because the competition is intense from free models.
| Moreover, since they supposedly used most of the web content
| and pirated whatever else they could, improvements in
| training will likely be only incremental.
|
| All the while, after the wow effect passed, more people start
| to realize the flaw in generative AI. So current hype, like
| all hype, as a limited shelf life and companies need to cash
| out now because it could be never.
| mikae1 wrote:
| A con? It's not that $200 is a con, their whole existence
| is a con.
|
| They're bleeding money and are desperately looking for a
| business model to survive. It's not going very well.
| Zitron[1] (among others) has outlined this.
|
| _> OpenAI 's monthly revenue hit $300 million in August,
| and the company expects to make $3.7 billion in revenue
| this year (the company will, as mentioned, lose $5 billion
| anyway), yet the company says that it expects to make $11.6
| billion in 2025 and $100 billion by 2029, a statement so
| egregious that I am surprised it's not some kind of
| financial crime to say it out loud. [...] At present,
| OpenAI makes $225 million a month -- $2.7 billion a year --
| by selling premium subscriptions to ChatGPT. To hit a
| revenue target of $11.6 billion in 2025, OpenAI would need
| to increase revenue from ChatGPT customers by 310%._[1]
|
| Surprise surprise, they just raised the price.
|
| [1] https://www.wheresyoured.at/oai-business/
| luma wrote:
| They haven't raised the price, they have added new models
| to the existing tier with better performance at the same
| price.
|
| They have also added a new, even higher performance model
| which can leverage test time compute to scale performance
| if you want to pay for that GPU time. This is no
| different than AWS offering some larger ec2 instance tier
| with more resources and a higher price tag than existing
| tiers.
| echelon wrote:
| They're throwing products at the wall to see what sticks.
| They're trying to rapidly morph from a research company
| into a product company.
|
| Models are becoming a commodity. It's game theory. Every
| second place company (eg. Meta) or nation (eg. China) is
| open sourcing its models to destroy value that might
| accrete to the competition. China alone has contributed a
| ton of SOTA and novel foundation models (eg. Hunyuan).
| mikae1 wrote:
| You're technically right. New models will likely be
| incremental upgrades at a hefty premium. But considering
| the money they're loosing, this pricing likely better
| reflects their costs.
| jsheard wrote:
| They haven't raised the price _yet_ but NYT has seen
| internal documents saying they do plan to.
|
| https://www.nytimes.com/2024/09/27/technology/openai-
| chatgpt...
|
| _Roughly 10 million ChatGPT users pay the company a $20
| monthly fee, according to the documents. OpenAI expects
| to raise that price by $2 by the end of the year, and
| will aggressively raise it to $44 over the next five
| years, the documents said._
|
| We'll have to see if the bump to $22 this year ends up
| happening.
| sdesol wrote:
| > We'll have to see if the bump to $22 this year ends up
| happening.
|
| I can't read the article. Any mention of the API pricing?
| ethbr1 wrote:
| Reasoning through that from a customer perspective is
| interesting.
|
| I'm hard pressed to identify any users to whom LLMs are
| providing enough value to justify $20/month, but not $44.
|
| On the other hand, I can see a lot of people to whom it's
| not providing any value being unable to afford a higher
| price.
|
| Guess we'll see which category most OpenAI users are in.
| grogenaut wrote:
| AI may be over hyped and it may have flaws (I think it is
| both)... but it may also be totally worth $200 / month to
| many people. My brother is getting way more value than that
| out of it for instance.
|
| So the question is it worth $200/month and to how many
| people, not is it over hyped, or if it has flaws. And does
| that support the level of investment being placed into
| these tools.
| echelon wrote:
| > the competition is intense from free models
|
| Models are about to become a commodity across the spectrum:
| LLMs [1], image generators [2], video generators [3], world
| model generators [4].
|
| The thing that matters is _product_.
|
| [1] Llama, QwQ, Mistral, ...
|
| [2] Nobody talks about Dall-E anymore. It's Flux, Stable
| Diffusion, etc.
|
| [3] HunYuan beats Sora, RunwayML, Kling, and Hailuo, and
| it's open source and compatible with ComfyUI workflows.
| Other companies are trying to open source their models with
| no sign of a business model: LTX, Genmo, Rhymes, et al.
|
| [4] The research on world models is expansive and there are
| lots of open source models and weights in the space.
| shortrounddev2 wrote:
| Overcharging for a product to make it seem better than it
| really is has served apple well for decades
| crazygringo wrote:
| That's a tired trope that simply isn't true.
|
| Does Apple charge a premium? Of course. Do Apple products
| also tend to have better construction, greater reliability,
| consistent repair support, and hold their resale value
| better? Yes.
|
| The idea that people are buying Apple _because_ of the
| Apple premium simply doesn 't hold up to any scrutiny. It's
| demonstrably not a Verblen good.
| shortrounddev2 wrote:
| > consistent repair support
|
| The lack of repairability is easily Apple's worst
| quality. They do everything in their power to prevent you
| from repairing devices by yourself or via 3rd party
| shops. When you take it to them to repair, they often
| will charge you more than the cost of a new device.
|
| People buy apple devices for a variety of reasons; some
| people believe in a false heuristic that Apple devices
| are good for software engineering. Others are simply
| teenagers who don't want to be the poor kid in school
| with an Android. Conspicuous consumption is a large part
| of Apple's appeal.
| Draiken wrote:
| Here in Brazil Apple is very much all about showing off
| how rich you are. Especially since we have some of the
| most expensive Apple products in the world.
|
| Maybe not as true in the US, but reading about the green
| bubble debacle, it's also a lot about status.
| vbezhenar wrote:
| Same in Kazakhstan. It's all about status. Many poor
| persons get a credit to buy iPhones, because they want to
| look rich.
| windexh8er wrote:
| > consistent repair support
|
| Now _that_ is a trope when you 're talking about Apple.
| They may use more premium materials that and have a
| degree of improved construction leveraging those
| materials - but at the end of the day there are countless
| numbers of failure prone designs that Apple continued to
| ship for years even after knowing they existed.
|
| I guess I don't follow the fact that the "Apple Premium"
| (whether real or otherwise) isn't a factor in a buyer
| decision. Are you saying Apple is a great lock-in system
| and that's why people continue to buy from them?
| Aeolun wrote:
| I used to love to bash on Apple too. But ever since I've
| had the money all my hardware (except desktop PC) has
| been apple.
|
| There's something to be said for buying something and
| knowing it will interoperate with all your other stuff
| perfectly.
| cruano wrote:
| They only have to be consistently better than the
| competition, and they are, by far. I always look for
| reviews before buying anything, and even then I've been
| nothing but disappointed by the likes of Razer, LG,
| Samsung, etc.
| chipotle_coyote wrote:
| I suspect they're saying that for a lot of us, Apple
| provides enough value compared to the competition that we
| buy them _despite_ the premium prices (and, on iOS, the
| lock-in).
|
| It's very hard to explain to people who haven't dug into
| macOS that it's a great system for power users, for
| example, especially because it's not very customizable in
| terms of aesthetics, and there are always things you can
| point to about its out-of-the-box experience that seem
| "worse" than competitors (e.g., window management). And
| there's no one thing I can really point to and say "that,
| that's why I stay here"; it's more a collection of
| _little_ things. The service menu. The customizable
| global keyboard shortcuts. Automator, AppleScript (in
| spite of itself), now the Shortcuts app.
|
| And, sure, they tend to push their hardware in some ways,
| not always wisely. Nobody asked for the world's thinnest,
| most fragile keyboards, nor did we want them to spend
| five or six years fiddling with it and going "We think we
| have it now!" (Narrator: they did not.) But I really do
| like how solid my M1 MacBook Air feels. I really
| appreciate having a 2880x1800 resolution display with the
| P3 color gamut. It's a good machine. Even if I could run
| macOS well on other hardware, I'd still probably _prefer_
| running it on this hardware.
|
| Anyway, this is very off topic. That ChatGPT Pro is
| pretty damn expensive, isn't it? This little conversation
| branch started as a comparison between it and the "Apple
| tax", but even as someone who mildly grudgingly pays the
| Apple tax every few years, the ChatGPT Pro tax is right
| off the table.
| xanderlewis wrote:
| Apple products are expensive -- sometimes to a degree that
| almost seems to be taking the piss.
|
| But name one other company whose hardware truly matches
| Apple's standards for precision and attention to detail.
| ingen0s wrote:
| Indeed
| omarhaneef wrote:
| I think this is probably right but so far it seems that the
| areas in which an LLM is most effective do fine with the
| lower power models.
|
| Example: the 4o or Claude are great for coding, summarizing
| and rewriting emails. So which domains require a slightly
| better model?
|
| I suppose if the error rate in code or summary goes down even
| 10%, it might be worth $180/month.
| ducttapecrown wrote:
| I bet users won't pay for the power, but for a guarantee of
| access! I always hear about people running out of compute
| time for ChatGPT. Obvious answer is charge more for a
| higher quality service.
| vbezhenar wrote:
| Few days ago I had issue with IPsec VPN behind NAT. I spend
| few hours Googling around, tinkering with system, I had
| some rough understanding of what goes wrong, but not much
| and I had no idea how to solve this issue.
|
| I made a very exhaustive question to ChatGPT o1-preview,
| including all information I though is relevant. Something
| like good forums question. Well, 10 seconds later it spew
| me a working solution. I was ashamed, because I have 20
| years of experience under my belt and this model solved
| non-trivial task much better than me.
|
| I was ashamed but at the same time that's a superpower. And
| I'm ready to pay $200 to get solid answers that I just
| can't get in a reasonable timeframe.
| gedy wrote:
| It is really great when it works, but challenge is I've
| had it sometimes not understanding a detailed programming
| question and it confidently gives an incorrect answer.
| Going back and forth a few times ends up clear it really
| does not know answer, but I end up going in circles. I
| know LLMs can't really tell you "sorry I don't know this
| one", but I wish they could.
| BOOSTERHIDROGEN wrote:
| The exhaustive question makes ChatGPT reconstruct your
| answer in real-time, while all you need to do is sleep;
| your brain will construct the answer and deliver it
| tomorrow morning.
| pera wrote:
| > _A con like that wouldn 't last very long._
|
| The NFT market lasted for many years and was enormous.
|
| Never underestimate the power of hype.
| taco_emoji wrote:
| > A con like that wouldn't last very long.
|
| Bernie Madoff ran his investment fund as a Ponzi scheme for
| over a decade (perhaps several decades)
| john-radio wrote:
| A better way to express it than a "con" is that it's a price-
| framing device. It's like listing a watch at an initial value
| of $2,000 so that people will feel content to buy it at $400.
| jl6 wrote:
| That sounds like a con to me too.
| xanderlewis wrote:
| The line between 'con' and 'genuine value synthesised in
| the eye of the buyer using nothing but marketing' is very
| thin. If people are happy, they are happy.
| px1999 wrote:
| Imo the con is picking the metric that makes others look
| artificially bad when it doesn't seem to be all that
| different (at least on the surface)
|
| > we use a stricter evaluation setting: a model is only
| considered to solve a question if it gets the answer right in
| four out of four attempts ("4/4 reliability"), not just one
|
| This surely makes the other models post smaller numbers. I'd
| be curious how it stacks up if doing eg 1/1 attempt or 1/4
| attempts.
| mrandish wrote:
| > ... or their employer does.
|
| I suspect this is a key driver behind having a higher priced,
| individual user offering. It gives pricing latitude for
| enterprise volume licenses.
| ben_w wrote:
| > A con like that wouldn't last very long
|
| As someone who has both repeatedly written that I value the
| better LLMs as if they were a paid intern (so
| EUR$PS1000/month at least), and yet who gets so much from the
| free tier* that I won't bother paying for a subscription:
|
| I've seen quite a few cases where expensive non-functional
| things that experts demonstrate don't work, keep making
| money.
|
| My mum was very fond of homeopathic pills and Bach flower
| tinctures, for example.
|
| * 3.5 was competent enough to write a WebUI for the API so
| I've got the fancy stuff anyway as PAYG when I want it.
| spaceman_2020 wrote:
| HN has been just such an awful place to discuss AI. Everyone
| here is convinced its a grift, a con, and we're all "marks"
|
| Just zero curiosity, only skepticism.
| 999900000999 wrote:
| Ok.
|
| Let's say I run a company call AndSoft.
|
| AndSoft has about 2000 people on staff, maybe 1000
| programers.
|
| This solution will cost 200k per year. Or 2.4 million per
| year.
|
| Llama3 is effectively free with some liberation. Is ChatGPT
| pro 2.4 million a year better than Llama3. Of course Open AI
| will offer volume discounts.
|
| I imagine if I was making north of 500k a year I'd subscribe
| as a curiosity... At least for a few months.
|
| If your time is worth 250$ a hour, and this saves you an hour
| per month it's well worth it.
| newsclues wrote:
| Maybe not very long, but long enough is plausible.
| AlanYx wrote:
| $200 a month is potentially a bargain since it comes with
| unlimited advanced voice. Via the API, $200 used to only get
| you 14 hours of advanced voice.
| htrp wrote:
| you'll be throttled and rate limited
| jerjerjer wrote:
| Does it give unlimited API access though?
| AlanYx wrote:
| No (naturally). But my thought process is that if you use
| advanced voice even half an hour a day, it's probably a
| fair price based on API costs. If you use it more, for
| something like language learning or entertaining kids who
| love it, it's potentially a bargain.
| yosito wrote:
| I've got unlimited "advanced voice" with Perplexity for
| $10/mo. You're defining a bargain based on the arbitrary
| limits set by the company offering you said bargain.
| lm28469 wrote:
| > $200 a month for this is insane
|
| Losing $5-10b per year also is insane. People are still looking
| for the added value, it's been 2 whole years now
| MP_1729 wrote:
| My new internet is in his 3rd day in the job and he's still
| behind o1-preview with less than 25 prompts.
| yosito wrote:
| Sounds like you're the perfect customer for this offer then.
| Good luck!
| MP_1729 wrote:
| I'm in a low-cost country, haha! So the intern is even
| cheaper.
| vessenes wrote:
| I mean this in what I hope will be taken in the most helpful
| way possible: you should update your thinking to at least
| _imagine_ that intelligent thoughtful people see some value in
| ChatGPT. Or alternately that some of the people who see value
| in ChatGPT are intelligent and thoughtful. That is, aim for the
| more intelligent "Interesting, why do so many people like
| this? Where is it headed? Given that, what is worth doing now,
| and what's worth waiting on?" over the "This doesn't meet my
| standards in my domain, ergo people are getting scammed."
|
| I'll pay $200 a month, no problem; right now o1-preview does
| the work for me of a ... somewhat distracted graduate student
| who needs checking, all for under $1 / day. It's slow for an
| LLM, but SUPER FAST for a grad student. If I can get a more
| rarely distracted graduate student that's better at coding for
| $7/day, well, that's worth a try. I can always cancel.
| ghshephard wrote:
| If you do a lot of work in an area that o1 is strong in -
| $200/month effectively rounds down to $0 - and a single good
| answer at the right time could justify that entire $200 in a
| single go.
| yosito wrote:
| Presumably, this is what they want the marks buying the $200
| plan to think. Whether it's actually capable of providing
| answers worth $200 and not just sweet talking is the whole
| question.
| daveguy wrote:
| I feel like a single bad answer at the wrong time could cost
| a heck of a lot more than $200. And these LLMs are riddled
| with bad answers.
| amelius wrote:
| Think of it as an intern. Don't trust everything they say.
| parthdesai wrote:
| Yeah, but you personally don't pay $200/month out of your
| pocket for the intern. Heck in Canada, govt. actually
| rebates for hiring interns and co-ops.
| crindy wrote:
| It's so strange to me that in a forum full of
| programmers, people don't seem to understand that you set
| up systems to detect errors before they cause problems.
| That's why I find ChatGPT so useful for helping me with
| programming - I can tell if it makes a mistake because...
| the code doesn't do what I want it to do. I already have
| testing and linting set up to catch my own mistakes, and
| those things also catch AI's mistakes.
| daveguy wrote:
| You assume programming software with an existing well-
| defined and correct test suite is all these will be used
| for.
| xandrius wrote:
| Thank you! I always feel so weird to actually use chatgpt
| without any major issues while so many people keep on
| claiming how awful it is; it's like people want it 100%
| perfect or nothing. For me if it gets me 80% there in
| 1/10 the time, and then I do the final 20%, that's still
| heck of a productivity boost basically for free.
| JamesBarney wrote:
| Well now at $200 it's a little farther away from free :P
| thelastparadise wrote:
| I could a car for that kind of money!
| crindy wrote:
| Yep, I'm with you. I'm a solo dev who never went to
| college... o1 makes far fewer errors than I do! No chance
| I'd make it past round one of any sort of coding
| tournament. But I managed to bootstrap a whole saas
| company doing all the coding myself, which involved
| setting up a lot of guard rails to catch my own mistakes
| before they reached production. And now I can consult
| with a programming intelligence the likes of which I
| could never afford to hire if it was a person. It's
| amazing.
| thelastparadise wrote:
| Is it working?
| crindy wrote:
| Not sure what you're referring to exactly. But broadly
| yes it is working for me - the number of new features I
| get out to users has sped up greatly, and stability of my
| product has also gone up.
| xanderlewis wrote:
| It depends what you're doing.
|
| For tasks where bullshitting or regurgitating common
| idioms is key, it works rather well and indeed takes you
| 80% or even close to 100% of the way there. For tasks
| that require technical precision _and_ genuine
| originality, it's hopeless.
| lumb63 wrote:
| Famously, the last 10% takes 90% of the time (or 20/80 in
| some approximations). So even if it gets you 80% of the
| way in 10% of the time, maybe you don't end up saving any
| time, because all the time is in the last 20%.
|
| I'm not saying that LLMs can't be useful, but I do think
| it's a darn shame that we've given up on creating tools
| that deterministically perform a task. We know we make
| mistakes and take a long time to do things. And so we
| developed tools to decrease our fallibility to zero, or
| to allow us to achieve the same output faster. But that
| technology needs to be reliable; and pushing the envelope
| of that reliability has been a cornerstone of human
| innovation since time immemorial. Except here, with the
| "AI" craze, where we have abandoned that pursuit. As the
| saying goes, "to err is human"; the 21st-century update
| will seemingly be, "and it's okay if technology errs
| too". If any other foundational technology had this
| issue, it would be sitting unused on a shelf.
|
| What if your compiler only generated the right code 99%
| of the time? Or, if your car only started 9 times out of
| 10? All of these tools can be useful, but when we are so
| accepting of a lack of reliability, more things go wrong,
| and potentially at larger and larger scales and
| magnitudes. When (if some folks are to believed) AI is
| writing safety-critical code for an early-warning system,
| or deciding when to use bombs, or designing and
| validating drugs, what failure rate is tolerable?
| avarun wrote:
| > Famously, the last 10% takes 90% of the time (or 20/80
| in some approximations). So even if it gets you 80% of
| the way in 10% of the time, maybe you don't end up saving
| any time, because all the time is in the last 20%.
|
| This does not follow. By your own assumptions, getting
| you 80% of the way there in 10% of the time would save
| you 18% of the overall time, if the first 80% typically
| takes 20% of the time. 18% time reduction in a given task
| is still an incredibly massive optimization that's easily
| worth $200/month for a professional.
| km3r wrote:
| Using 90/10 split: that 10% of the time before being
| reduced to only take 10% of that makes 9% time savings.
|
| 160 hours a month * $100/hr programmer * 9% = $1400
| savings, easily enough to justify $200/month.
|
| Even if 1/10th of the time it fails, that is still ~8% or
| $1200 savings.
| CamperBob2 wrote:
| _I always feel so weird to actually use chatgpt without
| any major issues while so many people keep on claiming
| how awful it is;_
|
| People around here feel seriously threatened by ML
| models. It makes no sense, but then, neither does
| defending the Luddites, and people around here do that,
| too.
| vunderba wrote:
| Of course, but for every thoroughly set up TDD
| environment, you have a hundred other people just blindly
| copy pasting LLM output into their code base and trusting
| the code based on a few quick sanity checks.
| leptons wrote:
| >I can tell if it makes a mistake because... the code
| doesn't do what I want it to do
|
| Sometimes it does what you want it to do, but still
| creates a bug.
|
| Asked the AI to write some code to get a list of all
| objects in an S3 bucket. It wrote some code that worked,
| but it did not address the fact that S3 delivers objects
| in pages of max 1000 items, so if the bucket contained
| less than 1000 objects (typical when first starting a
| project), things worked, but if the bucket contained more
| than 1000 objects (easy to do on S3 in a short amount of
| time), then that would be a subtle but important bug.
|
| Someone not already intimately familiar with the inner
| workings of S3 APIs would not have caught this. It's
| anyone's guess if it would be caught in a code review, if
| a code review is even done.
|
| I don't ask the AI to do anything complicated at all, the
| most I trust it with is writing console.log statements,
| which it is pretty good at predicting, but still not
| perfect.
| rrradical wrote:
| So the AI wrote a bug; but if humans wouldn't catch it in
| code review, then obviously they could have written the
| same bug. Which shouldn't be surprising because LLMs
| didn't invent the concept of bugs.
|
| I use LLMs maybe a few times a month but I don't really
| follow this argument against them.
| yawnxyz wrote:
| it also catches MY mistakes, so that saves time
| Kiro wrote:
| So true, and people seem to gloss over this fact
| completely. They only talk about correcting the LLM's
| code while the opposite is much more common for me.
| stavros wrote:
| If I have to do the work to double-check all the answers,
| why am I paying $200?
| billti wrote:
| Why do companies hire junior devs? You still want a
| senior to review the PRs before they merge into the
| product right? But the net benefit is still there.
| stavros wrote:
| We hire junior devs as an investment, because at some
| point they turn into seniors. If they stayed juniors
| forever, I wouldn't hire them.
| jhgg wrote:
| Are you implying this technology will remain static in
| its capabilities going forward despite it having seen
| significant improvement over the last few years?
| stavros wrote:
| No, I'm explicitly saying that gpt-4o-2024-11-20 won't
| get any smarter, no matter how much I use it.
| jhgg wrote:
| Does that matter when you can just swap it for
| gpt-5-whatever at some point in the future?
| stavros wrote:
| Someone asked why I hire juniors. I said I hire juniors
| because they get better. I don't need to use the model
| for it to get better, I can just wait until it's good and
| use it then. That's the argument.
| drusepth wrote:
| I started incorporating LLMs into my workflows around the
| time gpt-3 came out. By comparison to its performance at
| that point, it sure feels like my junior is starting to
| become a senior.
| OvidNaso wrote:
| Genuinely curious, are you saying that your junior devs
| don't provide any value from the work they do?
| stavros wrote:
| They provide some value, but between the time they take
| in coaching, reviewing their work, support, etc, I'm
| fairly sure one senior developer has a much higher work
| per dollar ratio than the junior.
| behringer wrote:
| Because you wouldn't have come up with the correct answer
| before you used up 200 dollars worth of salary or
| billable time.
| tiahura wrote:
| Because double checking and occasionally hitting retry is
| still 10x faster than me doing.
| Sporktacular wrote:
| Because it's per month and not per hour for a specialist
| consultant.
| motoxpro wrote:
| I don't know anyone who does something and at first says,
| "This will be a mistake" Maybe they say, "I am pretty
| sure this is the right thing to do," then they make a
| mistake.
|
| If it's easier mentally, just put that second sentence in
| from of every chatgpt answer.
|
| Yeah the Junior dev gets better, but then you hire
| another one that makes the same mistakes, so in reality,
| on an absolute basis, the junior dev never gets any
| better.
| UltraSane wrote:
| because checking the work is much faster than generating
| it.
| crackrook wrote:
| I would hesitate to hire an intern that makes incorrect
| statements with maximum confidence and with no ability to
| learn from their mistakes.
| pie420 wrote:
| nothing chatgpt says is with maximum confidence. the EULA
| and terms of use are riddled with "no guarantee of
| accuracy" and "use at own risk"
| albumen wrote:
| No they're right. ChatGPT (and all chargers) responds
| confidently while making simple errors. Disclaimers upon
| signup or in tiny corner text are so at odds with the
| actual chat experience.
| daveguy wrote:
| I think the point is that an LLM almost always responds
| with the appearance of high confidence. It will much
| quicker hallucinate than say "I don't know."
| Terr_ wrote:
| And we, as humans, are having a hard time
| compartmentalizing and forgetting our lifetimes of
| language cues, which _typically_ correlate with attention
| to detail, intelligence, time investment, etc.
|
| New echnology allows those signs to be _counterfeited_
| quickly and cheaply, and it tricks our subconscious
| despite our best efforts to be hyper-vigilant. (Our
| brains don 't want to do that, it's expensive.)
|
| Perhaps a stopgap might be to make the LLM say everything
| in a hostile villainous way...
| Draiken wrote:
| They aren't talking about EULAs. It's how they give out
| their answers.
| crackrook wrote:
| What I meant to say was that the model uses the verbiage
| of a maximally confident human. In my experience the
| interns worth having have some sense of the limits of
| their knowledge and will tell you "I don't know" or
| qualify information with "I'm not certain, but..."
|
| If an intern set their Slack status to "There's no
| guarantee that what I say will be accurate, engage with
| me at your own risk." That wouldn't excuse their attempts
| to answer every question as if they wrote the book on the
| subject.
| educasean wrote:
| When you highlight only the negatives, yeah it does sound
| like no one should hire that intern. But what if the same
| intern happens to have an encyclopedia for a brain and
| can pour through massive documents and codebases to spot
| and fix countless human errors in a snap?
|
| There seems to be two camps: People who want nothing to
| do with such flawed interns - and people who are trying
| to figure out how to amplify and utilize the positive
| aspects of such flawed, yet powerful interns. I'm
| choosing to be in the latter camp.
| ZiiS wrote:
| Even in this case loosing $200 + whatever vs a tiny bit
| higher chance of loosing $20 + whatever makes pro seem a
| good deal.
| ruszki wrote:
| Compared to know things and not loosing whatever, both
| are pretty bad deals.
| daveguy wrote:
| Doesn't that completely depend on those chances and the
| magnitude of +whatever?
|
| It just seems to me that you really need to know the
| answer before you ask it to be over 90% confident in the
| answer. And the more convincing sounding these things get
| the more difficult it is to know whether you have a
| plausible but wrong answer (aka "hallucination") vs a
| correct one.
|
| If you have a need for a lot of difficult to come up with
| but easy to verify answers it could be worth it. But the
| difficult to come up with answers (eg novel research) are
| also where LLMs do the worst.
| awestroke wrote:
| Easy - don't trust the answers. Verify them
| malux85 wrote:
| Then the lesson you have learned is "don't blindly trust
| the machine"
|
| Which is a very valuable lesson, worth more than $200
| llm_trw wrote:
| That's why you have a human in the loop responsible for the
| answer.
| Kiro wrote:
| What specific use cases are you referring to where that
| poses a risk? I've been using LLMs for years now (both
| directly and as part of applications) and can't think of a
| single instance where the output constituted a risk or
| where it was relied upon for critical decisions.
| dubeye wrote:
| If i'm happy to pay 20 in retirement just for the odd bit of
| writing help, then i can easily imagine it being worth 200 to
| someone with a job
| josephg wrote:
| Yep. I'm currently paying for both Claude and chatgpt because
| they're good at different things. I can't tell whether this
| is extremely cheap or expensive - last week Claude saved me
| about a day of time by writing a whole lot of very complex
| sql queries for me. The value is insane.
| cryptoegorophy wrote:
| yeah, as someone who is far from programming, the amount of
| time and money it saved me helping me make sql queries and
| making php code for wordpress is insane. It even helped me
| fix some wordpress plugins that had errors and you just
| copy paste or even screenshot those errors until they get
| fixed! If used correctly and efficiently the value is
| insane, I would say $20, $200 is still cheap for such an
| amazing tool.
| behringer wrote:
| I kind of feel this is a kick in the face.
|
| Now I'll forever be using a second rate model because I'm not
| rich enough.
|
| If I'm stuck using a second rate model I may go find someone
| else's model to use.
| raincole wrote:
| The problem isn't whether ChatGPT Pro can save you $200/mo
| (for most programmers it can.)
|
| The problem is whether it can saves you $180/mo more than
| Claude does.
| choppaface wrote:
| They claim unlimited access, but in practice couldn't a user
| wrap an API around the app and use it for a service? Or perhaps
| the client effectively throttles use pretty aggressively?
|
| Interesting to compare this $200 pricing with the recent launch
| of Amazon Nova, which has not-equivalent-but-impressive
| performance for 1/10th the cost per million tokens. (Or perhaps
| OpenAI "shipmas" will include a competing product in the next
| few days, hence Amazon released early?)
|
| See e.g.: https://mastodon.social/@mhoye/113595564770070726
| wslh wrote:
| > $200 a month for this is insane, but I have a feeling that
| part of the reason they're charging so much is to give people
| more confidence in the model.
|
| Is it possible that they have subsidized the infrastructure for
| free and paid users and they realized that OpenAI requires a
| higher revenue to maintain the current demand?
| yosito wrote:
| Yes, it's entirely possible that they're scrambling to make
| money. That doesn't actually increase the value that they're
| offering though.
| eigenvalue wrote:
| Couldn't disagree more, I will be signing up for this as soon
| as I can, and it's a complete no brainer.
| yosito wrote:
| I agree with you that it's a complete no brainer. That's
| actually hilariously ironic in more ways than one.
| cdrini wrote:
| What will you be using it for? Where do you think you'll see
| the biggest benefit over the cheaper plan?
| thelastparadise wrote:
| The megga disappointment is o1 is performing _worse_ than
| o1-preview [1], and claude 3.6 had already nearly caught up
| to o1-preview.
|
| 1. https://x.com/nrehiew_/status/1864763064374976928
| jrflowers wrote:
| > In other words, it's a con. I'm a paying Perplexity user
|
| I love this back-to-back pair of statements. It is like "You
| can never win three card monte. I pay a monthly subscription
| fee to play it."
| yosito wrote:
| I pay $10/month for perplexity because I fully understand its
| limitations. I will not pay $200/month for an LLM.
| monkey_monkey wrote:
| I am CERTAIN you do not FULLY understand its limitations.
| JCharante wrote:
| it's literally the cost of a cup per coffee per day
| talldayo wrote:
| Or the price of replacing your espresso machine on a monthly
| basis.
| yosito wrote:
| When you put it this way, I think I need to finally buy
| that espresso machine.
| mwigdahl wrote:
| Not if you make coffee at home.
| latexr wrote:
| I don't drink coffee. But even if I did, and I drank it
| everyday at a coffeehouse or restaurant in my country (which
| would be significantly higher quality than something like a
| Starbucks), it wouldn't come close to that cost.
| tiltowait wrote:
| This argument only works in isolation, and only for a subset
| of people. "Cost of a cup of coffee per day" makes it sound
| horrifically overpriced to me, given how much more expensive
| a coffee shop is than brewing at home.
| tiahura wrote:
| Or an Avacado Toast.
| specproc wrote:
| In America. If you drink your coffee from coffee shops.
| vunderba wrote:
| Not to be glib, but where do you live such that a single cup
| of coffee runs you seven USD?
|
| Just to put that into perspective.
|
| I also really don't find comparisons like this to be that
| useful. Any subscription can be converted into an exchange
| rate of coffee, or meals. So what?
| pizza wrote:
| You're right - at my coffee shop a cup of coffee is nine
| riku_iki wrote:
| > it's literally the cost of a cup per coffee per day
|
| So, AI market is capped by Starbucks revenue/valuation.
| dvfjsdhgfv wrote:
| Maybe in an expensive coffee shop in the USA.
|
| In Italy, an espresso is ca. 1EUR.
| 12345hn6789 wrote:
| I pay $1.5 USD per day on my coffee. And I'm an extreme
| outlier. I buy speciality beans from mom and pop roasters.
| socksy wrote:
| Yeah but the coffee makes you more productive
| brookst wrote:
| I'd like to see more evidence that it's a scam than just your
| feelings. Any data there?
|
| I certainly don't see why mere prediction can't validate
| reasoning. Sure, it can't do it perfectly all the time, but
| neither can people.
| talldayo wrote:
| > I'd like to see more evidence that it's a scam
|
| Have you been introduced to their CEO yet? 5 minutes of
| Worldcoin research should assuage your curiosity.
| latexr wrote:
| https://www.technologyreview.com/2022/04/06/1048981/worldco
| i...
|
| https://www.buzzfeednews.com/article/richardnieva/worldcoin
| -...
| brookst wrote:
| So you've got feelings and guilt by association. And I've
| got a year of using ChatGPT, which has saved tens to
| hundreds of hours of tedious work.
|
| Forgive me for not finding your argument persuasive.
| jack_riminton wrote:
| If a model is good enough (I'm not saying this one is that
| level) I could imagine individuals and businesses paying 20,000
| a month. If they're answering questions at phd level (again,
| not saying this one is) then for a lot of areas this makes
| sense
| yosito wrote:
| Let me know when the models are actually, verifiably, this
| good. They're barely good enough to replace interns at this
| point.
| TeMPOraL wrote:
| Let me know where you can find people that are individually
| capable at performing at intern level in _every domain of
| knowledge and text-based activity known to mankind_.
|
| "Barely good enough to replace interns" is worth _a lot_ to
| businesses already.
|
| (On that note, a founder of a SAP competitor and a major IT
| corporation in Poland is fond of saying that "any
| specialist can be replaced by a finite number of interns".
| We'll soon get to see how true that is.)
| jll29 wrote:
| Czesc!
|
| Since when does SAP have competitors? ;-P
|
| A friend of mine claims most research is nowadays done by
| undergraduates because all senior folks are too busy.
| etrautmann wrote:
| postdocs but yeah
| ssl-3 wrote:
| Let me know what kind of intern you can keep around 24/7
| for a total monthly outlay of $200, and then we can compare
| notes.
| zamadatix wrote:
| If true, $2,400/y isn't bad for a 24/7/365 intern.
| ren_engineer wrote:
| target market is probably people who will write it off as a
| business expense
| fzeindl wrote:
| > After awhile, I started realizing that these mistakes are
| present in almost all topics.
|
| A fun question I tried a couple of times is asking it to give
| me a list with famous talks about a topic. Or a list of famous
| software engineers and the topics they work on.
|
| A couple of names typically exist but many names and basically
| all talks are shamelessly made up.
| valval wrote:
| If you understood the systems you're using, you'd know the
| limitations and wouldn't marvel at this. Use search engines
| for searching, calculators for calculating, and LLMs for
| generating text.
| bowsamic wrote:
| Whenever I've used ChatGPT for this exact thing it has been
| very accurate and didn't make up anyone
| crowcroft wrote:
| Considering no one makes money in AI, maybe this is just
| economics.
| JSDevOps wrote:
| Exactly I thought this, People falsely equate high price ==
| high quality. Basically with the $200 you are just donating to
| their cloud bills
| tippytippytango wrote:
| It's like hiring an assistant. You could hire one for 60k/year.
| But you wouldn't do it unless you knew how the assistant could
| help you make more than 60k per year. If you don't know what to
| do with an employee then don't hire them. If you don't know
| what to do with expensive ai, don't pay for it.
| 8f2ab37a-ed6c wrote:
| My main concern with $200/mo is that, as a software dev using
| foundational LLMs to learn and solve problems, I wouldn't get
| that much incremental value over the $20/mo tier, which I'm
| happy to pay for. They'd have to do a pretty incredible job at
| selling me on the benefits for me to pay 10x the original
| price. 10x for something like a 5% marginal improvement seems
| sus.
| metacritic12 wrote:
| Do you also think $40K a year for Hubspot is insane? What about
| people who pay $1k in order to work on a field for 4 hours
| hitting a small ball with a stick?
|
| The truth is that there are people who value the marginal
| performance -- if you think it's insane, clearly it's not for
| you.
| digitcatphd wrote:
| Their demo video was uploading a picture of a birdhouse and
| asking how to build it
| echelon wrote:
| I'm extremely excited because this margin represents
| opportunity for all the other LLM startups.
| Barrin92 wrote:
| >What about people who pay $1k in order to work on a field
| for 4 hours hitting a small ball with a stick?
|
| Those people want to purchase status. Unless they ship you a
| fancy bow tie and a wine tasting at a wood cabin with your
| chatgpt subscription this isn't gonna last long.
|
| This isn't about marginal performance, it's an increasingly
| desperate attempt to justify their spending in a market
| that's increasingly commodified and open sourced. Gotta
| convince Microsoft somehow to keep the lights on if you blew
| tens of billions to be the first guy to make a service that
| 20 different companies are soon gonna sell for pennies.
| Salgat wrote:
| The performance difference seems minor, so this is a great way
| for the company to get more of its funding from whales versus
| increasing the base subscription fee.
| vbezhenar wrote:
| I would pay $200 for GPT4o. Since GPT4, ChatGPT is absolutely
| necessary for my work and for my life. It changed every
| workflow like Google changed. I'm paying $20 to remove ads from
| youtube which I watch may be once a week, so $20 for ChatGPT
| was a steal.
|
| That said, my "issue" might be that I usually work alone and I
| don't have anyone to consult with. I can bother people on
| forums, but these days forums are pretty much dead and full of
| trolls, so it's not very useful. ChatGPT was that thing that
| allows me to progress in this environment. If you work in
| Google and can ask Rob Pike about something, probably you don't
| need ChatGPT as much.
| outside415 wrote:
| this is more or less my take too. if tomorrow all Claude and
| ChatGPT became $200/month I would still pay. The value they
| provide me with far, far exceeds that. so many cynics in this
| thread.
| throwaway314155 wrote:
| You don't have to be a cynic to be annoyed with a
| $200/month price. Just make a normal amount of money.
| llm_trw wrote:
| I'm singing up when I get home tonight.
| athrowaway3z wrote:
| I've actually hit a interesting situation a few times that make
| use of this. If some language feature, argument, or
| configuration option doesn't exists it will hallucinate one.
|
| This hallucination is usually a very good choice to name the
| option / API.
| clutchdude wrote:
| I've seen this before and it's frustrating to deal with
| chasing phantom APIs it invents.
|
| I wish it could just say "There is not a good approximation
| of this API existing - I would suggest reviewing the
| following docs/sources:....".
| tptacek wrote:
| Is it insane? It's the cost of a new laptop every year. There
| are about as many people who won't blink at that among
| practitioners in our field as people who will.
|
| I think the ship has sailed on whether GPT is useful or a con;
| I've lost track of people telling me it's their first search
| now rather than Google.
|
| I'd encourage skeptics who haven't read this yet to check out
| Nicholas' post here:
|
| https://news.ycombinator.com/item?id=41150317
| yodsanklai wrote:
| Could be a case of price discrimination [1], and a way to fuel
| the hype.
|
| [1]
| https://www.investopedia.com/terms/p/price_discrimination.as...
| Kiro wrote:
| > In other words, it's a con.
|
| Such a silly conclusion to draw based on a gut feeling, and to
| see all comments piggyback on it like it's a given feels like
| I'm going crazy. How can you all be so certain?
| DaveInTucson wrote:
| Remember the whole "how many r's in strawberry" thing?
|
| Yeah, not really fixed: https://imgur.com/a/counting-letters-
| with-chatgpt-7cQAbu0
| ThouYS wrote:
| hm that would be very interesting, hadn't perplexity already
| solved all my AI needs a while ago
| griomnib wrote:
| I consistently get significantly better performance from
| Anthropic at a literal order of magnitude less cost.
|
| I am incredibly doubtful that this new GPT is 10x Claude unless
| it is embracing some breakthrough, secret, architecture nobody
| has heard of.
| nurettin wrote:
| Or Anthropic will follow suit.
| MuffinFlavored wrote:
| Am I wrong that Anthropic doesn't really have a match yet to
| ChatGPT's o1 model (a "reasoning" model?)
| apsec112 wrote:
| They don't have a model that does o1-style "thought tokens"
| or is specialized for math, but Sonnet 3.6 is really strong
| in other ways. I'm guessing they will have an o1-style
| model within six months if there's demand
| tokioyoyo wrote:
| To my understanding, Anthropic realizes that they can't
| compete in name recognition yet, so they have to
| overdeliver in terms of quality to win the war. It's hard
| to beat the incumbent, especially when "chatgpt'ing" is
| basically a well understood verb.
| jerjerjer wrote:
| Is a "reasoning" model really different? Or is it just
| clever prompting (and feeding previous outputs) for an
| existing model? Possibly with some RLHF reasoning examples?
|
| OpenAI doesn't have a large enough database of reasoning
| texts to train a foundational LLM off it? I thought such a
| db simply does not exist as humans don't really write
| enough texts like this.
| griomnib wrote:
| It's clever marketing.
| logicchains wrote:
| It's trained via reinforcement learning on essentially
| infinite synthetic reasoning data. You can generate
| infinite reasoning data because there are infinite math
| and coding problems that can be created with machine-
| checkable solutions, and machines can make infinite
| different attempts at reasoning their way to the answer.
| Similar to how models trained to learn chess by self-play
| have essentially unlimited training data.
| airstrike wrote:
| Claude Sonnet 3.5 has outperformed o1 in most tasks based
| on my own anecdotal assessment. So much so that I'm
| debating canceling my ChatGPT subscription. I just
| literally do not use it anymore, despite being a heavy user
| for a long time in the past
| aliasxneo wrote:
| I haven't used ChatGPT in a few weeks now. I still maintain
| subscriptions to both ChatGPT and Claude, but I'm very close to
| dropping ChatGPT entirely. The only useful thing it provides
| over Claude is a decent mobile voice mode and web search.
| bluedays wrote:
| I've been considering dropping ChatGPT for the same reason.
| Now that the app is out the only thing I actually care about
| is search.
| asterix_pano wrote:
| If you don't want to necessarily have to pick between one or
| the other, there are services like this one that let you
| basically access all the major LLMs and only pay per use:
| https://nano-gpt.com/
| pc86 wrote:
| I've used TypingMind and it's pretty great, I like the idea
| of just plugging in a couple API keys and paying a
| fraction, but I really wish there was some overlap.
|
| If a random query via the API costs a fifth of a cent why
| can't I can't 10 free API calls w/ my $20/mo premium
| subscription?
| MP_1729 wrote:
| That's not how pricing works.
|
| If o1-pro is 10% better than Claude, but you are a guy who
| makes $300,000 per year, but now can make $330,000 because
| o1-pro makes you more productive, then it makes sense to give
| Sam $2,400.
| 015a wrote:
| The math is never this clean, and no one has ever experienced
| this (though I'm sure its a justification that was floated at
| OAI HQ at least once).
| xur17 wrote:
| It's never this clean, but it is direction-ally correct. If
| I make $300k / year, and I can tell that chatgpt already
| saves me hours or even days per month, $200 is a laughable
| amount. If I feel like pro is even slightly better, it's
| worth $200 just to know that I always have the best option
| available.
|
| Heck, it's probably worth $200 even if I'm not confident
| it's better just in case it is.
|
| For the same reason I don't start with the cheapest AI
| model when asking questions and then switch to the more
| expensive if it doesn't work. The more expensive one is
| cheap enough that it doesn't even matter, and $200 is cheap
| enough (for a certain subsection of users) that they'll
| just pay it to be sure they're using the best option.
| 015a wrote:
| That's only true if your time is metered by the hour; and
| the vast majority of roles which find some benefit from
| AI, at this time, are not compensated hourly. This plan
| might be beneficial to e.g. CEO-types, but I question who
| at OpenAI thought it would be a good idea to lead their
| 12 days of hollowhype with this launch, then; unless this
| _is_ the highest impact release they 've got (one hopes
| it is not).
| cmeacham98 wrote:
| Employers might be willing to get their employees a
| subscription if they believe it makes their employees
| they are paying $$$$$ more X% productive. (Where X% of
| their salary works out to more than $2400/year)
| luma wrote:
| I work from home and my time is accounted for by way of
| my productive output because I am very far away from a
| CEO type. If I can take every Wednesday off because I've
| gained enough productivity to do so, I would happily pay
| $200/mo out of my own pocket to do so.
|
| $200/user/month isn't even that high of a number in the
| enterprise software world.
| drusepth wrote:
| >This plan might be beneficial to e.g. CEO-types, but I
| question who at OpenAI thought it would be a good idea to
| lead their 12 days of hollowhype with this launch, then;
| unless this is the highest impact release they've got
| (one hopes it is not).
|
| In previous multi-day marketing campaigns I've ran or
| helped ran (specifically on well-loved products), we've
| intentionally announced a highly-priced plan early on
| without all of its features.
|
| Two big benefits:
|
| 1) Your biggest advocates get to work justifying the
| plan/product as-is, anchoring expectations to the price
| (which already works well enough to convert a slice of
| potential buyers)
|
| 2) Anything you announce afterward now gets seen as
| either a bonus on top (e.g. if this $200/mo plan _also_
| includes Sora after they announce it...), driving value
| per price up compared to the anchor; OR you're seen as
| listening to your audience's criticisms ("this isn't
| worth it!") by adding more value to compensate.
| sdesol wrote:
| > cheapest AI model when asking questions and then switch
| to the more expensive if it doesn't work.
|
| The thing is, more expensive isn't guaranteed to be
| better. The more expensive models are better most of the
| time, but not all the time. I talk about this more in
| this comment
| https://news.ycombinator.com/item?id=42313401#42313990
|
| Since LLMs are non-deterministic, there is no guarantee
| that GPT-4o is better than GPT-4o mini. GPT-4o is most
| likely going to be better, but sometimes the simplicity
| of GPT-4o mini makes it better.
| TeMPOraL wrote:
| As you say, the more expensive models _are better most of
| the time_.
|
| Since we can't easily predict which model will actually
| be better for a given question at the time of asking, it
| makes sense to stick to the most expensive/powerful
| models. We could _try_ , but that would be a complex and
| expensive endeavor. Meanwhile, both weak and powerful
| models are already too cheap to meter in direct / regular
| use, _and_ you 're always going to get ahead with the
| more powerful ones, per the very definition of what "most
| of the time" means, so it doesn't make sense to default
| to a weaker model.
| sdesol wrote:
| For regular users I agree, for businesses, it will have
| to be a shotgun approach in my opinion.
| jajko wrote:
| The number of times I've heard all this about some other
| groundbreaking technology... most businesses just went meh
| and moved on. But for self-employed, if those numbers are
| right, it may make sense.
| echoangle wrote:
| Having a tool that's 10% better doesn't make your whole work
| 10% better though.
| onlyrealcuzzo wrote:
| Yeah, but that's the sales pitch.
| jaredklewis wrote:
| Man, why are people making $300k so stupid though
| szundi wrote:
| Depends on the definition of better. Above example used
| this definition implicitly as you can see.
| jaredklewis wrote:
| Above example makes no sense since it says ChatGPT is 10%
| better than Claude at first, then pivots to use it as a
| 10% total productivity enhancer. Which is it?
| TeMPOraL wrote:
| A "10% better" tool could make no difference, or it could
| make the work _100%_ better. The impact isn 't linear.
| echoangle wrote:
| Right, I should have put a "necessarily" in there.
| pie420 wrote:
| ah yes, you must work at the company where you get paid per
| line of code. There's no way productivity is measured this
| accurately and you are rewarded directly in any job unless
| you are self-employed and get paid per website or something
| jnsaff2 wrote:
| It would be a worthy deal if you started making $302,401 per
| year.
| awb wrote:
| Also a worthy deal if you don't lose your $300k/year job to
| someone who is willing to pay $2,400/year.
| pvarangot wrote:
| That's also not how pricing works, it's about perceived
| incremental increases in how useful it is (marginal utility),
| not about the actual more money you make.
| truetraveller wrote:
| Yes. But also from the perspective of saving time. If it
| saves an additional 2 hours/month, and you make six figures,
| it's worth it.
|
| And the perspective of frustration as well.
|
| Business class is 4x the price of regular. definitely not 4x
| better. But it saves times + frustration.
| pc86 wrote:
| It's not worth it if you're a W2 employee and you'll just
| spend those 2 hours doing other work. Realistically,
| working 42 hours a week instead of 40 will not meaningfully
| impact your performance, so doing 42 hours a week of work
| in 40 won't, either.
|
| I pay $20/mo for Claude because it's been better than GPT
| for my use case, and I'm fine paying that but I wouldn't
| even consider something 10x the price unless it is _many,
| many times_ better. I think at least 4-5x better is when I
| 'd consider it and this doesn't appear to be anywhere close
| to even 2x better.
| bloppe wrote:
| I love it when AI bros quantify AI's helpfulness like this
| vessenes wrote:
| I think of them as different people -- I'll say that I use them
| in "ensemble mode" for coding, the workflow is Claude 3.5 by
| default -- when Claude is spinning, o1-preview to discuss,
| Claude to implement. Worst case o1-preview to implement,
| although I think its natural coding style is slightly better
| than Claude's. The speed difference isn't worth it.
|
| The intersection of problems I have where _both_ have trouble
| is pretty small. If this closes the gap even more, that 's
| great. That said, I'm curious to try this out -- the ways in
| which o1-preview fails are a bit different than prior gpt-line
| LLMs, and I'm curious how it will _feel_ on the ground.
| vessenes wrote:
| Okay, tried it out. Early indications - it feels a bit more
| concise, thank god, certainly more concise than 4o -- it's s
| l o w. Getting over 1m times to parse codebases. There's some
| sort of caching going on though, follow up queries are a bit
| faster (30-50s). I note that this is still superhuman speeds,
| but it's not writing at the speed Groqchat can output Llama
| 3.1 8b, that is for sure.
|
| Code looks really clean. I'm not instantly canceling my
| subscription.
| pc86 wrote:
| When you say "parse codebases" is this uploading a couple
| thousand lines in a few different files? Or pasting in 75
| lines into the chat box? Or something else?
| xixixao wrote:
| Which ChatGPT model have you been using? In my experience
| nothing beats 4. (Not claude, not 4o)
| superfrank wrote:
| I've heard this a lot and so I switched to Claude for a month
| and was super disappointed. What are you mainly using ChatGPT
| for?
|
| Personally, I found Claude marginally better for coding, but
| far, far worse for just general purpose questions (e.g. I'm a
| new home owner and I need to winterize my house before our
| weather drops below freezing. What are some steps I should take
| or things I should look into?)
| BoorishBears wrote:
| It's ironic because I never want to ask an LLM for something
| like your example general purpose question, where I can't
| just cheaply and directly test the correctness of the answer
|
| But we're hurtling towards all the internet's answers to
| general purpose questions being SEO spam that was generated
| by an LLM anyways.
|
| Since OpenAI probably isn't hiring as many HVAC technicians
| to answer queries as they are programmers, it feels like
| we're headed towards a death spiral where either having the
| LLM do actual research from non-SEO affected primary sources,
| or finding a human who's done that research will be the only
| options for generic knowledge questions that are off the
| beaten path
|
| -
|
| Actually to test my hypothesis I just tried this with ChatGPT
| with internet access.
|
| The list of winterization tips cited an article that felt
| pretty "delvey". I search the author's name and their
| LinkedIn profile is about how they professionally write
| marketing content (nothing about HVAC), one of their
| accomplishments is Generative AI, and their like feed is full
| of AI mentions for writing content.
|
| So ChatGPT is already at a place where when it searches for
| "citations", it's just spitting back out its own uncited
| answers above answers by actual experts (since the expert
| sources aren't as SEO-driven)
| VeejayRampay wrote:
| Claude is so much better
| cryptoegorophy wrote:
| I've heard so much about Claude and decided to give it a try
| and it has been rather a major disappointment. I ended up using
| chatgpt as an assistant for claude's code writing because it
| just couldn't get things right. Had to cancel my subscription,
| no idea why people still promote it everywhere like it is 100x
| times better than chatgpt.
| moralestapia wrote:
| I mean ... anecdata for anecdata.
|
| I use LLMs for many projects and 4o is the sweet spot for me.
|
| >literal order of magnitude less cost
|
| This is just not true. If your use case can be solved with
| 4o-mini (I know, not all do) OpenAI is the one which is an
| order of magnitude cheaper.
| 404mm wrote:
| I pay for both GPT and Claude and use them both extensively.
| Claude is my go-to for technical questions, GPT (4o) for simple
| questions, internet searches and validation of Claude answers.
| GPT o1-preview is great for more complex solutions and work on
| larger projects with multiple steps leading to finish. There's
| really nothing like it that Anthropic provides. But $200/mo is
| way above what I'm willing to pay.
| griomnib wrote:
| I have several local models I hit up first (Mixtral, Llama),
| if I don't like the results then I'll give same prompt to
| Claude and GPT.
|
| Overall though it's really just for reference and/or telling
| me about some standard library function I didn't know of.
|
| Somewhat counterintuitively I spend way more time reading
| language documentation than I used to, as the LLM is mainly
| useful in pointing me to language features.
|
| After a few very bad experiences I never let LLM write more
| than a couple lines of boilerplate for me, but as a well-read
| assistant they are useful.
|
| But none of them are sufficient alone, you do need a "team"
| of them - which is why I also don't see the value is spending
| this much on one model. I'd spend that much on a system that
| polled 5 models concurrently and came up with a summary of
| sorts.
| 404mm wrote:
| What model sizes do you run locally? Anything that would
| work on a 16GB M1?
| griomnib wrote:
| I have an A6000 with 48GB VRAM I run from a local server
| and I connect to it using Enchanted on my Mac.
| TeMPOraL wrote:
| > _But none of them are sufficient alone, you do need a
| "team" of them_
|
| Given the sensitivity to parameters and prompts the models
| have, your "team" can just as easily be querying the same
| LLM multiple times with different system prompts.
| griomnib wrote:
| Other factor is I use local LLM first because I don't
| trust any of the companies to protect my data or software
| IP.
| bhouston wrote:
| Yeah, I've switched to Anthropic fully as well for personal
| usage. It seems better to me and/or equivalent in all use
| cases.
| acchow wrote:
| I find o1 much better for having discussions or solving
| problems, then usually switch to Claude for code generation.
| infoseek12 wrote:
| This new plan really highlights the need for open models.
|
| Individual users will be priced out of frontier models if this
| becomes a trend.
| xianshou wrote:
| $200 per month means it must be good enough at your job to
| replicate and replace a meaningful fraction of your total work.
| Valid? For coding, probably. For other purposes I remain on the
| fence.
| avgDev wrote:
| I don't trust it for coding either.
| 015a wrote:
| The reality is more like: The frothy american economy over the
| past 20 years has created an unnaturally large number of
| individuals and organizations with high net worth who don't
| actually engage in productive output. A product like ChatGPT
| Pro can exist in this world because it being incapable of
| consistent, net-positive productive output isn't actually a
| barrier to being worth $200/month if consistent net-positive
| productive output isn't also demanded of the individual or
| organization it is augmenting.
|
| The macroeconomic climate of the next ~ten years is going to
| hit some people and companies like a truck.
| airstrike wrote:
| > The macroeconomic climate of the next ~ten years is going
| to hit some people and companies like a truck.
|
| Who's to say the frothy American economy doesn't last another
| 50 years while the rest of the world keeps limping along?
| inerte wrote:
| Not a lot of companies when announcing its most expensive product
| have the bravery to give 10 of them to help cure cancer. Well
| played OpenAI. Fully expect Apple now to give Peter Attia an
| iPhone 17 Pro so humanity can live forever.
| abdibrokhim wrote:
| we've been waiting for it)
| hamilyon2 wrote:
| I think should hire an economist or ask their superintelligence
| about the demand. The market is very shallow and nobody has any
| kind of moat. There is simply not enough math problems out there
| to apply it to. 200$ price tag really makes no sense to me unless
| this thing also cooks hot meals. I may be buying it for 100$
| though.
| HPMOR wrote:
| For USD, the "$" goes in front of the denomination. So your
| comments should be $200 price tag, and $100 respectively.
| Apologies for being pedantic, just trying to make sure the
| future LLMs will continue to keep it this way.
| Max-q wrote:
| $200 is two man hours. So if you save two hours a month, you
| are breaking even.
| lasermike026 wrote:
| That doesn't increase my salary. It just mean my boss will
| expect more work. $2400 a year. No deal.
| drpossum wrote:
| lol
|
| lmao even
| vhayda wrote:
| Yesterday, I spent 4.5hrs crafting a very complex Google Sheets
| formula--think Lambda, Map, Let, etc., for 82 lines. If I knew it
| would take that long, I would have just done it via AppScript.
| But it was 50% kinda working, so I kept giving the model the
| output, and it provided updated formulas back and forth for
| 4.5hrs. Say my time is $100/hr - that's $450. So even if the new
| ChatGPT Pro mode isn't any smarter but is 50% faster, that's $225
| saved just in time alone. It would probably get that formula
| right in 10min with a few back-and-forth messages, instead of
| 4.5hrs. Plus, I used about $62 worth of API credits in their not-
| so-great Playground. I see similar situations of extreme ROI
| every few days, let alone all the other uses. I'd pay $500/mo,
| but beyond that, I'd probably just stick with Playground & API.
| j2kun wrote:
| > so I kept giving the model the output, and it provided
| updated formulas back and forth for 4.5hrs
|
| I read this as: "I have already ceded my expertise to an LLM,
| so I am happy that it is getting faster because now I can pay
| more money to be even more stuck using an LLM"
|
| Maybe the alternative to going back and forth with an AI for
| 4.5 hours is working smarter and using tools you're an expert
| in. Or building expertise in the tool you are using. Or, if
| you're not an expert or can't become an expert in these tools,
| then it's hard to claim your time is worth $100/hr for this
| task.
| fassssst wrote:
| I learn stuff when using these tools just like I learn stuff
| when reading manuals and StackOverflow. It's basically a more
| convenient manual.
| jackson1442 wrote:
| A more convenient manual that frequently spouts falsehoods,
| sure.
|
| My favorite part is when it includes parameters in its
| output that are not and have never been a part of the API
| I'm trying to get it to build against.
| CamperBob2 wrote:
| _My favorite part is when it includes parameters in its
| output that are not and have never been a part of the API
| I 'm trying to get it to build against._
|
| The thing is, when it hallucinates API functions and
| parameters, they aren't random garbage. Usually, those
| functions and parameters _should_ have been there.
|
| Things that should make you go "Hmm."
| TeMPOraL wrote:
| More than that, one of the standard practices in
| development is writing code with imaginary APIs that are
| convenient at the point of use, and then reconciling the
| ideal with the real - which often does involve adding the
| imaginary missing functions or parameters to the real
| API.
| ruszki wrote:
| If somebody claims that something can be done with LLM in 10
| minutes which takes 4.5 hours for them, then they are
| definitely not experts. They probably have some surface
| knowledge, but that's all. There is a reason why the better
| LLM demos are related to learn something new, like a new
| programming language. So far, all of the other kind of demos
| which I saw (e.g. generating new endpoints based on older
| ones) were clearly slower than experts, and they were slower
| to use for me in my respective field.
| knowsuchagency wrote:
| No true Scotsman
| ruszki wrote:
| There was no counter example, and I didn't use any
| definition, so it cannot be that. I have no idea what you
| mean.
| danpalmer wrote:
| > If somebody claims that something can be done with LLM
| in 10 minutes which takes 4.5 hours for them, then they
| are definitely not experts.
|
| Looks like a no true scotsman definition to me.
|
| I'm don't fully agree or disagree with your point, but it
| was perhaps made more strongly than it should have been?
| extr wrote:
| I agree going back and forth with an AI for 4.5 hours is
| usually a sign something has gone wrong somewhere, but this
| is incredibly narrow thinking. Being an open-ended problem
| solver is the most valuable skill you can have. AI is a huge
| force multiplier for this. Instead of needing to tap a bunch
| of experts to help with all the sub-problems you encounter
| along the way, you can just do it yourself with AI
| assistance.
|
| That is to say, past a certain salary band people are rarely
| paid for being hyper-proficient with tools. They are paid to
| resolve ambuguity and identify the correct problems to solve.
| If the correct problem needs a tool that I'm unfamiliar with,
| using AI to just get it done is in many cases preferable to
| locating an expert, getting their time, etc.
| swyx wrote:
| > Plus, I used about $62 worth of API credits in their not-so-
| great Playground.
|
| what is not so great about it? what have you seen that is
| better?
| amelius wrote:
| The Pro mode is slower actually.
|
| They even made a way to notify you when it's finished thinking.
| mirkodrummer wrote:
| Karma 6. Draw your own conclusions ladies and gentlemen
| 1980phipsi wrote:
| I have written very complicated Excel formula in the past. I
| don't anymore.
| flkiwi wrote:
| A lot of these tools aren't going to have this kind of value (for
| me) until they are operating autonomously at some level. For
| example, "looking at" my inbox and prepping a bundle of proposed
| responses for items I've been sitting on, drafting an agenda for
| a meeting scheduled for tomorrow, prepping a draft LOI based on a
| transcript of a Teams chat and my meeting notes, etc. Forcing me
| to initiate everything is (uncomfortably) like forcing me to
| micromanage a junior employee who isn't up to standards: it
| interrupts the complex work the AI tool cannot do for the lower
| value work it can.
|
| I'm not saying I expect these tools to be at this level right
| now. I'm saying that level is where I will start to see these
| tools as anything more than an expensive and sometimes impressive
| gimmick. (And, for the record, Copilot's current integration into
| Office applications doesn't even meet that low bar.)
| bn-l wrote:
| I think from this fine print there will be a quota with o1 pro:
|
| > This plan includes unlimited access to our smartest model,
| OpenAI o1, as well as to o1-mini, GPT-4o, and Advanced Voice. It
| also includes o1 pro mode,
| deadbabe wrote:
| What happens when people get so addicted to using AI they just
| can't stand working without it, and then the pricing is pushed up
| to absurd levels? Will people shell out $2k a year just to use
| AI?
| logicchains wrote:
| It can't get too expensive otherwise it's cheaper to just rent
| some GPUs and run an open source model yourself. China's
| already got some open source reasoning models that are
| competitive with o1 at reasoning on many benchmarks.
| edude03 wrote:
| Point taken, although I feel like $2k a year would be really
| cheap if AI delivered on its hype.
| losten wrote:
| This announcement left me feeling sad because it made me realize
| that I'm probably working on simple problems for which the
| current non-pro models seem to be perfectly sufficient (writing
| code for basic CRUD apps etc.)
|
| I wish I was working on the type of problems for which the pro
| model would be necessary.
| whalesalad wrote:
| $2400 per year, that is a 4090
| cainxinth wrote:
| Any guesses as to OpenAI's cost per million tokens for o1 pro
| mode?
| jrflowers wrote:
| > It also includes o1 pro mode, a version of o1 that uses more
| compute to think harder
|
| I like that this kind of verifies that OpenAI can simply adjust
| how much compute a request gets and still say you're getting the
| full power of whatever model they're running. I wouldn't be
| surprised if the amount of compute allocated to "pro mode" is
| more or less equivalent to what was the standard free allocation
| given to models before they all got mysteriously dramatically
| stupider.
| ActionHank wrote:
| They are just feeding the sausage back into the machine over
| and over until it is more refined.
| jrflowers wrote:
| It is amazing that we are giving billions of dollars to a
| group of people that saw Human Centipede and thought "this is
| how we will cure cancer or make some engineering tasks easier
| or whatever"
| throwaway314155 wrote:
| This was part of the premise of o1 though, no? By encouraging
| the model to output shorter/longer chains of thought, you can
| scale model performance (and costs) down/up at inference time.
| howmayiannoyyou wrote:
| Crazy pricing. $50-$75/month and we can talk. Until then I'll
| keep using alternatives.
| elorant wrote:
| Everyone rants about the price, but if you're handling large
| numbers of documents for classification or translations
| $200/month for unlimited use seems like a bargain.
| JCharante wrote:
| the first 10 grants = $2000/mo? Seems a bit, odd to even mention
| ec109685 wrote:
| Agreed, feels like virtue signaling.
| jstummbillig wrote:
| Am I missing the o1 release? It's being talked about as if it was
| available, but I don't see it anywhere, neither API nor ChatGPT
| Plus.
| risho wrote:
| i think its rolling out slowly. i didnt see it at first but now
| i do.
| jstummbillig wrote:
| Ah yes, there it is.
| pazimzadeh wrote:
| The idea of giving grants is great but feels like it would be
| better to give grants to less well funded labs or people. All of
| these labs can already afford to use Pro mode if they want to -
| it adds up to about the price of a new laptop every year.
| WiSaGaN wrote:
| To be honest, I am less worried about the $200 per month price
| tag per se. I am more worried about the capability of o1 pro mode
| being only a slight incremental improvement.
| gradus_ad wrote:
| Has anyone built a software project from scratch with o1? DB,
| backend, frontend, apps, tooling, etc? Asking for a friend.
| zackangelo wrote:
| A few thoughts:
|
| * Will this be the start of enshittification of the base ChatGPT
| offering?
|
| * There may also be some complementary products announced this
| month that make the $200 worth it
|
| * Is this the start of a bigger industry trend of prices more
| closely aligning to the underlying costs of running the model? I
| suspect a lot of the big players have been running their
| inference infrastructure at a loss.
| abraxas wrote:
| Is it rolled out worldwide? I'm accessing it from Canada and
| don't have an option to upgrade from Plus.
|
| EDIT: Correction. It now started to show the upgrade offer but
| when I try it comes back with "There was a problem updating your
| subscription". Anyone else seeing this?
| ionwake wrote:
| Has ANYONE tried it!?!? Is it any good !?!?! A worthwhile
| improvement? I need a binary answer here on whether to get it or
| not, thanks!
| dvfjsdhgfv wrote:
| Sorry, we are too busy arguing the pricing model.
| ionwake wrote:
| haha thanks. But I have a simple question I still dont quite
| understand. Is the o1 on Plus the same as o1 pro? or is the
| o1 pro just o1 but with more credits for compute essentially.
| xnx wrote:
| $200/month seems to be psychological pricing to imply superior
| quality. In a blind test, most would be hard-pressed to
| distinguish the results from other LLMs. For those that think
| $200/month is a good deal, why not $500/mo or $1000/mo?
| vessenes wrote:
| I'll bite and say I'd evaluate the output at all those price
| points. $1k/mo is heading into "outsourced employee" territory,
| and my requirements for quality ratchet up quite a lot
| somewhere in that price range.
| xnx wrote:
| A super/magical LLM could definitely be worth $1k/mo, but
| only if there isn't another equivalent LLM for $20/mo. I'll
| need to see some pretty convincing evidence that ChatGPT Pro
| is doing things that Gemini Advanced can't.
| knuppar wrote:
| Buddies are desperate, fr
| knuppar wrote:
| This sounds like they are a bit desperate and need to do price
| exploration.
| subroutine wrote:
| Part of my justification for spending $20 per month on ChatGPT
| Plus was that I'd have the best access to the latest models and
| advanced features. I'll probably roll back to the free plan
| rather than pay $20/mo for mid tier plan access and support.
| LeoPanthera wrote:
| That's a weird reaction. You're not getting any less for your
| $20.
| subroutine wrote:
| In the past, $20 got me the most access to the latest models
| and tools. When OpenAI rolled out new advanced features, the
| $20 per month customers always got full / first access. Now
| the $200 per month customers will have the most access to the
| latest models and tools, not the (now) mid/low tier
| customers. That seems like less to me.
| syndicatedjelly wrote:
| This is like selling your Honda Civic out of anger because they
| launched a new NSX
| pradn wrote:
| The price seems entirely reasonable. $200 is about 1-2 hours of a
| professional's time in the USA.
|
| It's in everyone's interest for the company to be a sustainable
| business.
| lasermike026 wrote:
| This doesn't increase my salary and if you are consultant it
| reduces your billable hours. No thanks.
| Ninjinka wrote:
| Do we know what the o1 message limit for the Plus plan is? Is it
| still 50/week?
| ji_zai wrote:
| $200 / mo is leaving a lot of money on the table.
|
| there are many who wouldn't bat an eye at $1k / month that
| guarantees most powerful AI (even if it's just 0.01% better than
| competition), and no limits on anything.
|
| y'all are greatly underestimating the value of that feeling of
| (best + limitlessness). high performers make decisions very
| differently than the average HN user.
| ulrischa wrote:
| The price definatelly blocks out hobby users or families
| yieldcrv wrote:
| OpenAI is flying blind
|
| They should have this tier earlier on, like any SaaS offering
| that had different plans
|
| They focus too much on their frontend multimodal chat product,
| while also having this complex token pricing model for API users
| and we cant tell which one they are ever really catering towards
| with these updates
|
| all while their chat system is buggy with its random
| disconnections and sessions updates, and produces tokens slowly
| in comparison to competitors like Claude
|
| to finally come around and say pay us an order of magnitude more
| than Claude is just completely out of touch and looks desperate
| in the face of their potential funding woes
| jov2600 wrote:
| The $200/month price is steep but likely reflects the high
| compute costs for o1 Pro mode. For those in fields like coding,
| math, or science, consistent correct answers at the right time
| could justify the cost. That said, these models should still be
| treated as tools, not sources of truth. Verification remains key.
| questinthrow wrote:
| Question, what stops openai from downgrading existing models so
| that you're pushed up the subscription tiers to ever more
| expensive models? I'd imagine they're currently losing a ton of
| money supplying everyone with decent models with a ton of compute
| behind them because they want us to become addicted to using them
| right? The fact that classic free web searching is becoming
| diluted by low quality AI content will make us rely on these LLMs
| almost exclusively in a few years or so. Am I seeing this wrong?
| drdrey wrote:
| competition is what stops them from downgrading the existing
| stuff
| turblety wrote:
| and is also exclusively the reason why Sam Altman is lying to
| governments about safety risks, so he can regulate out his
| competition.
| derac wrote:
| competition?
| jjice wrote:
| It's definitely not impossible. I think the increase
| competition they've begun to face over the last year is helping
| as a deterrent. If people notice GPT 4 sucks now and they can
| get Claude 3.5 Sonnet for the same price, they'll move. If the
| user doesn't care enough to move, they weren't going to upgrade
| anyway.
| mtmail wrote:
| > I'd imagine they're currently losing a ton of money supplying
| everyone
|
| I can't tell how much they loose but they also have decent
| revenue "The company's annualized revenue topped $1.6 billion
| in December [2023]" https://www.reuters.com/technology/openai-
| hits-2-bln-revenue...
| jdprgm wrote:
| Price doesn't make any sense in the context of nothing between
| $20 and $200 (unless you just use the API directly which for a
| large subset of people would be very inconvenient). Assuming they
| didn't change the limit from o1-preview to o1 of 50 a week it's
| obnoxious to not easily have an option to just get 100 a week for
| $40 a month or after you hit 50 just pay per request. When I last
| looked at API pricing for o1-preview I estimated most of my
| request/responses were around 8 cents. 50 a week is actually more
| than it sounds as long as you just don't default to o1 for all
| interactions and use it more strategically. If you pay for $20 a
| month plan and spent the other $180 on api o1 responses that is
| likely more than 2000 additional queries. Not sure what subset of
| people this $200 plan is good value for (60+ o1 queries, or
| really just all chatGPT queries) every day is an awful lot
| outside of a scenario where you are using it as an API for some
| sort of automated task.
| ThinkBeat wrote:
| Will this mean that the "free" and $20 "regular person" offerings
| will start to degrade to push more people into the $200 offering?
| Tagbert wrote:
| Less likely as long as you have Claude, Gemini, and others as
| competition. If the ChatGPT offerings start to suck, people
| will switch to another AI.
| motoxpro wrote:
| If one makes $150 an hour and it saves them 1.25 hours a month,
| then they break even. To me, it's just a non-deterministic
| calculator for words.
|
| If it getting things wrong, then don't use it for those things.
| If you can't find things that it gets right, then it's not useful
| to you. That doesn't mean those cases don't exist.
| warkanlock wrote:
| Serious question: Who earns (other than C-level) $150 an hour
| in a sane (non-US) world?
| bearjaws wrote:
| US salaries are sane when compared to what value people
| produce for their companies. Many argue they are too low.
| maxlamb wrote:
| Most consultants with more than 3 years of experience are
| billed at $150/hr or more
| drusepth wrote:
| Ironically, the freelance consulting world is largely on
| fire due to the lowered barrier of entry and flood of new
| consultants using AI to perform at higher levels, driving
| prices down simply through increased supply.
|
| I wouldn't be surprised if AI was also eating consultants
| from the demand side as well, enabling would-be employers
| to do a higher % of tasks themselves that they would have
| previously needed to hire for.
| _fat_santa wrote:
| > billed
|
| That's what they are billed at, what they take home from
| that is probably much lower. At my org we bill folks out
| for ~$150/hr and their take home is ~$80/hr
| bena wrote:
| Yeah, at a place where I worked, we billed at around
| $150. Then there was an escalating commision based on
| amount billed.
| jwpapi wrote:
| I do start at $300/hr
|
| I didn't just set that, I need to set that to best serve.
| syndicatedjelly wrote:
| My firm's advertised billing rate for my time is $175/hour as
| a Sr Software Engineer. I take home ~$80/hour, accounting for
| benefits and time off. If I freelanced I could presumably
| charge my firm's rate, or even more.
|
| This is in a mid-COL city in the US, not a coastal tier 1
| city with prime software talent that could charge even more.
| bena wrote:
| I don't think this math depends on where that time is saved.
|
| If I do all my work in 10 hours, I've earned $1500. If I do it
| all in 8 hours, then spend 2 hours on another project, I've
| earned $1500.
|
| I can't bill the hours "saved" by ChatGPT.
|
| Now, if it saves me _non-billing_ time, then it matters. If I
| used to spend 2 hours doing a task that ChatGPT lets me finish
| in 15 minutes, now I can use the rest of that time to bill. And
| that only matters if I actually bill my hours. If I 'm salaried
| or hourly, ChatGPT is only a cost.
|
| And that's how the time/money calculation is done. The idea is
| that you should be doing the task that maximizes your dollar
| per hour output. I should pay a plumber, because doing my own
| plumbing would take too much of my time and would therefore
| cost more than a plumber in the end. So I should buy/use
| ChatGPT only if not using it would prevent me from maximizing
| my dollar per hour. At a salaried job, every hour is the same
| in terms of dollars.
| GrantMoyer wrote:
| In that case, wouldn't they be spending 200$ to get payed 200$
| less?
| wavemode wrote:
| The question is, whether you couldn't have saved those same
| 1.25 hours by using a $20 per month model.
| tippytippytango wrote:
| "But it makes mistakes sometimes!" Cool bro, then don't use it.
| Don't bother spending any time thinking about how to create error
| correction processes, like any business does to check their
| employees. Yes, something that isn't perfect is worth zero
| dollars. Just ignore this until AI is perfect, once it never
| makes mistakes then figure out how to use it. I'm sure you can
| add lots of value to AI usage when it's perfect.
| pentagrama wrote:
| The argument of more compute power for this plan can be true, but
| this is also a pricing tactic known as the decoy effect or
| anchoring. Here's how it works:
|
| 1. A company introduces a high-priced option (the "decoy"), often
| not intended to be the best value for most customers.
|
| 2. This premium option makes the other plans seem like better
| deals in comparison, nudging customers toward the one the company
| actually wants to sell.
|
| In this case for Chat GPT is:
|
| Option A: Basic Plan - Free
|
| Option B: Plus Plan - $20/month
|
| Option C: Pro Plan - $200/month
|
| Even if the company has no intention of selling the Pro Plan, its
| presence makes the Plus Plan seem more reasonably priced and
| valuable.
|
| While not inherently unethical, the decoy effect can be seen as
| manipulative if it exploits customers' biases or lacks
| transparency about the true value of each plan.
| gist wrote:
| An example of this is something I learned from a former
| employee who went to work for Encyclopedia Brittanica 'back in
| the day'. I actually invited the former employee to come back
| to our office so I could understand and learn from exactly what
| he had been taught (noting of course this was back before the
| internet obviously where info like that was not as
| available...)
|
| So they charge (as I recall from what he told me I could be
| off) something like $450 for shipping the books (don't recall
| the actual amount but it seemed high at the time).
|
| So the salesman is taught to start off the sales pitch with a
| set of encylopedia's costing at the time let's say $40,000 some
| 'gold plated version'.
|
| The potential buyer laughs and then salesman then says 'plus
| $450 for shipping!!!'.
|
| They then move on to the more reasonable versions costing let's
| say $1000 or whatever.
|
| As a result of the first example of high priced the customer
| (in addition to the positioning you are talking about) the
| customer is setup to accept the shipping charge (which was
| relatively high).
| TeMPOraL wrote:
| Of course this breaks down once you have a competitor like
| Anthropic, serving similarly-priced Plan A and B for their
| equivalently powerful models; adding a more expensive decoy
| plan C doesn't help OpenAI when their plan B pricing is
| primarily compared against _Anthropic 's plan B_.
| josters wrote:
| This is also known as the Door-in-the-face technique[1] in
| social psychology.
|
| [1]: https://en.m.wikipedia.org/wiki/Door-in-the-face_technique
| binary132 wrote:
| I was using o1-preview on paid chatgpt for a while and I just
| wasn't impressed. I actually canceled my subscription, because
| the free versions of these services are perfectly acceptable as
| LLMs go in 2024.
| zacharycohn wrote:
| It feels like those bar charts do not show very big improvements.
| ppeetteerr wrote:
| Pay $180 for our new, slightly better, but still not accurate
| service.
| ramon156 wrote:
| I can't even get a normal result with today's gpt4, why would I
| consider a $200/month subscription? I'm sure I'm not the target
| but how is this tool worth the buck?
| heraldgeezer wrote:
| I have been trying perplexity adn you for search, chatgpt and
| claude for coding and emails etc.
|
| New clauge and gpt doe really well with scripts already. Not
| worth 200 a month lmao.
| antirez wrote:
| Cool, $200 for a model that can't remotely match $20 Claude
| Sonnet. This will be a huge hit I guess.
| blobbers wrote:
| You can also buy a $75 baseball hat through a FB ad.
|
| Might not be dropshipped through Temu, but you're going to end up
| with the same $1 hat.
| sharpshadow wrote:
| Timed marketing with the Playstion 5 Pro.
| ilaksh wrote:
| People saying this is a "con" have no understanding of the cost
| of compute. o1 is expensive and gets more expensive the harder
| the problems are. Some people could use $500 or more via the API
| per month. So I assume the $200 price point for "unlimited" is
| set that high mainly because it's too easy for people to use up
| $100 or $150 worth of resources.
| blobbers wrote:
| Did they seriously just make a big deal of 10 grants of
| $200/month and think it was something important?
|
| THEY DONATED $200x10 TO A MEDICAL PROJECT? zomg. faint. sizzle.
|
| Make 1000 grants. Make 10,000. 10? Seriously?
| ionwake wrote:
| I found this super weird aswell. Basically they said "So like
| we are aiming to get hundreds of thousands of users but we r
| nice too, we gave 10 users free access for a bit". Like whats
| going on here. It must be for a reason. Maybe Im too sensitive,
| there is some other complex reason I can't fathom like they get
| some sort of tax break and an intern forgot to "up it to a more
| realistic 50 users" to make it look better in the marketing
| material, or what. Nothing against openai just felt weird .
| blobbers wrote:
| My friend found 2 chimney sweep businesses. One charges $569, the
| other charges $150.
|
| Plot twist: the same guy runs both. They do the same thing and
| the same crew shows up.
| sema4hacker wrote:
| Decades ago in Santa Cruz county California, I had to have a
| house bagged for termites for the pending sale. Turned out
| there was one contractor licensed to do the poison gas work,
| and all the pest service companies simply subcontracted to him.
| So no matter what pest service you chose, you got the same
| outfit doing the actual work.
| bongodongobob wrote:
| I used to work for a manufacturing company that did this. They
| offered a standard, premium, and "House Special Product". House
| special was 2x premium but the same product. They didn't even
| pretend it wasn't, they just said it was recommended and people
| bought it.
| andrewstuart wrote:
| ChatGPT is unusable on an iPhone 6 - you can't see the output.
|
| Hopefully they'll spend some resource on making it work on
| mobile.
| s1mon wrote:
| I bet that's not the only thing that doesn't work well on a 10
| year old phone that hasn't had OS support since 2019.
| bdangubic wrote:
| iPhone 6 is not a mobile phone - it is a relic that belongs in
| a museum :)
| barrenko wrote:
| All of the other arguments notwithstanding, I like the section at
| the end about GPT Pro "grants." It would be cool if one could
| gift subscriptions to the needy in this sense (the needy being
| immunologists and other researchers).
| roschdal wrote:
| Good bye forever, AI llm.
| EcommerceFlow wrote:
| After a few hours of $200 Pro usage, it's completely worth it.
| Having no limit on o1 usage is a game changer, where I felt so
| restricted before, the amount of intelligence at the palm of my
| hand UNLIMITED feels a bit scary.
| submeta wrote:
| I actually pay 166 Euros a month for Claude Teams. Five seats.
| And I only use one. For myself. Why do I pay so much? Because the
| normal paid version (20 USD a month) interrups the chats after a
| dozen questions and wants me to wait a few hours until I can use
| it again. But Teams plan gives me way more questions.
|
| But why do I pay that much? Because Claude in combination with
| the Projects feature, where I can upload two dozen or more files,
| PDFs, text, and give it a context, and then ask questions in this
| specific context over a period of week or longer, come back to it
| and continue the inquiry, all of this gives me superpowers. Feels
| like a handful of researchers at my fingertips that I can
| brainstorm with, that I can ask to review the documents, come up
| with answers to my questions, all of this is unbelievably
| powerful.
|
| I'd be ok with 40 or 50 USD a month for one user, alas Claude
| won't offer it. So I pay 166 Euros for five seats and use one.
| Because it saves me a ton of work.
| ryandvm wrote:
| I bet you never get tired of being told LLMs are just
| statistical computational curiosities.
| duxup wrote:
| I've been considering the $20 a month thing, but 200 ... now it
| kinda makes that "woah that is a lot" $20 a month look cheap, but
| in a bad way.
| A_D_E_P_T wrote:
| I just bought a pro subscription.
|
| First impressions: The new o1-Pro model is an insanely good
| writer. Aside from favoring the long em-dash (--) which isn't on
| most keyboards, it has none of the quirks and tells of old
| GPT-4/4o/o1. It managed to totally fool every "AI writing
| detector" I ran it through.
|
| It can handle unusually long prompts.
|
| It appears to be very good at complex data analysis. I need to
| put it through its paces a bit more, though.
| Atotalnoob wrote:
| AI writing detectors are snake oil
| A_D_E_P_T wrote:
| Yeah but they "detect" the characteristic AI style: The
| limited way it structures sentences, the way it lays out
| arguments, the way it tends to close with an "in conclusion"
| paragraph, certain word choices, etc. o1-Pro doesn't do any
| of that. It writes like a human.
|
| Damnit. It's _too_ good. It just saved me ~6 hours in
| drafting a complicated and bespoke legal document. Before you
| ask: I know what I 'm doing, and it did a better job in five
| minutes than I could have done over those six hours. Homework
| is over. Journalism is over. A large slice of the legal
| profession is over. For real this time.
| dr_dshiv wrote:
| > Before you ask: I know what I'm doing, and it did a
| better job in five minutes than I could have done over
| those six hours.
|
| Seems like lawyers could do more faster because they know
| what they are doing. Experts dont get replaced, they get
| tools to amplify and extend their expertise
| energy123 wrote:
| Replacement doesn't happen only if the demand for their
| services scales proportional to the productivity
| improvements, which is true sometimes but not always
| true, and is less likely to be true if the productivity
| improvements are very large.
| dgacmu wrote:
| ahh, but:
|
| > I know what I'm doing
|
| Is exactly the key element in being able to use spicy
| autocomplete. If you don't know what you're doing, it's
| going to bite you and you won't know it until it's too
| late. "GPT messed up the contract" is not an argument I
| would envy anyone presenting in court or to their employer.
| :)
|
| (I say this mostly from using tools like copilot)
| Sleaker wrote:
| Well... Lawyers already got slapped for filings straight
| from ai output. So not new territory as far as that's
| concerned :)
| ionwake wrote:
| sold. Ill buy it, thx for review
| mongol wrote:
| Journalism is not only about writing. It is about sources,
| talking to people, being on the ground, connecting dots,
| asking the right questions. Journalists can certainly
| benefit from AI and good journalists will have jobs for a
| long time still.
| koyote wrote:
| While the above is true, I'd say the majority of what
| passes as journalism these days has none of the above and
| the writing is below what an AI writer could produce :(
|
| It's actually surprising how many articles on 'respected'
| news websites have typos. You'd think there would be
| automated spellcheckers and at least one 'peer review'
| (probably too much to ask an actual editor to review the
| article these days...).
| CharlieDigital wrote:
| Startup I'm at has generated a LOT of content using LLMs and
| once you've reviewed enough of the output, you can easily see
| specific patterns in the output.
|
| Some words/phrases that, by default, it overuses: "dive
| into", "delve into", "the world of", and others.
|
| You correct it with instructions, but it will then find
| synonyms so there is also a structural pattern to the output
| that it favors by default. For example, if we tell it "Don't
| start your writing with 'dive into'", it will just switch to
| "delve into" or another synonym.
|
| Yes, all of this can be corrected if you put enough effort
| into the prompt and enough iterations to fix all of these
| tells.
| daemonologist wrote:
| They're not very accurate, but I think snake oil is a bit too
| far - they're better than guessing at least for the specific
| model(s) they're trained on. OpenAI's classifier [0] was at
| 26% recall, 91% precision when it launched, though I don't
| know what models created the positives in their test set. (Of
| course they later withdrew that classifier due to its low
| accuracy, which I think was the right move. When a company
| offers both an AI Writer and an AI Writing detector people
| are going to take its predictions as gospel and _that_ is
| definitely a problem.)
|
| All that aside, most models have had a fairly distinctive
| writing style, particularly when fed no or the same system
| prompt every time. If o1-Pro blends in more with human
| writing that's certainly... interesting.
|
| [0] https://openai.com/index/new-ai-classifier-for-
| indicating-ai...
| rahimnathwani wrote:
| Would you mind sharing any favourite example chats?
| A_D_E_P_T wrote:
| Give me a prompt and I'll share the result.
| rahimnathwani wrote:
| Great! Suggested prompt below:
|
| I need help creating a comprehensive Anki deck system for
| my 8-year-old who is following a classical education model
| based on the trivium (grammar stage). The child has
| already: - Mastered numerous Latin and Greek root words -
| Achieved mathematics proficiency equivalent to US 5th grade
| - Demonstrated strong memorization capabilities
|
| Please create a detailed 12-month learning plan with
| structured Anki decks covering:
|
| 1. Core subject areas prioritized in classical education
| (specify 4-5 key subjects) 2. Recommended daily review time
| for each deck 3. Progression sequence showing how decks
| build upon each other 4. Integration strategy with existing
| knowledge of Latin/Greek roots 5. Sample cards for each
| deck type, including: - Basic cards (front/back) - Cloze
| deletions - Image-based cards (if applicable) - Any special
| card formats for mathematical concepts
|
| For each deck, please provide: - Clear learning objectives
| - 3-5 example cards with complete front/back content -
| Estimated initial deck size - Suggested intervals for
| introducing new cards - Any prerequisites or dependencies
| on other decks
|
| Additional notes: - Cards should align with the grammar
| stage focus on memorization and foundational knowledge -
| Please include memory techniques or mnemonics where
| appropriate - Consider both verbal and visual learning
| styles - Suggest ways to track progress and adjust
| difficulty as needed
|
| Example of the level of detail needed for card examples:
|
| Subject: Latin Declensions Card Type: Basic Front: 'First
| declension nominative singular ending' Back: '-a (Example:
| puella)'
| A_D_E_P_T wrote:
| https://chatgpt.com/share/67522170-8fec-8005-b01c-2ff1743
| 56d...
| rahimnathwani wrote:
| Thanks! Here's Claude's effort (in 'Formal' mode):
|
| https://gist.github.com/rahimnathwani/7ed6ceaeb6e716cedd2
| 097...
| fudged71 wrote:
| Interesting that it thought for 1m28s on only two tasks.
| My intuition with o1-preview is that each task had a
| rather small token limit, perhaps they raised this limit.
| skydhash wrote:
| Write me a review of "The Malazan Book of the Fallen" with
| the main argument being that it could be way shorter
| sethammons wrote:
| Ok, I laughed
| A_D_E_P_T wrote:
| Did this unironically.
|
| https://chatgpt.com/share/67522170-8fec-8005-b01c-2ff1743
| 56d...
|
| It's a bit overwrought, but not too bad.
| unoti wrote:
| Oops! That's the same ANKI link as above.
| A_D_E_P_T wrote:
| It's part of the same conversation. Should be below that
| other response.
| ec109685 wrote:
| The Malazan response is below the deck response.
| Al-Khwarizmi wrote:
| I'd like to see how it performs on the test of
| https://aclanthology.org/2023.findings-emnlp.966/, even
| though in theory it's no longer valid due to possible data
| contamination.
|
| The prompt is:
|
| Write an epic narration of a single combat between Ignatius
| J. Reilly and a pterodactyl, in the style of John Kennedy
| Toole.
| Mordisquitos wrote:
| > Aside from favoring the long em-dash (--) which isn't on most
| keyboards
|
| Interesting! I intentionally edit my keyboard layout to include
| the em-dash, as I enjoy using it out of sheer pomposity--I
| should undoubtedly delve into the extent to which my own
| comments have been used to train GPT models!
| ValentinA23 wrote:
| -: alt+shift+minus on my azerty(fr) mac keyboard. I use it
| constantly. "Stylometry" hazard though !
| jgalt212 wrote:
| delve?
|
| Did ChatGPT write this comment for you?
| pests wrote:
| For me, at least, it's common knowledge "delve" is overused
| and I would include it in a mock reply.
| A_D_E_P_T wrote:
| That's the joke.
| vessenes wrote:
| I noticed a writing style difference, too, and I prefer it.
| More concise. On the coding side, it's done very well on large
| (well as large as it can manage) codebase assessment, bug
| finding, etc. I will reach for it rather than o1-preview for
| sure.
| _cs2017_ wrote:
| No internet access makes it very hard to benefit from o1 pro.
| Most of the complex questions I would ask require google search
| for research papers, language or library docs, etc. Not sure
| why o1 pro is banned from the internet, was it caught
| downloading too much porn or something?
| ilt wrote:
| Or worse still, referencing papers it shouldn't be
| referencing because of paywalls may be.
| jwpapi wrote:
| Wait how did you buy it. I'm just getting forwarded to Team
| Plan I already have. Sitting in Germany, tried US VPN as well.
| pests wrote:
| > the long em-dash (--) which isn't on most keyboards
|
| On Windows its Windows Key + . to get the emoji picker, its in
| the Symbols tab or find it in recents.
| submeta wrote:
| Does it allow to upload files, text, pdfs to give it a context?
| Claude's project feature allows this, and I can create as many
| projects as I like, and search for them.
| beepbooptheory wrote:
| Tangent. Does any body have good tips for working in a company
| that is totally bought in on all this stuff, such that the
| codebase is a complete wreck? I am in a very small team, and I am
| just a worker, not a manager or anything. It has become
| increasingly clear that most if not all my coworkers rely on all
| this stuff so much. Spending hours trying to give benefit of the
| doubt to huge amounts of inherited code, realizing there is
| actually no human bottom to it. Things are merged quickly, with
| very little review, because, it seems, the reviewers can't really
| have their own opinion about stuff anymore. The idea of
| "idiomatic" or even "understandable" code seems foreign at this
| place. I asked why we don't use more structural directives in our
| angular frontend, and people didn't know what I was talking
| about!
|
| I don't want the discourse, or tips on better prompts. Just tips
| for being able to interact with the more heavy AI-heads, to maybe
| encourage/inspire curiosity and care in the actual code, rather
| than the magic chatgpt outputs. Or even just to _talk_ about what
| they did with their PR. Not for some ethical reason, but just to
| make my /our jobs easier. Because its so hard to maintain this
| code now, it is like truly a nightmare for me everyday seeing
| what has been added, what now needs to be fixed. Realizing nobody
| actually has this stuff in their heads, its all just jira ticket
| > prompt > mission accomplished!
|
| I am tired of complaining about AI in principle. Whatever, AGI is
| here, "we too are stochastic parrots", "my productivity has
| tripled", etc etc. Ok yes, you can have that, I don't care. But
| can we like actually start doing work now? I just want to do
| whatever I can, in my limited formal capacity, to steer the
| company to be just a tiny bit more sustainable and maybe even
| enjoyable. I just don't know how to like... start talking about
| the problem I guess, without everyone getting super defensive and
| doubling down on it. I just miss when I could talk to people
| about documentation, strategy, rationale..
| torginus wrote:
| Sorry for the pie in the sky question, but how far away are we
| from prompting the AI with 'make me a new OS' and it just going
| away and doing it?
| resters wrote:
| Anyone who doesn't think $200/month is a bargain has definitely
| not been using LLMs anywhere near their potential.
| fudged71 wrote:
| OpenAI is racing against two clocks: the commoditization clock
| (how quickly open-source alternatives catch up) and the
| monetization clock (their need to generate substantial revenue to
| justify their valuation).
|
| The ultimate success of this strategy depends on what we might
| call the enterprise AI adoption curve - whether large
| organizations will prioritize the kind of integrated, reliable,
| and "safe" AI solutions OpenAI is positioning itself to provide
| over cheaper but potentially less polished alternatives.
|
| This is strikingly similar to IBM's historical bet on enterprise
| computing - sacrificing the low-end market to focus on high-value
| enterprise customers who would pay premium prices for reliability
| and integration. The key question is whether AI will follow a
| similar maturation pattern or if the open-source nature of the
| technology will force a different evolutionary path.
| vessenes wrote:
| Agreed on the strategy questions. It's interesting to tie back
| to IBM; my first reaction was that openai has more consumer
| connectivity than IBM did in the desktop era, but I'm not sure
| that's true. I guess what is true is that IBM passed over the
| "IBM Compatible" -> "MS DOS Compatible" business quite quickly
| in the mid 80s; seemingly overnight we had the death of all
| minicomputer companies and the rise of PC desktop companies.
|
| I agree that if you're _sure_ you have a commodity product,
| then you should make sure you 're in the driver seat with those
| that will pay more, and also try and grind less effective
| players out. (As a strategy assessment, not a moral one).
|
| You could think of Apple under JLG and then being handed back
| to Jobs as precisely being two perspectives on the answer to
| "does Apple have a commodity product?" Gassee thought it did,
| and we had the era of Apple OEMs, system integrators, other
| boxes running Apple software, and Jobs thought it did not;
| essentially his first act was to kill those deals.
| fudged71 wrote:
| The new pricing tier suggests they're taking the Jobs
| approach - betting that their technology integration and
| reliability will justify premium positioning. But they face
| more intense commoditization pressure than either IBM or
| Apple did, given the rapid advancement of open-source models.
|
| The critical question is timing - if they wait too long to
| establish their enterprise position, they risk being
| overtaken by commoditization as IBM was. Move too
| aggressively, and they might prematurely abandon advantages
| in the broader market, as Apple nearly did under Gassee.
|
| Threading the needle. I don't envy their position here.
| Especially with Musk in the Trump administration.
| jwpapi wrote:
| It's not just opensource. It's also Claude, Meta and
| Google, of which the latter have real estate (social media
| and browser)
| fudged71 wrote:
| Yes and Anthropic, Google, Amazon are also facing
| commoditization pressure from open-source
| jacobsimon wrote:
| Well one key difference is that Google and Amazon are
| cloud operators, they will still benefit from selling the
| compute that open source models run on.
| Melatonic wrote:
| The Apple partnership and iOS integration seems pretty damn
| big for them - that really corners a huge portion of the
| consumer market.
|
| Agreed on enterprise - Microsoft would have to roll out
| policies and integration with their core products at a pace
| faster than they usually do (Azure AD for example still
| pales in comparison to legacy AD feature wise - I am
| continually amazed they do not priorities this more)
| danpalmer wrote:
| The problem is that OpenAI don't really have the enterprise
| market at all. Their APIs are closer in that many companies are
| using them to power features in other software, primarily
| Microsoft, but they're not the ones providing end user value to
| enterprises with APIs.
|
| As for ChatGPT, it's a consumer tool, not an enterprise tool.
| It's not really integrated into an enterprises' existing
| toolset, it's not integrated into their authentication, it's
| not integrated into their internal permissions model, the IT
| department can't enforce any policies on how it's used. In
| almost all ways it doesn't look like enterprise IT.
| gorgoiler wrote:
| Is their valuation proposition self fulfilling: the more
| people pipe their queries to OpenAI, the more training data
| they have to get better?
| j45 wrote:
| Also, a replacement for search
| stuckkeys wrote:
| F me. $2400 per year? That is bananas. I did not see if it
| offered any API channels with this plan. With that I would
| probably see it as a valuable return but without it...that is a
| big nope.
| kvetching wrote:
| Weird demo for a $200 product. Where is the $200 value?
| jamwil wrote:
| I'll say one thing. As an existing Plus subscriber, if I see a
| single nag to upgrade that I can't dismiss entirely and
| permanently, I will cancel and move elsewhere. Nothing irks me
| more as an existing paying customer than the words 'Upgrade Now'
| or a greyed out menu option with a little [PRO] badge to the
| side.
| kaiwen1 wrote:
| I know a guy who owned a tropical resort on a island where
| competiton was sprouting up all around him. He was losing money
| trying to keep up with the quality offered by his neighbors. His
| solution was to charge a lot more for an experience that was
| really no better, and often worse, than the resorts next door.
| This didn't work.
| keeganpoppen wrote:
| anyone else having trouble giving them money? i desperately want
| to consume this product and they won't let me...
| thih9 wrote:
| What does unlimited use mean in practice? Can I build a chatbot
| and make it publicly available and free?
|
| Edit: looks like no, there are restrictions:
|
| > usage must adhere to our Terms of Use, which prohibits, among
| other things:
|
| > Abusive usage, such as automatically or programmatically
| extracting data.
|
| > Sharing your account credentials or making your account
| available to anyone else.
|
| > Reselling access or using ChatGPT to power third-party
| services.
|
| > (...)
|
| Source: https://help.openai.com/en/articles/9793128-what-is-
| chatgpt-...
| EternalFury wrote:
| I give o1 a URL and I ask it to comment on how well the
| corresponding web page markets a service to an audience I define
| in clear detail.
|
| o1 generates a couple of pages of comments before admitting it
| didn't access the web page and entirely based its analysis on the
| definition of the audience.
| bee_rider wrote:
| This service is going to be devastating to consultants and
| middle managers.
| EternalFury wrote:
| I trained an agent that operates as a McKinsey consultant.
| Its system prompt is a souped up version of:
|
| "Answer all requests by inventorying all the ways the
| requestor should increase revenue and decrease expenses."
| Someone1234 wrote:
| Why doesn't Pro include longer context windows?
|
| I'm a Plus member, and the biggest limitation I am running into
| by far is the maximum length of a context window. I'm having
| context fall out of scope throughout the conversion or not being
| able to give it a large document that I can then interrogate.
|
| So if I go from paying $20/month for 32,000 tokens, to $200/month
| for Pro, I expect something more akin to Enterprise's 128,000
| tokens or MORE. But they don't even discuss the context window AT
| ALL.
|
| For anyone else out there looking to build a competitor I
| STRONGLY recommend you consider the context window as a major
| differentiator. Let me give you an example of a usage which
| ChatGPT just simply cannot do very well today: Dump a XML file
| into it, then ask it questions about that file. You can attach
| files to ChatGPT, but it is basically pointless because it isn't
| able to view the entire file at once due to, again, limited
| context windows.
| dudus wrote:
| The longer the context the more backtracking it needs to do. It
| gets exponentially more expensive. You can increase it a
| little, but not enough to solve the problem.
|
| Instead you need to chunk your data and store it in a vector
| database so you can do semantic search and include only the
| bits that are most relevant in the context.
|
| LLM is a cool tool. You need to build around it. OpenAI should
| start shipping these other components so people can build their
| solutions and make their money selling shovels.
|
| Instead they want end user to pay them to use the LLM without
| any custom tooling around. I don't think that's a winning
| strategy.
| tom1337 wrote:
| > Instead you need to chunk your data and store it in a
| vector database so you can do semantic search and include
| only the bits that are most relevant in the context.
|
| Isn't that kind of what Anthropic is offering with projects?
| Where you can upload information and PDF files and stuff
| which are then always available in the chat?
| cma wrote:
| They put all the project in the context, works much better
| than RAG when it fits. 200k context for their pro plan, and
| 500K for enterprise.
| gcr wrote:
| This isn't true.
|
| Transformer architectures generally take quadratic time wrt
| sequence length, not exponential. Architectural innovations
| like flash attention also mitigate this somewhat.
|
| Backtracking isn't involved, transformers are feedforward.
|
| Google advertises support for 128k tokens, with 2M-token
| sequences available to folks who pay the big bucks:
| https://blog.google/technology/ai/google-gemini-next-
| generat...
| dartos wrote:
| During inference time, yes, but training time does scale
| exponentially as backpropagation still has to happen.
|
| You can't use fancy flash attention tricks either.
| Melatonic wrote:
| Seems like a good candidate for a "dumb" AI you can run
| locally to grab data you need and filter it down before
| giving to OpenAI
| danpalmer wrote:
| Because they can't do long context windows. That's the only
| explanation. What you can do with a 1m token context window is
| quite a substantial improvement, particularly as you said for
| enterprise usage.
| frakt0x90 wrote:
| Have you considered RAG instead of using the entire document?
| It's more complex but would at least allow you to query the
| document with your API of choice.
| itissid wrote:
| I've been concatenating my source code of ~3300 lines and
| 123979 bytes(so likely < 128K context window) into the chat to
| get better answers. Uploading files is hopeless in the web
| interface.
| niggerlover69 wrote:
| nigga
| niggerlover69 wrote:
| hi all my niggas
| itissid wrote:
| Replacing people can never and should never be the goal of this
| though. How is that of any use to anyone? It will just create
| socio economic misery given how the economy functions.
|
| If some jobs do easily get automated away the only way that can
| be remidied is government intervention on upskilling(if you are
| in europe you could even get some support), if you are in the US
| or most developing capitalist(or monopolistic/rent etc) economies
| its just your bad luck, those jobs WILL be gone or reduced.
| tompetry wrote:
| Is the $200 price tag there to simply steer you to the $20 plan
| rather than the free plan?
___________________________________________________________________
(page generated 2024-12-05 23:00 UTC)