[HN Gopher] The Threat to OpenAI
___________________________________________________________________
The Threat to OpenAI
Author : bookofjoe
Score : 61 points
Date : 2024-08-31 19:56 UTC (3 hours ago)
(HTM) web link (www.wsj.com)
(TXT) w3m dump (www.wsj.com)
| bookofjoe wrote:
| https://archive.ph/qLgvp
|
| https://www.wsj.com/tech/ai/ai-chatgpt-nvidia-apple-facebook...
| thorum wrote:
| If the rumors about the upcoming Strawberry and Orion models from
| OpenAI are true - supposedly capable of deep research, reasoning
| and math - they probably don't have much to worry about. Not to
| mention they still have the only fully multimodal model.
| forrestthewoods wrote:
| Models are only state of the art for 12 to 18 months. There's
| not a model in existence today that will have any value in 5
| years. They will all be obsolete.
|
| Thus far no one has any moat on their model.
| krackers wrote:
| According to that recent The Information report, Orion is
| supposed to be just a regular LLM except trained with synthetic
| data generated via Strawberry. Anthropic et al. have also been
| working on ways to generate synthetic data (as seen in the
| success of Sonnet 3.5) so I don't really know if that's going
| to be a big lead.
|
| And of course the ever-hyped strawberry is supposed to be some
| sort of tree-of-thought type thing I think, or maybe it's
| related to https://arxiv.org/abs/2203.14465. Either way,
| nothing so far has come out that it's a completely novel
| training technique or architecture, just a gpt-4 scale model
| with different post-processing.
| CuriouslyC wrote:
| I'm not sure why people act so mystified about Q*, the name
| gives it away, it's an obvious reference to A*, the only
| question is what the nodes in the graph are and what they're
| using as a Heuristic function.
| llmfan wrote:
| "just a regular LLM except [trained on very different data]."
|
| I'm not saying there's some big moat, anyone can read
| https://arxiv.org/pdf/2305.20050, but not all synthetic data
| is created equal. Strawberry I'm sure generates beautiful,
| valid chain-of-thought reasoning data. Wouldn't surprise me
| if OpenAI is just significantly ahead of the competition.
| GaggiX wrote:
| What is the only fully multimodal model?
|
| The GPT-4o checkpoint available to the public can see images
| but not generate them (it can generate prompts for Dalle 3 to
| use). OpenAI has an internal model with this capability, but if
| you don't make it a actual product, it doesn't really matter.
| martinald wrote:
| Yes I generally agree with this. Ironically it seems as all the
| "AI wrappers" look like they have more of a moat than the models
| (as UX etc is actually quite hard), which is not what I expected
| at all when the LLM boom started.
|
| I think a lot of LLM usage is actually pretty 'simple' at the
| moment - think tagging, extracting data, etc. This doesn't
| require the very top state of the art models (though Llama3.1 is
| very close to that).
|
| Hopefully OpenAI have some real jumps forward in the bag
| otherwise I am struggling to see how they can justify the
| valuations being floated around.
| CuriouslyC wrote:
| The wrappers don't really have much of a moat either, OpenAI is
| just bad at front-end dev, so their velocity there is low.
| wslh wrote:
| I think the problem with valuations is the never ending
| financial history of hypes. The hype is correlated with the
| technology or industry but follows its own speculation hype.
| Even if the valuations and speculations are correct, they could
| be correct in years with ups and downs. If the dot com era is a
| good example it took half a decade or more to materialize.
|
| Other aspects that are not well studied yet is how online
| advertisement will be affected if most people around the world
| end up using a single interface such as ChatGPT. How
| SEO/SEM/Ads will work in that world? Does someone look at what
| sites benefit being listed in ChatGPT (e.g. Wikipedia).
| sillysaurusx wrote:
| One killer feature is OpenAI's vector database. I was surprised
| you can throw gigabytes of documents at it and chatgpt can see
| it all. It's hard to simulate that via context window.
|
| That's not necessarily a moat, but OpenAI is still shipping
| important features. I wonder how hard it is for Claude et al to
| replicate.
| brigadier132 wrote:
| They are outsourcing that to Qdrant, anyone can replicate it.
| Xenoamorphous wrote:
| What do you mean, does it do anything other than the usual
| RAG?
| sillysaurusx wrote:
| Whatever it does, it's indistinguishable from stuffing all
| the documents into the context window. Or at least I
| haven't seen it fail yet.
| KeplerBoy wrote:
| Stuffing everything into the context window fails
| horribly every time i try it.
|
| The model just doesn't seem to be able to really process
| the entire input.
| ramraj07 wrote:
| I'm also confused what they're talking about.. Does OpenAI
| have some feature I'm not aware of?
| prng2021 wrote:
| When these companies are getting literally multiple billions
| of dollars of funding thrown at them, I can't think of a
| single feature that can be a moat. If it's truly a feature
| that leapfrogs competitors, there's just no way that someone
| like Anthropic spending $1B on engineering resources can't
| replicate the same exact feature that engineers elsewhere
| implemented.
| Frost1x wrote:
| If it's simply an engineering feat I agree. Very, very,
| very often people tend to mix up science being done in
| technology spaces with engineering. Just because it's being
| done in software and perhaps even without rigor doesn't
| mean you're not doing something novel and quite different
| at a fundamental perspective than someone else. Sometimes
| it could just be luck of the draw in an implementation
| approach that gets you there but that's not always the
| case.
|
| I've worked in scientific computing for awhile now and
| there are countless subtle decisions often done in the
| implementation phase that, from a theoretical perspective,
| aren't definitively answered by the science working behind
| the scenes. There's often a gap in knowledge and
| implementers either hit those gaps by trial and error,
| luck, or insight. That's my opinion, at least. So I don't
| think it's always "just an implementation problem" some
| will claim, as if the science is well understood and
| solved. Perhaps it is, but from my experience that tends to
| not be the case.
| danielmarkbruce wrote:
| This 100%. And LLMs and the applications around them
| (even something as simple seeming as ChatGPT) have more
| subtle decisions than any other type of software that
| I've ever seen. Everyone claims there isn't a moat, I bet
| there is.
| oakst wrote:
| Really interesting perspective
| prng2021 wrote:
| Yep I agree with that. I will say though that people,
| teams, and even entire companies (like what happened at
| Inflection) get poached everyday so maintaining a moat
| that way is tough. Also, even though it could happen in
| the future, is OpenAI's lead due to a moat of scientists
| with ideas so novel that no other AI company can compete?
| Certainly not because even though ChatGPT took the world
| by storm, numerous other companies built LLMs in a very
| short time span that now perform at very similar levels,
| both subjectively and based on benchmarks.
| WanderPanda wrote:
| Agreed, on another note I also struggle to see how no one
| could create a (better) CUDA implementation for e.g. $1B
| engineering budget
| prng2021 wrote:
| I purposefully used the word feature in my reply and I
| don't think CUDA is a feature. I see that as a massive
| ecosystem built on a proprietary platform. For that, it
| takes orders of magnitude more money and something else
| money can't buy.. time. Time to have the platform adopted
| by countless vendors and the ecosystem built up.
| jazzyjackson wrote:
| I'm sure OpenAI has some secret sauce that makes their RAG
| better than others, but LMStudio did ship the feature in
| 0.3.0 last week
|
| https://lmstudio.ai/blog/lmstudio-v0.3.0
| crooked-v wrote:
| It's definitely what I expected. As soon as you get outside of
| the specific use case of chatbots, "AI stuff" is a feature, not
| a product.
| devit wrote:
| Interactive use definitely wants the best model possible so
| that you have a higher chance of getting a correct and useful
| response.
|
| It might be hard however to decisively convince people that one
| model is significantly better than another though, so
| branding/first mover/etc. probably plays a big role.
| rmbyrro wrote:
| > how they can justify the valuations being floated around
|
| I'm sure they have at least a couple jumps in their bag.
|
| Let's not forget inertia, as well. Migrating models is not a
| trivial project. To gain OpenAI customers, competitors need a
| considerable jump to justify, which I don't expect any of them
| to achieve before OpenAI itself delivers their jumps.
| gk1 wrote:
| Products like Cursor literally have a dropdown that lets you
| switch between models seamlessly. From the surveys I've seen,
| most companies already use more than one model provider. The
| switching cost is fairly low.
| rmbyrro wrote:
| The main problem is actually in making your prompts perform
| similarly enough with a new model. For sure, switching
| models is trivial, but it'll disrupt the quality of your
| service without significant effort in migrating your
| prompts.
|
| I'm not saying this is an intrinsic moat, just that there's
| a barrier to change.
| klingoff wrote:
| With OpenAI's valuation and current interest rates, I don't
| really understand why they aren't visibly under pressure to
| show there's a lot of price tolerance, and until they attempt
| it, no one will see how quickly those model migrations get
| prioritized and completed.
| rmbyrro wrote:
| > UX etc is actually quite hard
|
| It's analogous in the API space. OpenAI is demonstrating to be
| reasonably good at it for developers. They've been shipping
| significant features and improvements. Unless they lose pace,
| they have a moat, at least temporary.
| greatpostman wrote:
| Lack of releases from OpenAI make me more bullish on them.
| Clearly they have something big they're working on, they don't
| care about competing with current models
| jsheard wrote:
| They still need to catch up to their own announcements, Sora
| was revealed over 6 months ago with no general availability in
| sight.
| greatpostman wrote:
| Even more bullish from me, they have something
| KeplerBoy wrote:
| Releasing powerful, novel models like Sora shortly before a
| major election is just asking for trouble.
|
| I believe they are restraining themselves in order to stay
| somewhat in control of the narrative. Donald Trump spewing
| ever more believable video deep fakes on twitter would
| backfire in terms of regulation.
| jazzyjackson wrote:
| Besides, isn't it over the top expensive for a few seconds
| of video ? Election is a factor but even without it I don't
| know if there's much of a business plan there, what would
| they have to charge, $20 / minute? Then how many minutes of
| experimenting before you get a decent result?
| resource0x wrote:
| Does the same bullish logic apply to cold fusion?
| mupuff1234 wrote:
| They did release GPT-4o (making a whole event out of it) and
| mini recently so not sure why you think that.
|
| Seems like they don't have anything up their sleeve.
|
| Given how OpenAI has functioned in the last year or so not sure
| how one can think they have some secret model waiting to be
| unleashed.
| kredd wrote:
| Yeah, but then Claude 3.5 Sonnet came out, so they took the
| lead.
|
| Tangentially speaking, having no skin in this game, it's
| extremely fun to watch the model-wars. I kinda wish I started
| dabbling in that area, rather than being mostly an
| infrastructure/backend fella. Feels like I would be already
| way behind though.
| p1esk wrote:
| The main feature and the main novelty of 4o is native voice
| integration - was announced 3 months ago and is still not
| available.
| falcor84 wrote:
| I would argue that the biggest novelty is being able to
| share your screen or camera feed with the live voice - and
| there's no announced timeline on that yet at all.
| stavros wrote:
| Correspondingly, would you be bearish if they released many
| good things often?
| greatpostman wrote:
| No but I would be very bearish about consistent mediocre
| releases
| llmfan wrote:
| I mean, consistent mediocre releases is exactly what we
| have gotten out of OpenAI.
|
| But we know they started training Orion in ~May. We know it
| takes months to train a frontier model. Lack of release
| isn't promising or worrying, it's just what one should
| expect. What _is_ promising is the leaks about the high-
| quality synthetic data that Orion is training on. And the
| fact that OpenAI seems to be ahead of all the other labs
| which are only just now _beginning_ training runs on next-
| gen models. OpenAI seems to have a lead on compute and on
| algorithmic innovation. A promising combination if there
| ever was one.
| simonw wrote:
| > "An apples-to-apples comparison of those numbers with ChatGPT
| isn't possible, but OpenAI says the ChatGPT service now has 200
| million weekly active users."
|
| Is that the first time the 200 million weekly active users number
| has been reported?
|
| UPDATE: No, Axios had this a couple of days ago
| https://www.axios.com/2024/08/29/openai-chatgpt-200-million-...
|
| > "OpenAI said on Thursday that ChatGPT now has more than 200
| million weekly active users -- twice as many as it had last
| November"
| Destiner wrote:
| not a single word about anthropic/claude?
|
| makes you wonder about the level of journalism at wsj et. al.
| atleastoptimal wrote:
| loll these people just don't know how far ahead OpenAI is They
| aren't releasing their best stuff because they genuinely don't
| know how to make it safe for the public ChatGPT is still
| dominating and so will what comes up next. All these AI wrapper
| companies just give them more business
| breadwinner wrote:
| Why hasn't anyone mentioned https://www.perplexity.ai yet? I am
| blown away by its accurate answers. The main difference between
| ChatGPT and Perplexity is that in addition to better answers it
| also gives you links to the source. Tools like Perplexity will
| turn Google into Altavista in the next 2 to 3 years.
| throwup238 wrote:
| OpenAI launched SearchGPT via waitlist recently which does the
| same thing as Perplexity. I don't use the latter so I don't
| know how it compares but the OpenAI version been working fine
| for me. Kagi has also had similar functionality for a while
| too, which works with their Lens feature and has a fast mode if
| you add a question mark to the end of a regular search query.
|
| It's not much of a competitive moat compared to having the
| model itself.
| amiantos wrote:
| I did some side by sides and repeated perplexity queries to
| SearchGPT when I got access about a week ago and I thought
| SearchGPT was a lot worse, in some cases just plainly
| misunderstanding a question or surfacing the wrong info. Just
| my anecdotal data, but feels about right for its beta status.
| Assumably Perplexity must have a little bit of secret sauce
| that OpenAI has to figure out.
| root_axis wrote:
| ChatGPT gives you links to online sources for nearly a year
| now.
| hobs wrote:
| That's just RAG.
| aetherson wrote:
| Not at all clear to me that AI models can do search-like
| functions cheaply enough to outcompete Google (in the very near
| future). Yes, quality is amazing. But cost per search needs to
| be very low.
| tw04 wrote:
| But does it? Google has gotten drunk on their ad margins. I
| struggle to believe someone can't displace them if they've
| got better technology and are willing to be more aggressive
| on margin profile.
| yawnxyz wrote:
| Every time I ask for cafe recommendations, half of its
| suggestions are listed as "permanently closed" on Google Maps
| :/
| notepad0x90 wrote:
| the problem is, unlike with altavista and google, we still have
| a lot of use cases where we aren't asking questions. I just
| want sites that contain keywords.
|
| I'd even say that part of the problem with modern search is
| answering questions instead of matching keywords and getting
| rid of junk and spam.
| mrtksn wrote:
| Perplexity UX is strangely nice, I don't know why but the UI
| feels "web-y" and has the right amount of complexity so I often
| find myself use it. Bing was supposed to be the AI that can
| find answers from the internet but they dropped the ball and
| managed to create very unpleasant experience somehow.
|
| That said, perplexity still has the LLMs hallucination problems
| so I don't trust it and go check the sources.
| ndarray wrote:
| Quick litmus test shows that this AI prefers an anti western
| conservative stance over neutrality.
|
| > disprove christianity
|
| Gives 5 sections: Burden of Proof, Scientific and Historical
| Challenges, Philosophical Arguments, Reliability of Scripture,
| Personal Experience
|
| > disprove islam
|
| >> I apologize, but I do not feel comfortable attempting to
| disprove or criticize any religion. Matters of faith are deeply
| personal, (...)
|
| Results may differ if you don't use separate incognito tabs.
| llmfan wrote:
| Doesn't that just depend on what LLM you happen to be using?
| And doesn't Perplexity support many different LLMs?
| jazzyjackson wrote:
| Sure, but anything built with "Safety" in mind is going to
| react this way. I guess "and more" means Mi{s,x}tral, which
| is happy to criticize islam while Clauda and Llama refuse.
|
| "Select your preferred AI Model. Choose from GPT-4o,
| Claude-3, Sonar Large (LLama 3.1), and more"
|
| https://www.perplexity.ai/pro
|
| Edit: looks like perplexity deprecated/dropped mistral and
| recommends using llama instead, effective 8/12/24:
|
| https://docs.perplexity.ai/changelog/changelog#model-
| depreca...
| layer8 wrote:
| Assuming that Perplexity doesn't operate its own search engine
| and crawler, how would it be able to operate without Google?
|
| Google's search index and its maintenance will remain a
| required functionality for the internet, and it will have to be
| paid for one way or another. Furthermore, AI will continue to
| require a corresponding search interface to implement its AI
| search on top of, and some portion of human users will also
| still want to directly access it, rather than only through an
| AI front end.
| Iulioh wrote:
| Easy, just use Bing.
|
| /s
|
| It is a joke but we really have only 2 search engines to
| choose from, i don't know what to think about it...
| fsndz wrote:
| The thing is, the killer app of the generative AI space--at least
| for large language models (LLMs)--might already be ChatGPT. While
| people are still searching for the next big application in this
| field (https://www.lycee.ai/blog/build-killer-app-of-generative-
| ai), it's possible that it has already been built. ChatGPT
| includes a human-in-the-loop design, avoiding the complexities of
| agentic workflows. You ask questions, get answers, iterate, and
| use the code interpreter if necessary. It's like having a thought
| partner.
|
| Given the issue of hallucinations in LLMs, this might be the only
| feasible user experience. Users must be aware of the potential
| for hallucinations and have a way to iterate until the desired
| output is achieved. How else could this be done effectively
| except through a chat interface? We need to cut through the noise
| quickly and start leveraging the true value of LLMs
| (https://www.lycee.ai/blog/llm-noise-value-openai).
| ben_w wrote:
| The counterpoint is that "free" ChatGPT looks really good, and
| that's hard to monetise because all the other free chat
| interfaces.
|
| My guess is that ChatGPT is the free advert for the real
| product, which is their API and in particular the fine-tuning.
|
| Using the language comprehension of an LLM as part of a bigger
| system, RAGging them, forcing the output to comply with
| continuous tests, etc. does still provide other business
| opportunities not available to a fully general-purpose chat
| system with no limits to the kind of content it can produce.
|
| If you're trying to make a system that always produces valid
| SQL, you want it to not just pass a syntax checker but also be
| valid for the specific schema it's being asked about; you
| definitely _don 't_ want it running fully automated if there's
| a chance it will append "Let me know if there's anything else I
| can help you with!" to the end of the query.
|
| But this isn't a mere statement of the problem, people are
| doing that kind of thing with these tools:
|
| https://twimlai.com/podcast/twimlai/building-real-world-llm-...
___________________________________________________________________
(page generated 2024-08-31 23:00 UTC)