[HN Gopher] The Threat to OpenAI
       ___________________________________________________________________
        
       The Threat to OpenAI
        
       Author : bookofjoe
       Score  : 61 points
       Date   : 2024-08-31 19:56 UTC (3 hours ago)
        
 (HTM) web link (www.wsj.com)
 (TXT) w3m dump (www.wsj.com)
        
       | bookofjoe wrote:
       | https://archive.ph/qLgvp
       | 
       | https://www.wsj.com/tech/ai/ai-chatgpt-nvidia-apple-facebook...
        
       | thorum wrote:
       | If the rumors about the upcoming Strawberry and Orion models from
       | OpenAI are true - supposedly capable of deep research, reasoning
       | and math - they probably don't have much to worry about. Not to
       | mention they still have the only fully multimodal model.
        
         | forrestthewoods wrote:
         | Models are only state of the art for 12 to 18 months. There's
         | not a model in existence today that will have any value in 5
         | years. They will all be obsolete.
         | 
         | Thus far no one has any moat on their model.
        
         | krackers wrote:
         | According to that recent The Information report, Orion is
         | supposed to be just a regular LLM except trained with synthetic
         | data generated via Strawberry. Anthropic et al. have also been
         | working on ways to generate synthetic data (as seen in the
         | success of Sonnet 3.5) so I don't really know if that's going
         | to be a big lead.
         | 
         | And of course the ever-hyped strawberry is supposed to be some
         | sort of tree-of-thought type thing I think, or maybe it's
         | related to https://arxiv.org/abs/2203.14465. Either way,
         | nothing so far has come out that it's a completely novel
         | training technique or architecture, just a gpt-4 scale model
         | with different post-processing.
        
           | CuriouslyC wrote:
           | I'm not sure why people act so mystified about Q*, the name
           | gives it away, it's an obvious reference to A*, the only
           | question is what the nodes in the graph are and what they're
           | using as a Heuristic function.
        
           | llmfan wrote:
           | "just a regular LLM except [trained on very different data]."
           | 
           | I'm not saying there's some big moat, anyone can read
           | https://arxiv.org/pdf/2305.20050, but not all synthetic data
           | is created equal. Strawberry I'm sure generates beautiful,
           | valid chain-of-thought reasoning data. Wouldn't surprise me
           | if OpenAI is just significantly ahead of the competition.
        
         | GaggiX wrote:
         | What is the only fully multimodal model?
         | 
         | The GPT-4o checkpoint available to the public can see images
         | but not generate them (it can generate prompts for Dalle 3 to
         | use). OpenAI has an internal model with this capability, but if
         | you don't make it a actual product, it doesn't really matter.
        
       | martinald wrote:
       | Yes I generally agree with this. Ironically it seems as all the
       | "AI wrappers" look like they have more of a moat than the models
       | (as UX etc is actually quite hard), which is not what I expected
       | at all when the LLM boom started.
       | 
       | I think a lot of LLM usage is actually pretty 'simple' at the
       | moment - think tagging, extracting data, etc. This doesn't
       | require the very top state of the art models (though Llama3.1 is
       | very close to that).
       | 
       | Hopefully OpenAI have some real jumps forward in the bag
       | otherwise I am struggling to see how they can justify the
       | valuations being floated around.
        
         | CuriouslyC wrote:
         | The wrappers don't really have much of a moat either, OpenAI is
         | just bad at front-end dev, so their velocity there is low.
        
         | wslh wrote:
         | I think the problem with valuations is the never ending
         | financial history of hypes. The hype is correlated with the
         | technology or industry but follows its own speculation hype.
         | Even if the valuations and speculations are correct, they could
         | be correct in years with ups and downs. If the dot com era is a
         | good example it took half a decade or more to materialize.
         | 
         | Other aspects that are not well studied yet is how online
         | advertisement will be affected if most people around the world
         | end up using a single interface such as ChatGPT. How
         | SEO/SEM/Ads will work in that world? Does someone look at what
         | sites benefit being listed in ChatGPT (e.g. Wikipedia).
        
         | sillysaurusx wrote:
         | One killer feature is OpenAI's vector database. I was surprised
         | you can throw gigabytes of documents at it and chatgpt can see
         | it all. It's hard to simulate that via context window.
         | 
         | That's not necessarily a moat, but OpenAI is still shipping
         | important features. I wonder how hard it is for Claude et al to
         | replicate.
        
           | brigadier132 wrote:
           | They are outsourcing that to Qdrant, anyone can replicate it.
        
           | Xenoamorphous wrote:
           | What do you mean, does it do anything other than the usual
           | RAG?
        
             | sillysaurusx wrote:
             | Whatever it does, it's indistinguishable from stuffing all
             | the documents into the context window. Or at least I
             | haven't seen it fail yet.
        
               | KeplerBoy wrote:
               | Stuffing everything into the context window fails
               | horribly every time i try it.
               | 
               | The model just doesn't seem to be able to really process
               | the entire input.
        
             | ramraj07 wrote:
             | I'm also confused what they're talking about.. Does OpenAI
             | have some feature I'm not aware of?
        
           | prng2021 wrote:
           | When these companies are getting literally multiple billions
           | of dollars of funding thrown at them, I can't think of a
           | single feature that can be a moat. If it's truly a feature
           | that leapfrogs competitors, there's just no way that someone
           | like Anthropic spending $1B on engineering resources can't
           | replicate the same exact feature that engineers elsewhere
           | implemented.
        
             | Frost1x wrote:
             | If it's simply an engineering feat I agree. Very, very,
             | very often people tend to mix up science being done in
             | technology spaces with engineering. Just because it's being
             | done in software and perhaps even without rigor doesn't
             | mean you're not doing something novel and quite different
             | at a fundamental perspective than someone else. Sometimes
             | it could just be luck of the draw in an implementation
             | approach that gets you there but that's not always the
             | case.
             | 
             | I've worked in scientific computing for awhile now and
             | there are countless subtle decisions often done in the
             | implementation phase that, from a theoretical perspective,
             | aren't definitively answered by the science working behind
             | the scenes. There's often a gap in knowledge and
             | implementers either hit those gaps by trial and error,
             | luck, or insight. That's my opinion, at least. So I don't
             | think it's always "just an implementation problem" some
             | will claim, as if the science is well understood and
             | solved. Perhaps it is, but from my experience that tends to
             | not be the case.
        
               | danielmarkbruce wrote:
               | This 100%. And LLMs and the applications around them
               | (even something as simple seeming as ChatGPT) have more
               | subtle decisions than any other type of software that
               | I've ever seen. Everyone claims there isn't a moat, I bet
               | there is.
        
               | oakst wrote:
               | Really interesting perspective
        
               | prng2021 wrote:
               | Yep I agree with that. I will say though that people,
               | teams, and even entire companies (like what happened at
               | Inflection) get poached everyday so maintaining a moat
               | that way is tough. Also, even though it could happen in
               | the future, is OpenAI's lead due to a moat of scientists
               | with ideas so novel that no other AI company can compete?
               | Certainly not because even though ChatGPT took the world
               | by storm, numerous other companies built LLMs in a very
               | short time span that now perform at very similar levels,
               | both subjectively and based on benchmarks.
        
             | WanderPanda wrote:
             | Agreed, on another note I also struggle to see how no one
             | could create a (better) CUDA implementation for e.g. $1B
             | engineering budget
        
               | prng2021 wrote:
               | I purposefully used the word feature in my reply and I
               | don't think CUDA is a feature. I see that as a massive
               | ecosystem built on a proprietary platform. For that, it
               | takes orders of magnitude more money and something else
               | money can't buy.. time. Time to have the platform adopted
               | by countless vendors and the ecosystem built up.
        
           | jazzyjackson wrote:
           | I'm sure OpenAI has some secret sauce that makes their RAG
           | better than others, but LMStudio did ship the feature in
           | 0.3.0 last week
           | 
           | https://lmstudio.ai/blog/lmstudio-v0.3.0
        
         | crooked-v wrote:
         | It's definitely what I expected. As soon as you get outside of
         | the specific use case of chatbots, "AI stuff" is a feature, not
         | a product.
        
         | devit wrote:
         | Interactive use definitely wants the best model possible so
         | that you have a higher chance of getting a correct and useful
         | response.
         | 
         | It might be hard however to decisively convince people that one
         | model is significantly better than another though, so
         | branding/first mover/etc. probably plays a big role.
        
         | rmbyrro wrote:
         | > how they can justify the valuations being floated around
         | 
         | I'm sure they have at least a couple jumps in their bag.
         | 
         | Let's not forget inertia, as well. Migrating models is not a
         | trivial project. To gain OpenAI customers, competitors need a
         | considerable jump to justify, which I don't expect any of them
         | to achieve before OpenAI itself delivers their jumps.
        
           | gk1 wrote:
           | Products like Cursor literally have a dropdown that lets you
           | switch between models seamlessly. From the surveys I've seen,
           | most companies already use more than one model provider. The
           | switching cost is fairly low.
        
             | rmbyrro wrote:
             | The main problem is actually in making your prompts perform
             | similarly enough with a new model. For sure, switching
             | models is trivial, but it'll disrupt the quality of your
             | service without significant effort in migrating your
             | prompts.
             | 
             | I'm not saying this is an intrinsic moat, just that there's
             | a barrier to change.
        
           | klingoff wrote:
           | With OpenAI's valuation and current interest rates, I don't
           | really understand why they aren't visibly under pressure to
           | show there's a lot of price tolerance, and until they attempt
           | it, no one will see how quickly those model migrations get
           | prioritized and completed.
        
         | rmbyrro wrote:
         | > UX etc is actually quite hard
         | 
         | It's analogous in the API space. OpenAI is demonstrating to be
         | reasonably good at it for developers. They've been shipping
         | significant features and improvements. Unless they lose pace,
         | they have a moat, at least temporary.
        
       | greatpostman wrote:
       | Lack of releases from OpenAI make me more bullish on them.
       | Clearly they have something big they're working on, they don't
       | care about competing with current models
        
         | jsheard wrote:
         | They still need to catch up to their own announcements, Sora
         | was revealed over 6 months ago with no general availability in
         | sight.
        
           | greatpostman wrote:
           | Even more bullish from me, they have something
        
           | KeplerBoy wrote:
           | Releasing powerful, novel models like Sora shortly before a
           | major election is just asking for trouble.
           | 
           | I believe they are restraining themselves in order to stay
           | somewhat in control of the narrative. Donald Trump spewing
           | ever more believable video deep fakes on twitter would
           | backfire in terms of regulation.
        
             | jazzyjackson wrote:
             | Besides, isn't it over the top expensive for a few seconds
             | of video ? Election is a factor but even without it I don't
             | know if there's much of a business plan there, what would
             | they have to charge, $20 / minute? Then how many minutes of
             | experimenting before you get a decent result?
        
         | resource0x wrote:
         | Does the same bullish logic apply to cold fusion?
        
         | mupuff1234 wrote:
         | They did release GPT-4o (making a whole event out of it) and
         | mini recently so not sure why you think that.
         | 
         | Seems like they don't have anything up their sleeve.
         | 
         | Given how OpenAI has functioned in the last year or so not sure
         | how one can think they have some secret model waiting to be
         | unleashed.
        
           | kredd wrote:
           | Yeah, but then Claude 3.5 Sonnet came out, so they took the
           | lead.
           | 
           | Tangentially speaking, having no skin in this game, it's
           | extremely fun to watch the model-wars. I kinda wish I started
           | dabbling in that area, rather than being mostly an
           | infrastructure/backend fella. Feels like I would be already
           | way behind though.
        
           | p1esk wrote:
           | The main feature and the main novelty of 4o is native voice
           | integration - was announced 3 months ago and is still not
           | available.
        
             | falcor84 wrote:
             | I would argue that the biggest novelty is being able to
             | share your screen or camera feed with the live voice - and
             | there's no announced timeline on that yet at all.
        
         | stavros wrote:
         | Correspondingly, would you be bearish if they released many
         | good things often?
        
           | greatpostman wrote:
           | No but I would be very bearish about consistent mediocre
           | releases
        
             | llmfan wrote:
             | I mean, consistent mediocre releases is exactly what we
             | have gotten out of OpenAI.
             | 
             | But we know they started training Orion in ~May. We know it
             | takes months to train a frontier model. Lack of release
             | isn't promising or worrying, it's just what one should
             | expect. What _is_ promising is the leaks about the high-
             | quality synthetic data that Orion is training on. And the
             | fact that OpenAI seems to be ahead of all the other labs
             | which are only just now _beginning_ training runs on next-
             | gen models. OpenAI seems to have a lead on compute and on
             | algorithmic innovation. A promising combination if there
             | ever was one.
        
       | simonw wrote:
       | > "An apples-to-apples comparison of those numbers with ChatGPT
       | isn't possible, but OpenAI says the ChatGPT service now has 200
       | million weekly active users."
       | 
       | Is that the first time the 200 million weekly active users number
       | has been reported?
       | 
       | UPDATE: No, Axios had this a couple of days ago
       | https://www.axios.com/2024/08/29/openai-chatgpt-200-million-...
       | 
       | > "OpenAI said on Thursday that ChatGPT now has more than 200
       | million weekly active users -- twice as many as it had last
       | November"
        
       | Destiner wrote:
       | not a single word about anthropic/claude?
       | 
       | makes you wonder about the level of journalism at wsj et. al.
        
       | atleastoptimal wrote:
       | loll these people just don't know how far ahead OpenAI is They
       | aren't releasing their best stuff because they genuinely don't
       | know how to make it safe for the public ChatGPT is still
       | dominating and so will what comes up next. All these AI wrapper
       | companies just give them more business
        
       | breadwinner wrote:
       | Why hasn't anyone mentioned https://www.perplexity.ai yet? I am
       | blown away by its accurate answers. The main difference between
       | ChatGPT and Perplexity is that in addition to better answers it
       | also gives you links to the source. Tools like Perplexity will
       | turn Google into Altavista in the next 2 to 3 years.
        
         | throwup238 wrote:
         | OpenAI launched SearchGPT via waitlist recently which does the
         | same thing as Perplexity. I don't use the latter so I don't
         | know how it compares but the OpenAI version been working fine
         | for me. Kagi has also had similar functionality for a while
         | too, which works with their Lens feature and has a fast mode if
         | you add a question mark to the end of a regular search query.
         | 
         | It's not much of a competitive moat compared to having the
         | model itself.
        
           | amiantos wrote:
           | I did some side by sides and repeated perplexity queries to
           | SearchGPT when I got access about a week ago and I thought
           | SearchGPT was a lot worse, in some cases just plainly
           | misunderstanding a question or surfacing the wrong info. Just
           | my anecdotal data, but feels about right for its beta status.
           | Assumably Perplexity must have a little bit of secret sauce
           | that OpenAI has to figure out.
        
         | root_axis wrote:
         | ChatGPT gives you links to online sources for nearly a year
         | now.
        
         | hobs wrote:
         | That's just RAG.
        
         | aetherson wrote:
         | Not at all clear to me that AI models can do search-like
         | functions cheaply enough to outcompete Google (in the very near
         | future). Yes, quality is amazing. But cost per search needs to
         | be very low.
        
           | tw04 wrote:
           | But does it? Google has gotten drunk on their ad margins. I
           | struggle to believe someone can't displace them if they've
           | got better technology and are willing to be more aggressive
           | on margin profile.
        
         | yawnxyz wrote:
         | Every time I ask for cafe recommendations, half of its
         | suggestions are listed as "permanently closed" on Google Maps
         | :/
        
         | notepad0x90 wrote:
         | the problem is, unlike with altavista and google, we still have
         | a lot of use cases where we aren't asking questions. I just
         | want sites that contain keywords.
         | 
         | I'd even say that part of the problem with modern search is
         | answering questions instead of matching keywords and getting
         | rid of junk and spam.
        
         | mrtksn wrote:
         | Perplexity UX is strangely nice, I don't know why but the UI
         | feels "web-y" and has the right amount of complexity so I often
         | find myself use it. Bing was supposed to be the AI that can
         | find answers from the internet but they dropped the ball and
         | managed to create very unpleasant experience somehow.
         | 
         | That said, perplexity still has the LLMs hallucination problems
         | so I don't trust it and go check the sources.
        
         | ndarray wrote:
         | Quick litmus test shows that this AI prefers an anti western
         | conservative stance over neutrality.
         | 
         | > disprove christianity
         | 
         | Gives 5 sections: Burden of Proof, Scientific and Historical
         | Challenges, Philosophical Arguments, Reliability of Scripture,
         | Personal Experience
         | 
         | > disprove islam
         | 
         | >> I apologize, but I do not feel comfortable attempting to
         | disprove or criticize any religion. Matters of faith are deeply
         | personal, (...)
         | 
         | Results may differ if you don't use separate incognito tabs.
        
           | llmfan wrote:
           | Doesn't that just depend on what LLM you happen to be using?
           | And doesn't Perplexity support many different LLMs?
        
             | jazzyjackson wrote:
             | Sure, but anything built with "Safety" in mind is going to
             | react this way. I guess "and more" means Mi{s,x}tral, which
             | is happy to criticize islam while Clauda and Llama refuse.
             | 
             | "Select your preferred AI Model. Choose from GPT-4o,
             | Claude-3, Sonar Large (LLama 3.1), and more"
             | 
             | https://www.perplexity.ai/pro
             | 
             | Edit: looks like perplexity deprecated/dropped mistral and
             | recommends using llama instead, effective 8/12/24:
             | 
             | https://docs.perplexity.ai/changelog/changelog#model-
             | depreca...
        
         | layer8 wrote:
         | Assuming that Perplexity doesn't operate its own search engine
         | and crawler, how would it be able to operate without Google?
         | 
         | Google's search index and its maintenance will remain a
         | required functionality for the internet, and it will have to be
         | paid for one way or another. Furthermore, AI will continue to
         | require a corresponding search interface to implement its AI
         | search on top of, and some portion of human users will also
         | still want to directly access it, rather than only through an
         | AI front end.
        
           | Iulioh wrote:
           | Easy, just use Bing.
           | 
           | /s
           | 
           | It is a joke but we really have only 2 search engines to
           | choose from, i don't know what to think about it...
        
       | fsndz wrote:
       | The thing is, the killer app of the generative AI space--at least
       | for large language models (LLMs)--might already be ChatGPT. While
       | people are still searching for the next big application in this
       | field (https://www.lycee.ai/blog/build-killer-app-of-generative-
       | ai), it's possible that it has already been built. ChatGPT
       | includes a human-in-the-loop design, avoiding the complexities of
       | agentic workflows. You ask questions, get answers, iterate, and
       | use the code interpreter if necessary. It's like having a thought
       | partner.
       | 
       | Given the issue of hallucinations in LLMs, this might be the only
       | feasible user experience. Users must be aware of the potential
       | for hallucinations and have a way to iterate until the desired
       | output is achieved. How else could this be done effectively
       | except through a chat interface? We need to cut through the noise
       | quickly and start leveraging the true value of LLMs
       | (https://www.lycee.ai/blog/llm-noise-value-openai).
        
         | ben_w wrote:
         | The counterpoint is that "free" ChatGPT looks really good, and
         | that's hard to monetise because all the other free chat
         | interfaces.
         | 
         | My guess is that ChatGPT is the free advert for the real
         | product, which is their API and in particular the fine-tuning.
         | 
         | Using the language comprehension of an LLM as part of a bigger
         | system, RAGging them, forcing the output to comply with
         | continuous tests, etc. does still provide other business
         | opportunities not available to a fully general-purpose chat
         | system with no limits to the kind of content it can produce.
         | 
         | If you're trying to make a system that always produces valid
         | SQL, you want it to not just pass a syntax checker but also be
         | valid for the specific schema it's being asked about; you
         | definitely _don 't_ want it running fully automated if there's
         | a chance it will append "Let me know if there's anything else I
         | can help you with!" to the end of the query.
         | 
         | But this isn't a mere statement of the problem, people are
         | doing that kind of thing with these tools:
         | 
         | https://twimlai.com/podcast/twimlai/building-real-world-llm-...
        
       ___________________________________________________________________
       (page generated 2024-08-31 23:00 UTC)