hngopher.com

       [HN Gopher] OpenAI's plans according to sama
       ___________________________________________________________________
        
       OpenAI's plans according to sama
        
       Author : razcle
       Score  : 179 points
       Date   : 2023-05-31 18:05 UTC (4 hours ago)
        
 (HTM) web link (humanloop.com)
 (TXT) w3m dump (humanloop.com)
        
       | londons_explore wrote:
       | > is limited by GPU availability.
       | 
       | Which is all the more curious, considering OpenAI said this only
       | in January:
       | 
       | > Azure will remain the exclusive cloud provider for all OpenAI
       | workloads across our research, API and products [1]
       | 
       | So... OpenAI is severely GPU constrained, it is hampering their
       | ability to execute, onboard customers to existing products and
       | launch products. Yet they signed an agreement _not_ to just go
       | rent a bunch of GPU 's from AWS???
       | 
       | Did someone screw up by not putting a clause in that contract
       | saying "exclusive cloud provider, _unless you cannot fulfil our
       | requests_ "?
       | 
       | [1]: https://openai.com/blog/openai-and-microsoft-extend-
       | partners...
        
         | ilaksh wrote:
         | AWS might not really have much extra GPU capacity for them
         | anyway.. also they would cost more.
         | 
         | I think that there aren't a lot of GPUs available and it takes
         | time to add more to the datacenter even when you do get them.
        
           | carom wrote:
           | I heard earlier this year that people were having trouble
           | getting allocations on GCP as well. Probably why Nvidia is at
           | $1T now.
        
         | chaostheory wrote:
         | Even if they weren't exclusive with Azure, aren't GPU prices
         | reasonable again?
        
           | verdverm wrote:
           | They have to be a available to buy, regardless the price. My
           | understanding is there is a distinct lack of supply
        
         | londons_explore wrote:
         | Perhaps they are cash flow constrained, which in turn means
         | they are GPU constrained, since GPU's are their biggest
         | expense?
        
         | sebzim4500 wrote:
         | >So... OpenAI is severely GPU constrained, it is hampering
         | their ability to execute, onboard customers to existing
         | products and launch products. Yet they signed an agreement not
         | to just go rent a bunch of GPU's from AWS???
         | 
         | > Did someone screw up by not putting a clause in that contract
         | saying "exclusive cloud provider, unless you cannot fulfil our
         | requests"?
         | 
         | Maybe MSFT refused to sign such an agreement?
        
         | catchnear4321 wrote:
         | this has nothing to do with sama clamoring for regulation.
         | 
         | that absolutely isn't an attempt to slow down all competition.
         | 
         | which isn't necessary because nobody made such a mistake.
         | 
         | this won't lead to any hasty or reckless internal decisions in
         | a feckless effort to stay in front.
         | 
         | not that any have already been made.
         | 
         | not that that could lead to disaster.
        
         | jiggawatts wrote:
         | One of Azure's unique offerings is very large HPC clusters with
         | GPUs. You can deploy ~1,000 node scale sets with very high
         | speed networking. AWS has many single-server GPU offerings, but
         | nothing quite like what Azure has.
         | 
         | Don't assume Microsoft is bad at _everything_ and that AWS is
         | automatically superior at all product categories...
        
         | HarHarVeryFunny wrote:
         | There's an interesting recent video here from Microsoft
         | discussing Azure. The format is a bit cheesy, but lots of
         | interesting information nonetheless.
         | 
         | https://www.youtube.com/watch?v=Rk3nTUfRZmo&t=5s "What runs
         | ChatGPT? Inside Microsoft's AI supercomputer"
         | 
         | The relevance here is that Azure appears to be very well
         | designed to handle the hardware failures that will inevitably
         | happen during a training run taking weeks or months and using
         | many thousands of GPUs... There's a lot more involved than just
         | renting a bunch of Amazon GPUs, and anyways the partnership
         | between OpenAI and Microsoft appears quite strategic, and can
         | handle some build-out delays, especially if they are not
         | Microsoft's fault.
        
       | simse wrote:
       | > A stateful API
       | 
       | This would be huge for many applications, as "chatting" with
       | GPT-4 gets really, really expensive very quickly. I've played
       | with API with friends, and winced as I watched my usage hit
       | several dollars for just a bit of fun.
        
       | dontupvoteme wrote:
       | Having been recently taken aboard by the mothership I expect
       | they'll start trying to tune out anything related to programming
       | to push people towards co-pilot X..
       | 
       | It's pretty hilarious and annoying to see bing start to write
       | code only to self censor itself after a few lines (deleting what
       | was there! no wonder these guys love websockets and dynamic
       | histories)
       | 
       | Whoops!
        
         | _boffin_ wrote:
         | Wait... what? Can you elaborate.
        
           | mistymountains wrote:
           | He's speculating that Microsoft is nerfing OpenAI / chatGPT
           | to funnel narrow capabilities to silos like CoPilot.
        
             | _boffin_ wrote:
             | I understand that... I should have specified a bit more
             | that i'm interested in knowing more about the removal of
             | answers as its writing them, if they're code.
        
           | dontupvoteme wrote:
           | https://en.wikipedia.org/wiki/Embrace,_extend,_and_extinguis.
           | ..
        
             | _boffin_ wrote:
             | yes... I know about this, but that's not what I'm asking
             | about. I'm asking about it removing partial answers as it's
             | writing them.
             | 
             | Please make more effort next time than to provide me with a
             | Wiki article.
        
       | sharkjacobs wrote:
       | > He reiterated his belief in the importance of open source and
       | said that OpenAI was considering open-sourcing GPT-3. Part of the
       | reason they hadn't open-sourced yet was that he was skeptical of
       | how many individuals and companies would have the capability to
       | host and serve large LLMs.
       | 
       | Am I reading this right? "We're not open sourcing GPT-3 because
       | we don't think it would be useful to anyone else"
        
         | xxprogamerxy wrote:
         | He wants the release of the model to primarily benefit
         | individuals and smaller teams as opposed to large deep-pocketed
         | firms.
        
         | stavros wrote:
         | Reads to me like "we don't know how many people will have
         | hardware powerful enough to run this".
        
         | razcle wrote:
         | I think I worded this poorly. What he said was that a lot of
         | people say they want open-source models but they underestimate
         | how hard it is to serve them well. So he wondered how much real
         | benefit would come from open-sourcing them.
         | 
         | I think this is reasonable. Giving researchers access is great
         | but for most small companies they're likely better off having a
         | service provider manage inference for them rather than navigate
         | the infra challenge.
        
           | roganartu wrote:
           | The beauty of open source is that the community will either
           | figure out how to make it easier, or collectively decide it's
           | not worth the effort. We saw this with stable diffusion, and
           | we are seeing it with all the existing OSS LLMs.
           | 
           | "It's too hard, trust us" doesn't really make sense in that
           | context. If it is indeed too hard for small orgs to self host
           | then they won't. Hiding behind the guise of protecting these
           | people by not open sourcing it seems a bit disingenuous.
        
         | greenie_beans wrote:
         | lmao i had the same reaction. sounds like some bullshit.
        
         | TigeriusKirk wrote:
         | How can you sign a statement that AI presents an extinction
         | risk on par with nuclear weapons and then even consider open
         | sourcing your research?
         | 
         | We don't provide nuclear weapons for everyone to keep in their
         | basement, why would someone who believes AI is an existential
         | risk provide their code?
        
         | bibanez wrote:
         | I agree, this is so bizarre
        
           | ftxbro wrote:
           | yes i also can't wrap my head around how a ceo of a billion
           | dollar company isn't sincere in his public statements
        
             | wintogreen74 wrote:
             | Really? Even after saying this? "While Sam is calling for
             | regulation of future models, he didn't think existing
             | models were dangerous and thought it would be a big mistake
             | to regulate or ban them."
        
               | cinntaile wrote:
               | It was a tongue in cheek reaction.
        
               | ethanbond wrote:
               | Why couldn't that be true? E.g. even scientists who
               | worked on the Manhattan Project (justifiably) had
               | antipathy toward the much more powerful hydrogen bomb.
               | 
               | It's possible to think squirt guns shouldn't be regulated
               | but AR-15s should, or AR-15s shouldn't but cruise
               | missiles should. Or driving at 25mph should be allowed
               | but driving 125mph shouldn't.
        
           | RosanaAnaDana wrote:
           | Its just a way to lie that doesn't sound as much like a lie.
        
         | paxys wrote:
         | More like - it won't be useful to small-time developers (since
         | they won't have the capability to host and run it themselves)
         | and so all the benefits will be reaped by AWS and other large
         | players.
        
         | TapWaterBandit wrote:
         | When you stop listening to what Sam Altman says and just focus
         | on what he does, you can see the guy is a bit of a snake.
         | Greedy power-hungry man imho.
        
         | sebzim4500 wrote:
         | It is weird, but GPT-3 is worse than much smaller LLaMA models
         | so I doubt it would see much use anyway.
        
           | flangola7 wrote:
           | Are you referring to DaVinci or ChatGPT-3.5
        
             | sebzim4500 wrote:
             | DaVinci
        
           | killjoywashere wrote:
           | How do you measure this? Pointers to papers would be very
           | helpful
        
             | sebzim4500 wrote:
             | The LLaMA paper had a bunch of comparisons
        
         | candiddevmike wrote:
         | OpenAI: Regulations must be passed to protect our moat
         | 
         | Also OpenAI: Meta is pissing in our moat, let's drop a hint
         | about open sourcing our shit too!
        
       | [deleted]
        
       | naillo wrote:
       | > Plugins "don't have PMF"
       | 
       | Probability mass functions? Anyone know what this means in this
       | context?
        
         | simonbutt wrote:
         | Product market fit
        
         | [deleted]
        
       | sovietmudkipz wrote:
       | I'm hoping GPT will remove the information cutoff date. I write
       | plenty of terraform/AWS and it's a bit of a pain that the latest
       | API isn't accessible by GPT yet.
       | 
       | There's been quite a bit happening in the programming space since
       | sept 2021.
       | 
       | I use GPT to keep things high level and then do my normal
       | research methodology for implementation details.
        
         | mustacheemperor wrote:
         | I enjoy using GPT4 as a co-programmer, and funny enough it is
         | very challenging to get advice on Microsoft's own .NET MAUI
         | because that framework was in prerelease at the time the model
         | was trained.
         | 
         | My understanding is right now they essentially need to train a
         | new model on a new updated corpus to fix this, but maybe some
         | other techniques could be devised...or they'll train something
         | more up to date.
        
           | ilaksh wrote:
           | You might actually get pretty far if you just went through
           | the Microsoft docs and created a bunch of really concise
           | examples and fed that as the start of the prompt. Use like
           | 6-7kb for that and then the question at the end.
        
             | mustacheemperor wrote:
             | I have had some luck doing exactly that, and not even as
             | efficiently as you describe - If my question is limited
             | enough that the discussion won't overwhelm the context
             | window I've found I can just paste in big chunks of the
             | docs wholesale like a 'zero shot.'
        
           | buildbot wrote:
           | Context drift! https://qntm.org/mmacevedo
        
         | ilaksh wrote:
         | As stated your request is entirely impossible. They cannot
         | simply "remove the cut-off date". It takes months and huge
         | amounts of hardware to train. Then they do the reinforcement
         | adjustments on top of it while researching how to train the
         | next batch.
        
         | furyofantares wrote:
         | It's not like an arbitrary imposition, that's the data it was
         | trained on and it's expensive to train. I hope they find a way
         | to continually train in new information too but it's not like
         | they can just remove the cutoff date.
        
       | hoschicz wrote:
       | - they are working on a stateful API - they are working on a
       | cheaper version of GPT-4
       | 
       | Most probably this is driven by their use of it in ChatGPT, which
       | is on fire from PMF. Clearly they're experimenting with the
       | cheaper GPT-4 in ChatGPT right now as it's fairly turbo now, as
       | discussed earlier today.
        
       | atemerev wrote:
       | All AI companies (OpenAI included) are now working full tilt on
       | making AIs improve themselves (writing their own code, inventing
       | new pipelines etc). I don't know why choose anything else to work
       | on. This is a prime directive, that will bring the greatest
       | payoff.
        
         | ren_engineer wrote:
         | if this was currently possible wouldn't it lead to
         | sentient/superhuman AI rapidly?
         | 
         | >tell AI to make itself more efficient by finding performance
         | improvements in human written code
         | 
         | >that newly available processing power can now be used to find
         | more ways to improve itself
         | 
         | >flywheel effect of AI improving itself as it gets smarter and
         | smarter
         | 
         | eventually you'd turn it loose on improving the actual hardware
         | it runs on. I think the question now is really how far
         | transformers can be taken and if they are really the path to
         | "real" AI.
        
           | ilaksh wrote:
           | Within a couple of years of improvement processes like you
           | suggest will actually be really dangerous and stupid.
           | 
           | Also don't confuse all other types of human/animal
           | characteristics like sentience with intelligence. They are
           | different things. Things like sentience, subjective stream of
           | experience, or other aspects of being alive don't just
           | accidentally fall out of larger training datasets.
           | 
           | And we should be glad. The models are going to be orders of
           | magnitude faster (and perhaps X times higher IQ) than humans
           | within a few years. It is incredibly foolish to try to make
           | something like that into a living creature (or emulation of
           | living).
        
             | visarga wrote:
             | Intelligence is about action, and sentience is about
             | qualia, which I equate to perceptions coloured by values.
             | Action is visible and qualia are hidden, but they are
             | closely interconnected: we choose our actions in accordance
             | with our values and situation at hand.
        
         | m3kw9 wrote:
         | I think they are at least 1-2 new big research breakthroughs(on
         | the level of Attention) away from having this.
        
         | huijzer wrote:
         | I disagree since GPUs are a major constraint currently and that
         | skilled specialists outperform GPT-4 almost always as long as
         | they stay in their domain.
         | 
         | Will they use copilot(s) to improve the models? Yes, but they
         | have been doing that since 2021 already (the release year of
         | GitHub Copilot).
        
       | flakiness wrote:
       | If they open up fine-tuning API for their latest models, I wonder
       | how the enthusiasm around the open source model is impacted. One
       | of the advantages of the open source models is the ability to be
       | fine-tuned. Are other benefits enough to keep the momentum going?
        
         | m3kw9 wrote:
         | You better have deep pockets, have you check the prices and
         | then the rates for using the tuned models? They sure 10x to
         | 100x more expensive then nontuned models
        
       | dr_dshiv wrote:
       | > OpenAI will avoid competing with their customers -- other than
       | with ChatGPT. Quite a few developers said they were nervous about
       | building with the OpenAI APIs when OpenAI might end up releasing
       | products that are competitive to them. Sam said that OpenAI would
       | not release more products beyond ChatGPT. He said there was a
       | history of great platform companies having a killer app and that
       | ChatGPT would allow them to make the APIs better by being
       | customers of their own product. The vision for ChatGPT is to be a
       | super smart assistant for work but there will be a lot of other
       | GPT use-cases that OpenAI won't touch.
       | 
       | Can anyone elaborate on this? This is a big issue for me.
        
         | ilaksh wrote:
         | I think the tricky part for me is that "work" is extremely
         | broad and now that ChatGPT has plugins, it can kind of do
         | anything. Heh.
        
         | jiggawatts wrote:
         | Is this guy Aes Sedai?
         | 
         |  _Technically_ he can claim that OpenAI will not release
         | competing products while Microsoft plugs AI into _everything_.
         | 
         | Microsoft just announced at Build 2023 that they'll have OpenAI
         | tech integrated with: Windows, Bing, Outlook, Word, Teams,
         | Visual Studio, Visual Studio Code, Microsoft Fabric, Dynamics,
         | GitHub, Azure DevOps, and Logic Apps. I probably missed a
         | bunch.
         | 
         | Very soon now, _everything_ Microsoft sells will have OpenAI
         | integration.
         | 
         | Unless you're selling a niche product too small for Microsoft
         | to bother with, you're competing directly against OpenAI.
         | 
         | Oh, and to top it off: Microsoft can use GPT 4 all they want,
         | via API access. Third parties have to _beg and plead_ to get
         | rate-limited access. That access can be withdrawn at any time
         | if you 're doing something unsafe to OpenAI's profit margins.
         | 
         | "Please Sir Sam, may I have some GPT please?"
         | 
         | "No."
        
       | yesimahuman wrote:
       | The bit about plugins not having PMF is interesting and possibly
       | flawed. I, like many others, got access to plugins but not the
       | browsing or code interpreter plugins which feel like the bedrock
       | plugins that make the whole offering useful. I think there's also
       | just education that has to happen to teach users how to
       | effectively use other plugins, and the UX isn't really there to
       | help new users figure out what to even do with plugins.
        
         | gistbug wrote:
         | Yea, seems weird to allow people to use plugins, but not all of
         | them. Then have the gall to say that no one is using plugins,
         | yea because half of them don't have any context outside of
         | America.
        
           | lumost wrote:
           | I tried the plugins - they honestly didn't seem to work very
           | well. GPT-4 wasn't sure when it could use a plugin, or when
           | it should talk about how it would do something. I wasn't able
           | to get the plugins to activate most of the time.
        
         | CSMastermind wrote:
         | Have you found plugins to be useful?
         | 
         | For what it's worth I've found the model actually performs
         | significantly worse at most tasks when given access to
         | browsing, in part because it relies on that instead of its own
         | in built knowledge.
         | 
         | I haven't found a good way to have it only access the web for
         | specific parts of its response.
        
         | verdverm wrote:
         | Most of the plugins are garbage and for those that aren't, most
         | seem like they would be better as a chat like experience in the
         | original app than the OpenAI app
        
         | furyofantares wrote:
         | PMF meaning "product market fit"? I had to look it up, curious
         | if I found the right thing or not.
        
           | nico wrote:
           | Thought it meant Pull My Finger
        
           | typest wrote:
           | Yes, PMF = "product market fit".
        
           | wsgeorge wrote:
           | Had the same reaction. I was just about Googling it when it
           | hit. Funny how the brain can work out a random acronym given
           | context.
        
       | cryptoz wrote:
       | Great content and great answers except for open source question.
       | Sam is saying that he doesn't think anyone would be able to run
       | the code at scale so they didn't bother? Seems like a nonsense
       | answer, maybe I'm misunderstanding. The ability for individuals
       | or businesses to effectively run and host the code shouldn't have
       | an impact on the ability to open source.
        
       | boringuser2 wrote:
       | >Dedicated capacity offering is limited by GPU availability.
       | OpenAI also offers dedicated capacity, which provides customers
       | with a private copy of the model. To access this service,
       | customers must be willing to commit to a $100k spend upfront.
       | 
       | How many shell corporations are intelligence agencies seeding
       | right now?
        
         | m3kw9 wrote:
         | They are not gonna give the weights for sure but it still will
         | be inferencable, I'm not sure how but it's be self destructive
         | if they did
        
           | boringuser2 wrote:
           | Exactly, with a private model you could easily extract the
           | weights.
        
         | MacsHeadroom wrote:
         | Private instance means a dedicated endpoint fully managed by
         | OpenAI. You do not get model access or anything a regular API
         | user doesn't already get, except your API url will be something
         | like customer123.openai.com/api instead of api.openai.com/api
        
         | ftxbro wrote:
         | I had been putting theories in comments but they kept getting
         | flagged or banned or downvoted to oblivion, but maybe its time
         | has come. I'll keep it tame. If you are curious you can google
         | connections of OpenAI board of directors, Will Hurd, In-Q-Tel
         | trustees, Allen and Company, etc. There is more but whatever.
         | The conspiracy theory is that 'the govt stepped in' during the
         | six month pause after gpt-4 was trained and before it was
         | released.
        
           | refulgentis wrote:
           | It probably keeps getting flagged because it's ahistorical,
           | source: OpenAI engineers, and #2 somewhat obviously so. You
           | heard of RLHF?
        
             | ftxbro wrote:
             | > You heard of RLHF?
             | 
             | The conspiracy theory isn't that every employee of OpenAI
             | only had meetings with govt agencies for 8 hours every day
             | for six months.
        
         | cwkoss wrote:
         | Last night I was musing how many different countries'
         | intelligence agencies have moles working at OpenAI currently.
         | Gotta be at least 6, maybe as high as two dozen?
        
           | CSMastermind wrote:
           | US, France, Israel ... then who? Maybe another five eyes
           | country like the UK? Possibly China? I'm pretty skeptical
           | Russia would be able to get someone in there but maybe.
        
             | invaliduser wrote:
             | Hi. French here. I may be wrong, but I really feel like you
             | are overestimating us.
        
               | CSMastermind wrote:
               | DGSE essentially puts all of it's money/effort into
               | industrial espionage and they're the best in the world at
               | it.
        
           | m3kw9 wrote:
           | I bet the NSA has dossier on every employee there as well
        
             | boringuser2 wrote:
             | "Cooperate or we'll kill your family".
             | 
             | (Just to be clear, this is a hypothetical intelligence
             | agent saying this, not me.)
             | 
             | I mean, it's not exactly rocket science, who wouldn't
             | instantly fold to that?
        
               | layer8 wrote:
               | Someone without family?
        
               | boringuser2 wrote:
               | You know the next step, right?
        
               | jameshart wrote:
               | ChatGPT responds by threatening to torture a simulation
               | of the agents' consciousness in the cloud for eternity?
               | 
               | (I mean, since we're just making up wild hypotheticals)
        
           | boringuser2 wrote:
           | Agent Lee Chen Huwang, reporting for duty.
        
       | twobitshifter wrote:
       | Left the best part until the end. Scaling models larger is still
       | paying off for openai. It's not AGI yet, but how much bigger will
       | a model need to get to max out?
       | 
       | >The scaling hypothesis is the idea that we may have most of the
       | pieces in place needed to build AGI and that most of the
       | remaining work will be taking existing methods and scaling them
       | up to larger models and bigger datasets. If the era of scaling
       | was over then we should probably expect AGI to be much further
       | away. The fact the scaling laws continue to hold is strongly
       | suggestive of shorter timelines.
        
       | vb-8448 wrote:
       | 7. find a sustainable business model and make some money
        
       | heyzk wrote:
       | Great writeup, this helps us understand where to spend our time
       | vs what OpenAI's progress will solve.
        
       | BeenAGoodUser wrote:
       | Nice to see they are working on reducing the pricing. GPT-4 is
       | just too expensive right now imo. A long conversation would
       | quickly end up costing tens of dollars if not more, so less
       | expensive model costs + stateful API is urgently needed. I think
       | even OpenAI will actually gain a lot by reducing the pricing,
       | right now I wouldn't be surprised if many uses of GPT-4 weren't
       | viable just because of the costs.
        
         | Terretta wrote:
         | This is off by probably x10 or more.
         | 
         | Dozens of people using it daily for coding and conversations
         | and review in a month might be a couple hundred bucks. All day
         | convo, constantly, as fast as it can respond, might add up to
         | $5.
         | 
         | Not sure what kind of convo you're having that you could hit
         | $10 unless you're parallelizing with something like the
         | "guidance" tool or langchain.
        
           | refulgentis wrote:
           | Absolutely not. Dinner just got here, but tl;dr gpt4 is 0.03
           | per 750 words in 0.06 per 750 words out. People except the
           | history to be included as well
        
           | jiggawatts wrote:
           | The version of GPT 4 with 32K token context length is the
           | enabler for a huge range of "killer apps", but is even more
           | expensive than the 8K version.
           | 
           | And yes, parallelism and loops are also key enablers for
           | advanced use-cases.
           | 
           | For example, I have a lot of legacy code that needs
           | uplifting. I'd love to be able to run different prompts over
           | reams of code in parallel, iterating the prompts, etc...
           | 
           | The point of these things is that they're like humans you can
           | _clone_ at will.
           | 
           | The ability to point _thousands_ of these things at a code
           | base could be mindblowing.
        
       | purplecats wrote:
       | > Cheaper and faster GPT-4 -- This is their top priority. In
       | general, OpenAI's aim is to drive "the cost of intelligence" down
       | as far as possible and so they will work hard to continue to
       | reduce the cost of the APIs over time.
       | 
       | this certainly aligns with the massive (albeit subjective and
       | anecdotal) degradation in quality i've experienced with ChatGPT
       | GPT-4 over the past few weeks.
       | 
       | hopefully a superior (higher quality) alternative surfaces before
       | its unusable. i'm not considering continuing my subscription at
       | this rate.
        
         | nonethewiser wrote:
         | I wonder of it actually is because they're tuning it to make it
         | less offensive (by their standards). Thats the only explanation
         | I keep seeing repeated.
        
           | reaperman wrote:
           | I would be very surprised. Things that are very, very far
           | from that are also much worse. I'm having difficulty finding
           | the difference between GPT-3.5 and GPT-4 for a lot of my
           | programming tasks lately. It's noticeably degraded.
        
         | brucethemoose2 wrote:
         | Anthropic's Claude is said to be very good.
         | 
         | Instruction tuned LLaMA 65B/Falcon 40B are good, especially
         | with an embeddings database.
         | 
         | ...But OpenAI has all the name recognition and ease of use now,
         | so it might not even matter if others ambiguously surpass
         | OpenAI models.
        
       | ftxbro wrote:
       | why should I believe what someone says their plans are
        
       | cwkoss wrote:
       | I love the tongue-in-cheek paradox myth that the Bitcoin
       | whitepaper was written by a future god-AI to increase demand for
       | GPUs (and thus boost supply) so we are able to assemble the
       | future god-AI.
        
         | thelittleone wrote:
         | Conceptually this is paradoxical because of the notion that
         | time is linear. Which it is to the best of our current
         | understanding.
        
         | mkoubaa wrote:
         | Then that god AI must have also pulled some strings for early
         | video games
        
         | cwkoss wrote:
         | Fun to imagine a time machine being built but the only thing it
         | can transmit backward in time is PDFs
        
         | tivert wrote:
         | > I love the tongue-in-cheek paradox myth that the Bitcoin
         | whitepaper was written by a future god-AI to increase demand
         | for GPUs (and thus boost supply) so we are able to assemble the
         | future god-AI.
         | 
         | I know it's a joke, but the hole is the god-AI couldn't have
         | been that smart, since cryptocurrency-mining quickly switched
         | to ASICs, which muting the demand increase for GPUs.
        
           | modernpink wrote:
           | Well, humans switched from using using brains to store all
           | their memories once they could dump data onto external media
           | via writing. Much like how crypto switching to ASICs frees up
           | GPU capacity for AGI, writing freed the brain to develop
           | higher GI.
        
           | cwkoss wrote:
           | I think there are some derivative coins extended the
           | viability of GPU mining, but I've been out of the game for a
           | decade.
        
           | anamexis wrote:
           | But not before ramping up development and production of GPUs.
        
             | tivert wrote:
             | > But not before ramping up development and production of
             | GPUs.
             | 
             | Did the GPU manufactures ever embrace cryptocurrency? IIRC,
             | they actually tried to discourage it (e.g. by butting
             | throttling into mass market models to discourage their use
             | for computation).
             | 
             | Also, the graphs here show a long-term downward trend, with
             | only a short-term sales blip 5 years ago due to
             | cryptocurrency: https://www.tomshardware.com/news/sales-of-
             | desktop-graphics-....
        
               | refulgentis wrote:
               | Desktops is a very very key word there, I believe that's
               | why they repeat it so much. And it's all tongue in cheek,
               | and we all largely understand mining drove up gpu demand
        
         | barbazoo wrote:
         | I'd read that book!
        
           | majormajor wrote:
           | Hyperion/The Fall of Hyperion by Dan Simmons has something
           | similar.
        
           | baq wrote:
           | Watch Tenet.
        
             | skulk wrote:
             | I'm from the future, traveling backwards in time to tell
             | you to not watch Tenet.
        
               | barbazoo wrote:
               | Somehow I'm super sensitive to the audio (or might be
               | video) and start feeling nauseous after a short time. Is
               | there an explanation to this? I think it's that scratchy
               | humming background sound.
        
               | KineticLensman wrote:
               | Also lots of similar wisdom, from Percival Dunwoody,
               | Idiot Time Traveller from 1909 [0].
               | 
               | [0] https://www.gocomics.com/tomthedancingbug/2022/06/17
        
       | bobbyi wrote:
       | The roadmap here is completely focused on ChatGPT and GPT-4. I
       | wonder what portion of their resources is still going to other
       | areas (DALL-E, audio/ video processing, etc.)
        
         | cmelbye wrote:
         | Maybe some of those things that are currently separate projects
         | will eventually converge with a multimodal model.
        
       | sashank_1509 wrote:
       | Really great news to give at cheaper and faster GPT4. As a GPT+
       | subscriber, the most annoying thing is the 25 message limit every
       | 3 hours, I really want that removed.
       | 
       | A bit sad to hear that the multimodal model will only come next
       | year, was hoping to get it this year
       | 
       | 100k to 1 Million context length, sounds phenomenal especially if
       | it comes to GPT4. I've used Claudes 100k context length and I
       | found it so useful that when I have large documents I just
       | default to Claude now
        
         | alchemist1e9 wrote:
         | Did you have any tips on how you got access to Claude? I do the
         | request access but never get any email or any contact.
        
           | refulgentis wrote:
           | Poe, I'm in the same boat btw
        
       | jasmer wrote:
       | It's absurd that people are still thinking that a language model
       | which a bunch of tokens are indexed is some kind of 'AGI'.
        
       | asnyder wrote:
       | His statements on open sourcing in this interview/write-up is
       | somewhat in conflict with his recent statement made last week in
       | Munich https://youtu.be/uaQZIK9gvNo?t=1170, where he explicitly
       | said the Frontier of GPT won't be open sourced due to what they
       | perceive as safety reasons, https://youtu.be/uaQZIK9gvNo?t=1170
       | (19:30 - 22:00).
        
         | muskmusk wrote:
         | I don't see the conflict. They see _current_ models as mostly
         | harmless, but what comes next is dangerous.
         | 
         | It sounds a little too much sci-fi for me, but I guess he knows
         | better.
        
           | wintogreen74 wrote:
           | plus this conveniently pairs with "we don't need to regulate
           | current models, but future models... oh boy do those need to
           | be regulated!"
        
         | ilaksh wrote:
         | He was talking about open sourcing GPT-3. That is not the
         | frontier.
         | 
         | The frontier is the multimodal versions of GPT-4 which he just
         | said wasn't even going to public release until next year. Or
         | whatever they are on now which they are carefully not calling
         | GPT-5.
        
         | ftxbro wrote:
         | it's legal to make contradictory statements that's one of the
         | job of a ceo and it's why they aren't usually overly literal
         | types you know the kind i'm talking about
        
       | hervature wrote:
       | I never know if I have an inside scoop or an outside scoop. Has
       | Hyena not addressed the scaling of context length [1]? I know
       | this version is barely a month old but it was shared to me by a
       | non-engineer the week it came out. Still, giving interviews where
       | the person takes away that the main limitation is context length
       | and requires a big breakthrough that already happened makes me
       | seriously question whether or not he is qualified to speak on
       | behalf of OpenAI. Maybe he and OpenAI are far beyond this paper
       | and know it does not work but surely it should be addressed?
       | 
       | [1] - https://arxiv.org/pdf/2302.10866.pdf
        
         | arugulum wrote:
         | As someone who is in the field: papers proposing to solve the
         | context length problem come out every month. Almost none of the
         | solutions stick or work as well as a dense or mostly dense
         | model.
         | 
         | You'll know when the problem is solved when model after
         | consistently use a method. Until then (and especially if you're
         | not in the field as a researcher), assume that every paper
         | claiming to tackle context length is simply a nice proposal.
        
           | dr_dshiv wrote:
           | What about Meta's megabyte? Also nice proposal?
        
             | visarga wrote:
             | Yes. Solving context length has been tried in hundreds of
             | different approaches, and yet most LLMs are almost
             | identical to the original one from 2017.
             | 
             | Just to name a few families of approaches: Sparse
             | Attention, Hierachical Attention, Global-Local
             | Attention,Sliding Window Attention, Locality sensitive
             | hashing Attention, State space model, EMA gated attention.
        
               | Loquebantur wrote:
               | I assume, there is a common point of failure?
               | 
               | Notably, human working memory isn't great either. Which
               | begs the question (if the comparison is valid) as to
               | whether that limitation might be fundamental.
        
         | [deleted]
        
         | [deleted]
        
       | m3kw9 wrote:
       | If you look at their API limits, no serious company can use this
       | to scale up beyond say 10k users. 3500 Reqs per min for gpt3.5
       | turbo. They have a long way to go to make it usable for the rest
       | of the 95%
        
         | thorax wrote:
         | I've had to move to using Azure OpenAI service during business
         | hours for the API-- much more stable unless the prompts stray
         | into something a little odd and their API censorship blocks the
         | calls.
        
       ___________________________________________________________________
       (page generated 2023-05-31 23:00 UTC)