[HN Gopher] OpenAI's plans according to sama
___________________________________________________________________
OpenAI's plans according to sama
Author : razcle
Score : 179 points
Date : 2023-05-31 18:05 UTC (4 hours ago)
(HTM) web link (humanloop.com)
(TXT) w3m dump (humanloop.com)
| londons_explore wrote:
| > is limited by GPU availability.
|
| Which is all the more curious, considering OpenAI said this only
| in January:
|
| > Azure will remain the exclusive cloud provider for all OpenAI
| workloads across our research, API and products [1]
|
| So... OpenAI is severely GPU constrained, it is hampering their
| ability to execute, onboard customers to existing products and
| launch products. Yet they signed an agreement _not_ to just go
| rent a bunch of GPU 's from AWS???
|
| Did someone screw up by not putting a clause in that contract
| saying "exclusive cloud provider, _unless you cannot fulfil our
| requests_ "?
|
| [1]: https://openai.com/blog/openai-and-microsoft-extend-
| partners...
| ilaksh wrote:
| AWS might not really have much extra GPU capacity for them
| anyway.. also they would cost more.
|
| I think that there aren't a lot of GPUs available and it takes
| time to add more to the datacenter even when you do get them.
| carom wrote:
| I heard earlier this year that people were having trouble
| getting allocations on GCP as well. Probably why Nvidia is at
| $1T now.
| chaostheory wrote:
| Even if they weren't exclusive with Azure, aren't GPU prices
| reasonable again?
| verdverm wrote:
| They have to be a available to buy, regardless the price. My
| understanding is there is a distinct lack of supply
| londons_explore wrote:
| Perhaps they are cash flow constrained, which in turn means
| they are GPU constrained, since GPU's are their biggest
| expense?
| sebzim4500 wrote:
| >So... OpenAI is severely GPU constrained, it is hampering
| their ability to execute, onboard customers to existing
| products and launch products. Yet they signed an agreement not
| to just go rent a bunch of GPU's from AWS???
|
| > Did someone screw up by not putting a clause in that contract
| saying "exclusive cloud provider, unless you cannot fulfil our
| requests"?
|
| Maybe MSFT refused to sign such an agreement?
| catchnear4321 wrote:
| this has nothing to do with sama clamoring for regulation.
|
| that absolutely isn't an attempt to slow down all competition.
|
| which isn't necessary because nobody made such a mistake.
|
| this won't lead to any hasty or reckless internal decisions in
| a feckless effort to stay in front.
|
| not that any have already been made.
|
| not that that could lead to disaster.
| jiggawatts wrote:
| One of Azure's unique offerings is very large HPC clusters with
| GPUs. You can deploy ~1,000 node scale sets with very high
| speed networking. AWS has many single-server GPU offerings, but
| nothing quite like what Azure has.
|
| Don't assume Microsoft is bad at _everything_ and that AWS is
| automatically superior at all product categories...
| HarHarVeryFunny wrote:
| There's an interesting recent video here from Microsoft
| discussing Azure. The format is a bit cheesy, but lots of
| interesting information nonetheless.
|
| https://www.youtube.com/watch?v=Rk3nTUfRZmo&t=5s "What runs
| ChatGPT? Inside Microsoft's AI supercomputer"
|
| The relevance here is that Azure appears to be very well
| designed to handle the hardware failures that will inevitably
| happen during a training run taking weeks or months and using
| many thousands of GPUs... There's a lot more involved than just
| renting a bunch of Amazon GPUs, and anyways the partnership
| between OpenAI and Microsoft appears quite strategic, and can
| handle some build-out delays, especially if they are not
| Microsoft's fault.
| simse wrote:
| > A stateful API
|
| This would be huge for many applications, as "chatting" with
| GPT-4 gets really, really expensive very quickly. I've played
| with API with friends, and winced as I watched my usage hit
| several dollars for just a bit of fun.
| dontupvoteme wrote:
| Having been recently taken aboard by the mothership I expect
| they'll start trying to tune out anything related to programming
| to push people towards co-pilot X..
|
| It's pretty hilarious and annoying to see bing start to write
| code only to self censor itself after a few lines (deleting what
| was there! no wonder these guys love websockets and dynamic
| histories)
|
| Whoops!
| _boffin_ wrote:
| Wait... what? Can you elaborate.
| mistymountains wrote:
| He's speculating that Microsoft is nerfing OpenAI / chatGPT
| to funnel narrow capabilities to silos like CoPilot.
| _boffin_ wrote:
| I understand that... I should have specified a bit more
| that i'm interested in knowing more about the removal of
| answers as its writing them, if they're code.
| dontupvoteme wrote:
| https://en.wikipedia.org/wiki/Embrace,_extend,_and_extinguis.
| ..
| _boffin_ wrote:
| yes... I know about this, but that's not what I'm asking
| about. I'm asking about it removing partial answers as it's
| writing them.
|
| Please make more effort next time than to provide me with a
| Wiki article.
| sharkjacobs wrote:
| > He reiterated his belief in the importance of open source and
| said that OpenAI was considering open-sourcing GPT-3. Part of the
| reason they hadn't open-sourced yet was that he was skeptical of
| how many individuals and companies would have the capability to
| host and serve large LLMs.
|
| Am I reading this right? "We're not open sourcing GPT-3 because
| we don't think it would be useful to anyone else"
| xxprogamerxy wrote:
| He wants the release of the model to primarily benefit
| individuals and smaller teams as opposed to large deep-pocketed
| firms.
| stavros wrote:
| Reads to me like "we don't know how many people will have
| hardware powerful enough to run this".
| razcle wrote:
| I think I worded this poorly. What he said was that a lot of
| people say they want open-source models but they underestimate
| how hard it is to serve them well. So he wondered how much real
| benefit would come from open-sourcing them.
|
| I think this is reasonable. Giving researchers access is great
| but for most small companies they're likely better off having a
| service provider manage inference for them rather than navigate
| the infra challenge.
| roganartu wrote:
| The beauty of open source is that the community will either
| figure out how to make it easier, or collectively decide it's
| not worth the effort. We saw this with stable diffusion, and
| we are seeing it with all the existing OSS LLMs.
|
| "It's too hard, trust us" doesn't really make sense in that
| context. If it is indeed too hard for small orgs to self host
| then they won't. Hiding behind the guise of protecting these
| people by not open sourcing it seems a bit disingenuous.
| greenie_beans wrote:
| lmao i had the same reaction. sounds like some bullshit.
| TigeriusKirk wrote:
| How can you sign a statement that AI presents an extinction
| risk on par with nuclear weapons and then even consider open
| sourcing your research?
|
| We don't provide nuclear weapons for everyone to keep in their
| basement, why would someone who believes AI is an existential
| risk provide their code?
| bibanez wrote:
| I agree, this is so bizarre
| ftxbro wrote:
| yes i also can't wrap my head around how a ceo of a billion
| dollar company isn't sincere in his public statements
| wintogreen74 wrote:
| Really? Even after saying this? "While Sam is calling for
| regulation of future models, he didn't think existing
| models were dangerous and thought it would be a big mistake
| to regulate or ban them."
| cinntaile wrote:
| It was a tongue in cheek reaction.
| ethanbond wrote:
| Why couldn't that be true? E.g. even scientists who
| worked on the Manhattan Project (justifiably) had
| antipathy toward the much more powerful hydrogen bomb.
|
| It's possible to think squirt guns shouldn't be regulated
| but AR-15s should, or AR-15s shouldn't but cruise
| missiles should. Or driving at 25mph should be allowed
| but driving 125mph shouldn't.
| RosanaAnaDana wrote:
| Its just a way to lie that doesn't sound as much like a lie.
| paxys wrote:
| More like - it won't be useful to small-time developers (since
| they won't have the capability to host and run it themselves)
| and so all the benefits will be reaped by AWS and other large
| players.
| TapWaterBandit wrote:
| When you stop listening to what Sam Altman says and just focus
| on what he does, you can see the guy is a bit of a snake.
| Greedy power-hungry man imho.
| sebzim4500 wrote:
| It is weird, but GPT-3 is worse than much smaller LLaMA models
| so I doubt it would see much use anyway.
| flangola7 wrote:
| Are you referring to DaVinci or ChatGPT-3.5
| sebzim4500 wrote:
| DaVinci
| killjoywashere wrote:
| How do you measure this? Pointers to papers would be very
| helpful
| sebzim4500 wrote:
| The LLaMA paper had a bunch of comparisons
| candiddevmike wrote:
| OpenAI: Regulations must be passed to protect our moat
|
| Also OpenAI: Meta is pissing in our moat, let's drop a hint
| about open sourcing our shit too!
| [deleted]
| naillo wrote:
| > Plugins "don't have PMF"
|
| Probability mass functions? Anyone know what this means in this
| context?
| simonbutt wrote:
| Product market fit
| [deleted]
| sovietmudkipz wrote:
| I'm hoping GPT will remove the information cutoff date. I write
| plenty of terraform/AWS and it's a bit of a pain that the latest
| API isn't accessible by GPT yet.
|
| There's been quite a bit happening in the programming space since
| sept 2021.
|
| I use GPT to keep things high level and then do my normal
| research methodology for implementation details.
| mustacheemperor wrote:
| I enjoy using GPT4 as a co-programmer, and funny enough it is
| very challenging to get advice on Microsoft's own .NET MAUI
| because that framework was in prerelease at the time the model
| was trained.
|
| My understanding is right now they essentially need to train a
| new model on a new updated corpus to fix this, but maybe some
| other techniques could be devised...or they'll train something
| more up to date.
| ilaksh wrote:
| You might actually get pretty far if you just went through
| the Microsoft docs and created a bunch of really concise
| examples and fed that as the start of the prompt. Use like
| 6-7kb for that and then the question at the end.
| mustacheemperor wrote:
| I have had some luck doing exactly that, and not even as
| efficiently as you describe - If my question is limited
| enough that the discussion won't overwhelm the context
| window I've found I can just paste in big chunks of the
| docs wholesale like a 'zero shot.'
| buildbot wrote:
| Context drift! https://qntm.org/mmacevedo
| ilaksh wrote:
| As stated your request is entirely impossible. They cannot
| simply "remove the cut-off date". It takes months and huge
| amounts of hardware to train. Then they do the reinforcement
| adjustments on top of it while researching how to train the
| next batch.
| furyofantares wrote:
| It's not like an arbitrary imposition, that's the data it was
| trained on and it's expensive to train. I hope they find a way
| to continually train in new information too but it's not like
| they can just remove the cutoff date.
| hoschicz wrote:
| - they are working on a stateful API - they are working on a
| cheaper version of GPT-4
|
| Most probably this is driven by their use of it in ChatGPT, which
| is on fire from PMF. Clearly they're experimenting with the
| cheaper GPT-4 in ChatGPT right now as it's fairly turbo now, as
| discussed earlier today.
| atemerev wrote:
| All AI companies (OpenAI included) are now working full tilt on
| making AIs improve themselves (writing their own code, inventing
| new pipelines etc). I don't know why choose anything else to work
| on. This is a prime directive, that will bring the greatest
| payoff.
| ren_engineer wrote:
| if this was currently possible wouldn't it lead to
| sentient/superhuman AI rapidly?
|
| >tell AI to make itself more efficient by finding performance
| improvements in human written code
|
| >that newly available processing power can now be used to find
| more ways to improve itself
|
| >flywheel effect of AI improving itself as it gets smarter and
| smarter
|
| eventually you'd turn it loose on improving the actual hardware
| it runs on. I think the question now is really how far
| transformers can be taken and if they are really the path to
| "real" AI.
| ilaksh wrote:
| Within a couple of years of improvement processes like you
| suggest will actually be really dangerous and stupid.
|
| Also don't confuse all other types of human/animal
| characteristics like sentience with intelligence. They are
| different things. Things like sentience, subjective stream of
| experience, or other aspects of being alive don't just
| accidentally fall out of larger training datasets.
|
| And we should be glad. The models are going to be orders of
| magnitude faster (and perhaps X times higher IQ) than humans
| within a few years. It is incredibly foolish to try to make
| something like that into a living creature (or emulation of
| living).
| visarga wrote:
| Intelligence is about action, and sentience is about
| qualia, which I equate to perceptions coloured by values.
| Action is visible and qualia are hidden, but they are
| closely interconnected: we choose our actions in accordance
| with our values and situation at hand.
| m3kw9 wrote:
| I think they are at least 1-2 new big research breakthroughs(on
| the level of Attention) away from having this.
| huijzer wrote:
| I disagree since GPUs are a major constraint currently and that
| skilled specialists outperform GPT-4 almost always as long as
| they stay in their domain.
|
| Will they use copilot(s) to improve the models? Yes, but they
| have been doing that since 2021 already (the release year of
| GitHub Copilot).
| flakiness wrote:
| If they open up fine-tuning API for their latest models, I wonder
| how the enthusiasm around the open source model is impacted. One
| of the advantages of the open source models is the ability to be
| fine-tuned. Are other benefits enough to keep the momentum going?
| m3kw9 wrote:
| You better have deep pockets, have you check the prices and
| then the rates for using the tuned models? They sure 10x to
| 100x more expensive then nontuned models
| dr_dshiv wrote:
| > OpenAI will avoid competing with their customers -- other than
| with ChatGPT. Quite a few developers said they were nervous about
| building with the OpenAI APIs when OpenAI might end up releasing
| products that are competitive to them. Sam said that OpenAI would
| not release more products beyond ChatGPT. He said there was a
| history of great platform companies having a killer app and that
| ChatGPT would allow them to make the APIs better by being
| customers of their own product. The vision for ChatGPT is to be a
| super smart assistant for work but there will be a lot of other
| GPT use-cases that OpenAI won't touch.
|
| Can anyone elaborate on this? This is a big issue for me.
| ilaksh wrote:
| I think the tricky part for me is that "work" is extremely
| broad and now that ChatGPT has plugins, it can kind of do
| anything. Heh.
| jiggawatts wrote:
| Is this guy Aes Sedai?
|
| _Technically_ he can claim that OpenAI will not release
| competing products while Microsoft plugs AI into _everything_.
|
| Microsoft just announced at Build 2023 that they'll have OpenAI
| tech integrated with: Windows, Bing, Outlook, Word, Teams,
| Visual Studio, Visual Studio Code, Microsoft Fabric, Dynamics,
| GitHub, Azure DevOps, and Logic Apps. I probably missed a
| bunch.
|
| Very soon now, _everything_ Microsoft sells will have OpenAI
| integration.
|
| Unless you're selling a niche product too small for Microsoft
| to bother with, you're competing directly against OpenAI.
|
| Oh, and to top it off: Microsoft can use GPT 4 all they want,
| via API access. Third parties have to _beg and plead_ to get
| rate-limited access. That access can be withdrawn at any time
| if you 're doing something unsafe to OpenAI's profit margins.
|
| "Please Sir Sam, may I have some GPT please?"
|
| "No."
| yesimahuman wrote:
| The bit about plugins not having PMF is interesting and possibly
| flawed. I, like many others, got access to plugins but not the
| browsing or code interpreter plugins which feel like the bedrock
| plugins that make the whole offering useful. I think there's also
| just education that has to happen to teach users how to
| effectively use other plugins, and the UX isn't really there to
| help new users figure out what to even do with plugins.
| gistbug wrote:
| Yea, seems weird to allow people to use plugins, but not all of
| them. Then have the gall to say that no one is using plugins,
| yea because half of them don't have any context outside of
| America.
| lumost wrote:
| I tried the plugins - they honestly didn't seem to work very
| well. GPT-4 wasn't sure when it could use a plugin, or when
| it should talk about how it would do something. I wasn't able
| to get the plugins to activate most of the time.
| CSMastermind wrote:
| Have you found plugins to be useful?
|
| For what it's worth I've found the model actually performs
| significantly worse at most tasks when given access to
| browsing, in part because it relies on that instead of its own
| in built knowledge.
|
| I haven't found a good way to have it only access the web for
| specific parts of its response.
| verdverm wrote:
| Most of the plugins are garbage and for those that aren't, most
| seem like they would be better as a chat like experience in the
| original app than the OpenAI app
| furyofantares wrote:
| PMF meaning "product market fit"? I had to look it up, curious
| if I found the right thing or not.
| nico wrote:
| Thought it meant Pull My Finger
| typest wrote:
| Yes, PMF = "product market fit".
| wsgeorge wrote:
| Had the same reaction. I was just about Googling it when it
| hit. Funny how the brain can work out a random acronym given
| context.
| cryptoz wrote:
| Great content and great answers except for open source question.
| Sam is saying that he doesn't think anyone would be able to run
| the code at scale so they didn't bother? Seems like a nonsense
| answer, maybe I'm misunderstanding. The ability for individuals
| or businesses to effectively run and host the code shouldn't have
| an impact on the ability to open source.
| boringuser2 wrote:
| >Dedicated capacity offering is limited by GPU availability.
| OpenAI also offers dedicated capacity, which provides customers
| with a private copy of the model. To access this service,
| customers must be willing to commit to a $100k spend upfront.
|
| How many shell corporations are intelligence agencies seeding
| right now?
| m3kw9 wrote:
| They are not gonna give the weights for sure but it still will
| be inferencable, I'm not sure how but it's be self destructive
| if they did
| boringuser2 wrote:
| Exactly, with a private model you could easily extract the
| weights.
| MacsHeadroom wrote:
| Private instance means a dedicated endpoint fully managed by
| OpenAI. You do not get model access or anything a regular API
| user doesn't already get, except your API url will be something
| like customer123.openai.com/api instead of api.openai.com/api
| ftxbro wrote:
| I had been putting theories in comments but they kept getting
| flagged or banned or downvoted to oblivion, but maybe its time
| has come. I'll keep it tame. If you are curious you can google
| connections of OpenAI board of directors, Will Hurd, In-Q-Tel
| trustees, Allen and Company, etc. There is more but whatever.
| The conspiracy theory is that 'the govt stepped in' during the
| six month pause after gpt-4 was trained and before it was
| released.
| refulgentis wrote:
| It probably keeps getting flagged because it's ahistorical,
| source: OpenAI engineers, and #2 somewhat obviously so. You
| heard of RLHF?
| ftxbro wrote:
| > You heard of RLHF?
|
| The conspiracy theory isn't that every employee of OpenAI
| only had meetings with govt agencies for 8 hours every day
| for six months.
| cwkoss wrote:
| Last night I was musing how many different countries'
| intelligence agencies have moles working at OpenAI currently.
| Gotta be at least 6, maybe as high as two dozen?
| CSMastermind wrote:
| US, France, Israel ... then who? Maybe another five eyes
| country like the UK? Possibly China? I'm pretty skeptical
| Russia would be able to get someone in there but maybe.
| invaliduser wrote:
| Hi. French here. I may be wrong, but I really feel like you
| are overestimating us.
| CSMastermind wrote:
| DGSE essentially puts all of it's money/effort into
| industrial espionage and they're the best in the world at
| it.
| m3kw9 wrote:
| I bet the NSA has dossier on every employee there as well
| boringuser2 wrote:
| "Cooperate or we'll kill your family".
|
| (Just to be clear, this is a hypothetical intelligence
| agent saying this, not me.)
|
| I mean, it's not exactly rocket science, who wouldn't
| instantly fold to that?
| layer8 wrote:
| Someone without family?
| boringuser2 wrote:
| You know the next step, right?
| jameshart wrote:
| ChatGPT responds by threatening to torture a simulation
| of the agents' consciousness in the cloud for eternity?
|
| (I mean, since we're just making up wild hypotheticals)
| boringuser2 wrote:
| Agent Lee Chen Huwang, reporting for duty.
| twobitshifter wrote:
| Left the best part until the end. Scaling models larger is still
| paying off for openai. It's not AGI yet, but how much bigger will
| a model need to get to max out?
|
| >The scaling hypothesis is the idea that we may have most of the
| pieces in place needed to build AGI and that most of the
| remaining work will be taking existing methods and scaling them
| up to larger models and bigger datasets. If the era of scaling
| was over then we should probably expect AGI to be much further
| away. The fact the scaling laws continue to hold is strongly
| suggestive of shorter timelines.
| vb-8448 wrote:
| 7. find a sustainable business model and make some money
| heyzk wrote:
| Great writeup, this helps us understand where to spend our time
| vs what OpenAI's progress will solve.
| BeenAGoodUser wrote:
| Nice to see they are working on reducing the pricing. GPT-4 is
| just too expensive right now imo. A long conversation would
| quickly end up costing tens of dollars if not more, so less
| expensive model costs + stateful API is urgently needed. I think
| even OpenAI will actually gain a lot by reducing the pricing,
| right now I wouldn't be surprised if many uses of GPT-4 weren't
| viable just because of the costs.
| Terretta wrote:
| This is off by probably x10 or more.
|
| Dozens of people using it daily for coding and conversations
| and review in a month might be a couple hundred bucks. All day
| convo, constantly, as fast as it can respond, might add up to
| $5.
|
| Not sure what kind of convo you're having that you could hit
| $10 unless you're parallelizing with something like the
| "guidance" tool or langchain.
| refulgentis wrote:
| Absolutely not. Dinner just got here, but tl;dr gpt4 is 0.03
| per 750 words in 0.06 per 750 words out. People except the
| history to be included as well
| jiggawatts wrote:
| The version of GPT 4 with 32K token context length is the
| enabler for a huge range of "killer apps", but is even more
| expensive than the 8K version.
|
| And yes, parallelism and loops are also key enablers for
| advanced use-cases.
|
| For example, I have a lot of legacy code that needs
| uplifting. I'd love to be able to run different prompts over
| reams of code in parallel, iterating the prompts, etc...
|
| The point of these things is that they're like humans you can
| _clone_ at will.
|
| The ability to point _thousands_ of these things at a code
| base could be mindblowing.
| purplecats wrote:
| > Cheaper and faster GPT-4 -- This is their top priority. In
| general, OpenAI's aim is to drive "the cost of intelligence" down
| as far as possible and so they will work hard to continue to
| reduce the cost of the APIs over time.
|
| this certainly aligns with the massive (albeit subjective and
| anecdotal) degradation in quality i've experienced with ChatGPT
| GPT-4 over the past few weeks.
|
| hopefully a superior (higher quality) alternative surfaces before
| its unusable. i'm not considering continuing my subscription at
| this rate.
| nonethewiser wrote:
| I wonder of it actually is because they're tuning it to make it
| less offensive (by their standards). Thats the only explanation
| I keep seeing repeated.
| reaperman wrote:
| I would be very surprised. Things that are very, very far
| from that are also much worse. I'm having difficulty finding
| the difference between GPT-3.5 and GPT-4 for a lot of my
| programming tasks lately. It's noticeably degraded.
| brucethemoose2 wrote:
| Anthropic's Claude is said to be very good.
|
| Instruction tuned LLaMA 65B/Falcon 40B are good, especially
| with an embeddings database.
|
| ...But OpenAI has all the name recognition and ease of use now,
| so it might not even matter if others ambiguously surpass
| OpenAI models.
| ftxbro wrote:
| why should I believe what someone says their plans are
| cwkoss wrote:
| I love the tongue-in-cheek paradox myth that the Bitcoin
| whitepaper was written by a future god-AI to increase demand for
| GPUs (and thus boost supply) so we are able to assemble the
| future god-AI.
| thelittleone wrote:
| Conceptually this is paradoxical because of the notion that
| time is linear. Which it is to the best of our current
| understanding.
| mkoubaa wrote:
| Then that god AI must have also pulled some strings for early
| video games
| cwkoss wrote:
| Fun to imagine a time machine being built but the only thing it
| can transmit backward in time is PDFs
| tivert wrote:
| > I love the tongue-in-cheek paradox myth that the Bitcoin
| whitepaper was written by a future god-AI to increase demand
| for GPUs (and thus boost supply) so we are able to assemble the
| future god-AI.
|
| I know it's a joke, but the hole is the god-AI couldn't have
| been that smart, since cryptocurrency-mining quickly switched
| to ASICs, which muting the demand increase for GPUs.
| modernpink wrote:
| Well, humans switched from using using brains to store all
| their memories once they could dump data onto external media
| via writing. Much like how crypto switching to ASICs frees up
| GPU capacity for AGI, writing freed the brain to develop
| higher GI.
| cwkoss wrote:
| I think there are some derivative coins extended the
| viability of GPU mining, but I've been out of the game for a
| decade.
| anamexis wrote:
| But not before ramping up development and production of GPUs.
| tivert wrote:
| > But not before ramping up development and production of
| GPUs.
|
| Did the GPU manufactures ever embrace cryptocurrency? IIRC,
| they actually tried to discourage it (e.g. by butting
| throttling into mass market models to discourage their use
| for computation).
|
| Also, the graphs here show a long-term downward trend, with
| only a short-term sales blip 5 years ago due to
| cryptocurrency: https://www.tomshardware.com/news/sales-of-
| desktop-graphics-....
| refulgentis wrote:
| Desktops is a very very key word there, I believe that's
| why they repeat it so much. And it's all tongue in cheek,
| and we all largely understand mining drove up gpu demand
| barbazoo wrote:
| I'd read that book!
| majormajor wrote:
| Hyperion/The Fall of Hyperion by Dan Simmons has something
| similar.
| baq wrote:
| Watch Tenet.
| skulk wrote:
| I'm from the future, traveling backwards in time to tell
| you to not watch Tenet.
| barbazoo wrote:
| Somehow I'm super sensitive to the audio (or might be
| video) and start feeling nauseous after a short time. Is
| there an explanation to this? I think it's that scratchy
| humming background sound.
| KineticLensman wrote:
| Also lots of similar wisdom, from Percival Dunwoody,
| Idiot Time Traveller from 1909 [0].
|
| [0] https://www.gocomics.com/tomthedancingbug/2022/06/17
| bobbyi wrote:
| The roadmap here is completely focused on ChatGPT and GPT-4. I
| wonder what portion of their resources is still going to other
| areas (DALL-E, audio/ video processing, etc.)
| cmelbye wrote:
| Maybe some of those things that are currently separate projects
| will eventually converge with a multimodal model.
| sashank_1509 wrote:
| Really great news to give at cheaper and faster GPT4. As a GPT+
| subscriber, the most annoying thing is the 25 message limit every
| 3 hours, I really want that removed.
|
| A bit sad to hear that the multimodal model will only come next
| year, was hoping to get it this year
|
| 100k to 1 Million context length, sounds phenomenal especially if
| it comes to GPT4. I've used Claudes 100k context length and I
| found it so useful that when I have large documents I just
| default to Claude now
| alchemist1e9 wrote:
| Did you have any tips on how you got access to Claude? I do the
| request access but never get any email or any contact.
| refulgentis wrote:
| Poe, I'm in the same boat btw
| jasmer wrote:
| It's absurd that people are still thinking that a language model
| which a bunch of tokens are indexed is some kind of 'AGI'.
| asnyder wrote:
| His statements on open sourcing in this interview/write-up is
| somewhat in conflict with his recent statement made last week in
| Munich https://youtu.be/uaQZIK9gvNo?t=1170, where he explicitly
| said the Frontier of GPT won't be open sourced due to what they
| perceive as safety reasons, https://youtu.be/uaQZIK9gvNo?t=1170
| (19:30 - 22:00).
| muskmusk wrote:
| I don't see the conflict. They see _current_ models as mostly
| harmless, but what comes next is dangerous.
|
| It sounds a little too much sci-fi for me, but I guess he knows
| better.
| wintogreen74 wrote:
| plus this conveniently pairs with "we don't need to regulate
| current models, but future models... oh boy do those need to
| be regulated!"
| ilaksh wrote:
| He was talking about open sourcing GPT-3. That is not the
| frontier.
|
| The frontier is the multimodal versions of GPT-4 which he just
| said wasn't even going to public release until next year. Or
| whatever they are on now which they are carefully not calling
| GPT-5.
| ftxbro wrote:
| it's legal to make contradictory statements that's one of the
| job of a ceo and it's why they aren't usually overly literal
| types you know the kind i'm talking about
| hervature wrote:
| I never know if I have an inside scoop or an outside scoop. Has
| Hyena not addressed the scaling of context length [1]? I know
| this version is barely a month old but it was shared to me by a
| non-engineer the week it came out. Still, giving interviews where
| the person takes away that the main limitation is context length
| and requires a big breakthrough that already happened makes me
| seriously question whether or not he is qualified to speak on
| behalf of OpenAI. Maybe he and OpenAI are far beyond this paper
| and know it does not work but surely it should be addressed?
|
| [1] - https://arxiv.org/pdf/2302.10866.pdf
| arugulum wrote:
| As someone who is in the field: papers proposing to solve the
| context length problem come out every month. Almost none of the
| solutions stick or work as well as a dense or mostly dense
| model.
|
| You'll know when the problem is solved when model after
| consistently use a method. Until then (and especially if you're
| not in the field as a researcher), assume that every paper
| claiming to tackle context length is simply a nice proposal.
| dr_dshiv wrote:
| What about Meta's megabyte? Also nice proposal?
| visarga wrote:
| Yes. Solving context length has been tried in hundreds of
| different approaches, and yet most LLMs are almost
| identical to the original one from 2017.
|
| Just to name a few families of approaches: Sparse
| Attention, Hierachical Attention, Global-Local
| Attention,Sliding Window Attention, Locality sensitive
| hashing Attention, State space model, EMA gated attention.
| Loquebantur wrote:
| I assume, there is a common point of failure?
|
| Notably, human working memory isn't great either. Which
| begs the question (if the comparison is valid) as to
| whether that limitation might be fundamental.
| [deleted]
| [deleted]
| m3kw9 wrote:
| If you look at their API limits, no serious company can use this
| to scale up beyond say 10k users. 3500 Reqs per min for gpt3.5
| turbo. They have a long way to go to make it usable for the rest
| of the 95%
| thorax wrote:
| I've had to move to using Azure OpenAI service during business
| hours for the API-- much more stable unless the prompts stray
| into something a little odd and their API censorship blocks the
| calls.
___________________________________________________________________
(page generated 2023-05-31 23:00 UTC)