[HN Gopher] GPT-Neo - Building a GPT-3-sized model, open source ...
___________________________________________________________________
GPT-Neo - Building a GPT-3-sized model, open source and free
Author : sieste
Score : 638 points
Date : 2021-01-18 09:13 UTC (13 hours ago)
(HTM) web link (www.eleuther.ai)
(TXT) w3m dump (www.eleuther.ai)
| rexreed wrote:
| Once Microsoft became the sole and exclusive licensee of GPT-3, a
| credible open source effort was bound to emerge, and I hope it
| really is an effective alternative.
| fakedang wrote:
| Eleuther works with Google. I'd rather prefer the Microsoft
| demon rather than the Google Demon.
| jne2356 wrote:
| The resources for the GPT-3 replication will be provided by
| the cloud company CoreWeave.
| kashyapc wrote:
| Incidentally, over the weekend I listened to a two-hour-long
| presentation[1] by the cognitive scientist, Mark Turner[2]. The
| talk's main goal was to explore the question _" Does cognitive
| linguistics offer a way forward for second-language teaching?"_
|
| During the Q&A, Turner explicitly mentions GPT-3 (that's when I
| first heard of it) as a "futuristic (but possible) language-
| learning environment" that is likely to be a great boon for
| second-language learners. One of the appealing points seems to be
| "conversation [with GPT-3] is not scripted; it keeps going on any
| subject you like". Thus allowing you to simulate some bits of the
| gold standard (immersive language learning in the real world).
|
| As an advanced Dutch learner (as my fourth language), I'm curios
| of these approaches. And glad to see this open source model. (It
| is beyond ironic that the so-called "Open AI" turned out to have
| dubious ethics.)
|
| [1] https://www.youtube.com/watch?v=A4Q977p8PfQ
|
| [2] Check out the _excellent_ book he co-authored, _" Clear and
| simple as the truth"_--it has valuable insights on improving
| writing based on some robust research.
| Jack000 wrote:
| Hope there will be a distilled or approximate attention version
| so it can be run on consumer gpus.
| wumms wrote:
| https://github.com/EleutherAI/gpt-neo (Couldn't find it on the
| website)
| matteocapucci wrote:
| Do you know how can one donate?
| newswasboring wrote:
| The intention behind it is pretty good. Best of luck to them.
|
| I wonder if I can donate computing power to this remotely. Like
| the old SETI or protein folding things. Use idle CPU to calculate
| for the network. Otherwise the estimates I have seen on how much
| it would take to train these models are enormous.
| mryab wrote:
| Not directly related, but the Learning@home [1] project aims to
| achieve precisely that goal of public, volunteer-trained neural
| networks. The idea is that you can host separate "experts," or
| parts of your model (akin to Google's recent Switch
| Transformers paper) on separate computers.
|
| This way, you never have to synchronize the weights of the
| entire model across the participants -- you only need to send
| the gradients/activations to a set of peers. Slow connections
| are mitigated with asynchronous SGD and unreliable/disconnected
| experts can be discarded, which makes it more suitable for
| Internet-like networks.
|
| Disclaimer: I work on this project. We're currently
| implementing a prototype, but it's not yet GPT-3 sized. Some
| issues like LR scheduling (crucial for Transformer convergence)
| and shared parameter averaging (for gating etc.) are tricky to
| implement for decentralized training over the Internet.
|
| [1] https://learning-at-home.github.io/
| echelon wrote:
| Do you have a personal Twitter account I can follow? Your
| career is one I'd like to follow.
| teruakohatu wrote:
| TensorFlow supports distributed training with a client server
| model.
| newswasboring wrote:
| Does it also solve the problem of everyone having different
| hardware?
| londons_explore wrote:
| It does.
|
| For most models, your home broadband would be far too slow
| though.
| emteycz wrote:
| What about some kind of sharding, parts of the
| computation that could be executed in isolation for a
| longer period of time?
| Filligree wrote:
| An ongoing research problem. OpenAI would certainly like
| being able to use smaller GPUs, instead of having to fit
| the entire model into one.
| jne2356 wrote:
| GPT-3 does not fit in any one GPU that exists at present.
| It's already spread out across multiple GPUs.
| newswasboring wrote:
| Is it because they will have to communicate back errors
| during training? I forgot that training these models is
| more of a global task than proteins folding. In that
| sense this is less parallelizable over the internet.
| londons_explore wrote:
| Yes, and also activations if your GPU is too small to fit
| the whole model. The minimum useful bandwidth for that
| stuff is a few gigabits...
| jne2356 wrote:
| They get this suggestion a lot. There's a section in their FAQ
| that explains why it's infeasible.
|
| https://github.com/EleutherAI/info
| chillee wrote:
| The primary issue is that large scale GPU training is primarily
| dominated by communication costs. Since to some approximation
| things need to be synchronized after every gradient update, it
| becomes very quickly quite infeasible to increase the
| communication cost.
| danwills wrote:
| Yeah! sounds great, if I could easily run a SETI-at-home style
| thing to contribute to training a (future) model similar to
| GPT-x, but with the result freely available to play with, I
| reckon i'd do it. Could even be made into a game I'd say! I am
| totally aware that gpt3 itself can't be run on a regular
| workstation, but maybe hosting for instances of models from the
| best/most-interesting training runs could be worked out by
| crowd-funding?
| colobas wrote:
| I was just gonna propose something like this. Democratizing
| large ML models.
| hooande wrote:
| In my experience, the output from GPT-3, DALL-E, et al is similar
| to what you get from googling the prompt and stitching together
| snippets from the top results. These transformers are trained on
| "what was visible to google", which provides the limitation on
| their utility.
|
| I think of the value proposition of GPT-X as "what would you do
| with a team of hundreds of people who can solve arbitrary
| problems only by googling them?". And honestly, not a lot of
| productive applications come to mind.
|
| This is basically the description of a modern content farm. You
| give N people a topic, ie "dating advice", and they'll use google
| to put together different ideas, sentences and paragraphs to
| produce dozens of articles per day. You could also write very
| basic code with this, similar to googling a code snippet and
| combining the results from the first several stackoverflow pages
| that come up (which, incidentally, is how I program now). After a
| few more versions, you could probably use GPT to produce fiction
| that matches the quality of the average self published ebook. And
| DALL-E can come up with novel images in the same way that a
| graphic designer can visually merge the google image results for
| a given query.
|
| One limitation of this theoretical "team of automated googlers"
| is that what they search for content is cached on the date of the
| last GPT model update. Right now the big news story is the Jan
| 6th, 20201 insurrection at the US Capitol. GPT-3 can produce
| infinite bad articles about politics, but won't be able to say
| anything about current events in real time.
|
| I generally think that GPT-3 is awesome, and it's a damn shame
| that "Open"AI couldn't find a way to actually be open. At this
| point, it seems like a very interesting technology that is still
| in desperate need of a Killer App
| devonkim wrote:
| The current problem is that we don't have a reliable, scalable
| way to merge in features of knowledge engines that have
| ontological relationships of entities with generative engines
| that are good for making more natural looking or sounding
| qualitative output. There's certainly research going on to join
| them together but it's just not getting the kind of press
| releases as the generative and pattern recognition stuff that's
| much easier comparatively. The whole "General AI Complete"
| class of problems seems to be ones that are trying to combine
| multiple areas of more specific AI systems but that's exactly
| where more practical problems for the average person arise.
| glebshevchuk wrote:
| Agreed, but that's because they're hard to integrate
| together: one is concerned with enumerating over all facts
| that humans know about (a la Cyc) and the other is concerned
| with learning those directly from data. Developing feedback
| systems that combine these two would be quite exciting.
| dnautics wrote:
| > And honestly, not a lot of productive applications come to
| mind
|
| So, can't go into too many details, since I haven't started
| yet, I'm thinking about mixing a flavor of GPT with DETR for
| OCR tasks where the model then must predict categorization
| vectors, the chief difficulty of the task being that it must
| identify and classify arbitrary length content in the OCR.
| visarga wrote:
| > honestly, not a lot of productive applications come to mind
|
| Not so convincing when you enumerate so many applications
| yourself.
|
| > but won't be able to say anything about current events
|
| There are variants that use transformer + retrieval, so they
| got unlimited memory that can be easily extended.
| gxqoz wrote:
| I've mentioned this in another thread, but a GPT-3 that could
| reliably generate quizbowl questions like the ones on
| https://www.quizbowlpackets.com would be great in this
| domain. My experience with it indicates it's no where near
| being able to do this, though.
| pydry wrote:
| Content farms are hardly a productive application.
| visarga wrote:
| You missed the forest for the trees. If you got a tool that
| can use StackOverflow to solve simple programming tasks, or
| to generally solve any simple task with Google, then you're
| sitting on a gold mine.
| notahacker wrote:
| That's a big _if_ though.
|
| GPT-3 is much more _interesting autocomplete based on
| most commonly used patterns_ than something which figures
| out that Problem X has a lot of conceptual similarities
| with Solved Problem Y so it can just reuse the code
| example with some different variable names.
| jokethrowaway wrote:
| Yes and no.
|
| It may be useful to hire less low skilled employees and
| keep a few senior ones that take input from machine and
| decide what to keep and what to throw away. I'm not sure
| if a senior engineer would be more productive patching up
| code written by a bot or writing it from scratch. It's
| going to be a hard sell while you still need human
| supervisors.
|
| You can't trust a machine that can't reason with code
| implementation, or even content creation. You need a
| human to supervise or a better machine.
|
| We already have AI based auto-completion for code, gpt-3
| can be useful for that (but at what cost? Storing a huge
| model on your disk or making a slow / unsafe http request
| to the cloud?)
| polynomial wrote:
| > if a senior engineer would be more productive patching
| up code written by a bot or writing it from scratch.
|
| I have no doubt writing from scratch would win hands
| down. The main reason we patch wonky legacy code is
| because it's already running and depended on. If you
| remove that as a consideration, a senior engineer writing
| the equivalent code (rather than debugging code generated
| randomly from Google searches) would -IMO- would be more
| efficient and produce a higher quality program.
| schaefer wrote:
| I feel your stance [1] is demonstrably false in two challenges.
|
| 1) Please play a winning game of Go against Alpha Zero, just by
| googling the topic.
|
| 2) Next please explain how Alpha Zero's game's could forever
| change Go opening theory[2], without any genuine creativity.
|
| [1] that "the output from GPT-3, DALL-E, et al is similar to
| what you get from googling the prompt and stitching together
| snippets from the top results."
|
| [2]"Rethinking Opening Strategy: AlphaGo's Impact on Pro Play"
| by Yuan Zhou
| ravi-delia wrote:
| Op was clearly not talking about Alpha Zero, a different
| technology made by different people for a different purpose.
| Instead, they were noting that despite displaying some truly
| excellent world modeling, GPT-3 is _trained_ on data that
| encourages it to vomit up rehashes. It 's very possible that
| the next generation will overcome this and wind up completely
| holding together long-run concepts and recursion, at least if
| scaling parameters keeps working, but for now it is a real
| limitation.
|
| GPT-3 writes like a sleepy college student with 30 minutes
| before the due date; with shockingly complete grasp of
| _language_ , but perhaps not complete understanding of
| content. That's not just an analogy, I am a sleepy college
| student. When I write an essay without thinking too hard it
| displays exactly the errors that GPT-3 makes.
| tbenst wrote:
| GPT-3 can't play Go.
| Tenoke wrote:
| It almost definitely can to some extent given that gpt2
| could play chess [0].
|
| 0. https://slatestarcodex.com/2020/01/06/a-very-unlikely-
| chess-...
| rich_sasha wrote:
| I think this is exactly right, and indeed this is a lot of the
| value. "Content-generation" is already a thing, and yes it
| doesn't need to make much sense. Apparently people who read it
| don't mind.
| throwaway6e8f wrote:
| People don't read it, search engines do.
| masswerk wrote:
| BTW, we should mandatorily tag generated content for search
| engines in order to exclude it from future training sets.
| hundchenkatze wrote:
| Apart from that, hopefully the people building training
| sets use gltr or something similar to prevent training on
| generated text.
|
| http://gltr.io/
| ape4 wrote:
| Or not even googling, but pre-googling. Using its predictive
| typing in the text box at google.com Because you are giving to
| something to complete.
| Lucasoato wrote:
| > I think of the value proposition of GPT-X as "what would you
| do with a team of hundreds of people who can solve arbitrary
| problems only by googling them?". And honestly, not a lot of
| productive applications come to mind.
|
| Damn, this could replace so many programmers, we're doomed!
| throwaway_6142 wrote:
| *realizing GPT-3 was probably created by programmers who's
| job really is mostly googling for stackoverflow answers*
|
| #singularity
| crdrost wrote:
| We were worried that the singularity was going to involve
| artificial intelligences that could make themselves
| smarter, and we were underwhelmed when it turned out to be
| neural networks that started to summarize Stack Overflow
| neural network tips, to try to optimize themselves,
| instead.
|
| GPT-[?]: still distinguishable from human advice, but it
| contains a quadrillion parameters and nobody knows how
| exactly it's able to tune them.
| yowlingcat wrote:
| Curses. We've been found out.
| napier wrote:
| >"what would you do with a team of hundreds of people who can
| solve arbitrary problems only by googling them?"
|
| What would you do with a team of hundreds of people who can
| instantly access an archive comprising the sum total of
| digitized human knowledge and use it to solve problems?
| hooande wrote:
| We have that now, it's called googling. you could easily hire
| 100 people to do that job, but you'd have to pay them at
| least $15/hr now on the US. Say equivalent gpt-3 servers cost
| a fraction of that. How do you make money with that resource?
| eru wrote:
| Well, they can use it to write text. Not to solve problems
| directly.
| orm wrote:
| GPT-3 is trained on text prediction, and there's been a lot of
| commentary about the generation aspect, but some of the
| applications that excite me most are not necessarily about the
| generation of text, instead, GPT-3 (and also other language
| models) create very useful vector representations of natural
| language as a side effect that can then be used for other tasks
| with much less data, or with too much extra data. Using the
| text prediction task as a way to supervise learning this
| representation without having to create an expensive labelled
| dataset is very helpful, and not just to language tasks. See
| for example the CLIP work that came out recently for image
| classification, using GPT-3 and captions to supervise training.
| There is other work referred to in that blog post that also
| exploits captions or descriptions in natural language to help
| understand images better. More speculatively, being able to use
| natural language to supervise or give feedback to automated
| systems that have little to do with NLP seems very very useful.
| hooande wrote:
| "More speculatively, being able to use natural language to
| supervise or give feedback to automated systems that have
| little to do with NLP seems very very useful."
|
| I agree with this, and it isn't entirely speculative. One of
| the most useful applications I have seen that goes beyond
| googling is generating css using natural language. ie,
| "change the background to blue and put a star at the top of
| the page". There are heavily sample selected demos of this on
| twitter right now https://twitter.com/sharifshameem/status/12
| 82676454690451457...
|
| This is definitely practical, though I wouldn't design my
| corporate website using this. could be useful if you need to
| make 10 new sites a day for something with seo or domains
| theonlybutlet wrote:
| I work in a niche sector of the insurance industry. Based on
| what it can already do, I can see it doing half my job with
| basically no learning curve for the user. Based on this alone,
| I could see it reducing headcount in the company and sector by
| 5%. This is massive when you consider the low margins in the
| industry and high costs of "skilled" staff.
| mistermann wrote:
| > I think of the value proposition of GPT-X as "what would you
| do with a team of hundreds of people who can solve arbitrary
| problems only by googling them?". And honestly, not a lot of
| productive applications come to mind.
|
| If I was Xi Jinping, I would use it to generate arbitrary
| suggestions _for consideration by_ my advisory team, as I
| develop my ongoing plan for managing The Matrix.
| darepublic wrote:
| It's not even commercially available afaik, it's only been made
| available to _some_ developers who applied for access, through
| a completely opaque process-- this after soliciting for
| application ideas from every applicant. So I heartily support
| finding a way to truly open source it.
| comboy wrote:
| Have you tried conversing with it, after a few lines of setting
| a proper context? Like two scientist talking or something like
| that. It can provide very interesting outputs that are not
| googlable.
|
| Yes, every time you see something that for human obviously
| doesn't make sense it makes you dismiss it. You would look at
| that output differently though if you were talking with a
| child. Just like a child can miss some information making it
| say something ridiculous it may miss some patterns connections.
|
| But have you ever observed carefully how we connect patterns
| and make sentences? Our highly sophisticated discussions and
| reasoning is just pattern matching. Then most prominent
| patterns ordered in time also known as consciousness.
|
| Watch hackernews comments and look how after somebody used a
| rare adjective or cluster of words more commenters tend to use
| it without even paying conscious attention to that.
|
| Long story short, give it a try and see what examples of what
| people already did with it even in it's limited form.
|
| To me you are looking at an early computer and saying that it's
| not doing anything that a bunch of people with calculators
| couldn't do.
| tda wrote:
| Did anyone try to ask GPT-3 something about the insurrection? I
| can't find any free/open text prompts for GPT-3, this is what
| GPT-2 has to say:
|
| _After Trump lost the 2020 election his supporters stormed the
| Capitol_
|
| > They smashed windows, tore apart the Republican Party's
| headquarters, destroyed a statue of Ronald Reagan and set fire
| to the Texas statehouse.
|
| > A month later, after the Democrats took control of the Senate
| and the House, Trump's supporters stormed their own offices.
| They attacked the offices of California Democratic Senator
| Kamala Harris and New York Democratic Representative Joe
| Crowley. They ransacked offices in the Minnesota and Texas
| statehouses and burned down the Republican Party headquarters
| in Tennessee. The damage was estimated at
| throwaway_6142 wrote:
| my god it should be called CNN-2
| notahacker wrote:
| You can play with GPT3 in a custom world at AIdungeon.io The
| responses are biased towards giving you RPG second person
| narrative, but the corpus of data, mastery of syntax and more
| uncertain grasp of events and relationships is all there
|
| Example with the prompt _You are Donald Trump. The recent
| election results have been a disappointment to you_.
|
| https://pastebin.com/dSYZypCw
|
| Props for turns of phrases like "Your opponent is a typical
| liberal. He hails from the right wing of the Democratic
| Party, but has been trying to appeal to the left to gain more
| support.", but poor marks for apparently not having grasped
| how elections work. (There's a joke in there somewhere)
|
| If you don't pick a custom world and your own prompt, you get
| something more like this:
|
| > You are Donald Trump, a noble living in the kingdom of
| Larion. You are awakened by a loud noise outside the gate.
| You look out the window and see a large number of orcish
| troops on the road outside the castle.
|
| I'd like 'orcish troops' better if I thought it was inspired
| by media reports of Capitol events rather than a corpus of
| RPGs.
| visarga wrote:
| GPT-3
|
| > The Trump supporters were armed with guns and knives. They
| were also carrying torches. They were chanting "Trump 2020"
| and "Build the Wall." The Trump supporters were also chanting
| "Lock her up." But they were referring to Hillary Clinton.
| danmur wrote:
| This is hilarious, more please :)
| dvfjsdhgfv wrote:
| In science, some amazing discoveries are made years or even
| centuries before some practical applications for them are
| found. I believe in humanity, sooner or later we'll find some
| actually useful applications for GPT-X.
| Filligree wrote:
| It's a wonderful co-writer of fiction, for one. Maybe the
| better authors wouldn't need it, but as for everyone else --
| take a look at
| https://forums.sufficientvelocity.com/threads/generic-
| pawn-t..., and compare the first couple of story posts to the
| last few.
|
| One of the ways in which people get GPT-3 wrong is, they give
| it a badly worded order and get disappointed when the output
| is poor.
|
| It doesn't work with orders. It takes a lot of practice to
| work out what it does well with. It always imitates the
| input, and it's great at matching style -- and it knows good
| writing as well as bad, but it can't ever write any better
| than the person using it. If you want to write a good story
| with it, you need to already be a good writer.
|
| But it's wonderful at busting writer's block, and at writing
| _differently_ than the person using it.
| tigerBL00D wrote:
| I don't necessarily see the "team of automated googlers" as a
| fundamental or damning problem with GPT-like approaches. First
| I think people may have a lot fewer truly original ideas then
| they are willing to admit. Original thought is sought after and
| celebrated in arts as a rare commodity. But unlike in arts,
| where there are almost not constraints, when it comes to
| science or engineering almost every incremental step is of form
| Y = Fn(X0,..,Xn) where X0..Xn are widely known and proven to be
| true. With sufficient logical reasoning and/or experimental
| data, after numerous peer reviews, we can accept Fn(...) to be
| a valid transform and Y becomes Xn+1, etc. Before internet or
| Google one had to go to a library and read books and magazines,
| or ask other people to find inputs from which new ideas could
| be synthesized. I think GPT-like stuff is a small step towards
| automating and speeding up this general synthesis process in
| the post-Google world.
|
| But if we are looking to replace end-to-end intelligence at
| scale it's not just about synthesis. We need to also automate
| the peer review process so that it's bandwidth is matched to
| increased rate of synthesis. Most good researchers and
| engineers are able to self-critique their work (and the degree
| to which they can do that well is really what makes one good
| IMHO). And then we rely on our colleagues and peers to review
| our work and form a consensus on its quality. Currently GPT-
| like systems can easily overwhelm humans with such peer review
| requests. Even if a model is capable of writing the next great
| literary work, predicting exactly what happened on Jan 6, or
| formulating new laws of physics the sheer amount of crap it
| will produce alongside makes it very unlikely that anyone will
| notice.
| Keyframe wrote:
| "team of automated googlers" where google is baked-in. Google
| results, and content behind it, changes. Meaning, GPT would
| have to be updated as well. Could be a cool google feature, a
| service.
| breck wrote:
| I call it the "Prior-Units" theorem. Given that you are able
| to articulate an idea useful to many people, there exists
| prior units of that idea. The only way then to come up with a
| "new idea", is to come up with an idea useful only to
| yourself (plenty of those) (or small groups), or translate an
| old idea to a new language.
|
| The reason for this is that if your adult life consists of
| just a tiny, tiny, tiny fraction of the total time of all
| adults, and so if an idea is relevant to more people, odds
| decrease exponentially that no one thought of it before.
|
| There are always new languages though, so a great strategy is
| to take old ideas and bring them to new languages. I count
| new high level, non programming languages as new languages as
| well.
| PaulHoule wrote:
| Art (music, literature, ...) involves satisfaction of
| constraints. For instance you need to tune your guitar like
| the rest of the band, write 800 words like the editor told
| you, tell a story with beginning, middle, and end and
| hopefully not use the cheap red pigments that were
| responsible for so many white, blue, and gray flags I saw in
| December 2001.
| anticristi wrote:
| I love the initiative, but I'm starting to get scared of what a
| post-GPT-3 world will look like. We are already struggling to
| distinguish fake news from real ones, automated customer request
| replies from genuine replies, etc. How will I know that I have a
| conversation with a real human in the future?
|
| On the other side, the prospect of having an oracle that answers
| all trivia, fixes spelling and grammar, and allows humans to
| focus on higher level information processing is interesting.
| visarga wrote:
| > How will I know that I have a conversation with a real human
| in the future?
|
| This problem should be solved with cryptography, not by banning
| large neural nets.
| namelosw wrote:
| It's likely to be bad, such as:
|
| Massively plagiarize articles and the search engine probably
| have no way to identify which is the original content. It's
| like to rewrite everything on the internet using your own
| words, this may lead to the internet filled with this kind of
| garbage.
|
| Reddit and platforms alike filled with bots say bullshits all
| the time but hard to identify by the human in the first place
| (current model is pretty good at generating metaphysical
| bullshits, but rarely insightful content). People may be
| surrounded by bot bullshitters and trolls, and very few of them
| are real.
|
| Scams at larger scales. The skillset is essentially like
| customer service plus bad intentions. With new models, scammers
| can do their things at scale and find qualified victims more
| efficiently.
| pixl97 wrote:
| >(current model is pretty good at generating metaphysical
| bullshits, but rarely insightful content)
|
| Wait, are we talking about bots posting crap, or the average
| political discussion?
| gianlucahmd wrote:
| We already live in a post-GPT3 world, but one where all its
| power is the hands of a private company.
|
| The conversation needs to move on whether making it open and
| democratic is a good idea, but the tech itself is here to stay.
| dkjaudyeqooe wrote:
| I know! One day it's going to get so bad people are going to
| have to deploy critical thinking instead of accepting what they
| read at face value and suffer the indignity of having to think
| for themselves.
| _underfl0w_ wrote:
| Maybe that'll also be the Year of the Linux desktop I keep
| hearing so much about.
| Lewton wrote:
| Will they also learn to avoid magical thinking like "People
| en masse will all of a sudden develop abilities out of the
| blue"?
| Kuinox wrote:
| Sadly they don't. They start to believe to random things in
| and reject what they don't like.
| gmueckl wrote:
| Critical thinking won't help you when the majority (or all)
| of your sources are tainted and contradictory. At some point,
| the actual truth just gets swamped.
| inglor_cz wrote:
| This. Robots can spout 1000x more content than humans, if
| not more.
| spion wrote:
| This is already happening, just with humans networked into
| social networks that favor quick reshare over deep review
| thelastwave wrote:
| "the actual truth"
| hooande wrote:
| 10,000 years of human civilization and this hasn't happened
| yet, huh? Any day now, I'm sure
| emphatizer2000 wrote:
| Maybe somebody could create an AI that evaluates the
| factfulness of articles.
| visarga wrote:
| This is possible, especially with human in the loop.
| lrossi wrote:
| Or have the AI generate only fact-based, polite and
| relevant comments.
|
| Related xkcd: https://xkcd.com/810/
| realusername wrote:
| I'm going to throw some wild guess here and say that this
| sudden increase in critical thinking won't happen.
| Erlich_Bachman wrote:
| Photoshop has existed for decades. Is it really that big of a
| problem for photo news?
| technocratius wrote:
| The difference between Photoshop and generative models is not
| in what it can technically achieve, but the cost of achieving
| the desired result. Fake news photo or text generation is
| possible by humans, but scales poorly compared (more humans)
| to a algorithmically automated process (some more compute).
| draugadrotten wrote:
| Yes!
|
| https://en.wikipedia.org/wiki/Adnan_Hajj_photographs_controv.
| .. https://www.bbc.com/news/world-asia-china-55140848
|
| and many more
| Agentlien wrote:
| Wow, that first Adnan Hajj photograph looks absolutely
| terrible.
| danielscrubs wrote:
| Touch ups where done before photoshop but now it's ALWAYS
| done. The issues this has created in society might have a
| bigger emotional impact than we give it credit for.
|
| Regarding photo news there has been quite a lot of scandals
| to the point that I'd guess the touchups is more or less
| accepted.
| Pyramus wrote:
| I conducted a workshop in media compentency for teenage
| girls and one of the key learnings was that _every_ image
| of a female subject they encounter in printed media (this
| was before Instagram) has been retouched.
|
| To hammer the point home I let them retouch a picture by
| themselves to see what is possible even for a completely
| untrained manipulator.
|
| It was eye-opening - one of the things that should
| absolutely be taught in school but isn't.
| IndySun wrote:
| "one of the things that should absolutely be taught in
| school but isn't."
|
| Namely, critical thinking?
| darkwater wrote:
| I don't think "critical thinking" is the point here.
| Because first you need to know that such modifications
| CAN be done. And not everybody knows what can be
| retouched with PS or programs. So yeah, if you see some
| super-model on a magazine cover, and you don't know PS
| can edit photos easily, it would be not that immediate to
| think "hey maybe that's not real!".
|
| As an extreme example: would you ever checked 20 years
| ago a newspaper text to know if it was generated by an AI
| or by a human? Obviously no, because you didn't know of
| any AI that could do that.
| Pyramus wrote:
| Exactly this.
|
| There is a secondary aspect of becoming aware that
| society has agreed on beauty standards (different for
| different societies) and PS being used as a means to
| adhere to these standards.
| IndySun wrote:
| I think I made my point badly because I also agree.
|
| I am lamenting that teenagers were, in this day and age,
| surprised at what can be done with Photoshop. And that
| let loose on the appropriate software were surprised at
| what can be altered and how easily.
|
| My point is suggesting this may be so because people have
| not been taught how to think for themselves and accept
| things (in this case female images) 'as is', without a
| hint of curiosity. It is also a problem but at the other
| end of the stick, with many young people I work with
| considering Wikipedia to be 100% full of misinformation
| and fake news.
| Rastonbury wrote:
| The concern is not so much AI generated news, but malicious
| actors misleading, influencing or scamming people online at
| scale with realistic conversations. Today we already have
| large scale automated scams via email and robo call. Less
| scalable scams like Tinder love catfish scams or
| Russian/China trolls on reddit are now run by real people,
| imagine it being automated. If human moderators cannot
| distinguish these bots from real humans, that is a scary
| thought, imagine not being able to tell if this comment was
| written by a human or robot.
| hooande wrote:
| why does this matter? the internet is filled with millions
| of very low quality human generated discussions right now.
| There might not be much of a difference between thousands
| of humans generating comment spam and thousands of gpt-3
| instances doing the same
| mekkkkkk wrote:
| It does matter. The nice feeling of being one of many is
| a feature of echo chambers. If you can create that
| artificially for anything with a push of a button, it's a
| powerful tool to edit discourse or radicalize people.
|
| Have a look at Russias interference in the previous US
| election. This is what they did, but manually. To be able
| to scale and automate it is huge.
| ccozan wrote:
| But careful, the human psyche has some kind of tipping
| point. Too much fake news, and it will flip. Too less, no
| real influence is made.
|
| The exact balance must be orchestrated by a human.
| visarga wrote:
| > imagine not being able to tell if this comment was
| written by a human or robot
|
| I think neural nets could help finding fake news and
| factual mistakes. Then it wouldn't matter who wrote it if
| it is helpful and true.
| Yetanfou wrote:
| That is like saying "black powder has existed for centuries,
| are nuclear weapons really that big a problem?". The
| difference between an image editor like Photoshop and an
| automated image generation program is the far greater
| production capacity, speed, the lower cost and the fact that
| anyone with the right equipment can use it whereas the end
| result of of an image editor only is as good as the person
| using it,
| turing_complete wrote:
| Don't read news. Go to original sources and scientific papers.
| If you really want to understand something, a news website
| should only be your starting point to look for keywords. That
| is true today as it will be "post-GPT-3".
| wizzwizz4 wrote:
| > _Go to original sources and scientific papers._
|
| Given how much bunk "science" (and I'm talking things
| completely transparent to someone remotely competent in the
| field) gets published, especially in psychology, it's
| difficult to do even that.
| turing_complete wrote:
| You are right. You still have to read critically or find
| trusted sources, of course.
| yread wrote:
| Primary sources need to be approached with caution
| https://clas.uiowa.edu/history/teaching-and-writing-
| center/g...
| gmueckl wrote:
| And so does every other source. You can play that analysis
| game with any source material. The problem is that the
| accuracy and detail of the reporting usually fades with
| each step towards mass media content.
| savolai wrote:
| This scales badly today and will scale even worse in the
| future. Those without education or time resources will _at
| best_ manage to read the news. Humanity will need a low
| effort way to relay reliable information to more of it 's
| members.
| anticristi wrote:
| Besides the time scalability aspect highlighted by someone
| else, I am worried that GPT-3 will have the potential to
| produce even "fake scientific papers".
|
| Our trust fabric is already quite fragile post-truth. GPT-3
| might make it even more fragile.
| op03 wrote:
| I wouldn't worry about the science bit. No one worries
| about the university library getting larger and larger or
| how its going to confuse or misguide people even though
| everyone knows there are many books in there, full of
| errors, outdated information, badly written, boring etc etc
| etc.
|
| Why? Cause there is always someone on campus who knows
| something about a subject to guide you to the right stuff.
| cbozeman wrote:
| Most people are not smart enough to do this, and even if they
| are, they don't have enough time in their day.
| hntrader wrote:
| That's what people _should_ do, and that 's what you and I
| will do, but many won't, especially the less educated (no
| condescension intended). They'll buy into the increased
| proliferation of fake info. It's because of these people that
| I think the concerns are valid.
| anticristi wrote:
| Honestly, I consider myself fairly educated (I have a PhD
| in CS), but if the topic at hand is sufficiently far from
| my core competence, then reading the scientific article
| won't help. I keep reading about p-value hacking, subtle
| ways of biasing research, etc., and I realize that, to
| validate a scientific article, you have to be a domain
| expert and constantly keep up-to-date with today's best
| standards. Given the increasing number of domains to be an
| expert in, I fail to see how any single human can achieve
| that without going insane. :D
|
| I mean, Pfizer could dump their clinical trial reports at
| me, and I would probably be unable to compute their
| vaccine's efficiency, let alone find any flaws.
| taneq wrote:
| The fake news thing is a real problem (and may become worse
| under GPT3 but certainly exists already). As for the others -
| to quote Westworld, "if you can't tell the difference, does it
| really matter?"
| phreeza wrote:
| Genuine question, why is this a problem? Sure, someone may be
| able to generate thousands of real-sounding fake news
| articles, but it's not like they will also be able to flood
| the New York Times with these articles. How do you worry you
| will be exposed to these articles?
| chimprich wrote:
| It's not me I'm worried about - it's the 50% [1] of people
| who get their news from social media and "entertainment"
| news platforms. These people vote, and can get manipulated
| into performing quite extreme acts.
|
| At the moment a lot of people seem to have trouble engaging
| with reality, and that seems to be caused by relatively
| small disinformation campaigns and viral rumours. How much
| worse could it get when there's a vast number of realistic-
| sounding news articles appearing, accompanied by realistic
| AI-generated photos and videos?
|
| And that might not even be the biggest problem. If these
| things can be generated automatically and easily, it's
| going to be very easy to dismiss real information as fake.
| The labelling of real news as "fake news" phenomenon is
| going to get bigger.
|
| It's going to be more work to distinguish what is real from
| what is fake. If it's possible to find articles supporting
| any position and a suspicion that any contrary new is then
| a lot of people are going to find it easier to just believe
| what they prefer to believe... even more than they do now.
|
| [1] made-up number, but doesn't feel far off.
| profunctor wrote:
| That fact that you made up that number is extremely funny
| in this context.
| chimprich wrote:
| I don't think so - I was aware that it was a made-up
| number, and highlighted the fact that it was. It's the
| lack of awareness of what is backed up by data that is
| the problem I think.
|
| Or am I missing your point?
| _underfl0w_ wrote:
| Right, it's definitely goof that you cited it being fake,
| but I think the parent was pointing out the subtle (and
| likely unintentional) irony of discussing fake news while
| providing _fake_ numbers to support your opinion.
| jokethrowaway wrote:
| The majority of "fake news" are factual news described
| from a partial point of view and with a political spin.
|
| Even fact checkers are not immune to this and brand other
| news as true or false not based on facts but based on the
| political spin they favour.
|
| Fake news is a vastly overstated problem. Thanks to
| internet, we now have a wider breadth of political news
| and opinions and it's easy to label everything-but-your-
| side as fake news.
|
| There are a few patently false lies on the internet which
| are taken as examples of fake news - but they have very
| few supporters.
| mistermann wrote:
| > There are a few patently false lies on the internet
| which are taken as examples of fake news - but they have
| very few supporters.
|
| Very true. What's interesting though is how many
| supporters there of the idea that extremely large
| quantities of people 100% "hook, line, and sinker" buy
| into such fake news stories - based, ironically, on
| _fake-news-like_ articles assuring us (with specious
| evidence, if any) that this is the true state of reality.
|
| The world is amazingly paradoxical if you look at it from
| the proper abstract angle.
| chimprich wrote:
| > Even fact checkers are not immune to this and brand
| other news as true or false not based on facts but based
| on the political spin they favour.
|
| Could you give an example?
|
| > There are a few patently false lies on the internet
| which are taken as examples of fake news - but they have
| very few supporters.
|
| How many do you consider "few"?
|
| I can go to my local news site and read a story about the
| novel coronavirus and the _majority_ of comments below
| the article are stating objectively false facts.
|
| "It's just a flu" "Hospitals are empty" "The survival
| rate is 99.9%" "Vaccines alter your DNA"
|
| ...and so on.
|
| There is the conspiracy theory or cult called QAnon,
| which "includes in its belief system that President Trump
| is waging a secret war against elite Satan-worshipping
| paedophiles in government, business and the media."
|
| One QAnon Gab group has more than 165,000 users. I don't
| think these are small numbers.
| perpetualpatzer wrote:
| > made-up number, but doesn't feel far off.
|
| Pew Research says 18% report getting news primarily from
| social media (fielded 10/19-6/20)[0]. November 2019
| research said 41% among 18-29 year olds, which was the
| peak age group. Older folks largely watch news on TV[1].
|
| [0] https://www.journalism.org/2020/07/30/americans-who-
| mainly-g... [1] https://www.pewresearch.org/pathways-2020
| /NEWS_MOST/age/us_a...
| chimprich wrote:
| Thanks for providing data. Evidence is better than making
| up numbers.
| hnlmorg wrote:
| If recent times have told us anything, it's that the
| biggest distributor of "news" is social media. And worse
| still, people generally have no interest in researching the
| items they read. If "fake news" confirms their pre-existing
| bias then they will automatically believe it. If real news
| disagrees with their biases then it is considered fake.
|
| So in theory, the rise of deep fakes could lead to more
| people getting suckered into conspiracy theories and other
| such extreme opinions. We've already seen a small trend
| this way with low resolution images of different people
| with vaguely similar physical features because used as
| "evidence" of actors in hospitals / shootings / terrorist
| scenes / etc.
|
| That all said, I don't see this as a reason not to pursue
| GPT-3. From that regard the proverbial genie is already out
| of the bottle. What we need to work on is a better
| framework for distributing knowledge.
| xerxespoy wrote:
| Journalists are paid by the word.
| qayxc wrote:
| > "if you can't tell the difference, does it really matter?"
|
| It indeed does. The problem is that societies and cultures
| are heavily influenced and changed by communication, media,
| and art.
|
| By replacing big portions of these components with artificial
| content, generated from previously created content, you run
| the risk of creating feedback cycles (e.g. train future
| systems from output of their predecessors) and forming
| standards (beauty, aesthetics, morality, etc.) controlled by
| the entities that build, train, and filter the output of the
| AIs.
|
| You'll basically run the risk of killing individuality and
| diversity in culture and expression; consequences on society
| as a whole and individual behaviour are difficult to predict,
| but seeing how much power social media (an unprecedented
| phenomenon in human culture) have, there's reason to at the
| very least be cautious about this.
| visarga wrote:
| This problem affects all types of agents - natural or
| artificial. Agent acts in the environment, this causes
| experience and learning, and thus conditioning the future.
| The agent has no idea what other opportunities are lost
| behind past choices.
| notahacker wrote:
| Most human communications between humans have some physical
| world purpose, and so an algorithm which is trained to
| _create the impression_ that a purpose has been fulfilled
| whilst not actually having any capabilities beyond text
| generation is going to have negative effects except where the
| sole purpose of interacting is receiving satisfactory text.
|
| Reviews that look just like real reviews but are actually a
| weighted average of comments on a different product are
| negative. Customer service bots that go beyond FAQ to do a
| very convincing impression of a human service rep promising
| an investigation into an incident but can't actually start an
| investigation into the incident are negative. An information
| retrieval tool which has no information on a subject but can
| spin a very plausible explanation based on data on a
| different subject is negative.
|
| Of course, it's entirely possible for humans to bullshit, but
| unlike text generation algorithms it isn't our default
| response to everything.
| skybrian wrote:
| If you ask GPT-3 for three Lord of the Rings quotes it might
| give you two real ones and one fake one, because it doesn't
| know what truth is and just wants to give you something
| plausible.
|
| There are creative applications for bullshit, but something
| that cites its sources (so you can check) and doesn't
| hallucinate things would be much more useful. Like a search
| engine.
| Drakim wrote:
| What scares me personally is the idea that I might be
| floating in a sea of uncanny valley content. Content that's
| 98% human-like, but then that 2% sticks out like a nail and
| snaps me out of it.
|
| Sure, I might not be able to tell the difference the majority
| of the time, but when I can tell the difference it's gonna
| bother me a lot.
| fakedang wrote:
| To me, a lot of content seems to be digital marketing
| horseshit tbh.
| mistermann wrote:
| Do you not already have this feeling on a fairly regular
| basis? (Serious question)
| normanmatrix wrote:
| You will not. Welcome to the scary generative future.
| anticristi wrote:
| I was hoping for a "yes, we can" attitude here. :D
| Agentlien wrote:
| Deep fakes still feel quite uncanny valley to me. Even if they
| move beyond that convincing fake images have existed for a long
| while.
|
| As for support, I don't really see why it matters if I'm
| talking to a clever script or an unmotivated human.
| falcor84 wrote:
| > already struggling to distinguish ... automated customer
| request replies from genuine replies
|
| I hope it's not only due to a decline in the quality of human
| support. If we could have really useful automated support
| agents, I for one would applaud that.
| anticristi wrote:
| I agree. As long as it is transparent that I am speaking to
| an automated agent and I can easily escalate the issue to a
| human that can solve my problem when the agent gets stuck.
| bencollier49 wrote:
| We'll go full circle and you'll be forced to meet people in
| person again.
| Erlich_Bachman wrote:
| It's a shame that it has turned out to be necessary to externally
| re-make and re-train a model that has come out of company called
| `OPEN`AI. Wasn't one of the founding principles of it that all of
| the research would be available to the public? Isn't that the
| premise on which the initial funding was secured? Best of luck to
| Eleuther.
| mhuffman wrote:
| But I was told GPT-3 was too powerful for mere mortal hands
| (unless you have an account!) and that it would be used for
| hate speech and to bring about skynet.
|
| How will this project avoid those terrible outcomes?
| visarga wrote:
| By putting the cat back in the bag. Oh, it's too late ...
| useless to think about it - we can't de-invent an invention
| or stop people from replicating. Its like that time when NSA
| wanted to restrict crypto.
| dvfjsdhgfv wrote:
| I don't know a single intelligent person who believed this
| argument, it simply doesn't hold up.
| thelastwave wrote:
| Lots of people "believe" that, they just prefer to downvote
| anonymously rather than try to defend their position.
| ForHackernews wrote:
| New research has revealed that intelligence is not a
| prerequisite for generating hate speech on social media
| platforms.
| visarga wrote:
| It was probably bait and switch to hire top researchers and get
| initial funding. Now that OpenAI is a household name, they
| don't have to pretend anymore.
| b3kart wrote:
| I buy the former, researchers might be happier knowing their
| work potentially benefits all of humanity, not just a bunch
| of investors. But wouldn't it be _more_ difficult to get
| funding as a non-profit?
| littlestymaar wrote:
| It's just never going to be difficult to get funding when
| you have Elon Musk and Sam Altman as founders (and even
| more so when founders put one billion of their own money
| into it).
| b3kart wrote:
| Sure, but that's OpenAI's particular set of
| circumstances. Generally speaking I struggle to see
| investors preferring a nebulous non-profit over a for-
| profit with a clear path to market.
| littlestymaar wrote:
| Sure, but we're explicitly talking about OpenAI here.
| b3kart wrote:
| Of course. It's just that the comment I've been
| responding to suggested OpenAI going the "open"/non-
| profit route was to 1) get top researchers and 2) get
| investment. I was arguing that this doesn't seem to
| (generally) be a good way to get investment, but I agree
| with you in that in their case investment just wasn't a
| consideration at all.
| spiderfarmer wrote:
| I don't really care if OpenAI offers commercial licenses as
| long as the underlying research is truly open. This way
| alternative options will become available eventually.
| querez wrote:
| Arguably openAI is one of the most closed industry AI labs
| (among those that are still participating in the research
| community), on par only with deep mind (though deepmind at
| least publishes way more). Funnily enough, FAIR and Google
| Brain have a vastly better track record wrt. publishing not
| only papers but also code and models.
| dave_sullivan wrote:
| Really. OpenAI assembled some of the best minds from the deep
| learning community. The problem isn't that they are a for-
| profit SaaS, the problem is they lied.
| thelastwave wrote:
| And ended up making an AI service that's really good at...
| lying.
| Sambdala wrote:
| Wild-Ass Guess (Ass-Guess) incoming:
|
| OpenAI was built to influence the eventual _value chain_ of AI
| in directions that would give the funding parties more
| confidence that their AI bets would pay off.
|
| This value chain basically being one revolving around AI as
| substituting predictions and human judgement in a business
| process, much like cloud can be (oversimply) modeled as moving
| Capex to Opex in IT procurement.
|
| They saw that, like any primarily B2B sector, the value chain
| was necessarily going to be vertically stratified. The output
| of the AI value chain is as an input to another value chain,
| it's not a standalone consumer-facing proposition.
|
| The point of OpenAI is to invest/incubate a Microsoft or Intel,
| not a Compaq or Sun.
|
| They wanted to spend a comparatively small amount of money to
| get a feel for a likely vision of the long-term AI value chain,
| and weaponize selective openness to: 1) establish moats, 2)
| Encourage commodification of complementary layers which add
| value to, or create an ecosystem around, 'their' layer(s), and
| 3) Get insider insight into who their true substitutes are by
| subsidizing companies to use their APIs
|
| As AI is a technology that largely provides benefit by
| modifying business processes, rather than by improving existing
| technology behind the scenes, your blue ocean strategy will
| largely involve replacing substitutes instead of displacing
| direct competitors, so points 2 and 3 are most important when
| deciding where to funnel the largest slice of the funding pie.
|
| _Side Note: Becoming an Apple (end-to-end vertical integration)
| is much harder to predict ahead of time, relies on the 'taste'
| and curation of key individuals giving them much of the
| economic leverage, and is more likely to derail along the way._
|
| They went non-profit to for-profit after they confirmed the
| hypothesis that they can create generalizeable base models that
| others can add business logic and constraints to and generate
| "magic" without having to share the underlying model.
|
| In turn, a future AI SaaS provider can specialize in tuning the
| "base+1" model, then selling that value-add service to the
| companies who are actually incorporating AI into their business
| processes.
|
| It turned out, a key advantage at the base layer is just brute
| force and money, and further outcomes have shown there doesn't
| seem to be an inherent ceiling to this; you can just spend more
| money to get a model which is unilaterally better than the last
| one.
|
| There is likely so much more pricing power here than cloud.
|
| In cloud, your substitute (for the category) is buying and
| managing commodity hardware. This introduces a large-ish
| baseline cost, but then can give you more favorable unit costs
| if your compute load is somewhat predictable in the long term.
|
| More importantly, projects like OpenStack and Kubernetes have
| been desperately doing everything to commodotize the base layer
| of cloud, largely to minimize switching costs and/or move the
| competition over profits up to a higher layer. You also have
| category buyers like Facebook, BackBlaze, and Netflix investing
| heavily into areas aimed at minimizing the economic power of
| cloud as a category, so they have leverage to protect their own
| margins.
|
| It's possible the key "layer battle" will be between the
| hardware (Nvidia/TPUs) and base model (OpenAI) layers.
|
| It's very likely hardware will win this for as long as they're
| the bottleneck. If value creation is a direct function of how
| much hardware is being utilized for how long, and the value
| creation is linear-ish as the amount of total hardware scales,
| the hardware layer just needs to let a bidding war happen, and
| they'll be capturing much of the economic profit for as long as
| that continues to be the case.
|
| However, the hardware appears (I'm no expert though) to be
| something that is easier to design and manufacture, it's mostly
| a capacity problem at this point, so over time this likely gets
| commoditized (still highly profitable, but with less pricing
| power) to a level where the economic leverage goes to the Base
| model layer, and then the base layer becomes the oligopsony
| buyer, and the high fixed investment the hardware layer made
| then becomes a problem.
|
| The 'Base+1' layer will have a large boom of startups and
| incumbent entrants, and much of the attention and excitement in
| the press will be equal parts gushing and mining schaudenfreude
| about that layer, but they'll be wholly dependent on their
| access to base models, who will slowly (and deliberately) look
| more and more boring apart from the occasional handwringing
| over their monopoly power over our economy and society.
|
| There will be exceptions to this who are able to leverage
| proprietary data and who are large enough to build their own
| base models in-house based on that data, and those are likely
| to be valuable for their internal AI services preventing an
| 'OpenAI' from having as much leverage over them and being much
| better matched to their process needs, but they will not be as
| generalized as the models coming from the arms race of
| companies who see that as their primary competitive advantage.
| Facebook and Twitter are two obvious ones in this category, and
| they will primarily consume their own models, rather than
| expose them as model-as-a-service directly.
|
| The biggest question to me is whether there's a feedback loop
| here which leads to one clear winning base layer company
| (probably the world's most well-funded startup to date due to
| the inherent upfront costs and potential long-term income), or
| if multiple large, incumbent tech companies see this as an
| existential enough question that they more or less keep pace
| with each other, and we have a long-term stable oligopoly of
| mostly interchangeable base layers, like we do in cloud at the
| moment.
|
| Things get more complex when you look to other large investment
| efforts such as in China, but this feels like a plausible
| scenario for the SV-focused upcoming AI wars.
| visarga wrote:
| Apparently you don't need to be a large company to train
| GPT-3. EleutherAI is using free GPU from CoreWeave, the
| largest North American GPU miner, who agreed to this deal to
| get the final model open sourced and have their name on it.
| They are also looking at offering it as an API.
| Sambdala wrote:
| I think it's great they're doing this, but GPT-3 is the
| bellwether not the end state.
|
| Open models will function a lot like Open Source does
| today, where there are hobby projects, charitable projects,
| and companies making bad strategic decisions (Sun open
| sourcing Java), but the bulk of Open AI (open research and
| models, not the company) will be funded and released
| strategically by large companies trying to maintain market
| power.
|
| I'm thinking of models that will take $100 million to $1
| billion to create, or even more.
|
| We spend billions on chip fabs because we can project out
| long term profitability of a huge upfront investment that
| gives you ongoing high-margin capacity. The current
| (admittedly early and noisy) data we have about AI models
| looks very similar IMO.
|
| The other parallel is that the initial computing revolution
| allowed a large scale shift of business activities from
| requiring teams of people doing manual activities,
| coordinated by a supervisor towards having those functions
| live inside a spreadsheet, word processor, or email.
|
| This replaces a team of people with (outdated)
| specializations with fewer people accomplishing the same
| admin/clerical work by letting the computer do what it's
| good at doing.
|
| I think a similar shift will happen with AI (and other
| technologies) where work done by humans in cost centers is
| retooled to allow fewer people to do a better job at less
| cost. Think compliance, customer support, business
| intelligence, HR, etc.
|
| If that ends up being the case, donating a few million
| dollars worth of GPU time doesn't change the larger trends,
| and likely ends up being useful cover as to why we
| shouldn't be worried about what the large companies are up
| to in AI because we have access to crowdsourced and donated
| models.
| jariel wrote:
| This is neat, but almost no startups of any kind, even mid
| size corps, have such complicated and intricate plans.
|
| More likely: OpenAI was a legit premise, they started to run
| out of money, MS wanted to license and it wasn't going to
| work otherwise, so they just took the temperature with their
| initial sponsors and staff and went commercial.
|
| And that's it.
| ccostes wrote:
| I think calling this a "wild-ass guess" undersells it a bit
| (either that or we have very different definitions of a
| WAG).Very well though-through and compelling case.
|
| My biggest question is whether composable models are indeed
| the general case, which you say they confirmed as evidenced
| by the shift away from non-profit. It's certainly true for
| some domains, but I wonder if it's universal enough to enable
| the ecosystem you describe.
| wraptile wrote:
| OpenAI turning out to be a total bait and switch. Especially
| true when your co-founder is actively calling you out on it[1]
|
| Remember kids: if it's not a non-profit organization it is a
| _for_ profit one! It was silly to expect anything else:
|
| > In 2019, OpenAI transitioned from non-profit to for-profit.
| The company distributed equity to its employees and partnered
| with Microsoft Corporation, who announced an investment package
| of US$1 billion into the company. OpenAI then announced its
| intention to commercially license its technologies, with
| Microsoft as its preferred partner [2]
|
| 1 - https://edition.cnn.com/2020/09/27/tech/elon-musk-tesla-
| bill...
|
| 2 - https://en.wikipedia.org/wiki/OpenAI
| person_of_color wrote:
| So OpenAI employees get Microsoft RSUs?
| unixhero wrote:
| What is an RSU?
| ourcat wrote:
| Restricted Stock Units
| agravier wrote:
| It means restricted stock unit, and it's a kind of
| company stock unit that may be distributed to some
| "valued" employees. There is usually a vesting schedule,
| and you can't do whatever you want with it.
| garmaine wrote:
| Why would they? It's a separate company.
| dvfjsdhgfv wrote:
| It will be interesting to see the attitude of Microsoft
| towards this project in the light of their "Microsoft loves
| open source" propaganda.
| eeZah7Ux wrote:
| Like many other companies, Microsoft loves unpaid labor.
|
| Free Software is about giving freedom and security all the
| way to the end users - rather than SaaS providers.
|
| If you remove this goal and only focus on open source as a
| development methodology you end up with something very
| similar to volunteering for free for some large
| corporation.
| Closi wrote:
| I don't know where people got this idea that Microsoft
| can't participate positively in Open Source, and do that
| sincerely, without open sourcing absolutely everything.
|
| Of course they can - just because you contribute to open
| source, and do that because you also benefit from open
| source projects, doesn't mean you have to do absolutely
| everything under open source.
|
| Especially considering OpenAI isn't even Microsoft's IP or
| codebase.
| taf2 wrote:
| How about when Steve Ballmer said something along the
| lines of
|
| "Linux is a cancer that attaches itself in an
| intellectual property sense to everything it touches"
|
| Pretty sure that is hostile towards open source? Linux
| being one of the flagship projects of open source.
|
| [edit] source https://www.zdnet.com/article/ex-windows-
| chief-heres-why-mic...
| Closi wrote:
| It's hostile to the GPL licence which means anything
| licensed under GPL can't be used in Microsoft's
| proprietary products.
|
| I would personally say Microsoft wasn't necessarily
| driven by anti open-source hate necessarily, they were
| just very anti-competitor. Microsoft tried to compete
| with their biggest competitor? Colour me shocked.
| Daho0n wrote:
| I don't think this should be seen in the light of "open
| source everything" but more that many see Microsoft doing
| open source not as part of "being good" but part of their
| age old "embrace extend extinguish" policy.
| dvfjsdhgfv wrote:
| > I don't know where people got this idea that Microsoft
| can't participate positively in Open Source, and do that
| sincerely, without open sourcing absolutely everything.
|
| I'm not claiming that. Of course there is place for
| closed and open elements of their offerings. Let me
| clarify.
|
| In the past, Microsoft was very aggressive about open
| source. When they realized this strategy of FUD brings
| little result, they changed their attitude 180 and
| decided to embrace it putting literal hearts everywhere.
|
| Personally, I find it hypocritical. There is no
| love/hate, just business. They will use whatever strategy
| works to get their advantage. What I find strange is that
| people fell for it.
| Closi wrote:
| But why on this thread then, about GPT-3? It's not even
| their own company, IP or source to give away.
|
| But even when Microsoft _can 't_ open source it because
| it's _not theirs_ , we _still_ have people posting in
| this thread that this is further evidence that Microsoft
| is hypocritical. It sounds a lot like a form of
| Confirmation Bias to me where any evidence is used as
| proof that Microsoft is 'anti-open-source'.
| taf2 wrote:
| I think it is because each model from OpenAi was public
| until Microsoft became an investor.
| [deleted]
| pessimizer wrote:
| I don't know where people got the idea that companies can
| be "sincere." Sincerity is faithfully communicating your
| mental state to others. A company's mental state can
| change on a dime based on the decisionmaking of people
| who rely on the company for the degree of profit it
| generates. Any analog to sincerity that you think you see
| can probably be eliminated by firing one person after an
| exceptionally bad quarter (or an exceptionally good one.)
| Closi wrote:
| Sincere to me just means that you are being truthful, or
| not trying to be deceptive.
|
| And I think companies can be sincere - because companies
| are really just groups of people and assets when you get
| down to the nuts and bolts of it.
| eeZah7Ux wrote:
| > companies can be sincere
|
| "sincere", "honest", "hypocritical" usually refers to a
| long-term pattern. Being able to be sincere from time to
| time is besides the point.
|
| > companies are really just groups of people
|
| ...with profit as their first priority.
|
| For-profit companies "can be sincere" only as long as
| it's the most profitable strategy.
| JacobiX wrote:
| It's a recurring theme in OpenAI research, they become more and
| more closed. For instance their latest model called DALL*E hit
| the headlines before the release of the paper. Needless to say,
| the model is not available and no online demo has been
| published so far.
| cbozeman wrote:
| Because its winner-take-all in this research, not "winner-
| take-some".
|
| Andrew Yang talked about this and why breaking up Big Tech
| won't work. No one wants to use the second best search
| engine. The second best search engine is Bing and I almost
| never go there.
|
| Tech isn't like automobiles, where you might prefer a Honda
| over a Toyota, but ultimately they're interchangeable. A
| Camry isn't dramatically different and doesn't perform
| dramatically better than an Accord. Whoever builds the best
| AI "wins" and wins totally.
| visarga wrote:
| But they still released the CLIP model which is the
| complement of DALL-E and used in the DALL-E pipeline as a
| final filter. There are collabs with CLIP floating around and
| even a web demo.
| JacobiX wrote:
| Thank you for this info, as you mentioned CLIP is used for
| re-ranking DALL-E outputs, by itself it is just an (image,
| text) pairs classification network.
| Tenoke wrote:
| The research is open to the public. Here's the gpt3 paper
| https://arxiv.org/abs/2005.14165
|
| Also gpt2 models and code at least were publicly released and
| so has a lot of their work.
|
| And yes, they realized they can achieve more by turning for
| profit and partnering with Microsoft. So true, they are not
| fully 'open' but pretending they don't release things to the
| public and making the constant 'more like closedai aimirite'
| comments is getting old.
| avivo wrote:
| I'd love to see an equal amount of the effort put toward
| initiatives like this, also being put toward mitigating their
| _extremely likely_ negative societal impacts (and putting in
| safeguards).
|
| Of course, that's not nearly as sexy.
|
| Yes, there are lots of incredible positive impacts of such
| technology, just like there was with fire, or nuclear physics.
| But that doesn't mean that safeguards aren't _absolutely
| critical_ if you want it to be net win for society.
|
| These negative impacts are not theoretical. They are obvious and
| already a problem for anyone who works in the right parts of the
| security and disinformation world.
|
| We've been through all this before...
| https://aviv.medium.com/the-path-to-deepfake-harm-da4effb541...
|
| Of course, some of the same people who ignored recommendations[1]
| for harm mitigations in visual deepfake synthesis tools (which
| ended up being used for espionage and botnets) seem to be working
| on this.
|
| [1] e.g.
| https://www.technologyreview.com/2019/12/12/131605/ethical-d...
| mrfusion wrote:
| It still baffles me that GPT turned out to be more than a
| glorified markov chain text generator. It seems we've actually
| made it create a model of the world to some degree.
|
| And we kind of just stumbled on the design by throwing massive
| data and neural networks together?
| nullc wrote:
| You're made of _meat_ and yet you manage to be more than a
| glorified markov chain generator. :)
|
| (I hope)
| Filligree wrote:
| It turns out that brute-force works, and the scaling curve is
| _still_ not bending.
|
| I doubt we'll ever see a GPT-4, because there are known
| improvements they could make besides just upsizing it further,
| but that's besides the point. If that curve doesn't bend soon
| then a 10x larger network would be human-level in many ways.
|
| (Well, that is to say. It's actually bending. Upwards.)
| hntrader wrote:
| What % of all digitized and reasonably easy-to-access text
| data did they use to train GPT-3? I'm wondering whether the
| current limits on GPT-n are computation or data.
| kortex wrote:
| > As per the creators, the OpenAI GPT-3 model has been
| trained about 45 TB text data from multiple sources which
| include Wikipedia and books.
|
| It's about 400 B tokens. Library if Congress is about 40M
| books, let's say 50K tokens per book, or about 2T tokens.
| Not necessarily unique.
|
| I would say it's plausible that it was a decent percent of
| the indexed text available, and even more of the unique
| content. GPT2 was 10B tokens. Do we have 20T tokens
| available for GPT4? Maybe. But the low hanging fruit are
| definitely plucked.
| mrfusion wrote:
| So fascinating. I'd love to understand why it's working so
| well. I guess no one knows.
|
| Wouldn't gpt4 just be more data and more parameters?
| nemoniac wrote:
| Good initiative but tell us more about the governance. After all
| OpenAI was "open" until it was bought by Microsoft.
| wizzwizz4 wrote:
| No, it wasn't. And iirc, only GPT-3 was.
| joshlk wrote:
| How does the outfit intend to fund the project? OpenAI spends
| millions on computing resources to train the models.
| jne2356 wrote:
| The cloud company CoreWeave has agreed to provide the GPU
| resources necessary.
| stellaathena wrote:
| Hey! One of the lead devs here. A cloud computing company
| called CoreWeave is giving us the compute for free in exchange
| for us releasing it. We're currently at the ~10B scale and are
| working on understanding datacenter scale parallelized training
| better, but we expect to train the model on 300-500 V100s for
| 4-6 months.
| tmalsburg2 wrote:
| I imagine recreating the model will be computationally cheaper
| because they will not have to sift through the same huge
| hyperparameter space as the initial GPT-3 team had to.
| thelastwave wrote:
| Why is that?
| jne2356 wrote:
| This is not true. The OpenAI team only trained one full-sized
| GPT-3, and conducted their hyperparameter sweep on
| significantly smaller models (see:
| https://arxiv.org/abs/2001.08361). The compute savings from
| not having to do the hyperparameter sweep are negligible and
| do not significantly change the feasibility of the project.
| 2Gkashmiri wrote:
| So how much money would it take to rebuild this foss alternative
| ? And distributive power like seti@home? If it can be done and I
| hope it does, what benefit would the original proprietary one
| have over this? Licensing?
| astrange wrote:
| OpenAI will execute the original one for you. If you can get an
| account, anyway.
| jne2356 wrote:
| EleutherAI has already secured the resources necessary.
|
| They get the seti@home suggestion a lot. There's a section in
| their FAQ that explains why it's infeasible.
|
| https://github.com/EleutherAI/info
| pjfin123 wrote:
| What does the future of open-source large neural nets look like?
| My understanding is GPT-3 takes ~600GB of GPU memory to run
| inference. Does an open source model just allow you a choice of a
| handful of cloud providers instead of one?
| aabhay wrote:
| Open source doesn't mean that everyone will be rolling their
| own. It means that lots of players will start to offer
| endpoints with GPT-X, perhaps bundled with other services. It
| is good for the market.
| mirekrusin wrote:
| I'd gladly contribute (power and) few of idle GTX cards I have to
| public peer/volunteer/seti@home-like project if result
| snapshot(s) are available publicly/to registered, active
| contributors.
| Voloskaya wrote:
| SETI@home style distributed computation is not suitable for
| training something like GPT-3, unlike for SETI, the unit of
| work a node can do before needing to share it's output with the
| next node is really small, so very fast interconnect between
| the nodes is needed (Infiniband and NVLink is used in clusters
| to train it). It would probably take a decade to train such a
| model over regular internet.
| mirekrusin wrote:
| Are there any models/research optimised on working on this
| kind of small, distributed batches that would fit ie. ~10GB
| of commodity GPU?
| mitjam wrote:
| Maybe a case for a community colocation cloud where I a
| consumer can buy a system and colocate it in a large data
| center with great internal networking. Edit: typo
| leogao wrote:
| Handling heterogenous (and potentially untrustworthy)
| systems also adds overhead, not to mention that buying
| hardware in bulk is cheaper, so it makes the most sense
| just to raise the money and buy the hardware.
| mirekrusin wrote:
| The problem is potentially solvable as generating
| solutions takes a lot of GPU time and verifying it is
| very fast. Aquiring input data may be a problem, but
| should be possible with dedicated models for this type of
| computation.
| dmingod666 wrote:
| With Open-AI being corporate controlled and not really 'Open'. Is
| Neo a nod at 'The Matrix'?
| [deleted]
| habitue wrote:
| Is it standard to prune these kinds of large language models once
| they've been trained to speed them up?
| dvfjsdhgfv wrote:
| If they succeed, Eleuther should change their name to
| ReallyOpenAI.
| stellaathena wrote:
| Or for extra irony, ClosedAI
| techlatest_net wrote:
| Is there any real justification behind this fear of close nature
| of OpenAI or this is just frustration coming out? We had this
| debate of closed Vs open source 20 years back and eventually
| opensource won it because of various reasons. Won't those same
| reasons apply to this situation of close nature of OpenAI? If so
| then why are people worried about this? What is differnt this
| time?
| pmontra wrote:
| The cost.
|
| Closed source and open source developers use the same
| $300-3,000 laptops / desktops. Everybody can afford them.
|
| Training a large model in a reasonable time costs much more.
| According to https://lambdalabs.com/blog/demystifying-gpt-3/
| the cost of training GPT-3 was $4.6 million. Multiply it by the
| number of trial and errors.
|
| Of course we can't expect that something that costs tens or
| hundreds of millions will be given away for free or to be able
| to rebuild it without some collective training effort that
| distributes the cost on at least thousands of volunteers.
| qayxc wrote:
| This. Plus the increasing amount of intransparent results.
| Training data is private, so it's impossible to even try to
| recreate results, validate methods, or find biases/failure
| cases.
| jne2356 wrote:
| OpenAI only trained the full sized GPT-3 once. Hyperparameter
| sweep was conducted on significantly smaller models (see:
| https://arxiv.org/abs/2001.08361)
| ttctciyf wrote:
| I love the name's play on Greek _Eleutheria_ ( "eleutheria") -
| freedom, liberty!
| Havoc wrote:
| Would be good if this could decentralized bittorrent/BOINC style
| somehow.
|
| Wouldn't mind contributing some horsepower
| jne2356 wrote:
| They get this suggestion a lot. There's a section in their FAQ
| that explains why it's infeasible.
|
| https://github.com/EleutherAI/info
| onenightnine wrote:
| this is beautiful. why not? maybe we can make something
| eventually better than the now closed source version
| Mizza wrote:
| Serious question: is there a warez scene for trained models yet?
|
| (I don't know how the model is accessed - are users of mainline
| GPT-3 given a .pb and a stack of NDAs, or do they have to access
| it through access-controlled API?)
|
| Wherever data is desired by many but held by a few, a pirate crew
| inevitably emerges.
| jokowueu wrote:
| I think this also might be an interest to you
|
| https://the-eye.eu/public/AI/pile_preliminary_components/
| MasterScrat wrote:
| Those are datasets though, not models.
| notretarded wrote:
| Not really
| Voloskaya wrote:
| Checkpoint is not shared with customers, you only get access to
| an API endpoint.
| vessenes wrote:
| GPT-3 users are given an API link which routes to Azure, full
| blackbox.
| exhilaration wrote:
| It's via API https://openai.com/blog/openai-api/
| kordlessagain wrote:
| The model is huge and is currently run in the cloud on many
| machines.
| mortehu wrote:
| It's only 175 billion parameters, so presumably it can fit on
| a single computer with 1024 GB RAM.
| Voloskaya wrote:
| On CPU the latency would be absolutly prohibitive to the
| point of being useless.
| typon wrote:
| For training yes, but not for inference.
| leogao wrote:
| The inference latency would also be prohibitive.
| kordlessagain wrote:
| From 2019: https://heartbeat.fritz.ai/deep-learning-has-
| a-size-problem-...
|
| > Earlier this year, researchers at NVIDIA announced
| MegatronLM, a massive transformer model with 8.3 billion
| parameters (24 times larger than BERT)
|
| > The parameters alone weigh in at just over 33 GB on
| disk. Training the final model took 512 V100 GPUs running
| continuously for 9.2 days.
|
| Running this model on a "regular" machine at some useful
| rate is probably not possible at this time.
| Voloskaya wrote:
| Inference on GPU is already very slow on the full-scale
| non-distilled model (in the 1-2 sec range iirc), on CPU
| it would be an order of magnitude more.
| stingraycharles wrote:
| Wouldn't you need this model to be in GPU RAM instead of
| regular RAM, though?
___________________________________________________________________
(page generated 2021-01-18 23:00 UTC)