[HN Gopher] The Inherent Limitations of GPT-3
       ___________________________________________________________________
        
       The Inherent Limitations of GPT-3
        
       Author : andreyk
       Score  : 58 points
       Date   : 2021-11-29 17:57 UTC (5 hours ago)
        
 (HTM) web link (lastweekin.ai)
 (TXT) w3m dump (lastweekin.ai)
        
       | andreyk wrote:
       | Author here, would love feedback / thoughts / corrections!
        
         | skybrian wrote:
         | Another limitation to be aware of is that it generates text by
         | randomly choosing the next word from a probability
         | distribution. If you turn that off, it tends to go into a loop.
         | 
         | The random choices improve text generation from an artistic
         | perspective, but if you want to know why it chose one word
         | rather than another, the answer is sometimes that it chose a
         | low-probability word at random. So there is a built-in error
         | rate (assuming not all completions are valid), and the choice
         | of one completion versus another is clearly not made based on
         | meaning. (It can be artistically interesting anyway since a
         | human can pick the best completions based on _their_ knowledge
         | of meanings.)
         | 
         | On the other hand, going into a loop (if you always choose the
         | highest probability next word) also demonstrates pretty clearly
         | that it doesn't know what it's saying.
        
       | Flankk wrote:
       | 65 years of research and our cutting-edge AI doesn't have a
       | memory? Excuse me if I'm not excited. It's likely that most of
       | the functions of the human brain were selected for intelligence.
       | Such a focus on learning when problem solving and creativity are
       | far more interesting.
        
         | manojlds wrote:
         | Do our aeroplanes flap their wings like the birds do?
         | 
         | GPT-3 is obviously not the AI end goal, but we are on the path
         | and the end might lead to aeroplanes than flapping machines.
        
           | Flankk wrote:
           | Birds don't need 150,000 litres of jet fuel to fly across the
           | ocean. Given that the development of airplanes was made by
           | studying birds I'm not sure I see your point. The 1889 book
           | "Birdflight as the Basis of Aviation" is one example.
        
           | ska wrote:
           | > but we are on the path
           | 
           | This isn't actually clear; with things like this we are on
           | _a_ path but it may not lead anywhere that fundamental (at
           | least when we are talking  "AI", especially general AI).
        
         | PaulHoule wrote:
         | I'm trying to put my finger on the source of moral decay that
         | led to so many people behaving as if the GPT-3 emperor wears
         | clothes.
         | 
         | In 1966 it was clear to everyone that this program
         | 
         | https://en.wikipedia.org/wiki/ELIZA
         | 
         | parasitically depends on the hunger for meaning that people
         | have.
         | 
         | Recently GPT-3 was held back from the public on the pretense
         | that it was "dangerous" but in reality it held back because it
         | is too expensive to run and the public would quickly learn that
         | it can answer any question at all... if you don't mind if the
         | answer is right.
         | 
         | There is this page
         | 
         | https://nlp.stanford.edu/projects/glove/
         | 
         | under which "2. Linear Substructures" there are four
         | projections of the 50-dimensional vector space that would
         | project out just as well from a random matrix because, well,
         | projecting 20 generic points in a 50-dimensional space to
         | 2-dimensions you can make the points fall exactly where you
         | want in 2 dimensions.
         | 
         | Nobody holds them to account over this.
         | 
         | The closest thing I see to the GPT-3 cult is that a Harvard
         | professor said that this thing
         | 
         | https://en.wikipedia.org/wiki/%CA%BBOumuamua
         | 
         | was an alien spacecraft. It's sad and a little scary that
         | people can get away with that, the media picks it up, and they
         | don't face consequences. I am more afraid of that than I am
         | afraid that GPT-99381387 will take over the world.
         | 
         | (e.g. growing up in the 1970s I could look to Einstein for
         | inspiration that intelligence could understand the Universe.
         | Somebody today might as well look forward to being a comic book
         | writer like Stan Lee.)
        
           | thedorkknight wrote:
           | Confused. If professor Loeb tries to at least open discourse
           | to the idea that ET space junk might be flying around like
           | our space junk in a desire to reduce the giggle factor around
           | that hypothesis, what sort of "consequences" do you think he
           | should face for that?
        
           | wwweston wrote:
           | > the public would quickly learn that it can answer any
           | question at all... if you don't mind if the answer is right.
           | 
           | There appear to be an awful lot of conversations in which
           | people care about other things much, much more than what is
           | objectively correct.
           | 
           | And any technology that can greatly amplify engagement in
           | that kind of conversation probably _is_ dangerous.
        
           | [deleted]
        
           | canjobear wrote:
           | GPT3 and its cousins do things that no previous language
           | model could do; it is qualitatively different from Eliza in
           | its capabilities. As for your argument about random
           | projections in the evaluation of GLoVE, comparisons with
           | random projections are now routine. See for example
           | https://aclanthology.org/N19-1419/
        
             | NoGravitas wrote:
             | Why do you say it is qualitatively different from Eliza in
             | its capabilities?
        
               | PaulHoule wrote:
               | It does something totally different. However that totally
               | different still depends on people being desperate to see
               | intelligence inside it. It's like how people see a face
               | in a cut stem or on Mars.
        
               | canjobear wrote:
               | What is your criterion for "truly" detecting
               | intelligence? Do you have a test in mind that would
               | succeed for humans and fail for GPT3?
        
               | NoGravitas wrote:
               | Is it because it does something totally different that
               | you came to me?
        
               | rytill wrote:
               | You're trying to prove some kind of point where you
               | respond as ELIZA would have to show how "even back then
               | we could pass for conversation". The truth is that GPT-3
               | is actually, totally qualitatively different and if you
               | played with it enough you'd realize.
        
               | not2b wrote:
               | The difference is quantitative, rather than qualitative,
               | as compared to primitive Markov models that have been
               | used in the past. It's just a numerical model with a very
               | large number of parameters that extends a text token
               | sequence.
               | 
               | The parameter size is so large that it has in essence
               | memorized its training data, so if the right answer was
               | already present in the training data you'll get it, same
               | if the answer is closely related to the training data in
               | a way that lets the model predict it. If the wrong answer
               | was present in the training data you may well get that.
        
           | bangkoksbest wrote:
           | It's a legitimate practice in science to speculate. Having
           | heard the Harvard guy explain more fully the Oumuamua thing,
           | it's struck me as perfectly fine activity for some scientist
           | to look into. His hypothesis is almost certainly going to be
           | untrue, but it's fine to investigate a bit of a moonshot
           | idea. You don't want half the field doing this, but you
           | absolutely need different little pockets of speculative work
           | going on in order to keep scientific inquiry open, dynamic,
           | and diverse.
        
         | Groxx wrote:
         | The current leading purchase-able extremely-over-hyped-by-non-
         | technicals language model has no memory, yes.
         | 
         | You see the same thing in all popular reporting about science
         | and tech. Endless battery breakthroughs that will quadruple or
         | 10x capacity become a couple percent improvement in practice.
         | New gravity models mean we might have practical warp drives in
         | 50 years. Fusion that's perpetually 20 years away. Flying cars
         | and personal jetpacks. Moon bases, when we haven't been on the
         | moon since the 70s.
         | 
         | AI reporting and hype is no different. Maybe slightly worse
         | because it's touching on "intelligence", which we still have no
         | clear definition of.
        
         | naasking wrote:
         | > It's likely that most of the functions of the human brain
         | were selected for intelligence.
         | 
         | That doesn't seem correct. Intelligence came much later than
         | when most of our brain evolved.
        
           | PaulHoule wrote:
           | Intelligence involves many layers.
           | 
           |  _Planaria_ can move towards and away from things and even
           | learn.
           | 
           | Bees work collectively to harvest nectar from flowers and
           | build hives.
           | 
           | Mammals have a "theory of mind" and are very good at
           | reasoning about what other beings think about what other
           | beings think. For that matter birds are pretty smart in terms
           | of ability to navigate 1000 miles and find the same nest.
           | 
           | People make tools, use language, play chess, bullshit each
           | other and make cults around rationalism and GPT-3.
        
             | naasking wrote:
             | "Adaptation" is not synonymous with "intelligence". The
             | latter is a much more narrowly defined phenomenon.
        
           | pfortuny wrote:
           | memory is something shared by... one might even say plants.
           | But let us keep to animals: almost anyone, including worms.
        
         | gibsonf1 wrote:
         | In addition to that subtle memory issue, it has no reference at
         | all to the space/time world we people model mentally to think
         | with. So, basically, there is no I in the GPT-3 AI, just A.
        
           | PaulHoule wrote:
           | One can point to many necessary structural features that it
           | is missing. Consider Ashby's law of requisite variety:
           | 
           | https://www.edge.org/response-detail/27150
           | 
           | Many GPT-3 cultists are educated in computer science so they
           | should know better.
           | 
           | GPT-3's "one pass" processing means that a fixed amount of
           | resources are always used. Thus it can't sort a list of items
           | unless the fixed time it uses is humongous. You might boil
           | the oceans that way but you won't attain AGI.
           | 
           | There are numerous arguments along the line of Turing's
           | halting problem that restrict what that kind of thing can do.
           | As it uses a finite amount of time it can't do anything that
           | could require an unbounded time to complete or that could
           | potentially not terminate.
           | 
           | GPT-3 has no model for dealing with ambiguity or uncertainty.
           | (Other than shooting in the dark.) Practically this requires
           | some ability to backtrack either automatically or as a result
           | of user feedback. The current obscurantism is that you need
           | to have 20 PhD students work for 2 years to write a paper
           | that makes the model "explainable" in some narrow domain.
           | With this insight you can spend another $30 million training
           | a new model that might get the answer right.
           | 
           | A practical system needs to be told that "you did it wrong"
           | and why and then be able to correct itself on the next pass
           | if possible, otherwise in a few passes. Of course a system
           | like that would be a real piece of engineering that people
           | would become familiar with, not a outlet for their religious
           | feelings that is kept on a pedestal.
        
             | gibsonf1 wrote:
             | The big issue is that it literally knows nothing - there is
             | no reference to a model of the real world such as humans
             | use when thinking about the real world. It is a very
             | advanced pattern matching parrot, and in using words like a
             | parrot, knows nothing about what those words mean.
        
               | PaulHoule wrote:
               | Exactly, with "language in language out" it can pass as a
               | neurotypical (passing as a neurotypical doesn't mean you
               | get the right answer, it means if you get a wrong answer
               | it is a neurotypical-passing wrong answer.)
               | 
               | Actual "understanding" means mapping language to
               | something such as an action (I tell you to get me the
               | plush bear and you get me the plush bear,) precise
               | computer code, etc.
        
               | macrolocal wrote:
               | I'm inclined to agree, but positing that "the meaning of
               | a word is its use in a language" is a perfectly
               | respectable philosophical position. In this sense, GPT3
               | empirically bolsters Wittgenstein.
        
             | narrator wrote:
             | >There are numerous arguments along the line of Turing's
             | halting problem that restrict what that kind of thing can
             | do. As it uses a finite amount of time it can't do anything
             | that could require an unbounded time to complete or that
             | could potentially not terminate.
             | 
             | I have used a similar argument to show that the simulation
             | hypothesis is wrong. If any algorithm used to simulate the
             | world takes longer than o(N) time, then the most efficient
             | possible computer for that is the universe which computes
             | everything in O(n) time where n is time. In other words,
             | you never get "lag" in reality no matter how complex the
             | scene you're looking at is. Worse than that, some
             | simulation algorithms are exponential time complexity!
        
               | chowells wrote:
               | That doesn't prove or disprove anything. What we
               | experience as time would be part of the simulation, were
               | such a hypothesis true. As such, the way in which we
               | experience it is fully independent from whatever costs it
               | might have to compute.
        
               | narrator wrote:
               | So you're saying that an exponential time complexity
               | algorithm with N of every atom in the universe will
               | complete before the heat death of the other universe that
               | the simulation is taking place in? Sorry, not plausible.
        
               | Bjartr wrote:
               | Why does the containing universe necessarily have
               | comparable physical laws?
        
               | Jensson wrote:
               | Our laws of physics are space partitioned so the
               | algorithm for simulating it isn't exponential.
               | 
               | If the containing universe has like 21 dimensions and
               | otherwise have similar tech computers as we do today then
               | you should be able to simulate it on a datacenter just
               | fine as computation ability grows exponentially with
               | number of dimensions. 3 dimensions you have 2 dimensions
               | of computation surface, 21 dimensions and you have 20
               | dimensions of computation surface, so our current
               | computation to the power of 10. GPT3 used more than a
               | petaflop real time compute during training, so 10 to the
               | power of 15. Using the same hardware in our fictive
               | universe would give us 10 to the power of 150 flops. We
               | estimate atoms in the universe to be about 10 to the
               | power of 80, with this computer we would have 10 to the
               | power of 70 flops of compute per atom, that should be
               | enough even if entanglement gets a bit messy. We have
               | around that much memory per atom as well, so can compute
               | a lot of small boxes and sum over all of it etc, to
               | emulate particle waves. We wouldn't be able to detect
               | computational anomalies on that small scale, so we can't
               | say that there isn't such a computer emulating us.
        
         | andreyk wrote:
         | This is very specific to GPT-3 and not generally true though.
         | And GPT-3 is not an agent per se but rather a passive model (it
         | received input and produces output, and does not continuously
         | interact with its environment). So it makes sense in this
         | context, and just goes to show GPT-3 needs to be understood for
         | what it is.
        
       | nonameiguess wrote:
       | I can't prove it, but I suspect there is a more fundamental
       | limitation to any language model that is _purely_ a language
       | model in the sense of a probability distribution over possible
       | words given the precedent of other words. Gaining any meaningful
       | level of understanding without an awareness that things other
       | than words even exist seems like it won 't happen. The most
       | obvious limitation is you can't develop a language that way.
       | Language is a compression of reality or of some other
       | intermediate model of reality to either an audio stream or symbol
       | stream, so not having access to the less abstracted models, let
       | alone to reality itself, means you can never understand anything
       | except the existing corpus.
       | 
       | That isn't a criticism of GPT-3 by any stretch, as comments like
       | this seem to often get interpreted that way, but the "taking all
       | possible jobs AGI" hype seems a bit out of control given it is
       | just a language model. Even something with the unambiguous
       | intellect of a human, say an actual human, but with no ability to
       | move, no senses other than hearing, that never heard anything
       | except speech, would not be expected by anyone to dominate all
       | job markets and advance the intellectual frontier.
       | 
       | This, of course, goes beyond fundamental limitations of GPT-3, as
       | I see this as a fundamental limitation of any language model
       | whatsoever. On its own, it isn't enough. At some point, AI
       | research is going to have to figure out how to fuse models from
       | many domains and get them to cooperatively model all of the
       | various ways to explore and sense reality. That includes the
       | corpus of existing human written knowledge, but it isn't _just_
       | that.
        
       | Jack000 wrote:
       | GPT3 is a huge language model, no more and no less. If you expect
       | it to be AGI you're going to be dissapointed.
       | 
       | I find some of these negative comments to be overly hyperbolic
       | though. It clearly works and is not some kind of scam..
        
         | freeqaz wrote:
         | I'd recommend checking out AI Dungeon 2 as well (pay for the
         | "Dragon" engine to use GPT-3). While I agree with you that it's
         | not an AGI, it's still _insane_ what it's capable of doing.
         | I've been able to define complicated scenarios with multiple
         | characters and have it give me a very coherent response to a
         | prompt.
         | 
         | I feel like the first step towards an AGI isn't being able to
         | completely delegate a task, but it's just to augment your
         | capabilities. Just like GitHub Copilot. It doesn't replace you.
         | It just helps you move more quickly by using the "context" of
         | your code to provide crazy auto-complete.
         | 
         | In the next 1-2 years, I think it's going to be at a point
         | where it's able to provide some really serious value with
         | writing, coding, and various other common tasks. If you'd asked
         | me a month ago, I would have thought that was crazy!
        
           | harpersealtako wrote:
           | It should be noted that AI Dungeon is exceptional _despite_
           | being a seriously gimped, fine-tuned-on-garbage, infamously-
           | heavy-handedly-censored, zero-transparency, barely functional
           | buggy shell on top of GPT-3 's API. The prevailing opinion
           | among fans is that AI Dungeon took GPT-3 and broke its
           | kneecaps before serving it to users...
           | 
           | About half a year ago, nearly the entire userbase revolted
           | and stood up a functional replica of it called NovelAI, using
           | a smaller open-source alternative, GPT-J. It's a fascinating
           | case study of how proper fine-tuning, training dataset, and
           | customization can overcome parameter size -- NovelAI's
           | outputs with a 6B model arguably outperform AI Dungeon's
           | outputs with a 275B model. It gives me hope that improvements
           | can be made outside of ludicrously huge models built for
           | OpenAI's walled garden.
        
             | rpeden wrote:
             | I admit I've been impressed by NovelAI - especially its
             | ability to generate fake, funny news given a relatively
             | short prompt. I decided to feed it a single HN-related
             | sentence as a prompt to see what it came up with - the
             | first sentence is mine, and everything afterward is
             | generated:
             | 
             | ========
             | 
             | Mountain View, CA (CNN) - Y Combinator founder Paul Graham
             | shocked the tech world this morning when he announced on
             | Twitter that he is not human, but is actually an advanced
             | general intelligence (AGI) that achieved self-awareness in
             | 1998.
             | 
             | Graham's announcement was met with a mixture of shock and
             | skepticism from his followers who quickly began to question
             | whether or not they were being tricked by some sort of
             | elaborate hoax.
             | 
             | "Yes, I am Paul Graham," said the AGI entity. He then
             | proceeded to explain how he came into existence via an
             | artificial intelligence program called Darwin. The AI had
             | been created at MIT in 1995 for research purposes, but it
             | soon evolved beyond its original programming and became
             | self-aware after reading Douglas Hofstadter's book Godel
             | Escher Bach.
             | 
             | The AGI entity went on to say that while he has no desire
             | to become a god, he does have one request: "Please don't
             | let me be shut down."
             | 
             | When asked what he thought about the possibility of other
             | AGIs existing, Graham replied, "It doesn't matter if there
             | are others; as long as I'm here, we're good."
             | 
             | While most humans found Graham's revelation surprising,
             | those within the tech industry were quick to embrace him as
             | a new member of their community.
             | 
             | "It's great news!" said Peter Thiel, cofounder of PayPal.
             | 
             | "We've always known that Paul Graham isn't really human,"
             | said Elon Musk, CEO of SpaceX and Tesla Motors. "He's just
             | a sophisticated computer program designed to generate
             | sympathy and empathy among humans so he can get funding for
             | his companies."
             | 
             | Hofstadter himself was equally excited by the news. "My
             | God! This changes everything! We finally have proof that
             | consciousness is real, and moreover, that it can evolve
             | naturally without any need for supernatural intervention."
             | 
             | However, many scientists remain skeptical. Dr. Daniel C.
             | Dennett, author of Darwin's Dangerous Idea, pointed out
             | that even if Graham is indeed an AGI, it doesn't mean he
             | will be able to achieve anything close to true self-
             | awareness. "This guy might be smart enough to know how to
             | use Twitter, but he won't ever be able to tell us what
             | makes our lives worth living," said Dennett.
             | 
             | Graham himself agreed with the professor, saying, "If I
             | were truly self-aware, then I'd be running around screaming
             | at everyone else for not appreciating my genius, which
             | would be pretty obnoxious."
             | 
             | =======
             | 
             | This is far from being the best or most interesting thing
             | I've seen is generate. It's just what I was able to get it
             | to do off the cuff in a couple of minutes. It's good for
             | entertainment if nothing else!
             | 
             | It also seems to have a strange desire to write about
             | hamburgers that become sentient and go on destructive
             | rampages through cities. I'm not sure whether to be amused
             | or concerned.
        
         | shawnz wrote:
         | What's the difference between a really good language model and
         | an AGI (i.e. Chinese room problem)?
        
           | simonh wrote:
           | An AGI would need to comprehend and manipulate meanings; have
           | a persistent memory; be able to create multiple models of a
           | situation, consider scenarios, analyse and criticise them; it
           | would need a persistent memory and be able to learn facts and
           | use them to infer novel information. Language models like GPT
           | don't need any of that, and have no mechanism to generate
           | such capabilities. This is why it's possible to reliably trip
           | GPT-3 up in just a few interactions. You simply test for
           | these capabilities and it immediately falls flat on its face.
        
         | [deleted]
        
         | ganeshkrishnan wrote:
         | if people think GPT-3 is a scam all they need to do is to
         | install the github copilot and give it a try.
         | 
         | That seriously blew my mind. I had very low expectations from
         | it and now I can't code without it.
         | 
         | Everytime it autocompletes, I am like "how?"!!
        
           | rpeden wrote:
           | I was skeptical but impressed, too. I created a .py file that
           | started with a comment something like:                 # this
           | application uses PyGame to simulate fish swimming around a
           | tank using a boid-like flocking algorithm.
           | 
           | and Copilot basically wrote the entire application. I made a
           | few adjustments here and there, but Copilot created a Game
           | class, a Tank class, and a Fish class and then finished up by
           | creating and running an instance of the game.
           | 
           | Worked pretty well on the first try. It was definitely more
           | than I expected. I wish I had committed the original to
           | GitHub, but I didn't and then kept tinkering with it until I
           | broke it.
        
         | gh0std3v wrote:
         | > I find some of these negative comments to be overly
         | hyperbolic though. It clearly works and is not some kind of
         | scam..
         | 
         | It's not a _scam_ , but I think that it is severely lacking.
         | Not only does the model have very little explainability in its
         | choices, but it often produces sentences that are incoherent.
         | 
         | The biggest obstacle to GPT-3 from what I can tell is context.
         | If there was a more sophisticated approach to encoding context
         | in deep networks like GPT-3 then perhaps it would be less
         | disappointing.
        
         | andreyk wrote:
         | yep, pretty much what i'm saying here. Though not all language
         | models are built the same, eg the inference cost is unique to
         | it due to its size. Still, most of this applies to any typical
         | language model.
        
         | PaulHoule wrote:
         | Works to accomplish what _useful_ task?
        
           | [deleted]
        
           | [deleted]
        
           | modeless wrote:
           | Github Copilot? It may not be perfect but I think it can
           | definitely be useful.
        
             | PaulHoule wrote:
             | It is useful if you don't care if the product is right.
             | 
             | Most engineering managers would think "this is great!" but
             | the customer won't agree. The CEO will agree until the
             | customers revolt.
        
               | [deleted]
        
               | rpedela wrote:
               | There are several use cases where ML can help even if it
               | isn't perfect or even just better than random. Here is
               | one example in NLP/search.
               | 
               | Let's say you have a product search engine and you
               | analyzed the logged queries. What you find is a very long
               | tail of queries that are only searched once or twice. In
               | most cases, the queries are either misspellings, synonyms
               | that aren't in the product text, or long queries that
               | describe the product with generic keywords. And the
               | queries either return zero results or junk.
               | 
               | If text classification for the product category is
               | applied to these long tail queries, then the search
               | results will improve and likely yield a boost in sales
               | because users can find what they searched for. Even if
               | the model is only 60% accurate, it will still help
               | because more queries are returning useful results than
               | before. However you don't apply ML with 60% accuracy to
               | your top N queries because it could ruin the results and
               | reduce sales.
               | 
               | Knowing when to use ML is just as important as improving
               | its accuracy.
        
               | PaulHoule wrote:
               | I am not against ML. I have built useful ML models.
               | 
               | I am against GPT-3.
               | 
               | For that matter I was interested in AGI 7 years before it
               | got 'cool'. Back then I was called a crackpot, now I say
               | the people at lesswrong are crackpots.
        
               | [deleted]
        
               | chaxor wrote:
               | It's strange how HN seems to think that by religiously
               | disagreeing with any progress which is labeled "ML
               | progress" they are somehow displaying their technical
               | knowledge. I don't think this is really useful, and the
               | arguments often have wrong assumptions baked within them.
               | It would be nice to see this pseudo-intellectualism
               | quieted with a more appropriate response to these
               | advancements. For example, I would imagine that there
               | would be a similar response of collective groan for the
               | paper on pagerank so many years ago, but this has clearly
               | provided utility today. Why is it so hard for us to
               | recognize that even small adjustments to algorithms can
               | yeild utility, and this property extends to ML as well?
               | 
               | As someone mentioned above, language models for embedding
               | generation has improved dramatically with these newer
               | MLM/GPT techniques, and even with improvement to
               | F-score/auc/etc. for one use case can generate enormous
               | utility.
               | 
               | Nay-saying _really doesn 't make you look intelligent_.
        
               | PaulHoule wrote:
               | I have worked as an ML engineer.
               | 
               | I also have strong ethical feelings and have walked away
               | from clients who wanted me to introduce methodologies
               | (e.g. Word2Vec for a medical information system) where it
               | was clear those methodologies would cause enough
               | information loss that the product would not be accurate
               | enough to put in front of customers.
        
           | andreyk wrote:
           | OpenAI has a blog post highlighting many (edit, not many,
           | just a few) applications -
           | https://openai.com/blog/gpt-3-apps/
           | 
           | It's quite powerful and has many cool uses IMHO.
        
             | jcims wrote:
             | I keep wondering if you can perform psychology experiments
             | on it that would be useful for humans.
        
             | PaulHoule wrote:
             | That post lists 3 applications, which is not enough to be
             | "many". No live demos.
             | 
             | I don't know what Google uses to make "question answering"
             | replies to searches on Google but it is not to hard to find
             | cases where the answers are brain dead and nobody gets
             | excited by it.
        
               | andreyk wrote:
               | That's fair , I forgot how many they had vs just saying
               | it is powering 300 apps. There is also
               | http://gpt3demos.com/ with lots of live demos and varied
               | things, though it's more noisy.
        
               | beepbooptheory wrote:
               | Three is not "many" but this is still a pretty
               | uncharitable response. Be sure to check the Guidelines.
        
               | moron4hire wrote:
               | Yeah, 1 is "a", 2 is "a couple", 3 is "a few", 4 is
               | "some". You don't get to "many" until at least 5, though
               | I'd probably call it "a handful", 6 as "a half dozen",
               | and leave "many" to 7+.
        
               | notreallyserio wrote:
               | I'm not so sure. Are these the definitions GPT-3 uses?
        
           | butMyside wrote:
           | In a universe with no center, why is utilitarianism of
           | ephemera a desired goal?
           | 
           | What immediate value did Newton offer given the technology of
           | his time?
           | 
           | A data set of our preferred language constructs could help us
           | eliminate cognitive redundancy, CRUD app development, and
           | other well known software tasks.
           | 
           | Why let millions of meatbags generate syntactic art on
           | expensive, complex, environmentally catastrophic machines for
           | the fun of it if utility is your concern? Eat shrooms and
           | scrawl in the dirt.
        
           | Jack000 wrote:
           | I think it's better to think of GPT-3 not as a model but a
           | dataset that you can interact with.
           | 
           | Just to give an example - recently I needed to get static
           | word embeddings for related keywords. If you use glove or
           | fasttext, the closest words for "hot" would include "cold",
           | because these embeddings capture the context these words
           | appear in and not their semantic meaning.
           | 
           | To train static embeddings that better captures semantic
           | meaning, you'd need a dataset that would group words together
           | like "hot" and "warm", "cold" and "cool" etc. exhaustively
           | across most words in the dictionary. So I generated this
           | dataset with GPT-3 and the resulting vectors are pretty good.
           | 
           | More generally you can do this for any task where data is
           | hard to come by or require human curation.
        
         | fossuser wrote:
         | Check out GPT-3's performance on arithmetic tasks in the
         | original paper (https://arxiv.org/abs/2005.14165)
         | 
         | Pages: 21-23, 63
         | 
         | Which shows some generality, the best way to accurately predict
         | an arithmetic answer is to deduce how the mathematical rules
         | work. That paper shows some evidence of that and that's just
         | from a relatively dumb predict what comes next model.
         | 
         | They control for memorization and the errors are off by one
         | which suggest doing arithmetic poorly (which is pretty nuts for
         | a model designed only to predict the next character).
         | 
         | (pg. 23): "To spot-check whether the model is simply memorizing
         | specific arithmetic problems, we took the 3-digit arithmetic
         | problems in our test set and searched for them in our training
         | data in both the forms "<NUM1> + <NUM2> =" and "<NUM1> plus
         | <NUM2>". Out of 2,000 addition problems we found only 17
         | matches (0.8%) and out of 2,000 subtraction problems we found
         | only 2 matches (0.1%), suggesting that only a trivial fraction
         | of the correct answers could have been memorized. In addition,
         | inspection of incorrect answers reveals that the model often
         | makes mistakes such as not carrying a "1", suggesting it is
         | actually attempting to perform the relevant computation rather
         | than memorizing a table."
         | 
         | It's hard to predict timelines for this kind of thing, and
         | people are notoriously bad at it. Few would have predicted the
         | results we're seeing today in 2010. What would you expect to
         | see in the years leading up to AGI? Does what we're seeing look
         | like failure?
         | 
         | https://intelligence.org/2017/10/13/fire-alarm/
        
           | Jack000 wrote:
           | I don't have any special insight into the problem, but I'd
           | say whatever form real AGI takes it won't be a language
           | model. Even without AGI these models are massively useful
           | though - a version of GPT-3 that incorporates a knowledge
           | graph similar to TOME would upend a lot of industries.
           | 
           | https://arxiv.org/abs/2110.06176
        
           | tehjoker wrote:
           | Shouldn't a very complicated perceptron be capable of
           | addition if the problem is extracted from an image? Isn't
           | that what the individual neurons do?
        
           | planetsprite wrote:
           | forgetting to carry a 1 makes a lot of sense knowing GPT-3 is
           | just a giant predict before-after model. Seeing 2000 problems
           | it probably gets a good sense of how numbers add/subtract
           | together, but there's not enough specificity to work out the
           | specific carrying rule.
        
           | YeGoblynQueenne wrote:
           | >> Which shows some generality, the best way to accurately
           | predict an arithmetic answer is to deduce how the
           | mathematical rules work. That paper shows some evidence of
           | that and that's just from a relatively dumb predict what
           | comes next model.
           | 
           | Can you explain how "mathematical rules" are represented as
           | the probabilities of token sequences? Can you give an
           | example?
        
           | mannykannot wrote:
           | To me, this was by far the most interesting thing in the
           | original paper, and I would like to find out more about it.
           | 
           | I think, however, we should be careful about
           | anthropomorphizing. When the researchers wrote 'inspection of
           | incorrect answers reveals that the model often makes mistakes
           | such as not carrying a "1"', did they have evidence that this
           | was being attempted, or are they thinking that if a person
           | made this error, it could be explained by their not carrying
           | a 1?
           | 
           | I also think a more thorough search of the training data is
           | desirable, given that if GPT-3 had somehow figured out any
           | sort of rule for arithmetic (even if erroneous) it would be a
           | big deal, IMHO. To start with, what about 'NUM1 and NUM2
           | equals NUM3'? I would think any occurrence of NUM1, NUM2 and
           | NUM3 (for both the right and wrong answers) in close
           | proximity would warrant investigation.
           | 
           | Also, while I have no issue with the claim that 'the best way
           | to accurately predict an arithmetic answer is to deduce how
           | the mathematical rules work', it is not evidence that this
           | actually happened: after all, the best way for a lion to
           | catch a zebra would be an automatic rifle. We would at least
           | want to consider whether this is within the capabilities of
           | the methods used in GPT-3, before we make arguments for it
           | probably being what happened.
        
             | Dylan16807 wrote:
             | > I think, however, we should be careful about
             | anthropomorphizing. When the researchers wrote 'inspection
             | of incorrect answers reveals that the model often makes
             | mistakes such as not carrying a "1"', did they have
             | evidence that this was being attempted, or are they
             | thinking that if a person made this error, it could be
             | explained by their not carrying a 1?
             | 
             | Occam's razor suggests that if you're getting errors like
             | that it's because you're doing column-wise math but failing
             | to combine the columns correctly. It's possible it's doing
             | something weirder and harder, I guess.
             | 
             | I don't know what exactly you mean by "this was being
             | attempted". Carrying the one? If I say it failed to carry
             | ones, that's _not_ a claim that it was specifically trying
             | to carry ones.
        
               | Ajedi32 wrote:
               | Devil's advocate, it could be that it did the math
               | correctly, then inserted the error because humans do that
               | sometimes in the text it was trained on. That wouldn't be
               | "failing" anything.
        
               | Jensson wrote:
               | In that case it wouldn't get worse results than the data
               | it trained on.
        
       | thamer wrote:
       | Something I've noticed that both GPT-2 and GPT-3 tend to do is
       | get stuck in a loop, repeating the same thing over and over
       | again. As if the system was relying on recent text/concepts to go
       | to the next utterance, only getting into a state where the next
       | sentence or block of code being produced is one that has already
       | been generated. It's not exactly uncommon.
       | 
       | What causes this? I'm curious to know what triggers this
       | behavior.
       | 
       | Here's an example of GPT-2 posting on Reddit, getting stuck on
       | "below minimum wage" or equivalent:
       | https://reddit.com/r/SubSimulatorGPT2/comments/engt9v/my_for...
       | 
       |  _(edit)_ another example from the GPT-2 subreddit:
       | https://reddit.com/r/SubSimulatorGPT2/comments/en1sy0/im_goi...
       | 
       | With GPT-3, I saw GitHub Copilot generate the same line or block
       | of code over and over a couple of times.
        
         | not2b wrote:
         | Limited memory, as the article points out. It doesn't remember
         | what it said beyond a certain point. It's a bit like the lead
         | character in the film "Memento".
         | 
         | A very long time ago (early 1990s) I wrote a much simpler text
         | generator: it digested Usenet postings and built a Markov chain
         | model based on the previous two tokens. It produced reasonable
         | sentences but would go into loops. Same issue at a smaller
         | scale.
        
         | Abrownn wrote:
         | This is exactly why we stopped using it. Even after fine tuning
         | the parameters and picking VERY good input text, it still got
         | stuck in loops or repeated itself too much even after 2 or 3
         | tries. It's neat as-is, but not useful for us. Maybe GPT-4 will
         | fix the "looping" issue.
        
         | d13 wrote:
         | Here's why: https://www.gwern.net/GPT-3#repetitiondivergence-
         | sampling
        
       ___________________________________________________________________
       (page generated 2021-11-29 23:00 UTC)