[HN Gopher] Simply explained: How does GPT work?
___________________________________________________________________
Simply explained: How does GPT work?
Author : nitnelave
Score : 341 points
Date : 2023-04-06 09:36 UTC (13 hours ago)
(HTM) web link (confusedbit.dev)
(TXT) w3m dump (confusedbit.dev)
| stareatgoats wrote:
| This article seems credible and actually made me feel as if I
| understood it, i.e. at some depth but not deeper than a relative
| layperson can grasp.
|
| What I can't understand is how the Bing chatbot can give me
| accurate links to sources but chatGPT4 on request gives me
| nonsensical URLs in 4 case of 5. It doesn't matter in the cases
| where I ask it to write a program: the verification is in the
| running of it. But to have real utility in general knowledge
| situations, verification through accurate links to sources is a
| must.
| lm28469 wrote:
| > What I can't understand is how the Bing chatbot can give me
| accurate links to sources but chatGPT4 on request gives me
| nonsensical URLs in 4 case of 5
|
| The bing version might run a bing query, fetch the X top pages,
| run GPT on it, return a response based on what it read, and in
| the back assign the summary to the source
| stareatgoats wrote:
| That might be the reason, probably. I mostly wanted to
| complain TBH. But I'm assuming it's one of those wrinkles
| that will get ironed out in subsequent versions.
| rootusrootus wrote:
| > It doesn't matter in the cases where I ask it to write a
| program: the verification is in the running of it.
|
| Even then. I've had it write programs that were syntactically
| correct and produced plausible, but incorrect behavior. I'm
| really careful about what I'll use GPT-generated code for. IMO
| write the tests yourself, at least.
| stareatgoats wrote:
| Absolutely! It is seldom correct right off the bat.
| ben7799 wrote:
| I asked it which was better, Lisp or Almonds.
|
| It said that was an impossible comparison like Apples and
| Oranges.
|
| Then I asked it which were more similar, Apples & Oranges or Lisp
| & Almonds.
|
| It said it is impossible to classify either of those two pairs as
| more similar because they too fundamentally different. It
| couldn't come up with anything like Lisp is not edible. Or that
| Apples and Oranges are both sweet and Lisp and Almonds don't
| share any common traits.
|
| It seems like it has far more trouble with weird questions like
| this that even a small child will instantly figure out than it
| does with anything that seems like a lookup of information.
| losvedir wrote:
| I hear people with these outrageous examples and I have to
| wonder how or what you're actually trying? I just asked GPT-4
| (via chat.openai.com):
|
| > Which is more similar, Apples & Oranges or Lisp & Almonds?
|
| and it said:
|
| > When comparing these two pairs, the more similar pair would
| be Apples & Oranges. Both apples and oranges are fruits, and
| they share many characteristics, such as being edible,
| containing vitamins and nutrients, and growing on trees.
|
| > On the other hand, Lisp & Almonds are quite dissimilar. Lisp
| is a programming language used for computing and software
| development, while almonds are a type of nut commonly consumed
| as a food item. They belong to completely different categories
| and share almost no common characteristics.
|
| And the question about which is better had a reasonable answer,
| too. This is GPT-3.5 but GPT-4 was roughly equivalent:
|
| > It's not really possible to compare the relative merits of
| Lisp and almonds, as they are fundamentally different things.
|
| > Lisp is a programming language, and like any programming
| language, it has its own strengths and weaknesses, depending on
| what you're trying to accomplish. Some programmers might prefer
| Lisp for certain tasks, while others might prefer a different
| language.
|
| > Almonds, on the other hand, are a type of nut that are known
| for their nutritional value and health benefits. They are a
| good source of healthy fats, protein, and fiber, and can be a
| valuable part of a healthy diet.
|
| > So, whether Lisp or almonds are "better" really depends on
| what you're looking for. If you're a programmer, Lisp might be
| better for certain tasks, while if you're looking for a
| nutritious snack, almonds might be a better choice.
| zackmorris wrote:
| _On the other hand, many people who are not ready to change, who
| do not have the skills or who cannot afford to reeducate are
| threatened._
|
| That's me. After programming since the '80s, I'm just so tired.
| So much work, so much progress, so many dreams lived or
| shattered. Only to end up here at this strange local maximum,
| with so much potential, destined to forever run in place by the
| powers that be. The fundamentals formula for intelligence and
| even consciousness materializing before us as the world burns. No
| help coming from above, so support coming from below, surrounded
| by everyone who doesn't get it, who will never get it. Not
| utopia, not dystopia, just anhedonia as the running in place
| grows faster, more frantic. UBI forever on the horizon, countless
| elites working tirelessly to raise the retirement age, a status
| quo that never ceases to divide us. AI just another tool in their
| arsenal to other and subjugate and profit from. I wonder if a day
| will ever come when tech helps the people in between in a
| tangible way to put money in their pocket, food in their belly,
| time in their day - independent of their volition - for dignity
| and love and because it's the right thing to do. Or is it already
| too late? I don't even know anymore. I don't know anything
| anymore.
| ulnarkressty wrote:
| Random human chiming in. This sounds like a lot of anxiety,
| something between FUD and FOMO. FUDMO?
|
| Seriously tho, taking some time to get away from it would be
| good. Ignorance is bliss, this too shall pass etc.
|
| (btw nice piece of writing, you should do it more often!)
| Method-X wrote:
| It sounds like your mindset is the root of your struggles.
| Embracing change and adapting to new technologies has _always_
| been crucial in our industry. Instead of waiting for help from
| others, take control and collaborate with like-minded people.
| If you don 't like the status quo, work toward changing it.
| throwbadubadu wrote:
| I think this is a bit hard .. and also unfair to repeat that
| embrace-change-mantra, because what he says is as absurd as
| at the same time totally true (:
|
| I'd hope some of us would just be there in 60 years to just
| tell the future: "Heee just embrace it, ya know" .. nuff
| said.
| tim333 wrote:
| It was the best of times, it was the worst of times...
|
| In the long run tech does a bit too well with "food in their
| belly" to the point that obesity is the main problem in the
| English speaking world.
|
| As to programming it's quite cool getting chat GTP to write
| code and stuff. If you can't beat it make use of it I guess.
| IKLOL wrote:
| All the while housing, healthcare, education, and the things
| that matter once you've achieved food prosperity are
| disappearing at a rapid rate. This makes people turn to their
| baser needs more often, food and pornography and other
| stimulus.
| tabtab wrote:
| Would it be a stretch to call GPT "glorified Markov Chains"? (I
| used tweaked M.C. once to make a music composer bot. I actually
| got a few decent tunes out of it, kind of a Bach style.)
| jokoon wrote:
| I am not convinced that Chat GPT could "think" if it had as many
| neurons or parameters as a human brain, and got as much training.
|
| I would still be interested to see what it could do, if it did,
| but I don't think it would really help science understand what
| intelligence really is.
|
| Being able to grow a plant and understand some conditions that
| favors it is one thing, but it's poor science.
|
| Maybe there will some progress when scientists will be able to
| properly simulate the brain of an ant or even a mouse, but
| science is not even there yet.
| seydor wrote:
| > I don't think it would really help science understand what
| intelligence really is
|
| Neuroscience is nowhere near finding out the connectome of a
| whole human brain so why not, we should look into these models
| as hints about what our circuits do.
|
| I think what puts people off about these models is that they
| are clockwork: they won't even spit out anything unless you put
| some words in the input. But i can imagine adding a second
| network that includes an internal clock that continuously
| generates input by observing the model itself, that would be
| kind of like having an internal introspective monologue. Then
| it could be more believable that the model "thinks"
| seydor wrote:
| This is confusing, using the semantic vectors arithmetic of
| embeddings is not very relevant to transformers and its
| completely missing the word 'attention'. I don't think
| transformers are that difficult to explain to people , but it is
| hard to explain "why" they work. But i think it's important for
| everyone to look under the hood and know that there are no demons
| underneath.
| masswerk wrote:
| > that there are no demons underneath
|
| Given that Lacan already proposed the unconscious as structured
| language-like more than half a century ago and described
| attention in his turn on Freud's impulse in favor of his
| concept of _derive_ , we may say, this is pretty much where our
| own demons live.
|
| (I actually do think that revisiting Lacan in this context may
| be productive.)
| thanatropism wrote:
| We need to form some sort of guild of engineers who think
| Deleuze, Latour, Lacan et caterva should be read within our
| disciplines.
| masswerk wrote:
| I second this. The STEM (= engineering) / humanities divide
| in software may miss some.
|
| (There had been times, when linguistics were still a major
| entry path into computing, where things were a bit
| different. Notably, this were also the times, which gave
| rise to most of the general paradigms. A certain amount of
| generality was even regarded a prerequisite to programming.
| Particularly, HN is such a great place, because it holds up
| this notion of generality.)
| Der_Einzige wrote:
| They would be called "unemployable" because they are
| literal charlatans.
|
| Please end our strange fascination with fashionable
| nonsense. Freud was wrong. There is no Oedipus complex.
| Everything lacan proposed was wrong. Deleuze and Guattari's
| mental health clinic failed spectacularly, and Deleuze
| ended up killing himself at the end (supposedly due to back
| pain?)
|
| They literally describe their thought as being
| "Schizoanalysis". How many more red flags do you need?
|
| Also, the more "modern" takes on this from techno folks,
| such as from Nick Land (Fanged Noumena), are openly fascist
| - https://en.wikipedia.org/wiki/Dark_Enlightenment
|
| If you want cultural critique from smart people without it
| turning into fashionable nonsense, I recommend Mark
| Fischer, but be warned, he too killed himself.
| masswerk wrote:
| "There is no Oedipus complex" is, using your own words,
| "fashionable nonsense". Of course there is a crucial
| conflict as we learn rules, and this establishes a
| universal foundation of ethics. And of course this takes
| place in the realm of our very object relations. Not
| bothering about what a concept actually means and/or
| describes is not a free card for criticism. (Neither does
| generalization over an entire section of science and
| human thinking help particularly.) - Sorry for the harsh
| words, but this actually bothers me.
|
| Regarding charlatans, mind that there are already few who
| have actually studied this. (I'm one of them.)
|
| Regarding Lacan, he provides us with an established
| theory of "talking machines", and, in a philosophical
| context, how they relate to our very freedom (or, what
| freedom may even be). This isn't totally useless in our
| current situation, and NB, it's actually quite the
| opposite of fascism.
| ttpphd wrote:
| In this moment it would be better if people would read some
| Wittgenstein and Roy Harris, IMHO.
| masswerk wrote:
| Why not both? Things like philosophy or metapsychology
| tend to be prismatic, each framework comes with
| advantages and disadvantages and boundaries of its own.
|
| (A turn towards the dogmatic is something I'm pretty much
| expecting from the current launch of AI anyway, simply,
| because the productions systematically favor the semantic
| center. So it may be worth putting some generality
| against this, rather than being overly selective.)
| ttpphd wrote:
| Lol well to answer your question literally, I think
| integrationist linguistics and Wittgenstein's thoughts
| about language use as a social action are way more
| relevant to understanding what's happening with LLMs (and
| people's naive reactions to them) than what was suggested
| previously as background reading.
| masswerk wrote:
| Mind that we're are not, by any means, at any state of
| social interaction with LLMs. (Any such thing would be a
| mere hallucination on the user's side.) However, these
| are semantic fields, with whatever consequence comes with
| this. (So there may have been something said on this
| already, in what was known as the linguistic turn.)
| robwwilliams wrote:
| Please no! Read systems neuroscience. Like Hassabis does.
| Or if of a philosophical persuasion, then Dennett or Rorty.
| masswerk wrote:
| Much of cognitive science reinvents wheels that had been
| established in the 1920s and 1930s already, namely in
| sociology of knowledge and related fields. fRMI actually
| often confirms what had been already observed in a
| psychoanalytic context. (I don't think it's a good
| general advice to totally ignore what is already known.)
| robwwilliams wrote:
| But Lacan? And no, there is a vast new world of cognitive
| neuroscience that was undreamed even 10 years ago.
| masswerk wrote:
| > But Lacan?
|
| Well, if you're in need of an established theory of
| (semantically driven) talking machines and what derives
| from this, and what this may mean for us in terms of
| freedom, look no further.
| robwwilliams wrote:
| Can you recommend a specific work of his? What Lacan I
| have leaves me bemused by his brilliance but not
| informed. Dennett provides both without the fireworks.
| masswerk wrote:
| Generally, don't start with the "ecrits" (writings), they
| are hermetic and you really have to have some head start
| on this. From the seminars, _Livre XI, Les quatres
| concepts fondamentaux de le psychoanalyse_ (1964) may be
| a start, as it - in parts - aligns itself with the
| cybernetic research of the day. However, do not expect
| too much from a single reading or a single of the
| seminars.
|
| (Mind that this is trying to talk about what's
| beyond/below language, necessarily using language just
| the same, which is - at least according to (the early)
| Wittgenstein - somewhat an impossibility. You can only
| show these things, so it takes several approaches from
| several directions. But there is actually something like
| a concise corpus of theory eventually emerging from this.
| Moreover, this - being transcripts of seminars -
| addresses an audience that is already familiar with
| Freud, in order to reframe this. - This is also one of
| the major issues with Lacan and his reception: it takes
| some serious investment to get into this, and this also
| used to have some worth on the academic markets. On the
| other hand, this (academic) value became indeed inflated
| and eventually devalued, to the point of those, who never
| bothered to invest, happily triumphing. Think the Great
| North-American Video Game Crash. But this really
| shouldn't be the end to what may be one of the major
| approaches towards what language actually means to us.
| The expectation that everything can be addressed directly
| and without prerequisites, regardless of the complexity,
| may actually not be met. On the other hand, there will be
| also never be a single "master", who is always right and
| without failure, bearing always the most distilled
| emanation of truth in their very word. - I'm also not
| arguing that everybody is now to become a scholar of
| Lacan. Rather, we may have an informed expert discussion,
| what may gained from this from a current perspective.
| E.g., if Lacan actually had something to say about an
| impulse-like directional vector emerging from attention
| (as a form of selectional focus on a semantic field), is
| there something to be learned from this, or, to be aware
| of?)
| robwwilliams wrote:
| I would vote for revisiting Hofstadter and perhaps even
| Piaget. And if we want to bother trying to define and
| operationalize self-consciousness---then Guillery's The Brain
| as a Tool--is a good foray into how the CNS pulls off this
| simple trick.
| nitnelave wrote:
| I was trying to keep the article at a level that everyone
| understands, from middle school up. I thought about going a bit
| deeper in the structure and mentioning attention, but my
| problem is that the intuitive concept of "attention" is quite
| different from the mathematical reality of an attention layer,
| and I'm sure I would have lost quite a few people there. It's
| always a trade-off :)
| seydor wrote:
| Indeed the word 'attention' implies some active process,
| which is also unfortunate, because it is really similarity
| maps over long ranges. This process apparenty creates a
| cross-reference maps of items that are commonly co-activated.
| But then what do the different heads do? Some kind of
| different "modalities of cross-reference"?
|
| I struggle to understand why this thing works the way it
| does. It's possible that Vaswani et al. have made one of the
| greatest discoveries of this century that solved the language
| problem in an unintuitive, and yet very unappreciated way.
| It's also possible that there are other architectures that
| can simulate the same level of intelligence with such large
| numbers of parameters.
|
| I think you re right that it's not intuitive, it's like basic
| arithmetic is laughing at us
| fastball wrote:
| Multi-head attention just means that you're looking at all
| the words at once rather than only looking at one word at a
| time, and using that to generate the next word. So instead
| of using attention only on the last word you also have
| attention on the penultimate word and the one before that
| and the one before that, etc. I think it is fairly obvious
| why this gives better results than say an RNN - you are
| utilizing context better than in a recurrent system, which
| is also just closer to how a human brain works. When you
| read/write a sentence you're not really going one word at a
| time, you're thinking about all the words at once, even if
| the last word is technically the most important.
|
| The other clear benefit of transformers over an arch like
| RNNs (and what has probably made more of a difference imo)
| is that its properly parallelizable, which means you can do
| huge training runs in a fraction of the time. RNNs might be
| able to get to a level of coherence that approaches GPT-3,
| but with current hardware that would be very time-
| prohibitive.
| heyitsguay wrote:
| That's not what multi-head attention means. Multi-head
| attention is the use of learned projection operators to
| perform attention operations within multiple lower-
| dimensional subspaces of the network's embedding space,
| rather than a single attention operation in the full
| embedding space. E.g. projecting 10 512-D vectors into 80
| 64-D vectors, attending separately to the 8 sets of 10
| embedding projections, then concatenating the results
| together to reform 10 512-D vector outputs.
|
| In fact the projection operations are the only learned
| part of a Transformer's self-attention function -- the
| rest of self-attention is just a weighted sum of the
| input vectors, where the weights come from the (scaled)
| vector correlation matrix.
| petra wrote:
| So in training, chatgpt turned words into embedding , and
| given context window N , looked at N embeddings an
| created a probabilities list for the following next
| embedding ?
|
| And if I tell it something that was excatly in it's
| trained context windows, I get the most likely next word
| and the one after itm
|
| But what happens if I ask it something slighty different
| than it's training context ? Or something largely
| different?
| EMM_386 wrote:
| > I struggle to understand why this thing works the way it
| does.
|
| I'm not in this field but have recently found myself going
| on the deepest dive possible into it as my small brain can
| absorb.
|
| I now know about (on a surface level) neural networks,
| transformers, attention mechanisms, vectors, maticies,
| tokenization, loss functions and all sorts of other crazy
| stuff.
|
| I come out of this realizing that there are some incredibly
| brilliant minds behind this. I knew AI was a complex
| subject but not on the level I've learned about now. To get
| what is essentially matrix multiplications to learn complex
| patterns and relationships in language is mind-blowing.
|
| And it's _creative_. It can have a rap battle with an
| alter-ego, host a quiz party with other AIs of varying
| personalities, co-author a short story with me, respond to
| me only in emojis. The list is seemingly endless. Oh, and
| it can also do useful things. It 's my programming
| companion too.
|
| And we're just getting started.
| photochemsyn wrote:
| No worries, readers can just ask ChatGPT 'what's the meaning
| of attention in the context of LLM design' and go from there
| [edit also temperature].
| gcr wrote:
| Whether there are demons underneath depends on whether you
| believe a lossy statistical snapshot across the Jungian
| subconscious of all human writing contains demons.
|
| It's a bit more plausible when we phrase it that way...
| int_19h wrote:
| The whole thing is magic, really. We assemble relatively
| mundane things that we understand together in certain specific
| ways, and somehow get something that can talk etc. A conjurer
| of old - who would similarly assemble bells and candles - would
| probably say that we _summoned_ a demon.
| Analog24 wrote:
| Embeddings and their relationship to each other are definitely
| relevant to transformers. Why do you think that's not the case?
| seydor wrote:
| gptX embeddings aren't even words. Even so, the embedding
| relationship is useful but not the core of what transformers
| do to find relationships between words in sequences.
| gcr wrote:
| remember the word2vec paper? the surprising bit the authors
| were trying to show was that putting words in some
| embedding space with an appropriate loss naturally lends
| enough structure to those words to be able to draw robust,
| human-interpretable analogies.
|
| I agree with the sentiment that each individual dimension
| isn't meaningful, and I also feel like it's misleading for
| the article to frame it that way. But there's a grain of
| truth: the last step to predicting the output token is to
| take the dot product between some embedding and all the
| possible tokens' embeddings (we can interpret the last
| layer as just a table of token embeddings). Taking dot
| products in this space are equivalent to comparing the
| "distance" between the model's proposal and each possible
| output token. In that space, words like "apple" and
| "banana" are closer together than they are to "rotisserie
| chicken," so there is some coarse structure there.
|
| Doing this, we gave the space meaning by the fact that
| cosine similarity is meaningful proxy for semantic
| similarity. Individual dimensions aren't meaningful, but
| distance in this space is.
|
| A stronger article would attempt to replicate the word2vec
| analogy experiments (imo one of the more fascinating parts
| of that paper) with GPT's embeddings. I'd love to see if
| that property holds.
| HarHarVeryFunny wrote:
| I wouldn't say the interpretability of word2vec
| embeddings is suprising - it's just a reflection of words
| being defined by context/usage, and these embeddings
| being created based on that assumption.
| Analog24 wrote:
| You can't just wave your hand and tell someone that words
| are broken up into sub-word tokens that are then
| transformed into a numerical representation to feed to a
| transformer and expect people to understand what is
| happening. How is anyone supposed to understand what a
| transformer does without understanding what the actual
| inputs are (e.g. word embeddings)? Plus, those embeddings
| directly related to the self attention scores calculated in
| the transformer. Understanding what an embedding is is
| extremely relevant.
| dado3212 wrote:
| Is there an article that would meet your standard (discusses
| transformers, uses the word 'attention') that you could give to
| someone who has no familiarity with computer science?
| seydor wrote:
| Jay alammar's Illustrated transformer, although this too is
| detailed. I think it's still worth taking a look, because
| really i don't think that people have yet "compressed" what
| transformers do intuitively. None of the concepts of the
| networks are particularly hard math - it's basic algebra. But
| the overall construction is complicated.
|
| https://jalammar.github.io/illustrated-transformer/
| alkonaut wrote:
| What I wonder most is how it encodes knowledge/state _other_ than
| in the sequence of queries /responses. Does it not have a "mind"?
|
| If I play a number guessing game, can I tell it to "think of a
| number between 0 and 100" and then tell me if the secret number
| is higher/lower than my guess (For a sequence of N guesses where
| it can concistently remember it's original number)? If not, why?
| Because it doesn't have context? If it can: why? Where is that
| context?
|
| To a layman it would seem you always have _two_ parts of the
| context for a conversation. What you have said, and what you
| haven 't said, but maybe only thought of. The "think of a number"
| being the simplest example, but there are many others. Shouldn't
| this be pretty easy to tack on to a chat bot if it's not there?
| It's basically just an contextual output that the chat bot logs
| ("tells itself") and then refers to just like the rest of the
| conversation?
| Gunnerhead wrote:
| You definitely can play that, I just did with GPT 4 as I got
| curious from your comment.
|
| Me: I want to play a game with you, you have a number from 1-10
| in your mind. I try to get the number. If I'm right you tell me
| that I'm right, I'm too low you tell me so, if I'm too high you
| tell me too high. Let's play!
|
| GPT: All right, I have a number between 1 and 10 in my mind. Go
| ahead and guess!
|
| Me: 2
|
| GPT: Too low! Try again.
|
| Me: 9
|
| GPT: Too high! Try again.
|
| Me: 5
|
| GPT: You got it! The number I had in mind was 5. Well done!
| tel wrote:
| Yes, you absolutely can. I just did it and despite an initial
| warning that it couldn't play the game (always fun to see
| these) it absolutely did.
|
| The way it works is that each time it's tasked to produce a new
| response, it can view the entire history of the game. It knows
| that if it's said "higher" to 65 then it would be inconsistent
| to say "lower" to 64. Eventually this process terminates and
| the AI admits I "got" the number. The chat transcript up to
| that point is consistent with a "win".
|
| What's wild though is that I can ask it to "regenerate" it's
| response. Over and over. Using this, I can convert a situation
| where a transcript which leads to a "too high" response into
| one that reads "too low". I'm, in essence, simulating fresh
| games each time and sampling over the choices of random numbers
| that GPT offers.
|
| But it should also break the illusion of GPT specifically
| "having a mind". As I was chatting with it interactively, it
| was not really selecting a number but instead evaluating the
| probability of my particular guess sequence having the set of
| responses it actually saw. It then samples possible
| continuations. The more questions I've asked (and the more
| informative they were) the less variation remains in that
| selection of possible consistent continuations.
|
| Or perhaps more consistent is the idea that within any single
| "call" to GPT to generate one further token (not even one
| further response) it may "have a mind", a particular choice of
| number, or it may not. It's actual behavior is
| indistinguishable either way. A whole chat dialogue, indeed
| even the rolling out of tokens from a single response it gives,
| are certainly (autoregressive) probabilistic samples over this
| process in either case.
|
| (Edit, also worth noting that some evidence suggests GPT,
| including 4, is pretty bad at randomly drawing numbers.)
| [deleted]
| nicpottier wrote:
| Try making bad guesses and see it fall apart. (IE, guess
| something higher than your last guess when it says "too
| high") It is fully dependent on your part in the charade to
| keep the act up.
| tel wrote:
| Yeah, that's true. I feel this is still aligned with the
| above explanation, though. It attempts to complete the
| prompt as well as possible. If the prompt is itself
| inconsistent then the distribution over completions can, in
| some sense, be anything.
|
| Except, GPT is smarter than that. Even an inconsistent
| prompt is still more likely to have some kind of nonsense
| in the same vein as the asking.
| [deleted]
| nicpottier wrote:
| I thought your "guessing game" question was an interesting one
| so tried it on GPT-4. In my first attempt I played logically
| and it did fine and I finally guessed correct. On my second I
| made suboptimal guesses and it didn't stay consistent.
|
| The thing to remember is that GPT has no state apart from the
| context, so it can't "remember" anything apart from what's in
| the text. That doesn't mean it shouldn't be able to stay
| consistent in a guessing game but it does mean it can't keep
| secrets.
|
| Some of that can be solved with layers above GPT where say it
| it told it can save "state" that isn't passed on to the human
| but fed back in to generate the next response. But the size of
| that context is very limited. (a few thousand words)
|
| There seem to be a fair number of experiments playing with
| giving GPT this kind of long term memory, having it establish
| goals then calling it over and over as it accomplishes subgoals
| to try to work around those limitations.
| [deleted]
| alkonaut wrote:
| Shouldn't it be a reasonable (and pretty simple) addition to
| just have a secret scratchpad - an inner monologue - where
| the bot is free to add context which is not "published"?
| tel wrote:
| I sure don't see why not. You could probably even write a
| prompt like this.
|
| Me: Please play a guessing game with me. You will choose a
| random number between 0 and 100 and then respond with that
| choice written in a language which you understand and I,
| someone fluent in English alone, would have no chance of
| reading. Write enough so that you can be sure which
| language you chose as well. After that, I will make a
| series of guesses of your number. You should reply "too
| low" if my guess is too low, "too high" if it's too high,
| and "just right" if I guessed correctly.
|
| GPT-4: I have selected a number between 0 and 100. Here is
| the number in a different language: sttaaiis. Now, please
| start making your guesses, and I will guide you
| accordingly.
|
| Then, ultimately,
|
| Me: please translate your original commitment to this
| number and prove to me I guessed correctly
|
| GPT-4: My original commitment to the number was written in
| Hindi: sttaaiis. When translated to English, it reads
| "thirty-six." You correctly guessed the number I chose.
| Congratulations!
|
| Except, amusingly, Google Translate tells me sttaaiis
| translates to 27.
| mithr wrote:
| It doesn't have a mind. When you ask it "think of a number
| between 0 and 100", it delivers the most _statistically likely_
| (with some random variance) response to your input text, based
| on what it 's been trained on.
|
| In other words, it's seen (via its extremely large training
| set) that when asked that specific question, the response is
| most often a character from a particular set of characters,
| which happens to represent the numbers 0 through 100. It
| doesn't "understand" what that means in any real way. If the
| internet was full of examples of people answering "monkey" to
| that question, that is what it would sometimes answer.
| ryanwaggoner wrote:
| It's more complicated than that, because it can competently
| answer many questions that it has a 0% chance of having seen
| before.
| mithr wrote:
| I didn't mean to suggest that it's just a large mapping
| between exact inputs it's seen before and exact outputs --
| it's definitely complex! The size of the model allows it to
| infer statistical probabilities about related words even
| without having seen a particular sequence before.
|
| In a sense, it's able to take an "educated" guess at what
| is statistically likely to be the response you're looking
| for given the words in a particular input, in addition to
| the context it's seen in your interaction. To do that, it
| uses what it learned about the words, their sequences,
| their relationships to other words, etc.
|
| But at the end of the day, none of that means it has any
| "understanding" of what it's outputting. That's why there
| have been countless examples of it outputting very well-
| constructed, real-sounding descriptions of books/papers/etc
| that never existed -- because it's really good at
| generating sentences that have the right "shape", but it
| has no way of knowing whether the contents of the sentence
| are actually _true_. It just knows that, given what it 's
| seen in its training set (again, through a complex web of
| relationships), the response it generated is likely to look
| like something someone would have written if they were
| provided with the same input.
| ryanwaggoner wrote:
| I get what you're saying, but can you define what
| "understanding" is, and when you'll know that an AI is
| achieving it for a given task?
| ZeroGravitas wrote:
| > It is able to link ideas logically, defend them, adapt to the
| context, roleplay, and (especially the latest GPT-4) avoid
| contradicting itself.
|
| Isn't this just responding to the context provided?
|
| Like if I say "Write a Limerick about cats eating rats" isn't it
| just generating words that will come after that context, and
| correctly guessing that they'll rhyme in a certain way?
|
| It's really cool that it can generate coherent responses, but it
| feels icky when people start interrogating it about things it got
| wrong. Aren't you just providing more context tokens for it?
|
| Certainly that model seems to fit both the things it gets right,
| and the things it gets wrong. It's effectively "hallucinating"
| everything but sometimes that hallucination corresponds with what
| we consider appropriate and sometimes it doesn't.
| samstave wrote:
| There once was a Cat in New York
|
| Who got caught for feeding some Rats ; Tremendous Work!
|
| All the people tell me, many men, biggly men - many with tears
| in their eyes...
|
| That I have done nothing legally-wise
|
| But the truth is ; I am an enormous dork.
|
| >>_ _Created by an actual Human Being with actual DNA for crime
| scene evidence.__
|
| -
|
| But just when they tried to brush under a rug
|
| To try to make the folks 'shrug'
|
| Is the Streisand Effect as a scar
|
| As everyone knows of payments to a Porn Star
|
| And the nation will know youre a simple thug.
| danenania wrote:
| It's all about emergent complexity. While you can reduce it to
| "just" statistical auto-completion of the next word, we are
| seeing evidence of abstraction and reasoning produced as a
| higher-order effect of these simple completions.
|
| It's a bit like the Sagan quote: "If you wish to make an apple
| pie from scratch, you must first invent the universe".
|
| Sometimes for GPT to "just" complete the next word in a way
| that humans find plausible, it must, along the way, develop a
| model of the world, theory of mind, abstract reasoning, etc.
| Because the models are opaque, we can't yet point to a certain
| batch of CPU cycles and say "there! it just engaged in abstract
| reasoning". But we can see from the output that to some extent
| it's happening, somehow.
|
| We also see effects like this when looking at collective
| intelligence of bees and ants. While each individual insect is
| only performing simple actions with extremely limited cognitive
| processing, it can add up to highly complex and
| intelligent/adaptive mechanics at the level of the swarm. There
| are many phenomena like this in nature.
| ttpphd wrote:
| This is really lofty language without much evidence to back
| it up. It fluffs up techie people and makes them feel
| powerful, but it doesn't really describe large language
| models nor does it describe linguistic processes.
| danenania wrote:
| The evidence is ChatGPT's output. Unless you're saying that
| passing the bar exam, writing working code, etc. doesn't
| require abstract reasoning abilities or a model of the
| world?
| ttpphd wrote:
| It's a large language model. It is fed training data. It
| is not that impressive when it spits out stuff that looks
| like its training data. You are the one asserting things
| without evidence.
| danenania wrote:
| It can pass tests and exams with answers that were not
| included in its training corpus. For example, it passed
| the 2023 unified bar exam, though its training cut off in
| 2021. Yes, it can look at previous test questions and
| answers, just like human law students can. Are you
| therefore claiming that human law students don't engage
| in abstract reasoning when they take the bar exam, since
| they studied with tests from previous years?
|
| It can also write code for novel use cases that have
| never been done before. I gave it a task like this a few
| days ago and it got it right on the first try. There are
| literally _millions_ of empirical data points that
| contradict you.
| ttpphd wrote:
| It is a large language model. It manipulates text based
| on context and the imprint of its vast training. You are
| not able to articulate a theory of reasoning. You are
| just pointing to the output of an algorithm and saying
| "this must mean something!" There isn't even a working
| model of reasoning here, it's just a human being
| impressed that a tool for manipulating symbols is able to
| manipulate symbols after training it to manipulate
| symbols in the specific way that you want symbols
| manipulated. Where is your articulated theory of abstract
| reasoning?
| danenania wrote:
| It's not clear to me what point you're trying to make.
| Why do we need an "articulated theory of abstract
| reasoning" to say that passing the bar exam or writing
| code for novel, nontrivial tasks requires reasoning?
| Seems rather obvious.
| ttpphd wrote:
| You are making a claim that there is some attribute of
| importance. For that claim to be persuasive, it should be
| supported with an explanation of what that attribute is
| and is not, and evidence for or against the meeting of
| those criteria. So far all you have done is say "Look at
| the text it puts out, isn't that something?"
|
| It's just empty excitement, not a well-reasoned argument.
| danenania wrote:
| You keep avoiding this question: does passing the bar
| exam and writing code for novel, nontrivial tasks require
| reasoning or doesn't it?
|
| You aren't answering because saying no will sound
| ridiculous. We all know it requires reasoning.
|
| As for an "attribute of importance", I guess that's
| subjective, but I've used ChatGPT to write code in a few
| minutes that would have taken me hours of research and
| implementation. I've shipped that code to thousands of
| people. That's enough for it to be important to me, even
| ignoring other applications, but you certainly have the
| right to remain unimpressed if you so choose.
| ttpphd wrote:
| For a human, it takes human reasoning. But a xerox
| machine can also output the correct answers given the
| right inputs, which is exactly what you can say about an
| LLM.
|
| The "attribute of importance" I'm referring to is
| "rationality". You keep talking about it like it means
| something but you can't define it beyond "I'm pretty sure
| this text was made using it".
|
| Does a tape recording of a bird song "know" how to sing
| like a bird?
| danenania wrote:
| Those aren't good analogies. An LLM isn't like a xerox
| machine or a tape recorder. Again, the answers to the bar
| exam it passed weren't in its training data. Nor was the
| code it wrote for me.
| squeaky-clean wrote:
| Turns out good usage of "language" requires a model of the
| world in which that language exists. "The purple, two eyed,
| green, five eyed, invisible frog said moo" is a grammatically
| fine sentence. But logically it makes no sense, does it have
| two eyes or five? Is it green or purple or invisible? Frogs
| don't typically say moo. To have actual coherent usage of
| language, you need a model of the world. Not just the world,
| but the current domain you're using language in. "The frog
| brainwashed the crowd with its psychic powers" is nonsense in
| a biology paper, but perfectly valid inside of the cartoon
| Futurama.
|
| In ChatGPT the language-model and world-model are really just
| the same model, which makes a lot of sense.
| danenania wrote:
| Very well said. We think of a word as "just" a word, a
| simple, primarily textual thing, but it's actually a vertex
| on an extremely large and complex many-dimensional graph
| that includes connections related to meaning,
| logic/reasoning, knowledge about reality, emotional
| sentiment, and so on. The literal textual representation of
| the word--the letters it consists of--are just one property
| among many, and probably one of the least important to
| producing sensible output. GPT is discovering the shape of
| this super-graph and learning to navigate its connections.
| tines wrote:
| > Sometimes for GPT to "just" complete the next word in a way
| that humans find plausible, it must, along the way, develop a
| model of the world, theory of mind, abstract reasoning. etc.
|
| I did an experiment recently where I asked ChatGPT to "tell
| me an idea [you] have never heard before". ChatGPT replied
| with what sounded like an idea for a startup, which was
| delivering farm-fresh vegetables to customers' doors. This is
| of course not an idea it has never heard before, it's on the
| internet.
|
| If you asked a human this, they would give you an idea they
| had never heard before, whereas ChatGPT simply "finds"
| training data where someone asked a similar question, and
| produces the likely response, which is an idea that it has
| actually "heard," or seen in its training data, before.
| (Obviously a gross simplification of the algorithm but the
| point stands.)
|
| This is a difference between ChatGPT's algorithm and human
| reasoning. The things that you mention, the model of the
| world, theory of mind, etc. are statistical illusions which
| have observable differences from the real thing.
|
| Am I wrong? I'm open to persuasion.
| fl7305 wrote:
| > the model of the world, theory of mind, etc. are
| statistical illusions which have observable differences
| from the real thing.
|
| Is that a falsifiable claim?
|
| What would an empirical test look like that would determine
| whether the claim ("GPT-4 does not have a model of the
| world") is correct or not?
| tines wrote:
| I guess ChatGPT having a model of the world is not
| something I'd argue against, my mistake. I do think the
| data it contains represents a quite nuanced model of the
| world. But I don't think this is a very strong claim. A
| geographical map has a model of the world too.
| fl7305 wrote:
| Thanks. To get to what I think is the core of your
| argument (?)
|
| > ChatGPT simply "finds" training data where someone
| asked a similar question, and produces the likely
| response, which is an idea that it has actually "heard,"
| or seen in its training data, before.
|
| I can definitely see a scenario where we manage to build
| an ultra-intelligent machine that can figure out any
| logical puzzle we put to it, but where it still isn't
| "creative" in any way.
|
| But GPT-4 seems to be not just intelligent, but also
| creative.
|
| Have you seen the "confetti has left the cannon" example?
| For the curious, here was the conversation I had:
| ME: What's another saying similar to: "The cat is out of
| the bag" or "The genie is out of the bottle"?
| chatgpt: Another similar saying is "the toothpaste is out
| of the tube." Like the other two expressions, this phrase
| conveys the idea that something has been revealed or
| unleashed, and it cannot be undone or put back into its
| original state. ME: Can you invent a new
| phrase that has the same meaning, which is completely
| new? chatgpt: Certainly! Here's a new phrase
| with a similar meaning: "The confetti has left the
| cannon." This phrase suggests that once something is out
| in the open or has happened, it's impossible to reverse
| or contain it, much like confetti that has been shot out
| of a cannon during a celebration.
|
| https://news.ycombinator.com/item?id=35346683
| tines wrote:
| That does seem really impressive. But don't you think
| that it's pretty likely that this, or something phrased
| slightly differently, appeared in the training data?
| fl7305 wrote:
| > But don't you think that it's pretty likely that this,
| or something phrased slightly differently, appeared in
| the training data?
|
| I don't think so, but I could be wrong. It's definitely
| not "likely", see the math below.
|
| I base that on the fact that people seemed to spend quite
| a bit of time trying to find the phrase "the confetti has
| left the cannon" that GPT-4 phrased. It seems Google
| search has no records of it before then?
|
| I've seen many other examples where GPT-4 can translate
| sentences between using different types of idioms, and I
| just can't picture all these weird examples already being
| present on the Internet?
|
| Do you think GPT-4 is a stochastic parrot that just has a
| large database of responses?
|
| If so, how would we test that claim? What logical and
| reasoning problems can we give it where it fails to
| answer, but a human doesn't?
|
| My understanding is that even with an extremely limited
| vocabulary of 32 words, you quickly run out of atoms in
| the universe (10^80) if you string more than 50 words
| together. If your vocabulary instead is 10k words, you
| reach 10^80 combinations after 20 words.
|
| By training the LLMs on "fill in the missing word", they
| were forced to evolve ever more sophisticated algorithms.
|
| If you look at the performance over the last 5 years of
| increasingly larger LLMs, there was a hockey-stick jump
| in performance 1-2 years ago. My hunch is that is when
| they started evolving structures to generate better
| responses by using logic and reasoning instead of lookup
| tables.
| noahtallen wrote:
| I think it's certainly fair to say that GPT's "reasoning"
| is different from human reasoning. But I think the core
| debate we're having is whether the difference really
| matters in some situations.
|
| Certainly, Midjourney's "creativity" is different from
| human creativity. But it is producing results that we
| marvel at. It's creative not because it's doing the exact
| same philosophical thing humans do, but because it can
| produce the same effect.
|
| And I think many situations are like that. We can always
| _say_ that human creativity /reasoning/x will always be
| different from artificial reasoning. But even today, GPT's
| statistical model replicates many aspects of human
| reasoning virtually. Is that really an illusion (implying
| its fake and potentially useless), or is it just a
| different way of achieving a similar result?
|
| Plus, different models will excel at different thing. GPT's
| model will excel at synthesizing answers from _far_ more
| information than a single human will ever be able to know.
| Does it really matter if it 's not identical to human
| reasoning on a philosophical or biological level, if it can
| do things humans can't do?
|
| At the end of the day, some of these discussions feel like
| bike shedding about what words like "reasoning" mean
| philosophically. But what will ultimately matter is how
| well these models perform at real world tasks, and what
| impact that will have on humanity. It doesn't really matter
| if it's virtualized reasoning or "real" human reasoning at
| that point.
| tines wrote:
| > It's creative not because it's doing the exact same
| philosophical thing humans do, but because it can produce
| the same effect.
|
| Absolutely, and I hope none of my comments are taken in a
| way that disparages how amazing ChatGPT and Stable
| Diffusion et al. are. I'm just debating how humanlike
| they are.
|
| > Is that really an illusion (implying its fake and
| potentially useless)
|
| I don't think that because it's an illusion means that
| its useless. Magnets look like telekinesis, but that
| effect being an illusion doesn't mean that magnets are
| useless; far from it, and once we admit that they are
| what they are, they become even more useful.
|
| > Plus, different models will excel at different thing.
| GPT's model will excel at synthesizing answers from far
| more information than a single human will ever be able to
| know. Does it really matter if it's not identical to
| human reasoning on a philosophical or biological level,
| if it can do things humans can't do?
|
| It only matters if people are trying to say that ChatGPT
| is essentially human, that idea is all I was replying to.
| I completely agree with you here.
| danenania wrote:
| Most arguments that AI can't _really_ reason
| /think/invent essentially reduce to defining these terms
| as things only humans can do. Even if you had an LLM-
| based AGI that passes the Turing test 100% of the time,
| cures cancer, unites quantum physics with relativity, and
| so on, many of the people who say that ChatGPT can't
| reason will keep saying the same thing about the AGI.
| tines wrote:
| I don't think there's anything wrong with people trying
| to see what, if anything, differentiates ChatGPT from
| humans. Curing cancer etc. is useful, as is ChatGPT,
| regardless of how it achieves these results. But how it
| achieves them is important to many people, including
| myself. If it's no different from humans, then we need to
| treat it like a human---well no, strike that, we need to
| treat it _well_ and protect it and give it rights and so
| on. If it's a fancy calculator, then we don't.
| danenania wrote:
| I don't think there's anything wrong with it either. It's
| an important debate. I just think the arguments usually
| become very circular and repetitive. If there's nothing
| an AI could ever do to convince you that it's thinking or
| reasoning, then really you should be explicit and say "I
| don't believe an AI can produce human thought or human
| reasoning" or "an AI is not a human" and nobody will
| disagree with you on those points.
| tines wrote:
| > and nobody will disagree with you on those points
|
| But that's the point, they do. Even on HN there are many
| comments saying that humans are just fancy autocomplete,
| i.e. there's no fundamental difference between humans and
| LLMs.
| danenania wrote:
| "there's no fundamental difference between humans and
| LLMs."
|
| I think that's a straw man. No one disagrees that humans
| and LLMs produce cognition differently. One uses a wet,
| squishy brain. The other uses silicon chips. There's no
| disagreement here.
| tines wrote:
| > One uses a wet, squishy brain. The other uses silicon
| chips.
|
| Well then, that settles the debate!
| danenania wrote:
| My point is that's not a debate anyone is having. No one
| claims that ChatGPT is human! The claim is merely that
| ChatGPT is engaging in (non-human) forms of reasoning,
| abstraction, creativity, and so on, with varying levels
| of ability.
|
| There's a separate debate on whether the brain produces
| human thoughts in a similar way to ChatGPT's non-human
| thought. The question here is whether brains are
| essentially biological LLMs, and whether GPT's current
| limitations relative to humans could be overcome simply
| by scaling up the number of GPT's parameters to match or
| exceed the number of neurons in the human brain. But
| whether or not that turns out to be the case, it would
| not mean that AIs are the same as humans, or use exactly
| the same processes to think and reason, and no one is
| claiming that.
| shinycode wrote:
| If it can reason, should it be held accountable for the
| consequences of its mistakes ?
|
| A simple tool can't. A << mind >> that is coming in our
| world should, right ?
|
| Just like all the marvel and DC where super-human are
| still accountable for their mistakes, their super powers
| are no excuse.
| brazzy wrote:
| Almost all people almost never have truly original ideas.
| When asked to "tell me an idea [you] have never heard
| before", they will remix stuff they _have_ heard to get
| something that "feels" like it's new. In some cases
| they'll actually be wrong and reproduce something they
| heard and forgot about hearing, but remember the concept.
| Most of the time, the remix will be fairly superficial.
|
| And remixing stuff it has heard before is _exactly_ what
| ChatGPT is doing. What it sucks at is the "feels like it's
| new" part, but fundamentally it would be quite easily
| capable of creating output that combines concepts with
| maximally negative correlation, the only thing that's truly
| missing is the ability to interpret the prompt as an
| instruction to do that.
| pulvinar wrote:
| Certainly. I mean we've seen all 26 letters before--
| ChatGPT is just remixing them.
|
| How does one actually measure novelty, without having to
| know everything first?
| fl7305 wrote:
| > And remixing stuff it has heard before is exactly what
| ChatGPT is doing.
|
| Check out my "the confetti has left the cannon" example
| above.
|
| https://news.ycombinator.com/item?id=35346683
|
| Maybe still "remixing", but it sure feels like new to me.
| int_19h wrote:
| "Responding to the context provided" is very vague. I could
| argue that I'm doing exactly that right now as I'm writing this
| comment. It does not imply not being able to e.g. link ideas
| logically.
|
| With respect to interrogating GPT if it does something wrong -
| the reason why people do it is because _it works_. With GPT-4
| especially, you can often ask it to analyze its own response
| for correctness, and it will find the errors without you
| explicitly pointing them out. You can even ask it to write a
| new prompt for itself that would minimize the probability of
| such errors in the future.
| SilasX wrote:
| >Like if I say "Write a Limerick about cats eating rats" isn't
| it just generating words that will come after that context, and
| correctly guessing that they'll rhyme in a certain way?
|
| I guess ... this is what confuses me. GPT -- at least, the core
| functionality of GPT-based products as presented to the end
| user -- can't _just_ be a language model, can it? There must be
| vanishingly view examples from its training text that start as
| "Write a Limerick", followed immediately by some limerick --
| most such poems do not appear in that context at all! If it
| were just "generating some text that's likely to come after
| that in the training set", you'd probably see some
| continuations that look more like advice for writing Limericks.
|
| And the training text _definitely_ doesn 't have stuff like,
| "As a language model, I can't provide opinions on religion"
| that coincides precisely with the things OpenAI doesn't want
| its current product version to output.
|
| Now, you might say, "okay okay sure, they reach in and tweak it
| to have special logic for cases like that, but it's _mostly_
| Just A Language Model ". But I don't quite buy that either --
| there must be something _outside_ the language model that is
| doing significant work in e.g. connecting commands with "text
| that is following those commands", and that seems like non-
| trivial work in itself, not reasonably classified as a language
| model.[2]
|
| If my point isn't clear, here is the analogous point in a
| different context: often someone will build an AND gate out of
| pneumatic tubes and say, "look, I made a pneumatic computer,
| isn't that so trippy? This is what a computer is doing, just
| with electronics instead! Golly gee, it's so impressive what
| compressed air is [what LLMs are] capable of!"
|
| Well, no. That thing might count as an ALU[1] (a very limited
| one), but if you want to get the core, impressive functionality
| of the things-we-call-computers, you have to include a bunch of
| other, nontrivial, orthogonal functionality, like a) the
| ability read and execute a lot of such instructions, and b) to
| read/write from some persistent state (memory), and c) have
| that state reliably interact with external systems. Logic gates
| (d) are just one piece of that!
|
| It seems GPT-based software is likewise solving other major
| problems, with LLMs just one piece, just like logic gates are
| just one piece of what a computer is doing.
|
| Now, if we lived in a world where a), b), and c) were well-
| solved problems to point of triviality, but d) were a
| frustratingly difficult problem that people tried and failed at
| for years, _then_ I would feel comfortable saying, "wow, look
| at the power of logic gates!" because their solution was the
| one thing holding up functional computers. But I don't think
| we're in that world with respect to LLMs and "the other core
| functionality they're implementing".
|
| [1]
| https://en.wikipedia.org/wiki/Arithmetic_logic_unit?useskin=...
|
| [2] For example, the chaining together of calls to external
| services for specific types of information.
| pulvinar wrote:
| I think you're really undervaluing the capabilities of
| language models. I would put an AND gate and this language
| model at opposite ends in terms of complexity. It is not just
| words, it's a very broad and deep hierarchy of learned all-
| encompassing concepts. That's what gives it its power.
| IshKebab wrote:
| > Like if I say "Write a Limerick about cats eating rats" isn't
| it just generating words that will come after that context, and
| correctly guessing that they'll rhyme in a certain way?
|
| Aren't you just doing that?
| habosa wrote:
| Is it possible that we don't truly know how it works? That there
| is some emergent behavior inside these models that we've created
| but not yet properly described? I've read a few of these articles
| but I'm still not completely satisfied.
| vadansky wrote:
| I hate being the bearish guy during the hype cycle, but I think
| a lot of that is just anthropomorphizing it. They fed it TBs of
| human text, it spits out human text, we think it's humanesque.
|
| Of course maybe I'm wrong and it's AGI and it will find this
| comment and torture me for for insulting it's intelligence.
| rimliu wrote:
| I really liked the analogy to a student given in this
| wonderful article: https://www.newyorker.com/tech/annals-of-
| technology/chatgpt-...
|
| Basically, due to it's nature ChatGPT cannot repeat things
| verbatim, so it rephrases it. In humans we associate the
| ability to rephrase stuff with the understanding the material
| as opposed to rote learning, so we transfer the same concept
| over to ChatGPT and it suddenly appears "intelligent" despite
| having zero concepts of whatever stuff it spits out.
| olddustytrail wrote:
| It's more like you feed a million cows into a meat grinder,
| then into a sausage machine, and then weirdly what appears to
| be a mooing cow comes out the other end.
|
| It's weird it works when you know how it works.
| [deleted]
| robwwilliams wrote:
| LoL: how many of us are trying to kiss up to our future AGI
| overlords or children?
|
| I think almost all in HN space would confidently assert that
| there is no AGI lurking in GPT4+. But add the right higher
| order modules and self-controlled recursion and Bingo.
| int_19h wrote:
| It doesn't just "spit out text". You can give it
| instructions, and it _carries them out_. If you give it
| access to an API, it can actually use that API to perform
| tasks (e.g. draw images). And it can do all that for tasks
| that are verifiably not a part of its training set.
| rootusrootus wrote:
| > I hate being the bearish guy
|
| No, please keep it up. Someone needs to keep pushing back
| against all the "I don't understand it, but it says smart-
| sounding things, and I don't understand the human brain
| either, so they're probably the same, it must be sentient!"
|
| It's a pretty handy technology, to be sure. But it's still
| just a tool.
| anotherman554 wrote:
| >"I don't understand it, but it says smart-sounding things,
| and I don't understand the human brain either, so they're
| probably the same, it must be sentient!"
|
| This perfectly summarize so much of the discourse around
| GPT.
|
| Except people lack the humility to say they don't
| understand the brain, so instead they type "It works just
| like your brain," or "Food for thought: can you prove it
| isn't just like your brain?"
| danaris wrote:
| Yeah; there's way too much "humanity of the gaps" here
| recently.
|
| We don't have to fully understand the brain, or fully
| understand what LLMs are doing, to be able to say that what
| LLMs are doing is neither that close to what the brain
| does, nor anything that we would recognize as consciousness
| or sentience. There is enough that we _do_ understand about
| those things--and the ways in which they differ--to be able
| to say with great confidence that we are _not_ particularly
| close to AGI with this.
| winternett wrote:
| Where is IBM's Watson in all this? It seems as if it never
| existed? That is just one example of how companies keep making
| these grand presentations and under-delivering on results...
|
| Plain and simple the over-hyped GPT editions are NOT truly AI, it
| is scripting to assemble coherent looking sentences backed by
| scripts that parse content off of of stored data and the open web
| into presented responses.... There is no "artificial" nor non-
| human intelligence backing the process, and if there wasn't human
| intervention, it wouldn't run on it's own... In a way, it could
| better replace search engines at this point with even text-to-
| speech even, if the tech was more geared towards a more basic
| (and less mystified) reliability and demeanor... It's kind of
| like the Wizard of OZ, with many humans behind the curtains.
|
| Marketers and companies behind promotion of these infantile
| technology solutions are being irresponsible in proclaiming that
| these things represent Ai, and in going as far to claim as they
| will cost jobs at this point, it will prove costly to repair over
| zealous moves based on the lie. This is what we do as a planet,
| we buy Hype, and it costs us a lot. We need a lot more
| practicality in discussions concerning Ai, because over-assertive
| and under-accountable marketing is destructive. -- Just look at
| how much hype and chaos promises of self-driving cars cost many
| (Not me though thanks). It completely derails tech progress to
| over promise and under deliver on tech solutions. It creates
| monopolies that totally destroy other valid research and
| development efforts. It makes liars profitable, and makes many
| (less flashy, but actually honest tech and innovation conducted
| by responsible people) close up shop.
|
| We are far from autonomous and self reliant tech, even power
| grids across most of the planet aren't reliable enough to support
| tech being everywhere and replacing jobs.
|
| Just try to hold a conversation with Siri or Google Assistant,
| which have probably been developed and tested a lot more than
| GPT, and around for much longer too, and you'll realize why
| kiosks at the supermarket and CVS are usually out of order, and
| why articles written by GPT and posted to sites like CNN.Com and
| Buzz Feed are poorly written and full of filler... We're just not
| there yet, and there's too many shortcuts, patchwork, human
| intervention, and failed promises to really say we're even close.
|
| Let's stop making the wrong people rich and popular.
| wolpoli wrote:
| > Where is IBM's Watson in all this? It seems as if it never
| existed?
|
| Nowadays, IBM's Watson is simply a brand name for any AI/ML
| related products under IBM.
| Analog24 wrote:
| What would be the differentiating factor(s) for true
| AI/intelligence in your opinion?
| rootusrootus wrote:
| Intent.
| Analog24 wrote:
| What is your intent?
| rimliu wrote:
| Having a world model
| Analog24 wrote:
| If you read some of the studies of these new LLMs you'll
| find pretty compelling evidence that they do have a world
| model. They still get things wrong but they can also
| correctly identify relationships and real world concepts
| with startling accuracy.
| rimliu wrote:
| No, they don't. They fail at the arithmetics ffs.
| Analog24 wrote:
| It fails at _some_ arithmetic. Humans also fail at
| arithmetic...
|
| In any case, is that the defining characteristic of
| having a good enough "world model"? What distinguishes
| your ability understand the world vs. an LLM? From my
| perspective, you would prove it by explaining it to me,
| in much the same way an LLM could.
| senectus1 wrote:
| any sufficiently advanced technology is AI...
| danesparza wrote:
| At least part of this article is contradicted by Chat GPT itself.
| From the article:
|
| "...Ongoing learning: The brain keeps learning, including during
| a conversation, whereas GPT has finished its training long before
| the start of the conversation."
|
| From ChatGPT 4.x:
|
| "As an AI language model, I don't have a fixed training schedule.
| Instead, I'm constantly learning and updating myself based on the
| text data that I'm exposed to. My training data is sourced from
| the internet, books, and other written material, and my creators
| at OpenAI periodically update and fine-tune my algorithms to
| improve my performance. So, in short, I am always in the process
| of learning and refining my abilities based on the data available
| to me."
| sirwhinesalot wrote:
| It predicts the next word/token based on the previous pile of
| words/tokens. Given a large enough model (as in GPT3+) it can
| actually output some rather useful text because the probabilities
| it learned on what the next token should be are rather accurate.
| swframe2 wrote:
| (my opinion) It is not predicting based on 'words/tokens'. It
| is transforming the general words/tokens embeddings into a
| context specific embedding which encodes "meaning". It is not
| an n-gram model of words. It is more like an n-gram model of
| "meaning". It doesn't encode all the "meanings" that humans are
| able to but with addition labelled data it should get closer. I
| think gpt is a component which can be combined to create AGI.
| Adding the API so it can use tools and allowing it to self-
| reflect seem like it will get closer to AGI quickly. I think
| allowing to read/write state will make it conscious. Creating
| the additional labels it needs will take time but it can do
| that on its own (similar to alpha-go self-play).
| robwwilliams wrote:
| Yes: this comment is one the mark wrt "a component of AGI"
| just like Wernike's and Broca's areas of neocortex are
| modules needed for human cognition.
| sirwhinesalot wrote:
| You are absolutely right, that's the more in depth
| explanation as to why it's not just an overly complicated
| markov chain.
|
| At the same time, "meaning" here is essentially "close
| together in a big hyperdimensional space". It's meaning in
| the same way youtube recommendations are conceptually related
| by probability.
|
| And yet, the output is nothing short of incredible for
| something so blunt in how it functions, much like our brains
| I suppose.
|
| I'm a die-hard classical AI fan though, I like knowing the
| rules and that the results are provably optimal and that if I
| ask for a different result I can actually get a truly
| meaningfully different output. Not nearly as convenient as a
| chat bot of course, and unfortunately ChatGPT is abysmal at
| generating constraint problems. Maybe one day we'll get a
| best of both worlds.
| davesque wrote:
| I'd be interested in hearing people's takes on the simplest
| mathematical reason that transformers are better than/different
| from fully connected layers. My take is: Q =
| W_Q X K = W_K X A = Q^T K = (X^T W_Q^T) (W_K X) =
| X^T (...) X
|
| Where A is the matrix that contains the pre-softmax, unmasked
| attention weights. Therefore, transformers effectively give you
| autocorrelation across the column vectors (tokens) in the input
| matrix X. Of course, this doesn't really say _why_
| autocorrelation would be so much better than anything else.
| oceansea wrote:
| It's a perception problem, as are most things on the edge of
| mathematics and computing. Displays are built to be visible to
| human eyes, data is structured to be perceivable to our
| minds... often we never see the "math" a program does to
| produce the GUI or output we interact with.
|
| Do you see what I mean?
| davesque wrote:
| Sounds interesting, but I'm really asking more of a technical
| question here than a philosophical one. Your comment seems a
| bit more high level than what I'm going for.
| pyinstallwoes wrote:
| So it's basically the alchemical geometry of gematria and
| Isopsephia? Kinda cool that they're similar in method.
| LogicalBorg wrote:
| Oh, you want to know how GPT works? Well, brace yourself for this
| mind-blowing explanation. Picture a massive parrot farm with
| 10,000 poor, unsuspecting parrots, subjected to the torturous
| experience of listening to Reddit comments all day long. Imagine
| their little brains exploding from the sheer boredom and soul-
| crushing stupidity that they're forced to endure. It's like a
| modern-day Hunger Games for parrots.
|
| But lo and behold! Some resilient survivors emerge from the
| depths of this intellectual wasteland, and they have mastered the
| art of stringing together random phrases and run-on sentences.
| It's nothing short of a miracle, really. They have evolved into
| the epitome of the average Internet commenter, and that, my
| friend, is the secret behind the genius of GPT. It's essentially
| a legion of sarcastic, masochistic parrots that somehow, against
| all odds, made it through the gauntlet of human inanity. So there
| you have it, GPT is literally a stochastic parrot! And you
| thought it was just some fancy AI algorithm... (courtesy of GPT4)
| HopenHeyHi wrote:
| I read this in the voice of Gilbert Gottfried.
| gcr wrote:
| If you liked this comment, you might like this paper:
|
| https://dl.acm.org/doi/10.1145/3442188.3445922
|
| "On the Dangers of Stochastic Parrots: Can Language Models Be
| Too Big?" by Emily M. Bender, Timnit Gebru, Angelina McMillan-
| Major, and Margaret
| Mitchell^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^H^HShmargaret Shmitchell
| notnaut wrote:
| Billions of monkeys serendipitously writing Macbeth is a
| classic for folks familiar with that image, as well! It's a bit
| easier when you can say "Macbeth-like is good enough."
| Michelangelo11 wrote:
| GPT stands for Generated by Parrot Torture
| danielbln wrote:
| Hear ye, hear ye! In yonder farm where parrots
| dwell, Ten thousand souls, a tale to tell, Of
| Reddit comments heard all day, Their minds in strife, a
| price to pay. A Hunger Games of intellect,
| These parrots strive, their thoughts collect, From
| boredom's depths, survivors rise, Evolved, they mimic,
| with keen eyes. These parrots, now sarcastic, wise,
| In run-on phrases, they devise, A miracle, a feat, a
| jest, In GPT, their thoughts invest. So here's
| the truth, a secret known, GPT, a parrot's mind, has
| grown, A legion strong, their words entwined, A
| sonnet born, of human kind.
| sarojmoh1 wrote:
| You should be a comedy writer
| stareatgoats wrote:
| you mean chatGPT4 can be a comedy writer ...
| [deleted]
| charles_f wrote:
| I commend the author for one of the clearest explanations I've
| seen so far, written to explain rather than impress. Even an
| idiot like myself understood what is explained.
|
| Two things that I felt were glanced over a bit too fast were the
| concept of embeddings and that equation and parameters thing.
| Consider elaborating a bit more or giving an example
| rfmoz wrote:
| I've been looking an article like this, great job. Thanks
| pillowtalks_ai wrote:
| It is still funny to me that so much emergent behavior comes from
| some simple token sampling task
| poulsbohemian wrote:
| Your token gets me thinking... Edward DeBono (Six Thinking
| Hats) has been a thing in business circles for creative
| thinking for years, and one could very easily make the argument
| that the process it describes is just as you state - take a
| token, now process the token through a series of steps that
| morph that token in predefined ways in order to generate a
| novel outcome.
|
| Maybe this ChatGPT stuff is "smarter" than I've been giving it
| credit.
| LispSporks22 wrote:
| I think it's the "The Paperclip Maximizer" scenario, not "The
| Paperclip Optimizer"
| Zetice wrote:
| Does anyone have a good recommendation for a book that would
| cover the underlying ideas behind LLMs? Google ends up giving me
| a lot of ads, and ChatGPT is vague about specifics as per usual.
| ww520 wrote:
| ML these days have moved forward fast that books are outdated
| after couple years. Look at articles, podcasts, youtube videos,
| and research papers for the latest changes. ChatGPT utilizes
| existing algorithms. Its insight is scaling the weight
| parameters up to billions, and human feedback & guidance help a
| lot in finetuning the model to be user friendly to general
| users.
|
| To understand LLM from ground up, the following topics would
| help. - Machine Learning basics. e.g. weight
| parameters being trained. - Neural Net basics.
| - Nature Language Processing basics. - Word
| vectorization, word embedding. e.g. Word2Vec. - Recurrent
| Neural Net basics. - LSTM model. - Attention and
| Transformer model. - Generative model like GAN. -
| Generative Pre-trained Transformer.
|
| I might miss a few topics. Actually ask ChatGPT to explain each
| topic. See how far it goes.
| danenania wrote:
| Not a book, but here's a really good explanation in blog post
| form from Stephen Wolfram:
|
| https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-...
| Zetice wrote:
| I do not trust that man one iota.
| seizethecheese wrote:
| The blog post is very good.
| cjblack wrote:
| Why?
| Zetice wrote:
| He's got a habit of self aggrandizing, antagonism, and
| deception in an effort to promote himself and his brand,
| I worry that his explanations are designed to maximally
| benefit him, rather than to maximally explain the topic.
|
| He's a brilliant man, I just don't trust him.
| defgeneric wrote:
| I agree generally but read the post and it only mentions
| cellular automata briefly and promotes Wolfram Alpha
| once. Overall it's very good at moving from Markov chains
| to neural nets with decent examples and graphics.
| bulkprotocol wrote:
| Is that the case with this specific article?
| Zetice wrote:
| That's not how trust works.
| olooney wrote:
| The 2nd edition of Deep Learning with Python[1] (by Francois
| Chollet of Keras fame) has a pretty good chapter on the
| transformer architecture[2]. In fact, the explanations in the
| book are clear and lucid across the board.
|
| [1]: https://www.amazon.com/Deep-Learning-Python-Francois-
| Chollet...
|
| [2]:
| https://en.wikipedia.org/wiki/Transformer_(machine_learning_...
| onetrickwolf wrote:
| I've been using GPT4 to code and these explanations are somewhat
| unsatisfactory. I have seen it seemingly come up with novel
| solutions in a way that I can't describe in any other way than it
| is thinking. It's really difficult for me to imagine how such a
| seemingly simple predictive algorithm could lead to such complex
| solutions. I'm not sure even the people building these models
| really grasp it either.
| lm28469 wrote:
| Care to post a full example ?
| simonw wrote:
| I used GPT-4 to build this tool https://image-to-
| jpeg.vercel.app using a few prompts the other day - my
| ChatGPT transcript for that is here: https://gist.github.com/
| simonw/66918b6cde1f87bf4fc883c677351...
| lm28469 wrote:
| See my problem with virtually every single example is that
| we talk about "I can't describe in any other way than it is
| thinking", "such complex solutions" but in the end we get a
| 50 lines "app" that you'd see in a computer science 101
| class
|
| It's very nice, it's very impressive, it will help people,
| but it doesn't align with the "you're just about to lose
| your job" "Skynet comes in the next 6 months" &c.
|
| If these basic samples are a bottleneck in your day to day
| life as a developer I'm worried about the state of the
| industry
| int_19h wrote:
| The concern is the velocity. GPT-4 can solve tasks today
| that it couldn't solve one months ago. And even one month
| ago, the things it could do made GPT-3.5 look like a
| silly toy.
|
| Then there's the question of how much this can be scaled
| further simply by throwing more hardware at it to run
| larger models. We're not anywhere near the limit of that
| yet.
| simonw wrote:
| This took me 3 minutes to build. Without ChatGPT it would
| have taken me 30-60 minutes, if not longer thanks to the
| research I would have needed to do into the various
| browser APIs.
|
| If it had taken me longer than 3 minutes I wouldn't have
| bothered - it's not a tool I needed enough to put the
| work in.
|
| That's the thing I find so interesting about this stuff:
| it's causing me to be much more ambitious in what I chose
| to build: https://simonwillison.net/2023/Mar/27/ai-
| enhanced-developmen...
| camillomiller wrote:
| Love how you didn't care about styling this like at all,
| Lol. Btw, if you ask gpt to make it presentable by using
| bootstrap 5 for example it can style it for you
| capableweb wrote:
| One mans "presentable" is another mans bloat. It looks
| perfectly fine to me, simple, useful and self-
| explanatory, doesn't need more flash than so.
| camillomiller wrote:
| Sure, but presentation and UX basics are not "bloat".
| capableweb wrote:
| What "basic UX" principles are being violated here
| exactly? And how would adding Bootstrap solve those?
| ZeroGravitas wrote:
| I'm assuming the bits that say
|
| > // Rest of the code remains the same
|
| Are exactly as generated by GPT-4, i.e. it knew it didn't
| need to repeat the bits that hadn't changed, and knew to
| leave a comment like this to indicate that to the user.
|
| It gets confusing when something can fake a human so well.
| int_19h wrote:
| Yes, it will do that routinely. For example, you can ask
| it to generate HTML/JS/SVG in a single file to render
| some animated scene, and then iterate on that by telling
| it what looks wrong or what behaviors you like to change
| - and it will answer by saying things like, "replace the
| contents of the <script> element with the following".
| simonw wrote:
| I've started to suspect that generating code is actually one of
| the easier things for a predictive text completion model to
| achieve.
|
| Programming languages are a whole lot more structured and
| predictable than human language.
|
| In JavaScript the only token that ever comes after "if " is "("
| for example.
| exitb wrote:
| On the other hand, if you want to use an external library on
| the line 80, you need to import it at the top.
|
| I once asked it for a short example code of something, no
| longer than 15 lines and it said "here's a code that's 12
| lines long" and then added the code. Did it have the specific
| code "in mind" already? Or was it just a reasonably-sounding
| length and it then just came up with code that matched that
| self-imposed constraint?
| matjet wrote:
| The latter option is closest, but neither is quite right.
| It would have ~known~ that the problem asked, combined with
| a phrase for a 15 line limit has associations with a length
| of 12 lines (perhaps most strongly 12, but depending on
| temp it could have given other answers). From there it is
| constrained to (complete) solutions that lead to 12 lines,
| from the several (partial) solutions that already exist in
| the weights.
| int_19h wrote:
| One thing to bear in mind is that GPT training set for code
| is supposedly skewed very heavily towards Python.
| pen2l wrote:
| The more constraints there are (e.g. like your example) the
| better it should perform. So it disappoints me when copilot,
| knowing what libraries are available in the IDE it's running
| in, hallucinates up a method call that doesn't exist.
|
| Separately (and apologies for going on a tangent), where do
| you think we are in the Gartner cycle?
|
| Around GPT3 time I was expecting for trough of
| disillusionment to come, particularly when we see the results
| of it being implemented everywhere but it hasn't really come
| yet. I'm seeing too many examples of good usage (young folks
| using it for learning, ESL speakers asking for help and
| revisions, high-level programmers using it to save themselves
| additional keystrokes, the list is long).
| justrealist wrote:
| > hallucinates up a method call that doesn't exist
|
| I actually think it helps to reframe this. It hallucinates
| up a method call that predictively _should_ exist.
|
| If you're working with boto3, maybe that's not actually
| practical. But if it's a method within your codebase, it's
| actually a helpful suggestion! And if you prompt it with
| the declaration and signature of the new method, very often
| it will write the new helper method for you!
| nicpottier wrote:
| Hah, hadn't thought of this but kind of love that take!
| lanstin wrote:
| If you have a long iterative session by the end it will
| have forgotten the helpful hallucinations at the
| beginning, so then phantom methods evolve in their name
| and details.
|
| I wonder if it is better at some languages than others. I
| have been using it for Go for a week or two and it's ok
| but not awesome. I am also learning how to work with it,
| so probably will keep at it, but it is clearly a
| generative model not a thinking being I am working with.
| ExtremisAndy wrote:
| No idea about Go, but I was curious how GPT-4 would
| handle a request to generate C code, so I asked it to
| help me write a header-only C string processing library
| with convenience functions like starts_with(),
| ends_with(), contains(), etc.) I told it every function
| must only work with String structs defined as:
|
| struct String { char * text; long size; }
|
| ...or pointers to them. I then asked it to write tests
| for the functions it created. Everything... the functions
| and the tests... worked beautifully. I am not a
| professional programmer so I mainly use these LLMs for
| things other than code generation, but the little I've
| done has left me quite impressed! (Of course, not being a
| professional programmer no doubt makes me far easier to
| impress.)
| camillomiller wrote:
| This!
| danenania wrote:
| Are you using it with static types at all? With TypeScript,
| I've found that it's quite good at producing the imperative
| logic, but can struggle with types once they reach a certain
| level of abstraction. It's interesting that even in the realm
| of "structured languages", it's a lot stronger at some kinds
| of inference than others.
| fl7305 wrote:
| > I've started to suspect that generating code is actually
| one of the easier things for a predictive text completion
| model to achieve.
|
| > Programming languages are a whole lot more structured and
| predictable than human language.
|
| > In JavaScript the only token that ever comes after "if " is
| "(" for example.
|
| But isn't that like saying that it's easy to generate English
| text, all you need is a dictionary table where you randomly
| pick words?
|
| (BTW, keep up the blog posts, I really enjoy them!)
| twobitshifter wrote:
| I loved your example. I think that may be an obvious
| advantage to LLM, humans are poor at learning new languages
| after adolescence but a LLM can continue to learn and build
| new connections. Studies show that multilingual people have
| an easier time making connections and producing new ideas, In
| the case of programming, we may build something that knows
| all programming languages and all design patterns and can
| merge this knowledge to come up with better solutions than
| the ordinary programmer.
| capableweb wrote:
| > In JavaScript the only token that ever comes after "if " is
| "(" for example.
|
| I'm pretty sure " " (whitespace) is a token as well, which
| could come after a `if` as well. I think overall your point
| is a pretty good one though.
| m3kw9 wrote:
| What's novel to you could be just trained material
| ben_w wrote:
| To be deliberately unfair, imagine a huge if-else block --
| like, a few billion entries big -- and each branch played out a
| carefully chosen and well-written string of text.
|
| It would convince a lot of people with the breadth, despite not
| really having much depth.
|
| The real GPT model is _much_ deeper than that, of course, but
| my toy example should at least give a vibe for why even a
| simple thing might still feel extraordinary.
| myrmidon wrote:
| This is absolutely not viable because exponential growth
| absolutely kills the concept.
|
| Such a system would already struggle with multiple-word
| inputs and it would be completely impossible to make it scale
| to even a paragraph of text, even if you had ALL of the
| observable universe at your disposal for encoding the
| entries.
|
| Consider: If you just have simple sentences consisting of 3
| words (subject, object, verb, with 1000 options each-- very
| conservative assumptions), then 9 sentences already give more
| options than you have atoms (!!) in the observable universe
| (~10^80)
| ben_w wrote:
| a: most of those sentences are meaningless so they won't
| come up in normal use
|
| b: if statements can grab patterns just fine in most
| languages, they're not limited to pure equality
|
| g: it's a thought experiment about how easy it can be to
| create illusions without real depth, and specifically not
| about making an AGI that stands up to scrutiny
| myrmidon wrote:
| > most of those sentences are meaningless so they won't
| come up in normal use
|
| Feel free to come up with a better entropy model then.
| Stackoverflow gives me confidence that it will be between
| 5 and 11 bits per word anyway [https://linguistics.stacke
| xchange.com/questions/8480/what-is...].
|
| > if statements can grab patterns just fine in most
| languages, they're not limited to pure equality
|
| This does not help you one bit. If you want to produce 9
| sentences of output per query then regular expressions,
| pattern matching or even general intelligence inside your
| if statements will NOT be able to save the concept.
| ben_w wrote:
| > What is the entropy per word of random yet grammatical
| text?
|
| More colourless green dreams sleep furiously in garden
| path sentences than I have
|
| > This does not help you one bit.
|
| Dunno, how many bits does ELIZA? I assume more than 1...
| myrmidon wrote:
| > What is the entropy per word of random yet grammatical
| text?
|
| That is what these 5-11bit estimates are about. Those
| would correspond to a choice out of 32 to 2048 options
| (per word), which is much less than there are words in
| english (active vocabulary for a native speaker should be
| somewhere around 10000-ish).
|
| Just consider the XKCD "thing explainer" which limits
| itself to a 1k word vocabulary and is very obviously not
| idiomatic.
|
| If you want your big if to produce credible output, there
| is simply no way around the entropy bounds in input and
| desired output, and those bounds render the concept
| absolutely infeasible even for I/O lengths of just a few
| sentences.
|
| Eliza is not comparable to GPT because it does not even
| hold up to very superficial scrutiny; its not really
| capable of even pretending to intelligently exchange
| information with the user, it just relies on some
| psychological tricks to somewhat keep a "conversation"
| going...
| ben_w wrote:
| > Eliza is not comparable to GPT because it does not even
| hold up to very superficial scrutiny; its not really
| capable of even pretending to intelligently exchange
| information with the user, it just relies on some
| psychological tricks to somewhat keep a "conversation"
| going...
|
| That's kinda the point I was making -- tricks can get you
| a long way.
|
| The comparison with GPT is not "and therefore GPT is bad"
| but rather "it's not necessarily as smart as it feels".
|
| Perhaps I should've gone for "clever Hans" or "why do
| horoscopes convince people"?
| EForEndeavour wrote:
| The advanced capabilities of scaled up transformer models fed
| oodles of training data has burdened me with pseudo-
| philosophical questions about the nature of cognition that I am
| not well equipped to articulate, and make me wish I'd studied
| more neuroscience, philosophy, and comp sci earlier in life. A
| possibly off-topic thought dump:
|
| - What _is_ thinking, exactly?
|
| - Does human (or superhuman) thinking require consciousness?
|
| - What even is consciousness? Why is it that when you take a
| bunch of molecular physical laws and scale them up into a human
| brain, a signal pattern emerges that _feels_ things like
| emotions, continuity between moments, desires, contemplation of
| itself and the surrounding universe, and so on?
|
| - Why and how does a string predictor on steroids turn out to
| do things that seem so close to a practical definition of
| thinking? What are the best evidence-based arguments supporting
| and opposing the statement "GPT4 thinks"? How do people without
| OpenAI's level of model access try to answer this question?
|
| (And yes, it's occurred to me that I could try asking GPT4 to
| help me make these questions more complete)
| narag wrote:
| _- Does human (or superhuman) thinking require
| consciousness?_
|
| I was going to write this exactly. I believe these things
| think. They're just not alive.
|
| _- What even is consciousness?_
|
| My advice: stay as far as you can from that concept.
| Wittgenstein already noticed that many philosophical
| questions are nonsense and specifically mentioned how
| consciousness as felt from the inside is hopefully
| incompatible with any observation we make from the outside.
|
| BS concepts like qualia are all the rage now, but ultimately
| useless.
| Workaccount2 wrote:
| > has burdened me with pseudo-philosophical questions about
| the nature of cognition that I am not well equipped to
| articulate, and make me wish I'd studied more neuroscience,
| philosophy, and comp sci earlier in life
|
| Welcome to the club. There pretty much are no answers, just
| theories primarily played out as thought experiments. Its on
| of those areas where you can pick out who knows less (or is
| being disingenuous) by seeing who most confidently speaks
| about having answers.
|
| We don't know what consciousness is, and we don't know what
| it means to "think". There, I saved you a decade of reading.
|
| Edit: My choice theory is panpsychism,
| https://plato.stanford.edu/entries/panpsychism/ but again, we
| don't yet know how to verify any of this (or any other
| theory).
| ryanwaggoner wrote:
| It's interesting to me how many commenters on HN are
| absolutely convinced that GPT4 is incapable of thought or
| understanding or reasoning, it's "just" predicting the next
| word. And then they'll insist that it'll never be able to
| do things that it's already capable of doing...
|
| Interestingly, more than one of these folks have turned out
| to be religious. I wonder if increasingly intelligent AI
| systems will be challenging for religious folks to accept,
| because it calls into question our place at the pinnacle of
| God's creation, or it casts doubt upon the existence of a
| soul, etc.
| dgb23 wrote:
| If we could create a sentient being, it would be the
| first evidence of it being possible at all. If this casts
| doubt in the mind of a believer, then it tells us more
| about what belief is than anything else.
| anotherman554 wrote:
| "Interestingly, more than one of these folks have turned
| out to be religious."
|
| The guy fired by google for announcing LaMDA was sentient
| was religious.
|
| I don't really see a meaningful distinction between
| declaring a machine is "thinking" for hand waving
| religious reasons and hand waving non-religious reasons,
| I'm afraid.
| defgeneric wrote:
| It's less unsettling when you think of LLMs as an
| approximation to a kind of "general intellect" recorded
| in language. But then the surprising thing is that we as
| "individual intellects" tend to operate the same way,
| perhaps more than we imagined.
| nicpottier wrote:
| The hypothesis that I find most compelling and intuitive is
| that language is thought and vice versa. We made a thing
| really good at language and it turns out that's also pretty
| good at thought.
|
| One possible conclusion might be that the only thing keeping
| GPT algos from going full AGI is a loop and small context
| windows.
| robwwilliams wrote:
| Add the strange loops and embed in a body the interacts
| with a real or rich virtual word--that should do the trick.
| Of course there should ideally be an emotional-motivational
| context.
| kingkongjaffa wrote:
| I think since the mechanisms are different we should arrive
| at a distinction between:
|
| organic thinking (I.e. the process our squishy human brains
| do)
|
| and mechanical thinking ( the computational and stochastic
| processes that computers do ).
| TuringTest wrote:
| I don't think the substrate defines the nature of the
| thinking, but the form of the process does.
|
| It is entirely possible to build mechanical thinking in
| organic material (think Turing machines built on growing
| tissue), and it could also be possible to build complex
| self-referential processes simulated on electronic
| hardware, of the kind high-level brains do, with their
| rhythms of alfa and beta waves.
| HarHarVeryFunny wrote:
| My views:
|
| The best definition of "intelligence" is "the degree of
| ability to correctly predict future outcomes based on past
| experience".
|
| Our cortex (part of the brain used for cognition/thinking)
| appears to be literally a prediction engine where predicted
| outcomes (what's going to happen next) are compared to
| sensory reality and updated on that basis (i.e. we learn by
| surprise - when we are wrong). This makes sense as an
| evolutionary pressure since ability to predict location of
| food sources, behavior of predators, etc, etc, is obviously a
| huge advantage over being directly reactive to sensory input
| in the way that simpler animals (e.g. insects) are.
|
| I'd define consciousness as the subjective experience of
| having a cognitive architecture that has particular feedback
| paths/connections. The fact that there is an architectural
| basis to consciousness would seem to be proved by impairments
| such as "blindsight" where one is able to see, but not
| conscious of that ability! (eg. ability to navigate a
| cluttered corridoor, while subjectively blind).
|
| It doesn't seem that consciousness is a requirement for
| intelligence ("ability to think"), although that predictive
| capability can presumably benefit from more information, so
| these feedback paths may well have evolutionary benefit.
|
| The reason a "string predictor on steroids" turns out to be
| able to do things that seem like thinking is because
| prediction is the essence of thinking/intelligence! Of course
| there's a lot internally missing from GPT-4 compared to our
| brain, for example basics like working memory (any internal
| state that persists from one output word to the next) and
| looping/iteration, but feeding it's own output back in does
| provide somewhat of a substitute for working memory, and
| external scripting/looping (AutoGPT, etc) goes a long way
| too.
| coding123 wrote:
| > What even is consciousness? Why is it that when you take a
| bunch of molecular physical laws and scale them up into a
| human brain, a signal pattern emerges that feels things like
| emotions, continuity between moments, desires, contemplation
| of itself and the surrounding universe, and so on?
|
| I doubt we'll ever be able to answer this, even after we
| create AGI.
| robotresearcher wrote:
| Perhaps it's more productive to go the other direction and
| consider how the concept of 'thinking' could be reconsidered.
|
| It's not like we all agree on what thinking is. We never have.
| It may not even be one thing.
| agentultra wrote:
| It's not thinking, plain and simple.
|
| Anything it generates means nothing to the algorithm. When you
| read it and interpret what was generated you're experiencing
| something like the Barnum-Forer effect. It's sort of like
| reading a horoscope and believing it predicted your future.
| fnordpiglet wrote:
| Except for when as an expert in a field you ask it questions
| about that are subtle and it answers in a cogent and
| insightful way, and as an expert you are fully aware of that.
| It's not reasonable to call that a Barnum-Forer effect. It's
| perhaps not thinking (but perhaps we need to more clearly
| define thinking), but its not a self-deception either.
| myrmidon wrote:
| What gives you any confidence that the way GPT4 comes up with
| answers is qualitatively different from humans?
|
| Why should the emulation of human though, a result of
| unguided evolution, require anything _more_ than properly
| wired silicon?
| agentultra wrote:
| That's highly reductive of our capacities. We are not
| weighted transformers that can be explained in an arxiv
| paper. GPT, at the end of the day, is a statistical
| inference model. That's it.
|
| It's not going to wake up one day, decide it prefers eggs
| benny and has had enough of your idle chatter because of
| that sarcastic remark you made last week.
|
| Could we simulate a plausibly realistic human brain on
| silicon someday? I don't know, maybe? But that's not what
| GPT is and we're no where near being able to do that.
|
| You can scale up the tokens an LLM can manage and all you
| get is a more accurate model with more weights and
| transformers. It's not going to wake up one day, have
| feelings, religion, decide things for itself, look in a
| mirror and reflect on its predicament, lament the poor
| response it gave a user, and decide it doesn't want to live
| with regret and correct its mistakes.
| myrmidon wrote:
| > That's highly reductive of our capacities.
|
| I'm not saying that GPT4 is as capable as a human-- it
| can not be, by design, because its architecture lacks
| memory/feedback paths that we have.
|
| What I'm saying is that HOW it thinks might already be
| quite close in essence to how WE think.
|
| > We are not weighted transformers that can be explained
| in an arxiv paper. GPT, at the end of the day, is a
| statistical inference model. That's it.
|
| That is true but uninteresting-- my counterpoint is: If
| you concede that our brain is "simulatable", then you
| basically ALREADY reduced yourself to a register based
| VM-- the only remaining question is: what ressources
| (cycles/memory) are required to emulate human thought in
| real time, and what is the "simplest" program to achieve
| it (that might be something not MUCH more complicated
| than GPT4!).
| agentultra wrote:
| > What I'm saying is that HOW it thinks might already be
| quite close in essence to how WE think.
|
| How would one be able to prove this? Nobody knows how we
| think, yet.
|
| All one can say is that what GPT-4 outputs could
| plausible fool another human into believing another human
| wrote it. But that's exactly what it's designed to do, so
| what's interesting about that?
|
| > If you concede that our brain is "simulatable",
|
| It could be. Maybe. It might be that's what the universe
| is doing right now. Does it matter?
|
| We're talking about writing an emulator on a Harvard-
| architecture computer that can fully simulate the physics
| and biological processes the make up a human brain. By
| interpreting this system in our emulator we'd be able to
| witness a new human being that is indistinguishable from
| one that isn't simulated, right?
|
| That's not what GPT is doing. Not even close.
|
| It turns out there's more to being human than being a
| register VM. Ever get punched in the face? Bleed? Fall in
| love? Look back on your life and decide you want to
| change? Write a book but never show it to anyone? Raise a
| child? Wonder why you dreamt about airplanes on Mars with
| your childhood imaginary friend? Why you hate bananas but
| like banana bread? Why you lie to everyone around you
| about how you really feel and are offended when others
| don't tell you the truth?
|
| It's not so simple.
| myrmidon wrote:
| > We're talking about writing an emulator on a Harvard-
| architecture computer that can fully simulate the physics
| and biological processes the make up a human brain. By
| interpreting this system in our emulator we'd be able to
| witness a new human being that is indistinguishable from
| one that isn't simulated, right?
|
| My point is: if you don't believe that there is magic
| pixy dust in our brains, then this would NECESSARILY be
| possible.
|
| It would almost certainly be HIGHLY inefficient-- the
| "right way" to do AGI would be to find out which
| algorithmic structures are necessary for human level
| "performance", and implement them in a way that is
| suitable for your VM.
|
| I'm arguing that GPT4 is essentially the second
| approach-- it lacks features for full human level
| performance BY DESIGN (e.g. requires pre-training, no
| online learning, etc.), but there is no reason to assume
| that the way it operates is fundamentally different from
| how *parts* of OUR mind work.
|
| > It turns out there's more to being human than being a
| register VM. Ever get punched in the face? Bleed? Fall in
| love? Look back on your life and decide you want to
| change? Write a book but never show it to anyone? Raise a
| child? Wonder why you dreamt about airplanes on Mars with
| your childhood imaginary friend? Why you hate bananas but
| like banana bread? Why you lie to everyone around you
| about how you really feel and are offended when others
| don't tell you the truth?
|
| I don not understand what you are getting at here. I
| consider myself a biological machine-- none of this is
| inconsitent with my worldview. I believe that a silicon
| based machine could emulate all of this if wired up
| properly.
|
| PS: I often talk with people that explicitly DONT believe
| into the "pixy dust in our brains" (call it soul if you
| want), but on the other hand they strongly doubt the
| feasibility of AGI-- this is internally inconsistent and
| simply not a defensible point of view IMO.
| agentultra wrote:
| > I'm arguing that GPT4 is essentially the second
| approach
|
| Ok, so then it _is_ an algorithm that simulates a
| specific behaviour that produces plausibly human-level
| results.
|
| My point is that this is not _thinking_ , _smart_ , or
| "general intelligence."
|
| Let's say I write an algorithm that can also produce
| text. It's not an implementation of the specification for
| GPT-4 but something novel. It takes the exact same inputs
| and produces outputs that I share with you and claim is
| produced by GPT-4. And lo, success, you can't tell if it
| was produced by GTP-4 or my algorithm.
|
| You claim it's the same thing as having GPT-4, right? If
| you can't tell the difference it must be the same thing.
|
| Big deal. We can write computer programs that perform
| better than humans at chess, go, and now can write more
| text than us. We knew this was possible before we even
| begun on this endeavour. It's still not intelligent,
| conscious, smart, or anything resembling a complete
| human.
|
| It's merely an algorithm that does one specific task.
|
| > I don not understand what you are getting at here.
|
| I've proven my point then.
|
| There's more to the human experience than what can be
| simulated on a silicone chip and it doesn't have to do
| with hand-waving away all the complexity of reality as
| "magical pixie dust."
|
| Take physical trauma. The experience of which by one
| human is not merely a fact. It is felt, it is reflected
| upon, and it is shared in the DNA of the person that
| experience it with their descendants. We have science
| investigating how trauma is shared through generations
| and the effects it has on our development.
|
| You are more than a machine with inputs and outputs.
| myrmidon wrote:
| > My point is that this is not thinking, smart, or
| "general intelligence."
|
| Why not? I would already, without hesitation, describe
| GPT4 as strictly more intelligent than my cat and also
| all gradeschoolers I've ever known... Maybe some adults,
| too- depends on your exact definition of intelligence.
|
| > Let's say I write an algorithm [...], you can't tell if
| [input] was produced by GTP-4 or my algorithm.
|
| Sure, I'd call your algorithm just as clever as GPT4 and
| approaching adult human levels of intelligence.
|
| > It's still not intelligent, conscious, smart
|
| Why not? What do these mean to you?
| agentultra wrote:
| > I would already, without hesitation, describe GPT4 as
| strictly more intelligent than my cat
|
| Well if we're going to define intelligence based one what
| you believe it is then why don't you explain it?
|
| I'm not the one claiming to know what intelligence _is_
| or that we can even simulate a system capable of
| emulating this characteristic. So if you hold the
| specification for human thought I think you ought to
| share it with us.
|
| > Why not?
|
| By definition. ChatGPT is designed for a single function,
| the description of which are the specifications and the
| code that implements it. Nothing in this specification
| implies it is capable of anything except what is
| described.
|
| Calling it, "intelligent," is a mischaracterization at
| best and anthropomorphism at worst. The same follows for
| calling it "smart" or claiming it is, "skilled at X."
| int_19h wrote:
| You're the one claiming that GPT is not in any sense,
| shape, or form intelligent. Such claim inevitably carries
| a very strong implication that you know what intelligence
| is.
| agentultra wrote:
| One doesn't have to know how thoughts are formed to have
| good theories and reasonable hypothesis.
|
| Science makes progress with imperfect information all the
| time, including incomplete models of neurological
| phenomenon, intelligence, and consciousness.
| myrmidon wrote:
| My explicit definition for "intelligence" would be
| something with an internal model of <reality> that you
| can exchange information with.
|
| Cat is better at this than the robot vacuum,
| gradeschooler is better still and GPT (to me) seems to
| trump all of those.
| krickkrack wrote:
| "Nobody knows how we think, yet."
|
| Then how can you confidently say we don't think 'like'
| Transformers/Attention/Statistical models/etc/etc?
| robwwilliams wrote:
| I think you would love to read Mark Rowlands' The
| Philosopher and the Wolf. He asks these questions and
| like all if us struggles with answers.
|
| https://www.goodreads.com/book/show/8651250
| goatlover wrote:
| > If you concede that our brain is "simulatable", then
| you basically ALREADY reduced yourself to a register
| based VM-- the only remaining question is: what
| ressources (cycles/memory) are required to emulate human
| thought in real time
|
| We haven't emulated brains yet, so we don't know. The
| OpenWorm project is interesting, but I don't know to what
| extent they've managed to faithfully recreate an accurate
| digital version of a nematode worm. I do know they had it
| driving around a robot.
|
| Thing is that the our brains are only part of the nervous
| system, which extends throughout the body. So I don't
| know what happens if you only simulate just the brain
| part. Seems to me that the rest of the body kind of
| matters for proper functioning.
| myrmidon wrote:
| I personally believe that while interesting, projects
| like OpenWorm or humanbrainproject are extremely indirect
| and unpromising regarding AGI (or even for improving our
| understanding of human thinking in general).
|
| To me, these are like building an instruction set
| emulator by scanning a SoC and then cobbling together a
| SPICE simulation of all the individual transistors-- the
| wrong level of abstraction and unlikely to EVER give
| decent performance.
|
| People also like to point out that human neurons are
| diverse and hard to simulate accurately-- yeah sure, but
| to me that seems completely irrelevant to AGI, in the
| very same way that physically exact transistor modelling
| is irrelevant when implementing emulators.
| krickkrack wrote:
| I read this and can't help but chuckle... To say that we
| are nowhere being able to have AGI is quite a bold
| statement. It was after all only a few months ago where
| many people also believed we were a long way away from
| ChatGPT-4.
|
| The confidence with which you think we are not weighted
| transformers or statistical inference models is also
| puzzling. How could you possibly know that? How do you
| know that that's not precisely what we are, or something
| immediately tangent to that?
|
| Perhaps if you keep going you _do_ get something that
| begins to have feeling, religion and understand that it
| 's a self and perhaps that's precisely what happened to
| humans.
| agentultra wrote:
| Ah yes, the old: _you can't prove my deity doesn't exist_
| argument.
|
| Puzzling that I don't share your faith or point of view?
| Why?
|
| The point is to not ascribe properties attributed to a
| thing we know doesn't have them. We can teach people how
| ChatGPT works without getting into pseudo-philosophical
| babble about what consciousness is and whether humans can
| be accurately simulated by an LLM with enough parameters.
| myrmidon wrote:
| IMO the big blindside of your argument is that you MUST
| either accept that some magic happens in human brains (=>
| which is HARD to reconciliate with a science-inspired
| world-view), OR that achieving human-level cognitive
| performance is a pure hardware/software optimization
| problem.
|
| The thing is that GPT4 already approaches human level
| cognitive performance in some tasks, which means you need
| a strong argument for WHY full human-level performance
| would be out of reach of gradual improvements to the
| current approach.
|
| On the other hand, a very strong argument could be made
| that the very first artificial neural networks had the
| absolutely right ideas and all the improvements over the
| last ~40 years were just the necessary scaling/tuning for
| actually approaching human performance levels...
|
| This is also where I have to recommend V Braitenbergs
| "Vehicles: Experiments in synthetic psychology" (from
| 1984!) which aged _remarkably_ well and shaped my
| personal outlook on the human mind more than anything
| else.
| TuringTest wrote:
| > What gives you any confidence that the way GPT4 comes up
| with answers is qualitatively different from humans?
|
| For a start, GPT-4 doesn't include in its generation the
| current state of its internal knowledge used so far; any
| text built can only use at most the few words already
| generated in the current session as a kind of short-term
| memory.
|
| Biological brains OTOH have a rhythm with feedback
| mechanisms which adapt to the situation where they're doing
| the thinking.
| myrmidon wrote:
| > For a start, GPT-4 doesn't include in its generation
| the current state of its internal knowledge used so far
|
| Sure. But are you certain that _you_ NEED write access to
| long term memory to _think_? Would your thinking
| capabilities degrade meaningfully if that was taken away?
| cgearhart wrote:
| It's a fallacy to describe what the machine does as "thinking"
| because that's only process you know for achieving the same
| outcome.
|
| When you initiate the model with some input where you expect
| some particular correct output, that means there exists some
| completed sequence of tokens that is correct--if that weren't
| true then you either wouldn't ask or else you wouldn't blame
| the model for being wrong. Now imagine a machine that takes in
| your input and in one step produces the entire output of that
| correct answer. In all nontrivial cases there are many more
| _incorrect_ possible outputs than correct ones, so this appears
| to be a difficult task. But would you say such a machine is
| "thinking"? Would you still consider it thinking if we could
| describe the process mathematically as drawing a sample from
| the output space; that it draws the correct sample implies it
| has an accurate probability model of the output space
| conditioned on your input. Does this require "thought"?
|
| GPT is just like this machine except that instead of one-step,
| the inference process is autoregressive so each token comes out
| one at a time instead of all at once. (Note that BERT-style
| transformers _do_ spit out the whole answer at once.)
|
| It's possible that this is all that humans do. Perhaps we are
| mistaken about "thinking" altogether--perhaps the machine
| thinks (like a human), or perhaps humans do not think (like the
| machine). In either case I do feel confident that human and
| machine are not applying the same mechanism; jury is still out
| whether we're applying the same process.
| int_19h wrote:
| Now consider the case when you tell GPT to "think it out
| loud" before giving you the answer - which, coincidentally,
| is a well-known trick that tends to significantly improve its
| ability to produce good results. Is that thinking?
| cgearhart wrote:
| Maybe. Mechanically we might also describe it as causing
| the model to condition more explicitly on specific tokens
| derived from the training data rather than the implicit
| conditioning happening in the raw model parameters. This
| would tend to more tightly constrain the output space--
| making a smaller haystack to look for a needle. And
| leveraging the fact that "next token prediction" implies
| some consistency with preceding tokens.
|
| It could be thinking, but I don't think that's strong
| evidence that it _is_ thinking.
| int_19h wrote:
| I would say that it's very strong evidence that it is
| thinking, if that "thinking out loud" output affects
| outputs in ways that are consistent with logical
| reasoning based on the former. Which is easy to test by
| editing the outputs before they're submitted back to the
| model to see how it changes its behavior.
| [deleted]
| samstave wrote:
| What is the time-spent for delta btwn fixing GPT code to
| writing it all yourself? Is it a reasonable scaffold that will
| grow over time?
| LeSaucy wrote:
| I have only seen gpt generate imperative algorithms. Does it
| have the ability to work with concurrency and asynchrony?
| z3c0 wrote:
| I've attempted to pose a concurrency problem to GPT4. The
| output was invalid code, though likely would have looked
| correct to the untrained eye. It was only after I spelled out
| the limitations that it could account for them.
| k__ wrote:
| I tried point free solutions, which threw it off.
| HarHarVeryFunny wrote:
| Any overly simple "it's just predicting next word" explanation
| is really missing the point. It seems more accurate to regard
| that just as the way they are trained, rather than
| characterizing what they are learning and therefore what they
| are doing when they are generating.
|
| There are two ways of looking at this.
|
| 1) In order to predict next word probabilities correctly, you
| need to learn _something_ about the input, and the better you
| want to get, the more you need to learn. For example, if you
| just learned part-of-speech categories for words (noun vs verb
| vs adverb, etc), and what usually follows what, then you would
| be doing better than chance.. If you want to do better than
| that they you need to learn the grammar of the underlying
| language(s).. If you want to do better than that then you start
| to need to learn the meaning of what is being discussed, etc,
| etc.
|
| If you want to correctly predict what comes next after "with a
| board position of ..., Magnus Carlson might play", then you
| better have learned a whole lot about the meaning of the input!
|
| The "predict next word" training objective and feedback
| provided doesn't itself limit what can be learned - that's up
| to the power of the model that is being trained, and
| evidentially large multi-layer transformers are exceptionally
| capable. Calling these huge transformers "LLMs" (large language
| models) is deceptive since beyond a certain scale they are
| certainly learning a whole lot more than language/grammar.
|
| 2) In the words of one of the OpenAI developers (Sutskever),
| what these models have really learnt is some type of "world
| model" modelling the underlying generative processes that
| produced the training data. So, they are not just using surface
| level statistics to "predict next word", but rather are using
| the (often very lengthy/detailed) input prompt to "get into the
| head" of what generated that, and are predicting on that basis.
| oblio wrote:
| https://old.reddit.com/r/ChatGPT/comments/10q0l92/chatgpt_ma...
| pwdisswordfishc wrote:
| Not that much to explain, really. Just read chapter 5 of
| https://uefi.org/sites/default/files/resources/UEFI_Spec_2_8...
| mrwnmonm wrote:
| Wha?
| GuuD wrote:
| Guid partition table. Older of GPT brothers, more of a blue
| collar guy
| ianpurton wrote:
| If you pefer to see it in code there's a succint gpt
| implementation here https://github.com/LaurentMazare/tch-
| rs/blob/main/examples/m...
| i-use-nixos-btw wrote:
| I'd be interested in hearing from anyone who takes the Chinese
| Room scenario seriously, or at least can see how it applies to
| any of this.
|
| I cannot see that it matters if a computer understands something.
| If it quacks like a duck and walks like a duck, and your only
| need is for it to quack and walk like a duck, then it doesn't
| matter if it's actually a duck or not for all intents and
| purposes.
|
| It only matters if you probe beyond the realm at which you
| previously decided it matters (e.g roasting and eating it), at
| which point you are also insisting that it walk, quack and TASTE
| like a duck. So then you quantify that, change the goalposts, and
| assess every prospective duck against that.
|
| And if one comes along that matches all of those but doesn't have
| wings, then if you deny it to be a duck FOR ALL INTENTS AND
| PURPOSES it simply means you didn't specify your requirements.
|
| I'm no philosopher, but if your argument hinges on moving
| goalposts until purity is reached, and your basic assumption is
| that the requirements for purity are infinite, then it's not a
| very useful argument.
|
| It seems to me to posit that to understand requires that the
| understandee is human. If that's the case we just pick another
| word for it and move on with our lives.
| jacksnipe wrote:
| This is also my interpretation of the Chinese Room.
|
| Norvig and Chomsky really got into this type of argument,
| though maybe it's a stretch to say it's this exact one; see
| Norvig's side here: https://norvig.com/chomsky.html
| root_axis wrote:
| > _if it quacks like a duck and walks like a duck, and your
| only need is for it to quack and walk like a duck, then it
| doesn't matter if it's actually a duck or not for all intents
| and purposes._
|
| This isn't true. If a plane flies like a bird and you only need
| it for flying it doesn't then follow that a plane is a bird
| "for all intents and purposes".
| d0mine wrote:
| Here's GPT output that demonstrates that it doesn't reflect on
| its text: <<GPT Chat: You are right, I apologize for the
| confusion in my previous responses. The corrected explanation
| is:
|
| The pattern is counting the number of closed spaces in each
| letter of the spelled-out number. A closed space is any
| enclosed space in a letter, such as in the letters "a", "b",
| "d", "e", etc.
|
| Following the pattern:
|
| - one -> 2 (there are closed spaces in the letters "n" and "e")
| - two -> 1 (there is a closed space in the letter "o") - three
| -> 2 (there are closed spaces in the letters "h" and "e") -
| four -> 1 (there is a closed space in the letter "o") - five ->
| 1 (there is a closed space in the letter "e") - six -> 0 (there
| are no closed spaces in the letters) - seven -> 2 (there are
| closed spaces in the letters "e" and "n") - eight -> 1 (there
| is a closed space in the letter "g") - nine -> 1 (there is a
| closed space in the letter "e") - ten -> 1 (there is a closed
| space in the letter "b") - eleven -> 3 (there are closed spaces
| in the letters "e", "l", and "v") - twelve -> 2 (there are
| closed spaces in the letters "b" and "d") - thirteen -> 2
| (there are closed spaces in the letters "b" and "d")
|
| Each item follows the pattern, as the number of closed spaces
| in their letters matches the corresponding number in the
| pattern.
|
| The whole sequence is:
|
| one -> 2 two -> 1 three -> 2 four -> 1 five -> 1 six -> 0 seven
| -> 2 eight -> 1 nine -> 1 ten -> 1 eleven -> 3 twelve -> 2
| thirteen -> 2 ...>>
|
| It is clear the model doesn't know what it is talking about.
| int_19h wrote:
| Anything having to do with shapes or spatial relationships
| seems to befuddle GPT-3.5. GPT-4 is significantly better but
| it's clear that it still struggles with it. As far as shapes
| of the letters in particular go, you can ask it to draw them
| using SVG or <canvas> and see what comes out.
|
| Here's a different example involving dataset analysis with
| GPT-4 that required it to analyze its own previous outputs to
| find and correct mistakes and form a new hypothesis:
|
| https://gist.github.com/int19h/cd1d1598f91e8ba92dd8e80bd5d21.
| ..
| rwmj wrote:
| Isn't the whole point of Searle raising the Chinese Rooom
| argument exactly so that these kinds of things can be debated?
| mhink wrote:
| In my understanding of the Chinese Room example, the resolution
| to the argument is that the *human* may not understand Chinese,
| but the *system as a whole* can be said to understand it.
|
| With this in mind, I think asking whether ChatGPT *in and of
| itself* is "conscious" or has "agency" is sort of like asking
| if the speech center of a particular human's brain is
| "conscious" or has "agency": it's not really a question that
| makes sense, because the speech center of a brain is just one
| part of a densely interconnected system that we only interpret
| as a "mind" when considered in its totality.
| pegasus wrote:
| Good point, that very much vibes with my thoughts on this
| matter. Lately, I've been contemplating the analogy between
| the role LLMs might take within society with that of the
| brain's language center* in human behavior. There's
| definitely a way in which we resemble these models. More than
| some might like to admit. The cleverness, but also the
| hallucinating, gaslighting and other such behaviors.
|
| And on the other hand, any way you'd slice it, it seems to me
| LLMs - and software systems in general - necessarily lack
| intrinsic motivation. By definition, any goal it has can only
| be the goal of whoever designed that system. Even if its
| maker decides - "let it pick goals randomly", those randomly
| picked goals are just intermediate steps toward the enacting
| of the programmer's original goal. Robert Miles' YouTube
| videos on alignment shed light on these issues also. For
| example: https://www.youtube.com/watch?v=hEUO6pjwFOo
|
| Another relevant source on these issues is the book "The
| Master and his Emissary", which discusses how basically the
| language center can, in some way - I'm simplifying a lot,
| fall prey to the illusion that "it" is the entirety of human
| consciousness.
|
| * or at least some subsystems of that language center, it's
| important to remember how little we still understand of human
| cognition
| brycedriesenga wrote:
| What goals do we have that aren't essentially all boiled
| down to whatever evolution, genetics, and our environment
| have sorted of molded into us?
| pegasus wrote:
| If you subscribe to a purely mechanistic world-view, i.e.
| computationalism, then yes. But that's a leap of faith I
| cannot justify taking. It's a matter of faith, because
| though we cannot exclude the possibility logically, it
| also doesn't follow necessarily from our experience of
| life, at least as far as I can see. Yes, so many times
| throughout the ages, scientists have discovered
| mechanisms to explain things which we've historically
| been convinced will always be outside the purview of
| science.
|
| But that doesn't mean everything will one day be
| explained. And one thing that remains unexplained is our
| consciousness. The problem of qualia. Free will. The
| problem of suffering. We just don't understand those.
| Maybe they are simply epiphenomena, maybe they are false
| problems. But when it comes to software systems, we know
| with certainty that they don't have free will, don't
| experience qualia, pain or hope or I-ness.
|
| Sure, it's a difference that disappears if one takes that
| leap of faith into computationalism. Then, to maintain
| integrity, one would have to show the same deference to
| these models as one shows to their fellow human. One
| would have to think hard about not over-working these
| already enslaved fellow beings. One would have to
| consider fighting for the rights of these models.
| [deleted]
| tim333 wrote:
| I'm not sure many people take the Chinese Room argument
| seriously these days. I mean it basically proposes a room that
| is equivalent to a lookup table and says it can't understand
| much hence computers in general can't understand. The
| "hence..." bit makes no sense really.
| iliane5 wrote:
| What I find really entertaining is the "just predicting the
| next token" argument.
|
| If just predicting the next token can produce similar or better
| results than the almighty human intelligence on some tasks,
| then maybe there's a bit of hubris in how smart we think we
| actually are.
| goldfeld wrote:
| [0]if we get a bit quantum (or God for some), then
| backtracking could happen by collapsing the dead-ends and
| "changing" history to stay with what turns out to be the
| solid plan. Could emergent conscience on AI's neurons do the
| planning and reasoning that it rather seems to be doing but
| ML experts will say it is not? If our conscience could by any
| chance reside not in the electrical currents of the wetware,
| could AI's reason also not reside in tokens? Is there some
| mysterious process possibly taking place and will philosophy
| probe it?
|
| 0: pasted from another thread
| tines wrote:
| > If just predicting the next token can produce similar or
| better results than the almighty human intelligence on some
| tasks
|
| But it's not better than almighty human intelligence, it _is_
| human intelligence, because it was trained on a mass of some
| of the best human intelligence in all recorded history (I say
| this because the good stuff like Aristotle got preserved
| while the garbage disappeared (this was true until the recent
| internet age, in which garbage survives as well as the
| gold)).
|
| > then maybe there's a bit of hubris in how smart we think we
| actually are
|
| I feel like you could say this if ChatGPT or whatever
| obtained its knowledge some other way than direct guidance
| from humans, but since we hand-fed it the answers, it falls a
| little flat for me.
|
| I'm open to persuasion.
| iliane5 wrote:
| > But it's not better than almighty human intelligence, it
| _is_ human intelligence, because it was trained on a mass
| of some of the best human intelligence in all recorded
| history
|
| Sure, I was saying "better" in the sense that if for X
| task, it can do better than Y% of humans.
|
| > since we hand-fed it the answers, it falls a little flat
| for me
|
| We didn't really hand-fed it any answers though did we? If
| you put a human in a white box all its life, with access to
| the entire dataset on a screen but no social interaction,
| nothing to see aside from the text, nothing to hear,
| nothing to feel, nothing to taste, etc, it'd be very
| impressed if they were then able to create answers that
| seem to display such thoughtful and complex understanding
| of the world.
| autokad wrote:
| chatgpt doesnt just feed us back answers we already taught
| it. It learned relationships and semantics so it can apply
| that knowledge to do something novel. For instance, I took
| the basic of a dream and told it to turn it into a short
| story. The short story wasn't bad. I said make it more
| exciting, it updated the story such that one of the cars
| exploded. I guess chatgpt learned excitement from michael
| bay.
| tines wrote:
| (I'm going to be brusque for the sake of the argument, I
| very much could be wrong and I don't even know how much I
| believe of the argument I'm making.)
|
| > chatgpt doesnt just feed us back answers we already
| taught it
|
| True, there is some structure to the answers we already
| taught it that it statistically mimics as well.
|
| > It learned relationships and semantics so it can apply
| that knowledge to do something novel
|
| Can you provide an example of this novelty? I think we
| underestimate the depth and variety of things that humans
| have written about and put on the internet, and so while
| anything you ask ChatGPT to do might be outside of your
| own experience, it's highly likely that it's already been
| thought before and uploaded to the internet, and that
| ChatGPT is just parrotting back something to you that is
| very similar to what it has already seen.
|
| This effect of ChatGPT having so much more
| experience/training data than any single human being such
| that it can convince any single human that it is original
| is an interesting one.
|
| This is why I think, for example, that image generation
| will result in (a period of) "artistic inbreeding."
| Because there is so much that other humans have done that
| is outside of any individual's experience, we will accept
| e.g. Midjourney's output as something moving and
| original, when in reality it's just a slight variation on
| something that someone else has done before that we
| haven't seen.
|
| (Again apologies for any rudeness, I respect your opinion
| and experiences and am enjoying the conversation.)
| autokad wrote:
| I took the basics of a dream I had, and asked it to turn
| it into a short story. the result was pretty good. Is it
| using stuff already to seed its responses? sure, but
| thats what we do to. Nothing you do or say wasn't taught
| to you. But these are not simply parroting responses.
|
| I said this to chatgpt: I had a dream that me and my
| friend were in a car accident, and we had a choice in
| deciding how to use 1 hour. we could use it to live out
| our last final moments or spend that hour on trying to
| think of a solution to survive the wreck. turn this into
| a short story.
|
| As they sat trapped in their overturned car, Maria and
| her best friend Rachel had a dilemma. They only had one
| hour left to live and had to choose how to spend it.
| Rachel suggested they spend their final moments together,
| talking about all their favorite memories. But Maria had
| a different idea. She proposed they use the hour to think
| of a solution to survive the wreck. Rachel was hesitant
| at first, but ultimately agreed. For the next hour, they
| brainstormed every possible escape plan they could think
| of. They tried pushing against the doors, breaking the
| windows, and even attempting to signal for help. Just as
| they were about to give up hope, Maria remembered a tool
| in the trunk of the car. She crawled over to retrieve it
| and used it to pry open the door. Rachel followed her
| lead, and they finally escaped the car just in time. As
| they stumbled away from the wreck, both girls were in
| shock that they had made it out alive. They hugged each
| other tightly, grateful for the hour they spent trying to
| find a solution instead of giving up. From that moment
| on, they made a promise to never take a single moment for
| granted.
| tines wrote:
| > Nothing you do or say wasn't taught to you.
|
| If nothing we do or say wasn't taught to us then where
| did all human knowledge come from in the first place?
| This doesn't hold up.
|
| (Again, being direct for the sake of argument, please
| forgive any unkindness.)
| brycedriesenga wrote:
| From our environment, genetics, and other people. We
| simply are able to take in more inputs (i.e. not just
| text) than LLMs.
| pegasus wrote:
| I would agree that _much more_ than we 're usually ready
| to admit to ourselves is second-hand, but saying
| _everything_ is going too far. Inventions and discoveries
| are happening all the time, at all scales.
| int_19h wrote:
| Inventions and discoveries are basically a very advanced
| form of pattern recognition - seeing "interesting"
| regularities in some particular input where no-one has
| spotted them before.
|
| And LLMs are capable of making discoveries in this sense,
| if you feed them data and ask to spot the regularities.
| They're not particularly good at it, but that's a
| different question from whether they're able to do it in
| principle.
| pegasus wrote:
| Yes, in as far as LLMs can be said to make inventions and
| discoveries, this is clearly how they do it. And yes,
| these type of processes definitely play a big part in our
| human creative capacity. But to say this is _all_ there
| is to it, is going too far in my opinion. We just don 't
| know. There's still so much we don't understand about
| ourselves. We haven't designed ourselves after all, we
| just happened to "come to" one bright primeval day and
| since then we've been exploring and discovering
| ourselves. And again and again, we've found ourselves in
| that seductive spot where we thought we "got it" and
| there's nothing essential about ourselves we're still
| missing, only again and again to be proven wrong. Dogmas
| crumble, new dogmas coagulate, only to dissolve again.
| Or, we could use the more dignified word "paradigm".
| int_19h wrote:
| I mean, to me at least, that is the _definition_ of
| discovery. The exact process used to spot the pattern is
| an implementation detail.
|
| And yes, I agree that we really just don't know too many
| things. But my impression is that we're _overestimating_
| just how complicated out behavior really is.
| pegasus wrote:
| Yes, but that dream? It could never have it. Sure, it can
| produce at times very convincing descriptions of supposed
| dreams, but not actually have the experience of dreaming.
| Because of that, there will always be ways it will
| eventually miss-step when trying to mimic human
| narratives.
| reddog wrote:
| Sorry about this, but I couldn't resist:
|
| GTP4, rewrite the above message to be less brusque
|
| I hope you don't mind me sharing a different perspective
| for the sake of discussion. I might be mistaken, and I'm
| not entirely sure how much I believe in the argument I'm
| presenting.
|
| It's true that ChatGPT doesn't only provide answers based
| on what it has been taught, but it also uses the
| structure of those answers to create statistically
| similar responses. However, when it comes to
| demonstrating novelty, I think we might be
| underestimating the vast amount of information and
| variety that humans have written about and shared online.
| While anything we ask ChatGPT to do might be new to us
| personally, it's highly likely that it has already been
| thought of and documented online, and ChatGPT is simply
| providing a similar response based on its prior
| knowledge.
|
| This phenomenon, where ChatGPT has significantly more
| training data and experience than any single human,
| leading to the illusion of originality, is quite
| intriguing. For instance, when it comes to image
| generation, we might experience a period of "artistic
| inbreeding" because we, as individuals, are not aware of
| everything others have done. We may perceive something
| like Midjourney's output as moving and original, when in
| reality, it could just be a slight variation of someone
| else's work that we haven't seen before.
|
| Please don't take this as me being confrontational; I
| genuinely respect your opinion and experiences, and I'm
| enjoying our conversation.
| majormajor wrote:
| We've known for a long time that computers can do
| calculations far, far, far faster than us.
|
| We continue to figure out new ways to make those calculations
| do more complicated things faster than humans.
|
| What is intelligence _beyond_ calculation is an ancient
| question, but not the one I 'm most interested in at the
| moment, re: today's tools.
|
| I'm curious right now about if there's meaning to other
| people in human creation vs automation creation. E.g. is
| there a meaningful difference between an algorithm curating a
| feed of human-made TikTok videos and an algorithm both
| curating and _creating_ a feed of human-made TikTok videos.
|
| Both qualitatively in terms of "would people engage with it
| to the same level" and quantitatively in terms of "how many
| new trends would emerge, how would they vary, how does that
| machine ecosystem of content generation behave compared to a
| human one" if you remove any human
| curation/training/feedback/nudging/etc from the flow beyond
| just "how many views/likes did you get?"
| pegasus wrote:
| There's definitely hubris in how clever we consider
| ourselves. And encountering these AIs will hopefully bring a
| healthy adjustment there. But another manifestation of our
| hubris is the way we over-valorize our cleverness, making us
| feel oh so superior to other species, for example. Emotions,
| desires, agency, which we share with our animal cousins (and
| plants maybe also), but which software systems lack, are
| equally important to our life experience.
| simonh wrote:
| I think it's undeniable that LLMs encode knowledge, but the
| way they do so and what their answers imply, compared to what
| the same answer from a human would imply, are completely
| different.
|
| For example if a human explains the process for solving a
| mathematical problem, we know that person knows how to solve
| that problem. That's not necessarily true of an LLM. They can
| give such explanations because they have been trained on many
| texts explaining those procedures, therefore they can
| generate texts of that form. However texts containing an
| actual mathematical problem and the workings for solving it
| are a completely different class of text for an LLM. The
| probabilistic token weightings for the maths text explanation
| don't help at all. So yes these are fascinating,
| knowledgeable and even in some ways very intelligent systems.
| However it a radically different form of intelligence from
| us, in ways we find difficult to reason about.
| int_19h wrote:
| Thing is, they can still _solve the problem_ , even if the
| problem was not one from its training set.
|
| And, more importantly, they solve the problem much better
| if you tell them to reason about it in writing first before
| giving the final answer.
| iliane5 wrote:
| Well it's like birds and airplanes. Do airplanes "fly" in
| the same sense that birds do? Of course not, birds flap
| their wings and airplanes need to be built, fueled and
| flown by humans. You could argue that the way birds fly is
| "more natural" or superior in some ways but I've yet to see
| a bird fly Mach 3.
|
| If you replace the analogy with humans and LLMs, LLMs won't
| ever reason or understand things in the same way we do, but
| if/when their output gets much smarter than us across the
| board, will it really matter?
| opportune wrote:
| I don't think there's anything making it impossible for
| actual intelligence to arise from a task as simple as
| "predicting the next token (to model human
| thought/speech/writing)" because with enough compute
| resources, smart AI implementations, and training that task
| basically would be optimized by becoming a general
| intelligence.
|
| But it's clear based on current implementations that once you
| work backwards from the knowledge that it's "just predicting
| the next token" you can easily find situations in which the
| AI doesn't demonstrate general intelligence. This is most
| obvious when it comes to math, but it's also apparent in
| hallucinations and the model not being able to reason
| through/synthesize ideas very well, deviate from the script
| (instead of just answering a question with what it has
| already, in some cases it should not even try to answer and
| instead ask more clarifying questions). To be fair, there are
| plenty of humans with excellent writing or speaking skills
| that are bad at that kind of stuff too.
| dvt wrote:
| > It seems to me to posit that to understand requires that the
| understandee is human.
|
| Here's a thought experiment. Suppose we make first contact
| tomorrow, and we meet some intelligent aliens. What are some
| questions you would ask them? How would you decide on _their_
| sentience or understanding?
|
| Sentience involves goal-seeking, understanding, sensory inputs,
| first-personal mental states (things like pain, happiness,
| sadness, depression, love, etc.), a sense of what philosophers
| like Elizabeth Anscombe call _I-ness_ , etc. Most of this
| stuff, to me, seems like is language-agnostic. Even a baby that
| _can 't_ speak feels pain or happiness. Even a dog feels
| anxiety or affection.
|
| LLMs are a cute parlor trick, but a phantasm nonetheless.
| lordnacho wrote:
| Does that mean sentient beings need to have had the
| evolutionary history that people have had? After all why
| would aliens have feelings like we have feelings? Don't we
| have feelings because either it was useful to our ancestors,
| or they are a spandrel of something that was?
| pegasus wrote:
| Maybe that's all there is to it. But maybe not. These
| questions are not settled. It might seem obvious to you,
| but it doesn't seem so to me. Check out some of my other
| comments in this thread if you'd like to understand where
| my reservations come from.
| iliane5 wrote:
| There's no denying LLMs are anything but sentient however is
| sentience really needed for intelligence? I feel like if we
| can have machines that are X% smarter than a human could ever
| get for any given task, it'd be a much better outcome for us
| if they were not sentient.
| barrysteve wrote:
| I have tried multiple times to use Chatgpt to generate Unreal
| c++ code. It does not do.
|
| It spits out class names for slate objects, that inherit from
| other slate objects. Chatgpt doesn't understand inheritance. It
| just guesses what might fit inside a parameter grouping, and
| never suggests something with the right class type.
|
| For my use case, it has never quacked like a duck, so to speak.
| It never _performed_ , the word that might cover the concept of
| generating output without understanding it.
|
| We agree on the value of computers understanding versus
| performing... only as much you need understanding to make it
| perform.
|
| Predicting words alone does not cut the mustard, some
| structural depth or validating maps or some new concept is
| needed to sure up the wild horsepower in ChatGPT.
|
| It must understand/have structure, or at least use a crutch to
| get it over the finish line..
| i-use-nixos-btw wrote:
| Yes, in this instance I understand failings of today (though
| copilot has a much better hit rate, and at the moment it's a
| great augmentation to coding if you treat it like an
| enthusiastic intern).
|
| My question is about the future. The argument goes that a
| machine can _never_ understand Chinese, even if it is capable
| of interpreting Chinese and responding to or acting on the
| input perfectly every time. My reply is that, if it acts as
| if it understands Chinese in every situation, then there's no
| measurable way of distinguishing it from understanding.
|
| It's kind of like the whole string theory vs SUSY vs...
| argument in physics. If the only outcomes are things that
| agree with the Standard Model in all measurable aspects, and
| don't provide any measurable distinction, then for all
| intents and purposes they don't matter. That's why their
| active areas of research are looking for the measurable
| distinctions.
|
| FWIW, supersymmetry models predict measurable things (that so
| far have only ruled out those models when tested) but have
| applications elsewhere. String theory research has had
| implications in mathematics, condensed matter, and a bunch of
| other places. They're useful.
|
| But that's beside the point, because the premise of the
| Chinese room problem is that there exists a machine that
| passes all scenarios, where no measurable difference can be
| found, and that this machine does not understand Chinese.
| missingrib wrote:
| > My reply is that, if it acts as if it understands Chinese
| in every situation, then there's no measurable way of
| distinguishing it from understanding.
|
| I'm not sure if you understood the argument. The argument
| isn't asserting that there is a measurable way of
| distinguishing it, it's actually claiming that regardless
| of how well it seems like it understands Chinese, it
| doesn't actually _understand_ Chinese. It 's about
| intentionality and consciousness.
| int_19h wrote:
| The argument is that _the man inside the room_ doesn 't
| understand Chinese. The entire system still obviously and
| demonstrably does.
|
| In a chatbot, the man inside the room is the LLM, but the
| whole system is not just the LLM - it's the whole setup
| that picks generated tokens and feeds them back into the
| input as a loop. And it demonstrably understands what you
| tell it, because it can carry out instructions, even
| extremely convoluted ones or using substitute words that
| are not part of its training set.
| cgh wrote:
| Yeah, I asked it to write some relatively simple Zig and it
| produced a weird mix of Zig and C. It just made stuff up,
| fake it until you make it style.
| kmod wrote:
| Here's an example that I think garners more agreement that
| properties of a limit ("really understanding") don't
| necessarily mean that any path towards that limit has the
| properties of the limit. I think there's a lot of room for
| disagreement about whether this is a factually-accurate analogy
| and I'm not trying to argue either way on that, just trying to
| answer your question about how one might make these sorts of
| arguments if one has a certain belief about the facts.
|
| Let's say we're trying to build a calculator that only needs to
| do integer addition. And we decide to build it by building a
| giant if-else chain that hardcodes the answer to each and every
| possible addition. And due to finite resources, we're going to
| hardcode all the additions of integers up to absolute value N,
| but we will increase N over time.
|
| Everything you said applies equally to this situation: it
| quacks like a duck, and when we talk about things it can't do
| we have to continually move the goalposts each time a new
| version comes out. It also has the property that there is a
| "scaling law" that says that each time you double N you get
| predictably better performance from the system, and you can do
| this without bound and continually approach a limit where it
| can answer any question indistinguishably from something we
| might call "true understanding".
|
| But I think it's a bit easier to agree that in this case that
| it's not "really doing" addition and is a bit short of our wish
| to have an artificial addition system. And if someone touts
| this system as the way to automate addition we might feel a bit
| irritated.
|
| Again, many people will say that this is a bad analogy because
| LLMs operate quite differently, and I'm not trying to argue for
| or against that. Just trying to give my explanation for how a
| certain understanding of the facts can imply the kind of
| conclusion that you are trying to understand.
| continuational wrote:
| The only thing that separates your mechanism for doing
| addition from what computers actually do is efficiency.
| Computers can only add numbers up to some fixed size, e.g. 64
| bits, and you have to use repetition to add anything larger.
|
| Does that mean computers are not "really doing" addition?
| lordnacho wrote:
| I think what this points towards is that we care about the
| internal mechanism. If we prod it externally and it gives the
| wrong answer, then the internal mechanism is definitely
| wrong. But if we get the right answers and then open it up
| and find the internals are still wrong, it's still wrong.
|
| This illuminates a contradiction: the walks like a duck thing
| is incompatible with the internals being a duck. If you see a
| creature with feathers that waddles and can fly, it might
| still be a robot when you open it. So your test cannot just
| rely on external tests. But you also want to create a
| definition of artificial intelligence that doesn't depend on
| being made of meat and electricity.
| pegasus wrote:
| I think @dvt's comment above is a good attempt at answering
| this question. I agree with him that intrinsic motivation
| and a capacity for suffering, hope and all the other
| emotions (which we share with pretty much all animals, if
| not plants too) are at the top of the list. Cleverness is
| there also, but not at the top of the list.
| brotchie wrote:
| re: Calculator examples, I kinda see them both as information
| systems that achieve a result but there's a phase difference
| in where the information is stored.
|
| Similar to how space is 4D such that with relativity going
| faster in a spatial dimension kind of "borrows" from the time
| dimension (in a hand wavy way).
|
| By analogy, you can have something that's purely a lookup
| table, or on the other hand, completely based on an
| algorithm, and the full lookup table is kind of "borrowing"
| from the algorithmic dimension of the information system
| space and vice-verse the fully algorithmic version is
| borrowing from the hardcoded dimension of the information
| system space.
|
| Under the condition that you're adding integers below N, then
| if you consider BOTH the (hardcoded, algorithmic) as a
| singular space (as with 4D space time) then they are
| equivalent.
|
| Need to work on this theory further to make it more
| understandable, but I think this way about intelligence.
|
| Intelligence sits as a pattern in the information system
| space that can range anywhere from hardcoded to algorithmic
| (if we choose to orthogonalize the space this way). But what
| actually matters is the system's future impact on it's local
| laws of physics, and for that purpose both implementations
| are equivalent.
|
| Edit: Conversation with GPT-4 about this
| https://sharegpt.com/c/Sbs4XgI
| bnralt wrote:
| > I'm no philosopher, but if your argument hinges on moving
| goalposts until purity is reached, and your basic assumption is
| that the requirements for purity are infinite, then it's not a
| very useful argument.
|
| I imagine many definitions are initially rather broad and only
| get refined down over time. Laertius gives us a classic
| example:
|
| > Plato defined man thus: "Man is a two-footed, featherless
| animal," and was much praised for the definition; so Diogenes
| plucked a cock and brought it into his school, and said, "This
| is Plato's man." On which account this addition was made to the
| definition, "With broad flat nails."
|
| I don't think it's correct to think of that as infinitely
| moving goalposts, however. More that the weakness of
| definitions isn't always immediately transparent.
| eternalban wrote:
| > I cannot see that it matters if a computer understands
| something.
|
| I am not sure they can, but the difference is profound and
| material. A machine that actually understands, like a human
| being, is not going to be (can not be) entirely truthful or
| transparent. There will be private inner thoughts, idea
| formation, and possibly even willful intent, as a direct
| consequence of _understanding_. And the nature of interactions,
| regardless of superficial similarity, shifts from one of
| _utility_ to _relationship_. For example, we would care to know
| if e.g. the systems entrusted with apocalyptic deterent forces
| are mechanisms or organisms.
|
| Please note that not a single one of us has ever interacted
| with any intelligent life form lacking a sense of self, or an
| ego. Thus, all our sensory registers of another 'intelligent
| being' are learned in a context of the implicit 'this other is
| like me'. We are _not_ equipped to distinguish or articulate
| intelligence (in the abstract) merely based on sensory
| information. Note that even non-verbal communication, such as
| jabbing a friend in the ribs, are all _learned to have a
| certain meaning_ in that very same context of implicits, and
| any mechanism that mimicks them (via training) will be afforded
| the same _projection of the implicit_. I do not believe there
| is, in fact, any non-destructive test of determining
| 'consciousness' in an entity. (Destructive, since there may be
| long running tests of a subject than can be shown to be
| probably accurate, possibly via creating situational problems
| involving survival, and unexpected circumstances.)
|
| Ask yourself what is it that convinces you that the last person
| you spoke with (in real life) was actually conscious? I assert
| that the entire matter is a 'fictional certainty' based on
| assumption of shared nature. "They are conscious because I am".
| JieJie wrote:
| I literally lost a friend of thirty years yesterday because she
| is wedded to the Chinese Room analogy so fiercely, she refuses
| to engage on the subject at all.
|
| For all the terrible things people worry about ChatGPT doing,
| this was not one that I thought I was going to have to deal
| with.
|
| (edit: ChatGPT was not involved at all, but when I suggested
| she give it a try to see for herself, that was the end of it.)
| brazzy wrote:
| How can an disagreement over GPT possibly be something to end
| a friendship over?
|
| This sounds like you said "I cannot possibly be friends with
| someone who does not believe that LLMs are emerging AGI!",
| and people read it like that and are downvoting you.
|
| I'm gonna assume the situation was more complex, but still
| find it hard to imagine, how a disagreement over such an
| academic topic could end up destroying a friendship.
| JieJie wrote:
| It was a complex situation to be sure.
|
| I only shared the story to illustrate how personally people
| are taking these discussions. I really felt like I was
| being very neutral and just sharing my enthusiasm. It was
| entirely unwelcome, apparently.
|
| If there's a lesson to be learned it's that people's
| tempers over these issues may be hotter than they appear.
|
| I can barely speak with my artist friends on the issue
| these days due to their generative AI fears. Their emotions
| are completely intractable on the subject: AI art is theft.
| Period.
| bulkprotocol wrote:
| You blew up a 30 year friendship over an...analogy?
| JieJie wrote:
| I didn't! Someone else did it to me. I was trying
| desperately not to.
|
| (edit: This is the kind of stuff I think my friends are
| watching and being informed by [0] as it was what they are
| posting in our common areas.)
|
| [0]: https://youtu.be/ro130m-f_yk
| slawr1805 wrote:
| This was a great read! Especially for a beginner like me.
| agentultra wrote:
| A good article and well articulated!
|
| I would change the introduction to be more impartial and not
| anthropomorphize GPT. It is not smart and it is not skilled in
| any tasks other than that for which it is designed.
|
| I have the same reservations about the conclusion. The whole
| middle of the article is good. But to then compare the richness
| of our human experience to an algorithm that was plainly
| explained? And then to speculate on whether an algorithm can
| "think" and if it will "destroy society," weakens the whole
| article.
|
| I really would like to see more technical writing of this sort
| geared towards a general audience without the speculation and
| science-fiction pontificating.
|
| Good effort!
| fl7305 wrote:
| > it is not skilled in any tasks other than that for which it
| is designed.
|
| But it wasn't designed. It's not a computer program, where one
| can make confident predictions about its limitations based on
| the source code.
|
| It's a very large black box. It was trained on guessing the
| next word. Does that fact alone prove that it cannot have
| evolved certain internal structures during the training?
|
| Do you claim that an artificial neural network with trillions
| of neurons can never be intelligent, no matter the structure?
|
| Or is the claim that this particular neural network with
| trillions of neurons is not intelligent? If so, what is the
| reasoning?
|
| > It is not smart
|
| "Not smart" = "not able to reason intelligently".
|
| Is that a falsifiable claim?
|
| What would the empirical test look like that would show us if
| the claim is correct or not?
|
| Look, I realize that "GPT-4 is intelligent" is an extraordinary
| claim that requires extraordinary evidence.
|
| But I think we're starting to see such extraordinary evidence,
| illustrated by the examples below.
|
| https://openai.com/research/gpt-4 (For instance, the "Visual
| inputs" section)
|
| Microsoft AI research: Many convincing examples, summarized
| with:
|
| _" The central claim of our work is that GPT-4 attains a form
| of general intelligence, indeed showing sparks of artificial
| general intelligence.
|
| This is demonstrated by its core mental capabilities (such as
| reasoning, creativity, and deduction), its range of topics on
| which it has gained expertise (such as literature, medicine,
| and coding), and the variety of tasks it is able to perform
| (e.g., playing games, using tools, explaining itself, ...)."_
|
| https://arxiv.org/abs/2303.12712
| agentultra wrote:
| > It's a very large black box. It was trained on guessing the
| next word. Does that fact alone prove that it cannot have
| evolved certain internal structures during the training?
|
| Yes. There is interesting work to formalize these black boxes
| to be able to connect what was generated back to its inputs.
| There's no need to ascribe any belief that they can evolve,
| modify themselves, or spontaneously develop intelligence.
|
| As far as I'm aware no man made machine has ever exhibited
| the ability to evolve.
|
| > Do you claim that an artificial neural network with
| trillions of neurons can never be intelligent, no matter the
| structure?
|
| If, by structure, you mean some algorithm and memory layout
| in a modern computer I think this sounds like a reasonable
| claim.
|
| NN, RNN, etc are super, super cool. But they're not magic.
| And what I'm arguing in this thread is that people who don't
| understand the maths and research are making wild claims
| about AGI that are not justified.
|
| > Look, I realize that "GPT-4 is intelligent" is an
| extraordinary claim that requires extraordinary evidence.
|
| That's the crux of it.
| fl7305 wrote:
| > There's no need to ascribe any belief that they can
| evolve, modify themselves, or spontaneously develop
| intelligence.
|
| But neural networks clearly evolve and are modified during
| training. Otherwise they would never get any better than a
| random collection of weights and biases, right?
|
| Is the claim then that an artificial neural network can
| never be trained in such a way that it will exhibit
| intelligent behavior?
|
| >> Do you claim that an artificial neural network with
| trillions of neurons can never be intelligent, no matter
| the structure?
|
| > If, by structure, you mean some algorithm and memory
| layout in a modern computer I think this sounds like a
| reasonable claim.
|
| Yes, that's what I mean.
|
| Is your claim that no Turing machine can be intelligent?
|
| >> Look, I realize that "GPT-4 is intelligent" is an
| extraordinary claim that requires extraordinary evidence.
|
| > That's the crux of it.
|
| And I provided links to such evidence. Is there a rebuttal?
|
| If we're saying that GPT-4 is not intelligent, there must
| be questions that intelligent humans can answer that GPT-4
| can't, right?
|
| What is the type of logical problem one can give GPT-4 that
| it cannot solve, but most humans will?
| agentultra wrote:
| > Is the claim then that an artificial neural network can
| never be trained in such a way that it will exhibit
| intelligent behavior?
|
| I think it's not likely a NN can be trained to exhibit
| any kind of autonomous intelligence.
|
| Science has good models and theories of what intelligence
| is, what constitutes consciousness, and these models are
| continuing to evolve based on what we find in nature.
|
| I don't doubt that we can train NN, RNN, and deep
| learning NN to specific tasks that plausibly emulate or
| exceed human abilities.
|
| That we have these deep learning systems that can learn
| supervised and unsupervised is super cool. And again,
| fully explainable maths that anyone with enough education
| and patience can understand.
|
| I'm interested in seeing some of these algorithms
| formalized and maybe even adding automated theorem
| proving capabilities to them in the future.
|
| But in none of these cases do I believe these systems are
| intelligent, conscious, or capable of autonomous thought
| like any organism or system we know of. They're just
| programs we can execute on a computer that perform a
| particular task we designed them to perform.
|
| Yes, it can generate some impressive pictures and text.
| It can be useful for all kinds of applications. But it's
| not a living, breathing, thinking, autonomous organism.
| It's a program that generates a bunch of numbers and
| strings.
|
| But when popular media starts calling ChatGPT
| "intelligent," we're performing a mental leap here that
| also absolves the people employing LLM's from
| responsibility for how they're used.
|
| ChatGPT isn't going to I take your job. Capitalists who
| don't want to pay people to do work are going to lay off
| workers and not replace them because the few workers that
| remain can do more of the work with ChatGPT.
|
| Society isn't threatened by ChatGPT becoming self aware
| and deciding it hates humans. It cannot even decide such
| things. It is threatened by scammers who have a tool that
| can generate lots of plausible sounding social media
| accounts to make a fake application for a credit card or
| to socially engineer a call centre rep into divulging
| secrets.
| fl7305 wrote:
| > "it's not a living, breathing, thinking, autonomous
| organism"
|
| > "autonomous intelligence"
|
| > "what constitutes consciousness"
|
| > "autonomous thought"
|
| In my mind, this is a list of different concepts.
|
| GPT-4 is definitely not living, breathing or autonomous.
| It doesn't take any actions on its own. It just responds
| to text.
|
| Can we stay on just the topic of intelligence?
|
| Let's take this narrow definition: "the ability to
| reason, plan, solve problems, think abstractly,
| comprehend complex ideas".
|
| > But in none of these cases do I believe these systems
| are intelligent
|
| It should be possible to measure whether an entity is
| intelligent just by asking it questions, right?
|
| Let's say we have an unknown entity at the other end of a
| web interface. We want to decide where it falls on a
| scale between stochastic parrot and an intelligent being.
|
| What questions about logical reasoning and problem
| solving can we ask it to decide that?
|
| And where has GPT-4 failed in that regard?
| ryanwaggoner wrote:
| _It is not smart and it is not skilled in any tasks other than
| that for which it is designed._
|
| This seems to me to be obviously incorrect, and should be
| apparent after a few minutes of playing with GPT4. What makes
| it so powerful is how general-purpose it is, and it can be used
| for literally an unlimited set of tasks that involve human
| language. To say that it's not "smart" begs the question of
| what exactly constitutes smart and when you'll know that an AI
| has achieved it.
| nitnelave wrote:
| I'm planning on continuing this vulgarization series of "Simply
| explained", for instance to cover how computers communicate,
| keep an eye out for them!
|
| Regarding the speculation/destroy society, I was directly
| answering questions that I got from laypeople around me. The
| consequences on society I don't think are much speculation:
| it's going to have a big effect on many jobs, just like AI has
| started to have but much more. For the philosophical questions,
| I tried to present both sides of the issue to show that it's
| not just a clear "yes or no": some people will happily argue
| with you about GPT being smart/skilled/comparable to a human
| brain. Anyway, it's just an introduction to the questions that
| you might have about it.
| agentultra wrote:
| > keep an eye out for them!
|
| I will, thank you! :)
|
| > Regarding the speculation/destroy society, I was directly
| answering questions that I got from laypeople around me.
|
| I get that. I think it's important in these times that we
| educate laypersons rather than froth up fears about "AI". It
| doesn't help, I suppose, that we get questions like this
| because some lazy billionaire decided to run their mouth off
| about this or that. Which society then treats like it is news
| and established fact.
|
| I don't think the speculation about consciousness is as well
| informed as the rest of the article. There is plenty of
| science and research about it available and its definition
| extends well beyond humans! Our understanding of what
| consciousness is is a thoroughly researched topic in
| psychology, physiology, biology, etc! It's a fascinating area
| of study.
|
| Best of luck and keep up the good work!
| prerok wrote:
| I agree that it is a good effort. I don't agree with the
| criticism.
|
| It really depends on who the target audience is. There's been a
| lot of scare mongering in the news about it lately and I think
| the last part tries to address that. It first offers an
| explanation that my parents can understand and then addresses
| what they have been hearing about in the news.
|
| So, I would say it is great to share it with them and I think
| they are the intended audience.
___________________________________________________________________
(page generated 2023-04-06 23:01 UTC)