[HN Gopher] Quiet-STaR: Language Models Can Teach Themselves to ...
       ___________________________________________________________________
        
       Quiet-STaR: Language Models Can Teach Themselves to Think Before
       Speaking
        
       Author : hackerlight
       Score  : 236 points
       Date   : 2024-03-15 09:24 UTC (13 hours ago)
        
 (HTM) web link (arxiv.org)
 (TXT) w3m dump (arxiv.org)
        
       | roschdal wrote:
       | Next Language models teaching themselves to think, then kill
       | humans, based on crawled russian website with secret AI
       | instructions.
        
         | raidicy wrote:
         | Although this is obviously satirical hyperbole dataset
         | poisoning is real and will be underappreciated until the 1st
         | catastrophic example of it happening occurs.
        
           | DFHippie wrote:
           | Many years ago now I wrote my kids a very simple chatbot to
           | play with. You'd type in a phrase. It would tokenize it,
           | adding start and stop tokens, then update it's token
           | transition probabilities, using the two preceding tokens to
           | pick the next one. It would then generate a response from
           | these probabilities.
           | 
           | The data poisoning began immediately. Because "poop" was such
           | a funny word, they quickly taught it that the most probable
           | token after any bigram was "poop".
           | 
           | No humans were killed, but two small kids were amused for an
           | hour or so.
        
             | raidicy wrote:
             | My condolences for your models' poisoning. It sounds like a
             | real crappy way to go :?
        
       | dcrimp wrote:
       | I had this thought the other day that the whole chain of thought
       | reasoning pattern contributing to improved performance in LLM-
       | based systems seems to sit parallel to Kahneman's two-system
       | model of the mind that he covers in 'Thinking, Fast and Slow'.
       | 
       | Haven't read it in a few years, but I recall the book suggests
       | that we use one 'System 1' in our brains primarily for low-
       | effort, low computation thinking - like 1+1=? or "the sky is
       | ____".
       | 
       | It then suggests that we use a 'System 2' for deliberate,
       | conscious, high-cognitive tasks. Dense multiplication, reasoning
       | problems, working with tools - generally just decision-making.
       | Anything that requires focus or brain power. Our brain escalates
       | tasks from S1 to S2 if they feel complex or dangerous.
       | 
       | Maybe I'm being too cute, but it feels like critique that "LLMs
       | aren't intelligent because they are stochastic parrots" is an
       | observation that they are only equipped to use their 'System 1'.
       | 
       | When we prompt an LLM to think step-by-step, we allow it a
       | workspace to write down it's thoughts which it can then consider
       | in it's next token prediction, a rudimentary System 2, like a
       | deliberation sandbox.
       | 
       | We do a similar thing when we engage our System 2 - we hold a
       | diorama of the world in the front of our mind, where we simulate
       | what the environment will do if we proceed with a given action -
       | what our friend might respond to what we say, how the sheet steel
       | might bend to a force, how the code might break, how the tyres
       | might grip. And we use that simulation to explore a tree of
       | possibilities and decide an action that rewards us the most.
       | 
       | I'm no expert, but this paper seems to recognise a similar
       | framework to the above. Perhaps a recurrent
       | deliberation/simulation mechanism will make it's way into models
       | in the future, especially the action models we are seeing in
       | robotics.
        
         | OJFord wrote:
         | I'm currently reading it for the first time, completely
         | coincidentally/not for this reason, and on a few occasions I've
         | thought 'Gosh that's just like' or 'analogous to' or 'brilliant
         | description of that problem' for LLMs/generative AI or some
         | aspect of it. I wish I could recall some examples.
        
         | machiaweliczny wrote:
         | It's a bit over my head for now but seems like GFlowNets are
         | tackling this problem a bit.
        
           | dcrimp wrote:
           | interesting, hadn't come across these. Will be doing some
           | more reading up on them.
        
         | dougmwne wrote:
         | I had the same thought from Thinking, Fast and Slow.
         | 
         | Another variation of this seems to be the "thought loop" that
         | agents such as Devin and AutoGPT use.
        
           | mistermann wrote:
           | https://en.m.wikipedia.org/wiki/OODA_loop
        
         | biosed wrote:
         | Wasn't most of the claims in that book refuted, some even by
         | the author. I really enjoyed it and found some great insights
         | only to be later told by a friend in that sphere that the book
         | was not correct and even the author had "retracted" some of the
         | assertions.
        
           | jerpint wrote:
           | He won a Nobel prize for his works so not sure how much of it
           | would be refuted
        
             | gryn wrote:
             | One quick google search and you can find multiple links for
             | that, including some that were posted here. wasn't proven
             | to be false but that the evidence used was not much of
             | evidence either.
             | 
             | here the first one in my results:
             | 
             | https://retractionwatch.com/2017/02/20/placed-much-faith-
             | und...
        
               | mistermann wrote:
               | As luck would have it, a System 1 vs System 2 scenario
               | falls into our laps.
        
           | mannykannot wrote:
           | It might still be a useful concept in developing LLMs.
        
         | toisanji wrote:
         | that is the approach also taken in this paper for building LLM
         | agents with metacognition: https://replicantlife.com/
        
         | HarHarVeryFunny wrote:
         | > it feels like critique that "LLMs aren't intelligent because
         | they are stochastic parrots" is an observation that they are
         | only equipped to use their 'System 1'.
         | 
         | I wouldn't say LLMs aren't intelligent (at all) since they are
         | based on prediction which I believe is the ability that we
         | recognize as intelligence. Prediction is what our cortex has
         | evolved to do.
         | 
         | Still, intelligence isn't an all or nothing ability - it exists
         | on a spectrum (and not just an IQ score spectrum). My
         | definition of intelligence is "degree of ability to correctly
         | predict future outcomes based on past experience", so it
         | depends on the mechanisms the system (biological or artificial)
         | has available to recognize and predict patterns.
         | 
         | Intelligence also depends on experience, minimally to the
         | extent that you can't recognize (and hence predict) what you
         | don't have experience with, although our vocabulary for talking
         | about this might be better if we distinguished predictive
         | ability from experience rather than bundling them together as
         | "intelligence".
         | 
         | If we compare the predictive machinery of LLMs vs our brain,
         | there is obviously quite a lot missing. Certainly "thinking
         | before speaking" (vs LLM fixed # steps) is part of that, and
         | this Q* approach and tree-of-thoughts will help towards that.
         | Maybe some other missing pieces such as thalamo-cortical loop
         | (iteration) can be retrofitted to LLM/transformer approach too,
         | but I think the critical piece missing for human-level
         | capability is online learning - the ability to act then see the
         | results of your action and learn from that.
         | 
         | We can build a "book smart" AGI (you can't learn what you
         | haven't been exposed to, so maybe unfair to withhold the label
         | "AGI" just because of that) based on current approach, but the
         | only way to learn a skill is by practicing it and
         | experimenting. You can't learn to be a developer, or anything
         | else, just by reading a book or analyzing what other people
         | have produced - you need to understand the real world results
         | of your _own_ predictions /actions, and learn from that.
        
           | RandomLensman wrote:
           | Defining intelligence as prediction leaves out a lot of other
           | things that humans would see as intelligence in other humans
           | (e.g., creating a novel), also quite simple organisms make
           | predictions (e.g., a predator jumping at prey makes a
           | prediction about positions).
        
             | HarHarVeryFunny wrote:
             | Maybe a better way to say it rather than "intelligence is
             | prediction" is that prediction is what supports the
             | behaviors we see as intelligent. For example, prediction is
             | the basis of what-if planning (multi-step prediction),
             | prediction (as LLMs have proved) is the basis of leaning
             | and using language, prediction is the basis of modelling
             | other people and their actions, etc. So, ultimately the
             | ability to write a novel, is a result of prediction.
             | 
             | Yes, an insect (a praying mantis, perhaps) catching another
             | is exhibiting some degree of prediction, and per my
             | definition I'd say is exhibiting some (smallish) degree of
             | intelligence in doing so, regardless of this presumably
             | being a hard-coded behavior. Prediction becomes more and
             | more useful the better you are at it, from avoiding
             | predators, to predicting where the food is, etc, so this
             | would appear to be the selection pressure that has evolved
             | our cortex to be a very powerful prediction machine.
        
               | RandomLensman wrote:
               | The ability to write a novel is different from actually
               | writing a novel. If prediction forms the basis of (at
               | least some forms of) intelligence, intelligence itself is
               | more than prediction.
        
               | HarHarVeryFunny wrote:
               | That's why I say our vocabulary for talking about these
               | things leaves something to be desired - the way we use
               | the word "intelligence" combines both raw/potential
               | ability to do something (prediction), and the experience
               | we have that allows that ability to be utilized. The only
               | way you are going to learn to actually write a novel is
               | by a lot of reading and writing and learning how to write
               | something that provides the experience you hope it to
               | have.
        
               | RandomLensman wrote:
               | Kind of agree. I think, though, trying to shoe-horn
               | intelligence into some evolutionary concepts is tricky
               | because it is easy stack hypotheses there.
        
               | coldtea wrote:
               | > _The ability to write a novel is different from
               | actually writing a novel_
               | 
               | In what way, except as in begging the question?
        
               | RandomLensman wrote:
               | Which LLM will on its own go and write a novel? Also,
               | even for humans, just because you technically know how to
               | write a novel, you might fail at it.
        
               | coldtea wrote:
               | > _Which LLM will on its own go and write a novel?_
               | 
               | Which human will?
               | 
               | We get prompts all the time, it's called sensory input.
               | 
               | Instead of "write a noval" it's more like information
               | about literature, life experience, that partner who broke
               | our heart and triggered our writing this personal novel,
               | and so on.
        
               | RandomLensman wrote:
               | Some people write novels, some don't. Why some people do
               | so we sometimes know, sometimes we don't (maybe they
               | flipped a coin to decide). Some start to write but fail
               | to finish.
               | 
               | You have to believe that humans have no free will in a
               | certain way to have them be like an LLM, i.e, every
               | action is externally driven and determined.
        
               | coldtea wrote:
               | > _You have to believe that humans have no free will in a
               | certain way to have them be like an LLM, i.e, every
               | action is externally driven and determined._
               | 
               | Free will doesn't have much meaning. If I dont base my
               | action at time t, on their development based on inputs on
               | times before t, what would I base it on?
               | 
               | It would be random?
               | 
               | Or would there be a small thinking presense inside me
               | that gets information about my current situation and
               | decides "impartially", able to decide in whatever
               | direction, because it wasn't itself entirely determined
               | by my experiences thus far?
        
               | RandomLensman wrote:
               | Randomness is certainly an option. Ignoring information
               | is an option.
        
               | spookie wrote:
               | I think you're confusing prediction with ratiocination.
               | 
               | I'm sure you've deducted hypothesis' based solely on the
               | assertion that "contradiction and being are
               | incompatible". Note, there wasn't prediction involved on
               | that process.
               | 
               | I consider prediction as a subset of reason, but not the
               | contrary. Therefore, I beg to differ on the whole
               | assumption that "intelligence is prediction". It's more
               | than that, prediction is but a subset of that.
               | 
               | This is perhaps the biggest reason for the high
               | computational costs of LLM's, because they aren't taking
               | the shortcuts necessary to achieve true intelligence,
               | whatever that is.
        
               | HarHarVeryFunny wrote:
               | > I think you're confusing prediction with ratiocination.
               | 
               | No, exactly not! Prediction is probabalistic and liable
               | to be wrong, with those probabilities needing
               | updating/refining.
               | 
               | Note that I'm primarily talking about prediction as the
               | brain does it - not about LLMs, although LLMs have proved
               | the power of prediction as a (the?) learning mechanism
               | for language. Note though that the words predicted by
               | LLMs are also just probabilities. These probabilities are
               | sampled from (per a selected sampling "temperature" -
               | degree of randomness) to pick which word to actually
               | output.
               | 
               | The way the brain learns, from a starting point of
               | knowing nothing, is to observe and predict that the same
               | will happen next time, which it often will, once you've
               | learnt what observations are appropriate to include or
               | exclude from that prediction. This is all highly
               | probabalistic, which is appropriate given that the thing
               | being predicted (what'll happen if I throw a rock at that
               | tiger?) is often semi-random in nature.
               | 
               | We can better rephrase "intelligence is ability to
               | predict well", as "intelligence derives from ability to
               | predict well". It does of course also depend on
               | experience.
               | 
               | One reason why LLMs are so expensive to train is because
               | they learn in an extremely brute force fashion from the
               | highly redundant and repetitive output of others. Humans
               | don't do that - if we're trying to learn something, or
               | curious about it, we'll do focused experiments such as
               | "Let's see what happens if I do this, since I don't
               | already know", or "If I'm understanding this right, then
               | if I do X then Y should happen".
        
             | jimbokun wrote:
             | LLMs have shown that writing a novel can be accomplished as
             | an application of prediction, at least to a certain level
             | of quality.
        
               | RandomLensman wrote:
               | I have yet to see an LLM write a novel on its volition.
        
             | coldtea wrote:
             | > _Defining intelligence as prediction leaves out a lot of
             | other things that humans would see as intelligence in other
             | humans (e.g., creating a novel)_
             | 
             | Would it?
             | 
             | Why would "creating a novel" by a human not itself be text
             | generation based on prediction on what are the next good
             | choices (of themes, words, etc) based on a training data
             | set of lived experience stream and reading other
             | literature?
        
               | RandomLensman wrote:
               | What is the human predicting there? Why would it need to
               | be a prediction task at all? How about a dada-ist poem?
               | Made-up words and syntax? If it is prediction but the
               | criterion for "what is a good next choice" can totally be
               | made up on the fly - what does the word "prediction" even
               | mean?
        
               | coldtea wrote:
               | > _What is the human predicting there?_
               | 
               | Their next action - word put on page, and so on.
               | 
               | > _Why would it need to be a prediction task at all?_
               | 
               | What else would it be?
               | 
               | Note that prediction in LLM terminology doesn't mean
               | "what is going to happen in the future" like Nostradamus.
               | It means "what is a good next word given the input I was
               | given and the words I've answered so far".
               | 
               | > _How about a dada-ist poem? Made-up words and syntax?_
               | 
               | How about it? People have their training (sensory input,
               | stuff they're read, school, discussions) and sit to
               | predict (come up with, based on what they know) a made-up
               | word and then another.
        
               | RandomLensman wrote:
               | That is a meaningless definition of prediction if "what
               | is a good next word" has an ever changing definition in
               | humans (as everything would fulfill that definition).
        
               | coldtea wrote:
               | That's the very definition of production in an LLM.
               | 
               | What does "has an ever changing definition" mean?
               | 
               | And why "everything would fulfill that definition"?
               | 
               | At any time whats the "good next word" is based on the
               | state created by our inputs thus far (including
               | chemical/physiological state, like decaying memories, and
               | so on). And not only not "everything fullfil it", but it
               | can be only a single specific word.
               | 
               | (Same as if we include the random seed among an LLM
               | output: we get the same results given the same training
               | and same prompt).
        
               | RandomLensman wrote:
               | "it can be only a single specific word" - that is
               | incorrect as a human can change the process to generate
               | the next word, up to and including, using a random
               | process to create or select the next word (i.e., any word
               | would be fine).
               | 
               | You could say the process chosen is somehow predetermined
               | (even if the choices then are all made by using
               | randomness), but then really the word "prediction" has
               | very little meaning as the criteria to what is a "good
               | next word" have a nearly unlimited and ever changing
               | range as the generating process changes.
        
               | duskwuff wrote:
               | > Why would "creating a novel" by a human not itself be
               | text generation based on prediction on what are the next
               | good choices (of themes, words, etc) based on a training
               | data set of lived experience stream and reading other
               | literature?
               | 
               | Unless you're Stephen King on a cocaine bender, you don't
               | typically write a novel in a single pass from start to
               | finish. Most authors plan things out, at least to some
               | degree, and go back to edit and rewrite parts of their
               | work before calling it finished.
        
           | hackerlight wrote:
           | > online learning - the ability to act then see the results
           | of your action and learn from that.
           | 
           | I don't think that should be necessary, if you are talking
           | about weight updates. Offline batch mode Q-learning achieves
           | the same thing.
           | 
           | By online learning, did you mean working memory? I'd agree
           | with that. Whether it's RAG, ultra-long-context, and LSTM-
           | like approach, or something else, is TBD.
        
             | HarHarVeryFunny wrote:
             | By online learning I mean incremental real-time learning
             | (as opposed to pre-training), such that you can predict
             | something (e.g. what some external entity is going to do
             | next, or the results of some action you are about to take),
             | then receive the sensory feedback of what actually
             | happened, and use that feedback to improve your predictions
             | for next time.
             | 
             | I don't think there is any substitute for a predict-act-
             | learn loop here - you don't want to predict what someone
             | else has done (which is essentially what LLMs learn from a
             | training set), you want to learn how your OWN predictions
             | are wrong, and how to update them.
        
               | exe34 wrote:
               | > By online learning I mean incremental real-time
               | learning, such that you can predict something (e.g. what
               | some external entity is going to do next, or the results
               | of some action you are about to take),
               | 
               | I used to believe this, but the recent era of LLMs has
               | changed my mind. It's clear that the two things are not
               | related: you don't need to update weights in real-time if
               | you can hold context another way (attention) while
               | predicting the next token.
               | 
               | The fact that we appear to remember things with one-shot,
               | online training might be an illusion. It appears that we
               | don't immediately update the weights (long term memory),
               | but we store memories in short term memory first (e.g.
               | https://www.scientificamerican.com/article/experts-short-
               | ter...).
        
               | HarHarVeryFunny wrote:
               | The fundamental difference is that humans do learn,
               | permanently (eventually at least), from prediction
               | feedback, however this works. I'm not convinced that STM
               | is necessarily involved in this particular learning
               | process (maybe just for episodic memories?), but it makes
               | no difference - we do learn from the feedback.
               | 
               | An LLM can perform one-shot in-context learning, which in
               | conversational mode will include (up to context limit)
               | feedback from it's actions (output), but this is never
               | learned permanently.
               | 
               | The problem with LLMs not permanently learning from the
               | feedback to their own actions is that it means they will
               | never learn new skills - they are doomed to only learn
               | what they were pre-trained with, which isn't going to
               | include the skills of any specific job unless that
               | specific on-the-job experience of when to do something,
               | or avoid doing it, were made a part of it. The training
               | data for this does not exist - it's not the millions of
               | lines of code on GitHub or the bug fixes/solutions
               | suggested on Stack Overflow - what would be needed would
               | be the inner thoughts (predictions) of developers as they
               | tackled a variety of tasks and were presented with
               | various outcomes (feedback) continuously throughout the
               | software development cycle (or equivalent for any other
               | job/skill one might want them to acquire).
               | 
               | It's hard to see how OpenAI or anyone else could provide
               | this on-the-job training to an LLM even if they let it
               | loose in a programming playground where it could generate
               | the training dataset. How fast would the context fill
               | with compiler/link errors, debugger output, program
               | output etc ... once context was full you'd have to pre-
               | train on that (very slow - months, expensive) before it
               | could build on that experience. Days of human experience
               | would take years to acquire. Maybe they could train it to
               | write crud apps or some other low-hanging fruit, but it's
               | hard to see this ever becoming the general purpose "AI
               | programmer" some people think is around the corner. The
               | programming challenges of any specialized domain or task
               | would require training for that domain - it just doesn't
               | scale. You really need each individual deployed instance
               | of an LLM/AI to be able to learn itself - continuously
               | and incrementally - to get the on-the-job training for
               | any given use.
        
               | exe34 wrote:
               | > but this is never learned permanently.
               | 
               | Are you sure? I think "Open"AI uses the chat transcripts
               | to help the next training run?
               | 
               | > they are doomed to only learn what they were pre-
               | trained with
               | 
               | Fine-tuning.
               | 
               | > The training data for this does not exist
               | 
               | What does "this" refer to? Have you read the Voyager
               | paper? (https://arxiv.org/abs/2305.16291) Any lesson
               | learnt in the library could be used for fine-tuning or
               | the next training run for a base model.
               | 
               | > what would be needed would be the inner thoughts
               | (predictions) of developers as they tackled a variety of
               | tasks and were presented with various outcomes (feedback)
               | continuously throughout the software development cycle
               | 
               | Co-pilot gets to watch people figure stuff out - there's
               | no reason that couldn't be used for the next version. Not
               | only does it not need to read minds, but people go out of
               | their way to write comments or chat messages to tell it
               | what they think is going on and how to improve its code.
               | 
               | > Days of human experience would take years to acquire
               | 
               | And once learnt, that skill will never age, never get
               | bored, never take annual leave, never go to the kids'
               | football games, never die. It can be replicated as many
               | millions of time as necessary.
               | 
               | > they could train it to write crud apps
               | 
               | To be fair, a lot of computer code is crud apps. But
               | instead of learning it in one language, now it can do it
               | in every language that existed on stackoverflow the day
               | before its training run.
        
           | iteygib wrote:
           | To me, it is one of those things like defining what 'art' is,
           | as in creating a model in our heads around a concept. We take
           | our definitions and then use those to construct models like
           | AI that simulate our model well enough.
           | 
           | In other words, I personally do not believe any system we
           | develop will be truly 'intelligent', since intelligence is a
           | concept we created to help explain ourselves. We can't even
           | truly define it, but yet we try to test technologies we
           | develop to see if they possess it. It is a bit non sensical
           | to me.
        
             | HarHarVeryFunny wrote:
             | Sure, we created the word intelligence to help describe
             | ourselves, and our differing levels of ability, as well as
             | applying it to animals such as apes or dogs that we see
             | seem to possess some similar abilities.
             | 
             | However, if we want to understand where this rather
             | nebulous ability/quality of "intelligence" comes from, the
             | obvious place to look is our cortex, which it turns out
             | actually has rather simple architecture! If uncrumpled our
             | cortex would be a thin sheet about the size of a tea towel,
             | and consists of six layers of neurons of different types,
             | with a specific pattern of connectivity, and including
             | massive amounts of feedback. We can understand this
             | architecture to be a prediction machine, which makes sense
             | from an evolutionary point of view. Prediction is what lets
             | you act according to what will happen in the future as
             | opposed to being stuck in the present reacting to what is
             | happening right now.
             | 
             | Now, if we analyze what capabilities arise from an ability
             | to predict, such as multi-step what-if planning (multi-step
             | prediction), ability to learn and use language (as proven
             | by LLMs - a predict-next-word architecture), etc, etc, it
             | does appear (to me at least!) that this predictive function
             | of the cortex is behind all the abilities that we consider
             | as "intelligence".
             | 
             | For sure there is very little agreement on a definition of
             | intelligence, but I have offered here a very concrete
             | definition "degree of ability to predict future outcomes
             | based on past experience" that I think gets to the core of
             | it.
             | 
             | Part of the problem people have in agreeing on a definition
             | of intelligence is that this word arose from self-
             | observation as you suggest, and is more a matter of "i know
             | it when i see it" rather than having any better defined
             | meaning. For technical discussion of AI/AGI and brain
             | architecture we really need a rigorously defined
             | vocabulary, and might be better off avoiding such a poorly
             | defined concept in the first place, but it seems we are
             | stuck with it since the word is so entrenched and people
             | increasingly want to compare machines to ourselves and
             | judge whether they too have this quality.
             | 
             | Of course we can test for intelligence, in ourselves as
             | well as machines, by using things like IQ tests to see the
             | degree to which we/they can do the things we regard as
             | intelligent (we'd really need a much deeper set of tests
             | than a standard IQ test to do a good job of assessing
             | this), but the utility of understanding what is actually
             | behind intelligence (prediction!) is that this allows us to
             | purposefully design machines that have this property, and
             | to increasing degrees of capability (via more powerful
             | predictive architectures).
        
         | airstrike wrote:
         | I'll preface this by saying I know this may sound entirely made
         | up, unscientific, anecdotal, naive, or adolescent even, but
         | luckily nobody has to believe me...
         | 
         | A few weeks back I was in that limbo state where you're neither
         | fully awake nor fully asleep and for some reason I got into a
         | cycle where I could _notice_ my fast-thinking brain spitting
         | out words /concepts in what felt like the speed of light before
         | my slow-thinking brain would take those and turn them into
         | actual sentences
         | 
         | It was like I was seeing my chain of thought as a list of
         | _ideas_ that was filled impossibly fast before it got
         | summarized into a proper  "thought" as a carefully selected
         | list of _words_
         | 
         | I have since believed, as others have suggested in much more
         | cogent arguments before me, that what we perceive as our
         | thoughts are, indeed, a curated output of the brainstormy
         | process that immediately precedes it
        
           | mirror_neuron wrote:
           | It's hard (impossible?) to know if we're talking about the
           | same thing or not, but I experience something like this all
           | the time, without being on the edge of sleep. We might both
           | be wrong, but it's relatable!
        
           | dicroce wrote:
           | This is fascinating. I had another experience that I think
           | sheds light on some of this. One day I was in my office and
           | the lights were off. I turned around and looked at the dark
           | shape on top of my coworkers desk. For a few seconds I stared
           | blankly and then suddenly I had a thought: PC, it's his PC.
           | Then I started to think about that period of time just before
           | I realized what I was looking at... The only word I can
           | describe what it felt like is: unconscious. Is it possible
           | that consciousness is just a stream of recognition?
        
           | Swizec wrote:
           | > I got into a cycle where I could notice my fast-thinking
           | brain spitting out words/concepts in what felt like the speed
           | of light before my slow-thinking brain would take those and
           | turn them into actual sentences
           | 
           | The way I've seen this described by psychologists is that
           | System 1 is driving the car while System 2 panicks in the
           | back seat screaming out explanations for every action and
           | shouting directions to the driver so it can feel in control.
           | The driver may listen to those directions, but there's no
           | direct link between System 2 in the backseat and System 1
           | holding the wheel.
           | 
           | Various experiments have shown that in many situations our
           | actions come first and our conscious
           | understanding/explanation of those actions comes second.
           | Easiest observed in people with split brain operations. The
           | wordy brain always thinks it's in control even when we know
           | for a fact it couldn't possibly have been because the link
           | has been surgically severed.
           | 
           | Being super tired, on the edge of sleep, or on drugs can
           | disrupt these links enough to let you observe this directly.
           | It's pretty wild when it happens.
           | 
           | Another easy way, for me, is to get up on stage and give a
           | talk. Your mouth runs away presenting things and you're in
           | the back of your head going "Oh shit no that's going in the
           | wrong direction and won't make the right point, adjust
           | course!"
        
             | devinprater wrote:
             | Oh, yes, that's what I do! I act first, and then consider
             | the action.
        
             | nuancebydefault wrote:
             | Sometimes when I am in a Teams call, I observe myself
             | talking. I know for myself that I can get carried away
             | whilst talking and that time passes faster then. My
             | conscious self sometimes needs to interrupt my talky self
             | with a 'nough explained signal, or even with a 'nough
             | joking signal.
             | 
             | I read several studies that show that brains don't have a
             | central point of command, so our true self can not exist
             | (as one single origin). We are the sum of all our
             | consciousnesses, similar to how a car is the sum of its
             | parts.
        
           | nico wrote:
           | There is a technique for achieving this state of
           | consciousness, it's called noting
           | 
           | This is an awareness that advanced meditators seek, practice
           | and develop to perceive "reality as it is"
           | 
           | If you are curious, you might find related discussions, and a
           | great welcoming community at r/streamentry on Reddit
           | 
           | Also the book Mastering the Core Teachings of the Buddha
           | talks about it quite a bit, including instructions on how to
           | do it
        
             | jondwillis wrote:
             | Is this different from Dzoghchen buddhism?
        
               | nico wrote:
               | Noting is just a meditation technique
               | 
               | You might also call it an exercise for insight practice
               | 
               | There are multiple traditions that use noting or similar
               | techniques for insight practice (maybe with different
               | names)
               | 
               | Can't vouch for this thread, as I just found it, but
               | here's a related discussion (Dzogchen vs Vipassana) https
               | ://www.reddit.com/r/Buddhism/comments/9t3095/dzogchen_v..
               | .
        
             | jprete wrote:
             | Noting is very useful as long as you remember not to do it
             | all the time.
        
               | 0xdeadbeefbabe wrote:
               | If you don't remember then what? Stack overflow? Heap
               | overflow?
        
           | giva wrote:
           | Well, this sound weird to me in the sense that I don't feel
           | that I think in _words_. I only convert my thoughts into
           | words when i need to speak or write them down; So when I need
           | to communicate them to others, when I need to remember them
           | for later, or when I am stuck and I need to clear things up.
           | 
           | I was actually convinced it was the same for most people, and
           | that for this reason "Rubber duck debugging"[1] is a thing.
           | 
           | 1) https://en.wikipedia.org/wiki/Rubber_duck_debugging
        
             | JoBrad wrote:
             | Same. If I try to visualize my thoughts it's like a cloud
             | that coalesces into various forms, to show different
             | scenarios. It definitely isn't word-based until I decide to
             | actually translate it into that mode.
        
               | mewpmewp2 wrote:
               | Interesting. I think all of my thoughts are this record
               | I'm listening to as if it's an audiobook almost.
               | Sometimes, it's like multiple parallel streams of
               | different thoughts at different strengths that I can
               | observe, like a thought line that is going on, on a more
               | subconscious level, and it's something that if I notice,
               | I might want to pay attention to.
               | 
               | Like multiple LLMs are generating tokens in my head in
               | parallel, but like in my field of view, some I can only
               | listen/see barely because I'm not focusing on them.
        
             | kjqgqkejbfefn wrote:
             | Am I the only one visualizing some of my most creative
             | thoughts in a mental palace that is formed by many distinct
             | (euclidian) spaces, whose axis connect to each other
             | through a graph ? Closest thing that can describe this I
             | found are simplicial sets:
             | 
             | picture: https://encrypted-
             | tbn0.gstatic.com/images?q=tbn:ANd9GcRx5Xam...
             | 
             | It seems it's used by cognitive models, although I'm not
             | formally trained enough to tell exactly how:
             | 
             | https://arxiv.org/pdf/1703.08314.pdf
        
               | mewpmewp2 wrote:
               | I wish I had something like this in my head to tie things
               | in together. Right now I feel like my understanding of
               | things is so disorganised and "lucky" in a sense. I feel
               | lucky that I have grasp of anything.
        
               | giva wrote:
               | I don't know what a simpilician set is and wikipedia
               | didn't really helped me. However I could roughly describe
               | my "mind" as many mental maps where concepts are laid out
               | and connected in different ways. Learning means putting
               | new things on these maps a thinking is navigating through
               | them.
        
               | karmakaze wrote:
               | Reminds me of the saying about a poet vs mathematician,
               | the first gives different names to the same thing and the
               | latter the same name to different things. _Maybe that 's
               | why I can't stand highly descriptive prose (aka
               | describing the water while I'm drowning over here)._
               | 
               | Now what if you're a poetic mathematician _(or
               | mathematical poet)_ , what's that mind map look like?
        
               | LargoLasskhyfv wrote:
               | Well... what about that palace of mind thing, and the
               | ability to rewind into almost all older memories at will,
               | and on demand being able to look up things from there,
               | like reading, without having it memorized at all? Also
               | full stream of consciousness, like smells, tastes, light
               | wind on your skin, 'silken air' at just the right
               | temperature and humidity.
               | 
               | All of that arranged in something like 'eigengrau',
               | represented by glitterlike points connected by graphs,
               | mostly in 'phospene' colors, but not exclusively so.
               | 
               | Sometimes very non-euclidean, moving/warping.
               | 
               |  _KNOWING_ what 's behind every glitter point, like small
               | cinema, large home theatre, from several points of view
               | at the same time.
               | 
               | No words involved. Just visuals.
               | 
               | Thinking, like juggling/weighing blobs, like that glowing
               | stuff which moves slowly up and down in a lava-lamp.
               | 
               | Somehow 'knowing' what each blob, its size/form/viscosity
               | /weight/speed/color/brightness/'feel'/smell represents.
               | 
               | Slowly emerging new 'visuals' from this. Which are then
               | translated into 'language', if ever.
        
             | jiggawatts wrote:
             | https://mymodernmet.com/inner-monologue/
        
           | marmaduke wrote:
           | > curated output of the brainstormy process that immediately
           | precedes it
           | 
           | Daniel Dennett gives a nice albeit more detailed version of
           | your idea in his book Consciousness Explained, could be worth
           | a read
        
           | samstave wrote:
           | Mandelthought psyt.
        
           | melagonster wrote:
           | From positive perspective,it is surely that our thinking/mind
           | is not just language and always faster than sentence
           | formation.
        
           | JoBrad wrote:
           | I had a similar experience when I was put under during
           | surgery a few years ago. Later I learned that they used
           | ketamine in their concoction.
        
           | allemagne wrote:
           | I occasionally reach a similar state near sleep where I will
           | be half-dreaming that I'm reading from a page of a book where
           | the words materialize/"come into focus" right before my eyes
           | into what is usually vaguely grammatically correct nonsense.
        
           | pictureofabear wrote:
           | This seems like it might upend Descartes' "cogito, ergo sum"
           | ("I think therefore I am") in that the process for forming
           | thoughts in a language is not indicative that we exist,
           | rather it merely indicates that we have evolved a brain that
           | can produce and interpret language.
           | 
           | Seems like we're dismantling a lot of what Descartes came up
           | with these days.
        
             | TriNetra wrote:
             | For that I came up (or got inspired from somewhere) with
             | this: I'm aware therefore I exist. Pure awareness, devoid
             | of all objects (thoughts/visualization) is me.
        
           | theaussiestew wrote:
           | I have this too. My cognitive processes are not related to my
           | thinking brain, which I define as the part of my mental
           | process which produces the sounds of words in my mind.
           | Instead, I've observed that first, my subconscious processes
           | concepts at a much more fine grained level, much like the
           | latent space of a machine learning model. Only substantially
           | after, let's say 10ms after, do thoughts arise, which are
           | just pointers to the already processed subconscious process.
           | A very rough analogy would be the inference of an LLM in
           | words, vs all the processing of embeddings that happens
           | internally.
        
         | tasty_freeze wrote:
         | People often say that LLMs aren't really thinking because they
         | are just producing a stream of words (tokens really)
         | reflexively based on some windows of previous text either read
         | or from its own response. That is true.
         | 
         | But I have the experience when talking of not knowing what I'm
         | going to say until I hear what I've said. Sometimes I do have
         | deliberative thought and planning, trialing phrases in my head
         | before uttering them, but apparently I'm mostly an LLM that is
         | just generating a stream of tokens.
        
           | Workaccount2 wrote:
           | This is something that is easily observable by anyone at
           | virtually any moment, yet at the same time is something that
           | escapes 99% of the population.
           | 
           | When you are talking to someone in normal conversation, you
           | are both taking in the words you are saying at the same time.
        
         | iteygib wrote:
         | How does evolutionary instinct factor into the system model?
         | Flight or fight responses, reflexes, etc. 'Thinking' does have
         | consequences in terms of evolutionary survival in some
         | circumstances, as in spending too much time
         | deliberating\simulating.
        
         | kderbe wrote:
         | Andrej Karpathy makes this same point, using the same book
         | reference, in his "[1hr Talk] Intro to Large Language Models"
         | video from Nov. 2023.
         | 
         | Here is a link to the relevant part of his presentation:
         | https://youtu.be/zjkBMFhNj_g?t=2120
        
         | emmender2 wrote:
         | thinking step-by-step requires 100% accuracy in each step. If
         | you are 95% accurate in each step, after the 10th step, the
         | accuracy of the reasoning chain drops to 59%. this is the
         | fundamental problem with llm for reasoning.
         | 
         | reasoning requires deterministic symbolic manipulation for
         | accuracy. only then it can be composed into long chains.
        
           | hesdeadjim wrote:
           | I dream of a world where the majority of humans could come
           | close to 59% after attempting a ten step logical process.
        
           | throwuwu wrote:
           | You've never made a mistake in your reasoning?
           | 
           | Tongue in cheek but this has been considered and has resulted
           | in experiments like tree of thought and various check your
           | work and testing approaches. Thinking step by step is really
           | just another way of saying make a plan or use an algorithm
           | and when humans do either they need to periodically re-
           | evaluate what they've done so far and ensure it's correct.
           | 
           | The trick is training the model to do this as a matter of
           | course and to learn which tool to apply at the right time
           | which is what the paper is about wrt interspersed thoughts.
        
           | trenchgun wrote:
           | >reasoning requires deterministic symbolic manipulation for
           | accuracy
           | 
           | No, that is automation. Automated reasoning is a thing,
           | indeed. And I can kind of see a world where there is a system
           | which uses LLM for creative thinking, augmented with
           | automated reasoning systems (think datalog, egg, SMT-solver,
           | probabilistic model checking etc).
        
         | glial wrote:
         | I think of COT as a memory scratchpad. It gives the LLM some
         | limited write-only working memory that it can use for simple
         | computations (or associations, in its case). Now suppose an LLM
         | had re-writeable memory... I think every prompt-hack, of which
         | COT is one example, is an opportunity for an architecture
         | improvement.
        
           | HarHarVeryFunny wrote:
           | I think of COT more as a type of planning or thinking before
           | you speak. If you just open your mouth and start talking,
           | which is what a plain LLM does, then you may talk yourself
           | into a corner with no good way to get out of it, or find
           | yourself saying something that really makes no sense. COT
           | effectively allows the LLM to see the potential continuations
           | of what it is considering saying, and pick one that makes
           | sense!
           | 
           | I think lack of COT or any ability to plan ahead is part of
           | why LLMs are prone to hallucinate - if you've already run
           | your mouth and said "the capital of australia is", then it's
           | a bit late to realize you don't know what it is. The plain
           | LLM solution is to do what they always do and predict next
           | word using whatever it had in the training set, such as names
           | of some australian cities and maybe a notion that a capital
           | should be a large important city. IOW it'll
           | hallucinate/bullshit a continuation word such as "Melbourne".
           | With COT it would potentially have the ability to realize
           | that "the capital of australia is" is not a good way to start
           | a sentence when you don't know the answer, and instead say "i
           | don't know". Of course the other cause of hallucinations is
           | that the LLM might not even know what it doesn't know, so
           | might think that "Melbourne" is a great answer.
        
         | eightysixfour wrote:
         | This is a common comparison in the LLM world. I actually think
         | it is closer to the Left/Right Brain differences described in
         | Master and His Emissary, but that's for a blog post later.
        
         | bun_at_work wrote:
         | I have a similar view to you and not much to add to your
         | comment, other than to reference a couple books that you might
         | like if you enjoyed 'Thinking, Fast and Slow'.
         | 
         | 'The Righteous Mind' by Jonathan Haidt. Here, Haidt describes a
         | very similar 2-system model he describes as the Elephant-rider
         | model.
         | 
         | 'A Thousand Brains: A New Theory of Intelligence' by Jeff
         | Hawkins. Here Jeff describes his Thousand Brains theory, which
         | has commonality with the 2-system model described by Kahneman.
         | 
         | I think these theories of intelligence help pave the way for
         | future improvements on LLMs for sure, so just want to share.
        
         | thwarted wrote:
         | This sounds similar to the A Brain/B Brain concept that was
         | described by, I believe, Marvin Minsky. I don't know how this
         | might be related to Kahneman's work.
        
         | kouru225 wrote:
         | Feel like this is better represented as the default mode
         | network: https://en.m.wikipedia.org/wiki/Default_mode_network
         | 
         | There are questions we know the answers to and we just
         | reflexively spit them out, but then there are questions that
         | are new to us and we have to figure them out separately.
         | 
         | Recent research has shown that new memories are recorded in the
         | brain differently depending on how unique the memory is:
         | https://www.quantamagazine.org/the-usefulness-of-a-memory-gu...
        
       | adlpz wrote:
       | Any relation to OpenAI's rumored Q* (i.e. q-star) model? Authors
       | of this paper don't seem affiliated.
       | 
       | Just a name coincidence?
        
         | HarHarVeryFunny wrote:
         | I was thinking the same. The STaR paper this is an extension of
         | came out in 2022, so at least possible this is what q-star is
         | based on too, but maybe with Q standing for something else.
        
         | smusamashah wrote:
         | I think it's just a play on the same hyped up term.
        
       | anon291 wrote:
       | So it seems 'obvious' to me that a network about 50 layers deep
       | (for example) can only reason about symbolic questions for 50
       | 'steps' (in quotes because it's not a step as we think about it).
       | It only seems there's more complexity because it's 50 steps in
       | one or more learned subspaces that the model has been trained in
       | (which might mean the model can accomplish more than one 'human
       | step' in its 'step'). Humans (well intelligent humans at least)
       | seem able to obviously reason beyond those steps, but we all know
       | it requires real thinking and deliberation and perhaps a notepad
       | to be able to do that.
       | 
       | It's quite something to, for example, expect ChatGPT to be able
       | to correctly do 4 digit multiplications without any thought or
       | recourse to 'paper' when very few human beings can do that.
        
         | blackbear_ wrote:
         | This paper does indeed follow your intuition to investigate the
         | limits of transformers on compositional tasks (i.e., those that
         | require multi-step reasoning, including your multiplication
         | example): https://arxiv.org/abs/2305.18654
         | 
         | > Our empirical findings suggest that transformer LLMs solve
         | compositional tasks by reducing multi-step compositional
         | reasoning into linearized subgraph matching, without
         | necessarily developing systematic problem-solving skills. To
         | round off our empirical study, we provide theoretical arguments
         | on abstract multi-step reasoning problems that highlight how
         | autoregressive generations' performance can rapidly decay with
         | increased task complexity.
        
           | anon291 wrote:
           | Ah good... This is definitely a research path I've been
           | looking into. Great to see someone else has already gone
           | there!
        
           | visarga wrote:
           | Maybe the Skill Mix paper is relevant here. They define a
           | list of 100 skills, and then randomly sample tuples of n
           | skills (usually less than 6) and generate a test example
           | using those skills. Apparently only GPT-4 (at the time of the
           | paper) was able to compose 5 skills, the other models just 3
           | or 2. Beyond 5 skills even GPT-4 was doing much worse.
           | 
           | The interesting finding of the paper is that GPT-4 couldn't
           | have seen all the (topic, skill-tuple) combinations in the
           | training set. If you have 10,000 examples on a topic, and use
           | 5 out of 100 skills, you would need 100^5 training examples
           | to cover all combinations. In conclusion GPT-4 generalizes to
           | new skill combinations, thus it is not a stochastic parrot.
           | 
           | https://arxiv.org/abs/2310.17567
        
         | radarsat1 wrote:
         | This is true but you have to also consider the autoregressive
         | component. In your example, it's 50 steps _per iteration of the
         | model_ , where the model is executed once for each token in the
         | output.
         | 
         | So practically speaking it's a bit more complicated to
         | calculate how much the model can "think". Of course once a
         | token is output it is committed to that (in the most basic
         | scenario), but that doesn't mean it is not still "thinking" as
         | it produces subsequent tokens.
         | 
         | > perhaps a notepad
         | 
         | Exactly, the context and previously output tokens can be
         | considered such a notepad since they are input for the next
         | steps of the model.
        
           | Closi wrote:
           | Agreed - also prompt engineering encourages LLM's to do this
           | too (i.e. asking the LLM to explain the steps it will take to
           | solve an answer, prior to answering - e.g. Zero-Shot CoT
           | 'Let's think step by step')
        
           | anon291 wrote:
           | So part of my general issue with this kind of thinking is
           | that, if we take this as the main means of creating
           | complexity, then shorter prompts are worse for reasoning than
           | longer ones, because longer ones automatically give the model
           | more 'space', to think. Now, I realize that the research
           | community knows this, but I like papers like this that
           | explicitly seek ways to enable the model to 'breathe' a bit,.
        
         | visarga wrote:
         | You are missing an important detail here - number of tokens -
         | yes, you have 50 "steps" in network depth, but you could have
         | extra tokens. Assuming you don't run out of tape, there is no
         | reason for LLMs to be limited to simple operations.
        
       | FeepingCreature wrote:
       | Here we go!! I've been waiting years for them to try this. Let's
       | see how it does when scaled up to GPT-3/4 level.
       | 
       | This might be the missing piece to AGI.
        
         | parthianshotgun wrote:
         | The missing piece is unknowable
        
           | Cthulhu_ wrote:
           | We'll likely reconstruct what the missing piece was in
           | hindsight, but it's very probable there's no one missing
           | piece. Just like human evolution.
        
           | digging wrote:
           | Until it's been found, you mean?
        
             | sroussey wrote:
             | Maybe even then too!
        
           | arendtio wrote:
           | I am not convinced there even is a missing piece. I mean,
           | LLMs are being _used_ very differently compared to how
           | traditional AI programs were written. Combining both worlds
           | might be all that is needed.
           | 
           | I would not be surprised that when we have general artificial
           | intelligences, we will see, that advancing LLMs wasn't
           | necessary.
        
       | 082349872349872 wrote:
       | Edsger Dijkstra had a precise english style; even though his
       | mother tongue was Dutch, I find he made better use of English
       | than many native speakers.
       | 
       | In one of the EWD's, he reminisced that, as children, they were
       | taught to never begin to speak a sentence unless they already
       | knew how they were going to finish it.
       | 
       | I'd bet these two observations have a causal connection.
        
         | ricardobeat wrote:
         | Is that even possible, or just hyperbole? I'd bet the latter. I
         | wouldn't be surprised if some people are able to fully unravel
         | entire paragraphs of conversation in their head in a couple of
         | seconds, but that's not something you could teach to children
         | in general.
        
           | mannykannot wrote:
           | I don't think it is feasible, at least for conversation, but
           | as an aspirational goal for children, along the lines of "put
           | your toys away when you've finished playing with them", it is
           | not a bad one.
           | 
           | It's not unusual for me to think I know how I am going to end
           | a sentence, but then find that I can't get there.
        
           | h34t wrote:
           | in Dutch (and German) the verb often goes at the end of a
           | sentence, so the advice is rather practical.
        
             | ricardobeat wrote:
             | Dat week ik heel goed :(
        
               | ricardobeat wrote:
               | *weet, thanks autocarrot
        
           | ted_bunny wrote:
           | German children would with you disagree.
        
         | caddy wrote:
         | I also wonder if it has anything to do with the process of
         | learning a new language in general. I've thought more
         | thoroughly about how English works since I've been learning
         | French (not that I'm very eloquent in either)
        
         | Cthulhu_ wrote:
         | I've observed two things. One, writing is different to
         | speaking, because it's async, you can think before you write,
         | you can edit, etc.
         | 
         | But second, speaking in a non-native language makes you think
         | harder about what you're about to say. Less colloquialisms,
         | more focus on making sure your meaning is understood, more
         | sensitivity in case you might offend someone, perhaps?
         | 
         | It's not new either; a lot of science and whatnot has been done
         | in people's not-native language, like French, German, Latin,
         | etc. Another factor there is the lingo of the field; I can't
         | simply say "Kubernetes is een open-bron houder
         | orkestratiesysteem voor het automatiseren van de inzet,
         | schalen, en het beheer van zachte waren" without confusing half
         | my native speaking audience.
        
         | torginus wrote:
         | I also learned English from textbooks, and one of the strangest
         | things I encountered that native speakers routinely confuse
         | "their, there, they're" which I never thought was a mistake I
         | could make. It would be like confusing 'wet' and 'vet'. So
         | there's definitely a difference between native and non-native
         | speakers use the language.
        
           | qup wrote:
           | The people who confuse that mostly have not done very much
           | reading. Audibly, those words are identical.
        
           | leobg wrote:
           | Even crazier:
           | 
           | "Could of".
           | 
           | Like "You could of said so".
        
         | zoogeny wrote:
         | When I was a young man I was taking a language course while I
         | was temporarily living in a foreign country. There was an older
         | man in the course (not elderly, more like mid-fifties) who was
         | very bad at the new language we were both learning. Yet I
         | noticed he had, what seemed to me, a magic power: he could
         | always make people laugh. He would often whisper something to
         | one of our classmates and they would always get a giant smile
         | on their face or even laugh out loud.
         | 
         | I was intensely curious and I spent some time wondering how he
         | did it. One day, out of the blue, he invited me out to lunch
         | after class. We just chatted for most of the lunch, exchanging
         | backgrounds and stories. Then his face took on a serious
         | expression and he slowly and carefully began to explain
         | something to me as if he was passing on some wisdom.
         | 
         | He said that he never spoke a single sentence without fully
         | saying the sentence in his mind. He said he would often think
         | of the words several times in his mind, revising the phrase
         | until he was happy. He would imagine saying the words to the
         | person in front of him and he would imagine their reaction. And
         | he would continue to revise until he felt confident the person
         | who heard the words he would say would react in the way he
         | wanted them to react. If he could not imagine the person
         | reacting how he wanted them to react, he would not say anything
         | at all.
         | 
         | It was clear to me that he was passing along this advice but
         | also that he was calling me out a bit. He was letting me know
         | that I spoke without thinking. I say what pops into my head. It
         | was like he read my mind honestly, he knew exactly what I was
         | curious about and he answered the question I had for him that I
         | never asked.
         | 
         | I wish I could say that I learned the lesson. When I have tried
         | the technique it has rewarded the effort. But I haven't formed
         | it into a habit and I still tend to let my mouth race ahead of
         | my mind.
        
         | wara23arish wrote:
         | I love reading his EWDs, I had a professor who worked with him
         | who mentioned he made his students work use pens while taking
         | his tests. To make it less likely for the students to make
         | mistakes??
        
           | westurner wrote:
           | Perhaps to make it easier determine how to correct
           | instruction.
           | 
           | - "Guidelines for keeping a laboratory notebook" (2019)
           | https://news.ycombinator.com/item?id=19123430#19126809
        
           | float4 wrote:
           | > he made his students work use pens while taking his tests
           | 
           | This is very common in the Netherlands, I think that's why it
           | was a rule of his.
           | 
           | In general, the Dutch education system seems to be against
           | pencils (at least this was the case until recent; I'm Dutch
           | and mid 20s). You're tought to write using a fountain pen,
           | not a pencil. In high school, you're allowed to switch to
           | ball point but absolutely not to pencil. In university, write
           | with pretty much anything you want, but... not with a pencil.
           | If you do take your test with a pencil, there's genuinely a
           | chance your teacher will give you a 0, although most of the
           | time they'll probably be forgiving.
           | 
           | I majored in CS in the Netherlands and every test was done
           | with good old pen and paper. Students still make mistakes all
           | the time, which is why everyone uses a scrap sheet.
        
       | QuantumG wrote:
       | We're done for!
        
       | pawnty wrote:
       | This is the missing piece to train AI which has the ability to
       | reason. There are so many tasks whose answers are known but
       | reason steps are missing. With this method, we can use less
       | annotated data the reach the ability.
       | 
       | The interesting part(I imagine): the generated thought could be
       | hard for human to understand while it is still way more helpful
       | to get the correct answer! If that happens, we have created
       | something more intelligent than ourselves.
        
       | silent_cal wrote:
       | Neural networks do not think
        
         | adlpz wrote:
         | Do neurons think? Do a bunch of neurons?
         | 
         | Is this semantics?
        
           | empath-nirvana wrote:
           | basically this: https://en.wikipedia.org/wiki/Sorites_paradox
           | 
           | One neuron doesn't think. Three neurons don't think. Billions
           | of neurons think. Somewhere between one neuron and billions
           | of neurons, thinking starts happening. Probably also true for
           | neural networks. The main problem is that people throw around
           | terms like: "Thought", "Intelligence", "Will", "Reasoning",
           | "Knowledge", "Consciousness", etc like they are very well
           | defined and well understood terms and they very much are not.
        
             | silent_cal wrote:
             | Billions of neurons don't think, people do.
        
               | empath-nirvana wrote:
               | ...with what?
        
               | silent_cal wrote:
               | With their minds
        
             | adlpz wrote:
             | My point precisely. Those are all vague terms. Saying that
             | "neural nerworks do not think" is as meaningless as any
             | equivalent (or opposite) statement on any other system
             | including any number of neurons, a whole brain or a
             | _person_.
             | 
             | It's all semantics.
        
           | silent_cal wrote:
           | There are no real neurons in a neurons in a neural network.
        
         | PoignardAzur wrote:
         | You're not giving information anybody on this forum doesn't
         | already know.
         | 
         | Obviously they don't "speak" either. Both "think" and "speak"
         | are used as shorthands here for what the language models
         | actually do.
        
           | silent_cal wrote:
           | What are you upset with me for? The authors are using the
           | misleading language, not me. Take it up with them.
        
         | optimalsolver wrote:
         | Could you give a definition of "think" that NNs fail to live up
         | to?
        
           | silent_cal wrote:
           | Abstracting immaterial concepts from physical reality and
           | deliberately using them in analytical or deductive processes
           | to discover truths.
        
             | optimalsolver wrote:
             | So basically finding ways to compress your observational
             | history?
        
               | silent_cal wrote:
               | No, it's not "basically" that at all.
        
               | optimalsolver wrote:
               | That's pretty much what it is, as you stated it. Finding
               | abstractions that let you encode your observational
               | history more efficiently than you previously could, or
               | "discovering truths", if you want to be all mystical
               | about it.
        
             | ogogmad wrote:
             | Might be relevant:
             | https://www.nature.com/articles/s41586-023-06924-6
             | _Mathematical discoveries from program search with large
             | language models_
        
         | 4RealFreedom wrote:
         | I don't understand the downvotes - you are correct.
        
           | silent_cal wrote:
           | I think people just get mad when they're reminded of this
           | obvious fact. They want computers to prove that our minds are
           | an illusion, the product of a "meat computer".
        
             | stoniejohnson wrote:
             | Read some Daniel Dennett!
        
               | silent_cal wrote:
               | Are you serious?
        
               | stoniejohnson wrote:
               | You're very grumpy I think you need some food and a nap
               | :-)
        
           | viraptor wrote:
           | It's fair to talk about thinking in a handwavey "you know
           | what I mean" way. This is not a philosophy paper. It's a fine
           | point if that's what you want to discuss, but doesn't change
           | anything about the issue at hand and is needlessly pedantic.
           | It's the "what you're referring to is actually GNU/Linux" of
           | AI discussions.
        
       | YetAnotherNick wrote:
       | Another RL paper with terrible baseline. They used 0 shot non
       | instruction tuned Mistral for GSM8k which has very specific way
       | of output. They got 11% accuracy after improving it, while few
       | shot prompting achieves 37%[1]. GPT 4 could get ~97% with
       | prompting.
       | 
       | [1]:
       | https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderb...
        
         | hiddencost wrote:
         | Fwiw if they're serious scientists, taking a known method and
         | baseline and improving it is good science. Extensions to get
         | state of the art are probably possible, but their goal is to
         | measure just the impact of their change in a simple setting.
         | Let the engineers do the munged system combinations and get
         | SoTA.
        
           | YetAnotherNick wrote:
           | I am not talking about SoTA. I am talking about deliberate
           | poor baseline. GSM8k consists of two things: solving the
           | problem and getting the output format correct. Getting the
           | output format corrects gives 30% accuracy for the same model
           | where they got 11%. SoTA is 97%.
        
       | lionkor wrote:
       | This is purely anecdotal, and I try to keep it to myself but its
       | very difficult when at least half of the HN homepage is AI
       | related: LLMs like ChatGPT do so utterly terribly at any non-
       | trivial job I throw at it that I seriously consider people who
       | use it daily to either be straight up incompetent, or maybe their
       | domain is so trivial that the LLM actually does well.
       | 
       | From asking LLMs to solve a highly difficult async C++
       | parallelism problem, to german language specifics, it just fucks
       | up at a fundamental level. I understand that LLMs cannot solve
       | these issues and why, but then I do not understand the heavy
       | focus on AI by so many tech people.
       | 
       | Is day to day programming job so trivial that LLMs do a good job,
       | while at the same time being too difficult for you to do it
       | yourself? I really, really want to know exactly what the use case
       | is.
       | 
       | Do people just throw simple problems at it to validate their own
       | preconceived notion of how cool and useful LLMs are? Whats the
       | deal?
        
         | rplnt wrote:
         | Every other query I've given to ChatGPT came up with an utterly
         | wrong answer. Followup always yielded "sorry, I made an obvious
         | mistake, here's another wrong answer". Confident and stupid is
         | a very bad combination.
        
         | jollyllama wrote:
         | There are plenty of jobs where people have to complete various
         | tasks that are outside of their domain or otherwise tedious on
         | a daily basis. For example, plenty of devs have to set up or
         | change the configuration of remote hosts. Some LLMs are pretty
         | good at generating devops scripts to speed up this work.
        
           | orzig wrote:
           | Exactly. Example: maybe 1% of the code I generate is bash. I
           | used to try to memorize patterns, but of the top 20 I'd use
           | each less than once per year. Now, instead of that 1% taking
           | 5% of my time, it takes 2%. It's all "simple stuff", and I
           | can verify it instantly.
           | 
           | I have ~10 similar use cases. So it hasn't revolutionized my
           | life, but it's been well worth $20/mo ChatGPT Plus and $3/mo
           | API calls.
        
         | williamcotton wrote:
         | Boilerplate, test code, and general tedium. Most software just
         | needs to handle IO.
         | 
         | The next time you want to use SQL to compute a rolling sum try
         | asking ChatGPT 4 instead of searching through documentation or
         | search engine results for windowing functions.
         | 
         | Competency at programming along with very good technical
         | communication skills (with a touch of learning how to not hold
         | the tool backwards) and you should find the appeal.
        
           | slices wrote:
           | yes. just used Cody to get me on the right path with an
           | obscure postgresql JSON query, it easily saved me an hour of
           | fiddling around.
        
         | luma wrote:
         | This is an observation I've seen a lot around here. Underneath
         | it is the assumption that "if I can't figure out how to get
         | meaningful use out of a tool, the tool must be useless".
         | 
         | OpenAI didn't sign up 100M users without somebody somewhere
         | finding it to be useful. Like any other tool, it's utility is
         | limited mostly by the person wielding it.
        
           | bluGill wrote:
           | The tools seem useful, but I'm not sure they are. too often
           | they will confidently make up an answer that is wrong. When I
           | use them they do great on trivial problems but can't help on
           | hard ones.
        
             | luma wrote:
             | Reframe your thinking. You're approaching it like other
             | computer systems, where a given input yields a determined
             | output. Instead, treat it like a junior dev whom you can
             | unload an unlimited amount of work to, but the result still
             | requires review.
             | 
             | We're all used to working this way in human systems, people
             | that sound confident might also be wrong, and you learn
             | where you might trust them more or less as you work with
             | them over time. Until you are confident that they are
             | always "right" in a given problem domain, you need to apply
             | some level of review.
             | 
             | Finally, keep in mind that there are "smarter" and "dumber"
             | LLMs. If you didn't pay for what you were doing, you were
             | talking to a "dumber" model. The quality does go up if you
             | have $20 in your pocket.
        
               | bluGill wrote:
               | The junior engineers I know tend to ask questions not be
               | confidently wrong. That isn't to say they are always
               | right but they make a very different class of errors.
        
               | luma wrote:
               | Again, this is a tool you can use. You can complain that
               | it doesn't work in the way you expect, or you can learn
               | how it operates and how best to use it. If you can't
               | figure out how to apply it to your work, that's fine, but
               | loads of other people are doing exactly that with or
               | without you.
        
             | EForEndeavour wrote:
             | > When I use them they do great on trivial problems but
             | can't help on hard ones.
             | 
             | That sounds _super_ useful! The tools free you up from
             | wasting time on trivial problems so you have more time to
             | focus on the hard ones. What 's not to love?
        
               | bluGill wrote:
               | I try to work on complex problems. Sometimes they hide
               | something easy
        
         | Tadpole9181 wrote:
         | They're good autocomplete, they can help search for solutions
         | sometimes better than Google (SEO spam), you can use it as a
         | rubber duck, and you can make it auto fill trivial stuff that
         | would take you a few minutes to write out manually, like test
         | scaffolding. I would _never_ use it to actually complete a non-
         | trivial task and I _always_ confirm it 's answers. And yeah,
         | sometimes it sucks - it's a tool with a learning curve about
         | knowing it's limitations.
         | 
         | The reason there's so much money and time is that even semi-
         | competant AI is relatively new and the methods are still
         | extreme crude, and yet it's this advanced. This seems like the
         | path to an AGI, and if someone were to even approach that
         | point, it would radically change the world forever and could
         | lead to either really good things or really bad things.
         | 
         | Now, GPT-4 isn't considered the best at specialized tasks. It's
         | a master of many, but there are _much_ smaller models that can
         | do things like incredibly complex symbolic /geometric math
         | proofs, write code, perform translations, etc better. A lot of
         | ideas are on making expert systems using many of those
         | specialists combined with a generalist, like the segmentation
         | of a brain.
         | 
         | Anyway:
         | 
         | > I seriously consider people who use it daily to either be
         | straight up incompetent, or maybe their domain is so trivial
         | that the LLM actually does well.
         | 
         | These kinds of radical lines of thinking about a significant
         | proportion of enthused professionals (in any industry) who
         | aren't showing the same experience as you, is a red flag for
         | introspection. It's so easy to fall into the "enlightened me"
         | trap.
         | 
         | I appreciate you asking for more information!
        
         | nathas wrote:
         | I had a similar take until about a week ago. A friend showed me
         | his workflow with Copilot and whatever Jetbrains AI assistant
         | is.
         | 
         | Use it as a tool: what if instead of opening up a new tab,
         | searching for the API docs for the library you're trying to
         | find a function in, find the function, re-read the parameter
         | arguments for the 400th time, and then use it, you could just
         | highlight a snippet and say "Paginate the results from S3 using
         | boto3" and the code would just populate?
         | 
         | You have to have the clarity of thought to know what you're
         | doing, but the time it takes to write every line for basic
         | stuff you've done 1000x before can be greatly compressed if
         | it's inlined with your IDE.
         | 
         | I think this is the move for most LLM tools: integrate it with
         | existing tooling. An LLM for Excel for corporate bookkeepers,
         | CPAs, etc will be great. A Word/PDF summarizer that's tuned for
         | attorneys will also be fantastic. Highlight a paragraph, ask
         | for relevant case law, etc.
         | 
         | I thought ~2 years ago the results were... not great. Now I'm
         | pretty happy with it.
         | 
         | SecureFrame (helps with compliance regimes like SOC2) recently
         | added the ability to generate Terraform templates to
         | automatically generate infrastructure that will fix specific
         | platform risks for AWS, Azure, GCP, etc.
         | 
         | It definitely needs someone at the helm since it does
         | hallucinate, but I have found it to cut down my time on mundane
         | tasks or otherwise niche/annoying problems. When was the last
         | time you visited 4+ StackOverflow posts to find your answer?
         | Copilot, so far, has always hit a pretty close answer very
         | quickly.
        
           | samstave wrote:
           | Sorry if this is sophmoric, but when you said "you have to
           | have clarity of thought" - what jumped to mind was the phrase
           | "you have to speak to the code"... I thought it encapsulated
           | your clarity of thought quite saliently for me.
        
             | throwup238 wrote:
             | You must be one with the code. You must _be the code_.
        
           | dkjaudyeqooe wrote:
           | I don't know exactly how you use it, but this isn't my
           | experience at all. If you ask a LLM anything too specific,
           | that isn't obvious and a common issue/discussion ( something
           | that I almost never need to do), it just makes up nonsense to
           | fill the space.
           | 
           | Equally, if you ask it general questions it misses
           | information and is almost always incomplete, leaving out
           | slightly more obscure elements. Again, I need comprehensive
           | answers, I can come up with incomplete ones myself.
           | 
           | What's really obvious to me when I use it is that it's a LLM
           | trained on pre-existing text, that really comes through in
           | the character of its answers and its errors.
           | 
           | I've very glad others find them useful and productive, but
           | for me they're disappointing given how I want to use them.
        
             | orzig wrote:
             | That's fair, it might not be for you. In 'old school ML',
             | for a binary classifier, there's the concept of Precision
             | (% of Predicted Positive that's ACTUALLY Positive) and
             | Recall (% of ACTUALLY Positive that's Predicted to be
             | Positive).
             | 
             | It sounds like you want perfect Precision (no errors on
             | specific Qs) and perfect Recall (comprehensive on general
             | Qs). You're right that no model of any type has ever
             | achieved that on any large real-world data, so if that's
             | truly the threshold for useful in your use cases, they
             | won't make sense.
        
               | dkjaudyeqooe wrote:
               | I just want something useful. I'm not talking perfection,
               | I'm talking about answers which are not fit for purpose.
               | 80% of the time the answers are just not useful.
               | 
               | How are you supposed to use LLMs if the answers they give
               | are not salvageable with less work than answering the
               | question yourself using search?
               | 
               | Again, for some people it might be fine, for technical
               | work, LLMs don't seem to cut it.
        
           | orzig wrote:
           | I also had to build intuition for when it will be appropriate
           | versus not. It's hard to describe but one very positive
           | signal is certainly "will any hallucination be caught in
           | <30s"? Even in ChatGPT Plus you can have it write its own
           | unit tests and run them in the original prompt (even in the
           | profile's Custom Instructions so you don't have to type it
           | all the time).
           | 
           | So a mistake was using it for something where runtime
           | performance on dozens of quirky data files was critical; that
           | nearly set my CPU on fire. But str>str data cleanup, chain of
           | simple API calls, or some a one-off data visualization? _chef
           | kiss_
        
           | jmull wrote:
           | > to write every line for basic stuff you've done 1000x
           | before
           | 
           | There are ways to avoid writing basic stuff you've done 1000x
           | before that are better than LLMs though...
           | 
           | Put it in a well-thought-out function or package or other
           | form of shared/reusable code. You can validate it, spend the
           | time to make sure it covers your edge cases, optimize it,
           | test it, etc. so that when you go to reuse it you can have
           | confidence it will reliably do what you need it to do. LLM-
           | generated code doesn't have that.
           | 
           | (When you think about how LLMs are trained and work, you
           | realize they are actually just another form of code reuse,
           | but one where there are various transformations to the
           | original code that may or may not be correct.)
           | 
           | Where LLMs shine for coding is in code-completion. You get
           | the LLM output in little chunks that you can immediately
           | review correctly and completely, in the moment: "yeah that's
           | what I want" or "no, that's no good" or "ok, I can work with
           | that". Not surprising, since predicting completion is what
           | LLMs actually do.
        
         | kthartic wrote:
         | Some questions we've thrown at GPT-4 recently (real use cases):
         | 
         | > how does torchmetrics IOU work? Does it match gt with
         | detection boxes? or does it do pairwise IOU and average?
         | 
         | > What predictions has Ray Kurzweil made that he got correct
         | and incorrect? Please produce a table
         | 
         | > can you give me a stack implementation with min function in
         | O(1) time
         | 
         | > (A question about how we should solve a UX problem specific
         | to our app)
         | 
         | > What is the best way to return raw image data via a REST
         | endpoint?
         | 
         | > How is Return on Capital Employed (ROCE) calculated?
         | 
         | > Following the email exchange below, write a cross intro email
         | to introduce (X person) to (Y person)
         | 
         | > How do I run this code on TPU in Collab?
        
           | samstave wrote:
           | RE: Ray Kurzweil
           | 
           | Did you see him on JRE last week:
           | 
           | https://www.youtube.com/watch?v=w4vrOUau2iY
           | 
           | (or was that why you asked)
        
           | jrmg wrote:
           | Did it correctly answer all of these?
        
         | keiferski wrote:
         | You should treat LLMs the same way you treat any other smart
         | entity, human or otherwise: realize that they can be both
         | immensely _useful_ and fundamentally _wrong_ at the same time.
         | Intelligence is not equivalent to correctness.
        
         | zmgsabst wrote:
         | Three examples:
         | 
         | 1. having ChatGPT generate boilerplate, because I'm lazy;
         | 
         | 2. having ChatGPT attempt something I don't know as a starting
         | point, eg JavaScript; or,
         | 
         | 3. having ChatGPT give a reference rather than Google myself,
         | eg of a config option.
         | 
         | ChatGPT makes 1 less tedious, 3 less a game of "what magic
         | phrase finds the right SO post?", and means I do 2 at all, eg
         | trying out JS features on my blog.
         | 
         | I think it does alright at composition if you break down the
         | task sufficiently, but it struggles with higher order structure
         | -- particularly if you're using multiple responses.
         | 
         | That said, I suspect we need a theory shift to get AI to
         | comprehend higher order structure in composition.
        
           | empath-nirvana wrote:
           | It's pretty amazing at generating rust structs from yaml
           | examples, and also at writing generic versions of rust
           | functions.
           | 
           | Neither of those tasks are especially _difficult_, but they
           | are _annoying_.
        
         | slifin wrote:
         | Not everything in tech is difficult
         | 
         | I find LLMs great for creating SQL queries and regexes
        
         | readyman wrote:
         | Profit. The question at hand is whether LLMs can produce
         | profit, which is an extremely different question than the
         | questions you're asking.
        
         | leothecool wrote:
         | I train my LLM to barf up my domain specific boilerplate code.
         | I don't ask it to solve business problems.
        
         | BenFranklin100 wrote:
         | I signed up for Open.AI's monthly subscription. Its performance
         | on non-trivial tasks is abysmal. It's a regurgitation machine.
         | One might mischievously argue the average tech worker isn't
         | much better than an LLM, thus the interest? On a related note,
         | we are deluged daily with firms offering AI services. I see a
         | bubble.
        
         | Havoc wrote:
         | For me it's more like brainstorming.
         | 
         | Even if half of it is garbage it's a net win. At least in
         | domains where I can distinguish the two.
         | 
         | There are also cases where the cost of failure is very low. Eg
         | I could spend half an hour reading an api spec or I could make
         | an AI give me a curl command and test it out in 30 seconds. If
         | it works great if not oh well time to read spec
        
         | dmos62 wrote:
         | Why do you presume that people commonly use it for non-trivial
         | things? It excels at trivial things. That's what most people
         | use it for, probably. Like google search. Is there something
         | that leads you to think otherwise?
        
           | xanderlewis wrote:
           | Perhaps the incessant talk of GPT-x being AGI, whatever that
           | means.
        
         | samstave wrote:
         | I think its safe to remind oneself that this thing is literally
         | a zygot. So patience, and in ~5 years, it will be a different
         | story.
         | 
         | @xanderlewis
         | 
         | Doesnt that mean it simply is now consuming the internet in
         | real-time?
        
           | xanderlewis wrote:
           | Why? It's already eaten all of the publicly available data on
           | the web.
        
         | reportgunner wrote:
         | First sentence is 100% my sentiment, cheers !
        
         | visarga wrote:
         | Your view on LLM usage is too narrow. Yes, they are pretty shit
         | for me too in solving coding problems, but they are still
         | useful for bespoke information extraction, classification and
         | creative applications. The interest is justified, we're just
         | having a hard time understanding the limitations.
        
         | atoav wrote:
         | Technology is complex and hard to make sense of. That is why
         | most non-experts have a strong wish for a kind of mythical
         | technology, which you can just pour onto your problem and it
         | magically knows what you wanted (and which things you did not
         | want).
         | 
         | For a certain class of problems LLMs achieved new, never before
         | seen, almost magical results. Now imagine you were someone who
         | hates dealing with the constant complexity of solving problems
         | with technology and something comes along that seems to carry
         | promise of lifting that off your shoulders. Then you know why
         | people react like they do. Recall the block-chain-craze? There
         | were people who declared that this somehow magically solved any
         | IT-security problem there ever was - instead of seeing it as a
         | good solution for a very specific set of circumstances, nearly
         | nobody faced in practise.
         | 
         | In reality of course also LLMs have limitiations, e.g. above
         | mentioned ambiguity that is inherent to any magical technology:
         | To be _true_ magic the technology would have to be able to read
         | the thoughts of those who apply it and somehow infer from that
         | the true thing they want or need. Now LLMs are in the end still
         | just very good guesses based on statistical data, that means
         | the guess could just be what you want, but it lacks an actual
         | understanding of what it is doing.
         | 
         | Those applying the technology for things it is actually good at
         | (e.g. classification problems etc) will put it to good use, but
         | there will be a lot who will apply it and have things fall
         | apart Canada Airlines style.
        
         | epanchin wrote:
         | I daily drive KDB/Q. This is readily extendable for example in
         | C, which was my previous daily, and Python which I use
         | sporadically.
         | 
         | I don't use LLMs for C or KDB, I do use them for Python.
         | 
         | ChatGPT is good in Python. I guess as Python programmers rely
         | on stack exchange so there is lots to learn from, and Python
         | anyway is largely an exercise in finding the correct library.
         | 
         | If the only thing ChatGPT did was listen to my problem and
         | suggest which imports to use/manuals to read, that would be
         | good enough to use regularly. If I wasn't after a library/pre
         | existing code I wouldn't be using Python!
        
           | BoxOfRain wrote:
           | I've definitely noticed ChatGPT generally writes better
           | Python than it writes Scala, presumably for the same reason
           | of there being a fair bit more Python code in the wild.
        
             | mrguyorama wrote:
             | The actual reason probably has to do with the fact that LLM
             | developers and academics are more familiar with Python than
             | other programming languages, and therefore have policed
             | it's correctness better.
        
         | sebzim4500 wrote:
         | Stop using it for things that are in you area of expertise but
         | are too difficult for you. Use if for things where you think
         | "this is probably easy but I have no idea how to do it". For
         | example, I needed to do some pretty trivial task in powershell
         | but I have never used it so I got chatGPT to do it for me and
         | it worked first time. Obviously I checked the commands looked
         | plausible before I ran them, but it still probably took 2 mins
         | to do something that would have otherwise taken 30.
        
           | porkbeer wrote:
           | That just means you are ignorant of how wrong it guides you.
           | You need to first build trust before taking it new places.
           | You do that with topics and concepts you are familiar with.
        
             | OmarShehata wrote:
             | This has always been true of anything anyone has ever
             | googled or looked up on stackoverflow
             | 
             | I copy paste code from stackoverflow all the time. I used
             | to agonize over making sure I fully understand every line
             | it's copying. Now I have the discretion of making that
             | decision: sometimes it does really matter, sometimes all
             | you need to know is that it produces the right result for
             | your limited use & test case of it. (it's no different than
             | relying on a 3rd party library in that way)
             | 
             | I think we need to apply the same discretion to LLM output.
             | The answer "it depends". Sometimes using its output blindly
             | leads to disaster. Sometimes using it without fully
             | understanding all the details is a great way to make
             | progress.
        
             | mrguyorama wrote:
             | This is no different from my coworker who regularly
             | copy/pastes from stackoverflow to do things he doesn't have
             | any idea how to do himself, and just as awful,
             | unproductive, and problem inducing.
        
           | OmarShehata wrote:
           | I want to second this:
           | 
           | > Use if for things where you think "this is probably easy
           | but I have no idea how to do it"
           | 
           | I had exactly the same reaction as OP (LLM's suck what's with
           | the all the hype). These people are using it differently. For
           | me it's often something like, asking it to put together a
           | specific sequence of matrix transformations in ThreeJS or
           | some other library.
           | 
           | This is not a difficult task but it's often one I waste a lot
           | of time getting right. It's sort of about finding the right
           | level of abstraction you need to ask it.
        
           | runeofdoom wrote:
           | And how often will those "plausible looking commands" create
           | obvious or subtle problems that cost far more than 30
           | minutes?
        
             | sebzim4500 wrote:
             | Probably about as often as if I cobbled something together
             | from random blog posts except faster.
             | 
             | It's not like the script is running a nuclear power
             | station.
        
         | Al-Khwarizmi wrote:
         | Does your job involve solving complex, challenging problems
         | _all_ the time?
         | 
         | I am a CS professor, I don't think most people would class that
         | as a trivial job, but I find myself needing to do plenty of
         | trivial tasks every day: mixed bureaucracy (periodic reports,
         | grant requests, various evaluations, etc.), trivial programming
         | (a Seaborn chart to show some Excel results), text polishing
         | (need to cut a text to 500 words without altering meaning),
         | writing student assignments, writing emails in (non-Native)
         | English for sensitive requests with the right tone, etc... all
         | of those are things I have found LLMs to do fairly well and
         | save me a lot of time.
         | 
         | I wouldn't use them to do the core job of designing novel
         | algorithms, doing experiments, writing the bulk of a paper or
         | teaching students. But most of my working hours are not really
         | that "core" stuff. And I would assume it's the same for most
         | professionals.
         | 
         | If you have an environment where you are _constantly_
         | challenged by difficult tasks... wow. I don 't know if I should
         | envy you (because I love difficult problems and hate mindless
         | chores) or it would be too stressful.
         | 
         | PS: I don't think "being too difficult for you to do it
         | yourself" is the right litmus test for LLM usefulness. I _can_
         | draw charts with Seaborn, of course. But the LLM does it much
         | faster, and I don 't think doing it myself would make me grow,
         | hone useful skills or anything. I'd rather devote my time to
         | something else. So (in my view) it's clearly better to have the
         | LLM do it.
        
         | Benjaminsen wrote:
         | I'm preparing. Learning how to work with an AI is the only way
         | to stay competitive. The AIs will become smarter much faster
         | than I will.
        
         | mordymoop wrote:
         | When you say ChatGPT, are you referring to GPT4? I find a huge
         | and avoidable miscommunication happens when two people both
         | think they are using "ChatGPT" but talking about two different
         | models which vary in size by a factor of 10.
         | 
         | Assuming you are talking about GPT4, for the sake of argument,
         | the answer is speed. Of course I can write a small parser
         | script that deals with some data I received from a client. It
         | will take me an hour and be a tedious task far distant from my
         | actual expertise. An LLM can do it in 45 seconds, including the
         | time it took me to describe the task.
        
         | KLejmooo wrote:
         | I don't use it constantly but regularly.
         | 
         | LLMs english skills are much better than mine.
         | 
         | And when i do a little bit of go coding once a week (i'm a java
         | developer by trade), i don't have the time to learn go well
         | enough to just type stuff down without looking things up.
         | Instead of googling, i tell it "I need a struct with the
         | following attributes..." and it doesn't just ell me how i do
         | structs in go, it also creates them for me.
         | 
         | Also: There are a TON of issues were i would write a short
         | script to do something (formatting text into a table, searching
         | for specific lines etc.) were a normal person doesn't even have
         | those tools at hand.
         | 
         | For companies overall: Its not just what an LLM can do, LLM can
         | do things for you but its also a very very good interface to
         | your application. The demos i saw in my company are really good
         | and totally make sense and do reduce the entry barrier for
         | people.
         | 
         | I know a friend whos job is to create reports with sql. She
         | doesn't do anything else just reports across the whole
         | datawarehouse. Why? Because every normal non dev person can't
         | just write SQL or automate things.
         | 
         | The gap between tech people and management is huge.
        
       | archibaldJ wrote:
       | This looks really interesting; any possibility the researchers
       | will release some code soon ?
        
       | iAkashPaul wrote:
       | Base Mistral 7B is hardly suitable for the evaluations, even one
       | team at Intel tried to pull a fast one with NeuralChat in the
       | exact same way https://huggingface.co/Intel/neural-
       | chat-7b-v3#quantitative-...
        
       | kjqgqkejbfefn wrote:
       | This is basically what I tried this morning at the prompt level
       | (awful results), but the sketchy idea I had in mind went further
       | by introducing control-flow "meta-tokens" to help the LLM
       | renavigate its context. In this perspective the context would be
       | rethought as a self-editing structured mind-map, with the linear
       | aspect of the context at a time T standing for the execution
       | trace of the exploration of this mind-map so far. Some of those
       | meta-tokens would be able to have side effects on the context, to
       | highlight, give structure, summarize, forget and so on, some of
       | its parts. This could allow for native structured output without
       | using a syntactic format such as json, programmatic constructs in
       | the style of LMQL, implementing memory, etc. The goal: not just
       | to give logical/reasoning abilities to a LLM, but to give it the
       | means to come up with its own cognitive architecture.
       | Implementing structured output (using a <label
       | name="stuff">...</label> token) to also implement
       | memory/scratchpads, would also bring inspectability of those
       | cognitive structures for free. Of course I have no idea how to
       | implement this (I'm a ML tourist).
        
       | lawlessone wrote:
       | If it is doing this , is it still a language model? or also a
       | thought model?
        
       | aaroninsf wrote:
       | Observation: "expertise" (hence "reflex") is the learning of the
       | nonlinear solution space that can be inferred from initial
       | conditions.
       | 
       | Conjecture: models which engage in self-training on the solutions
       | they derive will get to something that looks a bit like
       | bootstrapping when you squint.
       | 
       | Lemma: there's a nice opportunity for cloud-hosted model SaaS to
       | offer discounts for actionable feedback on the quality of their
       | output, so as to drive this retraining.
       | 
       | Idle comment: I'd use the language of REM sleep and the idea of
       | "memory consolidation" for this.
       | 
       | Most of the premises of model training can be extended to the
       | level of reasoned solutions, rather than tokens.
        
       | thesz wrote:
       | They do not cite [1], a paper on (learned) variable computation
       | in RNN, applied to language modeling, that predates their work by
       | almost 8 years.
       | 
       | [1] https://openreview.net/pdf?id=S1LVSrcge
       | 
       | Microsoft also had something alike at that time, but for image
       | recognition: a CNN at input and then varable computation at
       | classification.
        
       ___________________________________________________________________
       (page generated 2024-03-15 23:00 UTC)