[HN Gopher] Natural language instructions induce generalization ...
       ___________________________________________________________________
        
       Natural language instructions induce generalization in networks of
       neurons
        
       Author : birriel
       Score  : 116 points
       Date   : 2024-03-19 16:47 UTC (6 hours ago)
        
 (HTM) web link (www.nature.com)
 (TXT) w3m dump (www.nature.com)
        
       | kevindamm wrote:
       | It's as if language is itself the latent space for these
       | psychophysical tasks, especially compositional instruction. Their
       | description of it as a scaffolding also seems apt.
        
         | zer00eyz wrote:
         | I hate the reductive nature of the concept of "latent spaces".
         | 
         | A good enough formula for a task isn't a solution for every
         | task. Yes Newtonian mechanics work, but Einstein is a better
         | reflection of reality.
        
           | bongodongobob wrote:
           | I'm not sure I understand the analogy. The very idea of NNs
           | is that it's not perfect, it is messy and not optimal, but is
           | very generalizable.
        
             | zer00eyz wrote:
             | >> The very idea of NNs is that it's not perfect, it is
             | messy and not optimal, but is very generalizable.
             | 
             | Newton: Do you need more than that to describe the speed of
             | a thrown baseball on a train? No. DO you you need more than
             | newton to get to the moon? No. Is it going to be accurate
             | at high speed in a large scale system (anything traveling
             | near C)? NO, it fails spectacularly.
             | 
             | NN's are great at simulation, language, weather... But what
             | people using them for weather seem to understand and the ML
             | folks (screaming about AI and AGI) dont is that simulation
             | is not a path to emulation. Lorenz showed that there were
             | limits in weather, that most other disciplines have
             | embraced these limits.
        
           | nerdponx wrote:
           | The entire innovation (discovery?) of LLMs is that a good
           | formula for the task of sequence completion turns out to also
           | be a good formula for a wide range of AI tasks. That emergent
           | property is why language models are called language models.
        
         | nothis wrote:
         | I'm not pretending to understand half the words uttered in this
         | discussion but I'm constantly reminded of how much it helps me
         | to articulate things (explain them to others, write them down,
         | etc) to understand them. Maybe that thinking indeed happens
         | almost entirely on a linguistic level and I'm not doing half as
         | much other thinking (visualization, abstract logic, etc.) in
         | the process as I thought. That feels weird.
        
           | robwwilliams wrote:
           | Or is the real thinking sub-linguistic and "you" and those
           | you talk to are the target audience of language? Sentences
           | emerge from a pre-linguistic space we do not understand.
        
         | mrblah wrote:
         | i've always assumed that language was required to give your
         | brain the abstractions needed to reference things in the past
         | compared to your current perception (aka now), like an index.
         | if you think about your earliest memories, they almost
         | certainly came after language. i'd be interested to know if any
         | of the documented 'wild child' cases (infants 'raised by
         | wolves') ever delved into what the children remembered before,
         | after being taught language as an adolescent.
        
       | jameshart wrote:
       | > We found that language scaffolds sensorimotor representations
       | such that activity for interrelated tasks shares a common
       | geometry with the semantic representations of instructions,
       | allowing language to cue the proper composition of practiced
       | skills in unseen settings.
       | 
       | Sapir-Whorf with the surprise comeback?
        
         | sisyphus_coding wrote:
         | > comeback
         | 
         | Did it fall out of favour?
        
           | jameshart wrote:
           | Strong Sapir-Whorf (linguistic determinism - language
           | _constrains_ thought) became pretty much seen as a joke by
           | the 1980s. Linguistic relativism (weak Sapir-Whorf - language
           | _shapes_ thought) is still respectable (because, I mean, of
           | course it does).
           | 
           | Actually, this research might just as well be evidence for
           | linguistic universalism (Chomsky - language _enables_
           | thought).
           | 
           | In general linguistic philosophers have been coming out with
           | either laughably obvious or utterly untestable hypotheses for
           | a century and it's amusing to see how these AI studies shake
           | up the hornets.
        
             | a_gnostic wrote:
             | Oftentimes I find myself understanding complex concepts
             | _before_ I can describe them, even _internally_. I am sure
             | everyone has this, as I often read comments praising others
             | ' submissions for formulating their thoughts efficiently.
             | So thoughts occur independent of language, but need it to
             | be expressed and shared, even if through pictures and
             | sounds.
        
               | lo_zamoyski wrote:
               | Expression is not language. What you're having trouble
               | doing is expressing what you understand.
        
               | Jensson wrote:
               | But that would be an impossibility if understanding
               | requires expressing it in language.
        
               | vaidhy wrote:
               | Thoughts occur independent of language is same as saying
               | sentient beings think. The question is does the thought
               | you have depend on the language?
               | 
               | I speak tamil and english and can distinctly see how the
               | language drives some of my understanding. If you have a
               | language that has evolved to describe 3D space, would be
               | understand spatial ideas better/faster?
               | 
               | If we are pattern matching creatures, then the patterns
               | are built over a period of time and our earliest
               | scaffolding for the patterns come from our mother tongue
               | (or the languages learnt in early childhood). Subsequent
               | understanding depends on building and expanding on those
               | patterns.
        
               | a_gnostic wrote:
               | I grew up trilingual, and have noticed that I understand
               | mechanical concepts better in one language, industrial
               | concepts in another... but have mostly defaulted to
               | English nowadays. I find learning new concepts easier by
               | playing translation games; Which language is the root for
               | this word, and how does it mechanically relate to the
               | concept?
        
             | lo_zamoyski wrote:
             | The distinction between language and "thought" to me is
             | odd. Language and "thought" _are the same thing_. The mouth
             | sounds or hand scribbles aren 't the language, but
             | _expressions_ of it.
        
               | somewhereoutth wrote:
               | You don't need language to catch a ball, but clearly
               | thinking is required to intercept its trajectory
               | correctly.
               | 
               | Language is about _communication_.
        
               | photonthug wrote:
               | There's an argument that communication which is internal
               | is still communication, and that a language of
               | trajectories required for coordination is still
               | linguistic in a meaningful sense. Most of the ways to
               | differentiate thought from language are probably going to
               | end up splitting hairs. It all comes back to
               | Wittgenstein, and it's arguable whether the POV is
               | useful, but it's certainly coherent and defensible.
        
               | noiv wrote:
               | I consider a sentence as a formatted thought. That
               | implies a thought exists before it is expressed in words.
               | There's a ton of thoughts in my head which can't be
               | transformed into any language I speak. I wish I could
               | somehow acquire some proficiency in the other thousands
               | of languages spoken by humans on this planet, just to
               | proof their immense lack of features.
               | 
               | Also our natural languages restrict information bandwidth
               | to a few bytes per second. Imagine doing sports like
               | tennis, chess or soccer at this speed...
        
               | jameshart wrote:
               | One issue here is semantics. The things that happen in
               | our brains which we can put into words tend to be the
               | things we categorize as 'thoughts'. But there are things
               | that happen in our brains which we struggle to connect to
               | language too, and we might call those 'feelings' or
               | 'emotions' or 'instincts' instead. So we're trying to use
               | language to think about how we think about language and I
               | suspect this might be why that end of neurolinguistics
               | falls off the deep end into philosophy.
        
       | nyrikki wrote:
       | >Tasks that are instructed using conditional clauses also require
       | a simple form of deductive reasoning (if p then q else s)
       | 
       | > Our models ofer several experimentally testable predictions
       | outlining how linguistic information must be represented to
       | facilitate flexible and general cognition in the human brain.
       | 
       | Aren't those claims falsified by more recent studies that show
       | that even in flys, preferred direction to a moving stimulus uses
       | the timing of spikes. And that fear conditioning in even mice
       | uses Dendritic Compartmentalization?
       | 
       | Or that humans can even do xor with a single neuron.
       | 
       | If "must be represented" was "may be modeled by" I would have
       | less of an issue and obviously spikey artificial NNs have had
       | problems with riddled basins and make autograd problematic in
       | general.
       | 
       | So ANNs need to be binary and it is best to model biological
       | neurons as such for practical models... but can someone please
       | clarify why 'must' can apply when using what we know now is an
       | oversimplified artificial neuron models?
       | 
       | Here are a couple of recent papers but I think dendritic
       | compartmentalization and spike timing sensitivity has been
       | established for over a decade.
       | 
       | https://pubmed.ncbi.nlm.nih.gov/35701166/
       | 
       | https://www.sciencedirect.com/science/article/pii/S009286741...
        
         | ben_w wrote:
         | > Or that humans can even do xor with a single neuron.
         | 
         | That's news to me.
         | 
         | I'm not _hugely_ surprised given I 've heard a biological
         | neuron is supposed to be equivalent to a small ANN network, but
         | still, first I've heard of that claim.
        
           | orbifold wrote:
           | https://www.science.org/doi/full/10.1126/science.aax6239
        
             | ben_w wrote:
             | Thanks :)
        
         | canjobear wrote:
         | There's a long debate in neuroscience about whether information
         | is encoded in timing of individual spikes or only their rates
         | (where rate coding is a bit more similar to how ANNs work, but
         | still different). It hasn't been decided by any one paper, nor
         | is it likely to be: it seems that different populations of
         | neurons in different parts of the brain encode information
         | through different means.
        
           | robwwilliams wrote:
           | Not either-or. It is both. Spike rate variation is way too
           | slow for some types of low level compute. Spike timing us
           | critical for actions as "simple" as throwing a fast ball into
           | the strike zone.
        
         | robwwilliams wrote:
         | Yes, the word "represented" is too widely used and abused in
         | neuroscience--to the point where a frog has "fly detector"
         | neurons. Humberto Maturana pushed back against this pervasive
         | idea. Chapter 4 of Terry Winograd's and Francesco Valera's
         | Understanding Computers and Cognition has a good overview of
         | common presumptions.
         | 
         | Given that CNS is a 700 million year hack, there will be lots
         | of odd tricks used to generate effective behaviors.
        
       | cs702 wrote:
       | TL;DR: The authors embed task instructions in a vector space with
       | a language model, and train a sensorimotor-controlling model on
       | top to perform tasks given the instruction embeddings. The
       | authors find that the models generalize to previously unseen
       | tasks, specified in natural language. Moreover, the authors show
       | that the hidden states learn to represent task subcomponents,
       | which helps explains why the model is able to generalize.
        
       | worstspotgain wrote:
       | Birds inspired planes. Later, aerodynamics fed back into
       | ornithology. I've been waiting for LLMs to be evaluated as a
       | model of human thought. Complexity and scale have held
       | neuroscience back. It's nowhere close to building a high-level
       | brain model up from biological primitives. Like ornithology, it
       | could use some feedback.
       | 
       | All arguments about AGI aside, a machine was built that writes
       | like a human. Its design is very un-biological in places, so it's
       | tempting to dismiss it. Why not see how deep the rabbit hole
       | goes?
       | 
       | Like all conjectures it just might surprise us. For one, the
       | intuition that language is central to thought predates LLMs, but
       | it's certainly consistent with it.
        
         | smokel wrote:
         | > Later, aerodynamics fed back into ornithology.
         | 
         | You mean to imply that birds have learned from jet fighter
         | designs?
         | 
         | I fail to understand the point you're making.
        
           | worstspotgain wrote:
           | Concepts from the mechanics of flight (e.g. lift and drag)
           | have helped ornithologists understand birds better. Birds
           | themselves did not learn much.
        
           | vaidhy wrote:
           | I believe the point is that we understand the birds' flight
           | better applying the principles we learnt designing and flying
           | airplanes. Similarly, we can learn more about human brain by
           | applying things we learn from building the ANN back to the
           | study of humans (anthropology in general, maybe neuroscience
           | and psychology )
        
           | Sardtok wrote:
           | He didn't say it fed back into birds, but into the study of
           | birds.
        
         | lo_zamoyski wrote:
         | Why assume this is sensible? Aerodynamics can help discover
         | general _principles_ that apply to both birds and planes or
         | whatever else. I don 't see how this holds for LLMs and brains.
         | The similarity between LLMs and brains is superficial.
         | 
         | Besides, with human beings, we have a host of philosophical
         | problems that undermine the neuroscientific presumption that a
         | mechanistic and closed view of the brain can account for mental
         | activity entirely, like the problem of intentionality.
        
           | og_kalu wrote:
           | >I don't see how this holds for LLMs and brains. The
           | similarity between LLMs and brains is superficial.
           | 
           | It's really not any more superficial than planes and bird
           | flight.
        
         | visarga wrote:
         | > a machine was built that writes like a human
         | 
         | But the real hero here is not the LLM, but the training set. It
         | took ages to collect all the knowledge, ideas and methods we
         | put in books. It cost a lot of human effort to provide the
         | data. Without the data we would have nothing. Without GPT we
         | could use RWKV, Mamba, S4, etc and still get similar results.
         | It's the data not the model.
         | 
         | > the intuition that language is central to thought predates
         | LLMs
         | 
         | Language carries AI and humans. The same distribution of
         | language can be the software running in our brains and in LLMs.
         | I think humans act like conditional language models with multi
         | modality and actions. We use language to plan and solve our
         | problem, work together and learn (a lot) from others.
         | 
         | Language itself is an evolutionary system and a self
         | replicator. Its speed is much faster than biology. We've been
         | on the language exponential for millennia, but just now hit the
         | critical mass for LLMs to be possible.
         | 
         | It's not so important that GPT-4 is a 2T weights model, what
         | matters is that it was trained on 13T tokens of human
         | experience and it now "writes like a human". Does that mean
         | humans also learn the same skills GPT-4 has learned from its
         | training set mostly by language as well?
        
           | AlecSchueler wrote:
           | > But the real hero here is not the LLM, but the training
           | set.
           | 
           | And in the case of windmills the hero is the wind. But the
           | mill is still a fantastic achievement.
        
       | retskrad wrote:
       | It's clear that both biological sentient beings and sentient
       | being made in factories in the future will essentially be two
       | sides of the same coin, differing only in their physical
       | composition. Humans today operate as biological AI, powered by
       | cells, and the sentient beings instead operate on transistors. As
       | we progress toward a future where both sentient beings exhibit
       | comparable intelligence, emotions, and learned experiences, the
       | distinction between the two becomes increasingly blurred. It
       | wouldn't be crazy to think that we'll program a person made out
       | of transistors to go through the same life as a biological human.
       | In such a scenario, why should we consider the sentient being
       | made of cells inherently superior to its transistor-based
       | counterpart?
        
         | arijun wrote:
         | I don't think it's a given that people do. Also what does that
         | have to do with the article?
        
         | spr-alex wrote:
         | proof left as exercise to the reader
        
         | smokel wrote:
         | Hm, no. To start, cells are able to replicate themselves,
         | whereas most silicon used today is not even close to doing so.
         | 
         | The story you suggest seems to be built on a limited
         | understanding of the processes involved. It's pretty hard to
         | predict the future, especially given incorrect assumptions.
        
           | KoolKat23 wrote:
           | Cell division is a solved problem. Install new ram module and
           | ctrl+c, ctrl+v from backup, done.
        
           | Workaccount2 wrote:
           | Compute is substrate independent.
           | 
           | We use transistors because they are wicked fast and
           | efficient. But a 4090 built from metal balls and wood blocks
           | would still be able to perform all the same calculations. Or
           | a 4090 made by drawing X's and O's on a (really massive)
           | piece of paper. Or one made by connecting a bunch of neurons
           | together for that matter.
           | 
           | Saying cells can multiply doesn't really mean anything,
           | unless is gives ability to access some higher form of compute
           | that is outside the reach of Turing machines. Which it
           | doesn't, because if it did, it would be supernatural.
        
             | Jensson wrote:
             | Analogue computers aren't equivalent to Turing machines.
        
       | dmead wrote:
       | I hate that we just turned out to be stochastic machines another
       | not something more interesting.
        
         | KoolKat23 wrote:
         | Delusions of grandeur
        
       | alexfromapex wrote:
       | I think the title of this is wrong...it should say "Structured
       | Language Patterns Facilitate Structured Generalization in Neural
       | Networks".
        
       | vassilis-uk wrote:
       | here is a video of the author explaining the work
       | https://www.youtube.com/watch?v=miEwuSz7Pts
        
       ___________________________________________________________________
       (page generated 2024-03-19 23:00 UTC)