[HN Gopher] Natural language instructions induce generalization ...
___________________________________________________________________
Natural language instructions induce generalization in networks of
neurons
Author : birriel
Score : 116 points
Date : 2024-03-19 16:47 UTC (6 hours ago)
(HTM) web link (www.nature.com)
(TXT) w3m dump (www.nature.com)
| kevindamm wrote:
| It's as if language is itself the latent space for these
| psychophysical tasks, especially compositional instruction. Their
| description of it as a scaffolding also seems apt.
| zer00eyz wrote:
| I hate the reductive nature of the concept of "latent spaces".
|
| A good enough formula for a task isn't a solution for every
| task. Yes Newtonian mechanics work, but Einstein is a better
| reflection of reality.
| bongodongobob wrote:
| I'm not sure I understand the analogy. The very idea of NNs
| is that it's not perfect, it is messy and not optimal, but is
| very generalizable.
| zer00eyz wrote:
| >> The very idea of NNs is that it's not perfect, it is
| messy and not optimal, but is very generalizable.
|
| Newton: Do you need more than that to describe the speed of
| a thrown baseball on a train? No. DO you you need more than
| newton to get to the moon? No. Is it going to be accurate
| at high speed in a large scale system (anything traveling
| near C)? NO, it fails spectacularly.
|
| NN's are great at simulation, language, weather... But what
| people using them for weather seem to understand and the ML
| folks (screaming about AI and AGI) dont is that simulation
| is not a path to emulation. Lorenz showed that there were
| limits in weather, that most other disciplines have
| embraced these limits.
| nerdponx wrote:
| The entire innovation (discovery?) of LLMs is that a good
| formula for the task of sequence completion turns out to also
| be a good formula for a wide range of AI tasks. That emergent
| property is why language models are called language models.
| nothis wrote:
| I'm not pretending to understand half the words uttered in this
| discussion but I'm constantly reminded of how much it helps me
| to articulate things (explain them to others, write them down,
| etc) to understand them. Maybe that thinking indeed happens
| almost entirely on a linguistic level and I'm not doing half as
| much other thinking (visualization, abstract logic, etc.) in
| the process as I thought. That feels weird.
| robwwilliams wrote:
| Or is the real thinking sub-linguistic and "you" and those
| you talk to are the target audience of language? Sentences
| emerge from a pre-linguistic space we do not understand.
| mrblah wrote:
| i've always assumed that language was required to give your
| brain the abstractions needed to reference things in the past
| compared to your current perception (aka now), like an index.
| if you think about your earliest memories, they almost
| certainly came after language. i'd be interested to know if any
| of the documented 'wild child' cases (infants 'raised by
| wolves') ever delved into what the children remembered before,
| after being taught language as an adolescent.
| jameshart wrote:
| > We found that language scaffolds sensorimotor representations
| such that activity for interrelated tasks shares a common
| geometry with the semantic representations of instructions,
| allowing language to cue the proper composition of practiced
| skills in unseen settings.
|
| Sapir-Whorf with the surprise comeback?
| sisyphus_coding wrote:
| > comeback
|
| Did it fall out of favour?
| jameshart wrote:
| Strong Sapir-Whorf (linguistic determinism - language
| _constrains_ thought) became pretty much seen as a joke by
| the 1980s. Linguistic relativism (weak Sapir-Whorf - language
| _shapes_ thought) is still respectable (because, I mean, of
| course it does).
|
| Actually, this research might just as well be evidence for
| linguistic universalism (Chomsky - language _enables_
| thought).
|
| In general linguistic philosophers have been coming out with
| either laughably obvious or utterly untestable hypotheses for
| a century and it's amusing to see how these AI studies shake
| up the hornets.
| a_gnostic wrote:
| Oftentimes I find myself understanding complex concepts
| _before_ I can describe them, even _internally_. I am sure
| everyone has this, as I often read comments praising others
| ' submissions for formulating their thoughts efficiently.
| So thoughts occur independent of language, but need it to
| be expressed and shared, even if through pictures and
| sounds.
| lo_zamoyski wrote:
| Expression is not language. What you're having trouble
| doing is expressing what you understand.
| Jensson wrote:
| But that would be an impossibility if understanding
| requires expressing it in language.
| vaidhy wrote:
| Thoughts occur independent of language is same as saying
| sentient beings think. The question is does the thought
| you have depend on the language?
|
| I speak tamil and english and can distinctly see how the
| language drives some of my understanding. If you have a
| language that has evolved to describe 3D space, would be
| understand spatial ideas better/faster?
|
| If we are pattern matching creatures, then the patterns
| are built over a period of time and our earliest
| scaffolding for the patterns come from our mother tongue
| (or the languages learnt in early childhood). Subsequent
| understanding depends on building and expanding on those
| patterns.
| a_gnostic wrote:
| I grew up trilingual, and have noticed that I understand
| mechanical concepts better in one language, industrial
| concepts in another... but have mostly defaulted to
| English nowadays. I find learning new concepts easier by
| playing translation games; Which language is the root for
| this word, and how does it mechanically relate to the
| concept?
| lo_zamoyski wrote:
| The distinction between language and "thought" to me is
| odd. Language and "thought" _are the same thing_. The mouth
| sounds or hand scribbles aren 't the language, but
| _expressions_ of it.
| somewhereoutth wrote:
| You don't need language to catch a ball, but clearly
| thinking is required to intercept its trajectory
| correctly.
|
| Language is about _communication_.
| photonthug wrote:
| There's an argument that communication which is internal
| is still communication, and that a language of
| trajectories required for coordination is still
| linguistic in a meaningful sense. Most of the ways to
| differentiate thought from language are probably going to
| end up splitting hairs. It all comes back to
| Wittgenstein, and it's arguable whether the POV is
| useful, but it's certainly coherent and defensible.
| noiv wrote:
| I consider a sentence as a formatted thought. That
| implies a thought exists before it is expressed in words.
| There's a ton of thoughts in my head which can't be
| transformed into any language I speak. I wish I could
| somehow acquire some proficiency in the other thousands
| of languages spoken by humans on this planet, just to
| proof their immense lack of features.
|
| Also our natural languages restrict information bandwidth
| to a few bytes per second. Imagine doing sports like
| tennis, chess or soccer at this speed...
| jameshart wrote:
| One issue here is semantics. The things that happen in
| our brains which we can put into words tend to be the
| things we categorize as 'thoughts'. But there are things
| that happen in our brains which we struggle to connect to
| language too, and we might call those 'feelings' or
| 'emotions' or 'instincts' instead. So we're trying to use
| language to think about how we think about language and I
| suspect this might be why that end of neurolinguistics
| falls off the deep end into philosophy.
| nyrikki wrote:
| >Tasks that are instructed using conditional clauses also require
| a simple form of deductive reasoning (if p then q else s)
|
| > Our models ofer several experimentally testable predictions
| outlining how linguistic information must be represented to
| facilitate flexible and general cognition in the human brain.
|
| Aren't those claims falsified by more recent studies that show
| that even in flys, preferred direction to a moving stimulus uses
| the timing of spikes. And that fear conditioning in even mice
| uses Dendritic Compartmentalization?
|
| Or that humans can even do xor with a single neuron.
|
| If "must be represented" was "may be modeled by" I would have
| less of an issue and obviously spikey artificial NNs have had
| problems with riddled basins and make autograd problematic in
| general.
|
| So ANNs need to be binary and it is best to model biological
| neurons as such for practical models... but can someone please
| clarify why 'must' can apply when using what we know now is an
| oversimplified artificial neuron models?
|
| Here are a couple of recent papers but I think dendritic
| compartmentalization and spike timing sensitivity has been
| established for over a decade.
|
| https://pubmed.ncbi.nlm.nih.gov/35701166/
|
| https://www.sciencedirect.com/science/article/pii/S009286741...
| ben_w wrote:
| > Or that humans can even do xor with a single neuron.
|
| That's news to me.
|
| I'm not _hugely_ surprised given I 've heard a biological
| neuron is supposed to be equivalent to a small ANN network, but
| still, first I've heard of that claim.
| orbifold wrote:
| https://www.science.org/doi/full/10.1126/science.aax6239
| ben_w wrote:
| Thanks :)
| canjobear wrote:
| There's a long debate in neuroscience about whether information
| is encoded in timing of individual spikes or only their rates
| (where rate coding is a bit more similar to how ANNs work, but
| still different). It hasn't been decided by any one paper, nor
| is it likely to be: it seems that different populations of
| neurons in different parts of the brain encode information
| through different means.
| robwwilliams wrote:
| Not either-or. It is both. Spike rate variation is way too
| slow for some types of low level compute. Spike timing us
| critical for actions as "simple" as throwing a fast ball into
| the strike zone.
| robwwilliams wrote:
| Yes, the word "represented" is too widely used and abused in
| neuroscience--to the point where a frog has "fly detector"
| neurons. Humberto Maturana pushed back against this pervasive
| idea. Chapter 4 of Terry Winograd's and Francesco Valera's
| Understanding Computers and Cognition has a good overview of
| common presumptions.
|
| Given that CNS is a 700 million year hack, there will be lots
| of odd tricks used to generate effective behaviors.
| cs702 wrote:
| TL;DR: The authors embed task instructions in a vector space with
| a language model, and train a sensorimotor-controlling model on
| top to perform tasks given the instruction embeddings. The
| authors find that the models generalize to previously unseen
| tasks, specified in natural language. Moreover, the authors show
| that the hidden states learn to represent task subcomponents,
| which helps explains why the model is able to generalize.
| worstspotgain wrote:
| Birds inspired planes. Later, aerodynamics fed back into
| ornithology. I've been waiting for LLMs to be evaluated as a
| model of human thought. Complexity and scale have held
| neuroscience back. It's nowhere close to building a high-level
| brain model up from biological primitives. Like ornithology, it
| could use some feedback.
|
| All arguments about AGI aside, a machine was built that writes
| like a human. Its design is very un-biological in places, so it's
| tempting to dismiss it. Why not see how deep the rabbit hole
| goes?
|
| Like all conjectures it just might surprise us. For one, the
| intuition that language is central to thought predates LLMs, but
| it's certainly consistent with it.
| smokel wrote:
| > Later, aerodynamics fed back into ornithology.
|
| You mean to imply that birds have learned from jet fighter
| designs?
|
| I fail to understand the point you're making.
| worstspotgain wrote:
| Concepts from the mechanics of flight (e.g. lift and drag)
| have helped ornithologists understand birds better. Birds
| themselves did not learn much.
| vaidhy wrote:
| I believe the point is that we understand the birds' flight
| better applying the principles we learnt designing and flying
| airplanes. Similarly, we can learn more about human brain by
| applying things we learn from building the ANN back to the
| study of humans (anthropology in general, maybe neuroscience
| and psychology )
| Sardtok wrote:
| He didn't say it fed back into birds, but into the study of
| birds.
| lo_zamoyski wrote:
| Why assume this is sensible? Aerodynamics can help discover
| general _principles_ that apply to both birds and planes or
| whatever else. I don 't see how this holds for LLMs and brains.
| The similarity between LLMs and brains is superficial.
|
| Besides, with human beings, we have a host of philosophical
| problems that undermine the neuroscientific presumption that a
| mechanistic and closed view of the brain can account for mental
| activity entirely, like the problem of intentionality.
| og_kalu wrote:
| >I don't see how this holds for LLMs and brains. The
| similarity between LLMs and brains is superficial.
|
| It's really not any more superficial than planes and bird
| flight.
| visarga wrote:
| > a machine was built that writes like a human
|
| But the real hero here is not the LLM, but the training set. It
| took ages to collect all the knowledge, ideas and methods we
| put in books. It cost a lot of human effort to provide the
| data. Without the data we would have nothing. Without GPT we
| could use RWKV, Mamba, S4, etc and still get similar results.
| It's the data not the model.
|
| > the intuition that language is central to thought predates
| LLMs
|
| Language carries AI and humans. The same distribution of
| language can be the software running in our brains and in LLMs.
| I think humans act like conditional language models with multi
| modality and actions. We use language to plan and solve our
| problem, work together and learn (a lot) from others.
|
| Language itself is an evolutionary system and a self
| replicator. Its speed is much faster than biology. We've been
| on the language exponential for millennia, but just now hit the
| critical mass for LLMs to be possible.
|
| It's not so important that GPT-4 is a 2T weights model, what
| matters is that it was trained on 13T tokens of human
| experience and it now "writes like a human". Does that mean
| humans also learn the same skills GPT-4 has learned from its
| training set mostly by language as well?
| AlecSchueler wrote:
| > But the real hero here is not the LLM, but the training
| set.
|
| And in the case of windmills the hero is the wind. But the
| mill is still a fantastic achievement.
| retskrad wrote:
| It's clear that both biological sentient beings and sentient
| being made in factories in the future will essentially be two
| sides of the same coin, differing only in their physical
| composition. Humans today operate as biological AI, powered by
| cells, and the sentient beings instead operate on transistors. As
| we progress toward a future where both sentient beings exhibit
| comparable intelligence, emotions, and learned experiences, the
| distinction between the two becomes increasingly blurred. It
| wouldn't be crazy to think that we'll program a person made out
| of transistors to go through the same life as a biological human.
| In such a scenario, why should we consider the sentient being
| made of cells inherently superior to its transistor-based
| counterpart?
| arijun wrote:
| I don't think it's a given that people do. Also what does that
| have to do with the article?
| spr-alex wrote:
| proof left as exercise to the reader
| smokel wrote:
| Hm, no. To start, cells are able to replicate themselves,
| whereas most silicon used today is not even close to doing so.
|
| The story you suggest seems to be built on a limited
| understanding of the processes involved. It's pretty hard to
| predict the future, especially given incorrect assumptions.
| KoolKat23 wrote:
| Cell division is a solved problem. Install new ram module and
| ctrl+c, ctrl+v from backup, done.
| Workaccount2 wrote:
| Compute is substrate independent.
|
| We use transistors because they are wicked fast and
| efficient. But a 4090 built from metal balls and wood blocks
| would still be able to perform all the same calculations. Or
| a 4090 made by drawing X's and O's on a (really massive)
| piece of paper. Or one made by connecting a bunch of neurons
| together for that matter.
|
| Saying cells can multiply doesn't really mean anything,
| unless is gives ability to access some higher form of compute
| that is outside the reach of Turing machines. Which it
| doesn't, because if it did, it would be supernatural.
| Jensson wrote:
| Analogue computers aren't equivalent to Turing machines.
| dmead wrote:
| I hate that we just turned out to be stochastic machines another
| not something more interesting.
| KoolKat23 wrote:
| Delusions of grandeur
| alexfromapex wrote:
| I think the title of this is wrong...it should say "Structured
| Language Patterns Facilitate Structured Generalization in Neural
| Networks".
| vassilis-uk wrote:
| here is a video of the author explaining the work
| https://www.youtube.com/watch?v=miEwuSz7Pts
___________________________________________________________________
(page generated 2024-03-19 23:00 UTC)