https://news.mit.edu/2021/artificial-intelligence-brain-language-1025

Skip to content |
Massachusetts Institute of Technology
MIT Top Menu|

  * Education
  * Research
  * Innovation
  * Admissions + Aid
  * Campus Life
  * News
  * Alumni
  * About MIT
  * More |

Search MIT
Search websites, locations, and people
[                    ]
See More Results

Suggestions or feedback?

MIT News | Massachusetts Institute of Technology

Subscribe to MIT News newsletter
Browse
Enter keywords to search for news articles: [                    ] 
Submit
Browse By

Topics

View All -
Explore:

  * Machine learning
  * Social justice
  * Startups
  * Black holes
  * Classes and programs

Departments

View All -
Explore:

  * Aeronautics and Astronautics
  * Brain and Cognitive Sciences
  * Architecture
  * Political Science
  * Mechanical Engineering

Centers, Labs, & Programs

View All -
Explore:

  * Abdul Latif Jameel Poverty Action Lab (J-PAL)
  * Picower Institute for Learning and Memory
  * Media Lab
  * Lincoln Laboratory
  * Haystack Group

Schools

  * School of Architecture + Planning
  * School of Engineering
  * School of Humanities, Arts, and Social Sciences
  * Sloan School of Management
  * School of Science
  * MIT Schwarzman College of Computing

View all news coverage of MIT in the media -
Subscribe to MIT newsletter -
Close

Breadcrumb

 1. MIT News
 2. Artificial intelligence sheds light on how the brain processes
    language

Artificial intelligence sheds light on how the brain processes
language

Neuroscientists find the internal workings of next-word prediction
models resemble those of language-processing centers in the brain.
Watch Video
Anne Trafton | MIT News Office
Publication Date:
October 25, 2021
Press Inquiries

Press Contact:

Abby Abazorius
Email: abbya@mit.edu
Phone: 617-253-2709
MIT News Office

Media Download

prediction model guesses next word
| Download Image
Caption: MIT neuroscientists find the internal workings of next-word
prediction models resemble those of language-processing centers in
the brain.

*Terms of Use:

Images for download on the MIT News office website are made available
to non-commercial entities, press and the general public under a
Creative Commons Attribution Non-Commercial No Derivatives license.
You may not alter the images provided, other than to crop them to
size. A credit line must be used when reproducing images; if one is
not provided below, credit the images to "MIT."

Close
prediction model guesses next word
Caption:
MIT neuroscientists find the internal workings of next-word
prediction models resemble those of language-processing centers in
the brain.
Credits:
Image: Christine Daniloff, MIT; stock image

Previous image Next image

In the past few years, artificial intelligence models of language
have become very good at certain tasks. Most notably, they excel at
predicting the next word in a string of text; this technology helps
search engines and texting apps predict the next word you are going
to type.

The most recent generation of predictive language models also appears
to learn something about the underlying meaning of language. These
models can not only predict the word that comes next, but also
perform tasks that seem to require some degree of genuine
understanding, such as question answering, document summarization,
and story completion. 

Such models were designed to optimize performance for the specific
function of predicting text, without attempting to mimic anything
about how the human brain performs this task or understands language.
But a new study from MIT neuroscientists suggests the underlying
function of these models resembles the function of
language-processing centers in the human brain.

Computer models that perform well on other types of language tasks do
not show this similarity to the human brain, offering evidence that
the human brain may use next-word prediction to drive language
processing.

"The better the model is at predicting the next word, the more
closely it fits the human brain," says Nancy Kanwisher, the Walter A.
Rosenblith Professor of Cognitive Neuroscience, a member of MIT's
McGovern Institute for Brain Research and Center for Brains, Minds,
and Machines (CBMM), and an author of the new study. "It's amazing
that the models fit so well, and it very indirectly suggests that
maybe what the human language system is doing is predicting what's
going to happen next."

Joshua Tenenbaum, a professor of computational cognitive science at
MIT and a member of CBMM and MIT's Artificial Intelligence Laboratory
(CSAIL); and Evelina Fedorenko, the Frederick A. and Carole J.
Middleton Career Development Associate Professor of Neuroscience and
a member of the McGovern Institute, are the senior authors of the
study, which appears this week in the Proceedings of the National
Academy of Sciences. Martin Schrimpf, an MIT graduate student who
works in CBMM, is the first author of the paper.

Video thumbnail Play video

Making predictions

The new, high-performing next-word prediction models belong to a
class of models called deep neural networks. These networks contain
computational "nodes" that form connections of varying strength, and
layers that pass information between each other in prescribed ways.

Over the past decade, scientists have used deep neural networks to
create models of vision that can recognize objects as well as the
primate brain does. Research at MIT has also shown that the
underlying function of visual object recognition models matches the
organization of the primate visual cortex, even though those computer
models were not specifically designed to mimic the brain.

In the new study, the MIT team used a similar approach to compare
language-processing centers in the human brain with
language-processing models. The researchers analyzed 43 different
language models, including several that are optimized for next-word
prediction. These include a model called GPT-3 (Generative
Pre-trained Transformer 3), which, given a prompt, can generate text
similar to what a human would produce. Other models were designed to
perform different language tasks, such as filling in a blank in a
sentence.

As each model was presented with a string of words, the researchers
measured the activity of the nodes that make up the network. They
then compared these patterns to activity in the human brain, measured
in subjects performing three language tasks: listening to stories,
reading sentences one at a time, and reading sentences in which one
word is revealed at a time. These human datasets included functional
magnetic resonance (fMRI) data and intracranial electrocorticographic
measurements taken in people undergoing brain surgery for epilepsy.

They found that the best-performing next-word prediction models had
activity patterns that very closely resembled those seen in the human
brain. Activity in those same models was also highly correlated with
measures of human behavioral measures such as how fast people were
able to read the text.

"We found that the models that predict the neural responses well also
tend to best predict human behavior responses, in the form of reading
times. And then both of these are explained by the model performance
on next-word prediction. This triangle really connects everything
together," Schrimpf says.

"A key takeaway from this work is that language processing is a
highly constrained problem: The best solutions to it that AI
engineers have created end up being similar, as this paper shows, to
the solutions found by the evolutionary process that created the
human brain. Since the AI network didn't seek to mimic the brain
directly -- but does end up looking brain-like -- this suggests that,
in a sense, a kind of convergent evolution has occurred between AI
and nature," says Daniel Yamins, an assistant professor of psychology
and computer science at Stanford University, who was not involved in
the study.

Game changer

One of the key computational features of predictive models such as
GPT-3 is an element known as a forward one-way predictive
transformer. This kind of transformer is able to make predictions of
what is going to come next, based on previous sequences. A
significant feature of this transformer is that it can make
predictions based on a very long prior context (hundreds of words),
not just the last few words.

Scientists have not found any brain circuits or learning mechanisms
that correspond to this type of processing, Tenenbaum says. However,
the new findings are consistent with hypotheses that have been
previously proposed that prediction is one of the key functions in
language processing, he says.

"One of the challenges of language processing is the real-time aspect
of it," he says. "Language comes in, and you have to keep up with it
and be able to make sense of it in real time."

The researchers now plan to build variants of these language
processing models to see how small changes in their architecture
affect their performance and their ability to fit human neural data.

"For me, this result has been a game changer," Fedorenko says. "It's
totally transforming my research program, because I would not have
predicted that in my lifetime we would get to these computationally
explicit models that capture enough about the brain so that we can
actually leverage them in understanding how the brain works."

The researchers also plan to try to combine these high-performing
language models with some computer models Tenenbaum's lab has
previously developed that can perform other kinds of tasks such as
constructing perceptual representations of the physical world.

"If we're able to understand what these language models do and how
they can connect to models which do things that are more like
perceiving and thinking, then that can give us more integrative
models of how things work in the brain," Tenenbaum says. "This could
take us toward better artificial intelligence models, as well as
giving us better models of how more of the brain works and how
general intelligence emerges, than we've had in the past."

The research was funded by a Takeda Fellowship; the MIT Shoemaker
Fellowship; the Semiconductor Research Corporation; the MIT Media Lab
Consortia; the MIT Singleton Fellowship; the MIT Presidential
Graduate Fellowship; the Friends of the McGovern Institute
Fellowship; the MIT Center for Brains, Minds, and Machines, through
the National Science Foundation; the National Institutes of Health;
MIT's Department of Brain and Cognitive Sciences; and the McGovern
Institute.

Other authors of the paper are Idan Blank PhD '16 and graduate
students Greta Tuckute, Carina Kauf, and Eghbal Hosseini.

Share this news article on:

  * Twitter
  * Facebook
  * LinkedIn
  * Reddit
  * Print

Related Links

  * Evelina Fedorenko
  * Joshua Tenenbaum
  * Nancy Kanwisher
  * Martin Schrimpf
  * Department of Brain and Cognitive Sciences
  * McGovern Institute for Brain Research
  * Center for Brains, Minds, and Machines
  * Computer Science and Artificial Intelligence Laboratory
  * School of Science
  * MIT Schwarzman College of Computing

Related Topics

  * Research
  * Brain and cognitive sciences
  * Artificial intelligence
  * Language
  * Computer modeling
  * Computer science and technology
  * Machine learning
  * Center for Brains Minds and Machines
  * McGovern Institute
  * Computer Science and Artificial Intelligence Laboratory (CSAIL)
  * School of Science
  * MIT Schwarzman College of Computing
  * National Science Foundation (NSF)
  * National Institutes of Health (NIH)

Related Articles

[placeholder--frontpage--featured-news]

Brain's language center has multiple roles

color change pixels of cat

Neuroscientists find a way to make object-recognition models perform
better

A team of MIT neuroscientists has found that some computer programs
can identify the objects in these images just as well as the primate
brain.

In one aspect of vision, computers catch up to primate brain

Previous item Next item

More MIT News

Photo of two men in hardhats working with a small metal boat on a
trailer, next to a canal with buildings in the background

One autonomous taxi, please

Self-driving Roboats, developed at MIT, set sea in Amsterdam canals.

Read full story -

Photo of people holding up U.S. flag illuminated by sunlight.

3 Questions: Administering elections in a hyper-partisan era

MIT professor of political science Charles Stewart III discusses the
status of US election administration.

Read full story -

Aerial view of wind turbines across a field with clouds partially
obscuring the view.

MIT Energy Initiative awards seven Seed Fund grants for early-stage
energy research

Awards support research to improve the efficiency, scalability, and
adoption of clean energy technologies.

Read full story -

Black-and-white portrait photo of Yossi Sheffi next to the cover of
his book, "A Shot in the Arm," which features microcope-style images
representing a helical strand of RNA and the SARS-Cov-2 virus

Chronicles of the epic mission to deliver Covid vaccines to the world

"A Shot in the Arm," a new book from Professor Yossi Sheffi, reveals
lessons about overcoming global threats.

Read full story -

carbon nanotube graphic

Carbon nanotube-based sensor can detect SARS-CoV-2 proteins

The technology could be developed as a rapid diagnostic for Covid-19
or other emerging pathogens.

Read full story -

tired walking graphic

Dragging your feet? Lack of sleep affects your walk, new study finds

Periodically catching up on sleep can improve gait control for the
chronically sleep-deprived.

Read full story -

  * More news on MIT News homepage -

More about MIT News at Massachusetts Institute of Technology

This website is managed by the MIT News Office, part of the MIT
Office of Communications.

News by Schools/College:

  * School of Architecture and Planning
  * School of Engineering
  * School of Humanities, Arts, and Social Sciences
  * MIT Sloan School of Management
  * School of Science
  * MIT Schwarzman College of Computing

Resources:

  * About the MIT News Office
  * MIT News Press Center
  * Terms of Use
  * Press Inquiries
  * Filming Guidelines
  * RSS Feeds

Tools:

  * Subscribe to MIT Daily/Weekly
  * Subscribe to press releases
  * Submit campus news

Massachusetts Institute of Technology
MIT Top Level Links:

  * Education
  * Research
  * Innovation
  * Admissions + Aid
  * Campus Life
  * News
  * Alumni
  * About MIT
  * Join us in building a better world.

Massachusetts Institute of Technology
77 Massachusetts Avenue, Cambridge, MA, USA

Recommended Links:

  * Visit
  * Map (opens in new window)
  * Events (opens in new window)
  * People (opens in new window)
  * Careers (opens in new window)
  * Contact
  * Privacy
  * Accessibility
  * 
      + Social Media Hub
      + MIT on Twitter
      + MIT on Facebook
      + MIT on YouTube
      + MIT on Instagram