https://news.mit.edu/2024/mit-researchers-develop-efficiency-training-more-reliable-ai-agents-1122

Skip to content |
Massachusetts Institute of Technology
MIT Top Menu|

  * Education
  * Research
  * Innovation
  * Admissions + Aid
  * Campus Life
  * News
  * Alumni
  * About MIT
  * More |

Search MIT
Search websites, locations, and people
[                    ]
See More Results

Suggestions or feedback?

MIT News | Massachusetts Institute of Technology

Subscribe to MIT News newsletter
Browse
Enter keywords to search for news articles: [                    ] 
Submit
Browse By

Topics

View All -
Explore:

  * Machine learning
  * Sustainability
  * Startups
  * Black holes
  * Classes and programs

Departments

View All -
Explore:

  * Aeronautics and Astronautics
  * Brain and Cognitive Sciences
  * Architecture
  * Political Science
  * Mechanical Engineering

Centers, Labs, & Programs

View All -
Explore:

  * Abdul Latif Jameel Poverty Action Lab (J-PAL)
  * Picower Institute for Learning and Memory
  * Media Lab
  * Lincoln Laboratory

Schools

  * School of Architecture + Planning
  * School of Engineering
  * School of Humanities, Arts, and Social Sciences
  * Sloan School of Management
  * School of Science
  * MIT Schwarzman College of Computing

View all news coverage of MIT in the media -
Listen to audio content from MIT News -
Subscribe to MIT newsletter -
Close

Breadcrumb

 1. MIT News
 2. MIT researchers develop an efficient way to train more reliable
    AI agents

MIT researchers develop an efficient way to train more reliable AI
agents

The technique could make AI systems better at complex tasks that
involve variability.
Adam Zewe | MIT News
Publication Date:
November 22, 2024
Press Inquiries

Press Contact:

Melanie Grados
Email: mgrados@mit.edu
Phone: 617-253-1682
MIT News Office

Media Download

A red stop light has the words, "AI" and a traffic intersection is in
the background.
| Download Image
Caption: MIT researchers develop an efficient approach for training
more reliable reinforcement learning models, focusing on complex
tasks that involve variability.
Credits: Image: MIT News; iStock

*Terms of Use:

Images for download on the MIT News office website are made available
to non-commercial entities, press and the general public under a
Creative Commons Attribution Non-Commercial No Derivatives license.
You may not alter the images provided, other than to crop them to
size. A credit line must be used when reproducing images; if one is
not provided below, credit the images to "MIT."

Close
A red stop light has the words, "AI" and a traffic intersection is in
the background.
Caption:
MIT researchers develop an efficient approach for training more
reliable reinforcement learning models, focusing on complex tasks
that involve variability.
Credits:
Image: MIT News; iStock

Previous image Next image

Fields ranging from robotics to medicine to political science are
attempting to train AI systems to make meaningful decisions of all
kinds. For example, using an AI system to intelligently control
traffic in a congested city could help motorists reach their
destinations faster, while improving safety or sustainability.

Unfortunately, teaching an AI system to make good decisions is no
easy task.

Reinforcement learning models, which underlie these AI
decision-making systems, still often fail when faced with even small
variations in the tasks they are trained to perform. In the case of
traffic, a model might struggle to control a set of intersections
with different speed limits, numbers of lanes, or traffic patterns.

To boost the reliability of reinforcement learning models for complex
tasks with variability, MIT researchers have introduced a more
efficient algorithm for training them.

The algorithm strategically selects the best tasks for training an AI
agent so it can effectively perform all tasks in a collection of
related tasks. In the case of traffic signal control, each task could
be one intersection in a task space that includes all intersections
in the city.

By focusing on a smaller number of intersections that contribute the
most to the algorithm's overall effectiveness, this method maximizes
performance while keeping the training cost low.

The researchers found that their technique was between five and 50
times more efficient than standard approaches on an array of
simulated tasks. This gain in efficiency helps the algorithm learn a
better solution in a faster manner, ultimately improving the
performance of the AI agent.

"We were able to see incredible performance improvements, with a very
simple algorithm, by thinking outside the box. An algorithm that is
not very complicated stands a better chance of being adopted by the
community because it is easier to implement and easier for others to
understand," says senior author Cathy Wu, the Thomas D. and Virginia
W. Cabot Career Development Associate Professor in Civil and
Environmental Engineering (CEE) and the Institute for Data, Systems,
and Society (IDSS), and a member of the Laboratory for Information
and Decision Systems (LIDS).

She is joined on the paper by lead author Jung-Hoon Cho, a CEE
graduate student; Vindula Jayawardana, a graduate student in the
Department of Electrical Engineering and Computer Science (EECS); and
Sirui Li, an IDSS graduate student. The research will be presented at
the Conference on Neural Information Processing Systems.

Finding a middle ground

To train an algorithm to control traffic lights at many intersections
in a city, an engineer would typically choose between two main
approaches. She can train one algorithm for each intersection
independently, using only that intersection's data, or train a larger
algorithm using data from all intersections and then apply it to each
one.

But each approach comes with its share of downsides. Training a
separate algorithm for each task (such as a given intersection) is a
time-consuming process that requires an enormous amount of data and
computation, while training one algorithm for all tasks often leads
to subpar performance.

Wu and her collaborators sought a sweet spot between these two
approaches.

For their method, they choose a subset of tasks and train one
algorithm for each task independently. Importantly, they
strategically select individual tasks which are most likely to
improve the algorithm's overall performance on all tasks.

They leverage a common trick from the reinforcement learning field
called zero-shot transfer learning, in which an already trained model
is applied to a new task without being further trained. With transfer
learning, the model often performs remarkably well on the new
neighbor task.

"We know it would be ideal to train on all the tasks, but we wondered
if we could get away with training on a subset of those tasks, apply
the result to all the tasks, and still see a performance increase,"
Wu says.

To identify which tasks they should select to maximize expected
performance, the researchers developed an algorithm called
Model-Based Transfer Learning (MBTL).

The MBTL algorithm has two pieces. For one, it models how well each
algorithm would perform if it were trained independently on one task.
Then it models how much each algorithm's performance would degrade if
it were transferred to each other task, a concept known as
generalization performance.

Explicitly modeling generalization performance allows MBTL to
estimate the value of training on a new task.

MBTL does this sequentially, choosing the task which leads to the
highest performance gain first, then selecting additional tasks that
provide the biggest subsequent marginal improvements to overall
performance.

Since MBTL only focuses on the most promising tasks, it can
dramatically improve the efficiency of the training process.

Reducing training costs

When the researchers tested this technique on simulated tasks,
including controlling traffic signals, managing real-time speed
advisories, and executing several classic control tasks, it was five
to 50 times more efficient than other methods.

This means they could arrive at the same solution by training on far
less data. For instance, with a 50x efficiency boost, the MBTL
algorithm could train on just two tasks and achieve the same
performance as a standard method which uses data from 100 tasks.

"From the perspective of the two main approaches, that means data
from the other 98 tasks was not necessary or that training on all 100
tasks is confusing to the algorithm, so the performance ends up worse
than ours," Wu says.

With MBTL, adding even a small amount of additional training time
could lead to much better performance.

In the future, the researchers plan to design MBTL algorithms that
can extend to more complex problems, such as high-dimensional task
spaces. They are also interested in applying their approach to
real-world problems, especially in next-generation mobility systems.

The research is funded, in part, by a National Science Foundation
CAREER Award, the Kwanjeong Educational Foundation PhD Scholarship
Program, and an Amazon Robotics PhD Fellowship.

Share this news article on:

  * X
  * Facebook
  * LinkedIn
  * Reddit
  * Print

Paper

Paper: "Model-Based Transfer Learning for Contextual Reinforcement
Learning"

Related Links

  * Cathy Wu
  * Laboratory for Information and Decision Systems
  * Institute for Data, Systems, and Society
  * Department of Civil and Environmental Engineering
  * School of Engineering
  * MIT Schwarzman College of Computing

Related Topics

  * Research
  * Computer science and technology
  * Artificial intelligence
  * Machine learning
  * Algorithms
  * Laboratory for Information and Decision Systems (LIDS)
  * IDSS
  * Civil and environmental engineering
  * School of Engineering
  * MIT Schwarzman College of Computing
  * National Science Foundation (NSF)

Related Articles

Multiple robots moving packages in a large open space

New AI model could streamline operations in a robotic warehouse

A stylized Earth has undulating, glowing teal pathways leading
everywhere.

AI accelerates problem-solving in complex scenarios

Cathy Wu standing in front of a window. Behind her are various
oddly-angled facets of the MIT Stata Center in a mix of materials --
metal, brick, glass -- and colors -- silver, tan, bright yellow.

The curse of variety in transportation systems

highway intersection

On the road to cleaner, greener, and faster driving

Previous item Next item

More MIT News

Illustration of the interior of a self-driving car looking out, with
heads-up displays as a person in the driver seat reads a book

Building an understanding of how drivers interact with emerging
vehicle technologies

The MIT Advanced Vehicle Technology Consortium provides data-driven
insights into driver behavior, along with trust in AI and advance
vehicle technology.

Read full story -

an aerial view of a large solar farm

Consortium led by MIT, Harvard University, and Mass General Brigham
spurs development of 408 MW of renewable energy

Projects in Texas and North Dakota support clean energy transition as
MIT moves closer to 2026 net-zero goal.

Read full story -

Arati Prabhakar

A vision for U.S. science success

In a talk at MIT, White House science advisor Arati Prabhakar
outlined challenges in medicine, climate, and AI, while expressing
resolve to tackle hard problems.

Read full story -

Black-and-white photo of Catherine Wolfram leaning against a tree
outside.

Catherine Wolfram: High-energy scholar

The MIT Sloan professor has become a leading energy economist through
original studies that can inform our global climate response.

Read full story -

Photo illustration of a tree in a park seen at five growth stages,
with a city skyline in the background

Advancing urban tree monitoring with AI-powered digital twins

The Tree-D Fusion system integrates generative AI and
genus-conditioned algorithms to create precise simulation-ready
models of 600,000 existing urban trees across North America.

Read full story -

Collage shows a toddler reaching for the 3D word, "toy." Other
floating text says "point to the," and "blicket."

Your child, the sophisticated language learner

New research shows that a grasp of grammar helps even very young
children figure out when they must acquire new words.

Read full story -

  * More news on MIT News homepage -

More about MIT News at Massachusetts Institute of Technology

This website is managed by the MIT News Office, part of the Institute
Office of Communications.

News by Schools/College:

  * School of Architecture and Planning
  * School of Engineering
  * School of Humanities, Arts, and Social Sciences
  * MIT Sloan School of Management
  * School of Science
  * MIT Schwarzman College of Computing

Resources:

  * About the MIT News Office
  * MIT News Press Center
  * Terms of Use
  * Press Inquiries
  * Filming Guidelines
  * RSS Feeds

Tools:

  * Subscribe to MIT Daily/Weekly
  * Subscribe to press releases
  * Submit campus news
  * Guidelines for campus news contributors
  * Guidelines on generative AI

Massachusetts Institute of Technology
MIT Top Level Links:

  * Education
  * Research
  * Innovation
  * Admissions + Aid
  * Campus Life
  * News
  * Alumni
  * About MIT
  * Join us in building a better world.

Massachusetts Institute of Technology
77 Massachusetts Avenue, Cambridge, MA, USA

Recommended Links:

  * Visit
  * Map (opens in new window)
  * Events (opens in new window)
  * People (opens in new window)
  * Careers (opens in new window)
  * Contact
  * Privacy
  * Accessibility
  * 
      + Social Media Hub
      + MIT on X
      + MIT on Facebook
      + MIT on YouTube
      + MIT on Instagram