https://news.mit.edu/2024/mit-researchers-develop-efficiency-training-more-reliable-ai-agents-1122 Skip to content | Massachusetts Institute of Technology MIT Top Menu| * Education * Research * Innovation * Admissions + Aid * Campus Life * News * Alumni * About MIT * More | Search MIT Search websites, locations, and people [ ] See More Results Suggestions or feedback? MIT News | Massachusetts Institute of Technology Subscribe to MIT News newsletter Browse Enter keywords to search for news articles: [ ] Submit Browse By Topics View All - Explore: * Machine learning * Sustainability * Startups * Black holes * Classes and programs Departments View All - Explore: * Aeronautics and Astronautics * Brain and Cognitive Sciences * Architecture * Political Science * Mechanical Engineering Centers, Labs, & Programs View All - Explore: * Abdul Latif Jameel Poverty Action Lab (J-PAL) * Picower Institute for Learning and Memory * Media Lab * Lincoln Laboratory Schools * School of Architecture + Planning * School of Engineering * School of Humanities, Arts, and Social Sciences * Sloan School of Management * School of Science * MIT Schwarzman College of Computing View all news coverage of MIT in the media - Listen to audio content from MIT News - Subscribe to MIT newsletter - Close Breadcrumb 1. MIT News 2. MIT researchers develop an efficient way to train more reliable AI agents MIT researchers develop an efficient way to train more reliable AI agents The technique could make AI systems better at complex tasks that involve variability. Adam Zewe | MIT News Publication Date: November 22, 2024 Press Inquiries Press Contact: Melanie Grados Email: mgrados@mit.edu Phone: 617-253-1682 MIT News Office Media Download A red stop light has the words, "AI" and a traffic intersection is in the background. | Download Image Caption: MIT researchers develop an efficient approach for training more reliable reinforcement learning models, focusing on complex tasks that involve variability. Credits: Image: MIT News; iStock *Terms of Use: Images for download on the MIT News office website are made available to non-commercial entities, press and the general public under a Creative Commons Attribution Non-Commercial No Derivatives license. You may not alter the images provided, other than to crop them to size. A credit line must be used when reproducing images; if one is not provided below, credit the images to "MIT." Close A red stop light has the words, "AI" and a traffic intersection is in the background. Caption: MIT researchers develop an efficient approach for training more reliable reinforcement learning models, focusing on complex tasks that involve variability. Credits: Image: MIT News; iStock Previous image Next image Fields ranging from robotics to medicine to political science are attempting to train AI systems to make meaningful decisions of all kinds. For example, using an AI system to intelligently control traffic in a congested city could help motorists reach their destinations faster, while improving safety or sustainability. Unfortunately, teaching an AI system to make good decisions is no easy task. Reinforcement learning models, which underlie these AI decision-making systems, still often fail when faced with even small variations in the tasks they are trained to perform. In the case of traffic, a model might struggle to control a set of intersections with different speed limits, numbers of lanes, or traffic patterns. To boost the reliability of reinforcement learning models for complex tasks with variability, MIT researchers have introduced a more efficient algorithm for training them. The algorithm strategically selects the best tasks for training an AI agent so it can effectively perform all tasks in a collection of related tasks. In the case of traffic signal control, each task could be one intersection in a task space that includes all intersections in the city. By focusing on a smaller number of intersections that contribute the most to the algorithm's overall effectiveness, this method maximizes performance while keeping the training cost low. The researchers found that their technique was between five and 50 times more efficient than standard approaches on an array of simulated tasks. This gain in efficiency helps the algorithm learn a better solution in a faster manner, ultimately improving the performance of the AI agent. "We were able to see incredible performance improvements, with a very simple algorithm, by thinking outside the box. An algorithm that is not very complicated stands a better chance of being adopted by the community because it is easier to implement and easier for others to understand," says senior author Cathy Wu, the Thomas D. and Virginia W. Cabot Career Development Associate Professor in Civil and Environmental Engineering (CEE) and the Institute for Data, Systems, and Society (IDSS), and a member of the Laboratory for Information and Decision Systems (LIDS). She is joined on the paper by lead author Jung-Hoon Cho, a CEE graduate student; Vindula Jayawardana, a graduate student in the Department of Electrical Engineering and Computer Science (EECS); and Sirui Li, an IDSS graduate student. The research will be presented at the Conference on Neural Information Processing Systems. Finding a middle ground To train an algorithm to control traffic lights at many intersections in a city, an engineer would typically choose between two main approaches. She can train one algorithm for each intersection independently, using only that intersection's data, or train a larger algorithm using data from all intersections and then apply it to each one. But each approach comes with its share of downsides. Training a separate algorithm for each task (such as a given intersection) is a time-consuming process that requires an enormous amount of data and computation, while training one algorithm for all tasks often leads to subpar performance. Wu and her collaborators sought a sweet spot between these two approaches. For their method, they choose a subset of tasks and train one algorithm for each task independently. Importantly, they strategically select individual tasks which are most likely to improve the algorithm's overall performance on all tasks. They leverage a common trick from the reinforcement learning field called zero-shot transfer learning, in which an already trained model is applied to a new task without being further trained. With transfer learning, the model often performs remarkably well on the new neighbor task. "We know it would be ideal to train on all the tasks, but we wondered if we could get away with training on a subset of those tasks, apply the result to all the tasks, and still see a performance increase," Wu says. To identify which tasks they should select to maximize expected performance, the researchers developed an algorithm called Model-Based Transfer Learning (MBTL). The MBTL algorithm has two pieces. For one, it models how well each algorithm would perform if it were trained independently on one task. Then it models how much each algorithm's performance would degrade if it were transferred to each other task, a concept known as generalization performance. Explicitly modeling generalization performance allows MBTL to estimate the value of training on a new task. MBTL does this sequentially, choosing the task which leads to the highest performance gain first, then selecting additional tasks that provide the biggest subsequent marginal improvements to overall performance. Since MBTL only focuses on the most promising tasks, it can dramatically improve the efficiency of the training process. Reducing training costs When the researchers tested this technique on simulated tasks, including controlling traffic signals, managing real-time speed advisories, and executing several classic control tasks, it was five to 50 times more efficient than other methods. This means they could arrive at the same solution by training on far less data. For instance, with a 50x efficiency boost, the MBTL algorithm could train on just two tasks and achieve the same performance as a standard method which uses data from 100 tasks. "From the perspective of the two main approaches, that means data from the other 98 tasks was not necessary or that training on all 100 tasks is confusing to the algorithm, so the performance ends up worse than ours," Wu says. With MBTL, adding even a small amount of additional training time could lead to much better performance. In the future, the researchers plan to design MBTL algorithms that can extend to more complex problems, such as high-dimensional task spaces. They are also interested in applying their approach to real-world problems, especially in next-generation mobility systems. The research is funded, in part, by a National Science Foundation CAREER Award, the Kwanjeong Educational Foundation PhD Scholarship Program, and an Amazon Robotics PhD Fellowship. Share this news article on: * X * Facebook * LinkedIn * Reddit * Print Paper Paper: "Model-Based Transfer Learning for Contextual Reinforcement Learning" Related Links * Cathy Wu * Laboratory for Information and Decision Systems * Institute for Data, Systems, and Society * Department of Civil and Environmental Engineering * School of Engineering * MIT Schwarzman College of Computing Related Topics * Research * Computer science and technology * Artificial intelligence * Machine learning * Algorithms * Laboratory for Information and Decision Systems (LIDS) * IDSS * Civil and environmental engineering * School of Engineering * MIT Schwarzman College of Computing * National Science Foundation (NSF) Related Articles Multiple robots moving packages in a large open space New AI model could streamline operations in a robotic warehouse A stylized Earth has undulating, glowing teal pathways leading everywhere. AI accelerates problem-solving in complex scenarios Cathy Wu standing in front of a window. Behind her are various oddly-angled facets of the MIT Stata Center in a mix of materials -- metal, brick, glass -- and colors -- silver, tan, bright yellow. The curse of variety in transportation systems highway intersection On the road to cleaner, greener, and faster driving Previous item Next item More MIT News Illustration of the interior of a self-driving car looking out, with heads-up displays as a person in the driver seat reads a book Building an understanding of how drivers interact with emerging vehicle technologies The MIT Advanced Vehicle Technology Consortium provides data-driven insights into driver behavior, along with trust in AI and advance vehicle technology. Read full story - an aerial view of a large solar farm Consortium led by MIT, Harvard University, and Mass General Brigham spurs development of 408 MW of renewable energy Projects in Texas and North Dakota support clean energy transition as MIT moves closer to 2026 net-zero goal. Read full story - Arati Prabhakar A vision for U.S. science success In a talk at MIT, White House science advisor Arati Prabhakar outlined challenges in medicine, climate, and AI, while expressing resolve to tackle hard problems. Read full story - Black-and-white photo of Catherine Wolfram leaning against a tree outside. Catherine Wolfram: High-energy scholar The MIT Sloan professor has become a leading energy economist through original studies that can inform our global climate response. Read full story - Photo illustration of a tree in a park seen at five growth stages, with a city skyline in the background Advancing urban tree monitoring with AI-powered digital twins The Tree-D Fusion system integrates generative AI and genus-conditioned algorithms to create precise simulation-ready models of 600,000 existing urban trees across North America. Read full story - Collage shows a toddler reaching for the 3D word, "toy." Other floating text says "point to the," and "blicket." Your child, the sophisticated language learner New research shows that a grasp of grammar helps even very young children figure out when they must acquire new words. Read full story - * More news on MIT News homepage - More about MIT News at Massachusetts Institute of Technology This website is managed by the MIT News Office, part of the Institute Office of Communications. News by Schools/College: * School of Architecture and Planning * School of Engineering * School of Humanities, Arts, and Social Sciences * MIT Sloan School of Management * School of Science * MIT Schwarzman College of Computing Resources: * About the MIT News Office * MIT News Press Center * Terms of Use * Press Inquiries * Filming Guidelines * RSS Feeds Tools: * Subscribe to MIT Daily/Weekly * Subscribe to press releases * Submit campus news * Guidelines for campus news contributors * Guidelines on generative AI Massachusetts Institute of Technology MIT Top Level Links: * Education * Research * Innovation * Admissions + Aid * Campus Life * News * Alumni * About MIT * Join us in building a better world. Massachusetts Institute of Technology 77 Massachusetts Avenue, Cambridge, MA, USA Recommended Links: * Visit * Map (opens in new window) * Events (opens in new window) * People (opens in new window) * Careers (opens in new window) * Contact * Privacy * Accessibility * + Social Media Hub + MIT on X + MIT on Facebook + MIT on YouTube + MIT on Instagram