The saying “practice makes perfect,” typically applied to humans, is also an excellent principle for robots operating in new and unfamiliar environments.
Imagine a robot arriving at a warehouse with the skills it was trained on, such as placing an object, but now needing to pick items from an unfamiliar shelf. Initially, the robot struggles as it must adapt to its new environment. To improve, it must identify which aspects of its skills need enhancement and then specialize in refining those actions.
While a human could manually program the robot to optimize its performance, researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) and The AI Institute have developed a more efficient approach. Their “Estimate, Extrapolate, and Situate” (EES) algorithm, presented at the Robotics: Science and Systems Conference, allows robots to independently practice and improve their skills, potentially enhancing their performance in factories, homes, and hospitals.
The study is available on the arXiv preprint server.
Assessing the Situation
To enhance a robot’s skills, such as floor sweeping, the EES algorithm uses a vision system to monitor and track the robot’s surroundings. It estimates how well the robot performs a given action (like sweeping) and evaluates whether additional practice would be beneficial. The algorithm predicts the robot’s performance improvements if it refines the skill and then proceeds with practice. The vision system verifies the accuracy of each attempt.
EES could be valuable in various settings, including hospitals, factories, homes, or coffee shops. For instance, a robot cleaning a living room would need practice in tasks like sweeping. According to Nishanth Kumar SM ’24 and his colleagues, EES can help the robot improve with minimal human intervention, using just a few practice trials.
“Initially, we questioned whether specialization could be achieved with a reasonable number of samples on a real robot,” says Kumar, co-lead author of the study, Ph.D. student in electrical engineering and computer science, and CSAIL affiliate.
“Now, we have an algorithm that allows robots to significantly improve at specific tasks with just tens or hundreds of data points, compared to the thousands or millions typically required by conventional reinforcement learning algorithms.”
Demonstrating Improvement
EES’s efficiency was demonstrated using Boston Dynamics’ Spot quadruped during trials at The AI Institute. The robot, equipped with an arm, successfully completed manipulation tasks after just a few hours of practice. In one instance, the robot learned to place a ball and ring on a slanted table in about three hours. In another case, it improved its ability to sweep toys into a bin in approximately two hours. These results represent a significant improvement over previous methods, which would have required more than 10 hours per task.
“We aimed to enable the robot to independently gather experience and identify the most effective strategies for its deployment,” says co-lead author Tom Silver SM ’20, Ph.D. ’24, an electrical engineering and computer science alumnus and CSAIL affiliate who is now an assistant professor at Princeton University.
“By focusing on what the robot already knows, we sought to determine which skills would be most beneficial to practice at any given time.”
While EES shows promise for autonomous practice in new environments, it currently has limitations. For example, the researchers used low tables to simplify object visibility for the robot and 3D-printed an attachable handle to make the brush easier for Spot to handle. The robot faced challenges in detecting some items and misidentified object locations, leading to these issues being counted as failures.
Future Directions
The researchers believe that combining real and simulated practice could accelerate skill development further. They aim to refine EES to reduce latency and overcome imaging delays experienced during trials. Future work may involve developing algorithms that manage practice sequences more effectively rather than simply selecting skills to refine.
“Enabling robots to autonomously learn is both highly useful and challenging,” says Danfei Xu, an assistant professor at Georgia Tech and research scientist at NVIDIA AI, who was not involved in the study.
“In the future, home robots will be sold to a variety of households and expected to perform diverse tasks. Pre-programming every potential task is impractical, so robots must learn on the job. However, allowing robots to explore and learn without guidance can be slow and may lead to unintended results.
“The research by Silver and his team introduces an algorithm that enables robots to practice autonomously and systematically, representing a significant step toward developing home robots that can continuously improve and adapt.”
Silver and Kumar’s co-authors include The AI Institute researchers Stephen Proulx and Jennifer Barry, as well as CSAIL members Linfeng Zhao, Willie McClinton, and professors Leslie Pack Kaelbling and Tomás Lozano-Pérez. Their work was supported by The AI Institute, the U.S. National Science Foundation, the U.S. Air Force Office of Scientific Research, the U.S. Office of Naval Research, the U.S. Army Research Office, and MIT Quest for Intelligence, with high-performance computing resources provided by the MIT SuperCloud and Lincoln Laboratory Supercomputing Center.