From Reinforcement Learning to Robot Learning: Leveraging Prior Data and Shared Evaluation

PhD Thesis, Tech. Report, CMU-RI-TR-23-36, July, 2023

View Publication

Abstract

Deep learning holds promise for learning complex patterns from data, which is especially useful when the input or output space is large. In robot learning, both the input (images or other sensor data) and the output (actions such as joint angles) can be large, suggesting that deep learning could be especially well-suited to making progress on challenging problems in robotics.

However, unlike most machine learning applications, robotics involves physical constraints that make off-the-shelf learning challenging. Robots are expensive and typically require human involvement for resetting environments and fixing hardware. These constraints make large-scale data collection and training difficult, presenting a major roadblock to applying today's data-intensive algorithms. Robot learning has an additional roadblock in evaluation: every physical space is different, making results across labs inconsistent.

Two common assumptions of the robot learning paradigm limit data efficiency. First, an agent typically assumes isolated environments and no prior knowledge or experience – learning is done tabula-rasa. Second, agents typically receive only image observations as input, relying on vision alone to learn tasks. However, in the real world, humans learn with many senses across many environments and come with prior experiences as they learn new tasks. This approach is not only practical but also crucial for feasibility in real robotics where it is cost-prohibitive to collect many samples from deployed physical systems.

In this thesis, I present work that lifts these two assumptions, improving the data efficiency of robot learning by leveraging multimodality and pretraining. First, I show how multimodal sensing like sight and sound can provide rich self-supervision (Chapter 2). Second, I introduce a framework for pretraining and evaluating self-supervised exploration via environment transfer (Chapter 3). In Chapter 4, I apply these ideas to real-world manipulation, combining the benefits of large-scale pretraining and multimodality through audio-visual pretraining for contact microphones. Finally, drawing upon the benchmarking efforts from Chapter 3, I introduce a real-robot benchmark for evaluating the generalization of both visual and policy learning methods via shared data and hardware (Chapter 5).

BibTeX

@phdthesis{Dean-2023-136969,
author = {Victoria Dean},
title = {From Reinforcement Learning to Robot Learning: Leveraging Prior Data and Shared Evaluation},
year = {2023},
month = {July},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-23-36},
keywords = {robot learning, reinforcement learning, manipulation, exploration, benchmarking, pretraining},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.