Learning with Structured Priors for Robust Robot Manipulation

Jacky Liang

PhD Thesis, Tech. Report, CMU-RI-TR-22-66, Robotics Institute, Carnegie Mellon University, December, 2022

View Publication

Abstract

Robust and generalizable robots that can autonomously manipulate objects in semi-structured environments can bring material benefits to society. Data-driven learning approaches are crucial for enabling such systems by identifying and exploiting patterns in semi-structured environments, allowing robots to adapt to novel scenarios with minimal human supervision. However, despite significant prior work in learning for robot manipulation, large gaps remain before robots can be widely deployed in the real world. This thesis addresses three particular challenges to advance toward this goal: sensing in semi-structured environments, adapting manipulation to novel scenarios, and flexible planning for diverse skills and tasks. A common theme among the discussed approaches is enabling efficient and generalizable learning by incorporating “structures," or priors specific to robot manipulation, into the design and implementation of learning algorithms. The thesis works follow the three challenges above.

We first leverage contact-based sensing in scenarios that are difficult for vision-based perception. In one work, we use contact feedback to track in-hand object poses during dexterous manipulation. In another, we learn to localize contact on the surface of robot arms to enable whole-arm sensing.

Next, we explore adapting manipulation to novel objects and environments for both model-based and model-free skills. We show how learning task-oriented interactive perception can improve the performance of downstream model-based skills by identifying relevant dynamics parameters. We also show how using object-centric action spaces can make deep reinforcement learning of model-free skills more efficient and generalizable.

Lastly, we explore flexible planning methods to leverage low-level skills for more complex manipulation tasks. We develop a search-based task planner that relaxes assumptions on skill and task representations from prior works by learning skill-level dynamics models. This planner is then applied in a follow-up work that uses learned preconditions of Hybrid Force-Velocity Controllers to perform multi-step contact-rich manipulation tasks. We also explore planning for more flexible tasks described by natural language by using code as the structured action space. This is done by prompting Large Language Models to directly map natural language task instructions to robot policy code, which orchestrates existing robot perception and skill libraries to complete the task.

BibTeX

@phdthesis{Liang-2022-134447,
author = {Jacky Liang},
title = {Learning with Structured Priors for Robust Robot Manipulation},
year = {2022},
month = {December},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-22-66},
keywords = {Robot Learning, Manipulation},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.