Planning and Execution using Inaccurate Models with Provable Guarantees on Task Completeness - Robotics Institute Carnegie Mellon University

Planning and Execution using Inaccurate Models with Provable Guarantees on Task Completeness

PhD Thesis, Tech. Report, CMU-RI-TR-22-05, Robotics Institute, Carnegie Mellon University, February, 2022

Abstract

Modern planning methods are effective in computing feasible and optimal plans for robotic tasks when given access to accurate dynamical models. However, robots operating in the real world often face situations that cannot be modeled perfectly before execution. Thus, we only have access to simplified but potentially inaccurate models. This imperfect modeling can lead to highly suboptimal plans or even the inability to reach the goal during execution. Existing approaches present a learning-based solution where real-world experience is used to learn a complex dynamical model that is subsequently used for planning. However, this requires a prohibitively large amount of experience over the entire state space, and can be wasteful if we are interested in completing the task and not in modeling the dynamics accurately. Furthermore, real robots often have operating constraints and cannot spend hours acquiring experience to learn dynamics. This thesis argues that by updating the behavior of the planner and not the dynamics of the model, we can leverage simplified and potentially inaccurate models and significantly reduce the amount of real-world experience needed to provably guarantee that the robot completes the task.

We support this argument from an algorithmic perspective by presenting two novel algorithms. The first algorithm CMAX guarantees that the robot reaches the goal using the inaccurate model without any resets. This is achieved by biasing the planner away from transitions whose dynamics are discovered to be inaccurately modeled during online execution. However, CMAX requires strong assumptions on the accuracy of the model used for planning and fails to improve the quality of solution over repetitions of the same task. The second algorithm CMAX++ leverages real-world experience to improve the quality of resulting plans over successive repetitions of a robotic task. CMAX++ achieves this by integrating model-free learning using acquired experience with model-based planning using the potentially inaccurate model. As a consequence of this in addition to completeness, CMAX++ also guarantees asymptotic convergence to the optimal path cost as the number of repetitions increases under relaxed assumptions. Crucially, both algorithms do not require any updates to the dynamics of the model unlike any existing method for planning using inaccurate models.

From a theoretical perspective, this thesis presents a performance analysis for methods that leverage inaccurate models in optimal control of linearized systems with quadratic costs. Our analysis shows that naively using inaccurate models can lead to large suboptimality gaps when modeling errors are significant, while updating the behavior of the planner during execution, like CMAX and CMAX++, can substantially reduce the suboptimality gap. The thesis concludes by exploring the paradigm of updating the dynamics of the model and presents an algorithm TOMS that directly optimizes task performance rather than prediction error. We show that in the online setting where the robot does not have access to any resets and collects data as it executes, TOMS outperforms prior works that either optimize a maximum likelihood objective or rely on an offline collected dataset with good coverage.

BibTeX

@phdthesis{Vemula-2022-130825,
author = {Anirudh Vemula},
title = {Planning and Execution using Inaccurate Models with Provable Guarantees on Task Completeness},
year = {2022},
month = {February},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-22-05},
keywords = {Robotics, Planning, Reinforcement Learning, Numerical Optimization},
}