Fast and High-Quality GPU-based Deliberative Perception for Object Pose Estimation

Master's Thesis, Tech. Report, CMU-RI-TR-20-22, Robotics Institute, Carnegie Mellon University, June, 2020

View Publication

Abstract

Pose estimation of known objects is fundamental to tasks such as robotic grasping and manipulation. The need for reliable grasping imposes stringent accuracy requirements on pose estimation in cluttered, occluded scenes in dynamic environments. Modern methods employ large sets of training data to learn features and object templates in order to find correspondence between models and observed data. However, these methods require extensive annotation of ground truth poses. An alternative is to use algorithms that search for the best explanation of the observed scene in a space of possible rendered scenes. A recently developed algorithm, PERCH (PErception Via SeaRCH) does so by using depth data to converge to a globally optimal solution using a search over a specially constructed tree. While PERCH offers strong guarantees on accuracy, the current formulation suffers from low scalability owing to its high runtime. In addition, the sole reliance on depth data for pose estimation restricts the algorithm to scenes where no two objects have the same shape.

In this work, we propose PERCH 2.0, a deliberative pose estimation approach that takes advantage of GPU acceleration and RGB data. We show that our approach can achieve an order of magnitude speedup over PERCH and meets scalability requirements for evaluating thousands of poses in parallel. We demonstrate that the proposed work directly allows for an extension of deliberative pose estimation methods to new domains such as object articulation, conveyor picking, and 6-Dof pose estimation. Our combined deliberative and discriminative framework for 6-DoF pose estimation achieves higher accuracy than purely data-driven approaches without the need for any ground truth pose annotation.

BibTeX

@mastersthesis{Agarwal-2020-122934,
author = {Aditya Agarwal},
title = {Fast and High-Quality GPU-based Deliberative Perception for Object Pose Estimation},
year = {2020},
month = {June},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-20-22},
keywords = {pose estimation, deliberative perception, manipulation},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.