Vision-Based Navigation and Deep-Learning Explanation for Autonomy

Master's Thesis, Tech. Report, CMU-RI-TR-17-27, Robotics Institute, Carnegie Mellon University, May, 2017

View Publication

Abstract

In this thesis, we investigate vision-based techniques to support robot mobile autonomy in human environments, including also understanding the important image features with respect to a classification task. Given this wide goal of transparent vision-based autonomy, the work proceeds along three main fronts. Our first algorithm enables a UAV to visually localize and navigate with respect to CoBot, a ground mobile robot, in order to perform visual search tasks. Our approach leverages the robust localization and navigation capabilities of CoBot while allowing the UAV to search for the object of interest in locations that CoBot cannot access. Second, to enable safe UAV navigation using its monocular camera, we contribute a deep learning based perception system to avoid obstacles in real-time. We demonstrate that using our system, UAVs can navigate safely in various challenging environments. Finally, we address our goal towards justification of vision-based decisions. We investigate an explanation technique to understand the predictions of a deep learning based image classifier. We contribute the Automatic Patch Pattern Labeling for Explanation (APPLE) algorithm for analyzing a deep network to find neurons that are `important' to the network classification outcome, and for automatically labeling the patches of the input image that activate these important neurons. We investigate several measures of importance for neurons and demonstrate that our technique can be used to gain insight into how a network decomposes an image to make its classification. The performance of each of these contributions is demonstrated through experimental results.

BibTeX

@mastersthesis{Konam-2017-22207,
author = {Sandeep Konam},
title = {Vision-Based Navigation and Deep-Learning Explanation for Autonomy},
year = {2017},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-17-27},
keywords = {Unmanned aerial vehicles, Autonomy, Vision, Navigation, Deep-learning, Interpretability},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.