/Verbalization of Service Robot Experience as Explanations in Language Including Vision-Based Learned Elements

Verbalization of Service Robot Experience as Explanations in Language Including Vision-Based Learned Elements

Sai Prabhakar Pandi Selvaraj
Master's Thesis, Tech. Report, CMU-RI-TR-17-53, Robotics Institute, Carnegie Mellon University, August, 2017

Download Publication (PDF)

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.


In this thesis, we focus on making robots more trustable by making them describe and explain their actions. First, to tackle the problem of making robots describe their experience, we introduce the concept of \textit{verbalization}, a parallel to visualization. Our verbalization algorithm can analyze log files as well as the robot’s live execution data to produce narratives describing its actions while taking user’s preference about into consideration using the verbalization space. We introduce the verbalization space to cover the variability in utterances that the robot may use to narrate its experience as we realized that different people might be interested in a different type of description from the robot.

We demonstrate verbalization at multiple levels of verbalization space to describe CoBot’s path while performing multi-floor navigation tasks. Our initial introduction of verbalization requires manual grounding of the log data to natural language phrases which makes the algorithm unscalable. To tackle this problem, we propose using classifiers and similar techniques to act on robot’s data to automatically annotate or ground the data. We then discuss and analyze the classifier we use to ground the log data to natural language automatically. We create DNN based classifier to find the floor CoBot has entered via elevator using input from the camera mounted on CoBot. To analyze the classification, we use different techniques to find important regions in the images for the classification. We have also developed metrics to analyze the relative importance of different regions in the image for a classification. Finally, using the important regions in an image, we produce an explanation in terms of natural language for its classification. We evaluate each algorithm and technique we have developed in this work and compare them with similar state-of-the-art techniques. Although our work focuses on CoBot, we contribute techniques to generalize the techniques we develop here beyond it.

BibTeX Reference
author = {Sai Prabhakar Pandi Selvaraj},
title = {Verbalization of Service Robot Experience as Explanations in Language Including Vision-Based Learned Elements},
year = {2017},
month = {August},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-17-53},
keywords = {Explainable AI, Human Robot Interaction, Mobile Robots, Vision, Deep Learning, Deep Visualization, Metrics},