Transparency in Deep Reinforcement Learning Networks - Robotics Institute Carnegie Mellon University

Transparency in Deep Reinforcement Learning Networks

Master's Thesis, Tech. Report, CMU-RI-TR-18-48, Robotics Institute, Carnegie Mellon University, August, 2018

Abstract

In the recent years there has been a growing interest in the field of explainability for machine learning models in general and deep learning in particular. This is because deep learning based approaches have made tremendous progress in the field of computer vision, reinforcement learning, language related domains and are being increasingly used in application areas such as medicine and finance. But before we fully adopt these models, it is important for us to understand the motivations behind network decisions. This helps us to gain trust in the network, to verify that network decisions are fair to those affected by it and to debug the network model. Moreover, it helps us to gain insights about underlying mechanisms learned by the network and understand the limitations of the network i.e. the domain in which the network performs well and conditions when the network fails.

In this particular work, we explore transparency in deep reinforcement learning networks. We focus on answering the question - why a particular decision was taken by a value based deep reinforcement learning agent and identify attributes in the input space that positively or negatively influence its future actions in a human interpretable manner. Particularly, we discuss an approach “object saliency” at length and demonstrate that it can be used as a simple and effective computational tool for this purpose. We compare and contrast it with existing saliency approaches using a quantitative measure, discuss results from a pilot human experiment to study intuitiveness of object saliency and show how object saliency can provide insights into differences in value function learned by different RL architectures or training approaches, that is not highlighted by existing methods. We also show that it is possible to develop rule based textual descriptions of object saliency maps for easy interpretability by humans - which is difficult to do with existing approaches.

BibTeX

@mastersthesis{Sundar-2018-107285,
author = {Ramitha Sundar},
title = {Transparency in Deep Reinforcement Learning Networks},
year = {2018},
month = {August},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-18-48},
keywords = {Transparency, Explainable AI, Reinforcement Learning, Saliency},
}