A Multimodal Dialogue System for Conversational Image Editing

T.-H. Lin, T. Bui, D. S. Kim and J. Oh
Workshop Paper, The Second Workshop on Conversational AI at the Thirty-second Conference on Neural Information Processing Systems (NeurIPS 2018), November, 2018


Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract

In this paper, we present a multimodal dialogue system for Conversational Image Editing. We formulate our multimodal dialogue system as a Partially Observed Markov Decision Process (POMDP) and trained it with Deep Q-Network (DQN) and a user simulator. Our evaluation shows that the DQN policy outperforms a rule-based baseline policy, achieving 90% success rate under high error rates. We also conducted a real user study and analyzed real user behavior.


@workshop{Lin-2018-113122,
author = {T.-H. Lin and T. Bui and D. S. Kim and J. Oh},
title = {A Multimodal Dialogue System for Conversational Image Editing},
booktitle = {The Second Workshop on Conversational AI at the Thirty-second Conference on Neural Information Processing Systems (NeurIPS 2018)},
year = {2018},
month = {November},
} 2019-06-26T13:02:33-04:00