Training Strategies for Time Series: Learning for Prediction, Filtering, and Reinforcement Learning

PhD Thesis, Tech. Report, CMU-RI-TR-18-01, Robotics Institute, Carnegie Mellon University, October, 2017

View Publication

Abstract

Data driven approaches to modeling time-series are important in a variety of applications from market prediction in economics to the simulation of robotic systems. However, traditional supervised machine learning techniques designed for i.i.d. data often perform poorly on these sequential problems. This thesis proposes that time series and sequential prediction, whether for forecasting, filtering, or reinforcement learning, can be effectively achieved by directly training recurrent prediction procedures rather then building generative probabilistic models.

To this end, we introduce a new training algorithm for learned time-series models, Data as Demonstrator (DaD), that theoretically and empirically improves multi-step prediction performance on model classes such as recurrent neural networks, kernel regressors, and random forests. Additionally, experimental results indicate that DaD can accelerate model-based reinforcement learning. We next show that latent-state time-series models, where a sufficient state parametrization may be unknown, can be learned effectively in a supervised way using predictive representations derived from observations alone. Our approach, Predictive State Inference Machines (PSIMs), directly optimizes – through a DaD-style training procedure – the inference performance without local optima by identifying the recurrent hidden state as a predictive belief over statistics of future observations. Finally, we experimentally demonstrate that augmenting recurrent neural network architectures with Predictive-State Decoders (Psds), derived using the same objective optimized by PSIMs, improves both the performance and convergence for recurrent networks on probabilistic filtering, imitation learning, and reinforcement learning tasks. Fundamental to our learning framework is that the prediction of observable quantities is a lingua franca for building AI systems.

Notes
Thesis Committee:J. Andrew Bagnell, Co-chair Martial Hebert, Co-chair Jeff SchneiderByron Boots, Georgia Institute of Technology

BibTeX

@phdthesis{Venkatraman-2017-104248,
author = {Arun Venkatraman},
title = {Training Strategies for Time Series: Learning for Prediction, Filtering, and Reinforcement Learning},
year = {2017},
month = {October},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-18-01},
keywords = {Time series, sequential prediction, Bayesian filtering, reinforcement learning, machine learning for robotics},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.