Understanding, Exploiting and Improving Inter-view Relationships

PhD Thesis, Tech. Report, CMU-RI-TR-22-19, Robotics Institute, Carnegie Mellon University, May, 2022

View Publication

Abstract

Multi-view machine learning has garnered substantial attention in various applications over recent years.
Many such applications involve learning on data obtained from multiple heterogeneous sources of information, for example, in multi-sensor systems such as self-driving cars, or monitoring intensive care patient vital signs at their bed-side.

Learning models for such applications can often benefit from leveraging not only the information from individual sources, but also the interactions and relationships between these sources.

In our research, we look at multi-view learning approaches which try to model these inter-view interactions explicitly.
Here, we define interactions and relationships between views in terms of the information which is shared across them, including corroboration and redundancy of information.

We distinguish between global relationships, which are shared across all views, and local relationships, which are only shared between a subset.
For example, in a multi-camera system, we can think of global relationships to be defined over the part of a scene which is visible to all cameras, while local relationships would be defined by the intersection of the fields of view of only some of the cameras.

We consider three main aspects of modeling such relationships.
First, we develop and study a framework for discovering and understanding them within multi-view data.
We describe different approaches to uncover and model these global and local relationships.
We look at simple multi-view extensions of auto-encoders, and then move onto more sophisticated generative models.

Second, we explore the benefits of this understanding of inter-view relationships to solve down-stream modeling tasks, exploiting the structure that multi-view data avails us. Here, we adapt our models to tackle different applications, and demonstrate the utility and effectiveness of explicitly modeling these relationships.
We first look at incorporating the downstream loss function into the representation learning framework to cater to the task-specific problem.
We then consider the applications in the domains of image data and temporal data to evaluate the adaptability of our methods.

Third, we investigate a methodology for improving these relationships directly by facilitating favorable interactions between views.
We first look at how one can re-interpret individual views as data points, allowing us to apply traditional machine learning approaches to modeling inter-view relationships. Using this re-interpretation, we look at view-selection where we directly select views which manifest favorable relationships, and propose Scalable Active Search as a candidate for this. Active Search allows us to interactively search for informative views, given an initial set of views and a measure of similarity between them.

BibTeX

@phdthesis{Venkatesan-2022-131701,
author = {Sibi Venkatesan},
title = {Understanding, Exploiting and Improving Inter-view Relationships},
year = {2022},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-22-19},
keywords = {Multi-view Machine Learning, Auto-Encoders, Generative Modeling, Active Learning, Unsupervised Learning},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.