Beyond Grids: Learning Graph Representations for Visual Recognition - Robotics Institute Carnegie Mellon University

Beyond Grids: Learning Graph Representations for Visual Recognition

Yin Li and Abhinav Gupta
Conference Paper, Proceedings of (NeurIPS) Neural Information Processing Systems, pp. 9225 - 9235, December, 2018

Abstract

We propose learning graph representations from 2D feature maps for visual recognition. Our method draws inspiration from region based recognition, and learns to transform a 2D image into a graph structure. The vertices of the graph define clusters of pixels ("regions"), and the edges measure the similarity between these clusters in a feature space. Our method further learns to propagate information across all vertices on the graph, and is able to project the learned graph representation back into 2D grids. Our graph representation facilitates reasoning beyond regular grids and can capture long range dependencies among regions. We demonstrate that our model can be trained from end-to-end, and is easily integrated into existing networks. Finally, we evaluate our method on three challenging recognition tasks: semantic segmentation, object detection and object instance segmentation. For all tasks, our method outperforms state-of-the-art methods.

BibTeX

@conference{Li-2018-113275,
author = {Yin Li and Abhinav Gupta},
title = {Beyond Grids: Learning Graph Representations for Visual Recognition},
booktitle = {Proceedings of (NeurIPS) Neural Information Processing Systems},
year = {2018},
month = {December},
pages = {9225 - 9235},
}