Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks - Robotics Institute Carnegie Mellon University

Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks

N. Dinesh Reddy, Minh Vo, and Srinivasa G. Narasimhan
Conference Paper, Proceedings of (CVPR) Computer Vision and Pattern Recognition, pp. 7318 - 7327, June, 2019

Abstract

We present Occlusion-Net, a framework to predict 2D and 3D locations of occluded keypoints for objects, in a largely self-supervised manner. We use an off-the-shelf detector as input (e.g. MaskRCNN) that is trained only on visible key point annotations. This is the only supervision used in this work. A graph encoder network then explicitly classifies invisible edges and a graph decoder network corrects the occluded keypoint locations from the initial detector. Central to this work is a trifocal tensor loss that provides indirect self-supervision for occluded keypoint locations that are visible in other views of the object. The 2D keypoints are then passed into a 3D graph network that estimates the 3D shape and camera pose using the self-supervised reprojection loss. At test time, Occlusion-Net successfully localizes keypoints in a single view under a diverse set of occlusion settings. We validate our approach on synthetic CAD data as well as a large image set capturing vehicles at many busy city intersections. As an interesting aside, we compare the accuracy of human labels of invisible
keypoints against those predicted by the trifocal tensor

BibTeX

@conference{Narapureddy-2019-118026,
author = {N. Dinesh Reddy and Minh Vo and Srinivasa G. Narasimhan},
title = {Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks},
booktitle = {Proceedings of (CVPR) Computer Vision and Pattern Recognition},
year = {2019},
month = {June},
pages = {7318 - 7327},
keywords = {occlusion detection, keypoints, car pose},
}