/Deep Radio-Visual Localization

Deep Radio-Visual Localization

Tatsuya Ishihara, Kris M. Kitani, Chieko Asakawa and Michitaka Hirose
Conference Paper, Winter Conf. on Applications of Computer Vision, March, 2018

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.


For many automated navigation applications, the underlying localization algorithm must be able to continuously produce both accurate and stable results by using a spectrum of redundant sensing technologies. To this end, various sensors have been used for localization, such as Wi-Fi, Bluetooth, GPS, LiDAR and cameras. In particular, a class of vision-based localization techniques using Structure from Motion (SfM) has been shown to produce very accurate position estimates in the real-world with moderate assumptions about the motion of the camera and the amount of visual texture in the environment. However, when these assumptions are violated, SfM techniques can fail catastrophically (i.e., cannot generate any estimate). Recently, a deep convolutional neural network (CNN) has been applied to images to robustly regress 6-DOF camera poses at the cost of lower accuracy than SfM. In this work, we propose improving image-based localization accuracy of deep CNN by combining Bluetooth radio-wave signal readings. In our experiments, we show that our proposed dual-stream CNN can robustly regress 6-DOF poses from images and radio-wave signals better than one sensing modality alone. More importantly, we show that when both modes are used, we can obtain localization accuracy rivaling that of SfM but with significantly improved robustness to SfM failure modes.

BibTeX Reference
author = {Tatsuya Ishihara and Kris M. Kitani and Chieko Asakawa and Michitaka Hirose},
title = {Deep Radio-Visual Localization},
booktitle = {Winter Conf. on Applications of Computer Vision},
year = {2018},
month = {March},
publisher = {IEEE},
keywords = {Pittsburgh},