3D Object Detection from CT Scans using a Slice-and-fuse Approach - Robotics Institute Carnegie Mellon University

3D Object Detection from CT Scans using a Slice-and-fuse Approach

Master's Thesis, Tech. Report, CMU-RI-TR-19-23, Robotics Institute, Carnegie Mellon University, May, 2019

Abstract

Automatic object detection in 3D X-ray Computed Tomography imagery has recently gained research attention due to its promising applications in aviation baggage screening. The huge resolution of an individual 3D scan, however, poses formidable computational challenges when coupled with deep 3D convolutional networks for inference. In this thesis, we propose the slice-and-fuse strategy -- a generic framework to leverage image-based detection and segmentation in high-resolution 3D volumes. We encode the input 3D volumes into multiple slices along XY, YZ, and XZ directions, exploit 2D CNNs to generate 2D predictions, and then fuse 2D predictions to 3D estimation. Using the proposed strategy, we design two 3D object detectors for 3D baggage CT scans. Retinal-SliceNet uses a unified, single network to detect target objects from the input 3D CT scans. U-SliceNet exploits a two-stage paradigm, first generating proposals using a voxel labeling network and then refining the proposals by a 3D classification network. U-SliceNet generates high-quality segmentation masks along with bounding boxes for target objects. We evaluate the two SliceNets on a large-scale 3D baggage CT dataset for three tasks: baggage classification, 3D object detection, and 3D semantic segmentation.

BibTeX

@mastersthesis{Yang-2019-112979,
author = {Anqi Yang},
title = {3D Object Detection from CT Scans using a Slice-and-fuse Approach},
year = {2019},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-19-23},
keywords = {3D X-ray CT imagery, 3D object detection, image-based 3D detection, 3D semantic segmentation, classification},
}