Home/3D Object Detection from CT Scans using a Slice-and-fuse Approach

3D Object Detection from CT Scans using a Slice-and-fuse Approach

Anqi Yang
Master's Thesis, Tech. Report, CMU-RI-TR-19-23, May, 2019

Download Publication

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


Automatic object detection in 3D X-ray Computed Tomography imagery has recently gained research attention due to its promising applications in aviation baggage screening. The huge resolution of an individual 3D scan, however, poses formidable computational challenges when coupled with deep 3D convolutional networks for inference. In this thesis, we propose the slice-and-fuse strategy — a generic framework to leverage image-based detection and segmentation in high-resolution 3D volumes. We encode the input 3D volumes into multiple slices along XY, YZ, and XZ directions, exploit 2D CNNs to generate 2D predictions, and then fuse 2D predictions to 3D estimation. Using the proposed strategy, we design two 3D object detectors for 3D baggage CT scans. Retinal-SliceNet uses a unified, single network to detect target objects from the input 3D CT scans. U-SliceNet exploits a two-stage paradigm, first generating proposals using a voxel labeling network and then refining the proposals by a 3D classification network. U-SliceNet generates high-quality segmentation masks along with bounding boxes for target objects. We evaluate the two SliceNets on a large-scale 3D baggage CT dataset for three tasks: baggage classification, 3D object detection, and 3D semantic segmentation.

author = {Anqi Yang},
title = {3D Object Detection from CT Scans using a Slice-and-fuse Approach},
year = {2019},
month = {May},
school = {},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-19-23},
keywords = {3D X-ray CT imagery, 3D object detection, image-based 3D detection, 3D semantic segmentation, classification},
} 2019-05-16T09:05:00-04:00