/Building and Leveraging Category Hierarchies for Large-scale Image Classification

Building and Leveraging Category Hierarchies for Large-scale Image Classification

Hao Zhang
Tech. Report, CMU-RI-TR-16-38, Robotics Institute, Carnegie Mellon University, August, 2016

Download Publication (PDF)

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.


In image classification, visual separability between different object categories is highly uneven, and some categories are more difficult to distinguish than others. Such difficult categories demand more dedicated classifiers. However, existing deep convolutional neural networks (CNN) are trained as flat N-way classifiers, and few efforts have been made to leverage the hierarchical structure of categories. Naturally, incorporating external knowledge from category hierarchies presents a major opportunity to improve the task. However, traditional methods of manually constructing category hierarchies by experts (e.g. WordNet, ImageNet) and interest communities (e.g. Wikipedia) are either knowledge or time intensive, and the results have limited coverage. In this report, we study the problem of automatically learning and utilizing category hierarchies (taxonomies) for large-scale image classification. First, we present a probabilistic model for taxonomy induction by jointly leveraging text corpus and images from the web. The model is discriminatively trained given a small set of existing ontologies and is capable of building full category hierarchies from scratch for a collection of unseen conceptual labels with associated images. Then, we introduce hierarchical deep convolutional neural networks (HD-CNNs), which embeds deep convolutional neural networks into a category hierarchy. An HD-CNN separates easy classes using a coarse category classifier while distinguishing difficult classes using fine category classifiers. We reported state-of-the-art results on both taxonomy induction and image classification tasks.

Master's thesis

BibTeX Reference
author = {Hao Zhang},
title = {Building and Leveraging Category Hierarchies for Large-scale Image Classification},
year = {2016},
month = {August},
institution = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-16-38},