Visual object category recognition is one of the most challenging problems in computer vision. Even assuming that we can obtain near-perfect instance level representations with the advances in visual input devices and low-level vision techniques, object categorization will still remain as a difficult problem because it requires drawing boundaries between instances in a continuous world, which are solely defined by human conceptualization. Object categorization is essentially a perceptual process that takes place in the human-defined semantic space.
In this semantic space, the categories reside not in isolation, but in relation to others. Some categories are similar, grouped, or co-occur, and some are not. However, despite this semantic nature of object categorization, most of the today's automatic visual category recognition systems rely only on the category labels for training discriminative recognition models, which could result in the recognition model misled into learning incorrect association between visual features and the semantic labels, essentially overfitting to training set biases and limiting the model's prediction power when new test instances are given.
Human knowledge of the world has great potential benefit for object category recognition as it provides much richer information beyond class membership, in the form of inter-category, or category-concept distances and structures. In this talk, I will introduce discriminative learning methods for categorization that leverage semantic knowledge for object recognition, focusing on the semantic relationships among different categories and concepts. To this end, I explore three semantic sources, namely attributes, taxonomies, and analogies, which are incorporated into the original discriminative model as a form of structural regularization, that penalize the models that deviate from the known structures according to the semantic knowledge provided.
The proposed methods are evaluated on challenging public datasets, and are shown to effectively improve the recognition accuracy over purely discriminative models, owing to better generalization power obtained from semantic regularizations.
Host: Kris Kitani
Appointments: Kris Kitani
Sung Ju Hwang is a postdoctoral research associate at Disney Research Pittsburgh, working under the supervision of Dr. Leonid Sigal. He received a B.S. degree in computer science and engineering from Seoul National University, Korea, and a M.A. degree in Computer Science from the University of Texas at Austin, in 2008 and 2010 respectively. He recently graduated from the University of Texas at Austin with a Ph.D. degree in computer science, where he performed research on visual recognition under the supervision of Prof. Kristen Grauman. His primary research interests lie in the intersection between visual recognition and machine learning, with the special focus on exploiting human knowledge for visual recognition, scalable large-scale recognition, learning with structural regularization, multitask and transfer learning