Learning Statistical Structure for Object Detection

Henry Schneiderman

Conference Paper, Proceedings of International Conference on Computer Analysis of Images and Patterns (CAIP '03), pp. 434 - 441, August, 2003

View Publication

Abstract

Many classes of images exhibit sparse structuring of statistical dependency. Each variable has strong statistical dependency with a small number of other variables and negligible dependency with the remaining ones. Such structuring makes it possible to construct a powerful classifier by only representing the stronger dependencies among the variables. In particular, a semi-na?e Bayes classifier compactly represents sparseness. A semi-na?e Bayes classifier decomposes the input variables into subsets and represents statistical dependency within each subset, while treating the subsets as statistically inde-pendent. However, learning the structure of a semi-na?e Bayes classifier is known to be NP complete. The high dimensionality of images makes statistical structure learning especially challenging. This paper describes an algorithm that searches for the structure of a semi-na?e Bayes classifier in this large space of possible structures. The algorithm seeks to optimize two cost functions: a localized error in the log-likelihood ratio function to restrict the structure and a global classification error to choose the final structure. We use this approach to train detectors for several objects including faces, eyes, ears, telephones, push-carts, and door-handles. These detectors perform robustly with a high detection rate and low false alarm rate in unconstrained settings over a wide range of variation in background scenery and lighting.

BibTeX

@conference{Schneiderman-2003-8720,
author = {Henry Schneiderman},
title = {Learning Statistical Structure for Object Detection},
booktitle = {Proceedings of International Conference on Computer Analysis of Images and Patterns (CAIP '03)},
year = {2003},
month = {August},
pages = {434 - 441},
publisher = {Springer-Verlag},
keywords = {computer vision, object recognition, object detection, face detection, learning, graphical models, statistical structure},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.