Dressed Human Modeling, Detection, and Parts Localization

Liang Zhao
doctoral dissertation, tech. report CMU-RI-TR-01-19, Robotics Institute, Carnegie Mellon University, July, 2001

  • Adobe portable document format (pdf) (2MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

This dissertation presents an integrated human shape modeling, detection, and body part localization vision system. It demonstrates that the system can (1) detect pedestrians in various shapes, sizes, postures, partial occlusion, and clothing from a moving vehicle using stereo cameras; (2) locate the joints of a person automatically and accurately without employing any markers around the joints.

The following contributions distinguish this dissertation from previous work:

1. Dressed human modeling and dynamic model assembling: Unlike previous work that employs a fixed human body model or global deformable template to perform human detection, in this dissertation merged body parts are introduced to represent the deformations caused by clothing, segmentation errors, or low image resolution. A dressed human model is dynamically assembled from the model parts in the recognition step; the shapes of the body parts and the size and spatial relationships between them (the contextual information) are represented as invariant under translation, rotation, and scaling. Therefore, the system can detect people in different clothes, positions, sizes, and orientations.

2. Bayesian similarity measure: A probabilistic similarity measure is derived from the human model that combines the local shape and global relationship constraints to guide body part identification and human detection. Thus, the identification of a part does not only depend on its own shape but also the contextual constraints from other parts. In contrast with previous work, the proposed similarity measure enables efficient shape matching and comparison robust to articulation, partial occlusion, and segmentation errors through coarse-to-fine human model assembling.

3. Recursive context reasoning algorithm: Contour-based human detection depends on reliable contour extraction, but contour extraction is an under-constrained problem without the knowledge about the objects to be detected. Unlike previous work that assumes perfect and complete contours are available, this dissertation proposes a recursive context reasoning (RCR) algorithm to solve the above dilemma. A contour updating procedure is introduced to integrate the human model and the identified body parts to predict the shapes and locations of the parts missed by the contour detector; the refined contours are used to reevaluate the Bayesian similarity measure and to determine if a person is present or not. Therefore, contour extraction, body part localization, and human detection are improved iteratively.

human detection, human modeling, body part identification, body part localization, human motion capture

Associated Center(s) / Consortia: Vision and Autonomous Systems Center
Associated Lab(s) / Group(s): NavLab
Associated Project(s): Side Collision Warning System for Transit Buses

Text Reference
Liang Zhao, "Dressed Human Modeling, Detection, and Parts Localization," doctoral dissertation, tech. report CMU-RI-TR-01-19, Robotics Institute, Carnegie Mellon University, July, 2001

BibTeX Reference
   author = "Liang Zhao",
   title = "Dressed Human Modeling, Detection, and Parts Localization",
   booktitle = "",
   school = "Robotics Institute, Carnegie Mellon University",
   month = "July",
   year = "2001",
   number= "CMU-RI-TR-01-19",
   address= "Pittsburgh, PA",