Analysis of a Spatio-Temporal Clustering Algorithm for Counting People in a Meeting

Yongjun Jeon and Paul Rybski
tech. report CMU-RI-TR-06-04, Robotics Institute, Carnegie Mellon University, January, 2006


Download
  • Adobe portable document format (pdf) (1MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
This paper proposes an algorithm that, given a time interval and the positions of people's faces located by a face detector, automatically determines the number of people present at a meeting. It should be noted that such a face detector often times produces noise and false positives, rendering the analysis of its results increasingly difficult. In any given frame, false positives may appear, and legitimate faces can go unnoticed, which calls for the use of statistical methods in the algorithm.

Exploiting clustering patterns based on temporal and spatial alignments of the detected faces, our algorithm employs the expectation-maximization (EM) algorithm [4] for mixture models and K-Means clustering algorithm [8]. The Gaussian mixture model [2] is used to estimate the probability density function of the data points; its parameters are then optimized using the EM algorithm, whose performance is in turn enhanced by its joint use with the K-Means algorithm. Also, by performing random restarts in the final model verification stage of the algorithm, different estimates are sampled using different parameters, and the most consistent result is chosen, under the assumption that an incorrect parameter set will have inconsistent fitting.

The results from this combination of algorithms and the sample training data set indicate the existence of the optimal set of parameters that produces estimates with locally minimum standard deviation and percentage error.

Finally, a stand-alone module will first be trained with a data set for which the ground truth is available for calculation of percentage errors. It will also implement an automatic, but simplified, model verification procedure with the parameters obtained from the data set.


Notes

Text Reference
Yongjun Jeon and Paul Rybski, "Analysis of a Spatio-Temporal Clustering Algorithm for Counting People in a Meeting," tech. report CMU-RI-TR-06-04, Robotics Institute, Carnegie Mellon University, January, 2006

BibTeX Reference
@techreport{Rybski_2006_5991,
   author = "Yongjun Jeon and Paul Rybski",
   title = "Analysis of a Spatio-Temporal Clustering Algorithm for Counting People in a Meeting",
   booktitle = "",
   institution = "Robotics Institute",
   month = "January",
   year = "2006",
   number= "CMU-RI-TR-06-04",
   address= "Pittsburgh, PA",
}