Carnegie Mellon University
Discriminative Cluster Analysis

Fernando De la Torre Frade and Takeo Kanade
International Conference on Machine Learning, June, 2006, pp. 241 - 248.

  • Adobe portable document format (pdf) (489KB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Clustering is one of the most widely used statistical tools for data analysis. Among all existing clustering techniques, k-means is a very popular method because of its ease of programming and because it accomplishes a good trade-off between achieved performance and computational complexity. However, k-means is prone to local minima problems, and it does not scale too well with high dimensional data sets. A common approach to dealing with high dimensional data is to cluster in the space spanned by the principal components (PC). In this paper, we show the benefits of clustering in a low dimensional discriminative space rather than in the PC space (generative). In particular, we propose a new clustering algorithm called Discriminative Cluster Analysis (DCA). DCA jointly performs dimensionality reduction and clustering. Several toy and real examples show the benefits of DCA versus traditional PCA+k-means clustering. Additionally, a new matrix formulation is proposed and connections with related techniques such as spectral graph methods and linear discriminant analysis are provided.

Clustering, Linear Discriminant Analysis, Component Analysis

Associated Project(s): Component Analysis for Data Analysis
Number of pages: 8

Text Reference
Fernando De la Torre Frade and Takeo Kanade, "Discriminative Cluster Analysis," International Conference on Machine Learning, June, 2006, pp. 241 - 248.

BibTeX Reference
   author = "Fernando {De la Torre Frade} and Takeo Kanade",
   title = "Discriminative Cluster Analysis",
   booktitle = "International Conference on Machine Learning",
   pages = "241 - 248",
   publisher = "ACM Press",
   address = "New York, NY, USA",
   month = "June",
   year = "2006",
   volume = "148",