/Source Constrained Clustering

Source Constrained Clustering

Ekaterina Taralova, Fernando De la Torre Frade and Martial Hebert
Conference Paper, 13th International Conference on Computer Vision 2011, November, 2011

Download Publication (PDF)

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author’s copyright. These works may not be reposted without the explicit permission of the copyright holder.


We consider the problem of quantizing data generated from disparate sources, e.g. subjects performing actions with different styles, movies with particular genre bias, various conditions in which images of objects are taken, etc. These are scenarios where unsupervised clustering produces inadequate codebooks because algorithms like K-means tend to cluster samples based on data biases (e.g. cluster subjects), rather than cluster similar samples across sources (e.g. cluster actions). We propose a new quantization technique, Source Constrained Clustering (SCC), which extends the K-means algorithm by enforcing clusters to group samples from multiple sources. We evaluate the method in the context of activity recognition from videos in an unconstrained environment. Experiments on several tasks and features show that using source information improves classification performance.

BibTeX Reference
author = {Ekaterina Taralova and Fernando De la Torre Frade and Martial Hebert},
title = {Source Constrained Clustering},
booktitle = {13th International Conference on Computer Vision 2011},
year = {2011},
month = {November},
keywords = {clustering, quantization, k-means, bag-of-words, computer vision, activity recognition},