An investigation of HMM classifier combination strategies for improved audio-visual speech recognition

S. Lucey, S. Sridharan, and V. Chandran

Conference Paper, Proceedings of 7th European Conference on Speech Communication and Technology (EUROSPEECH '01), September, 2001

Abstract

The combining of independent audio and visual HMM classifiers (late integration) has been shown to out perform the combination of audio and visual features in a single HMM classifier (early integration) when either or both modalities are presented with distortion for the task of speech recognition. Theoretical foundations for the optimal combination of these audio and video classifiers are still unclear. In this paper a number of strategies for combining these classifiers are investigated. An argument for using a hybrid of the sum and product rules is made based on empirical, theoretical and heuristic evidence.

BibTeX

@conference{Lucey-2001-121091,
author = {S. Lucey and S. Sridharan and V. Chandran},
title = {An investigation of HMM classifier combination strategies for improved audio-visual speech recognition},
booktitle = {Proceedings of 7th European Conference on Speech Communication and Technology (EUROSPEECH '01)},
year = {2001},
month = {September},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.