Improving visual noise insensitivity in small vocabulary audio-visual speech recognition applications - Robotics Institute Carnegie Mellon University

Improving visual noise insensitivity in small vocabulary audio-visual speech recognition applications

S. Lucey, S. Sridharan, and V. Chandran
Conference Paper, Proceedings of 6th International Symposium on Signal Processing and its Applications (ISSPA '01), Vol. 2, pp. 434 - 437, August, 2001

Abstract

Visual noise insensitivity is important to audio visual speech recognition (AVSR). Visual noise can take on a number of forms such as varying frame rate, occlusion, lighting or speaker variabilities. The use of a high dimensional secondary classifier on the word likelihood scores from both the audio and video modalities is investigated for the purposes of adaptive fusion. Preliminary results are presented demonstrating performance above the catastrophic fusion boundary for our confidence measure irrespective of the type of visual noise presented to it. Our experiments were restricted to small vocabulary applications.

BibTeX

@conference{Lucey-2001-121093,
author = {S. Lucey and S. Sridharan and V. Chandran},
title = {Improving visual noise insensitivity in small vocabulary audio-visual speech recognition applications},
booktitle = {Proceedings of 6th International Symposium on Signal Processing and its Applications (ISSPA '01)},
year = {2001},
month = {August},
volume = {2},
pages = {434 - 437},
}