K.-F. Lee, H.-W. Hon, and Raj Reddy
IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 38, No. 1, January, 1990, pp. 35 - 45.
|A description is given of SPHINX, a system that demonstrates the feasibility of accurate, large-vocabulary, speaker-independent, continuous speech recognition. SPHINX is based on discrete hidden Markov models (HMMs) with LPC- (linear-predictive-coding) derived parameters. To provide speaker independence, knowledge was added to these HMMs in several ways: multiple codebooks of fixed-width parameters, and an enhanced recognizer with carefully designed models and word-duration modeling. To deal with coarticulation in continuous speech, yet still adequately represent a large vocabulary, two new subword speech units are introduced: function-word-dependent phone models and generalized triphone models. With grammars of perplexity 997, 60, and 20, SPHINX attained word accuracies of 71, 94, and 96%, respectively, on a 997-word task.|
Note: see also IEEE Transactions on Signal Processing
|K.-F. Lee, H.-W. Hon, and Raj Reddy, "An overview of the SPHINX speech recognition system," IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 38, No. 1, January, 1990, pp. 35 - 45.|
author = "K.-F. Lee and H.-W. Hon and Raj Reddy",
title = "An overview of the SPHINX speech recognition system",
journal = "IEEE Transactions on Acoustics, Speech and Signal Processing",
pages = "35 - 45",
month = "January",
year = "1990",
volume = "38",
number = "1",
Notes = "see also IEEE Transactions on Signal Processing"
|The Robotics Institute is part of the School of Computer Science, Carnegie Mellon University.|
Contact Us | Update Instructions