Carnegie Mellon Robotics Institute
K.-F. Lee, H.-W. Hon, and Raj Reddy
IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 38, No. 1, January, 1990, pp. 35 - 45.
| Download |
|
| Abstract |
| A description is given of SPHINX, a system that demonstrates the feasibility of accurate, large-vocabulary, speaker-independent, continuous speech recognition. SPHINX is based on discrete hidden Markov models (HMMs) with LPC- (linear-predictive-coding) derived parameters. To provide speaker independence, knowledge was added to these HMMs in several ways: multiple codebooks of fixed-width parameters, and an enhanced recognizer with carefully designed models and word-duration modeling. To deal with coarticulation in continuous speech, yet still adequately represent a large vocabulary, two new subword speech units are introduced: function-word-dependent phone models and generalized triphone models. With grammars of perplexity 997, 60, and 20, SPHINX attained word accuracies of 71, 94, and 96%, respectively, on a 997-word task. |
| Notes |
Note: see also IEEE Transactions on Signal Processing |
| Text Reference |
| K.-F. Lee, H.-W. Hon, and Raj Reddy, "An overview of the SPHINX speech recognition system," IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 38, No. 1, January, 1990, pp. 35 - 45. |
| BibTeX Reference |
|
@article{Reddy_1990_3707, author = "K.-F. Lee and H.-W. Hon and Raj Reddy", title = "An overview of the SPHINX speech recognition system", journal = "IEEE Transactions on Acoustics, Speech and Signal Processing", pages = "35 - 45", month = "January", year = "1990", volume = "38", number = "1", Notes = "see also IEEE Transactions on Signal Processing" } |
| The Robotics Institute is part of the School of Computer Science, Carnegie Mellon University. Contact Us | Update Instructions |