Search

Navigator: RI | Publications | An overview of the SPHINX speech recognition system

Graphics enhanced version of this site

An overview of the SPHINX speech recognition system
K. Lee, H. Hon, and R. Reddy
IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 38, No. 1, January, 1990, pp. 35 - 45.

Jump to: Download | Abstract | Notes | Text Reference | BibTeX Reference


Download [Help]

Adobe portable document format (pdf) [1032 KB]

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.


Abstract

A description is given of SPHINX, a system that demonstrates the feasibility of accurate, large-vocabulary, speaker-independent, continuous speech recognition. SPHINX is based on discrete hidden Markov models (HMMs) with LPC- (linear-predictive-coding) derived parameters. To provide speaker independence, knowledge was added to these HMMs in several ways: multiple codebooks of fixed-width parameters, and an enhanced recognizer with carefully designed models and word-duration modeling. To deal with coarticulation in continuous speech, yet still adequately represent a large vocabulary, two new subword speech units are introduced: function-word-dependent phone models and generalized triphone models. With grammars of perplexity 997, 60, and 20, SPHINX attained word accuracies of 71, 94, and 96%, respectively, on a 997-word task.


Notes

Note: see also IEEE Transactions on Signal Processing


Text Reference

K. Lee, H. Hon, and R. Reddy, "An overview of the SPHINX speech recognition system," IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. 38, No. 1, January, 1990, pp. 35 - 45.


BibTeX Reference

@article{Lee_1990_3707,
   author = "K.-F. Lee and H.-W. Hon and Raj Reddy",
   title = "An overview of the SPHINX speech recognition system",
   journal = "IEEE Transactions on Acoustics, Speech and Signal Processing",
   month = "January",
   year = "1990",
   volume = "38",
   number = "1",
   pages = "35 - 45",
   note = "see also IEEE Transactions on Signal Processing"
}


The Robotics Institute is part of the School of Computer Science, Carnegie Mellon University.
For updates and comments, please see these instructions.
This page maintained by robotwebmaster@ri.cmu.edu