Error-Responsive Feedback Mechanisms for Speech Recognizers

Lin Chase
doctoral dissertation, tech. report CMU-RI-TR-97-18, Robotics Institute, Carnegie Mellon University, April, 1997


Download
  • Adobe portable document format (pdf) (1MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
This thesis is about modeling, analyzing, and predicting errorful behavior in large vocabulary continuous speech recognition systems. Because today's state-of-the-art recognizers are not designed to be situated naturally in an error feedback loop, they are ill-positioned for inclusion in multi-modal interfaces, multi-media databases, and other interesting applications. I make improvements to the current approach to predicting and analyzing error behaviors, which is currently based only on the measurement of word error rate.

The speech recognizer's functionality is extended to include confidence annotations, which are "meta-level" markings that indicate how certain the recognizer is that it has decoded its input correctly. This is accomplished by feeding externally defined error conditions back to the recognizer. Error feedback enables the construction of statistical models that map measurements of the recognizer's internal states nad behaviors to externally defined error conditions.

The measureing and modeling techniques used for confidence annotation are extended to create a blame assignment system for utterances whose actual transcripts are known. Errors are classified into a set of categories, some of which are directly useful in automatic adaptation schemes while others are more suited for human interpretation.

This classification approach is enhanced when used in conjunction with a visual error analysis tool that was developed during the thesis project.


Notes
Sponsor: NSA
Grant ID: MDA904-96-1-0113, MDA904-97-1-0006
Number of pages: 288

Text Reference
Lin Chase, "Error-Responsive Feedback Mechanisms for Speech Recognizers," doctoral dissertation, tech. report CMU-RI-TR-97-18, Robotics Institute, Carnegie Mellon University, April, 1997

BibTeX Reference
@phdthesis{Chase_1997_444,
   author = "Lin Chase",
   title = "Error-Responsive Feedback Mechanisms for Speech Recognizers",
   booktitle = "",
   school = "Robotics Institute, Carnegie Mellon University",
   month = "April",
   year = "1997",
   number= "CMU-RI-TR-97-18",
   address= "Pittsburgh, PA",
}