KIDS: A database of children's speech - Robotics Institute Carnegie Mellon University

KIDS: A database of children’s speech

Journal Article, Journal of the Acoustical Society of America, Vol. 100, No. 4, October, 1996

Abstract

We have collected a database of children reading age- and reading-level-appropriate text aloud. This (labelled) data, to be distributed in the near future, was primarily intended to be used in CMU's LISTEN tutor which employs speech recognition to monitor children's reading and then help correct errors. The speaker population was therefore chosen to represent good and poor readers and to incorporate dialects of the speakers for whom the reading coach is intended. Phonemic balance could not be achieved (although it has been calculated) since the primary concern in recording children reading is to present sentences that can effectively be read by first through third graders. The text is a series of sentences we adapted from text in the Weekly Reader series - most of the adaptation concerned the lack of the accompanying images. The text was chosen for its intrinsic interest and widespread use. Several trial recording sessions allowed us to develop a protocol that kept extraneous noises produced by the children at a minimum. We will discuss this and other problems inherent in recording children reading. Novel techniques developed for labelling this kind of speech will also be presented.

BibTeX

@article{Mostow-1996-14274,
author = {Jack Mostow},
title = {KIDS: A database of children's speech},
journal = {Journal of the Acoustical Society of America},
year = {1996},
month = {October},
volume = {100},
number = {4},
}