SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments - Robotics Institute Carnegie Mellon University

SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments

Conference Paper, Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05), Vol. 5, pp. 429 - 432, March, 2005

Abstract

The ability to identify sounds in complex audio environments is highly useful for multimedia retrieval, security, and many mobile robotic applications, but very little work has been done in this area. We present the SOLAR system, a system capable of finding sound objects, such as dog barks or car horns, in complex audio data extracted from movies. SOLAR avoids the need for segmentation by scanning over the audio data in fixed increments and classifying each short audio window separately. SOLAR employs boosted decision tree classifiers to select suitable features for modeling each sound object and to discriminate between the object of interest and all other sounds. We demonstrate the effectiveness of our approach with experiments on thirteen sound object classes trained using only tens of positive examples and tested on hours of audio data extracted from popular movies.

BibTeX

@conference{Hoiem-2005-9124,
author = {Derek Hoiem and Yan Ke and Rahul Sukthankar},
title = {SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments},
booktitle = {Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP '05)},
year = {2005},
month = {March},
volume = {5},
pages = {429 - 432},
keywords = {sound detection, sound retrieval},
}