SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments

Derek Hoiem, Yan Ke, and Rahul Sukthankar
IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March, 2005, pp. 429 - 432.


Download
  • Adobe portable document format (pdf) (266KB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
The ability to identify sounds in complex audio environ-ments is highly useful for multimedia retrieval, security, and many mobile robotic applications, but very little work has been done in this area. We present the SOLAR sys-tem, a system capable of finding sound objects, such as dog barks or car horns, in complex audio data extracted from movies. SOLAR avoids the need for segmentation by scanning over the audio data in fixed increments and clas-sifying each short audio window separately. SOLAR em-ploys boosted decision tree classifiers to select suitable features for modeling each sound object and to discrimi-nate between the object of interest and all other sounds. We demonstrate the effectiveness of our approach with experiments on thirteen sound object classes trained using only tens of positive examples and tested on hours of audio data extracted from popular movies.

Keywords
sound detection, sound retrieval

Notes
Sponsor: Intel Research Pittsburgh
Associated Center(s) / Consortia: Vision and Autonomous Systems Center
Associated Lab(s) / Group(s): Face Group
Number of pages: 4

Text Reference
Derek Hoiem, Yan Ke, and Rahul Sukthankar, "SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments," IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), March, 2005, pp. 429 - 432.

BibTeX Reference
@inproceedings{Hoiem_2005_5006,
   author = "Derek Hoiem and Yan Ke and Rahul Sukthankar",
   title = "SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments",
   booktitle = "IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)",
   pages = "429 - 432",
   month = "March",
   year = "2005",
   volume = "5",
}