Text, Speech, and Vision for Video Segmentation: The Informedia Project - The Robotics Institute Carnegie Mellon University
Home/Text, Speech, and Vision for Video Segmentation: The Informedia Project

Text, Speech, and Vision for Video Segmentation: The Informedia Project

Alex Hauptmann and Michael Smith
Conference Paper, Proceedings of AAAI Fall '95 Symposium on Computational Models for Integrating Language and Vision, pp. 10 - 12, November, 1995
View Publication

Abstract

We describe three technologies involved in creating a digital video library suitable for full-content search and retrieval. Image processing analyzes scenes, speech processing transcribes the audio signal, and natural language processing determines word relevance. The integration of these technologies enables us to include vast amounts of video data in the library.

BibTeX

@conference{Hauptmann-1995-16187,
author = {Alex Hauptmann and Michael Smith},
title = {Text, Speech, and Vision for Video Segmentation: The Informedia Project},
booktitle = {Proceedings of AAAI Fall '95 Symposium on Computational Models for Integrating Language and Vision},
year = {1995},
month = {November},
pages = {10 - 12},
}