An Adaptive Approach to Named Entity Extraction for Meeting Application

Fei Huang and Alex Waibel

Conference Paper, Proceedings of 2nd International Conference on Human Language Technology Research (HLT '02), pp. 165 - 170, March, 2002

View Publication

Abstract

Named entity extraction has been intensively investigated in the past several years. Both statistical approaches and rule-based approaches have achieved satisfactory performance for regular written/spoken language. However when applied to highly informal or ungrammatical languages, e.g., meeting languages, because of the many mismatches in language genre, the performance of existing methods decreases significantly. In this paper we propose an adaptive method of named entity extraction for meeting understanding. This method combines a statistical model trained from broadcast news data with a cache model built online for ambiguous words, computes their global context name class probability from local context name class probabilities, and integrates name lists information from meeting profiles. Such a fusion of supervised and unsupervised learning has shown improved performance of named entity extraction for meeting applications. When evaluated using manual meeting transcripts, the proposed method demonstrates a 26.07% improvement over the baseline model. Its performance is also comparable to that of the statistical model trained from a small annotated meeting corpus. We are currently applying the proposed method to automatic meeting transcripts.

BibTeX

@conference{Huang-2002-8398,
author = {Fei Huang and Alex Waibel},
title = {An Adaptive Approach to Named Entity Extraction for Meeting Application},
booktitle = {Proceedings of 2nd International Conference on Human Language Technology Research (HLT '02)},
year = {2002},
month = {March},
pages = {165 - 170},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.