Feature Selection for Extracting Semantically Rich Words - Robotics Institute Carnegie Mellon University

Feature Selection for Extracting Semantically Rich Words

Young-Woo Seo, Anupriya Ankolekar, and Katia Sycara
Tech. Report, CMU-RI-TR-04-18, Robotics Institute, Carnegie Mellon University, March, 2004

Abstract

The utility of semantic knowledge, in the form of ontologies, is widely acknowledged. In particular, semantic knowledge facilitates integration, visualization, and maintenance of information from various sources. However, the majority of previous work in this field has tried to learn ontologies for relatively constrained domains. In other words, to date, there has been relatively little work on trying to construct ontologies for an open domain, where there are enormous needs for such ontologies. Moreover, there have been few studies that empirically examine the value of text learning techniques to extract a set of candidate words for concept words in a domain ontology. The goal of this work is to examine the usefulness of existing feature selection methods for the extraction of a set of good candidate words for concept words in an ontology. From the experimental results, we found that the existing word feature selection methods are quite useful for ontology learning, in that there is a good overlap between the word sets identified by feature selection methods and the words in a manually built domain ontology. Finally, from our experience of working on this paper, we enumerate the desiderata for a domain ontology learning system.

BibTeX

@techreport{Seo-2004-8868,
author = {Young-Woo Seo and Anupriya Ankolekar and Katia Sycara},
title = {Feature Selection for Extracting Semantically Rich Words},
year = {2004},
month = {March},
institute = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-04-18},
keywords = {ontology learning, text learning, feature selection, machine learning},
}