Carnegie Mellon Robotics Institute
Young-Woo Seo and Katia Sycara
tech. report CMU-RI-TR-04-03, Robotics Institute, Carnegie Mellon University, January, 2004
| Download |
|
| Abstract |
| The world wide web represents vast stores of information. However, the sheer amount of such information makes it practically impossible for any human user to be aware of much of it. Therefore, it would be very helpful to have a system that automatically discovers relevant, yet previously unknown information, and reports it to users in human-readable form. As the first attempt to accomplish such a goal, we proposed a new clustering algorithm and compared it with existing clustering algorithms. The proposed method is motivated by constructive and competitive learning from neural network research. In the construction phase, it tries to find the optimal number of clusters by adding a new cluster when the intrinsic difference between the instance presented and the existing clusters is detected. Each cluster then moves toward the optimal cluster center according to the learning rate by adjusting its weight vector. From the experimental results on the three different real world data sets, the proposed method shows an even trend of performance across the different domains, while the performance of our algorithm on text domains was better than that reported in previous research. |
| Keywords |
| text clustering, topic detection, information retrieval, machine learning, artificial intelligence |
| Notes |
Sponsor: AFOSR Grant ID: F49620-01-1-0542 |
| Text Reference |
| Young-Woo Seo and Katia Sycara, "Text Clustering for Topic Detection," tech. report CMU-RI-TR-04-03, Robotics Institute, Carnegie Mellon University, January, 2004 |
| BibTeX Reference |
|
@techreport{Seo_2004_4569, author = "Young-Woo Seo and Katia Sycara", title = "Text Clustering for Topic Detection", booktitle = "", institution = "Robotics Institute", month = "January", year = "2004", number= "CMU-RI-TR-04-03", address= "Pittsburgh, PA", } |
| The Robotics Institute is part of the School of Computer Science, Carnegie Mellon University. Contact Us | Update Instructions |