Carnegie Mellon Robotics Institute
Young-Woo Seo, Joseph Andrew Giampapa, and Katia Sycara
tech. report CMU-RI-TR-02-14, Robotics Institute, Carnegie Mellon University, May, 2002
| Download |
|
| Abstract |
| In the application domain of stock portfolio management, software agents that evaluate the risks associated with the individual companies of a portfolio should be able to read electronic news articles that are written to give investors an indication of the financial outlook of a company. There is a positive correlation between news reports on a company's financial outlook and the company's attractiveness as an investment. However, because of the volume of such reports, it is impossible for financial analysts or investors to track and read each one. Therefore, it would be very helpful to have a system that automatically classifies news reports that reflect positively or negatively on a company's financial outlook. To accomplish this task, we treat the understanding of news articles as a text classification problem. In this paper, we propose a text classification method that we call, ``Domain Experts" and ``Self-Confident" sampling, and compare it with naive Bayes with expectation maximization (EM). We evaluate these learning techniques in terms of how well they improve with unlabeled data after being initially trained on a small number of human-labeled articles and how well they classify the latest financial news articles. The significance of this work lies in the new classification method that we propose and in the sampling technique we used for improving classification accuracy. |
| Keywords |
| Text Classification, Sampling technique of unlabeled data |
| Notes |
Sponsor: DARPA Associated Center(s) / Consortia:
Center for Integrated Manfacturing Decision Systems Associated Lab(s) / Group(s):
Advanced Agent - Robotics Technology Lab Associated Project(s):
WARREN and Text Miner |
| Text Reference |
| Young-Woo Seo, Joseph Andrew Giampapa, and Katia Sycara, "Text Classification for Intelligent Portfolio Management," tech. report CMU-RI-TR-02-14, Robotics Institute, Carnegie Mellon University, May, 2002 |
| BibTeX Reference |
|
@techreport{Seo_2002_3976, author = "Young-Woo Seo and Joseph Andrew Giampapa and Katia Sycara", title = "Text Classification for Intelligent Portfolio Management", booktitle = "", institution = "Robotics Institute", month = "May", year = "2002", number= "CMU-RI-TR-02-14", address= "Pittsburgh, PA", } |
| The Robotics Institute is part of the School of Computer Science, Carnegie Mellon University. Contact Us | Update Instructions |