Text Classification for Intelligent Portfolio Management

Young-Woo Seo, Joseph Andrew Giampapa, and Katia Sycara
tech. report CMU-RI-TR-02-14, Robotics Institute, Carnegie Mellon University, May, 2002


Download
  • Adobe portable document format (pdf) (110KB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
In the application domain of stock portfolio management, software agents that evaluate the risks associated with the individual companies of a portfolio should be able to read electronic news articles that are written to give investors an indication of the financial outlook of a company. There is a positive correlation between news reports on a company's financial outlook and the company's attractiveness as an investment. However, because of the volume of such reports, it is impossible for financial analysts or investors to track and read each one. Therefore, it would be very helpful to have a system that automatically classifies news reports that reflect positively or negatively on a company's financial outlook. To accomplish this task, we treat the understanding of news articles as a text classification problem. In this paper, we propose a text classification method that we call, ``Domain Experts" and ``Self-Confident" sampling, and compare it with naive Bayes with expectation maximization (EM). We evaluate these learning techniques in terms of how well they improve with unlabeled data after being initially trained on a small number of human-labeled articles and how well they classify the latest financial news articles. The significance of this work lies in the new classification method that we propose and in the sampling technique we used for improving classification accuracy.

Keywords
Text Classification, Sampling technique of unlabeled data

Notes
Sponsor: DARPA
Associated Center(s) / Consortia: Center for Integrated Manfacturing Decision Systems
Associated Lab(s) / Group(s): Advanced Agent - Robotics Technology Lab
Associated Project(s): WARREN and Text Miner

Text Reference
Young-Woo Seo, Joseph Andrew Giampapa, and Katia Sycara, "Text Classification for Intelligent Portfolio Management," tech. report CMU-RI-TR-02-14, Robotics Institute, Carnegie Mellon University, May, 2002

BibTeX Reference
@techreport{Seo_2002_3976,
   author = "Young-Woo Seo and Joseph Andrew Giampapa and Katia Sycara",
   title = "Text Classification for Intelligent Portfolio Management",
   booktitle = "",
   institution = "Robotics Institute",
   month = "May",
   year = "2002",
   number= "CMU-RI-TR-02-14",
   address= "Pittsburgh, PA",
}