The Importance of Simplicity and Validation in Genetic Programming for Data Mining in Financial Data

James Thomas and Katia Sycara
Proceedings of the joint AAAI-1999 and GECCO-1999 Workshop on Data Mining with Evolutionary Algorithms, July, 1999.


Download
  • Adobe portable document format (pdf) (108KB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

Abstract
A genetic programming system for data mining trading rules out of past foreign exchange data is described. The system is tested on real data from the dollar/yen and dollar/DM markets, and shown to produce considerable excess returns in the dollar/yen market. Design issues relating to potential rule complexity and validation regimes are explored empirically. Keeping potential rules as simple as possible is shown to be the most important component of success. Validation issues are more complicated. Inspection of fitness on a validation set is used to cut-off search in hopes of avoiding overfitting. Additional attempts to use the validation set to improve performance are shown to be ineffective in the standard framework. An examination of correlations between performance on the validation set and on the test set leads to an understanding of how such measures can be marginally benificial; unfortunately, this suggests that further attemps to improve performance through validation will be difficult.

Notes

Text Reference
James Thomas and Katia Sycara, "The Importance of Simplicity and Validation in Genetic Programming for Data Mining in Financial Data," Proceedings of the joint AAAI-1999 and GECCO-1999 Workshop on Data Mining with Evolutionary Algorithms, July, 1999.

BibTeX Reference
@inproceedings{Sycara_1999_3327,
   author = "James Thomas and Katia Sycara",
   title = "The Importance of Simplicity and Validation in Genetic Programming for Data Mining in Financial Data",
   booktitle = "Proceedings of the joint AAAI-1999 and GECCO-1999 Workshop on Data Mining with Evolutionary Algorithms",
   month = "July",
   year = "1999",
}