Efficient Algorithms for Minimizing Cross Validation Error - Robotics Institute Carnegie Mellon University

Efficient Algorithms for Minimizing Cross Validation Error

Andrew Moore and M. S. Lee
Conference Paper, Proceedings of (ICML) International Conference on Machine Learning, pp. 190 - 198, July, 1994

Abstract

Model selection is important in many areas of supervised learning. Given a dataset and a set of models for predicting with that dataset, we must choose the model which is expected to best predict future data. In some situations, such as online learning for control of robots or factories, data is cheap and human expertise costly. Cross validation can then be a highly effective method for automatic model selection. Large scale cross validation search can, however, be computationally expensive. This paper introduces new algorithms to reduce the computational burden of such searches. We show how experimental design methods can achieve this, using a technique similar to a Bayesian version of Kaelbling's Interval Estimation. Several improvements are then given, including (1) the use of blocking to quickly spot near-identical models, and (2) schemata search: a new method for quickly finding families of relevant features. Experiments are presented for robot data and noisy synthetic datasets. The new algorithms speed up computation without sacrificing reliability, and in some cases are more reliable than conventional techniques.

BibTeX

@conference{Moore-1994-16060,
author = {Andrew Moore and M. S. Lee},
title = {Efficient Algorithms for Minimizing Cross Validation Error},
booktitle = {Proceedings of (ICML) International Conference on Machine Learning},
year = {1994},
month = {July},
pages = {190 - 198},
publisher = {Morgan Kaufmann},
}