Carnegie Mellon Robotics Institute
Matthew Mullin and Rahul Sukthankar
Proceedings of the International Conference on Machine Learning, June, 2000.
| Download |
|
| Abstract |
| Cross-validation is an established technique for estimating the accuracy of a classifier and is normally performed either using a number of random test/train partitions of the data, or using k-fold cross-validation. We present a technique for calculating the complete cross-validation for nearest-neighbor classifiers: i.e., averaging over all desired test/train partitions of data. This technique is applied to several common classifier variants such as K-nearest-neighbor, stratified data partitioning and arbitrary loss functions. We demonstrate, with complexity analysis and experimental timing results, that the technique can be performed in time comparable to k-fold cross-validation, though in effect it averages an exponential number of trials. We show that the results of complete cross-validation are biased equally compared to subsampling and k-fold cross-validation, and there is some reduction in variance. This algorithm offers significant benefits both in terms of time and accuracy. |
| Keywords |
| machine learning |
| Notes |
Associated Center(s) / Consortia:
Vision and Autonomous Systems Center |
| Text Reference |
| Matthew Mullin and Rahul Sukthankar, "Complete Cross-Validation for Nearest Neighbor Classifiers," Proceedings of the International Conference on Machine Learning, June, 2000. |
| BibTeX Reference |
|
@inproceedings{Sukthankar_2000_3394, author = "Matthew Mullin and Rahul Sukthankar", title = "Complete Cross-Validation for Nearest Neighbor Classifiers", booktitle = "Proceedings of the International Conference on Machine Learning", month = "June", year = "2000", } |
| The Robotics Institute is part of the School of Computer Science, Carnegie Mellon University. Contact Us | Update Instructions |