Carnegie Mellon University
Kernel Conjugate Gradient for Fast Kernel Machines

Nathan Ratliff and J. Andrew (Drew) Bagnell
International Joint Conference on Artificial Intelligence, January, 2007.

  • Adobe portable document format (pdf) (7MB)
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.

We propose a novel variant of the conjugate gradient algorithm,Kernel Conjugate Gradient (KCG), designed to speed up learning for kernel machines with differentiable loss functions. This approach leads to a better conditioned optimization problem during learning. We establish an upper bound on the number of iterations for KCG that indicates it should require less than the square root of the number of iterations that standard conjugate gradient requires. In practice, for various differentiable kernel learning problems, we find KCG consistently, and significantly, outperforms existing techniques. The algorithm is simple to implement, requires no more computation per iteration than standard approaches, and is well motivated by Reproducing Kernel Hilbert Space (RKHS) theory. We further show that data-structure techniques recently used to speed up kernel machine approaches are well matched to the algorithm by reducing the dominant costs of training: function evaluation and RKHS inner product computation.

Kernel Machine, Functional Gradient, Kernel Gradient, Conjugate Gradient, KD-tree, Gaussian Processes, Kernel Logistic Regression

Sponsor: DARPA Learning for Locomotion
Associated Center(s) / Consortia: Center for the Foundations of Robotics
Associated Lab(s) / Group(s): Planning and Autonomy Lab
Associated Project(s): Learning Locomotion

Text Reference
Nathan Ratliff and J. Andrew (Drew) Bagnell, "Kernel Conjugate Gradient for Fast Kernel Machines," International Joint Conference on Artificial Intelligence, January, 2007.

BibTeX Reference
   author = "Nathan Ratliff and J. Andrew (Drew) Bagnell",
   title = "Kernel Conjugate Gradient for Fast Kernel Machines",
   booktitle = "International Joint Conference on Artificial Intelligence",
   month = "January",
   year = "2007",
   volume = "20",