Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints

Mengtian Li, Ersin Yumer, and Deva Ramanan

Conference Paper, Proceedings of (ICLR) International Conference on Learning Representations, April, 2020

View Publication

Abstract

In most practical settings and theoretical analyses, one assumes that a model can be trained until convergence. However, the growing complexity of machine learning datasets and models may violate such assumptions. Indeed, current approaches for hyper-parameter tuning and neural architecture search tend to be limited by practical resource constraints. Therefore, we introduce a formal setting for studying training under the non-asymptotic, resource-constrained regime, i.e., budgeted training. We analyze the following problem: "given a dataset, algorithm, and fixed resource budget, what is the best achievable performance?" We focus on the number of optimization iterations as the representative resource. Under such a setting, we show that it is critical to adjust the learning rate schedule according to the given budget. Among budget-aware learning schedules, we find simple linear decay to be both robust and high-performing. We support our claim through extensive experiments with state-of-the-art models on ImageNet (image classification), Kinetics (video classification), MS COCO (object detection and instance segmentation), and Cityscapes (semantic segmentation). We also analyze our results and find that the key to a good schedule is budgeted convergence, a phenomenon whereby the gradient vanishes at the end of each allowed budget. We also revisit existing approaches for fast convergence and show that budget-aware learning schedules readily outperform such approaches under (the practical but under-explored) budgeted training setting.

BibTeX

@conference{Li-2020-131685,
author = {Mengtian Li and Ersin Yumer and Deva Ramanan},
title = {Budgeted Training: Rethinking Deep Neural Network Training Under Resource Constraints},
booktitle = {Proceedings of (ICLR) International Conference on Learning Representations},
year = {2020},
month = {April},
keywords = {Budgeted training, learning rate schedule, linear decay, resource constraints},
}

Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.