by A. Makarova, H. Shen, V. Perrone, A. Klein, J. B. Faddoul, A. Krause, M. Seeger, C. Archambeau
Abstract:
Bayesian Optimization (BO) is a successful methodology to tune the hyperparameters of machine learning models. The user defines a metric of interest, such as the validation error, and BO finds the optimal hyperparameters that minimize it. However, the metric improvements on the validation set may not translate to improvements on the test set, especially when tuning models trained on small datasets. In other words, unlike conventional wisdom dictates, BO can overfit. While cross-validation can mitigate this, it comes with an increased computational cost. In this paper, we carry out the first systematic investigation of overfitting in BO and demonstrate that this issue is a serious, yet often overlooked concern in practice. We propose the first problem-adaptive and interpretable criterion to early stop BO, reducing overfitting while mitigating the cost of cross-validation. Experimental results on real-world hyperparameter optimization tasks show that our approach can substantially reduce compute time with little to no loss of test accuracy, demonstrating a practical advantage over existing techniques.
Reference:
Overfitting in Bayesian Optimization: an empirical study and early-stopping solution A. Makarova, H. Shen, V. Perrone, A. Klein, J. B. Faddoul, A. Krause, M. Seeger, C. ArchambeauICLR Workshop on Neural Architecture Search, 2021
Bibtex Entry:
@misc{makarova2021overfitting,
author = {Anastasia Makarova and Huibin Shen and Valerio Perrone and Aaron Klein and Jean Baptiste Faddoul and Andreas Krause and Matthias Seeger and Cedric Archambeau},
month = {May},
publisher = {ICLR Workshop on Neural Architecture Search},
title = {Overfitting in Bayesian Optimization: an empirical study and early-stopping solution},
video = {https://slideslive.com/38955387/overfitting-in-bayesian-optimization-an-empirical-study-and-earlystopping-solution?ref=search},
year = {2021}}