New Frontiers of Hyperparameter Optimization: Recent advances and open challenges in theory and practice
Abstract
Machine learning algorithms operate on data, and for any task the most effective method depends
on the data at hand. Hyperparameter optimization and algorithm selection are therefore crucial to ensure the best performance in terms of accuracy, efficiency, reliability, interpretability, etc. We first survey common techniques used in practice for hyperparameter optimization in machine learning including Bayesian optimization and bandit-based approaches. We next discuss new approaches developed in the context of Large Language Models, including neural scaling laws and parameterization-aware methods. We will discuss chief advantages and shortcomings of these approaches, in particular their limited theoretical guarantees. We will then discuss exciting new developments on hyperparameter tuning with strong theoretical guarantees. A growing line of work over the past decade from the learning theory community has successfully analysed how the algorithmic performance actually varies with the hyperparameter for several fundamental algorithms in machine learning including decision trees, linear regression, unsupervised and semi-supervised learning, and very recently even deep learning. This has allowed the development of techniques that take this structure into account, apply naturally to both hyperparameter tuning and algorithm selection, work well in dynamic or online learning environments, and are equipped with provable PAC (probably approximately correct) guarantees for the generalization error of the learned hyperparameter. Future research areas include integration of these structure-aware principled approaches with the currently used techniques, better optimization in high-dimensional and discrete spaces, and improving scalability in distributed settings.
Schedule
|
|
|
11:30 AM
|