From Tuning to Guarantees: Statistically Valid Hyperparameter Selection
Abstract
"The performance and reliability of modern machine learning systems depend critically on hyperparameter selection. Whether tuning a large language model, configuring a vision pipeline, or deploying AI in safety-critical environments, the choice of hyperparameters is decisive. Current tuning strategies such as grid or random search and Bayesian optimization are powerful for empirical optimization but they do not provide statistical guarantees on the reliability of the selected configuration after deployment. This gap becomes critical when models must satisfy strict performance, safety, or fairness requirements.
This tutorial introduces a rigorous and practical framework that treats hyperparameter selection as a statistical testing problem. By constructing valid p- or e-values for candidate configurations and applying multiple hypothesis testing (MHT) procedures, practitioners can control deployment risk with finite-sample guarantees. We begin with the Learn-Then-Test (LTT) methodology for average-risk control and build up to multiple key extensions, such as controlling the quantile risk using quantile LTT (QLTT), multi-objective optimization through Pareto Testing (PT), incorporating prior information through the concept of reliability graphs, and data-efficient selection through adaptive LTT (aLTT). Throughout the tutorial, we emphasize conceptual clarity, plain-language explanations of assumptions, and hands-on demonstrations with minimal, reproducible notebooks.
Attendees will gain a drop-in toolkit for augmenting existing tuning workflows with statistically valid selection. They will learn how to formalize relevant risk functions, generate valid evidence, choose appropriate error-rate controls (FWER/FDR), and navigate the trade-offs between statistical conservatism and power under limited data. No prior expertise in multiple hypothesis testing is required."
Schedule
|
9:30 AM
|
|
|
|
10:10 AM
|
|
10:25 AM
|
|
10:30 AM
|
|
10:40 AM
|
|
|
|
|
|
|
|
|
|
|
|
11:55 AM
|