Timezone: »
Tabular datasets are the last "unconquered castle" for deep learning, with traditional ML methods like Gradient-Boosted Decision Trees still performing strongly even against recent specialized neural architectures. In this paper, we hypothesize that the key to boosting the performance of neural networks lies in rethinking the joint and simultaneous application of a large set of modern regularization techniques. As a result, we propose regularizing plain Multilayer Perceptron (MLP) networks by searching for the optimal combination/cocktail of 13 regularization techniques for each dataset using a joint optimization over the decision on which regularizers to apply and their subsidiary hyperparameters. We empirically assess the impact of these regularization cocktails for MLPs in a large-scale empirical study comprising 40 tabular datasets and demonstrate that (i) well-regularized plain MLPs significantly outperform recent state-of-the-art specialized neural network architectures, and (ii) they even outperform strong traditional ML methods, such as XGBoost.
Author Information
Arlind Kadra (University of Freiburg)
Marius Lindauer (Leibniz University Hannover)
Frank Hutter (University of Freiburg & Bosch)
Frank Hutter is a Full Professor for Machine Learning at the Computer Science Department of the University of Freiburg (Germany), where he previously was an assistant professor 2013-2017. Before that, he was at the University of British Columbia (UBC) for eight years, for his PhD and postdoc. Frank's main research interests lie in machine learning, artificial intelligence and automated algorithm design. For his 2009 PhD thesis on algorithm configuration, he received the CAIAC doctoral dissertation award for the best thesis in AI in Canada that year, and with his coauthors, he received several best paper awards and prizes in international competitions on machine learning, SAT solving, and AI planning. Since 2016 he holds an ERC Starting Grant for a project on automating deep learning based on Bayesian optimization, Bayesian neural networks, and deep reinforcement learning.
Josif Grabocka (Universität Freiburg)
More from the Same Authors
-
2021 : OpenML Benchmarking Suites »
Bernd Bischl · Giuseppe Casalicchio · Matthias Feurer · Pieter Gijsbers · Frank Hutter · Michel Lang · Rafael Gomes Mantovani · Jan van Rijn · Joaquin Vanschoren -
2021 : HPO-B: A Large-Scale Reproducible Benchmark for Black-Box HPO based on OpenML »
Sebastian Pineda Arango · Hadi Jomaa · Martin Wistuba · Josif Grabocka -
2021 : HPOBench: A Collection of Reproducible Multi-Fidelity Benchmark Problems for HPO »
Katharina Eggensperger · Philipp Müller · Neeratyoy Mallik · Matthias Feurer · Rene Sass · Aaron Klein · Noor Awad · Marius Lindauer · Frank Hutter -
2021 : Transformers Can Do Bayesian-Inference By Meta-Learning on Prior-Data »
Samuel Müller · Noah Hollmann · Sebastian Pineda Arango · Josif Grabocka · Frank Hutter -
2021 : Transfer Learning for Bayesian HPO with End-to-End Landmark Meta-Features »
Hadi Jomaa · Sebastian Pineda Arango · Lars Schmidt-Thieme · Josif Grabocka -
2022 : c-TPE: Generalizing Tree-structured Parzen Estimator with Inequality Constraints for Continuous and Categorical Hyperparameter Optimization »
Shuhei Watanabe · Frank Hutter -
2022 : PI is back! Switching Acquisition Functions in Bayesian Optimization »
Carolin Benjamins · Elena Raponi · Anja Jankovic · Koen van der Blom · Maria Laura Santoni · Marius Lindauer · Carola Doerr -
2022 : TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second »
Noah Hollmann · Samuel Müller · Katharina Eggensperger · Frank Hutter -
2022 : On the Importance of Architectures and Hyperparameters for Fairness in Face Recognition »
Samuel Dooley · Rhea Sukthanker · John Dickerson · Colin White · Frank Hutter · Micah Goldblum -
2022 : Efficient Bayesian Learning Curve Extrapolation using Prior-Data Fitted Networks »
Steven Adriaensen · Herilalaina Rakotoarison · Samuel Müller · Frank Hutter -
2022 : Transfer NAS with Meta-learned Bayesian Surrogates »
Gresa Shala · Thomas Elsken · Frank Hutter · Josif Grabocka -
2022 : Gray-Box Gaussian Processes for Automated Reinforcement Learning »
Gresa Shala · André Biedenkapp · Frank Hutter · Josif Grabocka -
2022 : AutoRL-Bench 1.0 »
Gresa Shala · Sebastian Pineda Arango · André Biedenkapp · Frank Hutter · Josif Grabocka -
2022 : Bayesian Optimization with a Neural Network Meta-learned on Synthetic Data Only »
Samuel Müller · Sebastian Pineda Arango · Matthias Feurer · Josif Grabocka · Frank Hutter -
2022 : Towards Automated Design of Bayesian Optimization via Exploratory Landscape Analysis »
Carolin Benjamins · Anja Jankovic · Elena Raponi · Koen van der Blom · Marius Lindauer · Carola Doerr -
2022 : GraViT-E: Gradient-based Vision Transformer Search with Entangled Weights »
Rhea Sukthanker · Arjun Krishnakumar · sharat patil · Frank Hutter -
2022 : PriorBand: HyperBand + Human Expert Knowledge »
Neeratyoy Mallik · Carl Hvarfner · Danny Stoll · Maciej Janowski · Edward Bergman · Marius Lindauer · Luigi Nardi · Frank Hutter -
2022 : Towards Discovering Neural Architectures from Scratch »
Simon Schrodi · Danny Stoll · Robin Ru · Rhea Sukthanker · Thomas Brox · Frank Hutter -
2022 : On the Importance of Architectures and Hyperparameters for Fairness in Face Recognition »
Samuel Dooley · Rhea Sukthanker · John Dickerson · Colin White · Frank Hutter · Micah Goldblum -
2022 : Multi-objective Tree-structured Parzen Estimator Meets Meta-learning »
Shuhei Watanabe · Noor Awad · Masaki Onishi · Frank Hutter -
2022 Spotlight: Supervising the Multi-Fidelity Race of Hyperparameter Configurations »
Martin Wistuba · Arlind Kadra · Josif Grabocka -
2022 Spotlight: Lightning Talks 3B-1 »
Tianying Ji · Tongda Xu · Giulia Denevi · Aibek Alanov · Martin Wistuba · Wei Zhang · Yuesong Shen · Massimiliano Pontil · Vadim Titov · Yan Wang · Yu Luo · Daniel Cremers · Yanjun Han · Arlind Kadra · Dailan He · Josif Grabocka · Zhengyuan Zhou · Fuchun Sun · Carlo Ciliberto · Dmitry Vetrov · Mingxuan Jing · Chenjian Gao · Aaron Flores · Tsachy Weissman · Han Gao · Fengxiang He · Kunzan Liu · Wenbing Huang · Hongwei Qin -
2022 : TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second »
Noah Hollmann · Samuel Müller · Katharina Eggensperger · Frank Hutter -
2022 Poster: Supervising the Multi-Fidelity Race of Hyperparameter Configurations »
Martin Wistuba · Arlind Kadra · Josif Grabocka -
2022 Poster: Joint Entropy Search For Maximally-Informed Bayesian Optimization »
Carl Hvarfner · Frank Hutter · Luigi Nardi -
2022 Poster: Probabilistic Transformer: Modelling Ambiguities and Distributions for RNA Folding and Molecule Design »
Jörg Franke · Frederic Runge · Frank Hutter -
2022 Poster: NAS-Bench-Suite-Zero: Accelerating Research on Zero Cost Proxies »
Arjun Krishnakumar · Colin White · Arber Zela · Renbo Tu · Mahmoud Safari · Frank Hutter -
2022 Poster: JAHS-Bench-201: A Foundation For Research On Joint Architecture And Hyperparameter Search »
Archit Bansal · Danny Stoll · Maciej Janowski · Arber Zela · Frank Hutter -
2021 : CARL: A Benchmark for Contextual and Adaptive Reinforcement Learning »
Carolin Benjamins · Theresa Eimer · Frederik Schubert · André Biedenkapp · Bodo Rosenhahn · Frank Hutter · Marius Lindauer -
2021 : Hyperparameters in Contextual RL are Highly Situational »
Theresa Eimer · Carolin Benjamins · Marius Lindauer -
2021 Workshop: 5th Workshop on Meta-Learning »
Erin Grant · Fábio Ferreira · Frank Hutter · Jonathan Richard Schwarz · Joaquin Vanschoren · Huaxiu Yao -
2021 Poster: How Powerful are Performance Predictors in Neural Architecture Search? »
Colin White · Arber Zela · Robin Ru · Yang Liu · Frank Hutter -
2021 Poster: NAS-Bench-x11 and the Power of Learning Curves »
Shen Yan · Colin White · Yash Savani · Frank Hutter -
2021 Poster: Explaining Hyperparameter Optimization via Partial Dependence Plots »
Julia Moosbauer · Julia Herbinger · Giuseppe Casalicchio · Marius Lindauer · Bernd Bischl -
2021 Poster: Neural Ensemble Search for Uncertainty Estimation and Dataset Shift »
Sheheryar Zaidi · Arber Zela · Thomas Elsken · Chris C Holmes · Frank Hutter · Yee Teh -
2020 : Q/A for invited talk #1 »
Frank Hutter -
2020 : Meta-learning neural architectures, initial weights, hyperparameters, and algorithm components »
Frank Hutter -
2019 : Frank Hutter (University of Freiburg) "A Proposal for a New Competition Design Emphasizing Scientific Insights" »
Frank Hutter -
2019 Workshop: Meta-Learning »
Roberto Calandra · Ignasi Clavera Gilaberte · Frank Hutter · Joaquin Vanschoren · Jane Wang -
2019 Poster: Meta-Surrogate Benchmarking for Hyperparameter Optimization »
Aaron Klein · Zhenwen Dai · Frank Hutter · Neil Lawrence · Javier González -
2018 Workshop: NIPS 2018 Workshop on Meta-Learning »
Joaquin Vanschoren · Frank Hutter · Sachin Ravi · Jane Wang · Erin Grant -
2018 Poster: Maximizing acquisition functions for Bayesian optimization »
James Wilson · Frank Hutter · Marc Deisenroth -
2018 Tutorial: Automatic Machine Learning »
Frank Hutter · Joaquin Vanschoren -
2017 Workshop: Workshop on Meta-Learning »
Roberto Calandra · Frank Hutter · Hugo Larochelle · Sergey Levine -
2016 : Invited talk, Frank Hutter »
Frank Hutter -
2016 Workshop: Bayesian Optimization: Black-box Optimization and Beyond »
Roberto Calandra · Bobak Shahriari · Javier Gonzalez · Frank Hutter · Ryan Adams -
2016 : Frank Hutter (University Freiburg) »
Frank Hutter -
2016 Poster: Bayesian Optimization with Robust Bayesian Neural Networks »
Jost Tobias Springenberg · Aaron Klein · Stefan Falkner · Frank Hutter -
2016 Oral: Bayesian Optimization with Robust Bayesian Neural Networks »
Jost Tobias Springenberg · Aaron Klein · Stefan Falkner · Frank Hutter -
2015 : Scalable and Flexible Bayesian Optimization for Algorithm Configuration »
Frank Hutter -
2015 Poster: Efficient and Robust Automated Machine Learning »
Matthias Feurer · Aaron Klein · Katharina Eggensperger · Jost Springenberg · Manuel Blum · Frank Hutter