Timezone: »
Meta-learning hyperparameter optimization (HPO) algorithms from prior experiments is a promising approach to improve optimization efficiency over objective functions from a similar distribution. However, existing methods are restricted to learning from experiments sharing the same set of hyperparameters. In this paper, we introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction when trained on vast tuning data from the wild, such as Google’s Vizier database, one of the world’s largest HPO datasets. Our extensive experiments demonstrate that the OptFormer can simultaneously imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates. Compared to a Gaussian Process, the OptFormer also learns a robust prior distribution for hyperparameter response functions, and can thereby provide more accurate and better calibrated predictions. This work paves the path to future extensions for training a Transformer-based model as a general HPO optimizer.
Author Information
Yutian Chen (DeepMind)
Xingyou Song (Google Brain)
Chansoo Lee (Google)
Zi Wang (Google Brain)
Richard Zhang (Google Brain)
David Dohan (Google Brain)
Kazuya Kawakami
Greg Kochanski (Google, Inc.)
Arnaud Doucet (Oxford)
Marc'Aurelio Ranzato (DeepMind)
Sagi Perel (Carnegie Mellon University)
Nando de Freitas (DeepMind)
More from the Same Authors
-
2021 Spotlight: Optimal Sketching for Trace Estimation »
Shuli Jiang · Hai Pham · David Woodruff · Richard Zhang -
2021 : Introducing Symmetries to Black Box Meta Reinforcement Learning »
Louis Kirsch · Sebastian Flennerhag · Hado van Hasselt · Abram Friesen · Junhyuk Oh · Yutian Chen -
2021 : Introducing Symmetries to Black Box Meta Reinforcement Learning »
Louis Kirsch · Sebastian Flennerhag · Hado van Hasselt · Abram Friesen · Junhyuk Oh · Yutian Chen -
2022 : HyperBO+: Pre-training a universal hierarchical Gaussian process prior for Bayesian optimization »
Zhou Fan · Xinran Han · Zi Wang -
2022 : Multi-step Planning for Automated Hyperparameter Optimization with OptFormer »
Lucio M Dery · Abram Friesen · Nando de Freitas · Marc'Aurelio Ranzato · Yutian Chen -
2022 : Spectral Diffusion Processes »
Angus Phillips · Thomas Seror · Michael Hutchinson · Valentin De Bortoli · Arnaud Doucet · Emile Mathieu -
2023 Workshop: Adaptive Experimental Design and Active Learning in the Real World »
Willie Neiswanger · Mojmir Mutny · Ilija Bogunovic · Ava Soleimany · Zi Wang · Stefano Ermon · Andreas Krause -
2022 Spotlight: Optimal Query Complexities for Dynamic Trace Estimation »
David Woodruff · Fred Zhang · Richard Zhang -
2022 Spotlight: Lightning Talks 1A-4 »
Siwei Wang · Jing Liu · Nianqiao Ju · Shiqian Li · Eloïse Berthier · Muhammad Faaiz Taufiq · Arsene Fansi Tchango · Chen Liang · Chulin Xie · Jordan Awan · Jean-Francois Ton · Ziad Kobeissi · Wenguan Wang · Xinwang Liu · Kewen Wu · Rishab Goel · Jiaxu Miao · Suyuan Liu · Julien Martel · Ruobin Gong · Francis Bach · Chi Zhang · Rob Cornish · Sanmi Koyejo · Zhi Wen · Yee Whye Teh · Yi Yang · Jiaqi Jin · Bo Li · Yixin Zhu · Vinayak Rao · Wenxuan Tu · Gaetan Marceau Caron · Arnaud Doucet · Xinzhong Zhu · Joumana Ghosn · En Zhu -
2022 Spotlight: Conformal Off-Policy Prediction in Contextual Bandits »
Muhammad Faaiz Taufiq · Jean-Francois Ton · Rob Cornish · Yee Whye Teh · Arnaud Doucet -
2022 Workshop: Gaussian Processes, Spatiotemporal Modeling, and Decision-making Systems »
Alexander Terenin · Elizaveta Semenova · Geoff Pleiss · Zi Wang -
2022 Poster: Conformal Off-Policy Prediction in Contextual Bandits »
Muhammad Faaiz Taufiq · Jean-Francois Ton · Rob Cornish · Yee Whye Teh · Arnaud Doucet -
2022 Poster: A Continuous Time Framework for Discrete Denoising Models »
Andrew Campbell · Joe Benton · Valentin De Bortoli · Thomas Rainforth · George Deligiannidis · Arnaud Doucet -
2022 Poster: Score-Based Diffusion meets Annealed Importance Sampling »
Arnaud Doucet · Will Grathwohl · Alexander Matthews · Heiko Strathmann -
2022 Poster: A Multi-Resolution Framework for U-Nets with Applications to Hierarchical VAEs »
Fabian Falck · Christopher Williams · Dominic Danks · George Deligiannidis · Christopher Yau · Chris C Holmes · Arnaud Doucet · Matthew Willetts -
2022 Poster: Optimal Query Complexities for Dynamic Trace Estimation »
David Woodruff · Fred Zhang · Richard Zhang -
2022 Poster: Riemannian Score-Based Generative Modelling »
Valentin De Bortoli · Emile Mathieu · Michael Hutchinson · James Thornton · Yee Whye Teh · Arnaud Doucet -
2022 Poster: Solving Quantitative Reasoning Problems with Language Models »
Aitor Lewkowycz · Anders Andreassen · David Dohan · Ethan Dyer · Henryk Michalewski · Vinay Ramasesh · Ambrose Slone · Cem Anil · Imanol Schlag · Theo Gutman-Solo · Yuhuai Wu · Behnam Neyshabur · Guy Gur-Ari · Vedant Misra -
2021 : Retrospective Panel »
Sergey Levine · Nando de Freitas · Emma Brunskill · Finale Doshi-Velez · Nan Jiang · Rishabh Agarwal -
2021 Poster: Optimal Sketching for Trace Estimation »
Shuli Jiang · Hai Pham · David Woodruff · Richard Zhang -
2021 Poster: Active Offline Policy Selection »
Ksenia Konyushova · Yutian Chen · Thomas Paine · Caglar Gulcehre · Cosmin Paduraru · Daniel Mankowitz · Misha Denil · Nando de Freitas -
2020 : Is Transfer Learning Necessary for Protein Landscape Prediction? »
David Belanger · David Dohan -
2020 : Panel »
Emma Brunskill · Nan Jiang · Nando de Freitas · Finale Doshi-Velez · Sergey Levine · John Langford · Lihong Li · George Tucker · Rishabh Agarwal · Aviral Kumar -
2020 : Offline RL »
Nando de Freitas -
2020 Poster: Critic Regularized Regression »
Ziyu Wang · Alexander Novikov · Konrad Zolna · Josh Merel · Jost Tobias Springenberg · Scott Reed · Bobak Shahriari · Noah Siegel · Caglar Gulcehre · Nicolas Heess · Nando de Freitas -
2020 Poster: Modular Meta-Learning with Shrinkage »
Yutian Chen · Abram Friesen · Feryal Behbahani · Arnaud Doucet · David Budden · Matthew Hoffman · Nando de Freitas -
2020 Spotlight: Modular Meta-Learning with Shrinkage »
Yutian Chen · Abram Friesen · Feryal Behbahani · Arnaud Doucet · David Budden · Matthew Hoffman · Nando de Freitas -
2020 Poster: RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning »
Caglar Gulcehre · Ziyu Wang · Alexander Novikov · Thomas Paine · Sergio Gómez · Konrad Zolna · Rishabh Agarwal · Josh Merel · Daniel Mankowitz · Cosmin Paduraru · Gabriel Dulac-Arnold · Jerry Li · Mohammad Norouzi · Matthew Hoffman · Nicolas Heess · Nando de Freitas -
2019 : Poster and Coffee Break 2 »
Karol Hausman · Kefan Dong · Ken Goldberg · Lihong Li · Lin Yang · Lingxiao Wang · Lior Shani · Liwei Wang · Loren Amdahl-Culleton · Lucas Cassano · Marc Dymetman · Marc Bellemare · Marcin Tomczak · Margarita Castro · Marius Kloft · Marius-Constantin Dinu · Markus Holzleitner · Martha White · Mengdi Wang · Michael Jordan · Mihailo Jovanovic · Ming Yu · Minshuo Chen · Moonkyung Ryu · Muhammad Zaheer · Naman Agarwal · Nan Jiang · Niao He · Nikolaus Yasui · Nikos Karampatziakis · Nino Vieillard · Ofir Nachum · Olivier Pietquin · Ozan Sener · Pan Xu · Parameswaran Kamalaruban · Paul Mineiro · Paul Rolland · Philip Amortila · Pierre-Luc Bacon · Prakash Panangaden · Qi Cai · Qiang Liu · Quanquan Gu · Raihan Seraj · Richard Sutton · Rick Valenzano · Robert Dadashi · Rodrigo Toro Icarte · Roshan Shariff · Roy Fox · Ruosong Wang · Saeed Ghadimi · Samuel Sokota · Sean Sinclair · Sepp Hochreiter · Sergey Levine · Sergio Valcarcel Macua · Sham Kakade · Shangtong Zhang · Sheila McIlraith · Shie Mannor · Shimon Whiteson · Shuai Li · Shuang Qiu · Wai Lok Li · Siddhartha Banerjee · Sitao Luan · Tamer Basar · Thinh Doan · Tianhe Yu · Tianyi Liu · Tom Zahavy · Toryn Klassen · Tuo Zhao · Vicenç Gómez · Vincent Liu · Volkan Cevher · Wesley Suttle · Xiao-Wen Chang · Xiaohan Wei · Xiaotong Liu · Xingguo Li · Xinyi Chen · Xingyou Song · Yao Liu · YiDing Jiang · Yihao Feng · Yilun Du · Yinlam Chow · Yinyu Ye · Yishay Mansour · · Yonathan Efroni · Yongxin Chen · Yuanhao Wang · Bo Dai · Chen-Yu Wei · Harsh Shrivastava · Hongyang Zhang · Qinqing Zheng · SIDDHARTHA SATPATHI · Xueqing Liu · Andreu Vall -
2019 : Lunch Break and Posters »
Xingyou Song · Elad Hoffer · Wei-Cheng Chang · Jeremy Cohen · Jyoti Islam · Yaniv Blumenfeld · Andreas Madsen · Jonathan Frankle · Sebastian Goldt · Satrajit Chatterjee · Abhishek Panigrahi · Alex Renda · Brian Bartoldson · Israel Birhane · Aristide Baratin · Niladri Chatterji · Roman Novak · Jessica Forde · YiDing Jiang · Yilun Du · Linara Adilova · Michael Kamp · Berry Weinstein · Itay Hubara · Tal Ben-Nun · Torsten Hoefler · Daniel Soudry · Hsiang-Fu Yu · Kai Zhong · Yiming Yang · Inderjit Dhillon · Jaime Carbonell · Yanqing Zhang · Dar Gilboa · Johannes Brandstetter · Alexander R Johansen · Gintare Karolina Dziugaite · Raghav Somani · Ari Morcos · Freddie Kalaitzis · Hanie Sedghi · Lechao Xiao · John Zech · Muqiao Yang · Simran Kaur · Qianli Ma · Yao-Hung Hubert Tsai · Ruslan Salakhutdinov · Sho Yaida · Zachary Lipton · Daniel Roy · Michael Carbin · Florent Krzakala · Lenka Zdeborová · Guy Gur-Ari · Ethan Dyer · Dilip Krishnan · Hossein Mobahi · Samy Bengio · Behnam Neyshabur · Praneeth Netrapalli · Kris Sankaran · Julien Cornebise · Yoshua Bengio · Vincent Michalski · Samira Ebrahimi Kahou · Md Rifat Arefin · Jiri Hron · Jaehoon Lee · Jascha Sohl-Dickstein · Samuel Schoenholz · David Schwab · Dongyu Li · Sang Keun Choe · Henning Petzka · Ashish Verma · Zhichao Lin · Cristian Sminchisescu -
2019 Workshop: Science meets Engineering of Deep Learning »
Levent Sagun · Caglar Gulcehre · Adriana Romero Soriano · Negar Rostamzadeh · Nando de Freitas -
2019 : Welcoming remarks and introduction »
Levent Sagun · Caglar Gulcehre · Adriana Romero Soriano · Negar Rostamzadeh · Nando de Freitas -
2019 : Coffee/Poster session 2 »
Xingyou Song · Puneet Mangla · David Salinas · Zhenxun Zhuang · Leo Feng · Shell Xu Hu · Raul Puri · Wesley Maddox · Aniruddh Raghu · Prudencio Tossou · Mingzhang Yin · Ishita Dasgupta · Kangwook Lee · Ferran Alet · Zhen Xu · Jörg Franke · James Harrison · Jonathan Warrell · Guneet Dhillon · Arber Zela · Xin Qiu · Julien Niklas Siems · Russell Mendonca · Louis Schlessinger · Jeffrey Li · Georgiana Manolache · Debojyoti Dutta · Lucas Glass · Abhishek Singh · Gregor Koehler -
2019 Poster: Large Memory Layers with Product Keys »
Guillaume Lample · Alexandre Sablayrolles · Marc'Aurelio Ranzato · Ludovic Denoyer · Herve Jegou -
2019 Spotlight: Large Memory Layers with Product Keys »
Guillaume Lample · Alexandre Sablayrolles · Marc'Aurelio Ranzato · Ludovic Denoyer · Herve Jegou -
2019 Poster: Regularized Weighted Low Rank Approximation »
Frank Ban · David Woodruff · Richard Zhang -
2019 Poster: Learning Compositional Neural Programs with Recursive Tree Search and Planning »
Thomas PIERROT · Guillaume Ligner · Scott Reed · Olivier Sigaud · Nicolas Perrin · Alexandre Laterre · David Kas · Karim Beguir · Nando de Freitas -
2019 Spotlight: Learning Compositional Neural Programs with Recursive Tree Search and Planning »
Thomas PIERROT · Guillaume Ligner · Scott Reed · Olivier Sigaud · Nicolas Perrin · Alexandre Laterre · David Kas · Karim Beguir · Nando de Freitas -
2019 Poster: Augmented Neural ODEs »
Emilien Dupont · Arnaud Doucet · Yee Whye Teh -
2018 : TBA 5 »
Nando de Freitas -
2018 : Invited Talk 5: Nando de Freitas »
Nando de Freitas -
2018 : Invited Speaker #3 Marc'Aurelio Ranzato »
Marc'Aurelio Ranzato -
2018 Poster: Playing hard exploration games by watching YouTube »
Yusuf Aytar · Tobias Pfaff · David Budden · Thomas Paine · Ziyu Wang · Nando de Freitas -
2018 Spotlight: Playing hard exploration games by watching YouTube »
Yusuf Aytar · Tobias Pfaff · David Budden · Thomas Paine · Ziyu Wang · Nando de Freitas -
2018 Poster: Hamiltonian Variational Auto-Encoder »
Anthony Caterini · Arnaud Doucet · Dino Sejdinovic -
2018 Tutorial: Unsupervised Deep Learning »
Alex Graves · Marc'Aurelio Ranzato -
2017 : Invited talk: Learning to learn without gradient descent by gradient descent. »
Yutian Chen -
2017 Poster: Filtering Variational Objectives »
Chris Maddison · John Lawson · George Tucker · Nicolas Heess · Mohammad Norouzi · Andriy Mnih · Arnaud Doucet · Yee Teh -
2017 Poster: Clone MCMC: Parallel High-Dimensional Gaussian Gibbs Sampling »
Andrei-Cristian Barbos · Francois Caron · Jean-François Giovannelli · Arnaud Doucet -
2017 Poster: Robust Imitation of Diverse Behaviors »
Ziyu Wang · Josh Merel · Scott Reed · Nando de Freitas · Gregory Wayne · Nicolas Heess -
2017 Poster: Fader Networks:Manipulating Images by Sliding Attributes »
Guillaume Lample · Neil Zeghidour · Nicolas Usunier · Antoine Bordes · Ludovic DENOYER · Marc'Aurelio Ranzato -
2017 Poster: Gradient Episodic Memory for Continual Learning »
David Lopez-Paz · Marc'Aurelio Ranzato -
2017 Tutorial: Deep Learning: Practice and Trends »
Nando de Freitas · Scott Reed · Oriol Vinyals -
2016 Workshop: Neural Abstract Machines & Program Induction »
Matko Bošnjak · Nando de Freitas · Tejas Kulkarni · Arvind Neelakantan · Scott E Reed · Sebastian Riedel · Tim Rocktäschel -
2016 : Nando De Freitas »
Nando de Freitas -
2016 : Learning To Optimize »
Nando de Freitas -
2016 Poster: Learning to learn by gradient descent by gradient descent »
Marcin Andrychowicz · Misha Denil · Sergio Gómez · Matthew Hoffman · David Pfau · Tom Schaul · Nando de Freitas -
2015 Workshop: Bayesian Optimization: Scalability and Flexibility »
Bobak Shahriari · Ryan Adams · Nando de Freitas · Amar Shah · Roberto Calandra -
2015 Workshop: Scalable Monte Carlo Methods for Bayesian Analysis of Big Data »
Babak Shahbaba · Yee Whye Teh · Max Welling · Arnaud Doucet · Christophe Andrieu · Sebastian J. Vollmer · Pierre Jacob -
2015 Symposium: Deep Learning Symposium »
Yoshua Bengio · Marc'Aurelio Ranzato · Honglak Lee · Max Welling · Andrew Y Ng -
2015 Poster: Expectation Particle Belief Propagation »
Thibaut Lienart · Yee Whye Teh · Arnaud Doucet -
2014 Poster: Asynchronous Anytime Sequential Monte Carlo »
Brooks Paige · Frank Wood · Arnaud Doucet · Yee Whye Teh -
2014 Oral: Asynchronous Anytime Sequential Monte Carlo »
Brooks Paige · Frank Wood · Arnaud Doucet · Yee Whye Teh -
2014 Session: Oral Session 4 »
Marc'Aurelio Ranzato -
2013 Poster: DeViSE: A Deep Visual-Semantic Embedding Model »
Andrea Frome · Greg Corrado · Jonathon Shlens · Samy Bengio · Jeff Dean · Marc'Aurelio Ranzato · Tomas Mikolov -
2013 Poster: Predicting Parameters in Deep Learning »
Misha Denil · Babak Shakibi · Laurent Dinh · Marc'Aurelio Ranzato · Nando de Freitas -
2012 Poster: Large Scale Distributed Deep Networks »
Jeff Dean · Greg Corrado · Rajat Monga · Kai Chen · Matthieu Devin · Quoc V Le · Mark Mao · Marc'Aurelio Ranzato · Andrew Senior · Paul Tucker · Ke Yang · Andrew Y Ng -
2011 Workshop: Challenges in Learning Hierarchical Models: Transfer Learning and Optimization »
Quoc V. Le · Marc'Aurelio Ranzato · Russ Salakhutdinov · Josh Tenenbaum · Andrew Y Ng -
2010 Workshop: Deep Learning and Unsupervised Feature Learning »
Honglak Lee · Marc'Aurelio Ranzato · Yoshua Bengio · Geoffrey E Hinton · Yann LeCun · Andrew Y Ng -
2010 Poster: Generating more realistic images using gated MRF's »
Marc'Aurelio Ranzato · Volodymyr Mnih · Geoffrey E Hinton -
2010 Poster: Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine »
George Dahl · Marc'Aurelio Ranzato · Abdel-rahman Mohamed · Geoffrey E Hinton -
2009 Poster: Bayesian Nonparametric Models on Decomposable Graphs »
Francois Caron · Arnaud Doucet -
2009 Tutorial: Sequential Monte-Carlo Methods »
Arnaud Doucet · Nando de Freitas -
2007 Spotlight: Bayesian Policy Learning with Trans-Dimensional MCMC »
Matthew Hoffman · Arnaud Doucet · Nando de Freitas · Ajay Jasra -
2007 Poster: Bayesian Policy Learning with Trans-Dimensional MCMC »
Matthew Hoffman · Arnaud Doucet · Nando de Freitas · Ajay Jasra -
2007 Poster: Sparse Feature Learning for Deep Belief Networks »
Marc'Aurelio Ranzato · Y-Lan Boureau · Yann LeCun -
2006 Poster: Efficient Learning of Sparse Representations with an Energy-Based Model »
Marc'Aurelio Ranzato · Christopher Poultney · Sumit Chopra · Yann LeCun -
2006 Spotlight: Efficient Learning of Sparse Representations with an Energy-Based Model »
Marc'Aurelio Ranzato · Christopher Poultney · Sumit Chopra · Yann LeCun