Timezone: »
The (gradient-based) bilevel programming framework is widely used in hyperparameter optimization and has achieved excellent performance empirically. Previous theoretical work mainly focuses on its optimization properties, while leaving the analysis on generalization largely open. This paper attempts to address the issue by presenting an expectation bound w.r.t. the validation set based on uniform stability. Our results can explain some mysterious behaviours of the bilevel programming in practice, for instance, overfitting to the validation set. We also present an expectation bound for the classical cross-validation algorithm. Our results suggest that gradient-based algorithms can be better than cross-validation under certain conditions in a theoretical perspective. Furthermore, we prove that regularization terms in both the outer and inner levels can relieve the overfitting problem in gradient-based algorithms. In experiments on feature learning and data reweighting for noisy labels, we corroborate our theoretical findings.
Author Information
Fan Bao (Tsinghua University)
Guoqiang Wu (Shandong University)
Chongxuan LI (Tsinghua University)
Assistant Professor @ RUC
Jun Zhu (Tsinghua University)
Bo Zhang (Tsinghua University)
More from the Same Authors
-
2021 : Counter-Strike Deathmatch with Large-Scale Behavioural Cloning »
Tim Pearce · Jun Zhu -
2022 Poster: A Unified Hard-Constraint Framework for Solving Geometrically Complex PDEs »
Songming Liu · Hao Zhongkai · Chengyang Ying · Hang Su · Jun Zhu · Ze Cheng -
2022 Poster: Isometric 3D Adversarial Examples in the Physical World »
Yibo Miao · Yinpeng Dong · Jun Zhu · Xiao-Shan Gao -
2022 Poster: Confidence-based Reliable Learning under Dual Noises »
Peng Cui · Yang Yue · Zhijie Deng · Jun Zhu -
2022 Poster: EGSDE: Unpaired Image-to-Image Translation via Energy-Guided Stochastic Differential Equations »
Min Zhao · Fan Bao · Chongxuan LI · Jun Zhu -
2022 Poster: ViewFool: Evaluating the Robustness of Visual Recognition to Adversarial Viewpoints »
Yinpeng Dong · Shouwei Ruan · Hang Su · Caixin Kang · Xingxing Wei · Jun Zhu -
2022 Poster: DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps »
Cheng Lu · Yuhao Zhou · Fan Bao · Jianfei Chen · Chongxuan LI · Jun Zhu -
2022 Poster: Accelerated Linearized Laplace Approximation for Bayesian Deep Learning »
Zhijie Deng · Feng Zhou · Jun Zhu -
2022 : Physics-Guided Discovery of Highly Nonlinear Parametric Partial Differential Equations »
Yingtao Luo · Qiang Liu · Yuntian Chen · Wenbo Hu · TIAN TIAN · Jun Zhu -
2022 : All are Worth Words: a ViT Backbone for Score-based Diffusion Models »
Fan Bao · Chongxuan LI · Yue Cao · Jun Zhu -
2022 : Why Are Conditional Generative Models Better Than Unconditional Ones? »
Fan Bao · Chongxuan LI · Jiacheng Sun · Jun Zhu -
2022 : On Equivalences between Weight and Function-Space Langevin Dynamics »
Ziyu Wang · Yuhao Zhou · Ruqi Zhang · Jun Zhu -
2023 Poster: On Evaluating Adversarial Robustness of Large Vision-Language Models »
Yunqing Zhao · Tianyu Pang · Chao Du · Xiao Yang · Chongxuan LI · Ngai-Man (Man) Cheung · Min Lin -
2023 Poster: ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation »
Zhengyi Wang · Cheng Lu · Yikai Wang · Fan Bao · Chongxuan LI · Hang Su · Jun Zhu -
2023 Poster: Memory Efficient Optimizers with 4-bit States »
Bingrui Li · Jianfei Chen · Jun Zhu -
2023 Poster: Toward Understanding Generative Data Augmentation »
Chenyu Zheng · Guoqiang Wu · Chongxuan LI -
2023 Poster: Overcoming Recency Bias of Normalization Statistics in Continual Learning: Balance and Adaptation »
Yilin Lyu · Liyuan Wang · Xingxing Zhang · Zicheng Sun · Hang Su · Jun Zhu · Liping Jing -
2023 Poster: Diffusion Models and Semi-Supervised Learners Benefit Mutually with Few Labels »
Zebin You · Yong Zhong · Fan Bao · Jiacheng Sun · Chongxuan LI · Jun Zhu -
2023 Poster: Training Transformers with 4-bit Integers »
Haocheng Xi · ChangHao Li · Jianfei Chen · Jun Zhu -
2023 Poster: DPM-Solver-v3: Improved Diffusion ODE Solvers with Empirical Model Statistics »
Kaiwen Zheng · Cheng Lu · Jianfei Chen · Jun Zhu -
2023 Poster: Hierarchical Decomposition of Prompt-Based Continual Learning: Rethinking Obscured Sub-optimality »
Liyuan Wang · Jingyi Xie · Xingxing Zhang · Mingyi Huang · Hang Su · Jun Zhu -
2023 Poster: Learning Sample Difficulty from Pre-trained Models for Reliable Prediction »
Peng Cui · Dan Zhang · Zhijie Deng · Yinpeng Dong · Jun Zhu -
2023 Poster: Towards Accelerated Model Training via Bayesian Data Selection »
Zhijie Deng · Peng Cui · Jun Zhu -
2023 Poster: Gaussian Mixture Solvers for Diffusion Models »
Hanzhong Guo · Cheng Lu · Fan Bao · Tianyu Pang · Shuicheng Yan · Chao Du · Chongxuan LI -
2022 Spotlight: Lightning Talks 6A-2 »
Yichuan Mo · Botao Yu · Gang Li · Zezhong Xu · Haoran Wei · Arsene Fansi Tchango · Raef Bassily · Haoyu Lu · Qi Zhang · Songming Liu · Mingyu Ding · Peiling Lu · Yifei Wang · Xiang Li · Dongxian Wu · Ping Guo · Wen Zhang · Hao Zhongkai · Mehryar Mohri · Rishab Goel · Yisen Wang · Yifei Wang · Yangguang Zhu · Zhi Wen · Ananda Theertha Suresh · Chengyang Ying · Yujie Wang · Peng Ye · Rui Wang · Nanyi Fei · Hui Chen · Yiwen Guo · Wei Hu · Chenglong Liu · Julien Martel · Yuqi Huo · Wu Yichao · Hang Su · Yisen Wang · Peng Wang · Huajun Chen · Xu Tan · Jun Zhu · Ding Liang · Zhiwu Lu · Joumana Ghosn · Shanshan Zhang · Wei Ye · Ze Cheng · Shikun Zhang · Tao Qin · Tie-Yan Liu -
2022 Spotlight: A Unified Hard-Constraint Framework for Solving Geometrically Complex PDEs »
Songming Liu · Hao Zhongkai · Chengyang Ying · Hang Su · Jun Zhu · Ze Cheng -
2022 Spotlight: EGSDE: Unpaired Image-to-Image Translation via Energy-Guided Stochastic Differential Equations »
Min Zhao · Fan Bao · Chongxuan LI · Jun Zhu -
2022 Spotlight: Accelerated Linearized Laplace Approximation for Bayesian Deep Learning »
Zhijie Deng · Feng Zhou · Jun Zhu -
2022 Spotlight: DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps »
Cheng Lu · Yuhao Zhou · Fan Bao · Jianfei Chen · Chongxuan LI · Jun Zhu -
2022 Spotlight: Lightning Talks 4B-1 »
Alexandra Senderovich · Zhijie Deng · Navid Ansari · Xuefei Ning · Yasmin Salehi · Xiang Huang · Chenyang Wu · Kelsey Allen · Jiaqi Han · Nikita Balagansky · Tatiana Lopez-Guevara · Tianci Li · Zhanhong Ye · Zixuan Zhou · Feng Zhou · Ekaterina Bulatova · Daniil Gavrilov · Wenbing Huang · Dennis Giannacopoulos · Hans-peter Seidel · Anton Obukhov · Kimberly Stachenfeld · Hongsheng Liu · Jun Zhu · Junbo Zhao · Hengbo Ma · Nima Vahidi Ferdowsi · Zongzhang Zhang · Vahid Babaei · Jiachen Li · Alvaro Sanchez Gonzalez · Yang Yu · Shi Ji · Maxim Rakhuba · Tianchen Zhao · Yiping Deng · Peter Battaglia · Josh Tenenbaum · Zidong Wang · Chuang Gan · Changcheng Tang · Jessica Hamrick · Kang Yang · Tobias Pfaff · Yang Li · Shuang Liang · Min Wang · Huazhong Yang · Haotian CHU · Yu Wang · Fan Yu · Bei Hua · Lei Chen · Bin Dong -
2022 Spotlight: Lightning Talks 3B-2 »
Yu Huang · Tero Karras · Maxim Kodryan · Shiau Hong Lim · Shudong Huang · Ziyu Wang · Siqiao Xue · ILYAS MALIK · Ekaterina Lobacheva · Miika Aittala · Hongjie Wu · Yuhao Zhou · Yingbin Liang · Xiaoming Shi · Jun Zhu · Maksim Nakhodnov · Timo Aila · Yazhou Ren · James Zhang · Longbo Huang · Dmitry Vetrov · Ivor Tsang · Hongyuan Mei · Samuli Laine · Zenglin Xu · Wentao Feng · Jiancheng Lv -
2022 Spotlight: Fast Instrument Learning with Faster Rates »
Ziyu Wang · Yuhao Zhou · Jun Zhu -
2022 Poster: Fast Instrument Learning with Faster Rates »
Ziyu Wang · Yuhao Zhou · Jun Zhu -
2022 Poster: Censored Quantile Regression Neural Networks for Distribution-Free Survival Analysis »
Tim Pearce · Jong-Hyeon Jeong · yichen jia · Jun Zhu -
2021 Poster: On the Convergence of Prior-Guided Zeroth-Order Optimization Algorithms »
Shuyu Cheng · Guoqiang Wu · Jun Zhu -
2021 Poster: Scalable Quasi-Bayesian Inference for Instrumental Variable Regression »
Ziyu Wang · Yuhao Zhou · Tongzheng Ren · Jun Zhu -
2021 Poster: Rethinking and Reweighting the Univariate Losses for Multi-Label Ranking: Consistency and Generalization »
Guoqiang Wu · Chongxuan LI · Kun Xu · Jun Zhu -
2021 Poster: AFEC: Active Forgetting of Negative Transfer in Continual Learning »
Liyuan Wang · Mingtian Zhang · Zhongfan Jia · Qian Li · Chenglong Bao · Kaisheng Ma · Jun Zhu · Yi Zhong -
2021 Poster: Accumulative Poisoning Attacks on Real-time Data »
Tianyu Pang · Xiao Yang · Yinpeng Dong · Hang Su · Jun Zhu -
2020 : Fan Bao---Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models »
Fan Bao -
2020 Poster: Multi-label classification: do Hamming loss and subset accuracy really conflict with each other? »
Guoqiang Wu · Jun Zhu -
2020 Poster: Bi-level Score Matching for Learning Energy-based Latent Variable Models »
Fan Bao · Chongxuan LI · Kun Xu · Hang Su · Jun Zhu · Bo Zhang -
2020 Poster: Further Analysis of Outlier Detection with Deep Generative Models »
Ziyu Wang · Bin Dai · David P Wipf · Jun Zhu -
2020 Poster: Efficient Learning of Generative Models via Finite-Difference Score Matching »
Tianyu Pang · Kun Xu · Chongxuan LI · Yang Song · Stefano Ermon · Jun Zhu -
2020 Poster: Calibrated Reliable Regression using Maximum Mean Discrepancy »
Peng Cui · Wenbo Hu · Jun Zhu -
2020 Poster: Boosting Adversarial Training with Hypersphere Embedding »
Tianyu Pang · Xiao Yang · Yinpeng Dong · Kun Xu · Jun Zhu · Hang Su -
2020 Poster: Adversarial Distributional Training for Robust Deep Learning »
Yinpeng Dong · Zhijie Deng · Tianyu Pang · Jun Zhu · Hang Su -
2020 Poster: Understanding and Exploring the Network with Stochastic Architectures »
Zhijie Deng · Yinpeng Dong · Shifeng Zhang · Jun Zhu -
2019 Poster: Improving Black-box Adversarial Attacks with a Transfer-based Prior »
Shuyu Cheng · Yinpeng Dong · Tianyu Pang · Hang Su · Jun Zhu -
2019 Poster: Generative Well-intentioned Networks »
Justin Cosentino · Jun Zhu -
2019 Poster: Multi-objects Generation with Amortized Structural Regularization »
Kun Xu · Chongxuan LI · Jun Zhu · Bo Zhang -
2018 Poster: Towards Robust Detection of Adversarial Examples »
Tianyu Pang · Chao Du · Yinpeng Dong · Jun Zhu -
2018 Spotlight: Towards Robust Detection of Adversarial Examples »
Tianyu Pang · Chao Du · Yinpeng Dong · Jun Zhu -
2018 Poster: Graphical Generative Adversarial Networks »
Chongxuan LI · Max Welling · Jun Zhu · Bo Zhang -
2017 Poster: Triple Generative Adversarial Nets »
Chongxuan LI · Kun Xu · Jun Zhu · Bo Zhang -
2017 Poster: Population Matching Discrepancy and Applications in Deep Learning »
Jianfei Chen · Chongxuan LI · Yizhong Ru · Jun Zhu -
2016 Poster: Kernel Bayesian Inference with Posterior Regularization »
Yang Song · Jun Zhu · Yong Ren -
2016 Poster: Stochastic Gradient Geodesic MCMC Methods »
Chang Liu · Jun Zhu · Yang Song -
2016 Poster: Conditional Generative Moment-Matching Networks »
Yong Ren · Jun Zhu · Jialian Li · Yucen Luo -
2015 Poster: Max-Margin Majority Voting for Learning from Crowds »
TIAN TIAN · Jun Zhu -
2015 Poster: Max-Margin Deep Generative Models »
Chongxuan Li · Jun Zhu · Tim Shi · Bo Zhang -
2015 Poster: Convolutional Neural Networks with Intra-Layer Recurrent Connections for Scene Labeling »
Ming Liang · Xiaolin Hu · Bo Zhang -
2014 Poster: Distributed Bayesian Posterior Sampling via Moment Sharing »
Minjie Xu · Balaji Lakshminarayanan · Yee Whye Teh · Jun Zhu · Bo Zhang -
2014 Poster: Spectral Methods for Supervised Topic Models »
Yining Wang · Jun Zhu -
2014 Poster: Robust Bayesian Max-Margin Clustering »
Changyou Chen · Jun Zhu · Xinhua Zhang -
2013 Poster: Scalable Inference for Logistic-Normal Topic Models »
Jianfei Chen · Jun Zhu · Zi Wang · Xun Zheng · Bo Zhang -
2012 Poster: Monte Carlo Methods for Maximum Margin Supervised Topic Models »
Qixia Jiang · Jun Zhu · Maosong Sun · Eric Xing -
2012 Poster: Bayesian Nonparametric Maximum Margin Matrix Factorization for Collaborative Prediction »
Minjie Xu · Jun Zhu · Bo Zhang -
2011 Poster: Infinite Latent SVM for Classification and Multi-task Learning »
Jun Zhu · Ning Chen · Eric Xing -
2010 Poster: Large Margin Learning of Upstream Scene Understanding Models »
Jun Zhu · Li-Jia Li · Li Fei-Fei · Eric Xing -
2010 Poster: Predictive Subspace Learning for Multi-view Data: a Large Margin Approach »
Ning Chen · Jun Zhu · Eric Xing -
2010 Poster: Adaptive Multi-Task Lasso: with Application to eQTL Detection »
Seunghak Lee · Jun Zhu · Eric Xing -
2010 Poster: Efficient Relational Learning with Hidden Variable Detection »
Ni Lao · Jun Zhu · Liu Xinwang · Yandong Liu · William Cohen -
2008 Poster: Partially Observed Maximum Entropy Discrimination Markov Networks »
Jun Zhu · Eric Xing · Bo Zhang