Timezone: »
The loss landscapes of deep neural networks are not well understood due to their high nonconvexity. Empirically, the local minima of these loss functions can be connected by a learned curve in model space, along which the loss remains nearly constant; a feature known as mode connectivity. Yet, current curve finding algorithms do not consider the influence of symmetry in the loss surface created by model weight permutations. We propose a more general framework to investigate the effect of symmetry on landscape connectivity by accounting for the weight permutations of the networks being connected. To approximate the optimal permutation, we introduce an inexpensive heuristic referred to as neuron alignment. Neuron alignment promotes similarity between the distribution of intermediate activations of a model along the curve with that of the endpoint models. We provide theoretical analysis establishing the benefit of alignment to mode connectivity based on this simple heuristic. We empirically verify that the permutation given by alignment is locally optimal via a proximal alternating minimization scheme. Empirically, optimizing the weight permutation is critical for efficiently learning a simple, planar, low-loss curve between networks that successfully generalizes. Our alignment method can significantly alleviate the recently identified robust loss barrier on the path connecting two adversarial robust models and find more robust and accurate models on the path.
Author Information
Norman J Tatro (Rensselaer Polytechnic Institute)
Pin-Yu Chen (IBM Research AI)
Payel Das (IBM Research)
Igor Melnyk (IBM Research)
Prasanna Sattigeri (IBM Research)
Rongjie Lai (Rensselaer Polytechnic Institute)
More from the Same Authors
-
2020 : Paper 10: Certified Interpretability Robustness for Class Activation Mapping »
Alex Gu · Tsui-Wei Weng · Pin-Yu Chen · Sijia Liu · Luca Daniel -
2021 : Accurate Multi-Endpoint Molecular Toxicity Predictions in Humans with Contrastive Explanations »
Bhanushee Sharma · Vijil Chenthamarakshan · Amit Dhurandhar · James Hendler · Jonathan S. Dordick · Payel Das -
2021 : Certified Robustness for Free in Differentially Private Federated Learning »
Chulin Xie · Yunhui Long · Pin-Yu Chen · Krishnaram Kenthapadi · Bo Li -
2021 : MAML is a Noisy Contrastive Learner »
Chia-Hsiang Kao · Wei-Chen Chiu · Pin-Yu Chen -
2021 : QTN-VQC: An End-to-End Learning Framework for Quantum Neural Networks »
Jun Qi · Huck Yang · Pin-Yu Chen -
2021 : Sample-Efficient Generation of Novel Photo-acid Generator Molecules using a Deep Generative Model »
Samuel Hoffman · Vijil Chenthamarakshan · Dmitry Zubarev · Daniel Sanders · Payel Das -
2021 : Grapher: Multi-Stage Knowledge Graph Construction using Pretrained Language Models »
Igor Melnyk · Pierre Dognin · Payel Das -
2021 : Pessimistic Model Selection for Offline Deep Reinforcement Learning »
Huck Yang · Yifan Cui · Pin-Yu Chen -
2022 : Reducing Down(stream)time: Pretraining Molecular GNNs using Heterogeneous AI Accelerators »
Jenna A Bilbrey · Kristina Herman · Henry Sprueill · Sotiris Xantheas · Payel Das · Manuel Lopez Roldan · Mike Kraus · Hatem Helal · Sutanay Choudhury -
2022 : Physics-Constrained Deep Learning for Climate Downscaling »
Paula Harder · Qidong Yang · Venkatesh Ramesh · Prasanna Sattigeri · Alex Hernandez-Garcia · Campbell Watson · Daniela Szwarcman · David Rolnick -
2022 : Generating physically-consistent high-resolution climate data with hard-constrained neural networks »
Paula Harder · Qidong Yang · Venkatesh Ramesh · Prasanna Sattigeri · Alex Hernandez-Garcia · Campbell Watson · Daniela Szwarcman · David Rolnick -
2022 : Consistent Training via Energy-Based GFlowNets for Modeling Discrete Joint Distributions »
Chanakya Ekbote · Moksh Jain · Payel Das · Yoshua Bengio -
2022 : Visual Prompting for Adversarial Robustness »
Aochuan Chen · Peter Lorenz · Yuguang Yao · Pin-Yu Chen · Sijia Liu -
2022 : Do Domain Generalization Methods Generalize Well? »
Akshay Mehra · Bhavya Kailkhura · Pin-Yu Chen · Jihun Hamm -
2022 : On the Adversarial Robustness of Vision Transformers »
Rulin Shao · Zhouxing Shi · Jinfeng Yi · Pin-Yu Chen · Cho-Jui Hsieh -
2023 Poster: Pre-Training Protein Encoder via Siamese Sequence-Structure Diffusion Trajectory Prediction »
Zuobai Zhang · Minghao Xu · Aurelie Lozano · Vijil Chenthamarakshan · Payel Das · Jian Tang -
2023 Poster: Effective Human-AI Teams via Learned Natural Language Rules and Onboarding »
Hussein Mozannar · Jimin Lee · Dennis Wei · Prasanna Sattigeri · Subhro Das · David Sontag -
2023 Poster: Equivariant Few-Shot Learning from Pretrained Models »
Sourya Basu · Pulkit Katdare · Prasanna Sattigeri · Vijil Chenthamarakshan · Katherine Driggs-Campbell · Payel Das · Lav Varshney -
2023 Poster: The Impact of Positional Encoding on Length Generalization in Transformers »
Amirhossein Kazemnejad · Inkit Padhi · Karthikeyan Natesan Ramamurthy · Payel Das · Siva Reddy -
2022 : Panel »
Pin-Yu Chen · Alex Gittens · Bo Li · Celia Cintas · Hilde Kuehne · Payel Das -
2022 : Q & A »
Sayak Paul · Sijia Liu · Pin-Yu Chen -
2022 : Deep dive on foundation models for computer vision »
Pin-Yu Chen -
2022 Tutorial: Foundational Robustness of Foundation Models »
Pin-Yu Chen · Sijia Liu · Sayak Paul -
2022 : Basics in foundation model and robustness »
Pin-Yu Chen · Sijia Liu -
2022 : SynBench: Task-Agnostic Benchmarking of Pretrained Representations using Synthetic Data »
Ching-Yun Ko · Pin-Yu Chen · Jeet Mohapatra · Payel Das · Luca Daniel -
2022 Poster: Fair Infinitesimal Jackknife: Mitigating the Influence of Biased Training Data Points Without Refitting »
Prasanna Sattigeri · Soumya Ghosh · Inkit Padhi · Pierre Dognin · Kush Varshney -
2022 Expo Talk Panel: Uncertainty quantification for fair and transparent AI-assisted decision-making »
Prasanna Sattigeri -
2022 Expo Demonstration: Real-time Navigation of Chemical Space with Cloud-Based Inference from MoLFormer »
Payel Das · Brian Belgodere -
2021 : Grapher: Multi-Stage Knowledge Graph Construction using Pretrained Language Models »
Igor Melnyk · Pierre Dognin · Payel Das -
2021 : Sample-Efficient Generation of Novel Photo-acid Generator Molecules using a Deep Generative Model »
Samuel Hoffman · Vijil Chenthamarakshan · Dmitry Zubarev · Daniel Sanders · Payel Das -
2021 Poster: Predicting Deep Neural Network Generalization with Perturbation Response Curves »
Yair Schiff · Brian Quanz · Payel Das · Pin-Yu Chen -
2021 Poster: Mean-based Best Arm Identification in Stochastic Bandits under Reward Contamination »
Arpan Mukherjee · Ali Tajer · Pin-Yu Chen · Payel Das -
2021 Poster: Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Sparse Neural Networks »
Shuai Zhang · Meng Wang · Sijia Liu · Pin-Yu Chen · Jinjun Xiong -
2021 Poster: CAFE: Catastrophic Data Leakage in Vertical Federated Learning »
Xiao Jin · Pin-Yu Chen · Chia-Yi Hsu · Chia-Mu Yu · Tianyi Chen -
2021 Poster: Adversarial Attack Generation Empowered by Min-Max Optimization »
Jingkang Wang · Tianyun Zhang · Sijia Liu · Pin-Yu Chen · Jiacen Xu · Makan Fardad · Bo Li -
2021 : Live Q&A session: MAML is a Noisy Contrastive Learner »
Chia-Hsiang Kao · Wei-Chen Chiu · Pin-Yu Chen -
2021 : Contributed Talk (Oral): MAML is a Noisy Contrastive Learner »
Chia-Hsiang Kao · Wei-Chen Chiu · Pin-Yu Chen -
2021 : SenSE: A Toolkit for Semantic Change Exploration via Word Embedding Alignment »
Maurício Gruppi · Sibel Adali · Pin-Yu Chen -
2021 Poster: When does Contrastive Learning Preserve Adversarial Robustness from Pretraining to Finetuning? »
Lijie Fan · Sijia Liu · Pin-Yu Chen · Gaoyuan Zhang · Chuang Gan -
2021 Poster: Formalizing Generalization and Adversarial Robustness of Neural Networks to Weight Perturbations »
Yu-Lin Tsai · Chia-Yi Hsu · Chia-Mu Yu · Pin-Yu Chen -
2021 Poster: Scalable Intervention Target Estimation in Linear Models »
Burak Varici · Karthikeyan Shanmugam · Prasanna Sattigeri · Ali Tajer -
2021 Poster: Understanding the Limits of Unsupervised Domain Adaptation via Data Poisoning »
Akshay Mehra · Bhavya Kailkhura · Pin-Yu Chen · Jihun Hamm -
2020 : Spotlight: Characterizing the Latent Space of Molecular Generative Models with Persistent Homology Metrics »
Yair Schiff · Payel Das · Vijil Chenthamarakshan · Karthikeyan Natesan Ramamurthy -
2020 Poster: A Decentralized Parallel Algorithm for Training Generative Adversarial Nets »
Mingrui Liu · Wei Zhang · Youssef Mroueh · Xiaodong Cui · Jarret Ross · Tianbao Yang · Payel Das -
2020 Poster: ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training »
Chia-Yu Chen · Jiamin Ni · Songtao Lu · Xiaodong Cui · Pin-Yu Chen · Xiao Sun · Naigang Wang · Swagath Venkataramani · Vijayalakshmi (Viji) Srinivasan · Wei Zhang · Kailash Gopalakrishnan -
2020 : Spotlight on women at IBM Research »
Lisa Amini · Francesca Rossi · Celia Cintas · Payel Das -
2020 Poster: CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models »
Vijil Chenthamarakshan · Payel Das · Samuel Hoffman · Hendrik Strobelt · Inkit Padhi · Kar Wai Lim · Benjamin Hoover · Matteo Manica · Jannis Born · Teodoro Laino · Aleksandra Mojsilovic -
2020 : CogMol: Target-Specific and Selective Drug Design for COVID-19 Using Deep Generative Models »
Payel Das -
2020 Poster: Higher-Order Certification For Randomized Smoothing »
Jeet Mohapatra · Ching-Yun Ko · Tsui-Wei Weng · Pin-Yu Chen · Sijia Liu · Luca Daniel -
2020 Spotlight: Higher-Order Certification For Randomized Smoothing »
Jeet Mohapatra · Ching-Yun Ko · Tsui-Wei Weng · Pin-Yu Chen · Sijia Liu · Luca Daniel -
2020 Expo Talk Panel: AI against COVID-19 at IBM Research »
Divya Pathak · Payel Das · Michal Rosen-Zvi · Salim Roukos -
2019 : Coffee Break and Poster Session »
Rameswar Panda · Prasanna Sattigeri · Kush Varshney · Karthikeyan Natesan Ramamurthy · Harvineet Singh · Vishwali Mhasawade · Shalmali Joshi · Laleh Seyyed-Kalantari · Matthew McDermott · Gal Yona · James Atwood · Hansa Srinivasan · Yonatan Halpern · D. Sculley · Behrouz Babaki · Margarida Carvalho · Josie Williams · Narges Razavian · Haoran Zhang · Amy Lu · Irene Y Chen · Xiaojie Mao · Angela Zhou · Nathan Kallus -
2019 : Poster Session »
Ahana Ghosh · Javad Shafiee · Akhilan Boopathy · Alex Tamkin · Theodoros Vasiloudis · Vedant Nanda · Ali Baheri · Paul Fieguth · Andrew Bennett · Guanya Shi · Hao Liu · Arushi Jain · Jacob Tyo · Benjie Wang · Boxiao Chen · Carroll Wainwright · Chandramouli Shama Sastry · Chao Tang · Daniel S. Brown · David Inouye · David Venuto · Dhruv Ramani · Dimitrios Diochnos · Divyam Madaan · Dmitrii Krashenikov · Joel Oren · Doyup Lee · Eleanor Quint · elmira amirloo · Matteo Pirotta · Gavin Hartnett · Geoffroy Dubourg-Felonneau · Gokul Swamy · Pin-Yu Chen · Ilija Bogunovic · Jason Carter · Javier Garcia-Barcos · Jeet Mohapatra · Jesse Zhang · Jian Qian · John Martin · Oliver Richter · Federico Zaiter · Tsui-Wei Weng · Karthik Abinav Sankararaman · Kyriakos Polymenakos · Lan Hoang · mahdieh abbasi · Marco Gallieri · Mathieu Seurin · Matteo Papini · Matteo Turchetta · Matthew Sotoudeh · Mehrdad Hosseinzadeh · Nathan Fulton · Masatoshi Uehara · Niranjani Prasad · Oana-Maria Camburu · Patrik Kolaric · Philipp Renz · Prateek Jaiswal · Reazul Hasan Russel · Riashat Islam · Rishabh Agarwal · Alexander Aldrick · Sachin Vernekar · Sahin Lale · Sai Kiran Narayanaswami · Samuel Daulton · Sanjam Garg · Sebastian East · Shun Zhang · Soheil Dsidbari · Justin Goodwin · Victoria Krakovna · Wenhao Luo · Wesley Chung · Yuanyuan Shi · Yuh-Shyang Wang · Hongwei Jin · Ziping Xu -
2019 Poster: Learning New Tricks From Old Dogs: Multi-Source Transfer Learning From Pre-Trained Networks »
Joshua Lee · Prasanna Sattigeri · Gregory Wornell -
2018 : Contributed Work »
Thaer Moustafa Dieb · Aditya Balu · Amir H. Khasahmadi · Viraj Shah · Boris Knyazev · Payel Das · Garrett Goh · Georgy Derevyanko · Gianni De Fabritiis · Reiko Hagawa · John Ingraham · David Belanger · Jialin Song · Kim Nicoli · Miha Skalic · Michelle Wu · Niklas Gebauer · Peter Bjørn Jørgensen · Ryan-Rhys Griffiths · Shengchao Liu · Sheshera Mysore · Hai Leong Chieu · Philippe Schwaller · Bart Olsthoorn · Bianca-Cristina Cristescu · Wei-Cheng Tseng · Seongok Ryu · Iddo Drori · Kevin Yang · Soumya Sanyal · Zois Boukouvalas · Rishi Bedi · Arindam Paul · Sambuddha Ghosal · Daniil Bash · Clyde Fare · Zekun Ren · Ali Oskooei · Minn Xuan Wong · Paul Sinz · Théophile Gaudin · Wengong Jin · Paul Leu -
2018 Poster: Zeroth-Order Stochastic Variance Reduction for Nonconvex Optimization »
Sijia Liu · Bhavya Kailkhura · Pin-Yu Chen · Paishun Ting · Shiyu Chang · Lisa Amini -
2018 Poster: Efficient Neural Network Robustness Certification with General Activation Functions »
Huan Zhang · Tsui-Wei Weng · Pin-Yu Chen · Cho-Jui Hsieh · Luca Daniel -
2018 Demonstration: PatentAI: IP Infringement Detection with Enhanced Paraphrase Identification »
Youssef Drissi · Karthikeyan Natesan Ramamurthy · Prasanna Sattigeri -
2018 Poster: Explanations based on the Missing: Towards Contrastive Explanations with Pertinent Negatives »
Amit Dhurandhar · Pin-Yu Chen · Ronny Luss · Chun-Chen Tu · Paishun Ting · Karthikeyan Shanmugam · Payel Das -
2018 Poster: Co-regularized Alignment for Unsupervised Domain Adaptation »
Abhishek Kumar · Prasanna Sattigeri · Kahini Wadhawan · Leonid Karlinsky · Rogerio Feris · Bill Freeman · Gregory Wornell -
2017 : Poster session + Coffee break »
Mikael Kågebäck · Igor Melnyk · Amir-Hossein Karimi · Gino Brunner · Ershad Banijamali · Chris Donahue · Jake Zhao · Giambattista Parascandolo · Valentin Thomas · Abhishek Kumar · Chris Burgess · Amanda Nilsson · Maria Larsson · Cian Eastwood · Momchil Peychev -
2017 Poster: Semi-supervised Learning with GANs: Manifold Invariance with Improved Inference »
Abhishek Kumar · Prasanna Sattigeri · Tom Fletcher