Timezone: »
Importance sampling has become an indispensable strategy to speed up optimization algorithms for large-scale applications. Improved adaptive variants -- using importance values defined by the complete gradient information which changes during optimization -- enjoy favorable theoretical properties, but are typically computationally infeasible. In this paper we propose an efficient approximation of gradient-based sampling, which is based on safe bounds on the gradient. The proposed sampling distribution is (i) provably the \emph{best sampling} with respect to the given bounds, (ii) always better than uniform sampling and fixed importance sampling and (iii) can efficiently be computed -- in many applications at negligible extra cost. The proposed sampling scheme is generic and can easily be integrated into existing algorithms. In particular, we show that coordinate-descent (CD) and stochastic gradient descent (SGD) can enjoy significant a speed-up under the novel scheme. The proven efficiency of the proposed sampling is verified by extensive numerical testing.
Author Information
Sebastian Stich (EPFL)
Dr. [Sebastian U. Stich](https://sstich.ch/) is a faculty at the CISPA Helmholtz Center for Information Security. Research interests: - *methods for machine learning and statistics*—at the interface of theory and practice - *collaborative learning* (distributed, federated and decentralized methods) - *optimization for machine learning* (adaptive stochastic methods and generalization performance)
Anant Raj (Max Planck Institute for Intelligent Systems)
Martin Jaggi (EPFL)
Related Events (a corresponding poster, oral, or spotlight)
-
2017 Poster: Safe Adaptive Importance Sampling »
Thu. Dec 7th 02:30 -- 06:30 AM Room Pacific Ballroom #172
More from the Same Authors
-
2021 : Interpreting Language Models Through Knowledge Graph Extraction »
Vinitra Swamy · Angelika Romanou · Martin Jaggi -
2021 : Escaping Local Minima With Stochastic Noise »
Harshvardhan Harshvardhan · Sebastian Stich -
2021 : Understanding Memorization from the Perspective of Optimization via Efficient Influence Estimation »
Futong Liu · Tao Lin · Martin Jaggi -
2021 : Understanding Memorization from the Perspective of Optimization via Efficient Influence Estimation »
Futong Liu · Tao Lin · Martin Jaggi -
2021 : WAFFLE: Weighted Averaging for Personalized Federated Learning »
Martin Beaussart · Mary-Anne Hartley · Martin Jaggi -
2021 : The Peril of Popular Deep Learning Uncertainty Estimation Methods »
Yehao Liu · Matteo Pagliardini · Tatjana Chavdarova · Sebastian Stich -
2022 : Data-heterogeneity-aware Mixing for Decentralized Learning »
Yatin Dandi · Anastasiia Koloskova · Martin Jaggi · Sebastian Stich -
2022 : Decentralized Stochastic Optimization with Client Sampling »
Ziwei Liu · Anastasiia Koloskova · Martin Jaggi · Tao Lin -
2022 : Towards Provably Personalized Federated Learning via Threshold-Clustering of Similar Clients »
Mariel A Werner · Lie He · Sai Praneeth Karimireddy · Michael Jordan · Martin Jaggi -
2022 : Diversity through Disagreement for Better Transferability »
Matteo Pagliardini · Martin Jaggi · François Fleuret · Sai Praneeth Karimireddy -
2023 Poster: MultiMoDN—Multimodal, Multi-Task, Interpretable Modular Networks »
Vinitra Swamy · Malika Satayeva · Jibril Frej · Thierry Bossy · Thijs Vogels · Martin Jaggi · Tanja Käser · Mary-Anne Hartley -
2023 Poster: Hardware-Efficient Transformer Training via Piecewise Affine Operations »
Atli Kosson · Martin Jaggi -
2023 Poster: Faster Causal Attention Over Large Sequences Through Sparse Flash Attention »
Matteo Pagliardini · Daniele Paliotta · Martin Jaggi · François Fleuret -
2023 Poster: Collaborative Learning via Prediction Consensus »
Dongyang Fan · Celestine Mendler-Dünner · Martin Jaggi -
2023 Poster: Random-Access Infinite Context Length for Transformers »
Amirkeivan Mohtashami · Martin Jaggi -
2022 : Scalable Collaborative Learning via Representation Sharing »
Frédéric Berdoz · Abhishek Singh · Martin Jaggi · Ramesh Raskar -
2022 Poster: Sharper Convergence Guarantees for Asynchronous SGD for Distributed and Federated Learning »
Anastasiia Koloskova · Sebastian Stich · Martin Jaggi -
2022 Poster: FLamby: Datasets and Benchmarks for Cross-Silo Federated Learning in Realistic Healthcare Settings »
Jean Ogier du Terrail · Samy-Safwan Ayed · Edwige Cyffers · Felix Grimberg · Chaoyang He · Regis Loeb · Paul Mangold · Tanguy Marchand · Othmane Marfoq · Erum Mushtaq · Boris Muzellec · Constantin Philippenko · Santiago Silva · Maria Teleńczuk · Shadi Albarqouni · Salman Avestimehr · Aurélien Bellet · Aymeric Dieuleveut · Martin Jaggi · Sai Praneeth Karimireddy · Marco Lorenzi · Giovanni Neglia · Marc Tommasi · Mathieu Andreux -
2022 Poster: Beyond spectral gap: the role of the topology in decentralized learning »
Thijs Vogels · Hadrien Hendrikx · Martin Jaggi -
2021 : [S11] Interpreting Language Models Through Knowledge Graph Extraction »
Vinitra Swamy · Angelika Romanou · Martin Jaggi -
2021 : Contributed Talks in Session 1 (Zoom) »
Sebastian Stich · Futong Liu · Abdurakhmon Sadiev · Frederik Benzing · Simon Roburin -
2021 : Q&A with Martin Jaggi »
Martin Jaggi -
2021 : Learning with Strange Gradients, Martin Jaggi »
Martin Jaggi -
2021 : Opening Remarks to Session 1 »
Sebastian Stich -
2021 Workshop: OPT 2021: Optimization for Machine Learning »
Courtney Paquette · Quanquan Gu · Oliver Hinder · Katya Scheinberg · Sebastian Stich · Martin Takac -
2021 Poster: Breaking the centralized barrier for cross-device federated learning »
Sai Praneeth Karimireddy · Martin Jaggi · Satyen Kale · Mehryar Mohri · Sashank Reddi · Sebastian Stich · Ananda Theertha Suresh -
2021 Poster: RelaySum for Decentralized Deep Learning on Heterogeneous Data »
Thijs Vogels · Lie He · Anastasiia Koloskova · Sai Praneeth Karimireddy · Tao Lin · Sebastian Stich · Martin Jaggi -
2021 Poster: An Improved Analysis of Gradient Tracking for Decentralized Machine Learning »
Anastasiia Koloskova · Tao Lin · Sebastian Stich -
2020 : Closing remarks »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 : Contributed talks in Session 1 (Zoom) »
Sebastian Stich · Laurent Condat · Zhize Li · Ohad Shamir · Tiffany Vlaar · Mohammadi Zaki -
2020 : Live Q&A with Volkan Cevher (Zoom) »
Sebastian Stich -
2020 : Live Q&A with Tong Zhang (Zoom) »
Sebastian Stich -
2020 : Welcome remarks to Session 1 »
Sebastian Stich -
2020 Workshop: OPT2020: Optimization for Machine Learning »
Courtney Paquette · Mark Schmidt · Sebastian Stich · Quanquan Gu · Martin Takac -
2020 : Welcome event (gather.town) »
Quanquan Gu · Courtney Paquette · Mark Schmidt · Sebastian Stich · Martin Takac -
2020 Poster: Stochastic Stein Discrepancies »
Jackson Gorham · Anant Raj · Lester Mackey -
2020 Poster: Ensemble Distillation for Robust Model Fusion in Federated Learning »
Tao Lin · Lingjing Kong · Sebastian Stich · Martin Jaggi -
2020 Poster: Dual Instrumental Variable Regression »
Krikamol Muandet · Arash Mehrjou · Si Kai Lee · Anant Raj -
2020 Poster: Practical Low-Rank Communication Compression in Decentralized Deep Learning »
Thijs Vogels · Sai Praneeth Karimireddy · Martin Jaggi -
2020 Poster: Model Fusion via Optimal Transport »
Sidak Pal Singh · Martin Jaggi -
2019 Poster: PowerSGD: Practical Low-Rank Gradient Compression for Distributed Optimization »
Thijs Vogels · Sai Praneeth Karimireddy · Martin Jaggi -
2019 Poster: Unsupervised Scalable Representation Learning for Multivariate Time Series »
Jean-Yves Franceschi · Aymeric Dieuleveut · Martin Jaggi -
2018 Poster: Accelerated Stochastic Matrix Inversion: General Theory and Speeding up BFGS Rules for Faster Second-Order Optimization »
Robert Gower · Filip Hanzely · Peter Richtarik · Sebastian Stich -
2018 Poster: COLA: Decentralized Linear Learning »
Lie He · Yatao Bian · Martin Jaggi -
2018 Poster: Sparsified SGD with Memory »
Sebastian Stich · Jean-Baptiste Cordonnier · Martin Jaggi -
2018 Poster: Training DNNs with Hybrid Block Floating Point »
Mario Drumond · Tao Lin · Martin Jaggi · Babak Falsafi -
2017 Poster: Greedy Algorithms for Cone Constrained Optimization with Convergence Guarantees »
Francesco Locatello · Michael Tschannen · Gunnar Ratsch · Martin Jaggi -
2017 Poster: Efficient Use of Limited-Memory Accelerators for Linear Learning on Heterogeneous Systems »
Celestine Dünner · Thomas Parnell · Martin Jaggi -
2015 Poster: On the Global Linear Convergence of Frank-Wolfe Optimization Variants »
Simon Lacoste-Julien · Martin Jaggi -
2014 Workshop: OPT2014: Optimization for Machine Learning »
Zaid Harchaoui · Suvrit Sra · Alekh Agarwal · Martin Jaggi · Miro Dudik · Aaditya Ramdas · Jean Lasserre · Yoshua Bengio · Amir Beck -
2014 Poster: Communication-Efficient Distributed Dual Coordinate Ascent »
Martin Jaggi · Virginia Smith · Martin Takac · Jonathan Terhorst · Sanjay Krishnan · Thomas Hofmann · Michael Jordan -
2013 Workshop: Greedy Algorithms, Frank-Wolfe and Friends - A modern perspective »
Martin Jaggi · Zaid Harchaoui · Federico Pierucci