Timezone: »
Stochastic Gradient Descent (SGD) is a popular algorithm that can achieve state-of-the-art performance on a variety of machine learning tasks. Several researchers have recently proposed schemes to parallelize SGD, but all require performance-destroying memory locking and synchronization. This work aims to show using novel theoretical analysis, algorithms, and implementation that SGD can be implemented without any locking. We present an update scheme called Hogwild which allows processors access to shared memory with the possibility of overwriting each other's work. We show that when the associated optimization problem is sparse, meaning most gradient updates only modify small parts of the decision variable, then Hogwild achieves a nearly optimal rate of convergence. We demonstrate experimentally that Hogwild outperforms alternative schemes that use locking by an order of magnitude.
Author Information
Benjamin Recht (UC Berkeley)
Christopher Ré (Stanford)
Stephen Wright (UW-Madison)
Steve Wright is a Professor of Computer Sciences at the University of Wisconsin-Madison. His research interests lie in computational optimization and its applications to science and engineering. Prior to joining UW-Madison in 2001, Wright was a Senior Computer Scientist (1997-2001) and Computer Scientist (1990-1997) at Argonne National Laboratory, and Professor of Computer Science at the University of Chicago (2000-2001). He is the past Chair of the Mathematical Optimization Society (formerly the Mathematical Programming Society), the leading professional society in optimization, and a member of the Board of the Society for Industrial and Applied Mathematics (SIAM). Wright is the author or co-author of four widely used books in numerical optimization, including "Primal Dual Interior-Point Methods" (SIAM, 1997) and "Numerical Optimization" (with J. Nocedal, Second Edition, Springer, 2006). He has also authored over 85 refereed journal papers on optimization theory, algorithms, software, and applications. He is coauthor of widely used interior-point software for linear and quadratic optimization. His recent research includes algorithms, applications, and theory for sparse optimization (including applications in compressed sensing and machine learning).
Feng Niu (MeasureMe)
More from the Same Authors
-
2021 : Personalized Benchmarking with the Ludwig Benchmarking Toolkit »
Avanika Narayan · Piero Molino · Karan Goel · Willie Neiswanger · Christopher Ré -
2021 : SKM-TEA: A Dataset for Accelerated MRI Reconstruction with Dense Image Labels for Quantitative Clinical Evaluation »
Arjun Desai · Andrew Schmidt · Elka Rubin · Christopher Sandino · Marianne Black · Valentina Mazzoli · Kathryn Stevens · Robert Boutin · Christopher Ré · Garry Gold · Brian Hargreaves · Akshay Chaudhari -
2021 : Do ImageNet Classifiers Generalize to ImageNet? »
Benjamin Recht · Becca Roelofs · Ludwig Schmidt · Vaishaal Shankar -
2021 : Evaluating Machine Accuracy on ImageNet »
Vaishaal Shankar · Becca Roelofs · Horia Mania · Benjamin Recht · Ludwig Schmidt -
2021 : Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2021 : Combining Recurrent, Convolutional, and Continuous-Time Models with Structured Learnable Linear State-Space Layers »
Isys Johnson · Albert Gu · Karan Goel · Khaled Saab · Tri Dao · Atri Rudra · Christopher Ré -
2022 : BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach »
Mao Ye · Bo Liu · Stephen Wright · Peter Stone · Qiang Liu -
2023 Poster: HyenaDNA: Long-Range Genomic Sequence Modeling at Single Nucleotide Resolution »
Eric Nguyen · Michael Poli · Marjan Faizi · Armin Thomas · Michael Wornow · Callum Birch-Sykes · Stefano Massaroli · Aman Patel · Clayton Rabideau · Yoshua Bengio · Stefano Ermon · Christopher Ré · Stephen Baccus -
2023 Poster: Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture »
Dan Fu · Jessica R Grogan · Isys Johnson · Simran Arora · Evan Sabri Eyuboglu · Armin Thomas · Benjamin Spector · Michael Poli · Atri Rudra · Christopher Ré -
2023 Poster: A case for reframing automated medical image classification as segmentation »
Sarah Hooper · Mayee Chen · Khaled Saab · Kush Bhatia · Curtis Langlotz · Christopher Ré -
2023 Poster: TART: A plug-and-play Transformer module for task-agnostic reasoning »
Kush Bhatia · Avanika Narayan · Christopher De Sa · Christopher Ré -
2023 Poster: H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models »
Zhenyu Zhang · Ying Sheng · Tianyi Zhou · Tianlong Chen · Lianmin Zheng · Ruisi Cai · Zhao Song · Yuandong Tian · Christopher Ré · Clark Barrett · Zhangyang Wang · Beidi Chen -
2023 Poster: Laughing Hyena Distillery: Extracting Compact Recurrences From Convolutions »
Stefano Massaroli · Michael Poli · Dan Fu · Hermann Kumbong · David Romero · Rom Parnichkun · Aman Timalsina · Quinn McIntyre · Beidi Chen · Atri Rudra · Ce Zhang · Christopher Ré · Stefano Ermon · Yoshua Bengio -
2023 Poster: Skill-it! A data-driven skills framework for understanding and training language models »
Mayee Chen · Nicholas Roberts · Kush Bhatia · Jue WANG · Ce Zhang · Frederic Sala · Christopher Ré -
2023 Poster: Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification »
Neel Guha · Mayee Chen · Kush Bhatia · Azalia Mirhoseini · Frederic Sala · Christopher Ré -
2023 Poster: Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing »
Shuyao Li · Yu Cheng · Ilias Diakonikolas · Jelena Diakonikolas · Rong Ge · Stephen Wright -
2023 Poster: LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models »
Neel Guha · Julian Nyarko · Daniel Ho · Christopher Ré · Adam Chilton · Aditya K · Alex Chohlas-Wood · Austin Peters · Brandon Waldon · Daniel Rockmore · Diego Zambrano · Dmitry Talisman · Enam Hoque · Faiz Surani · Frank Fagan · Galit Sarfaty · Gregory Dickinson · Haggai Porat · Jason Hegland · Jessica Wu · Joe Nudell · Joel Niklaus · John Nay · Jonathan Choi · Kevin Tobia · Margaret Hagan · Megan Ma · Michael Livermore · Nikon Rasumov-Rahe · Nils Holzenberger · Noam Kolt · Peter Henderson · Sean Rehaag · Sharad Goel · Shang Gao · Spencer Williams · Sunny Gandhi · Tom Zur · Varun Iyer · Zehua Li -
2023 Oral: Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture »
Dan Fu · Jessica R Grogan · Isys Johnson · Simran Arora · Evan Sabri Eyuboglu · Armin Thomas · Benjamin Spector · Michael Poli · Atri Rudra · Christopher Ré -
2022 Spotlight: Machine Learning on Graphs: A Model and Comprehensive Taxonomy »
Ines Chami · Sami Abu-El-Haija · Bryan Perozzi · Christopher Ré · Kevin Murphy -
2022 Poster: On the Parameterization and Initialization of Diagonal State Space Models »
Albert Gu · Karan Goel · Ankit Gupta · Christopher Ré -
2022 Poster: Self-Supervised Learning of Brain Dynamics from Broad Neuroimaging Data »
Armin Thomas · Christopher Ré · Russell Poldrack -
2022 Poster: HAPI: A Large-scale Longitudinal Dataset of Commercial ML API Predictions »
Lingjiao Chen · Zhihua Jin · Evan Sabri Eyuboglu · Christopher Ré · Matei Zaharia · James Zou -
2022 Poster: BOME! Bilevel Optimization Made Easy: A Simple First-Order Approach »
Bo Liu · Mao Ye · Stephen Wright · Peter Stone · Qiang Liu -
2022 Poster: Coordinate Linear Variance Reduction for Generalized Linear Programming »
Chaobing Song · Cheuk Yin Lin · Stephen Wright · Jelena Diakonikolas -
2022 Poster: FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness »
Tri Dao · Dan Fu · Stefano Ermon · Atri Rudra · Christopher Ré -
2022 Poster: Contrastive Adapters for Foundation Model Group Robustness »
Michael Zhang · Christopher Ré -
2022 Poster: Decentralized Training of Foundation Models in Heterogeneous Environments »
Binhang Yuan · Yongjun He · Jared Davis · Tianyi Zhang · Tri Dao · Beidi Chen · Percy Liang · Christopher Ré · Ce Zhang -
2022 Poster: Transform Once: Efficient Operator Learning in Frequency Domain »
Michael Poli · Stefano Massaroli · Federico Berto · Jinkyoo Park · Tri Dao · Christopher Ré · Stefano Ermon -
2022 Poster: Machine Learning on Graphs: A Model and Comprehensive Taxonomy »
Ines Chami · Sami Abu-El-Haija · Bryan Perozzi · Christopher Ré · Kevin Murphy -
2022 Poster: S4ND: Modeling Images and Videos as Multidimensional Signals with State Spaces »
Eric Nguyen · Karan Goel · Albert Gu · Gordon Downs · Preey Shah · Tri Dao · Stephen Baccus · Christopher Ré -
2022 Poster: Fine-tuning Language Models over Slow Networks using Activation Quantization with Guarantees »
Jue WANG · Binhang Yuan · Luka Rimanic · Yongjun He · Tri Dao · Beidi Chen · Christopher Ré · Ce Zhang -
2021 Poster: Combining Recurrent, Convolutional, and Continuous-time Models with Linear State Space Layers »
Albert Gu · Isys Johnson · Karan Goel · Khaled Saab · Tri Dao · Atri Rudra · Christopher Ré -
2021 Poster: Rethinking Neural Operations for Diverse Tasks »
Nicholas Roberts · Mikhail Khodak · Tri Dao · Liam Li · Christopher Ré · Ameet Talwalkar -
2020 : Contributed Talk 6: Do Offline Metrics Predict Online Performance in Recommender Systems? »
Karl Krauth · Sarah Dean · Wenshuo Guo · Benjamin Recht · Michael Jordan -
2020 Workshop: Differential Geometry meets Deep Learning (DiffGeo4DL) »
Joey Bose · Emile Mathieu · Charline Le Lan · Ines Chami · Frederic Sala · Christopher De Sa · Maximilian Nickel · Christopher Ré · Will Hamilton -
2020 Poster: HiPPO: Recurrent Memory with Optimal Polynomial Projections »
Albert Gu · Tri Dao · Stefano Ermon · Atri Rudra · Christopher Ré -
2020 Spotlight: HiPPO: Recurrent Memory with Optimal Polynomial Projections »
Albert Gu · Tri Dao · Stefano Ermon · Atri Rudra · Christopher Ré -
2020 Poster: Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2020 Spotlight: Measuring Robustness to Natural Distribution Shifts in Image Classification »
Rohan Taori · Achal Dave · Vaishaal Shankar · Nicholas Carlini · Benjamin Recht · Ludwig Schmidt -
2020 Poster: From Trees to Continuous Embeddings and Back: Hyperbolic Hierarchical Clustering »
Ines Chami · Albert Gu · Vaggos Chatziafratis · Christopher Ré -
2019 : Second-order methods for nonconvex optimization with complexity guarantees »
Stephen Wright -
2019 Workshop: KR2ML - Knowledge Representation and Reasoning Meets Machine Learning »
Veronika Thost · Christian Muise · Kartik Talamadupula · Sameer Singh · Christopher Ré -
2019 Poster: On the Downstream Performance of Compressed Word Embeddings »
Avner May · Jian Zhang · Tri Dao · Christopher Ré -
2019 Spotlight: On the Downstream Performance of Compressed Word Embeddings »
Avner May · Jian Zhang · Tri Dao · Christopher Ré -
2019 Poster: Multi-Resolution Weak Supervision for Sequential Data »
Paroma Varma · Frederic Sala · Shiori Sagawa · Jason Fries · Dan Fu · Saelig Khattar · Ashwini Ramamoorthy · Ke Xiao · Kayvon Fatahalian · James Priest · Christopher Ré -
2019 Poster: Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices »
Vincent Chen · Sen Wu · Alexander Ratner · Jen Weng · Christopher Ré -
2019 Poster: Hyperbolic Graph Convolutional Neural Networks »
Ines Chami · Zhitao Ying · Christopher Ré · Jure Leskovec -
2019 Poster: Model Similarity Mitigates Test Set Overuse »
Horia Mania · John Miller · Ludwig Schmidt · Moritz Hardt · Benjamin Recht -
2019 Poster: A Meta-Analysis of Overfitting in Machine Learning »
Becca Roelofs · Vaishaal Shankar · Benjamin Recht · Sara Fridovich-Keil · Moritz Hardt · John Miller · Ludwig Schmidt -
2019 Poster: Finite-time Analysis of Approximate Policy Iteration for the Linear Quadratic Regulator »
Karl Krauth · Stephen Tu · Benjamin Recht -
2019 Poster: Certainty Equivalence is Efficient for Linear Quadratic Control »
Horia Mania · Stephen Tu · Benjamin Recht -
2018 Workshop: Relational Representation Learning »
Aditya Grover · Paroma Varma · Frederic Sala · Christopher Ré · Jennifer Neville · Stefano Ermon · Steven Holtzen -
2018 Poster: Simple random search of static linear policies is competitive for reinforcement learning »
Horia Mania · Aurelia Guy · Benjamin Recht -
2018 Poster: Regret Bounds for Robust Adaptive Control of the Linear Quadratic Regulator »
Sarah Dean · Horia Mania · Nikolai Matni · Benjamin Recht · Stephen Tu -
2018 Poster: ATOMO: Communication-efficient Learning via Atomic Sparsification »
Hongyi Wang · Scott Sievert · Shengchao Liu · Zachary Charles · Dimitris Papailiopoulos · Stephen Wright -
2018 Poster: Learning Compressed Transforms with Low Displacement Rank »
Anna Thomas · Albert Gu · Tri Dao · Atri Rudra · Christopher Ré -
2017 Workshop: Learning with Limited Labeled Data: Weak Supervision and Beyond »
Isabelle Augenstein · Stephen Bach · Eugene Belilovsky · Matthew Blaschko · Christoph Lampert · Edouard Oyallon · Emmanouil Antonios Platanios · Alexander Ratner · Christopher Ré -
2017 Workshop: OPT 2017: Optimization for Machine Learning »
Suvrit Sra · Sashank J. Reddi · Alekh Agarwal · Benjamin Recht -
2017 Workshop: ML Systems Workshop @ NIPS 2017 »
Aparna Lakshmiratan · Sarah Bird · Siddhartha Sen · Christopher Ré · Li Erran Li · Joseph Gonzalez · Daniel Crankshaw -
2017 Demonstration: Babble Labble: Learning from Natural Language Explanations »
Braden Hancock · Paroma Varma · Percy Liang · Christopher Ré · Stephanie Wang -
2017 Poster: Learning to Compose Domain-Specific Transformations for Data Augmentation »
Alexander Ratner · Henry Ehrenberg · Zeshan Hussain · Jared Dunnmon · Christopher Ré -
2017 Poster: The Marginal Value of Adaptive Gradient Methods in Machine Learning »
Ashia C Wilson · Becca Roelofs · Mitchell Stern · Nati Srebro · Benjamin Recht -
2017 Oral: The Marginal Value of Adaptive Gradient Methods in Machine Learning »
Ashia C Wilson · Becca Roelofs · Mitchell Stern · Nati Srebro · Benjamin Recht -
2017 Poster: Gaussian Quadrature for Kernel Features »
Tri Dao · Christopher M De Sa · Christopher Ré -
2017 Spotlight: Gaussian Quadrature for Kernel Features »
Tri Dao · Christopher M De Sa · Christopher Ré -
2017 Oral: Test of Time Award »
ali rahimi · Benjamin Recht -
2017 Poster: Inferring Generative Model Structure with Static Analysis »
Paroma Varma · Bryan He · Payal Bajaj · Nishith Khandwala · Imon Banerjee · Daniel Rubin · Christopher Ré -
2017 Poster: k-Support and Ordered Weighted Sparsity for Overlapping Groups: Hardness and Algorithms »
Cong Han Lim · Stephen Wright -
2016 : Invited Talk: You've been using asynchrony wrong your whole life! (Chris Re, Stanford) »
Christopher Ré -
2016 Poster: The Power of Adaptivity in Identifying Statistical Alternatives »
Kevin Jamieson · Daniel Haas · Benjamin Recht -
2016 Poster: Cyclades: Conflict-free Asynchronous Machine Learning »
Xinghao Pan · Maximilian Lam · Stephen Tu · Dimitris Papailiopoulos · Ce Zhang · Michael Jordan · Kannan Ramchandran · Christopher Ré · Benjamin Recht -
2016 Poster: Sub-sampled Newton Methods with Non-uniform Sampling »
Peng Xu · Jiyan Yang · Farbod Roosta-Khorasani · Christopher Ré · Michael Mahoney -
2015 Poster: Asynchronous stochastic convex optimization: the noise is in the noise and SGD don't care »
Sorathan Chaturapruek · John Duchi · Christopher Ré -
2015 Poster: Parallel Correlation Clustering on Big Graphs »
Xinghao Pan · Dimitris Papailiopoulos · Samet Oymak · Benjamin Recht · Kannan Ramchandran · Michael Jordan -
2015 Poster: Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width »
Christopher M De Sa · Ce Zhang · Kunle Olukotun · Christopher Ré -
2015 Spotlight: Rapidly Mixing Gibbs Sampling for a Class of Factor Graphs Using Hierarchy Width »
Christopher M De Sa · Ce Zhang · Kunle Olukotun · Christopher Ré · Christopher Ré -
2015 Poster: Taming the Wild: A Unified Analysis of Hogwild-Style Algorithms »
Christopher M De Sa · Ce Zhang · Kunle Olukotun · Christopher Ré · Christopher Ré -
2014 Workshop: 4th Workshop on Automated Knowledge Base Construction (AKBC) »
Sameer Singh · Fabian M Suchanek · Sebastian Riedel · Partha Pratim Talukdar · Kevin Murphy · Christopher Ré · William Cohen · Tom Mitchell · Andrew McCallum · Jason E Weston · Ramanathan Guha · Boyan Onyshkevych · Hoifung Poon · Oren Etzioni · Ari Kobren · Arvind Neelakantan · Peter Clark -
2014 Poster: Beyond the Birkhoff Polytope: Convex Relaxations for Vector Permutation Problems »
Cong Han Lim · Stephen Wright -
2014 Poster: Parallel Feature Selection Inspired by Group Testing »
Yingbo Zhou · Utkarsh Porwal · Ce Zhang · Hung Q Ngo · XuanLong Nguyen · Christopher Ré · Venu Govindaraju -
2013 Workshop: Big Learning : Advances in Algorithms and Data Management »
Xinghao Pan · Haijie Gu · Joseph Gonzalez · Sameer Singh · Yucheng Low · Joseph Hellerstein · Derek G Murray · Raghu Ramakrishnan · Michael Jordan · Christopher Ré -
2013 Poster: An Approximate, Efficient LP Solver for LP Rounding »
Srikrishna Sridhar · Stephen Wright · Christopher Re · Ji Liu · Victor Bittorf · Ce Zhang -
2012 Workshop: Log-Linear Models »
Dimitri Kanevsky · Tony Jebara · Li Deng · Stephen Wright · Georg Heigold · Avishy Carmi -
2011 Workshop: Optimization for Machine Learning »
Suvrit Sra · Stephen Wright · Sebastian Nowozin -
2011 Poster: Hogwild!: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent »
Benjamin Recht · Christopher Re · Stephen Wright · Feng Niu -
2010 Workshop: Optimization for Machine Learning »
Suvrit Sra · Sebastian Nowozin · Stephen Wright -
2010 Tutorial: Optimization Algorithms in Machine Learning »
Stephen Wright -
2009 Workshop: Optimization for Machine Learning »
Sebastian Nowozin · Suvrit Sra · S.V.N Vishwanthan · Stephen Wright