Workshop
|
|
Revisiting the noise Model of SGD
Barak Battash · Ofir Lindenbaum
|
|
Workshop
|
|
The Noise Geometry of Stochastic Gradient Descent: A Quantitative and Analytical Characterization
Mingze Wang · Lei Wu
|
|
Poster
|
Wed 8:45
|
Implicit Bias of (Stochastic) Gradient Descent for Rank-1 Linear Neural Network
Bochen Lyu · Zhanxing Zhu
|
|
Poster
|
Tue 15:15
|
Universal Gradient Descent Ascent Method for Nonconvex-Nonconcave Minimax Optimization
Taoli Zheng · Linglingzhi Zhu · Anthony Man-Cho So · Jose Blanchet · Jiajin Li
|
|
Workshop
|
Fri 7:00
|
DoG is SGD’s best friend: toward tuning-free stochastic optimization, Yair Carmon
Yair Carmon
|
|
Workshop
|
|
On the Parallel Complexity of Multilevel Monte Carlo in Stocahstic Gradient Descent
Kei Ishikawa
|
|
Poster
|
Wed 8:45
|
Implicit Bias of Gradient Descent for Two-layer ReLU and Leaky ReLU Networks on Nearly-orthogonal Data
Yiwen Kou · Zixiang Chen · Quanquan Gu
|
|
Poster
|
Thu 8:45
|
Complex-valued Neurons Can Learn More but Slower than Real-valued Neurons via Gradient Descent
Jin-Hui Wu · Shao-Qun Zhang · Yuan Jiang · Zhi-Hua Zhou
|
|
Workshop
|
|
Accelerated gradient descent: A guaranteed bound for a heuristic restart strategy
Walaa Moursi · Stephen Vavasis · Viktor Pavlovic
|
|
Workshop
|
|
Improved Stein Variational Gradient Descent with Importance Weights
Lukang Sun · Peter Richtarik
|
|
Poster
|
Wed 8:45
|
Transformers learn to implement preconditioned gradient descent for in-context learning
Kwangjun Ahn · Xiang Cheng · Hadi Daneshmand · Suvrit Sra
|
|
Workshop
|
|
GradTree: Learning Axis-Aligned Decision Trees with Gradient Descent
Sascha Marton · Stefan Lüdtke · Christian Bartelt · Heiner Stuckenschmidt
|
|