Timezone: »

 
Poster
Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian
Jack Parker-Holder · Luke Metz · Cinjon Resnick · Hengyuan Hu · Adam Lerer · Alistair Letcher · Alexander Peysakhovich · Aldo Pacchiano · Jakob Foerster

Thu Dec 10 09:00 PM -- 11:00 PM (PST) @ Poster Session 6 #1851

Over the last decade, a single algorithm has changed many facets of our lives - Stochastic Gradient Descent (SGD). In the era of ever decreasing loss functions, SGD and its various offspring have become the go-to optimization tool in machine learning and are a key component of the success of deep neural networks (DNNs). While SGD is guaranteed to converge to a local optimum (under loose assumptions), in some cases it may matter which local optimum is found, and this is often context-dependent. Examples frequently arise in machine learning, from shape-versus-texture-features to ensemble methods and zero-shot coordination. In these settings, there are desired solutions which SGD on standard' loss functions will not find, since it instead converges to theeasy' solutions. In this paper, we present a different approach. Rather than following the gradient, which corresponds to a locally greedy direction, we instead follow the eigenvectors of the Hessian. By iteratively following and branching amongst the ridges, we effectively span the loss surface to find qualitatively different solutions. We show both theoretically and experimentally that our method, called Ridge Rider (RR), offers a promising direction for a variety of challenging problems.

Author Information

Jack Parker-Holder (University of Oxford)
Luke Metz (Google Brain)
Cinjon Resnick (NYU)
Hengyuan Hu (Facebook)
Adam Lerer (Facebook AI Research)
Alistair Letcher (None)
Alexander Peysakhovich (Facebook)
Aldo Pacchiano (UC Berkeley)
Jakob Foerster (Facebook AI Research)

Jakob Foerster is a PhD student in AI at the University of Oxford under the supervision of Shimon Whiteson and Nando de Freitas. Using deep reinforcement learning he studies the emergence of communication in multi-agent AI systems. Prior to his PhD Jakob spent four years working at Google and Goldman Sachs. Previously he has also worked on a number of research projects in systems neuroscience, including work at MIT and the Weizmann Institute.

More from the Same Authors