Timezone: »
Recurrent Neural Networks (RNNs) frequently exhibit complicated dynamics, and their sensitivity to the initialization process often renders them notoriously hard to train. Recent works have shed light on such phenomena analyzing when exploding or vanishing gradients may occur, either of which is detrimental for training dynamics. In this paper, we point to a formal connection between RNNs and chaotic dynamical systems and prove a qualitatively stronger phenomenon about RNNs than what exploding gradients seem to suggest. Our main result proves that under standard initialization (e.g., He, Xavier etc.), RNNs will exhibit \textit{Li-Yorke chaos} with \textit{constant} probability \textit{independent} of the network's width. This explains the experimentally observed phenomenon of \textit{scrambling}, under which trajectories of nearby points may appear to be arbitrarily close during some timesteps, yet will be far away in future timesteps. In stark contrast to their feedforward counterparts, we show that chaotic behavior in RNNs is preserved under small perturbations and that their expressive power remains exponential in the number of feedback iterations. Our technical arguments rely on viewing RNNs as random walks under non-linear activations, and studying the existence of certain types of higher-order fixed points called \textit{periodic points} in order to establish phase transitions from order to chaos.
Author Information
Vaggos Chatziafratis (University of California, Santa Cruz)
Ioannis Panageas (UC Irvine)
Clayton Sanford (Columbia University)
Stelios Stavroulakis (UCI)
More from the Same Authors
-
2022 : Improving the predictions of ML-corrected climate models with novelty detection »
Clayton Sanford · Anna Kwa · Oliver Watt-Meyer · Spencer K. Clark · Noah Brenowitz · Jeremy McGibbon · Christopher S. Bretherton -
2022 Spotlight: On Scrambling Phenomena for Randomly Initialized Recurrent Networks »
Vaggos Chatziafratis · Ioannis Panageas · Clayton Sanford · Stelios Stavroulakis -
2022 Poster: Optimistic Mirror Descent Either Converges to Nash or to Strong Coarse Correlated Equilibria in Bimatrix Games »
Ioannis Anagnostides · Gabriele Farina · Ioannis Panageas · Tuomas Sandholm -
2022 Poster: Learning single-index models with shallow neural networks »
Alberto Bietti · Joan Bruna · Clayton Sanford · Min Jae Song -
2021 Poster: Support vector machines and linear regression coincide with very high-dimensional features »
Navid Ardeshir · Clayton Sanford · Daniel Hsu -
2020 Poster: Fast Convergence of Langevin Dynamics on Manifold: Geodesics meet Log-Sobolev »
Xiao Wang · Qi Lei · Ioannis Panageas