Timezone: »

Path-Normalized Optimization of Recurrent Neural Networks with ReLU Activations
Behnam Neyshabur · Yuhuai Wu · Russ Salakhutdinov · Nati Srebro

Tue Dec 06 09:00 AM -- 12:30 PM (PST) @ Area 5+6+7+8 #42 #None

We investigate the parameter-space geometry of recurrent neural networks (RNNs), and develop an adaptation of path-SGD optimization method, attuned to this geometry, that can learn plain RNNs with ReLU activations. On several datasets that require capturing long-term dependency structure, we show that path-SGD can significantly improve trainability of ReLU RNNs compared to RNNs trained with SGD, even with various recently suggested initialization schemes.

Author Information

Behnam Neyshabur (TTI-Chicago)
Yuhuai Wu (University of Toronto)
Russ Salakhutdinov (University of Toronto)
Nati Srebro (TTI-Chicago)

More from the Same Authors