The loss landscape of deep learning seems hopelessly complex. Most structures discovered in this landscape have been empirical observations. The loss landscape is determined by the model architecture and the dataset. Yet, the interplay between architecture and the structure of local minima is not well understood. We uncover a key part of this relation. First, we show that even nonlinear neural networks admit a large group of continuous symmetries which keep the loss invariant. These symmetries show that many local minima are valleys, having directions in the parameter space where the loss remains invariant. Additionally, we show that these symmetries imply certain quantities are conserved during gradient flow. We derive the explicit form of these conserved quantities using a Noether's theorem for gradient flow. These conserved quantities allow us to define coordinates along the valley of a local minimum. These symmetries can be used to create ensembles of trained models from a single trained model.