Cryogenic electron microscopy (cryo-EM) provides a unique opportunity to study the structural heterogeneity of biomolecules. Being able to explain this heterogeneity with atomic models would help our understanding of their functional mechanism but the size and ruggedness of the structural space presents an immense challenge. In this work, we describe a heterogeneous reconstruction method based on an atomistic representation whose deformation is reduced to a handful of collective motions through normal mode analysis. Our implementation follows an encoder-decoder approach. The amplitude of motion along the normal modes and the 2D shift between the center of the image and the center of the molecule are jointly estimated by an encoder while a physics-based decoder aggregates the images into a representation of the heterogeneity readily interpretable at the atomic level. We illustrate our method on 3 synthetic datasets corresponding to different distributions along a simulated trajectory of adenylate kinase transitioning from its open to its closed conformations. We show for each distribution that, given enough normal modes, our approach is able to recapitulate the intermediate atomic models with atomic-level accuracy.