Timezone: »

 
Crystal Diffusion Variational Autoencoder for Periodic Material Generation
Tian Xie · Xiang Fu · Octavian Ganea · Regina Barzilay · Tommi Jaakkola

Generating the periodic structure of stable materials is a long-standing challenge for the material design community. This task is difficult because stable materials only exist in a low-dimensional subspace of all possible periodic arrangements of atoms: 1) the coordinates must lie in the local energy minimum defined by quantum mechanics, and 2) different atom types have complex, yet specific bonding preferences. Existing methods fail to incorporate these factors and often lack proper invariances. We propose a Crystal Diffusion Variational Autoencoder (CDVAE) that captures the physical inductive bias of material stability. By learning from the data distribution of stable materials, the decoder generates materials in a diffusion process that moves atomic coordinates towards a lower energy state and updates atom types to satisfy bonding preferences between neighbors. Our model also explicitly encodes interactions across periodic boundaries and respects permutation, translation, rotation, and periodic invariances. We generate significantly more realistic materials than past methods in two tasks: 1) reconstructing the input structure, and 2) generating valid, diverse, and realistic materials. Our contribution also includes the creation of several standard datasets and evaluation metrics for the broader machine learning community.

Author Information

Tian Xie (Massachusetts Institute of Technology)
Xiang Fu (MIT)
Octavian Ganea (MIT)
Regina Barzilay (Massachusetts Institute of Technology)
Tommi Jaakkola (MIT)

Tommi Jaakkola is a professor of Electrical Engineering and Computer Science at MIT. He received an M.Sc. degree in theoretical physics from Helsinki University of Technology, and Ph.D. from MIT in computational neuroscience. Following a Sloan postdoctoral fellowship in computational molecular biology, he joined the MIT faculty in 1998. His research interests include statistical inference, graphical models, and large scale modern estimation problems with predominantly incomplete data.

Related Events (a corresponding poster, oral, or spotlight)

More from the Same Authors