Abstract
Generative models trained on internet-scale data are capable of generating novel yet highly realistic texts, images, and videos. A natural next question is whether these models can advance science through means such as generating novel stable materials. Traditionally, models with explicit structures (e.g., graphs) have been used in modeling structural relationships in scientific data (e.g., atoms and bonds in crystals), but generating structures explicitly might be difficult to scale to large and complex systems. Another challenge to generative models of materials lies in the mismatch between generative modeling metrics and the downstream applications. For instance, common metrics such as the reconstruction error do not correlate well with the downstream goal of discovering novel stable materials. In this work, we tackle the scalability challenge by first developing a unified crystal representation that can effectively represent any crystal structures (UniMat), followed by training a diffusion probabilistic model on the UniMat representations. Our empirical results suggest that despite the lack of explicit structure modeling, UniMat can generate high fidelity crystal structures from larger and more complex chemical systems, outperforming previous graph-based approaches under various generative modeling metrics. To better connect the generation quality of materials to downstream applications such as discovering novel stable materials, we propose to use decomposition energy from Density Function Theory (DFT) calculations and the resulting stability with respect to convex hulls as additional evaluation metrics for generative models of materials. Lastly, we show that conditional generation with UniMat can scale to previously established crystal datasets with up to millions of crystals structures, and outperform random structure search (the current leading method for structure discovery) in discovering new stable materials.
Authors
Sherry Yang, Amil Merchant, Dale Schuurmans, Igor Mordatch, Ekin Dogus Cubuk, Pieter Abbeel*, Ruben Cho*
- *
- External author
Venue
ICLR 2024