3D Pre-training for Molecular Property Prediction

3D Infomax improves GNNs for Molecular Property Prediction

Molecular property prediction is one of the fastest-growing applications of deep learning with critical real-world impacts.Including 3d molecular structure as input to learned models their performance for many moleculartasks is infeasible to compute at the scale required by several real-world applications.We propose pre-training a model to reason about the geometry of molecules given only their 2d molecular graphs.Using methods from self-supervised learning, we maximize the mutual informationbetween 3d summary vectors and the representations of a graph neural network such that they contain latent 3d information.During fine-tuning on molecules with unknown geometry, the graph neural network still generates implicit 3dinformation and can use it to improve downstream tasks.We show that 3dpre-training provides significant improvements for a wide range of properties, such as a 22% average mean-square - error (mae) reduction on eight quantum mechanical properties.