Applied ML
Graph Neural Networks for Molecules
Message-passing neural networks treat molecules as graphs of atoms and bonds. Variants like SchNet, D-MPNN, DimeNet, and NequIP add 3D geometry, edge messages, and rotational equivariance.
Prerequisites
Why This Matters
Molecules are graphs: atoms are nodes, bonds are edges, 3D coordinates label the nodes. Property prediction (solubility, toxicity, atomization energy, binding affinity) used to rely on hand-crafted descriptors and kernel regression. The Gilmer et al. (2017) message-passing reformulation showed that several earlier graph models (Duvenaud fingerprints, GG-NN, interaction networks) are special cases of one update rule, and that learning the message function on raw atom and bond features beats the hand-crafted pipelines on QM9.
The downstream stakes are not academic. Force fields trained on density functional theory data drive molecular dynamics simulations that would otherwise be intractable; binding-affinity models steer virtual screening in drug discovery. A wrong atomization energy by 1 kcal/mol can flip the predicted reaction pathway. A wrong toxicity prediction wastes a year of wet-lab work.
Core Ideas
A message-passing neural network (MPNN) maintains a hidden state on each atom at step . At each step, every atom receives messages from its neighbors and updates its state:
After rounds, a readout aggregates atom states into a graph-level prediction . Different choices for , , recover GG-NN, GraphSAGE, GCN, and the Duvenaud fingerprint.
SchNet (Schütt et al. 2017, J. Chem. Phys. 148) replaces discrete bond features with continuous-filter convolutions over interatomic distances, making the model differentiable in atomic coordinates and so usable as a force field. D-MPNN (Yang et al. 2019, J. Chem. Inf. Model. 59) passes messages along directed bonds rather than nodes, which removes the "messages bouncing back" pathology of vertex-centric MPNNs and is the default in ChemProp.
The expressive ceiling matters. Standard MPNNs are at most as powerful as the 1-Weisfeiler-Lehman graph isomorphism test (Xu et al. 2019, GIN paper): they cannot distinguish certain pairs of non-isomorphic molecules, including some regioisomers. This motivates higher-order GNNs and 3D-aware models. DimeNet (Klicpera et al. 2020) adds bond-angle messages; EGNN (Satorras et al. 2021) and NequIP (Batzner et al. 2022, Nat. Commun. 13) build or equivariance directly into the message function, so predicted forces transform correctly under rotation and reflection without data augmentation.
QM9 (134k small organic molecules, DFT-computed properties) and OC20 (catalysis, 130M relaxations) are the standard benchmarks. NequIP and its successors (Allegro, MACE) reach chemical accuracy on QM9 atomization energies with thousands of training points, where Coulomb-matrix kernel ridge regression needs orders of magnitude more.
Common Confusions
MPNNs are not strictly more expressive than fingerprints
A vanilla MPNN bounded by 1-WL cannot separate certain molecule pairs that ECFP4 with a large enough hash space can also separate, and vice versa. The empirical advantage of MPNNs comes from learned, task-specific features plus uncertainty propagation through the network, not from a strict expressive-power dominance.
Equivariance is about symmetry of the function, not the data
Augmenting training data with random rotations does not give the same inductive bias as a structurally equivariant network. A non-equivariant model can still memorize a rotation-augmented training set yet fail out-of-distribution; an equivariant model satisfies the symmetry exactly for every input.
References
Gilmer 2017
Gilmer, Schoenholz, Riley, Vinyals, Dahl, "Neural Message Passing for Quantum Chemistry," ICML 2017, arXiv:1704.01212.
Yang 2019 D-MPNN
Yang et al., "Analyzing Learned Molecular Representations for Property Prediction," J. Chem. Inf. Model. 59(8), 2019, pp. 3370-3388 (ChemProp).
Schütt 2018 SchNet
Schütt et al., "SchNet - A deep learning architecture for molecules and materials," J. Chem. Phys. 148(24), 2018, 241722, arXiv:1712.06113.
Klicpera 2020 DimeNet
Klicpera, Groß, Günnemann, "Directional Message Passing for Molecular Graphs," ICLR 2020, arXiv:2003.03123.
Batzner 2022 NequIP
Batzner et al., "E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials," Nat. Commun. 13:2453, 2022, arXiv:2101.03164.
Xu 2019 GIN
Xu, Hu, Leskovec, Jegelka, "How Powerful are Graph Neural Networks?" ICLR 2019, arXiv:1810.00826.
Related Topics
Last reviewed: April 18, 2026
Prerequisites
Foundations this topic depends on.
- Graph Neural NetworksLayer 3
- Convolutional Neural NetworksLayer 3
- Feedforward Networks and BackpropagationLayer 2
- Differentiation in RnLayer 0A
- Sets, Functions, and RelationsLayer 0A
- Basic Logic and Proof TechniquesLayer 0A
- Matrix CalculusLayer 1
- The Jacobian MatrixLayer 0A
- The Hessian MatrixLayer 0A
- Matrix Operations and PropertiesLayer 0A
- Eigenvalues and EigenvectorsLayer 0A
- Activation FunctionsLayer 1
- Convex Optimization BasicsLayer 1
- Vectors, Matrices, and Linear MapsLayer 0A
- Equivariant Deep LearningLayer 4