Graph Neural Networks for Molecules

Sneiderman, Robby

Applied ML

Graph Neural Networks for Molecules

Message-passing neural networks treat molecules as graphs of atoms and bonds. Variants like SchNet, D-MPNN, DimeNet, and NequIP add 3D geometry, edge messages, and rotational equivariance.

AdvancedTier 3CurrentReference~15 min

Prerequisites

Graph Neural Networks Equivariant Deep Learning

Prereq Map

Why This Matters

Molecules are graphs: atoms are nodes, bonds are edges, 3D coordinates label the nodes. Property prediction (solubility, toxicity, atomization energy, binding affinity) used to rely on hand-crafted descriptors and kernel regression. The Gilmer et al. (2017) message-passing reformulation showed that several earlier graph models (Duvenaud fingerprints, GG-NN, interaction networks) are special cases of one update rule, and that learning the message function on raw atom and bond features beats the hand-crafted pipelines on QM9.

The downstream stakes are not academic. Force fields trained on density functional theory data drive molecular dynamics simulations that would otherwise be intractable; binding-affinity models steer virtual screening in drug discovery. A wrong atomization energy by 1 kcal/mol can flip the predicted reaction pathway. A wrong toxicity prediction wastes a year of wet-lab work.

Core Ideas

A message-passing neural network (MPNN) maintains a hidden state $h_v^{(t)}$ on each atom $v$ at step $t$ . At each step, every atom receives messages from its neighbors and updates its state:

$m_v^{(t+1)} = \sum_{w \in \mathcal{N}(v)} M_t(h_v^{(t)}, h_w^{(t)}, e_{vw}), \quad h_v^{(t+1)} = U_t(h_v^{(t)}, m_v^{(t+1)})$

After $T$ rounds, a readout aggregates atom states into a graph-level prediction $\hat{y} = R(\{h_v^{(T)}\}_{v \in G})$ . Different choices for $M_t$ , $U_t$ , $R$ recover GG-NN, GraphSAGE, GCN, and the Duvenaud fingerprint.

SchNet (Schütt et al. 2017, J. Chem. Phys. 148) replaces discrete bond features with continuous-filter convolutions over interatomic distances, making the model differentiable in atomic coordinates and so usable as a force field. D-MPNN (Yang et al. 2019, J. Chem. Inf. Model. 59) passes messages along directed bonds rather than nodes, which removes the "messages bouncing back" pathology of vertex-centric MPNNs and is the default in ChemProp.

The expressive ceiling matters. Standard MPNNs are at most as powerful as the 1-Weisfeiler-Lehman graph isomorphism test (Xu et al. 2019, GIN paper): they cannot distinguish certain pairs of non-isomorphic molecules, including some regioisomers. This motivates higher-order GNNs and 3D-aware models. DimeNet (Klicpera et al. 2020) adds bond-angle messages; EGNN (Satorras et al. 2021) and NequIP (Batzner et al. 2022, Nat. Commun. 13) build $E(3)$ or $SO(3)$ equivariance directly into the message function, so predicted forces transform correctly under rotation and reflection without data augmentation.

QM9 (134k small organic molecules, DFT-computed properties) and OC20 (catalysis, 130M relaxations) are the standard benchmarks. NequIP and its successors (Allegro, MACE) reach chemical accuracy on QM9 atomization energies with thousands of training points, where Coulomb-matrix kernel ridge regression needs orders of magnitude more.

The 2023-2024 frontier shifted from per-task fitting to universal machine-learning interatomic potentials trained across the periodic table. MACE-MP-0 (Batatia et al. 2023) is a single $E(3)$ -equivariant model trained on the Materials Project that gives near-DFT accuracy on solids, liquids, and molecules without retraining. GNoME (Merchant et al. 2023, Nature) used a similar pipeline to discover roughly 380,000 new stable crystals. EquiformerV2 (Liao & Smidt 2024) pushes the $SE(3)$ -equivariant transformer to higher-degree representations and holds top OC20 IS2RE results. On the structure-prediction side, AlphaFold 2 (Jumper et al. 2021, Nature) and AlphaFold 3 (Abramson et al. 2024, Nature) extend the same equivariant-network ideas from small molecules to proteins and biomolecular complexes. Equivariant models with geometric message passing — DimeNet++, GemNet (Klicpera et al. 2021), SphereNet, and the Equiformer family — are the modern default for 3D molecular property prediction; SchNet and plain MPNN are now baselines.

Definition

Molecular Graph $G = (V, E)$

A molecular graph represents atoms as nodes $V$ , bonds as edges $E$ , and optional 3D coordinates as geometric labels on the nodes. A molecular GNN learns functions on this graph while respecting permutation of atom order.

Proposition

Message Passing Locality

Statement

After $T$ rounds of message passing, an atom's hidden state can depend only on atoms within $T$ graph hops.

Intuition

Each round moves information across one bond. Long-range electronic or stereochemical effects require enough rounds, explicit geometry, or architectural shortcuts.

Failure Mode

More rounds are not a free fix: deep message passing can oversmooth node states, and some molecular distinctions remain invisible to 1-WL-limited architectures.

report a correction →

ExerciseCore

Problem

A property depends on a substituent five bonds away from an active atom. What is the minimum message-passing depth needed for a vanilla edge-local MPNN to expose that information to the active atom?

Common Confusions

Watch Out

MPNNs are not strictly more expressive than fingerprints

A vanilla MPNN bounded by 1-WL cannot separate certain molecule pairs that ECFP4 with a large enough hash space can also separate, and vice versa. The empirical advantage of MPNNs comes from learned, task-specific features plus uncertainty propagation through the network, not from a strict expressive-power dominance.

Watch Out

Equivariance is about symmetry of the function, not the data

Augmenting training data with random rotations does not give the same inductive bias as a structurally equivariant network. A non-equivariant model can still memorize a rotation-augmented training set yet fail out-of-distribution; an equivariant model satisfies the symmetry exactly for every input.

References

Gilmer, Schoenholz, Riley, Vinyals, Dahl, "Neural Message Passing for Quantum Chemistry," ICML 2017, arXiv:1704.01212.
Yang et al., "Analyzing Learned Molecular Representations for Property Prediction," J. Chem. Inf. Model. 59(8), 2019, pp. 3370-3388 (ChemProp).
Schütt et al., "SchNet - A deep learning architecture for molecules and materials," J. Chem. Phys. 148(24), 2018, 241722, arXiv:1712.06113.
Klicpera, Groß, Günnemann, "Directional Message Passing for Molecular Graphs," ICLR 2020, arXiv:2003.03123.
Batzner et al., "E(3)-equivariant graph neural networks for data-efficient and accurate interatomic potentials," Nat. Commun. 13:2453, 2022, arXiv:2101.03164.
Xu, Hu, Leskovec, Jegelka, "How Powerful are Graph Neural Networks?" ICLR 2019, arXiv:1810.00826.
Klicpera, Becker, Günnemann, "GemNet: Universal Directional Graph Neural Networks for Molecules," NeurIPS 2021, arXiv:2106.08903.
Batatia et al., "A foundation model for atomistic materials chemistry," 2023 (MACE-MP-0). arXiv:2401.00096
Merchant et al., "Scaling deep learning for materials discovery," Nature 624, 2023 (GNoME).
Liao, Smidt, "EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations," ICLR 2024, arXiv:2306.12059.
Jumper et al., "Highly accurate protein structure prediction with AlphaFold," Nature 596, 2021. AlphaFold 2.
Abramson et al., "Accurate structure prediction of biomolecular interactions with AlphaFold 3," Nature 630, 2024.

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

2

Graph Neural Networkslayer 3 · tier 2
Equivariant Deep Learninglayer 4 · tier 2

Derived topics

0

No published topic currently declares this as a prerequisite.