What does Fokker-Planck describe vs the SDE itself
Question
Given an Ito SDE dX_t = b(X_t, t) dt + sigma(X_t, t) dW_t with associated Fokker-Planck (forward Kolmogorov) equation partial_t p(x, t) = - div(b * p) + (1/2) sum_ij partial_i partial_j (D_ij * p) where D = sigma * sigma^T. The SDE and the Fokker-Planck equation describe different objects. Which statement is correct?
Why this matters
This distinction is the conceptual hinge between sample-based methods (SDE simulation, Langevin samplers) and density-based methods (score matching, normalizing flows). Score matching trains a model of the score nabla log p_t(x), the gradient of the LOG DENSITY satisfying the Fokker-Planck equation, NOT a model of individual SDE paths. Conflating the two is the most common conceptual error in diffusion model derivations.
Common mistake
Believing 'simulating the SDE' and 'solving the Fokker-Planck PDE' are the same task. They give complementary information. SDE simulation gives you samples (cheap per sample, expensive to estimate density tails); Fokker-Planck gives you the density everywhere (cheap to evaluate, more expensive to set up).
Source anchor
content/topics/fokker-planck-equation.mdx#why-this-matters