Diffusion Lab

Learn the real diffusion loop: corrupt clean structure in closed form, then spend sampler budget bringing it back with the right amount of guidance.

Target

Samplersteps32

CFGw2.0

step 0loss —

Target distribution

300 fresh samples from p(x, c)

Model samples

press Sample to draw from the current model

Score field

arrows point toward higher density at noise level t

t = 200 / 1000

noise t

Sample quality vs steps · DDIM (η=0) vs Heun (Karras)press Quality vs steps to evaluate

Training lossno steps yet

Training runs in the main thread, ~30 steps per animation frame. Sampling does 1000 reverse steps per point and blocks briefly. DDIM and CFG controls land in later phases.

How to read this

A small MLP is being trained right now in your tab to denoise samples from the chosen 2D distribution. The forward process adds Gaussian noise on the cosine schedule of Nichol & Dhariwal: at step t a clean point x_0 becomes x_t = sqrt(bar_alpha_t) · x_0 + sqrt(1 − bar_alpha_t) · eps. The network learns to predict that eps from x_t with class and time conditioning. Reverse-time samplers turn those eps predictions back into samples.

Sampler choice

DDPM walks the full T = 1000 reverse chain with fresh noise at each step (Ho 2020). DDIM keeps the same trained network but takes a sub-schedule of N deterministic steps via the predicted x_0(Song, Meng & Ermon 2020). Heun (EDM) rewrites the trajectory in sigma-space using σ = sqrt((1 − bar_alpha_t)/bar_alpha_t) and integrates the probability-flow ODE with a 2nd-order predictor-corrector on a Karras ρ=7 schedule. The quality chart makes the trade-off visible: at low step counts Heun usually wins on MMD, at high step counts the three converge.

Classifier-free guidance

The network is trained on both class labels and an empty token (Ho & Salimans 2022). At sample time the eps used is (1 + w) · eps_c − w · eps_∅. The vector field plot lets you watch the ε-arrows on a 16×16 grid stretch and sharpen as w increases. w = 0 recovers the plain class-conditional field.

Vector field

Each arrow plots the score s_θ = −eps_θ / sqrt(1 − bar_alpha_t) at one (x, y) location. At small t the arrows snap clean points into mode centers; at large t they point inward toward the origin because almost all of x_t is noise. Watch the field morph as training progresses.

Quality vs steps

Each point on the chart is the kernel-MMD between 200 generated samples and a fresh batch from the target, computed with an RBF kernel set at σ ∈ {0.5, 1, 2} × median pairwise distance (Gretton 2012). Lower is better. The DDIM and Heun curves use the same trained weights and the same step counts; the network-call cost of Heun is roughly twice that of DDIM at the same step count because each Heun step is a predictor-corrector pair.

References

Ho, Jain, Abbeel. Denoising Diffusion Probabilistic Models. NeurIPS 2020. arXiv:2006.11239
Song, Meng, Ermon. Denoising Diffusion Implicit Models. ICLR 2021. arXiv:2010.02502
Nichol, Dhariwal. Improved Denoising Diffusion Probabilistic Models. ICML 2021. arXiv:2102.09672
Ho, Salimans. Classifier-Free Diffusion Guidance. NeurIPS 2021 Workshop. arXiv:2207.12598
Karras, Aittala, Aila, Laine. Elucidating the Design Space of Diffusion-Based Generative Models. NeurIPS 2022. arXiv:2206.00364
Gretton et al. A Kernel Two-Sample Test. JMLR 2012.

TheoryDiffusion ModelsDDPM, score-SDE, probability flow ODE, guidance, latent diffusion, and flow matching CompareAutoregressive vs. DiffusionWhy one model predicts the next token and the other learns a reverse denoising path NextFlow MatchingSee how modern image and video systems replace the diffusion ODE with a directly learned velocity field

TheoremPath · 594 topics · Interactive demos