NUTS / Funnel Lab

See why NUTS can fix path-length tuning and still fail on bad hierarchical geometry. The real choice is often centered vs non-centered coordinates, not sampler branding.

NUTS / Funnel Lab

Geometry decides whether warmup can save the run

Watch the same hierarchical model in two coordinate systems. The centered form can squeeze into a sharp funnel neck; the non-centered form spreads that geometry back onto a friendlier latent scale. NUTS adapts path length, but it still inherits the geometry you hand it.

Blue or green points are posterior mass, the dashed path is the adapted trajectory, and rose crosses mark where divergences cluster first.

Centered hierarchy

Local effects are stored directly, so weak data forces the sampler through a narrow scale-dependent neck.

divergence risk moderate

Bulk ESS / 1k draws

211

What to watch

Neck pinches, path kinks, and divergences cluster where the local scale collapses.

Non-centered hierarchy

Local effects move onto a standard-normal base scale, so the same posterior becomes much easier for HMC and NUTS to traverse.

divergence risk low

Bulk ESS / 1k draws

465

What to watch

Cloud stays close to isotropic and the adapted path spends more time exploring than recovering.

Data strength34%

Low strength means the shared scale parameter dominates. High strength means each group effect is pinned down by its own likelihood.

Integrator aggressiveness58%

This bundles step size ambition and how hard the sampler tries to move through the posterior in one trajectory. High values reveal bad geometry faster.

Try this

Start with the weak-data funnel preset. Then keep the same weak data regime and reduce aggressiveness. You should see tuning help a little, but the non-centered panel remains much easier because the real fix is geometric.

Scenario note

Centered geometry pinches into a narrow neck, so aggressive integration throws divergences there first.

Diagnosis

Classic weak-data funnel

The centered hierarchy is forcing the sampler through a narrow neck, so aggressive integration keeps breaking at the same place. The non-centered panel stays much more isotropic.

Why it matters

This is the regime where reparameterization changes geometry more than NUTS path-length adaptation does.

Next move

Lower aggressiveness first, then compare how much more the non-centered parameterization helps than pure tuning does.

Two equivalent models, different geometry

Centered: θ_{j} \sim N (μ, τ^{2}) with the sampler moving directly in θ_{j} .

Non-centered: z_{j} \sim N (0, 1) and θ_{j} = μ + τ z_{j}, so the latent coordinates stay on a stable base scale.

What NUTS really changes

NUTS removes the need to hand-pick the leapfrog count. It does not repeal the neck of a centered funnel. If the geometry is wrong, path-length adaptation still drives a bad path through a bad region.

TheoryNo-U-Turn Sampler and Neal's FunnelAutomatic path length, divergence geometry, and the classic centered funnel pathology CompareCentered vs. Non-Centered Hierarchical ModelsWhen weak data favors the non-centered form, and when strong data can pull you back DiagnosticsBurn-in and Convergence DiagnosticsSplit R-hat, ESS, divergences, and why sampler warnings matter more than a single scalar

TheoremPath · 641 topics · Interactive demos