Skip to main content

NUTS / Funnel Lab

See why NUTS can fix path-length tuning and still fail on bad hierarchical geometry. The real choice is often centered vs non-centered coordinates, not sampler branding.

NUTS / Funnel Lab

Geometry decides whether warmup can save the run

Watch the same hierarchical model in two coordinate systems. The centered form can squeeze into a sharp funnel neck; the non-centered form spreads that geometry back onto a friendlier latent scale. NUTS adapts path length, but it still inherits the geometry you hand it.

Blue or green points are posterior mass, the dashed path is the adapted trajectory, and rose crosses mark where divergences cluster first.

Centered hierarchy

Local effects are stored directly, so weak data forces the sampler through a narrow scale-dependent neck.

divergence risk moderate

Bulk ESS / 1k draws

211

What to watch

Neck pinches, path kinks, and divergences cluster where the local scale collapses.

Non-centered hierarchy

Local effects move onto a standard-normal base scale, so the same posterior becomes much easier for HMC and NUTS to traverse.

divergence risk low

Bulk ESS / 1k draws

465

What to watch

Cloud stays close to isotropic and the adapted path spends more time exploring than recovering.

34%

Low strength means the shared scale parameter dominates. High strength means each group effect is pinned down by its own likelihood.

58%

This bundles step size ambition and how hard the sampler tries to move through the posterior in one trajectory. High values reveal bad geometry faster.

Try this

Start with the weak-data funnel preset. Then keep the same weak data regime and reduce aggressiveness. You should see tuning help a little, but the non-centered panel remains much easier because the real fix is geometric.

Scenario note

Centered geometry pinches into a narrow neck, so aggressive integration throws divergences there first.

Diagnosis

Classic weak-data funnel

The centered hierarchy is forcing the sampler through a narrow neck, so aggressive integration keeps breaking at the same place. The non-centered panel stays much more isotropic.

Why it matters

This is the regime where reparameterization changes geometry more than NUTS path-length adaptation does.

Next move

Lower aggressiveness first, then compare how much more the non-centered parameterization helps than pure tuning does.

Two equivalent models, different geometry

Centered: with the sampler moving directly in .Non-centered: and , so the latent coordinates stay on a stable base scale.

What NUTS really changes

NUTS removes the need to hand-pick the leapfrog count. It does not repeal the neck of a centered funnel. If the geometry is wrong, path-length adaptation still drives a bad path through a bad region.