Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Sampling MCMC

Slice Sampling

Slice sampling draws from a target distribution by uniformly sampling from the region under its density curve. It introduces an auxiliary variable to avoid tuning proposal distributions, unlike random-walk Metropolis-Hastings.

CoreTier 3Stable~35 min

Prerequisites

0

Why This Matters

Random-walk Metropolis-Hastings requires choosing a proposal distribution, and performance is sensitive to this choice. A proposal that is too narrow gives high acceptance but slow exploration. A proposal that is too wide gives low acceptance and wastes computation. Slice sampling sidesteps this: it introduces an auxiliary variable and samples uniformly from the region under the density curve, automatically adapting to the local scale of the target.

The algorithm has no tuning parameters in its idealized form. In practice, the "stepping-out" procedure introduces a width parameter, but performance is robust to this choice.

The Idea

To sample from a density f(x)f(x) (up to a normalizing constant), the key observation is:

Xf    (X,Y) is uniform on {(x,y):0yf(x)}X \sim f \iff (X, Y) \text{ is uniform on } \{(x, y) : 0 \leq y \leq f(x)\}

If you can sample uniformly from the region under the curve of ff, the xx-marginal is exactly ff.

The Algorithm

Definition

Slice Sampler

Given current state xtx_t and (unnormalized) target density ff:

  1. Draw auxiliary variable: sample ytUniform(0,f(xt))y_t \sim \text{Uniform}(0, f(x_t)).
  2. Define the slice: S={x:f(x)>yt}S = \{x : f(x) > y_t\}.
  3. Sample from slice: draw xt+1x_{t+1} uniformly from SS.

This defines a Markov chain (x0,x1,x2,)(x_0, x_1, x_2, \ldots) with stationary distribution proportional to ff.

Step 3 is the hard part. The slice SS can be a complicated, disconnected set. In one dimension, the "stepping-out" and "shrinking" procedure makes this practical.

Stepping-out procedure (univariate):

  1. Start with an interval [L,R][L, R] of initial width ww around xtx_t.
  2. Randomly position the interval: L=xtUwL = x_t - U \cdot w, R=L+wR = L + w where UUniform(0,1)U \sim \text{Uniform}(0, 1).
  3. Step out: while f(L)>ytf(L) > y_t, set L=LwL = L - w. While f(R)>ytf(R) > y_t, set R=R+wR = R + w.
  4. The interval [L,R][L, R] now contains the slice (at least the connected component containing xtx_t).
  5. Sample xUniform(L,R)x' \sim \text{Uniform}(L, R). If f(x)>ytf(x') > y_t, accept xt+1=xx_{t+1} = x'. Otherwise, shrink the interval (replace LL or RR with xx' depending on which side) and repeat.

The shrinking step guarantees that the algorithm eventually finds a point in the slice.

Correctness

Theorem

Slice Sampler Preserves the Target Distribution

Statement

The joint distribution that is uniform on {(x,y):0yf(x)}\{(x, y) : 0 \leq y \leq f(x)\} is invariant under the slice sampling Markov chain. The marginal distribution of xx is proportional to f(x)f(x).

Intuition

Each conditional distribution is correct by construction. Given xx, yy is uniform on [0,f(x)][0, f(x)]. Given yy, xx is uniform on the slice {x:f(x)>y}\{x : f(x) > y\}. Since each Gibbs step preserves the joint, the overall chain preserves it.

Proof Sketch

The slice sampler is a Gibbs sampler on the joint space (x,y)(x, y). The joint density is p(x,y)1[0yf(x)]p(x, y) \propto \mathbf{1}[0 \leq y \leq f(x)], which is the uniform distribution on the subgraph of ff. The conditional p(yx)=Uniform(0,f(x))p(y \mid x) = \text{Uniform}(0, f(x)) and p(xy)=Uniform({x:f(x)>y})p(x \mid y) = \text{Uniform}(\{x : f(x) > y\}) are both correct. Each step of the Gibbs sampler preserves the joint, so f(x)=0f(x)dy=f(x)f(x) = \int_0^{f(x)} dy = f(x) is the correct xx-marginal (up to normalization).

Why It Matters

This shows that slice sampling is exact (no approximation error in the stationary distribution), unlike methods that truncate or approximate the target. The only source of error is finite-time bias from not having mixed, which is common to all MCMC methods.

Failure Mode

The proof assumes exact uniform sampling from the slice. The stepping-out procedure finds a superset of the slice and uses rejection within it, which is exact. But in multivariate settings, the slice can be a complicated non-convex region, and uniform sampling becomes hard. Multivariate slice sampling typically updates one coordinate at a time (like Gibbs), which can be slow for correlated targets.

Theorem

Uniform Ergodicity of the Slice Sampler

Statement

If ff is bounded (i.e., f(x)Mf(x) \leq M for all xx) and f(x)dx<\int f(x) dx < \infty, the univariate slice sampler is uniformly ergodic: there exist constants C>0C > 0 and ρ<1\rho < 1 such that:

dTV(Pn(x0,),π)Cρnd_{\text{TV}}(P^n(x_0, \cdot), \pi) \leq C \rho^n

for all starting points x0x_0.

Intuition

When ff is bounded, the slice at any height yy is contained in a bounded region. This ensures that the chain can "reach" any part of the state space in a bounded number of steps, giving geometric convergence.

Proof Sketch

Mira and Roberts (2002) show that the slice sampler satisfies a minorization condition when ff is bounded. For any starting point, with positive probability the auxiliary variable yy falls below a threshold where the slice covers a fixed "regeneration set." From this set, the chain has a fixed probability of reaching any target region, giving the uniform ergodicity bound.

Why It Matters

Uniform ergodicity is a strong mixing guarantee: convergence is geometric and uniform over all starting points. Many Metropolis-Hastings chains are only geometrically ergodic (not uniformly), so the slice sampler has an advantage for bounded targets.

Failure Mode

For unbounded densities (e.g., those with heavy tails or singularities), uniform ergodicity can fail. The chain may get stuck in the tails where the density is very small and the slices are very wide. For heavy-tailed targets, the slice sampler may still be geometrically ergodic, but the rate depends on the tail behavior.

Advantages Over Metropolis-Hastings

  1. No proposal distribution to tune: the width parameter ww in stepping-out affects efficiency but not correctness.
  2. No rejection of moves: every iteration produces a new sample (though the stepping-out procedure requires multiple density evaluations).
  3. Automatic scale adaptation: the slice width adjusts to the local curvature of ff.

Common Confusions

Watch Out

Slice sampling still requires density evaluations

Slice sampling is not "free." The stepping-out procedure requires evaluating ff at each boundary extension, and the shrinking procedure evaluates ff for each rejected proposal. The total number of density evaluations per step depends on the width ww and the target geometry. For expensive densities, this cost can dominate.

Watch Out

The width parameter w is not a proposal variance

In Metropolis-Hastings, the proposal variance directly determines the acceptance rate and mixing. In slice sampling, ww determines how many steps the stepping-out procedure takes. Too small: many stepping-out evaluations. Too large: many shrinking evaluations. Either way, the chain is correct. Performance degrades gracefully, not catastrophically.

Canonical Examples

Example

Slice sampling from a mixture of Gaussians

Target: f(x)=0.3ϕ(x;2,1)+0.7ϕ(x;3,0.5)f(x) = 0.3 \cdot \phi(x; -2, 1) + 0.7 \cdot \phi(x; 3, 0.5) where ϕ(x;μ,σ)\phi(x; \mu, \sigma) is the Gaussian density. At xt=1.5x_t = -1.5, compute f(xt)0.094f(x_t) \approx 0.094. Draw yUniform(0,0.094)y \sim \text{Uniform}(0, 0.094), say y=0.05y = 0.05. The slice S={x:f(x)>0.05}S = \{x : f(x) > 0.05\} consists of two intervals (one around each mode). The stepping-out procedure finds bounds containing these intervals. Sampling uniformly, the chain can jump between modes in a single step, unlike random-walk MH which must traverse the low-density region between modes.

Exercises

ExerciseCore

Problem

For the uniform distribution f(x)=1f(x) = 1 on [0,1][0, 1], describe what the slice sampler does at each step. What is the slice SS for any value of the auxiliary variable yy?

ExerciseAdvanced

Problem

Consider the target f(x)exp(x)f(x) \propto \exp(-|x|) (a Laplace distribution). At xt=2x_t = 2, draw auxiliary y=0.05y = 0.05. What is the slice SS? Approximately how wide is it? Compare this to the slice width at xt=0x_t = 0 with the same yy value.

References

Canonical:

  • Neal, "Slice Sampling" (2003), Annals of Statistics 31(3), 705-767

Current:

  • Mira & Roberts, "Slice Sampling" (2002), in Highly Structured Stochastic Systems

  • Murray, Adams, MacKay, "Elliptical Slice Sampling" (2010)

  • Gelman et al., Bayesian Data Analysis (2013), Chapters 10-12

  • Brooks et al., Handbook of MCMC (2011), Chapters 1-5

Next Topics

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.

Next Topics