Slice Sampling

Sneiderman, Robby

Sampling MCMC

Slice Sampling

Slice sampling draws from a target distribution by uniformly sampling from the region under its density curve. It introduces an auxiliary variable to avoid tuning proposal distributions, unlike random-walk Metropolis-Hastings.

CoreTier 3StableSupporting~35 min

Prerequisites

Metropolis Hastings

Prereq Map

Learning position

Read this page in the graph.

sampling-mcmc | layer 2 | tier 3. This page has 1 direct prerequisite and 1 published dependent.

Open Atlas Prerequisites Leads to

What next

Burn-in and Convergence Diagnostics

This is the first curated or graph-derived continuation from the current page.

Evidence badge

Claim status

This page has no public Lean mapping yet. Use the evidence page to inspect how claim status labels work.

Show the backing system

AtlasOpen the full prerequisite graph and run grounding traces.EvidenceInspect source support, claim labels, and public trust status.LeanReview the checked declaration list, scopes, and axiom profile.

Why This Matters

Random-walk Metropolis-Hastings requires choosing a proposal distribution, and performance is sensitive to this choice. A proposal that is too narrow gives high acceptance but slow exploration. A proposal that is too wide gives low acceptance and wastes computation. Slice sampling sidesteps this: it introduces an auxiliary variable and samples uniformly from the region under the density curve, automatically adapting to the local scale of the target.

The algorithm has no tuning parameters in its idealized form. In practice, the "stepping-out" procedure introduces a width parameter, but performance is robust to this choice.

The Idea

To sample from a density $f(x)$ (up to a normalizing constant), the key observation is:

$X \sim f \iff (X, Y) \text{ is uniform on } \{(x, y) : 0 \leq y \leq f(x)\}$

If you can sample uniformly from the region under the curve of $f$ , the $x$ -marginal is exactly $f$ .

The Algorithm

Definition

Slice Sampler

Given current state $x_t$ and (unnormalized) target density $f$ :

Draw auxiliary variable: sample $y_t \sim \text{Uniform}(0, f(x_t))$ .
Define the slice: $S = \{x : f(x) > y_t\}$ .
Sample from slice: draw $x_{t+1}$ uniformly from $S$ .

This defines a Markov chain $(x_0, x_1, x_2, \ldots)$ with stationary distribution proportional to $f$ .

Step 3 is the hard part. The slice $S$ can be a complicated, disconnected set. In one dimension, the "stepping-out" and "shrinking" procedure makes this practical.

Stepping-out procedure (univariate):

Start with an interval $[L, R]$ of initial width $w$ around $x_t$ .
Randomly position the interval: $L = x_t - U \cdot w$ , $R = L + w$ where $U \sim \text{Uniform}(0, 1)$ .
Step out: while $f(L) > y_t$ , set $L = L - w$ . While $f(R) > y_t$ , set $R = R + w$ .
The interval $[L, R]$ now contains the slice (at least the connected component containing $x_t$ ).
Sample $x' \sim \text{Uniform}(L, R)$ . If $f(x') > y_t$ , accept $x_{t+1} = x'$ . Otherwise, shrink the interval (replace $L$ or $R$ with $x'$ depending on which side) and repeat.

The shrinking step guarantees that the algorithm eventually finds a point in the slice.

Correctness

Theorem

Slice Sampler Preserves the Target Distribution

Statement

The joint distribution that is uniform on $\{(x, y) : 0 \leq y \leq f(x)\}$ is invariant under the slice sampling Markov chain. The marginal distribution of $x$ is proportional to $f(x)$ .

Intuition

Each conditional distribution is correct by construction. Given $x$ , $y$ is uniform on $[0, f(x)]$ . Given $y$ , $x$ is uniform on the slice $\{x : f(x) > y\}$ . Since each Gibbs step preserves the joint, the overall chain preserves it.

Proof Sketch

The slice sampler is a Gibbs sampler on the joint space $(x, y)$ . The joint density is $p(x, y) \propto \mathbf{1}[0 \leq y \leq f(x)]$ , which is the uniform distribution on the subgraph of $f$ . The conditional $p(y \mid x) = \text{Uniform}(0, f(x))$ and $p(x \mid y) = \text{Uniform}(\{x : f(x) > y\})$ are both correct. Each step of the Gibbs sampler preserves the joint, so $f(x) = \int_0^{f(x)} dy = f(x)$ is the correct $x$ -marginal (up to normalization).

Why It Matters

This shows that slice sampling is exact (no approximation error in the stationary distribution), unlike methods that truncate or approximate the target. The only source of error is finite-time bias from not having mixed, which is common to all MCMC methods.

Failure Mode

The proof assumes exact uniform sampling from the slice. The stepping-out procedure finds a superset of the slice and uses rejection within it, which is exact. But in multivariate settings, the slice can be a complicated non-convex region, and uniform sampling becomes hard. Multivariate slice sampling typically updates one coordinate at a time (like Gibbs), which can be slow for correlated targets.

report a correction →

Theorem

Uniform Ergodicity of the Slice Sampler

Statement

If $f$ is bounded (i.e., $f(x) \leq M$ for all $x$ ) and $\int f(x) dx < \infty$ , the univariate slice sampler is uniformly ergodic: there exist constants $C > 0$ and $\rho < 1$ such that:

$d_{\text{TV}}(P^n(x_0, \cdot), \pi) \leq C \rho^n$

for all starting points $x_0$ .

Intuition

When $f$ is bounded, the slice at any height $y$ is contained in a bounded region. This ensures that the chain can "reach" any part of the state space in a bounded number of steps, giving geometric convergence.

Proof Sketch

Mira and Roberts (2002) show that the slice sampler satisfies a minorization condition when $f$ is bounded. For any starting point, with positive probability the auxiliary variable $y$ falls below a threshold where the slice covers a fixed "regeneration set." From this set, the chain has a fixed probability of reaching any target region, giving the uniform ergodicity bound.

Why It Matters

Uniform ergodicity is a strong mixing guarantee: convergence is geometric and uniform over all starting points. Many Metropolis-Hastings chains are only geometrically ergodic (not uniformly), so the slice sampler has an advantage for bounded targets.

Failure Mode

For unbounded densities (e.g., those with heavy tails or singularities), uniform ergodicity can fail. The chain may get stuck in the tails where the density is very small and the slices are very wide. For heavy-tailed targets, the slice sampler may still be geometrically ergodic, but the rate depends on the tail behavior.

report a correction →

Advantages Over Metropolis-Hastings

No proposal distribution to tune: the width parameter $w$ in stepping-out affects efficiency but not correctness.
No rejection of moves: every iteration produces a new sample (though the stepping-out procedure requires multiple density evaluations).
Automatic scale adaptation: the slice width adjusts to the local curvature of $f$ .

Common Confusions

Watch Out

Slice sampling still requires density evaluations

Slice sampling is not "free." The stepping-out procedure requires evaluating $f$ at each boundary extension, and the shrinking procedure evaluates $f$ for each rejected proposal. The total number of density evaluations per step depends on the width $w$ and the target geometry. For expensive densities, this cost can dominate.

Watch Out

The width parameter w is not a proposal variance

In Metropolis-Hastings, the proposal variance directly determines the acceptance rate and mixing. In slice sampling, $w$ determines how many steps the stepping-out procedure takes. Too small: many stepping-out evaluations. Too large: many shrinking evaluations. Either way, the chain is correct. Performance degrades gracefully, not catastrophically.

Canonical Examples

Example

Slice sampling from a mixture of Gaussians

Target: $f(x) = 0.3 \cdot \phi(x; -2, 1) + 0.7 \cdot \phi(x; 3, 0.5)$ where $\phi(x; \mu, \sigma)$ is the Gaussian density. At $x_t = -1.5$ , compute $f(x_t) \approx 0.094$ . Draw $y \sim \text{Uniform}(0, 0.094)$ , say $y = 0.05$ . The slice $S = \{x : f(x) > 0.05\}$ consists of two intervals (one around each mode). The stepping-out procedure finds bounds containing these intervals. Sampling uniformly, the chain can jump between modes in a single step, unlike random-walk MH which must traverse the low-density region between modes.

Exercises

ExerciseCore

Problem

For the uniform distribution $f(x) = 1$ on $[0, 1]$ , describe what the slice sampler does at each step. What is the slice $S$ for any value of the auxiliary variable $y$ ?

ExerciseAdvanced

Problem

Consider the target $f(x) \propto \exp(-|x|)$ (a Laplace distribution). At $x_t = 2$ , draw auxiliary $y = 0.05$ . What is the slice $S$ ? Approximately how wide is it? Compare this to the slice width at $x_t = 0$ with the same $y$ value.

References

Canonical:

Neal, "Slice Sampling" (2003), Annals of Statistics 31(3), 705-767

Current:

Mira & Roberts, "Slice Sampling" (2002), in Highly Structured Stochastic Systems
Murray, Adams, MacKay, "Elliptical Slice Sampling" (2010)
Gelman et al., Bayesian Data Analysis (2013), Chapters 10-12
Brooks et al., Handbook of MCMC (2011), Chapters 1-5

Next Topics

Burn-in and convergence diagnostics: practical methods for assessing when MCMC has converged

Last reviewed: April 14, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

1

Metropolis-Hastings Algorithmlayer 2 · tier 1

Derived topics

1

Burn-in and Convergence Diagnosticslayer 2 · tier 2

Graph-backed continuations

Burn-in and Convergence Diagnostics