Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Sampling MCMC

Adaptive Rejection Sampling

For log-concave densities, build a piecewise linear upper hull of log f(x) that tightens automatically with each evaluation. Sample from the envelope and accept/reject.

AdvancedTier 3Stable~40 min

Prerequisites

0

Why This Matters

Standard rejection sampling requires you to find a proposal distribution gg and a constant MM such that f(x)Mg(x)f(x) \leq M g(x) everywhere. For complex target densities, finding a tight envelope is difficult. A loose envelope means most samples are rejected. Adaptive rejection sampling (ARS), introduced by Gilks and Wild (1992), solves this for log-concave densities by automatically constructing and tightening an envelope as samples are drawn. The envelope only improves over time; no hand-tuning is needed. ARS is widely used as a subroutine inside Gibbs samplers for sampling from univariate full conditionals.

Mental Model

If logf(x)\log f(x) is concave (a dome shape), then tangent lines to logf(x)\log f(x) lie above the curve. The minimum of several tangent lines forms a piecewise linear upper bound on logf(x)\log f(x). Exponentiating gives an upper envelope for f(x)f(x) itself, which is a piecewise exponential distribution (easy to sample from). Each time you evaluate ff at a new point (whether the sample is accepted or rejected), you can add a new tangent line, tightening the envelope. The envelope starts loose and automatically becomes tighter, increasing the acceptance rate over time.

Formal Setup

Let f(x)f(x) be the unnormalized target density. Assume h(x)=logf(x)h(x) = \log f(x) is concave.

Definition

Log-Concave Density

A density f(x)f(x) is log-concave if h(x)=logf(x)h(x) = \log f(x) is a concave function: h(λx+(1λ)y)λh(x)+(1λ)h(y)h(\lambda x + (1-\lambda) y) \geq \lambda h(x) + (1-\lambda) h(y) for all x,yx, y and λ[0,1]\lambda \in [0,1]. Equivalently, the sublevel sets {x:f(x)c}\{x : f(x) \geq c\} are convex for all c>0c > 0.

Examples: Gaussian, exponential, logistic, gamma (with shape 1\geq 1), beta (with both parameters 1\geq 1), uniform on a convex set. Non-examples: Student-t, Cauchy, mixture of Gaussians (in general).

Definition

Upper Hull

Given kk evaluation points x1<x2<<xkx_1 < x_2 < \cdots < x_k where h(xi)h(x_i) and h(xi)h'(x_i) have been computed, the upper hull is:

uk(x)=mini=1,,k[h(xi)+h(xi)(xxi)]u_k(x) = \min_{i=1,\ldots,k} \left[ h(x_i) + h'(x_i)(x - x_i) \right]

This is the minimum of kk tangent lines to hh. Since hh is concave, each tangent line lies above hh, so uk(x)h(x)u_k(x) \geq h(x) for all xx.

Definition

Lower Hull

The lower hull is the piecewise linear interpolant through the evaluation points:

lk(x)=h(xi)+h(xi+1)h(xi)xi+1xi(xxi)for x[xi,xi+1]l_k(x) = h(x_i) + \frac{h(x_{i+1}) - h(x_i)}{x_{i+1} - x_i}(x - x_i) \quad \text{for } x \in [x_i, x_{i+1}]

By concavity, lk(x)h(x)l_k(x) \leq h(x) for all xx in the convex hull of the evaluation points.

The ARS Algorithm

  1. Initialize with k2k \geq 2 evaluation points. Compute h(xi)h(x_i) and h(xi)h'(x_i) at each point.
  2. Construct the upper hull uk(x)u_k(x) and lower hull lk(x)l_k(x).
  3. Sample xexp(uk(x))/Zkx^* \sim \exp(u_k(x)) / Z_k where Zk=exp(uk(x))dxZ_k = \int \exp(u_k(x))\, dx. This is a piecewise exponential distribution: within each segment of uku_k, the density is exponential, so sampling reduces to choosing a segment (proportional to its area) and sampling from a truncated exponential.
  4. Squeezing test: draw uUniform(0,1)u \sim \text{Uniform}(0,1). If uexp(lk(x)uk(x))u \leq \exp(l_k(x^*) - u_k(x^*)), accept xx^* without evaluating h(x)h(x^*). This saves a (potentially expensive) function evaluation.
  5. Rejection test: if the squeeze test fails, evaluate h(x)h(x^*). If uexp(h(x)uk(x))u \leq \exp(h(x^*) - u_k(x^*)), accept xx^*.
  6. Update: regardless of acceptance, add xx^* to the set of evaluation points and update uku_k and lkl_k.
  7. Return to step 3.

Core Theory

Theorem

Validity of the ARS Envelope

Statement

For any set of evaluation points {x1,,xk}\{x_1, \ldots, x_k\} in the support of ff, the upper hull uk(x)=mini[h(xi)+h(xi)(xxi)]u_k(x) = \min_i [h(x_i) + h'(x_i)(x - x_i)] satisfies uk(x)h(x)u_k(x) \geq h(x) for all xx in the support. Therefore exp(uk(x))f(x)\exp(u_k(x)) \geq f(x) everywhere, and rejection sampling using exp(uk(x))/Zk\exp(u_k(x))/Z_k as the proposal is valid (produces exact samples from f/ff/\int f). Furthermore, uk+1(x)uk(x)u_{k+1}(x) \leq u_k(x) for all xx: adding evaluation points can only tighten the envelope.

Intuition

A concave function lies below all of its tangent lines. The minimum of tangent lines is the tightest piecewise linear upper bound available from the given evaluation points. Adding a new tangent line can only lower the minimum (or leave it unchanged), so the envelope never gets worse.

Proof Sketch

For any ii and any xx: h(x)h(xi)+h(xi)(xxi)h(x) \leq h(x_i) + h'(x_i)(x - x_i) by the first-order condition for concavity (h(y)h(z)+h(z)(yz)h(y) \leq h(z) + h'(z)(y - z) for all y,zy, z when hh is concave). Taking the minimum over ii preserves the inequality: h(x)mini[h(xi)+h(xi)(xxi)]=uk(x)h(x) \leq \min_i [h(x_i) + h'(x_i)(x - x_i)] = u_k(x). For monotonicity: uk+1(x)=min(uk(x),h(xk+1)+h(xk+1)(xxk+1))uk(x)u_{k+1}(x) = \min(u_k(x), h(x_{k+1}) + h'(x_{k+1})(x - x_{k+1})) \leq u_k(x).

Why It Matters

This guarantees that ARS produces exact samples from the target (not approximate). The envelope is always valid regardless of where the evaluation points are placed. The automatic tightening means the acceptance rate improves over the course of sampling without any intervention.

Failure Mode

The theorem requires log-concavity. If h(x)h(x) is not concave, a tangent line may lie below hh at some points, and the envelope uk(x)<h(x)u_k(x) < h(x) there. Rejection sampling with an invalid envelope produces biased samples. There is no diagnostic that catches this during sampling; the user must verify log-concavity before applying ARS.

Acceptance Rate Improvement

As kk grows, uk(x)h(x)u_k(x) \to h(x) pointwise (assuming evaluation points become dense). The acceptance probability at step kk is:

αk=f(x)dxexp(uk(x))dx\alpha_k = \frac{\int f(x)\, dx}{\int \exp(u_k(x))\, dx}

Since uku_k is non-increasing in kk, αk\alpha_k is non-decreasing. In practice, ARS often reaches acceptance rates above 90% after 10-20 evaluation points for smooth log-concave densities.

The squeezing test adds further efficiency: when lk(x)l_k(x^*) is close to uk(x)u_k(x^*), the squeeze test passes without evaluating h(x)h(x^*), saving computation. The fraction of samples accepted by squeezing also increases as kk grows.

Limitations

Log-concavity is required. Many important distributions are not log-concave: Student-t (heavy tails), mixtures of Gaussians, posterior distributions with multimodal structure. For these, ARS cannot be applied directly.

Univariate only (in the basic form). The piecewise linear envelope construction works naturally in one dimension. Extensions to multiple dimensions exist (e.g., adaptive rejection Metropolis sampling, ARMS) but lose the guarantee of exact sampling, instead using the envelope as a Metropolis-Hastings proposal.

Derivative required. The basic ARS algorithm needs h(x)h'(x) at evaluation points. Derivative-free variants exist (using secant lines instead of tangent lines) but have slightly worse envelopes.

Common Confusions

Watch Out

ARS is not MCMC

ARS produces independent, exact samples from the target distribution. It is not a Markov chain; there is no burn-in period and no autocorrelation between samples. It is a rejection sampler with an automatically improving proposal. The samples are i.i.d. from the target once accepted.

Watch Out

Tightening the envelope does not invalidate previous samples

When a new tangent line is added, the envelope changes for future proposals, but all previously accepted samples remain valid. Each accepted sample passed a valid rejection test at the time it was drawn, and the envelope was a valid upper bound at that time. The evolving envelope does not retroactively affect past samples.

Key Takeaways

  • ARS builds a piecewise linear upper hull of logf(x)\log f(x) using tangent lines at evaluated points
  • The envelope is always valid (above h(x)h(x)) by concavity, guaranteeing exact sampling
  • Each new evaluation tightens the envelope; acceptance rates improve monotonically
  • The squeezing test avoids expensive function evaluations when the lower hull is close to the upper hull
  • Log-concavity is a hard requirement; the method silently produces biased samples without it
  • ARS is widely used inside Gibbs samplers for univariate full conditionals

Exercises

ExerciseCore

Problem

Consider h(x)=x2/2h(x) = -x^2/2 (the log of a standard Gaussian). You have two evaluation points: x1=1x_1 = -1 with h(1)=0.5h(-1) = -0.5, h(1)=1h'(-1) = 1, and x2=1x_2 = 1 with h(1)=0.5h(1) = -0.5, h(1)=1h'(1) = -1. Write the upper hull u2(x)u_2(x) and compute u2(0)u_2(0). How does this compare to h(0)=0h(0) = 0?

ExerciseAdvanced

Problem

Prove that the gamma distribution with shape parameter α<1\alpha < 1 is not log-concave, and therefore ARS cannot be applied directly. Then describe how to handle this case by a change of variables.

References

Canonical:

  • Gilks and Wild, Adaptive Rejection Sampling for Gibbs Sampling, Applied Statistics (1992)
  • Gilks, Derivative-free Adaptive Rejection Sampling for Gibbs Sampling, Bayesian Statistics 4 (1992)

Current:

  • Robert and Casella, Monte Carlo Statistical Methods (2004), Section 2.3.4

  • Martino and Miguez, Generalized Adaptive Rejection Sampling Schemes, Signal Processing (2011)

  • Gelman et al., Bayesian Data Analysis (2013), Chapters 10-12

  • Brooks et al., Handbook of MCMC (2011), Chapters 1-5

Next Topics

  • Gibbs sampling (where ARS is often used for full conditionals)
  • Adaptive rejection Metropolis sampling for non-log-concave targets

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.