Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

Sampling MCMC

Squeezed Rejection Sampling

An optimization of rejection sampling that adds a cheap lower bound (squeeze function) to avoid expensive target density evaluations when the sample clearly falls in the accept or reject region.

CoreTier 3Stable~30 min

Prerequisites

0

Why This Matters

Standard rejection sampling requires evaluating the target density f(x)f(x) for every proposed sample. When ff is expensive to compute (think multivariate posteriors, normalizing constants involving special functions, or densities defined through numerical integration), each evaluation dominates the runtime.

Squeezed rejection sampling reduces the number of expensive evaluations by adding a cheap lower bound. If the proposal clearly falls in the accept region or clearly falls in the reject region, you skip the expensive evaluation entirely.

Mental Model

In standard rejection sampling, you have an envelope Mg(x)f(x)Mg(x) \geq f(x) and you accept a proposal xx with probability f(x)/(Mg(x))f(x) / (Mg(x)). You always need to evaluate f(x)f(x).

Now add a squeeze function s(x)s(x) satisfying s(x)f(x)s(x) \leq f(x) for all xx. Draw xgx \sim g and uUniform(0,Mg(x))u \sim \text{Uniform}(0, Mg(x)). If us(x)u \leq s(x), accept without evaluating ff. If u>Mg(x)u > Mg(x), reject without evaluating ff (this never happens by construction). If s(x)<uMg(x)s(x) < u \leq Mg(x), then and only then evaluate f(x)f(x) and accept if uf(x)u \leq f(x).

The squeeze handles the easy cases. You only pay for ff when the sample lands in the uncertain strip between s(x)s(x) and Mg(x)Mg(x).

Formal Setup

Definition

Squeeze Function

Given a target density f(x)f(x) and an envelope Mg(x)f(x)Mg(x) \geq f(x), a squeeze function is any function s:X[0,)s: \mathcal{X} \to [0, \infty) such that:

0s(x)f(x)Mg(x)for all xX0 \leq s(x) \leq f(x) \leq Mg(x) \quad \text{for all } x \in \mathcal{X}

The squeeze must be cheap to evaluate relative to ff.

Definition

Squeezed Rejection Sampling Algorithm

  1. Draw xg(x)x \sim g(x) from the proposal distribution.
  2. Draw uUniform(0,1)u \sim \text{Uniform}(0, 1).
  3. If us(x)/(Mg(x))u \leq s(x) / (Mg(x)), accept xx (no evaluation of ff).
  4. Else if uf(x)/(Mg(x))u \leq f(x) / (Mg(x)), accept xx (evaluate ff).
  5. Else reject xx (evaluate ff).
  6. Return to step 1.

Main Theorems

Theorem

Correctness of Squeezed Rejection Sampling

Statement

Squeezed rejection sampling produces independent samples distributed exactly according to the target density proportional to f(x)f(x). The acceptance probability is identical to standard rejection sampling: 1/M1/M.

Intuition

The squeeze only determines when you evaluate ff. It never changes the accept/reject decision. When you do evaluate ff, the decision is identical to standard rejection sampling. So the output distribution is unchanged.

Proof Sketch

Partition the probability space into three events for a proposal (x,u)(x, u) where uUniform(0,Mg(x))u \sim \text{Uniform}(0, Mg(x)):

Event A: us(x)u \leq s(x). Accept. Since s(x)f(x)s(x) \leq f(x), we have uf(x)u \leq f(x), so standard rejection sampling would also accept.

Event B: s(x)<uf(x)s(x) < u \leq f(x). Accept after evaluating ff. Same decision as standard.

Event C: u>f(x)u > f(x). Reject after evaluating ff. Same decision as standard.

In all three events, the accept/reject decision matches standard rejection sampling. The squeeze only changes which event triggers the evaluation.

Why It Matters

The fraction of proposals requiring ff-evaluation is:

P(s(x)<uMg(x))=1s(x)dxMg(x)dxP(s(x) < u \leq Mg(x)) = 1 - \frac{\int s(x) \, dx}{\int Mg(x) \, dx}

A tight squeeze (close to ff) means most proposals are resolved cheaply.

Failure Mode

If the squeeze is loose (s(x)0s(x) \approx 0), nearly every proposal falls in the uncertain strip and you evaluate ff anyway. The overhead of computing s(x)s(x) then makes squeezed rejection sampling slower than standard rejection sampling. The squeeze must be both cheap and tight.

Canonical Examples

Example

Squeeze for the normal distribution

Suppose f(x)ex2/2f(x) \propto e^{-x^2/2} and the proposal is a double exponential (Laplace) distribution. A common squeeze for the standard normal is the piecewise linear function that interpolates ff at a grid of points, staying below ff everywhere. For example, between grid points xix_i and xi+1x_{i+1}, the squeeze is the linear function connecting (xi,f(xi))(x_i, f(x_i)) and (xi+1,f(xi+1))(x_{i+1}, f(x_{i+1})). Since ff is log-concave, the linear interpolant stays below ff. With 10-20 grid points, this squeeze resolves over 95% of proposals without evaluating exp(x2/2)\exp(-x^2/2).

Common Confusions

Watch Out

The squeeze does not change the acceptance rate

Adding a squeeze does not accept more proposals. The overall acceptance probability remains 1/M1/M. The squeeze only reduces the number of proposals that require evaluating ff. It trades ff-evaluations for cheaper ss-evaluations.

Watch Out

The squeeze need not be a density

The squeeze s(x)s(x) is not required to integrate to 1 or even to be a proper density. It is just a lower bound on f(x)f(x). Any non-negative function that stays below ff works.

Summary

  • Add a cheap lower bound s(x)f(x)s(x) \leq f(x) to skip expensive evaluations
  • Correctness is immediate: the squeeze never changes the accept/reject decision, only when ff is evaluated
  • Savings depend on the tightness of the squeeze: s(x)dx/Mg(x)dx\int s(x) \, dx / \int Mg(x) \, dx
  • If ff is cheap to evaluate, the overhead of computing ss is not worth it
  • Log-concave densities admit natural piecewise linear squeezes

Exercises

ExerciseCore

Problem

Suppose the squeeze resolves 90% of proposals without evaluating ff, and evaluating ff takes 100 times longer than evaluating ss. What is the approximate speedup of squeezed rejection sampling over standard rejection sampling?

ExerciseAdvanced

Problem

For a log-concave density ff, explain why a piecewise linear interpolant through points (xi,f(xi))(x_i, f(x_i)) lies below ff between grid points. State the precise condition on ff that makes this work.

References

Canonical:

  • Devroye, Non-Uniform Random Variate Generation (1986), Chapter 7
  • von Neumann, "Various Techniques Used in Connection with Random Digits" (1951), original rejection sampling paper

Current:

  • Robert & Casella, Monte Carlo Statistical Methods (2004), Chapter 2.3

  • Gelman et al., Bayesian Data Analysis (2013), Chapters 10-12

  • Brooks et al., Handbook of MCMC (2011), Chapters 1-5

Next Topics

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.

Next Topics