Sampling MCMC
Rejection Sampling
The simplest exact sampling method: propose from an envelope distribution and accept or reject to produce exact independent draws from a target: but doomed to fail in high dimensions.
Why This Matters
Rejection sampling is the conceptually simplest method for generating exact samples from a distribution that you can evaluate (up to a normalizing constant) but cannot directly sample from. Unlike MCMC methods, rejection sampling produces independent samples. no burn-in, no autocorrelation, no convergence diagnostics needed.
It is the first sampling algorithm you should learn, and understanding why it fails in high dimensions motivates the entire field of MCMC. Rejection sampling also appears as a building block inside more advanced algorithms: adaptive rejection sampling (ARS), slice sampling, and perfect simulation.
Mental Model
You want to throw darts at a dartboard shaped like your target distribution . But the shape is complicated, so instead you throw darts uniformly at a larger, simpler region that completely covers . Any dart that lands below is kept; any that lands above is thrown away. The kept darts are distributed exactly according to .
The efficiency depends on how tight the envelope is. If the envelope is much larger than , you throw away most darts.
Formal Setup and Notation
Let be the target density (possibly unnormalized). Let be a proposal density from which we can sample, and let be a constant such that for all .
Envelope Function
The envelope function is a scaled version of the proposal density that dominates the target everywhere:
The constant must satisfy . The tightest envelope uses , which maximizes the acceptance probability.
Rejection Sampling Algorithm
To generate a sample from :
- Propose: Draw
- Accept/Reject: Draw . If , accept ; otherwise reject and return to step 1.
- Output the accepted .
The accepted samples are exact, independent draws from .
Acceptance Probability
The probability that a proposed sample is accepted is:
If is normalized (integrates to 1), then . The expected number of proposals per accepted sample is .
Main Theorems
Correctness of Rejection Sampling
Statement
Samples accepted by the rejection sampling algorithm are distributed exactly according to the target distribution . That is, for any measurable set :
Intuition
The joint distribution of where and is uniform under the curve . Accepting only points where is equivalent to restricting to the region under . Points uniform under have -marginal proportional to .
Proof Sketch
For any set :
The numerator is:
The denominator is .
Dividing: .
Why It Matters
This guarantees that rejection sampling produces exact samples from the target. no approximation, no asymptotic arguments, no burn-in. Each accepted sample is an independent draw from . This is a stronger guarantee than any MCMC method provides (MCMC only converges to the target asymptotically).
Failure Mode
The theorem guarantees correctness but says nothing about efficiency. If is very large, the acceptance rate is tiny and the algorithm is impractical. The fundamental challenge is finding a tight envelope.
Dimensional Scaling of Rejection Sampling
Statement
For "comparable" -dimensional distributions (e.g., target and proposal are both Gaussian with different means or covariances), the envelope constant grows exponentially in :
for some constant . Consequently, the acceptance rate decreases exponentially with dimension.
Intuition
In high dimensions, probability mass concentrates on thin shells. Even if two distributions are similar in shape, their mass is concentrated on slightly different shells. The envelope must cover both shells, but the overlap between the two shells shrinks exponentially with dimension. The result is that must be exponentially large to maintain the dominance condition everywhere.
Proof Sketch
Consider and with . Then:
The supremum is achieved at , giving . Since , this is exponential in . The acceptance rate is .
Even for , in dimensions: .
Why It Matters
This is the reason rejection sampling is impractical in high dimensions and the motivation for MCMC methods. MCMC avoids the need for a global envelope by making local moves that are always in roughly the right neighborhood. Understanding this dimensional failure is essential background for appreciating why MCMC was such a breakthrough.
Failure Mode
The exponential scaling is not a consequence of poor proposal choice. It is inherent to the accept-reject paradigm in high dimensions. No clever choice of can avoid it when the target lives in many dimensions. This is why rejection sampling is useful primarily for low-dimensional problems (say, ).
Squeezed Rejection Sampling
When evaluating is expensive, you can add a lower bound (squeezing function) that is cheap to evaluate:
- Propose , draw
- If : accept immediately (cheap. did not evaluate )
- If : reject immediately (never happens by construction)
- If : accept (requires evaluating )
- If : reject (requires evaluating )
The squeeze saves expensive evaluations of for samples that would clearly be accepted. This is useful when involves a complex likelihood calculation.
Adaptive Rejection Sampling (ARS)
For log-concave densities, there is a powerful refinement that avoids the problem of choosing manually.
Adaptive Rejection Sampling
When is concave (i.e., is log-concave), the tangent lines to at a set of evaluation points form a piecewise-linear upper bound on . Exponentiating gives an envelope for that is piecewise exponential and can be sampled exactly.
Algorithm:
- Start with a set of abscissae
- Construct the piecewise-linear upper hull of using tangent lines at each
- Exponentiate to get the envelope
- Sample from the piecewise-exponential envelope
- Accept or reject as usual
- If rejected, add the rejected point to the abscissae and update the hull
Key property: the envelope gets tighter with each rejection, so the acceptance rate improves over time, converging to 1.
Log-concavity holds for many standard distributions: normal, exponential, gamma (with ), beta (with ), and logistic. ARS is the method of choice for sampling from univariate log-concave distributions.
Canonical Examples
Sampling Beta(2,5) using a uniform envelope
Target: on (unnormalized Beta(2,5) density).
Proposal: (Uniform(0,1)).
Find : We need . Taking the derivative: at .
.
Acceptance rate: .
So about 40.7% of proposals are accepted. The algorithm:
- Draw
- Draw
- If , accept ; else reject
Sampling from a truncated normal
Target: for (right tail of standard normal, unnormalized).
Proposal: for (Exponential with rate 2, shifted to start at 2).
Envelope constant: .
The exponent is , maximized at where it equals 0. So .
Acceptance probability at each point: .
Average acceptance rate: .
This is reasonably efficient because the exponential proposal matches the tail behavior of the normal well.
Why rejection sampling fails for 100-dimensional Gaussian
Target: . Proposal: .
The proposal is only 10% wider in each dimension. seemingly a good match. But . The acceptance rate is : you would need about 14,000 proposals for each accepted sample.
With : . Completely impractical.
This is not a failure of the proposal choice. It is inherent to rejection sampling in high dimensions.
Common Confusions
Rejection sampling requires a dominating envelope, not just overlap
A common error is choosing such that on the support of but failing to ensure everywhere. If the envelope does not dominate, the algorithm produces biased samples. regions where are under-represented. You must verify the dominance condition mathematically or numerically before running the algorithm.
Rejected samples tell you nothing about the target
Unlike in Metropolis-Hastings (where the current state is repeated on rejection), in rejection sampling, rejected proposals are simply discarded. They do not contribute to the sample in any way. Only accepted samples are used. This means the efficiency is entirely determined by the acceptance rate.
Rejection sampling produces independent samples
This is a feature, not a bug, and a key distinction from MCMC. Each accepted sample is independent of all others. There is no autocorrelation, no burn-in, no need for convergence diagnostics. If you can afford the acceptance rate, rejection sampling is strictly preferable to MCMC for producing independent samples.
You do NOT need the normalizing constant of f
Just like MH, rejection sampling works with unnormalized targets. If for unknown , the acceptance ratio still produces samples from . The constant is absorbed into : you find such that , and the acceptance condition does not require knowing separately.
Summary
- Rejection sampling produces exact, independent samples from the target
- Requires an envelope: for all
- Acceptance rate = (or if is normalized)
- Squeezed rejection adds a lower bound to avoid expensive evaluations
- ARS exploits log-concavity for automatic, improving envelopes
- Acceptance rate decays exponentially with dimension. rejection sampling fails in high dimensions
- This dimensional failure is the primary motivation for MCMC methods
Exercises
Problem
Design a rejection sampler for the Beta(2, 5) distribution using a Uniform(0, 1) envelope. Find the optimal , write out the algorithm, and compute the acceptance rate.
Problem
Prove that the acceptance rate of rejection sampling equals when is unnormalized, or when is a proper density.
Problem
Consider rejection sampling from on (standard Exponential) using proposal on (a Pareto-like distribution). Find the optimal and the acceptance rate. Is this a good choice of proposal?
References
Canonical:
- von Neumann (1951), "Various Techniques Used in Connection with Random Digits"
- Gilks & Wild (1992), "Adaptive Rejection Sampling for Gibbs Sampling"
Current:
-
Robert & Casella, Monte Carlo Statistical Methods (2004), Chapter 2
-
Devroye, Non-Uniform Random Variate Generation (1986), Chapters 2-3
-
Gelman et al., Bayesian Data Analysis (2013), Chapters 10-12
-
Brooks et al., Handbook of MCMC (2011), Chapters 1-5
Next Topics
The natural next steps from rejection sampling:
- Adaptive rejection sampling: automatic envelope construction for log-concave densities
- Importance sampling: reweighting instead of rejecting, avoiding the need for a dominating envelope
Last reviewed: April 2026
Builds on This
- Adaptive Rejection SamplingLayer 2
- Squeezed Rejection SamplingLayer 2