Sampling MCMC
Reversible Jump MCMC
MCMC for model selection: propose moves that change the number of parameters, maintain detailed balance across dimensions via Jacobian corrections, and sample over model space and parameter space simultaneously.
Prerequisites
Why This Matters
Standard MCMC methods (Metropolis-Hastings, Gibbs, HMC) sample within a fixed-dimensional parameter space. But many real problems require choosing the number of parameters itself: how many clusters in a mixture model, how many change points in a time series, how many hidden factors in a latent model.
Reversible jump MCMC (Green, 1995) extends Metropolis-Hastings to sample over a union of spaces with different dimensions. It is the standard Bayesian approach to model selection when the model indicator is discrete and the parameter spaces differ across models.
Mental Model
Imagine the posterior distribution lives not on a single but on a disjoint union where indexes models and is the dimension of model . Reversible jump proposes moves that jump between these spaces: adding a parameter (birth), removing one (death), splitting one component into two (split), or merging two into one (merge).
The challenge is maintaining detailed balance when the "from" and "to" states live in different-dimensional spaces.
Formal Setup
Let be a countable collection of models. Under model , the parameter vector lives in . The joint posterior over is:
where is the prior on model index, is the parameter prior under model , and is the likelihood.
Dimension-Matching Condition
To propose a move from state to state , generate auxiliary random variables of dimension such that:
where are the auxiliary variables for the reverse move. This ensures a bijection between the "from" and "to" augmented spaces, both of dimension .
Dimension-Changing Map
A deterministic bijection maps:
This map, together with its Jacobian, defines how parameters in one model translate to parameters in another.
Main Theorems
Reversible Jump Acceptance Probability
Statement
A move from to is proposed by drawing and setting . The acceptance probability is:
where is the probability of proposing a move from model to model (the move type probability).
Intuition
This is the standard Metropolis-Hastings ratio with one addition: the Jacobian determinant. When you change dimensions, the map stretches or compresses volume, analogous to a change of variables in probability. The Jacobian corrects for this volume distortion, just as in a standard change of variables in integration.
Proof Sketch
Write the detailed balance condition on the augmented space . The proposal distribution on this space is . The target on this space is (the auxiliary variables are independent of the target). Apply the change-of-variables formula through to express the reverse move's contribution. The Jacobian arises from this change of variables.
Why It Matters
This formula reduces transdimensional sampling to a standard accept/reject computation. Once you specify the map and the auxiliary variable distribution , the acceptance ratio is fully determined. The art of RJMCMC is choosing and so that the acceptance rate is not too low.
Failure Mode
If the map is poorly designed, the proposed parameters in the new model space will have low posterior density, giving very low acceptance rates. The Jacobian determinant can also be extreme (very large or very small), causing most proposals to be rejected. Computing the Jacobian requires knowing the analytical form of , which rules out implicit or iterative maps without special treatment.
Detailed Balance Across Dimensions
Statement
Under the RJMCMC acceptance probability, the Markov chain satisfies detailed balance with respect to across all pairs of models:
where is the transition kernel that integrates over all possible auxiliary variable realizations. Consequently, if the chain is irreducible and aperiodic, is the unique stationary distribution.
Intuition
This is the same detailed balance property as standard MH, extended to a state space that is a union of spaces with different dimensions. The key is that the augmented space (original parameters plus auxiliary variables) has the same dimension on both sides of the transition, so the standard MH detailed balance argument applies.
Proof Sketch
On the augmented space , define the extended target . The RJMCMC proposal on is a standard MH proposal (since is a bijection between equal-dimensional spaces). Standard MH detailed balance applies on . Marginalizing out the auxiliary variables recovers detailed balance for .
Why It Matters
Detailed balance guarantees correctness: the chain's stationary distribution is the posterior , regardless of how poor the proposals are. Poor proposals affect efficiency (low acceptance rates, slow mixing) but not correctness. This separation of correctness from efficiency is the strength of the MCMC framework.
Failure Mode
Detailed balance alone does not guarantee convergence. The chain must also be irreducible (can reach any state from any other state) and aperiodic. In RJMCMC, irreducibility requires that the move types allow reaching all models. If some models are only accessible through a long chain of intermediate models, mixing can be extremely slow.
Common Move Types
Birth/Death. Add or remove a component. To "birth" a new component in a -component mixture: draw new parameters from a proposal, adjust existing weights to sum to 1. The reverse "death" move removes a component and redistributes its weight. The auxiliary variables are the proposed parameters for the new component.
Split/Merge. Split one component into two or merge two into one. This is more efficient than birth/death because the split components inherit information from their parent. The map takes to two new components:
with weights and . The Jacobian of this map must be computed and included in the acceptance ratio.
Application: Mixture Models with Unknown
The canonical application is a Gaussian mixture model where the number of components is unknown. The model space is where has parameters .
RJMCMC alternates between:
- Within-model moves: standard MH or Gibbs updates to holding fixed
- Between-model moves: birth/death or split/merge moves that change
The posterior samples give a distribution over : the fraction of time the chain spends in model estimates . This is the fully Bayesian approach to model selection, avoiding point estimates of the model order.
Application: Change-Point Detection
A time series with unknown change points at positions , where is unknown. Each segment has its own parameter . RJMCMC proposes adding a new change point (splitting a segment), removing one (merging two segments), or moving an existing change point. The dimension changes by the number of parameters per segment when adding or removing a change point.
Common Confusions
The Jacobian is not optional
Every dimension-changing map has a Jacobian that must be included in the acceptance ratio. If the map is the identity (e.g., the new parameters are exactly the auxiliary variables), the Jacobian is 1. But for split/merge or any nonlinear map, omitting the Jacobian breaks detailed balance and produces an incorrect stationary distribution. There is no warning; the chain will simply converge to the wrong distribution.
RJMCMC does not require special convergence theory
Some practitioners believe RJMCMC needs different convergence diagnostics than standard MCMC. It does not. The chain is a standard Markov chain on the augmented space. Standard diagnostics (trace plots, effective sample size, ) apply, though monitoring the model indicator is especially important.
Low acceptance rates for dimension-changing moves are normal
Acceptance rates of 1-5% for birth/death or split/merge moves are common and acceptable. The chain spends most of its time doing within-model updates. The between-model moves need only fire occasionally to explore the model space. Compare this with standard MH, where acceptance rates of 20-40% are typical.
Why This Is Hard
The difficulty is entirely in designing good proposals. The dimension-changing map must produce parameter values in the new model that are plausible under the posterior. A naive birth move that draws new parameters from the prior will almost always be rejected because the prior is typically much wider than the posterior.
Good proposals use data-informed distributions (e.g., proposing new component means near existing data points) or deterministic maps that preserve sufficient statistics. Designing these proposals requires problem-specific insight; there is no general-purpose solution.
Key Takeaways
- RJMCMC extends MH to sample over models with different numbers of parameters
- Dimension-matching: augment both sides so the bijection maps between equal-dimensional spaces
- The acceptance ratio includes a Jacobian for the dimension-changing map
- Detailed balance holds on the augmented space, guaranteeing correctness
- Birth/death and split/merge are the two standard move types
- The hard part is designing proposals with reasonable acceptance rates
Exercises
Problem
A 2-component Gaussian mixture has parameters with the constraint , giving 5 free parameters. A 3-component mixture has 8 free parameters. For a birth move from 2 to 3 components, how many auxiliary variables are needed? What is the dimension of the augmented space on each side?
Problem
Consider a split move for a 1D Gaussian mixture. Component with parameters splits into two components using auxiliary variables via:
, , , , , .
Compute the Jacobian .
References
Canonical:
- Green, "Reversible Jump Markov Chain Monte Carlo Computation and Bayesian Model Determination" (Biometrika, 1995)
- Richardson & Green, "On Bayesian Analysis of Mixtures with an Unknown Number of Components" (JRSS-B, 1997)
Current:
-
Brooks, Gelman, Jones, Meng, Handbook of Markov Chain Monte Carlo (2011), Chapter 6
-
Green & Hastie, "Reversible Jump MCMC" (2009), a tutorial review
-
Robert & Casella, Monte Carlo Statistical Methods (2004), Chapters 3-7
Next Topics
Natural extensions from reversible jump MCMC:
- Birth-death MCMC: continuous-time formulations that avoid explicit Jacobian computations
- Bayesian model comparison: other approaches to model selection, including Bayes factors and marginal likelihood estimation
Last reviewed: April 2026
Prerequisites
Foundations this topic depends on.
- Metropolis-Hastings AlgorithmLayer 2
- Common Probability DistributionsLayer 0A
- Sets, Functions, and RelationsLayer 0A
- Basic Logic and Proof TechniquesLayer 0A