Foundations
Distributions Atlas
A connection map for the parametric families used in statistical inference and ML. Lists each family with its support, parameterization, and the transformations that move between families: sum, mixture, conjugacy, limiting case, and ratio constructions.
Why This Matters
The named distributions are not a list. They are a graph. A Bernoulli trial summed times is a Binomial; a Binomial with large and mean held fixed is a Poisson; the time between Poisson events is an Exponential; the sum of independent Exponentials with the same rate is a Gamma; a Gamma with shape and rate is a Chi-squared with degrees of freedom; a standard Normal divided by a square-rooted Chi-squared is a Student-t; the ratio of two scaled Chi-squareds is an F.
Knowing these edges turns a long memorization task into a short one. The sampling distribution of the standardized sample mean is Student-t because the numerator is Normal, the denominator squared is Chi-squared, and they are independent. The Pearson Chi-squared statistic has its name because under the null hypothesis it converges to a sum of squared Normals. The F statistic in analysis of variance is a ratio of two variance estimates, and each variance estimate is a scaled Chi-squared. Each new test follows from the same construction: identify the building blocks, identify the transformation, read off the limiting distribution.
This page is the atlas. It states each connection precisely once and links to the per-distribution page that proves it.
The Atlas
Discrete families
| Family | Notation | Support | Mean | Variance |
|---|---|---|---|---|
| Bernoulli | ||||
| Binomial | ||||
| Geometric | ||||
| Negative Binomial | ||||
| Poisson | ||||
| Hypergeometric |
Continuous families
| Family | Notation | Support | Mean | Variance |
|---|---|---|---|---|
| Uniform | ||||
| Normal | ||||
| Exponential | ||||
| Gamma (rate) | ||||
| Beta | ||||
| Chi-squared | ||||
| Student-t | for | for | ||
| F | for | see Casella-Berger 5.3 | ||
| Lognormal | ||||
| Pareto | for | see Casella-Berger 3.3 |
The variance entries with degree-of-freedom restrictions (, , ) reflect the fact that those distributions have polynomially decaying tails, so low-order moments only exist past a threshold.
The Connection Graph
Each row of this table is a named transformation. The "Direction" column reads from the building block to the result.
| Direction | Construction | Result | Why it matters |
|---|---|---|---|
| Sum of i.i.d. | The count of successes in fixed trials. | ||
| Geometric sum of i.i.d. | sum of i.i.d. | Trials until the -th success. | |
| Binomial rare-event limit | , , | Rare counts in large pools; defects, mutations, queue arrivals. | |
| Poisson process inter-arrivals | gaps between Poisson event times | Time-between-events under memoryless arrivals. | |
| Sum of i.i.d. | Time-to--th-event in a Poisson process. | ||
| Gamma with shape , rate | Identification: Chi-squared a specific Gamma. | ||
| Standard Normal squared | for | The simplest Chi-squared random variable. | |
| Sum of i.i.d. squared standard Normals | The sample variance up to a scaling factor. | ||
| Ratio of standard Normal and root-Chi | with | The standardized sample mean has this form. | |
| Ratio of two scaled Chi-squareds | with independent | The F statistic for variance comparison and ANOVA. | |
| Order statistic of | from i.i.d. uniforms | Beta arises geometrically before it arises as a prior. | |
| Logarithm of Lognormal | for | Defines Lognormal. Multiplicative noise becomes additive after the log. | |
| Exponentiated Pareto tail | for | The log-Pareto is an Exponential. Heavy tails become light after a log. |
Sum of i.i.d. Exponentials is Gamma
Statement
If are independent random variables, then with density
Intuition
A Poisson process with rate has Exponential inter-arrival times. The time of the -th event is the sum of the first gaps. Each gap contributes a factor of to the density and one power of to the polynomial term; the factorial is the volume of the ordered-arrival simplex.
Proof Sketch
By induction or by MGF. The MGF of is for . By independence the MGF of is , which is the MGF of . MGF uniqueness identifies the distribution.
Why It Matters
Every Chi-squared random variable is a sum of squared Normals, hence a Gamma with half-integer shape. Every F is a ratio of two Chi-squareds, hence built from Gammas. The Gamma family is the load-bearing wall behind half of the classical sampling distributions.
Failure Mode
The result requires independence and a common rate. Sums of Exponentials with different rates do not give a Gamma; they give a hypoexponential distribution, which has a different density.
Ratio of Normal and Root Chi-squared is Student-t
Statement
Let and be independent. Then with density
Intuition
The numerator is the unit-variance source of randomness. The denominator is a scale estimate, normalized so that as . Dividing by a noisy estimate of the unit scale inflates the tails by a polynomial amount; the heavier the noise (smaller ), the heavier the tails.
Proof Sketch
Joint density of factors by independence. Change variables to and , integrate out . The integrand is a Gamma in once collected, so the integral evaluates by the Gamma normalizing constant.
Why It Matters
The sample mean of an i.i.d. Normal sample has numerator that is standard Normal, and denominator with proportional to a Chi-squared. The standardized statistic is therefore exactly . This is the one-sample t-test in one line.
Failure Mode
Independence of and is essential. In the t-test the relevant is built from and the relevant is built from ; their independence is a special fact about Normal samples and Cochran's theorem, not a generic phenomenon.
Conjugate Priors
A prior on is conjugate to a likelihood when the posterior is in the same parametric family as the prior. For the named families the conjugate pairs are short:
| Likelihood | Conjugate prior | Posterior update |
|---|---|---|
| or | , | |
| or | ||
| , | ||
| , | ||
| with known | precision-weighted average of and | |
| with known | , | |
| both unknown | Normal-Inverse-Gamma | joint update on |
The Bernoulli-Beta and Poisson-Gamma pairs are derived in their respective pages; the Normal pairs are derived in Bayesian estimation.
Three Recurring Tricks
Trick 1: Sum identifies via MGF
If and are independent and you want the law of , compute the MGF of and match it to a known MGF. This is how sum-of-Exponentials-is-Gamma is proved, how sum-of-Normals-is-Normal is proved, and how Binomial-additivity is proved. The MGF table in moment generating functions is the lookup index.
Trick 2: Limit identifies via characteristic-function convergence
If you want the limiting law of a sequence , compute the characteristic function and take the limit. This is how Binomial-to-Poisson is proved (Poisson's theorem), how the central limit theorem is proved, and how as is proved.
Trick 3: Transformation identifies via change of variables
If for a smooth invertible , then . This is how Lognormal is derived from Normal, how the Pareto-log-Exponential connection is verified, how the Student-t density is computed from the joint density, and how the F density is computed from the joint density.
Two Common Confusions
Rate versus scale parameterization
Exponential and Gamma both have two conventions. Rate uses with density and mean . Scale uses with density and mean . SciPy and many engineering texts default to scale; mathematical-statistics texts default to rate. Always check which parameterization a software library uses before plugging in. The pages in this atlas use rate by default and note where scale is more natural.
Chi-squared as a sampling distribution versus a concentration class
The Chi-squared distribution in this atlas is the exact sampling distribution of . The phrase "Chi-squared concentration" elsewhere on the site refers to the Laurent-Massart sub-Gamma tail bounds for Chi-squared random variables, which is a finite-sample inequality, not a description of the law. See chi-squared concentration for the bound; see chi-squared distribution and tests for the law and its uses.
How to Use This Atlas
Pick a target distribution; read down the connection-graph table; follow the edges to the building blocks. Each distribution page derives its connections from this atlas and links back here.
- Want to know why the sample variance has a Chi-squared distribution? Read chi-squared distribution and tests.
- Want to derive the t-statistic from first principles? Read Student-t distribution and t-test.
- Want the F statistic for ANOVA? Read F distribution and ANOVA.
- Want a Bayesian update for a Bernoulli sequence? Read beta distribution.
- Want the time-to-failure model for a Poisson process? Read exponential distribution and gamma distribution.
Exercises
Problem
Let be i.i.d. with and . Identify the distribution of and compute and .
Problem
A Poisson process with rate events per minute is observed. Let be the time, in minutes, of the third event. Identify the distribution of and compute in terms of an incomplete Gamma function.
Problem
Let and be independent, and let . Show that as , the distribution of converges to .
References
Canonical:
- Casella and Berger, Statistical Inference (2002), Chapters 3 and 5.
- Bickel and Doksum, Mathematical Statistics, Volume I (2015), Chapter 1.
- Johnson, Kotz, and Balakrishnan, Continuous Univariate Distributions, Volumes 1 and 2 (1994 and 1995).
- Johnson, Kemp, and Kotz, Univariate Discrete Distributions (2005).
Probability foundations:
- Blitzstein and Hwang, Introduction to Probability (2019), Chapters 3 through 8.
- Durrett, Probability: Theory and Examples (2019), Chapters 2 and 3.
- Grimmett and Stirzaker, Probability and Random Processes (2020), Chapters 3 through 6.
Bayesian framing:
- Gelman et al., Bayesian Data Analysis (2013), Chapter 2.
- Robert, The Bayesian Choice (2007), Chapter 3.
Last reviewed: May 11, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
3- Common Probability Distributionslayer 0A · tier 1
- Random Variableslayer 0A · tier 1
- Moment Generating Functionslayer 0A · tier 2
Derived topics
8- Beta Distributionlayer 0A · tier 1
- Exponential Distributionlayer 0A · tier 1
- Gamma Distributionlayer 0A · tier 1
- Normal Distributionlayer 0A · tier 1
- Poisson Distributionlayer 0A · tier 1
+3 more on the derived-topics page.