Hypergeometric Distribution

Sneiderman, Robby

Foundations

Hypergeometric Distribution

The Hypergeometric distribution is the law of the success count in n draws without replacement from a finite population of N items, K of which are successes. The PMF is the ratio of three binomial coefficients. The mean is nK/N, identical to the Binomial with p = K/N, but the variance carries the finite-population correction (N-n)/(N-1). The Binomial is the n << N limit. Fisher's exact test, capture-recapture, and quality-control acceptance sampling read off the Hypergeometric directly.

ImportantCoreTier 2StableSupporting~35 min

For:StatsGeneral

Prerequisites

Common Probability Distributions Distributions Atlas

Prereq Map

Plain-Language Definition

Take a finite population of $N$ items. Mark $K$ of them as successes. Draw a sample of size $n$ without replacement and count the number of successes drawn. The probability law of that count is the Hypergeometric distribution.

The plain example is an urn with $K$ red balls and $N - K$ blue balls. Pull out $n$ balls without replacement; how many are red? Sampling without replacement is what distinguishes the Hypergeometric from the Binomial. The Binomial is the law for the same setup with replacement, or equivalently for an infinite population. As the population grows large relative to the sample, the Hypergeometric converges to the Binomial.

Definition

Hypergeometric Distribution $X \sim Hypergeometric (N, K, n)$

A random variable $X$ has a Hypergeometric distribution with population size $N$ , success count $K \in \{0, 1, \dots, N\}$ , and sample size $n \in \{0, 1, \dots, N\}$ if its PMF is

$\mathbb{P}(X = k) = \frac{\binom{K}{k}\binom{N - K}{n - k}}{\binom{N}{n}}, \quad k \in \{\max(0, n - (N - K)), \dots, \min(n, K)\}.$

The numerator counts the ways to choose $k$ successes from the $K$ available and $n - k$ failures from the $N - K$ available. The denominator counts the ways to choose any $n$ items from $N$ . The ratio is the equally-likely-outcomes probability for that count.

The support boundaries are tight. The lower bound $\max(0, n - (N - K))$ kicks in when the sample is so large that some successes are forced into it; the upper bound $\min(n, K)$ caps the count at the smaller of the sample size and the available-success count.

Why This Matters

Three places in the canon use the Hypergeometric directly.

Fisher's exact test for $2 \times 2$ tables. The null distribution of one cell count given fixed margins is exactly Hypergeometric. The conditioning argument that produces it is the cleanest derivation of an exact small-sample test in classical statistics, and it works when the chi-squared approximation breaks down for small cell counts.
Capture-recapture. Tag $K$ animals, release them, draw $n$ from the population, and observe $k$ tagged. The Hypergeometric PMF gives a likelihood for the unknown $N$ ; the maximum-likelihood estimator $\widehat N = \lfloor Kn / k \rfloor$ is the Lincoln-Petersen estimator. Same idea works for software-defect estimation in code review.
Acceptance sampling. A lot of $N$ items contains $K$ defectives. Draw $n$ items and accept the lot if no more than $c$ are defective. The operating characteristic of the inspection plan is a sum of Hypergeometric PMFs in $K$ .

Mean and Variance

Theorem

Hypergeometric Mean and Variance

Statement

$\mathbb{E}[X] = n\frac{K}{N}, \qquad \operatorname{Var}(X) = n\frac{K}{N}\frac{N - K}{N}\frac{N - n}{N - 1}.$

Intuition

The mean is identical to a Binomial with $n$ trials and per-trial success probability $K/N$ . The variance differs by the factor $(N - n)/(N - 1)$ , the finite-population correction. As $N \to \infty$ with $K/N \to p$ , the correction tends to 1 and the Hypergeometric variance approaches $np(1 - p)$ , the Binomial variance.

Proof Sketch

Write $X = \sum_{i=1}^{n} I_i$ , where $I_i$ is the indicator that the $i$ -th draw is a success. By symmetry $\mathbb{P}(I_i = 1) = K/N$ for every $i$ , so $\mathbb{E}[X] = nK/N$ by linearity. For the variance, $\operatorname{Var}(X) = \sum_i \operatorname{Var}(I_i) + \sum_{i \neq j} \operatorname{Cov}(I_i, I_j)$ . Each $\operatorname{Var}(I_i) = (K/N)(1 - K/N)$ . For $i \neq j$ , $\mathbb{E}[I_i I_j] = (K/N) \cdot (K - 1)/(N - 1)$ by the chain rule on a uniformly sampled pair, giving $\operatorname{Cov}(I_i, I_j) = -(K/N)(N - K) / [N^2 (N - 1)]$ . Summing $n$ variances and $n(n - 1)$ covariances and collecting terms gives the stated variance.

Why It Matters

The mean alone cannot tell the Hypergeometric apart from the Binomial. The finite-population correction in the variance is the diagnostic. In any setting where the sample is a meaningful fraction of the population (sampling 10 percent or more from a finite population), the Binomial variance overstates the true variance, sometimes substantially.

Failure Mode

Software libraries differ on parameter order. Some take $(N, K, n)$ , others $(K, N - K, n)$ , others $(n, K, N)$ . Read the docstring before plugging in numbers.

report a correction →

Binomial Limit

Theorem

Hypergeometric Converges to Binomial as N grows

Statement

For fixed $n$ and $k$ , $\mathbb{P}(\operatorname{Hypergeometric}(N, K, n) = k) \to \binom{n}{k} p^k (1 - p)^{n - k}$ as $N \to \infty$ with $K/N \to p$ .

Intuition

When the population is much larger than the sample, removing a drawn item barely changes the success fraction in the remaining items. Sampling without replacement becomes indistinguishable from sampling with replacement, and the Hypergeometric reduces to the Binomial.

Proof Sketch

Expand the Hypergeometric PMF using $\binom{K}{k} = K(K-1)\cdots(K-k+1)/k!$ and similarly for the other binomial coefficients. The numerator is a polynomial of total degree $n$ in $N$ , and the denominator $\binom{N}{n}$ is also $\Theta(N^n)$ . Match leading coefficients: the $k$ factors from $K$ and the $n - k$ factors from $N - K$ give $K^k (N - K)^{n - k}$ to leading order, while the denominator gives $N^n / n! \cdot$ a factor close to $1$ for large $N$ . Reassembling and dividing $K/N \to p$ produces the Binomial PMF.

Why It Matters

The rule of thumb in textbooks is to use the Binomial approximation when $n \leq 0.05 N$ , i.e. the sample is at most five percent of the population. Below that threshold the Binomial PMF and the Hypergeometric PMF agree to two decimal places for moderate $k$ . Above it, the finite-population correction matters and the exact Hypergeometric should be used.

Failure Mode

The convergence is pointwise in $k$ , not uniform in the extreme tails. For very small $\mathbb{P}(X = k)$ at the support boundaries, the relative error from the Binomial approximation can be substantial even when $n / N$ is small.

report a correction →

Worked Example: Acceptance Sampling

A shipment contains $N = 100$ widgets, of which $K = 8$ are defective. An inspector draws $n = 20$ at random without replacement.

What is the probability of seeing exactly $k = 2$ defectives?

$\mathbb{P}(X = 2) = \frac{\binom{8}{2}\binom{92}{18}}{\binom{100}{20}}.$

Compute the three binomial coefficients (in practice via log-gamma to avoid overflow). The numerical value is approximately $0.3041$ .

What is the probability of seeing no defectives at all? That is the acceptance probability for a lot-tolerance plan with $c = 0$ :

$\mathbb{P}(X = 0) = \frac{\binom{92}{20}}{\binom{100}{20}} = \prod_{i = 0}^{19} \frac{92 - i}{100 - i} \approx 0.1825.$

The Binomial approximation with $p = 0.08$ , $n = 20$ gives $(1 - 0.08)^{20} \approx 0.1887$ , which overestimates the acceptance probability because it ignores the finite-population correction.

The mean and standard deviation: $\mathbb{E}[X] = 20 \cdot 8 / 100 = 1.6$ , $\operatorname{Var}(X) = 1.6 \cdot 0.92 \cdot (100 - 20)/99 = 1.190$ , $\operatorname{SD}(X) \approx 1.091$ . The Binomial would give variance $1.6 \cdot 0.92 = 1.472$ , about 24 percent too large.

Comparison to Closely Related Distributions

Watch Out

Hypergeometric and Binomial have the same mean but different variances

The two distributions agree on the mean ( $\mathbb{E}[X] = nK/N$ versus $\mathbb{E}[X] = np$ with $p = K/N$ ). They differ in the variance by the finite-population correction $(N - n)/(N - 1)$ , which equals $1$ when $n = 1$ and decreases monotonically to $0$ when $n = N$ . Reporting the Binomial standard error on data from a finite-population without-replacement design will overstate the uncertainty.

Watch Out

The Hypergeometric is not symmetric in K and n

A Hypergeometric $(N, K, n)$ has the same distribution as a Hypergeometric $(N, n, K)$ by the symmetry of "which side of the table is the sample". The PMF is symmetric in the roles of "successes" and "draws". This is a useful computational trick when $K \ll n$ or $n \ll K$ . It is also a frequent source of confusion when reading code that swaps the two.

Watch Out

Conditional on the margins, the Hypergeometric is exact, not approximate

In a $2 \times 2$ contingency table with fixed margins, the conditional distribution of the upper-left cell under the null of independence is exactly Hypergeometric. Fisher's exact test reports a $p$ -value computed from this exact distribution. The chi-squared approximation to the same test statistic uses a continuous distribution and is approximate; the two answers can differ for small cell counts.

Hypergeometric vs Binomial vs Negative Hypergeometric

The Hypergeometric, the Binomial, and the Negative Hypergeometric all answer questions about counting successes in a sequence of draws but differ in what is fixed and what is random.

Binomial. $n$ independent trials with constant success probability $p$ . Random variable: the number of successes. Sampling with replacement, or equivalently from an infinite population.
Hypergeometric. $n$ draws without replacement from a finite population with $K$ successes and $N - K$ failures. Random variable: the number of successes. Sampling without replacement from a finite population.
Negative Hypergeometric. Draw without replacement until $r$ failures occur. Random variable: the number of successes before the $r$ -th failure. Less commonly encountered, but appears in stopping-time and waiting-time problems analogous to the Negative Binomial under sampling without replacement.

The relationships are clean: the Hypergeometric converges to the Binomial as $N \to \infty$ with $K/N \to p$ , and the Negative Hypergeometric stands to the Hypergeometric as the Negative Binomial stands to the Binomial.

Capture-Recapture

A wildlife biologist tags $K = 100$ fish in a lake, releases them, waits for mixing, and then draws $n = 60$ fish, of which $k = 12$ are tagged. The likelihood of the unknown population size $N$ is

$L(N) = \frac{\binom{100}{12}\binom{N - 100}{48}}{\binom{N}{60}}.$

Differentiating with respect to $N$ and finding the integer maximum gives the Lincoln-Petersen estimator $\widehat N = \lfloor Kn / k \rfloor = \lfloor 100 \cdot 60 / 12 \rfloor = 500$ fish.

The Lincoln-Petersen estimator assumes a closed population (no births, deaths, immigration, or emigration during the experiment), random mixing of tagged fish, and no tag loss. Violations are common in field work, and modified estimators (Schnabel for multiple recapture occasions, Jolly-Seber for open populations) extend the same Hypergeometric likelihood approach.

Exercises

ExerciseCore

Problem

A deck of 52 cards contains 4 aces. Draw 5 cards without replacement. Find the probability of getting exactly 2 aces and the probability of getting at least 1 ace.

ExerciseCore

Problem

A factory ships boxes of $N = 50$ light bulbs. The quality-control protocol draws $n = 5$ bulbs without replacement and rejects the box if any are defective. If the box contains $K = 3$ defectives, what is the probability the box is accepted?

ExerciseCore

Problem

A lottery draws 6 numbers without replacement from 49. A ticket selects 6 numbers. Find the probability of matching exactly 4 of the 6 drawn numbers.

ExerciseAdvanced

Problem

Show that for fixed $n$ , the variance ratio $\operatorname{Var}_{\text{Hyper}}(X) / \operatorname{Var}_{\text{Bin}}(X)$ equals $(N - n)/(N - 1)$ , and interpret the two extremes $n = 1$ and $n = N$ .

ExerciseAdvanced

Problem

A $2 \times 2$ contingency table records the outcomes of 14 patients on two treatments. Treatment A: 6 of 8 successes. Treatment B: 1 of 6 successes. Compute Fisher's exact one-sided $p$ -value for the null that treatments are equally effective, conditioning on the observed margins.

Sampling-Distribution Connections

The Hypergeometric is the without-replacement analogue of the Binomial and sits inside a small lattice of related distributions:

The Binomial is the with-replacement / infinite-population limit.
The Multivariate Hypergeometric generalizes to populations with more than two categories: $N = \sum_c K_c$ items in categories $c = 1, \dots, C$ , sample $n$ , count the categories.
The Negative Hypergeometric changes the stopping rule from "fixed $n$ " to "until $r$ failures occur".
The Poisson is the rare-event limit of the Binomial and, transitively, a far-field limit of the Hypergeometric when both $N$ and $K$ grow large with $K/N \to 0$ and $nK/N \to \lambda$ .

References

Casella, G., and Berger, R. L. (2002). Statistical Inference, 2nd ed., Duxbury. Chapter 3.2 covers discrete distributions including the Hypergeometric, with the Binomial limit worked through.
Blitzstein, J. K., and Hwang, J. (2019). Introduction to Probability, 2nd ed., Chapman and Hall / CRC. Chapter 3 has a chess-board treatment of the Hypergeometric and a clean derivation of the Lincoln-Petersen estimator.
Lehmann, E. L., and Romano, J. P. (2005). Testing Statistical Hypotheses, 3rd ed., Springer. Section 4.6 covers Fisher's exact test and the role of the Hypergeometric null distribution in conditional inference.

Last reviewed: May 12, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

2

Common Probability Distributionslayer 0A · tier 1
Distributions Atlaslayer 0A · tier 1

Derived topics

0

No published topic currently declares this as a prerequisite.