Foundations
Discrete and Continuous Distribution Pairs
A reference page mapping the canonical discrete-and-continuous companion pairs. Discrete uniform and continuous uniform; Bernoulli/Binomial counting process and the Poisson process; geometric and exponential waiting times (both memoryless); negative binomial and gamma (time to r-th event); hypergeometric (sampling without replacement, no clean continuous analog). Each pair is shown side by side with definition, PMF or PDF, support, mean, variance, MGF, memorylessness check, and conceptual bridge.
Prerequisites
Why This Matters
Several discrete distributions have natural continuous-time analogs. Knowing the pairs cleans up two sources of confusion: which distribution belongs in which problem (count-the-trials vs. wait-for-the-time), and why the same algebraic identities (memorylessness, sum-to-form-a-bigger-member) keep showing up. The pairing is also a useful sanity check: if a discrete-time analysis and a continuous-time analysis disagree about the answer to the same question, one of them is making a modeling error.
This page is a reference. The individual distributions are covered in detail on common probability distributions; the value here is the side-by-side view.
The Pairing at a Glance
| Discrete | Continuous | Bridge | Both memoryless? |
|---|---|---|---|
| Discrete uniform on | Uniform on | "No information" maximum-entropy on a finite or bounded set | No |
| Bernoulli / Binomial counting | Poisson process on | Rare-event limit: many trials with small per-trial success rate | n/a (counting processes) |
| Geometric on | Exponential on | Waiting time to the first success / event | Yes (the only memoryless families on their supports) |
| Negative binomial (sum of Geometrics) | Gamma with integer shape (sum of Exponentials) | Waiting time to the -th success / event | No |
| Hypergeometric (sample without replacement) | No clean continuous analog | Sampling without replacement has no rate-based continuous version | n/a |
Uniform: Discrete and Continuous
| Property | Discrete Uniform on | Continuous Uniform on |
|---|---|---|
| PMF / PDF | for | for |
| CDF | for | for |
| Mean | ||
| Variance | ||
| MGF | ||
| Memorylessness | No (bounded support) | No (bounded support) |
The factor of in the variance is the same constant in both expressions; this is not coincidence. The continuous uniform is the limit of the discrete uniform on as , and the variance of the limit matches the limit of the variances.
Bernoulli/Binomial Counting versus the Poisson Process
| Property | Binomial counting on trials | Poisson process on |
|---|---|---|
| What it counts | Number of successes in iid Bernoulli() trials | Number of arrivals in a continuous-time interval with rate |
| Distribution of the count | ||
| Mean | ||
| Variance | ||
| MGF | ||
| Increment independence | Successes in disjoint trial sets are independent | Arrivals in disjoint time intervals are independent |
| Limit relationship | , , gives Poisson | The Poisson process is the rare-event continuous-time limit |
The rare-event limit is the bridge: as the number of opportunities for an event grows and the per-opportunity probability shrinks at the right rate, the binomial PMF converges to the Poisson PMF . The variance of the binomial is , matching the Poisson variance.
Poisson Limit of the Binomial
Statement
If with as , then for every fixed ,
Proof Sketch
Write The product . And since . Combining, .
Why It Matters
This justifies modeling rare-event counts (insurance claims per month, network packet drops per second, mutations per generation) as Poisson rather than binomial: when the underlying trial count is large and the per-trial rate is small, the Poisson approximation is both convenient (one parameter instead of two) and accurate.
Failure Mode
The approximation requires to be small and to be a moderate constant. For fixed bounded away from zero, the binomial does not converge to Poisson but rather to a Gaussian (after standardization, by the CLT). The two limits handle different regimes.
Geometric and Exponential: The Memoryless Pair
| Property | Geometric on | Exponential on |
|---|---|---|
| What it measures | Number of trials until first success in iid Bernoulli() | Time until first event in a rate- Poisson process |
| PMF / PDF | for | for |
| CDF | for | for |
| Mean | ||
| Variance | ||
| MGF | for | for |
| Memorylessness | for integer | for real |
Geometric and Exponential Are the Unique Memoryless Distributions
Statement
On the positive integers, the geometric distribution is the unique distribution satisfying for all . On the positive reals, the exponential distribution is the unique continuous distribution satisfying for all .
Proof Sketch
Exponential (continuous case). Let . Memorylessness gives . The only right-continuous solutions on with are of the form for some (a classical Cauchy functional-equation result). Differentiating gives the exponential density.
Geometric (discrete case). Let . Memorylessness gives . Setting gives , so and . Identifying as the success probability recovers the geometric.
Why It Matters
Memorylessness is the conceptual bridge between the two distributions. The continuous-time analog of "waiting one more trial after failed trials" is "waiting one more time unit after time units of no event"; both inherit the same lack-of-memory property because the underlying processes (Bernoulli trials, Poisson process) have independent increments.
Concretely: if calls arrive at a help line as a rate- Poisson process, the time you have already waited gives you no information about how much longer until the next call. The same is true for the next Bernoulli success after a string of failures. Any modeling that requires the waiting-time distribution to "remember" the past must use something other than geometric or exponential.
Failure Mode
The memoryless property is exact, not approximate. Mixture models (geometric with random , exponential with random ) are not memoryless. Truncated exponential distributions are not memoryless. Any failure of independent increments in the underlying process breaks memorylessness.
Negative Binomial and Gamma: Waiting for the r-th Event
| Property | Negative Binomial | Gamma with integer shape |
|---|---|---|
| What it measures | Number of trials until -th success in iid Bernoulli() | Time until -th event in a rate- Poisson process |
| PMF / PDF | for | for |
| Mean | ||
| Variance | ||
| Sum-of-independent decomposition | NB is the sum of iid Geometric | Gamma is the sum of iid Exponential |
| Memorylessness | No (the wait for the -th event remembers prior failed trials when ) | No |
The negative binomial (geometric counterpart) inherits its construction from summing independent geometrics. The gamma (exponential counterpart) inherits its construction from summing independent exponentials. Memorylessness is lost as soon as because conditioning on having waited a long time without all events changes the distribution of the remaining wait.
The gamma distribution also has a non-integer shape parameter generalization (replacing with the gamma function ); the negative binomial similarly extends to non-integer via the gamma function, where it is most useful as a Poisson distribution with gamma-distributed rate (a mixture, not a waiting-time distribution).
Hypergeometric: No Clean Continuous Analog
The hypergeometric distribution models sampling without replacement: from a population of items with successes, draw items, and count the number of successes drawn. Its PMF is There is no clean continuous-time analog. The continuous version of "sampling without replacement" would be "drawing without re-sampling from a finite continuous population", which is not a standard probability construct. The hypergeometric appears in survey sampling, finite-population inference, and Fisher's exact test for contingency tables; in each setting, the continuous-population analog has different machinery (regression, asymptotic chi-squared, etc.).
Worth knowing: as with , the hypergeometric converges to the binomial. So the "no continuous analog" gap is bridged by the binomial-Poisson rare-event chain in the large-population limit.
Common Confusions
Geometric supports vary by convention
The two standard conventions are (counting the trial of the first success) and (counting the failures before the first success). The mean is in the first convention and in the second. Always state which version is in use; the formulas differ by a constant shift.
The negative binomial has at least two equally common parameterizations
The shape parameter can count trials until the -th success (PMF supported on ) or failures before the -th success (PMF supported on ). When is non-integer, the distribution is typically presented in the mean-dispersion parameterization used in GLM software. Match the convention to the source.
Memorylessness is about the residual lifetime, not the marginal distribution
The exponential is memoryless, but a sum of two iid exponentials (a gamma with shape ) is not memoryless. The reason: knowing you have waited some time without seeing both events tells you which event has happened (if any), and changes the conditional distribution of the remaining wait. Memorylessness is a fragile property and only survives in the single-event special case.
The Poisson process is more than the Poisson distribution
"Poisson distribution" is a one-parameter family on describing the count of events. "Poisson process" is a continuous-time stochastic process whose count over any interval is Poisson and whose increments over disjoint intervals are independent. The two are tightly related but conceptually distinct; many of the discrete-continuous pairings above are pairings of stochastic processes (Bernoulli trial sequence vs. Poisson process), with the marginal distributions falling out as derived quantities.
Exercises
Problem
Verify that an exponential random variable satisfies the memoryless property by direct computation.
Problem
Let be iid Geometric() on . Show that has the negative binomial PMF for .
Problem
Let be iid Exponential() inter-arrival times and let be the number of arrivals in . Show that by computing via the gamma distribution of the partial sums.
References
Canonical:
- Casella and Berger, Statistical Inference (2002), 2nd edition, Chapter 3 (discrete) and Chapter 3 (continuous)
- Blitzstein and Hwang, Introduction to Probability (2019), 2nd edition, Chapters 4 (discrete), 5 (continuous), and 13 (Poisson processes)
- Durrett, Probability: Theory and Examples (2019), 5th edition, Chapter 4 (Poisson processes)
Distribution catalogs:
- Johnson, Kotz, and Kemp, Univariate Discrete Distributions (2005), 3rd edition
- Johnson, Kotz, and Balakrishnan, Continuous Univariate Distributions, Volume 1 (1994), 2nd edition
Applied / process view:
- Ross, Introduction to Probability Models (2014), 11th edition, Chapter 5 (Poisson processes)
- Pinsky and Karlin, An Introduction to Stochastic Modeling (2011), 4th edition
Next Topics
- Central limit theorem: the universal continuous limit of normalized sums of discrete or continuous distributions.
- Law of large numbers: the matched-pair partner of the CLT; applies equally to discrete and continuous distributions.
- Method of moments: the estimation strategy that uses moments directly from the tables on this page.
Last reviewed: May 12, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
2- Common Probability Distributionslayer 0A · tier 1
- Random Variableslayer 0A · tier 1
Derived topics
0No published topic currently declares this as a prerequisite.