Foundations
Exponential Distribution
The Exponential distribution as the memoryless waiting time: density, CDF, MGF, the memoryless property as a characterization, the Poisson-process inter-arrival construction, the minimum of independent exponentials, MLE for the rate, and the bridge to the Gamma distribution.
Why This Matters
The Exponential is the only continuous distribution on that is memoryless: conditional on surviving past any time , the remaining waiting time has the same distribution as the original. That single property determines the family up to its rate parameter, and it is the reason the Exponential is the canonical model for the time between events in a Poisson process. Once you have the Exponential, you have the Poisson process; once you have the Poisson process, you have the Poisson distribution for counts and the Gamma distribution for the time to the -th event.
The Exponential is also the maximum-entropy distribution on with a fixed mean, which makes it the right default model when all you know about a positive random variable is its mean. The downstream uses include reliability, queueing, survival analysis, and the noise model in some Bayesian regressions.
Definition
Exponential Distribution
A random variable has an Exponential distribution with rate if it has density
and zero density for . The CDF is for .
An equivalent parameterization uses scale : . The rate parameterization gives mean ; the scale parameterization gives mean .
The two parameterizations describe the same family but differ in software conventions: SciPy and R survival packages tend to use scale, mathematical-statistics texts tend to use rate. Always check the convention before plugging in.
Moments and MGF
The mean of is and the variance is . Both follow from the MGF.
Exponential MGF
Statement
For and , For the MGF is infinite.
Intuition
The MGF is the Laplace transform of the density at . The Laplace transform of is finite exactly when the exponent is negative, that is, when .
Proof Sketch
For the integrand does not decay and the integral diverges.
Why It Matters
Differentiating twice and evaluating at zero gives and , so . The MGF has a pole at , which means the Exponential is light-tailed but not sub-Gaussian: it is sub-exponential. See sub-exponential random variables for the resulting tail bound.
Failure Mode
The MGF is finite only on the half-line . Chernoff bounds for the Exponential must keep the parameter inside this region; pushing to blows up the MGF and gives no useful bound. The Gamma and Chi-squared inherit this restriction because they are sums of Exponentials.
The Memoryless Property
Memoryless Property
Statement
For and every ,
Intuition
A light bulb whose failure time is Exponential has no memory of how long it has been on. Given it has not failed by time , the distribution of the remaining time looks identical to the distribution of a fresh bulb.
Proof Sketch
By definition of conditional probability,
Why It Matters
The memoryless property is the defining feature of the Exponential family among continuous distributions on . It is what makes Poisson processes Markovian: the future does not depend on the past beyond the current time. It also makes Exponential models inappropriate when the hazard rate of the underlying process changes over time; in those cases use the Gamma distribution or a Weibull distribution.
Failure Mode
The memoryless property does not hold for the Gamma distribution with shape parameter different from one, the Lognormal, or the Weibull with shape parameter different from one. Reliability data with increasing or decreasing hazard rates should not be modeled by an Exponential; the wrong parametric family will systematically misprice tail risk.
Memorylessness Characterizes the Exponential
Memoryless Characterization
Statement
If is a nonnegative continuous random variable that is memoryless and not identically zero, then there exists a unique with .
Intuition
Memorylessness is the multiplicative functional equation for the survival function . The only nonzero continuous solutions are the decaying exponentials, .
Proof Sketch
Let . Memorylessness gives , that is, . With right-continuous and (assuming is not almost surely zero), Cauchy's functional equation gives for some . The condition that is a probability survival function ( nonincreasing, ) forces . This is .
Why It Matters
The characterization is what makes "memoryless waiting time" a one-line argument for choosing the Exponential family: no other continuous distribution on has it. Combined with the discrete analogue (geometric is the only memoryless distribution on ), the result anchors a clean decision: if you have evidence the hazard rate is constant in time, the Exponential is forced; if not, do not use it.
Failure Mode
The characterization assumes continuity. Replacing it with general right-continuity allows for trivial distributions concentrated at zero. The discrete analogue with geometric distributions on uses a different functional equation; the two characterizations are parallel but not interchangeable.
Minimum of Independent Exponentials
Minimum of Independent Exponentials
Statement
Let for be independent. Then and
Intuition
The probability that no event has happened by time is the product of the survival probabilities, which is exponential in the sum of rates. The probability that event is the first equals the relative rate , by symmetry of the joint density.
Proof Sketch
which is the survival function of . For the identity of the minimum, condition on and use the joint density to compute ; the result is .
Why It Matters
This identity is what competing-risks models, queueing systems, and the Gillespie algorithm for simulating continuous-time Markov chains depend on. Each event clock is exponential; the first to fire determines the next state, and the time of the first firing is itself exponential.
Failure Mode
The result requires independence and exponential marginals. For dependent or non-exponential lifetimes, the minimum is not exponential and the relative-rate identification of which clock fires first fails. The classical competing-risks formula generalizes via cause-specific hazards, not by the elementary calculation here.
Connection to Poisson Process and Gamma
A Poisson process with rate on has the following equivalent characterizations:
- The number of events in any interval of length is .
- The inter-arrival times are i.i.d. .
- The waiting time for the -th event is .
The second characterization is what makes the Exponential the canonical continuous model for "time between rare events with constant rate". The third bridges directly to the Gamma distribution by sum-of-Exponentials. See the proof of sum-of-i.i.d.-Exponentials-is-Gamma in distributions atlas.
Maximum Likelihood Estimation
MLE for the Rate
Statement
Given an i.i.d. sample from , the MLE is The MLE for the scale parameterization is .
Intuition
The log-likelihood is concave in with a single critical point. The MLE for the rate is the reciprocal of the sample mean; the MLE for the mean is the sample mean.
Proof Sketch
The log-likelihood is Differentiating: gives . The second derivative is negative, confirming a maximum.
Why It Matters
is biased upward in finite samples: for , computed from the fact that and has expected reciprocal . The bias-corrected estimator is . For large the bias is negligible.
Failure Mode
The MLE is undefined if every , which happens with probability zero for continuous data but can happen with quantized or censored data. Survival analysis with censoring requires a modified likelihood; see survival analysis.
The Fisher information per observation in the rate parameterization is , so the asymptotic variance of is . The Cramer-Rao lower bound is achieved asymptotically by the MLE.
Sample Output
| Quantity | Formula | Numerical example, |
|---|---|---|
| Mean | ||
| Variance | ||
| Median | ||
| 95th percentile | ||
| 99th percentile |
The median is smaller than the mean because the distribution is right-skewed: most of the mass is concentrated near zero, with a long tail.
Common Confusions
Rate versus scale parameterization
in rate notation has mean . in scale notation has mean . They are the same family with , but plugging into a scale-parameterization library gives mean , not . Read the docstring before reading the result.
Memoryless is not the same as light-tailed
The Exponential is light-tailed in the sense that the MGF is finite on a half-line. Memorylessness is a separate property. The Gamma and Chi-squared are still light-tailed but are not memoryless; their hazard rates depend on time.
The minimum is exponential, the sum is not
The minimum of independent Exponentials is exponential with summed rate. The sum is Gamma, not exponential. The maximum is neither; its survival function is , which has no closed name except as the "generalized maximum-of-exponentials" distribution.
Hazard rate constant means constant, not exact
The Exponential has hazard rate exactly at every time. Empirical hazard estimates from real data are noisy; a rolling estimate that wobbles around a constant value is consistent with Exponential lifetimes, but a steady upward or downward trend is not, regardless of the average level.
Exercises
Problem
Customers arrive at a service desk according to a Poisson process with rate per hour. Find the probability that the time until the next customer arrives exceeds 15 minutes.
Problem
Let . Show that for every .
Problem
Let be independent with . Find the density of the maximum .
Problem
Show that if then is . (This is the probability integral transform.)
Problem
Show that the MLE is asymptotically Normal: as . Identify the role of the Fisher information.
References
Canonical:
- Casella and Berger, Statistical Inference (2002), Chapter 3 (Section 3.3 introduces the family), Chapter 7 (Section 7.2 covers Exponential MLE).
- Lehmann and Casella, Theory of Point Estimation (1998), Chapter 1 (sufficiency for the Exponential and the connection to one-parameter exponential families).
- Ross, Introduction to Probability Models (2019), Chapter 5 (memoryless property and Poisson process construction).
Probability:
- Blitzstein and Hwang, Introduction to Probability (2019), Chapter 5.
- Durrett, Probability: Theory and Examples (2019), Chapter 2 (Section 2.5 on Poisson processes).
- Grimmett and Stirzaker, Probability and Random Processes (2020), Chapter 6 (Poisson processes and renewal theory).
Last reviewed: May 11, 2026
Canonical graph
Required before and derived from this topic
These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.
Required prerequisites
3- Common Probability Distributionslayer 0A · tier 1
- Distributions Atlaslayer 0A · tier 1
- Exponential Function Propertieslayer 0A · tier 1
Derived topics
2- Gamma Distributionlayer 0A · tier 1
- Poisson Distributionlayer 0A · tier 1
Graph-backed continuations