Concentration Probability
Measure Concentration and Geometric Functional Analysis
High-dimensional geometry is counterintuitive: Lipschitz functions concentrate, random projections preserve distances, and most of a sphere's measure sits near the equator. Johnson-Lindenstrauss, Gaussian concentration, and Levy's lemma.
Why This Matters
High-dimensional spaces behave nothing like low-dimensional ones. In with , a random vector is almost surely close to in norm. Two independent random unit vectors are almost surely nearly orthogonal. A Lipschitz function of a Gaussian vector is almost surely close to its mean.
These phenomena are not curiosities. They are the reason dimensionality reduction works, random projections preserve structure, and empirical averages concentrate. The Johnson-Lindenstrauss lemma, which enables practical dimensionality reduction from to , is a direct consequence of Gaussian concentration.
Gaussian Concentration
Lipschitz Function
A function is -Lipschitz if for all . The smallest such is the Lipschitz constant.
Gaussian Concentration of Lipschitz Functions
Statement
Let and be -Lipschitz. Then for all :
The function is sub-Gaussian with parameter .
Intuition
Each coordinate of contributes independently to , and the Lipschitz condition limits how much each coordinate can influence the output. The independent contributions each have bounded effect, and their sum concentrates by the same mechanism as a sum of bounded random variables. The dimension does not appear in the bound.
Proof Sketch
Use the Gaussian Poincare inequality: . For the tail bound, apply the Herbst argument: compute the log-MGF and show using the Gaussian log-Sobolev inequality. Then apply the Chernoff bound.
Why It Matters
This single result has enormous consequences. It implies that the norm concentrates around , inner products concentrate around 0, and random projections preserve distances. Most of high-dimensional probability follows from applying this inequality to appropriately chosen Lipschitz functions.
Failure Mode
The result requires Gaussianity (or more generally, a distribution satisfying a log-Sobolev inequality). For heavy-tailed distributions, Lipschitz functions do not concentrate at the Gaussian rate. The bound also requires to be globally Lipschitz; local Lipschitz conditions are not sufficient.
Johnson-Lindenstrauss Lemma
Johnson-Lindenstrauss Lemma
Statement
For any set of points and any , there exists a linear map with such that for all pairs :
The map where is a matrix with i.i.d. entries works with high probability.
Intuition
A random projection of a vector has squared norm that concentrates around . This is because where each . The sum of squared Gaussians concentrates around its mean by chi-squared concentration. Taking a union bound over all pairs gives the result.
Proof Sketch
Fix a single pair and let . Then where i.i.d. By sub-exponential concentration of chi-squared:
for . With and a union bound over pairs, the failure probability is at most for large enough.
Why It Matters
This lemma says that points in arbitrarily high dimension can be embedded into dimensions while approximately preserving all pairwise distances. The target dimension depends only on and , not on the original dimension . This is the theoretical foundation for random projection methods in dimensionality reduction, approximate nearest neighbor search, and compressed sensing.
Failure Mode
The target dimension is tight: there exist point sets requiring dimensions for any linear map. For very small, the target dimension can still be large. The lemma also only preserves Euclidean distances; other metrics require different techniques.
Concentration on the Sphere
Levy's Lemma (Concentration on the Sphere)
Statement
Let be uniformly distributed on and be -Lipschitz. Let be the median of . Then:
The concentration improves with dimension: higher dimension means tighter concentration.
Intuition
On a high-dimensional sphere, most of the surface area is concentrated near any equator. A Lipschitz function cannot vary much on this concentrated strip. The effective number of "independent directions" on the sphere is , which is why the exponent contains .
Proof Sketch
The proof uses the spherical isoperimetric inequality: among all sets of a given measure on , spherical caps have the smallest boundary. For a cap of measure (the hemisphere), the -enlargement has measure . For a Lipschitz function, the set contains the -enlargement of .
Why It Matters
Levy's lemma makes precise the idea that high-dimensional spheres are "effectively one-dimensional" from the perspective of Lipschitz functions. Any continuous measurement on a high-dimensional sphere will give nearly the same value for almost every point. This is a key ingredient in proofs of the Johnson-Lindenstrauss lemma via spherical projections.
Failure Mode
The bound degrades for non-Lipschitz functions. For discontinuous functions or functions with large Lipschitz constant relative to the sphere's radius, the concentration is useless. Also, the bound is vacuous for (the circle), where no concentration occurs.
Why High-Dimensional Geometry is Counterintuitive
Three specific phenomena that defy low-dimensional intuition:
-
Norm concentration: if , then in probability. The norm is approximately deterministic.
-
Near-orthogonality: if independently, then has standard deviation . Random vectors are nearly orthogonal.
-
Volume in corners: most of the volume of a high-dimensional cube is near the corners (near radius ), not near the center. The inscribed ball has negligible volume for large .
Common Confusions
JL says nothing about structure preservation beyond distances
The JL lemma preserves pairwise Euclidean distances. It does not preserve cluster structure, manifold geometry, or topological properties. Two point sets with the same pairwise distance matrix are mapped identically by JL. For structure beyond distances, you need methods like t-SNE or UMAP that optimize different objectives.
Gaussian concentration dimension-free does not mean dimension irrelevant
The bound does not contain , but the Lipschitz constant and the mean can depend on . For example, is -Lipschitz with mean . The bound says deviates from by at most , which is a relative deviation of .
Summary
- Lipschitz functions of Gaussians are sub-Gaussian with parameter equal to the Lipschitz constant. No dependence on dimension.
- JL: points in can be projected to dimensions preserving distances up to
- Levy's lemma: on , concentration improves with dimension
- High-dimensional norm concentrates: for
- Random vectors are nearly orthogonal in high dimensions
Exercises
Problem
Let . Show that is -Lipschitz and use Gaussian concentration to bound .
Problem
Prove the JL lemma for a single pair: if and with having i.i.d. entries, show that for .
References
Canonical:
- Vershynin, High-Dimensional Probability, Chapters 5 and 8
- Ledoux, The Concentration of Measure Phenomenon (2001)
Current:
-
Boucheron, Lugosi, Massart, Concentration Inequalities, Chapter 5
-
Wainwright, High-Dimensional Statistics (2019), Chapter 2
-
van Handel, Probability in High Dimension (2016), Chapters 1-3
Next Topics
- Empirical processes and chaining: applying concentration to suprema of random functions
Last reviewed: April 2026
Prerequisites
Foundations this topic depends on.