AI Safety
Differential Privacy
Formal privacy guarantees for algorithms: epsilon-delta DP, Laplace and Gaussian mechanisms, composition theorems, DP-SGD for training neural networks, and the privacy-utility tradeoff.
Prerequisites
Why This Matters
Models trained on sensitive data (medical records, financial transactions, user behavior) can memorize and leak individual records. Membership inference attacks can determine whether a specific person was in the training set. Model inversion attacks can reconstruct training examples. Differential privacy provides a mathematical guarantee: the output of the algorithm is nearly the same whether or not any single individual's data is included. This is the strongest formal privacy guarantee available, and it has been deployed at scale (Apple, Google, US Census Bureau).
The Definition
Differential Privacy
A randomized algorithm satisfies -differential privacy if for all datasets and differing in exactly one record (neighboring datasets), and for all measurable subsets :
The parameter is the privacy budget (smaller is more private). The parameter is the probability of a catastrophic privacy failure ( gives pure DP).
Sensitivity
The global sensitivity of a function is:
For sensitivity, use . For sensitivity, use . Sensitivity measures the maximum change in from adding or removing one record. It determines how much noise is needed for privacy.
Core Mechanisms
Laplace Mechanism
Statement
For a function with sensitivity , the Laplace mechanism:
satisfies -differential privacy (pure DP).
Intuition
Add Laplace noise scaled to the sensitivity. If changing one record changes by at most , then noise with scale makes it hard to tell which dataset produced the output. Smaller (more privacy) requires more noise.
Proof Sketch
For neighboring : . By the triangle inequality on and the bound , this ratio is at most .
Why It Matters
The Laplace mechanism is the simplest and most widely used DP mechanism. It achieves pure DP () with a clear noise-privacy tradeoff. For counting queries (sensitivity 1), adding noise provides -DP.
Failure Mode
The Laplace mechanism uses global sensitivity (worst case over all possible databases). If one unusual record has disproportionate influence on , the sensitivity is large and the noise overwhelms the signal. Local sensitivity or the smooth sensitivity framework can help but requires more sophisticated analysis.
Gaussian Mechanism
For -DP with , use Gaussian noise:
where is the sensitivity. The Gaussian mechanism is preferred when composing many mechanisms because Gaussian noise composes more tightly than Laplace noise.
Composition
Advanced Composition Theorem
Statement
If mechanisms each satisfy -DP, then the composed mechanism satisfies -DP where:
for any .
Intuition
Naive composition says applications of -DP give -DP (linear growth). Advanced composition shows privacy degrades as instead of . This square root improvement is critical for iterative algorithms like DP-SGD that compose thousands of mechanism invocations.
Proof Sketch
Model the privacy loss as a random variable. Each mechanism contributes a bounded privacy loss. By a martingale argument (Azuma-Hoeffding on the privacy loss random variable), the total privacy loss concentrates around its mean with standard deviation .
Why It Matters
DP-SGD runs for thousands of training steps, each with a private gradient computation. Without advanced composition, the total privacy budget would be impractical (thousands times ). With it, the budget grows as , making private deep learning feasible.
Failure Mode
Advanced composition still gives loose bounds for large . Tighter accounting via Renyi DP or the moments accountant (Abadi et al., 2016) provides better bounds in practice. The numerical privacy accountant (google/dp-accounting) gives the tightest known bounds.
Post-Processing Immunity
Post-Processing Immunity
Statement
If satisfies -DP and is any (possibly randomized) function, then also satisfies -DP.
Intuition
Once you have a private output, you can compute anything you want from it without degrading privacy. Training a model with DP-SGD and then deploying that model for inference does not consume additional privacy budget. Privacy is a property of how the output was generated, not what you do with it.
Proof Sketch
For any set in the range of , let . Then .
Why It Matters
This means you only need to ensure privacy at the mechanism level. All downstream analysis, model deployment, prediction serving, and reporting are automatically private. This greatly simplifies the privacy analysis of complex ML pipelines.
Failure Mode
Post-processing immunity requires that does not access the original dataset . If you use again (e.g., to evaluate the private model on the training set), that counts as additional composition and consumes more privacy budget.
DP-SGD
DP-SGD (Abadi et al., 2016) makes stochastic gradient descent differentially private via two modifications:
-
Gradient clipping: clip each per-example gradient to norm : . This bounds the sensitivity of the average gradient to where is the batch size.
-
Noise addition: add Gaussian noise calibrated to the clipped sensitivity: .
The noise scale is set based on the target and the number of training steps, using the moments accountant for tight composition.
The Privacy-Utility Tradeoff
More privacy (smaller ) requires more noise, which degrades model accuracy. This tradeoff is inherent, not an artifact of specific mechanisms. For DP-SGD:
- : strong privacy, moderate utility loss for large datasets, significant loss for small datasets
- : meaningful privacy against practical attacks, small utility loss
- : weak formal guarantees, minimal utility loss
The right depends on the threat model. There is no universal threshold for "good enough" privacy.
Common Confusions
DP protects individuals, not aggregate statistics
DP guarantees that no single individual's data significantly affects the output. It does not hide aggregate properties of the dataset. A DP mechanism can still accurately report that 60% of patients have condition X. It just cannot reveal whether you specifically are in the dataset.
Epsilon is not directly interpretable as a probability
does not mean "1% privacy loss." It means the log-odds ratio of any output changes by at most 1 when one record is added or removed. For , the probability ratio is at most . Translating into concrete attack success probabilities requires specifying the prior and the attack model.
DP-SGD noise is per step, not per model
Each gradient step adds noise independently. The total privacy cost is the composition of all steps, not the noise in a single step. A model trained for 1000 steps has a much larger total than one trained for 100 steps, all else equal.
Key Takeaways
- -DP bounds how much any single record can change the output distribution; smaller is more private
- The Laplace mechanism achieves pure DP; the Gaussian mechanism achieves approximate DP with better composition properties
- Advanced composition gives privacy degradation over steps instead of
- Post-processing immunity means all downstream computation on a DP output is automatically private
- DP-SGD clips per-example gradients and adds Gaussian noise to make neural network training private
- The privacy-utility tradeoff is fundamental and task-dependent
Exercises
Problem
You want to privately release the average salary of employees, where each salary is in . The query has sensitivity . What noise scale is needed for using the Laplace mechanism?
Problem
You run DP-SGD for steps, each satisfying -DP. Using naive composition, what is the total privacy guarantee? Using advanced composition with , what is the approximate ?
References
Canonical:
- Dwork & Roth, The Algorithmic Foundations of Differential Privacy (2014), Chapters 2-3
- Dwork et al., "Calibrating Noise to Sensitivity in Private Data Analysis" (2006)
Current:
- Abadi et al., "Deep Learning with Differential Privacy" (DP-SGD, 2016), Sections 2-3
- Balle et al., "Hypothesis Testing Interpretations and Renyi Differential Privacy" (2020)
Next Topics
- Hypothesis testing for ML: statistical foundations that interact with privacy guarantees
Last reviewed: April 2026
Prerequisites
Foundations this topic depends on.
- Common Probability DistributionsLayer 0A
- Sets, Functions, and RelationsLayer 0A
- Basic Logic and Proof TechniquesLayer 0A