Beta. Content is under active construction and has not been peer-reviewed. Report errors on GitHub.Disclaimer

AI Safety

Differential Privacy

Formal privacy guarantees for algorithms: epsilon-delta DP, Laplace and Gaussian mechanisms, composition theorems, DP-SGD for training neural networks, and the privacy-utility tradeoff.

AdvancedTier 2Current~55 min

Why This Matters

Models trained on sensitive data (medical records, financial transactions, user behavior) can memorize and leak individual records. Membership inference attacks can determine whether a specific person was in the training set. Model inversion attacks can reconstruct training examples. Differential privacy provides a mathematical guarantee: the output of the algorithm is nearly the same whether or not any single individual's data is included. This is the strongest formal privacy guarantee available, and it has been deployed at scale (Apple, Google, US Census Bureau).

The Definition

Definition

Differential Privacy

A randomized algorithm M:DR\mathcal{M}: \mathcal{D} \to \mathcal{R} satisfies (ε,δ)(\varepsilon, \delta)-differential privacy if for all datasets DD and DD' differing in exactly one record (neighboring datasets), and for all measurable subsets SRS \subseteq \mathcal{R}:

P(M(D)S)eεP(M(D)S)+δP(\mathcal{M}(D) \in S) \leq e^\varepsilon \cdot P(\mathcal{M}(D') \in S) + \delta

The parameter ε>0\varepsilon > 0 is the privacy budget (smaller is more private). The parameter δ0\delta \geq 0 is the probability of a catastrophic privacy failure (δ=0\delta = 0 gives pure DP).

Definition

Sensitivity

The global sensitivity of a function f:DRdf: \mathcal{D} \to \mathbb{R}^d is:

Δf=maxD,D neighborsf(D)f(D)p\Delta f = \max_{D, D' \text{ neighbors}} \|f(D) - f(D')\|_p

For 1\ell_1 sensitivity, use p=1p = 1. For 2\ell_2 sensitivity, use p=2p = 2. Sensitivity measures the maximum change in ff from adding or removing one record. It determines how much noise is needed for privacy.

Core Mechanisms

Theorem

Laplace Mechanism

Statement

For a function f:DRdf: \mathcal{D} \to \mathbb{R}^d with 1\ell_1 sensitivity Δf\Delta f, the Laplace mechanism:

M(D)=f(D)+(Z1,,Zd),ZiLap(0,Δf/ε)\mathcal{M}(D) = f(D) + (Z_1, \ldots, Z_d), \quad Z_i \sim \text{Lap}(0, \Delta f / \varepsilon)

satisfies (ε,0)(\varepsilon, 0)-differential privacy (pure DP).

Intuition

Add Laplace noise scaled to the sensitivity. If changing one record changes ff by at most Δf\Delta f, then noise with scale Δf/ε\Delta f / \varepsilon makes it hard to tell which dataset produced the output. Smaller ε\varepsilon (more privacy) requires more noise.

Proof Sketch

For neighboring D,DD, D': P(M(D)=x)P(M(D)=x)=iexp(xifi(D)ε/Δf)exp(xifi(D)ε/Δf)\frac{P(\mathcal{M}(D) = x)}{P(\mathcal{M}(D') = x)} = \prod_i \frac{\exp(-|x_i - f_i(D)| \varepsilon / \Delta f)}{\exp(-|x_i - f_i(D')| \varepsilon / \Delta f)}. By the triangle inequality on xifi(D)xifi(D)|x_i - f_i(D)| - |x_i - f_i(D')| and the bound f(D)f(D)1Δf\|f(D) - f(D')\|_1 \leq \Delta f, this ratio is at most exp(ε)\exp(\varepsilon).

Why It Matters

The Laplace mechanism is the simplest and most widely used DP mechanism. It achieves pure DP (δ=0\delta = 0) with a clear noise-privacy tradeoff. For counting queries (sensitivity 1), adding Lap(1/ε)\text{Lap}(1/\varepsilon) noise provides ε\varepsilon-DP.

Failure Mode

The Laplace mechanism uses global sensitivity (worst case over all possible databases). If one unusual record has disproportionate influence on ff, the sensitivity is large and the noise overwhelms the signal. Local sensitivity or the smooth sensitivity framework can help but requires more sophisticated analysis.

Gaussian Mechanism

For (ε,δ)(\varepsilon, \delta)-DP with δ>0\delta > 0, use Gaussian noise:

M(D)=f(D)+N(0,σ2I),σΔ2f2ln(1.25/δ)ε\mathcal{M}(D) = f(D) + \mathcal{N}(0, \sigma^2 I), \quad \sigma \geq \frac{\Delta_2 f \sqrt{2 \ln(1.25/\delta)}}{\varepsilon}

where Δ2f\Delta_2 f is the 2\ell_2 sensitivity. The Gaussian mechanism is preferred when composing many mechanisms because Gaussian noise composes more tightly than Laplace noise.

Composition

Theorem

Advanced Composition Theorem

Statement

If mechanisms M1,,Mk\mathcal{M}_1, \ldots, \mathcal{M}_k each satisfy (ε,δ)(\varepsilon, \delta)-DP, then the composed mechanism (M1,,Mk)(\mathcal{M}_1, \ldots, \mathcal{M}_k) satisfies (ε,kδ+δ)(\varepsilon', k\delta + \delta')-DP where:

ε=2kln(1/δ)ε+kε(eε1)\varepsilon' = \sqrt{2k \ln(1/\delta')} \cdot \varepsilon + k\varepsilon(e^\varepsilon - 1)

for any δ>0\delta' > 0.

Intuition

Naive composition says kk applications of ε\varepsilon-DP give kεk\varepsilon-DP (linear growth). Advanced composition shows privacy degrades as O(kε)O(\sqrt{k} \varepsilon) instead of O(kε)O(k \varepsilon). This square root improvement is critical for iterative algorithms like DP-SGD that compose thousands of mechanism invocations.

Proof Sketch

Model the privacy loss log(P(M(D)S)/P(M(D)S))\log(P(\mathcal{M}(D) \in S) / P(\mathcal{M}(D') \in S)) as a random variable. Each mechanism contributes a bounded privacy loss. By a martingale argument (Azuma-Hoeffding on the privacy loss random variable), the total privacy loss concentrates around its mean with standard deviation O(εk)O(\varepsilon\sqrt{k}).

Why It Matters

DP-SGD runs for thousands of training steps, each with a private gradient computation. Without advanced composition, the total privacy budget would be impractical (thousands times ε\varepsilon). With it, the budget grows as T\sqrt{T}, making private deep learning feasible.

Failure Mode

Advanced composition still gives loose bounds for large kk. Tighter accounting via Renyi DP or the moments accountant (Abadi et al., 2016) provides better bounds in practice. The numerical privacy accountant (google/dp-accounting) gives the tightest known bounds.

Post-Processing Immunity

Proposition

Post-Processing Immunity

Statement

If M\mathcal{M} satisfies (ε,δ)(\varepsilon, \delta)-DP and g:RRg: \mathcal{R} \to \mathcal{R}' is any (possibly randomized) function, then gMg \circ \mathcal{M} also satisfies (ε,δ)(\varepsilon, \delta)-DP.

Intuition

Once you have a private output, you can compute anything you want from it without degrading privacy. Training a model with DP-SGD and then deploying that model for inference does not consume additional privacy budget. Privacy is a property of how the output was generated, not what you do with it.

Proof Sketch

For any set SS' in the range of gg, let S=g1(S)S = g^{-1}(S'). Then P(g(M(D))S)=P(M(D)S)eεP(M(D)S)+δ=eεP(g(M(D))S)+δP(g(\mathcal{M}(D)) \in S') = P(\mathcal{M}(D) \in S) \leq e^\varepsilon P(\mathcal{M}(D') \in S) + \delta = e^\varepsilon P(g(\mathcal{M}(D')) \in S') + \delta.

Why It Matters

This means you only need to ensure privacy at the mechanism level. All downstream analysis, model deployment, prediction serving, and reporting are automatically private. This greatly simplifies the privacy analysis of complex ML pipelines.

Failure Mode

Post-processing immunity requires that gg does not access the original dataset DD. If you use DD again (e.g., to evaluate the private model on the training set), that counts as additional composition and consumes more privacy budget.

DP-SGD

DP-SGD (Abadi et al., 2016) makes stochastic gradient descent differentially private via two modifications:

  1. Gradient clipping: clip each per-example gradient to norm CC: gˉi=gimin(1,C/gi2)\bar{g}_i = g_i \cdot \min(1, C / \|g_i\|_2). This bounds the sensitivity of the average gradient to C/BC / B where BB is the batch size.

  2. Noise addition: add Gaussian noise calibrated to the clipped sensitivity: g~=1B(igˉi+N(0,σ2C2I))\tilde{g} = \frac{1}{B}(\sum_i \bar{g}_i + \mathcal{N}(0, \sigma^2 C^2 I)).

The noise scale σ\sigma is set based on the target (ε,δ)(\varepsilon, \delta) and the number of training steps, using the moments accountant for tight composition.

The Privacy-Utility Tradeoff

More privacy (smaller ε\varepsilon) requires more noise, which degrades model accuracy. This tradeoff is inherent, not an artifact of specific mechanisms. For DP-SGD:

  • ε1\varepsilon \sim 1: strong privacy, moderate utility loss for large datasets, significant loss for small datasets
  • ε10\varepsilon \sim 10: meaningful privacy against practical attacks, small utility loss
  • ε100\varepsilon \sim 100: weak formal guarantees, minimal utility loss

The right ε\varepsilon depends on the threat model. There is no universal threshold for "good enough" privacy.

Common Confusions

Watch Out

DP protects individuals, not aggregate statistics

DP guarantees that no single individual's data significantly affects the output. It does not hide aggregate properties of the dataset. A DP mechanism can still accurately report that 60% of patients have condition X. It just cannot reveal whether you specifically are in the dataset.

Watch Out

Epsilon is not directly interpretable as a probability

ε=1\varepsilon = 1 does not mean "1% privacy loss." It means the log-odds ratio of any output changes by at most 1 when one record is added or removed. For ε=1\varepsilon = 1, the probability ratio is at most e12.72e^1 \approx 2.72. Translating ε\varepsilon into concrete attack success probabilities requires specifying the prior and the attack model.

Watch Out

DP-SGD noise is per step, not per model

Each gradient step adds noise independently. The total privacy cost is the composition of all steps, not the noise in a single step. A model trained for 1000 steps has a much larger total ε\varepsilon than one trained for 100 steps, all else equal.

Key Takeaways

  • (ε,δ)(\varepsilon, \delta)-DP bounds how much any single record can change the output distribution; smaller ε\varepsilon is more private
  • The Laplace mechanism achieves pure DP; the Gaussian mechanism achieves approximate DP with better composition properties
  • Advanced composition gives O(k)O(\sqrt{k}) privacy degradation over kk steps instead of O(k)O(k)
  • Post-processing immunity means all downstream computation on a DP output is automatically private
  • DP-SGD clips per-example gradients and adds Gaussian noise to make neural network training private
  • The privacy-utility tradeoff is fundamental and task-dependent

Exercises

ExerciseCore

Problem

You want to privately release the average salary of n=1000n = 1000 employees, where each salary is in [0,200000][0, 200000]. The query f(D)=1nixif(D) = \frac{1}{n}\sum_i x_i has 1\ell_1 sensitivity Δf=200000/1000=200\Delta f = 200000 / 1000 = 200. What noise scale is needed for ε=1\varepsilon = 1 using the Laplace mechanism?

ExerciseAdvanced

Problem

You run DP-SGD for T=10000T = 10000 steps, each satisfying (0.01,105)(0.01, 10^{-5})-DP. Using naive composition, what is the total privacy guarantee? Using advanced composition with δ=105\delta' = 10^{-5}, what is the approximate ε\varepsilon'?

References

Canonical:

  • Dwork & Roth, The Algorithmic Foundations of Differential Privacy (2014), Chapters 2-3
  • Dwork et al., "Calibrating Noise to Sensitivity in Private Data Analysis" (2006)

Current:

  • Abadi et al., "Deep Learning with Differential Privacy" (DP-SGD, 2016), Sections 2-3
  • Balle et al., "Hypothesis Testing Interpretations and Renyi Differential Privacy" (2020)

Next Topics

Last reviewed: April 2026

Prerequisites

Foundations this topic depends on.

Next Topics