Zero-Knowledge Proofs

Sneiderman, Robby

Applied Math

Zero-Knowledge Proofs

Prove you know a secret without revealing it. Interactive proofs, the simulation paradigm, ZK completeness, SNARKs, and connections to complexity theory and verifiable computation.

AdvancedTier 2CurrentReference~50 min

Prerequisites

Public Key Cryptography P vs Np

Prereq Map

Learning position

Read this page in the graph.

applied-math | layer 3 | tier 2. This page has 1 direct prerequisite and 0 published dependents.

Open Atlas Prerequisites Leads to

What next

Take the diagnostic

No published continuation is declared yet, so the diagnostic is the clean next route.

Evidence badge

Claim status

This page has no public Lean mapping yet. Use the evidence page to inspect how claim status labels work.

Show the backing system

AtlasOpen the full prerequisite graph and run grounding traces.EvidenceInspect source support, claim labels, and public trust status.LeanReview the checked declaration list, scopes, and axiom profile.

Why This Matters

A zero-knowledge proof lets a prover convince a verifier that a statement is true without revealing any information beyond the truth of the statement. This is a precise cryptographic notion, not a vague idea. The prover demonstrates knowledge (e.g., "I know the factorization of $n$ ") without leaking the knowledge itself (the factors $p$ and $q$ ).

The applications span cryptography, blockchain, and increasingly ML. Verifiable computation lets a client outsource a computation to an untrusted server and verify the result efficiently, without re-doing the work. Zero-knowledge proofs are the mechanism. In ML, this enables verifiable inference: proving that a model output was computed correctly from a specific model and input, without revealing the model weights.

Interactive Proof Systems

Definition

Interactive Proof System

An interactive proof system for a language $L$ consists of two parties:

A prover $P$ (computationally unbounded) who wants to convince the verifier that $x \in L$
A verifier $V$ (probabilistic polynomial-time) who interacts with $P$ and either accepts or rejects

The interaction proceeds in rounds: the verifier sends a challenge, the prover responds, and this repeats for polynomially many rounds.

Definition

Completeness and Soundness

An interactive proof system $(P, V)$ for language $L$ satisfies:

Completeness: For every $x \in L$ , the honest prover $P$ convinces $V$ to accept with probability at least $2/3$ :

$\Pr[V \text{ accepts after interacting with } P \text{ on input } x] \geq 2/3$

Soundness: For every $x \notin L$ and every (possibly cheating) prover $P^*$ :

$\Pr[V \text{ accepts after interacting with } P^* \text{ on input } x] \leq 1/3$

The constants $2/3$ and $1/3$ are conventional and can be amplified to $1 - 2^{-k}$ and $2^{-k}$ by $O(k)$ independent repetitions.

Definition

Zero-Knowledge Property (Simulation Paradigm)

An interactive proof $(P, V)$ for $L$ is zero-knowledge if and only if for every probabilistic polynomial-time verifier $V^*$ , there exists a probabilistic polynomial-time simulator $S$ such that for all $x \in L$ :

$\text{View}_{V^*}(P(x) \leftrightarrow V^*(x)) \approx S(x)$

where $\text{View}_{V^*}$ is the transcript of messages and $V^*$ 's random coins during the interaction, and $\approx$ denotes computational indistinguishability.

The simulator $S$ produces a fake transcript without access to the prover or any witness. If such a simulator exists, then the verifier learns nothing from the interaction that it could not have generated on its own.

The three variants of "zero-knowledge" depend on the strength of the $\approx$ relation:

Perfect ZK: the distributions are identical
Statistical ZK: the distributions are statistically close (negligible total variation distance)
Computational ZK: the distributions are computationally indistinguishable (no polynomial-time algorithm can tell them apart)

The Classic Example: Graph Isomorphism

Example

Zero-knowledge proof for graph isomorphism

Setup: The prover knows an isomorphism $\pi: G_0 \to G_1$ between two graphs. The verifier has both graphs but not $\pi$ .

Protocol (one round):

The prover picks a random permutation $\sigma$ of the vertices and sends $H = \sigma(G_0)$ (a random relabeling of $G_0$ ) to the verifier.
The verifier picks a random bit $b \in \{0, 1\}$ and sends it to the prover.
The prover must produce an isomorphism from $G_b$ $G_{b}$ to $H$ $H$ :
- If $b = 0$ : send $\sigma$ (since $H = \sigma(G_0)$ )
- If $b = 1$ : send $\sigma \circ \pi^{-1}$ (since $H = \sigma(G_0) = \sigma(\pi^{-1}(G_1))$ )
The verifier checks the isomorphism and accepts if it is valid.

Completeness: The honest prover always answers correctly, so the verifier accepts with probability 1.

Soundness: If $G_0 \not\cong G_1$ , then $H$ is isomorphic to $G_0$ but not $G_1$ (or vice versa). A cheating prover can answer correctly for only one value of $b$ , so the verifier rejects with probability at least $1/2$ . After $k$ rounds, the cheating probability drops to $2^{-k}$ .

Zero-knowledge: The simulator $S$ works as follows. Pick a random bit $b'$ . Compute $H = \sigma(G_{b'})$ for a random permutation $\sigma$ . Set the verifier's challenge to $b'$ . The simulated transcript is $(H, b', \sigma)$ . This has the same distribution as the real transcript because $H$ is a random relabeling of $G_0$ (equivalently, of $G_1$ since $G_0 \cong G_1$ ), and the challenge is a uniform random bit. The simulator never uses $\pi$ .

ZK Completeness for NP

Theorem

Zero-Knowledge Completeness for NP

Statement

Assuming the existence of one-way functions, every language $L \in \text{NP}$ has a computational zero-knowledge proof system. That is, there exists an interactive proof $(P, V)$ for $L$ that is complete, sound, and computational zero-knowledge.

Intuition

The proof reduces any NP language to graph 3-coloring (which is NP-complete). A zero-knowledge proof for graph 3-coloring is constructed using commitment schemes (which exist if one-way functions exist). The prover commits to a random permutation of a valid 3-coloring, the verifier picks a random edge, and the prover reveals the colors of that edge's endpoints. The verifier checks that the colors differ. After enough rounds, soundness is exponentially high. Zero-knowledge follows because a simulator can fake the commitments (since it only needs to show one consistent edge per round).

Proof Sketch

Reduce $L$ to graph 3-coloring via the NP-completeness of 3-coloring.
Construct a ZK proof for 3-coloring: the prover has a valid 3-coloring $c: V \to \{1,2,3\}$ . In each round: a. Pick a random permutation $\pi$ of $\{1,2,3\}$ . Commit to $\pi(c(v))$ for each vertex $v$ using a hiding and binding commitment scheme. b. The verifier picks a random edge $(u,v)$ . c. The prover opens the commitments for $u$ and $v$ , revealing $\pi(c(u))$ and $\pi(c(v))$ . d. The verifier checks that the two revealed colors are different and that the commitments were opened correctly.
Soundness: If the graph is not 3-colorable, every coloring has at least one monochromatic edge. The verifier catches it with probability at least $1/|E|$ per round. After $k|E|$ rounds, the cheating probability is exponentially small.
Zero-knowledge: The simulator picks a random edge $(u, v)$ , assigns two random distinct colors to $u$ and $v$ and arbitrary colors elsewhere, and commits. Since only one edge is revealed per round, the simulator's transcript is computationally indistinguishable from the real one (hiding property of the commitment scheme).

Why It Matters

This theorem says zero-knowledge proofs are a general-purpose tool: any statement that can be verified efficiently (i.e., any NP statement) can be proved in zero-knowledge. This is the theoretical foundation for all ZK applications. You do not need a specialized protocol for each problem; the general construction (though inefficient) guarantees existence.

Failure Mode

The theorem requires one-way functions to exist. If P = NP (which would imply no one-way functions exist), then ZK proofs become trivial (the verifier can find witnesses on its own) or impossible (depending on the formalization). The constructed protocol is also highly impractical: it requires many rounds and the reduction through 3-coloring blows up the instance size. Practical ZK systems use specialized constructions, not this generic reduction.

report a correction →

The Simulation Paradigm

The simulation paradigm is the conceptual core of zero-knowledge. It defines "learning nothing" rigorously.

The verifier's view of the interaction consists of: the input $x$ , the verifier's random coins $r$ , and the sequence of prover messages $(a_1, a_2, \ldots)$ . A proof is zero-knowledge if and only if this entire view can be simulated by a polynomial-time algorithm that has access only to $x$ (not to the prover's witness or the prover itself).

If such a simulator exists, then anything the verifier could compute from the real interaction, it could also compute from the simulated transcript. Since the simulator runs without the witness, the verifier gains no information about the witness.

This idea extends far beyond ZK proofs. Simulation-based security definitions appear throughout cryptography: secure multi-party computation, oblivious transfer, and garbled circuits all use the simulation paradigm to define what it means for a protocol to be "secure."

From Interactive to Non-Interactive: Fiat-Shamir

Definition

Fiat-Shamir Heuristic

The Fiat-Shamir transform converts an interactive public-coin proof (where the verifier's messages are just random strings) into a non-interactive one. Replace the verifier's random challenges with the output of a cryptographic hash function applied to the transcript so far:

$\text{challenge}_i = H(\text{transcript up to round } i)$

The prover computes the entire proof non-interactively by simulating the verifier's challenges using $H$ . The result is a single proof string that anyone can verify.

In the random oracle model (where $H$ is modeled as a truly random function), this transform preserves soundness. In the standard model (with a concrete hash function), provable security requires additional assumptions.

The Fiat-Shamir transform is the bridge between the theoretical world of interactive proofs and the practical world of blockchain and verifiable computation, where proofs must be transmitted as static objects.

SNARKs and STARKs

Modern zero-knowledge systems go far beyond the basic interactive framework.

SNARKs (Succinct Non-interactive Arguments of Knowledge) produce proofs that are:

Succinct: proof size is $O(\log n)$ or $O(1)$ regardless of the computation size
Non-interactive: a single message from prover to verifier
Arguments of knowledge: the prover must "know" a witness, not just convince the verifier of existence

The succinctness is the key property. A SNARK for a computation of size $n$ can be verified in time $O(\log n)$ or even $O(1)$ , while the computation itself takes time $n$ . This compression is what makes verifiable computation practical.

STARKs (Scalable Transparent Arguments of Knowledge) replace trusted setup with hash functions only (transparency) and achieve verification time $O(\log^2 n)$ . They are post-quantum secure (relying on hash functions rather than elliptic curves) but produce larger proofs than SNARKs.

Connections to Verifiable ML

Zero-knowledge proofs enable several ML applications:

Verifiable inference: prove that a neural network output was computed correctly from specific weights and inputs, without revealing the weights. The computation is encoded as an arithmetic circuit, and a SNARK proves correct execution.
Private model auditing: prove that a model satisfies certain properties (fairness constraints, accuracy thresholds) without revealing the model.
Federated learning verification: prove that a gradient update was computed correctly from local data, without revealing the data.

Current limitations: encoding neural network inference as arithmetic circuits is expensive. A single forward pass through a moderate network can produce circuits with billions of gates. Proof generation time is orders of magnitude slower than inference itself. This is an active area of engineering optimization.

Common Confusions

Watch Out

Zero-knowledge does not mean the verifier learns nothing at all

The verifier does learn one thing: that the statement is true. "Zero-knowledge" means the verifier learns nothing beyond the truth of the statement. Formally, anything the verifier can compute after the interaction, it could have computed before (given only the statement and the fact that it is true). The simulator captures this: it produces a valid-looking transcript without knowing the witness.

Watch Out

Soundness protects the verifier, not the prover

Soundness guarantees that a cheating prover cannot convince the verifier of a false statement. Zero-knowledge protects the prover's secret from the verifier. These are independent properties. A proof system could be sound but not zero-knowledge (the verifier learns extra information). Or it could be zero-knowledge but not sound (a cheating prover can always succeed). A useful ZK proof must be both.

Watch Out

The simulator does not run the real protocol

The simulator produces a transcript that looks like a real protocol execution but is generated without any prover. The simulator is a thought experiment: its existence is what proves the protocol is zero-knowledge. The simulator is not executed in practice. It is a proof technique, not a protocol component.

Exercises

ExerciseCore

Problem

In the graph isomorphism ZK protocol, why must the prover send a freshly random permutation $H = \sigma(G_0)$ in each round? What goes wrong if the prover reuses the same $H$ across rounds?

ExerciseCore

Problem

Explain why the soundness error of the graph isomorphism ZK protocol is $1/2$ per round. What is the soundness error after $k$ rounds?

ExerciseAdvanced

Problem

Construct a simulator for the graph 3-coloring ZK proof. The simulator does not know a valid 3-coloring. Describe precisely how it generates a transcript indistinguishable from a real execution.

ExerciseResearch

Problem

The Fiat-Shamir transform replaces the verifier's random challenges with $H(\text{transcript})$ . Explain why this is sound in the random oracle model but problematic in the standard model. What is a concrete attack if $H$ is not modeled as a random oracle?

References

Canonical:

Goldreich, Foundations of Cryptography, Volume 1 (2001), Chapter 4 for zero-knowledge definitions and the completeness theorem
Goldwasser, Micali, Rackoff, "The Knowledge Complexity of Interactive Proof Systems" (1989), the original paper defining ZK proofs

Foundational:

Goldreich, Micali, Wigderson, "Proofs that Yield Nothing But Their Validity" (1991), proving ZK completeness for NP via graph 3-coloring
Fiat & Shamir, "How To Prove Yourself" (1987), the Fiat-Shamir heuristic for non-interactive proofs

Current:

Thaler, Proofs, Arguments, and Zero-Knowledge (2023), Chapters 1-8, a modern textbook covering SNARKs and verifiable computation
Ben-Sasson et al., "Scalable, transparent, and post-quantum secure computational integrity" (2018), introducing STARKs

Next Topics

Public-key cryptography: the number-theoretic hardness assumptions that zero-knowledge proofs build on
P vs NP: the complexity-theoretic context for why NP completeness matters for ZK proofs

Last reviewed: April 15, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

1

Public-Key Cryptographylayer 2 · tier 2

Derived topics

0

No published topic currently declares this as a prerequisite.