Feynman–Kac Formula

Sneiderman, Robby

Mathematical Infrastructure

Feynman–Kac Formula

The probabilistic representation of solutions to linear parabolic PDEs as expectations over SDE trajectories. The bridge that lets you Monte Carlo a PDE, the reason high-dimensional Black-Scholes is tractable, and the foundation under every backward-SDE method including deep BSDE.

AdvancedTier 2StableSupporting~45 min

Prerequisites

Stochastic Differential Equations Ito Lemma

Prereq Map

Learning position

Read this page in the graph.

mathematical-infrastructure | layer 3 | tier 2. This page has 2 direct prerequisites and 3 published dependents.

Open Atlas Prerequisites Leads to

What next

Backward Stochastic Differential Equations

This is the first curated or graph-derived continuation from the current page.

Evidence badge

Claim status

This page has no public Lean mapping yet. Use the evidence page to inspect how claim status labels work.

Show the backing system

AtlasOpen the full prerequisite graph and run grounding traces.EvidenceInspect source support, claim labels, and public trust status.LeanReview the checked declaration list, scopes, and axiom profile.

Why This Matters

The Feynman–Kac formula is the precise statement that some partial differential equations can be solved by simulating a stochastic differential equation. That is a strong claim: instead of meshing a domain in $\mathbb{R}^d$ — which costs exponentially in $d$ — you sample SDE trajectories and average a payoff at the terminal time, which costs only polynomially. For high-dimensional parabolic PDEs (Black–Scholes under counterparty risk, Hamilton–Jacobi–Bellman equations in stochastic control, committor functions in molecular dynamics) this is the only known route to a computable answer.

The formula also flips the mental model of what a PDE solution is. The classical view treats $u(t, x)$ as a function on a continuous domain satisfying differentiation rules. The Feynman–Kac view treats $u(t, x) = \mathbb{E}[\text{payoff} \mid X_t = x]$ as a value function over SDE trajectories starting from $(t, x)$ . Every modern method that combines neural networks with stochastic numerics — deep BSDE, the deep splitting method, Han–Jentzen–E's full nonlinear extensions — takes the value-function view as its starting point.

A useful slogan: the Fokker–Planck equation moves densities forward in time using the generator's adjoint; Feynman–Kac moves value functions backward in time using the generator itself. They are the two halves of the PDE-SDE duality.

Mental Model

Think of $u(t, x)$ as the expected payoff of a stochastic game: starting from state $x$ at time $t$ , run the SDE forward to terminal time $T$ , and collect a reward $g(X_T)$ . Two complications turn the simple expectation $\mathbb{E}[g(X_T) \mid X_t = x]$ into the full Feynman–Kac form. First, you might discount the future at a rate $r(t, x)$ , multiplying by $\exp(-\int_t^T r\,ds)$ . Second, you might collect a running reward $f(s, X_s)$ along the trajectory, adding $\int_t^T f\,ds$ . Both are common in finance and control. The Feynman–Kac formula says these probabilistic constructions are exactly the unique solutions to a class of linear parabolic PDEs.

Formal Statement

Definition

Feynman–Kac Formula $u (t, x) = E [e x p (- \int_{t}^{T} r) g (X_{T}) + \int_{t}^{T} e x p (- \int_{t}^{s} r) f (s, X_{s}) d s ∣ X_{t} = x]$

Let $X_t \in \mathbb{R}^d$ solve the SDE $dX_s = b(s, X_s)\,ds + \sigma(s, X_s)\,dB_s$ with infinitesimal generator $\mathcal{L} u = b \cdot \nabla u + \tfrac{1}{2} \operatorname{Tr}(\sigma \sigma^\top \nabla^2 u)$ . Let $u: [0, T] \times \mathbb{R}^d \to \mathbb{R}$ solve the backward Kolmogorov / Cauchy problem

\partial_t u + \mathcal{L} u + f(t, x) - r(t, x)\, u = 0, \qquad u(T, x) = g(x),

with sufficiently smooth $b, \sigma, f, r, g$ and bounded growth. Then $u$ admits the probabilistic representation

u(t, x) = \mathbb{E}\!\left[ e^{-\int_t^T r(s, X_s)\,ds}\, g(X_T) + \int_t^T e^{-\int_t^s r(\tau, X_\tau)\,d\tau}\, f(s, X_s)\,ds \,\Big|\, X_t = x \right].

The PDE has three ingredients that map cleanly to the SDE side: $\mathcal{L}$ is the generator of the diffusion, $r$ is a state-dependent discount rate applied to terminal and running payoffs, and $f$ is a forcing term that becomes a running cost integrated along the path.

Classical Feynman–Kac (the clean special case)

Set $f \equiv 0$ and $r \equiv 0$ . The PDE collapses to $\partial_t u + \mathcal{L} u = 0$ , $u(T, x) = g(x)$ , and the formula reduces to $u(t, x) = \mathbb{E}[g(X_T) \mid X_t = x]$ . This is the version most often called "Feynman–Kac" without qualification.

Theorem

Feynman–Kac Representation

Statement

Under the assumptions above, the unique classical solution to $\partial_t u + \mathcal{L} u = 0$ with terminal condition $u(T, \cdot) = g$ admits the probabilistic representation $u(t, x) = \mathbb{E}[g(X_T) \mid X_t = x]$ , where $X_s$ solves $dX_s = b(s, X_s)\,ds + \sigma(s, X_s)\,dB_s$ with $X_t = x$ .

Intuition

The map $s \mapsto u(s, X_s)$ should be a martingale: starting from $(t, x)$ and running forward, the expected future value of $u$ at time $s > t$ should equal $u(t, x)$ , since there is no information being added. Itô's lemma identifies the drift of $u(s, X_s)$ as $\partial_t u + \mathcal{L} u$ , which the PDE forces to be zero. The martingale property at the terminal time gives $u(t, x) = \mathbb{E}[u(T, X_T) \mid X_t = x] = \mathbb{E}[g(X_T) \mid X_t = x]$ .

Proof Sketch

Apply Itô's lemma to $u(s, X_s)$ on $[t, T]$ : $du(s, X_s) = (\partial_s u + \mathcal{L} u)(s, X_s)\,ds + (\nabla u)^\top \sigma\,dB_s = 0\,ds + (\nabla u)^\top \sigma\,dB_s$ . Take expectations conditional on $X_t = x$ . The stochastic integral is a martingale (polynomial-growth bounds plus Burkholder–Davis–Gundy), so its expectation vanishes, giving $\mathbb{E}[u(T, X_T) \mid X_t = x] - u(t, x) = 0$ . The terminal condition $u(T, x) = g(x)$ closes the identity.

Why It Matters

This is the cleanest statement of "PDE = expectation over an SDE." Three consequences. (1) Monte Carlo solver. Sample $N$ trajectories of $X$ starting from $(t, x)$ , average $g(X_T)$ , get an unbiased estimator of $u(t, x)$ with variance $O(1/N)$ that is independent of dimension. (2) Linearity of expectation as superposition principle. If $g$ is a sum of payoffs, the corresponding $u$ is the same sum; the PDE inherits the linearity for free. (3) A pricing equation for European options. With $X$ a risk-neutral asset price and $g$ the option payoff, the expectation is the price; any PDE-side method for $u$ must agree with the Monte Carlo estimator up to numerical error. The result states: $u(t, x) = \mathbb{E}[g(X_T) \mid X_t = x]$ for all $(t, x) \in [0, T] \times \mathbb{R}^d$ .

Failure Mode

The polynomial-growth and regularity hypotheses can fail in two pedestrian ways. First, payoffs $g$ with exponential growth (e.g., $g(x) = e^{\lvert x \rvert^2}$ ) make the expectation infinite even when the PDE has a formal solution. Second, degenerate diffusions ( $\sigma$ rank-deficient at some points) leave parts of the state space unreachable from $X_t = x$ , and $u(t, x)$ becomes determined by boundary data the SDE cannot probe. Hörmander-type bracket conditions or viscosity-solution machinery are then needed.

report a correction →

Discounted Variant: Black–Scholes

The full Feynman–Kac formula with discounting is what gives the Black–Scholes equation its probabilistic content. For an asset $S_t$ following geometric Brownian motion under the risk-neutral measure, $dS_t = r S_t\,dt + \sigma S_t\,dB_t$ , and a European option with payoff $g(S_T)$ at maturity $T$ , the price at time $t$ with $S_t = s$ is

V(t, s) = \mathbb{E}\!\left[e^{-r(T-t)}\, g(S_T) \,\Big|\, S_t = s\right].

By Feynman–Kac, $V$ is the unique solution to the PDE $\partial_t V + \tfrac{1}{2}\sigma^2 s^2 \partial_{ss} V + r s\, \partial_s V - r V = 0$ , with terminal condition $V(T, s) = g(s)$ . This is the Black–Scholes equation in stock-price coordinates, and the equivalence "price = discounted expectation = PDE solution" is a Feynman–Kac identification, not an extra assumption.

The same machinery generalizes to multi-asset options, stochastic-volatility models (Heston), and counterparty-risk-adjusted pricing, except the PDE becomes high-dimensional ( $d$ assets means $d$ spatial dimensions) and classical methods give up. Feynman–Kac says you can still Monte Carlo it.

Worked Example: Heat Equation as Brownian Expectation

Take $X_t = x + B_t$ (standard Brownian motion shifted to start at $x$ ), $f \equiv 0$ , $r \equiv 0$ , and $g$ smooth. The generator is $\mathcal{L} = \tfrac{1}{2} \Delta$ and the PDE becomes $\partial_t u + \tfrac{1}{2} \Delta u = 0$ , $u(T, x) = g(x)$ . Feynman–Kac gives

u(t, x) = \mathbb{E}[g(x + B_{T-t})] = \int_{\mathbb{R}^d} g(y)\, \frac{1}{(2 \pi (T-t))^{d/2}}\,e^{-\lvert y - x \rvert^2 / (2(T-t))}\,dy.

The right-hand side is the heat kernel convolution. So the Feynman–Kac formula reproduces the heat-kernel representation of the backward heat equation, and by extension the forward heat equation, by swapping the time direction. This is the cleanest illustration of the formula: the stochastic representation and the PDE Green's function are the same object viewed from two sides.

Connection to Backward SDEs

The classical Feynman–Kac formula handles linear parabolic PDEs. For semilinear parabolic PDEs of the form $\partial_t u + \mathcal{L} u + f(t, x, u, \sigma^\top \nabla u) = 0$ with $f$ depending on $u$ and $\nabla u$ themselves, the linear formula no longer applies. The expectation is no longer well-defined because the "running payoff" $f$ depends on the unknown solution.

The right generalization is the backward stochastic differential equation (BSDE) of Pardoux and Peng (1990): a process $(Y_t, Z_t)$ satisfying $dY_t = -f(t, X_t, Y_t, Z_t)\,dt + Z_t^\top dB_t$ with terminal condition $Y_T = g(X_T)$ . The pair $(Y, Z)$ encodes both $u$ and its gradient along the path: $Y_t = u(t, X_t)$ and $Z_t = \sigma^\top \nabla u(t, X_t)$ . This is the nonlinear Feynman–Kac formula, and it is the mathematical object that the deep BSDE method numerically approximates.

Common Confusions

Watch Out

Forward vs backward Kolmogorov is a sign convention, not a different equation

The forward Kolmogorov (Fokker–Planck) equation $\partial_t p = \mathcal{L}^* p$ evolves the density of $X_t$ forward in time using the generator's adjoint. The backward Kolmogorov equation $\partial_t u + \mathcal{L} u = 0$ evolves a value function backward in time using the generator itself. They are not different equations; they are dual halves of the same operator, and Feynman–Kac is the explicit dictionary between them.

Watch Out

The discount factor is exp(-∫ r ds), not exp(-r(T-t))

For state-dependent or time-dependent rates $r(t, x)$ , the discount factor inside the expectation is the path-integrated $\exp(-\int_t^T r(s, X_s)\,ds)$ , not the simpler $\exp(-r(T-t))$ . The latter is correct only when $r$ is a constant. This becomes important in stochastic interest-rate models (Vasicek, CIR) where $r$ itself is an SDE; the discount factor then depends on the entire trajectory of $r$ , not just its starting value.

Watch Out

Feynman–Kac is for linear PDEs in u; nonlinear-in-u PDEs need BSDEs

The classical formula handles PDE coefficients that depend on $(t, x)$ but not on $u$ itself. Adding a nonlinear-in- $u$ term to the PDE breaks the clean expectation representation: the running cost $f(t, X_s, u(t, X_s), \nabla u(t, X_s))$ depends on the unknown solution and cannot be computed along a trajectory without knowing $u$ first. The fix is the BSDE formulation, which is implicit rather than explicit and requires solving a fixed-point problem at every step.

Exercises

ExerciseCore

Problem

Use Feynman–Kac to solve the backward heat equation $\partial_t u + \tfrac{1}{2} \partial_{xx} u = 0$ , $u(T, x) = x^2$ , on $[0, T] \times \mathbb{R}$ . Verify your answer by direct PDE substitution.

ExerciseAdvanced

Problem

Derive the Black–Scholes PDE from the discounted Feynman–Kac formula, going in the opposite direction from the worked example: start from $V(t, s) = \mathbb{E}[e^{-r(T-t)} g(S_T) \mid S_t = s]$ for geometric Brownian motion $dS = r S\,dt + \sigma S\,dB$ , apply Itô to $e^{-r s} V(s, S_s)$ , and read off the PDE.

References

Canonical:

Karatzas and Shreve, Brownian Motion and Stochastic Calculus (2nd ed., Springer, 1991), Section 5.7. The standard rigorous proof, including the discounted variant.
Øksendal, Stochastic Differential Equations (6th ed., Springer, 2003), Chapter 8. Self-contained derivation of the formula and applications to boundary-value problems and the Dirichlet problem.
Friedman, Stochastic Differential Equations and Applications, Vol. 1 (Academic Press, 1975), Chapter 6. Detailed proof of Feynman–Kac under minimal regularity assumptions.
Shreve, Stochastic Calculus for Finance II: Continuous-Time Models (Springer, 2004), Sections 4.5 and 6.4. The cleanest finance-flavored treatment, with full Black–Scholes derivation.

Current:

Pardoux and Peng, Adapted solution of a backward stochastic differential equation (Systems and Control Letters 14, 1990). The foundational paper that generalized Feynman–Kac to semilinear (nonlinear-in-u) PDEs via BSDEs.
El Karoui, Peng, and Quenez, Backward stochastic differential equations in finance (Mathematical Finance 7, 1997). The classical reference for BSDE-based pricing under nonlinear constraints (counterparty risk, transaction costs).
Han, Jentzen, and E, Solving high-dimensional partial differential equations using deep learning (PNAS 115, 2018). The deep-learning extension to nonlinear Feynman–Kac via BSDEs.

Next Topics

Backward SDE Theory: the Pardoux–Peng generalization to nonlinear-in- $u$ PDEs.
Deep BSDE Method: neural-network solver for the resulting BSDEs in high dimensions.
Fokker–Planck Equation: the dual / forward-Kolmogorov side; densities instead of value functions.
Hamilton–Jacobi–Bellman: the canonical fully nonlinear PDE that arises in stochastic control and is solved via the BSDE extension.

Last reviewed: April 26, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

2

Ito's Lemmalayer 3 · tier 2
Stochastic Differential Equationslayer 3 · tier 2

Derived topics

3

Backward Stochastic Differential Equationslayer 3 · tier 2
Fokker–Planck Equationlayer 3 · tier 2
Hamilton–Jacobi–Bellman Equationlayer 3 · tier 2

Graph-backed continuations

Backward Stochastic Differential Equations Fokker–Planck Equation Hamilton–Jacobi–Bellman Equation