GraphSLAM and Factor Graphs

Sneiderman, Robby

Applied Math

GraphSLAM and Factor Graphs

SLAM as graph optimization: poses as nodes, constraints as edges, factor graph representation, MAP estimation via nonlinear least squares, and the sparsity structure that makes large-scale mapping tractable.

AdvancedTier 2StableSupporting~55 min

Prereq Map

Learning position

Read this page in the graph.

applied-math | layer 3 | tier 2. This page has 0 direct prerequisites and 3 published dependents.

Open Atlas Prerequisites Leads to

What next

Particle Filters

This is the first curated or graph-derived continuation from the current page.

Evidence badge

Claim status

This page has no public Lean mapping yet. Use the evidence page to inspect how claim status labels work.

Show the backing system

AtlasOpen the full prerequisite graph and run grounding traces.EvidenceInspect source support, claim labels, and public trust status.LeanReview the checked declaration list, scopes, and axiom profile.

Why This Matters

Simultaneous Localization and Mapping (SLAM) is the problem of building a map of an unknown environment while simultaneously tracking the agent's position within it. It is one of the most important problems in robotics: every autonomous vehicle, delivery robot, and AR headset solves some form of SLAM.

GraphSLAM reformulates SLAM as a graph optimization problem. Instead of maintaining a probability distribution over all poses (which is what filter-based SLAM does), GraphSLAM collects all constraints (odometry, landmark observations, loop closures) and solves for all poses simultaneously via nonlinear least squares. This batch approach produces more accurate results than filtering and naturally exploits sparsity in the constraint structure.

GraphSLAM is a bipartite factor graph: observed constraints feed local factors, and the solve happens over latent poses and landmarks

The robot never directly observes the true map or its exact pose. Odometry, landmark readings, and revisit cues stay observed; the batch optimizer only adjusts the hollow pose and landmark nodes.

The important move is not “draw a path of poses.” It is “attach every observed cue to a local factor, then solve the whole sparse system at once.” The loop-closure factor is what lets a late revisit correct earlier drift.

Diagram language

filled circle = observed variable or chosen action

hollow circle = latent variable the system must infer or roll forward

rectangle = local compatibility term or measurement residual

rounded-side box = explicit motion or observation map when one is computed directly

Why the observed circles matter

Odometry uₜ, measurements zₜ, and revisit cues are fixed inputs. The optimizer never changes them; it only moves the latent poses and landmarks until the factor residuals become mutually consistent.

The sparse objective is the whole point

$θ^{⋆} = ar g min_{θ} \sum_{k} ∥ e_{k} (θ) ∥_{Σ_{k}^{- 1}}^{2} means each rectangle contributes one local residual, so the Hessian inherits the graph sparsity instead of becoming dense.$

Why this picture scales

A factor only touches the variables in its local stencil. That is why GraphSLAM can solve large maps with sparse linear algebra instead of maintaining a full dense covariance over every pose.

Mental Model

Imagine you walk through a building while recording your position at each step and noting observations of landmarks. Each step gives you a noisy estimate of the displacement (odometry). Each landmark observation gives you a noisy constraint between your current position and the landmark.

Build a graph: nodes are your poses at each time step plus landmark positions. Edges are the constraints from odometry and observations. Now find the configuration of all nodes that best satisfies all constraints simultaneously. This is a least-squares optimization problem, and its structure makes it efficiently solvable.

Formal Setup and Notation

Definition

SLAM as a Graphical Model

Let $x_{0:T} = (x_0, x_1, \ldots, x_T)$ be the robot's poses and $m = (m_1, \ldots, m_M)$ be the landmark positions. The SLAM problem seeks the maximum a posteriori (MAP) estimate:

$x_{0:T}^*, m^* = \arg\max_{x_{0:T}, m} \, p(x_{0:T}, m \mid z_{1:T}, u_{1:T})$

where $z_{1:T}$ are observations and $u_{1:T}$ are odometry inputs.

Definition

Factor Graph

A factor graph is a bipartite graph with two types of nodes:

Variable nodes: represent unknown quantities ( $x_0, x_1, \ldots, x_T, m_1, \ldots, m_M$ )
Factor nodes: represent probabilistic constraints between variables

Each factor $f_k$ connects to the variables it depends on. The joint distribution factorizes as:

$p(x_{0:T}, m \mid z, u) \propto \prod_{k} f_k(\mathcal{X}_k)$

where $\mathcal{X}_k$ is the set of variables connected to factor $k$ .

Definition

SLAM Factors

The standard SLAM factor graph contains three types of factors:

Prior factor: $f_0(x_0) = \mathcal{N}(x_0; \mu_0, \Sigma_0)$ anchoring the initial pose
Odometry factors: $f_t^{\text{odom}}(x_{t-1}, x_t) = \mathcal{N}(x_t - g(x_{t-1}, u_t); 0, Q_t)$ constraining consecutive poses via motion model $g$
Observation factors: $f_t^{\text{obs}}(x_t, m_j) = \mathcal{N}(z_t - h(x_t, m_j); 0, R_t)$ constraining a pose and a landmark via observation model $h$

Main Theorems

Proposition

GraphSLAM as Nonlinear Least Squares

Statement

Under Gaussian noise assumptions, the MAP estimation problem is equivalent to a nonlinear least-squares problem:

$\theta^* = \arg\min_{\theta} \sum_{k=1}^{N} \|e_k(\theta)\|_{\Sigma_k^{-1}}^2$

where $e_k(\theta)$ is the error function for factor $k$ , $\|v\|_{\Sigma^{-1}}^2 = v^\top \Sigma^{-1} v$ is the Mahalanobis norm, and $\Sigma_k$ is the noise covariance for factor $k$ .

Intuition

Taking the negative log of the posterior (which is a product of Gaussians) converts multiplication into summation and Gaussian exponents into squared Mahalanobis norms. Maximizing the posterior becomes minimizing the sum of squared errors. Each factor contributes one error term, weighted by the inverse of its noise covariance.

Why It Matters

This equivalence transforms SLAM from a probabilistic inference problem into an optimization problem, giving access to the entire toolbox of nonlinear least squares: Gauss-Newton, Levenberg-Marquardt, and their sparse variants. These are well-understood algorithms with convergence guarantees for well-conditioned problems.

Failure Mode

The equivalence requires Gaussian noise. Non-Gaussian noise (e.g., heavy-tailed sensor errors, outlier measurements from incorrect data associations) breaks the least-squares formulation. Robust variants use M-estimators (Huber, Cauchy kernels) instead of squared error to downweight outliers, but lose the clean Gaussian interpretation.

report a correction →

Solving the Optimization

Gauss-Newton Method

Linearize each error function around the current estimate $\theta^{(k)}$ :

$e_i(\theta) \approx e_i(\theta^{(k)}) + J_i \, \delta\theta$

where $J_i = \partial e_i / \partial \theta |_{\theta^{(k)}}$ is the Jacobian. Substituting into the cost function and setting the gradient to zero gives the normal equations:

$H \, \delta\theta^* = -b$

where:

$H = \sum_i J_i^\top \Sigma_i^{-1} J_i \qquad b = \sum_i J_i^\top \Sigma_i^{-1} e_i(\theta^{(k)})$

$H$ is the approximate Hessian (also called the information matrix). Update: $\theta^{(k+1)} = \theta^{(k)} + \delta\theta^*$ .

Levenberg-Marquardt

Add a damping term: $(H + \lambda I)\,\delta\theta = -b$ . When $\lambda$ is large, the step is small and steepest-descent-like (safe but slow). When $\lambda$ is small, the step is Gauss-Newton (fast but may overshoot). Adaptive $\lambda$ interpolates between the two regimes.

Sparsity Structure

Proposition

Sparsity of the SLAM Information Matrix

Statement

The information matrix $H \in \mathbb{R}^{n \times n}$ (where $n = \dim(\theta)$ ) has a nonzero entry $H_{ij}$ only if variables $i$ and $j$ appear together in at least one factor. For SLAM:

Odometry factors create a block-tridiagonal structure in the pose-pose block
Observation factors create sparse connections between poses and landmarks
The total number of nonzero entries is $O(T + L)$ where $L$ is the number of observations, compared to the dense matrix size $O(n^2)$

Solving $H\,\delta\theta = -b$ via sparse Cholesky factorization costs $O(n)$ to $O(n^{1.5})$ depending on the graph structure, compared to $O(n^3)$ for dense factorization.

Intuition

Each factor connects only a small number of variables (typically 2). The information matrix inherits this sparse connectivity: most variable pairs never appear in the same factor, so $H_{ij} = 0$ . A robot that has visited 1000 poses and observed 500 landmarks has a state vector of dimension $\sim$ 4500 (in 3D), but the information matrix has only $\sim$ 10,000 nonzero entries out of $\sim$ 20 million possible.

Why It Matters

Sparsity is what makes large-scale SLAM tractable. EKF-SLAM maintains a dense covariance matrix, which requires $O(n^2)$ storage and $O(n^2)$ per update. GraphSLAM exploits sparsity to solve problems with millions of poses and landmarks. The key insight: the information (inverse covariance) matrix is sparse even though the covariance matrix is dense. This is why GraphSLAM works in the information form.

report a correction →

Loop Closure

Loop closure occurs when the robot revisits a previously mapped location and recognizes it. This adds a constraint between the current pose and a distant past pose, creating a long-range edge in the factor graph.

Loop closure is critical because odometry drift accumulates over time. Without loop closures, the map becomes increasingly distorted. A single correct loop closure can correct drift accumulated over thousands of poses, because the optimization distributes the correction across the entire trajectory.

The danger of incorrect loop closure: A false positive (incorrectly believing you have revisited a location) introduces a constraint between unrelated poses. The optimizer satisfies this constraint by distorting the map, often catastrophically. Robust loop closure detection is one of the hardest problems in SLAM.

Practical Systems

Modern SLAM systems (ORB-SLAM, RTAB-Map, Google Cartographer) all use factor-graph-based backends. The frontend extracts features and detects loop closures. The backend (typically iSAM2 or g2o) solves the nonlinear least squares problem incrementally: when new factors arrive, the solution is updated efficiently without re-solving from scratch.

Incremental solving (iSAM2): Maintains a Bayes tree (a data structure encoding the sparse factorization) and updates only the affected portion of the tree when new factors are added. This gives amortized $O(1)$ cost per new variable for chain-like graphs.

Common Confusions

Watch Out

GraphSLAM is batch, not filtering

EKF-SLAM processes observations one at a time and maintains a running state estimate. GraphSLAM collects all constraints and optimizes over the entire trajectory simultaneously. Incremental solvers (iSAM2) blur this distinction by updating the batch solution efficiently, but the formulation is still batch optimization, not filtering.

Watch Out

The information matrix is sparse but the covariance matrix is dense

In SLAM, the information matrix $\Lambda = \Sigma^{-1}$ is sparse because each measurement connects only a few variables. But the covariance matrix $\Sigma = \Lambda^{-1}$ is dense because every variable is correlated with every other variable through the chain of constraints. This is why information-form algorithms (like GraphSLAM) are efficient while covariance-form algorithms (like EKF-SLAM) are not.

Watch Out

SLAM is not just localization

Localization assumes the map is known and estimates only the robot's pose. Mapping assumes the robot's poses are known and builds the map. SLAM solves both simultaneously, which creates the chicken-and-egg difficulty: you need the map to localize, and you need to be localized to build the map.

Summary

SLAM as graph: nodes are poses and landmarks, edges are constraints
Factor graph factorizes the joint posterior as a product of local factors
MAP estimation under Gaussian noise reduces to nonlinear least squares
Solve with Gauss-Newton or Levenberg-Marquardt on the normal equations
The information matrix is sparse: $O(n)$ nonzeros, not $O(n^2)$
Sparsity enables solving SLAM with millions of variables
Loop closure corrects accumulated drift but false positives are catastrophic
Incremental solvers (iSAM2) update the solution efficiently as new data arrives

Exercises

ExerciseCore

Problem

A robot takes 100 steps, observing 20 landmarks along the way. Each pose is 3D ( $x, y, \theta$ ) and each landmark is 2D ( $x, y$ ). What is the dimension of the state vector $\theta$ ? If the robot makes 80 landmark observations (some landmarks are seen from multiple poses), how many factors are in the factor graph (including the prior and odometry)?

ExerciseAdvanced

Problem

Explain why a single loop closure observation can dramatically improve the map quality even though it is just one additional factor in a graph with hundreds of factors. What happens to the information matrix structure when a loop closure is added?

References

Canonical:

Thrun & Montemerlo, "The Graph-SLAM Algorithm with Applications to Large-Scale Mapping of Urban Structures" (2006)
Dellaert & Kaess, "Factor Graphs for Robot Perception" (2017)
Thrun, Burgard, Fox, Probabilistic Robotics (2005), Ch. 10-11 on SLAM backends and smoothing
Dellaert, Factor Graphs and GTSAM: A Hands-on Introduction (2021), Sec. 1-4

Current:

Kaess et al., "iSAM2: Incremental Smoothing and Mapping Using the Bayes Tree" (2012)
Cadena et al., "Past, Present, and Future of Simultaneous Localization and Mapping" (2016)

Next Topics

The natural next steps from GraphSLAM and factor graphs:

Particle filters: the sequential alternative for nonlinear state estimation
Convex optimization basics: the optimization tools underlying least-squares solvers

Last reviewed: April 25, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

0

No direct prerequisites are declared; this is treated as an entry point.

Derived topics

3

Particle Filterslayer 3 · tier 3
Active SLAM and POMDPslayer 4 · tier 3
Visual and Semantic SLAMlayer 4 · tier 3

Graph-backed continuations

Particle Filters Active SLAM and POMDPs Visual and Semantic SLAM