Eigenvalues and Eigenvectors

Sneiderman, Robby

Foundations

Eigenvalues and Eigenvectors

Eigenvalues and eigenvectors: the directions a matrix scales without rotating. Characteristic polynomial, diagonalization, the spectral theorem for symmetric matrices, and the direct connection to PCA.

CoreTier 1StableCore spine~55 min

Prerequisites

Matrix Operations and Properties Inner Product Spaces and Orthogonality Linear Independence Matrix Norms

Start 8-question practice · 20 available 3-question pulse check Prereq Map

Learning position

Read this page in the graph.

foundations | layer 0A | tier 1. This page has 5 direct prerequisites and 24 published dependents.

Open Atlas Prerequisites Leads to

What next

Singular Value Decomposition

This is the first curated or graph-derived continuation from the current page.

Evidence badge

Source-grounded page

This page has no public Lean mapping yet. Use the evidence page to inspect how claim status labels work.

Show the backing system

AtlasOpen the full prerequisite graph and run grounding traces.EvidenceInspect source support, claim labels, and public trust status.LeanReview the checked declaration list, scopes, and axiom profile.

Why This Matters

Eigenvalues and eigenvectors are the most fundamental concept in applied linear algebra. When you run PCA, you are computing eigenvectors of a covariance matrix. When you analyze the convergence rate of gradient descent, you are looking at eigenvalues of the Hessian. When you study Markov chains, the mixing time depends on the second-largest eigenvalue. When you compute the condition number of a matrix, you are taking a ratio of eigenvalues.

If you do not understand eigenvalues and eigenvectors, you cannot understand PCA, SVD, spectral clustering, or any convergence analysis that involves matrices. This is foundational linear algebra, not decoration.

The Greek letter is the standard symbol for an eigenvalue. The collection of all eigenvalues of a matrix is its .

Mental Model

Multiply a vector $x$ by a matrix $A$ , and in general $Ax$ points in a different direction than $x$ . But for special vectors — the eigenvectors — multiplication by $A$ only stretches or shrinks them, without changing their direction. The amount of stretching is the .

Think of $A$ as a transformation. Most vectors get rotated and stretched. Eigenvectors are the "natural axes" of the transformation: the directions along which $A$ acts by pure scaling. Finding these axes reveals the intrinsic geometry of whatever $A$ represents.

Formal Setup

Definition

Eigenvector and Eigenvalue $A v = λ v$

Let $A \in \mathbb{R}^{n \times n}$ be a square matrix. A nonzero vector $v \in \mathbb{C}^n$ is an eigenvector of $A$ with eigenvalue $\lambda \in \mathbb{C}$ if and only if:

$Av = \lambda v$

The eigenvalue $\lambda$ describes how $A$ acts along the direction $v$ : if $\lambda > 0$ , $A$ scales $v$ without reversing direction; if $\lambda < 0$ , $A$ reverses direction and scales $v$ ; if $|\lambda| < 1$ , $A$ contracts the component along $v$ ; and if $|\lambda| > 1$ , $A$ expands it. If $\lambda = 0$ , then $v \in \ker(A)$ , so $A$ has a nontrivial nullspace and is singular.

Definition

Characteristic Polynomial $det (A - λ I) = 0$

The eigenvalues of $A$ are the roots of the characteristic polynomial:

$p(\lambda) = \det(A - \lambda I) = 0$

This is a degree- $n$ polynomial in $\lambda$ , so $A$ has exactly $n$ eigenvalues (counted with multiplicity) in $\mathbb{C}$ . For real matrices, complex eigenvalues come in conjugate pairs $\lambda, \bar{\lambda}$ .

Definition

Eigenspace $null (A - λ I)$

The eigenspace for eigenvalue $\lambda$ is the set of all vectors $v$ satisfying $(A - \lambda I)v = 0$ , which is the null space of $A - \lambda I$ . Its dimension is the geometric multiplicity of $\lambda$ . The algebraic multiplicity is the multiplicity of $\lambda$ as a root of the characteristic polynomial. The geometric multiplicity is always at most the algebraic multiplicity.

Trace, Determinant, and Eigenvalues

Two fundamental invariants of a matrix are directly related to its eigenvalues:

Trace: The trace of $A$ equals the sum of its eigenvalues:

$\text{tr}(A) = \sum_{i=1}^n \lambda_i$

Determinant: The determinant of $A$ equals the product of its eigenvalues:

$\det(A) = \prod_{i=1}^n \lambda_i$

These are immediate consequences of the fact that the characteristic polynomial $\det(A - \lambda I)$ has roots $\lambda_1, \ldots, \lambda_n$ . Expanding the polynomial and comparing coefficients of $\lambda^{n-1}$ gives the trace relation; evaluating at $\lambda = 0$ gives the determinant relation.

Consequence: $A$ is singular if and only if at least one eigenvalue is zero.

Diagonalization

Definition

Diagonalization $A = P D P^{- 1}$

If $A$ has $n$ linearly independent eigenvectors $v_1, \ldots, v_n$ with eigenvalues $\lambda_1, \ldots, \lambda_n$ , then $A$ is diagonalizable:

$A = PDP^{-1}$

where $P = [v_1 | v_2 | \cdots | v_n]$ is the matrix whose columns are eigenvectors, and $D = \text{diag}(\lambda_1, \ldots, \lambda_n)$ is the diagonal matrix of eigenvalues.

Why this matters: Diagonalization makes matrix powers trivial: $A^k = PD^kP^{-1}$ , where $D^k = \text{diag}(\lambda_1^k, \ldots, \lambda_n^k)$ . This is why eigenvalues control the long-term behavior of dynamical systems and iterative algorithms.

Not every matrix is diagonalizable. A matrix fails to be diagonalizable when some eigenvalue has geometric multiplicity strictly less than its algebraic multiplicity (defective eigenvalues). However, symmetric matrices are always diagonalizable --- and in the best possible way.

Main Theorems

Theorem

Spectral Theorem for Symmetric Matrices

Statement

If $A \in \mathbb{R}^{n \times n}$ is symmetric ( $A = A^\top$ ), then:

All eigenvalues of $A$ are real.
Eigenvectors corresponding to distinct eigenvalues are orthogonal.
$A$ has an orthogonal diagonalization: $A = Q\Lambda Q^\top$

where $Q$ is an orthogonal matrix ( $Q^\top Q = I$ ) whose columns are orthonormal eigenvectors, and $\Lambda = \text{diag}(\lambda_1, \ldots, \lambda_n)$ contains the real eigenvalues.

Intuition

A symmetric matrix acts as a pure scaling along orthogonal axes, with no rotation component. The eigenvectors form an orthonormal basis for $\mathbb{R}^n$ , and in this basis, $A$ is just a diagonal matrix. This is the simplest possible structure a matrix can have.

Geometrically: the level sets $\{x : x^\top A x = c\}$ of a symmetric matrix are ellipsoids whose axes are the eigenvectors and whose semi-axis lengths are determined by the eigenvalues.

Proof Sketch

Real eigenvalues: Let $Av = \lambda v$ with $v \neq 0$ . Then $\bar{v}^\top Av = \lambda \bar{v}^\top v = \lambda \|v\|^2$ . Since $A$ is real and symmetric, $\bar{v}^\top Av = \overline{v^\top A\bar{v}} = \overline{v^\top A^\top \bar{v}} = \overline{\bar{v}^\top Av}$ . So $\lambda \|v\|^2 = \bar{\lambda}\|v\|^2$ , giving $\lambda = \bar{\lambda}$ , hence $\lambda \in \mathbb{R}$ .

Orthogonality: If $Av = \lambda v$ and $Aw = \mu w$ with $\lambda \neq \mu$ , then $\lambda v^\top w = (Av)^\top w = v^\top A^\top w = v^\top Aw = \mu v^\top w$ . So $(\lambda - \mu)v^\top w = 0$ , and since $\lambda \neq \mu$ , $v^\top w = 0$ .

Existence of full orthonormal eigenbasis: By induction on $n$ . The real eigenvalue guaranteed by the fundamental theorem of algebra (applied to the characteristic polynomial, which has real coefficients and hence at least one real root for odd degree, or using the compactness argument for the Rayleigh quotient) gives an eigenvector $v_1$ . The orthogonal complement $v_1^\perp$ is invariant under $A$ (by symmetry), so we can apply the argument inductively.

Why It Matters

The spectral theorem is why PCA works. The covariance matrix $\Sigma = \mathbb{E}[(X - \mu)(X - \mu)^\top]$ is symmetric and positive semidefinite. Its eigenvectors are the principal components --- the directions of maximum variance. Its eigenvalues are the variances along those directions. PCA is literally the spectral theorem applied to the sample covariance matrix.

Beyond PCA: the spectral theorem governs the convergence rate of gradient descent (eigenvalues of the Hessian), the mixing time of reversible Markov chains (eigenvalues of the transition matrix), and the behavior of graph Laplacians in spectral clustering. The spectral theory of operators generalizes these ideas to infinite-dimensional spaces.

Failure Mode

The spectral theorem requires symmetry. For non-symmetric matrices, eigenvalues can be complex, eigenvectors need not be orthogonal, and the matrix may not even be diagonalizable. If you are working with a non-symmetric matrix (e.g., a non-reversible Markov chain transition matrix), you need the Jordan normal form or the singular value decomposition instead.

report a correction →

Connection to PCA

The link between eigenvalues and PCA is direct:

Given data points $x_1, \ldots, x_n \in \mathbb{R}^d$ with sample covariance matrix $\hat{\Sigma} = \frac{1}{n}\sum_{i=1}^n (x_i - \bar{x})(x_i - \bar{x})^\top$ :

$\hat{\Sigma}$ is symmetric and positive semidefinite
By the spectral theorem: $\hat{\Sigma} = Q\Lambda Q^\top$
The eigenvector with the largest eigenvalue is the direction of maximum variance (the first principal component)
The $k$ -th eigenvector (sorted by decreasing eigenvalue) is the $k$ -th principal component
The eigenvalue $\lambda_k$ equals the variance of the data projected onto the $k$ -th principal component

This is not a loose analogy. PCA is eigendecomposition of the covariance matrix. Understanding eigenvalues is understanding PCA.

Canonical Examples

Example

Eigenvalues of a 2x2 matrix

Let $A = \begin{pmatrix} 4 & 1 \\ 1 & 3 \end{pmatrix}$ .

The characteristic polynomial is: $\det(A - \lambda I) = (4 - \lambda)(3 - \lambda) - 1 = \lambda^2 - 7\lambda + 11 = 0$

Solving: $\lambda = \frac{7 \pm \sqrt{49 - 44}}{2} = \frac{7 \pm \sqrt{5}}{2}$ .

So $\lambda_1 = \frac{7 + \sqrt{5}}{2} \approx 4.618$ and $\lambda_2 = \frac{7 - \sqrt{5}}{2} \approx 2.382$ .

Check: $\text{tr}(A) = 4 + 3 = 7 = \lambda_1 + \lambda_2$ . And $\det(A) = 12 - 1 = 11 = \lambda_1 \cdot \lambda_2$ . Both eigenvalues are positive, so $A$ is positive definite. Since $A$ is symmetric, the eigenvectors are orthogonal.

Example

Why eigenvalues control iteration convergence

Consider the iteration $x_{k+1} = Ax_k$ . If $A$ is diagonalizable with $A = PDP^{-1}$ , then $x_k = A^k x_0 = PD^kP^{-1}x_0$ . Writing $c = P^{-1}x_0$ :

$x_k = \sum_{i=1}^n c_i \lambda_i^k v_i$

This is why eigenvalues appear in every convergence analysis: gradient descent, power iteration, PageRank, Markov chain mixing.

Common Confusions

Watch Out

Eigenvalues can be complex even for real matrices

The matrix $A = \begin{pmatrix} 0 & -1 \\ 1 & 0 \end{pmatrix}$ (a 90-degree rotation) has characteristic polynomial $\lambda^2 + 1 = 0$ , giving eigenvalues $\lambda = \pm i$ . There are no real eigenvectors because a rotation has no direction that it merely scales. Complex eigenvalues indicate a rotational component. For symmetric matrices, this never happens --- all eigenvalues are real.

Watch Out

Eigenvectors are only unique up to scaling

If $v$ is an eigenvector of $A$ , so is $\alpha v$ for any nonzero scalar $\alpha$ . When people say "the eigenvector," they usually mean a unit eigenvector ( $\|v\| = 1$ ), but even then the sign is ambiguous ( $v$ and $-v$ are both unit eigenvectors). This sign ambiguity appears in PCA: principal components are defined up to a sign flip.

Watch Out

Symmetric positive semidefinite is not the same as symmetric positive definite

A symmetric matrix is positive semidefinite ( $A \succeq 0$ ) if and only if all eigenvalues are $\geq 0$ . It is positive definite ( $A \succ 0$ ) if and only if all eigenvalues are $> 0$ . The covariance matrix is always positive semidefinite. It is positive definite if and only if the data spans all $d$ dimensions (no perfect collinearity). The distinction matters: positive semidefiniteness means $A$ might have a nontrivial null space.

Summary

Eigenvectors are directions that a matrix scales without rotating; eigenvalues are the scaling factors
Characteristic polynomial $\det(A - \lambda I) = 0$ gives eigenvalues
Trace = sum of eigenvalues; determinant = product of eigenvalues
Diagonalization $A = PDP^{-1}$ makes matrix powers trivial
Spectral theorem: symmetric matrices have real eigenvalues and orthogonal eigenvectors ( $A = Q\Lambda Q^\top$ )
PCA = eigendecomposition of the covariance matrix
Eigenvalues control convergence rates of iterative algorithms

Exercises

ExerciseCore

Problem

Find the eigenvalues and eigenvectors of $A = \begin{pmatrix} 3 & 2 \\ 2 & 6 \end{pmatrix}$ . Verify that the eigenvectors are orthogonal and that the trace and determinant relations hold.

ExerciseAdvanced

Problem

Let $A$ be a real symmetric $n \times n$ matrix with eigenvalues $\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n$ . Prove that:

$\lambda_1 = \max_{\|x\| = 1} x^\top A x$

This is the Rayleigh quotient characterization. Why does this directly imply that the first principal component is the leading eigenvector of the covariance matrix?

References

Canonical:

Strang, Linear Algebra and Its Applications (4th ed., 2006), Chapters 5-6
Horn & Johnson, Matrix Analysis (2nd ed., 2012), Chapters 1-4
Halmos, Finite-Dimensional Vector Spaces (1958), Chapters 6-7 (spectral theory)

Current:

Boyd & Vandenberghe, Introduction to Applied Linear Algebra (2018), Chapter 10
Axler, Linear Algebra Done Right (4th ed., 2024), Chapters 5, 7
Trefethen & Bau, Numerical Linear Algebra (1997), Lectures 24-27 (eigenvalue algorithms and perturbation theory)

Next Topics

Building on eigenvalues and eigenvectors:

Singular value decomposition: the generalization to non-square and non-symmetric matrices
Principal component analysis: eigenvalues of the covariance matrix in action
Conditioning and condition number: the ratio of extreme eigenvalues

Last reviewed: April 13, 2026

Canonical graph

Required before and derived from this topic

These links come from prerequisite edges in the curriculum graph. Editorial suggestions are shown here only when the target page also cites this page as a prerequisite.

Full prerequisite chain All derived topics

Required prerequisites

5

Inner Product Spaces and Orthogonalitylayer 0A · tier 1
Linear Independencelayer 0A · tier 1
Matrix Normslayer 0A · tier 1
Matrix Operations and Propertieslayer 0A · tier 1
Vectors, Matrices, and Linear Mapslayer 0A · tier 1

Derived topics

24

Positive Semidefinite Matriceslayer 0A · tier 1
Singular Value Decompositionlayer 0A · tier 1
Tensors and Tensor Operationslayer 0A · tier 1
The Hessian Matrixlayer 0A · tier 1
Conditioning and Condition Numberlayer 1 · tier 1

+19 more on the derived-topics page.