> Learning paths. Math majors — read everything, especially the existence argument in §23.6, the eigenspace discussion in §23.7, and the Math-Major Sidebars; the full characteristic-polynomial machinery arrives in Chapter 24. CS / Data Science —...
Prerequisites
- chapter-07-matrices-as-functions
Learning Objectives
- State the eigen-equation $A\mathbf{v}=\lambda\mathbf{v}$ and explain it as the search for invariant directions, not as a root-finding exercise.
- Interpret an eigenvalue geometrically as a stretch factor: $|\lambda|>1$ grows, $|\lambda|<1$ shrinks, $\lambda<0$ flips, $\lambda=1$ is fixed.
- Identify the invariant directions of a 2x2 transformation in the recurring visualizer as the arrows that stay on their own line.
- Find the eigenvalues and eigenvectors of a 2x2 matrix by hand and describe the eigenspace of each eigenvalue.
- Explain why eigenvectors reveal what a matrix really does, and recognize the zero vector and scaled eigenvectors correctly.
- Verify an eigenpair with numpy's $\texttt{np.linalg.eig}$ and connect eigenvectors to PageRank and PCA as forward references.
In This Chapter
- 23.1 What is an eigenvector, geometrically?
- 23.2 What does the eigen-equation $A\mathbf{v}=\lambda\mathbf{v}$ actually say?
- 23.3 What does the eigenvalue mean? Reading $\lambda$ as a stretch factor
- 23.4 Why is a scaled eigenvector still an eigenvector? (and why $\mathbf{0}$ is not one)
- 23.5 What do the invariant directions look like in the visualizer?
- 23.6 Do eigenvectors always exist? A first look at finding them for a 2×2
- 23.7 What is an eigenspace?
- 23.8 Why do eigenvalues reveal what a matrix really does?
- 23.9 Where do eigenvectors take us? PageRank and PCA on the horizon
- 23.10 Putting it together: what to carry forward
Eigenvalues and Eigenvectors: The Vectors That a Matrix Doesn't Rotate
Learning paths. Math majors — read everything, especially the existence argument in §23.6, the eigenspace discussion in §23.7, and the Math-Major Sidebars; the full characteristic-polynomial machinery arrives in Chapter 24. CS / Data Science — focus on the Geometric Intuition, the visualizer experiments, the numpy, and the PageRank and PCA forward references; the proofs are optional but the picture is not. Physics / Engineering — focus on the invariant-direction geometry, the stretch-factor interpretation of $\lambda$, and the vibration-mode and steady-state applications; eigenvectors are the language of normal modes and stability.
Take a transformation — any of the matrices we have been pushing the plane around with since Chapter 1 — and watch what it does to a single arrow. Point the arrow east and apply the matrix; usually the image points somewhere new, swung off to the northeast or flipped around to the southwest. The arrow has been rotated: its direction changed. Try another arrow, pointing north this time; again, the image points off in some fresh direction. Most arrows, under most transformations, get turned. That turning is exactly why a matrix is hard to picture all at once — every direction is being sent somewhere different, and the eye cannot track all of it.
But now suppose you go hunting, patiently, for the arrows that don't turn. You sweep an arrow slowly around the circle, applying the matrix at every angle, and you watch for the special moments when the image lands right back on the same line as the input — pointing the same way (or exactly backward), only longer or shorter. For most matrices, such directions exist, and there are usually only a few of them. Those are the invariant directions of the transformation: the directions that survive the matrix without being rotated, that are merely stretched. The arrows along them are called eigenvectors, and the factor by which each one stretches is its eigenvalue. This chapter is about finding them and, far more importantly, about understanding why they are the most revealing thing you can know about a matrix.
This is the conceptual heart of the entire book, and it answers the deepest question in the subject, the one we flagged in the Part V introduction: stripped of any particular coordinate system, what does a matrix really do? A matrix written out as a grid of numbers is a description tangled up with the coordinate axes you happened to choose. The eigenvectors cut through that tangle. They are the transformation's own natural axes — the directions along which its action is pure, simple stretching with no rotation at all. Find them, and a confusing block of numbers resolves into a handful of independent scalings. That resolution is what we mean when we say, as recurring theme #6 of this book insists, that eigenvalues and eigenvectors reveal what a matrix really does.
Here is the promise of the chapter, and notice that it is geometric before it is algebraic. We will not begin by writing $\det(A - \lambda I) = 0$ and grinding out roots; that machinery is real and useful and it is the whole of Chapter 24, but starting there would be starting at the end. It would teach you to compute eigenvalues while leaving you with no idea what they mean — which is exactly the failure mode this book is built to avoid. Instead we start in the visualizer, looking at deformed grids and asking which arrows held their line. Only once the picture is solid will we earn the algebra. So let us begin where every concept in this book begins: with the picture.
23.1 What is an eigenvector, geometrically?
Let us make the hunt concrete with a real transformation. Take the matrix
$$A = \begin{bmatrix} 2 & 1 \\ 1 & 2 \end{bmatrix},$$
which we will return to again and again in this chapter as our home example. Recall from Chapter 7 that the columns of $A$ are the images of the basis vectors: $A$ sends $\mathbf{e}_1 = (1,0)$ to $(2,1)$ and $\mathbf{e}_2 = (0,1)$ to $(1,2)$. Neither basis vector held its line — $(1,0)$ rotated upward to point toward $(2,1)$, and $(0,1)$ rotated rightward to point toward $(1,2)$. The standard axes are not invariant directions for this matrix. That is the usual situation; the axes you start with are rarely special to the transformation.
Now try the diagonal direction. Take the arrow $\mathbf{v} = (1,1)$ — pointing northeast at $45°$ — and apply $A$:
$$A\begin{bmatrix} 1 \\ 1 \end{bmatrix} = \begin{bmatrix} 2 + 1 \\ 1 + 2 \end{bmatrix} = \begin{bmatrix} 3 \\ 3 \end{bmatrix} = 3\begin{bmatrix} 1 \\ 1 \end{bmatrix}.$$
Look carefully at what happened. The output $(3,3)$ points in exactly the same direction as the input $(1,1)$ — both lie on the $45°$ line through the origin. The arrow did not turn at all. It was simply stretched to three times its length. The direction $(1,1)$ is an invariant direction of $A$, the vector $(1,1)$ is an eigenvector, and the stretch factor $3$ is its eigenvalue.
There is a second one. Take the anti-diagonal arrow $\mathbf{w} = (-1, 1)$, pointing northwest, and apply $A$:
$$A\begin{bmatrix} -1 \\ 1 \end{bmatrix} = \begin{bmatrix} -2 + 1 \\ -1 + 2 \end{bmatrix} = \begin{bmatrix} -1 \\ 1 \end{bmatrix} = 1\begin{bmatrix} -1 \\ 1 \end{bmatrix}.$$
Again the output lies on the same line as the input — the northwest–southeast diagonal — and again the arrow did not rotate. This time the stretch factor is $1$: the vector is left completely unchanged, a fixed direction. So $(-1,1)$ is a second eigenvector of $A$, with eigenvalue $1$.
Geometric Intuition — Picture the whole plane as a stretchy rubber sheet, and the matrix $A$ as a deformation of that sheet. Almost every arrow drawn on the sheet gets dragged to a new angle as the sheet deforms. But along the $45°$ line, the sheet only stretches outward by a factor of $3$ — points slide away from the origin along that line but never leave it. Along the perpendicular $135°$ line, the sheet does nothing at all — those points stay put. An eigenvector is an arrow drawn along one of those special, un-rotated grain lines of the deformation. The matrix may be a chaotic-looking grid of numbers, but its action is secretly just "triple everything along this line, leave everything along that line alone."
This is the definition you should carry in your head before any formula. An eigenvector of a matrix is a nonzero vector whose direction the matrix does not change — the matrix only scales it. The scaling factor is the eigenvalue. The whole subject of this Part flows from that one picture: a transformation, no matter how complicated it looks in coordinates, has a few private directions along which it acts as nothing more than a stretch.
23.1.1 Why "the vectors a matrix doesn't rotate" is the right slogan
The chapter's subtitle calls eigenvectors the vectors that a matrix doesn't rotate, and it is worth being precise about what that does and does not mean. It does not mean the vector is unmoved — an eigenvector with eigenvalue $3$ is moved a great deal, tripled in length. It means the vector is not moved off its own line. Its direction (or its opposite direction) is preserved; only its magnitude changes. Every other vector in the plane, when you apply the matrix, comes out pointing along a genuinely new line — it has been rotated, even if only slightly. The eigenvectors are the exceptions: the directions the rotation part of the transformation cannot touch.
That phrasing also tells you why eigenvectors are rare and special rather than generic. If you pick a direction at random and apply a typical matrix, the odds that the output lands back on the exact same line are essentially zero — you would have to hit one of a small handful of perfectly aligned directions. So eigenvectors are not "most vectors." They are the structural skeleton, the few load-bearing directions, hiding inside a transformation that scrambles everything else.
23.1.2 Sweeping the circle: why most arrows turn and a few do not
It helps to picture the search as a continuous sweep, because that picture makes the existence of eigenvectors feel almost inevitable. Place a single unit arrow pointing east and slowly rotate the input arrow counterclockwise through a full circle, applying our home matrix $A = \begin{psmallmatrix}2&1\\1&2\end{psmallmatrix}$ at every angle and noting, each time, the angle between the input arrow and its image. At due east ($\theta = 0°$) the input is $(1,0)$ and the image is $(2,1)$, which points up and to the right — the image sits above the input, so $A$ has rotated this arrow counterclockwise. Keep turning the input toward the northeast. By the time the input reaches $45°$, pointing along $(1,1)$, the image is $(3,3)$, which points along $(1,1)$ too — the angle between them has shrunk all the way to zero. The input and output are aligned. We have swept right through an eigenvector.
Continue past $45°$ toward north. Now the image falls below the input — $A$ rotates these arrows clockwise. So as the input swept from east ($0°$) to north ($90°$), the rotation $A$ imposes changed sign: counterclockwise before $45°$, clockwise after. Somewhere in between, the rotation had to pass through zero — and that crossing is the eigenvector. A direction where the imposed rotation is exactly zero is, by definition, a direction the matrix does not turn. The eigenvectors are the angles at which the transformation's rotation changes sign and momentarily vanishes. This is why a typical $2\times2$ matrix has a couple of them: as you sweep the input through $180°$, the imposed rotation generally swings from one sign to the other and back, crossing zero twice. (And when it never crosses zero — when the matrix rotates every arrow in the same rotational sense, like a true rotation — there are no real eigenvectors at all, the case we confront in §23.6.1.)
Geometric Intuition — Think of the matrix as a wind blowing across a field of arrows, each pinned at the origin and free to swing. For most arrows the wind pushes them to a new heading. The eigenvectors are the arrows that happen to point straight into or straight along the wind, so the wind only lengthens or shortens them without swinging them aside. Sweep your gaze around the circle and the eigen-directions are the calm headings where the sideways push falls to zero. Finding eigenvectors is finding the directions the transformation pushes along, not across.
23.2 What does the eigen-equation $A\mathbf{v}=\lambda\mathbf{v}$ actually say?
We now have enough understanding to write the picture down as an equation — and crucially, the equation will mean something, because we built the meaning first. The statement "applying $A$ to $\mathbf{v}$ produces a scaled copy of $\mathbf{v}$" is, symbol for symbol,
$$\boxed{\,A\mathbf{v} = \lambda\mathbf{v}\,}$$
This is the eigen-equation, the central equation of Part V and one of the most important equations in all of linear algebra. Read it slowly, left to right, as a sentence. On the left, $A\mathbf{v}$: take the vector $\mathbf{v}$ and transform it with the matrix $A$ — generally a complicated operation that mixes the components of $\mathbf{v}$ together. On the right, $\lambda\mathbf{v}$: take the same vector $\mathbf{v}$ and merely scale it by the number $\lambda$ — a trivial operation that touches no direction. The equation demands that, for this special $\mathbf{v}$, those two operations give the same answer. The full machinery of the matrix collapses, on $\mathbf{v}$, into simple multiplication by a single scalar.
The Key Insight — The eigen-equation $A\mathbf{v} = \lambda\mathbf{v}$ says that the matrix $A$ acts on the special vector $\mathbf{v}$ exactly as if it were the lowly scalar $\lambda$. A whole matrix, reduced to a single number — but only along the eigenvector's direction. Eigenvectors are the directions in which a matrix is "as simple as a number."
A pair $(\lambda, \mathbf{v})$ that satisfies the equation is called an eigenpair, and we say $\mathbf{v}$ is an eigenvector belonging to (or associated with) the eigenvalue $\lambda$. For our home matrix $A = \begin{psmallmatrix}2&1\\1&2\end{psmallmatrix}$, we found two eigenpairs by inspection: $(\lambda_1, \mathbf{v}_1) = (3, (1,1))$ and $(\lambda_2, \mathbf{v}_2) = (1, (-1,1))$. The complete list of a matrix's eigenvalues is called its spectrum — a word borrowed, as we will note shortly, from physics, and the reason eigenvalue problems are sometimes called spectral problems.
The word eigen itself is German for "own," "proper," or "characteristic." An eigenvector is the matrix's own vector — a direction that belongs intrinsically to the transformation rather than to the coordinate system. (You will sometimes see the older English terms characteristic vector and characteristic value, and the eigenvalues are still occasionally called the matrix's proper values or latent roots. The hybrid German–English word "eigenvalue" won out and is now universal.)
Historical Note — The German prefix eigen- was attached to these "characteristic" quantities by David Hilbert in his work on integral equations around 1904, and the terms Eigenwert (eigenvalue) and Eigenfunktion entered mathematics through him and the Göttingen school. [verify] The underlying idea is far older: characteristic values appear in Euler's and Lagrange's eighteenth-century study of the rotational motion of rigid bodies (the "principal axes" of a spinning top are eigenvectors of its inertia tensor), and Cauchy connected them to quadratic surfaces in the 1820s. [verify] The hybrid English word "eigenvalue" did not become standard until the twentieth century; many older texts say "characteristic value" or "latent root" (a term coined by Sylvester). [verify]
23.2.1 The eigen-equation is the whole problem in one line
Everything in Part V is, in some form, this single equation. Chapter 24 will ask how to find the $\lambda$ and $\mathbf{v}$ that satisfy it for a given $A$ (the answer involves the characteristic polynomial). Chapter 25 will ask what happens when you have enough eigenvectors to make a basis (the matrix diagonalizes, and its powers become easy). Chapter 27 will ask what is special when $A$ is symmetric (the eigenvectors come out perpendicular). Chapter 29 will ask which eigenvector dominates when you apply $A$ over and over (the answer ranks the entire web). But all of it begins, and keeps returning, to $A\mathbf{v} = \lambda\mathbf{v}$ — a matrix, a vector, a number, and the simple demand that transforming equals scaling.
23.3 What does the eigenvalue mean? Reading $\lambda$ as a stretch factor
The eigenvector tells you which direction is invariant; the eigenvalue $\lambda$ tells you what happens along that direction. Because the eigen-equation says $A\mathbf{v} = \lambda\mathbf{v}$, the eigenvalue is precisely the stretch factor applied to the eigenvector. And a single number can do more than you might expect — its sign and its size encode the entire qualitative behavior of the transformation along that line. Let us read off the full vocabulary.
- $\lambda > 1$ — stretching (growth). The eigenvector points the same way after transformation but is longer. For our home matrix, $\lambda_1 = 3$ tripled the $(1,1)$ direction. Repeated application makes such directions grow without bound; these are the expanding directions of the transformation.
- $\lambda = 1$ — a fixed direction. The eigenvector is completely unchanged: $A\mathbf{v} = \mathbf{v}$. It is a fixed point direction; vectors along it survive the transformation untouched. We saw this with $\lambda_2 = 1$ for the $(-1,1)$ direction. This case is the seed of steady states, the heart of the Markov-chain and PageRank applications.
- $0 < \lambda < 1$ — shrinking (decay). The eigenvector points the same way but is shorter. Repeated application makes these directions collapse toward the origin; they are the contracting directions. A transformation with all $|\lambda| < 1$ pulls every vector inward — it is a contraction.
- $\lambda = 0$ — collapse onto a lower dimension. Then $A\mathbf{v} = \mathbf{0}$: the whole eigenvector direction is crushed to the zero vector. The eigenvector with eigenvalue $0$ is exactly a nonzero vector in the null space $N(A)$ from Chapter 13. So zero is an eigenvalue if and only if the matrix is singular — a beautiful link between two ideas you met separately, which we make precise below.
- $\lambda < 0$ — a flip plus a stretch. The eigenvector is reversed (pointed in the opposite direction) and scaled by $|\lambda|$. The line is still invariant — the arrow stays on its own line — but it now points backward. For example $\lambda = -2$ flips the eigenvector and doubles it.
- $|\lambda| > 1$ versus $|\lambda| < 1$ — the magnitude is the growth rate. What matters for long-run behavior is the absolute value $|\lambda|$. If $|\lambda| > 1$ the direction grows under repeated application; if $|\lambda| < 1$ it shrinks; if $|\lambda| = 1$ it neither grows nor shrinks. This single observation is the engine of dynamical-systems stability, which we will exploit in Chapter 25 and again in the ODE chapter, Chapter 37.
Geometric Intuition — Think of each eigenvalue as a verdict on its own private axis, delivered the instant you apply the matrix. "Along this line, everything triples" ($\lambda = 3$). "Along that line, nothing happens" ($\lambda = 1$). "Along this other line, everything is cut in half and flipped to the far side" ($\lambda = -\tfrac12$). The transformation as a whole may look like an incomprehensible swirl, but on each eigen-axis its instruction is just one number — stretch by $\lambda$. The eigenvalues are the transformation's complete set of marching orders, one per invariant direction.
Common Pitfall — A negative eigenvalue does not mean the eigenvector "isn't really" an eigenvector or that something went wrong. Many students expect $\lambda$ to be positive because they picture "stretching," but $\lambda < 0$ is perfectly legitimate: the arrow stays on its own line (still invariant) and is flipped to point the opposite way. The direction is still preserved as a line through the origin — and a line does not care which way the arrow along it points. Likewise $\lambda = 0$ is a genuine eigenvalue, not a "missing" one; it signals a direction the matrix annihilates.
23.3.1 The eigenvalue is a number, the eigenvector is a direction
Hold the two halves of an eigenpair in their proper places. The eigenvalue $\lambda$ is a scalar — just a number, possibly negative, possibly zero, and (we will see in Chapter 26) possibly complex. The eigenvector $\mathbf{v}$ is a direction — really a whole line through the origin, as the next section makes clear. The eigenvalue answers "how much?" and the eigenvector answers "along what?" Confusing the two — asking for "the eigenvalue's direction" or "the eigenvector's size" — is a category error that the careful reader never makes. A matrix's essential behavior is a list of (direction, stretch-factor) pairs, and that list is its eigenstructure.
23.3.2 Reading transformations you already know off their eigenvalues
To make the stretch-factor vocabulary concrete, let us read the eigenvalues of three transformations we have studied in earlier chapters and watch the geometry and the numbers agree. This is good practice at the most important skill of the chapter: seeing what eigenvalues mean before computing anything.
A reflection across the $x$-axis, the matrix $\begin{psmallmatrix}1&0\\0&-1\end{psmallmatrix}$ from Chapter 21. What does a mirror leave on its own line? Two families of directions. Any arrow along the mirror line (the $x$-axis) is untouched — it stays exactly where it is — so it is an eigenvector with eigenvalue $\lambda = +1$. Any arrow perpendicular to the mirror (the $y$-axis) is flipped straight across to point the opposite way, staying on its own line but reversed — so it is an eigenvector with eigenvalue $\lambda = -1$. Every reflection has exactly this signature: one eigenvalue $+1$ (the mirror line, fixed) and one eigenvalue $-1$ (the perpendicular, flipped). The negative eigenvalue is the algebraic fingerprint of the flip — precisely the $\lambda < 0$ case from §23.3, made vivid. You can confirm with numpy that $\begin{psmallmatrix}1&0\\0&-1\end{psmallmatrix}$ returns eigenvalues $[\,1, -1\,]$, and you could have predicted it from the geometry alone.
A projection onto the $x$-axis, the matrix $\begin{psmallmatrix}1&0\\0&0\end{psmallmatrix}$ (project every point straight down onto the horizontal axis, the idea from Chapter 19). An arrow already lying on the $x$-axis is left alone — eigenvalue $+1$. An arrow on the $y$-axis is crushed flat to the origin — eigenvalue $0$. So a projection has eigenvalues $1$ and $0$: the directions it keeps ($\lambda = 1$) and the directions it annihilates ($\lambda = 0$). The eigenvalue-$0$ direction is exactly the null space, recovering the §23.3 fact that zero is an eigenvalue precisely when the matrix is singular — and projections, which throw away a dimension, are always singular. The eigenvalues read out the projection's anatomy: what survives and what collapses.
A pure rotation by $90°$, the matrix $\begin{psmallmatrix}0&-1\\1&0\end{psmallmatrix}$. We have already noted (and will confirm in §23.6.1) that it has no real eigenvalues — there is no real direction a $90°$ turn leaves on its own line. So "read the eigenvalues off the geometry" here returns the honest answer "there are none over the reals," and that absence is itself a fingerprint: a transformation with no real invariant direction must be doing some genuine turning. Three transformations, three eigen-signatures, no heavy computation — just the question which directions stay on their own line, and by how much?
Check Your Understanding — Without computing a characteristic polynomial, what are the eigenvalues of (a) the identity matrix $I$, (b) the scaling $5I$, and (c) a reflection across the line $y = x$?
Answer
(a) $I$ leaves every vector exactly where it is, so every direction is invariant with eigenvalue $1$: the only eigenvalue is $\lambda = 1$ (and its eigenspace is the whole plane). (b) $5I$ stretches every direction by $5$, so the only eigenvalue is $\lambda = 5$ (eigenspace again the whole plane). (c) A reflection always has eigenvalues $+1$ (along the mirror line $y=x$, fixed) and $-1$ (perpendicular to it, along $y=-x$, flipped) — the same signature as any reflection. None of these needed algebra; the geometry hands you the answer.
23.3.3 The trace and determinant shortcut for a 2×2
For a $2\times 2$ matrix there is a fast check on your eigenvalues worth having in your pocket, and it foreshadows a theme of §23.8. As we will see in §23.6 (and derive carefully in Chapter 24), the characteristic polynomial of $\begin{psmallmatrix}a&b\\c&d\end{psmallmatrix}$ is
$$\lambda^2 - (a+d)\,\lambda + (ad - bc) = \lambda^2 - \operatorname{tr}(A)\,\lambda + \det(A).$$
By comparing this with the factored form $(\lambda - \lambda_1)(\lambda - \lambda_2) = \lambda^2 - (\lambda_1 + \lambda_2)\lambda + \lambda_1\lambda_2$, you read off two beautiful identities directly:
$$\lambda_1 + \lambda_2 = \operatorname{tr}(A) = a + d, \qquad \lambda_1 \lambda_2 = \det(A) = ad - bc.$$
The two eigenvalues sum to the trace and multiply to the determinant. This is both a sanity check and sometimes a shortcut to the eigenvalues themselves. For our home matrix $\begin{psmallmatrix}2&1\\1&2\end{psmallmatrix}$: trace $= 4$ and determinant $= 3$, so the eigenvalues are two numbers summing to $4$ and multiplying to $3$ — namely $3$ and $1$, exactly as we found. For $\begin{psmallmatrix}4&1\\2&3\end{psmallmatrix}$: trace $= 7$, determinant $= 10$, so two numbers summing to $7$ and multiplying to $10$ — namely $5$ and $2$. The check confirms our work in a line, and it is your first taste of the deeper truth in §23.8 that the trace and determinant are built from the eigenvalues and so are coordinate-free invariants of the transformation.
23.4 Why is a scaled eigenvector still an eigenvector? (and why $\mathbf{0}$ is not one)
We have been a little casual in saying "the eigenvector $(1,1)$," as though there were just one. In fact, if $\mathbf{v}$ is an eigenvector, so is every nonzero scalar multiple of it. This is not a technicality; it is the reason an eigenvector is best thought of as a direction (or a line through the origin) rather than as one specific arrow.
The proof is immediate from linearity. Suppose $A\mathbf{v} = \lambda\mathbf{v}$, and let $c \ne 0$ be any scalar. Then
$$A(c\mathbf{v}) = c\,(A\mathbf{v}) = c\,(\lambda\mathbf{v}) = \lambda\,(c\mathbf{v}).$$
The vector $c\mathbf{v}$ satisfies the eigen-equation with the same eigenvalue $\lambda$. So $(1,1)$, $(2,2)$, $(7,7)$, and $(-5,-5)$ are all eigenvectors of our home matrix belonging to $\lambda = 3$. (Check the most extreme one: $A(7,7) = (21,21) = 3\cdot(7,7)$ ✓.) Geometrically this is obvious — if the matrix stretches a whole line by a factor of $3$, then every arrow on that line is stretched by $3$, so every arrow on the line is an eigenvector. The eigenvalue belongs to the line; the individual arrow is just a representative.
Because of this freedom, eigenvectors are only determined up to a nonzero scalar, and we are free to normalize — to pick a convenient representative. Common choices are the unit-length representative (so $\lVert\mathbf{v}\rVert = 1$, which is what numpy returns) or the "nicest integers" representative like $(1,1)$ that we use for hand work. There is no single "correct" eigenvector for a given eigenvalue; there is a correct direction, and any nonzero arrow along it will do.
Common Pitfall — The zero vector is never an eigenvector. This is the one exclusion built into the definition, and it trips up almost everyone at first. Notice that $A\mathbf{0} = \mathbf{0} = \lambda\mathbf{0}$ holds for every number $\lambda$ — the zero vector satisfies the eigen-equation for all $\lambda$ simultaneously. If we allowed $\mathbf{0}$ as an eigenvector, then every scalar would be an eigenvalue of every matrix, and the concept would carry no information at all. So we require an eigenvector to be nonzero by definition: $A\mathbf{v} = \lambda\mathbf{v}$ with $\mathbf{v} \ne \mathbf{0}$. (The eigenvalue $\lambda$, by contrast, is perfectly allowed to be $0$ — that is the singular case from §23.3. Do not confuse "eigenvalue zero is allowed" with "the zero vector is allowed"; the first is fine, the second is forbidden.)
Check Your Understanding — You are told that $(2, 2)$ is an eigenvector of some matrix $M$ with eigenvalue $5$. What is $M(2,2)$? What is $M(-3,-3)$? Is $(0,0)$ an eigenvector of $M$?
Answer
$M(2,2) = 5\cdot(2,2) = (10,10)$, straight from the eigen-equation. Since $(-3,-3) = -\tfrac32(2,2)$ is a nonzero scalar multiple, it lies on the same line and is also an eigenvector with the same eigenvalue $5$: $M(-3,-3) = 5\cdot(-3,-3) = (-15,-15)$. And no — $(0,0)$ is not an eigenvector of $M$ (or of any matrix), by definition, because we require eigenvectors to be nonzero. (It does satisfy $M\mathbf{0} = 5\cdot\mathbf{0}$, but so would every $\lambda$, which is exactly why we exclude it.)
23.5 What do the invariant directions look like in the visualizer?
We have done enough algebra; it is time to see the eigenvectors. The 2D transformation visualizer introduced in Chapter 1 was built for exactly this moment. It draws the unit square (dashed blue), the deformed square after the matrix acts (orange), and the images of the two basis vectors (the red and green arrows). All we need to do is overlay the eigen-directions and watch them stay on their own lines. Here is the visualizer, reproduced verbatim from toolkit/visualizer.py as the style guide requires — do not restyle it; every transformation figure in this book is drawn by this same function:
# toolkit/visualizer.py — the recurring 2D transformation visualizer.
# Shows what a 2x2 matrix A does to the unit square and the basis vectors.
import numpy as np
import matplotlib.pyplot as plt
def visualize_2d(A, title="", ax=None):
"""Plot the action of 2x2 matrix A on the unit square and i-hat, j-hat."""
A = np.asarray(A, dtype=float)
square = np.array([[0, 1, 1, 0, 0],
[0, 0, 1, 1, 0]]) # unit-square corners (closed)
out = A @ square # transformed square
e1, e2 = A @ np.array([1, 0]), A @ np.array([0, 1]) # images of basis vectors
if ax is None:
_, ax = plt.subplots(figsize=(5, 5))
ax.plot(square[0], square[1], "b--", lw=1, label="input (unit square)")
ax.fill(out[0], out[1], alpha=0.25, color="C1")
ax.plot(out[0], out[1], "C1-", lw=2, label="A · (unit square)")
ax.arrow(0, 0, *e1, color="C3", width=0.02, length_includes_head=True) # A e1
ax.arrow(0, 0, *e2, color="C2", width=0.02, length_includes_head=True) # A e2
ax.axhline(0, color="gray", lw=0.5)
ax.axvline(0, color="gray", lw=0.5)
ax.set_aspect("equal")
ax.grid(True, alpha=0.3)
ax.set_title(title or f"det = {np.linalg.det(A):.2f}")
ax.legend(loc="best", fontsize=8)
return ax
# Example: a horizontal shear
# visualize_2d([[1, 1], [0, 1]], title="Shear")
# plt.show()
Now we use it on our home matrix and draw the eigenvectors as lines through the origin. The trick is to plot each eigen-direction as a full dashed line: if the input arrow and the output arrow both lie along that same line, you have caught an eigenvector in the act.
# Overlay the eigenvectors of A = [[2,1],[1,2]] on the visualizer.
import numpy as np, matplotlib.pyplot as plt
from toolkit.visualizer import visualize_2d
A = np.array([[2, 1], [1, 2]])
ax = visualize_2d(A, title="A=[[2,1],[1,2]]: eigen-lines stay on themselves")
for vec, lam, c in [((1, 1), 3, "purple"), ((-1, 1), 1, "black")]:
v = np.array(vec, dtype=float)
t = np.linspace(-1.5, 1.5, 2) # draw the whole eigen-LINE
ax.plot(t * v[0], t * v[1], "--", color=c, lw=1.2,
label=f"eigen-line, λ={lam}")
Av = A @ v # image stays on the same line
ax.arrow(0, 0, *Av, color=c, width=0.03, length_includes_head=True)
ax.legend(loc="best", fontsize=7)
plt.show()
Figure 23.1. The action of $A = \begin{psmallmatrix}2&1\\1&2\end{psmallmatrix}$ on the unit square, with its two eigen-lines overlaid. Alt-text: the dashed blue unit square is deformed into an orange parallelogram; two dashed lines through the origin (the $45°$ and $135°$ diagonals) are drawn, and the arrows along them remain exactly on their own line after the matrix acts — the $45°$ arrow lengthens to triple size ($\lambda=3$), the $135°$ arrow is unchanged ($\lambda=1$).
What you see in Figure 23.1 is the entire chapter in one image. The red and green arrows — the images of the standard basis vectors — have clearly rotated off the horizontal and vertical axes; the standard axes are not invariant. But the two dashed eigen-lines tell a different story. The arrow along the $45°$ line stays exactly on that line and grows to three times its length: there is $\lambda = 3$, made visible as pure stretching. The arrow along the $135°$ line stays exactly on that line and does not move at all: there is $\lambda = 1$, made visible as a fixed direction. The matrix rotates almost everything, but it leaves these two grain-lines un-rotated. That is what an eigenvector is, and now you have watched it happen.
Geometric Intuition — Here is a way to find eigenvectors with your eyes alone, no algebra. Sweep a single arrow slowly around the full circle and, at each angle, lightly imagine its image under $A$ (the visualizer makes this concrete if you animate the input angle). For most angles the input arrow and its image point in different directions — there is an angle between them, a rotation. The eigen-directions are precisely the angles where that gap snaps shut: input and output lie on the same line. Geometrically, finding eigenvectors is finding the angles at which a matrix's rotation vanishes.
23.5.1 A pure-stretch example: when the axes themselves are invariant
Not every matrix rotates the basis vectors. Consider the diagonal matrix
$$D = \begin{bmatrix} 3 & 0 \\ 0 & \tfrac12 \end{bmatrix},$$
which simply triples the horizontal direction and halves the vertical one. Apply it to the basis vectors: $D\mathbf{e}_1 = (3,0) = 3\mathbf{e}_1$ and $D\mathbf{e}_2 = (0, \tfrac12) = \tfrac12\mathbf{e}_2$. Both basis vectors are eigenvectors — $\mathbf{e}_1$ with eigenvalue $3$, $\mathbf{e}_2$ with eigenvalue $\tfrac12$. A diagonal matrix wears its eigenstructure on its sleeve: the eigenvalues of a diagonal matrix are exactly its diagonal entries, and the eigenvectors are the standard axes. In the visualizer, $D$ stretches the unit square horizontally and squashes it vertically into a tall-or-wide rectangle, and the two arrows stay locked on the horizontal and vertical axes — they never rotate, because the axes are the invariant directions here.
This is the cleanest possible matrix, and it tells us what "understanding a matrix" should feel like. For a diagonal matrix the action is transparently a list of independent scalings, one per axis. The grand goal of the next two chapters is to discover that many matrices are secretly diagonal — they are just written in the wrong coordinate system. Rotate your axes to align with the eigenvectors, and a messy matrix like $\begin{psmallmatrix}2&1\\1&2\end{psmallmatrix}$ becomes the diagonal matrix $\begin{psmallmatrix}3&0\\0&1\end{psmallmatrix}$ of its eigenvalues. That re-coordinatization is diagonalization (Chapter 25), and it is the single most useful payoff of eigenvectors.
23.5.2 A second experiment: the shear, with only one eigen-line
Run a contrasting matrix through the same visualizer to sharpen the picture: the horizontal shear $\begin{psmallmatrix}1&1\\0&1\end{psmallmatrix}$ from Chapter 1, the very transformation the visualizer's own example draws. A shear slides the top of the unit square rightward while pinning its base, slanting the square into a leaning parallelogram. Which arrows hold their line under a shear? Only the horizontal ones: a vector along the $x$-axis, like $(1,0)$, maps to $(1,0)$ — it is fixed ($\lambda = 1$). But sweep any other direction and it tilts. The vertical arrow $(0,1)$ maps to $(1,1)$, swung off the $y$-axis. So the shear has a single eigen-line, the $x$-axis, and no second invariant direction.
# A shear has only ONE eigen-line (the x-axis); everything else tilts.
import numpy as np, matplotlib.pyplot as plt
from toolkit.visualizer import visualize_2d
A = np.array([[1, 1], [0, 1]])
ax = visualize_2d(A, title="Shear [[1,1],[0,1]]: one eigen-line only")
t = np.linspace(-1.5, 1.5, 2)
ax.plot(t, 0 * t, "--", color="purple", lw=1.2, label="eigen-line, λ=1 (x-axis)")
ax.legend(loc="best", fontsize=7)
plt.show()
print("eigenvalues:", np.linalg.eig(A)[0]) # [1. 1.] -- repeated!
Figure 23.2. The shear $\begin{psmallmatrix}1&1\\0&1\end{psmallmatrix}$ acting on the unit square, with its single eigen-line (the $x$-axis) drawn. Alt-text: the dashed blue unit square is slanted rightward into a parallelogram; only the horizontal axis is marked as an eigen-line, since every non-horizontal direction is tilted by the shear.
The numpy output is revealing: the shear's eigenvalues are $[\,1, 1\,]$ — the value $1$ repeated — yet there is only one eigen-line, not two. A matrix can have a repeated eigenvalue but still fail to supply a full set of independent invariant directions. Such matrices are called defective, and they are the reason the next chapter must distinguish algebraic multiplicity (how many times $\lambda$ is a root) from geometric multiplicity (how many independent eigenvectors it actually yields). The shear is the canonical defective matrix; we take up its eigenspace shortfall in §23.7.1 and meet its full theory in Chapter 24 and again, definitively, in the Jordan-form chapter, Chapter 36. For now the visual lesson stands: most matrices hand you a full set of eigen-lines, but a few — the shears and their kin — are stingy, and the visualizer lets you see the shortfall.
Real-World Application — Vibration and the natural frequencies of a structure. When engineers analyze a bridge, a building, or an aircraft wing, they model it as masses connected by springs, which leads to a matrix equation. The eigenvectors of that matrix are the normal modes — the special shapes in which the whole structure vibrates without changing shape, only scaling up and down — and the eigenvalues determine the corresponding natural frequencies. A structure pushed at a frequency matching one of its eigenvalues resonates, which is why soldiers break step crossing a bridge and why the Tacoma Narrows Bridge famously tore itself apart in 1940. Finding the invariant directions of the structure's matrix is, quite literally, finding the ways it likes to shake. We develop this fully in this chapter's second case study.
23.6 Do eigenvectors always exist? A first look at finding them for a 2×2
So far we have recognized eigenvectors that were handed to us or guessed by symmetry. The natural next question is how to find them for an arbitrary matrix, and whether they always exist. The full, systematic answer — the characteristic polynomial $\det(A - \lambda I) = 0$ — is the entire subject of Chapter 24, and we will not steal its thunder. But we can take a first, geometry-driven look that reveals where the algebra comes from. We do this on a matrix that is not symmetric, so you see that eigenvectors are not a special privilege of nice matrices.
Take
$$A = \begin{bmatrix} 4 & 1 \\ 2 & 3 \end{bmatrix}.$$
We want directions $\mathbf{v} \ne \mathbf{0}$ and scalars $\lambda$ with $A\mathbf{v} = \lambda\mathbf{v}$. The key algebraic move is to get everything on one side. Rewrite the right-hand side as $\lambda I \mathbf{v}$ (inserting the identity changes nothing) and subtract:
$$A\mathbf{v} = \lambda\mathbf{v} \;\;\Longleftrightarrow\;\; A\mathbf{v} - \lambda\mathbf{v} = \mathbf{0} \;\;\Longleftrightarrow\;\; (A - \lambda I)\mathbf{v} = \mathbf{0}.$$
This recasting is the hinge of the whole theory, so dwell on it. The eigen-equation has become a statement about the null space from Chapter 13: $\mathbf{v}$ is an eigenvector with eigenvalue $\lambda$ exactly when $\mathbf{v}$ is a nonzero vector in the null space of the matrix $A - \lambda I$. Now recall the crucial fact from Chapter 9 and Chapter 11: a square matrix has a nonzero null vector if and only if it is singular — that is, if and only if its determinant is zero. So a number $\lambda$ is an eigenvalue precisely when
$$\det(A - \lambda I) = 0.$$
There it is — the characteristic equation — but notice how we got here. We did not start by writing it down as a formula to memorize; we derived it as the condition for the invariant-direction equation to have a nonzero solution. The geometry came first; the polynomial is its consequence. (Chapter 24 turns this into a reliable procedure and studies the polynomial's roots in depth, including the subtle business of repeated roots.)
Let us carry it through for this $2\times 2$. Form $A - \lambda I$ and take its determinant:
$$A - \lambda I = \begin{bmatrix} 4 - \lambda & 1 \\ 2 & 3 - \lambda \end{bmatrix}, \qquad \det(A - \lambda I) = (4-\lambda)(3-\lambda) - (1)(2).$$
Expand:
$$\det(A - \lambda I) = 12 - 7\lambda + \lambda^2 - 2 = \lambda^2 - 7\lambda + 10.$$
Set it to zero and factor: $\lambda^2 - 7\lambda + 10 = (\lambda - 5)(\lambda - 2) = 0$, so the eigenvalues are $\lambda_1 = 5$ and $\lambda_2 = 2$. (For a $2\times2$ matrix $\begin{psmallmatrix}a&b\\c&d\end{psmallmatrix}$, this quadratic is always $\lambda^2 - (a+d)\lambda + (ad - bc) = \lambda^2 - \operatorname{tr}(A)\,\lambda + \det(A)$ — the coefficients are the trace and the determinant, a shortcut we revisit in §23.8.)
Now find each eigenvector by solving $(A - \lambda I)\mathbf{v} = \mathbf{0}$ — that is, by finding the null space, exactly the skill from Chapter 13.
For $\lambda_1 = 5$: $$A - 5I = \begin{bmatrix} -1 & 1 \\ 2 & -2 \end{bmatrix}.$$ The equation $(A - 5I)\mathbf{v} = \mathbf{0}$ is $-v_1 + v_2 = 0$ (the two rows are multiples, as they must be for a singular matrix), so $v_2 = v_1$. Any vector with equal components works; the simplest is $\mathbf{v}_1 = (1, 1)$. Check: $A(1,1) = (4+1, 2+3) = (5,5) = 5\cdot(1,1)$ ✓.
For $\lambda_2 = 2$: $$A - 2I = \begin{bmatrix} 2 & 1 \\ 2 & 1 \end{bmatrix}.$$ The equation $(A - 2I)\mathbf{v} = \mathbf{0}$ is $2v_1 + v_2 = 0$, so $v_2 = -2v_1$. The simplest solution is $\mathbf{v}_2 = (1, -2)$ (or equivalently $(-1, 2)$). Check: $A(1,-2) = (4-2, 2-6) = (2, -4) = 2\cdot(1,-2)$ ✓.
So the non-symmetric matrix $\begin{psmallmatrix}4&1\\2&3\end{psmallmatrix}$ has two perfectly good real eigenpairs: $(5, (1,1))$ and $(2, (1,-2))$. It stretches the $(1,1)$ direction by $5$ and the $(1,-2)$ direction by $2$. Run it through the visualizer and you would see two dashed eigen-lines, both holding their own under the transformation. Eigenvectors are not a luxury of symmetric matrices.
Computational Note — Here is the same computation handed to numpy, and it must match what we found by hand (the book's verification standard demands it).
np.linalg.eigreturns the eigenvalues in a 1D arraywand the eigenvectors as the columns of a 2D arrayV, each normalized to unit length and in an order that need not match yours.```python
Verify the eigenpairs of A = [[4,1],[2,3]] against our hand computation.
import numpy as np A = np.array([[4.0, 1.0], [2.0, 3.0]]) w, V = np.linalg.eig(A) print("eigenvalues:", w) # [5. 2.] print("eigenvectors (columns):\n", V)
eigenvectors (columns):
[[ 0.707107 -0.447214]
[ 0.707107 0.894427]]
```
The eigenvalues come back as
[5. 2.], matching $\lambda_1 = 5$, $\lambda_2 = 2$ exactly. The first eigenvector column is $(0.7071, 0.7071)$ — that is our $(1,1)$ normalized to unit length, since $(1,1)/\sqrt2 = (0.7071, 0.7071)$. The second column is $(-0.4472, 0.8944)$, which is our $(1,-2)$ normalized: $(-1,2)/\sqrt5 = (-0.4472, 0.8944)$ (numpy chose the $(-1,2)$ representative and the unit-length scaling — a different arrow, the same direction). Same eigenvalues, same eigen-lines, different choice of representative arrow. Always expect numpy's eigenvectors to differ from yours by a scalar — usually a length normalization and possibly a sign.
23.6.1 When there is no real eigenvector: a rotation
Do eigenvectors always exist over the real numbers? No — and the cleanest counterexample is the most geometrically obvious one. Consider the $90°$ rotation matrix from Chapter 21,
$$R = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}.$$
Geometrically, ask the defining question: is there any direction that $R$ leaves on its own line? A $90°$ rotation turns every arrow by a quarter turn — east becomes north, north becomes west, and so on. No nonzero real vector comes back pointing along its own line, because every direction is rotated to a perpendicular one. So $R$ has no real eigenvectors at all. Watch the algebra confirm the geometry: the characteristic equation is
$$\det(R - \lambda I) = \det\begin{bmatrix} -\lambda & -1 \\ 1 & -\lambda \end{bmatrix} = \lambda^2 + 1 = 0,$$
whose roots are $\lambda = \pm i$ — imaginary, with no real solutions. The absence of real invariant directions and the absence of real roots are the same fact, viewed geometrically and algebraically. This is not a defect; it is information, and it is so important that Chapter 26 is devoted to it. There you will learn the beautiful resolution: a matrix with complex eigenvalues is a rotation in disguise, and the complex eigenvalues encode precisely the angle it spins through. For now, simply absorb the honest caveat:
Warning
— Not every real matrix has real eigenvectors. A rotation (other than by $0°$ or $180°$) has none over $\mathbb{R}$, because it turns every direction. Over the complex numbers $\mathbb{C}$, however, the situation is far cleaner: by the Fundamental Theorem of Algebra the characteristic polynomial of an $n\times n$ matrix always has $n$ roots (counted with multiplicity), so every square matrix has $n$ complex eigenvalues. The reals can fall short; the complex numbers never do. Keep this condition in mind — "real eigenvalues" is a property some matrices have and some lack, while "complex eigenvalues exist" is universal. We meet the complex case head-on in Chapter 26, and a different subtlety (too few eigenvectors even when the roots are real) in Chapter 24.
23.6.2 A worked example with a story: the Fibonacci matrix and the golden ratio
It is worth doing one more full computation, on a matrix whose eigenvalues you may already know by another name — because it shows, with no extra machinery, why eigenvalues are the long-run growth rates we claimed they were in §23.8. Recall the Fibonacci numbers $1, 1, 2, 3, 5, 8, 13, 21, \dots$, where each term is the sum of the previous two. We can package the recurrence as a matrix. If the current state is the pair $(F_{n}, F_{n-1})$ — the two most recent Fibonacci numbers — then the next state is $(F_{n+1}, F_n) = (F_n + F_{n-1},\, F_n)$, which is exactly
$$\begin{bmatrix} F_{n+1} \\ F_n \end{bmatrix} = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix}\begin{bmatrix} F_n \\ F_{n-1} \end{bmatrix}, \qquad A = \begin{bmatrix} 1 & 1 \\ 1 & 0 \end{bmatrix}.$$
Generating Fibonacci numbers is applying this matrix over and over — precisely the repeated-application setup of §23.8, where we will argue that the largest-$|\lambda|$ eigenvalue governs the long-run behavior. So the growth rate of the Fibonacci sequence is the dominant eigenvalue of $A$. Let us find it. The characteristic equation is
$$\det(A - \lambda I) = \det\begin{bmatrix} 1-\lambda & 1 \\ 1 & -\lambda \end{bmatrix} = (1-\lambda)(-\lambda) - 1 = \lambda^2 - \lambda - 1 = 0.$$
The quadratic formula gives $\lambda = \dfrac{1 \pm \sqrt{5}}{2}$ — and the positive root is the golden ratio $\varphi = \dfrac{1+\sqrt5}{2} \approx 1.618$, with the other root $\psi = \dfrac{1-\sqrt5}{2} \approx -0.618$. The eigenvalues of the Fibonacci matrix are the golden ratio and its conjugate. (Sanity-check with the trace/determinant shortcut: $\operatorname{tr}(A) = 1 = \varphi + \psi$ ✓ and $\det(A) = -1 = \varphi\psi$ ✓, since $\varphi\psi = \frac{(1+\sqrt5)(1-\sqrt5)}{4} = \frac{1-5}{4} = -1$.)
Now read off the consequence. Because $|\varphi| \approx 1.618 > 1 > |\psi| \approx 0.618$, repeated application makes the $\varphi$-eigendirection grow while the $\psi$-eigendirection shrinks toward nothing. After many steps almost all of the state vector lies along the dominant eigenvector, so each step multiplies the vector by roughly $\varphi$. That is exactly the classical fact that consecutive Fibonacci numbers have ratio approaching the golden ratio: $F_{n+1}/F_n \to \varphi$. The famous appearance of $\varphi$ in sunflowers and pinecones is an eigenvalue making itself felt as a growth rate. Let us also find the dominant eigenvector, to complete the eigenpair. Solving $(A - \varphi I)\mathbf{v} = \mathbf{0}$ gives $(1-\varphi)v_1 + v_2 = 0$, so $v_2 = (\varphi - 1)v_1$; taking $v_1 = \varphi$ (a convenient choice using $\varphi - 1 = 1/\varphi$, hence $v_2 = 1$) yields $\mathbf{v} = (\varphi, 1)$. Verify with the defining relation $\varphi^2 = \varphi + 1$:
$$A\begin{bmatrix}\varphi\\1\end{bmatrix} = \begin{bmatrix}\varphi + 1\\\varphi\end{bmatrix} = \begin{bmatrix}\varphi^2\\\varphi\end{bmatrix} = \varphi\begin{bmatrix}\varphi\\1\end{bmatrix}.\quad\checkmark$$
# Verify: the Fibonacci matrix's eigenvalues are the golden ratio and its conjugate.
import numpy as np
A = np.array([[1.0, 1.0], [1.0, 0.0]])
w, V = np.linalg.eig(A)
print("eigenvalues:", w) # [ 1.618034 -0.618034]
print("golden ratio phi:", (1 + 5 ** 0.5) / 2) # 1.618033988749895
# Consecutive Fibonacci ratios approach the dominant eigenvalue phi:
a, b = 1, 1
for _ in range(10):
a, b = b, a + b
print("F_11 / F_10 =", b / a) # 1.6179... -> phi
This single example earns three of the chapter's themes at once: eigenvalues are roots of a polynomial (the algebra), they are stretch factors of invariant directions (the geometry), and they are the long-run growth rates of a repeated process (the dynamics). The same number, $\varphi$, is all three at once. That convergence of meanings is exactly what the Part V introduction promised, and it is why eigenvalues are worth the climb.
23.7 What is an eigenspace?
We noted in §23.4 that every scalar multiple of an eigenvector is again an eigenvector for the same eigenvalue. That observation has a clean structural name. For a fixed eigenvalue $\lambda$, collect all the vectors $\mathbf{v}$ satisfying $A\mathbf{v} = \lambda\mathbf{v}$ — including, this time, the zero vector. This set is the eigenspace of $\lambda$, sometimes written $E_\lambda$:
$$E_\lambda = \{\mathbf{v} : A\mathbf{v} = \lambda\mathbf{v}\} = N(A - \lambda I).$$
The second equality is the key insight, and we earned it in §23.6: the eigenspace of $\lambda$ is exactly the null space of $A - \lambda I$. That is wonderful news, because null spaces are not loose collections — they are subspaces (Chapter 13). The eigenspace is closed under addition and scalar multiplication: add two eigenvectors with the same eigenvalue and you get another ($A(\mathbf{u}+\mathbf{v}) = \lambda\mathbf{u} + \lambda\mathbf{v} = \lambda(\mathbf{u}+\mathbf{v})$); scale one and you get another. So the eigenvectors for a given $\lambda$, together with $\mathbf{0}$, form a genuine subspace of $\mathbb{R}^n$.
Why include the zero vector now, when we so carefully excluded it as an eigenvector in §23.4? Because we are describing a subspace, and every subspace must contain $\mathbf{0}$ (Chapter 13's first axiom). The resolution is a matter of precise language, and it is worth getting exactly right: the eigenspace $E_\lambda$ contains the zero vector and is a subspace; the eigenvectors are the nonzero members of that subspace. Both statements are true at once. The eigenspace is the line (or plane, or higher-dimensional flat) of invariant directions plus its origin point; the eigenvectors are the arrows on it.
For our home matrix $\begin{psmallmatrix}2&1\\1&2\end{psmallmatrix}$, the eigenspace $E_3$ is the entire $45°$ line $\{c(1,1) : c \in \mathbb{R}\}$ — a one-dimensional subspace, $\operatorname{span}\{(1,1)\}$. The eigenspace $E_1$ is the entire $135°$ line $\operatorname{span}\{(-1,1)\}$, also one-dimensional. The plane $\mathbb{R}^2$ is, in this case, beautifully split into two perpendicular eigen-lines, and the transformation acts as a simple stretch on each.
Notice how the subspace structure pays off the moment you take a vector that lies in neither eigen-line. Decompose an arbitrary vector, say $(5, -1)$, into its eigen-components. Writing $(5,-1) = a(1,1) + b(-1,1)$ and solving gives $a = 2$, $b = -3$, so $(5,-1) = 2(1,1) - 3(-1,1)$. Now applying $A$ is effortless, because $A$ acts on each eigen-piece by its own eigenvalue:
$$A(5,-1) = 2\cdot A(1,1) - 3\cdot A(-1,1) = 2\cdot 3(1,1) - 3\cdot 1(-1,1) = 6(1,1) - 3(-1,1) = (9, 3).$$
A quick direct check confirms it: $A(5,-1) = (2\cdot5 + 1\cdot(-1),\; 1\cdot5 + 2\cdot(-1)) = (9, 3)$ ✓. We computed the action of $A$ on a generic vector without ever touching the matrix entries — we split the vector along the eigen-axes, stretched each piece by its eigenvalue, and added the results back. This is the whole strategy of eigen-analysis in miniature: in the eigenbasis, a transformation is a list of independent stretches, and any vector's fate is just its components scaled term by term. When the eigenvectors span the space — as they do here — every vector can be handled this way, which is exactly the diagonalizability of Chapter 25.
Check Your Understanding — The matrix $A = \begin{psmallmatrix}2&1\\1&2\end{psmallmatrix}$ has eigenpairs $(3,(1,1))$ and $(1,(-1,1))$. Use the eigen-decomposition trick to compute $A^{10}(1,1)$ and $A^{10}(-1,1)$ without multiplying ten matrices. What is $A^{10}(2,2)$?
Answer
Each eigenvector just collects a factor of its eigenvalue once per application, so after ten applications $A^{10}\mathbf{v} = \lambda^{10}\mathbf{v}$. Thus $A^{10}(1,1) = 3^{10}(1,1) = 59049\,(1,1) = (59049, 59049)$, and $A^{10}(-1,1) = 1^{10}(-1,1) = (-1,1)$ (the $\lambda=1$ direction never grows). Since $(2,2) = 2(1,1)$ is on the $\lambda=3$ eigen-line, $A^{10}(2,2) = 3^{10}\cdot(2,2) = (118098, 118098)$. The eigenvalues turned a ten-fold matrix product into a single exponentiation — a preview of why diagonalization (Chapter 25) makes matrix powers trivial.
23.7.1 Eigenspaces can be more than one-dimensional
In our $2\times2$ examples every eigenspace has been a single line, but eigenspaces can be larger, and this fact (its size is called the geometric multiplicity, studied in Chapter 24) turns out to be the crux of when a matrix behaves well. Consider the simplest possible example, a pure scaling by $2$ in the plane:
$$S = \begin{bmatrix} 2 & 0 \\ 0 & 2 \end{bmatrix} = 2I.$$
Here $S\mathbf{v} = 2\mathbf{v}$ for every vector $\mathbf{v}$ — every direction is invariant, because uniform scaling does not rotate anything at all. So the eigenspace $E_2$ is the whole plane $\mathbb{R}^2$, a two-dimensional subspace, and the single eigenvalue $2$ "uses up" both dimensions. There is only one eigenvalue, but its eigenspace is as big as it can be. At the opposite extreme, a shear like $\begin{psmallmatrix}1&1\\0&1\end{psmallmatrix}$ also has only one eigenvalue ($\lambda = 1$, repeated) but its eigenspace is a single line — the shear has a deficiency of invariant directions, and such "defective" matrices are the source of the subtleties in Chapter 24 (and the reason Jordan form exists, in Chapter 36).
Math-Major Sidebar — The dimension of the eigenspace $E_\lambda$ is called the geometric multiplicity of $\lambda$; the number of times $\lambda$ appears as a root of the characteristic polynomial is its algebraic multiplicity. A foundational theorem (proved in Chapter 24) states that for every eigenvalue, $1 \le \text{(geometric multiplicity)} \le \text{(algebraic multiplicity)}$. When the two are equal for every eigenvalue, the matrix is diagonalizable (Chapter 25): there are enough independent eigenvectors to form a basis, and the matrix is just a diagonal matrix in disguise. When geometric multiplicity falls short — as for the shear, where the root $\lambda=1$ has algebraic multiplicity $2$ but only a one-dimensional eigenspace — the matrix is defective and cannot be diagonalized. This single inequality, geometric $\le$ algebraic, governs the entire fine structure of Part V. Eigenspaces are also why eigenvectors belonging to distinct eigenvalues are automatically linearly independent: a vector in $E_3$ can never be written using a vector from $E_1$, because applying $A$ would have to scale it by $3$ and by $1$ at once.
23.8 Why do eigenvalues reveal what a matrix really does?
We have reached the conceptual payoff, recurring theme #6 of this book, and it deserves a section of its own. Why are eigenvalues and eigenvectors the right way to understand a matrix? Why not just stare at the grid of numbers? The answer has three layers, and together they explain why eigen-analysis is the most powerful single idea in linear algebra.
First: the eigenvectors are the matrix's own coordinate system. A matrix's entries depend on the basis you wrote it in — change coordinates (Chapter 16) and the same transformation gets a completely different grid of numbers. The eigenvectors, by contrast, are intrinsic: they are invariant directions of the transformation itself, not artifacts of your axes. If you adopt the eigenvectors as your axes, the matrix becomes diagonal — every off-diagonal number, every confusing cross-term, vanishes, and you are left with just the eigenvalues down the diagonal. So eigen-analysis answers "what does this matrix really do, independent of how I happened to write it?" The eigenvalues are the coordinate-free truth; the matrix entries are one coordinate-dependent shadow of it.
Geometric Intuition — Imagine handing the same transformation to ten students, each using different axes. They would write down ten different matrices — different numbers everywhere. But if you asked all ten for the eigenvalues, they would all report the same numbers (say, $3$ and $1$ for our home transformation). The eigenvalues are what all ten matrices have in common — the invariant essence beneath ten different coordinate descriptions. That is the precise sense in which eigenvalues "reveal what a matrix really does": they survive the change of coordinates that scrambles the entries.
Second: eigenvalues control what happens under repetition. Most uses of matrices in science apply them over and over: a population evolves year after year, a Markov chain steps again and again, a web-surfer clicks link after link, a dynamical system marches forward in time. When you apply $A$ repeatedly to an eigenvector, the eigenvalues simply exponentiate: $A^k\mathbf{v} = \lambda^k\mathbf{v}$, because each application multiplies by $\lambda$ once more. So the long-run fate of any vector is dictated by the eigenvalues — directions with $|\lambda| > 1$ explode, directions with $|\lambda| < 1$ die out, and the largest-$|\lambda|$ direction comes to dominate everything. This is why the dominant eigenvector (the one with the biggest $|\lambda|$) is so important: it is the direction that wins in the long run. Hold that phrase; it is the whole idea of PageRank.
This long-run principle is why eigenvalues show up wherever a process repeats. An economist modeling an economy's growth as a year-on-year multiplication by a matrix reads the dominant eigenvalue as the economy's asymptotic growth rate and the dominant eigenvector as the proportions toward which its sectors settle — the Leontief input–output models of mathematical economics are exactly this kind of eigenvalue problem. A demographer's age-structured population (the Leslie matrix) grows in the long run at a rate equal to its dominant eigenvalue, with the stable age distribution given by the corresponding eigenvector, independent of the starting population. Across ecology, economics, and epidemiology, the single number $|\lambda_{\max}|$ separates runaway growth from inevitable decay, and the dominant eigenvector tells you the shape the system relaxes into. One concept, the dominant eigenpair, answers "how fast?" and "what shape?" for every linear process that iterates.
Third: eigenvalues are an invariant fingerprint. Two matrices that represent the same transformation in different coordinates (Chapter 16 called them similar matrices, $B = P^{-1}AP$) always share the same eigenvalues — and therefore the same trace and determinant, which are built from the eigenvalues. In fact the trace equals the sum of the eigenvalues and the determinant equals their product (you saw $\operatorname{tr} = a+d$ and $\det = ad-bc$ surface as coefficients of the characteristic polynomial in §23.6; for $\begin{psmallmatrix}4&1\\2&3\end{psmallmatrix}$ we have $\operatorname{tr} = 7 = 5+2$ and $\det = 10 = 5\times2$). These quantities are invariants: they label the transformation itself, not its coordinate representation. The eigenvalues are the most complete such fingerprint.
The Key Insight — Eigenvalues and eigenvectors strip a transformation down to its essential action: a set of invariant directions, each with a single stretch factor. Everything else about the matrix — the specific entries, the cross-terms, the apparent complexity — is coordinate-system packaging. The eigen-structure is what is left when you peel the packaging away. This is the deepest reason the subject exists.
23.8.1 Seeing coordinate-invariance with numbers
The claim that "eigenvalues survive a change of coordinates" is easy to state and easy to doubt, so let us watch it happen. Take our home transformation $A = \begin{psmallmatrix}2&1\\1&2\end{psmallmatrix}$ (eigenvalues $3$ and $1$) and rewrite it in a different basis. Chapter 16 taught us that re-expressing a transformation in new coordinates produces a similar matrix $B = P^{-1}AP$, where the columns of $P$ are the new basis vectors. Choose, say, the skewed basis given by $P = \begin{psmallmatrix}1&1\\0&1\end{psmallmatrix}$. Then
$$B = P^{-1}AP = \begin{bmatrix} 1 & 0 \\ 1 & 3 \end{bmatrix}.$$
The entries of $B$ are completely different from those of $A$ — different numbers in every slot. Yet compute its eigenvalues and you get $3$ and $1$ again, the very same pair. Its trace is $1 + 3 = 4 = \operatorname{tr}(A)$ and its determinant is $3 = \det(A)$, both unchanged. Two different grids of numbers, one transformation, one set of eigenvalues. The eigenvalues did not care which basis we used to write the matrix down, because they describe the transformation, not its bookkeeping. This is the precise content of "eigenvalues reveal what a matrix really does."
# Similar matrices (same transformation, different basis) share eigenvalues.
import numpy as np
A = np.array([[2.0, 1.0], [1.0, 2.0]])
P = np.array([[1.0, 1.0], [0.0, 1.0]]) # a change of basis
B = np.linalg.inv(P) @ A @ P
print("B =\n", B) # [[1. 0.] [1. 3.]]
print("eig(A):", np.linalg.eig(A)[0]) # [3. 1.]
print("eig(B):", np.linalg.eig(B)[0]) # [3. 1.] -- identical
And the matrix-power consequence of §23.8 is just as concrete. Because $A\mathbf{v} = 3\mathbf{v}$ for $\mathbf{v} = (1,1)$, applying $A$ five times multiplies by $3$ five times: $A^5(1,1) = 3^5(1,1) = 243\cdot(1,1) = (243, 243)$. No matrix multiplication needed — the eigenvalue did all the work. If instead you adopt the two eigenvectors as your axes (an orthonormal pair here, since the matrix is symmetric — a Chapter 27 gift), the matrix becomes the diagonal matrix $\begin{psmallmatrix}3&0\\0&1\end{psmallmatrix}$ of its eigenvalues, and raising it to the fifth power is just $\begin{psmallmatrix}3^5&0\\0&1^5\end{psmallmatrix} = \begin{psmallmatrix}243&0\\0&1\end{psmallmatrix}$. The tangle of cross-terms that made $A$ hard to iterate has dissolved into independent scalings. That dissolving is diagonalization, the payoff Chapter 25 collects in full — but you can already see why it matters: in the right coordinates, the eigenvalues' coordinates, a matrix is as easy to work with as a list of numbers.
23.9 Where do eigenvectors take us? PageRank and PCA on the horizon
The reason eigenvalues sit at the center of this book is that the same invariant-direction idea, applied in different fields, becomes some of the most consequential algorithms ever written. Two of them are the destinations of this Part and the next, and both reduce to a single eigenvector. We seed them now so you can watch them grow.
PageRank — the dominant eigenvector that organized the web. We first met Google's PageRank informally back in Chapter 3, as a system of linear equations. Now we can name what it really is. Imagine the entire web as a giant matrix: a column for each page, recording where its links point, scaled so the columns behave like probabilities (a column-stochastic matrix, which we build in Chapter 29). A random surfer clicking links forever is applying this matrix over and over — and by the long-run argument of §23.8, the distribution of where the surfer ends up converges to the matrix's dominant eigenvector, the invariant direction with eigenvalue $\lambda = 1$. That eigenvector, with one number per page, is the PageRank score. Google's original ranking of the entire internet was the computation of a single dominant eigenvector — found by repeatedly applying the matrix, a method called power iteration that you will implement in this chapter's Build Your Toolkit and study fully in Chapter 29. The search engine that organized the web is, at its core, this chapter's idea at planetary scale.
To make the $\lambda = 1$ steady-state idea fully concrete with numbers, take a tiny two-state version. Suppose a metropolitan area is split between a city and its suburbs, and each year $90\%$ of city residents stay while $10\%$ move out, and $20\%$ of suburbanites move in while $80\%$ stay. If $\mathbf{x} = (\text{city share}, \text{suburb share})$ is the population distribution, one year of migration is the matrix
$$P = \begin{bmatrix} 0.9 & 0.2 \\ 0.1 & 0.8 \end{bmatrix}, \qquad \mathbf{x}_{\text{next}} = P\,\mathbf{x},$$
whose columns each sum to $1$ (everyone goes somewhere) — a column-stochastic matrix, exactly the PageRank shape in miniature. Where does the population settle in the long run? At a distribution that no longer changes: a vector $\mathbf{x}^\star$ with $P\mathbf{x}^\star = \mathbf{x}^\star$. But that is the eigen-equation with $\lambda = 1$! The steady state is the eigenvector for eigenvalue $1$. Solving $(P - I)\mathbf{x}^\star = \mathbf{0}$ gives $-0.1\,x_1 + 0.2\,x_2 = 0$, so $x_1 = 2x_2$; normalizing so the shares sum to $1$ yields $\mathbf{x}^\star = (\tfrac23, \tfrac13)$. In the long run two-thirds of people live in the city and one-third in the suburbs, regardless of where they started — and you find that equilibrium by computing a single eigenvector. (Every column-stochastic matrix has $\lambda = 1$ as an eigenvalue, and for a primitive one (the Perron–Frobenius setting of Chapter 29) all other $|\lambda| < 1$, which is exactly why the steady state exists and why power iteration converges to it.) PageRank is this calculation with billions of states instead of two.
PCA — eigenvectors as the natural axes of data. The second destination lives in data science. Given a cloud of data — thousands of points in high-dimensional space — which directions matter most? Principal Component Analysis answers this by forming the data's covariance matrix (a symmetric matrix capturing how the features vary together) and computing its eigenvectors. Those eigenvectors, the principal components, are the invariant directions along which the data varies most; the eigenvalues measure how much variance lies along each. Projecting onto the top few eigenvectors compresses the data to its essential directions while discarding noise — the same "find the natural axes" move we made for a single matrix, now applied to a whole dataset. We develop the full method in PCA territory in Chapter 32, but the engine is exactly the eigenvectors of this chapter. When a recommender system, a face-recognition pipeline, or a gene-expression study "finds the principal directions," it is finding eigenvectors.
Real-World Application — Quantum mechanics: observables are eigenvalue problems. The most profound appearance of eigenvalues in all of science is in quantum theory. Every measurable quantity — energy, momentum, spin — is represented by a matrix (an operator), and the only values you can ever measure are its eigenvalues; the states in which the system has a definite value of that quantity are the corresponding eigenvectors, called eigenstates. When physicists speak of the discrete "energy levels" of an atom, they are naming the eigenvalues of its energy operator, and the word spectrum — which we used in §23.2 for the set of a matrix's eigenvalues — comes directly from the spectral lines of light those energy levels emit. The deep link between symmetric/Hermitian matrices and real, measurable eigenvalues is the Spectral Theorem of Chapter 27, and it underlies the whole formalism of observables and eigenstates in quantum mechanics. The universe, at its smallest scale, is solving eigen-equations.
23.9.1 Build the dominant eigenvector yourself
Before the formal machinery of Chapter 24, you can already compute the most important eigenvector — the dominant one — with an idea so simple it is almost cheating: just apply the matrix over and over. By §23.8, repeated application makes the largest-$|\lambda|$ direction grow fastest, so it eventually swamps all the others. Renormalize after each step to keep the vector from exploding, and you converge to the dominant eigenvector. That is power iteration, and it is your toolkit contribution for this chapter.
Here is why it works, made precise with the eigen-decomposition trick we just practiced. Suppose the eigenvectors span the space, so any starting vector splits as $\mathbf{x}_0 = c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_n\mathbf{v}_n$ along the eigen-directions, and order the eigenvalues so that $|\lambda_1| > |\lambda_2| \ge \cdots \ge |\lambda_n|$ — $\lambda_1$ is strictly the biggest in magnitude, the dominant eigenvalue. Applying $A$ a total of $k$ times scales each piece by its eigenvalue to the $k$-th power:
$$A^k\mathbf{x}_0 = c_1\lambda_1^k\mathbf{v}_1 + c_2\lambda_2^k\mathbf{v}_2 + \cdots + c_n\lambda_n^k\mathbf{v}_n = \lambda_1^k\!\left(c_1\mathbf{v}_1 + c_2\Bigl(\tfrac{\lambda_2}{\lambda_1}\Bigr)^{\!k}\mathbf{v}_2 + \cdots\right).$$
Every ratio $\lambda_j/\lambda_1$ has magnitude less than $1$, so every term except the first decays to zero as $k$ grows. After enough steps $A^k\mathbf{x}_0$ points essentially along $\mathbf{v}_1$ — the dominant eigenvector — and the renormalization simply rescales it to unit length so the $\lambda_1^k$ blow-up does not overflow. The dominant direction wins because its eigenvalue out-grows all the others, term by term. That is the entire convergence argument, and it is nothing more than §23.8's repetition principle written out. (It also shows the catch: if the two largest eigenvalues are tied in magnitude, $|\lambda_2| = |\lambda_1|$, the ratio does not shrink and power iteration stalls — a genuine limitation we revisit in Chapter 29.)
Build Your Toolkit — Implement
power_iteration(A, num_iters=1000, tol=1e-10)intoolkit/eigen.py. Starting from a random unit vector $\mathbf{x}_0$, repeatedly compute $\mathbf{x}_{k+1} = A\mathbf{x}_k / \lVert A\mathbf{x}_k\rVert$; the iterates converge to the dominant eigenvector, and the Rayleigh quotient $\mathbf{x}^{\mathsf{T}}A\mathbf{x} / (\mathbf{x}^{\mathsf{T}}\mathbf{x})$ converges to its eigenvalue $\lambda$. Write it in pure Python (no numpy in the implementation — only your own toolkit's vector operations from Chapter 2), then verify againstnp.linalg.eigthat you recover the eigenvalue of largest magnitude and a scalar multiple of its eigenvector. On $A = \begin{psmallmatrix}4&1\\2&3\end{psmallmatrix}$ your routine should return $\lambda \approx 5$ and a vector along $(1,1)$. This is the seed ofqr_algorithm(Chapter 29) and the literal engine of PageRank.
Here is a numpy sketch of the idea (the real toolkit version is pure Python), with output you can verify against the by-hand eigenvalue $5$ we found in §23.6:
# Power iteration: find the dominant eigenpair by repeated application of A.
import numpy as np
A = np.array([[4.0, 1.0], [2.0, 3.0]])
x = np.array([1.0, 0.0]) # any nonzero start
for _ in range(50):
x = A @ x
x = x / np.linalg.norm(x) # renormalize each step
lam = x @ (A @ x) / (x @ x) # Rayleigh quotient -> eigenvalue
print("dominant eigenvector:", x) # [0.707107 0.707107] (the (1,1) line)
print("dominant eigenvalue: ", lam) # 5.0
print("numpy eigenvalues: ", np.linalg.eig(A)[0]) # [5. 2.]
The iterate locks onto $(0.7071, 0.7071)$ — our $(1,1)$ eigen-line — and the Rayleigh quotient converges to $5.0$, matching both our hand computation and np.linalg.eig. You have just computed an eigenvector with nothing but repeated matrix multiplication. Scale this up to a matrix with one row and column per web page, and you have Google's original algorithm.
23.10 Putting it together: what to carry forward
Stand back and look at what this chapter built, because Part V will lean on all of it. We began with a picture — most arrows rotate under a matrix, a few hold their line — and we named those few the invariant directions, the eigenvectors, whose stretch factors are the eigenvalues. We wrote the picture as the eigen-equation $A\mathbf{v} = \lambda\mathbf{v}$, learned to read $\lambda$ as a stretch factor (grow, shrink, fix, flip, collapse), understood that an eigenvector is really a direction so scaling it changes nothing, and watched the whole story unfold in the visualizer as the dashed eigen-lines that stayed on themselves. We saw eigenspaces as the null spaces $N(A - \lambda I)$ — honest subspaces — and we took a first, geometry-first look at finding eigenvalues by demanding $\det(A - \lambda I) = 0$, a condition we derived rather than memorized. We met the honest caveat that rotations have no real eigenvectors, and we glimpsed the destinations: PageRank's dominant eigenvector and PCA's principal components, both reachable by power iteration.
Notice, too, how this chapter rhymes with the four fundamental subspaces of Part III (recurring theme #5). The eigenspace of $\lambda$ is a null space, $N(A - \lambda I)$ — so the entire apparatus of column spaces, null spaces, rank, and nullity that we built in Chapters 13 and 14 is exactly what we use to find and describe eigenvectors. The eigenvalue $0$ is special precisely because its eigenspace is the ordinary null space $N(A)$, tying "zero is an eigenvalue" to "the matrix is singular." Eigen-analysis is not a new continent; it is the four-subspaces framework applied to the shifted matrices $A - \lambda I$, one shift per eigenvalue. The structure you already know reorganizes itself around the invariant directions.
Above all, we earned recurring theme #6: eigenvalues and eigenvectors reveal what a matrix really does — its essential action, freed of coordinate-system artifacts. A matrix is a transformation; its eigenvectors are the transformation's own natural axes; its eigenvalues are what it does along them. Everything in the rest of Part V is the working-out of this single idea. Chapter 24 gives the reliable method for finding eigenvalues and confronts the subtlety of repeated roots (the algebraic-versus-geometric multiplicity we previewed with the shear). Chapter 25 cashes in the eigenvectors to diagonalize a matrix and make its powers trivial. Chapter 26 rescues the rotations that seemed to have no eigenvectors at all. Chapter 27 reveals the special grace of symmetric matrices, whose eigenvectors come out perpendicular. And Chapter 29 turns the dominant eigenvector loose on the entire web. You now hold the key idea of the heart of the book; turn the page and we make it precise.