50 min read

> Learning paths. Math majors — read everything, especially the convergence argument for the series in §37.2, the derivation of $\frac{d}{dt}e^{At}=Ae^{At}$ in §37.4, and the defective-case construction via Jordan form in §37.6; this chapter is...

Prerequisites

  • chapter-25-diagonalization
  • chapter-26-complex-eigenvalues

Learning Objectives

  • Define the matrix exponential e^{At} as the convergent power series sum (At)^k/k! and explain why the series always converges for any square matrix.
  • Compute e^{At} for a diagonalizable matrix via e^{At}=Pe^{Dt}P^{-1}, and for a defective matrix via the Jordan form, recognizing where the polynomial-times-exponential terms come from.
  • Solve the linear system x'=Ax with the single formula x(t)=e^{At}x(0), and verify it both by hand (eigenvector expansion) and against scipy's expm and solve_ivp.
  • Read the long-run fate of a linear dynamical system from the eigenvalues of A: decay when all Re(lambda)<0, growth when any Re(lambda)>0, oscillation when eigenvalues are complex.
  • Classify the phase portrait of a 2x2 system — node, saddle, spiral, center — from the signs and types of the eigenvalues, and use the trace-determinant plane as a shortcut.
  • State precisely why e^{A+B} need not equal e^A e^B, and identify the commuting condition AB=BA under which it does.

Linear Algebra Meets Calculus: The Matrix Exponential, Systems of Differential Equations, and Stability

Learning paths. Math majors — read everything, especially the convergence argument for the series in §37.2, the derivation of $\frac{d}{dt}e^{At}=Ae^{At}$ in §37.4, and the defective-case construction via Jordan form in §37.6; this chapter is where the eigenvalue theory of Part V and the Jordan form of Chapter 36 cash out as the qualitative theory of differential equations. CS / Data Science — focus on the Geometric Intuition callouts, the closed-form solution in §37.4, the eigenvalue-stability dictionary in §37.7, and the scipy code; the convergence proof is optional, but the phase-portrait classification is not. Physics / Engineering — focus on the phase portraits of §37.8, the stability conditions, and the two case studies (coupled tanks and the RLC circuit); the matrix exponential is the linear-systems tool you will reach for constantly. This chapter assumes diagonalization $A=PDP^{-1}$ from Chapter 25, complex eigenvalues and their real canonical form from Chapter 26, and — for the hardest case — the Jordan normal form of Chapter 36.

Every chapter of Part VII has taken a comfortable idea and asked what happens when we push it further. This one pushes the eigenvalue. In Part V we learned that the eigenvalues of a matrix reveal what it really does — its essential action, stripped of coordinate artifacts. We used them to take powers ($A^k = PD^kP^{-1}$, Chapter 25), to rank web pages (Chapter 29), to diagnose curvature (Chapter 28). Now we let time flow continuously, and the eigenvalues do something they have only hinted at before: they predict the future. Given a system whose state changes according to $\mathbf{x}' = A\mathbf{x}$ — a population, a circuit, a chemical mixture, a vibrating structure — the eigenvalues of $A$ decide everything about its fate. Whether the system settles to rest, blows up, or oscillates forever is written in the signs of their real parts and the presence of their imaginary parts. This is recurring theme #6 of the book, eigenvalues reveal what a matrix really does, reaching its most dramatic statement: eigenvalues become prophecy.

The bridge that carries us there is one object, and it is the matrix exponential, written $e^{At}$. You already know its one-dimensional ancestor intimately. The simplest differential equation in all of science is $x' = ax$ — "the rate of change is proportional to the amount present" — and its solution, the one fact everyone remembers from calculus, is $x(t) = e^{at}x(0)$. Growth when $a>0$, decay when $a<0$, the exponential curve that models radioactive decay and compound interest and bacterial growth. The astonishing claim of this chapter — the matrix exponential explained in a single sentence — is that the exact same formula solves the matrix system $\mathbf{x}' = A\mathbf{x}$, once we make sense of "$e$ to the power of a matrix." The solution is $\mathbf{x}(t) = e^{At}\mathbf{x}(0)$, and it is not an analogy. It is literally the same object: the exponential of a linear operator, defined by the same Taylor series, behaving by the same rules. To study a system of differential equations with linear algebra is to compute one matrix exponential and read its eigenvalues.

True to the book's method, we begin with the picture. Before any series, before any formula, we will look at what a linear system does to the plane — the swirling, expanding, or collapsing flow of trajectories called a phase portrait — and let that picture tell us what the eigenvalues must be. Then we define $e^{At}$, prove it solves the system, compute it three ways (diagonalization for the clean case, Jordan form for the defective case of Chapter 36, the raw series when all else fails), and finally read stability straight off the spectrum. The anchor for the whole chapter is a single linear dynamical system and its phase portrait: we will plot its trajectories, classify it by its eigenvalues, and watch the abstract theory predict every curve.

37.1 What does a system of differential equations look like geometrically?

Let us begin with the picture, because the book always begins with the picture. Forget formulas for a moment and imagine a fluid flowing across the plane. At every point $\mathbf{x} = (x_1, x_2)$ there is a little arrow telling a particle sitting there which way to move and how fast. A particle dropped into the fluid traces out a curve, following the arrows — speeding up where they are long, turning where they bend. That is exactly what a system of differential equations is. The equation $\mathbf{x}' = A\mathbf{x}$ assigns to each point $\mathbf{x}$ a velocity vector $A\mathbf{x}$, and a solution is the path a particle takes when it always moves with the prescribed velocity.

A linear dynamical system is the special, beautiful case where that velocity field is linear: the arrow at $\mathbf{x}$ is simply the matrix $A$ applied to $\mathbf{x}$. So the velocity field is the very transformation we have studied for thirty chapters — but now reinterpreted as a flow rather than a single snapshot. The matrix $A$ stops being "where does $\mathbf{x}$ land?" and becomes "which way, and how fast, does $\mathbf{x}$ move?" Every geometric fact you know about $A$ — its eigenvectors, its determinant, its rotation-or-stretch character — turns into a fact about the flow.

Geometric Intuition — A linear system $\mathbf{x}'=A\mathbf{x}$ is a velocity field on the plane: an arrow $A\mathbf{x}$ planted at each point $\mathbf{x}$. Picture iron filings showing the lines of a magnetic field, except here the field tells particles how to move, not how to align. A solution curve, called a trajectory, is the streamline a particle follows. The whole family of trajectories — the complete flow pattern — is the phase portrait. Reading a phase portrait is reading the matrix $A$ as motion. And the special directions of that motion, the ones along which the flow points straight in or straight out without turning, are exactly the eigenvectors of $A$ (Chapter 23). The eigenvectors are the "grain" of the flow.

The origin holds a privileged place. Since $A\mathbf{0} = \mathbf{0}$, the velocity at the origin is zero: a particle placed exactly there never moves. We call such a point an equilibrium (or fixed point, or steady state) of the system — a state that, once reached, persists forever. For a linear system the origin is always an equilibrium, and (when $A$ is invertible) the only one. The central question of this entire chapter is about that equilibrium: is it stable? If we nudge the system slightly away from the origin, does it drift back, run away, or circle forever? That single question — stability — is what the eigenvalues will answer.

The Key Insight — A system of differential equations $\mathbf{x}'=A\mathbf{x}$ is a geometry problem in disguise: it asks how the linear transformation $A$, read as a velocity field, moves points through time. The trajectories are streamlines of that field, the origin is the equilibrium, and the question "what happens in the long run?" is the question "do trajectories approach the origin, flee it, or orbit it?" We will answer it without solving a single hard integral — purely from the eigenvalues of $A$.

FAQ: How is a system of differential equations different from one equation, and why use a matrix?

A single differential equation like $x' = ax$ describes one quantity changing over time in isolation. But the world is coupled: the rate at which a predator population grows depends on how many prey there are; the current through an inductor depends on the voltage across a capacitor; the amount of salt in tank A depends on how much flows in from tank B. When several quantities each change at a rate that depends on all of them, we get a system: $$ x_1' = a_{11}x_1 + a_{12}x_2, \qquad x_2' = a_{21}x_1 + a_{22}x_2. $$ Stack the quantities into a state vector $\mathbf{x} = (x_1, x_2)$ and the coefficients into a matrix $A = [a_{ij}]$, and the entire coupled system collapses into one clean line, $\mathbf{x}' = A\mathbf{x}$. This is the same compression we performed for systems of algebraic equations back in Chapter 3 — many scalar equations becoming one matrix equation $A\mathbf{x} = \mathbf{b}$. The payoff is identical: once it is a matrix equation, the full machinery of linear algebra applies. The matrix is not a notational convenience; it is the object whose eigenvalues control the answer.

37.2 What is the matrix exponential, and why does the series converge?

Now the picture demands a tool. To solve $\mathbf{x}' = A\mathbf{x}$ — to find the trajectory through a given starting point — we need a matrix version of $e^{at}$. So we ask the most natural question in mathematics: the scalar exponential has a famous power series, $$ e^{x} = 1 + x + \frac{x^2}{2!} + \frac{x^3}{3!} + \cdots = \sum_{k=0}^{\infty} \frac{x^k}{k!}, $$ a series that converges for every real or complex $x$. What if we feed it a matrix? Everything in that series makes sense for a square matrix $A$: we can take powers $A^k$ (matrix multiplication, Chapter 8), scale them by $1/k!$, and add them up. So we simply define the matrix exponential by the same series.

For any square matrix $A$, the matrix exponential is $$ e^{A} = I + A + \frac{A^2}{2!} + \frac{A^3}{3!} + \cdots = \sum_{k=0}^{\infty} \frac{A^k}{k!}. $$ The constant term is $I$ (the matrix analogue of $1 = x^0$, since $A^0 = I$), and the result is another matrix of the same size. When we want the exponential to depend on time, we replace $A$ by $At$ and write $$ e^{At} = I + At + \frac{(At)^2}{2!} + \frac{(At)^3}{3!} + \cdots = \sum_{k=0}^{\infty} \frac{A^k t^k}{k!}. $$ This $e^{At}$ — a matrix-valued function of the scalar $t$ — is the central object of the chapter. At $t=0$ it equals $I$ (every term but the first vanishes), and as $t$ varies it traces out a path through the space of matrices.

Geometric Intuition — The scalar $e^{at}$ describes pure exponential stretching of a number line: each instant multiplies the value by a factor slightly more than 1 (if $a>0$) or less than 1 (if $a<0$). The matrix $e^{At}$ does the same thing to all of space at once, but the stretching factor can differ along different directions, and space can rotate while it stretches. Picture the unit square of the recurring visualizer from Chapter 1: $e^{At}$ for a fixed $t$ is a single transformation matrix, deforming that square. As $t$ increases from $0$, the square starts as itself ($e^{A\cdot 0}=I$) and smoothly morphs — expanding along expanding eigendirections, shrinking along shrinking ones, spinning if there is rotation. The matrix exponential is "continuous transformation in time," and its eigenvalues set the rates.

But a definition by infinite series is worthless if the series does not converge. For numbers, $\sum x^k/k!$ converges because the factorials in the denominator eventually crush the powers in the numerator. We need the same guarantee for matrices, and it requires only one idea from Chapter 18: the norm of a matrix measures its size, and the size of a product is no bigger than the product of the sizes.

The Key Insight — The matrix exponential series $\sum A^k t^k / k!$ converges for every square matrix $A$ and every $t$, with no conditions at all. So $e^{At}$ always exists and is always a single well-defined matrix. This is what makes it a universal tool: unlike diagonalization, which can fail (the defective case of Chapter 24), the exponential never fails to be defined. We may struggle to compute it, but it is always there.

Here is why convergence is automatic. Pick any matrix norm $\lVert\cdot\rVert$ that is submultiplicative, meaning $\lVert AB\rVert \le \lVert A\rVert\,\lVert B\rVert$ (the operator norm of Chapter 18 is one such; the Frobenius norm is another). Then $\lVert A^k\rVert \le \lVert A\rVert^k$, and so the size of the $k$-th term of the series is bounded: $$ \left\lVert \frac{A^k t^k}{k!} \right\rVert \le \frac{\lVert A\rVert^k\, |t|^k}{k!} = \frac{(\lVert A\rVert\,|t|)^k}{k!}. $$ The right-hand side is the general term of the ordinary scalar exponential series for $e^{\lVert A\rVert |t|}$, which we know converges. Since the sizes of our matrix terms are dominated, term by term, by a convergent series of positive numbers, the matrix series converges absolutely. (The technical name is the comparison test, applied entrywise; the deeper fact is that the space of $n\times n$ matrices is complete — every absolutely convergent series converges — the same completeness idea that defined Hilbert space in Chapter 34.)

Math-Major Sidebar. "Converges" here means in the matrix norm, equivalently entry by entry: the $(i,j)$ entry of the partial sum $\sum_{k=0}^{N} A^k t^k / k!$ converges to the $(i,j)$ entry of $e^{At}$ as $N\to\infty$. Because all norms on the finite-dimensional space $\mathbb{R}^{n\times n}$ are equivalent (Chapter 35's abstract-space ideas guarantee this), convergence in any norm is convergence in every norm, and is the same as entrywise convergence. The bound $\lVert e^{At}\rVert \le e^{\lVert A\rVert |t|}$ that falls out of the proof is itself useful: it tells you the exponential cannot grow faster than the scalar exponential of the matrix's size. Completeness of $\mathbb{R}^{n\times n}$ — every Cauchy sequence of matrices converges to a matrix — is what lets "absolutely convergent" upgrade to "convergent," exactly as for real numbers.

It is worth pausing on why we are entitled to manipulate this infinite sum as freely as we will — factoring matrices out of it in §37.3, differentiating it term by term in §37.4. Absolute convergence is precisely the license for both. A series that converges absolutely may be rearranged, regrouped, and (when its terms are differentiable functions of a parameter, uniformly on compact sets) differentiated term by term, with the result still converging to the derivative of the sum. The matrix series clears this bar on every bounded interval of $t$, because the dominating scalar series $\sum (\lVert A\rVert |t|)^k/k!$ does. So every formal manipulation we perform on $e^{At}$ in this chapter — treating it like the familiar scalar exponential it generalizes — is rigorously justified, not merely suggestive. This is the recurring discipline of the book: we earn the right to compute, rather than assuming it.

Historical Note. The idea of substituting a matrix into a power series traces to the founders of matrix theory in the mid-nineteenth century. Arthur Cayley, who introduced matrix algebra in his Memoir on the Theory of Matrices (1858), already considered functions of matrices, and the systematic theory of functions of a matrix — including the exponential — was developed over the following decades, with the Cayley–Hamilton theorem (every matrix satisfies its own characteristic equation, Chapter 24) providing the key that any power series in a matrix collapses to a polynomial in it. The use of the matrix exponential to solve linear systems of differential equations became standard through the work on continuous groups by Sophus Lie in the 1880s, for whom $e^{At}$ is the one-parameter group generated by $A$. [verify] (The precise lines of development are tangled across Cayley, Sylvester, Lie, and later Weyr and Frobenius; the broad arc — Cayley's matrix algebra, functions of matrices via Cayley–Hamilton, Lie's one-parameter groups — is reliable, the exact attributions and dates approximate.)

A first computation: the exponential of a diagonal matrix

The series pays an immediate dividend on the easiest possible matrix. Let $D = \begin{bmatrix} \lambda_1 & 0 \\ 0 & \lambda_2 \end{bmatrix}$ be diagonal. Powers of a diagonal matrix are trivial — you just power the diagonal entries — so $D^k = \begin{bmatrix} \lambda_1^k & 0 \\ 0 & \lambda_2^k \end{bmatrix}$. Plug this into the series and the two diagonal slots decouple completely: $$ e^{Dt} = \sum_{k=0}^{\infty} \frac{t^k}{k!}\begin{bmatrix} \lambda_1^k & 0 \\ 0 & \lambda_2^k \end{bmatrix} = \begin{bmatrix} \sum_k \frac{(\lambda_1 t)^k}{k!} & 0 \\ 0 & \sum_k \frac{(\lambda_2 t)^k}{k!} \end{bmatrix} = \begin{bmatrix} e^{\lambda_1 t} & 0 \\ 0 & e^{\lambda_2 t} \end{bmatrix}. $$ The exponential of a diagonal matrix is the diagonal matrix of the exponentials. Each diagonal entry follows its own scalar exponential, governed by its own $\lambda_i$. This is the entire chapter in miniature: along the special directions, the matrix exponential is just a collection of independent scalar exponentials $e^{\lambda_i t}$, and the $\lambda_i$ are the eigenvalues. Everything that follows is about reducing the general case to this one.

Check Your Understanding — Compute $e^{At}$ for $A = \begin{bmatrix} 2 & 0 \\ 0 & -1 \end{bmatrix}$, and describe what it does to a starting vector as $t$ grows.

Answer $A$ is diagonal, so $e^{At} = \begin{bmatrix} e^{2t} & 0 \\ 0 & e^{-t} \end{bmatrix}$. Applied to $\mathbf{x}(0) = (x_1, x_2)$, it gives $\mathbf{x}(t) = (e^{2t}x_1,\, e^{-t}x_2)$. The first coordinate grows exponentially (eigenvalue $+2$), the second decays to zero (eigenvalue $-1$). A particle is flung outward along the $x_1$-axis while being pulled in toward it along the $x_2$-axis — the signature of a saddle, which we will meet again in §37.8. At $t=1$, numpy/scipy give $\begin{bmatrix} e^2 & 0 \\ 0 & e^{-1}\end{bmatrix} = \begin{bmatrix} 7.389056 & 0 \\ 0 & 0.367879\end{bmatrix}$, matching scipy.linalg.expm.

37.3 How do you compute the matrix exponential by diagonalization?

The diagonal case was easy. The deep idea of Chapter 25 was that a diagonalizable matrix is "secretly diagonal" — it is a diagonal matrix wearing a change-of-basis costume, $A = PDP^{-1}$, where the columns of $P$ are the eigenvectors and $D$ holds the eigenvalues. That costume passes straight through the exponential, and it is the single most useful computational fact in the chapter.

Suppose $A = PDP^{-1}$ is diagonalizable, as in Chapter 25. Then for any power $k$, the inner factors telescope: $A^k = (PDP^{-1})(PDP^{-1})\cdots(PDP^{-1}) = PD^kP^{-1}$, because each adjacent $P^{-1}P$ collapses to $I$. Substitute this into the exponential series and factor the constant matrices $P$ and $P^{-1}$ out of the sum (they do not depend on $k$): $$ e^{At} = \sum_{k=0}^{\infty} \frac{A^k t^k}{k!} = \sum_{k=0}^{\infty} \frac{PD^k P^{-1} t^k}{k!} = P\left(\sum_{k=0}^{\infty} \frac{D^k t^k}{k!}\right)P^{-1} = P\,e^{Dt}\,P^{-1}. $$ And we just computed $e^{Dt}$ — it is the diagonal matrix of $e^{\lambda_i t}$. So the recipe is complete:

$$ \boxed{\,A = PDP^{-1} \quad\Longrightarrow\quad e^{At} = P\,e^{Dt}\,P^{-1} = P\begin{bmatrix} e^{\lambda_1 t} & & \\ & \ddots & \\ & & e^{\lambda_n t} \end{bmatrix}P^{-1}.\,} $$

This is the matrix-exponential twin of the formula $A^k = PD^kP^{-1}$ for powers, and the logic is identical — diagonalize, do the easy thing to the eigenvalues, change back. Where powers raised each $\lambda_i$ to the $k$, the exponential exponentiates each $\lambda_i t$. The eigenvalues are exactly the rates of the independent exponential modes.

Geometric Intuition — Read $e^{At} = P\,e^{Dt}\,P^{-1}$ right-to-left as three geometric steps, exactly as in Chapter 25. $P^{-1}$ rewrites your vector in the eigenvector coordinate system (the "natural" axes of the flow). $e^{Dt}$ then stretches each eigen-coordinate independently — by $e^{\lambda_i t}$, growing or shrinking at its own eigenvalue's rate. Finally $P$ translates back to standard coordinates. So evolving a linear system in time is: change to eigen-coordinates, let each mode exponentiate on its own schedule, change back. The eigenvectors are the directions along which the flow is pure stretch with no mixing — the grain of the wood — and the eigenvalues are how fast each one grows or decays.

Hand computation: the matrix exponential of a 2×2 matrix

Let us run the recipe on the matrix that will anchor this chapter: $$ A = \begin{bmatrix} -2 & 1 \\ 1 & -2 \end{bmatrix}. $$ Step 1 — eigenvalues. The characteristic polynomial (Chapter 24) is $\det(A - \lambda I) = (-2-\lambda)^2 - 1 = \lambda^2 + 4\lambda + 3 = (\lambda+1)(\lambda+3)$, so the eigenvalues are $\lambda_1 = -1$ and $\lambda_2 = -3$. Both are negative. Pause on that; it is the whole reason this system will turn out to be stable.

Step 2 — eigenvectors. For $\lambda_1 = -1$: solve $(A + I)\mathbf{v} = \mathbf{0}$, i.e. $\begin{bmatrix} -1 & 1 \\ 1 & -1 \end{bmatrix}\mathbf{v} = \mathbf{0}$, giving $\mathbf{v}_1 = \begin{bmatrix} 1 \\ 1 \end{bmatrix}$. For $\lambda_2 = -3$: solve $\begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix}\mathbf{v} = \mathbf{0}$, giving $\mathbf{v}_2 = \begin{bmatrix} 1 \\ -1 \end{bmatrix}$. These are orthogonal — no accident, since $A$ is symmetric and the Spectral Theorem of Chapter 27 guarantees orthogonal eigenvectors.

Step 3 — assemble. With $P = \begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}$ and $D = \begin{bmatrix} -1 & 0 \\ 0 & -3 \end{bmatrix}$, we have $P^{-1} = \tfrac12\begin{bmatrix} 1 & 1 \\ 1 & -1 \end{bmatrix}$, and $$ e^{At} = P\begin{bmatrix} e^{-t} & 0 \\ 0 & e^{-3t} \end{bmatrix}P^{-1} = \frac12\begin{bmatrix} e^{-t}+e^{-3t} & e^{-t}-e^{-3t} \\ e^{-t}-e^{-3t} & e^{-t}+e^{-3t} \end{bmatrix}. $$ Multiply it out and check at $t=0$: the diagonal entries become $\tfrac12(1+1)=1$ and the off-diagonal $\tfrac12(1-1)=0$, so $e^{A\cdot 0}=I$, as it must. At $t=1$, using $e^{-1}=0.367879$ and $e^{-3}=0.049787$, the diagonal entries are $\tfrac12(0.367879+0.049787)=0.208833$ and the off-diagonal $\tfrac12(0.367879-0.049787)=0.159046$. Hold those numbers; we verify them against scipy next.

numpy verification

# Matrix exponential by diagonalization, e^{At} = P e^{Dt} P^{-1}, vs scipy.linalg.expm.
import numpy as np
from scipy.linalg import expm                 # the reference implementation
A = np.array([[-2., 1.], [1., -2.]])
lam, P = np.linalg.eig(A)                      # eigenvalues, eigenvectors as columns
print(np.round(lam, 6))                        # [-1. -3.]  (numpy may order them either way)
t = 1.0
E_diag = P @ np.diag(np.exp(lam * t)) @ np.linalg.inv(P)   # P e^{Dt} P^{-1}
print(np.round(E_diag.real, 6))
# [[0.208833 0.159046]
#  [0.159046 0.208833]]
print(np.round(expm(A * t), 6))                # scipy: same matrix
# [[0.208833 0.159046]
#  [0.159046 0.208833]]
print(np.max(np.abs(E_diag.real - expm(A * t))))   # ~1e-15: agree to machine precision

The hand result and both code paths agree: $e^{A\cdot 1} = \begin{bmatrix} 0.208833 & 0.159046 \\ 0.159046 & 0.208833 \end{bmatrix}$, with the diagonalization formula and scipy's expm differing only at the level of floating-point round-off ($\sim 10^{-15}$). The .real is there because np.linalg.eig returns complex arrays by default even for real eigenvalues — a Computational Note we return to in §37.5.

Computational Notenp.linalg.eig returns eigenvalues in no guaranteed order, and the eigenvectors as the columns of P (remember: numpy indexes from 0, so P[:,0] is the eigenvector for lam[0]). It also returns complex dtype even when the eigenvalues are real, hence the .real cleanup. For actually computing $e^{At}$ in production, do not hand-roll the diagonalization — call scipy.linalg.expm, which uses a numerically robust scaling-and-squaring method (Chapter 38) that works even when $A$ is defective or nearly so, where the eigenvector matrix $P$ is ill-conditioned and the diagonalization formula quietly loses accuracy.

37.4 Why does $\mathbf{x}(t)=e^{At}\mathbf{x}(0)$ solve the system?

We have a tool, $e^{At}$, and a claim: that it solves $\mathbf{x}' = A\mathbf{x}$. Now we earn the claim. This is the theorem that justifies the entire chapter, and its proof is a single beautiful line — the matrix exponential was built to make it work.

Why we care. This theorem reduces every linear system of differential equations — any size, any coupling — to one formula, $\mathbf{x}(t)=e^{At}\mathbf{x}(0)$. No clever integrating factors, no guessing solution forms, no case analysis. Compute one matrix exponential and you have the trajectory through any starting point at once. It is the differential-equations analogue of "$A^{-1}\mathbf{b}$ solves $A\mathbf{x}=\mathbf{b}$" — a closed-form solution that turns a calculus problem into a linear-algebra computation.

Key idea. Differentiating the matrix exponential gives back $A$ times itself, $\frac{d}{dt}e^{At} = A\,e^{At}$ — exactly the scalar fact $\frac{d}{dt}e^{at}=a\,e^{at}$, with the matrix riding along. So $e^{At}\mathbf{x}(0)$ differentiates to $A$ times itself, which is precisely what $\mathbf{x}'=A\mathbf{x}$ demands.

Proof. First we establish the key derivative fact. Differentiate the series for $e^{At}$ term by term in $t$ — legitimate because the series converges uniformly on bounded $t$-intervals (§37.2): $$ \frac{d}{dt}e^{At} = \frac{d}{dt}\left(I + At + \frac{A^2 t^2}{2!} + \frac{A^3 t^3}{3!} + \cdots\right) = A + A^2 t + \frac{A^3 t^2}{2!} + \cdots. $$ Factor $A$ out of every surviving term (it is a constant matrix, so it pulls through the derivative and the sum): $$ \frac{d}{dt}e^{At} = A\left(I + At + \frac{A^2 t^2}{2!} + \cdots\right) = A\,e^{At}. $$ (We could equally factor $A$ out on the right, giving $\frac{d}{dt}e^{At}=e^{At}A$; since every term is a power of $A$, the exponential commutes with $A$.) Now let $\mathbf{x}(t) = e^{At}\mathbf{x}(0)$, where $\mathbf{x}(0)$ is a constant vector. Differentiate, using the product rule with the constant $\mathbf{x}(0)$: $$ \mathbf{x}'(t) = \frac{d}{dt}\big(e^{At}\big)\mathbf{x}(0) = A\,e^{At}\mathbf{x}(0) = A\,\mathbf{x}(t). $$ So $\mathbf{x}(t)$ satisfies the differential equation. And at $t=0$, $\mathbf{x}(0) = e^{A\cdot 0}\mathbf{x}(0) = I\,\mathbf{x}(0) = \mathbf{x}(0)$, so it satisfies the initial condition. Existence is proved. Uniqueness — that this is the only solution — follows from the standard existence-and-uniqueness theorem for linear ODEs (a Lipschitz-continuity argument); intuitively, the velocity field $A\mathbf{x}$ assigns exactly one direction to each point, so a trajectory cannot branch. $\blacksquare$

What this means. The matrix $e^{At}$ is the flow of the system: feed it any initial state $\mathbf{x}(0)$ and it transports that state forward by time $t$. It is sometimes called the fundamental matrix or state-transition matrix, because its columns are the trajectories starting from the standard basis vectors, and every trajectory is a linear combination of those. Geometrically, $e^{At}$ is "the universe's time-$t$ map": apply it to the whole plane and you see where every particle has flowed. This is recurring theme #1 of the book — a matrix is a transformation — extended to time: $e^{At}$ is the transformation that is the passage of time.

The eigenvector form: the solution as a sum of modes

Combining the solution formula with the diagonalization of §37.3 gives the most illuminating way to see a trajectory. If $A$ is diagonalizable with eigenpairs $(\lambda_i, \mathbf{v}_i)$, write the initial condition in the eigenvector basis (Chapter 16): $\mathbf{x}(0) = c_1\mathbf{v}_1 + c_2\mathbf{v}_2 + \cdots + c_n\mathbf{v}_n$. Then, because $e^{At}\mathbf{v}_i = e^{\lambda_i t}\mathbf{v}_i$ (the exponential acts on each eigenvector by the scalar $e^{\lambda_i t}$ — this is the eigen-equation $A\mathbf{v}=\lambda\mathbf{v}$ passed through the series), the solution decouples into independent modes: $$ \boxed{\,\mathbf{x}(t) = c_1 e^{\lambda_1 t}\mathbf{v}_1 + c_2 e^{\lambda_2 t}\mathbf{v}_2 + \cdots + c_n e^{\lambda_n t}\mathbf{v}_n.\,} $$ This is the formula to understand the system by. Each eigenvector $\mathbf{v}_i$ is a mode — a fixed direction — and along it the system simply grows or decays like the pure scalar exponential $e^{\lambda_i t}$. The trajectory is a superposition of these modes (recurring theme: linearity is superposition, Chapter 1). The eigenvalue $\lambda_i$ is the rate of mode $i$: positive $\lambda_i$ means that mode grows, negative means it decays, and (next section) complex means it oscillates. To know the system's fate is to know which modes survive.

The mode picture also explains timescales, which matter enormously in practice. When the eigenvalues are all negative but of very different sizes — say $\lambda_1 = -1$ and $\lambda_2 = -3$ in our anchor — the modes decay at very different rates. The fast mode ($e^{-3t}$) is essentially gone after a time of order $1/3$, while the slow mode ($e^{-t}$) lingers three times as long. So a trajectory has two phases: a brief transient in which the fast mode dies and the motion swings toward the slow eigenvector, followed by a long relaxation in which the system slides into equilibrium along that slow direction. The ratio of the largest to the smallest $|\operatorname{Re}\lambda|$ is the system's stiffness, and a large ratio (modes decaying on wildly different clocks) is exactly what makes a system numerically hard to integrate — a "stiff" ODE that defeats naive step-by-step methods and forces specialized solvers. The eigenvalues do not merely decide whether the system settles; they set how fast, on every timescale at once.

Math-Major Sidebar — the inhomogeneous system. What if the system has a forcing term, $\mathbf{x}' = A\mathbf{x} + \mathbf{f}(t)$ — an external input driving the dynamics (a voltage source on a circuit, a vaccination rate on an epidemic)? The matrix exponential solves this too, by the variation-of-parameters idea you met for scalar equations in calculus. Multiply through by the integrating factor $e^{-At}$ and use $\frac{d}{dt}(e^{-At}\mathbf{x}) = e^{-At}(\mathbf{x}' - A\mathbf{x}) = e^{-At}\mathbf{f}(t)$ (the product rule, valid because $e^{-At}$ commutes with $A$). Integrating from $0$ to $t$ and multiplying back by $e^{At}$ gives the closed form $$ > \mathbf{x}(t) = \underbrace{e^{At}\mathbf{x}(0)}_{\text{free response}} + \underbrace{\int_0^t e^{A(t-s)}\mathbf{f}(s)\,ds}_{\text{forced response}}. > $$ The first term is the homogeneous solution of this section — the system's natural response to its initial state — and the second is a convolution of the forcing with the exponential, the system's response to the input. This formula, called Duhamel's principle, is the backbone of linear-systems theory in engineering, and it shows that even the driven case reduces to knowing $e^{At}$. The eigenvalues still govern: if all $\operatorname{Re}\lambda < 0$, the free response decays and the system's long-run behavior is dictated entirely by the forcing — a stable system forgets its initial condition and tracks its input.

Hand computation: solving the 2×2 system

Take our anchor matrix $A = \begin{bmatrix} -2 & 1 \\ 1 & -2 \end{bmatrix}$ with initial state $\mathbf{x}(0) = \begin{bmatrix} 3 \\ 1 \end{bmatrix}$. We found $\lambda_1=-1, \mathbf{v}_1=(1,1)$ and $\lambda_2=-3, \mathbf{v}_2=(1,-1)$. First expand $\mathbf{x}(0)$ in the eigenbasis: solve $c_1(1,1) + c_2(1,-1) = (3,1)$. Adding and subtracting the two component equations gives $c_1 = \tfrac{3+1}{2} = 2$ and $c_2 = \tfrac{3-1}{2} = 1$. The solution is therefore $$ \mathbf{x}(t) = 2\,e^{-t}\begin{bmatrix} 1 \\ 1 \end{bmatrix} + 1\,e^{-3t}\begin{bmatrix} 1 \\ -1 \end{bmatrix} = \begin{bmatrix} 2e^{-t} + e^{-3t} \\ 2e^{-t} - e^{-3t} \end{bmatrix}. $$ Both exponents are negative, so both modes decay and $\mathbf{x}(t) \to \mathbf{0}$ as $t\to\infty$: the system returns to equilibrium. The fast mode ($e^{-3t}$, eigenvalue $-3$) dies first, so after a short time the trajectory aligns with the slow mode $\mathbf{v}_1 = (1,1)$ before sliding into the origin along that direction. Evaluate at $t=1$: $\mathbf{x}(1) = 2(0.367879)(1,1) + (0.049787)(1,-1) = (0.785546,\, 0.685972)$.

numpy verification — three ways must agree

# Solve x' = Ax three ways: closed form, e^{At}x0, and a numerical ODE integrator.
import numpy as np
from scipy.linalg import expm
from scipy.integrate import solve_ivp
A  = np.array([[-2., 1.], [1., -2.]])
x0 = np.array([3., 1.])

# (1) eigenvector/mode formula: x(t) = 2 e^{-t}(1,1) + 1 e^{-3t}(1,-1)
def x_closed(t):
    return 2*np.exp(-t)*np.array([1., 1.]) + 1*np.exp(-3*t)*np.array([1., -1.])

# (2) matrix exponential
def x_expm(t):
    return expm(A * t) @ x0

# (3) numerical integration of the ODE itself
sol = solve_ivp(lambda t, x: A @ x, [0, 1], x0, t_eval=[1.0], rtol=1e-10, atol=1e-12)

print(np.round(x_closed(1.0), 6))     # [0.785546 0.685972]
print(np.round(x_expm(1.0),   6))     # [0.785546 0.685972]
print(np.round(sol.y[:, -1],  6))     # [0.785546 0.685972]

All three outputs read [0.785546 0.685972]. The hand-derived closed form, the matrix-exponential solution, and scipy.integrate.solve_ivp's independent numerical integration agree to six decimals — three different routes (algebra, the exponential, brute-force numerics) converging on the same trajectory point, recurring theme #3 in action: computation validates theory.

Real-World Application — drug dosing and pharmacokinetics (medicine / biology). When a drug moves between the bloodstream and the tissues, the concentrations $\mathbf{x}(t)=(\text{blood}, \text{tissue})$ obey a linear system $\mathbf{x}'=A\mathbf{x}$ whose off-diagonal entries are transfer rates between compartments and whose diagonal entries include elimination by the kidneys and liver. Because elimination makes all eigenvalues negative, the drug clears — and the slowest (least negative) eigenvalue sets the drug's effective half-life, the number printed on the label that tells a doctor how often to redose. Pharmacologists literally read dosing schedules off the eigenvalues of $A$. The same compartment-model mathematics appears in epidemiology, ecology (nutrient flow between species), and chemical engineering.

37.5 What happens when the eigenvalues are complex?

So far our eigenvalues have been real and the modes have purely grown or decayed. But Chapter 26 taught us that a real matrix can have complex eigenvalues, and that they are not exotic — they are rotation in disguise. In a dynamical system, complex eigenvalues are where oscillation comes from. They are the reason a struck bell rings, a suspension bridge sways, a predator-prey population cycles, and an electrical circuit hums.

Recall the mechanism from Chapter 26: for a real matrix, complex eigenvalues always come in conjugate pairs $\lambda = \alpha \pm \beta i$ (the characteristic polynomial has real coefficients, so its complex roots are conjugates). Feed such an eigenvalue into a mode $e^{\lambda t}$ and use Euler's formula, the bridge between exponentials and oscillation that you met in calculus and in Chapter 26: $$ e^{\lambda t} = e^{(\alpha + \beta i)t} = e^{\alpha t}\,e^{i\beta t} = e^{\alpha t}\big(\cos\beta t + i\sin\beta t\big). $$ Read this slowly, because it contains the entire qualitative theory of oscillation. The factor $e^{\alpha t}$ — driven by the real part $\alpha$ — is the envelope: it grows ($\alpha>0$), decays ($\alpha<0$), or stays constant ($\alpha=0$). The factor $\cos\beta t + i\sin\beta t$ — driven by the imaginary part $\beta$ — is the oscillation: it spins around the unit circle at angular frequency $\beta$, completing a cycle every $2\pi/\beta$ units of time. A complex eigenvalue packages a decay rate and a frequency into one number.

The Key Insight — A complex eigenvalue $\lambda = \alpha \pm \beta i$ tells you two things at once. Its real part $\alpha$ sets whether the oscillation grows or dies (the envelope $e^{\alpha t}$), and its imaginary part $\beta$ sets how fast it oscillates (the frequency, through $\cos\beta t$ and $\sin\beta t$). Real part is fate; imaginary part is rhythm. A bell with $\alpha<0$ rings and fades; a system with $\alpha>0$ oscillates with ever-growing amplitude (resonance disaster); a system with $\alpha=0$ oscillates forever at constant amplitude.

The complex modes are useful for analysis, but the real solution must be real — the physical state is real numbers. The fix is exactly the real canonical form of Chapter 26: the imaginary parts of the two conjugate modes cancel when you add them, leaving real solutions built from $e^{\alpha t}\cos\beta t$ and $e^{\alpha t}\sin\beta t$. We do not need to redo that algebra; we need only its conclusion. The general real solution of a $2\times2$ system with eigenvalues $\alpha\pm\beta i$ is a combination of $e^{\alpha t}\cos\beta t$ and $e^{\alpha t}\sin\beta t$ — an oscillation at frequency $\beta$ inside an exponential envelope $e^{\alpha t}$.

Geometric Intuition — A real eigenvalue stretches along a fixed line; a complex pair rotates while it scales. Picture the flow when $\lambda=\alpha\pm\beta i$: a particle does not move along a straight eigen-direction (there are no real eigenvectors to move along) — instead it spirals. The imaginary part $\beta$ spins it around the origin; the real part $\alpha$ simultaneously pushes it outward ($\alpha>0$, expanding spiral) or inward ($\alpha<0$, collapsing spiral toward the origin). When $\alpha=0$ exactly, the in-out push vanishes and the particle traces a closed ellipse forever — a pure rotation, the "center" of §37.8. Complex eigenvalues are the algebra of spiraling.

Hand computation: a stable spiral

Take $B = \begin{bmatrix} -1 & -2 \\ 2 & -1 \end{bmatrix}$. The characteristic polynomial is $(-1-\lambda)^2 + 4 = \lambda^2 + 2\lambda + 5$, with roots $\lambda = \tfrac{-2 \pm \sqrt{4-20}}{2} = -1 \pm 2i$. So $\alpha = -1$ (negative real part — the spiral decays) and $\beta = 2$ (the oscillation frequency). The solution oscillates at angular frequency $2$ — period $2\pi/2 = \pi$ — inside a shrinking envelope $e^{-t}$. A particle started at $(2,0)$ spirals inward toward the origin, completing a half-turn every $\pi/2$ time units while its distance from the origin shrinks by a factor of $e^{-\pi}\approx 0.043$ each full loop. This is a stable spiral.

numpy verification

# Complex eigenvalues => spiral. Real part = decay/growth, imaginary part = frequency.
import numpy as np
from scipy.linalg import expm
from scipy.integrate import solve_ivp
B  = np.array([[-1., -2.], [2., -1.]])
print(np.round(np.linalg.eigvals(B), 6))    # [-1.+2.j -1.-2.j]  -> alpha=-1, beta=2
x0 = np.array([2., 0.])
print(np.round((expm(B) @ x0), 6))          # [-0.306184  0.669024]  (position at t=1)
sol = solve_ivp(lambda t, x: B @ x, [0, 1], x0, t_eval=[1.0], rtol=1e-10, atol=1e-12)
print(np.round(sol.y[:, -1], 6))            # [-0.306184  0.669024]  -> integrator agrees

The eigenvalues come back as -1.+2.j and -1.-2.j — real part $-1$, imaginary part $\pm 2$ — confirming the hand analysis, and the two solution methods agree on the position at $t=1$. The negative real part guarantees the spiral collapses; we plot its full trajectory in §37.8.

Common Pitfall — A complex eigenvalue $\alpha + \beta i$ does not mean the system is unstable, and the magnitude $|\lambda|=\sqrt{\alpha^2+\beta^2}$ is not what governs stability. Only the real part $\alpha$ decides growth versus decay. A student who sees $\lambda = -1 + 2i$ and reasons "$|\lambda| = \sqrt5 > 1$, so it grows" has imported the discrete-system rule (where $|\lambda|>1$ means growth of $A^k$, Chapter 29) into the continuous world, where it is flatly wrong. For $\mathbf{x}'=A\mathbf{x}$, stability is about the sign of $\operatorname{Re}\lambda$, full stop. (The two rules connect cleanly: $|e^{\lambda}| = e^{\operatorname{Re}\lambda}$, so $\operatorname{Re}\lambda < 0$ for the continuous system corresponds exactly to $|e^{\lambda}|<1$ for the time-step map $e^{A}$ — the bridge between continuous and discrete dynamics.)

FAQ: Why must oscillation come from complex eigenvalues — can a real eigenvalue ever oscillate?

No, and the reason is worth making explicit because it pins down exactly where oscillation lives in the algebra. A single real eigenvalue $\lambda$ contributes a mode $e^{\lambda t}\mathbf{v}$ that moves monotonically along the fixed line spanned by $\mathbf{v}$ — it grows or shrinks, but it never turns around, because $e^{\lambda t}$ is a monotone function of $t$ for real $\lambda$. Turning around — going up, then down, then up — requires the periodic functions $\cos\beta t$ and $\sin\beta t$, and those enter the solution only through Euler's formula $e^{i\beta t}=\cos\beta t + i\sin\beta t$, which only appears when an eigenvalue has a nonzero imaginary part $\beta$. So oscillation in a linear system is precisely the fingerprint of complex eigenvalues; a system with only real eigenvalues cannot oscillate, no matter how the modes are combined. This is why a heavily damped door (real eigenvalues) eases shut without bouncing, while a lightly damped one (complex eigenvalues) swings past and back: damping is what pushes the eigenvalues off the real axis and back onto it. The imaginary part is the frequency, and without it there is no rhythm to the motion.

37.6 How do you exponentiate a defective matrix?

Diagonalization handled the well-behaved case, and most matrices you meet are diagonalizable. But Chapter 24 warned us, and Chapter 36 confronted it head-on: some matrices are defective — they do not have enough independent eigenvectors to form a basis, so there is no $P$ with $A=PDP^{-1}$. The diagonalization recipe breaks. Yet $e^{At}$ still exists (the series always converges, §37.2), so we must be able to compute it. The answer is the Jordan normal form of Chapter 36, and it reveals where a genuinely new kind of term — a polynomial multiplying the exponential — comes from.

The model defective matrix is a single Jordan block (Chapter 36), $$ J = \begin{bmatrix} \lambda & 1 \\ 0 & \lambda \end{bmatrix}, $$ which has the one eigenvalue $\lambda$ with algebraic multiplicity 2 but only a one-dimensional eigenspace — the defect that blocks diagonalization. To exponentiate it, use the trick of Chapter 36: split it into its diagonal and nilpotent parts, $J = \lambda I + N$ where $N = \begin{bmatrix} 0 & 1 \\ 0 & 0 \end{bmatrix}$. The matrix $N$ is nilpotent: $N^2 = \mathbf{0}$ (check it). The two pieces $\lambda I$ and $N$ commute (anything commutes with a multiple of $I$), which — by the rule we are about to state in §37.7 — lets us split the exponential of the sum into a product: $$ e^{Jt} = e^{(\lambda I + N)t} = e^{\lambda I t}\,e^{Nt} = e^{\lambda t}\,e^{Nt}. $$ Now $e^{Nt}$ is easy because $N$ is nilpotent — its series terminates after two terms, since $N^2 = N^3 = \cdots = \mathbf{0}$: $$ e^{Nt} = I + Nt + \frac{(Nt)^2}{2!} + \cdots = I + Nt = \begin{bmatrix} 1 & t \\ 0 & 1 \end{bmatrix}. $$ Therefore $$ \boxed{\,e^{Jt} = e^{\lambda t}\begin{bmatrix} 1 & t \\ 0 & 1 \end{bmatrix} = \begin{bmatrix} e^{\lambda t} & t\,e^{\lambda t} \\ 0 & e^{\lambda t} \end{bmatrix}.\,} $$ There it is — the new ingredient. A defective eigenvalue produces a term $t\,e^{\lambda t}$: a polynomial in $t$ times the exponential. The repeated eigenvalue, lacking a second eigenvector, "borrows" a factor of $t$ from the generalized eigenvector of Chapter 36. For a larger Jordan block of size $m$, the entries climb to $t^{m-1}e^{\lambda t}/(m-1)!$ — the deficiency in eigenvectors is paid for in powers of $t$.

Geometric Intuition — In the diagonalizable case each mode is pure exponential, $e^{\lambda t}\mathbf{v}$, moving rigidly along a fixed eigen-direction. A defective eigenvalue has a missing direction, and the flow compensates with a shear that grows linearly in time — the $t\,e^{\lambda t}$ term. Picture the visualizer's shear from Chapter 1, but with the shear strength increasing as the clock ticks: the generalized eigenvector slides steadily alongside the true eigenvector. Crucially, the stability is still set by $\lambda$: if $\operatorname{Re}\lambda<0$, the exponential $e^{\lambda t}$ eventually crushes the polynomial $t$ (exponentials beat polynomials), so the mode still decays — just more slowly, with a transient bump before it dies.

Hand computation and verification: the defective exponential

For $J = \begin{bmatrix} 2 & 1 \\ 0 & 2 \end{bmatrix}$ (eigenvalue $\lambda = 2$, defective), the formula gives $e^{Jt} = e^{2t}\begin{bmatrix} 1 & t \\ 0 & 1 \end{bmatrix}$. At $t=1$: $e^{J} = e^{2}\begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix}$, and with $e^2 = 7.389056$ that is $\begin{bmatrix} 7.389056 & 7.389056 \\ 0 & 7.389056 \end{bmatrix}$.

# Defective (non-diagonalizable) matrix: exponentiate via Jordan, get a t*e^{lt} term.
import numpy as np
from scipy.linalg import expm
J = np.array([[2., 1.], [0., 2.]])
print(np.round(expm(J), 6))                 # scipy handles the defective case directly
# [[7.389056 7.389056]
#  [0.       7.389056]]
t = 1.0
hand = np.exp(2*t) * np.array([[1., t], [0., 1.]])   # e^{2t}[[1,t],[0,1]]
print(np.round(hand, 6))                    # [[7.389056 7.389056] [0. 7.389056]]
# WARNING: diagonalization fails here -- np.linalg.eig returns a near-singular P.
lam, P = np.linalg.eig(J)
print(np.round(np.linalg.cond(P), 2))       # huge condition number -> P is (nearly) non-invertible

The hand formula and scipy's expm match exactly, $\begin{bmatrix} 7.389056 & 7.389056 \\ 0 & 7.389056\end{bmatrix}$ — and note that scipy gets it right without diagonalizing. The last line shows why you cannot diagonalize: np.linalg.eig returns an eigenvector matrix P with an enormous condition number (it tries to return two eigenvectors but they are nearly parallel), so $P^{-1}$ is numerically garbage. This is the practical face of "defective," and the reason §37.3 told you to call expm rather than rolling your own.

Warning

— The diagonalization formula $e^{At}=Pe^{Dt}P^{-1}$ is valid only when $A$ is diagonalizable — when it has a full set of $n$ linearly independent eigenvectors (Chapter 25's condition). For a defective matrix the formula is not just inaccurate, it is undefined: $P$ is not invertible, so $P^{-1}$ does not exist. You must use the Jordan form (this section) or, in practice, scipy's expm. Always check diagonalizability — equivalently, check that the geometric multiplicity equals the algebraic multiplicity for every eigenvalue (Chapter 24) — before reaching for $Pe^{Dt}P^{-1}$.

37.7 What is the stability rule, and why is $e^{A+B} \ne e^A e^B$?

We can now state the chapter's central result — the one the whole structure has been building toward — and it is short enough to memorize. Gather everything: in the diagonalizable case the solution is $\sum_i c_i e^{\lambda_i t}\mathbf{v}_i$; complex eigenvalues contribute $e^{\alpha t}(\text{oscillation})$ with $\alpha = \operatorname{Re}\lambda$; defective eigenvalues contribute $t^{m}e^{\lambda t}$ but the exponential still dominates the polynomial. In every case, each mode is bounded above and below by $e^{(\operatorname{Re}\lambda)t}$ (up to polynomial factors), and a polynomial cannot reverse the verdict of an exponential. So the long-run fate of the whole system is decided by the single most positive real part among the eigenvalues.

The Key Insight — the stability dictionary. For the linear system $\mathbf{x}'=A\mathbf{x}$, the equilibrium at the origin is: - Asymptotically stable (every trajectory decays to $\mathbf{0}$) $\iff$ all eigenvalues have $\operatorname{Re}\lambda < 0$. - Unstable (some trajectory grows without bound) $\iff$ some eigenvalue has $\operatorname{Re}\lambda > 0$. - Marginally stable / borderline when the largest real part is exactly $0$ (and those eigenvalues are non-defective): trajectories neither grow nor decay — they settle into bounded oscillation or constant motion. (A defective eigenvalue with $\operatorname{Re}\lambda=0$ is unstable, because the $t$-factor grows.)

Real part negative ⇒ decay. Real part positive ⇒ growth. Real part zero ⇒ borderline. Stability lives entirely in the signs of the real parts of the eigenvalues. This is recurring theme #6 at its sharpest: the eigenvalues are the destiny of the system.

State the conditions precisely, because the borderline cases are where careless reasoning fails. "All $\operatorname{Re}\lambda<0$" is the clean, robust condition for asymptotic stability — small perturbations die out completely. The strict inequality matters: a single eigenvalue with $\operatorname{Re}\lambda=0$ is enough to break asymptotic stability, and if that borderline eigenvalue is defective, the borrowed factor of $t$ makes the system outright unstable even though $\operatorname{Re}\lambda$ is not positive. This is why a structural engineer fears eigenvalues on the imaginary axis, not just to the right of it: at exactly zero damping, resonance can grow linearly.

Real-World Application — control systems and the stability of feedback (engineering / robotics). When an engineer designs a cruise control, an autopilot, a drone stabilizer, or a thermostat, the closed-loop system is a linear system $\mathbf{x}'=A\mathbf{x}$ whose matrix $A$ depends on the feedback gains they choose. The entire job of control design is to place the eigenvalues of $A$ in the left half-plane ($\operatorname{Re}\lambda<0$) so that disturbances die out and the system holds its setpoint. Push an eigenvalue across the imaginary axis and the controller oscillates or runs away — a crashed drone, a hunting thermostat, a wobbling robot arm. "Pole placement" and the Routh–Hurwitz stability criterion are entire chapters of control theory devoted to keeping $\operatorname{Re}\lambda<0$ without ever computing the eigenvalues explicitly. The RLC circuit of Case Study 2 is the simplest instance.

Why the exponential law $e^{A+B}=e^Ae^B$ usually fails

There is one seductive shortcut you must not take, and it is the most common error with matrix exponentials. For numbers, $e^{a+b}=e^a e^b$ always — it is the law of exponents. For matrices, this is false in general. The reason is the reason for half of linear algebra's surprises: matrix multiplication does not commute (Chapter 8). Watch where commuting is needed. Expanding $e^{A+B}$ by the series produces, at second order, the term $\tfrac12(A+B)^2 = \tfrac12(A^2 + AB + BA + B^2)$, while expanding $e^Ae^B$ produces $\tfrac12 A^2 + AB + \tfrac12 B^2$. These match only if $AB+BA = 2AB$, i.e. only if $\mathbf{AB = BA}$. When $A$ and $B$ commute, every term lines up and the law holds; when they do not, the exponential of a sum is genuinely different from the product of the exponentials.

The exponential law for matrices (state the condition!). $e^{A+B} = e^A e^B$ if and only if $A$ and $B$ commute, $AB = BA$. In particular $e^{A}e^{-A} = e^{A-A} = e^{0} = I$ always holds (a matrix commutes with itself), so $e^{At}$ is always invertible, with $(e^{At})^{-1} = e^{-At}$ — which makes physical sense, since the flow can always be run backward in time. But you may not split $e^{A+B}$ into $e^A e^B$ unless you have checked that $AB=BA$.

# The exponential law e^{A+B} = e^A e^B FAILS when A and B do not commute.
import numpy as np
from scipy.linalg import expm
A = np.array([[0., 1.], [0., 0.]])
B = np.array([[0., 0.], [1., 0.]])
print(np.allclose(A @ B, B @ A))            # False -> A and B do NOT commute
print(np.round(expm(A + B), 6))             # e^{A+B}
# [[1.543081 1.175201]
#  [1.175201 1.543081]]
print(np.round(expm(A) @ expm(B), 6))       # e^A e^B  -- a DIFFERENT matrix
# [[2. 1.]
#  [1. 1.]]
print(round(np.linalg.norm(expm(A+B) - expm(A) @ expm(B)), 6))   # 0.751733 -> not equal

The two matrices differ substantially — the Frobenius norm of their difference is $0.751733$, nowhere near zero. The non-commuting $A$ and $B$ make $e^{A+B}\neq e^Ae^B$, exactly as the algebra predicted. (The precise correction term is the content of the Baker–Campbell–Hausdorff formula, which expresses $\log(e^Ae^B)$ in terms of nested commutators $[A,B]=AB-BA$ — and it is central to quantum mechanics and Lie theory.)

Common Pitfall — Do not write $e^{A+B}=e^Ae^B$ for matrices without checking $AB=BA$. This trap is everywhere: it sneaks into "operator splitting" numerical schemes, into quantum time-evolution (where $e^{-iHt}$ for a sum of non-commuting Hamiltonians cannot be split naively — the source of the famous Trotter–Suzuki approximation), and into hand derivations. The safe facts are the commuting ones: $e^{At}e^{As}=e^{A(t+s)}$ (same matrix at different times — the flow composes, $A\cdot t$ and $A\cdot s$ commute), and $e^{A}e^{-A}=I$. Anything mixing two different matrices needs the commuting check first.

37.8 How do eigenvalues classify the phase portrait?

Now we collect the harvest: the complete dictionary translating eigenvalues into the shape of the phase portrait. This is the geometry-first payoff of the whole chapter — every type of long-run behavior a $2\times2$ linear system can exhibit, drawn as a picture and labeled by its eigenvalues. The anchor of this chapter is exactly this: take a system, compute its eigenvalues, and classify the portrait.

There are four fundamental types for a $2\times2$ system with an isolated equilibrium at the origin, set by the two eigenvalues $\lambda_1, \lambda_2$:

  1. Node — eigenvalues real, same sign. Both negative ⇒ a stable node: every trajectory flows into the origin, fast along the more-negative eigenvector, then sliding in along the less-negative one. Both positive ⇒ an unstable node: the same picture with all arrows reversed, everything flowing out.
  2. Saddle — eigenvalues real, opposite signs ($\lambda_1 < 0 < \lambda_2$). Trajectories come in along the stable (negative-$\lambda$) eigenvector and go out along the unstable (positive-$\lambda$) one, making the hyperbola-like saddle shape. Always unstable — almost every trajectory eventually escapes along the unstable direction.
  3. Spiral (focus) — eigenvalues complex with nonzero real part, $\alpha\pm\beta i$, $\alpha\neq0$. Trajectories spiral. $\alpha<0$ ⇒ stable spiral (inward); $\alpha>0$ ⇒ unstable spiral (outward). The frequency is $\beta$.
  4. Center — eigenvalues purely imaginary, $\pm\beta i$ ($\alpha=0$). Trajectories are closed ellipses orbiting the origin forever — neither growing nor decaying. Marginally stable; the special case of undamped oscillation.

Geometric Intuition — The four portraits are four answers to "what does the flow do near equilibrium?" A node is pure radial motion (in or out) — trajectories are like water down a drain (stable) or a fountain (unstable). A saddle is a mountain pass — flow funnels in from two sides and out the other two. A spiral is a whirlpool with drift — round and round while sinking or rising. A center is a planetary orbit — eternal closed loops. Each shape is a direct fingerprint of the eigenvalues: real-same-sign drains, real-opposite-sign passes, complex spirals, imaginary orbits.

The trace-determinant shortcut

For a $2\times2$ matrix you often do not even need to find the eigenvalues — two numbers you can read at a glance, the trace and determinant, place the system on a map. Recall from Chapter 24 that for $A = \begin{bmatrix} a & b \\ c & d \end{bmatrix}$, the eigenvalues satisfy $\lambda^2 - (\operatorname{tr} A)\lambda + \det A = 0$, where $\operatorname{tr} A = a+d = \lambda_1+\lambda_2$ and $\det A = ad-bc = \lambda_1\lambda_2$. The discriminant is $\Delta = (\operatorname{tr} A)^2 - 4\det A$. From just these you can classify:

  • $\det A < 0$ ⇒ eigenvalues are real with opposite signs ⇒ saddle (always).
  • $\det A > 0$ and $\Delta > 0$ ⇒ real, same sign ⇒ node, stable if $\operatorname{tr} A < 0$, unstable if $\operatorname{tr} A > 0$.
  • $\det A > 0$ and $\Delta < 0$ ⇒ complex eigenvalues ⇒ spiral, stable if $\operatorname{tr} A < 0$, unstable if $\operatorname{tr} A > 0$.
  • $\det A > 0$ and $\operatorname{tr} A = 0$ ⇒ purely imaginary ⇒ center.

The pattern compresses to a slogan: the trace controls stability (it is the sum of the real parts, so $\operatorname{tr} A < 0$ pulls toward decay), and the determinant and discriminant control the type (sign of $\det A$ separates saddle from node/spiral; sign of $\Delta$ separates node from spiral). This is the trace-determinant plane, the one-picture summary of all $2\times2$ linear dynamics.

Check Your Understanding — Classify each system from its trace and determinant, without finding eigenvalues. (a) $A=\begin{bmatrix}-2&1\\1&-2\end{bmatrix}$. (b) $A=\begin{bmatrix}1&2\\2&1\end{bmatrix}$. (c) $A=\begin{bmatrix}-1&-2\\2&-1\end{bmatrix}$.

Answer (a) $\operatorname{tr}=-4<0$, $\det=4-1=3>0$, $\Delta=16-12=4>0$ ⇒ real same-sign, trace negative ⇒ stable node (matches our $\lambda=-1,-3$). (b) $\operatorname{tr}=2$, $\det=1-4=-3<0$ ⇒ saddle (eigenvalues $3,-1$). (c) $\operatorname{tr}=-2<0$, $\det=1+4=5>0$, $\Delta=4-20=-16<0$ ⇒ complex, trace negative ⇒ stable spiral (matches $\lambda=-1\pm2i$). All three classified with no eigenvalue computation.

Drawing the phase portraits (the anchor)

Here is the anchor of the chapter made visible. We plot the phase portrait of three representative systems — a stable node, a saddle, and a stable spiral — each with its velocity field (the arrows $A\mathbf{x}$) and a few sample trajectories, and we read its classification straight off the eigenvalues. The figures are produced by the matplotlib code below; each uses the same conventions so all three look like pages of one atlas.

# Figure 37.1-37.3: phase portraits of a stable node, a saddle, and a stable spiral.
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import solve_ivp

systems = {
    "Stable node  (lam = -1, -3)":  np.array([[-2.,  1.], [ 1., -2.]]),
    "Saddle  (lam = 3, -1)":        np.array([[ 1.,  2.], [ 2.,  1.]]),
    "Stable spiral  (lam = -1+-2i)":np.array([[-1., -2.], [ 2., -1.]]),
}
fig, axes = plt.subplots(1, 3, figsize=(15, 5))
g = np.linspace(-3, 3, 21)
X, Y = np.meshgrid(g, g)
for ax, (title, A) in zip(axes, systems.items()):
    U, V = A[0, 0]*X + A[0, 1]*Y, A[1, 0]*X + A[1, 1]*Y     # velocity field A x
    ax.streamplot(X, Y, U, V, density=1.1, color="0.6", linewidth=0.7)
    for x0 in [(2.5, 0.5), (-2.5, -0.5), (0.5, 2.5), (-0.5, -2.5), (2.5, -2.5), (-2.5, 2.5)]:
        sol = solve_ivp(lambda t, x: A @ x, [0, 4], x0, t_eval=np.linspace(0, 4, 300))
        ax.plot(sol.y[0], sol.y[1], "C0", lw=1.6)            # a trajectory
        ax.plot(*x0, "ko", ms=3)                              # its start
    ax.plot(0, 0, "C3*", ms=14)                              # the equilibrium
    ax.set_xlim(-3, 3); ax.set_ylim(-3, 3); ax.set_aspect("equal")
    ax.set_title(title); ax.grid(alpha=0.3)
plt.tight_layout(); plt.savefig("phase_portraits.png", dpi=110)

Figure 37.1–37.3. Three phase portraits, classified by eigenvalues. Left — the stable node $A=\begin{bmatrix}-2&1\\1&-2\end{bmatrix}$ (eigenvalues $-1,-3$): all trajectories flow into the origin, sliding in along the slow eigenvector $(1,1)$. Center — the saddle $A=\begin{bmatrix}1&2\\2&1\end{bmatrix}$ (eigenvalues $3,-1$): trajectories approach along the stable direction $(1,-1)$ and escape along the unstable direction $(1,1)$, the hyperbolic mountain-pass shape. Right — the stable spiral $A=\begin{bmatrix}-1&-2\\2&-1\end{bmatrix}$ (eigenvalues $-1\pm2i$): trajectories spiral inward to the origin, oscillating at frequency $2$ inside a decaying envelope. The red star marks the equilibrium; gray streamlines are the velocity field $A\mathbf{x}$; blue curves are trajectories from the black starting dots.

These three pictures are the eigenvalue dictionary. In the node, two negative real eigenvalues make every arrow point inward — confirmed by our computation $\lambda=-1,-3$. In the saddle, the opposite signs $\lambda=3,-1$ split the flow into incoming and outgoing directions. In the spiral, the complex pair $\lambda=-1\pm2i$ produces rotation (from the imaginary part $2$) inside decay (from the real part $-1$). You did not need to draw the curves to know their shape — the eigenvalues told you in advance. That is the chapter's promise: eigenvalues are prophecy, and the phase portrait is the prophecy fulfilled.

FAQ: What about repeated eigenvalues and the line on the trace-determinant chart?

The four types — node, saddle, spiral, center — cover the generic cases, the ones you hit when the eigenvalues are distinct and off the imaginary axis. Two boundaries deserve a word. The curve $\Delta = (\operatorname{tr} A)^2 - 4\det A = 0$ on the trace-determinant plane separates nodes (real eigenvalues, $\Delta>0$) from spirals (complex eigenvalues, $\Delta<0$); on that curve the discriminant is zero, so the eigenvalues are real and repeated, $\lambda_1=\lambda_2$. What happens there depends on the defectiveness of Chapter 36. If the repeated eigenvalue still has two independent eigenvectors (the matrix is a scalar multiple of $I$, a "star node"), every direction is an eigen-direction and trajectories are straight rays into or out of the origin. If it is defective — only one eigenvector — the missing direction forces the $t\,e^{\lambda t}$ term of §37.6, and you get an improper (degenerate) node: trajectories all become tangent to the single eigenvector as they enter or leave, swept along by the linear-in-$t$ shear. Either way the stability is unchanged — a repeated negative eigenvalue is still stable — so for the practical question "does it decay?" the sign of the real part remains the whole story. The boundary cases refine the shape of the portrait, not its fate. The other boundary, the positive $\det$-axis where $\operatorname{tr} A = 0$, is the center: the knife-edge between stable and unstable spirals, where the slightest change in trace tips eternal orbits into inward or outward spirals — which is why centers are structurally fragile and rarely survive when a model is made even slightly more realistic.

Real-World Application — population ecology and the fate of two interacting species (biology / ecology). Linearize a two-species model — two competitors, or a predator and its prey — near an equilibrium and you get exactly $\mathbf{x}'=A\mathbf{x}$, where $\mathbf{x}$ is the deviation from equilibrium populations. The eigenvalues of the linearization (the Jacobian, the calculus tool that turns a nonlinear model into a local linear one) classify the equilibrium: a stable node or spiral means the two populations return to coexistence after a disturbance; a saddle means coexistence is precarious and the system tips toward one species winning; a center is the textbook predator-prey cycle, populations orbiting forever out of phase. Ecologists read the long-term outcome of competition off the same eigenvalues we just computed. We meet a full coupled-population model in Case Study 1.

37.9 Building the matrix exponential from scratch

We close the exposition by turning the chapter's central computation into code you own, and then verifying it against the professionals. The diagonalization formula $e^{At}=Pe^{Dt}P^{-1}$ is, line for line, a recipe a computer can follow — and writing it yourself is the surest way to feel that the matrix exponential is just eigenvalues exponentiated.

The plan is exactly §37.3: eigendecompose $A$ to get eigenvalues $\boldsymbol\lambda$ and eigenvector matrix $P$, exponentiate the eigenvalues into the diagonal matrix $e^{Dt}$, and sandwich, $P\,e^{Dt}\,P^{-1}$. (For the toolkit version you implement the eigendecomposition itself from scratch via the QR algorithm of Chapter 29; here, to isolate the exponential idea, we lean on the eigensolver and focus on assembling the exponential.) Then we solve $\mathbf{x}'=A\mathbf{x}$ for a given $A, \mathbf{x}(0)$ and check the result against both scipy.linalg.expm and the independent integrator scipy.integrate.solve_ivp.

# matrix_exponential(A, t) via eigendecomposition: e^{At} = P e^{Dt} P^{-1}.
import numpy as np
from scipy.linalg import expm
from scipy.integrate import solve_ivp

def matrix_exponential(A, t):
    """e^{At} for a DIAGONALIZABLE matrix A, via P e^{Dt} P^{-1}. (For defective A, use scipy.linalg.expm.)"""
    A = np.asarray(A, dtype=float)
    lam, P = np.linalg.eig(A)                 # A = P diag(lam) P^{-1}
    Dt = np.diag(np.exp(lam * t))             # e^{Dt}: exponentiate each eigenvalue*t
    return (P @ Dt @ np.linalg.inv(P)).real   # sandwich; .real since real A -> real e^{At}

def solve_linear_ode(A, x0, t):
    """Solve x' = A x with x(0)=x0, returning x(t) = e^{At} x0."""
    return matrix_exponential(A, t) @ np.asarray(x0, dtype=float)

A  = np.array([[-2., 1.], [1., -2.]])
x0 = np.array([3., 1.])
print(np.round(matrix_exponential(A, 1.0), 6))      # matches expm:
# [[0.208833 0.159046]
#  [0.159046 0.208833]]
print(np.max(np.abs(matrix_exponential(A, 1.0) - expm(A))))   # ~1e-15

x_mine = solve_linear_ode(A, x0, 1.0)
x_ref  = solve_ivp(lambda t, x: A @ x, [0, 1], x0,
                   t_eval=[1.0], rtol=1e-10, atol=1e-12).y[:, -1]
print(np.round(x_mine, 6), np.round(x_ref, 6))      # [0.785546 0.685972] [0.785546 0.685972]

The from-scratch matrix_exponential reproduces scipy.linalg.expm to machine precision ($\sim10^{-15}$), and the ODE solution it produces, $\mathbf{x}(1)=(0.785546, 0.685972)$, matches the independent numerical integrator. Three methods, one answer — and you built one of them.

Build Your Toolkit. Add the continuous-time engine to your toolkit and make the matrix exponential a first-class operation — pure Python in the implementation, numpy/scipy only to verify. - In a new toolkit/matrix_exp.py, implement matrix_exponential(A, t) via the eigendecomposition $e^{At}=Pe^{Dt}P^{-1}$, reusing the eig/qr_algorithm you built in toolkit/eigen.py (Chapter 29) so the implementation calls no numpy linear algebra — only your own eigendecomposition and your own matmul/inverse from Chapters 8–9. Exponentiate each eigenvalue with the scalar Taylor series (or Python's math.exp on the real part times Euler's formula on the imaginary part, for complex $\lambda$). - Add solve_linear_ode(A, x0, t) returning $e^{At}\mathbf{x}(0)$ — the complete solver for $\mathbf{x}'=A\mathbf{x}$. - Add classify_2x2(A) that returns the string "stable node" / "unstable node" / "saddle" / "stable spiral" / "unstable spiral" / "center" from the trace, determinant, and discriminant of §37.8. - Verify all three: matrix_exponential against scipy.linalg.expm; solve_linear_ode against scipy.integrate.solve_ivp on the anchor system $A=\begin{bmatrix}-2&1\\1&-2\end{bmatrix},\ \mathbf{x}(0)=(3,1)$, confirming $\mathbf{x}(1)\approx(0.785546,0.685972)$; and classify_2x2 against the eigenvalues for the node, saddle, and spiral above. Note in a comment that for defective $A$ your eigendecomposition route fails (singular $P$) and scipy.linalg.expm — using scaling-and-squaring, Chapter 38 — is the robust choice. The point you are proving: the matrix exponential is nothing more than eigenvalues, exponentiated and rotated back into place.

37.10 Summary and the road ahead

We let time flow, and watched linear algebra meet calculus. The simplest equation in science, $x'=ax$ with solution $x(t)=e^{at}x(0)$, has an exact matrix twin: the system $\mathbf{x}'=A\mathbf{x}$ is solved by $\mathbf{x}(t)=e^{At}\mathbf{x}(0)$, where the matrix exponential $e^{At}=\sum_k (At)^k/k!$ is defined by the same Taylor series and always converges, for every square matrix. We learned to compute it: by diagonalization, $e^{At}=Pe^{Dt}P^{-1}$ (Chapter 25), which exponentiates each eigenvalue into an independent mode $e^{\lambda_i t}$; and in the defective case, via the Jordan form (Chapter 36), where a missing eigenvector forces a polynomial-times-exponential term $t^m e^{\lambda t}$. We proved that $e^{At}$ solves the system by the one-line fact $\frac{d}{dt}e^{At}=Ae^{At}$, and we read the solution as a superposition of eigenmodes, $\mathbf{x}(t)=\sum_i c_i e^{\lambda_i t}\mathbf{v}_i$. Complex eigenvalues $\alpha\pm\beta i$ split into a decay rate $\alpha$ and an oscillation frequency $\beta$ (Euler's formula, Chapter 26), producing spirals. And it all converged on the stability rule: the equilibrium is asymptotically stable exactly when all eigenvalues have negative real part, unstable when any has positive real part — eigenvalues as the destiny of the system. We classified every $2\times2$ phase portrait — node, saddle, spiral, center — from the eigenvalues, with the trace-determinant plane as a shortcut (trace controls stability, determinant and discriminant control type), and drew the portraits to see the dictionary made visible. Finally, we flagged the trap that $e^{A+B}\neq e^Ae^B$ unless $AB=BA$ — the non-commutativity of matrices reaching all the way into the exponential.

So what is the single thing to remember from this chapter? That the eigenvalues of $A$ are the fate of $\mathbf{x}'=A\mathbf{x}$. Negative real parts mean decay, positive mean growth, imaginary parts mean oscillation — and the phase portrait's shape (node, saddle, spiral, center) is nothing but a picture of those eigenvalues. The matrix exponential $e^{At}$ is the machine that turns the eigenvalues into the flow of time, but you can read the system's destiny without ever computing it: just find the eigenvalues and look at their real parts. Eigenvalues stopped being roots of a polynomial and became prophecy.

Where this goes. We have computed $e^{At}$ as if arithmetic were exact — but Chapter 38, Numerical Linear Algebra, delivers the reality check. We saw it twice already: the diagonalization route quietly fails for defective matrices because $P$ becomes ill-conditioned, and scipy's expm succeeds because it uses a fundamentally different, numerically stable algorithm (scaling-and-squaring) rather than the eigenvalues. Why one method is trustworthy on a finite machine and another is not — how the condition number measures a problem's sensitivity to round-off, and why algorithmic stability is a separate virtue from mathematical correctness — is the subject of the next and final chapter of Part VII. The matrix exponential is also the doorway to two deep cross-roads: the connection between differential equations and linear algebra that this chapter built is the foundation of dynamical systems theory, and the operator $e^{-iHt}$ — the matrix exponential of a Hermitian matrix times imaginary time — is precisely the rule for time evolution in quantum mechanics, where the eigenvalues of the Hamiltonian set the energy levels and frequencies of the universe. The qubit of Chapters 5, 21, 27, and 34 evolves by a matrix exponential; the same $e^{At}$ that returns a perturbed bridge to rest also drives the dynamics of every quantum computer.

The Key Insight — Eigenvalues govern dynamics. The matrix exponential $e^{At}$ solves $\mathbf{x}'=A\mathbf{x}$ exactly, but its behavior is entirely dictated by the eigenvalues of $A$: their real parts decide growth or decay, their imaginary parts decide oscillation, and together they classify the phase portrait into node, saddle, spiral, or center. Master this and you can look at any linear dynamical system — a circuit, a population, a control loop, a vibrating structure, a quantum state — and predict its fate from the spectrum alone. This is recurring theme #6, eigenvalues reveal what a matrix really does, in its most consequential form: in a world that evolves in time, the eigenvalues are the future.