Quiz — Chapter 20: Gram-Schmidt and QR Decomposition

Twelve conceptual checks. Try each before opening the answer. One-line explanations follow each answer.


Q1. Gram-Schmidt is best described as which earlier operation, applied repeatedly?

Answer **Orthogonal projection** (Chapter 19). Each step subtracts a vector's projection onto the directions already built, leaving a perpendicular remainder. *Gram-Schmidt is repeated projection — it contains no new mathematics beyond the projection theorem.*

Q2. What is the precise condition on the input vectors for Gram-Schmidt to succeed (produce an orthonormal basis of the same size)?

Answer The vectors must be **linearly independent**. *If they are dependent, some remainder $\mathbf{u}_k$ is exactly $\mathbf{0}$, and you cannot normalize the zero vector — the process signals the dependence by failing at that step.*

Q3. In the factorization $A = QR$, what two properties define $Q$ and $R$?

Answer $Q$ has **orthonormal columns** (so $Q^{\mathsf{T}}Q = I$) and $R$ is **upper-triangular**. *In the canonical form, $R$ additionally has a positive diagonal, which makes the factorization unique.*

Q4. Why is $R$ upper-triangular and not lower-triangular?

Answer Because Gram-Schmidt processes vectors *in order*, and the $k$-th original vector $\mathbf{a}_k$ is rebuilt using only $\mathbf{q}_1, \dots, \mathbf{q}_k$ — never a later $\mathbf{q}$. *"No looking ahead" puts zeros below the diagonal; the sequential structure is the source of triangularity.*

Q5. You compute QR by hand and get a positive diagonal in $R$; numpy gives one negative diagonal entry. Who is right?

Answer **Both.** QR is unique only up to the signs of $Q$'s columns (and matching rows of $R$); numpy makes no positive-diagonal promise. *Flip the sign of any $Q$-column whose $R_{kk}$ is negative — and flip that row of $R$ — to reach the canonical form and match.*

Q6. What is the QR-based formula for the least-squares solution, and how is it solved?

Answer $R\hat{\mathbf{x}} = Q^{\mathsf{T}}\mathbf{b}$, solved by **back substitution** (one upward pass, since $R$ is upper-triangular). *Same projection-onto-the-column-space answer as Chapter 17, but computed without forming $A^{\mathsf{T}}A$.*

Q7. Why is solving least squares via QR more numerically stable than the normal equations $A^{\mathsf{T}}A\hat{\mathbf{x}} = A^{\mathsf{T}}\mathbf{b}$?

Answer Forming $A^{\mathsf{T}}A$ **squares the condition number** ($\kappa \to \kappa^2$), amplifying roundoff; QR never forms it, so its conditioning stays at $\kappa$. *For correlated predictors, $\kappa^2$ can destroy half your significant digits before you even solve.*

Q8. For a tall $m \times n$ matrix ($m > n$), what are the shapes of $Q$ and $R$ in the reduced QR (numpy's default)?

Answer $Q$ is $m \times n$ (orthonormal columns) and $R$ is $n \times n$ (square, upper-triangular). *The **full** QR instead gives an $m \times m$ orthogonal $Q$ and an $m \times n$ $R$; ask numpy for `mode='complete'` to get it.*

Q9. Classical and modified Gram-Schmidt produce the same answer in exact arithmetic. Why prefer modified in practice?

Answer In floating point, **classical GS loses orthogonality** when the inputs are nearly dependent (catastrophic cancellation); modified GS — subtracting each shadow from the *running* remainder — stays accurate. *Same formula, different order of operations, vastly different stability.*

Q10. What does subtracting a vector's projection onto the all-ones direction $\mathbf{1} = (1,1,\dots,1)$ do to that vector, statistically?

Answer It **centers** the vector (subtracts its mean), so the result sums to zero. *Orthogonality to $\mathbf{1}$ means $\mathbf{1}\cdot\mathbf{u} = 0$, i.e. the entries sum to zero — the geometric and statistical statements coincide, which is why PCA centers data first.*

Q11. Iterating QR — factor $A = QR$, then form $RQ$, factor again, repeat — converges (under mild conditions) to a matrix whose diagonal reveals what?

Answer The **eigenvalues** of $A$. *This is the QR algorithm (Chapters 24, 29); each step $RQ = Q^{\mathsf{T}}AQ$ is a similarity transformation, so eigenvalues are preserved while the matrix is rotated toward upper-triangular form.*

Q12. Does every finite-dimensional subspace of $\mathbb{R}^m$ have an orthonormal basis? How do you know?

Answer **Yes.** Take any basis (Chapter 15 guarantees one) and run Gram-Schmidt. *The process is constructive — it not only proves existence but hands you the basis, which is why later results like the Spectral Theorem and SVD may freely assume orthonormal bases exist.*