Chapter 18 Exercises — Dot Products, Norms, and the Geometry of Angles
How to use these. Work the ⭐ problems first to lock in the vocabulary of dot product, norm, orthogonality, and angle (no computation needed). The ⭐⭐ problems are by-hand calculations — dot products, lengths, angles, cosine similarities you can do with a pencil. The ⭐⭐⭐ problems split into proofs (the A track) and coding with
numpy(the C track); do the ones that match your path, but the strongest students do both. The ⭐⭐⭐⭐ problems are applied: find the geometry of angle and similarity hiding in a real system. Tags: [hand] = pencil only, [code] = needsnumpy, [proof] = rigorous argument, [essay] = written explanation.
For several problems, use the running pair from the chapter, $\mathbf{u}=\begin{bmatrix}1\\2\\3\end{bmatrix}$ and $\mathbf{v}=\begin{bmatrix}4\\5\\6\end{bmatrix}$, for which $\mathbf{u}\cdot\mathbf{v}=32$, $\lVert\mathbf{u}\rVert=\sqrt{14}$, $\lVert\mathbf{v}\rVert=\sqrt{77}$, and the angle is about $12.93^\circ$.
Tier ⭐ — Conceptual (what is / why)
18.1 [hand] State the two formulas for the dot product $\mathbf{u}\cdot\mathbf{v}$ — the algebraic one and the geometric one — and say in one sentence what each is good for. Which one needs to know the angle in advance, and which one does not?
18.2 [hand] Is the dot product of two vectors a scalar or a vector? Explain why this is exactly what we want, given that we will use it to measure length and angle.
18.3 [hand] Write the norm $\lVert\mathbf{v}\rVert$ in terms of the dot product. Why does this single definition let us reuse everything we prove about dot products to reason about lengths?
18.4 [hand] State the algebraic condition for two vectors to be orthogonal. What is the geometric meaning, and why does $\cos 90^\circ$ make the condition come out the way it does?
18.5 [hand] List the four defining properties of a norm (positivity, definiteness, absolute homogeneity, triangle inequality). Which three are easy to check directly for the Euclidean norm, and which one required Cauchy–Schwarz to prove?
18.6 [hand] State the Cauchy–Schwarz inequality including its conditions. For which pairs of vectors does equality hold? Why does the inequality guarantee that the angle formula $\arccos\frac{\mathbf{u}\cdot\mathbf{v}}{\lVert\mathbf{u}\rVert\lVert\mathbf{v}\rVert}$ always returns a real number?
18.7 [hand] Define cosine similarity and state its range. Explain in one sentence why it is length-invariant and why that is the property that makes it useful for comparing documents of different lengths.
18.8 [hand] Explain the difference between a dot product and an inner product as the terms are used in this book. When you see the abstract notation $\langle\mathbf{u},\mathbf{v}\rangle$, what does it signal?
18.9 [hand] True or false, with a one-sentence reason: (a) the zero vector is orthogonal to every vector; (b) cosine similarity is a kind of distance; (c) two random vectors in $\mathbb{R}^{1000}$ are usually nearly parallel.
Tier ⭐⭐ — Computation by hand
18.10 [hand] Compute $\mathbf{a}\cdot\mathbf{b}$ for $\mathbf{a}=\begin{bmatrix}2\\-1\\3\end{bmatrix}$ and $\mathbf{b}=\begin{bmatrix}1\\4\\0\end{bmatrix}$. Are they orthogonal?
18.11 [hand] Find the norm of $\begin{bmatrix}1\\2\\2\end{bmatrix}$ and the norm of $\begin{bmatrix}2\\3\\6\end{bmatrix}$. (Both are Pythagorean — the answers are whole numbers.) Then write the unit vector in the direction of the second.
18.12 [hand] Find the angle between $\mathbf{u}=\begin{bmatrix}1\\0\end{bmatrix}$ and $\mathbf{v}=\begin{bmatrix}1\\\sqrt3\end{bmatrix}$ by computing $\cos\theta$ exactly. (You should get a familiar angle.)
18.13 [hand] Determine which of the following pairs are orthogonal: (a) $\begin{bmatrix}3\\4\end{bmatrix},\begin{bmatrix}4\\-3\end{bmatrix}$; (b) $\begin{bmatrix}1\\1\\1\end{bmatrix},\begin{bmatrix}1\\-2\\1\end{bmatrix}$; (c) $\begin{bmatrix}2\\0\\1\\0\end{bmatrix},\begin{bmatrix}0\\5\\0\\3\end{bmatrix}$.
18.14 [hand] For the running pair $\mathbf{u}=(1,2,3)$, $\mathbf{v}=(4,5,6)$, verify the Cauchy–Schwarz inequality by computing $|\mathbf{u}\cdot\mathbf{v}|$ and $\lVert\mathbf{u}\rVert\lVert\mathbf{v}\rVert$ and confirming the first is no larger. How close are they, and what does the closeness tell you about the angle?
18.15 [hand] Compute the cosine similarity between the document vectors $\mathbf{d}_1=\begin{bmatrix}3\\2\\2\\0\\0\\0\end{bmatrix}$ and $\mathbf{d}_2=\begin{bmatrix}1\\4\\3\\0\\0\\0\end{bmatrix}$ from §18.9. Then compute it between $\mathbf{d}_1$ and $5\mathbf{d}_1$, and explain why the second answer is exactly $1$ regardless of the factor $5$.
18.16 [hand] Find the scalar projection of $\mathbf{v}=\begin{bmatrix}5\\1\end{bmatrix}$ onto the direction $\mathbf{u}=\begin{bmatrix}3\\4\end{bmatrix}$ (use $\operatorname{comp}_{\mathbf{u}}\mathbf{v}=\frac{\mathbf{u}\cdot\mathbf{v}}{\lVert\mathbf{u}\rVert}$). Interpret the sign.
18.17 [hand] Verify the triangle inequality for $\mathbf{u}=\begin{bmatrix}3\\0\end{bmatrix}$ and $\mathbf{v}=\begin{bmatrix}0\\4\end{bmatrix}$ by computing $\lVert\mathbf{u}+\mathbf{v}\rVert$ and $\lVert\mathbf{u}\rVert+\lVert\mathbf{v}\rVert$. Is the inequality strict or an equality here? Why? (What would you need for equality?)
18.18 [hand] Compute the $\ell^1$, $\ell^2$, and $\ell^\infty$ norms of $\mathbf{v}=\begin{bmatrix}2\\-3\\6\end{bmatrix}$. Which is largest, which smallest, and which one is the only one that comes from a dot product?
18.19 [hand] A unit vector $\hat{\mathbf{u}}$ and a vector $\mathbf{v}$ satisfy $\hat{\mathbf{u}}\cdot\mathbf{v}=3$. What does this number represent geometrically? If you also know $\lVert\mathbf{v}\rVert=5$, what is the angle between $\hat{\mathbf{u}}$ and $\mathbf{v}$?
Tier ⭐⭐⭐ — Proofs (A track)
18.20 [proof] Prove the reconciliation of the two dot-product formulas, $\lVert\mathbf{u}\rVert\lVert\mathbf{v}\rVert\cos\theta=\sum_i u_i v_i$, by measuring $\lVert\mathbf{u}-\mathbf{v}\rVert^2$ two ways — with the law of cosines and with the componentwise expansion — and equating them (the argument of §18.3). State where each step is used.
18.21 [proof] Prove the Cauchy–Schwarz inequality $|\mathbf{u}\cdot\mathbf{v}|\le\lVert\mathbf{u}\rVert\lVert\mathbf{v}\rVert$ by considering the nonnegative quadratic $f(t)=\lVert\mathbf{u}-t\mathbf{v}\rVert^2$ and using that its discriminant must be $\le 0$ (the argument of §18.7). State the condition for equality and prove it.
18.22 [proof] Using Cauchy–Schwarz, prove the triangle inequality $\lVert\mathbf{u}+\mathbf{v}\rVert\le\lVert\mathbf{u}\rVert+\lVert\mathbf{v}\rVert$. Identify exactly where the inequality $\mathbf{u}\cdot\mathbf{v}\le|\mathbf{u}\cdot\mathbf{v}|\le\lVert\mathbf{u}\rVert\lVert\mathbf{v}\rVert$ enters.
18.23 [proof] Prove the parallelogram law: $\lVert\mathbf{u}+\mathbf{v}\rVert^2+\lVert\mathbf{u}-\mathbf{v}\rVert^2=2\lVert\mathbf{u}\rVert^2+2\lVert\mathbf{v}\rVert^2$. (Expand both squared norms using $\lVert\mathbf{w}\rVert^2=\mathbf{w}\cdot\mathbf{w}$ and the bilinearity of the dot product.) What does this say geometrically about the diagonals of a parallelogram?
18.24 [proof] Prove the Pythagorean theorem for vectors: if $\mathbf{u}\perp\mathbf{v}$ then $\lVert\mathbf{u}+\mathbf{v}\rVert^2=\lVert\mathbf{u}\rVert^2+\lVert\mathbf{v}\rVert^2$. Show the converse also holds (if the squared-length identity holds then the vectors are orthogonal). Which property of the dot product makes the cross term vanish?
18.25 [proof] Prove that the dot product is bilinear: $(\,a\mathbf{u}+b\mathbf{w})\cdot\mathbf{v}=a(\mathbf{u}\cdot\mathbf{v})+b(\mathbf{w}\cdot\mathbf{v})$ for all scalars $a,b$ and vectors $\mathbf{u},\mathbf{w},\mathbf{v}$, working from the component definition. Explain why symmetry then gives linearity in the second argument for free.
Tier ⭐⭐⭐ — Coding (C track)
Use
numpy. Where a problem says "from scratch," do not callnp.dot,@, ornp.linalg.normin the body — implement the logic with loops or comprehensions and use numpy only to check.
18.26 [code] Implement dot(u, v), norm(v), angle(u, v), and cosine_similarity(u, v) from scratch in pure Python (the Build-Your-Toolkit task of §18.9), with angle clamping its cosine to $[-1,1]$ before math.acos. Verify each against numpy (np.array(u) @ np.array(v), np.linalg.norm, np.arccos(np.clip(...))) on at least five vector pairs, including a pair of identical vectors (to confirm the clamp prevents a domain error) and an orthogonal pair.
18.27 [code] Write is_orthogonal(u, v, tol=1e-9) that returns True when abs(dot(u, v)) < tol. Test it on $(2,1)\perp(-1,2)$, on the 4D pair $(1,0,0,1)$ and $(0,1,1,0)$, and on a non-orthogonal pair. Explain in a comment why you compare to a tolerance instead of testing == 0.
18.28 [code] Build the three document vectors of §18.9 and compute the full $3\times 3$ matrix of pairwise cosine similarities. Confirm the diagonal is all ones and that the baking document is orthogonal (similarity $0$) to both linear-algebra documents. Then append a fourth document that mixes both topics and report its similarities to the other three.
18.29 [code] Reproduce the high-dimensional near-orthogonality experiment of §18.6: for $n\in\{2,10,100,1000\}$, sample 2000 random pairs of vectors, compute the angle between each pair, and report the mean and standard deviation of the angles. Confirm the mean stays near $90^\circ$ while the spread shrinks. In a comment, explain via the dot-product formula why the spread collapses as $n$ grows.
18.30 [code] Show numerically that Pearson correlation equals cosine similarity of the mean-centered vectors. Generate two related data lists (e.g. y = 2*x + noise), compute np.corrcoef(x, y)[0,1], and independently compute the cosine similarity of x - x.mean() and y - y.mean(). Confirm they agree to floating-point tolerance, then repeat with an anti-correlated pair and confirm both give a value near $-1$.
18.31 [code] Write nearest_neighbor(query, vectors) that returns the index of the vector in a list most similar to query by cosine similarity. Test it on a small set of embedding-like vectors where you can predict the answer, and contrast it with a version that uses smallest Euclidean distance — construct an example (one candidate much longer than the others) where the two methods disagree, and explain which is "right" for direction-based similarity.
18.32 [code] Verify the two orthogonal-complement relationships of §18.11 numerically for a $3\times 4$ matrix of your choice: get a null-space basis with scipy.linalg.null_space(A) and a left-null basis with null_space(A.T), then confirm np.allclose(A @ nullA, 0) and np.allclose(A.T @ nullAT, 0). State which fundamental-subspace orthogonality each check confirms.
Tier ⭐⭐⭐⭐ — Application / short essay
18.33 [essay] Cosine similarity in semantic search (NLP / data science). A search engine embeds a query and a corpus of documents as vectors and ranks documents by cosine similarity to the query. In 150–250 words, explain why cosine similarity (an angle) is preferred over raw Euclidean distance for this task, what the length-invariance buys you when documents vary wildly in length, and how Cauchy–Schwarz guarantees every similarity score lands in a comparable $[-1,1]$ range. Connect your answer to word embeddings and the broader family of similarity measures.
18.34 [essay] Why centering matters in recommender systems (data science). A memory-based recommender compares users by the similarity of their rating vectors. In 150–250 words, explain why raw cosine similarity can be misleading when all ratings are positive, why subtracting each user's mean rating (giving the Pearson correlation) fixes this, and what a centered similarity of $-1$ tells you about two users' tastes. Use the Alice/Carol example from §18.9 as your worked illustration.
18.35 [essay] Work and the dot product (physics / engineering). The work done by a constant force $\mathbf{F}$ over a displacement $\mathbf{d}$ is $W=\mathbf{F}\cdot\mathbf{d}$. In 150–250 words, explain why only the component of the force along the motion does work, why a force perpendicular to the motion does zero work (relate this to orthogonality), and why a force opposing the motion does negative work. Give a concrete example (e.g. carrying a bag horizontally, or friction) and compute the work for specific $\mathbf{F}$ and $\mathbf{d}$ of your choosing.
18.36 [essay] Signal correlation and matched filtering (signals). A receiver decides whether a known pattern $\mathbf{p}$ is present in a noisy measurement $\mathbf{x}$ by computing the normalized dot product (cosine similarity) between them. In 150–250 words, explain why a high cosine similarity indicates the pattern is present and aligned, why orthogonal noise contributes little to the score (tie this to §18.6's near-orthogonality of random high-dimensional vectors), and why this "matched filter" is the same operation as the projection that will appear in Chapter 19. Mention how the orthogonality of distinct codes lets CDMA separate users.