Chapter 12 Exercises — Computer Graphics

Work in homogeneous coordinates throughout: a 2D point is $(x, y, 1)$, a 3D point is $(x, y, z, 1)$, transforms are $3\times 3$ or $4\times 4$ matrices, and column vectors multiplied on the left (rightmost matrix acts first). Tiers: ⭐ conceptual · ⭐⭐ hand computation · ⭐⭐⭐ proof / coding · ⭐⭐⭐⭐ application. Code exercises are marked [code]; numpy is for checking, not replacing, the hand work.


⭐ Conceptual (warm-ups)

1. In one sentence, why can no $2\times 2$ matrix translate the plane? What property of linear maps does translation violate?

2. A 2D point $(x, y)$ is written in homogeneous coordinates as $(x, y, 1)$. What goes in the third column of a $3\times 3$ homogeneous transformation matrix, and what does the bottom row $\begin{bmatrix}0 & 0 & 1\end{bmatrix}$ accomplish?

3. Why are 3D transformation matrices $4\times 4$ rather than $3\times 3$? Answer in terms of where translation needs to live.

4. State the rendering pipeline's four coordinate spaces in order, and name the matrix that takes you between each consecutive pair.

5. True or false, with a one-line reason each: (a) A homogeneous rotation matrix has determinant $1$. (b) Orthographic and perspective projection differ only by a scalar. (c) The view matrix is the inverse of the camera's placement matrix. (d) $TRS$ and $SRT$ produce the same model matrix.

6. What is the perspective divide, and why is it needed after multiplying by the perspective projection matrix rather than being part of the matrix itself?

7. Explain, geometrically, why two parallel rails appear to converge to a point under perspective projection but stay a fixed distance apart under orthographic projection.


⭐⭐ Hand computation

8. Write the $3\times 3$ homogeneous matrix for a translation by $(5, -2)$. Apply it by hand to the points $(0,0)$ and $(1, 3)$.

9. Write the $3\times 3$ homogeneous matrix for a $90°$ counterclockwise rotation. Apply it by hand to $(2, 0)$ and to $(0, 3)$, and confirm the homogeneous coordinate stays $1$.

10. Build the model matrix $M = T(4,0)\,R(90°)\,S(2,2)$ by multiplying the three $3\times 3$ matrices by hand (right to left). What is the image of the point $(1, 0)$ under $M$? Describe in words what $M$ does to a shape sitting at the origin.

11. For $R = R(90°)$ and $T = T(4, 0)$ (both $3\times 3$ homogeneous), compute $TR$ and $RT$ by hand. Show they differ, and apply each to $(1,0)$ to get two different points.

12. Write the three $4\times 4$ rotation matrices $R_x(90°)$, $R_y(90°)$, $R_z(90°)$. For each, state which coordinate is left unchanged and where the basis vectors $\mathbf e_1, \mathbf e_2, \mathbf e_3$ go.

13. A 3D point $(2, 3, 4)$ is projected by the gentle perspective rule $(x, y, z) \mapsto (x/z, y/z)$ (focal length $d = 1$). Where does it land on screen? Where does $(2, 3, 8)$ land? Which appears closer to the screen's center, and why?

14. A unit cube has a corner at $(1, 1, 1)$. Apply $R_z(90°)$ to it by hand (homogeneous $4\times 4$), then orthographically project (drop $z$). Give the screen coordinates.

15. Compute $\det$ of the $3\times 3$ homogeneous matrices for (a) a translation $T(t_x, t_y)$, (b) a scaling $S(s_x, s_y)$, (c) a rotation $R(\theta)$. Which preserve area? (Use the block-triangular structure from Chapter 11.)


⭐⭐⭐ Proof and coding

16. (proof) Prove that the product of two $3\times 3$ homogeneous matrices of the form $\begin{bmatrix} A & \mathbf b \\ \mathbf 0^{\mathsf T} & 1\end{bmatrix}$ (with $A$ a $2\times 2$ block and $\mathbf b$ a $2\times 1$ column) is again of that form, and identify the resulting $A$-block and $\mathbf b$-column. Conclude that affine maps are closed under composition.

17. (proof) Show that the homogeneous translation matrices form a commutative group under multiplication: $T(\mathbf u)\,T(\mathbf v) = T(\mathbf v)\,T(\mathbf u) = T(\mathbf u + \mathbf v)$, and the inverse of $T(\mathbf v)$ is $T(-\mathbf v)$. (Translations commute even though general transforms do not — explain why this does not contradict §12.6.)

18. (proof) Prove that the "rotate about a pivot $\mathbf p$" recipe $M = T(\mathbf p)\,R(\theta)\,T(-\mathbf p)$ fixes the point $\mathbf p$ (i.e. $M\mathbf p = \mathbf p$ in homogeneous coordinates) and that its top-left $2\times 2$ block equals the rotation $R(\theta)$ itself.

19. [code] Implement translation(tx, ty), rotation(theta), and scaling(sx, sy) returning $3\times 3$ homogeneous matrices (this is the Build Your Toolkit task). Verify against numpy that translation(3,1) @ rotation(pi/4) @ scaling(2,2) matches your hand result from §12.5, $\begin{bmatrix}1.414 & -1.414 & 3\\ 1.414 & 1.414 & 1 \\ 0 & 0 & 1\end{bmatrix}$.

20. [code] Write a function model_matrix(tx, ty, theta, sx, sy) that returns $T R S$. Apply it to the four corners of the unit square and print the transformed corners. Confirm that the corner originally at $(0,0)$ lands exactly at $(t_x, t_y)$.

21. [code] Demonstrate non-commutativity numerically: build $T(3,1)$ and $R(90°)$, print $TR$ and $RT$, and confirm they differ. Apply both to $(1,0)$ and report the two distinct landing points.

22. [code] Implement the gentle perspective projection: a function project(point, d=1.0) taking a 3D point and returning $(d\,x/z,\ d\,y/z)$. Project the eight corners of a unit cube translated to be centered at depth $z = 5$, and confirm distant corners land closer to the origin than near ones.

23. [code] Reproduce Figure 12.1: build the 8 cube corners, rotate by $R_x(20°)\,R_y(30°)$, project orthographically (drop $z$), and draw the 12 edges with matplotlib. Then change the rotation angles and observe how the wireframe changes.


⭐⭐⭐⭐ Application

24. [code] Render a frame. Build a small 2D "scene" of two shapes — a square and a triangle — each with its own model matrix (different position, rotation, and scale). Transform every vertex by its model matrix and plot both shapes on one figure. Then apply a single shared view translation (simulating a camera pan) to both and re-plot; confirm both shapes shift together.

25. [code] Perspective vs. orthographic. Take the rotated cube of Exercise 23, but now (a) translate it to be centered at depth $z = 4$ in front of the camera and (b) project it with perspective ($x/z, y/z$) instead of orthographic. Plot both the orthographic and perspective wireframes side by side. Describe how the near face appears larger than the far face under perspective but equal under orthographic — the visible signature of the $1/z$ shrink.

26. (application, hand + reasoning) The clock hand. A clock's hour hand is a line segment from its pivot at world position $(3, 5)$ to a tip $2$ units along the $+x$ direction. Using the pivot-rotation sandwich $M = T(3,5)\,R(\theta)\,T(-3,-5)$, find where the tip lands after a $90°$ rotation. Verify the pivot itself does not move. Why would naively applying $R(90°)$ alone (without the translations) send the hand to the wrong place?

27. (application, open-ended) Why $4\times 4$ everywhere? A junior engineer proposes storing 3D rotations and scales as $3\times 3$ matrices and translations as separate vectors, "to save memory." Explain, using the composition argument of §12.5–§12.6, what breaks: in particular, how composing a parent's and child's transforms in a scene graph becomes awkward, and why unifying everything as $4\times 4$ homogeneous matrices makes the scene-graph product clean. (Connect to the skeletal-animation Real-World Application.)