Chapter 12 Quiz — Computer Graphics

Q: Why can a linear transformation — and therefore an ordinary matrix — never perform a translation?

Because every linear map fixes the origin: (set the scalar to zero in the homogeneity rule). A nonzero translation sends the origin to , so it cannot be linear. Translation is affine (linear-plus-shift), and a bare matrix has no slot for the constant shift.

Q: What is the homogeneous-coordinates trick, in one sentence, and why does it make translation a matrix multiply?

Append a to every point (lift to ), and work with matrices; translation then becomes the last column of the matrix, because that column gets multiplied by the pinned-to- last coordinate, producing a constant shift in the output. Geometrically, translation in dimensions is a shear in dimensions.

Q: In a homogeneous matrix, where do the translation amounts live, and what is the role of the bottom row ?

The translation amounts occupy the third column (entries and in 0-indexed numpy). The bottom row keeps the homogeneous coordinate equal to after the multiply, so the result is again a valid 2D point on the slice. Relaxing that bottom row is exactly what produces perspective (Q9).

Q: You compose a model matrix . Which transformation is applied to a vertex *first*, and why is the translation written on the left?

The rightmost matrix, (scaling), is applied first, because — touches first, then , then . Translation is on the left because it acts last. This is Chapter 8's "rightmost acts first" rule for compositions, with column vectors and left-multiplication.

Q: Name the four coordinate spaces of the rendering pipeline in order, and the matrix between each pair.

Model (local) space world space camera space clip / NDC space screen space. The whole journey is the composition applied to each vertex, with a perspective divide after .

DataField.Dev

Chapter 12 Quiz — Computer Graphics

Twelve conceptual checks on homogeneous coordinates, transform composition, projection, and the rendering pipeline. Try each before opening the answer — these test understanding, not arithmetic speed.

Q1. Why can a linear transformation — and therefore an ordinary $n\times n$ matrix — never perform a translation?

Answer

Because every linear map fixes the origin: $T(\mathbf 0) = \mathbf 0$ (set the scalar to zero in the homogeneity rule). A nonzero translation sends the origin to $\mathbf t \neq \mathbf 0$, so it cannot be linear. Translation is *affine* (linear-plus-shift), and a bare matrix has no slot for the constant shift.

Q2. What is the homogeneous-coordinates trick, in one sentence, and why does it make translation a matrix multiply?

Answer

Append a $1$ to every point (lift $(x,y)$ to $(x,y,1)$), and work with $(n{+}1)\times(n{+}1)$ matrices; translation then becomes the last column of the matrix, because that column gets multiplied by the pinned-to-$1$ last coordinate, producing a *constant* shift in the output. Geometrically, translation in $n$ dimensions is a *shear* in $n{+}1$ dimensions.

Q3. In a $3\times 3$ homogeneous matrix, where do the translation amounts $(t_x, t_y)$ live, and what is the role of the bottom row $\begin{bmatrix}0 & 0 & 1\end{bmatrix}$?

Answer

The translation amounts occupy the **third column** (entries $(0,2)$ and $(1,2)$ in 0-indexed numpy). The bottom row keeps the homogeneous coordinate equal to $1$ after the multiply, so the result is again a valid 2D point on the $w = 1$ slice. Relaxing that bottom row is exactly what produces *perspective* (Q9).

Q4. You compose a model matrix $M = TRS$. Which transformation is applied to a vertex first, and why is the translation written on the left?

Answer

The **rightmost** matrix, $S$ (scaling), is applied first, because $M\mathbf p = T(R(S\mathbf p))$ — $S$ touches $\mathbf p$ first, then $R$, then $T$. Translation $T$ is on the left because it acts *last*. This is Chapter 8's "rightmost acts first" rule for compositions, with column vectors and left-multiplication.

Q5. Does $TR$ equal $RT$ for a translation $T$ and a rotation $R$? What does this mean for placing objects?

Answer

No — matrix multiplication is not commutative (Chapter 8). "Rotate then translate" swings an object on a wide arc about the world origin and then shifts it; "translate then rotate" moves it out first and then swings *that position* about the origin. They give different matrices and different final positions. Get the order wrong and your object lands in the wrong place every frame.

Q6. Two graphics systems disagree on conventions: one uses column vectors with $\mathbf p' = M\mathbf p$, the other uses row vectors with $\mathbf p' = \mathbf p\, M$. How are their matrices related, and what happens if you mix the conventions?

Answer

The two systems' matrices are **transposes** of each other, and the row-vector convention *reverses* the multiplication order (first transform on the left). Both are correct internally. Mixing them — feeding a column-convention matrix into a row-convention multiply — silently transposes your transforms, producing objects that rotate the wrong way or fly off-screen. Pick one convention and never cross it.

Q7. What is the difference between orthographic and perspective projection, geometrically?

Answer

**Orthographic** projects along parallel rays (depth simply discarded), so objects keep their size regardless of distance and parallel lines stay parallel — right for CAD/engineering drawings. **Perspective** divides by depth ($x/z, y/z$), so farther objects shrink and parallel lines converge to a vanishing point — right for mimicking human vision and photographs. The difference is the $1/z$ factor.

Q8. Orthographic projection (dropping the $z$-coordinate) is what kind of matrix from earlier chapters, and what is the consequence?

Answer

It is a **singular** (non-invertible) projection — the same flattening map as Chapter 7's projection onto a line, now in 3D, with $\det = 0$ as a map of 3D space (Chapter 11). The consequence is irreversibility: you cannot recover an object's depth from its projected position. Depth is stored separately (in a depth buffer) precisely because the screen position alone has lost it.

Q9. How does the perspective projection matrix produce the $1/z$ shrink, given that division is not a linear operation?

Answer

The perspective matrix's bottom row copies $z$ into the homogeneous coordinate $w$ (instead of leaving $w = 1$). So $(x,y,z,1)$ becomes $(x,y,z,z)$, with $w = z$. The **perspective divide** — dividing all coordinates by $w$, which homogeneous coordinates always require to read off the real point — then yields $(x/z, y/z)$. The "impossible" division is the normalization step we were going to do anyway.

Q10. Name the four coordinate spaces of the rendering pipeline in order, and the matrix between each pair.

Answer

**Model (local) space** $\xrightarrow{\text{model matrix } M}$ **world space** $\xrightarrow{\text{view matrix } V}$ **camera space** $\xrightarrow{\text{projection matrix } P\ (+\text{ divide})}$ **clip / NDC space** $\xrightarrow{\text{viewport}}$ **screen space**. The whole journey is the composition $V_{port}\, P\, V\, M$ applied to each vertex, with a perspective divide after $P$.

Q11. Why is the view (camera) matrix the inverse of the camera's placement in the world?

Answer

Seeing the world *from* the camera means undoing the act of placing the camera *in* the world (Chapter 9's inverse). If $C$ positions the camera, then $V = C^{-1}$ drags the whole world into the camera's frame: a camera at $(0,0,5)$ corresponds to translating the world by $(0,0,-5)$. There is no separate "camera" in the math — moving the camera right is the same as moving the world left.

Q12. A homogeneous triple $(X, Y, 0)$ has last coordinate zero, so it represents no finite 2D point. What does it represent, and where does it show up in graphics?

Answer

It is a **point at infinity** — a pure *direction* rather than a location (you cannot divide $(X/0, Y/0)$ into a finite point). These are the vanishing points of perspective: the place where parallel lines "meet at infinity" is exactly such a point, which the perspective transform maps to a finite spot on the screen. They are the entryway to projective geometry.