Case Study 15.1 — Color Spaces: One Color, Many Coordinate Lists

DataField.Dev

Case Study 15.1 — Color Spaces: One Color, Many Coordinate Lists

A color is a vector, and a "color space" is a basis

Open any image editor and you will find a color described by three numbers — usually red, green, and blue, each from 0 to 255. That triple is not arbitrary: it is a coordinate vector. The hidden basis is the set of three primary lights $\{\mathbf{R}, \mathbf{G}, \mathbf{B}\}$, and a pixel's color is the linear combination "this much red, this much green, this much blue." Human color vision is, to an excellent approximation, three-dimensional — our retinas carry three cone types — so the space of perceivable colors is a three-dimensional vector space, and any three independent primaries form a basis for it. RGB is just the most familiar choice of rulers. This case study makes the abstract machinery of Chapter 15 tangible: we will take a single fixed color and read its coordinates against several different bases, watching the numbers change while the color does not.

The thread to Chapter 15. A color is a vector in a 3D space. "RGB," "the opponent process," "YIQ," "YCbCr" — these are different bases for the same space. Converting between color spaces is exactly the change-of-coordinates computation of §15.5 and §15.8: the color is invariant, only its coordinate list changes.

The standard basis: RGB

Take a warm orange whose RGB coordinates are $\mathbf{c} = (200, 150, 50)$. In the language of Chapter 15, the standard basis here is $$\mathbf{R} = (1, 0, 0), \qquad \mathbf{G} = (0, 1, 0), \qquad \mathbf{B} = (0, 0, 1),$$ and the coordinates are the entries, for exactly the reason §15.6 gave: with $B = I$, solving $I\mathbf{c} = \mathbf{c}$ returns $\mathbf{c}$ unchanged. So this color is "200 of red, 150 of green, 50 of blue." Nothing to compute — RGB is the basis in which the address and the arrow coincide.

But RGB is a poor basis for many tasks. It mixes brightness and color together: dim every channel and you darken the pixel, but brightness is smeared across all three numbers rather than isolated in one. For compression, for adjusting contrast without shifting hue, and for matching how the human visual system actually encodes color, engineers switch to a basis that separates brightness from chromatic content.

A second basis: an opponent-process frame

Human vision does not send raw red/green/blue to the brain; the retina re-encodes the signal into one luminance (light–dark) channel and two opponent color channels (roughly red-vs-green and blue-vs-yellow). We can model a simplified opponent basis whose three vectors, written in RGB coordinates, are $$\mathbf{p}_1 = (1, 1, 1), \qquad \mathbf{p}_2 = (1, -1, 0), \qquad \mathbf{p}_3 = (0, 1, -1).$$ The first is the achromatic "all channels together" direction (the gray axis); the second is a red-minus-green contrast; the third a green-minus-blue contrast. Are these a basis? Stack them as columns and check the determinant: it is $3 \neq 0$, so the three vectors are independent, and three independent vectors in $\mathbb{R}^3$ automatically span (§15.6). They form a basis $\mathcal{P}$ of color space.

Now find the orange's coordinates in $\mathcal{P}$ by solving $P\mathbf{c}_{\mathcal{P}} = \mathbf{c}$, where $P = [\mathbf{p}_1 \mid \mathbf{p}_2 \mid \mathbf{p}_3]$ and $\mathbf{c} = (200, 150, 50)$ is the RGB color. This is the standard coordinate computation of §15.5.

# One color, two bases: RGB coordinates -> opponent-basis coordinates.
import numpy as np
c_rgb = np.array([200., 150., 50.])             # the orange, in RGB
P = np.column_stack([[1.,1,1], [1.,-1,0], [0.,1,-1]])  # opponent basis (cols in RGB)
print("det P =", round(np.linalg.det(P), 3))    # -> 3.0  (so P is a basis)
c_opp = np.linalg.solve(P, c_rgb)               # coordinates in the opponent basis
print("coords in P :", np.round(c_opp, 4))      # -> [133.3333  66.6667  83.3333]
print("reconstruct :", np.round(P @ c_opp, 4))  # -> [200. 150.  50.]  (= c_rgb)

Output:

det P = 3.0
coords in P : [133.3333  66.6667  83.3333]
reconstruct : [200. 150.  50.]

The same orange now reads as $\mathbf{c}_{\mathcal{P}} \approx (133.33,\ 66.67,\ 83.33)$ — three completely different numbers from $(200, 150, 50)$, yet they reconstruct the identical color. The first coordinate, $\approx 133.3$, is the projection onto the gray axis: it measures roughly how bright the color is, now cleanly separated into a single number. The other two encode the chromatic content. This is precisely the §15.8 phenomenon — one vector, two bases, two coordinate lists, same underlying object — now wearing the clothes of digital imaging.

A nice sanity check sharpens the intuition. Feed a pure gray, $(120, 120, 120)$, through the same conversion. Its opponent coordinates come out $(120, 0, 0)$: all the content sits in the luminance axis and zero in the two chromatic axes, exactly as it should for a colorless pixel. The opponent basis was designed so that "grayness" is a single coordinate; RGB hides that fact by spreading gray equally across all three numbers.

Why this matters. JPEG compression begins by converting RGB to a luminance-plus-chrominance basis (YCbCr, a close cousin of our opponent frame) and then throws away resolution in the chrominance coordinates — because human eyes are far more sensitive to brightness detail than color detail. That trick is only possible because the new basis isolates brightness in one coordinate. The compression is a statement about which coordinates you can afford to coarsen, and it is meaningless until you have chosen the right basis. The same logic — re-coordinatize, then discard the unimportant coordinates — is the seed of low-rank approximation and PCA in Part VI.

A third basis: a real engineering standard (YIQ)

Color-space conversions used in actual broadcast and video are also just change-of-basis matrices, with carefully chosen (non-integer) coefficients. The classic NTSC television standard used the YIQ basis, related to RGB by a fixed $3\times 3$ matrix $M$: $$\begin{bmatrix} Y \\ I \\ Q \end{bmatrix} = \underbrace{\begin{bmatrix} 0.299 & 0.587 & 0.114 \\ 0.596 & -0.274 & -0.322 \\ 0.211 & -0.523 & 0.312 \end{bmatrix}}_{M}\begin{bmatrix} R \\ G \\ B \end{bmatrix}.$$ Here $M$ plays the role of the change-of-coordinates matrix $B^{-1}$ from §15.5 — it converts RGB coordinates directly into YIQ coordinates. The first row computes $Y$, the luminance, as a weighted average of R, G, B (green is weighted most heavily because the eye is most sensitive to it). For our orange,

# RGB -> YIQ is a change of coordinates by a fixed 3x3 matrix M.
import numpy as np
M = np.array([[0.299, 0.587, 0.114],
              [0.596,-0.274,-0.322],
              [0.211,-0.523, 0.312]])
c_rgb = np.array([200., 150., 50.])
print("YIQ coords:", np.round(M @ c_rgb, 3))    # -> [153.55  62.   -20.65]
print("det M    :", round(np.linalg.det(M), 5)) # -> -0.25389  (nonzero => invertible)

Output:

YIQ coords: [153.55  62.   -20.65]
det M    : -0.25389

The orange is $(153.55,\ 62.0,\ -20.65)$ in YIQ — a fourth coordinate list for the same color. The determinant of $M$ is nonzero, confirming YIQ is a genuine basis (the conversion is invertible: you can always recover RGB by applying $M^{-1}$). The famous backward-compatibility of color TV with black-and-white sets was a coordinate trick: black-and-white receivers simply read the first coordinate $Y$ and ignored $I$ and $Q$. They displayed the luminance coordinate of a color they could not fully represent — possible only because the YIQ basis isolates brightness in one slot.

The takeaway

A color never changes; its coordinate list changes with every choice of basis. RGB, the opponent frame, YIQ, YCbCr — all describe the same three-dimensional space of colors, and every conversion among them is the change-of-coordinates computation of Chapter 15: form the matrix of new-basis vectors, solve (or multiply by the precomputed inverse), and verify by reconstruction. The dimension is always three, because color perception has three degrees of freedom; which three rulers you lay down is an engineering choice, made to put the information you care about — usually brightness — into coordinates you can read, keep, or discard at will. When you reach Chapter 16, you will recognize every color-space conversion as a worked instance of change of basis, and when you reach Part VI you will see that "discard the unimportant coordinates" is the whole idea behind compression.