Case Study 2 — Encoding as a Linear Map: Error-Correcting Codes Through Kernel and Image

DataField.Dev

Case Study 2 — Encoding as a Linear Map: Error-Correcting Codes Through Kernel and Image

Field: computer science and communications. Every time a file is read from a scratched DVD, a message survives a noisy wireless channel, or a QR code is scanned at an angle, an error-correcting code is doing the work. At the heart of the most important family of these codes — the linear codes — is a single linear transformation: the encoding map. This case study shows that the design and analysis of linear codes is pure Chapter 35 material. The encoder is an injective linear map; its image is the set of valid codewords; error detection is membership in a kernel; and rank–nullity quantifies exactly how much redundancy buys how much protection.

Encoding is a linear map into a bigger space

Suppose we want to transmit messages that are blocks of three symbols, modeled as vectors $\mathbf{m}\in\mathbb{R}^3$. (Real codes work over the binary field $\{0,1\}$, but the linear-algebra structure is identical, and working over $\mathbb{R}$ lets us reuse every tool from this book without new machinery — we flag the binary specifics where they matter.) A raw three-symbol message has no protection: flip one symbol in transit and the receiver has no way to know. The idea of a code is to add redundancy by mapping each message to a longer codeword before sending it, so that valid codewords are spread far apart and a small corruption lands closer to the intended codeword than to any other.

The encoder is a linear map $E:\mathbb{R}^3\to\mathbb{R}^6$, given by a $6\times 3$ generator matrix $G$, with $E(\mathbf{m}) = G\mathbf{m}$. We use a systematic generator, which copies the message into the first three coordinates and appends three computed parity coordinates: $$ G = \begin{bmatrix} I_3 \\ P \end{bmatrix} = \begin{bmatrix} 1&0&0\\ 0&1&0\\ 0&0&1\\ 1&1&0\\ 0&1&1\\ 1&0&1 \end{bmatrix}. $$ The top block $I_3$ reproduces the message; the bottom block $P$ produces three parity checks, each a linear combination of the message symbols. This is a linear map by construction — matrix multiplication is the prototype linear transformation of Chapter 7 — so everything Chapter 35 says about abstract linear maps applies, now read as facts about the code.

Why the encoder must be injective: distinct messages, distinct codewords

The first requirement of any code is that different messages produce different codewords — otherwise the receiver could not tell them apart even with no noise at all. In the language of Chapter 35, the encoding map must be injective, and injectivity is exactly the condition $\ker E = \{\mathbf{0}\}$. For a matrix, that means $G$ has trivial null space, i.e. full column rank.

# The encoding map E(m) = G m for a systematic [6,3] generator matrix.
import numpy as np
P = np.array([[1,1,0],[0,1,1],[1,0,1]], float)
G = np.vstack([np.eye(3), P])             # 6x3 generator: I_3 on top, parity P below
print(G.astype(int))
print("rank:", np.linalg.matrix_rank(G))  # 3 -> full column rank -> injective encoder
m = np.array([1, 0, 1.])
print("encode (1,0,1) ->", (G @ m).astype(int))   # [1 0 1 1 1 2]

The rank is $3$, equal to the number of message symbols, so $G$ has trivial null space and the encoder is injective: no two messages collide. The message $(1,0,1)$ encodes to the codeword $(1,0,1,1,1,2)$ — the message in the first three slots, the three parity values $(1,1,2)$ in the last three. (Over the binary field those parities would reduce mod $2$ to $(1,1,0)$; the linear structure is the same.) Rank–nullity here reads $\dim\ker E + \dim\operatorname{im}E = 3$, and since $\dim\ker E=0$, the image has dimension $3$: the encoder faithfully reproduces all three dimensions of message space inside the six-dimensional codeword space, losing nothing.

The image is the code: a 3-dimensional subspace of a 6-dimensional space

The set of all valid codewords is exactly the image of the encoding map, $\operatorname{im}E = \{G\mathbf{m} : \mathbf{m}\in\mathbb{R}^3\}$ — the column space of $G$. This image is a three-dimensional subspace of $\mathbb{R}^6$, and that gap between dimensions is the whole point of coding: there are six coordinates of room but only three dimensions of legitimate messages, so the valid codewords occupy a thin slice of the ambient space. Anything off that slice is, by definition, not a codeword — it could only have arisen from corruption.

This reframes error detection as a beautifully linear question: a received vector is a valid codeword if and only if it lies in the image of $E$. Membership in a subspace is something linear algebra answers instantly, and the standard device is the complementary map that vanishes precisely on the image.

Error detection as membership in a kernel: the parity-check map

To test whether a received vector lies in the code, we build a second linear map whose kernel is the code. The parity-check matrix $H$ is a $3\times 6$ matrix designed so that $HG = 0$ — every codeword is annihilated by $H$. For our systematic generator, $H = [\,-P \mid I_3\,]$ does the job: $$ H = \begin{bmatrix} -1&-1&0&1&0&0\\ 0&-1&-1&0&1&0\\ -1&0&-1&0&0&1 \end{bmatrix}. $$ The product $H\mathbf{r}$ of the parity-check matrix with a received vector $\mathbf{r}$ is called the syndrome. The defining identity $HG=0$ says exactly that the image of the encoder sits inside the kernel of the parity check — and a rank count shows the two coincide.

# Parity-check map H : codewords are exactly its kernel. Syndrome detects errors.
import numpy as np
H = np.hstack([-P, np.eye(3)])            # 3x6 parity-check matrix [-P | I_3]
print("H @ G == 0 ?", np.allclose(H @ G, 0))    # True -> every codeword is killed by H
print("rank H:", np.linalg.matrix_rank(H))      # 3 -> ker H has dim 6 - 3 = 3 = im E

c = G @ np.array([1, 0, 1.])              # a genuine codeword
print("syndrome of codeword:", (H @ c).astype(int))     # [0 0 0] -> in the code

r = c.copy(); r[3] += 1                   # corrupt one coordinate in transit
print("syndrome of corrupted:", (H @ r).astype(int))    # [1 0 0] -> nonzero: error flagged

The output confirms the entire structure. First, $HG=0$: every codeword has zero syndrome. Second, $H$ has rank $3$, so by rank–nullity its kernel has dimension $6-3=3$ — the same dimension as the image of $E$. Two three-dimensional subspaces of $\mathbb{R}^6$, one nested in the other, of equal dimension, must be equal: $\operatorname{im}E = \ker H$. The code is simultaneously "the image of the encoder" and "the kernel of the parity check," and that double description is the engine of decoding. A clean codeword has zero syndrome; the corrupted vector $\mathbf{r}$ (one coordinate bumped) has syndrome $(1,0,0)\neq\mathbf{0}$, so the receiver knows an error occurred without ever knowing the original message. The nonzero syndrome even points at the error: in a binary Hamming code, the syndrome is the binary index of the flipped bit, so the receiver can correct a single-bit error by reading the syndrome as an address.

Rank–nullity as the fundamental trade-off of coding

The numbers $6$, $3$, and $3$ are not arbitrary; they are the parameters of the code, and rank–nullity is the law relating them. A linear code is specified as an $[n,k]$ code: $k$ message symbols encoded into $n$ codeword symbols. The encoding map $E:\mathbb{R}^k\to\mathbb{R}^n$ is injective (so $\dim\ker E=0$ and $\dim\operatorname{im}E=k$), the code is a $k$-dimensional subspace of $\mathbb{R}^n$, and the parity-check map $H:\mathbb{R}^n\to\mathbb{R}^{n-k}$ has the code as its kernel, so $\dim\ker H = k$ forces $\operatorname{rank}H = n-k$. The number of parity checks is $n-k$, the redundancy of the code — and it is precisely the rank of the parity-check map, dictated by rank–nullity.

This makes the central engineering trade-off a linear-algebra identity. More redundancy ($n-k$ large) spreads codewords farther apart and corrects more errors, but wastes more channel capacity; less redundancy is efficient but fragile. The dimension of the code, $k$, plus the redundancy, $n-k$, must equal the block length $n$ — there is no free lunch, and the "lunch" is literally accounted for by $\dim\ker H + \dim\operatorname{im}H = n$. Designing a good code is choosing a $k$-dimensional subspace of $\mathbb{R}^n$ (equivalently, a generator $G$ or a parity-check $H$) whose codewords are as far apart as possible for the redundancy spent — an optimization over subspaces, which is to say, over the images and kernels of linear maps.

The Chapter 35 dictionary, applied

Step back and read off how cleanly the abstract vocabulary maps onto the concrete technology:

Encoding = a linear map $E=G$ from message space to codeword space.
"Distinct messages stay distinct" = $E$ is injective = $\ker E=\{\mathbf{0}\}$ = $G$ has full column rank.
The set of valid codewords = the image of $E$ = the column space of $G$ = a $k$-dimensional subspace of $\mathbb{R}^n$.
Error detection = testing membership in the image = testing membership in the kernel of the parity-check map $H$, via the syndrome $H\mathbf{r}$.
Redundancy $n-k$ = the rank of $H$, fixed by rank–nullity $\dim\ker H+\operatorname{rank}H=n$.

Not one of these statements required an idea from outside this chapter. The encoding map is a linear transformation; its kernel and image are the null space and column space of Chapter 13; rank–nullity governs the trade-off between message size and protection. A computer scientist designing a code and a calculus student solving a differential equation (Case Study 1) are working the same structure — preimages, kernels, and images of linear maps — which is exactly the unifying claim the chapter set out to prove. The abstraction earns its keep: master linear maps once, and error-correcting codes, differential equations, and quantum observables are all the same subject seen from different windows.