Case Study 1 — The Equalizer in Your Pocket: Sound as a Vector in a Function Space

DataField.Dev

Case Study 1 — The Equalizer in Your Pocket: Sound as a Vector in a Function Space

Slide the bass up on your music app and the song gets warmer; pull the treble down and the cymbals soften. That little row of sliders — a graphic equalizer — feels like a piece of audio magic, but it is, top to bottom, an exercise in the chapter you just read. A sound is a function, the space of all sounds is a vector space, and the equalizer does nothing more exotic than take linear combinations of building-block functions. This case study traces how the abstract idea "functions are vectors" turns into the most-used tool in audio software, and why the vector-space axioms are exactly what make it work. It is a story with no physics in it beyond "air wiggles" — the content is signal processing and the linear algebra of Chapter 5.

A sound is a function, and functions are vectors

Strike a tuning fork and the air pressure at your eardrum rises and falls smoothly over time. Record it, and you have captured a function: pressure as a function of time, $p(t)$. A whole song is a more complicated function — a wildly wiggling curve — but a function nonetheless. Digitally, we don't store the continuous curve; we sample it, measuring the pressure tens of thousands of times per second (CD audio uses 44,100 samples per second). Each second of mono audio becomes a vector of 44,100 numbers, exactly the sampling idea from Section 5.6 of the chapter, just with a very fine grid.

Now recall the punchline of Section 5.4: the set of all such functions, under pointwise addition and scalar multiplication, is a vector space. That single fact licenses everything an audio engineer does. "Playing two tracks at once" is vector addition — the combined waveform is the sum of the two waveforms, sample by sample. "Turning up the volume" is scalar multiplication — multiply every sample by a gain factor. "Silence" is the zero vector — the function that is zero at every instant. "Inverting a signal" (the trick behind noise-cancelling headphones) is taking the additive inverse $-p(t)$, the waveform that cancels $p(t)$ to silence. Every operation on the mixing board is one of the two vector-space operations, and the axioms are the guarantee that they behave sanely — that the order you add tracks in doesn't matter (commutativity), that turning the master volume up by $2$ then by $3$ equals turning it up by $6$ once (Axiom 6), and so on. Sound engineers rely on these properties every minute without ever naming them; the chapter named them.

# Two tracks add as vectors; "volume" is scalar multiplication. (Axioms 0, 1, 6.)
import numpy as np
t  = np.linspace(0, 1, 8, endpoint=False)   # 8 samples of one second (toy grid)
s1 = np.sin(2*np.pi*1*t)                     # a 1 Hz tone, sampled -> a vector
s2 = 0.5*np.sin(2*np.pi*2*t)                 # a 2 Hz tone at half volume -> a vector
mix = s1 + s2                                # "play both at once" = vector addition
print(np.round(mix, 3))   # [ 0.    1.207 1.    0.207 0.   -0.207 -1.   -1.207]
print(np.allclose(mix, s1 + s2))   # True  -- the mix IS the vector sum

The key move: a sound is a sum of pure tones

Here is where the vector-space view stops being a relabeling and starts doing real work. In the eighteenth century, Joseph Fourier made an astonishing claim: any reasonable periodic sound can be written as a sum of pure sine waves of different frequencies, each scaled by its own amplitude. A clarinet note, a struck piano string, a sung vowel — each is a particular weighted combination of a fundamental frequency and its harmonics. In the language of this chapter, the pure tones $\sin(2\pi f t)$ and $\cos(2\pi f t)$ are a set of building-block vectors, and every sound in the space is a linear combination of them: $$p(t) = a_1 \sin(2\pi f_1 t) + a_2 \sin(2\pi f_2 t) + a_3 \sin(2\pi f_3 t) + \cdots.$$ The amplitudes $a_1, a_2, a_3, \dots$ are the coordinates of the sound in the "frequency basis." This is precisely the move that Chapter 6 will call span and Chapter 16 will call change of basis: the same vector (the sound) can be written in the time domain (its samples) or in the frequency domain (its amplitudes), and the two descriptions are equally complete. The chapter's promise that "the geometric intuition of arrows transfers to functions" is cashing out here — picking out a sound's frequency content is the function-space version of reading off a vector's components along a chosen set of axes.

Why does this decomposition matter? Because once a sound is broken into its frequency pieces, you can treat each piece separately and add the results back — and the distributive axioms (7) and (8) are exactly what guarantee the reassembly is faithful. That "take apart, process the pieces, recombine" pattern is the superposition principle from Chapter 1, and it is the soul of all linear signal processing.

The equalizer is just rescaling coordinates

Now the equalizer falls out almost for free. A graphic equalizer presents a few sliders, each governing a band of frequencies — "bass," "low mids," "high mids," "treble." When you push the bass slider up, the software multiplies the amplitudes of the low-frequency building blocks by a number bigger than $1$, leaves the others alone, and adds everything back together. In vector-space terms: express the sound in the frequency basis, scale some coordinates, and reconstruct. That is it. Boosting the bass by a factor of $2$ and the treble by a factor of $0.5$ is a single linear operation on the sound vector — and because the space is closed under scaling and addition (Axiom 0), the result is guaranteed to be another genuine sound in the space, never some illegal object the speaker can't play.

# A toy 2-band equalizer: scale the low and high partials, then recombine.
import numpy as np
t   = np.linspace(0, 1, 8, endpoint=False)
low  = np.sin(2*np.pi*1*t)        # "bass" building block (1 Hz)
high = np.sin(2*np.pi*3*t)        # "treble" building block (3 Hz)
bass_gain, treble_gain = 2.0, 0.5
eq_out = bass_gain*low + treble_gain*high     # rescale coordinates, then add back
print(np.round(eq_out, 3))   # the equalized signal -- still a valid sound vector

Real equalizers work on the full sampled signal using the Fast Fourier Transform to find the amplitudes, adjust them band by band, and invert the transform — but the conceptual skeleton is exactly the toy above: a linear combination of frequency building blocks, with the slider positions as the scalars. The MP3 file that holds the song uses the same idea in reverse, discarding the building blocks your ear can't hear to save space; JPEG does it for images, which are functions of two variables. All of it rests on sounds and images living in a vector space where decomposition-and-recombination is legitimate.

Why the abstraction earns its keep here

Step back and notice that nothing in this story required the sound to be an arrow in $\mathbb{R}^3$. The vectors were functions; the scalars were real gains; the "basis" was a set of sine waves. Yet every move — addition, scaling, decomposition into building blocks, recombination — was identical to what you would do with ordinary arrows, because functions and arrows are both vector spaces and the axioms are all the machinery needs. An engineer who internalized linear algebra only as "stuff you do to columns of numbers" would be helpless to explain why the equalizer works; an engineer who learned the abstract definition of Chapter 5 sees instantly that audio is just linear algebra in a function space, and that the same tools transfer to images, radio, seismic data, and medical scans without modification.

That transfer is the entire payoff the chapter promised. When you reach the Fourier-series chapter (Chapter 22), you will learn that the sine-wave building blocks are not just a basis but an orthogonal one — perpendicular vectors in the function space — which is what makes finding the coordinates (the amplitudes) a matter of projection (Chapter 19) rather than solving a giant system. Return to this case study then. You will recognize the equalizer in your pocket as a projection onto an orthogonal basis of a function space, and you will understand it not as audio magic but as the linear algebra you started building right here, the moment you accepted that a sound is a vector.

Forward references: building-block combinations and span (Chapter 6); the time-domain ↔ frequency-domain switch as change of basis (Chapter 16); sine waves as an orthogonal basis and amplitudes as projections (Chapters 19, 22); the finite-precision realities of sampling (Chapter 38).