Fourier Series

Introduction Orthogonality of Trigonometric Functions Fourier Coefficients Complex Exponential Form Parseval's Identity Convergence Properties

Introduction

The central idea of Fourier series is that a wide class of periodic functions can be represented as an infinite sum of sine and cosine functions. This remarkable result, developed by Joseph Fourier in his 1822 work "Théorie analytique de la chaleur" (The Analytical Theory of Heat) while studying heat conduction, has profound implications across mathematics, physics, engineering, and modern computer science.

Definition: Fourier Series Consider a function \(f: \mathbb{R} \to \mathbb{R}\) that is periodic with period \(2L\), meaning \(f(x + 2L) = f(x)\) for all \(x \in \mathbb{R}\). Under suitable conditions, such as \(f\) being piecewise smooth or of bounded variation, the Fourier series of \(f\) is: \[ f(x) = \frac{a_0}{2} + \sum_{n=1}^{\infty} \left( a_n \cos\left(\frac{n\pi x}{L}\right) + b_n \sin\left(\frac{n\pi x}{L}\right) \right) \] where the Fourier coefficients are given by: \[ \begin{align*} a_0 &= \frac{1}{L} \int_{-L}^{L} f(x) \, dx \\\\ a_n &= \frac{1}{L} \int_{-L}^{L} f(x)\cos\left(\frac{n\pi x}{L}\right) \, dx, \quad n \geq 1 \\\\ b_n &= \frac{1}{L} \int_{-L}^{L} f(x)\sin\left(\frac{n\pi x}{L}\right) \, dx, \quad n \geq 1 \end{align*} \]

The constant term \(\frac{a_0}{2}\) represents the average value of the function over one period. Each term in the series represents a harmonic component: the \(n\)-th term oscillates \(n\) times over one period of \(f\). This decomposition reveals the frequency content of the signal, a perspective that proves invaluable in both theoretical analysis and practical computation.

Insight: Fourier Decomposition in CS

While Fourier series may seem purely theoretical, the discrete computational methods explored in the Fourier transform and FFT are actively used throughout modern technology.

  • Audio and image compression:
    MP3, AAC, and JPEG use discrete cosine transforms (closely related to Fourier series) to compress data by identifying which frequency components humans are least sensitive to.
  • Speech recognition and synthesis:
    Modern speech LLM systems use Fourier-based spectrograms to convert audio waveforms into features that neural networks can process.
  • Music information retrieval:
    Applications like Shazam identify songs by analyzing their frequency signatures using Fast Fourier Transforms (FFT).
  • Time series forecasting:
    Machine learning models for financial data, weather prediction, and sensor data often use Fourier features to capture periodic patterns (daily, weekly, seasonal cycles)

The fundamental idea of decomposing complex signals into simple periodic components remains at the heart of how modern systems process audio, images, and temporal data. Understanding Fourier series provides the mathematical foundation for these ubiquitous technologies.

Orthogonality of Trigonometric Functions

The Fourier series formula claims that any periodic function can be decomposed into sines and cosines. But how do we extract the individual coefficients? The answer lies in the same principle that makes orthogonal projections work in finite-dimensional spaces: inner products with orthogonal basis functions isolate each component. We define an inner product on the space of functions over \([-L, L]\) by: \[ \langle f, g \rangle = \int_{-L}^{L} f(x)g(x) \, dx \]

The trigonometric functions satisfy the following orthogonality relations:

Theorem: Orthogonality relations of Trigonometric Functions

\[ \begin{align*} \int_{-L}^{L} \cos\left(\frac{m\pi x}{L}\right)\cos\left(\frac{n\pi x}{L}\right) \, dx &= \begin{cases} 0 & \text{if } m \neq n \\ L & \text{if } m = n \neq 0 \\ 2L & \text{if } m = n = 0 \end{cases} \\\\ \int_{-L}^{L} \sin\left(\frac{m\pi x}{L}\right)\sin\left(\frac{n\pi x}{L}\right) \, dx &= \begin{cases} 0 & \text{if } m \neq n \\ L & \text{if } m = n \neq 0 \\ \end{cases} \\\\ \int_{-L}^{L} \cos\left(\frac{m\pi x}{L}\right)\sin\left(\frac{n\pi x}{L}\right) \, dx &= 0 \end{align*} \] where \(m\) and \(n\) are any nonnegative integers.

Note that when \(m = n = 0\), \(\cos(0) = 1\), giving us the constant function whose self-inner product is \(2L\).

Proof:

All three identities reduce to integrating \(\cos\frac{k\pi x}{L}\) or \(\sin\frac{k\pi x}{L}\) for an integer \(k\). A direct computation gives, for any integer \(k \neq 0\), \[ \int_{-L}^{L} \cos\frac{k\pi x}{L} \, dx = \frac{L}{k\pi}\sin\frac{k\pi x}{L}\bigg|_{-L}^{L} = 0, \qquad \int_{-L}^{L} \sin\frac{k\pi x}{L} \, dx = 0, \] the second integral vanishing by odd symmetry. For \(k = 0\) the cosine integral equals \(2L\) and the sine integral equals \(0\).

Cosine-cosine. Using the product-to-sum identity \(\cos A \cos B = \tfrac{1}{2}[\cos(A-B) + \cos(A+B)]\), \[ \int_{-L}^{L} \cos\frac{m\pi x}{L}\cos\frac{n\pi x}{L} \, dx = \frac{1}{2}\int_{-L}^{L}\!\!\cos\frac{(m-n)\pi x}{L}\,dx + \frac{1}{2}\int_{-L}^{L}\!\!\cos\frac{(m+n)\pi x}{L}\,dx. \] For \(m \neq n\) both integers \(m-n\) and \(m+n\) are nonzero, so both integrals vanish. For \(m = n \geq 1\) the first integral equals \(2L\) and the second vanishes (since \(m+n \geq 2\)), yielding \(L\). For \(m = n = 0\) both integrands are constant \(1\), giving \(\tfrac{1}{2}(2L + 2L) = 2L\).

Sine-sine. Using \(\sin A \sin B = \tfrac{1}{2}[\cos(A-B) - \cos(A+B)]\): if either \(m = 0\) or \(n = 0\) the integrand is identically zero; otherwise for \(m, n \geq 1\) the same case analysis as above gives \(0\) when \(m \neq n\) and \(L\) when \(m = n\).

Sine-cosine. The identity \(\sin A \cos B = \tfrac{1}{2}[\sin(A+B) + \sin(A-B)]\) reduces the integral to a sum of sine integrals, each of which vanishes on the symmetric interval \([-L, L]\) by odd symmetry. \(\blacksquare\)

These relations show that the set \[ \left\{1, \cos\left(\frac{\pi x}{L}\right), \sin\left(\frac{\pi x}{L}\right), \cos\left(\frac{2\pi x}{L}\right), \sin\left(\frac{2\pi x}{L}\right), \ldots\right\} \] forms an orthogonal system for the space of square-integrable functions on \([-L, L]\).

This system is moreover complete — meaning every square-integrable function can be approximated arbitrarily well in the \(L^2\) norm by finite linear combinations of these functions — so that it serves as an orthogonal basis of \(L^2[-L, L]\). Completeness combines the Riesz–Fischer theorem (developed in the Lp spaces chapter) with a density argument; we expand on this in the convergence section below.

This orthogonality is analogous to the orthogonality of vectors in Euclidean space, but now we're working in an infinite-dimensional function space. Just as we can decompose a vector into components along orthogonal basis vectors, we can decompose a periodic function into components along orthogonal trigonometric functions.

Fourier Coefficients

With the orthogonality relations established, we can now derive the coefficient formulas rigorously. The strategy is the same as in finite-dimensional linear algebra: to find the component along a basis vector, take the inner product of the function with that basis vector and divide by the basis vector's squared norm.

A note on term-by-term integration. The derivations below interchange the infinite sum with the integral, which is not automatic for all convergent series. This step is justified, for instance, when the Fourier series converges uniformly on \([-L, L]\) (as is the case when \(f\) is continuously differentiable with \(f(-L) = f(L)\)), or more generally under the \(L^2\)-convergence framework developed in the convergence section below: there the inner products \(\langle f, \cos(n\pi x/L)\rangle\) and \(\langle f, \sin(n\pi x/L)\rangle\) are taken directly without series manipulations, and yield the same coefficient formulas. We proceed formally here and defer the analytic justification.

To find \(a_0\), integrate both sides of the Fourier series: \[ \begin{align*} \int_{-L}^{L} f(x) \, dx &= \int_{-L}^{L} \frac{a_0}{2} \, dx + \sum_{n=1}^{\infty} \left( a_n \int_{-L}^{L} \cos\left(\frac{n\pi x}{L}\right) \, dx + b_n \int_{-L}^{L} \sin\left(\frac{n\pi x}{L}\right) \, dx \right) \\\\ &= L a_0 \end{align*} \] Therefore: \[ a_0 = \frac{1}{L} \int_{-L}^{L} f(x) \, dx \]

To find \(a_n\) for \(n \geq 1\), multiply both sides by \(\cos\left(\frac{m\pi x}{L}\right)\) and integrate: \[ \begin{align*} \int_{-L}^{L} f(x)\cos\left(\frac{m\pi x}{L}\right) \, dx &= \int_{-L}^{L} \frac{a_0}{2}\cos\left(\frac{m\pi x}{L}\right) \, dx \\\\ &\quad + \sum_{n=1}^{\infty} \left( a_n \int_{-L}^{L} \cos\left(\frac{n\pi x}{L}\right)\cos\left(\frac{m\pi x}{L}\right) \, dx + b_n \int_{-L}^{L} \sin\left(\frac{n\pi x}{L}\right)\cos\left(\frac{m\pi x}{L}\right) \, dx \right) \\\\ &= L a_m \end{align*} \] Therefore: \[ a_n = \frac{1}{L} \int_{-L}^{L} f(x)\cos\left(\frac{n\pi x}{L}\right) \, dx, \quad n \geq 1 \]

Similarly, multiplying by \(\sin\left(\frac{m\pi x}{L}\right)\) and integrating gives: \[ b_n = \frac{1}{L} \int_{-L}^{L} f(x)\sin\left(\frac{n\pi x}{L}\right) \, dx, \quad n \geq 1 \]

Note: The formulas can be unified as: \[ a_n = \frac{1}{\pi} \int_{-\pi}^{\pi} f(x)\cos(nx) \, dx, \quad n \geq 0 \] where for \(n = 0\), we have \(\cos(0) = 1\), giving \(a_0 = \frac{1}{\pi}\int_{-\pi}^{\pi} f(x) \, dx\).

The appearance of \(\frac{a_0}{2}\) in the series (rather than \(a_0\)) is a convention that makes the formula symmetric. These formulas allow us to compute the Fourier series representation of any suitable periodic function.

Example: Square Wave

Consider the square wave function with period \(2L\): \[ f(x) = \begin{cases} 1 & \text{if } 0 < x < L \\ -1 & \text{if } -L < x < 0 \end{cases} \]

First, compute \(a_0\): \[ a_0 = \frac{1}{L} \int_{-L}^{L} f(x) \, dx = \frac{1}{L}\left(\int_{-L}^{0} (-1) \, dx + \int_{0}^{L} 1 \, dx\right) = \frac{1}{L}(-L + L) = 0 \]

For \(n \geq 1\), since \(f\) is an odd function and \(\cos\left(\frac{n\pi x}{L}\right)\) is even, their product is odd: \[ a_n = \frac{1}{L} \int_{-L}^{L} f(x)\cos\left(\frac{n\pi x}{L}\right) \, dx = 0. \]

For the sine coefficients (\(f\) is odd, \(\sin\left(\frac{n\pi x}{L}\right)\) is odd, so their product is even): \[ \begin{align*} b_n &= \frac{1}{L} \int_{-L}^{L} f(x)\sin\left(\frac{n\pi x}{L}\right) \, dx \\\\ &= \frac{1}{L}\left(\int_{-L}^{0} (-1)\sin\left(\frac{n\pi x}{L}\right) \, dx + \int_{0}^{L} \sin\left(\frac{n\pi x}{L}\right) \, dx\right) \\\\ &= \frac{1}{L}\left[\frac{L}{n\pi}\cos\left(\frac{n\pi x}{L}\right)\bigg|_{-L}^{0} - \frac{L}{n\pi}\cos\left(\frac{n\pi x}{L}\right)\bigg|_{0}^{L}\right] \\\\ &= \frac{1}{n\pi}\left[(\cos(0) - \cos(-n\pi)) - (\cos(n\pi) - \cos(0))\right] \\\\ &= \frac{1}{n\pi}\left[(1 - \cos(n\pi)) - (\cos(n\pi) - 1)\right] \\\\ &= \frac{2}{n\pi}(1 - \cos(n\pi)). \end{align*} \] Since \(\cos(n\pi) = (-1)^n\): \[ b_n = \begin{cases} \frac{4}{n\pi} & \text{if } n \text{ is odd} \\ 0 & \text{if } n \text{ is even} \end{cases} \]

Therefore, the Fourier series is: \[ \begin{align*} f(x) &= \frac{4}{\pi}\sum_{k=0}^{\infty} \frac{\sin\left(\frac{(2k+1)\pi x}{L}\right)}{2k+1} \\\\ &= \frac{4}{\pi}\left(\sin\left(\frac{\pi x}{L}\right) + \frac{\sin\left(\frac{3\pi x}{L}\right)}{3} + \frac{\sin\left(\frac{5\pi x}{L}\right)}{5} + \cdots\right). \end{align*} \]

Complex Exponential Form

The real trigonometric form of the Fourier series involves separate cosine and sine coefficients, which can be notationally cumbersome. By combining them into complex exponentials via Euler's formula, we obtain a mathematically powerful representation that treats positive and negative frequencies symmetrically. This complex form is the standard starting point for the Fourier transform.

Assume \(f\) is periodic with \(f(-L) = f(L)\); the derivation below is a formal manipulation of the series, and the analytic hypotheses needed for convergence are the same as in the real form and are discussed in the convergence section. Using Euler's formula: \[ e^{i\theta} = \cos(\theta) + i\sin(\theta) \] we can express trigonometric functions as: \[ \cos(\theta) = \frac{e^{i\theta} + e^{-i\theta}}{2}, \quad \sin(\theta) = \frac{e^{i\theta} - e^{-i\theta}}{2i} \]

We can rewrite the Fourier series: \[ \begin{align*} f(x) &= \frac{a_0}{2} + \sum_{n=1}^{\infty} \left( a_n \cos\left(\frac{n\pi x}{L}\right) + b_n \sin\left(\frac{n\pi x}{L}\right) \right) \\\\ &= \frac{a_0}{2} + \sum_{n=1}^{\infty} a_n \left( \frac{e^{in\pi x/L} + e^{-in\pi x/L}}{2} \right) + \sum_{n=1}^{\infty} b_n \left( \frac{e^{in\pi x/L} - e^{-in\pi x/L}}{2i} \right) \\\\ &= \frac{a_0}{2} + \frac{1}{2}\sum_{n=1}^{\infty} (a_n - ib_n) e^{in\pi x/L} + \frac{1}{2}\sum_{n=1}^{\infty} (a_n + ib_n) e^{-in\pi x/L} \end{align*} \]

We now reindex the first summation by the dummy-variable substitution \(n \mapsto -n\). Under this substitution, \(e^{in\pi x/L}\) becomes \(e^{-i n\pi x/L}\), the summation range \(n = 1, 2, 3, \ldots\) becomes \(n = -1, -2, -3, \ldots\), and the coefficient \((a_n - i b_n)\) becomes \((a_{-n} - i b_{-n})\) (here \(-n\) is positive, so \(a_{-n}, b_{-n}\) are already-defined real Fourier coefficients): \[ \begin{align*} f(x) &= \frac{a_0}{2} + \frac{1}{2}\sum_{n = -1}^{-\infty} (a_{-n} - i b_{-n})\, e^{-i n\pi x/L} + \frac{1}{2}\sum_{n=1}^{\infty} (a_n + i b_n)\, e^{-i n\pi x/L}. \end{align*} \]

The two sums now share the same exponential \(e^{-in\pi x/L}\) but have different coefficient formulas depending on the sign of \(n\). To unify them into a single summation \(\sum_{n=-\infty}^{\infty}\), we define the complex Fourier coefficients \(c_n\) by cases — one formula for each regime of \(n\):

\[ c_n = \begin{cases} \frac{1}{2}(a_n + ib_n) & \text{if } n > 0, \\ \frac{a_0}{2} & \text{if } n = 0, \\ \frac{1}{2}(a_{-n} - ib_{-n}) & \text{if } n < 0. \end{cases} \]

This ensures that for real-valued functions, the coefficients satisfy conjugate symmetry: \(c_{-n} = \bar{c}_n\). Substituting these, we obtain the complex form of the Fourier series:

Definition: Complex Form of the Fourier Series

\[ f(x) = \sum_{n= -\infty}^{\infty} c_n e^{\frac{-i n \pi x}{L}} \] where the complex Fourier coefficients are: \[ \begin{align*} c_n &= \frac{1}{2}(a_n + ib_n) \\\\ &= \frac{1}{2L} \int_{-L}^{L} f(x) \left( \cos \left(\frac{n \pi x}{L}\right) + i \sin \left(\frac{n \pi x}{L}\right) \right) \, dx \\\\ &= \frac{1}{2L} \int_{-L}^{L} f(x)e^{\frac{i n\pi x}{L}} \, dx \end{align*} \] Note that if \(f(x)\) is real, \(c_{(-n)} = \overline{c_n}\).

Derivation using Orthogonality:

A complex function \(\phi(x)\) is orthogonal to another complex function \(\psi(x)\) over an interval \(a \leq x \leq b\) if \[ \int_a^b \overline{\phi}\psi \, dx = 0 \] where \(\overline{\phi}\) is the complex conjugate of \(\phi\).

For \(-\infty < n < \infty\), the eigenfunctions \(e^{\frac{-i n \pi x}{L}}\) can be verified to form an orthogonal set by following integration \[ \int_{-L}^L \left(\overline{e^{\frac{- i m\pi x}{L}}}\right) e^{\frac{ - i n\pi x}{L}} \, dx = \begin{cases} 0 & \text{if } m \neq n \\ 2L & \text{if } m = n \end{cases} \] because \(\left(\overline{e^{\frac{- i m\pi x}{L}}}\right) = e^{\frac{i m\pi x}{L}} \).

Here, we multiply the complex Fourier series by \(e^{\frac{i m \pi x}{L}}\) and integrate from \(-L\) to \(L\): \[ \int_{-L}^L f(x) e^{\frac{i m \pi x}{L}} \, dx = \sum_{n= -\infty}^{\infty} c_n \int_{-L}^L e^{\frac{-i n \pi x}{L}} e^{\frac{i m \pi x}{L}} \, dx. \] Using the complex orthogonality condition, only the \(m = n\) term survives, and thus we obtain the complex Fourier coefficients: \[ \begin{align*} &\int_{-L}^L f(x) e^{\frac{i m \pi x}{L}} \, dx = 2Lc_m \\\\ &\Longrightarrow c_m = \frac{1}{2L} \int_{-L}^{L} f(x)e^{\frac{i m\pi x}{L}} \, dx. \end{align*} \]

Notation / Sign Convention

Convention used in this text (Mathematical / PDE form):

We adopt the following complex-exponential convention for the Fourier series: \[ \boxed{ f(x) = \sum_{n=-\infty}^{\infty} c_n e^{-\frac{i n\pi x}{L}}, \qquad c_n = \frac{1}{2L}\int_{-L}^{L} f(x)\,e^{+\frac{i n\pi x}{L}}\,dx } \] This convention is standard in mathematical analysis and the theory of partial differential equations. Note the opposite signs in the exponentials: negative in the series, positive in the coefficient integral. This choice has several mathematical advantages:

Mathematical properties:

The basis functions \(e^{-\frac{i n\pi x}{L}}\) are eigenfunctions of the derivative operator: \[ \frac{d}{dx}\,e^{-\frac{i n\pi x}{L}} = -\frac{i n\pi}{L}\,e^{-\frac{i n\pi x}{L}}, \] which makes the eigenvalue \(-\frac{i n\pi}{L}\) align naturally with the negative definite nature of the Laplacian in PDEs.

These basis functions satisfy the orthogonality relation: \[ \int_{-L}^{L} e^{-\frac{i n\pi x}{L}}\,e^{+\frac{i m\pi x}{L}}\,dx = \begin{cases} 2L, & m = n, \\[4pt] 0, & m \neq n. \end{cases} \] The conjugate relationship \(\overline{e^{-\frac{i n\pi x}{L}}} = e^{+\frac{i n\pi x}{L}}\) ensures that multiplying the series by \(e^{+\frac{i m\pi x}{L}}\) and integrating isolates the coefficient \(c_m\) directly.

Parseval's identity (energy conservation):

With the normalized inner product \(\langle f, g \rangle = \frac{1}{2L}\int_{-L}^{L} f(x)\,\overline{g(x)}\,dx\), the complex exponential system forms an orthonormal basis of \(L^2[-L, L]\), yielding: \[ \frac{1}{2L}\int_{-L}^{L} |f(x)|^2\,dx = \sum_{n=-\infty}^{\infty} |c_n|^2. \] This shows that the transformation \(f \mapsto \{c_n\}\) is unitary, preserving the \(L^2\) norm (energy) of the function.

Relation to Engineering and Physics convention:

In engineering, physics, and signal processing, the opposite sign convention is typically used: \[ \boxed{ f(x) = \sum_{n=-\infty}^{\infty} \tilde{c}_n e^{+\frac{i n\pi x}{L}}, \qquad \tilde{c}_n = \frac{1}{2L}\int_{-L}^{L} f(x)\,e^{-\frac{i n\pi x}{L}}\,dx } \] The two conventions are related by the simple transformation \(\tilde{c}_n = c_{-n}\). Since this is just a re-indexing, all mathematical properties (orthogonality, completeness, Parseval's identity) remain valid in both conventions.

The same sign distinction appears in the Fourier transform:

Mathematical / PDE convention (used in this text):
\[ \widehat{f}(\xi) = \int_{-\infty}^{\infty} f(x)\,e^{+i x\xi}\,dx, \qquad f(x) = \frac{1}{2\pi}\int_{-\infty}^{\infty} \widehat{f}(\xi)\,e^{-i x\xi}\,d\xi \] Advantages:

  • The derivative becomes multiplication by \(-i\xi\): \(\widehat{f'}(\xi) = -i\xi\widehat{f}(\xi)\)
  • Aligns with spectral theory where the Laplacian \(-\Delta\) is positive definite
  • Natural for studying PDEs and harmonic analysis

Engineering / Physics convention:
\[ F(\omega) = \int_{-\infty}^{\infty} f(t)\,e^{-i\omega t}\,dt, \qquad f(t) = \frac{1}{2\pi}\int_{-\infty}^{\infty} F(\omega)\,e^{+i\omega t}\,d\omega \] Advantages:

  • Plane waves \(e^{i(kx - \omega t)}\) propagate in the positive \(x\)-direction
  • Positive frequencies correspond to counterclockwise rotation in the complex plane
  • Aligns with the time-evolution operator \(e^{-iHt/\hbar}\) in quantum mechanics
  • Natural for causal systems and signal processing

Both conventions are mathematically equivalent and internally consistent. The choice depends on the field and application:


Throughout this text, we consistently use the mathematical convention. When consulting other sources or implementing algorithms, always verify which convention is being used to ensure correct results.

The complex form is often preferred because it simplifies many operations. Note that for a real-valued function \(f(x)\), the coefficients satisfy conjugate symmetry: \(c_{-n} = \bar{c}_n\). This symmetry ensures that the imaginary parts cancel out when summing, resulting in a real signal.

Parseval's Identity

In finite-dimensional linear algebra, the Pythagorean theorem states that the squared norm of a vector equals the sum of squares of its components in an orthonormal basis. Parseval's identity is the infinite-dimensional analogue: it asserts that the total "energy" of a function (its \(L^2\) norm) equals the sum of the squared magnitudes of its Fourier coefficients. This result is fundamental because it guarantees that no information is lost or created when passing between the time and frequency domains.

Notation: Throughout this section, \(|\cdot|\) denotes the complex modulus. For a complex number \(z = x + iy\), we have \(|z|^2 = x^2 + y^2\). For real numbers, this reduces to the ordinary absolute value.

Theorem: Parseval's Identity

Parseval's identity relates the total energy of a signal in the time domain to its energy in the frequency domain. For a function \(f\) with Fourier series coefficients \(a_n\) and \(b_n\), it states: \[ \frac{1}{L}\int_{-L}^{L} |f(x)|^2 \, dx = \frac{a_0^2}{2} + \sum_{n=1}^{\infty} (a_n^2 + b_n^2). \]

In the complex exponential form, this becomes: \[ \frac{1}{2L}\int_{-L}^{L} |f(x)|^2 \, dx = \sum_{n=-\infty}^{\infty} |c_n|^2. \]

This identity is a generalization of the Pythagorean theorem to infinite-dimensional function spaces. The left side represents the "energy" or total power of the signal, while the right side shows that this energy is distributed across the frequency components.

Insight: Parseval's Identity in Signal Processing and ML

  • Energy Conservation:
    The total "energy" (\(L^2\) norm) of a signal is identical whether measured in the time domain or the frequency domain. This ensures that the transformation itself does not distort the information content of the data.
  • Data Compression (e.g., JPEG, MP3):
    Because total energy is preserved, we can truncate small Fourier coefficients that contribute little to the total energy. This allows us to discard "insignificant" data with minimal loss of perceived quality.
  • Feature Selection:
    Parseval's identity allows us to identify which frequency components contain the majority of the signal's energy, providing a rigorous way to reduce dimensionality in machine learning tasks.
  • Noise Filtering:
    Signals typically concentrate energy in a few specific coefficients, whereas white noise tends to spread its energy across all frequencies. This energy distribution analysis is the basis for spectral denoising.
Proof:

We prove the complex form first. A fully rigorous argument proceeds through the partial sums \(S_N f(x) = \sum_{|n|\leq N} c_n e^{-in\pi x/L}\): for each finite \(N\) the double sum below is finite, so interchanging sum and integral is elementary, and the identity \(\|S_N\|_2^2 = \sum_{|n|\leq N} |c_n|^2\) follows. Taking \(N \to \infty\) and using \(L^2\)-convergence \(S_N \to f\) in \(L^2[-L,L]\) (established in the convergence section and built on Riesz-Fischer in the \(L^p\) chapter) yields the full identity. We present the calculation in the suggestive infinite-sum form below, with this limiting procedure understood.

Starting with the Fourier series \[ f(x) = \sum_{n=-\infty}^{\infty} c_n e^{-i\frac{n\pi x}{L}}, \] so that \(\overline{f(x)} = \sum_{m=-\infty}^{\infty} \overline{c_m}\, e^{+i\frac{m\pi x}{L}}\), we compute: \[ \begin{align*} \frac{1}{2L}\int_{-L}^{L} |f(x)|^2 \, dx &= \frac{1}{2L}\int_{-L}^{L} f(x) \overline{f(x)} \, dx \\\\ &= \frac{1}{2L}\int_{-L}^{L} \left(\sum_{n=-\infty}^{\infty} c_n e^{-i\frac{n\pi x}{L}}\right) \left(\sum_{m=-\infty}^{\infty} \overline{c_m} e^{+i\frac{m\pi x}{L}}\right) dx \\\\ &= \frac{1}{2L}\sum_{n=-\infty}^{\infty}\sum_{m=-\infty}^{\infty} c_n \overline{c_m} \int_{-L}^{L} e^{i\frac{(m-n)\pi x}{L}} \, dx. \end{align*} \] By the orthonormality of \(\left\{e^{-i\frac{n\pi x}{L}}\right\}\) with respect to the inner product \(\langle f, g \rangle = \frac{1}{2L}\int_{-L}^{L} f\overline{g} \, dx\): \[ \int_{-L}^{L} e^{i\frac{(m-n)\pi x}{L}} \, dx = \begin{cases} 2L & \text{if } n = m \\ 0 & \text{if } n \neq m \end{cases} \] Therefore: \[ \frac{1}{2L}\int_{-L}^{L} |f(x)|^2 \, dx = \sum_{n=-\infty}^{\infty} |c_n|^2. \]

To obtain the real form, we use the relationships between real and complex coefficients. For \(n \geq 1\): \[ \begin{align*} |c_n|^2 + |c_{-n}|^2 &= \left|\frac{a_n + ib_n}{2}\right|^2 + \left|\frac{a_n - ib_n}{2}\right|^2 \\\\ &= \frac{a_n^2 + b_n^2}{4} + \frac{a_n^2 + b_n^2}{4} = \frac{a_n^2 + b_n^2}{2} \end{align*} \] and \(|c_0|^2 = \left|\frac{a_0}{2}\right|^2 = \frac{a_0^2}{4}\). Thus: \[ \begin{align*} \sum_{n=-\infty}^{\infty} |c_n|^2 &= |c_0|^2 + \sum_{n=1}^{\infty} (|c_n|^2 + |c_{-n}|^2) \\\\ &= \frac{a_0^2}{4} + \sum_{n=1}^{\infty} \frac{a_n^2 + b_n^2}{2}. \end{align*} \] Since the left side equals \(\frac{1}{2L}\int_{-L}^{L} |f(x)|^2 \, dx\), multiplying both sides by 2 gives: \[ \frac{1}{L}\int_{-L}^{L} |f(x)|^2 \, dx = \frac{a_0^2}{2} + \sum_{n=1}^{\infty} (a_n^2 + b_n^2). \]

Convergence Properties

Having derived the Fourier coefficients and established Parseval's identity, a fundamental question remains: in what sense does the Fourier series actually converge to the original function \(f\)? The answer is surprisingly nuanced and depends on the regularity of \(f\). Three types of convergence arise naturally, each with different assumptions and implications.

1. Pointwise Convergence:

Theorem: Pointwise Convergence (Dirichlet-Jordan)

If \(f\) is periodic and of bounded variation on \([-L, L]\), then at every point \(x\), the Fourier series converges to the average of the left and right limits: \[ \frac{a_0}{2} + \sum_{n=1}^{\infty} \left(a_n \cos\tfrac{n\pi x}{L} + b_n \sin\tfrac{n\pi x}{L}\right) = \frac{f(x^+) + f(x^-)}{2} \] where \(f(x^+) = \lim_{h \to 0^+} f(x+h)\) and \(f(x^-) = \lim_{h \to 0^-} f(x+h)\). At points of continuity, this equals \(f(x)\).

Functions of bounded variation include most functions encountered in practice, such as piecewise smooth and piecewise monotone functions.

Proof Outline:

A complete proof is beyond our present scope, but the main line of argument is instructive and relies only on tools already developed in this text. We outline the four key steps.

Step 1 — Dirichlet kernel representation.
Substituting the integral formula for \(c_n\) into the partial sum \(S_N f(x) = \sum_{|n|\leq N} c_n e^{-in\pi x/L}\) and interchanging the finite sum with the integral gives \[ S_N f(x) = \frac{1}{2L}\int_{-L}^{L} f(y)\, D_N(y - x)\, dy, \qquad D_N(t) = \sum_{n=-N}^{N} e^{in\pi t/L}. \] A geometric-series computation puts \(D_N\) in closed form: \[ D_N(t) = \frac{\sin\!\left((N+\tfrac{1}{2})\pi t/L\right)}{\sin(\pi t/(2L))}. \] Note that \(D_N\) is an even function of \(t\) and is periodic with period \(2L\); both facts will be used below.

Step 2 — Kernel mass.
Integrating \(D_N\) term-by-term and keeping only the \(n = 0\) term gives \[ \frac{1}{2L}\int_{-L}^{L} D_N(t)\, dt = 1 \quad \text{for every } N. \] This is the mass condition one would expect of an approximate identity. The obstruction to finishing the argument by a naive approximate-identity estimate is that \(D_N\) is not positive — its \(L^1\) norm \(\frac{1}{2L}\int_{-L}^{L}|D_N(t)|\,dt\) grows like \(\log N\), so cancellation rather than concentration of mass is what drives convergence.

Step 3 — Symmetrization and reduction via bounded variation.
Substituting \(y = x + t\) in the kernel representation and using periodicity together with evenness of \(D_N\), the partial sum takes the symmetrized form \[ S_N f(x) = \frac{1}{2L}\int_0^L \bigl[f(x+t) + f(x-t)\bigr]\, D_N(t)\, dt. \] Combined with the normalization \(\frac{1}{2L}\int_0^L D_N(t)\,dt = \tfrac{1}{2}\) (half of the full-period mass, by evenness and Step 2), this gives \[ S_N f(x) - \tfrac{1}{2}\bigl(f(x^+) + f(x^-)\bigr) = \frac{1}{2L}\int_0^L \Big(\bigl[f(x+t) - f(x^+)\bigr] + \bigl[f(x-t) - f(x^-)\bigr]\Big)\,D_N(t)\,dt. \] The goal is to show this tends to \(0\) as \(N \to \infty\). Applying the Jordan decomposition \(f = f_1 - f_2\) of a bounded-variation function into monotone pieces, it suffices to bound each of four integrals of the form \[ I_{j,\pm}(N) = \frac{1}{2L}\int_0^L \bigl[f_j(x \pm t) - f_j(x^\pm)\bigr]\,D_N(t)\,dt, \] where \(j \in \{1,2\}\) and the sign is chosen accordingly. For each \(I_{j,\pm}\), the integrand \(g(t) := f_j(x \pm t) - f_j(x^\pm)\) is monotone in \(t\) with \(g(0^+) = 0\). One then fixes a small \(\delta > 0\) and splits the integral at \(\delta\):

  • Tail \([\delta, L]\): here \(\sin(\pi t/(2L))\) is bounded away from \(0\), so the integrand \(g(t)/\sin(\pi t/(2L))\) is bounded and integrable. By the Riemann-Lebesgue lemma (Step 4), \(\int_\delta^L g(t) D_N(t)\,dt \to 0\) as \(N \to \infty\).
  • Near-origin \([0, \delta]\): the second mean-value theorem for integrals, applied to the monotone function \(g\) on \([0, \delta]\), yields an expression of the form \(g(\delta^-) \cdot \int_\xi^\delta D_N(t)\,dt\) for some \(\xi \in [0, \delta]\). A classical computation shows that partial integrals of the Dirichlet kernel are bounded uniformly in \(N\) and in the endpoints: there exists a constant \(C\) such that \(\bigl|\int_a^b D_N(t)\,dt\bigr| \leq C\) for all \(0 \le a < b \le L\) and all \(N\). Since \(g(\delta^-) \to 0\) as \(\delta \to 0^+\), this piece can be made arbitrarily small by choosing \(\delta\) small, uniformly in \(N\).

Given \(\varepsilon > 0\), first fix \(\delta\) small enough to control the near-origin piece, then let \(N \to \infty\) to eliminate the tail. This yields \(I_{j,\pm}(N) \to 0\), and summing the four contributions gives the claimed limit.

Step 4 — The Riemann-Lebesgue lemma.
The vanishing of the tail integrals in Step 3 as \(N \to \infty\) rests on the Riemann-Lebesgue lemma: for any function \(g\) integrable on a bounded interval, the oscillatory integrals \(\int g(t)\, e^{\pm iNt}\,dt\) tend to \(0\) as \(N \to \infty\). In our application the relevant tail integrand is bounded, hence square-integrable on \([\delta, L]\), and the \(L^2\) form of the lemma is what we need — this follows directly from Bessel's inequality and will be proved in the forthcoming page on Fourier analysis in Hilbert spaces. The extension to general integrable functions, obtained via density, is also established there.

Putting the four steps together yields the stated pointwise limit.

2. Mean Square (L²) Convergence:

For any square-integrable function \(f \in L^2[-L, L]\) (the space of functions where \(\int_{-L}^{L} |f(x)|^2 \, dx < \infty\)), the Fourier series converges in the mean square sense: \[ \lim_{N \to \infty} \int_{-L}^{L} \left|f(x) - \left(\frac{a_0}{2} + \sum_{n=1}^{N} \left(a_n\cos\left(\frac{n\pi x}{L}\right) + b_n\sin\left(\frac{n\pi x}{L}\right)\right)\right)\right|^2 dx = 0 \] This mean-square convergence rests on the fact that the trigonometric system is a complete orthogonal basis of \(L^2[-L, L]\). Two ingredients combine to establish this. First, the Riesz-Fischer theorem, established in the upcoming page on Lp spaces, guarantees that \(L^2[-L, L]\) is a complete Hilbert space under the inner product \(\langle f, g\rangle = \frac{1}{2L}\int_{-L}^{L} f(x)\overline{g(x)}\,dx\). Second, the density of trigonometric polynomials in \(L^2[-L, L]\) follows from the Weierstrass approximation theorem combined with the density of continuous periodic functions in \(L^2\). Once these are in hand, partial-sum convergence in \(L^2\) is a standard consequence of general Hilbert-space theory applied to an orthonormal basis, as developed in the Banach and Hilbert spaces chapter. The role of Lebesgue integration here is to supply the \(L^2\) framework itself; without it, the space of "square-integrable functions" would not be complete and the theorem would fail. This result is fundamental because:


3. The Gibbs Phenomenon:

At jump discontinuities, the partial sums of the Fourier series exhibit persistent oscillations near the discontinuity. If \(f\) has a jump discontinuity of magnitude \(J\), the partial sums overshoot by approximately \(0.0895 \cdot J\) (about 9% of the jump magnitude) on each side of the discontinuity. As \(N \to \infty\), this overshoot does not disappear but becomes increasingly localized near the discontinuity while maintaining its relative amplitude. This behavior is known as the Gibbs phenomenon.

For example, consider the square wave that jumps from -1 to +1 at \(x=0\). The jump magnitude is \(J = 2\), so the overshoot is approximately \(0.09 \times 2 \approx 0.18\). Thus, the partial sums reach approximately \(1.18\) near the positive side of the jump (instead of +1) and approximately \(-1.18\) near the negative side (instead of -1).

Insight: Gibbs Phenomenon in Signal Processing and ML

The Gibbs phenomenon explains why simply truncating Fourier series introduces ringing artifacts near sharp edges - a critical consideration in image compression (JPEG) and audio processing (MP3). In practice, window functions (Hamming, Hanning, Blackman) taper the coefficients to suppress ringing at the cost of frequency resolution. In machine learning, the Gibbs phenomenon appears when using Fourier features to approximate discontinuous functions: models like Fourier Neural Operators can struggle near sharp interfaces in PDE solutions, motivating hybrid spectral-spatial architectures that handle discontinuities locally.