The Heat Equation¶
The heat equation is the canonical partial differential equation describing diffusion. It plays a central role in probability theory, stochastic processes, and mathematical finance, serving as the prototype for all parabolic PDEs.
The Equation¶
In one spatial dimension, the heat equation is:
with initial condition:
Here \(u(t,x)\) represents: - Physics: Temperature at position \(x\) and time \(t\) - Probability: Density of particles diffusing from initial distribution \(f\) - Finance: Value function for certain derivative contracts
Why the Factor 1/2?¶
The coefficient \(\frac{1}{2}\) is chosen to align with standard Brownian motion, whose variance satisfies:
With this normalization: - The heat kernel equals the density of \(B_t\) - The generator of Brownian motion is \(\mathcal{L} = \frac{1}{2}\frac{\partial^2}{\partial x^2}\) - Feynman-Kac formulas take their simplest form
Alternative convention: Physics texts often write \(u_t = \kappa u_{xx}\) where \(\kappa\) is the thermal diffusivity. Setting \(\kappa = \frac{1}{2}\) gives our normalization.
Physical Derivation¶
The heat equation arises from two physical principles:
1. Conservation of energy:
where \(q\) is the heat flux.
2. Fourier's law: Heat flows from hot to cold proportionally to the temperature gradient:
Combining these:
Classification: Parabolic PDEs¶
The heat equation is the prototype of parabolic PDEs.
General second-order linear PDE:
Classification (by discriminant \(B^2 - AC\)):
| Type | Condition | Example | Behavior |
|---|---|---|---|
| Elliptic | \(B^2 - AC < 0\) | Laplace: \(u_{xx} + u_{yy} = 0\) | Equilibrium |
| Parabolic | \(B^2 - AC = 0\) | Heat: \(u_t = u_{xx}\) | Diffusion |
| Hyperbolic | \(B^2 - AC > 0\) | Wave: \(u_{tt} = u_{xx}\) | Propagation |
Key Qualitative Properties¶
1. Smoothing (Regularization)¶
Even if the initial data \(f\) is rough (e.g., discontinuous), the solution \(u(t,\cdot)\) becomes infinitely differentiable for any \(t > 0\).
Intuition: Diffusion averages out irregularities.
2. Infinite Speed of Propagation¶
If \(f\) has compact support, \(u(t,x) > 0\) for all \(x \in \mathbb{R}\) and \(t > 0\).
Contrast with wave equation: Information travels at finite speed for hyperbolic equations.
3. Conservation of Mass¶
If \(\int_{\mathbb{R}} f(x)\,dx = M\), then:
Total "heat" (or probability mass) is conserved.
4. Positivity Preservation¶
If \(f(x) \geq 0\) for all \(x\), then \(u(t,x) \geq 0\) for all \(t > 0\) and \(x \in \mathbb{R}\).
Probabilistic meaning: Densities remain non-negative.
5. Decay of Maximum¶
The maximum temperature decreases over time (in the absence of sources).
The Heat Equation in Higher Dimensions¶
In \(\mathbb{R}^d\):
The fundamental solution becomes:
This is the density of \(d\)-dimensional Brownian motion \(B_t \in \mathbb{R}^d\).
Boundary Value Problems¶
On a bounded domain \(\Omega \subset \mathbb{R}^d\):
Neumann conditions: Specify \(\frac{\partial u}{\partial n}\) on the boundary (insulated boundary).
Connection to Brownian Motion¶
The heat equation is the analytical counterpart of Brownian motion:
| Probabilistic | Analytical |
|---|---|
| \(B_t \sim N(0,t)\) | \(G(t,x) = \frac{1}{\sqrt{2\pi t}}e^{-x^2/2t}\) |
| \(\mathbb{E}[f(B_t)]\) | \(u(t,0) = \int f(x)G(t,x)\,dx\) |
| Generator \(\mathcal{L} = \frac{1}{2}\frac{d^2}{dx^2}\) | Heat operator \(\partial_t - \frac{1}{2}\partial_{xx}\) |
| Martingale \(f(B_t) - \int_0^t \mathcal{L}f(B_s)\,ds\) | \(u_t = \frac{1}{2}u_{xx}\) |
This connection, formalized by the Feynman-Kac theorem, is the foundation for probabilistic methods in PDE theory.
Historical Note¶
- Joseph Fourier (1822): Derived the heat equation and introduced Fourier series to solve it
- Norbert Wiener (1923): Constructed Brownian motion rigorously
- Andrey Kolmogorov (1931): Connected diffusions to parabolic PDEs
- Mark Kac (1949): Probabilistic interpretation of PDE solutions
Summary¶
The heat equation describes: - Diffusion of heat, particles, or probability - Smoothing of initial irregularities - Conservation of total mass - The analytical side of Brownian motion
The heat equation is the simplest parabolic PDE and the gateway to understanding diffusion processes.
QuantPie Derivation¶
Derivation from Particle Conservation¶
The heat equation emerges naturally from modeling particle diffusion. Consider particles randomly moving in space, where the position change \(\Delta\) has probability density \(\phi(\Delta)\).
Number of particles at time \(t + \tau\):
Taylor expansion: - Time: \(f(x, t+\tau) = f(x, t) + \frac{\partial f}{\partial t}\tau\) - Position: \(f(x-\Delta, t) = f(x, t) - \frac{\partial f}{\partial x}\Delta + \frac{1}{2}\frac{\partial^2 f}{\partial x^2}\Delta^2\)
Substituting and using the fact that \(\phi\) is even (symmetric distribution):
Since \(\int \phi(\Delta)d\Delta = 1\) and \(\int \Delta\phi(\Delta)d\Delta = 0\) (by symmetry):
Defining diffusion coefficient \(D = \frac{1}{\tau}\int \Delta^2\phi(\Delta)d\Delta\) (variance per unit time):
Physical interpretation: If neighboring regions have more particles on average, particles diffuse in to increase the local density. Conversely, if neighbors are depleted, the region loses particles.
Fundamental Solution via Similarity Method¶
The heat equation admits a self-similar solution. Under the transformation: - \(z = e^a x\) - \(s = e^b t\) - \(v(z, s) = e^c u(e^a x, e^b t)\)
For the heat equation to remain invariant, we need \(b = 2a\).
Similarity solution ansatz:
where \(\xi = \frac{x}{\sqrt{t}}\) is the similarity variable.
Substituting into the heat equation with \(c/b = -1/2\):
This simplifies to:
Fundamental solution (Green's function):
General solution by superposition:
The fundamental solution is the Gaussian kernel - the transition density of diffusion processes. This connects the PDE theory to probability: the solution represents how an initial distribution \(u(x, 0)\) spreads according to Brownian motion.
Exercises¶
Exercise 1. Write the one-dimensional heat equation \(\partial_t u = \frac{1}{2}\partial_{xx}u\) and verify that \(u(x, t) = e^{-\alpha^2 t/2}\sin(\alpha x)\) is a solution for any constant \(\alpha\). What initial condition does this correspond to?
Solution to Exercise 1
The one-dimensional heat equation is:
Let \(u(x,t) = e^{-\alpha^2 t/2}\sin(\alpha x)\). We compute each side.
Left side:
Right side: First, \(\partial_x u = \alpha e^{-\alpha^2 t/2}\cos(\alpha x)\). Then:
So:
The initial condition is \(u(x,0) = \sin(\alpha x)\). This is a single Fourier mode that decays exponentially in time at rate \(\alpha^2/2\). Higher-frequency modes (larger \(\alpha\)) decay faster, which is the smoothing property of the heat equation.
Exercise 2. Use the superposition formula \(u(x, t) = \int_{-\infty}^{\infty}u(x_0, 0)\,G(x, t; x_0)\,dx_0\) to solve the heat equation with initial condition \(u(x, 0) = e^{-x^2}\). Express your answer in closed form.
Solution to Exercise 2
Using the superposition formula with \(u(x_0, 0) = e^{-x_0^2}\) and \(G(x,t;x_0) = (4\pi D t)^{-1/2}\exp(-(x-x_0)^2/(4Dt))\) with \(D = 1/2\):
Combine the exponents:
Let \(A = 1 + \frac{1}{2t} = \frac{2t+1}{2t}\). Completing the square in \(x_0\):
The residual term simplifies: \(\frac{x^2}{4t^2 A} - \frac{x^2}{2t} = \frac{x^2}{2t}\left(\frac{1}{2tA} - 1\right) = -\frac{x^2}{2t+1}\).
The Gaussian integral in \(x_0\) evaluates to \(\sqrt{\pi/A} = \sqrt{2\pi t/(2t+1)}\). Combining:
At \(t = 0\): \(u(x,0) = e^{-x^2}\), confirming the initial condition. The solution is a Gaussian that broadens over time, with variance growing from \(1/2\) to \((2t+1)/2\).
Exercise 3. The heat equation describes the diffusion of heat. In one dimension, if heat is initially concentrated at \(x = 0\), describe qualitatively how the temperature profile evolves over time. Relate this to the spreading Gaussian kernel.
Solution to Exercise 3
At \(t = 0\), the heat is concentrated at \(x = 0\), represented by the Dirac delta \(\delta(x)\) (or approximately by a very narrow, tall Gaussian).
As \(t\) increases, the temperature profile evolves as the Gaussian kernel \(G(t,x) = (2\pi t)^{-1/2}\exp(-x^2/(2t))\):
- The peak height decreases as \(1/\sqrt{t}\)
- The width (standard deviation) grows as \(\sqrt{t}\)
- The total area (total heat) is conserved at \(1\)
- The profile remains symmetric about \(x = 0\)
Physically, heat flows from hot regions to cold regions. The initial concentration spreads outward in both directions. The spreading is self-similar: the profile at any time is just a rescaled version of the profile at any other time, with the characteristic width proportional to \(\sqrt{t}\).
Exercise 4. Classify the heat equation \(\partial_t u = D\,\partial_{xx}u\) in terms of the PDE classification (parabolic, elliptic, hyperbolic). Explain why the parabolic type is associated with diffusion rather than wave propagation.
Solution to Exercise 4
The heat equation \(\partial_t u = D\,\partial_{xx}u\) has the form \(Au_{xx} + 2Bu_{xt} + Cu_{tt} + \text{lower order} = 0\) where we identify \(A = D\), \(B = 0\), \(C = 0\) (there is no \(u_{tt}\) term). The discriminant is:
Since \(B^2 - AC = 0\), the equation is parabolic.
Why parabolic = diffusion, not waves: The parabolic type has only a first-order time derivative, meaning information propagates instantaneously (infinite speed of propagation) but with exponential decay of high-frequency modes. In contrast:
- Hyperbolic equations (\(B^2 - AC > 0\), e.g., the wave equation \(u_{tt} = c^2 u_{xx}\)) have a second-order time derivative, supporting wave propagation at finite speed \(c\)
- Elliptic equations (\(B^2 - AC < 0\), e.g., Laplace's equation) describe equilibrium states with no time evolution
The heat equation describes an irreversible process: initial irregularities are smoothed out and cannot be recovered, reflecting the dissipative nature of diffusion.
Exercise 5. Consider the heat equation on a finite interval \([0, L]\) with \(u(0, t) = u(L, t) = 0\). Using separation of variables \(u(x, t) = X(x)T(t)\), find the general solution as a Fourier sine series. What happens to the solution as \(t \to \infty\)?
Solution to Exercise 5
Using separation of variables \(u(x,t) = X(x)T(t)\), substitute into \(\partial_t u = \frac{1}{2}\partial_{xx}u\):
where \(\lambda\) is the separation constant (taken negative for decay).
Spatial equation: \(X'' + 2\lambda X = 0\) with \(X(0) = X(L) = 0\).
For nontrivial solutions, \(2\lambda > 0\) and \(X(x) = \sin(n\pi x/L)\) with \(2\lambda_n = (n\pi/L)^2\), giving \(\lambda_n = n^2\pi^2/(2L^2)\) for \(n = 1, 2, 3, \ldots\)
Temporal equation: \(T'(t) = -\lambda_n T(t)\), so \(T_n(t) = e^{-\lambda_n t} = e^{-n^2\pi^2 t/(2L^2)}\).
General solution by superposition:
where \(b_n = \frac{2}{L}\int_0^L f(x)\sin(n\pi x/L)\,dx\) are the Fourier sine coefficients of the initial data.
As \(t \to \infty\): Every exponential \(e^{-n^2\pi^2 t/(2L^2)} \to 0\). The \(n = 1\) mode decays slowest, so the solution approaches zero, with the decay rate dominated by \(e^{-\pi^2 t/(2L^2)}\). Physically, the heat leaks out through the fixed-temperature boundaries until the rod reaches equilibrium at \(u = 0\).
Exercise 6. The diffusion coefficient \(D = \sigma^2/2\) determines the rate of spreading. For \(D = 0.045\) (corresponding to \(\sigma = 0.30\)), compute the standard deviation of the Gaussian kernel after \(t = 1\) year and \(t = 4\) years. Verify that the standard deviation grows as \(\sqrt{t}\).
Solution to Exercise 6
The standard deviation of the Gaussian kernel at time \(t\) is \(\sigma_{\text{kernel}} = \sqrt{2Dt}\). With \(D = 0.045\):
After \(t = 1\) year:
After \(t = 4\) years:
Verification of \(\sqrt{t}\) growth: The ratio of standard deviations is \(0.60/0.30 = 2 = \sqrt{4/1} = \sqrt{t_2/t_1}\), confirming that the standard deviation grows as \(\sqrt{t}\). Doubling the standard deviation requires quadrupling the time, which is the hallmark of diffusive (rather than ballistic) spreading.
Exercise 7. Explain why the heat equation is a good model for option pricing through the Black-Scholes PDE. After the change of variables that transforms the Black-Scholes PDE into the heat equation, what do the spatial variable, time variable, and initial condition represent in financial terms?
Solution to Exercise 7
The Black-Scholes PDE for a European option with price \(V(S,t)\) is:
This is a parabolic PDE (like the heat equation) because the underlying stock price follows geometric Brownian motion, which is a diffusion process.
Change of variables: Set \(x = \log S\) (so \(S = e^x\)), \(\tau = T - t\) (time to maturity), and \(V(S,t) = e^{-r\tau}v(x,\tau)\) (remove discounting). After substitution, the PDE becomes:
A further substitution \(v(x,\tau) = e^{\alpha x + \beta \tau}w(x,\tau)\) with appropriate \(\alpha, \beta\) eliminates the first-order and zeroth-order terms, yielding the standard heat equation \(\partial_\tau w = \frac{\sigma^2}{2}\partial_{xx}w\).
Financial interpretation of the transformed variables:
- Spatial variable \(x = \log S\): the log-price of the underlying asset
- Time variable \(\tau = T - t\): time to maturity (running forward from \(0\) at expiry)
- Initial condition \(w(x, 0)\): a transformed version of the option payoff \(g(e^x)\) at expiry, e.g., \(\max(e^x - K, 0)\) for a call option
The heat equation is the right model because the log-price of a stock under geometric Brownian motion is a diffusion process, and the option price is an expected value of the payoff -- exactly the convolution with the heat kernel.