# The diffusion equation

Let us now return to the one-dimensional case. The question we would like to answer now is the following: in the continuum limit, if we make many different independent particles start from the origin at the same instant, supposing that they undergo a random walk where the length of each step is regulated by a probability distribution ${\displaystyle \chi (\ell )}$, how many particles will there be at time ${\displaystyle t}$ in the interval ${\displaystyle [x,x+dx]}$? We will see that the answer to this question leads us to the diffusion equation.

Let us suppose that ${\displaystyle \left\langle \ell \right\rangle =0}$ and ${\displaystyle \left\langle \ell ^{2}\right\rangle =a^{2}}$, with ${\displaystyle a}$ finite (which is rather reasonable[1]). The position of the particle at time ${\displaystyle t+\Delta t}$ given the one at time ${\displaystyle t}$ will be:

${\displaystyle x(t+\Delta t)=x(t)+\ell (t)}$
If we call ${\displaystyle \rho (x,t)}$ the particle density, then at time ${\displaystyle t+\Delta t}$ (with ${\displaystyle \Delta t}$ a small time interval, and we will later take the limit ${\displaystyle \Delta t\to 0}$) we will have:
${\displaystyle \rho (x,t+\Delta t)=\int \rho (x',t)\chi (x-x')dx'}$
In fact, a particle starting from ${\displaystyle x'}$ at time ${\displaystyle t}$ will be in ${\displaystyle x}$ at time ${\displaystyle t+\Delta t}$ with probability ${\displaystyle \chi (x-x')}$, and so integrating over all the possible initial positions of the particles (given by the particle density at time ${\displaystyle t}$) we get exactly the new particle distribution. Defining the variable ${\displaystyle \ell =x-x'}$:
${\displaystyle \rho (x,t+\Delta t)=\int \chi (\ell )\rho (x-\ell ,t)d\ell }$
Let us now suppose that the step size ${\displaystyle \ell }$ is small compared to the scales on which ${\displaystyle \rho }$ varies (which is another reasonable assumption); this way we can expand ${\displaystyle \rho }$ in a Taylor series[2]:
${\displaystyle \rho (x-\ell ,t)=\rho (x,t)-\ell {\frac {\partial }{\partial x}}\rho (x,t)+{\frac {\ell ^{2}}{2}}{\frac {\partial ^{2}}{\partial x^{2}}}\rho (x,t)-{\frac {\ell ^{3}}{3!}}{\frac {\partial ^{3}}{\partial x^{3}}}\rho (x,t)+\cdots }$
so that we can write:
{\displaystyle {\begin{aligned}\rho (x,t+\Delta t)=\int \chi (\ell )\left(\rho -\ell {\frac {\partial \rho }{\partial x}}+{\frac {\ell ^{2}}{2}}{\frac {\partial ^{2}\rho }{\partial x^{2}}}-{\frac {\ell ^{3}}{3!}}{\frac {\partial ^{3}\rho }{\partial x^{3}}}+\cdots \right)d\ell =\\{}\\=\rho -{\frac {\partial \rho }{\partial x}}\left\langle \ell \right\rangle +{\frac {\partial ^{2}\rho }{\partial x^{2}}}{\frac {\left\langle \ell ^{2}\right\rangle }{2}}-{\frac {\partial ^{3}\rho }{\partial x^{3}}}{\frac {\left\langle \ell ^{3}\right\rangle }{3!}}+\cdots \end{aligned}}}
If we now also expand ${\displaystyle \rho }$ for small ${\displaystyle \Delta t}$ and suppose for simplicity that ${\textstyle \left\langle \ell ^{3}\right\rangle =ca^{3}}$ with ${\displaystyle c}$ a constant[3], then:
${\displaystyle \Delta t{\frac {\partial \rho }{\partial t}}+{\frac {\Delta t^{2}}{2}}{\frac {\partial ^{2}\rho }{\partial t^{2}}}={\frac {a^{2}}{2}}{\frac {\partial ^{2}\rho }{\partial x^{2}}}-c{\frac {a^{3}}{6}}{\frac {\partial ^{3}\rho }{\partial x^{3}}}+\cdots }$
and dividing by ${\displaystyle \Delta t}$:
${\displaystyle {\frac {\partial \rho }{\partial t}}+{\frac {\Delta t}{2}}{\frac {\partial ^{2}\rho }{\partial t^{2}}}={\frac {a^{2}}{2\Delta t}}{\frac {\partial ^{2}\rho }{\partial x^{2}}}-{\frac {ca^{3}}{6\Delta t}}{\frac {\partial ^{3}\rho }{\partial x^{3}}}+\cdots }$
We are now interested in the continuum limit, namely ${\displaystyle \Delta t\to 0}$ and ${\displaystyle a\to 0}$; this means that in general the limit of ${\displaystyle a^{2}/(2\Delta t)}$ is not well determined. In particular, we could have:

• ${\displaystyle a^{2}/\Delta t\to 0}$, in which case ${\displaystyle \partial \rho /\partial t=0}$: there is no evolution
• ${\displaystyle a^{2}/\Delta t\to \infty }$, in which case the evolution would be instantaneous
• ${\displaystyle a^{2}/\Delta t\to {\text{const.}}}$

The most interesting case is the last one: in fact, if we require ${\displaystyle a^{2}/2\Delta t}$ to be always (so also in the limits ${\displaystyle a,\Delta t\to 0}$) equal to a constant ${\displaystyle D}$, called diffusion constant, then the terms proportional to ${\displaystyle \partial ^{2}\rho /\partial t^{2}}$ and ${\displaystyle \partial ^{3}\rho /\partial x^{3}}$ and all the other terms in the expansion vanish[4]. Therefore, we find that ${\displaystyle \rho (x,t)}$ must satisfy the so called diffusion equation[5]:

${\displaystyle {\frac {\partial }{\partial t}}\rho (x,t)=D{\frac {\partial ^{2}}{\partial x^{2}}}\rho (x,t)}$
which in three dimensions can be rewritten as:
${\displaystyle {\frac {\partial }{\partial t}}\rho ({\vec {x}},t)=D\nabla ^{2}\rho ({\vec {x}},t)}$

## Diffusion and continuity equation

We have until now not considered a very important property of ${\displaystyle \rho }$: since it is a particle density, it must satisfy a continuity equation. This is an equation that expresses the fact that particles cannot "disappear" and "reappear" in different points of space, but must continuously flow from point to point. To make things more clear let us consider a region of volume ${\displaystyle V}$ of our system: this will contain a certain number of particles, which will change with time because some particles will go out of ${\displaystyle V}$ while others will enter into the volume from the outside, due to the continuous random motion they are subjected to. Therefore, in terms of ${\displaystyle \rho ({\vec {r}},t)}$ the number of particles contained in the volume at time ${\displaystyle t+\Delta t}$ will be:

${\displaystyle \int _{V}\rho ({\vec {r}},t+\Delta t)d{\vec {r}}=\int _{V}\rho ({\vec {r}},t)d{\vec {r}}+{\text{in}}-{\text{out}}}$
where by "in" and "out" we mean the number of particles that have entered or gone out of ${\displaystyle V}$. If we call ${\displaystyle {\vec {J}}}$ the flow of particles[6], and ${\displaystyle {\hat {n}}}$ the unit vector orthogonal to the surface ${\displaystyle S}$ enclosing ${\displaystyle V}$ pointing outward, we can rewrite this equation as:
${\displaystyle \int _{V}\rho ({\vec {r}},t+\Delta t)d{\vec {r}}=\int _{V}\rho ({\vec {r}},t)d{\vec {r}}-\int _{S}{\vec {J}}\cdot {\hat {n}}\Delta tdS}$
(where the last term is the net outgoing flow of particles). Dividing by ${\displaystyle \Delta t}$:
${\displaystyle \int _{V}{\frac {\rho ({\vec {r}},t+\Delta t)-\rho ({\vec {r}},t)}{\Delta t}}d{\vec {r}}=-\int _{S}{\vec {J}}\cdot {\hat {n}}dS}$
Taking the limit ${\displaystyle \Delta t\to 0}$ and using Gauss theorem we get:
${\displaystyle \int _{V}{\frac {\partial \rho }{\partial t}}d{\vec {r}}=-\int _{V}{\vec {\nabla }}\cdot {\vec {J}}d{\vec {r}}}$
Since the volume ${\displaystyle V}$ is arbitrary, the integrands must be always equal:
${\displaystyle {\frac {\partial }{\partial t}}\rho ({\vec {r}},t)=-{\vec {\nabla }}\cdot {\vec {J}}({\vec {r}},t)}$
This is the continuity equation associated to ${\displaystyle \rho }$, with flow ${\displaystyle {\vec {J}}}$.

Now, how are continuity and diffusion equations related? It is immediate to see that they are equivalent if:

${\displaystyle {\vec {J}}({\vec {r}},t)=-D{\vec {\nabla }}\rho ({\vec {r}},t)}$
In fact, this way:
${\displaystyle {\frac {\partial \rho }{\partial t}}=-{\vec {\nabla }}\cdot \left(-D{\vec {\nabla }}\rho \right)=D\nabla ^{2}\rho }$

The physical meaning of equation ${\textstyle {\vec {J}}=-D{\vec {\nabla }}\rho }$ is that a set of particles subjected to a random walk will move from areas of high to areas of low concentration. In other words, the diffusion of particles tends to "flatten" concentration inhomogeneities[7].

## Random walks and central limit theorem

We now want to show a particular property of random walks, and so also of diffusive phenomena: we want to see that they are coherent with the central limit theorem. In fact, this states that if ${\textstyle x_{N}=\sum _{i}\ell _{i}}$ is a sum of independent random variables, then ${\displaystyle x_{N}}$ for large ${\displaystyle N}$ is distributed along a Gaussian: what we want to show is that this is indeed the case. To be more precise, let us reconsider the one-dimensional discrete random walk with constant step length ${\displaystyle a}$. Let us call ${\displaystyle x}$ the position of the particle at the ${\displaystyle N}$-th step, ${\displaystyle n_{+}}$ the number of steps that it has done to the right and ${\displaystyle n_{-}}$ to the left; of course ${\displaystyle N=n_{+}+n_{-}}$, and we also call ${\displaystyle m=n_{+}-n_{-}}$, so that ${\displaystyle m=x/a}$. What we want to do is to determine the probability ${\displaystyle P(m,N)}$ that the particle is at position ${\displaystyle m}$ after ${\displaystyle N}$ steps, and see how it behaves for large ${\displaystyle N}$.

Now, since the probabilities for a step to be done to the right or to the left are the same, and equal to ${\displaystyle 1/2}$, the probability ${\displaystyle P(m,N)}$ will be a binomial one[8]:

${\displaystyle P(m,N)={\begin{pmatrix}N\\n_{+}\end{pmatrix}}{\frac {1}{2^{n_{+}}}}{\frac {1}{2^{N-n_{+}}}}={\begin{pmatrix}N\\{\frac {N+m}{2}}\end{pmatrix}}{\frac {1}{2^{N}}}={\frac {N!}{\left({\frac {N+m}{2}}\right)!\left({\frac {N-m}{2}}\right)!}}}$
It is easier to handle this expression if we take its logarithm:
${\displaystyle \ln P(m,N)=\ln N!-\ln \left({\frac {N+m}{2}}\right)!-\ln \left({\frac {N-m}{2}}\right)!}$
Using Stirling's approximation (see the appendix Stirling's approximation):
${\displaystyle \ln N!\sim N\ln N-N+{\frac {1}{2}}\ln(2\pi N)}$
and supposing that ${\displaystyle m}$ is small compared to ${\displaystyle N}$, we can use the logarithm expansion ${\displaystyle \ln(1+x)\sim x-x^{2}/2+\cdots }$; in the end we get:
${\displaystyle \ln P(m,N)=-{\frac {m^{2}}{2N}}+\ln {\frac {1}{\sqrt {2\pi N}}}}$
Therefore, exponentiating we have that:
${\displaystyle P(m,N)={\frac {1}{\sqrt {2\pi N}}}e^{-{\frac {m^{2}}{2N}}}\quad \qquad N\quad {\text{ large}}}$
In other words, the probability of finding a particle in a given position for large times is distributed along a Gaussian, which is what we expected from the central limit theorem. Note that the result we have just found is completely general: we have not solved the diffusion equation in order to find it, so we can argue that the density of diffusing particles will be a Gaussian for large enough times independently of the initial conditions. In other words no matters what the initial shape of ${\displaystyle \rho }$ was, for large enough times it will have become a Gaussian.

A final remark: the expression of ${\displaystyle P}$ for large ${\displaystyle N}$ is formally defined also for ${\displaystyle m>N}$. However, this is clearly impossible: if the particle has done ${\displaystyle N}$ steps on the right its final position can't be ${\displaystyle m>N}$. Should we care about this problem? Not really much, in reality: it is in fact true that formally we should also have non-null probabilities to find the particles for ${\displaystyle m>N}$, but in practice these are ridiculously small (we are considering events beyond ${\displaystyle N}$ sigmas from the mean of the Gaussian!), so they do not constitute a real problem.

## The solution of the diffusion equation

We now want to solve the diffusion equation. For the sake of simplicity, we will do that in one dimension, and the first approach we will use is that of a Fourier decomposition.

We begin by noting that the operators ${\displaystyle \partial ^{2}/\partial x^{2}}$ and the translation operator ${\displaystyle T_{a}}$, defined as ${\displaystyle (T_{a}f)(x)=f(x+a)}$, commute; in fact:

${\displaystyle \left({\frac {\partial ^{2}}{\partial x^{2}}}T_{a}f\right)(x)={\frac {\partial ^{2}}{\partial x^{2}}}f(x+a)\quad \qquad \left(T_{a}{\frac {\partial ^{2}}{\partial x^{2}}}f\right)(x)={\frac {\partial ^{2}}{\partial x^{2}}}f(x+a)}$
Therefore, since ${\displaystyle \partial ^{2}/\partial x^{2}}$ and ${\displaystyle T_{a}}$ commute they have a common basis of eigenfunctions. Thus, if we find the eigenfunctions of ${\displaystyle T_{a}}$:
${\displaystyle T_{a}\varphi =\lambda \varphi \quad \Rightarrow \quad \varphi (x+a)=\lambda \varphi _{x}\quad \Rightarrow \quad \varphi _{k}(x)=e^{ikx}\quad {\text{and}}\quad \lambda =e^{ika}}$
we have also found the eigenfunctions of ${\displaystyle \partial ^{2}/\partial x^{2}}$:
${\displaystyle {\frac {\partial ^{2}}{\partial x^{2}}}\varphi _{k}(x)=-k^{2}e^{-ikx}=-k^{2}\varphi _{k}(x)}$
These are plain waves, and in order to make them orthonormal we redefine them as:
${\displaystyle \varphi _{k}(x)={\frac {1}{\sqrt {2\pi }}}e^{ikx}}$
so that we have indeed:
${\displaystyle \int _{-\infty }^{+\infty }\varphi _{k}(x)\varphi _{k}^{*}(y)dk={\frac {1}{2\pi }}\int _{-\infty }^{+\infty }e^{ik(x-y)}dk=\delta (x-y)}$

Therefore, we can expand any given ${\displaystyle \rho }$ in terms of plane waves:

${\displaystyle \rho (x,t)={\frac {1}{2\pi }}\int \rho _{k}(t)e^{ikx}dk}$
where ${\displaystyle \rho _{k}(t)}$ are the Fourier coefficients, given by:
${\displaystyle \rho _{k}(t)=\int e^{-ikx}\rho (x,t)dx}$
This way we have:
${\displaystyle {\frac {\partial }{\partial t}}\rho (x,t)={\frac {1}{2\pi }}\int {\frac {\partial \rho _{k}}{\partial t}}e^{ikx}dk\quad \qquad D{\frac {\partial ^{2}}{\partial x^{2}}}\rho (x,t)={\frac {1}{2\pi }}\int \left(-Dk^{2}\right)\rho _{k}(t)e^{ikx}dk}$
and the validity of the diffusion equation implies that the integrands must be equal, i.e.:
${\displaystyle {\frac {\partial }{\partial t}}\rho _{k}(t)=-Dk^{2}\rho _{k}(t)\quad \Rightarrow \quad \rho _{k}(t)=e^{-Dk^{2}(t-t_{0})}\rho _{k}(t_{0})}$
Note that this means that the various Fourier modes of the initial conditions are "muffled" as time passes, and the larger is ${\displaystyle k}$ the the more muffled ${\displaystyle \rho _{k}(t)}$ will be. Therefore, substituting in ${\displaystyle \rho (x,t)}$:
${\displaystyle \rho (x,t)={\frac {1}{2\pi }}\int \rho _{k}(t)e^{ikx}dk={\frac {1}{2\pi }}\int \rho _{k}(t_{0})e^{ikx-Dk^{2}(t-t_{0})}dk}$

Everything we have stated so far is abslutely general; we now consider a very special case: let us suppose ${\displaystyle t_{0}=0}$ for simplicity and set ${\displaystyle \rho (x,0)=\delta (x)}$, namely at the beginning all the particles are located at the origin. We have:

${\displaystyle \rho _{k}(0)=\int e^{-ikx}\delta (x)dx=1\quad \Rightarrow \quad \rho (x,t)={\frac {1}{2\pi }}\int e^{-Dk^{2}t+ikx}dk}$
This is a Fresnel integral, which can be computed with complex analysis using Cauchy's theorem; however, we use a "trick" to determine ${\displaystyle \rho }$ without explicitly calculating it, based on the fact that:
${\displaystyle \int _{-\infty }^{+\infty }e^{-z^{2}}dz={\sqrt {\pi }}}$
Deriving ${\displaystyle \rho ({\vec {r}},t)}$ with respect to ${\displaystyle x}$, we have:
{\displaystyle {\begin{aligned}{\frac {\partial \rho }{\partial x}}={\frac {i}{2\pi }}\int _{-\infty }^{+\infty }\underbrace {ke^{-Dk^{2}t}} _{-{\frac {1}{2Dt}}{\frac {\partial }{\partial k}}e^{-Dk^{2}t}}e^{ikx}dk=-{\frac {i}{4\pi Dt}}\int _{-\infty }^{+\infty }\left({\frac {\partial }{\partial k}}e^{-Dk^{2}t}\right)e^{ikx}dk=\\=-{\frac {i}{4\pi Dt}}\left(\underbrace {\left.e^{-Dk^{2}t+ikx}\right|_{-\infty }^{+\infty }} _{0}-\underbrace {\int _{-\infty }^{+\infty }ixe^{-Dk^{2}t+ikx}dk} _{ix2\pi \rho (x,t)}\right)=-{\frac {x}{2Dt}}\rho (x,t)\end{aligned}}}
Therefore:
${\displaystyle \rho (x,t)={\text{const.}}\cdot e^{-{\frac {x^{2}}{4Dt}}}}$
and the constant can be determined from the normalization condition, which leads to:
${\displaystyle \rho (x,t)={\frac {1}{\sqrt {4\pi Dt}}}e^{-{\frac {x^{2}}{4Dt}}}}$
This is the solution of the diffusion equation for the initial condition ${\displaystyle \rho (x,0)=\delta (x)}$. Let us see its properties:

• After ${\displaystyle t=0}$, ${\displaystyle \rho }$ is a Gaussian of height ${\displaystyle 1/{\sqrt {4\pi Dt}}}$ and width ${\displaystyle {\sqrt {2Dt}}}$: as time passes this Gaussian lowers and widens

• As ${\displaystyle t\to 0}$, ${\displaystyle \rho (x,t)\to \delta (x)}$ (comprehensibly)

• The mean value ${\displaystyle \left\langle x\right\rangle _{t}}$ of ${\displaystyle x}$ at time ${\displaystyle t}$ is null:

${\displaystyle \left\langle x\right\rangle _{t}=\int \rho (x,t)xdx=0}$
since the integrand is odd. Then calling ${\displaystyle \alpha =1/4Dt}$ for simplicity, we have[9]:
{\displaystyle {\begin{aligned}\sigma _{t}^{2}=\left\langle x_{t}\right\rangle _{t}=\int \rho (x,t)x^{2}dx=\int {\frac {x^{2}e^{-\alpha x^{2}}}{\sqrt {\pi /\alpha }}}dx=-{\sqrt {\frac {\alpha }{\pi }}}{\frac {\partial }{\partial \alpha }}\int e^{-\alpha x^{2}}dx=\\=-{\sqrt {\frac {\alpha }{\pi }}}{\frac {\partial }{\partial \alpha }}{\sqrt {\frac {\pi }{\alpha }}}=-{\frac {\partial }{\partial \alpha }}\ln {\sqrt {\frac {\pi }{\alpha }}}={\frac {1}{2\alpha }}=2Dt\end{aligned}}}
Therefore ${\displaystyle \sigma _{t}={\sqrt {2Dt}}}$, a result we already knew from the general properties of random walks.

If we use as initial condition ${\displaystyle \rho (x,0)=\delta (x-y)}$, then by definition ${\displaystyle \rho (x,t)}$ is the Green's function of the diffusion equation:

${\displaystyle \rho (x,t)={\frac {1}{\sqrt {4\pi Dt}}}e^{-{\frac {(x-y)^{2}}{4Dt}}}:=G(x-y,t)}$
Green's functions can be used to solve differential equations, like the diffusion one, with an approach different from the Fourier decomposition. In fact, if we can (at least formally) write ${\displaystyle \rho }$ as a superposition in the following form:
${\displaystyle \rho (x,0)=\int \rho (y,0)\delta (x-y)dy}$
then the general solution of the diffusion equation is:
${\displaystyle \rho (x,t)=\int G(x-y,t)\rho (y,0)dy}$

Of course, both Fourier's and Green's methods are equivalent, but depending on the situation and the initial conditions considered one can result more convenient than the other.

1. The fact that ${\displaystyle \left\langle \ell \right\rangle =0}$ of course means that the random walk is unbiased. As we will later see in Fokker-Planck and Langevin's equations, we can also consider cases where the random walk is actually biased by an external force. Note also that, however, there can be cases where ${\displaystyle \chi (\ell )}$ does not have a finite variance, for example power law distributions. In this case the resulting motion is called Lévy flight, and the main difference with a "normal" random walk is that in a Lévy flight the particles sometimes make very long steps. Such systems have been also used to model the movement of animal herds.
2. We also keep the third order in the expansion, to make explicit that all the terms beyond the second order vanish in the limits we will take.
3. An example of probability distribution that satisfies this requirement is the Poisson distribution.
4. In particular, the term proportional to ${\displaystyle \partial ^{3}\rho /\partial x^{3}}$ vanishes because:
${\displaystyle {\frac {ca^{3}}{6\Delta t}}={\frac {c}{6}}aD\to 0}$
5. A fun fact: the diffusion equation with an imaginary time is the Schro"dinger equation for a free particle (of course provided that ${\displaystyle \rho }$ is interpreted as a wave function), with a diffusion constant equal to ${\displaystyle D=\hbar /2m}$.
6. Remember that in general the flow of particles ${\displaystyle {\vec {J}}}$ is defined as a vector such that ${\displaystyle |{\vec {J}}|}$ is the number of particles that pass through the surface orthogonal to ${\displaystyle {\vec {J}}}$ per unit time and unit surface. More in general, if ${\displaystyle {\hat {n}}}$ is a unit vector orthogonal to a surface, then ${\displaystyle {\vec {J}}\cdot {\hat {n}}}$ is the number of particles passing through the surface along the direction of ${\displaystyle {\hat {n}}}$ per unit time and surface. If ${\displaystyle {\vec {J}}\cdot {\hat {n}}<0}$, this means that the particles are passing through the surface along the direction of ${\displaystyle -{\hat {n}}}$.
7. However, there are situations were diffusion (despite the presence of the negative sign in equation ${\displaystyle {\vec {J}}=-D{\vec {\nabla }}\rho }$) can actually highlight and increase concentration inhomogeneities, spontaneously bringing a system from a homogeneous to an inhomogeneous configuration. These are called patterning phenomena.
8. We can also justify this thinking that the choice of the direction of the step is regulated by the flips of a coin.
9. This result could have been obtained much more easily, without doing any computation, from the fact that ${\displaystyle \rho (x,t)}$ is a Gaussian.