# Entropy and the arrow of time

Entropy as defined in Entropy as ignorance: information entropy depends only on the microscopic laws of motion, which as we have seen in Time and entropy are time-reversal invariant: this means that ${\displaystyle S}$ in equation Entropy as ignorance: information entropy is time independent and so strictly speaking the entropy of a system should never increase. To see that explicitly let us show that in general:

${\displaystyle {\frac {d}{dt}}\int f(\rho (\mathbb {Q} ,\mathbb {P} ,t))d\Gamma =0}$
where ${\displaystyle f}$ is a generic function such that ${\displaystyle f(0)=0}$. In our case ${\displaystyle f(x)=-x\ln x}$, which formally is not well defined for ${\displaystyle x=0}$; however, since ${\displaystyle \lim _{x\to 0}x\ln x=0}$ we can extend the function continuously and define ${\displaystyle 0\cdot \ln 0}$ to be zero. Therefore:
${\displaystyle {\frac {d}{dt}}\int f(\rho (\mathbb {Q} ,\mathbb {P} ,t))d\Gamma =\int {\frac {\partial f}{\partial \rho }}{\frac {\partial \rho }{\partial t}}d\Gamma =-\int {\frac {\partial f}{\partial \rho }}{\vec {\nabla }}\cdot (\mathbb {V} \rho )d\Gamma }$
where in the last step we have used the fact that since ${\displaystyle \rho }$ is a probability density it must satisfy the continuity equation ${\displaystyle {\dot {\rho }}=-{\vec {\nabla }}(\mathbb {V} \rho )}$ (see also the discussion of Liouville's theorem), and ${\textstyle \mathbb {V} =({\dot {\mathbb {Q} }},{\dot {\mathbb {P} }})}$; we now can easily see that ${\textstyle {\vec {\nabla }}\cdot (\mathbb {V} \rho )=\mathbb {V} \cdot {\vec {\nabla }}\rho }$:
{\displaystyle {\begin{aligned}{\vec {\nabla }}\cdot (\mathbb {V} \rho )=\sum _{i=1}^{3N}\left[{\frac {\partial }{\partial q_{i}}}({\dot {q}}_{i}\rho )+{\frac {\partial }{\partial p_{i}}}({\dot {p}}_{i}\rho )\right]=\sum _{i=1}^{3N}\left[{\frac {\partial }{\partial q_{i}}}\left({\frac {\partial {\mathcal {H}}}{\partial p_{i}}}\rho \right)-{\frac {\partial }{\partial p_{i}}}\left({\frac {\partial {\mathcal {H}}}{\partial q_{i}}}\rho \right)\right]=\\=\sum _{i=1}^{3N}\left[{\frac {\partial {\mathcal {H}}}{\partial p_{i}}}{\frac {\partial \rho }{\partial q_{i}}}-{\frac {\partial {\mathcal {H}}}{\partial q_{i}}}{\frac {\partial \rho }{\partial p_{i}}}\right]=\sum _{i=1}^{3N}\left({\dot {q}}_{i}{\frac {\partial \rho }{\partial q_{i}}}+{\dot {p}}_{i}{\frac {\partial \rho }{\partial p_{i}}}\right)=\mathbb {V} \cdot {\vec {\nabla }}\rho \end{aligned}}}
Therefore:
${\displaystyle {\frac {d}{dt}}\int f(\rho (\mathbb {Q} ,\mathbb {P} ,t))d\Gamma =-\int {\frac {\partial f}{\partial \rho }}{\vec {\nabla }}\rho \cdot \mathbb {V} d\Gamma =-\int {\vec {\nabla }}f\cdot \mathbb {V} d\Gamma }$
where by definition of gradient ${\displaystyle {\vec {\nabla }}f=(\partial f/\partial \rho ){\vec {\nabla }}\rho }$. Now, we have that ${\displaystyle {\vec {\nabla }}f\cdot \mathbb {V} ={\vec {\nabla }}(f\mathbb {V} )}$, which can be shown exactly as we have done for ${\displaystyle \rho }$. Thus, using Gauss theorem:
${\displaystyle {\frac {d}{dt}}\int f(\rho (\mathbb {Q} ,\mathbb {P} ,t))d\Gamma =-\int {\vec {\nabla }}(f\mathbb {V} )d\Gamma =-\int _{\Sigma _{\infty }}f(\rho )\mathbb {V} \cdot d{\vec {\Sigma }}_{\infty }}$
where ${\displaystyle \Sigma _{\infty }}$ is the surface that encloses the phase space volume[1]; since ${\displaystyle \rho }$ is a probability density it will vanish on ${\displaystyle \Sigma _{\infty }}$[2] and so will ${\displaystyle f(\rho )}$, since we are supposing ${\displaystyle f(0)=0}$. Therefore the last integral is null:
${\displaystyle {\frac {d}{dt}}\int f(\rho (\mathbb {Q} ,\mathbb {P} ,t))d\Gamma =0}$
From this we get:
${\displaystyle {\frac {dS}{dt}}=0}$
and note that as we have obtained it, this relation is always valid. This means that in principle if we consider a system undergoing an irreversible transformation, like the adiabatic expansion of a gas, its entropy remains constant; however we know that in such cases entropy always increases: where does this discrepancy come from?

What we want to show now is that this discrepancy comes from the fact that in reality entropy is not a property of a given system, but of our knowledge about it. Let us first see this in a rather intuitive way: suppose we are computationally integrating the equations of motion of a system made of ${\displaystyle N}$ particles closed in a fixed volume and to choose very unusual initial conditions, for example we set the initial positions of the particles in only one half of the system (we are thus simulating the adiabatic expansion of a gas). We let the system evolve for some time, then we stop and invert all the velocities of the particles and then restart the integration; what we are doing is essentially equivalent to letting the system evolve for some time and then "rewind" it. We would therefore expect that as the system evolves the particles will come back to their initial conditions since we are just "rewinding" the process; however this doesn't occur and the gas evolves as a normal ideal gas. This happens because computers have finite precision: the position and momentum of every particle is stored with a fixed number of significant figures, and as time passes we are loosing information about the system because the computer will discard many significant figures that (mathematically) should be present. In order to actually see the gas go back to its original configuration we would thus need a computer with infinite precision, which would not loose information.

Let us now see this concept in a more formal way. When we consider a system from a microscopic point of view[3] the probability density ${\displaystyle \rho }$ has complete information on the system since it will be of the form:

${\displaystyle \rho (\mathbb {Q} ,\mathbb {P} ,t)=\delta (\mathbb {Q} -\mathbb {Q} (t))\delta (\mathbb {P} -\mathbb {P} (t))}$

Consider now a situation where we have less information on the system: suppose for example we have a phase space probability density ${\displaystyle \rho (\mathbb {Q} ,t)}$ that carries no information about the momenta of the particles[4]. Then as ${\displaystyle \rho }$ evolves it will satisfy a diffusion-like equation in configuration space[5]:

${\displaystyle {\dot {\rho }}(\mathbb {Q} ,\mathbb {P} )=D\nabla ^{2}\rho (\mathbb {Q} ,\mathbb {P} )}$
If the entropy of the system is:
${\displaystyle S=-k_{S}\int \rho (\mathbb {Q} ,t)\ln \rho (\mathbb {Q} ,t)d\mathbb {Q} }$
then we have:
${\displaystyle {\frac {dS}{dt}}=-k_{S}\int {\frac {\partial \rho }{\partial t}}\ln \rho d\mathbb {Q} -k_{S}\int {\frac {\partial \rho }{\partial t}}d\mathbb {Q} }$
The second integral is null:
${\displaystyle \int {\frac {\partial \rho }{\partial t}}d\mathbb {Q} ={\frac {\partial }{\partial t}}\int \rho d\mathbb {Q} ={\frac {\partial }{\partial t}}1=0}$
so integrating by parts the remaining term and using Gauss theorem:
{\displaystyle {\begin{aligned}{\frac {dS}{dt}}=-k_{S}\int {\frac {\partial \rho }{\partial t}}\ln \rho d\mathbb {Q} =-k_{S}D\int \nabla ^{2}\rho \ln \rho d\mathbb {Q} =\\=-k_{S}D\int _{\Sigma _{\infty }}\ln \rho {\vec {\nabla }}\rho \cdot d{\vec {\Sigma }}_{\infty }+k_{S}D\int {\vec {\nabla }}\rho \cdot {\vec {\nabla }}\ln \rho d\mathbb {Q} \end{aligned}}}
where ${\displaystyle \Sigma _{\infty }}$ is the surface that encloses the system in configuration space. Assuming that ${\displaystyle \rho ,{\vec {\nabla }}\rho \to 0}$ on ${\displaystyle \Sigma _{\infty }}$, then since ${\displaystyle {\vec {\nabla }}\ln \rho ={\vec {\nabla }}\rho /\rho }$ we have:
${\displaystyle {\frac {dS}{dt}}=k_{S}D\int {\frac {|{\vec {\nabla }}\rho |^{2}}{\rho }}\geq 0}$
Therefore, we have found that now the entropy of the system really increases, and this follows only from our lack of knowledge about the momenta of the particles.

We can thus see how entropy really emerges when we have not a perfect knowledge on the system, or in other words when we start ignoring or excluding some degrees of freedom.
1. Note: in momentum space this is always a surface at infinity, while in configuration space it can also be a finite surface depending on the properties of the system (obviously, if the particles can occupy a limited space then ${\displaystyle \Sigma _{\infty }}$ in configuration space will be finite).
2. Intuitively, this can be justified from the fact that ${\displaystyle \rho }$ must be normalized:
${\displaystyle \int \rho (\mathbb {Q} ,\mathbb {P} )d\Gamma =1}$
and this can happen only if ${\displaystyle \rho }$ tends to zero at infinity.
3. Which is what we have done this in the proof of the fact that ${\displaystyle {\dot {S}}=0}$, since we have used Hamilton's equations.
4. For example, this can me obtained from the previous probability density by integrating over the momenta and then renormalizing.
5. In fact, as we have seen in The diffusion equation the diffusion and continuity equations are equivalent if ${\displaystyle {\vec {J}}=-D{\vec {\nabla }}\rho }$: therefore since ${\displaystyle \rho }$ satisfies a continuity equation (being a probability density) then it will also satisfy a diffusion equation with diffusion constant ${\displaystyle D}$.