# A niftier framework for the statistical ensembles

The way we have obtained the canonical and grand canonical partition functions from the microcanonical ensemble in The canonical ensemble and The grand canonical ensemble is a rather "classic" one, and maybe also the most intuitive.

However this is not the only possible one: in fact, as we now want to show it is possible to obtain *all* the ensembles (including the microcanonical one) from the *principle of maximum entropy*, where the entropy is defined in the most general way, i.e. :

In other words what we want to see is that maximizing the entropy of a system as written now with appropriately chosen constraints it is possible to determine both the canonical and grand canonical ensembles.

Let us consider a very different but simple and intuitive example to understand how this can be possible.
Suppose we have a normal six-sided die; if we know nothing about it (namely we don't know if it has been fixed or not) then all the possible rolls have the same probability, i.e. for . This fact can also be obtained from the maximization of Shannon's entropy (we remove any proportionality constant for simplicity):

Now, suppose that the die has been fixed and that ; in order to find the new probabilities we now have to maximize with the additional constraint . Therefore:

Let us now see this more in general: suppose we have a system which can be found in different states (for simplicity we now consider the discrete case), each with probability . Let us also suppose that we have put some contraints on the mean values of some observables defined on this system, i.e.:

*given*values of the observables. We have put also the normalization condition in the same form as the other constraints (with ) in order to have a more general notation. As we know the entropy of the system will be given by (we again drop any constant in front of ):

Now, in order to solve the problem we still have to impose all the other constraints: . These can be written as:

*covariance*of and . In general, the -th element of the

*covariance matrix*is exactly defined as the covariance between and :

Let us now briefly see what happens in the continuous case, so that we can use what we have seen in the framework of ensemble theory.
Since we are now dealing with continuous probability densities , they will not depend on the discrete index but on the "continuous index" , and of course the summations over must be replaced with integrations in phase space. In other words, the entropy of the system will be:

The probability density will be of the form:

Let us now apply all this to the ensemble theory.
In the microcanonical ensemble we only have the normalization constraint:

*given*value of the energy. In this case, therefore, the only non-null "observable" is , which as we have seen is a "fictious" one (defined so that also the normalization condition can be put in the form of a constraint on the mean value of a given observable). In other words, referring to our notation we have and the probability density has indeed the form:

^{[1]}.

In the grand canonical ensemble, then, we have the additional constraint of having the mean value of the number of particles fixed, namely . Explicitly we have that the entropy of the system is:

We conclude with an observation.
We have determined the properties of the ensembles fixing the values of the first moments of the observables (i.e., and ); we can ask: why haven't we fixed also other moments (namely , etc.)?
In general it can happen that those additional moments are redundant; let us see a simple example in order to understand that.
Suppose is a stochastic variable distributed along a probability distribution ; imagine that we are given values of without knowing and that we want to understand what is from the -s. How can we proceed? We could try to guess with a procedure similar to what we have seen now. For example we could compute the -th moments of with , namely , and . Then, our guess for would be:

^{[2]}in order to determine .

However, in ensemble theory something slightly different happens. Suppose in fact that we have fixed the first two moments of in the canonical ensemble; then we would have:

*assuming*that the moments different from the first are actually significant, since strictly speaking there is nothing that would prevent us from fixing also their values. It is the incredible accuracy of the predictions made by statistical mechanics with the experimental results that confirms that this is a reasonable assumption.

- ↑ At this point there is no way to understand that, and we are about to see something similar also in the grand canonical ensemble. This is the disadvantage of this way of deducing the statistical ensembles: it is elegant and mathematically consistent, but not really physically intuitive. The "classical" way we have used to derive the ensembles is surely less formal and rigorous, but it allows to understand
*physically*what happens. - ↑ It could be asked then what can we do if we don't know
*absolutely nothing*about . In this case there is nothing that can help, besides experience; in such cases, in fact, one tries to get some insights on the problems and then try different "recipes", from which something new can be learned about .