1

I read in an introduction to information theory and entropy

There is N particles, which can move across a million of boxes. What does author mean by

Consider the number of configurations with the particles distributed almost equally, except that half the boxes are short by one particle, and the rest have an extra. The number of such configurations is:

$${10^6 \choose 10^6/2} = {10^6! \over (10^6/2)! (10^6-10^6/2)!}$$

Why binomial coefficient is used here? I know that it is used in yes/no experiments and if I do a million of yes/no, I can choose 500k items. What does it have with respect to this problem?

Obviously, the number of ways to occupy the half of container is counted. However, what if I have $N < 10^6/2$ and, therefore, there will be no particles to fill the configuration? Should I assume $N \gg 10^6$?

Update I see that nowhere author uses N to compute the probabilities of particle distribution. I see that he wanted to show that the most probable state is when molecules occupy only a half of the container, following the binomial and normal distributions, which have peak in the center. However, I intuitively feel that, given trillions of molecules, $N \gg 10^6$, they will fill the whole container. That is, overwhelmingly likely configuration must be all $10^6$ cells are occupied rather than only half of them.

Do we have an example when uniform distribution coverages to normal as we increase the configuration resolution?

Response from the author

the "choosing" is not which boxes will have any molecules in them, but which boxes will have (either) one extra molecule, or, alternatively, will have one less molecule. So: 500,000 boxes will have (N / 10^6) + 1 molecules, and 500,000 boxes will have (N / 10^6) - 1 molecules. The "choosing" is exactly which 500,000 boxes will have an extra molecule. There are many ways to choose exactly which 500,000 will have an extra (and the other 500,000, whichever they are, will have one less . . .).

The point is that each of these configurations (with very close to uniform distributions) is an example of a "high entropy" configuration. The fact that there a so many "high entropy" configurations means that if we observe the box at some random time, we are almost certain to see it in a "high entropy" configuration . . .

It is also worth noting that I have only described one very specific example of "high entropy" configurations (those with exactly 500,000 small boxes with one extra molecule, and exactly 500,000 with one less molecule). There are many, many, many other "high entropy" configurations with "almost uniform" distributions of the molecules among the $10^6$ small boxes. Each one of these also contributes to the likelihood that we will observe the system in a "high entropy" configuration. In particular, the probability that we will observe the system in a "high entropy" configuration is so close to 1 that we might just as well call it 1

I still do not understand why this extra molecules are distributed binomially.

Here is a follow up Dec 2013

1.) We have $1$ "big" box made up of 10^6 small boxes. 2.) We have $N$ identical particles (all contained within the big box), and, sensibly, we assume $N$ is much larger than $10^6$. For simplicity, we could take $N=10^{24}$ (i.e., roughly Avogadro's number . . .). 3.) A "configuration" of the particles is an arrangement of the N particles in the "big" box, and, hence, within the $10^6$ small boxes. 4.) Thus, a "configuration" can be thought of as a list of the number of particles in each of the small boxes -- i.e., a list of $10^6$ numbers, say $n_i$ for $i = 1, 2, ... , 10^6$, and the sum of the $n_i$ is equal to $N$ (i.e., every particle is in some small box). The number $n_i$ tells how many particles are in the $i$'th small box. Note that since the particles are all identical to each other, we can't tell exactly which particles are in a small box, but only how many particles are there.

Now, we would like to learn something about the space of "configurations". A first observation we can make is that there are many, many, many possible configurations . . . In particular, there are $10^{24} + 10^6 - 1 \choose 10^6 - 1$ configurations. A quick-and-dirty estimate of $n \choose k$ is ${n \choose k} \geq (n / k)^k$, so in our case, we have the quick estimate that the total number of possible configurations is $\geq \left(10^{24} + 10^6 - 1 \over 10^6 - 1\right)^{10^6 - 1}$, which is approximately $\left(10^{24} / 10^6\right)^{10^6} = (10^{18})^{10^6} = 10^{18 * 10^6}$ (i.e., $1$ with $18$ million $0$'s after it) ...

So, what can we say about the probability that we will observe the system in a configuration with some particular property? We can estimate this by counting the number of configurations having a particular property. So, we do some counting.

1.) One property that a configuration might have is that all the particles are in exactly one of the small boxes. In other words, the configuration would look like ($0$, $0$, $0$, $0$, . . ., $0$, $0$, $10^{24}$, $0$, $0$ ... , $0$, $0$, $0$) (i.e., $10^6$ - 1 small boxes have $0$ particles in them, and $1$ small box has $10^{24}$ particles in it). There are $10^6$ different configurations that have this precise property (one configuration for each small box). Thus, we estimate that the probability of finding the system in one of these specific configurations will be less than $10^6$ / $10^{18 * 10^6}$ (a very tiny number!).

2.) A second property that a configuration might have is that the particles are exactly uniformly scattered across all the small boxes, in other words for every small box to have exactly $10^{18}$ particles in it. This is actually a relatively uninteresting case, because there is exactly one way for this to be the case, and hence this specific configuration is extremely unlikely to occur.

Val
  • 1
  • Well, I'm not wholly sure but I would think that you are right about the binomial coefficient being used in the "$10^{6}$ choose $10^{6}/2$" sense. The probability of any given particle going into a box is $10^{6}$ choose $10^{6}/2$ because one half of the boxes have a particle going it. – Juan Sebastian Lozano Oct 27 '13 at 19:02
  • Wait, probability is something between 0 and 1. ${10^6 \choose 10^6/2}$ is much larger. It cannot be the probability. What does it mean that I have a configuration of ${10^6 \choose 0}=1$ and ${10^6 \choose 10^6}=1$? – Val Oct 27 '13 at 19:28
  • I think the author is implicitly assuming N is a multiple of 10^6. – awkward Oct 27 '13 at 22:30
  • I'm sorry I was rather ambiguous, I meant that since probability of a desired event is "$\frac{desired}{possible}$" then then ${10^{6} \choose 10^{6}/2}$ is the total amount possible ways of choosing $10^{6}/2$ boxes, which might be what the author meant. I don't know, but it seems from the link that it is something along these line. Like I said, I don't really know (sorry for how little help that was). – Juan Sebastian Lozano Oct 28 '13 at 00:22
  • @JuanSebastianLozanoMuñoz I think that $possible = 2^{10^6}$ and the binomial factor is the $desired$ since binomial compute you the size of k-group given n items whereas the total sum of those sizes, for k ranging in $(0,n)$ is $2^n$. The largest group will be about in the center, k = n/2. But I do not understand what why it does to the probability of this configuration. Intuitively, I feel that when we have $N >> 10^6$, having molecules occupied a half of the container is much less likely than occupying every cell. Yet, binomial distribution has max at the center. – Val Oct 28 '13 at 11:14
  • @Val I agree with your intuition about the distribution of particles, but reviewing the relevant part of the book (which was not linked when I had previously commented), it seems that they do not use the binomial coefficient in either probability or in distribution. The author is very simply calculating the number of possible configurations. The $10^6$ comes from the example he had done directly before, and the $\frac{10^6}{2}$ is the number of boxes that have to be filled with one extra particle. The author even states that they are nearly evenly distributed. – Juan Sebastian Lozano Oct 29 '13 at 14:23
  • About my earlier comment: Binomial coefficient gives you the number of possible groups (disregarding order) of size $k$ possible when chosen from population $n$, so what I meant earlier was that this was the number of possible configurations of boxes with one extra particle, and your desired would be 1 (because I was assuming you wanted a specific configuration). – Juan Sebastian Lozano Oct 29 '13 at 14:33
  • @JuanSebastianLozanoMuñoz do you mean that I have popped $10^6 \choose 10^6/2$ out of my head? – Val Oct 29 '13 at 14:39
  • No, not at all. I was simply trying to explain where the author got the values for ${10^6 \choose \frac{10^6}{2}}$, and why he uses a binomial coefficient. I am pretty much just extending what the answer below says. – Juan Sebastian Lozano Oct 29 '13 at 14:50

1 Answers1

3

The only reasonable interpretation of this number is that a "configuration" is a mapping from boxes to their number of particles (in the sense that all ways to assign particular particles to boxes that result in the same numbers for each box are considered to be the same; one is not distinguishing particles but one is distinguishing boxes). Then a configuration of the indicated kind is determined by a choice of which $500\,000$ boxes get an extra particle (the remaining one automatically have one particle short). I am in the dark about why this number would be relevant to the discussion, you did not give enough details for that. But in any case any counting of actual microscopic configurations (which is usually what one does for entropy) would involve $N$, and would be a much larger number than the given binomial coefficient.

  • I have fixed the reference to appropriate section of the book. Might be you, as more experienced, can elicit more appropriate information. – Val Oct 28 '13 at 12:08