Is entropy "interpolative" (in the sense defined below...)?

Question

Define the usual Shannon $H(p)=-\sum_i p_i\log_2(p_i)$ for discrete pdf (probability distribution function) $p=(p_i,i=1\ldots n)$. Now consider three pdf's: $p_a,p_b$ are given (with the same cardinality $n$), and $p_c$ is a convex combination $p_c=\alpha p_a+(1-\alpha)p_b$ for a given $0\leq\alpha\leq1$.

Then, is it true that $H(p_a)\leq H(p_c)\leq H(p_b)$? I've tried a variety of numerical cases (wrote a small program to try them for me), and it's been true for those cases. But I haven't been able to conjure up a general proof. So (a) can you conjure one up?, and (b) even more concretely, can you (is it even possible to) derive a closed-form formula for $H(p_c)=f(\alpha,H(p_a),H(p_b))$ in terms of some $f(\cdot)$?

P.S. Also, as an auxiliary question, does the same interpolation property hold (or not hold) for continuous pdf's (with the usual sum$\to$integral for $H$)?

E d i t
--------------
Thanks to @IvanNeretin for answering the question, even if the answer has to be "no".
So let me ask a related question: Given $p_a$ and $p_b$ as above, and supposing $H(p_a)\leq H(p_b)$ (just flip $a\leftrightarrow b$ if not), is there any way to construct a $p_c$ such that $H(p_a)\leq H(p_c)\leq H(p_b)$? Ivan cleverly demonstrated that my not-so-clever convex combination isn't such a construction. But that doesn't necessarily rule out some construction that does satisfy the interpolative condition.

E d i t # 2
-------------------
Re Ivan's comment/question beneath his answer, "How did you produce your numerical cases?", the C language program is below. Save it in file entinterp.c (or whatever you like), and compile it as
cc entinterp.c -lm -o entinterp
and run it as, e.g.,
./entinterp .5 ".1,.2,.3,.4&.5,.6,.7,.8"
The first .5 arg is $\alpha$. Then a space followed by the two pdf's, $p_a$ and $p_b$, all enclosed in quotes. Each $p_i,i=1\ldots n,$ is separated by a comma, and the two pdf's are separated by an ampersand. Note that each pdf is normalized by the program, so you don't have to worry about $\sum_i p_i=1$. But you do have to input the same number $n$ of $p_i$'s both before and after the ampersand.

Output from the example illustrated above is...

bash-5.0$ ./entinterp .5 ".1,.2,.3,.4&.5,.6,.7,.8"
H(0.100,0.200,0.300,0.400) = 1.8464
H(0.192,0.231,0.269,0.308) = 1.9785
H(0.146,0.215,0.285,0.354) = 1.9289
entinterp> end-of-job

where the first two output lines are $H(p_a)$ and $H(p_b)$, and the third line is $H(p_c)$.

Now, the above example, like all I originally tried, is indeed interpolative. But after Ivan's answer, I tried some more and found that's it's real, real easy to come up with examples that don't work. So my first comment under Ivan's answer is pretty much irrelevant, based on my mistaken assumption that most cases work.

Anyway, here's the code...

/* --- is entropy interpolative? --- */
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
/* --- entry point --- /
int main ( int argc, char argv[] ) {
  /* --- command-line args --- /
  double  alpha = ( argc>1? atof(argv[1]) : 0.5 ); / convexity multiplier /
  char probstr = ( argc>2? argv[2] : ".1,.2,.3,.4&.5,.6,.7,.8" );
  /* --- other variables --- /
  char probstr1=probstr, amp=strchr(probstr,'&'), probstr2=amp+1;
  double probs1=NULL, probs2=NULL, getprobs(), probs3[99];
  double H=0.0, entropy();
  int nprobs = 0;
  / --- get pdf's --- /
  if ( amp == NULL ) goto end_of_job;
  probs1 = getprobs(probstr1);
  probs2 = getprobs(probstr2);
  while ( 1 ) {
    if ( probs1[nprobs]<0.0 || probs2[nprobs]<0.0 ) break;
    probs3[nprobs] = alphaprobs1[nprobs] + (1.0-alpha)probs2[nprobs];
    nprobs++; }
  probs3[nprobs] = (-1.0);
  H = entropy(probs1);
  H = entropy(probs2);
  H = entropy(probs3);
  end_of_job:
    printf("entinterp> end-of-job\n"); exit(0);
  } / --- end-of-function main() --- */
/* --- entry point --- /
double getprobs( char probstr ) {
  double probs = (double )malloc(99sizeof(double));
  int    iprob=0, nprobs=0;
  char   thisptr=probstr, endptr=NULL;
  double thisprob=0.0, totprob=0.0;
  if ( probs==NULL || probstr==NULL ) return(NULL);
  while ( 1 ) {
    while ( thisptr!='\000' && strchr(" ,",thisptr)!=NULL ) thisptr++;
    if ( thisptr=='\000' || thisptr=='&' ) break;
    thisprob = strtod(thisptr,&endptr);
    if ( thisprob < 0.0 ) break;
    probs[nprobs++] = thisprob;
    totprob += thisprob;
    thisptr = endptr;
    } /* --- end-of-while() --- /
  if ( nprobs>0 && totprob>0.0 )
    for ( iprob=0; iprob<nprobs; iprob++ ) probs[iprob] /= totprob;
  end_of_job:
    probs[nprobs] = (-1.0);
    return ( probs );
  } / --- end-of-function getprobs() --- */
/* --- entry point --- /
double entropy( double probs ) {
  double H = 0.0, log2=log(2.0);
  int    nprobs = 0;
  FILE   prtfile = stdout; / NULL for no print /
  if ( probs == NULL ) { H=(-1.0); goto end_of_job; }
  if ( prtfile != NULL ) fprintf(prtfile,"H(");
  while ( probs[nprobs] >= 0.0 ) {
    if ( prtfile != NULL )
      fprintf(prtfile,"%c%.3lf",(nprobs>0?',':'\000'),probs[nprobs]);
    if ( probs[nprobs] > 0.0 ) H -= probs[nprobs](log(probs[nprobs])/log2);
    nprobs++; }
  if ( prtfile != NULL ) fprintf(prtfile,") = %.4lf\n",H);
  end_of_job:
    return ( H );
  } /* --- end-of-function entropy() --- /
/ --- end-of-file entinterp.c --- */

I've been thinking in a different direction. If $p_b$ is the *uniform* distribution (which happens to have the maximum possible entropy), then... your convexity condition is true! Perhaps this "special" case is not so special, after all. — Ivan Neretin, Jul 05 '20 at 07:10
@IvanNeretin Sure, the very first example I tried (with a calculator before programming it) was the uniform distribution $p_a=.5,.5$ (you guessed it, $H=1$) and $p_b=0,1$ ($H=0$), which obviously (even "obvious" to me at the time:) had to work. I think the question, as re-stated in the first Edit, should have all along been: is there any function of $p_a$ and $p_b$ giving a $p_c$ whose entropy is interpolative with respect to them? I naively and incorrectly focused on a convexity combination, which you quickly demonstrated is wrong. — John Forkosh, Jul 05 '20 at 10:20

score 1 · Accepted Answer · answered Jul 04 '20 at 08:24

1

This is obviously not true.

Say, $p_a(i)={2\over n}$ for $i\in[1,{n\over2}]$ and $0$ elsewhere. Also, $p_b(i)={2\over n}$ for $i\in[{n\over2}+1,n]$ and $0$ elsewhere, and $\alpha={1\over2}$. Now $H(p_a)=H(p_b)=\log_2{n\over2}\mathop{\color{red}{\mathbf<}}H(p_c)=\log_2n$.

When you mix two gases, the entropy increases - that's the physical intuition which kinda justifies the "obviously" in the first phrase.

So it goes.

answered Jul 04 '20 at 08:24

Ivan Neretin

12,835

Thanks, Ivan. Duh! Of course that's right, and also obvious... now that you've said it (I've got an ms in physics, not a PhD in phys chem, whereby your intuition didn't cross my mind -- but my grad courses in thermo and stat mech made it obvious after you said it). So let me ask a follow-up question: I tried maybe a dozen or more different numerical cases with my small program, and those all worked, every last one. So I'm guessing it works "most of the time", for some value of "most". What's the condition???, i.e., what relation between $p_a$ and $p_b$ must be satisfied for it to work? – John Forkosh Jul 05 '20 at 04:27
Rather than the follow-up question above, please see the edit on the original question for the more relevant follow-up question (which occurred to me after writing the preceding comment, and which was too long for this comment). But feel free to answer both questions -- I'm still a bit perplexed why all the various numerical cases I tried worked. – John Forkosh Jul 05 '20 at 04:57
That surprises me too; I wouldn't even expect it to work "most of the time" in any sense. How did you produce your numerical cases? – Ivan Neretin Jul 05 '20 at 06:22
Thanks again, Ivan. Re "How...?", see Edit#2 to original question, which includes the program code. Re "...surprises me, too", Edit#2 also remarks that, subsequent to your answer, I can now -- more often than not -- input examples that fail to be interpolative. I must have originally been unconsciously typing in examples that were related in some way so that they worked. I tried too many examples for it to just be coincidence. But I didn't write them down, so I guess we'll never know for sure. – John Forkosh Jul 05 '20 at 10:09

Is entropy "interpolative" (in the sense defined below...)?

1 Answers1