0

Suppose that i want to calculate the joint entropy $H(A,B)$ of two discrete random variables of the form:

$A=\{-1,1,1,-1,-1,-1,1,1\}$ and $B=\{1,-1,1,1,-1,-1,-1,1\}$.

If the goal was just the calculation of the entropy of A or B, then, for example, i would have: $H(A)=- \sum{p*\log_2 (p)}$ where the probability mass function $p$ would be calculated from the observed frequencies of $-1$ and $1$. This means that $H(A)=- [\frac{1}{2} \log_2(\frac{1}{2}) + \frac{1}{2} \log_2(\frac{1}{2})]$. But what about the joint entropy and what should i do if i had more than two dicrete random variables (of the same form, with elements $-1$ and $1$)?

Gregory Grant
  • 14,874
john
  • 67
  • When you say "of the form", do you mean the following: "characterized by the following samples" ? – zoli May 16 '15 at 17:01
  • I mean that a third random variable could be $C={-1,-1,-1,1,-1,1,1,1}$ so i want to calculate $H(A,B,C)$. – john May 16 '15 at 17:04
  • Butt you showed the calculation for $A$. Do you know how to do it for the pair $A,B$? Why are you saying that you want to do the calculation for more than two random variables? – zoli May 16 '15 at 17:07
  • I do not know how to calculate the joint probability $p(A,B)$ of A and B, because i cannot just multiply the two marginal probabilities $P(A)$ and $P(B)$ (A and B may not be independent). Is there a specific way of calculation? For example, if i use that $p(A,B)=p(A|B) P(B)$, how i can calculate $p(A|B)$? – john May 16 '15 at 17:12
  • Ah! I know what you mean. See my answer. If you want to estimate the probability of the common occurrence of, say, $1,1$ then you count the the common occurrences of $1,1$ in the sample. Your sample is ${(-1,1),(1,-1),(1,1),(-1,1),(-1,-1),(-1,-1),(1,-1),(1,1)}$. However, I had to assume that the $A$ samples and the $B$ samples correspond. If I cannot assume that then there is no way to calculate the joint entropy. – zoli May 16 '15 at 17:50
  • Now, what else do you want to know? If you are satisfied then accept my answer. If you are not, ask further questions. – zoli May 16 '15 at 18:16
  • Thanks a lot for your answers. – john May 16 '15 at 18:30
  • Then, please hit the check mark button. – zoli May 16 '15 at 18:40

1 Answers1

0

The possible values of the triplet $A,B,C$ are $\{(1,1,1),(1,1,-1),\cdots,(-1,-1,-1)\}.$ Based on a sample the probabilities of the the different $8$ outcomes could be estimated. Let those probabilities be denoted by $p_{1,1,1},p_{1,1,-1},\cdots,p_{-1,-1,-1}$.

The entropy of $(A,B,C)$, by definition, is

$$H(A,B,C)=-(p_{1,1,1}\log (p_{1,1,1})+p_{1,1,-1}\log(p_{1,1,-1})...+p_{-1,-1,-1}\log (p_{-1,-1,-1}).$$

Or, in general, if $\{p_1,p_2,\cdots p_n\}$ is the pmf of a discrete random variable then the corresponding entropy is

$$H=-\sum_{i=1}^np_i\log(p_i).$$ (The base ofe $\log$ is considered to be $2$ in this contexts.)

Edited

Example for $A,B$:

For instance the estimate for $p_{1,1}=\frac{2}{8}=\frac{1}{4},$ because in the given sample of $8$ elements the number of occurrences of $1,1$ is $2$. Also, $p_{1,-1}=\frac{1}{4}$,$p_{-1,1}=\frac{1}{4}$, $p_{-1,-1}=\frac{1}{4}$. So

$$H(A,B)=-\log\left(\frac{1}{4}\right)=2.$$ But this is only a very poor estimate!! The sample is small.

zoli
  • 20,452
  • Do you want these probabilities as calculated based on the sample? – zoli May 16 '15 at 17:24
  • Why do you need that? – zoli May 16 '15 at 17:42
  • Because i can calculate $P(A,B)$ by using that $P(A,B)=P(A|B) P(B)$. – john May 16 '15 at 17:45
  • Again, see my other comment: If you may assume that the samples correspond then you can estimate the joint probabilities. There is no special (royal) way through the conditional probabilities. Unless the conditional probabilities are given. But this is not the case. – zoli May 16 '15 at 17:52
  • If you insist. For, again, $P(A=1|B=1)$: you take those cases when $B=1$. There are $4$ such cases: ${(-1,1),(1,1),(-1,1),(1,1)}$. Among them there are $2$ of the form $(1,1)$. So $P(A=1|B=1)=\frac{1}{2}.$ Also, $P(B=1)=\frac{1}{2}$. So... – zoli May 16 '15 at 18:03