1

I am reading https://xyang35.github.io/2017/04/14/variational-lower-bound/ second derivation for KL divergence.

if you check equation, you will see at the end it gets: $$ = -L + \log P(X) $$ But I can not understand how we isolated $\log P(X)$ since it is multiplied with summation of $Q(z)$ in the row before.

I did check on this matter and it says $Q(z)$ is just 1. Ok, but why we didn't use 1 in first term as well and we would get: $$ = \log \frac{P(x,z)}{Q(z)} + \log P(x) $$ How could $Q(z)$ just disappear in second term?

nullgeppetto
  • 3,006
  • q is probability density function of a probability distribution, integral over it's space is 1 = 100%. Kolmogorov's second axiom (the one of unit measure: https://en.wikipedia.org/wiki/Probability_axioms ) – mathreadler Sep 26 '19 at 10:59
  • @mathreadler I understand, but in first term: "L", we use q(Z) – filtertips Sep 26 '19 at 11:08
  • Have you taken some probability theory course? It can be good idea to do before you go into neural networks with Bayesian priors. – mathreadler Sep 26 '19 at 11:18

1 Answers1

1

I will assume that what you wonder is how they can do:

$$-\int_Zq(Z)\log\left(\frac{P(X,Z)}{q(Z)}\right) + p(X)\int_Zq(Z) = -L+\log(p(X))$$

It is because if we look at the second term:

$$p(X)\cdot\underset{=1}{\underbrace{\int_Zq(Z)}} = p(X) \cdot 1 = p(X)$$ This is because of unity of probability measure. Any probability measure must have it's density function $f(t)$ fulfill:

$$\int_{\Omega} f(t)dt=1$$

where $\Omega$ is it's event space.

mathreadler
  • 25,824
  • Thank you very much. But we can we say for first term as well that sumation over Q(z) is one? Is it right to say log(P(x,z)/Q(z) + log P(x) ... because densitiy function of QZ is one, we can cancel it both terms right? – filtertips Sep 26 '19 at 11:20
  • Maybe take a probability course. And vote up responses which you find helpful. It will help to get more help in future.. – mathreadler Sep 26 '19 at 12:54
  • But if you look first term and check the integral of Q(z), it stays in equation. It's not considered as 1. This is my problem, I don't understand why... L != log(P(x,y)/P(z), L= integral Q(z) * log(P(x,y)/P(z) – filtertips Oct 01 '19 at 16:30
  • @Stenga there is no integral of Q(Z), there is an integral of (Q(Z) * log(...)), where the log expression is dependent on integration variable Z. I agree that the notation they are using on the site is quite sloppy and unclear, they have been lazy and skipped "dz" in the integral. The reason we could lift out "log(p(x))" is that the integral is over z and p(x) and therefore also log(p(x)) is no function of z. – mathreadler Oct 01 '19 at 17:30
  • Oh, yes, thank you very much. the "dz" makes a lot of sense to me now, as I recalled basics from integral math. I understand independence now. I appreciate it. Thank you. – filtertips Oct 01 '19 at 18:03