Maximum Entropy Continuous Distribution

Question

In Pattern Recognition and Machine Learning Ch 1.6, the author derives the distribution which maximises the differential entropy;

$$H(\textbf{x})-\int p(\textbf{x}) \ln (p(\textbf{x})) d\textbf{x}$$

To do so the author comes up with three constraints;

$$\int_{-\infty}^{\infty} p(x) dx = 1$$ $$\int_{-\infty}^{\infty} xp(x) dx = \mu$$ $$\int_{-\infty}^{\infty} (x-\mu)^2p(x) dx = \sigma^2$$

This results in the Lagrangian functional;

$$F(p)=-\int_{-\infty}^{\infty} p(x) \ln(p(x)) dx + \lambda_1(\int_{-\infty}^{\infty} p(x) dx - 1) + \lambda_2 (\int_{-\infty}^{\infty} x p(x) dx - \mu) + \lambda_3(\int_{-\infty}^{\infty} (x-\mu)^2 p(x) dx - \sigma^2)$$

Taking the derivative of this functional using the calculus of variations and setting it equal to zero gives;

$$p(x)=\exp(-1+\lambda_1+\lambda_2 x + \lambda_3 (x-\mu)^2)$$

The author states that you can find the Lagrange multipliers by back substitution of this result into the three constraint equations, leading to the conclusion that $p(x)$ is a normal density.

I'm wondering how to derive this last step, specifically how to find the Lagrange multipliers. If we substitute back into the constraints we get three integral equations with three unknowns. How would I go about solving these equations?

score 0 · Accepted Answer · 2020-10-18T10:25:30.570

Assume that $\mu=0$ and $\sigma=1$, and let $z:=\sqrt{\pi}e^{-1+\lambda_1}e^{-\lambda_2^2/(4\lambda_3)}$. Then, assuming that $\lambda_3<0$, the equations are $$ I_1:=e^{-1+\lambda_1}\int_{-\infty}^{\infty} e^{\lambda_2x+\lambda_3x^2}\,dx=\frac{z}{(-\lambda_3)^{1/2}}=1, $$ $$ I_2:=e^{-1+\lambda_1}\int_{-\infty}^{\infty} xe^{\lambda_2x+\lambda_3x^2}\,dx=\frac{z\lambda_2}{2(-\lambda_3)^{3/2}}=0, \quad\text{and} $$ $$ I_3:=e^{-1+\lambda_1}\int_{-\infty}^{\infty} x^2e^{\lambda_2x+\lambda_3x^2}\,dx=\frac{z\lambda_2^2}{4(-\lambda_3)^{5/2}}+\frac{z}{2(-\lambda_3)^{3/2}}=1. $$ Plugging $z=(-\lambda_3)^{1/2}$, we get $$ \frac{\lambda_2}{-\lambda_3}=0\quad\text{and}\quad \frac{\lambda_2^2}{4\lambda_3^2}+\frac{1}{-2\lambda_3}=1, $$ so that $\lambda_2=0$ and $\lambda_3=-1/2$. Finally, using $z=(-\lambda_3)^{1/2}$, we get $\lambda_1=1-\ln \sqrt{2\pi}$.

Therefore, $$ p(x)=e^{-\ln \sqrt{2\pi}-x^2/2}=\frac{1}{\sqrt{2\pi}}e^{-x^2/2}. $$

For the general case, consider $y=(x-\mu)/\sigma$ and notice that $$ -\int p(x)\ln(p(x))\,dx=-\frac{1}{\sigma}\int p(y)\ln(p(y))\, dy. $$

Evaluation of $I_1$, $I_2$, and $I_3$:

First, recall that for $c>0$, $$ \int_{-\infty}^\infty e^{-cx^2}\,dx=\sqrt{\frac{\pi}{c}}, $$ and notice that $$ bx-cx^2=-c\left(\frac{b}{2c}-x\right)^2+\frac{b^2}{4c}. $$ Thus, letting $\lambda_1=a$, $\lambda_2=b$, and $\lambda_3=-c$, $$ I_1=e^{-1+a}e^{b^2/(4c)}\int_{-\infty}^\infty e^{-c(b/(2c)-x)^2}\,dx=e^{-1+a}e^{b^2/(4c)}\times \sqrt{\frac{\pi}{c}}, $$ As for the second integral, notice that $$ \int_{-\infty}^\infty \left(x-\frac{b}{2c}\right)e^{-c(b/(2c)-x)^2}=0, $$ and so $I_2=I_1b/(2c)$. Finally, $$ \frac{d}{dc}\int e^{-c(b/(2c)-x)^2}\,dx =\int \left(\frac{b^2}{4c^2}-x^2\right)e^{-c(b/(2c)-x)^2}\,dx. $$ Therefore, $$ I_3=I_1\frac{b^2}{4c^2}-e^{-1+a}e^{b^2/(4c)}\times\frac{d}{dc}\sqrt{\frac{\pi}{c}}. $$

@tail_recursion I added the limits of integration for clarity. — , Oct 17 '20 at 10:42
Would be useful if you could add some more detail on how to do the integrals. I'm getting limits involving the imaginary error function erfi, where the argument is going to $\pm \infty$ so the limits don't exist. — tail_recursion, Oct 17 '20 at 11:11
@tail_recursion https://math.stackexchange.com/questions/628681/how-to-compute-moments-of-log-normal-distribution — , Oct 17 '20 at 11:13
I'm still not sure what you did there. I'm not clear how you got from the second step to the last step. I was however able to do the first integral using a formula given here; https://en.wikipedia.org/wiki/Gaussian_integral — tail_recursion, Oct 18 '20 at 05:37
I was able to figure out the other integrals using a formula at the bottom of this page; https://mathworld.wolfram.com/GaussianIntegral.html — tail_recursion, Oct 18 '20 at 07:40

Maximum Entropy Continuous Distribution

1 Answers1