-2

I see this equation in a paper published in a proceeding of a very competitive conference:( and thus I don't think it has a flaw)

$$Pr(Y|X)= \frac{e^{-E(X,Y)}}{g(X)};$$

Where $X,Y$ are two random variables, $g$ is a function of $X$.

I don't know how to calculate $E(X,Y)$. I even thought it did not exist.

Do you know how we can calculate $E(X,Y)$ given we already have joint probability density function $f(X,Y)$? And also what is its intuitive meaning?

Thank you!

user3209698
  • 1,742
  • 2
    also could you provide more context please? I don't understand the question. the notation looks incorrect – Chill2Macht Apr 23 '16 at 14:26
  • Thanks William. This is equation 1 in this paper: link – rudky martin Apr 23 '16 at 14:35
  • you are right though that (X,Y) is not a (one-dimensional) random variable, although it could be a random vector, but in that case its "expectation" is actually the expectation vector with entries E(X) and E(Y). Or perhaps the paper was referencing E(XY) (i.e. the expectation of the product? also that formula might make sense sort of for probability densities, but that notation Pr(Y|X) usually denotes a conditional distribution, which does not seem to be indicated at all by the RHS – Chill2Macht Apr 23 '16 at 14:38
  • 1
    The (misleading) notation in the paper means that $E$ is some function defined on the product target space of $(X,Y)$. In the discrete case, the equation should be read as $$P(Y=y\mid X=x)=\frac{e^{-E(x,y)}}{Z(x)}$$ for every $(x,y)$, where, for every $x$, $$Z(x)=\sum_ye^{-E(x,y)}.$$ The formula in the continuous case is in the paper. – Did Apr 23 '16 at 14:44
  • That formula has nothing to do with expected values. The text clearly states that $E$ is an energy function and $g$ (well, they call it $Z$) is a partition function...using the terminology of statistical thermodynamics. – lulu Apr 23 '16 at 14:44
  • In the continuous case, the joint PDF is then $$f(x,y)=h(x)e^{-E(x,y)}$$ for some function $h$, and the PDF of $X$ is $$f_X(x)=h(x)Z(x).$$ – Did Apr 23 '16 at 14:48
  • Oh my bad, thanks guys for pointing it out. – rudky martin Apr 23 '16 at 14:51

1 Answers1

0

If you read further in the paper, $E(X,Y)$ denotes the "energy function" of the random variables $X$ and $Y$, not an expectation - hence it is a scalar valued function of two inputs, no inconsistency there. Also note that $Pr(y|x)$ is supposed to denote the density of the conditional distribution in this paper, not the conditional distribution itself. I.e. this function needs to be integrated over the appropriate domains in order to get the values of the probability measures defined by the conditional distribution (which is itself a family of probability measures depending on the RV X). $g(X)$ is the partition function -- basically the idea is to use the Boltzmann framework from equilibrium statistical mechanics to model this image recognition problem, called Gibbs measure or a log-linear model.

https://en.wikipedia.org/wiki/Conditional_random_field

https://en.wikipedia.org/wiki/Gibbs_measure

https://en.wikipedia.org/wiki/Log-linear_model

https://en.wikipedia.org/wiki/Boltzmann_distribution

Chill2Macht
  • 20,920