How to read this notation?

Question

Under the policy $\pi(\phi,a)$, the sequence of loss functions \begin{equation} L_i(\theta_i) = \mathbb{E}_{\phi,a\sim \pi(.)}[(y_i - Q_i)^2], \end{equation} is minimized, in order to train the Q-network.

How do I read the $\mathbb{E}_{\phi,a\sim \pi(.)}$ part? Expected $\phi,a$ based on policy $\pi$?

score 1 · Accepted Answer · answered Jun 01 '17 at 13:50

1

Talking with a colleague, we determined it is

The expected value of $(y_i - Q_i)^2$ where $\phi,a$ are sampled from the policy $\pi$

answered Jun 01 '17 at 13:50

BlueMoon93

255

How to read this notation?

1 Answers1