Under the policy $\pi(\phi,a)$, the sequence of loss functions \begin{equation} L_i(\theta_i) = \mathbb{E}_{\phi,a\sim \pi(.)}[(y_i - Q_i)^2], \end{equation} is minimized, in order to train the Q-network.
How do I read the $\mathbb{E}_{\phi,a\sim \pi(.)}$ part? Expected $\phi,a$ based on policy $\pi$?