In this lecture on method of moment, we have:
why is gradient of psi inverse a dxd matrix?
K-th moment $m_k$is defined as $ \mathbb E[X^k] $ and can be estimated by the average using Law of Large Numbers which here is represented by $\hat m_k$
My understanding is that the inverse function $\psi^-1 $ takes the vector of moments of size d, so why isn't the gradient of size d?

