For $\nabla_X Y(X) = \nabla_X U(X)V(X)$, is there a general formula to differentiate it? I guess it is something like $\left(\frac{\partial U}{\partial X}\right)^T V(X) + \frac{\partial V}{\partial X} U(X)^T$ (denominator layout). However, when I was trying to derive it with chain rule, I get:
$\begin{align*} \frac{\partial Y}{\partial X} &= \frac{\partial U}{\partial X} \frac{\partial Y}{\partial U} + \frac{\partial V}{\partial X} \frac{\partial Y}{\partial V} \\ &= \frac{\partial U}{\partial X} V(X) + \frac{\partial V}{\partial X} U(X)^T \end{align*}$
which obvious doesn't even produce correct dimensions for matrix multiplication. What is the proper way to differentiate it (in denominator layout)?