$
\def\H{{\cal H}}
\def\e{\,e}\def\o{{\tt1}}\def\p{\partial}
\def\LR#1{\left(#1\right)}
\def\D#1{\operatorname{Diag}\LR{#1}}
\def\d#1{\operatorname{diag}\LR{#1}}
\def\trace#1{\operatorname{Tr}\LR{#1}}
\def\grad#1#2{\frac{\p #1}{\p #2}}
\def\gradLR#1#2{\LR{\frac{\p #1}{\p #2}}}
$Since a vector-by-matrix gradient is a third-order tensor, the result does not fit neatly into standard matrix notation. In this case, a component-wise gradient is the simplest approach.
But first we need to know the gradient of a matrix
with respect to its own components, i.e.
$$\grad{A}{A_{ij}} = E_{ij} = \e_i\e_j^T$$
where $\{E_{ij}\}$ are single-entry matrices whose elements are all zero except for the $(i,j)$ element which equals $\o$. These matrices act as the standard matrix basis, just as the $\{e_k\}$ are the standard vector basis and whose elements are all zero except for a $\o$ at the $k^{th}$ component.
Write the tensor-valued gradient either as a set of vector-valued gradients (one for each component of $A$)
$$\eqalign{
y &= \LR{X\odot A}\o \\
\grad{y}{A_{ij}} &= \LR{X\odot E_{ij}}\o
= x_{ij} {E_{ij}\o}
= x_{ij} \e_i\e_j^T\o
= x_{ij}\,\e_i \\
}$$
or as a set of matrix-valued gradients (one for each component of $y$)
$$\eqalign{
y &= \LR{X\odot A}\o \\
y_k &= \e_k^T\LR{X\odot A}\o \\
&= \LR{\e_k\o^T}:\LR{X\odot A} \\
&= \LR{X\odot\e_k\o^T}:A \\
&= \LR{E_{kk}X}:A \\
\grad{y_k}{A} &= E_{kk}X \\
}$$
where a colon denotes the Frobenius product, which is a concise
notation for the trace
$$\eqalign{
A:B &= \sum_{i=1}^m\sum_{j=1}^n A_{ij}B_{ij} \;=\; \trace{A^TB} \\
A:A &= \|A\|^2_F \\
}$$
and commutes with the Hadamard product
$$\eqalign{
A:\LR{B\odot C} &= \LR{A\odot B}:C \\
}$$
If you need the full tensor-valued gradient, you can construct it using the dyadic product $(\star)$ and summing over the components
$$\eqalign{
\grad{y}{A}
&= \sum_{k}\e_k\star\gradLR{y_k}{A} \\
&= \sum_{i}\sum_{j}\gradLR{y}{A_{ij}}\star E_{ij} \\
}$$
One last way to write this is to define a tensor
$$\H=\sum_k\e_k\star E_{kk}=\sum_k\e_k\star\e_k\star\e_k$$
with components
$$\H_{ijk} = \begin{cases}
\o\quad{\rm if}\;\;i=j=k \\
0\quad{\rm otherwise} \\
\end{cases}
$$
which extends the Kronecker delta symbol $(\delta_{ij})$
to a third-order tensor.
This gives us several equivalent ways to write the function
$$\eqalign{
y
\;=\; \LR{X\odot A}\o
\;=\; \d{AX^T}
\;=\; \H:\LR{AX^T}
\;=\; \LR{\H X}:A
}$$
Given the last form, the gradient calculation is trivial
$$\eqalign{
\grad{y}{A} &= \H X \\
}$$