In index notation, your question can be dealt with in 3 lines
$$\eqalign{
y_i &= w_kX_{ki} + b_i \cr
dy_i &= dw_kX_{ki} \cr
\frac{\partial y_i}{\partial w_p} &= \delta_{pk}\,X_{ki} = X^T_{ip} \cr
}$$
with no philosophizing about how one should interpret $\,y$-vs-$y^T\,$ or $\,w$-vs-$w^T;\,$ each quantity carries a single index so talking about a transpose is meaningless. The quantity $X$ on the other hand is a $2^{nd}$ order tensor, i.e. a matrix. It carries two indices, so it is meaningful to talk about its transpose.
If you want to work in matrix notation without getting confused, then you must meticulously write equations in which all of the vectors are column vectors.
Then the lines above can be translated directly and unambiguously as
$$\eqalign{
y &= X^Tw + b \cr
dy &= X^T\,dw \cr
\frac{\partial y}{\partial w} &= X^T \cr\cr
}$$
Also, your assertion that
$$\frac{\partial y}{\partial X} = w^T$$
is incorrect.
A vector-by-matrix derivative generates a $3^{rd}$ order tensor. There's no way to write this in matrix notation, but in index notation it is straightforward to calculate
$$\eqalign{
y_i &= w_kX_{ki} + b_i \cr
dy_i &= w_k\,dX_{ki} \cr
\frac{\partial y_i}{\partial X_{pq}}
&= w_k \delta_{kp} \delta_{iq} = w_p \delta_{iq} \cr
}$$