Simplifying this least squares cost function

Question

I am beginning to study normal equations and am looking at the cost functions.

I wish to get from this equation where w is the weight vector, x is the feature vector

$g(w) = \frac{1}{N} \sum_{i=1}^n(y_i-w^Tx_i)^2 $

To this

$g(w) = \frac{1}{N}(y-Xw)^T (y-X_w) $

I can see that multiplying out the brackets is the first step but can't figure out what next.

$ (y_i-w^Tx_i)(y_i-w^Tx_i)$

What rule am I missing to reach the second equation, and why is the $w$ term not transposed in the second bracket?

score 0 · Accepted Answer · answered Oct 09 '22 at 11:40

0

Let z be an arbitrary vector $z \in \mathbb{R}^n$. Notice that $\sum_{i=1}^n z_i^2=z^T z$.

Now what you are looking for follows from choosing $z= y-Xw$.

answered Oct 09 '22 at 11:40

Schneidl

I coded this in colab, I can see that it is impossible to simply square a matrix by itself due to row/column mismatch. it needs to be transposed. However, is it always the first matrix that is transposed? – Tam Oct 12 '22 at 06:59

1 Answers1