0

I came across this paragraph (confusing transpose signs) in a matrix calculus paper. I'm confused by the double transpose notation of the vectors, in particular t(w). I thought w hat should be t[w, b] instead of t[t(w), b], and x hat should be [x, 1] instead of [t(x), 1]. Could you please confirm whether the highlighted transpose signs in the attached are correct? Thanks

Nemo
  • 315

2 Answers2

0

You see vectors are usually written in columns, also known as $n\times1$ matrices. Take for example the $3\times1$ matrix (or column vector) $w = \left(\begin{smallmatrix}x_1\\x_2\\x_3\end{smallmatrix}\right)$. So $w^T$ is the $1\times3$ matrix (or row vector) $(x_1, x_2, x_3)$, and if we define $\hat{w}= (w^T, b)^T$ then that is equal to $(x_1,x_2,x_3,b)^T=\left(\begin{smallmatrix}x_1\\x_2\\x_3\\b\end{smallmatrix}\right)$.

Tom
  • 270
  • 1
  • 8
  • Hi Tom. Thanks for your explanation. Just to confirm, did you mean $\hat{w}= (w^T b)^T$ instead of $\hat{w}= (w^T b)$? – Nemo Apr 02 '19 at 23:52
  • Yes, I edited my answer, and I think it correctly reflects the notation in your screenshot now. – Tom Apr 02 '19 at 23:56
  • If it's the case of $\hat{w} = (x_1,x_2,x_3,b)^T=\left(\begin{smallmatrix}x_1\x_2\x_3\b\end{smallmatrix}\right)$, then $\hat{w} . \hat{x}$ won't be equal to w.x + b then? – Nemo Apr 03 '19 at 00:05
  • I believe the writer is in some error here, because the sizes of the matrices are not compatible in $w\cdot x + b$. A $n\times 1$ matrix cannot be multiplied with a $n\times 1$ matrix. I think he means $w^T \cdot x +b$. But it would be better for your understanding if you just think of it as a dot product of vectors. – Tom Apr 03 '19 at 00:40
  • Thanks for your confirmation of Yesfun's comment (and my thought originally) on the typo from the writer. And sorry for all the confusion: all the operand between the two vectors in this question is dot product, not matrix multiplication. – Nemo Apr 03 '19 at 00:45
  • Taking a dot product is equivalent to matrix multiplication of a row vector with a column vector. – Tom Apr 03 '19 at 04:06
0

In my opinion, the first is correct, while the second one should contain a typo.

\begin{align*} \mathbf{w}^T&=[w_1,...,w_n]\\ \hat{\mathbf{w}}^T&=[\mathbf{w}^T,b]\\ &=[w_1,...,w_n,b], \end{align*} and after applying transpose on the above row vectors, we have $ \hat{\mathbf{w}}=[\mathbf{w}^T,b]^T$, a column vector.

Similarly, \begin{align*} \mathbf{x}^T&=[x_1,...,x_n]\\ \hat{\mathbf{x}}^T&=[\mathbf{x}^T,1]\\ &=[x_1,...,x_n,1], \end{align*} and then we have $\hat{\mathbf{x}}=[\mathbf{x}^T,1]^T$, a column vector.

  • The dot product here is a different operator from the matrix product. If x, w are two column vectors, I guess $x\dot w$ is defined to be $(x^T)w$, as the definition of the notation in the page 23~24 of your reference paper. – Yesfun Yeh Apr 03 '19 at 00:19
  • Sorry for misunderstanding your point. I think both $\hat{x}$ and $\hat{w}$ should denote column vectors, and then $\hat{w} \dot \hat{x}$ would be ${\hat{w}}^T \hat{x}$, which is $w^Tx+b=w \dot x+b$, as what we want. – Yesfun Yeh Apr 03 '19 at 00:27
  • Thanks Yesfun. I think I was confusing you. I think you are correct by saying both $\hat{w}$ and $\hat{x}$ should be column vectors. Then their dot product will result in ${w} . {x} + b$. Consequently, you got the point to say that the second one should contain a typo, that is, it should be $\hat{x} = ({x}^T, 1)^T$. – Nemo Apr 03 '19 at 00:39
  • Exactly! I am glad that we attain a consensus haha. – Yesfun Yeh Apr 03 '19 at 00:41
  • I must say I'm so very happy to have the community helping out in this puzzle as I was sleepless for 2 nights trying to twist my understanding to conform to the text! – Nemo Apr 03 '19 at 00:48