Given a matrix $A \in R^{m \times n}$ and whose rank is $n$. I need to show $\| A(A^TA)^{-1}A^T\|_2 = 1$. Can any hint me the direction in which I should solve this problem. Should I use any decomposition of matrix $A$ to show the result?
-
You can use either QR or SVD for decomposing your matrix. – J. M. ain't a mathematician Sep 06 '11 at 05:37
-
I suppose I should have asked: are you familiar with either of those decompositions? – J. M. ain't a mathematician Sep 06 '11 at 05:46
-
@J.M I learnt about eigen value decomposition but neither SVD nor QR decomposition – Learner Sep 06 '11 at 05:55
-
How about QR?$;$ – J. M. ain't a mathematician Sep 06 '11 at 05:56
-
@j.M. Jus edited my previous comment – Learner Sep 06 '11 at 05:58
-
what does the subscript 2 imply here in the question title? – Bhargav Sep 06 '11 at 07:10
-
@168: It's the matrix 2-norm. – J. M. ain't a mathematician Sep 06 '11 at 07:20
-
@Learner: SVD is definitely something you'll want to know if you want to do applied statistics or any of various sorts of applied math. – Michael Hardy Sep 06 '11 at 21:00
4 Answers
I will explain it from a geometric perspective.
The matrix $P=A(A^TA)^{-1}A^T$ is an orthogonal projection matrix satisfying $P^2=P$ and $P^T=P$. For any $x\in\mathbb{R}^m$, $Px$ is the orthogonal projection of $x$ onto the column space of $A$.
- Consider an orthogonal basis $\{x_i\}_{i=1}^n$ of $\mathrm{Range}(A)$. Then $Px_i=x_i$.
- Consider an orthogonal basis $\{y_i\}_{i=1}^{m-n}$ of the orthogonal complement of $\mathrm{Range}(A)$, then $Py_i=0$.
Therefore, $P$ has $n$ eigenvalues equal to 1 and $m-n$ eigenvalues equal to 0. Since $P$ is symmetric positive semi-definite, its singular values are equal to its eigenvalues. As a result, $\|P\|_2=\sigma_{\max}(P)=1$.
- 5,228
An argument without geometry goes like this:
As Shiyu said $P^2=P$ and hence, $\|P\| = \|P^2\|\leq \|P\|^2$ and therefor $1\leq \|P\|$. Moreover, $P^T=P$ and hence, $\|Px\|^2 = \langle Px,Px\rangle = \langle Px, x\rangle \leq \|Px\|\|x\|$ which gives $\|P\| \leq 1$.
- 11,680
Let $H = A(A^TA)^{-1}A^T$. To see that $x\mapsto Hx$ (for $x\in\mathbb{R}^m$) is the orthogonal projection onto the column space of $A$, it suffices to prove two things:
- If $x$ is in the column space of $A$, then $Hx=x$.
- If $x$ is orthogonal to all columns of $A$, then $Hx=0$.
To prove the second statement, notice that if $x$ is orthogonal to all columns of $A$, then $A^T x = 0$. Therefore $A(A^TA)^{-1}A^Tx = 0$.
To prove the first statement, notice that $x$ is in the column space of $A$ iff $x = Aw$, for some $w$. Therefore $$ Hx = HAw = \Big(A(A^TA)^{-1}A^T\Big) Aw = A(A^TA)^{-1}\Big(A^T A\Big)w = Aw = x. $$
Now let $x$ be any vector in $\mathbb{R}^m$. Decompose $x$ into a component in the column space of $A$ and a component orthogonal to the column space of $A$. The component in the column space of $A$ is $u=Hx$. The component orthogonal to the column space of $A$ is $v=(I-H)x$. What then is $\|Hx\|_2$? It is $\|u\|_2 \le \|u+v\|_2 = \|x\|_2$. Since $\|Hx\|_2 \le \|x\|_2$, we have $\|H\|_2 \le 1$. But since $\|Hu\|_2= \|u\|_2$, we have $\|H\|_2\ge1$.
-
BTW, it is conventional in statistics to use the letter $H$ for this matrix and to call it the "hat matrix" (hence the "$H$") because of the meaning it has in statistics. – Michael Hardy Sep 07 '11 at 14:07
I'm probably missing something, but since $(AB)^{-1}=B^{-1}A^{-1}$, why not just
$$A(A^TA)^{-1}A^T = A(A^{-1}(A^T)^{-1})A^T = (AA^{-1})((A^T)^{-1}A^T) = II = I$$ ?
- 286,031
-
4
-
-
1"Pure" mathematicians often tacitly assume matrices are square, it seems. – Michael Hardy Sep 06 '11 at 13:53