1

If $X$ is the design matrix, show that:

$$SS_\text{reg}=y^T(X(X^T X)^{-1}X^T-X_2(X_2^T X_2)^{-1}X_2^T)y$$

where $X_2$ is a $n \times 1$ matrix with entries equal to one.

So far what I have done is that I know $SS_\text{reg}=\sum_{i=1}^n (y_i-\bar{y})^2=\sum_{i=1}^n\hat{y}^2-\frac{\sum_{i=1}^n(y_i)^2}n$

I have found that $\sum_{i=1}^n\hat{y}^2=y^T (X(X^T X)^{-1}X^T y$. However I am stuck at finding what $\frac{\sum_{i=1}^n(y_i)^2} n$ is. Any clue how I might approach it?

  • Any mathematician looking at this question who is not familiar with this, the mathematics of linear regression, should know two things, and will probably understand this then: (1) The matrix $X$ has more rows than columns, and its columns are linearly independent, and similarly for $X_2$, and (2) the column space of $X_2$ is within the column space of $X.$ The additional information about $X_2$ given in the question is not essential to the answer. $\qquad$ – Michael Hardy Sep 14 '16 at 18:18
  • I fixed a typo in the question, where it said $\sum_{i=1}^n\hat{y}^2=y^T (X(X^T X)^{-1}X^T$ (without the final $y$) but you needed $\sum_{i=1}^n\hat{y}^2=y^T (X(X^T X)^{-1}X^T y. \qquad$ – Michael Hardy Sep 14 '16 at 18:23
  • The total corrected sum of squares is $$SS_\text{total} = \sum_{i=1}^n (y_i - \bar y)^2.$$ The sum of squares due to regression is $$SS_\text{reg} = \sum_{i=1}^n (\hat y_i -\bar y)^2.$$ The residual sum of squares is $$SS_\text{residual} = \sum_{i=1}^n (y_i -\hat y_i)^2.$$ One has $$SS_\text{total} = SS_\text{regression} + SS_\text{residual}$$ because $$ \sum_{i=1}^n (y_i - \hat y_i)(\hat y_i - \bar y) = 0,$$ i.e. the residuals are uncorrelated with the fitted values. So your expression for $SS_\text{reg}$ is mistaken; that sum is actually the total corrected sum of squares. $\qquad$ – Michael Hardy Sep 14 '16 at 19:09

0 Answers0