0

Question

The answer to the last part provided is The sum of square of residuals is minimum for points lying on the regression line and so cannot be less than 8.8 for any other line.

Can somebody please explain what this means?

It is almost evident that these points doesn't lie on the given regression line. So if I were to provide a much more accurate regression line, won't the square of sum much more smaller?

mathnoob123
  • 1,373

2 Answers2

1

One way to look at the result line we get from a linear regression is that this is the line we get by minimizing the sum of squared residuals of the points (to visualize, it is the sum of squared vertical distance of points to the regression line).

Thus any line other than the regression line will not have a smaller sum of squared residues.

EDIT

I feel you might misunderstand what is a regression line - so a regression line is not a line that you give arbitrary $a$, $b$ parameters to it. Instead, it is a line that you calculate you parameter so that the sum of squared residual is the smallest out of all lines. Hope this would help.

Jay Zha
  • 7,792
  • But what if I provide an equation which is basically a much more accurate regression line than the one provided? – mathnoob123 May 21 '17 at 23:42
  • @FaiqRaees : They're saying there isn't any that is more accurate. – Michael Hardy May 21 '17 at 23:44
  • @FaiqRaees define "accurate" - if you mean smaller sum of squared residuals, then you cannot, because the regression line you get always have the smallest such sum. – Jay Zha May 21 '17 at 23:44
  • @MichaelHardy Is that an assumption I have to make? Because I used the formulas and I got a much more accurate regression line $y=-0.7049180..x+15.18032787$ – mathnoob123 May 21 '17 at 23:46
  • @YujieZha Yes I am aware that the parameters a,b are not given by us. What I meant was through the use of calculator and and its function, I was able to calculate a much accurate regression line that fits the given points than the one provided. – mathnoob123 May 21 '17 at 23:50
  • 2
    @FaiqRaees : It appears that your proposed more accurate line is correct for the seven pairs given. But whatever the correct numbers, what the book ought to say is that for the least squares line, the sum of squares of residuals is smaller than for all other lines. However, notice that it says "Seven of the pairs". That means it's not giving you all of them. There are more than seven. – Michael Hardy May 21 '17 at 23:51
  • Okay thank you very much – mathnoob123 May 21 '17 at 23:52
  • @MichaelHardy The eighth pair was supposed to be calculated by us. It's (13,6). Even if we include it, the line is $y=-0.704830053x+15.17710197$. But I got what you're trying to say. – mathnoob123 May 21 '17 at 23:58
1

The sum of square of residuals is minimum for points lying on the regression line and so cannot be less than $8.8$ for any other line.

This is misleadingly stated. It says "for points lying on the regression line". What it ought to say is that for the line whose slope was specified slope and intercept, the sum of squares of residuals is smaller than it is for any other slope or other intercept.

Note that only seven of the eight pairs are given. You are asked to find the eighth pair.