2

I am trying to step through the derivation of linear regression curve fitting with ordinary least squares method and everything looks great except I am puzzled how multiple sources make the jump from step 1 to step 2 shown below?

Step 1:

$$m=\frac{\sum_i(\overline{y}-y_i)}{\sum_i(\overline{x}-x_i)}$$

Step 2

$$m=\frac{\sum_i((\overline{y}-y_i)\cdot(\overline{x}-x_i))}{\sum_i(\overline{x}- x_i)^2}$$

Source1 | Source2

When I calculate the slope with step $1$ which theoretically should be the same as step $2$ I get a divide by $0$ error because a $\sum_i(\overline{x}-x_i)$ will always yield $0$. So how is it that these two equations are equal but one yields a divide by $0$ error and one returns the correct linear slope from a cluster of points?

UPDATE: derivation show in sources are incorrect and miss-leading! Step 2 is in fact the correct answer, however both sources show that step 2 came from step 1 that is incorrect. The error step is show below (along with its corrected derivation)

$$m=\sum_i(y_i*x_i-\overline{y}*x_i+m*\overline{x}*x_i-m*x_i^2)=0$$

incorrect step in source here was to factor out $x_i$ and divide both side by $x_i$ to remove it out of the equation. This is incorrect since $x_i$ is not a constant and cannot be removed from the summation.

correct step here would have been to break out the sums and solve for $m$:

$$m=\frac{\sum_i(\overline{y}*x_i-y_i*x_i)}{\sum_i(\overline{x}*x_i-x_i^2)}$$

Reference.

PydPiper
  • 123

2 Answers2

2

Your first expression yields $\frac 00$ because the sum of the $\overline y$s is the same as the sum of the $y_i$ by definition of $\overline y$, similarly for $x$.

Your second expression is not equivalent to the first. It looks like you have just multiplied numerator and denominator by $(\overline x -x_i)$, but that is inside the sum and the term that multiplies it also depends on $i$. The denominator is now a sum of squares and is positive unless all the $x_i$ are identical.

Ross Millikan
  • 374,822
  • Thank you for taking the time to respond Ross! I might be missing something then because the derivation from https://medium.com/analytics-vidhya/ordinary-least-square-ols-method-for-linear-regression-ef8ca10aadfc looks okay to me (there are a few steps skipped in there but i was able to write it out and flow it) - until i got to my posted steps. Do you think you could take a look at the link a fill in the gaps? – PydPiper Aug 14 '21 at 22:28
  • I didn't go through the derivation. The second expression looks like what I would expect, so I think the first is mistakenly "derived from the first by canceling the common factor" You could make up some data, four points are enough, and manually calculate each of the sums to see that the first is $\frac 00$ and the second is not. – Ross Millikan Aug 14 '21 at 22:29
  • Right i made up some point to check that "step 1" equation does give me the correct slope and i got a 0/0 like you said, however i am not following where the derivation from https://medium.com/analytics-vidhya/ordinary-least-square-ols-method-for-linear-regression-ef8ca10aadfc is wrong by subbing in $$b=\overline{y}-m\overline{x}$$ here is another source that shows the same step https://towardsdatascience.com/understanding-the-ols-method-for-simple-linear-regression-e0a4e8f692cc – PydPiper Aug 14 '21 at 22:38
  • also in your posted answer i am not following why step1 does not equal step2. i should be able to do: $$b=\frac{\sum_i(\overline{y}-y_i)}{\sum_i(\overline{x}-x_i)} * ({\sum_i(\overline{x}-x_i)}/{\sum_i(\overline{x}-x_i)})$$ then expand out the denominator to be just a square – PydPiper Aug 14 '21 at 22:48
  • 1
    Because $\left(\sum_i(\overline x -x_i)\right)^2 \neq \sum_i(\overline x -x_i)^2$ On the left you sum to zero and then square, on the right you square each term before adding them and get something larger than zero. – Ross Millikan Aug 14 '21 at 23:04
  • ah perfect! that is the fundamental i was missing here. I am still unsure as to how the derivations i linked jumps from step 1 to step 2 but i'll keep working it and fill this question out a bit more. Appreciate the help Ross! – PydPiper Aug 14 '21 at 23:12
2

After an admittedly quick look, I think the derivations you cite are convoluted at best, and faulty at worst. I believe the step they make to get to what you are calling "Step 1" is incorrect. That expression does not follow from the prior step.

The OLS derivation on wikipedia is sound.

  • Hey Tim, yes unfortunately that is what i am beginning to figure out. I was just a bit surprised since i found the same step 1 to step 2 on multiple sources so i still think i am missing how they get the final correct solution of step 2 from step 1. Thanks for providing the wiki link. I have to admit it is a bit over my head in notation so i was trying to see an detailed expansion/derivation (which is why i went and looked for the sources i found). I'll keep digging it and see if i can figure something out. There must be a method to this madness – PydPiper Aug 14 '21 at 23:35
  • Hi Tim, you are correct the sources were mis-leading and incorrect! I updated the question with correction. – PydPiper Aug 15 '21 at 00:23