I am trying to step through the derivation of linear regression curve fitting with ordinary least squares method and everything looks great except I am puzzled how multiple sources make the jump from step 1 to step 2 shown below?
Step 1:
$$m=\frac{\sum_i(\overline{y}-y_i)}{\sum_i(\overline{x}-x_i)}$$
Step 2
$$m=\frac{\sum_i((\overline{y}-y_i)\cdot(\overline{x}-x_i))}{\sum_i(\overline{x}- x_i)^2}$$
When I calculate the slope with step $1$ which theoretically should be the same as step $2$ I get a divide by $0$ error because a $\sum_i(\overline{x}-x_i)$ will always yield $0$. So how is it that these two equations are equal but one yields a divide by $0$ error and one returns the correct linear slope from a cluster of points?
UPDATE: derivation show in sources are incorrect and miss-leading! Step 2 is in fact the correct answer, however both sources show that step 2 came from step 1 that is incorrect. The error step is show below (along with its corrected derivation)
$$m=\sum_i(y_i*x_i-\overline{y}*x_i+m*\overline{x}*x_i-m*x_i^2)=0$$
incorrect step in source here was to factor out $x_i$ and divide both side by $x_i$ to remove it out of the equation. This is incorrect since $x_i$ is not a constant and cannot be removed from the summation.
correct step here would have been to break out the sums and solve for $m$:
$$m=\frac{\sum_i(\overline{y}*x_i-y_i*x_i)}{\sum_i(\overline{x}*x_i-x_i^2)}$$