Basic question on linear regression

Question

I am trying to understand linear regression. The typical model takes form $$y_{i}=ax_{i} +b + \epsilon_{i}, \ \ \ i=1..N$$ where $\epsilon_{i}$, is an i.i.d Gaussian random variable. The objective is to minimize $$\sum_{i=1}^{N} (y_{i} - ax_{i} – b - \epsilon_{i})^{2}.$$

The computation of the gradient yields to: $$\frac{\partial}{\partial a} = -\sum_{i=1}^{N} y_{i} x_{i} + a\sum_{i=1}^{N} x_{i}^{2} +b \sum_{i=1}^{N} x_{i} + \sum_{i=1}^{N} x_{i}\epsilon_{i}$$ $$\frac{\partial}{\partial b} = -\sum_{i=1}^{N} y_{i} + a\sum_{i=1}^{N} x_{i} +bN + \sum_{i=1}^{N} \epsilon_{i}$$ My question concerns the terms involving $\epsilon_{i}$. What are the arguments that allow us to state that these terms are equal to zero?

They are not zero, but they vanish upon taking expectations of both sides. This is appropriate because you are looking for an unbiased estimator for $a,b$ anyway. (Look up the Gauss-Markov theorem, which is really the connection between "linear algebra" least squares and "statistics" least squares.) — Ian, Sep 08 '16 at 11:59

score 0 · Answer 1 · answered Nov 01 '16 at 15:56

0

Usually we assume $\epsilon\sim N(0,\sigma^2)$, so $E[\epsilon]=0$ and if sample size is large enough $\sum^N\epsilon = NE[\epsilon]=0$

answered Nov 01 '16 at 15:56

wenxi

11

Basic question on linear regression

1 Answers1