so I want to do a linear least squares regression on my data, however I have known experimental error on my data points in $y$ and relatively few numbers of points so I would like to use values measured instead of the usual residual method. It seems like a simple question but I can't seem to find it online. Does anyone have any idea how to do this?
-
You can use weighted residual to increase the "priority" for some points. – AnilB Jun 29 '15 at 21:48
-
Measurement error in $y$ variables does not bias regression coefficients (i.e. in $x$). So, at least in theory, it should not make a difference. If that result is affected by a small sample size, I can't say for sure. – Greg Jul 01 '15 at 03:29
1 Answers
As Occupy commented, using weighted least squares is the solution.
Let me take something similar from data validation and data reconciliation. In this area, what we want to minimize is $$\Phi=\sum_{i=1}^n \Big(\frac{y_i^*-y_i}{\sigma_i}\Big)^2$$ where $y_i^*$, $y_i$ and $\sigma_i$ are respectively the reconciled value, the measured value and the standard deviation for the $i^{th}$ measurement.
In your case $y_i^*=f(a,b,c,\cdots,x_i)$ which is the model to fit and you can assume that the $\sigma_i$'s are just proportional to the error $\Delta y_i$'s you know for each $y_i$ data point. So, just minimize $$\Phi=\sum_{i=1}^n \Big(\frac{y_i^*-y_i}{\Delta y_i}\Big)^2$$ which is the same as $$\sum_{i=1}^n w_i\Big({y_i^*-y_i}\Big)^2$$ with $w_i=\frac{1}{(\Delta y_i)^2}$.
The procedure are very similar to ordinary least square fits (linear or nonlinear)
- 260,315
-
Hello, thanks very much for this information. I should have put in that in particular I'm interested in calculating the errors on the parameters of the fit line. The reason why I'd rather do this than use the residuals to estimate the errors is I have a relatively small number of points on my line, thus I feel there's a good chance I might happen to underestimate my noise. – user2551700 Jun 30 '15 at 14:36