In regression, we assume that $(X,Y)$ are random variables following some certain distribution. How would the problem change if we do not assume $(X,Y)$ are randoms. Why can we just have $Y=f(X,\epsilon)$, where $(X,Y)$ are non-random, and $\epsilon$ is a random quantity??
Asked
Active
Viewed 64 times
1 Answers
0
Regression has nothing to do with randomness. Regression means fitting some parametrized function or curve to some points.
That means to set values for the parameters and come up with some metric that describes the function or curve to fit better or worse than other functions or curves derived from other parameters. The result is the set of parameters that describe a function or curve that fits best according to the used metric.
It's not the concern of regression if these points are "random" or not.
null
- 1,522
-
I beg to differ. I think we need the data $(X,Y)$ to be random so that we can have a probability structural for statistics in order to measure how good they are (biasness, consistency, efficiency, and so on) – bankrip Sep 12 '15 at 00:29
-
Regressors are often considered to be random if data are observational but sometimes can be considered deterministic (if e.g. data are experimental). However, I think the outcome variable $Y$ is always modelled as random. I've never seen a model with the dependent variable treated as non-random. – MerylStreep Sep 13 '15 at 19:14
-
@bankrip no, the data does not need to be random. As I said, you just plug in some numbers, it doesn't matter if they come from a "random" source or not. The algorithm can be applied regardless. – null Sep 13 '15 at 19:26