Proof that a and b in linear regression are random variables

Question

Does anyone know how to prove that the variables $a$ and $b$ that are used in linear regression are random variables? For me the assumption would be that these are dependent on the values of $x$ and $y$ which are simply random variables that can take different values as differing observations are made in multiple experiments and that as a consequence, $a$ and $b$ are also random variables. Is there a deeper way of thinking that proves $a$ and $b$ are random?

Maybe if $x$ and $y$ are random random variables then $a$ and $b$ are random, but it's very hard to make sense of this question. What exactly do you mean when you write, "$a$ and $b$ are random"? — Gerry Myerson, Apr 07 '13 at 09:27
$a$ and $b$ in the true model are nonrandom. The estimators $\hat a$ and $\hat b$ are random in the sense that conditioning on $x$, the error term $e$ is random. — , Apr 07 '13 at 09:41
A random variable is a measureable function on a suitable $\sigma$-algebra, the probability space. Even constant functions are measureable, though that is probably not what you want to call random ... — Hagen von Eitzen, Apr 07 '13 at 09:41

score 2 · Accepted Answer · 2013-04-07T10:40:04.723

2

We can write the model in matrix form, where $a$ and $b$ are both contained in $\beta$. The true model underlying the OLS is the linear projection model $$ y=a+bx_1+e, E[x_1e]=0, E[e]=0 $$ but now we write it as $$ y=x\beta+e, E[xe]=0 $$ It can be proved that the true best linear predictor $\beta=E[xx']^{-1}E[xy]$, and hence $a$ and $b$ are just numbers and nonrandom.

Assume now we have $n$ realizations(observations) of $x$ and $y$. We array it into a dataset $X,Y$, the OLS estimator is $$ \hat\beta=(XX')^{-1}X'Y. $$ If we are to talk about the probablistic properties of $\hat\beta$ such as unbiasedness, we use the expectation conditioning on $X$. Hence the $x$ part in the true model is now assumed to be nonrandom, however there is still a $e$ that is random. For example, the conditional expectation of $\hat\beta$ is \begin{align} &E[(XX')^{-1}X'Y|X]\\ =&E[(XX')^{-1}X'(X\beta+e)|X]\\ =&(XX')^{-1}X'X\beta+(XX')^{-1}X'E[e|X]\\ =&\beta. \end{align} Note that $\hat\beta$, i.e. $\hat a$ and $\hat b$ are random because of $e$.

edited Apr 07 '13 at 10:40

answered Apr 07 '13 at 10:34

this is deep. Do you have a simpler way of putting it across? – iOSAndroidWindowsMobileAppsDev Apr 07 '13 at 10:45
This seems to be based on the definitions but it too is cokplicated. http://www.lse.ac.uk/Depts/economics/pdf/ch3new.pdf – iOSAndroidWindowsMobileAppsDev Apr 07 '13 at 14:05
@JqueryNinja The problem with this lecture notes is it does not make a distinction between the true model and the estimated model. This level suffices for you to run regressions using softwares, however to understand the finite sample or asymptotic properties you have to go further. In this notes the estimator $b=cov(x,y)/var(x)$, which is very misleading since covariance and variance is just a number, and hence $b$ is only a constant as it implies. But in reality the estimator is a random variable. – Apr 07 '13 at 16:01

Proof that a and b in linear regression are random variables

1 Answers1