0

Here is the question:
A surveyor conducts research on 4148 households, randomly selected. An estimate of $7.9\%$ unemployment was obtained from a survey of $4148$ people.
Construct a $95\%$ confidence interval with this data.
Solution:
An approximate $95\%$ confidence interval for the true unemployment rate $p$ is $$\left(\hat{p} - z_{2.5}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}},\hat{p} - z_{97.5}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\right)\\=\left(0.079 - 1.96\sqrt{\frac{0.079(1-0.079)}{4148}},0.079 + 1.96\sqrt{\frac{0.079(1-0.079)}{4148}}\right)\\=(0.071,0.087)$$

What I'm mainly confused about is: how did they jump immediately to their first line?

I feel as if they made the assumption that each person is bernoulli distributed? And then they assumed that when you standardize the average, this is normally distributed?
Are these assumptions valid to make?

1 Answers1

1

Note the small typo in the first line of your formula ( the upper limit of the interval should be $+$)

It's a standard result that an approximate $100(1-\alpha)\%$ CI for a proportion $p$, obtained by observing $x$ successes in a sequence of $n$ independent Bernoulli trials each with success probability $p$ is $$ \left(\hat{p} - z\sqrt{\frac{\hat{p}(1-\hat{p})}{n}},\hat{p} + z\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\right) $$ where $\hat{p}=x/n$ is the estimate of $p$ and $z$ is the $(1-\alpha/2)$ quantile of the standard normal distribution.

So yes, assumptions have been made about independent Bernoulli trials.

PM.
  • 5,249
  • Thanks. Are these results also similar for other trials? Normal is a definite, but how about poisson etc?
    That question, and the one I'm about to ask might be touching on the derivation of your result.
    For $\hat{p}$ being an estimator of $p$, then is your result a consequence of: $\frac{\hat{p} - p}{\sqrt{Var(\hat{p})}} \sim N(0,1)$ (since by $CLT$, the LHS is approximated by the standard normal.) ?
    – Twenty-six colours Jun 07 '17 at 14:46
  • where $\hat{p}$ is also a "asymptotic normal" estimator, which happens to be the MLE in this case.
    Will this result always work if we picked an MLE for any parameter (since MLE's are asymptotic normal) (assuming it's based on i.i.d. random variables)?
    Sorry to ask so many questions in short succession.
    – Twenty-six colours Jun 07 '17 at 14:48
  • @Twenty-sixcolours There seem to by multiple new questions here. Also, the above is not my result but it is standard. For deeper discussions and CIs applicable in other contexts I'd suggest referring to appropriate texts ( maybe Hogg&Craig, Introduction to Mathematical Statistics. ) – PM. Jun 07 '17 at 16:31