1

I am taking an introductory stats class and I am a bit confused to an assinment question this week. The problem has to do with finding the minimum sample size for a given confidence interval and margin of error. The problem asks-

I want you to determine the minimum sample size required to obtain a confidence interval for an opinion poll with a specified margin of error using the formula you learned this week (Chapter 16).

You will calculate 9 different sample sizes corresponding to a margin of error of ±5%, ±3%, and ±1% at each of the confidence levels 90%, 95% and 99%. For all these calculations use a "worst case scenario" value of p= 0.5 (which translates to σ = 0.5) and remember that percentages must be converted to decimals.*

For all these calculations use a "worst case scenario" value of p= 0.5 (which translates to σ = 0.5)*

As I understand the problem requires finding 'n' at the given confidence level and margin of error. From what I read we are supposed to substitute p-hat and q-hat to .5 respectively which multiply to .25 for the worst case scenario.

Can you please explain how this value (p*= 0.5) translates to sigma = 0.5? I thought that p-hat being .5 would translate to q-hat being .5 since it is the compliment of p-hat. I dont understand how sigma plays a role in this example? Thank you for any help!

Anthony
  • 11

1 Answers1

1

The formula for an asymptotic $100(1-\alpha)\%$ confidence interval for a binomial proportion is $$\hat p \pm z^*_{\alpha/2} \sqrt{\frac{\hat p (1-\hat p)}{n}},$$ where $\hat p$ is the point estimate for the proportion, $n$ is the sample size, and $z^*_{\alpha/2}$ is the upper $\alpha/2$ quantile of the standard normal (e.g., for a $95\%$ confidence interval, $\alpha = 0.05$ and $z^*_{0.025} \approx 1.96$).

The quantity $$ME = z^*_{\alpha/2} \sqrt{\frac{\hat p (1-\hat p)}{n}}$$ is known as the margin of error for the interval. Since this margin depends on $\hat p$, which can vary from $0$ to $1$, the "worst case" scenario in which the margin is as large as possible for a given fixed sample size $n$ and confidence level $100(1-\alpha)\%$, corresponds to $\hat p = 0.5$. Thus, if $$n \ge \left\lceil 4 \left(\frac{z_{\alpha/2}^*}{ME}\right)^2 \right\rceil,$$ we are guaranteed that this sample size will be sufficient to be within the desired margin of error $ME$ for any possible point estimate $\hat p$ you might observe. So all that is left is to evaluate this formula for combinations of $\alpha$ and $ME$; e.g., when $\alpha = 0.05$ and $ME = 2\% = 0.02$, we have $$n \ge \left\lceil 4 \left(\frac{1.96}{0.02}\right)^2 \right\rceil = 38415.$$ Of course, I selected a value that is not one of the nine you must compute.

So what does this $\sigma = 0.5$ mean? Well, the confidence interval for the mean of a normally distributed population when the population standard deviation is known, is $$\bar x \pm z_{\alpha/2}^* \frac{\sigma}{\sqrt{n}},$$ where $\sigma$ is the population standard deviation. Here, $\bar x$ is the sample mean. So we can see that the two types of interval estimates are analogous, with $\hat p$ playing the role of $\bar x$, and $\sqrt{\hat p(1-\hat p)}$ playing the role of $\sigma$. This is not a coincidence; it is the result of using a normal approximation to the binomial distribution. But whereas $\sigma$ is fixed and known, $\sqrt{\hat p(1-\hat p)}$ is an estimate of the standard deviation, because $\hat p$ is calculated from the data itself. However, again in the "worst case" scenario, we set $\hat p = 0.5$, which then gives $$\sigma = \sqrt{(0.5)(1 - 0.5)} = 0.5.$$ This is where that assertion comes from. The corresponding sample size formula then becomes $$n \ge \left\lceil \frac{(z^*_{\alpha/2})^2}{\sigma^2 ME^2} \right\rceil,$$ which might look more familiar to you.

heropup
  • 135,869
  • So is what you are saying is that there are 2 formulas to arive at the same answer for a sample size. One being on formula that is used for the sample distribution of a mean and the other being the one that is used for the sample distribution of a proportion. I'm afraid I still dont completely understand. There are 2 sample size formulas I can use. Are they both the same for all cases? – Anthony Nov 09 '21 at 23:53
  • @Anthony Your question specifically pertains to a proportion; therefore, you use the first formula I showed you. All you have to do is choose the appropriate $\alpha$ and $ME$ as I also showed you. The second formula applies for a confidence interval for a normally distributed sample. I only included it because I am explaining why your question said $\sigma = 0.5$. When this choice is made, then both formulas become the same. In general, there are various possible sample size formulas, because the formula depends on how the interval is constructed. – heropup Nov 10 '21 at 02:34