0

I calculated (and verified) the confidence interval for the mean of a population, from this sample:

$n=100$, $x_1 = ...=x _6= 36$, $x_7 = ... = x_{17} = 37$, $x_{18}=...=x_{43}= 38$, $x_{44} = ... = x_{75} = 39$, $x_{76} = ... = x_{89} = 40$, $x_{90} = ... = x_{100} = 41$.

After finding that the sample mean is $\bar{x} = 38.7$ and the sample standard deviation is $s_x = 1.3153$. The $95$ percent confidence interval turned out to be $(38.44, 38.96)$, using the usual method when $\sigma ^2$ is unknown (through the standard normal variable $(\bar{X} - \mu)/(s_x / \sqrt{n-1})$. I carried out the calculations again but I got the same result.

I'm confused because it seems not to make sense. $57$ percent of the sample are at least $39$. How does that interval give us that level of confidence?

In general, what is a good rule of thumb to check that the bounds of your interval are reasonable?

George
  • 543
  • Why did you divide $s_x$ by $\sqrt{n-1}$ and not by $\sqrt n$ ? – callculus42 Aug 22 '15 at 13:03
  • @calculus because it's either $\hat \sigma / \sqrt n$ or $s_x/ \sqrt{n-1}$ where $\hat \sigma ^2$ is the unbiased estimator of the variance given by $\hat \sigma ^2 = \frac{n}{n-1} s_x ^2$ – George Aug 22 '15 at 13:25
  • My thoughts were more or less similar. If $\hat \sigma_x=s_x$ is the unbaised estimator of $\sigma$, then you have to divide $s_x$ only by $\sqrt{n}$. If $s_x$ is the baised estimator, then you are right. In your case $s_x=\sqrt{\frac{1}{n}\cdot \sum_{i=1}^n (x_i-\overline x ) ^2}$, is that right ? – callculus42 Aug 22 '15 at 14:31
  • yes that's right @calculus – George Aug 22 '15 at 14:34
  • 1
    OK. I just wanted to make sure that there are no misunderstandings. – callculus42 Aug 22 '15 at 14:46

1 Answers1

1

Your calculations are correct. It may not be obvious, but there is a slight negative skew to this set of values which is why the confidence interval has slightly lower boundary values than you expected. The modal value is higher than the median which is higher than the mean, indicating negative skewness. 50% of values are higher than the median (approximately), but more than 50% are higher than the mean.

David Quinn
  • 34,121