Precise mathematical translation of the 68–95–99.7 rule?(Not a proof!)

Question

The rule:

In statistics, the 68–95–99.7 rule, also known as the three-sigma rule or empirical rule, states that nearly all values lie within 3 standard deviations of the mean in a normal distribution.

About 68.27% of the values lie within 1 standard deviation of the mean. Similarly, about 95.45% of the values lie within 2 standard deviations of the mean. Nearly all (99.73%) of the values lie within 3 standard deviations of the mean.

So suppose that I have a set of values (measurements) which has the normal distribution property. Let's call it S.

When they say "about 68.27% of the values" what values do they mean? Do they mean that the standard deviation of any 68.27 % of the elements of S is smaller than 1? Do they mean something more? Could someone give me a precise mathematical statement that is equivalent to this "68–95–99.7 rule".

I've posted this on math.stackexchange because I would like a mathematical answer.

https://en.wikipedia.org/wiki/Empirical_rule – zyx Sep 14 '13 at 02:36 — zyx, Sep 14 '13 at 02:36

score 11 · Answer 1 · answered Sep 14 '13 at 14:23

The mathematical statement of the "within one standard deviation" rule is that

$$\Pr(\mu-\sigma < X < \mu + \sigma) =\frac{1}{\sqrt{2 \pi} \sigma} \int_{\mu - \sigma}^{\mu + \sigma} \exp \left( - \frac{(x-\mu)^2}{2 \sigma^2} \right) \; dx = \frac{1}{\sqrt{2 \pi}} \int_{-1}^1 \exp \left( - \frac{u^2}{2} \right) \; du \approx 0.682689$$

(In the integral, just make the substitution $u = (x-\mu)/\sigma)$.)

Is that what you had in mind? The other statements are similar, just replacing $\sigma$ with $2 \sigma$ or $3 \sigma$.

score 5 · Answer 2 · answered Sep 14 '13 at 00:59

5

Precisely, they mean that if you could observe an "infinite" number of values from your normal distribution, 68% would be within 1 standard deviation of the mean, as specified by the parameters of your distribution, 95% would be within 2 standard deviations, and 99.7% within 3 standard deviations.

Of course, you cannot take an infinite number of observations. But the larger the finite number of observations you can take, the closer your results will be to the infinite case. If you only take a very few number of observations, the results could be very different from the ideal you ask about. But if you had hundreds of observations, you'll be surprisingly close. This is the content of the Law of Large Numbers.

answered Sep 14 '13 at 00:59

What percentage of points are in the interval $(0,0.68)$ compared to the interval $(0,1)$? Both are infinite! – Sep 14 '13 at 01:07
1

What I said: the interval $(0,0.68)$ has $68%$ of the points in the interval $(0,1)$. There are others, but this is an obvious one. $(0.17,0.83)$ also has 68% of the points of $(0,1)$. In this case, for the uniform distribution on $(0,1)$, length works to define percentages. In general, you need to use a probability measure, of which length on a uniform distribution is a special case. – Sep 14 '13 at 01:12
Ok, thanks for the answers but what would be the anwser for this question "Do they mean that the standard deviation of any 68.27 % of the elements of S is smaller than 1?"? I haven't read probability theory yet, so things like "uniform distributions" are unknown to me. – shooting-squirrel Sep 14 '13 at 01:17
1

You should read some probability theory, and statistics, as this will help you a lot. The problem is that the question you want answered is ill-posed. Individual observations (what you call elements) don't have standard deviations. Random variables (like normal or uniform distributions) do have standard deviations, because they describe a random process: what values can occur, and how often (with what probability)? It's hard to go much further than this without more background in probability. – Sep 14 '13 at 01:21
Ok, I will check that. – shooting-squirrel Sep 14 '13 at 01:24

score 4 · Accepted Answer · edited Sep 14 '13 at 01:30

If $X$ is a normally distributed random variable, then $$\Pr\left(\left|\frac{X-\mu}{\sigma}\right|\le 1\right)\approx 0.6826.$$ Here $\mu$ is the population mean, and $\sigma$ is the population standard deviation.

Similar facts hold for the other two numbers you mentioned.

If we do repeated independent sampling, that can be represented as a sequence $X_1,X_2,X_3,\dots, X_n$ of independent random variables with the same mean and variance. If $n$ is large, then with reasonably probability the proportion of the sample results that lies between $\mu-\sigma$ and $\mu+\sigma$ will be not far from $68\%$. However, even with $n$ around $1000$, we can only be about $95\%$ sure that the experimental proportion will be between $65\%$ and $71\%$.

Precise mathematical translation of the 68–95–99.7 rule?(Not a proof!)

3 Answers3

Linked