2

Edit with Context: Book says the % of data captured within k standard deviations $= 1 - \dfrac{1}{k^2}$. Dug a bit deeper and found it was derived using Chebyshev's but no direct derivation found$\ldots$

Within how many multiples of standard deviation will capture at least $\boldsymbol{75}$% of the data in a distribution with a mean $\boldsymbol\mu$?

I derived the formula below and got that $k$ must be equal to or less than $k$2. This doesn't make sense to me as a larger $k$ would capture more and more data, so it should be the other way around.

Work is shown below.

The inequality derivation:

Let $v = |X-\mu|$. Let $y = k\sigma$. Then $$P(v \geq y) \leq \frac{1}{k^2} = 1 - P(v < y) \leq \frac{1}{k^2}.$$

So $$k \leq \sqrt\frac{1}{1-P(v<y)}.$$

Rócherz
  • 3,976
Mid
  • 466
  • I did not know that, I thought it was an equivalent definition so no need to reverse inequalities? – Mid Dec 11 '16 at 06:32
  • No problem, I am still a novice at this stuff so I had to double check. – Mid Dec 11 '16 at 06:37

2 Answers2

3

Your intuition is correct about what Chebyshev inequality says. It's just a minor confusion in the algebra.

The RHS of the inequality on your last line, the $y$ contains $k$ as well. You cannot directly interpret/solve in that fashion (see the edit below for more details).

If you take the probability $P(v<y)$ as given, that is $P(v < k \sigma) = p$ with everything (e.g. the density and $\sigma$) known, then in principle one can directly solve for the value of $k$.

For example, if $X$ is normally distributed and given $P(v < k \sigma) = p = 0.800$, the equation is equivalent to $\Phi(k) = \frac{1+p}2 = 0.900 $ and the solution to is $k \approx 1.28155$.

However, this is often not easy and we bound the probability of interest $P(v < y)$ (which is an increasing function of $k$) by the $1 - \frac1{k^2}$ on the RHS (which is also an increasing function in $k$).

Thus, the inequality to solve as per the question statement to capture at least 75% of the data becomes

$$ 1 - \frac1{k^2} \geq \frac{3}4 \qquad \textbf{so as to guarantee} \qquad Pr\bigg\{~ |X- \mu| < k\, \sigma ~\bigg\} \geq 1 - \frac1{k^2} \geq \frac{3}4 $$

and this gives the desired correct direction of the inequality for $k$.

----------- Below is esp. in response to the comment----------

The Chebychev inequality if written this way: $$ Pr\bigg\{~ |X- \mu| < k\, \sigma ~\bigg\} \geq 1 -\frac{1}{k^2} \tag*{Eq.(1)}$$

then from the original question statement to capture at least 75% of the data, the correct inequality to solve is

$$ Pr\bigg\{~ |X- \mu| < k\, \sigma ~\bigg\} \geq \frac{3}4 \qquad \textbf{but NOT} \qquad \frac{3}4 \geq 1 -\frac{1}{k^2} \quad \text{(which gives $k \leq 2$)}$$

Similar goes with the complement statement not capturing at most 25% of the data applied directly to $P( v > y ) \leq 1/k^2$.

In conclusion, the counter-intuitive $k \leq \sqrt{1/(1-p)}$ stems from the deceivingly inviting direct application of the inequality. I hope this answers your question of "how this happened".

  • Thanks for the answer, though I am still confused on the "no longer an inequality aspect". Wouldn't it just be $1 - \frac{1}{k^2}$ less than or equal p? I get your point on since k is used to solve for the probability p in the first place, but I am confused on the algebra rules behind it. Could you enlighten me please? – Mid Dec 11 '16 at 07:39
  • 1
    @Mid Does the revision address your question better now? – Lee David Chung Lin Dec 11 '16 at 09:56
  • Yes. I cannot solve the left and middle part of the inequality because they are tied together by k, but by bounding it from below by 75% then by solving k in the middle equation I am finding the minimum solution to the question I was trying to answer. It is crystal clear now, this is exactly the understanding I needed. Thanks for the followup! – Mid Dec 11 '16 at 10:48
1

It seems like you got some great help by @Lee, maybe I can add some more insight by demonstrating the usage of this theorem in a made up problem..

Given something like:

$\mu =124$

and

$\sigma =7$

One could find the minimum probability that the number of values falls between 110 and 138. Note that this is 2 deviations away. For the sake of the problem lets say that 124 represents the amount of crimes per week in some neighborhood

the formula to this theorem looks like this:

$P(\mu-k \sigma <x<k \sigma +\mu)\geq 1-\frac{1}{k^2}$

where k is the number of deviations, so since above I noted that the values between 110 and 138 are 2 deviations away then we will use k = 2.

We can plug in the values we have above:

$P(124-2 \sigma <x<2 \sigma +124)\geq 1-\frac{1}{2^2}$

=

$P(124-2 \sigma <x<2 \sigma +124)\geq 0.75$

=

$P(110 <x< 138)\geq 0.75$

We would conclude that there is at least a 75% chance that the amount of crimes per week is between 110 and 138..

You can use this theorem for any deviation you like, such as 1.5 or 2.5 and so on.. It also appropriate in using non-normal data, pretty convenient.

Brandon
  • 684
  • 1
    Great example. I had in the back of my mind the Normal distribution, but I was surprised to learn that the power of the inequality is due to the generality. – Mid Dec 11 '16 at 17:18