I've got data in a range and frequency format e.g 0-5 : 3, 6-10 : 70 etc up to 31-35. How do I get the most accurate mean for such data?
1 Answers
There is no way to get the most accurate mean since some information has been lost in the grouping.
E.g. your grouped data could represent 3 data items with value 0, and 70 items with value 6, and no other items. Then the mean would be $$\frac{3 \times 0 + 70 \times 6}{3 + 70} \approx 5.75.$$
But your grouped data could just as possibly represent 3 data items with value 5, and 70 items with value 10, and no other items. Then the mean would be $$\frac{3 \times 5 + 70 \times 10}{3 + 70} \approx 9.79.$$
So you have to take some view on how the groupings were made. It looks like your data can take whole numbered values 0, 1, 2, ... If you say to yourself that you know nothing about how the groups were made, so that an item in the 0-5 group is equally likely to be any of the six values 0, 1, 2, 3, 4, 5, then its (average) expected value is $$\frac{1}{6} \times \left( 0 + 1 + 2 + 3 + 4 + 5 \right) = 2.5.$$
Similarly the expected value of the (5 possible values in the) 6-10 group is $$\frac{1}{5} \times \left( 6 + 7 + 8 + 9 + 10 \right) = 8.$$
So with this (uniform) assumption about how the groups were made, and assuming there are no values above 10, the mean would be $$\frac{3 \times 2.5 + 70 \times 8}{3 + 70} \approx 7.77.$$
- 1,345
-
Thanks oks. I'd thought someone would suggest one out of the (many) ways of calculating means (isn't percentile one?). – Stephen Igwue Feb 18 '14 at 08:03