1

I am trying to understand how to average certain data, interpret the result of that, and potentially convert the results to percentages. I have four categories for the data:

1: (0-50%): Not Well
2: (50-75%): Okay
3: (75-90%): Well Done
4: (90-100%): Excellent

This is being used in a rating system. Say, for instance, we have 3 ratings each of category 4, and 7 ratings of category 3: {3, 3, 3, 3, 3, 3, 3, 4, 4, 4}. I don't believe doing a straight arithmetic mean will be useful in this case since category 1 is larger than category 2, category 2 is larger than category 3, and category 3 is larger than category 4. For instance, what would a result of 3.3 mean if we took the arithmetic mean? I'm trying to understand how to interpret the results of the data, but am at a loss for how to do this. Is there a way to convert the results back to one of the 4 categories or a percentage?

Edit: I think I've made some progress, but I could be off the mark. Taking a few different examples...

{1,1,1,1,1,1,1,1,4,4}.

They got a rating of 1 80% of the time, and a rating of 4 20% of the time .8*1+.2*4 = 1.6, possibly a Not Well result?

{1,1,1,3,3,4,4,4,4,4}: .3*1+.5*4+.2*3 = 2.9, maybe an Okay result?

{1,1,1,1,2,2,2,4,4,4}: .4*1+.3*2+.3*4 = 2.2, maybe an Okay result?

The problem with using that method is that Excellent will only come up if every rating is a 4. I am not sure how to fix that or if what I've tried is the correct method.

  • It might be better to look at the median of the sample, rather than a mean. – Paul Mar 14 '15 at 23:51
  • I've been looking into that, as well. But, for instance, in the example: {1,1,1,1,1,4,4,4,4,4,4,4}, the median is 4, but 1 is a much larger category and should have a higher weight. If we tried to weight the median by the category size, using the previous example, we would have 505 1's and 107 4's. 1 would be the median in this case, but I would rather end up in category 2 in this example, which wouldn't be a possibility if using the median. – ChaosKExtreme Mar 15 '15 at 00:15

1 Answers1

0

The standard way to calculate such a mean is to multiply the midpoint of each interval by the frequency, sum these and divide by the total frequency.

{1,1,1,1,1,1,1,1,4,4}: $\frac {8 \times 0.25 + 2 \times 0.95}{10}=0.39$:Not well

{1,1,1,3,3,4,4,4,4,4}: $\frac {3 \times 0.25 + 2 \times 0.825 + 5\times 0.95}{10}=0.735$:Okay

{1,1,1,1,2,2,2,4,4,4}:$\frac {4 \times 0.25 + 3 \times 0.625 + 3\times 0.95}{10}=0.5725$:Okay

{3,3,3,4,4,4,4,4,4,4}: $\frac {3 \times 0.825+ 7\times 0.95}{10}=0.9125$:Excellent

This method is usually called "estimating the mean of a grouped frequency table" because the coded values 1, 2, 3 and 4 hide the true percentage values. There is thus a margin of error in each calculation.

It is possible to give a margin of error for your mean, too, which you could use to decide whether the overall result was truly "excellent" or not.

{1,1,1,1,2,2,2,4,4,4}: lower bound is $\frac {4 \times 0 + 3 \times 0.5+ 3\times 0.9}{10}=0.42$:

{1,1,1,1,2,2,2,4,4,4}: upper bound is $\frac {4 \times 0.5 + 3 \times 0.75+ 3\times 1}{10}=0.725$:

{3,3,3,4,4,4,4,4,4,4}: lower bound is $\frac {3 \times 0.75 + 7 \times 0.9}{10}=0.855$:

{3,3,3,4,4,4,4,4,4,4}: upper bound is $\frac {3 \times 0.9 + 7\times 1}{10}=0.97$:

tomi
  • 9,594
  • Hmm, let me try that on a few examples: {1,1,1,3,3,4,4,4,4,4}: (.253+.8252+.95*5)/10 = .8325 = Well Done. – ChaosKExtreme Mar 15 '15 at 00:50
  • Is it possible to get Excellent this way (other than just all 4's)? It seems like it would be. – ChaosKExtreme Mar 15 '15 at 00:54
  • Should be possible. – tomi Mar 15 '15 at 00:57
  • This definitely seems like what I was looking for! Thanks, so much! I've been struggling with this most of the day. – ChaosKExtreme Mar 15 '15 at 00:59
  • Can't upvote, not enough rep. So, just picture me giving you a thumbs up, because I am. – ChaosKExtreme Mar 15 '15 at 00:59
  • Thanks. I've now given an example that gets a result of 91.25%=excellent. – tomi Mar 15 '15 at 01:01
  • Thanks. That margin of error you mention seems like it could come in handy for my purposes. How would you go about calculating that? – ChaosKExtreme Mar 15 '15 at 01:13
  • Added examples to my answer. – tomi Mar 15 '15 at 01:27
  • Note that you don't have to have ten values; any number will do... – tomi Mar 15 '15 at 01:28
  • Thanks. Given your examples, that would make {1,1,1,1,2,2,2,4,4,4}: 57.25% +/- 15.25% and {3,3,3,4,4,4,4,4,4,4}: 91.25% +/- 5.75%. That will definitely help. – ChaosKExtreme Mar 15 '15 at 01:43
  • Good. You can restrict yourself to awarding the lower bound as the mean score if you want to ensure that you only award a "well done" to the truly deserving, but that might be too harsh... – tomi Mar 15 '15 at 01:55
  • Hi, In the first example how did you derived 0.25 and 0.95. I think the mid-point is what I don't know how to calculate. Thanks. – asyncwait May 27 '16 at 14:08
  • 1
    You said category "1" was 0% to 50%, so the middle of that interval is ${0.00 + 0.50 \over 2}=0.25$. Category "4" was 90% to 100%, so the middle of that interval is ${0.90 + 1.00 \over 2}=0.95$. – tomi May 27 '16 at 14:17