I'm struggling with the following problem. I have a table showing the % of a population that like (say) bananas in three locations, the population of each location, and the total population who like bananas in each location (the previous two columned multiplied by each other).
[Column 1] % who like bananas: (1) 13% (2) 11% (3) 17%
[Column 2] Population: (1) 100 (2) 125 (3) 90
[Column 3] Pop who like bananas: (1) 13 (2) 13.75 (3) 15.3
When you sum the third column you get a total of 42.05 who like bananas.
But if you average the % of people shown in the first column you get 13.67%. Multiply this by the sum of the population (315) and you get 43.05, significantly higher than the sum of the third column.
It seems using the average for the three rows skews the final result compared to finding the population who like bananas for each row and then adding these up.
What's a simple way to explain why this is?
Obviously just a theoretical exercise - in real life more people like bananas!
Thanks