When calculating vaccine effectiveness against hospitalization from publicly available data, I came across a strange (mathematical) problem, which I do not know how to interpret. The problem occurs on real-world data, though for this question I hand-crafted the data to better illustrate the problem.
I calculate the vaccine effectiveness against hospitalization by comparing numbers of hospitalized in the group of positive. Here are my data:
Note that while the effectiveness of the vaccine for the whole population is negative, for individual age ranges it is positive. Mathematically it is clear - young people are often positive but rarely hospitalized, while old people are rarely positive (because they are more vaccinated) however they are hospitalized more often.
Question 1: How to interpret the fact that the effectiveness for the whole is negative, while for parts it is positive?
When I perform the categorization differently than by age range, I can get completely different effectiveness for the parts:
Note that the total numbers are the same as before, only the distribution of hospitalized among positive is different. My impression is that by carefully choosing the category, I can get any results I want.
Question 2: I am sure this phenomenon is well known and studied in statistics. Can you point me to the right topic I can look at?
Question 3: As shown above, the calculated vaccine effectiveness depends substantially (and can give completely different results) on the division into categories chosen. Why is categorization by age considered better (and correct) than categorization e.g. by colour (as in my example)? IMO, it's just a wishful thinking. For instance, how can we be sure that if we sub-divide the age ranges more finely (either by individual years or by another criteria, e.g. type of vaccine, factory where it was made, region where the patients live etc.), the effectiveness won't be negative again?

