0

Been struggling with some probabilities even though i might able to solve it the hard way this seems to be like too much of a struggle to make sense solving

Test is made out of 5 history and 6 geography questions. Student correctly answers history questions 60% of the time, he correctly answers geography questions 40% of the time. What is the probability that there will be more correctly answered history questions than geography questions?

So basically you need to have $H>G$ number of questions. so to answer 5 history questions out of 6 AND answer 4 geography questions out of 5 the formula should be something along the lines of $(C^5_5 * 0.6^5*0.4^0)*(C^4_5*0.4^4*0.6^1)$,

however this will only take take into account one situation of $5H - 4G$, but you're left with other situations:

5 - 4 as in example
5 - 3
5 - 2
5 - 1
5 - 0
4 - 3
4 - 2
4 - 1
4 - 0
3 - 2
3 - 1
3 - 0
2 - 1
2 - 0
1 - 0

are you supposed to calculate each situation individually and add up the answers or is there a better way to figure the answer out? (i am guessing there is)

woohoos
  • 107
  • 1
    Is the $60$% for the history questions, or for all questions? – Paul Apr 16 '20 at 17:54
  • fixed, thanks for noticing. it's for history questions – woohoos Apr 16 '20 at 17:55
  • That's one way to do it, but you can aggregate some of the cases. for example, instead of your first 5 cases, you can reduce it 2: 5 and (not 5 or 6). – Paul Apr 16 '20 at 18:11

2 Answers2

1

There are ways to simplify it a bit, but you are going to have a lot of cases to evaluate.

Two things could be done to reduce the number of cases. First, consider the complement. Second, condition on the $G$ variable instead of the $H$.

I'll show two scenarios, depending on which variable we condition.

Condition on $H$

Like you did, we regroup the cases with respect to the $H$.

If $H=5$ $$P(H=5)\left(P(G=0)+P(G=1)+P(G=2)+P(G=3)+P(G=4)\right)$$ Here it is shorter to consider the complement. $$P(H=5)\left(1-P(G=5)-P(G=6)\right)\tag{H5}$$ If $H=4$ $$P(H=4)\left(P(G=0)+P(G=1)+P(G=2)+P(G=3)\right)$$ Here it is also shorter to consider the complement. $$P(H=4)\left(1-P(G=4)-P(G=5)-P(G=6)\right)\tag{H4}$$ For the other values, the direct version is shorter. For $H=3$: $$P(H=3)\left(P(G=0)+P(G=1)+P(G=2)\right)\tag{H3}$$ For $H=2$: $$P(H=2)\left(P(G=0)+P(G=1)\right)\tag{H2}$$ For $H=1$: $$P(H=1)P(G=0)\tag{H1}$$ For $H=0$, probability is $0$. Then, the probability is the sum of each $$P(H>G)=(H5)+(H4)+(H3)+(H2)+(H1)$$

Condition on $G$

It will be shorter if we regroup according to $G$.

If $G=6$ of $G=5$, the probability is $0$.

If $G=4$, then $$P(G=4)P(H=5)\tag{G4}$$ If $G=3$, then $$P(G=3)\left(P(H=5)+P(H=4)\right)\tag{G3}$$ If $G=2$, then $$P(G=2)\left(P(H=5)+P(H=4)+P(H=3)\right)\tag{G2}$$ For the last two cases, it is shorter to consider the complement. If $G=1$, then $$P(G=1)\left(1-P(H=0)-P(H=1)\right)\tag{G1}$$ If $G=0$, then $$P(G=0)\left(1-P(H=0)\right)\tag{G0}$$ Then, you want $$P(H>G)=(G4)+(G3)+(G2)+(G1)+(G0)$$ No matter how you do it, there will be a lot of computation.

1

$H$ and $G$ are independent random variables each having a binomial probability distribution but with different parameters $n$ (number of experiments) and $p$ (probability of success).

You wish to find the probability distribution for $H-G$. This would be relatively easy if $H, G$ were distributed normally, or if $n$ were large so that the normal approximation could be used - which is not the case here. Then the random variable $Z=H-G$ is also normally distributed and the required probability $P(Z\gt 0)$ could be found from tables.

$H+G$ has a binomial distribution if $p_H=p_G$, in which case tables can be used. But $H-G$ does not have a binomial distribution even in this case.

However, note that $p_H=1-p_G$ therefore $\bar{G}=n_G-G$ has the same probability of success as $H$.

Therefore the sum $Z=H+\bar{G}=n_G+H-G$ has a binomial distribution with parameters $n_H+n_G$ and $p_H=1-p_G$. So the required probability $P(Z=n_G+H-G \gt n_G)$ can be found from Binomial Distribution Tables.

See : Difference of two binomial random variables.