1

A student has 7 pairs of socks of 7 different colours. During 7 days he randomly picks 2 socks from the drawer (not necessarily of the same colour) and then doesn't put them back. Find the expected value of the number of days, in which 2 socks of the same colour are picked.

My work so far: There are 7 colours, so on the first day the student has a $\frac{1}{7}$ chance to pick the first sock, and $\frac{1}{13}$ chance to pick another sock of the same colour, giving $\frac{1}{91}$ chance to pick a pair of matching socks on the first day. Since drawing of all socks is equally likely, this can be multiplied by 7 to give $\frac{7}{91}$. Multiplying it by the seven days of the week, we have $\frac{49}{91}$, which comes out to $0.5385$ days (question asks for 4 decimal places). Are there any flaws in my logic, and is the answer correct (or close)?

Edit: corrected wrong decimal approximation

John Doe
  • 502
  • 2
    the phrasing is a bit off., though I think the substance is solid. Whatever you draw first there is a $\frac 1{13}$ chance that the second will be the match, so there is a $\frac 1{13}$ chance of picking a pair of matching socks on the first day (note $\frac 1 {91}$ as you state. That's the probability of getting a specific pair on a given day). Then independence lets you multiply by the number of days (as each couple of socks is equally likely to be the once chosen on that day). – lulu Jan 31 '23 at 12:43
  • @lulu Thanks! So the answer would be 7/13, i.e. 0.5385 days? This seems wrong however, logically? Would mean that on average it would take half a day for the student to pick the same coloured socks, but there are a lot more ways to not get a matching pair (e.g. if student picked green and red on the first day, he now has another green and red single sock in the drawer, which can never get a match, so probability of getting a match, assuming there was no match already, decreases every day). So 7/13 would be the probability of getting matching socks in the 7 days, not the expected value? – John Doe Jan 31 '23 at 12:56
  • 2
    I'd avoid the word "independence" here, and would opt instead for the word "symmetry." Other than that, I fully agree with @lulu's first comment. Whether or not you got a matching pair on one day is a dependent event with respect to whether or not you got a matching pair on another day (seen trivially by noting that if you got matching pairs the first six days then the seventh day is guaranteed to be a matching pair). The multiplication by seven (the number of days) is correct however and can be explained by "Linearity of Expectation" – JMoravitz Jan 31 '23 at 12:57
  • Expected value of the number of days in which a matching pair was taken out i mean. – John Doe Jan 31 '23 at 12:58
  • 1
    And so the final answer is that the expected number of days where you get matching socks will be $\frac{7}{13}$. Note how linearity of expectation lets you completely ignore all of the dependence going on and lets you completely ignore having to break into cases like "if student picked green and red on the first day, then..." I have no idea where your value of "3.7692" comes from in your post, but that is not $\frac{49}{91}$. I think you multiplied by seven one too many times. – JMoravitz Jan 31 '23 at 13:02
  • @JMoravitz Yes, you are right, i multiplied the 49/91 by 7 again. And 49/91 is indeed 7/13. – John Doe Jan 31 '23 at 13:11
  • @JMoravitz Also see where my confusion came from - I assumed we need to find the expected number of days in which 2 matching socks were picked. So after how many days he'd pick a matching pair, not the expected value of the number of days. Those are 2 different values, right? – John Doe Jan 31 '23 at 13:16
  • 1
    "The expected total number of days in which 2 matching socks were picked out of all days" is not the same as "expected number of days that we need to wait until having gotten a matching pair." The question asked for the first, and the first is (as alluded to previously) able to be found trivially with linearity of expectation. The second is a much harder question, but also unclear how to handle what happens if there were no matches at all throughout the week. That second question doesn't really make sense for the problem of where we take socks without replacement. – JMoravitz Jan 31 '23 at 13:20
  • @JMoravitz Thank you, that clears it up! – John Doe Jan 31 '23 at 13:24
  • 1
    I agree that "independence" was a poor choice of words. I was referring to the fact that Expectation is linear, independent of the dependence of the random variables in question. Of course the results on a given day depend on those of other days (the last day's result is fully determined by the prior days, for instance). – lulu Jan 31 '23 at 13:36
  • I can't understand the question. If I have a matching pair of socks, aren't they both of the same colour? – P. Lawrence Jan 31 '23 at 13:37
  • @P.Lawrence Yes, but you can pick socks of different colours on a given day, hence the wording I guess. – John Doe Jan 31 '23 at 14:23

1 Answers1

3

Thanks everyone for comments! For clarity, I will write an answer that summarizes all the comments written.

We call a day "good" if two socks of the same color are chosen on that day. As far as i understand, it is required to calculate the expected number of "good" days. The indicator method works well here. Let $\xi$ be a random variable that represents the number of good days. Enter the following event $$A_j = \left \{ j\text{-th day was good} \right\}, \quad j=1,...,7.$$ Then $$\xi = I_{A_1} + I_{A_2} + ... + I_{A_7} \text{ where } I_{A_j} = \begin{cases} 1, \text{ if event }A_j \text{ happened},\\ 0, \text{ otherwise} \end{cases}$$ Hence, $$E \xi = E\left(\sum\limits_{j=1}^{7} I_{A_j} \right)= \sum\limits_{j=1}^{7} E\left(I_{A_j} \right) = \sum\limits_{j=1}^{7} P(A_j) = 7 \cdot \frac{C_7^1}{C_{14}^2} = \frac{7}{13}, \quad (\text{here } C_n^k = \left( \begin{gathered} n \\ k\end{gathered} \right)). $$ A good question here is why $P(A_j) = \frac{1}{13} \quad \forall j \in \left\{1,...,7 \right\}$. It follows from symmetry as @JMorawitz wrote or we can verify this, using conditional probability (but this take long time). Anyway, the probability $P(A_2)$ can be calculated quickly $$P(A_2) = P(A_2|A_1) \cdot P(A_1) + P(A_2|\overline{A_1})\cdot P(\overline{A_1}) =$$ $$ = \underbrace{\frac{C_6^1}{C_{12}^2} }_{12 \text{ socks left of which there are } 6 \text{ pairs}} \cdot \frac{1}{13} + \underbrace{\frac{C_5^1}{C_{12}^2}}_{12 \text{ socks left of which there are } 5 \text{ pairs}} \cdot \frac{12}{13} = $$ $$ = \frac{1}{11}\cdot \frac{1}{13} + \frac{10}{12\cdot 11} \cdot \frac{12}{13} = \frac{1}{11}\cdot \frac{1}{13} + \frac{10}{11\cdot 13}= \frac{1}{13}.$$

greyls
  • 1,273
  • Thank you for the detailed write up! I have one question about the E(Xi) equation - why is the sum of the expected values of Indicator {A_j} the same as the sum of probabilities of {A_j}? And what was the advantage of using the indicator function, as opposed to just writing E(Xi) as the sum of the probabilities of {A_j} to begin with? – John Doe Feb 02 '23 at 16:51
  • 1
    @John Doe, the sum of the expected values of indicator {A_j} the same as the sum of probabilities of {A_j}, because by definition of expectation and definition of indicator we have $EI_{A_j}=1\cdot P(A_j)+ 0\cdot (1-P(A_j))$. I would not call it an advantage, in my opinion the indicator method explains better where this comes from. This method is also useful to know because it allows you to solve many similar problems when it is difficult to find the distribution of a random variable directly, but you need to calculate its characteristics. – greyls Feb 02 '23 at 17:00