Alice has two bags, having random values in [0, 1]. She reveals one bag to Bob. Bob picks one bag, Alice gets the other. What is the optimal strategy?

Question

Alice is given two bags. The values of the contents of the bags are each uniformly (and independently) distributed over $[0, 1]$. She looks in both bags and finds out their value. She then reveals the value of one of the bags to Bob. Bob then picks either the revealed bag or the hidden bag. Bob and Alice are both trying to maximize their expected value. What are their optimal strategies? Assume that Bob knows Alice's strategy when he chooses his strategy.

It's clear that Bob can always achieve an expected value of $0.5$ by just picking either the revealed bag or the hidden bag uniformly at random.

One proposed strategy for Alice is to reveal the bag having value closer to $0.5$. In this case, after a bag of value $x$ is revealed to Bob, his posterior distribution over the value of the hidden bag is uniform over $[0, 0.5 - t] \cup [0.5 + t, 1]$, where $t = |x - 0.5|$. Since this distribution is symmetric about 0.5, Bob's expected value for the hidden bag is always $0.5$. Accordingly, Bob will choose the revealed bag with value $x$ if $x \ge 0.5$, and will otherwise prefer the hidden bag having expected value $0.5$.

Since the distribution of the value $x$ of the revealed bag is symmetric about $0.5$, the probability of $x \ge 0.5$ is $0.5$. The probability density of $x$ conditioned on $x \ge 0.5$ is proportional to $p(x) = 1 - 2(x - 0.5) = 2 - 2x$; normalization gives us $p(x) = 8 - 8x$. Bob's expected value when $x \ge 0.5$ is therefore $\int_{0.5}^{1} x (8 - 8x) \,dx = 2/3$, so his overall expected value is

$$ \frac{1}{2} \cdot \frac{1}{2} + \frac{1}{2} \cdot \frac{2}{3} = \frac{7}{12} = 0.5 + \frac{1}{12}. $$

As we can see, this strategy still gives Bob $1/12$ more expected value than the proven lower bound of $0.5$. My questions are:

Is there a better strategy for Alice?
I suspect the only way to bring Bob's expected value to 0.5 would be if Alice had a strategy where, after she revealed a bag of value $x$ to Bob, Bob's expected value of the hidden bag is $x$. Only then is Bob unable to gain any advantage out of his ability to choose. Is such a strategy possible?
Can we prove a better lower bound? Can we prove the above strategy optimal?

"It's clear that Bob can always achieve an expected value of 0.5 by just picking either the revealed bag or the hidden bag uniformly at random." Well.... if so that's a better strategy for Alice, isn't it? If she randomly chooses the bag to show then Bob's expected return will be $0.5 < 0.5 +\frac 1{12}$. — fleablood, Jun 02 '21 at 15:26
Does Alice know the contents of each bag when Alice chooses which one to reveal to Bob? — Theoretical Economist, Jun 02 '21 at 15:42
@fleablood I don't think Bob's expected return there is 0.5, since his strategy will again be "take the bag you show me if it is larger than 0.5, and gamble on the other one otherwise". I think in that case, his expected returns are 1/2 * 3/4 + 1/2 * 1/2 = 5/8. — E-A, Jun 02 '21 at 15:51
@fleablood, if Alice randomly chooses the bag to show, that still gives Bob information that can help him win. If Bob rejects <0.5 and accepts greater than >0.5, then Bob's expected return is $>0.5$. — aTree, Jun 02 '21 at 15:51
@Theoretical Economist yes, she knows the values of the bags. I've edited to clarify this. — Superty, Jun 02 '21 at 16:35
I would guess that Alice’s optimal strategy would be to reveal as little as possible to Bob when showing Bob one of the bags. This means that Alice’s strategy for choosing which bag to reveal does not depend on the contents of either bag (or else this reveals additional information to Bob). Of course, there are many strategies that have this property. — Theoretical Economist, Jun 02 '21 at 17:20
@TheoreticalEconomist if it doesn't depend on the contents of the bag, isn't that equivalent to what fleablood said in an earlier comment, which does worse than the strategy already presented by the OP? Also, if you have time, if you can skim over my answer that would be great; based on your username, you probably know more about game theory than I do. — E-A, Jun 02 '21 at 17:47
@E-A you’re right; that was an off-the-cuff guess from the (probably mistaken) intuition that Alice doesn’t want to reveal any information to Bob, as Bob can only use this to their (Bob’s) advantage, and this is bad for Alice. I’d be interested to know where this intuition breaks down, so I’m quite interested in understanding your answer. I’ll have a look tonight. — Theoretical Economist, Jun 02 '21 at 17:50
I don't think it's explicitly mentioned in the question, but probably worth saying: the aim is to pick as high a number as possible, right? — psmears, Jun 02 '21 at 21:18

score 8 · Accepted Answer · edited Jun 02 '21 at 22:05

8

Here, I will expand the argument given by @QuesterZen in his last paragraph, and argue that Alice has no strategy that helps her in any way against Bob's strategy of "take anything $\geq 0.5$ and switch otherwise", and show that $7/12$ is indeed the best.

There are essentially 3 cases we have to consider:

$a < b < 0.5$: Alice can ensure that Bob gets the $a$ by showing $b$.
$a < 0.5 < b$: Alice has no choice but let Bob get the larger one.
$0.5 < a < b$: Alice can ensure that Bob gets $a$ by first showing $a$.

In none of these strategies did Alice do anything suboptimal; the crux of Bob's gains are coming from the fact that whenever the two numbers are separated by the threshold Bob chose (0.5), Bob is guaranteed to make big bucks. (this is similar to the two envelopes puzzle if you have heard of it)

The expected gains of Bob under this strategy matches your $7/12$ as:

$\frac{1}{4} \frac{1}{6} + \frac{1}{2} \frac{3}{4} + \frac{1}{4} (\frac{1}{2} + \frac{1}{6}) = \frac{7}{12}$,

which shows that Bob can indeed always achieve 7/12.

edited Jun 02 '21 at 22:05

Tanner Swett

10,624

answered Jun 02 '21 at 17:18

E-A

5,987

1

Makes sense to me. Thanks! – Superty Jun 02 '21 at 20:50
One difficulty here is that given the strategy you've specified for Alice, the conjectured strategy for Bob (i.e. take anything larger than $1/2$, switch otherwise) may not be optimal. For example, whenever Alice shows Bob a bag whose contents are greater than $1/2$, it might be optimal for Bob to actually take the other one. (Depending on which bag Alice shows in case 2, and possibly also on how large the current value Alice sees is.) – Theoretical Economist Jun 04 '21 at 01:03
The issue with my original comment is that I was being sloppy: I was talking about optimal strategies when I was actually thinking about equilibrium strategies. – Theoretical Economist Jun 04 '21 at 01:04
@TheoreticalEconomist OP already has a strategy in the original post that shows that Bob cannot do any better than 7/12, unless you don't think Bob is being optimal in that scenario? So you suspect Bob can do better than 7/12? I believe I showed the lower bound here for Bob. – E-A Jun 04 '21 at 02:02
The problem is that asking what Bob's optimal strategy is is ill-defined if we don't have a theory of how Alice plays. Said differently, Bob's optimal strategy depends on Alice's strategy. For example, if Alice's strategy were to always show the smaller bag (or always show the larger one), then Bob has a strategy that secures a payoff of $3/4$ in expectation, and this strategy is strictly better than the one where Bob switches when the contents of the bag is less than $1/2$ and takes the bag otherwise. – Theoretical Economist Jun 04 '21 at 16:01
@TheoreticalEconomist I specified in the question that Bob knows Alice's strategy when he decides his. I believe that this is equivalent to saying that Bob's and Alice's strategy should be in equilibrium, but I'm not 100% sure. – Superty Jun 04 '21 at 18:17
Looking for their equilibrium strategies is fine. However, if that's what you want, then it's not clear that the strategies specified so far are an equilibrium. If we take Alice's strategy as in @E-A's answer, then I believe Bob has a profitable deviation. – Theoretical Economist Jun 04 '21 at 18:26
@TheoreticalEconomist Let me formalize the proof of E-A: Bob's optimal response to Alice's "reveal the bag closer to 0.5" establishes that Bob cannot do better than 7/12 if Alice picks her optimal strategy.
Now we prove that Bob can always achieve at least 7/12 utility no matter what strategy Alice picks, by always using the strategy "take the revealed bag if has at least 0.5 value and the hidden bag otherwise". From the answer, we know that even Alice's optimal strategy in this case gives Bob 7/12 expected value.

Thus Bob's maximum expected value is exactly 7/12.
– Superty Jun 04 '21 at 18:31
That's not the same as an equilibrium. An equilibrium is a specification of strategies for Alice and Bob, with the requirement that a) Alice's strategy is optimal given Bob's equilibrium strategy, and b) Bob's strategy is optimal given Alice's equilibrium strategy. What are Alice and Bob's equilibrium strategies here? – Theoretical Economist Jun 04 '21 at 18:40
I'd guess that Bob's strategy of switching whenever the bag is less than $1/2$ and staying otherwise actually is an equilibrium strategy for Bob. However, what should Alice do? We'd also need to check that given what Alice does, Bob's strategy is indeed optimal. – Theoretical Economist Jun 04 '21 at 18:41
1

As I mentioned, when I stated the question I said that Alice chooses her strategy, and then Bob chooses his strategy with knowledge of Alice's strategy. Perhaps that should have been clearer. I said in my comment that I think the strategies they pick would be an equilibrium pair but I wasn't sure.
Anyway, it appears that Alice's strategy of revealing the bag closer to 0.5 and Bob's strategy of picking the revealed bag iff it is above 0.5 are in equilibrium; E-A's answer proves that Alice cannot do better by unilaterally switching, and I proved in the question that Bob cannot do better.
– Superty Jun 04 '21 at 18:46
@Superty is the strategy for Alice that E-A uses in this answer the same as the one that you proposed? It seems to me that they are different, which is why I don't think you have shown that this is an equilibrium. – Theoretical Economist Jun 05 '21 at 13:32
@TheoreticalEconomist They are indeed different; the strategy for Alice in E-A's answer is only used for the proof and may not be in equilibrium with Bob's strategy. E-A shows that if Bob uses his strategy, the best Alice can do is 7/12, by showing that the best-case strategy against Bob's strategy achieves only 7/12. Therefore, Alice cannot gain by unilaterally switching from her equilibrium strategy described in my question, since that also already achieves 7/12. We have already proven the other side in my question, so the two strategies in the question are in equilibrium. – Superty Jun 06 '21 at 15:30

QuesterZen · Answer 2 · 2021-06-02T16:07:44.993

Perhaps we can build up to the continuous case with smaller finite cases?

With only {0,1}, Bob always wins by accepting 1 and rejecting 0. His expected return is $\frac{3}{4}$.

With only {0,0.5,1}, the only interesting case is if Alice has either (0.5,0) or (0.5,1), when she has to show the 0.5 either way. Bob can accept it or chose randomly with equal results. His expected return is $\frac{11}{18}$.

With {0,0.33,0.67,1} Alice gains nothing by showing the 0 or 1. In all other cases Bob can’t do better than when he sees 0.33 he swaps and if 0.67 he holds and Alice can’t make any better choices even knowing this. His expected return is $\frac{5}{8}$.

Five and six values are basically the same.

So far it is looking very likely you are correct.

For the continuous case, I’m fairly sure Alice can’t beat Bob’s obvious strategy or Bob beat Alice’s. This would mean your pair of strategies is a Nash equilibrium for the game.

To see that neither has a better strategy in the face of the others’ obvious strategy, take for example the situation where Alice has two low values a,b with a<b<0.5. If she shows value a, she expects Bob to swap it, so she can’t beat showing b hoping to “trick” Bob into swapping. But for each such case there is a case where she has b,(1-a) with equal probability. Again she has to show b as 1-a>0.5 so Bob would take it and win. For Bob if he sees the value b, it is equally likely that Alice has a or 1-a in her bag since either way she would show b, so he should always swap, gaining an expected value of 0.5>b.

Thanks for your answer. I decided to accept the other answer since that gave a more understandable (to me) proof that Alice really can't do better. — Superty, Jun 02 '21 at 20:51

Alice has two bags, having random values in [0, 1]. She reveals one bag to Bob. Bob picks one bag, Alice gets the other. What is the optimal strategy?

2 Answers2