I am working through the book Statistical Inference by Casella and Berger. While I understand that most of probability theory is done heuristically, on a first passing of material I like to be more formal just to get an idea of things.
In one of their examples on conditional probability, they discuss the three prisoner's problem. It's supposed to show how conditional events can be miscalculated, but I'm having trouble trying to understand and formalize the correct version as opposed to the misguided one.
I put the pages below for reference.
In order to highlight my confusion, I'll introduce my 'naive' attempt to work through it. On a first pass, one could model this problem by considering the sample space $S$ to be $\{a, b, c\}$. Then $F$ is the power set, and $p$ is just the discrete probability function. Here, an outcome $e \in S$ would be interpreted as 'prisoner $e$ was pardoned'.
Now, with prisoner $e \in S$, we could determine various events. Let $L_e$ denote the event that $e$ is picked to be pardoned and live, and let $D_e$ denote the event that prisoner $e$ dies. Then with this formalism, $L_e = \{e\}$, and $D_e = S - \{e\} = \{f, g\}$ where $f$ and $g$ are the other generic prisoners.
If we let $A$ denote the event that prisoner $a$ lives, well then of course $A = L_a = \{a\}$ and $p(A) = 1/3$. Let $W$ denote the event that 'the warden says that $b$ dies'.
Now, one of the pitfalls the authors warn against is assuming that $W = D_e = \{a,c\}$. If one assumes this, then you end up with the erroneous calculations mentioned in the text. I can't think of a way to use the same sample space to model the 'correct' way to do this. Part of the posing of the problem says that the warden is telling this information to prisoner $A$. How can one modify the space to reflect this intuition? I'm also not exactly sure how they get the calculation that $p(W \cap A) = 1/6$. How would $p(W \cap A)$ be different than $p(W \cap C)$? Thank in advance for any assistance.

