home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: rec.puzzles
- Path: sparky!uunet!elroy.jpl.nasa.gov!decwrl!cache.crc.ricoh.com!james
- From: james@crc.ricoh.com (James Allen)
- Subject: Re: 52-card Monty
- Message-ID: <1992Nov21.200205.1998@crc.ricoh.com>
- Sender: james@crc.ricoh.com (James Allen)
- Organization: RICOH California Research Center
- Date: Sat, 21 Nov 92 20:02:05 GMT
- Lines: 223
-
-
- I lost the original posting by Jim Saxe; here is the problem reworded:
-
- ===============================================================
- Monty and Waldo play a game with N closed boxes. Monty hides a
- dollar in one box; the others are empty. Monty opens the empty boxes
- one by one. When there are only two boxes left Waldo opens either box;
- he wins if it contains the dollar. Prior to each of the N-2 box
- openings Waldo chooses one box and locks it, preventing Monty from opening
- it next. That box is then unlocked and cannot be so locked twice in a row.
-
- What are the optimal strategies for Monty and Waldo and what is the
- fair price for Waldo to pay to play the game?
- ===============================================================
-
- I posted a poor effort on this earlier; let me try to make amends.
- The fair price for large N is $0.6321205588285576784; I will offer
- a proof along with optimal strategies.
-
- Denote the game as G_N(). After (N-M) rounds of play, the game will have
- the same form as G_M(). Depending on the strategies each of the M boxes
- will have a probability p_i of containing the dollar. Let Waldo lock
- the M'th box (renumbering the boxes if necessary). Denote the game and
- Waldo's expected winnings in the game by
- G_M(p_1, p_2, ..., p_M) [1]
- In the solution we will need consider only the case
- p_2 = p_3 = p_4 = ... = p_M
- We abbreviate this case by
- G_M(b) = G_M(1 + b - Mb, b, b, b, b, ..., b) [2]
- Of course the probabilities must sum to one; also since
- probabilities are not negative:
- 1 + b - Mb >= 0, or
- 0 <= b <= 1 / (M-1) [3]
- In fact we will need to consider only the case
- b >= 1 / M [4]
- We will refer to constraints [2] and [4] below.
-
- Induction is used for one of the theorems, so we'd better solve G_2 and G_3
- for our basis.
- G_2(p_1, p_2) = max (p_1, p_2)
- G_3(p_1, p_2, p_3) = max (p_1 + p_2, p_3)
- since after Monty opens box #1, box #2 will have probability (p_1 + p_2)
- and vice versa. With constraints [2], [4]
- G_2(b) = G_2(1-b, b) = b
- G_3(b) = G_3(1-2b, b, b) = 1 - b
-
- The proof of Theorem 1 is based on the probability p_i that box #i
- contains the dollar. (Of course this is Waldo's perceived probability:
- Monty's probability would be 0 or 1.) It may seem wrong for Waldo to
- "forget" the game history and remember only the computed p_i. For
- example he may have previously locked some but not all of the boxes
- and this fact is ignored except in the calculation of p_i. Or Monty may
- have some higher level "plan" which mightn't seem to translate directly
- into probabilities. But probability algebra obeys some simplifying
- linearity rules and, if Monty keeps a "poker face", the probability model
- is the only thing Waldo has to act on.
-
- Especially paradoxical is the derivation of Waldo's p_i in his trivial
- strategy below: he can adopt inferior but "correct" p_i to simplify the
- analysis.
-
- Theorem 1)
- If b >= 1/M then
- G_M(b) = G_[M-1]( (1-b) / (M-2) ) [5]
-
- Proof)
- We will show that Monty and Waldo each have a strategy in G_M(b) to
- reduce the game to G_[M-1](b, q, q, ..., q) where q = (1-b) / (M-2)
- and where the boxes have been renumbered so that box #1 was box #M
- (the one Waldo locked) from the prior round and the new box #(M-1)
- is the one Waldo locks next. Note that if Monty indeed arranges
- the probability mixture G_[M-1](b, q, q, q, q, ..., q) it won't
- matter which box Waldo locks (Box #1 has the only non-equal
- probability but Waldo cannot lock the same box twice in a row);
- this is a typical property of "saddlepoint" strategy.
-
- The following applies to any G_M(p_1, p_2, p_3, ..., p_M).
- To achieve the saddlepoint Monty need simply assure that each box
- available for Waldo to lock has equal probability of containing
- the dollar. Using the numbering of G_M() let R(i,j) be the
- joint probability that Box #i contains the dollar and Box #j
- is opened by Monty in G_M(). Clearly,
- R(i, j) >= 0
- R(i, i) = 0 [ M equations ]
- R(i, M) = 0, i < M [ M-1 equations ]
- sum_over_j R(i,j) = p_i [ M equations ]
- and to achieve constraint [2] (nearly all probabilities equal)
- in G_[M-1],
- R(1, j) = R(k, j) [ M^2 - 5M + 6 equations]
- for 1 < j,k < M and j != k
- R(2, 1) = R(k, 1) [ M-3 equations ]
- for 2 < k < M
- and to make G_[M-1] be independent of Monty's play
- R(M, j)/R(1, j) = R(M, 2)/R(1, 2) [ M-3 equations ]
- for 2 < j < M
- R(M, 2)/R(1, 2) = R(M, 1)/R(2, 1) [ 1 equation ]
-
- Viewing R(i, j) as M*M unknowns, the above gives M*M linear
- equations so there is a simple unique solution:
-
- R(i, j) = (1 - p_M) / (M - 2) - p_j [6]
- for i,j < M and i != j
- R(M, j) = p_M - p_j * p_M / (1 - p_M) [7]
- for j < M
-
- Monty can arrange for this solution if and only if R(i,j) >= 0;
- as seen that is equivalent to
- p_j * (M-2) + p_M <= 1 , from [6]
- p_j + p_M <= 1 , from [7]
-
- When constraint [2] is satisfied (p_M = b and p_j = b or (1+b-Mb) )
- this becomes,
- b <= 1 / (M - 1)
- b >= (M - 3) / (M^2 - 3M + 1) >= 1 / M
- The first inequality was shown above and the second is somewhat
- weaker than constraint [4] ( b >= 1/M ). Hence Monty can transpose
- G_M(b) into G_[M-1]( (1-b) / (M-2) ) whenever b >= 1/M and M >= 3.
-
- And of course
- R(2,1) / sum R(i,1) = (1-b) / (M-2)
- so after the next round the result is always
- G_[M-1]( (1-b) / (M-2) )
-
- It should be easy to argue that this strategy is optimal for Monty,
- but we want to derive Waldo's best strategy anyway and if it
- guarantees the same value we know we're at the "saddlepoint".
- If Waldo knows Monty has a non-optimal strategy he can take
- advantage of it, but we will just derive a strategy good enough
- to achieve the saddle-point value.
-
- In the game G_[M+1](p_1, p_2, ..., p_M, Q) Waldo has locked
- box #(M+1) which has probability Q. After Monty's next move
- Waldo has the choice of (M-1) boxes; he should simply lock
- each box with probability 1/(M-1). That box will contain the
- dollar with mean probability (1-Q)/(M-1). The other (M-2)
- available boxes must have the same average probability (1-Q)/(M-1)
- since the box Waldo just unlocked still has probability Q.
- Thus Waldo can take the view that he has constructed the game
- G_M(Q, r, r, r, ..., r), for r = (1-Q)/(M-1)
- which is precisely the game Monty constructed with his optimal
- strategy!
-
- In the above Waldo set his probability estimates to each be
- p_2 = (p_2 + p_3 + ... + p_[M-1]) / (M - 2)
- p_3 = (p_2 + p_3 + ... + p_[M-1]) / (M - 2)
- et cetera
- If Waldo "knows" more than this, he can pretend he doesn't!
- For example he can ask Monty to secretly shuffle the boxes.
-
- Thus either player can reduce
- G_M(b), b >= 1/M
- to
- G_[M-1]((1-b)/(M-2))
- so this must be the saddlepoint.
- Q.E.D.
-
- Theorem 2)
- If b >= 1/M then
- G_M(b) = 1 - 1/2! + 1/3! - ... - (1-b)(-1)^M/(M-2)!
- = sum (-1)^i/i! - (1-b)(-1)^M/(M-2)! [8]
- where the sum is over i = 1, 2, 3, ..., M-3
-
- Proof)
- The proof is by induction. We know the theorem holds for M = 3
- and we will assume it holds for (M-1). Set
- c = (1-b) / (M-2)
- and note that since b <= 1/(M-1) from [3], we have
- c = (1-b)/(M-2) >= (1 - 1/(M-1)) / (M-2)
- or simply
- c >= 1/(M-1)
- so the condition of the inductive hypothesis is satisfied and
- G_[M-1](c) = 1 - 1/2! + 1/3! - ... - (1-c)(-1)^M/(M-3)!
- But from Theorem 1
- G_M(b) = G_[M-1](c)
- and from the definition of c,
- c/(M-3)! = (1-b)/(M-2)!
- which establishes the theorem.
-
- Theorem 3)
- G_M(1/M) = G_M(1/M, ..., 1/M) = 1 - 1/2! + 1/3! - ... -(-1)^M/M!
-
- Proof)
- This follows directly from Theorem 2 and the observation that
- (1/M)/(M-2)! = 1/(M-1)! - 1/M!
-
- For large M, G_M(1/M) approaches (1 - 1/e). It will be a little bigger
- when M is odd and a little smaller when M is even. I've appended the
- numeric values below.
-
- Obligatory puzzle: What property does this poem have?
-
- Ouija ionic, cameo aeons,
- Usual ennui there.
- Miaou.
-
- James
-
- % dc
- [[Solution for M =]Plb1+pdsb]sy
- 65k1sa1sblyx2sc[la1lc/-dsaplclyx*scla1lc/+dsaplclyx*sclzx]dszx
- Solution for M =2
- 0.50000000000000000000000000000000000000000000000000000000000000000
- Solution for M =3
- 0.66666666666666666666666666666666666666666666666666666666666666666
- Solution for M =4
- 0.62500000000000000000000000000000000000000000000000000000000000000
- Solution for M =5
- 0.63333333333333333333333333333333333333333333333333333333333333333
- Solution for M =6
- 0.63194444444444444444444444444444444444444444444444444444444444445
- Solution for M =7
- 0.63214285714285714285714285714285714285714285714285714285714285714
- Solution for M =8
- 0.63211805555555555555555555555555555555555555555555555555555555556
- Solution for M =9
- 0.63212081128747795414462081128747795414462081128747795414462081129
- Solution for M =10
- 0.63212053571428571428571428571428571428571428571428571428571428572
- . . .
- Solution for M =52
- 0.63212055882855767840447622983853913255418886896823216549216319831
- ^C
-
-