|  | B) Two players, P and T, independently select a number between 0 and 9
inclusive.   If they agree to within �1, then P gains, while T loses.
If they disagree then T wins, while P loses.  What is the winning strategy?
P's optimal strategy is to choose from {1,4,5,8} with equal probabilities,
while T's optimal strategy is to randomly choose from {0,3,6,9}.  P's strategy
ensures P will win at least 1/4 of the time, and T's strategy ensures T a win
3/4 of the time.
To see this, form the payoff matrix, and note that P always finds 1 better
than 0, and 8 better than 9, while T prefers 0 over 1, and 9 over 8.  Crossing
out these columns and rows yields a smaller payoff matrix.  Continuing in
this way, the payoff matrix (for P) is eventually reduced to:
	    t_0 t_3 t_6 t_9
	p_1  +1  -1  -1  -1
	p_4  -1  +1  -1  -1
	p_5  -1  -1  +1  -1
	p_8  -1  -1  -1  +1
At this point, the problem yields to solution by a symmetry argument.
 | 
|  | re .2, .-1
P chooses i with probability p_i, and T chooses j with probability t_j.
Here's the payoff matrix; the value (to P) for each possible combination:
	                      T
	     (0) (1) (2) (3) (4) (5) (6) (7) (8) (9)
	(0)  +1  +1  -1  -1  -1  -1  -1  -1  -1  -1
	(1)  +1  +1  +1  -1  -1  -1  -1  -1  -1  -1
	(2)  -1  +1  +1  +1  -1  -1  -1  -1  -1  -1
	(3)  -1  -1  +1  +1  +1  -1  -1  -1  -1  -1
    P	(4)  -1  -1  -1  +1  +1  +1  -1  -1  -1  -1
	(5)  -1  -1  -1  -1  +1  +1  +1  -1  -1  -1
	(6)  -1  -1  -1  -1  -1  +1  +1  +1  -1  -1
	(7)  -1  -1  -1  -1  -1  -1  +1  +1  +1  -1
	(8)  -1  -1  -1  -1  -1  -1  -1  +1  +1  +1
	(9)  -1  -1  -1  -1  -1  -1  -1  -1  +1  +1
On each play, P expects to win Sum p_i * t_j * M[i,j].
                               i,j
Suppose P has the strategy {p_0, p_1, ..., p_9}.  If p_0 is non-zero, P can
do better (or at least as well) with the strategy {0, p_0+p_1, p_2, ..., p_9}
(notice that M[1,j] >= M[0,j] for all j).  Thus, we see that P's optimal
strategy has p_0 = 0.  Similarly, p_9 is zero, because P should prefer playing
8 rather than playing 9.  Because P (being so smart) will never play 0 or 9,
we can remove those two rows from the payoff matrix.
Also, T always prefers playing 9 rather than 8, and 0 over 1 (remember that
T wants to *minimize* the payoff, and note that M[i,0] <= M[i,1] for all i).
So we may also remove those two columns, viz:
	                  T
	     (0) (2) (3) (4) (5) (6) (7) (9)
	(1)  +1  +1  -1  -1  -1  -1  -1  -1
	(2)  -1  +1  +1  -1  -1  -1  -1  -1
	(3)  -1  +1  +1  +1  -1  -1  -1  -1
    P	(4)  -1  -1  +1  +1  +1  -1  -1  -1
	(5)  -1  -1  -1  +1  +1  +1  -1  -1
	(6)  -1  -1  -1  -1  +1  +1  +1  -1
	(7)  -1  -1  -1  -1  -1  +1  +1  -1
	(8)  -1  -1  -1  -1  -1  -1  +1  +1
P's optimal strategy must be prepared to compete with T's optimal strategy,
so P considers this new payoff matrix, and sees that M[3,j] >= M[2,j] and
M[6,j] >= M[7,j], so P can wisely decide to never play 2 or 7.
For T's part, he sees that M[i,0] <= M[i,2], and M[i,9] <= M[i,7], so T
wisely decides to never play 2 or 7.  This reduces the payoff matrix to:
	              T
	     (0) (3) (4) (5) (6) (9)
	(1)  +1  -1  -1  -1  -1  -1
	(3)  -1  +1  +1  -1  -1  -1
    P	(4)  -1  +1  +1  +1  -1  -1
	(5)  -1  -1  +1  +1  +1  -1
	(6)  -1  -1  -1  +1  +1  -1
	(8)  -1  -1  -1  -1  -1  +1
Continuing; because M[4,j] >= M[3,j], P need never consider playing 3;
because M[5,j] >= M[6,j], P never plays 6; because M[i,3] <= M[i,4],
T never plays 4; because M[i,6] <= M[i,5], T never plays 5.  We now have:
	            T
	     (0) (3) (6) (9)
	(1)  +1  -1  -1  -1
    P	(4)  -1  +1  -1  -1
	(5)  -1  -1  +1  -1
	(8)  -1  -1  -1  +1
P now chooses values for p_1, p_4, p_5, and p_8 (which must sum to 1).
P could choose p_1 = p_4 = p_5 = 1/3, and p_8 = 0, but that allows T
the very profitable strategy of always choosing 9.  Instead, P wants to
choose a strategy that maximizes his payoff, regardless of what strategy
T may choose.
P's expectation E = Sum p_i * t_j * M[i,j], which can be rewritten as:
                    i,j
E = -1 + 2*( p_1 * t_0 + p_4 * t_3 + p_5 * t_6 + p_8 * t_9 )
Suppose that the above t_j values are optimal (for T).  This means that
for the p_i values to be optimal, any small adjustment in the t_j values
would cause P to do better; i.e.,
E' = -1 + 2*( p_1 * (t_0+d) + p_4 * (t_3-d) + p_5 * t_6 + p_8 * t_9 ) => E
So, expanding this on both sides, and cancelling, we see that
	2 * p_1 * d - 2 * p_4 * d >= 0  (or t_3 = 0, or t_0 = 1)
Now this same idea holds, even if we adjust t_0 and t_3 the other way:
E' = -1 + 2*( p_1 * (t_0-d) + p_4 * (t_3+d) + p_5 * t_6 + p_8 * t_9 ) => E
So,
	-2 * p_1 * d + 2 * p_4 * d >= 0  (or t_3 = 1, or t_0 = 0)
Thus we see that (unless t_0 or t_3 are at a boundary),
	2 * p_1 * d - 2 * p_4 * d = 0,  so that  p_1 - p_4 = 0
Proceeding in this way, we can determine that p_1 = p_4 = p_5 = p_8.  Since
these sum to 1, p_1 = p_4 = p_5 = p_8 = 1/4.  Similarly for the t_j values
(since T wants to protect himself from a clever P), we have t_0 = t_3 = t_6
= t_9 = 1/4.
 | 
|  | >A) Two players, P and T, independently select a number between 0 and 9 
>inclusive.   If they agree to within �1, they each gain, otherwise they lose.  
>What is the winning strategy?
	Previous discussion focussed on game B), but A) yields to a similar
approach, even though it's not a zero-sum game.   The payoff matrix is:
 	  0123456789
	 +----------
	0|1100000000
	1|1110000000
	2|0111000000
	3|0011100000
	4|0001110000
	5|0000111000
	6|0000011100
	7|0000001110
	8|0000000111
	9|0000000011
	No player will say 0 or 9, since by saying 1 or 8 they cover a strict
superset of possible guesses by the other player.
 	  12345678
	 +--------
	1|11000000
	2|11100000
	3|01110000
	4|00111000
	5|00011100
	6|00001110
	7|00000111
	8|00000011
	But if 0 and 9 are impossible, then 1 is dominated by 2, and 8 by 7.
Continuing this process, for a couple more iterations, the matrix becomes:
	  45
	 +--
	4|11
	5|11
and so the two players, if playing rationally and if they believe that the
other is playing rationally, are certain to win, since they will both say 4 or 
5.
C) A third game we could imagine is the opposite of A).   Both players are 
trying to *avoid* matching or being adjacent to the other player's guess.   
Here, 0 dominates 1, and 2 dominates 8.
 	  0123456789
	 +----------
	0|0011111111
	1|0001111111
	2|1000111111
	3|1100011111
	4|1110001111
	5|1111000111
	6|1111100011
	7|1111110001
	8|1111111000
	9|1111111100
reduces immediately to
 	  02345679
	 +--------
	0|01111111
	2|10011111
	3|10001111
	4|11000111
	5|11100011
	6|11110001
	7|11111001
	9|11111110
	Similarly, 2 dominates 3, and 7 dominates 6:
 	  024579
	 +------
	0|011111
	2|101111
	4|110011
	5|110011
	7|111101
	9|111110
	This is as far as the simplification goes, but now each player can
guarantee himself a win 80% of the time, by picking evenly from 
{0, 2, (4or5), 7, 9}.   In practice, if two players are playing this game
repeatedly, they can each stabilize on a single response, as soon as they
have won the game once, and thus approach 100% wins.   However, if each player
is being told neither his opponents guess, nor indeed whether he won or not, 
80% is the best 'blind' payoff one can get.
Regards,
Andrew.
 |