Computing a saddle point

The security plan for $ {{\rm P}_1}$ constitutes a valid solution to the game under the alternating play model. $ {{\rm P}_2}$ has only to choose an optimal response to the plan of $ {{\rm P}_1}$ at each stage. Under the stage-by-stage model, the ``solution'' concept is a saddle point, which occurs when the upper and lower values coincide. The procedure just described could be used to determine the value and corresponding plans; however, what happens when the values do not coincide? In this case, randomized security plans should be developed for the players. As in the case of a single-stage game, a randomized upper value $ \overline{\cal L}^*$ and a randomized lower value $ \underline{\cal L}^*$ are obtained. In the space of randomized plans, it turns out that a saddle point always exists. This implies that the game always has a randomized value, $ {\cal L}^*= \underline{\cal L}^*= \overline{\cal L}^*$. This saddle point can be computed from the bottom up, in a manner similar to the method just used to compute security plans.

Return to the example in Figure 10.13. This game actually has a deterministic saddle point, as indicated previously. It still, however, serves as a useful illustration of the method because any deterministic plan can once again be interpreted as a special case of a randomized plan (all of the probability mass is placed on a single action). Consider the bottom four subtrees of Figure 10.13, which are obtained by using only the last two levels of decision vertices. In each case, $ {{\rm P}_1}$ and $ {{\rm P}_2}$ must act in parallel to end the sequential game. Each subtree can be considered as a matrix game because the costs are immediately obtained after the two players act.

Figure 10.17: Under the stage-by-stage model, the game in Figure 10.13 can instead be represented as a tree in which each player acts once, and then they play a matrix game to determine the cost.

Figure 10.18: Each matrix in Figure 10.17 can be replaced by its randomized value. This clips one level from the original tree. For this particular example, the randomized value is also a deterministic value. Note that these are exactly the costs that appeared in Figure 10.14c. This occurred because each of the matrix games has a deterministic value; if they do not, then the costs will not coincide.
This leads to an alternative way to depict the game in Figure 10.13, which is shown in Figure 10.17. The bottom two layers of decision vertices are replaced by matrix games. Now compute the randomized value for each game and place it at the corresponding leaf vertex, as shown in Figure 10.18. In the example, there are only two layers of decision vertices remaining. This can be represented as the game

$\displaystyle \begin{tabular}{cc} & $V$  $U$ & \begin{tabular}{\vert c\vert c\vert}\hline 0 & 1  \hline 2 & 3  \hline \end{tabular} \end{tabular} ,$ (10.104)

which has a value of $ 1$ and occurs if $ {{\rm P}_1}$ applies $ L$ and $ {{\rm P}_2}$ applies $ R$. Thus, the solution to the original sequential game has been determined by solving matrix games as an alternative to the method applied to obtain the security plans. The benefit of the new method is that if any matrix does not have a deterministic saddle point, its randomized value can instead be computed. A randomized strategy must be played by the players if the corresponding decision vertex is reached during execution.

Steven M LaValle 2012-04-20