A saddle point will be obtained once again by defining security strategies for each player. Each player treats the other as nature, and if the same worst-case value is obtained, then the result is a saddle point for the game. If the values are different, then a randomized plan is needed to close the gap between the upper and lower values.

Upper and lower values now depend on the initial state, . There was no equivalent for this in Section 10.5.1 because the root of the game tree is the only possible starting point.

If sequences,
and
, of actions are applied from
, then the state history,
, can be derived by repeatedly
using the state transition function, . The *upper
value* from is defined as

which is identical to (10.33) if is replaced by nature. Also, (10.108) generalizes (9.44) to multiple stages. The

If , then a deterministic saddle point exists from . This implies that the order of and can be swapped inside of every stage.

Steven M LaValle 2012-04-20