The policy iteration method may alternatively be applied to the probabilistic discounted-cost problem. Recall the method given in Figure 10.4. The general approach remains the same: A search is conducted over the space of plans by solving a linear system of equations in each iteration. In Step 2, (10.53) is replaced by

which is a special form of (10.76) for evaluating a fixed plan. In Step 3, (10.54) is replaced by

Using these alterations, the policy iteration algorithm proceeds in the same way as in Section 10.2.2.

Steven M LaValle 2012-04-20