Since this chapter considers sequential versions of single-stage decision problems, the suggested reading at the end of Chapter 9 is also relevant here. The probabilistic formulation in Section 10.1 is a basic problem of stochastic control theory [95,564]. The framework is also popular in artificial intelligence [79,267,471,839]. For an early, influential work on stochastic control, see [109], in which the notion of sequential games against nature is developed. The forward projection and backprojection topics are not as common in control theory and are instead inspired from [281,313,659]. The nondeterministic formulation is obtained by eliminating probabilities from the formulation; worst-case analysis also appears extensively in control theory [57,58,301]. A case for using randomized strategies in robotics is made in [314].

Section 10.2 is based on classical dynamic
programming work, but with emphasis on the *stochastic
shortest-path problem*. For more reading on value and policy
iteration in this context, see [95]. Section
10.2.3 is based on extending Dijkstra's algorithm. For
convergence issues due to approximations of continuous problems, see
[92,567,720]. For complexity results for games against
nature, see [764,767].

Section 10.3 was inspired by coverage in [95]. For further reading on reinforcement learning, the subject of Section 10.4, see [19,74,97,895].

Section 10.5 was based on material in [59], but
with an emphasis on unifying concepts from previous sections. Also
contained in [59] are sequential game formulations on
continuous spaces and even in continuous time. In continuous time,
these are called *differential games*, and they are introduced in
Section 13.5. Dynamic programming principles extend
nicely into game theory. Furthermore, they extend to
Pareto optimality [242].

The main purpose of Section 10.6 is to return to motion planning by considering continuous state spaces. Few works exist on combining stochastic optimal control with motion planning. The presented material is based mainly on [599,605,613,867,868].

Steven M LaValle 2012-04-20