14.6.3.4 Computing optimal solutions via dynamic programming

Dynamic programming with interpolation, as covered in Section 14.5, can be applied to solve the problem once it is formulated in terms of the path-constrained phase space ${S} \subset {\mathbb{R}}^2$ . The domain of $\tau$ provides the constraint $0 \leq s \leq 1$ . Assume that only forward progress along the path is needed; moving in the reverse direction should not be necessary. This implies that ${\dot s}> 0$ . To make

bounded, an upper bound, ${\dot s}_{max}$ , is usually assumed, beyond which it is known that the speed is too high to follow the path.

**Figure 14.27:** (a) Planning occurs in the path-constrained phase space. (b) Due to the forward-progress assumption, value iteration can be reduced to a quick wavefront propagation across regularly spaced vertical lines in .
$\begin{figure}\begin{center} \begin{tabular}{ccc} \psfig{file=figs/trajplan.eps,... ...5in} \\ (a) & \hspace*{0.25truein} & (b) \end{tabular} \end{center}\end{figure}$

This results in the planning problem shown in Figure 14.27a. The system is expressed as ${\ddot s}= u'$ , in which $u' \in U'(s,{\dot s})$ . The initial phase in

is $(0,{\dot s}_i)$ and the goal phase is $(1,{\dot s}_g)$ . Typically, ${\dot s}_i = {\dot s}_g = 0$ . The region shown in Figure 14.27 is contained in the first quadrant of the phase space because only positive values of

and ${\dot s}$ are allowed (in Figure 14.13,

and ${\dot q}$ could be positive or negative). This implies that all motions are to the right. The actions determine whether accelerations or decelerations will occur.

Backward value iteration with interpolation can be easily applied by discretizing

and $U'(s,{\dot s})$ . Due to the constraint ${\dot s}> 0$ , making a Dijkstra-like version of the algorithm is straightforward. A simple wavefront propagation can even be performed, starting at

and progressing backward in vertical waves until

is reached. See Figure 14.27b. The backprojection (14.29) can be greatly simplified. Suppose that the

-axis is discretized into

regularly spaced values

, $\ldots$ ,

at every $\Delta s$ , for some fixed $\Delta s > 0$ . Thus, $s_k = (k \Delta s)/m$ . The index

can be interpreted as the stage. Starting at

, the final cost-to-go $G^*_m(s_m,{\dot s}_m)$ is defined as 0 if the corresponding phase represents the goal, and $\infty$ otherwise. At each

, the ${\dot s}$ values are sampled, and the cost-to-go function is represented using one-dimensional linear interpolation along the vertical axis. At each stage, the dynamic programming computation

$\displaystyle G^*_k(s_k,{\dot s}_k) = \min_{u' \in U'(s_k,{\dot s}_k)} \Big\{ l'_d(s_k,{\dot s}_k,u') + G^*_{k+1}(s_{k+1},{\dot s}_{k+1}) \Big\}$

(14.46)

The dynamic programming approach is so general that it can even be extended to path-constrained trajectory planning in the presence of higher order constraints [880]. For example, if a system is specified as $q^{(3)} = h(q,{\dot q},{\ddot q},u)$ , then a 3D path-constrained phase space results, in which each element is expressed as $(s,{\dot s},{\ddot s})$ . The actions in this space are jerks, yielding $s^{(3)} = u'$ for $u' \in U'(s,{\dot s},{\ddot s})$ .