9.2.2 Nondeterministic vs. Probabilistic Models

What is the best decision for the robot, given that it is engaged in a game against nature? This depends on what information the robot has regarding how nature chooses its actions. It will always be assumed that the robot does not know the precise nature action to be chosen; otherwise, it is pointless to define nature. Two alternative models that the robot can use for nature will be considered. From the robot's perspective, the possible models are

  1. [] Nondeterministic: I have no idea what nature will do.
  2. [] Probabilistic: I have been observing nature and gathering statistics.
Under both models, it is assumed that the robot knows $ \Theta$ in Formulation 9.3 or $ \Theta(u)$ for all $ u \in U$ in Formulation 9.4. The nondeterministic and probabilistic terminology are borrowed from Erdmann [313]. In some literature, the term possibilistic is used instead of nondeterministic. This is an excellent term, but it is unfortunately too similar to probabilistic in English.

Assume first that Formulation 9.3 is used and that $ U$ and $ \Theta$ are finite. Under the nondeterministic model, there is no additional information. One reasonable approach in this case is to make a decision by assuming the worst. It can even be imagined that nature knows what action the robot will take, and it will spitefully choose a nature action that drives the cost as high as possible. This pessimistic view is sometimes humorously referred to as Murphy's Law (``If anything can go wrong, it will.'') [111] or Sod's Law. In this case, the best action, $ u^* \in U$, is selected as

$\displaystyle u^* = \operatornamewithlimits{argmin}_{u \in U} \Big\{ \max_{\theta \in \Theta} \Big\{ L(u,\theta) \Big\} \Big\}.$ (9.14)

The action $ u^*$ is the lowest cost choice using worst-case analysis. This is sometimes referred to as a minimax solution because of the $ \min$ and $ \max$ in (9.14). If $ U$ or $ \Theta$ is infinite, then the $ \min$ or $ \max$ may not exist and should be replaced by $ \inf$ or $ \sup$, respectively.

Worst-case analysis may seem too pessimistic in some applications. Perhaps the assumption that all actions in $ \Theta$ are equally likely may be preferable. This can be handled as a special case of the probabilistic model, which is described next.

Under the probabilistic model, it is assumed that the robot has gathered enough data to reliably estimate $ P(\theta)$ (or $ p(\theta)$ if $ \Theta$ is continuous). In this case, it is imagined that nature applies a randomized strategy, as defined in Section 9.1.3. It assumed that the applied nature actions have been observed over many trials, and in the future they will continue to be chosen in the same manner, as predicted by the distribution $ P(\theta)$. Instead of worst-case analysis, expected-case analysis is used. This optimizes the average cost to be received over numerous independent trials. In this case, the best action, $ u^* \in U$, is

$\displaystyle u^* = \operatornamewithlimits{argmin}_{u \in U} \Big\{ E_\theta \Big[ L(u,\theta) \Big] \Big\},$ (9.15)

in which $ E_\theta$ indicates that the expectation is taken according to the probability distribution (or density) over $ \theta $. Since $ \Theta$ and $ P(\theta)$ together form a probability space, $ L(u,\theta)$ can be considered as a random variable for each value of $ u$ (it assigns a real value to each element of the sample space).9.3 Using $ P(\theta)$, the expectation in (9.15) can be expressed as

$\displaystyle E_\theta[ L(u,\theta) ] = \sum_{\theta \in \Theta} L(u,\theta) P(\theta) .$ (9.16)

Example 9..9 (Nondeterministic vs. Probabilistic)   Return to Example 9.8. Let $ U = \{u_1,u_2,u_3\}$ represent the robot actions, and let $ \Theta =
\{\theta_1,\theta_2,\theta_3\}$ represent the nature actions.

Under the nondeterministic model of nature, $ u^* = u_1$, which results in $ L(u^*,\theta) = 1$ in the worst case using (9.14). Under the probabilistic model, let $ P(\theta_1) = 1/5$, $ P(\theta_2) =
1/5$, and $ P(\theta_3) = 3/5$. To find the optimal action, (9.15) can be used. This involves computing the expected cost for each action:

\begin{displaymath}\begin{split}E_\theta[L(u_1,\theta)] & = (1)1/5 + (-1)1/5 + (...
...(u_3,\theta)] & = (2)1/5 + (-1)1/5 + (1)3/5 = 4/5 . \end{split}\end{displaymath} (9.17)

The best action is $ u^* = u_2$, which produces the lowest expected cost, $ -1$.

If the probability distribution had instead been $ P = [1/10 4/5\
1/10]$, then $ u^* = u_1$ would have been obtained. Hence the best decision depends on $ P(\theta)$; if this information is statistically valid, then it enables more informed decisions to be made. If such information is not available, then the nondeterministic model may be more suitable.

It is possible, however, to assign $ P(\theta)$ as a uniform distribution in the absence of data. This means that all nature actions are equally likely; however, conclusions based on this are dangerous; see Section 9.5. $ \blacksquare$

In Formulation 9.4, the nature action space $ \Theta(u)$ depends on $ u \in U$, the robot action. Under the nondeterministic model, (9.14) simply becomes

$\displaystyle u^* = \operatornamewithlimits{argmin}_{u \in U} \Big\{ \max_{\theta \in \Theta(u)} L(u,\theta) \Big\}.$ (9.18)

Unfortunately, these problems do not have a nice matrix representation because the size of $ \Theta(u)$ can vary for different $ u \in U$. In the probabilistic case, $ P(\theta)$ is replaced by a conditional probability distribution $ P(\theta \vert u)$. Estimating this distribution requires observing numerous independent trials for each possible $ u \in U$. The behavior of nature can now depend on the robot action; however, nature is still characterized by a randomized strategy. It does not adapt its strategy across multiple trials. The expectation in (9.16) now becomes

$\displaystyle E_\theta\Big[ L(u,\theta) \Big] = \sum_{\theta \in \Theta(u)} L(u,\theta) P(\theta\vert u) ,$ (9.19)

which replaces $ P(\theta)$ by $ P(\theta \vert u)$.

Steven M LaValle 2012-04-20