Optimal strategies

Now consider finding the optimal strategy, denoted by $\pi^*$ , under the nondeterministic model. The sets $Y(\theta)$ for each $\theta \in \Theta$ must be used to determine which nature actions are possible for each observation, $y \in Y$ . Let $\Theta(y)$ denote this, which is obtained as

$\displaystyle \Theta(y) = \{\theta \in \Theta \;\vert\; y \in Y(\theta) \} .$

(9.23)

The optimal strategy, $\pi^*$ , is defined by setting

$\displaystyle \pi^*(y) = \operatornamewithlimits{argmin}_{u \in U} \Big\{ \max_{\theta \in \Theta(y)} \Big\{ L(u,\theta) \Big\} \Big\},$

(9.24)

for each $y \in Y$ . Compare this to (9.14), in which the maximum was taken over all $\Theta$ . The advantage of having the observation,

, is that the set is restricted to $\Theta(y) \subseteq \Theta$ .

Under the probabilistic model, an operation analogous to (9.23) must be performed. This involves computing $P(\theta\vert y)$ from $P(y\vert\theta)$ to determine the information that contains regarding $\theta$ . Using Bayes' rule, (9.9), with marginalization on the denominator, the result is

$\displaystyle P(\theta\vert y) = {P(y\vert\theta) P(\theta) \over \displaystyle\strut \sum_{\theta \in \Theta} P(y\vert\theta) P(\theta)} .$

(9.25)

To see the connection between the nondeterministic and probabilistic cases, define a probability distribution, $P(y\vert\theta)$ , that is nonzero only if $y \in Y(\theta)$ and use a uniform distribution for $P(\theta)$ . In this case, (9.25) assigns nonzero probability to precisely the elements of $\Theta(y)$ as given in (9.23). Thus, (9.25) is just the probabilistic version of (9.23). The optimal strategy, $\pi^*$ , is specified for each $y \in Y$ as

$\displaystyle \pi^*(y) = \operatornamewithlimits{argmin}_{u \in U} \Big\{ E_\th... ...\in U} \left\{ \sum_{\theta \in \Theta} L(u,\theta) P(\theta\vert y) \right\} .$

(9.26)

This differs from (9.15) and (9.16) by replacing $P(\theta)$ with $P(\theta\vert y)$ . For each

, the expectation in (9.26) is called the conditional Bayes' risk. The optimal strategy, $\pi^*$ , always selects the strategy that minimizes this risk. Note that $P(\theta\vert y)$ in (9.26) can be expressed using (9.25), for which the denominator (9.26) represents

and does not depend on

; therefore, it does not affect the optimization. Due to this, $P(y\vert\theta) P(\theta)$ can be used in the place of $P(\theta\vert y)$ in (9.26), and the same $\pi^*$ will be obtained. If the spaces are continuous, then probability densities are used in the place of all probability distributions, and the method otherwise remains the same.

Steven M LaValle 2012-04-20