As in Section 9.2, outputs from sensors are combined over time by a filtering method to maintain the estimate. In the current setting, the pose can be maintained by combining both visibility information and outputs of an IMU. For the orientation component of the pose, the complementary filter from (9.10) could be used. The camera provides an additional source for detecting orientation drift error. The camera optical axis is a straightforward reference for yaw error estimation detection, which makes it a clear replacement for the magnetometer. If the camera tilt is known, then the camera can also provide accurate tilt error estimation.

The IMU was crucial for obtaining highly accurate orientation tracking because of accurate, high-frequency estimates of angular velocity provided by the gyroscope. If the frame rate for a camera or lighthouse system is very high, then sufficient sensor data may exist for accurate position tracking; however, it is preferable to directly measure derivatives. Unfortunately, IMUs do not measure linear velocity. However, the output of the linear accelerometer could be used as suggested in the beginning of this section. Suppose that the accelerometer estimates the body acceleration as

$\displaystyle \hat{a}[k] = (\hat{a}_x[k],\hat{a}_y[k],\hat{a}_z[k])$ (9.24)

in the world frame (this assumes the gravity component has been subtracted from the accelerometer output).

By numerical integration, the velocity $ \hat{v}[k]$ can be estimated from $ \hat{a}[k]$. The position $ \hat{p}[k]$ is estimated by integrating the velocity estimate. The update equations using simple Euler integration are

\begin{displaymath}\begin{array}{l} \hat{v}[k] = \hat{a}[k] \Delta t + \hat{v}[k...
... \hat{p}[k] = \hat{v}[k] \Delta t + \hat{p}[k-1] . \end{array}\end{displaymath} (9.25)

Note that each equation actually handles three components, $ x$, $ y$, and $ z$, at the same time. The accuracy of the second equation can be further improved by adding $ \frac{1}{2} \hat{a}[k] \Delta t^2$ to the right side.

As stated earlier, double integration of the acceleration leads to rapidly growing position drift error, denoted by $ \hat{d}_p[k]$ . The error detected from PnP solutions provide an estimate of $ \hat{d}_p[k]$, but perhaps at a much lower rate than the IMU produces observations. For example, a camera might take pictures at $ 60$ FPS and the IMU might report accelerations at $ 1000$ FPS.

The complementary filter from (9.10) can be extended to the case of double integration to obtain

\begin{displaymath}\begin{array}{l} p_c[k] = \hat{p}[k] - \alpha_p \hat{d}_p[k]  v_c[k] = \hat{v}[k] - \alpha_v \hat{d}_p[k] . \end{array}\end{displaymath} (9.26)

Above, $ p_c[k]$ and $ v_c[k]$ are the corrected position and velocity, respectively, which are each calculated by a complementary filter. The estimates $ \hat{p}[k]$ and $ \hat{v}[k]$ are calculated using (9.25). The parameters $ \alpha_p$ and $ \alpha_v$ control the amount of importance given to the drift error estimate in comparison to IMU updates.

Equation (9.26) is actually equivalent to a Kalman filter, which is the optimal filter (providing the most accurate estimates possible) for the case of a linear dynamical system with Gaussian noise, and sensors that also suffer from Gaussian noise. Let $ \omega_d^2$ represent the variance of the estimated Gaussian noise in the dynamical system, and let $ \omega_s^2$ represent the sensor noise variance. The complementary filter (9.26) is equivalent to the Kalman filter if the parameters are chosen as $ \alpha_p = \sqrt{2 \omega_d / \omega_s}$ and $ \alpha_v = \omega_d / \omega_s$ [120]. A large variety of alternative filtering methods exist; however, the impact of using different filtering methods is usually small relative to calibration, sensor error models, and dynamical system models that are particular to the setup. Furthermore, the performance requirements are mainly perceptually based, which could be different than the classical criteria around which filtering methods were designed [169].

Once the filter is running, its pose estimates can be used to aid the PnP problem. The PnP problem can be solved incrementally by perturbing the pose estimated by the filter, using the most recent accelerometer outputs, so that the observed features are perfectly matched. Small adjustments can be made to the pose so that the sum-of-squares error is reduced to an acceptable level. In most case, this improves reliability when there are so few features visible that the PnP problem has ambiguous solutions. Without determining the pose incrementally, a catastrophic jump to another PnP solution might occur.

Steven M LaValle 2020-01-06