The power of prediction

For the rest of this section, we consider how to live with whatever latency remains. As another thought experiment, imagine that a fortune teller is able to accurately predict the future. With such a device, it should be possible to eliminate all latency problems. We would want to ask the fortune teller the following:

  1. At what future time will the pixels be switching?
  2. What will be the positions and orientations of all virtual world models at that time?
  3. Where will the user be looking at that time?
Let $ t_s$ be answer to the first question. We need to ask the VWG to produce a frame for time $ t_s$ and then perform visual rendering for the user's viewpoint at time $ t_s$. When the pixels are switched at time $ t_s$, then the stimulus will be presented to the user at the exact time and place it is expected. In this case, there is zero effective latency.

Now consider what happens in practice. First note that using information from all three questions above implies significant time synchronization across the VR system: All operations must have access to a common clock. For the first question above, determining $ t_s$ should be feasible if the computer is powerful enough and the VR system has enough control from the operating system to ensure that VWG frames will be consistently produced and rendered at the frame rate. The second question is easy for the case of a static virtual world. In the case of a dynamic world, it might be straightforward for all bodies that move according to predictable physical laws. However, it is difficult to predict what humans will do in the virtual world. This complicates the answers to both the second and third questions. Fortunately, the latency is so small that momentum and inertia play a significant role; see Chapter 8. Bodies in the matched zone are following physical laws of motion from the real world. These motions are sensed and tracked according to methods covered in Chapter 9. Although it might be hard to predict where you will be looking in $ 5$ seconds, it is possible to predict with very high accuracy where your head will be positioned and oriented in $ 20$ms. You have no free will on the scale of $ 20$ms! Instead, momentum dominates and the head motion can be accurately predicted. Some body parts, especially fingers, have much less inertia, and therefore become more difficult to predict; however, these are not as important as predicting head motion. The viewpoint depends only on the head motion, and latency reduction is most critical in this case to avoid perceptual problems that lead to fatigue and VR sickness.

Steven M LaValle 2020-01-06