We briefly discuss the dynamics of the Structure from Motion problem. As shown earlier, it is often the case (i.e. in cinematographic post-production, robotics, etc.) that cameras do not teleport around the scene and objects do not move about too suddenly. These bodies are governed by physical dynamics and it thus makes sense to constrain the possible configurations of the camera to have some smooth temporal changes over a causal time sequence. For instance, we consider the typical dynamic system: 4
Here, the observations are the 2D features (in u,v coordinates) which are concatenated into an observation vector for each moment in time. The observations are caused by the internal state of the system, which contains the scene's 3D structure, the relative 3D motion between the camera and the scene and the camera's internal geometry. The mapping from to is tricky in SfM since it is nonlinear ( varies with ) and is also corrupted by some noise. Here, the noise is represented as an additive Gaussian (normal ) process with zero-mean and time-varying covariance Rt. The matrix Rt probabilistically encodes the accuracy of the measured 2D feature coordinates and can represent features that are missing in certain frames when large variances are imputed into Rt appropriately.
In addition, the dynamics of the internal state are constrained. The 3D structure, 3D motion and camera geometry do not vary wildly but are linearly dependent (via ) on their previous values at the past time interval plus Gaussian noise. The noise process is additive with zero-mean and covariance Q. For generality, we assume that the motion of the camera through the scene is not known a priori and thus, is set to identity. Therefore, the internal state varies only through some Gaussian random noise process. This can be seen as a 'random walk' type of internal state space. In other words, the vector varies randomly but smoothly with small deltas from its past values.
This dynamic system encodes the causal and dynamic nature of the SfM problem and allows an elegant integration of multiple frames from image sequences. It is also a probabilistic framework for representing uncertainty. These dynamical systems have been extensively studied are routinely solved via reliable Kalman Filtering (KF) techniques. In our nonlinear case, an Extended Kalman Filter (EKF) is utilized which linearizes at each time step.
The representation of the measurement vector is simply the concatenation of the 2D feature point measurements. We now turn our attention to the representation of the internal state of the unknowns of the system: the 3D structure, 3D motion and internal camera geometry. This step is critical since the effectiveness of the Kalman filtering framework depends strongly on the representation.