In standard perspective projection, the mapping from a 3D coordinate onto the image plane is accomplished via the projection Equation 11.
However, we instead use the central projection representation as depicted in Figure 7. Here, the coordinate system's origin is fixed at the image plane instead of at the center of projection (COP). In addition, the focal length is parameterized by its inverse, . This camera model has long been used in the photogrammetry community and has also been adopted by Szeliski and Kang  in their nonlinear least squares formulation. The projection equation thus becomes Equation 12.
Note how this projection decouples the camera focal length (f) from the depth of the point (ZC). In the traditional projection Equation 11, if ZC is fixed and the f is altered, the imaging geometry remains the same while the scale of the image changes. In other words, the cone of perspective rays remains fixed while the focal plane () translates along the optical (Z) axis. We note that in the standard projection model, the imaging geometry (i.e. the perspective rays) are only altered by varying depth ZC which is the only way to alter the imaging geometry. Thus, f only acts as a scaling factor and the imaging geometry and the depth are encoded in ZC.
In our representation, however, the inverse focal length alters the imaging geometry independently of the depth value ZC. State variable decoupling is known to be critical in Kalman filtering frameworks and is applicable here since we plan on putting both camera internal geometry and structure ZC into the internal hidden state .
Another critical property of as opposed to f is that it does not exhibit numerical ill-conditioning. It can span the wide range of perspective projection but also the special case of orthographic projection which occurs when we set the focal length and all rays project orthogonally onto the image plane. However, under orthographic projection, which does not 'blow up' and maintains numerical stability in KF frameworks. We can thus combine both perspective and orthographic projection into the same so-called central projection framework without any numerical instabilities (this is demonstrated experimentally in the next section). This flexibility is not typical in many traditional computer vision approaches where perspective and orthographic projection must be treated quite differently. We now begin building our internal state vector with this well-behaved parameter, as in Equation 13.