Traditionally the models of these transformations have been affine. That is of the form where . Affine describes rotation about the optical axis, zoom, translation and shear. Cameras however do not produce a shear type of motion. Unfortunaly, the affine model cannot describe pan and tilt. These are the equivalents to real world ``keystoning'' and ``chirping''. Therefore trying to describe the actual motions by using the affine model will result in a very poor description that will both model the motion poorly and is succpetible to noise.
However, the projective model is able to exactly describe all the possible camera motions. The projective model consists of 8 parameters and is of the following form where . This is what we had learned in assignment 1. In :
Becuause the parameters of the projective coordinate transformation had traditionally been thought to be too difficult to solve, most researchers have used the simpler affine model or other approximations to the projective model. ... we propose and demonstrate the featureless estimation of the parameters of the ``exact'' projective model ...Thus, the projective model properly expresses the pan and tilt motions which result in ``keystoning'' and ``chirping'' of the image. So, in order for us to properly express the difference between two images we need to solve for the 8 parameters. Obviously, this isn't trivial in the sense of setting up eight equations and solving for the 8 values.