Next: Results and Discussions
Up: Problem Formulation
Previous: Problem Formulation
Affine Homography
We first look at the case when the transformation between
two views is affine. The third row of the matrix
in Equation 1, has the special form
for affine transformations. We can write
Equation 1 in this case as
where A is a
matrix,
is a translation vector
and
is the view index, with
being considered to be the
reference view. We can discard the effect of vector
by
discarding the DC component - the Fourier coefficient
corresponding to
(or by shifting the origin to the
centroid of the shape). The affine transformation can now be
written as
without any loss in generality.
In general, correspondence between images, i.e., information
as to which image points in different views are projections
of the same 3D point, is not available. This implies that
when the boundary is seen in two views, the description may
not start from the same point. In other words, there is an
unknown shift between the sequences of boundary points in
different views. This shift in the spatial domain translates
into a rotation in the Fourier domain. Equation 3
can now be written as
![\begin{displaymath}
{\bf\bar{X}}^l[k] = {\bf A} {\bf\bar{X}}^0[k]\,
e^{\omega_k\lambda_l}
\end{displaymath}](img27.png) |
(4) |
where,
is the unknown shift in view
,
is the number of points on the boundary of the shape, and
Let us define a measure called the cross-conjugate
product (CCP) on the Fourier representations of two views
as
The matrix
can be expressed as a sum
of a symmetric matrix and a skew symmetric matrix as
where
and
. The skew symmetric matrix
reduces to
where
is the difference of the
off-diagonal elements of
. We now have
The term
of the
above equation is purely real and the term
is purely
imaginary. The phases of
and
depend
only on the shift
. Thus,
can be
recovered from the inverse Fourier transform of
or
, if known. However, we can only compute
, a combination of
and
,
which is not directly useful to recover the shift.
We observe that the effect of the transformation matrix
on
is restricted to a scaling by a factor
. We
can define a new measure
, ignoring scale, for the
sequence
as
![\begin{displaymath}
\kappa(l) = {\bf\bar{X}}^l[k]^{*T} \left[ \begin{array}{cr} 0 & 1 \\
-1\;\; & 0 \end{array} \right] {\bf\bar{X}}^l[k].
\end{displaymath}](img49.png) |
(7) |
It can be shown that
Equation 8 gives a necessary condition for the
sequences
and
to be two
different views of the same planar shape, or in other words, the
values of the measure
in the two views
should be scaled versions of each other. This extends to
multiple views also. We can express it differently in
multiple views. Consider the
matrix formed by the coefficients
of the
measures for M different views.
The necessary condition for matching of the planar shape
in
views then reduces to
 |
(9) |
It should be noted that this recognition condition does not
require correspondence between views and is valid for any
number of views.
Can we also estimate the shift
that would align
corresponding points in two views? The answer is yes, using
a measure
for a fixed
given by
![\begin{displaymath}
\kappa_{mod}(l, p) = (X^l[k])^{*T} \left[ \begin{array}{cl} 0 &
1 \\ -1\;\; & 0 \end{array} \right] X^l[p]
\end{displaymath}](img60.png) |
(10) |
correlates each Fourier descriptor coefficient
with a fixed one within each view. Following a chain of reasoning
similar to the one above,
we can show that
Equation 11 states that the phases of
and
differ by a value
proportional to the shift
and the differential
frequency
. Therefore, the ratio
will be a complex sinusoid. The value
of
can be computed from the inverse Fourier
transform of the quotient series. Thus, we can compute the
correspondence between points on the shape boundary across views by
determining
as above, starting with no prior knowledge.
The measure
can also be adopted for the purpose
of recognition. However, this approach would not be
discussed in this paper for want of space.
The general projective homography relating two different views of the same planar shape
can be reasonably approximated by an affine homography.
This approximation seems to be a practical one as most real life configurations of imaging a
scene from multiple view points, possess structure
that are very close that of affine homographies. This assumption is also validated by the
results for general homographies, which are presented in the next section.
Next: Results and Discussions
Up: Problem Formulation
Previous: Problem Formulation
2002-10-09