Since we are searching specifically for mouths, the connected component
analysis can be further constrained. Not only is the symmetry data projected
onto the horizontal, it is to be linked horizontally as well. We are searching
specifically for connected horizontal limbs and not an arbitrary cloud of
adjacent points from the images in Figure . The extraction will
produce a set of 3D curves that flow through the symmetry data in
Figure . Figure displays the limb axes
(shown with dashed lines) that should approximate the clouds (or manifolds) of
symmetry points (enclosed with continuous lines) in the search space.

Furthermore, these 3D curves are to be as horizontal as possible and should
not meander excessively. The mouth is a smooth limb so its axis should be
a simple straight line or a slight curve. Furthermore, axial symmetry curves
should not undulate excessively in scale-space, *r* (the third dimension).
This prevents the vertical thickness of the mouth limb from varying wildly. Thus,
mouth limbs which are maximally straight will be favored.
Figure displays limb extractions which do not satisfy
these criteria and hence are probably not mouths. Such limbs should be
attenuated during extraction so that they will not interfere with mouth
detection. This attenuation agrees with the Gestalt psychology notion of
``Pragnanz'' which identifies a correspondence between simplicity in image
structures and their perceptual significance [3].

Connected component analysis begins at a given point (*p*,*r*) in the 3D search
space. This 3D search space covers values *r*=1 to *r*=6 and the (*x*,*y*)region depicted in Figure . This starting point ((*p*,*r*) or
(*x*,*y*,*r*)) will be called the ``seed''. One of the requirements we impose is
that the seed has a significant horizontal symmetry magnitude (25% of the
maximum possible magnitude). From this starting point, we form a limb or 3D
curve by propagating along two trajectories: to the left and to the right.

By left propagation, we seek a path towards the left of the current pixel
(*p*,*r*) which flows through the strongest horizontal symmetry points in our 3D
search space. Along the left propagation, we utilize a kernel that covers all
the adjacent neighbours of the current point
(*p*,*r*)=(*x*,*y*,*r*) to the left of
the current pixel. This kernel is depicted in Figure . This
kernel contains 9 cells of which we must pick one. This new cell becomes the
new ``current cell'' and we repeat the process, tracing out the next step in
the path from the 9 possible choices in the kernel. This propagation is
continued and the trajectory flows through the 3D space as it repeatedly
selects from the 9 possible cells in the kernel.

The symmetry magnitudes are scaled by the weights depicted in the kernel in
Figure . We thus measure a weighted symmetry value from the
horizontal projection data for each cell in the kernel. The pixel location to
the immediate left is favored most and its symmetry magnitude is consequently
scaled by .
We simultaneously disfavor diagonally positioned pixels
by scaling their magnitude by .
We begin by only considering the
spatial neighbourhood at scale *r*. The peak weighted response from the three
cells at *r* is determined. If it is greater than the threshold (25%) then we
move (propagate) to the strongest cell's position and re-compute the kernel
from there. If on the other hand, the scale at *r* has only weak symmetry, we
repeat the analysis at *r*+1 and *r*-1 and propagate to the strongest of those
6 cells. Thus, we seek a path towards the left of the current pixel which
flows through the strongest horizontal symmetry. However, we favor paths that
are horizontal (limiting spatial meandering) and paths that are at the same
scale (limiting meandering in scale). Equation illustrates the
weighted computation of the peak-response of the strongest cell in the kernel.
The cell generating the peak-response, *peak* is the one we move to next. It
becomes the ``current cell'' and we repeat the kernel computation from there.
The process stops when a dead end is reached and none of the 9 possible cells
contain an output greater than the threshold (25%). In
Equation , the horizontal symmetry values are labelled *S* where
*S*_{(x,y,r)}=*S*_{rhorizontal}(*x*,*y*).

A similar propagation is performed to the right of the seed (or starting cell) and the two trajectories are merged into one. This single trajectory in the spatial and scale domain represents the extracted limb as a whole.

Figure shows (in black) the (*x*,*y*) region in the image
where the mouth limb starting points will be selected. The dimensions of this
triangular search region are defined in the figure. Note that the triangle
does not extend beyond the face mask so that the mouth search is performed
exclusively within the face contour. Any part of the mouth that intersects
this triangular search space will trigger limb extraction. Thus, the mouth
does not have to be perfectly centered upon the face's mid-line. This can
happen if the face has an unusual pose or the eye detection is slightly
inaccurate. The triangular search space is superimposed upon the rotated
intensity image in Figure . Observe how the mouth is
not perfectly aligned with the face's mid-line in
Figure . However, it still falls within the generous
triangular search region and is thus detectable.

Note that this triangular search space is rather redundant. Say a horizontal
limb is detected at a seed position (*x*,*y*,*r*) on the image. A very similar
limb will probably be detected when we start the limb extraction at
(*x*+1,*y*,*r*). Thus, we do not need to start the limb extraction process at
every point in the triangle. We merely start the limb extraction once for
every value of *y* in the triangle. For each value of *y* we select the seeding point
(*x*,*y*,*r*) by varying *x* and *r* to maximize
*S*_{rhorizontal}(*x*,*y*). Then,
we begin the limb extraction and extract via propagation the trajectory of the
limb towards the left and the right. This is repeated for each value of *y* in
the triangle resulting in a collection of limb axes.

The resulting horizontal limb axes are stored as 3D curves (or trajectories)
in the spatial-scale domain as shown in Figure . The height in
this graph represents the scale or the dimension *r* of the trajectory. This
value of *r* corresponds directly to the thickness of the limb being
extracted. Now, we must test these limbs to determine which of them is the
mouth. Some particularly short limbs are present in Figure
despite the Gaussian filtering. These limbs will be discarded later since they
are too short.