next up previous contents
Next: Contributions Up: 3D Pose Estimation and Previous: Sensitivity Analysis

  
Conclusions

A face recognition system must be able to recognize a face in many different imaging situations. The appearance of a face in a 2D image is not only influenced by its identity but by other variables such as lighting, background clutter, and pose. This thesis has described a method for detecting the face despite these variations, as well as a 3D-based normalization algorithm to compensate for them. This step essentially removes the large nonlinear effects of pose, lighting and background clutter, eliminating much of the variance these variables generate. In essence, the normalization generates a mug-shot from an arbitrary, uncontrived image containing a face. This then permits the use of holistic linear recognition techniques (which require frontal mug-shots). Thus, the 3D normalization acts as a bridge connecting the detection algorithm (which can handle arbitrary input images) to the recognition engine.

The system was motivated by a number of criteria. The algorithm had to find faces efficiently without exhaustively searching the image. Thus, we proposed the use of low-level perceptual mechanisms based on contrast, symmetry and scale. This allows us to focus computational resources on perceptually significant objects in a scene. We discussed the development of attentional mechanisms and the symmetry transform which quickly pinpoint interesting objects in a scene. The symmetry transform was then adapted to selectively search for the symmetry found in facial contours.

We then described a framework for using low-level mechanisms to find faces and facial features in an arbitrary image. The proposed hierarchical search is efficient and robust and proceeds in a coarse-to-fine method as it searches for faces. We begin by finding face-like blobs in the image. Then we identify the facial contour that encloses the face and use it to restrict the search for eyes in the image. Blobs that resemble eyes are then detected and we proceed to search for a mouth. Once a mouth is located, an approximation of the position of the nose is found using signature analysis.

Subsequently, we developed a 3D normalization technique which compensates for any facial pose by using the positions of the eyes, the nose and the mouth. This 3D normalization uses a deformable 3D model of the average human head. The model is used to synthesize a mug-shot from a localized face in an arbitrary input image. Thus, we effectively compensate for pose variations and background clutter. Subsequently, the lighting is normalized using a mixture of windowed histogram fitting transfer functions. The effectiveness of the normalization technique was then illustrated with several examples.

Next, the details of the Karhunen-Loeve based statistical detection and recognition algorithm were presented. The KL transform is used to compress the face data by over two orders of magnitude. In addition, we adapted the KL transform to function as a detection mechanism for improving the localization of faces and for discarding incorrect localizations. In addition, the KL provided a convenient way of matching a probe image to a database of faces using distance measurements.

The complete implementation was then described. The assembly of individual modules in software as well as the user interface were discussed. We then tested the system with arbitrary input images to determine its effectiveness at localizing faces. Then, the recognition rates were computed on a standard database. We also performed a sensitivity analysis to determine the effect of localization on recognition.

The localization and recognition results obtained demonstrate the algorithm's ability to handle a variety of pose changes, illumination conditions and background clutter. This is due to the 3D normalization stage which links the localization output to the recognition engine. Thus, we overcome the constraints of holistic linear recognition which requires mug-shot input images. The 3D normalization and illumination correction act as an intermediate step which links the robust feature detection stage to the precise linear recognition stage.



 
next up previous contents
Next: Contributions Up: 3D Pose Estimation and Previous: Sensitivity Analysis
Tony Jebara
2000-06-23