Computer Vision Talks at Columbia University

3D Structure based facial video compression

Kanad Biswas

IIT Delhi, India

Monday, June/25, 11AM

Interschool Lab, 7th floor, CEPSR 

Host: Prof. Shree Nayar 

 

Abstract 

The talk is about compression of facial video sequences typically arising in vide telephony, on-line classrooms, video-conferencing etc. To start with the face is triangulated. The vertices serve as control points. Instead of sending across all the points for each frame to reconstruct the face at the other end, we compute the 3d affine structure of the face and send it across once in the beginning. Then the controL points are tracked robustly in each frame and we compute the global motion vector. For each frame we just have to transmit 8 parameters (can be reduced further) through which the face can be reconstructed at the other end. The second part is the computation of the local motion of the lips and the eyes. This is done through computing 2D affine structure of the moving blocks. Alternatively, we make use of the fact that there are a finite set of lip movements which keep on repeating in a normal conversation. These can be stored in a data base and indexed properly using a snake based contouring of the lips. Now one has to just send the relevant index across. Some explorately work based on global reconstruction using view morphing will also be presented.

SHORT BIOGRAPHICAL SKETCH OF THE SPEAKER:

Prof Biswas graduated in Electricl Engineering from IIT Madras. He did his masters in control engg from IIT Delhi. His phd in 1974 was also from IIT Delhi in the area of signal estimation. He has been on the faculty of IIT Delhi since then and currently is a professsor of computer science and Engg. department. He has been earlier working on control theory and Blackboard based architecture. His current interests are in the area of video compression, automatic segmentation and classification of broadcast video and medical image compression.