The PubFig database is a large, real-world face dataset consisting of 58,797 images of 200 people collected from the internet. Unlike most other existing face datasets, these images are taken in completely uncontrolled situations with non-cooperative subjects. Thus, there is large variation in pose, lighting, expression, scene, camera, imaging conditions and parameters, etc. The PubFig dataset is similar in spirit to the Labeled Faces in the Wild (LFW) dataset created at UMass-Amherst, although there are some significant differences in the two:

We have created a face verification benchmark on this dataset that test the abilities of algorithms to classify a pair of images as being of the same person or not. Importantly, these two people should have never been seen by the algorithm during training. In the future, we hope to create recognition benchmarks as well.


The database is made available only for non-commercial use. If you use this dataset, please cite the following paper:

"Attribute and Simile Classifiers for Face Verification,"
Neeraj Kumar, Alexander C. Berg, Peter N. Belhumeur, and Shree K. Nayar,
International Conference on Computer Vision (ICCV), 2009.
[bibtex] [pdf] [webpage]
author = {N. Kumar and A. C. Berg and P. N. Belhumeur and S. K. Nayar},
title = {{A}ttribute and {S}imile {C}lassifiers for {F}ace {V}erification},
booktitle = {IEEE International Conference on Computer Vision (ICCV)},
month = {Oct},
year = {2009}


Related Projects