Face Recognition for Web-Scale Datasets

E. G. Ortiz and B. C. Becker.  "Face Recognition for Web-Scale Datasets". ELSEVIER Computer Vision and Image Understanding, 2013.

Related Publication:
B. C. Becker and E. G. Ortiz. “Evaluating Open-Universe Face Identification on the Web”. IEEE CVPR Workshop on Analysis and Modeling of Faces and Gestures, 2013. [Project Page]

Open-Universe Face Identification


With the increasing pervasiveness of digital cameras, the Internet, and social networking, there is a growing need to catalog and analyze large collections of photos. Because photo interest is largely determined by who appears in the picture, labeling photos with identities is particularly important. In fact, popular social networks such as Facebook allow users to place tags on photos to label people, encouraging collaboratively organized photo albums amongst friends. Imagine millions of social network users needing to tag their photos: such web-scale labeling problems present a real challenge and fascinating opportunity for automation by face recognition.

Linearly Approximated Sparse Representation-based Classification (LASRC)

Method details coming soon.



SRC based methods outperform all other methods in terms of precision and recall. It is important to notice that our method performs comparably to standard SRC, but with a 100x speedup as seen in the timing diagram below.


Timeline of all steps in the entire face recognition system. All times reported with a single core of a 2.27 GHz machine.

Facebook and PubFig+LFW Datafbfaces

To allow researchers to improve and build upon our work, we have released:

  • Facebook Dataset Features [4.3GB]L: 800,000 faces in an anonymized HOG+Gabor+LBP feature representation, allowing direct comparison of new classification algorithms on pre-preprocessed, aligned, and feature-extracted faces in a real-world, web-scale scenario.
  • PubFig+LFW Raw Images[3.7GB] and PubFig+LFW Features [1.6GB]: Both raw images and feature vectors for the open-universe scenario, allowing any improvements to be directly compared to our results.
  • Raw Intermediate Results (Facebook Results [4.8GB] and PubFig+LFW Results [0.9GB]): We have released the raw classification results used to generate our figures and tables; this data can be loaded into our MATLAB toolbox for further analysis.

Facebook Face Downloader

(For Research Purposes Only)

For researchers who need raw images, our Facebook Face Downloader (35 MB)  images, tags, and metadata from Facebook at a rate of 20,000 photos/hour and matches, extracts, and aligns faces at a rate of 5,000 faces/hour. This approach provides much more data and freedom than if we simply release our raw face images. Researchers can directly compare to all methods in the paper by introducing their data and methods into our evaluation toolbox.

This video shows how easy it is for researchers to create their own datasets from Facebook.

MATLAB Face Recognition Toolbox

To foster future research and improvements, we are releasing a full MATLAB Face Recognition Evaluator (25 MB) that includes our LASRC algorithm as well as all others we have compared against this study: NN, SVM, SVM-KNN, SRC, Mtjsrc, LLC, KNN-SRC, LRC, L2, and CRC_RLS.

  • fbCreateFaceDatasets: Generates datasets from raw images download from Facebook or any other source by extracting features and creating correct data splits for input to experimental stage.
  • fbRunExperiments: Runs all specified algorithms on data generated in the previous stage. A sampling of algorithms is shown in the below figure.
  • fbReportResults: Generates graphs and tables for specified algorithms run during previous stage.
We have included a small subset of the PubFig+LFW dataset we created for demonstration purposes. Please see runme.m in the matlab directory.

FRE Framework



Matlab Face Recognition Toolbox (FRT) [2.7MB]
(Download Data Below - To download PF83+LFW go to here: Project Page)


PubFig+LFW Raw Images[3.7GB]
PubFig+LFW Features [1.6GB]
PubFig+LFW Results [0.9GB]

Comments are closed.