A Comprehensive Survey of Face Databases for Constrained and Unconstrained Environments Siddheshwar S

A Comprehensive Survey of Face Databases for
Constrained and Unconstrained Environments

Siddheshwar S. Gangonda Prashant P. Patavardhan Kailash J. Karande
Research Scholar, Dept. of E&TC Professor, Dept. of ECE Professor, Dept. of E&TC
SKN Sinhgad COE, Pandharpur, KLS Gogte Institute of Technology, SKN Sinhgad COE, Pandharpur,
Maharashtra, India. Belagavi, Karnataka, India. Maharashtra, India.

Abstract—Face recognition has witnessed a lot of attention due
to its numerous applications in many fields like computer vision,
security, pattern recognition and computer graphics, but still is
a challenging and active research area. In this paper, we have
presented a comprehensive survey of face databases for
constrained and unconstrained Environments. Face databases
are used for the face detection and recognition algorithm testing
and they have been designed to evaluate the effectiveness of face
recognition algorithms. The paper is focused mostly on novel
databases that are freely available for the research purposes.
Most of the popular face databases are briefly introduced and
compared.

Keywords- face recognition, face database, expression, occlusion.

I. INTRODUCTION

Over the last few years research in face recognition has
moved from 2D to 3D. The need for 3D face data has
resulted in the need of 3D databases. In this paper, we first
give an introduction of publicly available 2D and 3D face
databases for constrained and unconstrained Environments.
The existence of many databases demands a quantitative
comparison of these databases in order to compare more
accurately the performances of the various algorithms
available in literature 8,9,11,15,16. The development of
algorithms robust to illumination, pose, facial expression,
age, occlusion changes requires databases of sufficient size
that include carefully controlled variations of these factors.
Also, common databases are required to comparatively
analyze algorithms.
Presently, there are many databases utilized for facial
recognition that vary in lighting conditions, size, pose,
expressions, the number of imaged subjects and occlusions.
The earliest facial databases mostly consists of frontal
images, such as the local data set acquired from 115 people at
Brown University utilized in the early works in 1987.
Nowadays, the facial databases were seen to capture the
variations in pose, lighting, imaging angles, ethnicity, gender
and facial expressions. Some of the most recent databases
capture the variations in image sizes, compression,
occlusions and are gathered from different sources such
as social media and web 11, 12.
In this paper, we have presented a comprehensive survey
of face databases for constrained and unconstrained
environments. Section II describes the overview of various
face databases. It focuses mostly on novel databases that are
freely available for research purposes. Section III describes
some of the recent face databases. Section IV compares the
various popular face databases. Finally, Section V concludes
the paper.

II. FACE DATABASES

Over the past few decades, a large number of face databases
have been designed to analyze the effectiveness of face
recognition algorithms. The brief introduction of selected
databases is as follows. In most cases the link to database
download is provided.

A. The AR database

The AR database 13 is one of the very few databases which
contain real occlusions and are open to the public. It consists
of more than 4,000 color images of 126 people faces (70 men
and 56 women). These images suffer from different variations
in facial expressions, lighting conditions and occlusions (i.e.,
sunglasses and scarves). They were captured under strictly
controlled conditions. No restrictions on wear (clothes,
glasses, etc.), makeup, hair style, etc. were imposed to
subjects. For each subject, 26 images in total were captured in
two sessions (two weeks apart) 1.
The limitations of the AR database are that it only contains
two types of occlusions, i.e., sunglasses and scarf, and the
location of the occlusion is either on the upper face or lower
face. This database can be downloaded from the link
http://rvl1.ecn.purdue.edu/~aleix/aleix face DB.html13.

Fig.1 Sample images of two sessions from AR database 2.

B. The Extended Yale B database

It consists of 2,414 frontal face images of 38 persons in 64
different lighting conditions. For every subject in a particular
pose, an image with surrounding (background) illumination

was also captured. The images are grouped into four subsets
according to the lighting angle with respect to the camera
axis. The Subset 1 and Subset 2 cover the angular range 0
?
to 25?, the Subset 3 covers 25? to 50?, the Subset 4 covers
50? to 77?, and the Subset 5 covers angles which are larger
than 78?. In order to simulate various levels of contiguous
occlusions, the most used scheme is to replace a randomly
located square patch from each test image with a baboon
image which has similar texture with the human face. The
location of the occlusion is randomly chosen. The sizes of
the synthetic occlusions vary in the range of 10% to 50% of
the original image 2,14.

Fig.2 Sample images from the Extended Yale B database with randomly
located occlusions: a) Subset 1, b) Subset 2. c) Subset 3, d) Subset 4
and e) Subset 5 2.

C. The FRGC database

The most popular 3D expression databases are the “Face
Recognition Grand Challenge”(FRGC) databases. The Grand
Challenge probably had a large impact on the advancement
of face recognition algorithms. So it is also considered as the
reference databases for validation of 3D face recognition
algorithm. The Face Recognition Grand Challenge (FRGC)
database contains 8,014 images from 466 subjects in
difference sessions. For each subject in each session, there
are four controlled still images, two uncontrolled still images,
and one 3D image. The still images contain variations such as
lighting and expression changes, time-lapse, etc.
The unconstrained images were captured in varying
lighting conditions; e.g., hallways, or outdoors. Each set of
unconstrained images contains two expressions, smiling and
neutral. To simulate the randomly located occlusions, one
can replace a randomly located square patch from each image
with a black block. The location of the occlusion is randomly
chosen. The size of the black block varies in the range of
10% to 50% of the original image 2. This database can be
downloaded from the link https://www.idiap.ch/software/bob.

D. The LFW database

The Labeled Faces in the Wild (LFW)8 database is a
database of face photographs designed for studying the
problem of unconstrained face recognition which contains
13,233 face images of 5,749 people collected from the

Fig.4 Sample images from the LFW database: first and second row: six
matched pairs from six subjects, third and forth row: six non-matched
pairs from twelve subjects 2.

Internet. These images are captured in uncontrollable
environments and contain large variations in pose, expression,
illumination, time-lapse and various types of occlusions. The
only constraint on these faces is that they were detected by the
Viola-Jones face detector. Each face has been labelled with the
name of the subject pictured. 1,680 of the subjects pictured
have two or more distinct images in the database. The aim of
face verification under the LFW database’s protocol is to
determine if a pair of face images belongs to the same subject
or not. The images are available as 250 by 250 pixel JPEG
images. Most images are in color, although a few are grayscale
only 4.

E. CAS-PEAL Database

It consist of images from 66 to 1040 subjects (595 men, 445
women) in seven categories: pose, expression, accessory, time,

Fig. 4 Pose variation in the CAS-PEAL database 2.

lighting, background, and distance. For the pose subset, nine
cameras distributed in a semicircle around the subject were
used. Images were recorded sequentially within a short time
period (2 seconds) 2. This database can be downloaded from
the link http://www.jdl.ac.cn/peal/index.html.

F. FERET

This database collection was a collaborative effort between
Dr. Wechsler and Dr. Phillips. The images were collected in a
semi-controlled condition. In order to maintain a degree of
uniformity in the database, the same physical setup was
utilized in each photography session. As the equipment had to
be assembled again for each session, there was some smaller
variation in images collected on dissimilar dates. It was
collected in 15 sessions between August 1993 and July 1996.

The database has 1564 sets of images for a total of 14,126
images that consists of 1199 individuals and 365 duplicate
sets of images. A duplicate set is a second set of images of a
person already in the database and was generally taken on a
dissimilar day. This database can be downloaded from the
link http://www.nist.gov/humanid/feret/. The color FERET
dataset can be downloaded from the link
http://www.nist.gov/humanid/colorferet/.

Fig.5 pose variations images from FERET database 6.

G. Korean Face Database (KFDB)

It consists of facial imagery of a large number of Korean
subjects collected under carefully constrained conditions. In
this, images with varying pose, lighting, and facial
expressions were recorded. The people were imaged in the
mid of an octagonal frame and the cameras were placed
between 45
0 off frontal in both directions at 150 increments.
Fig. 6 Pose variation in the Korean face database 2.

H. Yale Face Database B

It was collected for the systematic testing of face recognition
methods under large variations in lighting and pose. The
people were imaged inside a geodesic dome with 64
computer-controlled xenon strobes. The images of 10 people
were recorded under 64 lighting conditions in nine poses.
This database can be downloaded from the link
http://cvc.yale.edu/projects/yalefacesB/yalefacesB.html.

Fig. 7 Yale Face Database B images froM the 64 illumination conditions 6.

I. Yale Face Database

It consists of 11 images of 15 people in a different conditions
having with and without glasses, changes in facial expression
and lighting variation 2. This database can be downloaded
from the link
http://cvc.yale.edu/projects/yalefaces/yalefaces.html.

J. CMU Pose, Illumination, and Expression (PIE) Database

This database systematically samples a large number of pose
and lighting conditions and different facial expressions. It has
made an impact on algorithm development for face
recognition across pose. It consists of 41,368 images taken
from 68 people. The RGB color images are 640×480 in size
3. This database can be downloaded from the link
http://www.ri.cmu.edu/projects/project 418.html.

Fig. 8 Illumination variation images from PIE face database 3.

K. SCface Database

This database consists of static images of human faces.
Images were taken in unconstrained indoor conditions using
five video surveillance cameras of different qualities. It has
4160 stable images of 130 people. Images from different
quality cameras create the real-world conditions which helps
in robust face recognition algorithms testing. It is freely
available to research community 3.

L.
Georgia Tech Face Database

It consists of images of 50 people which are stored in JPEG
format. Most of the images were taken in two dissimilar
sessions to consider the changes in lighting conditions, facial
expression, and appearance. Also, the faces were taken at
dissimilar scales and directions. Each image is manually
labeled to find the position of the face in the image.

M. Japanese Female Facial Expression (JAFFE) Database

This database consists of 213 images of 7 facial expressions (6
normal facial expressions + 1 neutral) in various poses by 10
Japanese female models. Each image has been rated on 6
emotion adjectives by 60 Japanese people 2. This database
can be downloaded from the link
http://www.mis.atr.co.jp/˜mlyons/jaffe.html.
Fig. 9 Expression variation images from JAFFE database 2.

N. Indian Face Database

This database consists of eleven different images of 40
different people. All the images are stored in JPEG format.
The size of each image is 640×480 pixels, which are having
256 grey levels per pixel. The images are arranged in two
main categories – males and females. The different
orientations of the face included are: looking front, looking
left, looking right, looking up, looking up towards left,
looking up towards right, looking down and the available
emotions are: neutral, smile, laughter, sad/disgust 3.This
database can be downloaded from the link
http://www.cs.umass.edu/~vidit/facedatabase.

Fig. 10 Pose variation images from Indian Face database 3.

O. FEI Face Database

It is a Brazilian face database that consists of a set of face
images captured at the Artificial Intelligence Laboratory of
FEI in Brazil. There are 14 images of all the 200 individuals,
a total of 2800 images.

P. The Bosphorus database

It is a new 3D face database that has a rich set of expressions,
systematic changes of poses and various types of occlusions.
It is very useful for the advancement and analysis of
algorithms on face recognition under adverse environments,
facial expression analysis and facial expression synthesis.

Q. FaceScrub Database

The database was collected from the images available on the
Internet. There is an automatic procedure that verifies that the
image belongs to the right person. It contains the images of
530 people which is 107,818 in total. The images are
provided together with the name and gender annotations.

III. RECENT FACIAL DATABASES

The early databases were focused on facial detection for
subject identification, the more recent databases are tuned
more towards capturing the changes in imaging modalities,
facial expressions, and obscurities due to makeup. Some of
the latest facial databases are 7:

A. Labelled Wikipedia Faces (LWF)
It consists of mined images from more than 0.5 million
biographic entries from the Wikipedia Living People entries
and it has 8500 faces from 1500 subjects. YouTube Faces
Database (YFD) has 3425 videos of 1595 dissimilar subjects
(2.15 videos per subject) with video clips ranging from 48-
6070 frames. This database was designed to provide a
collection of videos and labels for subject identification from
videos and benchmarking video pair-matching techniques.

B. YouTube Makeup Dataset (YMD)

It contains images from 151 subjects (Caucasian females)
from YouTube makeup tutorials before and after precise to
heavy makeup is applied. 4 shots are taken for each subject (2
shots before and 2 shots after makeup is applied). This
database has constant lighting but it demonstrates the
challenges in facial recognition due to changes in makeup.

C. Indian Movie Face Database (IMFD)

It contains 34512 images from 100 Indian actors collected
from about 100 videos and cropped to include variations in
pose, expression, lighting, resolution, occlusions and makeup.

IV.
COMPARISON OF FACE DATABASES

The different face databases have been built for the analysis
of face images when dealing with a single or a combination of
these changes. The different types of face image databases are
given in the Table1.

Table 1. Different types of face image databases 5

Face
Database Image Type
RGB/Gray
Scale
Image
Size Types of
conditions
FERET Gray RGB
256 x 384
i, e, p, I/O, t
The yale
face B Gray Scale
640 x 480
i, p
AR Faces RGB
576 x 768
i , o, t, e
CMU-PIE RGB
640 x 486
i, e, p
The yale
face Gray Scale
320 x 243
i, e
Asian face
databaseRGB
640 x 480
p, e, i, o
Indian face
databaseRGB
640 x 480
e, p
The different Image changes are shown by p: pose, o:
occlusion, i: illumination, e: expression, t: time delay, I/O:
indoor/outdoor conditions.

The image size, image type and the other specifications
describes about the complexity of face database which in
turn shows the robustness of different algorithms of face
recognition. The different face databases are created to
evaluate the effect of changes on the several types of
conditions of an image. AR Faces, FERET, CMU-PIE,
Asian and Indian face database are the most widely used 2D
face image databases. Each database provides a platform to
access the particular challenges of uncontrolled conditions.
For example, CMU-PIE is used for more illumination and
poses changes. FERET gives a good testing platform for
large probe and gallery sets. AR Faces gives the natural
occluded face images. Asian face database consists of 2D
face images of female and male with pose, illumination,
expression occlusion and expressions. The Indian face
database comprises of face images with variation in
expression and poses.

V.
CONCLUSION
Face recognition in unconstrained situations is still a
challenging research domain. In this paper, we have
presented a comprehensive survey of face databases for
constrained and unconstrained environments. It focuses
mostly on novel databases that are freely available for the
research purposes. Most of the popular face databases are
briefly introduced and compared. The purpose of this
review paper is to assist the young budding researchers in
the area of face recognition by compiling the most widely
used face databases and the link to download them so as to
motivate their further research.

REFERENCES
1 Martinez, A. M, “The AR face database”, CVC Technical Report, 24,
1998.
2 Gross R., “Face Databases”, Handbook of Face Recognition S. Z. LI
and A. K. JAIN. New York: Springer. ISBN 0-387-40595-x, 2005.
3 Face Recognition Hompage www.face-rec.org/databases.
4 Ladislav Lenc, PavelKral, “Unconstrained Facial Images: Database for
Face Recognition under Real-world Conditions”, October 2015, DOI:
10.1007/978-3-319-27101-9_26.
5 W. Zhao, R. chellappa, P. J. Phillips, Face recognition: A literature
survey, “ACM Computing Surveys (CSUR)”, December, 2003.
6 Siddheshwar S. Gangonda, Prashant P. Patavardhan, Kailash J.
Karande, “Analysis of Face Recognition Algorithms for Uncontrolled
Environments”, Third International Conference on Computing,
Communication and Signal Processing (ICCASP-2018), Lonere,
Raigad, Maharashtra. DOI: 10.1007/978-981-13-1513-8_93.
7 Siddheshwar S. Gangonda, Prashant P. Patavardhan, Kailash J.
Karande,”An Extensive Survey of Prominent Researches in Face
Recognition under different Conditions”, Fourth International
Conference on Computing Communication Control and Automation
(ICCUBEA), Pune, 2018. 978-1-5386-5257-2/18/$31.00 ©2018 IEEE.
8 Gary B. Huang1, Marwan Mattar1, Tamara Berg2, and Erik Learned-
Miller, “Labeled Faces in the Wild: A Database for Studying Face
Recognition in Unconstrained Environments”, http://vis-
www.cs.umass.edu/lfw/.
9 Gabriel Castaneda and Taghi M. Khoshgoftaar, “A Review of
Performance Evaluation on 2D Face Databases”, IEEE Third
International Conference on Big Data Computing Service and
Applications, 2017, 978-1-5090-6318-5/17 $31.00 © 2017 IEEE
DOI 10.1109/BigDataService.2017.38
10 S. Z. Li, R. F. Chu, S. C. Liao, and L. Zhang, “Illumination invariant
face recognition using near-infrared images,” IEEE Transactions on
Pattern Analysis and Machine Intelligence, Vol. 29, No. 4, pp. 1-13,
April 2007.

11 Shwetank Arya*, Neeraj Pratap, Karamjit Bhatia, “Future of Face
Recognition: A Review”, Second International Symposium on Computer
Vision and the Internet (VisionNet’15), 2015.
12 S. Z. Li, R. F. Chu, S. C. Liao, and L. Zhang, “Illumination invariant face
recognition using near-infrared images,” IEEE Transactions on Pattern
Analysis and Machine Intelligence, Vol. 29, No. 4, pp. 1-13, April 2007.
13 http://rvl1.ecn.purdue.edu/~aleix/aleix_face_DB.html
14 http://cvc.yale.edu/projects/yalefaces/yalefaces.html.
15 Muhammad Sharif, Farah Naz, Mussarat Yasmin, Muhammad Alyas
Shahid and Amjad Rehman, “Face Recognition: A Survey”, Journal of
Engineering Science and Technology Review 10 (2) (2017) 166- 177,
ISSN: 1791-2377 © 2017
16 M. Hassaballah, Saleh Aly, “Face recognition: challenges,
achievements and future directions”, The Institution of Engineering and
Technology (IET), ISSN 1751-9632, doi: 10.1049/iet-cvi.2014.0084.