Deep Convolutional Neural Network using Triplet of Faces, Deep Ensemble, and Scorelevel Fusion for Face Recognition

Similar documents
Face Recognition A Deep Learning Approach

Deep Learning for Face Recognition. Xiaogang Wang Department of Electronic Engineering, The Chinese University of Hong Kong

Deep Face Recognition. Nathan Sun

Robust Face Recognition Based on Convolutional Neural Network

FaceNet. Florian Schroff, Dmitry Kalenichenko, James Philbin Google Inc. Presentation by Ignacio Aranguren and Rahul Rana

Learning to Recognize Faces in Realistic Conditions

Facial Expression Classification with Random Filters Feature Extraction

An Associate-Predict Model for Face Recognition FIPA Seminar WS 2011/2012

Tutorial on Machine Learning Tools

FACE recognition has been one of the most extensively. Robust Face Recognition via Multimodal Deep Face Representation

Clustering Lightened Deep Representation for Large Scale Face Identification

Lecture 10: Other Applications of CNNs

Deep Learning For Video Classification. Presented by Natalie Carlebach & Gil Sharon

WITH the widespread use of video cameras for surveillance

Face detection and recognition. Detection Recognition Sally

Kaggle Data Science Bowl 2017 Technical Report

Deep Learning with Tensorflow AlexNet

Metric learning approaches! for image annotation! and face recognition!

Hybrid Deep Learning for Face Verification. Yi Sun, Xiaogang Wang, Member, IEEE, and Xiaoou Tang, Fellow, IEEE

FISHER VECTOR ENCODED DEEP CONVOLUTIONAL FEATURES FOR UNCONSTRAINED FACE VERIFICATION

DeepFace: Closing the Gap to Human-Level Performance in Face Verification

Partial Least Squares Regression on Grassmannian Manifold for Emotion Recognition

Deep Learning for Vision

arxiv: v1 [cs.cv] 15 Nov 2018

A Patch Strategy for Deep Face Recognition

Convolutional Neural Network for Facial Expression Recognition

COLOR AND DEEP LEARNING FEATURES

Handwritten Chinese Character Recognition by Joint Classification and Similarity Ranking

Pedestrian and Part Position Detection using a Regression-based Multiple Task Deep Convolutional Neural Network

Deep Learning for Computer Vision with MATLAB By Jon Cherrie

Two-Stream Convolutional Networks for Action Recognition in Videos

Multiple Kernel Learning for Emotion Recognition in the Wild

MoonRiver: Deep Neural Network in C++

A Comparison of Sequence-Trained Deep Neural Networks and Recurrent Neural Networks Optical Modeling For Handwriting Recognition

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Improving Face Recognition by Exploring Local Features with Visual Attention

Dynamic Routing Between Capsules

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python.

Plankton Classification Using ConvNets

Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling

Recursive Spatial Transformer (ReST) for Alignment-Free Face Recognition

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

Pairwise Relational Networks for Face Recognition

Combining PGMs and Discriminative Models for Upper Body Pose Detection

Perceptron: This is convolution!

Inception Network Overview. David White CS793

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

Deep Learning Architectures for Face Recognition in Video Surveillance

Face Recognition by Deep Learning - The Imbalance Problem

CHAPTER 5 GLOBAL AND LOCAL FEATURES FOR FACE RECOGNITION

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

Recurrent Neural Networks and Transfer Learning for Action Recognition

Is Bigger CNN Better? Samer Hijazi on behalf of IPG CTO Group Embedded Neural Networks Summit (enns2016) San Jose Feb. 9th

LEARNING TO GENERATE CHAIRS WITH CONVOLUTIONAL NEURAL NETWORKS

Spatial Localization and Detection. Lecture 8-1

Computer Vision Lecture 16

Computer Vision Lecture 16

Improving Face Recognition by Exploring Local Features with Visual Attention

Deep Fisher Faces. 1 Introduction. Harald Hanselmann

Vignette: Reimagining the Analog Photo Album

Hybrid Deep Learning for Face Verification

The Analysis of Faces in Brains and Machines

Deep Learning for Computer Vision

Alternatives to Direct Supervision

Applications Video Surveillance (On-line or off-line)

Deep Learning for Computer Vision

Machine Learning 13. week

on learned visual embedding patrick pérez Allegro Workshop Inria Rhônes-Alpes 22 July 2015

Generic Face Alignment Using an Improved Active Shape Model

Understanding Faces. Detection, Recognition, and. Transformation of Faces 12/5/17

An efficient face recognition algorithm based on multi-kernel regularization learning

FACE verification in unconstrained settings is a challenging

DeepFace: Closing the Gap to Human-Level Performance in Face Verification

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

Regionlet Object Detector with Hand-crafted and CNN Feature

Places Challenge 2017

Deep Learning for Computer Vision II

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601

PEOPLE naturally recognize others from their face appearance.

Keras: Handwritten Digit Recognition using MNIST Dataset

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

NVIDIA FOR DEEP LEARNING. Bill Veenhuis

Experiments of Image Retrieval Using Weak Attributes

Supplementary Material: Unsupervised Domain Adaptation for Face Recognition in Unlabeled Videos

An advanced data leakage detection system analyzing relations between data leak activity

Intro to Deep Learning. Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Unsupervised Learning

Yiqi Yan. May 10, 2017

3D model classification using convolutional neural network

METRIC LEARNING BASED DATA AUGMENTATION FOR ENVIRONMENTAL SOUND CLASSIFICATION

Deep Learning on Graphs

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Video-Based Face Recognition Using Ensemble of Haar-Like Deep Convolutional Neural Networks

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Pair-wise Distance Metric Learning of Neural Network Model for Spoken Language Identification

arxiv: v1 [cs.cv] 3 Mar 2018

Learning Class-Specific Image Transformations with Higher-Order Boltzmann Machines

Deep Learning. Volker Tresp Summer 2014

CHAPTER 3 PRINCIPAL COMPONENT ANALYSIS AND FISHER LINEAR DISCRIMINANT ANALYSIS

Transcription:

IEEE 2017 Conference on Computer Vision and Pattern Recognition Deep Convolutional Neural Network using Triplet of Faces, Deep Ensemble, and Scorelevel Fusion for Face Recognition Bong-Nam Kang*, Yonghyun Kim, and Daijin Kim * Dept. of Creative IT Engineering, Dept. of Computer Science & Engineering {bnkang, gkyh0805, dkim}@postech.ac.kr

Introduction Face recognition in unconstrained environments is very challenging problem. Inter-class variations Same person? Hugh Carpenter Grant Jennifer Phil Mickelson Hope Solo Facial poses, expressions, and illumination changes cause problems to misidentify faces of different identities as the same identity. Intra-class variations Such variations within the same identity could overwhelm the variations due to identity differences and make face recognition challenging. 2

Overview Training for Features Deep Neural Network (DNN) Triplets of Faces Feature Learning Discriminative Features Test Multiple DNNs Input Face Images Feature Extraction TL-Joint Bayesian Classifier Same or Not? Classification 3

Discriminative Feature Learning (1/2) Joint Loss Function for Feature Learning L total = L triplets + L pairs + L identity Triplet Loss L triplets L triplets = max 0, 1 F I > F I? @ F I > F I A @ + m The output of network is represented by F I R -. m is a margin: define the minimum ratio between the negative pairs and the positive pairs in the Euclidean space 4

Discriminative Feature Learning (2/2) Pairwise Loss L pairs Minimize the absolute distances between the positive data in the triplets T. L pairs = C F I > F I A @ @ (F G,F H ) K Identity loss L identity Use negative log-likelihood loss with softmax. Reflect characteristics for each identity. Encourage the separability of features m L identity = C log is1 e F i(i i ) n e F j(i i ) js1 5

Description using Deep Ensemble (1/2) Features for Ensemble Net. #1 Net. #2 f1 f1 f2 f2 In conventional applications of DCNN, the output of the last fully connected layer f2 is used only as a feature. Use DNN features taken from f1 and f2 fully connected layers. è Multi-scale feature effect 6

Description using Deep Ensemble (2/2) Deep Ensemble Concatenation Dimensionality reduction by PCA Feature extraction from multiple Neural Networks and deep ensemble generation 7

Fusion Score-level fusion Use similarities DCNN ensemble and similarities of high-dim. LBP as features. Use Support Vector Machine (SVM) as a classifier (recognizer). 8

Experimental Results (1/2) Training data 4,048 subjects with more than equal 10 images (198,018). 396,036 face images (horizontal flipped) are used to generate about 4M triplets of faces for training. Test data - LFW (Labeled Faces in the Wild) Each of 10 folders consists of 300 intra pairs and 300 extra pairs (total: 6,000 pairs). 10-fold cross validation. 9

Experimental Results (2/2) Results on LFW Method No. of No. of images DNNs Feature dim. Accuracy (%) Human - - - 97.53 Joint Bayesian 99,773-8,000 92.42 Fishervector face N/A - 256 93.03 Tom-vs-Pete classifier 20,639-5,000 93.30 High-dim. LBP 99,773-2,000 95.17 TL-Joint Bayesian 99,773-2,000 96.23 DeepFace 4M 9 4,096 x 4 97.25 DeepID 202,599 120 150 (PCA) 97.45 DeepID3 300,000 50 300 x 100 99.53 FaceNet 200M 1 128 99.63 Learning from Scratch 494,414 2 320 97.73 Proposed Method (+Joint Bayesian) Proposed Method (+TL-Joint Bayesian) 198,018 4 198,018 4 1,024 (PCA) 1,024 (PCA) 96.23 98.33 Proposed Method (Fusion) 198,018 4 6 99.08 10

Discussion & Conclusion Joint Loss Function to learn a discriminative feature is effective The proposed method is more efficient Small number of data - only 198,018 training images Only 4 different deep network models used Accuracy: 99.08% (Score-level Fusion) The proposed method is useful when The amount of training data is insufficient to train deep neural networks. 11

Thank you!!! 12

Appendix 13

Deep Convolution Neural Network Architecture Multi-scale Convolution Layer Block (MCLB) Consists of 1x1, 3x3, 5x5 convolution layers, and 3x3 max pooling layer. Structure of Deep Convolution Neural Network Constructed by stacking MCLBs on top of each other (24 layers deep). All convolutions and fully connected layers use the ReLU non-linear activation. Average pooling takes the average of each feature map and sums out the spatial information. Dropout is only applied to the last fully connected layer for regularization. 14

Discriminative Feature Learning After training with only triplet loss, we observed that the range of distances between each pair data was not within the certain range. Although the ratio of the distances was within the certain range, the range of the absolute distances was not within the certain range. 1.86 1.86 Ratios of distances are same. If distance 1.25, positive (same identity). These two image are with same identity. Acceptable? 15

Experimental Results Results of Joint Loss Function on Validation Set 55,747 face images are used as a validation set. Accuracy (%) Error reduction DNN + L Z-[\]Z]^ (baseline) 88.17 - DNN + L ]_Z`a[] + L Z-[\]Z]^ 91.32 26.62% DNN + L ]_Z`a[] + L`bZ_c + L Z-[\]Z]^ 93.45 44.63% 16