Tutorial Deep Learning : Unsupervised Feature Learning

Tutorial Deep Learning : Unsupervised Feature Learning Joana Frontera-Pons 4th September 2017 - Workshop Dictionary Learning on Manifolds

OUTLINE Introduction Representation Learning TensorFlow Examples

DEEP LEARNING Deep Learning : Unsupervised Learning methods that can learn invariant features hierarchies, Non-linear representations obtained with deep layer structures allow to bring out complex relationships and disentangle the variation factors of the inputs, How? - AutoEncoders, ConvNets, Deep Belief Networks, What kind of representations the model can extract?

DEEP LEARNING Hidden layer 1 Hidden layer 2 Hidden layer 3

Motivation Motivation Motivation MOTIVATION Linear representations are frequently used to model images: Linear representations are frequently used to model images: Linear representations are frequently used to model images: x=d x=d x=d x is the original signal or image, original signal or image, D xisisa the transformation matrix (forsignal e.g. Discrete cosine transform, x is the original or image, D is a transformation matrixcomponent (for e.g. Discrete cosine transform, Wavelets transform, Principal Analysis, K-SVD dictionary), D is a transformation matrix (for e.g. Discrete cosine transform Wavelets transform, Principal Component Analysis, K-SVD dictionary), is the feature Wavelets vector. is the feature vector. transform, Principal Component Analysis, K-SVD dict is the feature vector. In this context, linear representations are the most widely spread In this context, linear representations are the most widely spread In this context, linear representations are solving, the most...widely sprea approaches for denoising, compression, inverse problems approaches for denoising, compression, inverse problems solving,... approaches for denoising, compression, inverse problems solving They fail at capturing some common variations in the data such as They fail at capturing some common variations in the data such as They fail at capturing some common variations in the data such translation, rotation, zoom translation, rotation, zoom translation, rotation, zoom New representations Newadaptive adaptivenon-linear non-linear representations New adaptive non-linear representations Visit to the CMM - Research and Integration Project J.Frontera-Pons J.Frontera-Pons Visit to the CMM - Research and Integration Project J.Frontera-Pons Visit to the CMM - Research and Integration Project J.Frontera-Pons 3/ 3/12 3 3/ 3

AUTOENCODERS Autoencoders are artificial neural networks capable of learning efficient representations of the input data without supervision. They are powerful feature extractors, They work by learning a to copy their inputs to their outputs, We have to constrain the network to prevent the model to learn the identity : limit the size of the representation, add noise, They find their purpose in : dimensionality reductcion, feature extraction unsupervised pretraining, or as generative models.

AUTOENCODERS Basic Autoencoders : Parametric encoding function from inputs to their representations, and a decoding function that maps back to input space, Train the model in order to reconstruct as accurately as possible the input. Hidden representation : h 2 R nhid f (x) = (W T f x + b f ) Encoder Input : x 2 R d Reconstruction Error : L(x, ˆx) - MeanSquared Error - Binary Cross-Entropy g (h) = (W T g h + b g ) Decoder Reconstruction : ˆx 2 R d

WEIGHT FILTERS PCA DAE

STACKED AUTOENCODERS Greedy layer-wise initialization: Ø Pre-training all the layers using unsupervised feature learning, Ø Add one level at a time, Ø Build hierarchy of representations. Stacking layers of autoencoders : Ø Pairs of encoder/decoder combined to form a global encoder and global decoder, Ø Deep autoencoders jointly trained and optimized for an overall reconstruction error. Outputs Hidden 3 Hidden 2 Hidden 1 Inputs

REGULARIZED AUTOENCODERS Basic Autoencoders may learn the identity function to minimize the reconstruction error, Impose some constraint on the code to force the representation to be insensitive to local variations Regularization, Examples: Contractive AE, Sparse AE, DAE, etc. DENOISING AUTOENCODERS [Vincent 2008] Modify the training objective to retrieve a clean input from an artificially corrupted version of it, Make the transformation robust to small random perturbations in the input, Corruption noise: Gaussian, salt and pepper, masking, or adaptive corruption process. J. FRONTERA- PONS - DEEP LEARNING 10 DEDALE TUB MEETING 19/04/2016

DENOISING AUTOENCODERS x x ˆx x J. FRONTERA- PONS - DEEP LEARNING 11 DEDALE TUB MEETING 19/04/2016

BASICS TENSORFLOW TensorFlow is an open source software library for numerical computation developed by the Google Brain team, The user defines in Python a graph of computations and then TF takes that graph and runs it efficiently using optimised C++. + Operation x + f(x, y) =x 2 y + y +2 x y 2 x Variable Constant

BASICS TENSORFLOW TensorFlow programs are typically split into two parts: Construction phase: Builds a computation graph representing the machine learning model and the computations required to train it. Execution phase: Runs a loop that evaluates a training step repeatedly. TensorBoard example

EXPERIMENTAL RESULTS Mixed National Institute of Standards and Technology database (MNIST) is a large database of handwritten digits, Training set containing 60000 samples and test set 1000 samples, BUILD FEATURES THAT CAPTURE THE VARIATIONS ALONG THE DIFFERENT DIGITS

EXPERIMENTAL RESULTS Encode Decode x h ˆx

Thank you for your attention!