Lecture #17: Autoencoders and Random Forests with R. Mat Kallada Introduction to Data Mining with R

Size: px

Start display at page:

Download "Lecture #17: Autoencoders and Random Forests with R. Mat Kallada Introduction to Data Mining with R"

Esther Bryan
5 years ago
Views:

1 Lecture #17: Autoencoders and Random Forests with R Mat Kallada Introduction to Data Mining with R

2 Assignment 4 Posted last Sunday Due next Monday!

3 Autoencoders in R Firstly, what is an autoencoder?

4 Autoencoders Re-visited A better explanation than last time We re doing non-linear dimensionality reduction We have the possibility of representing the feature vector into any viable format using the universal approximation theorem Is there a better way to represent this data?

5 To be a bit more clear.. The goal is: Height Width Autoencoder Re-represents data Weight We re learning latent factors of the data automatically. In theory, we can learn any representation due to the universal approximation theorem with autoencoders

6 Multi-layer Perceptron Re-visited

7 Equations from that Neural Network Graph We have these equations prob region class 1 = { logistic(height * weight 1 + width * weight 2 + weight 0 ) } prob region class 2 = { logistic(height * weight 3 + width * weight 4 + weight 5 ) } final probability = { logistic(prob region 1 * weight 6 + prob region 1 * weight 7 + weight 8 ) }

8 Equations from that Neural Network Graph We have these equations Bias Neurons prob region class 1 = { logistic(height * weight 1 + width * weight 2 + weight 0 ) } prob region class 2 = { logistic(height * weight 3 + width * weight 4 + weight 5 ) } final probability = { logistic(prob region 1 * weight 6 + prob region 1 * weight 7 + weight 8 ) }

9 Some Questions about the MLP Neural Network Why not build deep neural networks? Why should we go deep again? That is, have meta- combiners for the combiners? How many hidden neurons? How do we solve multi-class problems?

10 The Autoencoder Neural Network Width Height Width Height Tail Length Tail Length Tail Width Tail Width Teeth Count Teeth Count

11 The Autoencoder Neural Network Width Height Tail Length Tail Width Teeth Count Width Height Tail Length Tail Width Teeth Count Wait - what is the point is predicting the same thing?

12 The Autoencoder Neural Network Width Height Width Height Tail Length Tail Length Tail Width Tail Width Teeth Count Teeth Count Compressed Representation of the Input Data

13 Autoencoder Networks do Dimensionality Reduction Width Height Tail Length Tail Width Teeth Count Only pass through encoding layer of the network X1 X2 X

14 Hierarchical Representations Could we train another autoencoder on the new representation to get yet another representation? What could we get out of that?

15 Another Autoencoder for our Newly Derived Representation X 1 X 1 X 2 X 3 Z 1 X 2 X 3 This second encoding, Z, is another representation of the input representation 1

16 Autoencoder Networks do Dimensionality Reduction Width Height Tail Length Tail Width Teeth Count Pass through encoding layer of Autoencoder 1 X 1 X 2 X Pass through encoding layer of Autoencoder 2 Z 1...

17 Autoencoder Networks do Dimensionality Reduction Width Height Tail Length Tail Width Teeth Count Pass through encoding layer of Autoencoder 1 X 1 X 2 X Two Levels of Feature Exraction Z 1... Pass through encoding layer of Autoencoder 2

18 Autoencoders They do dimensionality reduction We try to compress or re-represent the input data

19 Illustrative Example of the Goal of Dimensionality Reduction

20 The goal is to learn effective hierarchical representations

21 Autoencoders with R DataJoy Link:

22 Random Forests in R Wait - what is random forests again?

23 Random Forests in a Nutshell Bagging Create Bootstrap Samples to re-simulate the sampling procedure Random Sub-Space Method Randomly split on a subset of M features. M is a hyperparameter

24 Goal of Ensemble Learning in a Nutshell Make use of two techniques to create an ensemble What is ensemble learning?

25 What is ensemble learning? A general framework for supervised data mining where we learn multiple models using the same base learner Random Forests is a type of Ensemble Learning How is the team/ensemble built in Random Forests again?

26 Let s go through how Random Forests works again.. In case we forgot

27 Step #1: Bootstrap Sample Generation Re-simulate the entire experiment by bootstrapping

28 Bootstrapping N different times Construct N different bootstrap samples, train N decision trees

29 Random Sub-space Method in Decision Trees Only consider M features at EVERY SPLIT. Consider M=3: Results in a very diverse ensemble of learners. This is done at every split of the tree in every tree.

30 Okay, we have N models. How do ensembles predict?

31 Okay, we have N models. How do ensembles predict? Let s just Majority Vote or Average: easy peasy

32 Why are decision trees used as the base learner? They are unstable learners Small changes in the data, changes the entire decision tree model

33 Feature Importance We know that the top level splits are most important Robust aggregation over all of decision trees in the ensmble

34 Random Forests in R DataJoy:

Lecture 8: Grid Search and Model Validation Continued

Lecture 8: Grid Search and Model Validation Continued Mat Kallada STAT2450 - Introduction to Data Mining with R Outline for Today Model Validation Grid Search Some Preliminary Notes Thank you for submitting