Large-Scale Evolution of Image Classifiers

Size: px

Start display at page:

Download "Large-Scale Evolution of Image Classifiers"

Emerald Pitts
6 years ago
Views:

1 Large-Scale Evolution of Image Classifiers Esteban Real 1 Sherry Moore 1 Andrew Selle 1 Saurabh Saxena 1 Yutaka Leon Suematsu 2 Jie Tan 1 Quoc V.Le 1 Alexey Kurakin 1 1 Google Brain 2 Google Research ICML, 2017 Presenter: Tianlu Wang ICML, 2017Presenter: Tianlu Wang 1 / Esteban Real, Sherry Moore, Andrew Selle, Saurabh Large-Scale Saxena, Evolution Yutaka Leon of Image Suematsu, Classifiers Jie Tan, Quoc V.Le, Alexey Kurakin (Google Brain 26

2 Outline 1 Introduction Motivation Backgrounds 2 Related Work Neuro-evolution Non-evolutionary 3 Methods Algorithm Overview Encoding and Mutations More Details 4 Results Progress of experiments Comparisons Meta-parameters 5 Summary ICML, 2017Presenter: Tianlu Wang 2 / Esteban Real, Sherry Moore, Andrew Selle, Saurabh Large-Scale Saxena, Evolution Yutaka Leon of Image Suematsu, Classifiers Jie Tan, Quoc V.Le, Alexey Kurakin (Google Brain 26

3 Outline 1 Introduction Motivation Backgrounds 2 Related Work Neuro-evolution Non-evolutionary 3 Methods Algorithm Overview Encoding and Mutations More Details 4 Results Progress of experiments Comparisons Meta-parameters 5 Summary ICML, 2017Presenter: Tianlu Wang 3 / Esteban Real, Sherry Moore, Andrew Selle, Saurabh Large-Scale Saxena, Evolution Yutaka Leon of Image Suematsu, Classifiers Jie Tan, Quoc V.Le, Alexey Kurakin (Google Brain 26

4 Motivation AlexNet, GoogleNet, VGG, ResNet... Designing neural network architectures can be challenging Discover network architectures automatically ICML, 2017Presenter: Tianlu Wang 4 / Esteban Real, Sherry Moore, Andrew Selle, Saurabh Large-Scale Saxena, Evolution Yutaka Leon of Image Suematsu, Classifiers Jie Tan, Quoc V.Le, Alexey Kurakin (Google Brain 26

5 Outline 1 Introduction Motivation Backgrounds 2 Related Work Neuro-evolution Non-evolutionary 3 Methods Algorithm Overview Encoding and Mutations More Details 4 Results Progress of experiments Comparisons Meta-parameters 5 Summary ICML, 2017Presenter: Tianlu Wang 5 / Esteban Real, Sherry Moore, Andrew Selle, Saurabh Large-Scale Saxena, Evolution Yutaka Leon of Image Suematsu, Classifiers Jie Tan, Quoc V.Le, Alexey Kurakin (Google Brain 26

6 Backgrounds Achievements: evolution algorithm outputs a fully-trained model with no human participation Drawbacks: significant computation Image classification, CIFAR-10, CIFAR-100 ICML, 2017Presenter: Tianlu Wang 6 / Esteban Real, Sherry Moore, Andrew Selle, Saurabh Large-Scale Saxena, Evolution Yutaka Leon of Image Suematsu, Classifiers Jie Tan, Quoc V.Le, Alexey Kurakin (Google Brain 26

7 Outline 1 Introduction Motivation Backgrounds 2 Related Work Neuro-evolution Non-evolutionary 3 Methods Algorithm Overview Encoding and Mutations More Details 4 Results Progress of experiments Comparisons Meta-parameters 5 Summary ICML, 2017Presenter: Tianlu Wang 7 / Esteban Real, Sherry Moore, Andrew Selle, Saurabh Large-Scale Saxena, Evolution Yutaka Leon of Image Suematsu, Classifiers Jie Tan, Quoc V.Le, Alexey Kurakin (Google Brain 26

8 Neuro-evolution Weight evolution: back propagation Weight and architecture: NEAT algorithm (node and connection) ICML, 2017Presenter: Tianlu Wang 8 / Esteban Real, Sherry Moore, Andrew Selle, Saurabh Large-Scale Saxena, Evolution Yutaka Leon of Image Suematsu, Classifiers Jie Tan, Quoc V.Le, Alexey Kurakin (Google Brain 26

9 Outline 1 Introduction Motivation Backgrounds 2 Related Work Neuro-evolution Non-evolutionary 3 Methods Algorithm Overview Encoding and Mutations More Details 4 Results Progress of experiments Comparisons Meta-parameters 5 Summary ICML, 2017Presenter: Tianlu Wang 9 / Esteban Real, Sherry Moore, Andrew Selle, Saurabh Large-Scale Saxena, Evolution Yutaka Leon of Image Suematsu, Classifiers Jie Tan, Quoc V.Le, Alexey Kurakin (Google Brain 26

10 Non-evolutionary Bayesian optimization Reinforcement learning Q-learning ICML, 2017Presenter: Tianlu Wang 10

11 Outline 1 Introduction Motivation Backgrounds 2 Related Work Neuro-evolution Non-evolutionary 3 Methods Algorithm Overview Encoding and Mutations More Details 4 Results Progress of experiments Comparisons Meta-parameters 5 Summary ICML, 2017Presenter: Tianlu Wang 11

12 Algorithm Overview Input: a population of models, each model is a trained single-layer nonconvolutional model with learning rate = 0.1 Measurement: accuracy on validation dataset ICML, 2017Presenter: Tianlu Wang 12

13 Algorithm Overview Input: a population of models, each model is a trained single-layer nonconvolutional model with learning rate = 0.1 Measurement: accuracy on validation dataset model2 model1 model3 model X model Y worker mutation Validation dataset model X model Y ICML, 2017Presenter: Tianlu Wang 12 steban Real, Sherry Moore, Andrew Selle, Saurabh Large-Scale Saxena, Evolution Yutaka Leon of Image Suematsu, Classifiers Jie Tan, Quoc V.Le, Alexey Kurakin (Google/ Brain 26

14 Algorithm Overview Input: a population of models, each model is a trained single-layer nonconvolutional model with learning rate = 0.1 Measurement: accuracy on validation dataset model2 model1 model3 model X model Y worker mutation Validation dataset model X model Y When to stop? ICML, 2017Presenter: Tianlu Wang 12

15 Outline 1 Introduction Motivation Backgrounds 2 Related Work Neuro-evolution Non-evolutionary 3 Methods Algorithm Overview Encoding and Mutations More Details 4 Results Progress of experiments Comparisons Meta-parameters 5 Summary ICML, 2017Presenter: Tianlu Wang 13

16 Model Encoding Individual model is encoded as a graph: Vertices rank-3 tensor(image width * image height * channels) activations(batch normalization with ReLU or plain linear layer) Edges Identity connections Convolutions ICML, 2017Presenter: Tianlu Wang 14

17 Model Encoding Individual model is encoded as a graph: Vertices rank-3 tensor(image width * image height * channels) activations(batch normalization with ReLU or plain linear layer) Edges Identity connections Convolutions Inconsistent input: pick and keep primary one reshape(interpolation/truncation/padding) non-primary ones ICML, 2017Presenter: Tianlu Wang 14

18 Mutations The worker picks a mutation at random from a set: ALTER-LEARNING-RATE IDENTITY (effectively means keep training) RESET-WEIGHTS INSERT/REMOVE CONVOLUTION ALTER-STRIDE ALTER-NUMBER-OF-CHANNELS FILTER-SIZE INSERT-ONE-TO-ONE INSERT/REMOVE SKIP ICML, 2017Presenter: Tianlu Wang 15

19 Outline 1 Introduction Motivation Backgrounds 2 Related Work Neuro-evolution Non-evolutionary 3 Methods Algorithm Overview Encoding and Mutations More Details 4 Results Progress of experiments Comparisons Meta-parameters 5 Summary ICML, 2017Presenter: Tianlu Wang 16

20 More Details Poor initial conditions(12th silde) 45,000 training; 5,000 validation; test SGD with momentum of 0.9, batch size 50, weight decay Computation cost: floating-point operations Inherit parameters weights whenever possible ICML, 2017Presenter: Tianlu Wang 17

21 Outline 1 Introduction Motivation Backgrounds 2 Related Work Neuro-evolution Non-evolutionary 3 Methods Algorithm Overview Encoding and Mutations More Details 4 Results Progress of experiments Comparisons Meta-parameters 5 Summary ICML, 2017Presenter: Tianlu Wang 18

22 Progress of an evolution experiment ICML, 2017Presenter: Tianlu Wang 19

23 Repeatability of results and controls ICML, 2017Presenter: Tianlu Wang 20

24 Outline 1 Introduction Motivation Backgrounds 2 Related Work Neuro-evolution Non-evolutionary 3 Methods Algorithm Overview Encoding and Mutations More Details 4 Results Progress of experiments Comparisons Meta-parameters 5 Summary ICML, 2017Presenter: Tianlu Wang 21

25 Compared to hand-designed networks ICML, 2017Presenter: Tianlu Wang 22

26 Compared to auto-discovered networks ICML, 2017Presenter: Tianlu Wang 23

27 Outline 1 Introduction Motivation Backgrounds 2 Related Work Neuro-evolution Non-evolutionary 3 Methods Algorithm Overview Encoding and Mutations More Details 4 Results Progress of experiments Comparisons Meta-parameters 5 Summary ICML, 2017Presenter: Tianlu Wang 24

28 Improve the method Large population size More training steps Increase mutation rate Reset all weights ICML, 2017Presenter: Tianlu Wang 25

29 Summary Neuro-evolution starts from trivial initial conditions and yields fully trained models Construct large, accurate networks for two challenging and popular image classification benchmarks Large search space and high computation cost ICML, 2017Presenter: Tianlu Wang 26

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 7. Image Processing COMP9444 17s2 Image Processing 1 Outline Image Datasets and Tasks Convolution in Detail AlexNet Weight Initialization Batch Normalization