MoonRiver: Deep Neural Network in C++

Similar documents
Keras: Handwritten Digit Recognition using MNIST Dataset

Tutorial on Keras CAP ADVANCED COMPUTER VISION SPRING 2018 KISHAN S ATHREY

Index. Umberto Michelucci 2018 U. Michelucci, Applied Deep Learning,

Demystifying Deep Learning

Demystifying Deep Learning

A Quick Guide on Training a neural network using Keras.

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Deep Learning for Computer Vision II

CS489/698: Intro to ML

Tutorial on Machine Learning Tools

Index. Springer Nature Switzerland AG 2019 B. Moons et al., Embedded Deep Learning,

Convolutional Neural Networks

Supplementary material for Analyzing Filters Toward Efficient ConvNet

CNN Basics. Chongruo Wu

Advanced Video Analysis & Imaging

Deep Learning and Its Applications

Alternatives to Direct Supervision

Deep Learning for Computer Vision with MATLAB By Jon Cherrie

HPE Deep Learning Cookbook: Recipes to Run Deep Learning Workloads. Natalia Vassilieva, Sergey Serebryakov

Deep Learning with Tensorflow AlexNet

Inference Optimization Using TensorRT with Use Cases. Jack Han / 한재근 Solutions Architect NVIDIA

Using Machine Learning for Classification of Cancer Cells

Deep Learning Benchmarks Mumtaz Vauhkonen, Quaizar Vohra, Saurabh Madaan Collaboration with Adam Coates, Stanford Unviersity

Unsupervised Learning

Keras: Handwritten Digit Recognition using MNIST Dataset

NVIDIA FOR DEEP LEARNING. Bill Veenhuis

CNN optimization. Rassadin A

A Deep Learning Approach to Vehicle Speed Estimation

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Artificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu

Machine Learning. MGS Lecture 3: Deep Learning

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

NVIDIA DLI HANDS-ON TRAINING COURSE CATALOG

Machine Learning 13. week

Dynamic Routing Between Capsules

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Autoencoders. Stephen Scott. Introduction. Basic Idea. Stacked AE. Denoising AE. Sparse AE. Contractive AE. Variational AE GAN.

Deep Learning. Practical introduction with Keras JORDI TORRES 27/05/2018. Chapter 3 JORDI TORRES

Inception and Residual Networks. Hantao Zhang. Deep Learning with Python.

Perceptron: This is convolution!

11. Neural Network Regularization

POINT CLOUD DEEP LEARNING

MIXED PRECISION TRAINING: THEORY AND PRACTICE Paulius Micikevicius

MIXED PRECISION TRAINING OF NEURAL NETWORKS. Carl Case, Senior Architect, NVIDIA

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

Computer Vision: Homework 5 Optical Character Recognition using Neural Networks

Application of Deep Learning Techniques in Satellite Telemetry Analysis.

Supplementary: Cross-modal Deep Variational Hand Pose Estimation

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Fei-Fei Li & Justin Johnson & Serena Yeung

If you installed VM and Linux libraries as in the tutorial, you should not get any errors. Otherwise, you may need to install wget or gunzip.

Deconvolution Networks

Kaggle Data Science Bowl 2017 Technical Report

Deep neural networks II

Deep Neural Network Hyperparameter Optimization with Genetic Algorithms

Spatial Localization and Detection. Lecture 8-1

Polytechnic University of Tirana

RNN LSTM and Deep Learning Libraries

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2016

Outline GF-RNN ReNet. Outline

Vulnerability of machine learning models to adversarial examples

Face Recognition A Deep Learning Approach

CENG 783. Special topics in. Deep Learning. AlchemyAPI. Week 11. Sinan Kalkan

Scaling Distributed Machine Learning

Where s Waldo? A Deep Learning approach to Template Matching

An Introduction to Deep Learning with RapidMiner. Philipp Schlunder - RapidMiner Research

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Restricted Boltzmann Machines. Shallow vs. deep networks. Stacked RBMs. Boltzmann Machine learning: Unsupervised version

All You Want To Know About CNNs. Yukun Zhu

Recurrent Neural Nets II

NVIDIA DEEP LEARNING INSTITUTE

Deconvolutions in Convolutional Neural Networks

Fuzzy Set Theory in Computer Vision: Example 3

Research on Pruning Convolutional Neural Network, Autoencoder and Capsule Network

Deep Generative Models Variational Autoencoders

Machine Learning on VMware vsphere with NVIDIA GPUs

Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / ADD-IDAR

Pixel-level Generative Model

Single Image Super Resolution of Textures via CNNs. Andrew Palmer

Outlier detection using autoencoders

Usable while performant: the challenges building. Soumith Chintala

NVIDIA GPU CLOUD DEEP LEARNING FRAMEWORKS

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset

On the Effectiveness of Neural Networks Classifying the MNIST Dataset

A Deep Learning primer

Neural Networks for unsupervised learning From Principal Components Analysis to Autoencoders to semantic hashing

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

Recurrent Neural Networks and Transfer Learning for Action Recognition

Intro to Deep Learning. Slides Credit: Andrej Karapathy, Derek Hoiem, Marc Aurelio, Yann LeCunn

Deep Learning Explained Module 4: Convolution Neural Networks (CNN or Conv Nets)

Convolutional Sequence to Sequence Learning. Denis Yarats with Jonas Gehring, Michael Auli, David Grangier, Yann Dauphin Facebook AI Research

Deep Learning Applications

Deep Learning for Embedded Security Evaluation

Deep Learning. Volker Tresp Summer 2017

Deep Neural Networks Optimization

Using Capsule Networks. for Image and Speech Recognition Problems. Yan Xiong

Wu Zhiwen.

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes

Transcription:

MoonRiver: Deep Neural Network in C++ Chung-Yi Weng Computer Science & Engineering University of Washington chungyi@cs.washington.edu Abstract Artificial intelligence resurges with its dramatic improvement in recognition and prediction. The driving force is deep neural network. Although there are lots of existed packages, like Caffe, TensorFlow, PyTorch, or MXNet, to help people apply neural network technique to the problems, the running algorithm behind them is obscure. In order to overcome this, we decide to implement deep neural network in C++ from scratch, called MoonRiver. This is not only for uncovering the secrets behind deep neural network but also offering a good code base to help us discover more essential advancement in the future. There are several goals in developing MoonRiver. The first is independence: we hope MoonRiver could be lightweight. It shouldn t depend on any third-party libraries, and therefore could be easily compiled just using standard C++ compiler. It makes it possible to port MoonRiver on any OSes and machines. The second is scalability: MoonRiver should make users easily design any networks and scale to large ones with minimum effort. We demonstrate the effectiveness of MoonRiver by training and testing autoencoder and LeNet. The codes used to define these networks are concise and the experimental results are promising. 1 Overview Deep neural network is a black box. The packages of neural network, like Caffe, TensorFlow, PyTorch, or MXNet, are another black boxes. In order to uncover the secrets behind these boxes, we want to implement deep neural network in C++ from scratch, called MoonRiver. This is not only for comprehensively understanding the algorithm running in deep neural network but also for offering a good code base to improve quality and performance of the learning technique essentially in the future. Several properties are desired in designing MoonRiver. 1. Independence: MoonRiver is lightweight. It shouldn t have any dependencies on any third-party libraries. It could be easily compiled by standard C++ compiler. 2. Portability: MoonRiver should be easily ported on any OSes and machines. 3. Convenience: MoonRiver makes it easy for users to design any networks they want. 4. Scalability: MoonRiver should be easily scaled to large network with minimum effort. The report is organized as follows. We describe MoonRiver supported feature in Sec. 2. Sec. 3 introduces how to use MoonRiver to design and train a customized network, taking LeNet as a running example. In Sec. 4, we demonstrate the effectiveness of MoonRiver by showing the training setting and testing results on two networks - auto-encoder for image generation and LeNet for image classification. The possible extensions and future works are explained in Sec. 5 1

2 Supported Features Here we describe all currently MoonRiver supported features. In a nutshell, MoonRiver implements most necessary layers and activation functions to do image classification and generation, and also offers several effective optimizers to update training coefficients. Here is the summary of all features MoonRiver supports: 1. Layer (a) Convolutional Layer (b) Fully Connected Layer (c) Max Pooling Layer (d) Softmax Layer (e) Flatten Layer 2. Activation: (a) Linear (b) Relu (c) Tanh (d) Sigmoid 3. Optimizer: (a) SGD (b) Momentum (c) RMSprop (d) Adam 4. Cost Function: (a) Mean Square Error (b) Negative Log Likelihood 5. Misc: (a) MNIST Data Loader (b) Mini-batch Random Sampler (c) Network Saving and Loading Notice DropOut and Batch Normalization are not supported right now, should be added on in the future. 2

Figure 1: The network architecture of LeNet. 3 Usage Here we explain how to use MoonRiver to design a customized network and how to train it. We take LeNet as a running example because of its elegant architecture and prestigious fame on image classification. Although I believe most readers have already known the architecture of LeNet, for completeness of the report, I still illustrate this famous architecture in Figure. 1 If we use MoonRiver to implement LeNet, the codes we have to write are only as follows: Notice how easily we set an activation function in each layer, and if no activation function is set, the output of the layer will be directly fed into the next layer without passing any activations. 3

Next, We explain how to train LeNet by leveraging the training data from MNIST data set. The training codes are shown as follows. We carefully add notes on the codes to explain how it works. So that s all! Things become super easy when using MoonRiver to create a new network and train it efficiently. 4 Experimental Results In this section we demonstrate the effectiveness of MoonRiver by training and testing two networks: LeNet and Auto-Encoder. The former is for image classification, whereas the letter is for image generation, or you can view it in another perspective, image dimension reduction. 4.1 LeNet The goal of LeNet is for each input handwritten image to predict which digit it belongs to. So it can be considered as solving an image classification problem. Here is our setting in training LeNet: Training Set: 60,000 MNIST images Mini-batch Size: 64 Total Epochs: 10 Optimizer: Momentum Cost Function: Negative Log Likelihood We test the trained model with 10,000 testing images from MNIST dataset, which are not included in the training set. The testing/classification results are shown as follows: Negative Log Likelihood Loss: 0.038 Classification Accuracy: 98.97% (9,897/10,000) 4

Figure 2: The network architecture of Auto-Encoder. 4.2 Auto-Encoder Another network we used to test MoonRiver is auto-encoder. The main goal of auto-encoder is to encode an image as a code which is usually has lower dimension than the original image(notice the term code used here means low dimensional representation of a image instead of programming code we used before). The idea is to look for an encoding function to reduce the image dimension from high to low, and another decoding function to reconstruct the high dimensional image from the low dimensional code. The encoding and decoding functions are trained through an end to end neural network training. The auto-encoder architecture we design is shown in Figure.2. Note the yellow layer in the middle of the network is code layer, which only has two dimensions. Here is the setting of training the auto-encoder: Training Set: 60,000 MNIST images Mini-batch Size: 64 Total Epochs: 10 Optimizer: Adam Cost Function: Mean Square Error Mean Square Error + Code L2 norm regularization Notice that we use two different cost functions to train the auto-encoder. The below is the result of using only mean square error. We show the result by drawing the codes, which are only 2- dimensional, on a 2D image, called code map. It shows significant clustered effect where the codes belongs to the same digits are clustered in the code map. Figure 3: The code map from the training where we only use mean square error as the cost function. 5

But notice the codes could be highly distributed in the code map if we only use mean square error as the cost function because they can be any values without any penalties. So we try another cost functions which regularizes L2 norm of codes and hopefully could get the map where all codes are centralize around the origin. It works! The result is shown in the following image, which the codes are more evenly distributed in the 2D space. Figure 4: The code map where we use mean square error plus code regularization as the cost function. In the end, we show some sample reconstruction image results in Figure.5. These images are generated by passing the input images (at top tow) to the trained auto-encoder and then retrieved from the end of the network. The reconstruction images should be as close as possible to the input image (Note the criteria is exactly the cost function we call it mean square error before). Figure 5: The sample reconstruction results. (Top row) Ground truth. (Bottom Row) Reconstructed images. 5 Conclusion and Future Works We design a new deep neural network framework, MoonRiver, from scratch. It was implemented in C++. It is lightweight and supports any OSes where have standard C++ compilers. It also let users could easily create a customized network and train it efficiently. There are still lots of extensions we can do in the future. Here we list some of them: Support GPU acceleration Support Recurrent Neural Network, like LSTM Support GAN Convert existed trained network, like AlexNet, VGG-Net, or ResNet, into MoonRiver accepted network format MoonRiver is the start instead of the end. We hopefully in the future we could look for more essential advancement in deep neural network based on the good code base we have already developed. 6