Deep Learning With Noise

Size: px
Start display at page:

Download "Deep Learning With Noise"

Transcription

1 Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University Fan Yang Department of Mathematical Sciences Carnegie Mellon University Abstract Recent works have shown that, by allowing some inaccuracy when training deep neural networks, not only the training performance but also the accuracy of the model can be improved. Our work, taking those previous works as examples and guidance, tries to study the impact of introducing different types of noise in different components of training a deep neural network. We intend to experiment with noise types which include Binomial noise, Gaussian noise, Gamma noise, etc. We also intend to study the effects of noise in different parts of the model which include neurons and network links in the input, hidden, and output layers, as well as matrix multiplication and gradient computation in the backward propagation process. 1 Introduction Large-scale deep neutral network models have become increasingly popular to solve hard classification problems and have demonstrated significant improvements in accuracy. Compared to traditional statistical machine learning methods, which require a human domain expert that can construct a good set of features as input dataset, deep learning models do not require a hand crafted feature set to begin with, and hence is more powerful and suitable for hard AI tasks such as speech recognition or visual object classification. Without any hand crafting of the raw input data, deep neural network machine learning models can learn a hierarchy of features by itself in the first several layers of neural network model. Then, in the deepest layer of the model, a set of features are selected and weighted for each output to generate a prediction. Avoiding the inevitable human error in feature selection, deep learning often outperforms traditional approaches in those hard classification problems in terms of accuracy. In order to train a more complicated model which includes feature selection capability, deep neural network is typically trained with more data than a traditional machine learning method. Due to the scale of the deep neutral network (with multiple layers of neurons) and the scale of the input data set, the performance of these models, in addition to accuracy, has become a significant factor in such implementations. Recent work [1, 2] on large-scale machine learning systems propose to significantly improve the performance by relaxing the consistency when training neural network models (e.g., weights will not be updated in each iteration). One interesting observation in these papers, alongside the above one, is that relaxation surprisingly improves the accuracy of the deep learning model on the test data. However, the effect of noise on deep learning models has never been systematically studied, nor is the underlying reason for the improved accuracy. One hypothesis of the above observation is that relaxing consistency introduces stochastic noise into training process [1]. This implicitly mitigates over-fitting of the model and generalizes the model better to classify test data. Another hypothesis of this observation is that the introduced noise eliminates the memorization effect of a deep neural network, and hence allows the model to capture the general observation of the training data that can be applied to the test data well. 1

2 Our work, taking previous works as examples and guidance, tries to systematically study the effect of introducing different noise into different components of different types of deep learning neural networks. We observe that a reasonable amount, and a reasonable magnitude of noise, when introduced into a deep learning model, can improve the accuracy and the convergence rate of the model. We hope that our work can provide insights into future methods of approximate deep learning model, and inspire and motivate more work to take advantage of the 2 Background and Related Work In this section, we first introduce several common neural networks model, Logistic Regression (single neuron) model, Multi-Layer Perceptron (MLP) model, and Convolutional Neural Network (LeNet) model. Then we summarize and compare our work to several related works that introduce noise into those models to improve accuracy. 2.1 Neural Network Models Explained The simplest form of a neural network, which is also the primary component of any neural network model, is a single neuron. Figure 1a illustrates a single-neuron neural network, which is also known as the Logistic Regression (LR) model. The neuron shown in the figure consumes a vector of numbers (X) as inputs, and produces a single number as its output, which typically represents the prediction result by the model. The neuron stores a vector of weights (W), with each weight represents how positively or negatively each input affects the output. The output of the neuron and the update to the neuron s weight can be computed as follows: Output = tanh(w X) W new = W old learningrate Wold cost(w ) A Multi-Layer Perceptron (MLP) model is essentially multiple layers of neurons connected by a network. Figure 1b illustrates a simple example of such model, which is composed of three layers. The first layer is the input layer, which provides the raw inputs to the next layer. The second layer is the hidden layer, whose input is fully connected to the input layer, and output fully connected to the output layer. The hidden layer is known to be capable of extracting features from the input. The third layer is the output layer, which outputs the prediction results for the data. Note that the figure shows only an example of such model, the model can become deeper to extract more implicit features from the raw input if we add more hidden layers between input and output layers, which are fully connected to the layers next to each other. (a) Single-layer neural network (LR) model. (b) Multi-Layer Perceptron (MLP) model. Figure 1: Illustration of two neural network models. A Convolutional Neural Network (LeNet) model adds multiple convolution layers in addition to the MLP model. Figure 2 illustrates an example of this model. In the convolution layer, multiple steps are processed. First, the input is transformed into a two-dimensional array. Then a sliding window which contains a small two-dimensional weight vector is applied to the input. The sliding window is capable of extracting 2-D features from the inputs such as images. Finally, the processed input is downsampled by a 2x2 matrix, which reduces the size of the input by 4. In this figure, we show an example which contains two convolution layer and a hidden layer. We can also have a more complex model by adding more convolution layers or more hidden layers, which also allows the model to extract more implicit features. 2

3 Figure 2: Convolutional Neural Network (LeNet) model. 2.2 Comparison with Related Works We summarize three recent works that explain and explore three mechanisms to introduce noise into a multi-layer neural network (mlp). Dropout proposes to regularize fully connected neural networks by probabilistically dropping an output (set to zero) of a hidden layer neuron [3] (i.e., with a low probability (1 p), one of the output of a hidden layer neutron is set to 0 in the forward propagation process). This can effectively decrease test error rates by preventing over-fitting of the model. Inspired by Dropout [3], DropConnect proposes to probabilistically drop a weight of a hidden layer neuron (as opposed to an output of a hidden layer neuron in DropConnect) [2]. Maxout extends Dropout and DropConnect by probabilistically set an output or a weight of a hidden layer neuron to maximum value [4]. While these works attempts to explore similar ideas as our work, we believe our work is much more comprehensive than these works as we systematically and experimentally explored various noise models, various noise locations, and various neural networks. 3 Proposed Method In this section, we overview all types of noise that we have introduced into each model (LR, MLP, and LeNet) in our experiments. 3.1 Adding Noise into Logistic Regression We first introduce noise into gradient descent component of Logistic Regression. To be more specific, in a noise-free Logistic Regression model, weights are updated in the following way: W new = W old learningrate Wold cost(w ) In a noise-added Logistic Regression model, weights are updated as: W new = W old learningrate (mask Wold cost(w )) or W new = W old mask Gau Wold cost(w ) where learning rate is a scalar and mask is a vector that has the same dimension as W. We generate mask as a random vector from Binomial distribution Bin(1, 0.5), Gaussian distribution N (learningrate, 2 learningrate), Rayleigh Distribution Rayleigh(1) or Gamma Distribution Gamma(1, 1). 3.2 Adding Noise into Multi-layer Logistic Regression Secondly, we introduce noise into weights between layers. In our Multi-layer Logistic Regression model, there are three layers: input layer, hidden layer and output layer. Each layer consists of neurons. Neurons in different layers are connected by weights. During a noise-free training process of the model, weights between layers are transmitted and updated without any loss of information or variances. However, during a noise-added training process, weights between layers are are subject to some variation. To be more specific, let W input be the 3

4 matrix of weights between input layer and hidden layer, W output be the matrix of weights between hidden layer and output layer. In a noise-added training process, we add combination of the following steps: W input = W input mask W output = W output mask W input = W input + mask where mask is a matrix of the same dimension as W input or W output. We generate mask as a random matrix from Binomial Distribution Bin(1, 0.99) or Gaussian Distribution N (0, 0.01). 3.3 Adding Noise into Convolutional Neural Network Last, we introduce noise into feature mapping component of the model. The difference between Convolutional Neural Network (LeNet) and Multi-layer Logistic Regression (MLP) is that LeNet has a feature mapping process before MLP. Feature mapping is a process where a small window moves along the image to extract local features. In other words, the window, acting as a function, will compute a linear combination of the underlying pixels. In a noise-added feature mapping process, the extracted feature is subject to some variation. 4 Experiments In this section, we first present the dataset we use in our experiments as well as the parameters for each model. Note that we fine tuned these parameters to achieve the best possible outcomes before we add our modification to the code. Next, we present the results and conclude findings we get from our experiments, including those negative results and the lessons learned in this project. 4.1 Dataset and Implementation Parameters We experiment on three datasets a hand-written digit dataset (MNIST), two tiny images datasets (CIFAR-10 and CIFAR-100). Specifications of datasets are summarized in Table 1. Dataset Description Class Training Set Size Testing Set Size MNIST hand-written digits 10 60,000 10,000 CIFAR-10 32x32 RGB images 10 50,000 10,000 CIFAR x32 RGB images ,000 10,000 Table 1: Datasets: MNIST, CIFAR-10, CIFAR-100 We preprocess CIFAR-10 and CIFAR-100 by grey-scaling every image using the following formula: Y = R G B In other words, every pixel in the image is now a linear combination of its original RGB values. These two datasets are preprocessed due to technical implementation limitations (which will be fixed after the deadline), not machine learning theory reasons. Our neural network models are implemented using Python Theono Library. The starter code is from DeepLearning.net. Parameters of each neural network models are summarized in Table 2. Model Parameters Logistic Regression (LR) learning rate = 0.13 Multi-layer Logistic Regression (MLP) LR + hidden units = 500 Convolutional Neural Network MLP + window size = 5x5, downsample = 2x2 Table 2: Parameters of Neural Network Models We use stochastic logistic regression with learning rate = In Multi-layer Logistic Regression, there are 500 neurons in the hidden layer. During feature mapping of Convolutional Neural Network, windows are of size 5 by 5 and downsample is of size 2 by 2. 4

5 In our experiments, different models may run different number of iterations. This is because we set a threshold of accuracy increase when training the model. If the model s accuracy increase is less than the threshold, we stop training the model. Hence some models run more iterations as long as their accuracy increases are above the threshold. 4.2 Adding Noise into Logistic Regression Figure 3 shows test error rate using noise-free and noise-added Logistic Regression on MNIST. Figure 3: Logistic Regression with Noise on MNIST In Figure 3, the vertical axis is test error rate (%), the horizontal axis is number of iterations. The experiments all run on MNIST. The noise-free line shows test error rate using a noise-free Logistic Regression model. The noise(gaussian) line shows test error rate when mask Gau is applied during gradient descent. The noise(binomial) line shows test error rate when a mask generated from Bin(1, 0.5) is applied during gradient descent. The noise(rayleigh) line shows test error rate when a mask generated from Rayleigh(1) is applied during gradient descent. The noise(gamma) line shows test error rate when a mask generated from Gamma(1, 1) is applied during gradient descent. Finding 1: A reasonable amount, and a reasonable amplitude of noise improves deep neural network model s accuracy, while a noise that is too significant does not. As showed in Figure 3, noise-added models achieve better accuracy compared to the noise-free model. Noise(Binomial) model has the lowest test error rate (7.156%) among the five experiments. However, it also has the lowest convergence rate. This is a phenomenon we have observed throughout the project. Though adding noise can improve accuracy, the side effect is that it will take longer to train the model, hence decrease convergence rate. 4.3 Adding Noise into Multi-layer Logistic Regression Figure 4 shows test error rate using noise-free and noise-added Multi-layer Logistic Regression on MNIST. In Figure 4, the vertical axis is test error rate (%), the horizontal axis is number of iterations. The experiments all run on MNIST. The noise-free line shows test error rate using a noise-free Multilayer Logistic Regression. The dropconnect line shows test error rate when a mask generated from Bin(1, 0.99) is applied to W input. The dropout line shows test error rate when a mask generated from Bin(1, 0.99) is applied to W output. The dropconnect&out line shows test error rate when mask generated from Bin(1, 0.99) is applied to both W input and W output. The noise-variation line shows test error rate when a mask generated from Gaussian N (0, 0.01) is added to W input. Finding 2: Deep learning models with noise perform no worse than the noise-free model. As showed in Figure 4, noise-added models perform no worse than the noise-free model. Since the test error rate of the noise-free model is already quite low (2.63%), it is difficult for noise-added models to significantly improve accuracy. We notice that dropout model and dropconnect model 5

6 Figure 4: Multi-layer Logistic Regression with Noise on MNIST Figure 5: Multi-layer Logistic Regression with Noise on CIFAR-10 perform better than dropconnect&out model and noise-variation model. It is difficult to provide a conclusive explanation for this observation at the moment because we have not finished fine-tuning our noise-added models. It is possible that noise from certain distributions is more likely to prevent overfitting and hence improve accuracy. Figure 5 shows test error rate using noise-free and noise-added Multi-layer Logistic Regression on CIFAR-10. In Figure 5, the vertical axis is test error rate (%), the horizontal axis is number of iterations. The experiments all run on CIFAR-10. The noise-free line, dropconnect line and dropout line use the same models as experiments in Figure 4, respectively. Finding 3: Deep learning models with noise can take more iterations to converge as the test error fluctuates due to noise. As showed in Figure 5, noise-added models perform much better than noise-free model, though it takes longer to train noise-added models. An interesting observation is that as training iterations increase, test error rate of noise-added models fluctuates. This is another side effect of adding noise into the model. Finding 4: Noise added to earlier stage of the deep learning models can be better integrated and generate less fluctuation. From the above experiments using MLP, we observe that models with noise added between input layer and hidden layer outperform other noise-added models. An intuitive explanation for this phenomenon is that noise added to earlier stage of the model can be better integrated while noise added to later stage of the model tends to cause more fluctuation. 6

7 4.4 Adding Noise into Convolutional Neural Network Figure 6 shows test error rate using noise-free and noise-added Convolutional Multi-layer Logistic Regression on MNIST. In Figure 6, the vertical axis is test error rate (%), the horizontal axis is Figure 6: Convolutional Neural Network with Noise on MNIST number of iterations. The experiments all run on MNIST. The noise-free line shows test error rate when using a noise-free Convolutional Neural Network. The noise@downsampe line shows test error rate when noise is added during downsample process. The noise-before-hidden-layer line shows test error rate when noise is added right before hidden layer. Finding 5: The convergence rate is faster for deep learning models with noise. As showed in Figure 6, the three models perform equally well. We observe that as the number of iterations increases, noise-added models converge slightly faster. This phenomenon is interesting because it is unexpected. Similarly phenomenon appears in Figure 7 as well. Figure 7 shows test error rate using noise-free and noise-added Convolutional Multi-layer Logistic Regression on CIFAR-10. In Figure 7, the vertical axis is test error rate (%), the horizontal axis is Figure 7: Convolutional Neural Network with Noise on CIFAR-10 number of iterations. The experiments all run on CIFAR-10. The noise-free line shows test error rate when using a noise-free Convolutional Neural Network. The convo-dropconnect line shows test error rate when the MLP part of the model has noise added between input layer and hidden layer. The convo-dropout line shows test error rate when the MLP part of the model has noise added between hidden layer and output layer. 7

8 Finding 6: Noise improves both accuracy and convergence rate more with complex deep learning models. As showed in Figure 7, the three models achieve the same lowest test error rate. However, through the training process, the noise-added model (convo-dropconnect) converges faster than the noisefree model. The intuition behind this phenomenon is that since Convolutional Neural Network is a complicated model, noise is better integrated and absorbed. We conjecture that noise added to complicated deep learning models can improve not only accuracy but also convergence rate. 4.5 Negative Results Lesson Learned: Complex deep learning models can integrate noise better than simple models. Figure 8 shows some negative results from our experiments. We experiment noise-free and noiseadded Logistic Regression on CIFAR-100. The noise-added model perform much worse than noisefree model. The explanation for this result is that Logistic Regression model is too simple to integrate noise when running CIFAR-100. This agrees with our previous conjecture that complicated models are better at integrating noise. Figure 8: Negative Results on CIFAR Conclusions In this project, we systematically perform experiments on studying the effect of adding noise into deep learning neural networks. We conduct experiments on adding different noise into different components of neural network models. The experiment results show that adding noise almost always improves accuracy. Our main observations are: (1) Noise added during early stage of the model can be better integrated while noise added during late stage of the model tends to cause fluctuation of accuracy; (2) Complicated neural network models can integrate and absorb noise better than simple neural network models; (3) Sometimes adding noise can improve not only accuracy but also convergence rate. We hope that this experimental study can provide insights into future design of deep learning neural network models and machine learning hardwares. Next generation machine learning hardwares can fully exploit the results that Beyond this project, we hope to pursue three major research directions: (1) Conduct more thorough experiments that quantitatively analyze the effect of noise on deep learning models; (2) Provide theoretical explanation for the effect of noise on deep learning models based on our experiment results and findings; (3) Design and explore more efficient computer hardware and systems for deep learning models. References [1] T. Chilimbi, Y. Suzue, J. Apacible, and K. Kalyanaraman, Project adam: Building an efficient and scalable deep learning training system, in OSDI,

9 [2] L. Wan, M. Zeiler, S. Zhang, Y. L. Cun, and R. Fergus, Regularization of neural networks using dropconnect, in ICML, [3] G. E. Hinton, N. Srivastava, A. Krizhevsky, I. Sutskever, and R. R. Salakhutdinov, Improving neural networks by preventing co-adaptation of feature detectors, arxiv preprint arxiv: , [4] I. J. Goodfellow, D. Warde-Farley, M. Mirza, A. C. Courville, and Y. Bengio, Maxout networks, in ICML,

Stochastic Function Norm Regularization of DNNs

Stochastic Function Norm Regularization of DNNs Stochastic Function Norm Regularization of DNNs Amal Rannen Triki Dept. of Computational Science and Engineering Yonsei University Seoul, South Korea amal.rannen@yonsei.ac.kr Matthew B. Blaschko Center

More information

From Maxout to Channel-Out: Encoding Information on Sparse Pathways

From Maxout to Channel-Out: Encoding Information on Sparse Pathways From Maxout to Channel-Out: Encoding Information on Sparse Pathways Qi Wang and Joseph JaJa Department of Electrical and Computer Engineering and, University of Maryland Institute of Advanced Computer

More information

Convolutional Neural Networks

Convolutional Neural Networks Lecturer: Barnabas Poczos Introduction to Machine Learning (Lecture Notes) Convolutional Neural Networks Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.

More information

Study of Residual Networks for Image Recognition

Study of Residual Networks for Image Recognition Study of Residual Networks for Image Recognition Mohammad Sadegh Ebrahimi Stanford University sadegh@stanford.edu Hossein Karkeh Abadi Stanford University hosseink@stanford.edu Abstract Deep neural networks

More information

Deep Learning for Computer Vision II

Deep Learning for Computer Vision II IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L

More information

Weighted Convolutional Neural Network. Ensemble.

Weighted Convolutional Neural Network. Ensemble. Weighted Convolutional Neural Network Ensemble Xavier Frazão and Luís A. Alexandre Dept. of Informatics, Univ. Beira Interior and Instituto de Telecomunicações Covilhã, Portugal xavierfrazao@gmail.com

More information

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet Classification with Deep Convolutional Neural Networks ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture

More information

arxiv: v1 [stat.ml] 21 Feb 2018

arxiv: v1 [stat.ml] 21 Feb 2018 Detecting Learning vs Memorization in Deep Neural Networks using Shared Structure Validation Sets arxiv:2.0774v [stat.ml] 2 Feb 8 Elias Chaibub Neto e-mail: elias.chaibub.neto@sagebase.org, Sage Bionetworks

More information

Deep Learning Workshop. Nov. 20, 2015 Andrew Fishberg, Rowan Zellers

Deep Learning Workshop. Nov. 20, 2015 Andrew Fishberg, Rowan Zellers Deep Learning Workshop Nov. 20, 2015 Andrew Fishberg, Rowan Zellers Why deep learning? The ImageNet Challenge Goal: image classification with 1000 categories Top 5 error rate of 15%. Krizhevsky, Alex,

More information

Deep Learning & Neural Networks

Deep Learning & Neural Networks Deep Learning & Neural Networks Machine Learning CSE4546 Sham Kakade University of Washington November 29, 2016 Sham Kakade 1 Announcements: HW4 posted Poster Session Thurs, Dec 8 Today: Review: EM Neural

More information

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful

More information

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

Deep Learning with Tensorflow AlexNet

Deep Learning with Tensorflow   AlexNet Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification

More information

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3 Neural Network Optimization and Tuning 11-785 / Spring 2018 / Recitation 3 1 Logistics You will work through a Jupyter notebook that contains sample and starter code with explanations and comments throughout.

More information

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional

More information

Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah

Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah Reference Most of the slides are taken from the third chapter of the online book by Michael Nielson: neuralnetworksanddeeplearning.com

More information

Apparel Classifier and Recommender using Deep Learning

Apparel Classifier and Recommender using Deep Learning Apparel Classifier and Recommender using Deep Learning Live Demo at: http://saurabhg.me/projects/tag-that-apparel Saurabh Gupta sag043@ucsd.edu Siddhartha Agarwal siagarwa@ucsd.edu Apoorve Dave a1dave@ucsd.edu

More information

CS489/698: Intro to ML

CS489/698: Intro to ML CS489/698: Intro to ML Lecture 14: Training of Deep NNs Instructor: Sun Sun 1 Outline Activation functions Regularization Gradient-based optimization 2 Examples of activation functions 3 5/28/18 Sun Sun

More information

Introduction to Deep Learning

Introduction to Deep Learning ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)

More information

DropConnect Regularization Method with Sparsity Constraint for Neural Networks

DropConnect Regularization Method with Sparsity Constraint for Neural Networks Chinese Journal of Electronics Vol.25, No.1, Jan. 2016 DropConnect Regularization Method with Sparsity Constraint for Neural Networks LIAN Zifeng 1,JINGXiaojun 1, WANG Xiaohan 2, HUANG Hai 1, TAN Youheng

More information

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation

JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS. Puyang Xu, Ruhi Sarikaya. Microsoft Corporation JOINT INTENT DETECTION AND SLOT FILLING USING CONVOLUTIONAL NEURAL NETWORKS Puyang Xu, Ruhi Sarikaya Microsoft Corporation ABSTRACT We describe a joint model for intent detection and slot filling based

More information

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used. 1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when

More information

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017 3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural

More information

Groupout: A Way to Regularize Deep Convolutional Neural Network

Groupout: A Way to Regularize Deep Convolutional Neural Network Groupout: A Way to Regularize Deep Convolutional Neural Network Eunbyung Park Department of Computer Science University of North Carolina at Chapel Hill eunbyung@cs.unc.edu Abstract Groupout is a new technique

More information

Deep Neural Networks:

Deep Neural Networks: Deep Neural Networks: Part II Convolutional Neural Network (CNN) Yuan-Kai Wang, 2016 Web site of this course: http://pattern-recognition.weebly.com source: CNN for ImageClassification, by S. Lazebnik,

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

End-To-End Spam Classification With Neural Networks

End-To-End Spam Classification With Neural Networks End-To-End Spam Classification With Neural Networks Christopher Lennan, Bastian Naber, Jan Reher, Leon Weber 1 Introduction A few years ago, the majority of the internet s network traffic was due to spam

More information

Comparing Dropout Nets to Sum-Product Networks for Predicting Molecular Activity

Comparing Dropout Nets to Sum-Product Networks for Predicting Molecular Activity 000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050

More information

Visual object classification by sparse convolutional neural networks

Visual object classification by sparse convolutional neural networks Visual object classification by sparse convolutional neural networks Alexander Gepperth 1 1- Ruhr-Universität Bochum - Institute for Neural Dynamics Universitätsstraße 150, 44801 Bochum - Germany Abstract.

More information

Tiny ImageNet Visual Recognition Challenge

Tiny ImageNet Visual Recognition Challenge Tiny ImageNet Visual Recognition Challenge Ya Le Department of Statistics Stanford University yle@stanford.edu Xuan Yang Department of Electrical Engineering Stanford University xuany@stanford.edu Abstract

More information

Emotion Detection using Deep Belief Networks

Emotion Detection using Deep Belief Networks Emotion Detection using Deep Belief Networks Kevin Terusaki and Vince Stigliani May 9, 2014 Abstract In this paper, we explore the exciting new field of deep learning. Recent discoveries have made it possible

More information

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017 COMP9444 Neural Networks and Deep Learning 7. Image Processing COMP9444 17s2 Image Processing 1 Outline Image Datasets and Tasks Convolution in Detail AlexNet Weight Initialization Batch Normalization

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

Keras: Handwritten Digit Recognition using MNIST Dataset

Keras: Handwritten Digit Recognition using MNIST Dataset Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA February 9, 2017 1 / 24 OUTLINE 1 Introduction Keras: Deep Learning library for Theano and TensorFlow 2 Installing Keras Installation

More information

Multi-Glance Attention Models For Image Classification

Multi-Glance Attention Models For Image Classification Multi-Glance Attention Models For Image Classification Chinmay Duvedi Stanford University Stanford, CA cduvedi@stanford.edu Pararth Shah Stanford University Stanford, CA pararth@stanford.edu Abstract We

More information

Advanced Introduction to Machine Learning, CMU-10715

Advanced Introduction to Machine Learning, CMU-10715 Advanced Introduction to Machine Learning, CMU-10715 Deep Learning Barnabás Póczos, Sept 17 Credits Many of the pictures, results, and other materials are taken from: Ruslan Salakhutdinov Joshua Bengio

More information

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna,

Convolutional Neural Networks: Applications and a short timeline. 7th Deep Learning Meetup Kornel Kis Vienna, Convolutional Neural Networks: Applications and a short timeline 7th Deep Learning Meetup Kornel Kis Vienna, 1.12.2016. Introduction Currently a master student Master thesis at BME SmartLab Started deep

More information

Neural Networks: promises of current research

Neural Networks: promises of current research April 2008 www.apstat.com Current research on deep architectures A few labs are currently researching deep neural network training: Geoffrey Hinton s lab at U.Toronto Yann LeCun s lab at NYU Our LISA lab

More information

Seminars in Artifiial Intelligenie and Robotiis

Seminars in Artifiial Intelligenie and Robotiis Seminars in Artifiial Intelligenie and Robotiis Computer Vision for Intelligent Robotiis Basiis and hints on CNNs Alberto Pretto What is a neural network? We start from the frst type of artifcal neuron,

More information

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane

More information

Convolution Neural Networks for Chinese Handwriting Recognition

Convolution Neural Networks for Chinese Handwriting Recognition Convolution Neural Networks for Chinese Handwriting Recognition Xu Chen Stanford University 450 Serra Mall, Stanford, CA 94305 xchen91@stanford.edu Abstract Convolutional neural networks have been proven

More information

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa Instructors: Parth Shah, Riju Pahwa Lecture 2 Notes Outline 1. Neural Networks The Big Idea Architecture SGD and Backpropagation 2. Convolutional Neural Networks Intuition Architecture 3. Recurrent Neural

More information

Using Capsule Networks. for Image and Speech Recognition Problems. Yan Xiong

Using Capsule Networks. for Image and Speech Recognition Problems. Yan Xiong Using Capsule Networks for Image and Speech Recognition Problems by Yan Xiong A Thesis Presented in Partial Fulfillment of the Requirements for the Degree Master of Science Approved November 2018 by the

More information

Fuzzy Set Theory in Computer Vision: Example 3, Part II

Fuzzy Set Theory in Computer Vision: Example 3, Part II Fuzzy Set Theory in Computer Vision: Example 3, Part II Derek T. Anderson and James M. Keller FUZZ-IEEE, July 2017 Overview Resource; CS231n: Convolutional Neural Networks for Visual Recognition https://github.com/tuanavu/stanford-

More information

CS229 Final Project: Predicting Expected Response Times

CS229 Final Project: Predicting Expected  Response Times CS229 Final Project: Predicting Expected Email Response Times Laura Cruz-Albrecht (lcruzalb), Kevin Khieu (kkhieu) December 15, 2017 1 Introduction Each day, countless emails are sent out, yet the time

More information

Rotation Invariance Neural Network

Rotation Invariance Neural Network Rotation Invariance Neural Network Shiyuan Li Abstract Rotation invariance and translate invariance have great values in image recognition. In this paper, we bring a new architecture in convolutional neural

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Convolutional Neural Networks for Handwritten Digit Recognition Andreas Georgopoulos CID: 01281486 Abstract Abstract At this project three different Convolutional Neural Netwroks

More information

A comparison between end-to-end approaches and feature extraction based approaches for Sign Language recognition

A comparison between end-to-end approaches and feature extraction based approaches for Sign Language recognition A comparison between end-to-end approaches and feature extraction based approaches for Sign Language recognition Marlon Oliveira, Houssem Chatbri, Suzanne Little, Noel E. O Connor, and Alistair Sutherland

More information

The Mathematics Behind Neural Networks

The Mathematics Behind Neural Networks The Mathematics Behind Neural Networks Pattern Recognition and Machine Learning by Christopher M. Bishop Student: Shivam Agrawal Mentor: Nathaniel Monson Courtesy of xkcd.com The Black Box Training the

More information

Deep Neural Network Acceleration Framework Under Hardware Uncertainty

Deep Neural Network Acceleration Framework Under Hardware Uncertainty Deep Neural Network Acceleration Framework Under Hardware Uncertainty Mohsen Imani, Pushen Wang, and Tajana Rosing Computer Science and Engineering, UC San Diego, La Jolla, CA 92093, USA {moimani, puw001,

More information

arxiv: v1 [cs.cv] 29 Oct 2017

arxiv: v1 [cs.cv] 29 Oct 2017 A SAAK TRANSFORM APPROACH TO EFFICIENT, SCALABLE AND ROBUST HANDWRITTEN DIGITS RECOGNITION Yueru Chen, Zhuwei Xu, Shanshan Cai, Yujian Lang and C.-C. Jay Kuo Ming Hsieh Department of Electrical Engineering

More information

Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling

Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling [DOI: 10.2197/ipsjtcva.7.99] Express Paper Cost-alleviative Learning for Deep Convolutional Neural Network-based Facial Part Labeling Takayoshi Yamashita 1,a) Takaya Nakamura 1 Hiroshi Fukui 1,b) Yuji

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

Motivation Dropout Fast Dropout Maxout References. Dropout. Auston Sterling. January 26, 2016

Motivation Dropout Fast Dropout Maxout References. Dropout. Auston Sterling. January 26, 2016 Dropout Auston Sterling January 26, 2016 Outline Motivation Dropout Fast Dropout Maxout Co-adaptation Each unit in a neural network should ideally compute one complete feature. Since units are trained

More information

Index. Umberto Michelucci 2018 U. Michelucci, Applied Deep Learning,

Index. Umberto Michelucci 2018 U. Michelucci, Applied Deep Learning, A Acquisition function, 298, 301 Adam optimizer, 175 178 Anaconda navigator conda command, 3 Create button, 5 download and install, 1 installing packages, 8 Jupyter Notebook, 11 13 left navigation pane,

More information

Using neural nets to recognize hand-written digits. Srikumar Ramalingam School of Computing University of Utah

Using neural nets to recognize hand-written digits. Srikumar Ramalingam School of Computing University of Utah Using neural nets to recognize hand-written digits Srikumar Ramalingam School of Computing University of Utah Reference Most of the slides are taken from the first chapter of the online book by Michael

More information

Handwritten Hindi Numerals Recognition System

Handwritten Hindi Numerals Recognition System CS365 Project Report Handwritten Hindi Numerals Recognition System Submitted by: Akarshan Sarkar Kritika Singh Project Mentor: Prof. Amitabha Mukerjee 1 Abstract In this project, we consider the problem

More information

Lecture : Neural net: initialization, activations, normalizations and other practical details Anne Solberg March 10, 2017

Lecture : Neural net: initialization, activations, normalizations and other practical details Anne Solberg March 10, 2017 INF 5860 Machine learning for image classification Lecture : Neural net: initialization, activations, normalizations and other practical details Anne Solberg March 0, 207 Mandatory exercise Available tonight,

More information

Image Compression: An Artificial Neural Network Approach

Image Compression: An Artificial Neural Network Approach Image Compression: An Artificial Neural Network Approach Anjana B 1, Mrs Shreeja R 2 1 Department of Computer Science and Engineering, Calicut University, Kuttippuram 2 Department of Computer Science and

More information

Convolutional Neural Network for Image Classification

Convolutional Neural Network for Image Classification Convolutional Neural Network for Image Classification Chen Wang Johns Hopkins University Baltimore, MD 21218, USA cwang107@jhu.edu Yang Xi Johns Hopkins University Baltimore, MD 21218, USA yxi5@jhu.edu

More information

On the Effectiveness of Neural Networks Classifying the MNIST Dataset

On the Effectiveness of Neural Networks Classifying the MNIST Dataset On the Effectiveness of Neural Networks Classifying the MNIST Dataset Carter W. Blum March 2017 1 Abstract Convolutional Neural Networks (CNNs) are the primary driver of the explosion of computer vision.

More information

3D model classification using convolutional neural network

3D model classification using convolutional neural network 3D model classification using convolutional neural network JunYoung Gwak Stanford jgwak@cs.stanford.edu Abstract Our goal is to classify 3D models directly using convolutional neural network. Most of existing

More information

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies

More information

Neural Networks (pp )

Neural Networks (pp ) Notation: Means pencil-and-paper QUIZ Means coding QUIZ Neural Networks (pp. 106-121) The first artificial neural network (ANN) was the (single-layer) perceptron, a simplified model of a biological neuron.

More information

Real-time Object Detection CS 229 Course Project

Real-time Object Detection CS 229 Course Project Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection

More information

Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions

Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions ENEE 739Q: STATISTICAL AND NEURAL PATTERN RECOGNITION Spring 2002 Assignment 2 Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions Aravind Sundaresan

More information

Adaptive Dropout Training for SVMs

Adaptive Dropout Training for SVMs Department of Computer Science and Technology Adaptive Dropout Training for SVMs Jun Zhu Joint with Ning Chen, Jingwei Zhuo, Jianfei Chen, Bo Zhang Tsinghua University ShanghaiTech Symposium on Data Science,

More information

Contextual Dropout. Sam Fok. Abstract. 1. Introduction. 2. Background and Related Work

Contextual Dropout. Sam Fok. Abstract. 1. Introduction. 2. Background and Related Work Contextual Dropout Finding subnets for subtasks Sam Fok samfok@stanford.edu Abstract The feedforward networks widely used in classification are static and have no means for leveraging information about

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

Character Recognition Using Convolutional Neural Networks

Character Recognition Using Convolutional Neural Networks Character Recognition Using Convolutional Neural Networks David Bouchain Seminar Statistical Learning Theory University of Ulm, Germany Institute for Neural Information Processing Winter 2006/2007 Abstract

More information

Simple Model Selection Cross Validation Regularization Neural Networks

Simple Model Selection Cross Validation Regularization Neural Networks Neural Nets: Many possible refs e.g., Mitchell Chapter 4 Simple Model Selection Cross Validation Regularization Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February

More information

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset

LSTM: An Image Classification Model Based on Fashion-MNIST Dataset LSTM: An Image Classification Model Based on Fashion-MNIST Dataset Kexin Zhang, Research School of Computer Science, Australian National University Kexin Zhang, U6342657@anu.edu.au Abstract. The application

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Data Mining. Neural Networks

Data Mining. Neural Networks Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

Vulnerability of machine learning models to adversarial examples

Vulnerability of machine learning models to adversarial examples ITAT 216 Proceedings, CEUR Workshop Proceedings Vol. 1649, pp. 187 194 http://ceur-ws.org/vol-1649, Series ISSN 1613-73, c 216 P. Vidnerová, R. Neruda Vulnerability of machine learning models to adversarial

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining

More information

Keras: Handwritten Digit Recognition using MNIST Dataset

Keras: Handwritten Digit Recognition using MNIST Dataset Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA January 31, 2018 1 / 30 OUTLINE 1 Keras: Introduction 2 Installing Keras 3 Keras: Building, Testing, Improving A Simple Network 2 / 30

More information

Deep Neural Networks for Recognizing Online Handwritten Mathematical Symbols

Deep Neural Networks for Recognizing Online Handwritten Mathematical Symbols Deep Neural Networks for Recognizing Online Handwritten Mathematical Symbols Hai Dai Nguyen 1, Anh Duc Le 2 and Masaki Nakagawa 3 Tokyo University of Agriculture and Technology 2-24-16 Nakacho, Koganei-shi,

More information

Stacked Denoising Autoencoders for Face Pose Normalization

Stacked Denoising Autoencoders for Face Pose Normalization Stacked Denoising Autoencoders for Face Pose Normalization Yoonseop Kang 1, Kang-Tae Lee 2,JihyunEun 2, Sung Eun Park 2 and Seungjin Choi 1 1 Department of Computer Science and Engineering Pohang University

More information

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning

CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning CS231A Course Project Final Report Sign Language Recognition with Unsupervised Feature Learning Justin Chen Stanford University justinkchen@stanford.edu Abstract This paper focuses on experimenting with

More information

Artificial Intelligence Introduction Handwriting Recognition Kadir Eren Unal ( ), Jakob Heyder ( )

Artificial Intelligence Introduction Handwriting Recognition Kadir Eren Unal ( ), Jakob Heyder ( ) Structure: 1. Introduction 2. Problem 3. Neural network approach a. Architecture b. Phases of CNN c. Results 4. HTM approach a. Architecture b. Setup c. Results 5. Conclusion 1.) Introduction Artificial

More information

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li

Learning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,

More information

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601

Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network. Nathan Sun CIS601 Disguised Face Identification (DFI) with Facial KeyPoints using Spatial Fusion Convolutional Network Nathan Sun CIS601 Introduction Face ID is complicated by alterations to an individual s appearance Beard,

More information

DNN-BASED AUDIO SCENE CLASSIFICATION FOR DCASE 2017: DUAL INPUT FEATURES, BALANCING COST, AND STOCHASTIC DATA DUPLICATION

DNN-BASED AUDIO SCENE CLASSIFICATION FOR DCASE 2017: DUAL INPUT FEATURES, BALANCING COST, AND STOCHASTIC DATA DUPLICATION DNN-BASED AUDIO SCENE CLASSIFICATION FOR DCASE 2017: DUAL INPUT FEATURES, BALANCING COST, AND STOCHASTIC DATA DUPLICATION Jee-Weon Jung, Hee-Soo Heo, IL-Ho Yang, Sung-Hyun Yoon, Hye-Jin Shim, and Ha-Jin

More information

BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks

BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks Surat Teerapittayanon Harvard University Email: steerapi@seas.harvard.edu Bradley McDanel Harvard University Email: mcdanel@fas.harvard.edu

More information

Motivation. Problem: With our linear methods, we can train the weights but not the basis functions: Activator Trainable weight. Fixed basis function

Motivation. Problem: With our linear methods, we can train the weights but not the basis functions: Activator Trainable weight. Fixed basis function Neural Networks Motivation Problem: With our linear methods, we can train the weights but not the basis functions: Activator Trainable weight Fixed basis function Flashback: Linear regression Flashback:

More information

arxiv: v2 [cs.cv] 30 Oct 2018

arxiv: v2 [cs.cv] 30 Oct 2018 Adversarial Noise Layer: Regularize Neural Network By Adding Noise Zhonghui You, Jinmian Ye, Kunming Li, Zenglin Xu, Ping Wang School of Electronics Engineering and Computer Science, Peking University

More information

Week 3: Perceptron and Multi-layer Perceptron

Week 3: Perceptron and Multi-layer Perceptron Week 3: Perceptron and Multi-layer Perceptron Phong Le, Willem Zuidema November 12, 2013 Last week we studied two famous biological neuron models, Fitzhugh-Nagumo model and Izhikevich model. This week,

More information

Classification of objects from Video Data (Group 30)

Classification of objects from Video Data (Group 30) Classification of objects from Video Data (Group 30) Sheallika Singh 12665 Vibhuti Mahajan 12792 Aahitagni Mukherjee 12001 M Arvind 12385 1 Motivation Video surveillance has been employed for a long time

More information

Lecture 19: Generative Adversarial Networks

Lecture 19: Generative Adversarial Networks Lecture 19: Generative Adversarial Networks Roger Grosse 1 Introduction Generative modeling is a type of machine learning where the aim is to model the distribution that a given set of data (e.g. images,

More information

Finding Tiny Faces Supplementary Materials

Finding Tiny Faces Supplementary Materials Finding Tiny Faces Supplementary Materials Peiyun Hu, Deva Ramanan Robotics Institute Carnegie Mellon University {peiyunh,deva}@cs.cmu.edu 1. Error analysis Quantitative analysis We plot the distribution

More information

INTRODUCTION TO DEEP LEARNING

INTRODUCTION TO DEEP LEARNING INTRODUCTION TO DEEP LEARNING CONTENTS Introduction to deep learning Contents 1. Examples 2. Machine learning 3. Neural networks 4. Deep learning 5. Convolutional neural networks 6. Conclusion 7. Additional

More information