Convolutional Neural Networks (CNNs) for Power System Big Data Analysis

Size: px
Start display at page:

Download "Convolutional Neural Networks (CNNs) for Power System Big Data Analysis"

Transcription

1 Convolutional Neural Networks (CNNs) for Power System Big Analysis Siby Jose Plathottam, Hossein Salehfar, Prakash Ranganathan Electrical Engineering, University of North Dakota Grand Forks, USA Abstract The concept of automated power system data analysis using Deep Neural Networks (as part of the routine tasks normally performed by Independent System Operators) is explored and developed in this paper. Specifically, we propose to use the widely-used Deep neural network architecture known as Convolutional Neural Networks (CNNs). To this end, a 2-D representation of power system data is developed and proposed. To show the relevance of the proposed concept, a multi-class multi-label classification problem is presented as an application example. Midcontinent ISO (MISO) data sets on wind power and load is used for this purpose. TensorFlow, an open source machine learning platform is used to construct the CNN and train the network. The results are discussed and compared with those from standard Feed Forward Networks for the same data. Index Terms Deep Learning, Machine Learning, Convolutional NN, Feed Forward NN, wind power generation, Artificial intelligence I. INTRODUCTION The capabilities of Artificial Intelligence (AI) programs have grown in unforeseen way during the last 3 to 4 years. These include tasks like computer vision with above 9% accuracy [], [2], playing computer games with human level skills [3], defeating the reigning world champion in the ancient board game of Go [4], [5]. The last task was not expected to be accomplished by a computer until the next decade. Much of the above and other progress can be attributed to a class of Machine Learning (ML) algorithms called Deep Neural Networks (DNN s) also known as Deep Learning (DL) [6]. It must be noted that the game playing AI program also used a ML concept know as Reinforcement Learning (RL) to perform its task. The architecture of DNN s is not task-specific. Hence their same general learning algorithm may be repurposed for other tasks. One example of this application of learning algorithm was how Google used the same algorithm that learnt to play Go to optimize the operation of their data center cooling system and improved efficiency by 4% [7]. The success that AI applications, using DNN s, have achieved in solving tasks once thought to be only solvable by human experts lends out the hope that these techniques can also be applied to complex power system problems. One potential application is in the Independent System Operator (ISO) domains. Independent System Operator s (ISO s) are entities which coordinate the generation and transmission of electric power within their control area [8]. The human operators of an ISO take decisions like dispatching generation, scheduling tie line interchanges, fixing spot energy prices every few minutes to ensure system stability, power quality, and fairness to all utilities. They are aided by optimization algorithms like Security Constrained Economic Dispatch and Optimal Power Flow programs to make decisions. Thus, the final decision has a human in the loop. As distributed generation resources in the form of wind turbine generators and solar photovoltaics (PV s) are continually being added, the power system operation and control is becoming increasingly complex. In other words, a paradigm shift is happening in the electric power systems domain [9]. It is prudent that advanced computational tools be developed for ISO s to ensure that tomorrow s power grids are more reliable and cost effective than those of today. An AI platform using ML that resides within an ISO s data center to actively monitor and react in real time to a continuous stream of data from the grid is such a tool. This is not an impossible task due to the continual improvements in ML algorithms and the ever-increasing capability of distributed computing techniques. However, as with most promising new technologies, success is not guaranteed. A truly successful ISO AI decision system would require multiple iterations of concepts and algorithms. One possible approach is to use an ensemble consisting of DNN s of different architectures with specific expertise. This may also require the use of RL. The various functionalities that an ISO AI decision system would comprise of are illustrated in Fig.. The objective of this paper is to use a type of DNN s architecture called Convolutional Neural Networks (CNN s) to perform classification of large power system data sets. CNN s can be used as one of the building blocks of an ISO AI decision system, to perform analysis of large volumes of historical data. The next section gives a brief background on CNN s and how they are useful in processing data sets with data points that are sequentially, and spatially related to each. This work has been supported by the NSF and North Dakota EPSCOR Program through grant #IIA

2 Operator Query Operator Feedback Classification ISO AI Forecast Artificial Intelligence (AI) Machine Learning (ML) Artificial Neural Network (ANN) State Estimation Generation Dispatch Interchange Convolutional Neural Network (CNN) Optimization EnergyPrice Figure 2. CNN within AI Domain AI Response Critic System 2-D Array Convolutional Layer Convolution Filter Feature Map Figure. Functionalities inside an ISO AI Decision System II. CNN ARCHITECTURE A. Convolutional layer operation The simplest form of a DNN may be that of an ordinary Feed Forward Neural Network (FFNN) with 2 or more hidden layers. However, many of the best recent results using DNN s have come due to the use of CNN s, originally proposed by Lecun et al in 998 []. An illustration of how the CNN architecture fits within the wider world of AI programs is shown in Fig.2. CNN s share many similarities with FFNN s. The main conceptual difference between the two is that CNN s preserve the spatial relationship between data points, while FFNN s do not. This is one reason why CNN s have found such success in image recognition and other similar complex tasks. An image is a 2-dimensional (2-D) array of pixel values having fixed height, width, and color channels. A CNN can analyze an image as a 2-D array of numbers (or as 3-D, if there is more than one color channel) having a shape the same as that of an original image. However, in the case of a FFNN the 2-D array would need to be flattened into a -D array potentially destroying any spatial relationship between data points. Another major difference is that unlike FFNN s an individual neuron in a CNN is not connected to each pixel in the image at the same time. Instead, each neuron in a CNN has a window of specific height and width (i.e., a weight matrix) through which it analyzes an image patch having the same height and width. This window is known as the convolution filter and it works by sliding over the entire area of the image one patch at a time. The convolution operation (i.e., matrix multiplication of pixel values with the convolution filter weights) produces one value for each image patch. By sliding over the entire image matrix produces a feature map whose dimensions depend on both the dimension of the image and the convolution filter. An illustration of the convolution operation is given in Fig. 3. -D Array Flattening Input Layer Sliding Window Feed Forward Layer Hidden Layer Figure 3. An illustration of the difference between operations in a convolutional layer and feed forward layer. The shape of the feature map, i.e., it s width (FM W) and height (FM H) may be calculated by using Equations () and (2), respectively. Where I W and I H are the width and the height of the 2-D array, respectively. CF W and CF H are the convolutional filter width and height, respectively, and S W and S WH are strides of the sliding window along the width and height, respectively. The size, number, and stride of the convolution filters are hyper parameters for the CNN that must be finetuned for each application. I CF W W FM = + () W SW I CF H H FM = + (2) H S H

3 The input to a CNN is in no way limited to pixel values from image data. A CNN can be applied to any type of data with sequential information. One application where CNN s have been highly successful has been in the field of computational biology to classify DNA sequences [] and to predict the specificity of DNA-protein bindings [2]. In the present paper the authors extend the use of CNN s in processing the big set of sequential data collected by ISO s, namely the 24-hr power generation and load data. A recent work used a Deep Learning (DL) technique, Auto Encoders, to predict solar PV power generation [3]. B. Training and inference using CNN s Using a ML architecture, like a CNN, to write an AI program is conceptually different from traditional programming. Rather than writing instructions for each step to complete a task, here one must specify a learning algorithm for the network and provide many training samples of inputoutput pairs. The process through which the weights and biases of the CNN adjust themselves using the learning algorithm and data samples is known as training. Almost all learning algorithms use some variations of the backpropagation algorithm [4]. In this work, a mini-batch gradient descent technique is used where the training data set is split into multiple mini-batches and each batch is consecutively trained. The learning algorithm uses a function knows as the loss function to measure the difference between the output produced by the CNN, and the actual targeted output. The selection of the loss function is primarily dependent on whether the neural network is performing a classification or a regression task. In this work, since the neural network is performing a classification task, a crossentropy loss function [5] is used. Training a network is generally a time-consuming process, and it may take many hours, days, or even weeks to fully train a CNN from scratch, depending upon the number of training data samples and the complexity of the CNN. The process in which the input data is fed into an already trained network to produce outputs is known as inference. Inference can be performed quickly by any properly trained neural network, and may take only a few milli-seconds or less. This is because the underlying mathematical operations in the trained network are simple and are performed in parallel. To test how well the network is learning during (or after) training, it is necessary to measure the loss for input data that was not part of the training dataset. For this, a portion of the training data is separated before the start of training. This is referred to as the validation data. Only when the loss due to validation data decreases in tandem with the loss due to training data, can the neural network be said to be learning. If the opposite happens, i.e., the training error decreases while the validation error increases, the network is said to be memorizing and would not be able to generalize and produce good results during inference. III. CLASSIFICATION USING CNN The various tasks performed by ML algorithms can be broadly classified into two areas, namely Classification and Regression. CNN s can perform both of these tasks, though classification is the more a widely-used application. In this work, the CNN is trained to perform a multi-class and multilabel classification task as illustrated in Fig. 4. Each input data can have features from multiple classes, but within each class it can only be classified under one label. This work will use what is known as the one hot encoding concept to represent the labels, whereby each neuron in the output layer corresponds to a unique label [6]. Correspondingly, a sample of the output data used for training is a vector with a size equal to the total number of labels. The values within this vector may either be or. It is possible to convert the output from a neuron to take on a value of either or using a discrete step function, but this severely limits the learning ability of the network. Instead, it is more advantageous to compute the probability value of each output using a continuously differentiable function like the Sigmoid activation function. The Sigmoid can squeeze the output from a neuron to a value between and [7], [8] and represent the probability of a decision. Input Class Class 2 Class m Label Label 2 Label n Figure 4. A multi-class and multi-label classification task A. Classification using class seperated Softmax activation Probabilities can also be calculated using the Softmax function (4), but this function can take advantage of the fact that probabilities of class labels within a class are mutually exclusive: y i = n e j= xi e x j where y i is the probability that the input being classified belongs to the i th label, x i is the output from the neuron corresponding to the i th label and n is the number of labels within the class. Also, n y i= i = (5) In this work, there are separate Softmax activations for each class. Hence, the loss function is calculated separately for each class and added together using (6). k (4) m n k k = (, ˆ i i ) (6) loss xentropy y y k= i= where, y i k is the neural network output corresponding to the i th label in the k th class and y i k is the actual output corresponding to the i th label in the k th class. m is the number of classes, and n k is the total number of labels in the k th class.

4 Both Softmax and Sigmoid activation functions will be used separately and their performances are then compared. IV. ELECTRIC POWER SYSTEM DATA STREAMS In an electric power system, the sum of power generated, power consumed, power losses, and energy stored must be equal to zero at every time instant. A. Training data for CNN This paper uses wind power generation and load data from Midcontinent ISO (MISO) for analysis by a CNN. Each of data sample corresponds to a 24-hr wind power generation and actual load for a day. Hence one sample of input data consists of 48 unique generation and load values. In this work, as an example the CNN will be trained to extract 3 different features from the 24 hr data, and predict their labels (classification) The features are mean wind power, standard deviation of wind power, and the fraction of total load that is served by wind power generators. The vector corresponding to an output data sample is illustrated in Fig. 5. <4 MW >4 MW <85 MW LOW MED HIGH LOW MED HIGH LOW HIGH Wind Power Strength >85 MW <5 MW >5 MW <23 MW Wind Power Variability Figure 5. Sample of training output >23 MW <7.5 % >7.5 % Wind Power Load Share map [9]. A second convolutional layer may be added to work on the feature maps produced by the previous convolutional layer. There may be m number of such convolutional layers. Using large number of layers will result in finer representation of the input data features. However, this also increases the computational time as well as the memory required. The output of the feature maps from the last convolutional layer is flattened and given as input to a FFNN. Finally, in the output layer the probabilities of different classes are calculated using a Softmax or Sigmoid activation. A. Implementing the CNN computational graph using TensorFlow In order to train the CNN using the power system data, a computational graph was developed in Google s TensorFlow machine learning library [2], [2]. One of the many advantages of using TensorFlow is the possibility of visualizing the computational graph of the algorithm after coding has been performed. The computational graph used to implement the CNN of this work is shown in Fig. 7. The details of the layers are given in Table I and Table II. A cross entropy function [5] is used to calculate the loss or error between the CNN output and the actual labels during the training process. The Adam optimization algorithm [22] was used in this work to train the CNN weights. A stochastic gradient descent method was implemented by changing the size of the training batch in each epoch. For comparison, the same data set were also processed using a standard feed forward network (FFNN) having a single hidden layer. The number of neurons in the hidden layer were chosen such that the number of parameters in the FFNN will be roughly equal to the number of parameters in the CNN. The details of the layers in the FFNN are given in Table III. V. CNN APPLICATION TO POWER SYSTEM DATA STREAM Like the case of pixel data from images, the CNN processes a 2-D array containing power data without flattening that data. To do this, the data is arranged in the form of a stack of 2-D arrays. The width of the array will correspond to the number of time blocks (24 in this case) and the height of the array will correspond to the number of data sources (wind generation and load in this case). This concept is illustrated in Fig. 6. Depending on the number of data sources and time blocks for the problem, the size of the 2-D array increases. Time Blocks Sources Figure 6. Stacking 2-D arrays of power system generation and load data The operations within the CNN that are used to process the 24-hr power data arranged as stacks of 2-D arrays are explained below. The st convolutional filter in the CNN will process a 2-D array of size (2 24) using a filter of size 2 I W. An n number of such filters may be used. Each filter will produce a feature map of size FM W. A non-linear activation function like ReLU is applied to each element of the feature Figure 7. Computational graph for CNN generated using Tensorflow

5 TABLE I. HYPERPARAMETRES - CONVOLUTIONAL LAYER IN CNN Convolutional layers within CNN Layer Conv Layer Conv layer 2 Filter size Number of filters 4 4 Feature map size 9 4 Parameters Weights 2 6 4= =96 Biases 4 4 TABLE II. HYPERPARAMETRES - FULLY CONNECTED LAYER IN CNN FFNN layers within CNN Layer Fully connected layer Output layer Number of inputs 4 4 = 56 8 Number of 8 8 neurons/outputs Parameters Weights 56 8= =64 Biases 8 8 TABLE III. HYPERPARAMETRES FOR FFNN FFNN layers Layer First layer Output layer Number of inputs 48 2 Number of neurons/outputs 2 8 Parameters Weights 48 2 = =96 Biases 2 8 Tables I-III indicate that CNN has four layers, while the FFNN has only two layers, even though the number of unique parameters (weights and biases) in both are nearly the same (68 for CNN and 692 for FFNN). If another power data source were to be added as input, the number of weights would have increased by 288 for the FFNN. However, for the CNN, only 24 additional weights would be required. VI. RESULTS The training inputs of the CNN were obtained from MISO data from 25, 26, and 27 [23]. This corresponds to wind power generation and load data for 4 days, i.e values. Out of these, data corresponding to days were considered as the training data and the remaining 4 days as the validation data. The learning rate was set at -6 for all the epochs. The training and inference were done on an Intel Core i7 PC with 8 GB RAM. The training time was about 4 hours for the CNN & about 2.5 hours for the standard FFNN. The change in cross entropy loss w.r.t. the training epochs is shown in Fig. 8. The improvement in average classification accuracy during training is shown in Fig. 9. The classification accuracy for each class is given in Fig.. Cross Entropy Loss Cross Entropy loss for Training - Sigmoid Cross Entropy loss for Validation - Sigmoid Cross Entropy loss for Training - Softmax Cross Entropy loss for validation - Softmax 5 Epochs 5 2 Figure 8. Cross entropy loss w.r.t. epochs for CNN Accuracy Accuracy Training Accuracy with Sigmoid Validation Accuracy with Sigmoid Training Accuracy with Softmax Validation Accuracy with Softmax 5 Epochs 5 2 Figure 9. Average accuracy w.r.t. epochs for CNN Class_accuracy Class2_accuracy Class3_accuracy 5 Epochs 5 2 Figure. Accuracy for classes using Softmax activation w.r.t. epochs Changes in the value of weights in the st convolutional layer as training progresses can be visualized as a histogram as shown in Fig. using the visualization tool of TensorFlow known as Tensorboard. Here Y-axis represents the epochs while the X-axis is the spread of values and the Z-axis is the number of parameters taking a particular value. The change in their weight values over the epochs indicate that CNN is learning during the training process. Figure. Evolution of weights in the st convolutional filter w.r.t. epochs visualized as a histogram for Softmax (left) & Sigmoid (right) output layers. The FFNN was also trained with the same data set, and optimization algorithm as the CNN. The change in crossentropy loss and average classification accuracy as training progressed is shown in Fig. 2 and Fig. 3 respectively. Cross Entropy loss Accuracy Cross Entropy Loss for Training - Softmax Cross Entropy loss for Validation - Softmax Cross Entropy Loss for Training - Sigmoid Cross Entropy Loss for Validation - Sigmoid 5 Epochs 5 2 Figure 2. Cross entropy loss w.r.t. epochs for FFNN Training Accuracy with Softmax Validation Accuracy with Softmax Training Accuracy with Sigmoid Validation Accuracy with Sigmoid 5 Epochs 5 2 Figure 3. Average accuracy w.r.t. epochs for FFNN

6 TABLE IV. COMPARING THE CROSS ENTROPY LOSS AND ACCURACY BETWEEN CNN AND FFNN. Cross Entropy Loss Average Accuracy Neural Network type Training Validation Training Validation CNN (Softmax output) % 79 % CNN (Sigmoid output) % 77 % FFNN (Softmax output) % 78 % FFNN (Sigmoid output) % 72 % A. Discussion of results The final classification accuracies obtained are given in Table IV. From the results, it can be observed that training a CNN using 24-hr power data sets is feasible. Since mini-batch gradient descent is used, the plots corresponding to training data are noisy. The cross-entropy loss as well the classification accuracy is improving for both training and validation data sets. This would indicate that CNN is generalizing and not just memorizing. In case of the output layers, the Softmax can be said to be marginally better than the Sigmoid for average classification accuracy. Also, it can be seen that the trends for loss and accuracy have not entirely plateaued in case of the CNN. There is a noticeable difference when using validation data which would indicate that FFNN is less effective at generalizing. VII. CONCLUSION This work has proposed the use of Machine Learning algorithms like Convolutional Neural Networks (CNN s) to develop an ISO AI decision system that can aid or even replace the human operators to efficiently control complex power grids of tomorrow. The operation of CNN and the concept of feeding the CNN with power data in the form of stacks of 2-D arrays was introduced. The CNN was trained using power data from MISO to perform multi-class multilabel classification. The utility of TensorFlow for training and analyzing neural networks was also discussed. Up to 9% accuracy was obtained on the training data set, and 79% accuracy on the validation data set was observed using a Softmax classifier. The results underscored the feasibility of using CNN s for Big data analysis in power systems. There is still significant scope for improvement by using bigger data sets, experimenting with activation function like Sigmoid or Exponential linear unit (ELU), and fine tuning the CNN hyper parameters. REFERENCES [] A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet Classification with Deep Convolutional Neural Networks, in Advances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates, Inc., 22, pp [2] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, Going Deeper with Convolutions, Sep. 24. [3] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller, Playing Atari with Deep Reinforcement Learning, Dec. 23. [4] D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, Mastering the game of Go with deep neural networks and tree search, Nature, vol. 529, no. 7587, pp , Jan. 26. [5] S. Byford, Google s AlphaGo AI beats Lee Se-dol again to win Go series 4- - The Verge. [Online]. Available: [6] Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol. 52, no. 7553, pp , May 25. [7] R. Evans and J. Gao, DeepMind AI reduces energy used for cooling Google data centers by 4%, Google Green Blog, 26. [Online]. Available: [8] Role of ISOs and RTOs, ISO/RTO Resource Council. [Online]. Available: [Accessed: -Mar-27]. [9] M. Sarwar and B. Asad, A review on future power systems; technologies and research for smart grids, in 26 International Conference on Emerging Technologies (ICET), 26, pp. 6. [] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE, vol. 86, no., pp , 998. [] R. Rizzo, A. Fiannaca, M. La Rosa, and A. Urso, A Deep Learning Approach to DNA Sequence Classification, in 2th International Meeting Computational Intelligence Methods for Bioinformatics and Biostatistics, 26, pp [2] H. Zeng, M. D. Edwards, G. Liu, and D. K. Gifford, Convolutional neural network architectures for predicting DNA protein binding, Bioinformatics, vol. 32, no. 2, pp. i2 i27, Jun. 26. [3] A. Gensler, J. Henze, B. Sick, and N. Raabe, Deep Learning for solar power forecasting An approach using AutoEncoder and LSTM Neural Networks, in 26 IEEE International Conference on Systems, Man, and Cybernetics (SMC), 26, pp [4] S. S. Haykin and Simon, Neural networks: a comprehensive foundation, 2nd ed. Prentice Hall, 999. [5] P. Golik, P. Doetsch, and H. Ney, Cross-entropy vs. squared error training: a theoretical and experimental comparison, in Interspeech, 23, pp [6] A. Géron, Hands-on machine learning with Scikit-Learn and TensorFlow: concepts, tools, and techniques to build intelligent systems, st ed. O Reilly Media, Inc, 27. [7] S. J. Hanson, J. D. Cowan, C. L. Giles, C. Conference on Neural Information Processing Systems - Natural and Synthetic (6: 992: Denver, and C. Neural Information Processing Systems Conference (5: 992: Denver, The power of approximating: a comparison of activation functions, in Proceedings of the 5th International Conference on Neural Information Processing Systems, 992, pp [8] H. N. Mhaskar and C. A. Micchelli, How to choose an activation function. Morgan Kaufmann Publishers Inc., 993. [9] B. Xu, N. Wang, T. Chen, and M. Li, Empirical Evaluation of Rectified Activations in Convolutional Network, May 25. [2] M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viegas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, TensorFlow: Large- Scale Machine Learning on Heterogeneous Distributed Systems, Mar. 26. [2] J. Dean and R. Monga, TensorFlow - Google s latest machine learning system, open sourced for everyone, Google Research Blog, 25. [Online]. Available: [22] D. P. Kingma and J. Ba, Adam: A Method for Stochastic Optimization, arxiv, Dec. 24. [23] Market Reports, MISO. [Online]. Available: orts.aspx. [Accessed: 7-Apr-27].

Cross-domain Deep Encoding for 3D Voxels and 2D Images

Cross-domain Deep Encoding for 3D Voxels and 2D Images Cross-domain Deep Encoding for 3D Voxels and 2D Images Jingwei Ji Stanford University jingweij@stanford.edu Danyang Wang Stanford University danyangw@stanford.edu 1. Introduction 3D reconstruction is one

More information

Scene classification with Convolutional Neural Networks

Scene classification with Convolutional Neural Networks Scene classification with Convolutional Neural Networks Josh King jking9@stanford.edu Vayu Kishore vayu@stanford.edu Filippo Ranalli franalli@stanford.edu Abstract This paper approaches the problem of

More information

Channel Locality Block: A Variant of Squeeze-and-Excitation

Channel Locality Block: A Variant of Squeeze-and-Excitation Channel Locality Block: A Variant of Squeeze-and-Excitation 1 st Huayu Li Northern Arizona University Flagstaff, United State Northern Arizona University hl459@nau.edu arxiv:1901.01493v1 [cs.lg] 6 Jan

More information

A LAYER-BLOCK-WISE PIPELINE FOR MEMORY AND BANDWIDTH REDUCTION IN DISTRIBUTED DEEP LEARNING

A LAYER-BLOCK-WISE PIPELINE FOR MEMORY AND BANDWIDTH REDUCTION IN DISTRIBUTED DEEP LEARNING 017 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING, SEPT. 5 8, 017, TOKYO, JAPAN A LAYER-BLOCK-WISE PIPELINE FOR MEMORY AND BANDWIDTH REDUCTION IN DISTRIBUTED DEEP LEARNING Haruki

More information

Go in numbers 3,000. Years Old

Go in numbers 3,000. Years Old Go in numbers 3,000 Years Old 40M Players 10^170 Positions The Rules of Go Capture Territory Why is Go hard for computers to play? Brute force search intractable: 1. Search space is huge 2. Impossible

More information

REVISITING DISTRIBUTED SYNCHRONOUS SGD

REVISITING DISTRIBUTED SYNCHRONOUS SGD REVISITING DISTRIBUTED SYNCHRONOUS SGD Jianmin Chen, Rajat Monga, Samy Bengio & Rafal Jozefowicz Google Brain Mountain View, CA, USA {jmchen,rajatmonga,bengio,rafalj}@google.com 1 THE NEED FOR A LARGE

More information

Neural Networks with Input Specified Thresholds

Neural Networks with Input Specified Thresholds Neural Networks with Input Specified Thresholds Fei Liu Stanford University liufei@stanford.edu Junyang Qian Stanford University junyangq@stanford.edu Abstract In this project report, we propose a method

More information

arxiv: v1 [cs.cv] 25 Aug 2018

arxiv: v1 [cs.cv] 25 Aug 2018 Painting Outside the Box: Image Outpainting with GANs Mark Sabini and Gili Rusak Stanford University {msabini, gilir}@cs.stanford.edu arxiv:1808.08483v1 [cs.cv] 25 Aug 2018 Abstract The challenging task

More information

Introduction to Deep Q-network

Introduction to Deep Q-network Introduction to Deep Q-network Presenter: Yunshu Du CptS 580 Deep Learning 10/10/2016 Deep Q-network (DQN) Deep Q-network (DQN) An artificial agent for general Atari game playing Learn to master 49 different

More information

3D Deep Convolution Neural Network Application in Lung Nodule Detection on CT Images

3D Deep Convolution Neural Network Application in Lung Nodule Detection on CT Images 3D Deep Convolution Neural Network Application in Lung Nodule Detection on CT Images Fonova zl953@nyu.edu Abstract Pulmonary cancer is the leading cause of cancer-related death worldwide, and early stage

More information

Visual Odometry using Convolutional Neural Networks

Visual Odometry using Convolutional Neural Networks The Kennesaw Journal of Undergraduate Research Volume 5 Issue 3 Article 5 December 2017 Visual Odometry using Convolutional Neural Networks Alec Graves Kennesaw State University, agrave15@students.kennesaw.edu

More information

AUTOMATIC TRANSPORT NETWORK MATCHING USING DEEP LEARNING

AUTOMATIC TRANSPORT NETWORK MATCHING USING DEEP LEARNING AUTOMATIC TRANSPORT NETWORK MATCHING USING DEEP LEARNING Manuel Martin Salvador We Are Base / Bournemouth University Marcin Budka Bournemouth University Tom Quay We Are Base 1. INTRODUCTION Public transport

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

Deep Learning for Computer Vision II

Deep Learning for Computer Vision II IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L

More information

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks

More information

Tracking by recognition using neural network

Tracking by recognition using neural network Zhiliang Zeng, Ying Kin Yu, and Kin Hong Wong,"Tracking by recognition using neural network",19th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed

More information

SUMMARY. in the task of supervised automatic seismic interpretation. We evaluate these tasks qualitatively and quantitatively.

SUMMARY. in the task of supervised automatic seismic interpretation. We evaluate these tasks qualitatively and quantitatively. Deep learning seismic facies on state-of-the-art CNN architectures Jesper S. Dramsch, Technical University of Denmark, and Mikael Lüthje, Technical University of Denmark SUMMARY We explore propagation

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Deep Learning with Tensorflow AlexNet

Deep Learning with Tensorflow   AlexNet Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification

More information

Machine Learning. MGS Lecture 3: Deep Learning

Machine Learning. MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer

More information

An Accurate and Real-time Self-blast Glass Insulator Location Method Based On Faster R-CNN and U-net with Aerial Images

An Accurate and Real-time Self-blast Glass Insulator Location Method Based On Faster R-CNN and U-net with Aerial Images 1 An Accurate and Real-time Self-blast Glass Insulator Location Method Based On Faster R-CNN and U-net with Aerial Images Zenan Ling 2, Robert C. Qiu 1,2, Fellow, IEEE, Zhijian Jin 2, Member, IEEE Yuhang

More information

Tuning the Layers of Neural Networks for Robust Generalization

Tuning the Layers of Neural Networks for Robust Generalization 208 Int'l Conf. Data Science ICDATA'18 Tuning the Layers of Neural Networks for Robust Generalization C. P. Chiu, and K. Y. Michael Wong Department of Physics Hong Kong University of Science and Technology

More information

Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization

Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization Tuning the Scheduling of Distributed Stochastic Gradient Descent with Bayesian Optimization Valentin Dalibard Michael Schaarschmidt Eiko Yoneki Abstract We present an optimizer which uses Bayesian optimization

More information

Pitch and Roll Camera Orientation From a Single 2D Image Using Convolutional Neural Networks

Pitch and Roll Camera Orientation From a Single 2D Image Using Convolutional Neural Networks Pitch and Roll Camera Orientation From a Single 2D Image Using Convolutional Neural Networks Greg Olmschenk, Hao Tang, Zhigang Zhu The Graduate Center of the City University of New York Borough of Manhattan

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

arxiv: v1 [cs.cv] 27 Jun 2018

arxiv: v1 [cs.cv] 27 Jun 2018 LPRNet: License Plate Recognition via Deep Neural Networks Sergey Zherzdev ex-intel IOTG Computer Vision Group sergeyzherzdev@gmail.com Alexey Gruzdev Intel IOTG Computer Vision Group alexey.gruzdev@intel.com

More information

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong

Proceedings of the International MultiConference of Engineers and Computer Scientists 2018 Vol I IMECS 2018, March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong , March 14-16, 2018, Hong Kong TABLE I CLASSIFICATION ACCURACY OF DIFFERENT PRE-TRAINED MODELS ON THE TEST DATA

More information

CONVOLUTIONAL NEURAL NETWORK TRANSFER LEARNING FOR UNDERWATER OBJECT CLASSIFICATION

CONVOLUTIONAL NEURAL NETWORK TRANSFER LEARNING FOR UNDERWATER OBJECT CLASSIFICATION CONVOLUTIONAL NEURAL NETWORK TRANSFER LEARNING FOR UNDERWATER OBJECT CLASSIFICATION David P. Williams NATO STO CMRE, La Spezia, Italy 1 INTRODUCTION Convolutional neural networks (CNNs) have recently achieved

More information

Layerwise Interweaving Convolutional LSTM

Layerwise Interweaving Convolutional LSTM Layerwise Interweaving Convolutional LSTM Tiehang Duan and Sargur N. Srihari Department of Computer Science and Engineering The State University of New York at Buffalo Buffalo, NY 14260, United States

More information

A Dense Take on Inception for Tiny ImageNet

A Dense Take on Inception for Tiny ImageNet A Dense Take on Inception for Tiny ImageNet William Kovacs Stanford University kovacswc@stanford.edu Abstract Image classificiation is one of the fundamental aspects of computer vision that has seen great

More information

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful

More information

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD. Deep Learning 861.061 Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD asan.agibetov@meduniwien.ac.at Medical University of Vienna Center for Medical Statistics,

More information

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: Code Mania 2019 Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python: 1. Introduction to Artificial Intelligence 2. Introduction to python programming and Environment

More information

Multi-Glance Attention Models For Image Classification

Multi-Glance Attention Models For Image Classification Multi-Glance Attention Models For Image Classification Chinmay Duvedi Stanford University Stanford, CA cduvedi@stanford.edu Pararth Shah Stanford University Stanford, CA pararth@stanford.edu Abstract We

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional

More information

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks

Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Deep Tracking: Biologically Inspired Tracking with Deep Convolutional Networks Si Chen The George Washington University sichen@gwmail.gwu.edu Meera Hahn Emory University mhahn7@emory.edu Mentor: Afshin

More information

Convolutional Layer Pooling Layer Fully Connected Layer Regularization

Convolutional Layer Pooling Layer Fully Connected Layer Regularization Semi-Parallel Deep Neural Networks (SPDNN), Convergence and Generalization Shabab Bazrafkan, Peter Corcoran Center for Cognitive, Connected & Computational Imaging, College of Engineering & Informatics,

More information

Index. Umberto Michelucci 2018 U. Michelucci, Applied Deep Learning,

Index. Umberto Michelucci 2018 U. Michelucci, Applied Deep Learning, A Acquisition function, 298, 301 Adam optimizer, 175 178 Anaconda navigator conda command, 3 Create button, 5 download and install, 1 installing packages, 8 Jupyter Notebook, 11 13 left navigation pane,

More information

Structured Prediction using Convolutional Neural Networks

Structured Prediction using Convolutional Neural Networks Overview Structured Prediction using Convolutional Neural Networks Bohyung Han bhhan@postech.ac.kr Computer Vision Lab. Convolutional Neural Networks (CNNs) Structured predictions for low level computer

More information

Evaluating Mask R-CNN Performance for Indoor Scene Understanding

Evaluating Mask R-CNN Performance for Indoor Scene Understanding Evaluating Mask R-CNN Performance for Indoor Scene Understanding Badruswamy, Shiva shivalgo@stanford.edu June 12, 2018 1 Motivation and Problem Statement Indoor robotics and Augmented Reality are fast

More information

Pipeline-Based Processing of the Deep Learning Framework Caffe

Pipeline-Based Processing of the Deep Learning Framework Caffe Pipeline-Based Processing of the Deep Learning Framework Caffe ABSTRACT Ayae Ichinose Ochanomizu University 2-1-1 Otsuka, Bunkyo-ku, Tokyo, 112-8610, Japan ayae@ogl.is.ocha.ac.jp Hidemoto Nakada National

More information

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017

COMP9444 Neural Networks and Deep Learning 7. Image Processing. COMP9444 c Alan Blair, 2017 COMP9444 Neural Networks and Deep Learning 7. Image Processing COMP9444 17s2 Image Processing 1 Outline Image Datasets and Tasks Convolution in Detail AlexNet Weight Initialization Batch Normalization

More information

Smart Parking System using Deep Learning. Sheece Gardezi Supervised By: Anoop Cherian Peter Strazdins

Smart Parking System using Deep Learning. Sheece Gardezi Supervised By: Anoop Cherian Peter Strazdins Smart Parking System using Deep Learning Sheece Gardezi Supervised By: Anoop Cherian Peter Strazdins Content Labeling tool Neural Networks Visual Road Map Labeling tool Data set Vgg16 Resnet50 Inception_v3

More information

EasyChair Preprint. Real Time Object Detection And Tracking

EasyChair Preprint. Real Time Object Detection And Tracking EasyChair Preprint 223 Real Time Object Detection And Tracking Dária Baikova, Rui Maia, Pedro Santos, João Ferreira and Joao Oliveira EasyChair preprints are intended for rapid dissemination of research

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Facial Key Points Detection using Deep Convolutional Neural Network - NaimishNet

Facial Key Points Detection using Deep Convolutional Neural Network - NaimishNet 1 Facial Key Points Detection using Deep Convolutional Neural Network - NaimishNet Naimish Agarwal, IIIT-Allahabad (irm2013013@iiita.ac.in) Artus Krohn-Grimberghe, University of Paderborn (artus@aisbi.de)

More information

Deep Learning. Architecture Design for. Sargur N. Srihari

Deep Learning. Architecture Design for. Sargur N. Srihari Architecture Design for Deep Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation

More information

Yelp Restaurant Photo Classification

Yelp Restaurant Photo Classification Yelp Restaurant Photo Classification Haomin Peng haomin@stanford.edu Yiqing Ding yiqingd@stanford.edu Boning Huang bnhuang@stanford.edu 1. Introduction Everyday, millions of users upload numerous photos

More information

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why? Data Mining Deep Learning Deep Learning provided breakthrough results in speech recognition and image classification. Why? Because Speech recognition and image classification are two basic examples of

More information

Smart Content Recognition from Images Using a Mixture of Convolutional Neural Networks *

Smart Content Recognition from Images Using a Mixture of Convolutional Neural Networks * Smart Content Recognition from Images Using a Mixture of Convolutional Neural Networks * Tee Connie *, Mundher Al-Shabi *, and Michael Goh Faculty of Information Science and Technology, Multimedia University,

More information

Advanced Machine Learning

Advanced Machine Learning Advanced Machine Learning Convolutional Neural Networks for Handwritten Digit Recognition Andreas Georgopoulos CID: 01281486 Abstract Abstract At this project three different Convolutional Neural Netwroks

More information

Application of Convolutional Neural Network for Image Classification on Pascal VOC Challenge 2012 dataset

Application of Convolutional Neural Network for Image Classification on Pascal VOC Challenge 2012 dataset Application of Convolutional Neural Network for Image Classification on Pascal VOC Challenge 2012 dataset Suyash Shetty Manipal Institute of Technology suyash.shashikant@learner.manipal.edu Abstract In

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

Fuzzy Set Theory in Computer Vision: Example 3, Part II

Fuzzy Set Theory in Computer Vision: Example 3, Part II Fuzzy Set Theory in Computer Vision: Example 3, Part II Derek T. Anderson and James M. Keller FUZZ-IEEE, July 2017 Overview Resource; CS231n: Convolutional Neural Networks for Visual Recognition https://github.com/tuanavu/stanford-

More information

Slides credited from Dr. David Silver & Hung-Yi Lee

Slides credited from Dr. David Silver & Hung-Yi Lee Slides credited from Dr. David Silver & Hung-Yi Lee Review Reinforcement Learning 2 Reinforcement Learning RL is a general purpose framework for decision making RL is for an agent with the capacity to

More information

CapsNet comparative performance evaluation for image classification

CapsNet comparative performance evaluation for image classification CapsNet comparative performance evaluation for image classification Rinat Mukhometzianov 1 and Juan Carrillo 1 1 University of Waterloo, ON, Canada Abstract. Image classification has become one of the

More information

Rotation Invariance Neural Network

Rotation Invariance Neural Network Rotation Invariance Neural Network Shiyuan Li Abstract Rotation invariance and translate invariance have great values in image recognition. In this paper, we bring a new architecture in convolutional neural

More information

Global Optimality in Neural Network Training

Global Optimality in Neural Network Training Global Optimality in Neural Network Training Benjamin D. Haeffele and René Vidal Johns Hopkins University, Center for Imaging Science. Baltimore, USA Questions in Deep Learning Architecture Design Optimization

More information

A Cellular Similarity Metric Induced by Siamese Convolutional Neural Networks

A Cellular Similarity Metric Induced by Siamese Convolutional Neural Networks A Cellular Similarity Metric Induced by Siamese Convolutional Neural Networks Morgan Paull Stanford Bioengineering mpaull@stanford.edu Abstract High-throughput microscopy imaging holds great promise for

More information

arxiv: v1 [cs.lg] 12 Jul 2018

arxiv: v1 [cs.lg] 12 Jul 2018 arxiv:1807.04585v1 [cs.lg] 12 Jul 2018 Deep Learning for Imbalance Data Classification using Class Expert Generative Adversarial Network Fanny a, Tjeng Wawan Cenggoro a,b a Computer Science Department,

More information

Deep Neural Networks:

Deep Neural Networks: Deep Neural Networks: Part II Convolutional Neural Network (CNN) Yuan-Kai Wang, 2016 Web site of this course: http://pattern-recognition.weebly.com source: CNN for ImageClassification, by S. Lazebnik,

More information

3D model classification using convolutional neural network

3D model classification using convolutional neural network 3D model classification using convolutional neural network JunYoung Gwak Stanford jgwak@cs.stanford.edu Abstract Our goal is to classify 3D models directly using convolutional neural network. Most of existing

More information

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies

More information

Keras: Handwritten Digit Recognition using MNIST Dataset

Keras: Handwritten Digit Recognition using MNIST Dataset Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA January 31, 2018 1 / 30 OUTLINE 1 Keras: Introduction 2 Installing Keras 3 Keras: Building, Testing, Improving A Simple Network 2 / 30

More information

Dynamic Routing Between Capsules

Dynamic Routing Between Capsules Report Explainable Machine Learning Dynamic Routing Between Capsules Author: Michael Dorkenwald Supervisor: Dr. Ullrich Köthe 28. Juni 2018 Inhaltsverzeichnis 1 Introduction 2 2 Motivation 2 3 CapusleNet

More information

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides

Deep Learning in Visual Recognition. Thanks Da Zhang for the slides Deep Learning in Visual Recognition Thanks Da Zhang for the slides Deep Learning is Everywhere 2 Roadmap Introduction Convolutional Neural Network Application Image Classification Object Detection Object

More information

Determining Aircraft Sizing Parameters through Machine Learning

Determining Aircraft Sizing Parameters through Machine Learning h(y) Determining Aircraft Sizing Parameters through Machine Learning J. Michael Vegh, Tim MacDonald, Brian Munguía I. Introduction Aircraft conceptual design is an inherently iterative process as many

More information

3D Densely Convolutional Networks for Volumetric Segmentation. Toan Duc Bui, Jitae Shin, and Taesup Moon

3D Densely Convolutional Networks for Volumetric Segmentation. Toan Duc Bui, Jitae Shin, and Taesup Moon 3D Densely Convolutional Networks for Volumetric Segmentation Toan Duc Bui, Jitae Shin, and Taesup Moon School of Electronic and Electrical Engineering, Sungkyunkwan University, Republic of Korea arxiv:1709.03199v2

More information

Know your data - many types of networks

Know your data - many types of networks Architectures Know your data - many types of networks Fixed length representation Variable length representation Online video sequences, or samples of different sizes Images Specific architectures for

More information

Hand Written Digit Recognition Using Tensorflow and Python

Hand Written Digit Recognition Using Tensorflow and Python Hand Written Digit Recognition Using Tensorflow and Python Shekhar Shiroor Department of Computer Science College of Engineering and Computer Science California State University-Sacramento Sacramento,

More information

Hello Edge: Keyword Spotting on Microcontrollers

Hello Edge: Keyword Spotting on Microcontrollers Hello Edge: Keyword Spotting on Microcontrollers Yundong Zhang, Naveen Suda, Liangzhen Lai and Vikas Chandra ARM Research, Stanford University arxiv.org, 2017 Presented by Mohammad Mofrad University of

More information

Towards Effective Deep Learning for Constraint Satisfaction Problems

Towards Effective Deep Learning for Constraint Satisfaction Problems Towards Effective Deep Learning for Constraint Satisfaction Problems Hong Xu ( ), Sven Koenig, and T. K. Satish Kumar University of Southern California, Los Angeles, CA 90089, United States of America

More information

Computer Vision Lecture 16

Computer Vision Lecture 16 Computer Vision Lecture 16 Deep Learning for Object Categorization 14.01.2016 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de leibe@vision.rwth-aachen.de Announcements Seminar registration period

More information

Neural Network Neurons

Neural Network Neurons Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given

More information

CrescendoNet: A New Deep Convolutional Neural Network with Ensemble Behavior

CrescendoNet: A New Deep Convolutional Neural Network with Ensemble Behavior CrescendoNet: A New Deep Convolutional Neural Network with Ensemble Behavior Xiang Zhang, Nishant Vishwamitra, Hongxin Hu, Feng Luo School of Computing Clemson University xzhang7@clemson.edu Abstract We

More information

Computation-Performance Optimization of Convolutional Neural Networks with Redundant Kernel Removal

Computation-Performance Optimization of Convolutional Neural Networks with Redundant Kernel Removal Computation-Performance Optimization of Convolutional Neural Networks with Redundant Kernel Removal arxiv:1705.10748v3 [cs.cv] 10 Apr 2018 Chih-Ting Liu, Yi-Heng Wu, Yu-Sheng Lin, and Shao-Yi Chien Media

More information

CSE 559A: Computer Vision

CSE 559A: Computer Vision CSE 559A: Computer Vision Fall 2018: T-R: 11:30-1pm @ Lopata 101 Instructor: Ayan Chakrabarti (ayan@wustl.edu). Course Staff: Zhihao Xia, Charlie Wu, Han Liu http://www.cse.wustl.edu/~ayan/courses/cse559a/

More information

Face Recognition A Deep Learning Approach

Face Recognition A Deep Learning Approach Face Recognition A Deep Learning Approach Lihi Shiloh Tal Perl Deep Learning Seminar 2 Outline What about Cat recognition? Classical face recognition Modern face recognition DeepFace FaceNet Comparison

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

Real-time Object Detection CS 229 Course Project

Real-time Object Detection CS 229 Course Project Real-time Object Detection CS 229 Course Project Zibo Gong 1, Tianchang He 1, and Ziyi Yang 1 1 Department of Electrical Engineering, Stanford University December 17, 2016 Abstract Objection detection

More information

TensorFlow Debugger: Debugging Dataflow Graphs for Machine Learning

TensorFlow Debugger: Debugging Dataflow Graphs for Machine Learning TensorFlow Debugger: Debugging Dataflow Graphs for Machine Learning Shanqing Cai, Eric Breck, Eric Nielsen, Michael Salib, D. Sculley Google, Inc. {cais, ebreck, nielsene, msalib, dsculley}@google.com

More information

Variational autoencoders for tissue heterogeneity exploration from (almost) no preprocessed mass spectrometry imaging data.

Variational autoencoders for tissue heterogeneity exploration from (almost) no preprocessed mass spectrometry imaging data. arxiv:1708.07012v2 [q-bio.qm] 24 Aug 2017 Variational autoencoders for tissue heterogeneity exploration from (almost) no preprocessed mass spectrometry imaging data. Paolo Inglese, James L. Alexander,

More information

Bayesian model ensembling using meta-trained recurrent neural networks

Bayesian model ensembling using meta-trained recurrent neural networks Bayesian model ensembling using meta-trained recurrent neural networks Luca Ambrogioni l.ambrogioni@donders.ru.nl Umut Güçlü u.guclu@donders.ru.nl Yağmur Güçlütürk y.gucluturk@donders.ru.nl Julia Berezutskaya

More information

A Deep Learning primer

A Deep Learning primer A Deep Learning primer Riccardo Zanella r.zanella@cineca.it SuperComputing Applications and Innovation Department 1/21 Table of Contents Deep Learning: a review Representation Learning methods DL Applications

More information

Convolutional Neural Networks

Convolutional Neural Networks Lecturer: Barnabas Poczos Introduction to Machine Learning (Lecture Notes) Convolutional Neural Networks Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.

More information

Deep Learning Applications

Deep Learning Applications October 20, 2017 Overview Supervised Learning Feedforward neural network Convolution neural network Recurrent neural network Recursive neural network (Recursive neural tensor network) Unsupervised Learning

More information

A performance comparison of Deep Learning frameworks on KNL

A performance comparison of Deep Learning frameworks on KNL A performance comparison of Deep Learning frameworks on KNL R. Zanella, G. Fiameni, M. Rorro Middleware, Data Management - SCAI - CINECA IXPUG Bologna, March 5, 2018 Table of Contents 1. Problem description

More information

Keras: Handwritten Digit Recognition using MNIST Dataset

Keras: Handwritten Digit Recognition using MNIST Dataset Keras: Handwritten Digit Recognition using MNIST Dataset IIT PATNA February 9, 2017 1 / 24 OUTLINE 1 Introduction Keras: Deep Learning library for Theano and TensorFlow 2 Installing Keras Installation

More information

Deep Neural Networks for Recognizing Online Handwritten Mathematical Symbols

Deep Neural Networks for Recognizing Online Handwritten Mathematical Symbols Deep Neural Networks for Recognizing Online Handwritten Mathematical Symbols Hai Dai Nguyen 1, Anh Duc Le 2 and Masaki Nakagawa 3 Tokyo University of Agriculture and Technology 2-24-16 Nakacho, Koganei-shi,

More information

Deep Learning and Its Applications

Deep Learning and Its Applications Convolutional Neural Network and Its Application in Image Recognition Oct 28, 2016 Outline 1 A Motivating Example 2 The Convolutional Neural Network (CNN) Model 3 Training the CNN Model 4 Issues and Recent

More information

arxiv: v1 [cs.cv] 14 Apr 2017

arxiv: v1 [cs.cv] 14 Apr 2017 Interpretable 3D Human Action Analysis with Temporal Convolutional Networks arxiv:1704.04516v1 [cs.cv] 14 Apr 2017 Tae Soo Kim Johns Hopkins University Baltimore, MD Abstract tkim60@jhu.edu The discriminative

More information

A Novel Weight-Shared Multi-Stage Network Architecture of CNNs for Scale Invariance

A Novel Weight-Shared Multi-Stage Network Architecture of CNNs for Scale Invariance JOURNAL OF L A TEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 1 A Novel Weight-Shared Multi-Stage Network Architecture of CNNs for Scale Invariance Ryo Takahashi, Takashi Matsubara, Member, IEEE, and Kuniaki

More information

Real-time convolutional networks for sonar image classification in low-power embedded systems

Real-time convolutional networks for sonar image classification in low-power embedded systems Real-time convolutional networks for sonar image classification in low-power embedded systems Matias Valdenegro-Toro Ocean Systems Laboratory - School of Engineering & Physical Sciences Heriot-Watt University,

More information

Deep Learning. Practical introduction with Keras JORDI TORRES 27/05/2018. Chapter 3 JORDI TORRES

Deep Learning. Practical introduction with Keras JORDI TORRES 27/05/2018. Chapter 3 JORDI TORRES Deep Learning Practical introduction with Keras Chapter 3 27/05/2018 Neuron A neural network is formed by neurons connected to each other; in turn, each connection of one neural network is associated

More information

Progressive Neural Architecture Search

Progressive Neural Architecture Search Progressive Neural Architecture Search Chenxi Liu, Barret Zoph, Maxim Neumann, Jonathon Shlens, Wei Hua, Li-Jia Li, Li Fei-Fei, Alan Yuille, Jonathan Huang, Kevin Murphy 09/10/2018 @ECCV 1 Outline Introduction

More information

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University.

Deep Learning. Visualizing and Understanding Convolutional Networks. Christopher Funk. Pennsylvania State University. Visualizing and Understanding Convolutional Networks Christopher Pennsylvania State University February 23, 2015 Some Slide Information taken from Pierre Sermanet (Google) presentation on and Computer

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

XES Tensorflow Process Prediction using the Tensorflow Deep-Learning Framework

XES Tensorflow Process Prediction using the Tensorflow Deep-Learning Framework XES Tensorflow Process Prediction using the Tensorflow Deep-Learning Framework Demo Paper Joerg Evermann 1, Jana-Rebecca Rehse 2,3, and Peter Fettke 2,3 1 Memorial University of Newfoundland 2 German Research

More information

Final Report: Classification of Plankton Classes By Tae Ho Kim and Saaid Haseeb Arshad

Final Report: Classification of Plankton Classes By Tae Ho Kim and Saaid Haseeb Arshad Final Report: Classification of Plankton Classes By Tae Ho Kim and Saaid Haseeb Arshad Table of Contents 1. Project Overview a. Problem Statement b. Data c. Overview of the Two Stages of Implementation

More information