Introduction to Neural Networks

Size: px
Start display at page:

Download "Introduction to Neural Networks"

Transcription

1 ECE 5775 (Fall 17) High-Level Digital Design Automation Introduction to Neural Networks Ritchie Zhao, Zhiru Zhang School of Electrical and Computer Engineering

2 Rise of the Machines Neural networks have revolutionized the world Self-driving vehicles Advanced image recognition Game AI 1

3 Rise of the Machines Neural networks have revolutionized research 2

4 A Brief History 1950 s: First artificial neural networks created based on biological structures in the human visual cortex 1980 s 2000 s: NNs considered inferior to other, simpler algorithms (e.g. SVM, logistic regression) Mid 2000 s: NN research considered dead, machine learning conferences outright reject most NN papers : NNs begin winning large-scale image and document classification contests, beating other methods 2012 Now: NNs prove themselves in many industrial applications (web search, translation, image analysis) 3

5 Classification Problems We ll discuss neural networks for solving supervised classification problems Inputs Labels Cat Bird Given a training set consisting of labeled inputs Learn a function which maps inputs to labels This function is used to predict the labels of new inputs Inputs Predictions Person??? Training Set Test Set 4

6 A Simple Classification Problem Inputs: Pairs of numbers (x 1, x 2 ) Labels: 0 or 1 x 1 x 2 Label This is a binary decision problem Real-life analogue: Label = Raining or Not Raining x 1 = relative humidity x 2 = cloud coverage Training Set 5

7 Visualizing the Data x 1 x 2 Label x 1 x 2 Plot of the data points Training Set 6

8 Decision Function In this case, the data points can be classified with a linear decision boundary Decision boundary 7

9 The Perceptron The perceptron is the simplest possible neural network, containing only one neuron Described by the following equation:. y = σ(* w, x, + b) w, b σ,/0 = weights = bias = activation function 8

10 Visualizing the Perceptron A 2-input, 1-output perceptron: x 0 w 0 x 3 w 3 + b σ y Output Probability that the label should be 1 Linear function of x = x 0 x 3 Wx + b Non-linear function squashes output to the interval (0,1) 9

11 Activation Function The activation function σ is non-linear: Unit Step Sigmoid ReLU e 9: max (0, x) Discontinuous at origin, zero derivative everywhere Continuous and differentiable everywhere Continuous but non-differentiable at origin First perceptrons used Unit Step. Early neural networks used Sigmoid. Modern deep networks use ReLU. 10

12 The Perceptron Decision Boundary Let s plot the 2-input perceptron (sigmoid activation) y = σ(w 0 x 0 + w 3 x 3 + b) w 0 = 10 w 3 = 10 w 0 = 5 w 3 = 20 w 0 = 20 w 3 = 5 Ratio of weights change the direction of the decision boundary 11

13 The Perceptron Decision Boundary Let s plot the 2-input perceptron (sigmoid activation) y = σ(w 0 x 0 + w 3 x 3 + b) w 0 = 10 w 3 = 10 w 0 = 5 w 3 = 5 w 0 = 2 w 3 = 2 Magnitude of weights change the steepness of the decision boundary 12

14 The Perceptron Decision Boundary Let s plot the 2-input perceptron (sigmoid activation) y = σ(w 0 x 0 + w 3 x 3 + b) b = 0 b = 5 b = 5 Bias moves boundary away the from origin 13

15 Finding the Weights and Bias With the right weights and bias we can create any (linear) decision boundary we want! But how do we compute the weights and bias? Solution: backpropagation training 14

16 Backpropagation Let s take the 2-input perceptron and initialize the weights and bias randomly x w 0 + σ y x 3 w b 1.4 Inspired by course slides from L. Fei-Fei, J. Johnson, S. Yeung, CS231n at Stanford University 15

17 Backpropagation Send one input pair through and calculate the output x 1 x 2 Label 0.7 x w Output: 0.95 σ y Target: x 3 w b 1.4 Inspired by course slides from L. Fei-Fei, J. Johnson, S. Yeung, CS231n at Stanford University 16

18 Backpropagation Training How to change {w 0, w 3, b} to move y towards the target? Answer: partial derivatives JK JL M, JK JL N, JK JO 0.7 x w Output: 0.95 σ y Target: x 3 w b 1.4 Inspired by course slides from L. Fei-Fei, J. Johnson, S. Yeung, CS231n at Stanford University 17

19 Backpropagation We can calculate the derivatives function-by-function Green = partial derivative x 0 x w 0 w b 1.4 s y σ y = σ(s) = y s = e 9S y e 9S (1 + e 9S ) 3 = y y = 1.0 Inspired by course slides from L. Fei-Fei, J. Johnson, S. Yeung, CS231n at Stanford University 18

20 Backpropagation We can calculate the derivatives function-by-function Propagate partial derivatives backwards x 0 x w 0 w b 1.4 s y σ 1.0 y s = w 0 x 0 + w 3 x 3 + b s b = 1 s w 0 = x 0 s w 3 = x 3 Inspired by course slides from L. Fei-Fei, J. Johnson, S. Yeung, CS231n at Stanford University 19

21 Backpropagation What we ve done is apply the chain rule of derivatives x 0 x w 0 w b 1.4 s y σ 1.0 y y = y w 0 s s w 0 y s = s w 0 = 0.7 Inspired by course slides from L. Fei-Fei, J. Johnson, S. Yeung, CS231n at Stanford University 20

22 Backpropagation In general, any operator in a neural network defines a forwards pass and a backwards pass x E x = E f f x y f f(x, y) E E y = E f f y f Backpropagation allows us to build complex networks and still be able to update the weights Inspired by course slides from L. Fei-Fei, J. Johnson, S. Yeung, CS231n at Stanford University 21

23 Weight Update We now know how to compute JK JL M, JK JL N, JK JO Parameter update rule (for minimizing y): w = w η JK JL η = learning rate or step size This is optimization by gradient descent, as we move in the direction of the steepest gradient 22

24 Practical Differences For simplicity we minimized y, since label t = 0 In practice we use a loss function L(y, t) Square Loss: L y, t = (y t) 3 Cross Entropy: L y, t = t ln y 1 t ln (1 y) For simplicity we dealt with 1 training sample In practice we process multiple training samples (a minibatch) and sum the gradients before updating the weights 23

25 Deep Neural Network The perceptron s linear decision boundary is not very useful for complex problems A deep neural network (DNN) consists of many layers, each containing multiple neurons This network is fully connected Input Layer Hidden Layers Output Layer Image credit: 24

26 What Happens in a DNN? Each hidden unit learns a hidden feature Later features depends on earlier features in the network, so they will get more complex Simple Features More Complex Features Image credit: 25

27 Deep Neural Network A deep neural network creates highly non-linear decision boundaries 2 units 3 units 4 units Image credit: 26

28 Convolutional Neural Network Convolutional Layers Fully-connected Layers Convolutional neural network (CNN) is a neural network specifically built to analyze images State-of-the-art in image classification Key difference: the layers at the front are convolutional layers Image credit: 27

29 The Convolutional Layer Fully-connected Layer 1. An output neuron is connected to each input neuron Convolutional Layer 1. Inputs and outputs are vectors of 2D feature maps 2. An output pixel is connected to its neighboring region on each input feature map 3. All pixels on an output feature map use the same weights Image credit: 28

30 The Convolutional Layer Input Feature Map Output Feature Map The conv layer moves a weight filter across the input map, doing a dot product at each location The weight filter detects local features in the input map Image credit: 29

31 Intuition on CNNs Convolutional layers detect visual features This shows features in the input image which cause high activity, across 3 different conv layers in a CNN Image credit: H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, Unsupervised Learning of Hierarchical Representations with Convolutional Deep Belief Networks, CACM Oct

32 Accelerating CNNs on FPGA Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks Cheng Zhang 1, Peng Li 2, Guangyu Sun 1,3, Yijin Guan 1, Bingjun Xiao 2, Jason Cong 2,3,1 1 Center for Energy-Efficient Computing and Applications, Peking University 2 Computer Science Department, University of California, Los Angeles 3 PKU/UCLA Joint Research Institute in Science and Engineering FPGA 15, Feb

33 Convolutional Layer Code N K C M Filters R Input Feature Maps Output Feature Maps 1 for(row=0; row<r; row++) { 2 for(col=0; col<c; col++) { Loop over output pixels 3 for(to=0; to<m; to++) { Loop over output feature maps 4 for(ti=0; ti<n; ti++) { Loop over input feature maps 5 for(i=0; i<k; i++) { 6 for(j=0; j<k; j++) { Loop over filter pixels output_fm[to][row][col] += weights[to][ti][i][j]*input_fm[ti][s*row+i][s*col+j]; }}}}}} 32

34 Challenges for FPGA We can t just unroll all the loops due to limited FPGA resources Must choose the right pragmas and code transformations Computation Resource... PE-1 PE-2 PE-n Limited compute and routing resources Solution: Loop pipelining Interconnect On-chip memory buffer1 buffer2 Limited on-chip storage Solution: Loop tiling Off-chip Bus External Memory Limited off-chip bandwidth Solution: Data reuse, compute-memory balance 33

35 Loop Tiling N K C M Filters R Input Feature Maps Output Feature Maps 1 for(row=0; row<r; row++) { 2 for(col=0; col<c; col++) { Loop over pixels in an output map 3 for(to=0; to<m; to++) { 4 for(ti=0; ti<n; ti++) { 5 for(i=0; i<k; i++) { 6 for(j=0; j<k; j++) { output_fm[to][row][col] += weights[to][ti][i][j]*input_fm[ti][s*row+i][s*col+j]; }}}}}} 34

36 Loop Tiling N K C M Tr Filters Tc R Input Feature Maps Output Feature Maps 1 for(row=0; row<r; row+=tr) { 2 for(col=0; col<c; col+=tc) { Loop over different tiles in output map 3 for(to=0; to<m; to++) { 4 for(ti=0; ti<n; ti++) { 5 for(trr=row; trr<min(row+tr, R); trr++) { Loops over pixels 6 for(tcc=col; tcc<min(tcc+tc, C); tcc++) { in each tile 7 for(i=0; i<k; i++) { 8 for(j=0; j<k; j++) { output_fm[too][trr][tcc] += weights[too][tii][i][j]*input_fm[tii][s*trr+i][s*tcc+j]; }}}}}} }}}} 35

37 Code with Loop Tiling 1 for(row=0; row<r; row+=tr) { 2 for(col=0; col<c; col+=tc) { 3 for(to=0; to<m; to+=tm) { 4 for(ti=0; ti<n; ti+=tn) { // software: write output feature map // software: write input feature map + filters Software Portion Maximize memory reuse 5 for(trr=row; trr<min(row+tr, R); trr++) { 6 for(tcc=col; tcc<min(tcc+tc, C); tcc++) { FPGA Portion 7 for(too=to; too<min(to+tm, M); too++) { Maximize 8 for(tii=ti; tii<(ti+tn, N); tii++) { computational 9 for(i=0; i<k; i++) { performance 10 for(j=0; j<k; j++) { output_fm[too][trr][tcc] += weights[too][tii][i][j]*input_fm[tii][s*trr+i][s*tcc+j]; }}}}}} }}}} // software: read output feature map 36

38 Optimizing for On-Chip Performance 95 for(i=0; for(trr=row; i<k; trr<min(row+tr, i++) { R); trr++) { Reorder these two loops to the top level 10 6 for(j=0; for(tcc=col; j<k; tcc<min(tcc+tc, j++) { C); tcc++) { 57 for(trr=row; for(too=to; too<min(to+tm, trr<min(row+tr, M); R); too++) trr++) { 68 for(tcc=col; for(tii=ti; tii<(ti+tn, tcc<min(tcc+tc, N); tii++) C); { tcc++) { 79 for(too=to; for(i=0; #pragma i<k; HLS too<min(to+tm, i++) pipeline { M); too++) { Generated Hardware for(too=to; for(j=0; for(tii=ti; j<k; too<min(to+tm, tii<(ti+tn, Tn j++) { N); tii++) M); too++) { { 8 Performance determined for(tii=ti; #pragma output_fm[too][trr][tcc] HLS tii<(ti+tn, unroll N); tii++) += { 5 8 by tile for(trr=row; factors Tn for(tii=ti; trr<min(row+tr, and weights[too][tii][i][j]*input_fm[tii][s*trr+i][s*tcc+j]; output_fm[too][trr][tcc] weights[too][tii][i][j]*input_fm[ti][s*trr+i][s*tcc+j]; Tm tii<(ti+tn, R); N); trr++) tii++) += { { 6 }}}}}} for(tcc=col; weights[too][tii][i][j]*input_fm[tii][s*trr+i][s*tcc+j]; output_fm[too][trr][tcc] #pragma tcc<min(tcc+tc, HLS unroll C); tcc++) += { 7 Amount }}}}}} of for(too=to; data transfer weights[too][tii][i][j]*input_fm[ti][s*trr+i][s*tcc+j]; output_fm[too][trr][tcc] too<min(to+tm, M); += too++) { 8 determined }}}}}} by for(tii=ti; Tr weights[too][tii][i][j]*input_fm[ti][s*trr+i][s*tcc+j]; and tii<(ti+tn, Tc N); tii++) { 9 }}}}}} for(i=0; i<k; i++) { 10 for(j=0; j<k; j++) { output_fm[too][trr][tcc] += weights[too][tii][i][j]*input_fm[ti][s*trr+i][s*tcc+j]; }}}}}} Tm 37

39 Design Space Complexity Challenge: Number of available optimizations present a huge space of possible designs Optimal loop order? Tile size for each loop? Implementing and testing each design by hand will be slow and error-prone Some designs exceed the on-chip resources Solution: Performance modeling + automated design space exploration 38

40 Performance Modeling We calculate the following design metrics: Total number of operations (FLOP) Depends on the CNN model parameters Total external memory access (Byte) Depends on the CNN weight and activation size Total execution time (S) Depends on the hardware architecture (e.g. tile factors Tm and Tn) Ignore resource constrains for now Execution Time = Number of Cycles Clock Period Number of Cycles ^ _` a _ b R C K 3 39

41 Performance Modeling Computation Resource... PE-1 PE-2 PE-n Interconnect On-chip memory buffer1 buffer2 Computational Throughput Total number of operations = Total execution time GFLOP/S Computation To Communication (CTC) Ratio Total number of operations = Total external memory access FLOP / (Byte Accessed) GByte S Off-chip Bus External Memory GFLOP/s = FLOP/(Byte Accessed) = Required Bandwidth Computational Throughput CTC Ratio GByte/S 40

42 Roofline Method GFLOP/S Computational Throughput Design Required Bandwidth Gbyte/S Computation To Communication (CTC) Ratio FLOP / Byte 41

43 Computational Throughput Roofline Method GFLOP/S Computational Roof (Max Exceeds device throughput) On-Chip Resources Bandwidth-Bottlenecked Theoretical Throughput Real Throughput Computation To Communication (CTC) Ratio Bandwidth Roof (Device memory bandwidth) The Roofline FLOP / Byte 42

44 Design Space Exploration 120 Bandwidth Roof Computational Throughput Computational Roof A B Computation To Communication (CTC) Ratio 43

45 Experimental Evaluation The accelerator design is implemented with Vivado HLS on the VC707 board containing a Virtex7 FPGA Hardware resource utilization Resource DSP BRAM LUT FF Used Available Utilization 80% 50% 61% 34% Comparison of theoretical and real performance Layer 2 Theoretical Measured Difference 5.11ms 5.25ms ~5% 44

46 Experimental Results Energy Speedup x x x x thread -O3 16 threads -O3 FPGA 0 1 thread -O3 16 threads -O3 FPGA CPU Xeon E (32nm) 16 cores 2.2 GHz FPGA Virtex7-485t (28nm) 448 PEs 100MHz gcc O3 OpenMP 3.0 Vivado Vivado HLS

47 FIN 46

Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Network

Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Network Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Network Chen Zhang 1, Peng Li 3, Guangyu Sun 1,2, Yijin Guan 1, Bingjun Xiao 3, Jason Cong 1,2,3 1 Peking University 2 PKU/UCLA Joint

More information

Machine Learning on FPGAs

Machine Learning on FPGAs Machine Learning on FPGAs Jason Cong Chancellor s Professor, UCLA Director, Center for Domain-Specific Computing cong@cs.ucla.edu http://cadlab.cs.ucla.edu/~cong 1 Impacts of deep learning for many applications

More information

Design Exploration of FPGA-based Accelerators for Deep Neural Networks

Design Exploration of FPGA-based Accelerators for Deep Neural Networks Design Exploration of FPGA-based Accelerators for Deep Neural Networks Guangyu Sun CECA@Peking University gsun@pku.edu.cn 1 Applications of Deep Neural Networks Unmanned Vehicle Speech & Audio Text & Language

More information

ECE 5775 High-Level Digital Design Automation Fall DNN Acceleration on FPGAs

ECE 5775 High-Level Digital Design Automation Fall DNN Acceleration on FPGAs ECE 5775 High-Level Digital Design Automation Fall 2018 DNN Acceleration on FPGAs Announcements HW 2 due Friday at 11:59pm First student-led seminar is on Tuesday 10/16 Format: 18-min talk plus 2-min Q&A

More information

Lab 4: Convolutional Neural Networks Due Friday, November 3, 2017, 11:59pm

Lab 4: Convolutional Neural Networks Due Friday, November 3, 2017, 11:59pm ECE5775 High-Level Digital Design Automation, Fall 2017 School of Electrical Computer Engineering, Cornell University Lab 4: Convolutional Neural Networks Due Friday, November 3, 2017, 11:59pm 1 Introduction

More information

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Lecture 20: Neural Networks for NLP. Zubin Pahuja Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple

More information

Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs

Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs Ritchie Zhao 1, Weinan Song 2, Wentao Zhang 2, Tianwei Xing 3, Jeng-Hau Lin 4, Mani Srivastava 3, Rajesh Gupta 4, Zhiru

More information

ECE5775 High-Level Digital Design Automation, Fall 2018 School of Electrical Computer Engineering, Cornell University

ECE5775 High-Level Digital Design Automation, Fall 2018 School of Electrical Computer Engineering, Cornell University ECE5775 High-Level Digital Design Automation, Fall 2018 School of Electrical Computer Engineering, Cornell University Lab 4: Binarized Convolutional Neural Networks Due Wednesday, October 31, 2018, 11:59pm

More information

Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System

Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System Chi Zhang, Viktor K Prasanna University of Southern California {zhan527, prasanna}@usc.edu fpga.usc.edu ACM

More information

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro CMU 15-781 Lecture 18: Deep learning and Vision: Convolutional neural networks Teacher: Gianni A. Di Caro DEEP, SHALLOW, CONNECTED, SPARSE? Fully connected multi-layer feed-forward perceptrons: More powerful

More information

SODA: Stencil with Optimized Dataflow Architecture Yuze Chi, Jason Cong, Peng Wei, Peipei Zhou

SODA: Stencil with Optimized Dataflow Architecture Yuze Chi, Jason Cong, Peng Wei, Peipei Zhou SODA: Stencil with Optimized Dataflow Architecture Yuze Chi, Jason Cong, Peng Wei, Peipei Zhou University of California, Los Angeles 1 What is stencil computation? 2 What is Stencil Computation? A sliding

More information

Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks

Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks Chen Zhang 1 chen.ceca@pku.edu.cn Yijin Guan 1 guanyijin@pku.edu.cn Peng Li 2 pengli@cs.ucla.edu Bingjun Xiao 2 xiao@cs.ucla.edu

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining

More information

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks

More information

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD.

Deep Learning Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD. Deep Learning 861.061 Basic Lecture - Complex Systems & Artificial Intelligence 2017/18 (VO) Asan Agibetov, PhD asan.agibetov@meduniwien.ac.at Medical University of Vienna Center for Medical Statistics,

More information

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU, Machine Learning 10-701, Fall 2015 Deep Learning Eric Xing (and Pengtao Xie) Lecture 8, October 6, 2015 Eric Xing @ CMU, 2015 1 A perennial challenge in computer vision: feature engineering SIFT Spin image

More information

C-Brain: A Deep Learning Accelerator

C-Brain: A Deep Learning Accelerator C-Brain: A Deep Learning Accelerator that Tames the Diversity of CNNs through Adaptive Data-level Parallelization Lili Song, Ying Wang, Yinhe Han, Xin Zhao, Bosheng Liu, Xiaowei Li State Key Laboratory

More information

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

COMP 551 Applied Machine Learning Lecture 14: Neural Networks COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted for this course

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

Machine Learning 13. week

Machine Learning 13. week Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of

More information

Accelerating Convolutional Neural Nets. Yunming Zhang

Accelerating Convolutional Neural Nets. Yunming Zhang Accelerating Convolutional Neural Nets Yunming Zhang Focus Convolutional Neural Nets is the state of the art in classifying the images The models take days to train Difficult for the programmers to tune

More information

Machine Learning Classifiers and Boosting

Machine Learning Classifiers and Boosting Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve

More information

Week 3: Perceptron and Multi-layer Perceptron

Week 3: Perceptron and Multi-layer Perceptron Week 3: Perceptron and Multi-layer Perceptron Phong Le, Willem Zuidema November 12, 2013 Last week we studied two famous biological neuron models, Fitzhugh-Nagumo model and Izhikevich model. This week,

More information

Minimizing Computation in Convolutional Neural Networks

Minimizing Computation in Convolutional Neural Networks Minimizing Computation in Convolutional Neural Networks Jason Cong and Bingjun Xiao Computer Science Department, University of California, Los Angeles, CA 90095, USA {cong,xiao}@cs.ucla.edu Abstract. Convolutional

More information

Lecture #11: The Perceptron

Lecture #11: The Perceptron Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be

More information

Multi-layer Perceptron Forward Pass Backpropagation. Lecture 11: Aykut Erdem November 2016 Hacettepe University

Multi-layer Perceptron Forward Pass Backpropagation. Lecture 11: Aykut Erdem November 2016 Hacettepe University Multi-layer Perceptron Forward Pass Backpropagation Lecture 11: Aykut Erdem November 2016 Hacettepe University Administrative Assignment 2 due Nov. 10, 2016! Midterm exam on Monday, Nov. 14, 2016 You are

More information

Deep neural networks II

Deep neural networks II Deep neural networks II May 31 st, 2018 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, Jia-Bin Huang, Derek Hoiem, Adriana Kovashka, Why (convolutional) neural networks? State of

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

BHNN: a Memory-Efficient Accelerator for Compressing Deep Neural Network with Blocked Hashing Techniques

BHNN: a Memory-Efficient Accelerator for Compressing Deep Neural Network with Blocked Hashing Techniques BHNN: a Memory-Efficient Accelerator for Compressing Deep Neural Network with Blocked Hashing Techniques Jingyang Zhu 1, Zhiliang Qian 2*, and Chi-Ying Tsui 1 1 The Hong Kong University of Science and

More information

Neural Computer Architectures

Neural Computer Architectures Neural Computer Architectures 5kk73 Embedded Computer Architecture By: Maurice Peemen Date: Convergence of different domains Neurobiology Applications 1 Constraints Machine Learning Technology Innovations

More information

Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA

Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA Towards a Uniform Template-based Architecture for Accelerating 2D and 3D CNNs on FPGA Junzhong Shen, You Huang, Zelong Wang, Yuran Qiao, Mei Wen, Chunyuan Zhang National University of Defense Technology,

More information

A Survey of FPGA-based Accelerators for Convolutional Neural Networks

A Survey of FPGA-based Accelerators for Convolutional Neural Networks 1 A Survey of FPGA-based Accelerators for Convolutional Neural Networks Sparsh Mittal Abstract Deep convolutional neural networks (CNNs) have recently shown very high accuracy in a wide range of cognitive

More information

Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks

Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks Naveen Suda, Vikas Chandra *, Ganesh Dasika *, Abinash Mohanty, Yufei Ma, Sarma Vrudhula, Jae-sun Seo, Yu

More information

Machine Learning. MGS Lecture 3: Deep Learning

Machine Learning. MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ Machine Learning MGS Lecture 3: Deep Learning Dr Michel F. Valstar http://cs.nott.ac.uk/~mfv/ WHAT IS DEEP LEARNING? Shallow network: Only one hidden layer

More information

Data Mining. Neural Networks

Data Mining. Neural Networks Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most

More information

Deep Learning for Computer Vision II

Deep Learning for Computer Vision II IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L

More information

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart Machine Learning The Breadth of ML Neural Networks & Deep Learning Marc Toussaint University of Stuttgart Duy Nguyen-Tuong Bosch Center for Artificial Intelligence Summer 2017 Neural Networks Consider

More information

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com Neural Networks These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these slides

More information

ImageNet Classification with Deep Convolutional Neural Networks

ImageNet Classification with Deep Convolutional Neural Networks ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture

More information

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016

CS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional

More information

Supervised Learning in Neural Networks (Part 2)

Supervised Learning in Neural Networks (Part 2) Supervised Learning in Neural Networks (Part 2) Multilayer neural networks (back-propagation training algorithm) The input signals are propagated in a forward direction on a layer-bylayer basis. Learning

More information

Notes on Multilayer, Feedforward Neural Networks

Notes on Multilayer, Feedforward Neural Networks Notes on Multilayer, Feedforward Neural Networks CS425/528: Machine Learning Fall 2012 Prepared by: Lynne E. Parker [Material in these notes was gleaned from various sources, including E. Alpaydin s book

More information

Design Space Exploration of FPGA-Based Deep Convolutional Neural Networks

Design Space Exploration of FPGA-Based Deep Convolutional Neural Networks Design Space Exploration of FPGA-Based Deep Convolutional Neural Networks Abstract Deep Convolutional Neural Networks (DCNN) have proven to be very effective in many pattern recognition applications, such

More information

Intel HLS Compiler: Fast Design, Coding, and Hardware

Intel HLS Compiler: Fast Design, Coding, and Hardware white paper Intel HLS Compiler Intel HLS Compiler: Fast Design, Coding, and Hardware The Modern FPGA Workflow Authors Melissa Sussmann HLS Product Manager Intel Corporation Tom Hill OpenCL Product Manager

More information

DNN ENGINE: A 16nm Sub-uJ DNN Inference Accelerator for the Embedded Masses

DNN ENGINE: A 16nm Sub-uJ DNN Inference Accelerator for the Embedded Masses DNN ENGINE: A 16nm Sub-uJ DNN Inference Accelerator for the Embedded Masses Paul N. Whatmough 1,2 S. K. Lee 2, N. Mulholland 2, P. Hansen 2, S. Kodali 3, D. Brooks 2, G.-Y. Wei 2 1 ARM Research, Boston,

More information

Deep Learning and Its Applications

Deep Learning and Its Applications Convolutional Neural Network and Its Application in Image Recognition Oct 28, 2016 Outline 1 A Motivating Example 2 The Convolutional Neural Network (CNN) Model 3 Training the CNN Model 4 Issues and Recent

More information

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation Learning Learning agents Inductive learning Different Learning Scenarios Evaluation Slides based on Slides by Russell/Norvig, Ronald Williams, and Torsten Reil Material from Russell & Norvig, chapters

More information

Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA

Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA Scalable and Modularized RTL Compilation of Convolutional Neural Networks onto FPGA Yufei Ma, Naveen Suda, Yu Cao, Jae-sun Seo, Sarma Vrudhula School of Electrical, Computer and Energy Engineering School

More information

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa Instructors: Parth Shah, Riju Pahwa Lecture 2 Notes Outline 1. Neural Networks The Big Idea Architecture SGD and Backpropagation 2. Convolutional Neural Networks Intuition Architecture 3. Recurrent Neural

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017 3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural

More information

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes CS 6501: Deep Learning for Computer Graphics Training Neural Networks II Connelly Barnes Overview Preprocessing Initialization Vanishing/exploding gradients problem Batch normalization Dropout Additional

More information

Global Optimality in Neural Network Training

Global Optimality in Neural Network Training Global Optimality in Neural Network Training Benjamin D. Haeffele and René Vidal Johns Hopkins University, Center for Imaging Science. Baltimore, USA Questions in Deep Learning Architecture Design Optimization

More information

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies http://blog.csdn.net/zouxy09/article/details/8775360 Automatic Colorization of Black and White Images Automatically Adding Sounds To Silent Movies Traditionally this was done by hand with human effort

More information

Perceptron as a graph

Perceptron as a graph Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 10 th, 2007 2005-2007 Carlos Guestrin 1 Perceptron as a graph 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-6 -4-2

More information

Accelerating Large-Scale Convolutional Neural Networks with Parallel Graphics Multiprocessors

Accelerating Large-Scale Convolutional Neural Networks with Parallel Graphics Multiprocessors 20th International Conference on Artificial Neural Networks (ICANN), Thessaloniki, Greece, September 2010 Accelerating Large-Scale Convolutional Neural Networks with Parallel Graphics Multiprocessors Dominik

More information

Neural Networks (Overview) Prof. Richard Zanibbi

Neural Networks (Overview) Prof. Richard Zanibbi Neural Networks (Overview) Prof. Richard Zanibbi Inspired by Biology Introduction But as used in pattern recognition research, have little relation with real neural systems (studied in neurology and neuroscience)

More information

Introduction to Deep Learning

Introduction to Deep Learning ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)

More information

Profiling the Performance of Binarized Neural Networks. Daniel Lerner, Jared Pierce, Blake Wetherton, Jialiang Zhang

Profiling the Performance of Binarized Neural Networks. Daniel Lerner, Jared Pierce, Blake Wetherton, Jialiang Zhang Profiling the Performance of Binarized Neural Networks Daniel Lerner, Jared Pierce, Blake Wetherton, Jialiang Zhang 1 Outline Project Significance Prior Work Research Objectives Hypotheses Testing Framework

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

Optimizing CNN-based Object Detection Algorithms on Embedded FPGA Platforms

Optimizing CNN-based Object Detection Algorithms on Embedded FPGA Platforms Optimizing CNN-based Object Detection Algorithms on Embedded FPGA Platforms Ruizhe Zhao 1, Xinyu Niu 1, Yajie Wu 2, Wayne Luk 1, and Qiang Liu 3 1 Imperial College London {ruizhe.zhao15,niu.xinyu10,w.luk}@imperial.ac.uk

More information

DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs

DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs IBM Research AI Systems Day DNNBuilder: an Automated Tool for Building High-Performance DNN Hardware Accelerators for FPGAs Xiaofan Zhang 1, Junsong Wang 2, Chao Zhu 2, Yonghua Lin 2, Jinjun Xiong 3, Wen-mei

More information

INTRODUCTION TO DEEP LEARNING

INTRODUCTION TO DEEP LEARNING INTRODUCTION TO DEEP LEARNING CONTENTS Introduction to deep learning Contents 1. Examples 2. Machine learning 3. Neural networks 4. Deep learning 5. Convolutional neural networks 6. Conclusion 7. Additional

More information

Deep (1) Matthieu Cord LIP6 / UPMC Paris 6

Deep (1) Matthieu Cord LIP6 / UPMC Paris 6 Deep (1) Matthieu Cord LIP6 / UPMC Paris 6 Syllabus 1. Whole traditional (old) visual recognition pipeline 2. Introduction to Neural Nets 3. Deep Nets for image classification To do : Voir la leçon inaugurale

More information

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center Machine Learning With Python Bin Chen Nov. 7, 2017 Research Computing Center Outline Introduction to Machine Learning (ML) Introduction to Neural Network (NN) Introduction to Deep Learning NN Introduction

More information

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla DEEP LEARNING REVIEW Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature 2015 -Presented by Divya Chitimalla What is deep learning Deep learning allows computational models that are composed of multiple

More information

Design Space Exploration of FPGA-Based Deep Convolutional Neural Networks

Design Space Exploration of FPGA-Based Deep Convolutional Neural Networks Design Space Exploration of FPGA-Based Deep Convolutional Neural Networks Mohammad Motamedi, Philipp Gysel, Venkatesh Akella and Soheil Ghiasi Electrical and Computer Engineering Department, University

More information

SDA: Software-Defined Accelerator for Large- Scale DNN Systems

SDA: Software-Defined Accelerator for Large- Scale DNN Systems SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, Yong Wang, Bo Yu, Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A dominant

More information

Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks

Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. X, NO. X, DEC 2016 1 Caffeine: Towards Uniformed Representation and Acceleration for Deep Convolutional Neural Networks

More information

NVIDIA FOR DEEP LEARNING. Bill Veenhuis

NVIDIA FOR DEEP LEARNING. Bill Veenhuis NVIDIA FOR DEEP LEARNING Bill Veenhuis bveenhuis@nvidia.com Nvidia is the world s leading ai platform ONE ARCHITECTURE CUDA 2 GPU: Perfect Companion for Accelerating Apps & A.I. CPU GPU 3 Intro to AI AGENDA

More information

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech

Convolutional Neural Networks. Computer Vision Jia-Bin Huang, Virginia Tech Convolutional Neural Networks Computer Vision Jia-Bin Huang, Virginia Tech Today s class Overview Convolutional Neural Network (CNN) Training CNN Understanding and Visualizing CNN Image Categorization:

More information

SDA: Software-Defined Accelerator for Large- Scale DNN Systems

SDA: Software-Defined Accelerator for Large- Scale DNN Systems SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, 1 Yong Wang, 1 Bo Yu, 1 Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A

More information

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun Presented by Tushar Bansal Objective 1. Get bounding box for all objects

More information

CSC 578 Neural Networks and Deep Learning

CSC 578 Neural Networks and Deep Learning CSC 578 Neural Networks and Deep Learning Fall 2018/19 7. Recurrent Neural Networks (Some figures adapted from NNDL book) 1 Recurrent Neural Networks 1. Recurrent Neural Networks (RNNs) 2. RNN Training

More information

Deep Learning. Architecture Design for. Sargur N. Srihari

Deep Learning. Architecture Design for. Sargur N. Srihari Architecture Design for Deep Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation

More information

Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster

Energy-Efficient CNN Implementation on a Deeply Pipelined FPGA Cluster Energy-Efficient CNN Implementation on a Deeply Pipelined Cluster Chen Zhang 1, Di Wu, Jiayu Sun 1, Guangyu Sun 1,3, Guojie Luo 1,3, and Jason Cong 1,,3 1 Center for Energy-Efficient Computing and Applications,

More information

Two FPGA-DNN Projects: 1. Low Latency Multi-Layer Perceptrons using FPGAs 2. Acceleration of CNN Training on FPGA-based Clusters

Two FPGA-DNN Projects: 1. Low Latency Multi-Layer Perceptrons using FPGAs 2. Acceleration of CNN Training on FPGA-based Clusters Two FPGA-DNN Projects: 1. Low Latency Multi-Layer Perceptrons using FPGAs 2. Acceleration of CNN Training on FPGA-based Clusters *Argonne National Lab +BU & USTC Presented by Martin Herbordt Work by Ahmed

More information

A Dendrogram. Bioinformatics (Lec 17)

A Dendrogram. Bioinformatics (Lec 17) A Dendrogram 3/15/05 1 Hierarchical Clustering [Johnson, SC, 1967] Given n points in R d, compute the distance between every pair of points While (not done) Pick closest pair of points s i and s j and

More information

Computer Architectures for Deep Learning. Ethan Dell and Daniyal Iqbal

Computer Architectures for Deep Learning. Ethan Dell and Daniyal Iqbal Computer Architectures for Deep Learning Ethan Dell and Daniyal Iqbal Agenda Introduction to Deep Learning Challenges Architectural Solutions Hardware Architectures CPUs GPUs Accelerators FPGAs SOCs ASICs

More information

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty) Supervised Learning (contd) Linear Separation Mausam (based on slides by UW-AI faculty) Images as Vectors Binary handwritten characters Treat an image as a highdimensional vector (e.g., by reading pixel

More information

Scalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA

Scalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA Scalable and Dynamically Updatable Lookup Engine for Decision-trees on FPGA Yun R. Qu, Viktor K. Prasanna Ming Hsieh Dept. of Electrical Engineering University of Southern California Los Angeles, CA 90089

More information

Knowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey.

Knowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey. Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

COMP 551 Applied Machine Learning Lecture 16: Deep Learning COMP 551 Applied Machine Learning Lecture 16: Deep Learning Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: Class web page: www.cs.mcgill.ca/~hvanho2/comp551 Unless otherwise noted, all

More information

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why?

Deep Learning. Deep Learning provided breakthrough results in speech recognition and image classification. Why? Data Mining Deep Learning Deep Learning provided breakthrough results in speech recognition and image classification. Why? Because Speech recognition and image classification are two basic examples of

More information

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal Object Detection Lecture 10.3 - Introduction to deep learning (CNN) Idar Dyrdal Deep Learning Labels Computational models composed of multiple processing layers (non-linear transformations) Used to learn

More information

Su et al. Shape Descriptors - III

Su et al. Shape Descriptors - III Su et al. Shape Descriptors - III Siddhartha Chaudhuri http://www.cse.iitb.ac.in/~cs749 Funkhouser; Feng, Liu, Gong Recap Global A shape descriptor is a set of numbers that describes a shape in a way that

More information

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina Neural Network and Deep Learning Early history of deep learning Deep learning dates back to 1940s: known as cybernetics in the 1940s-60s, connectionism in the 1980s-90s, and under the current name starting

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,

More information

DEEP LEARNING ACCELERATOR UNIT WITH HIGH EFFICIENCY ON FPGA

DEEP LEARNING ACCELERATOR UNIT WITH HIGH EFFICIENCY ON FPGA DEEP LEARNING ACCELERATOR UNIT WITH HIGH EFFICIENCY ON FPGA J.Jayalakshmi 1, S.Ali Asgar 2, V.Thrimurthulu 3 1 M.tech Student, Department of ECE, Chadalawada Ramanamma Engineering College, Tirupati Email

More information

M.Tech Student, Department of ECE, S.V. College of Engineering, Tirupati, India

M.Tech Student, Department of ECE, S.V. College of Engineering, Tirupati, India International Journal of Scientific Research in Computer Science, Engineering and Information Technology 2018 IJSRCSEIT Volume 3 Issue 5 ISSN : 2456-3307 High Performance Scalable Deep Learning Accelerator

More information

ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests

ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests ElasticFlow: A Complexity-Effective Approach for Pipelining Irregular Loop Nests Mingxing Tan 1 2, Gai Liu 1, Ritchie Zhao 1, Steve Dai 1, Zhiru Zhang 1 1 Computer Systems Laboratory, Electrical and Computer

More information

Convolutional Neural Networks

Convolutional Neural Networks Lecturer: Barnabas Poczos Introduction to Machine Learning (Lecture Notes) Convolutional Neural Networks Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal publications.

More information

DEEP NEURAL NETWORKS FOR OBJECT DETECTION

DEEP NEURAL NETWORKS FOR OBJECT DETECTION DEEP NEURAL NETWORKS FOR OBJECT DETECTION Sergey Nikolenko Steklov Institute of Mathematics at St. Petersburg October 21, 2017, St. Petersburg, Russia Outline Bird s eye overview of deep learning Convolutional

More information

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015)

AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) AttentionNet for Accurate Localization and Detection of Objects. (To appear in ICCV 2015) Donggeun Yoo, Sunggyun Park, Joon-Young Lee, Anthony Paek, In So Kweon. State-of-the-art frameworks for object

More information

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group

Deep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies

More information

Combined Weak Classifiers

Combined Weak Classifiers Combined Weak Classifiers Chuanyi Ji and Sheng Ma Department of Electrical, Computer and System Engineering Rensselaer Polytechnic Institute, Troy, NY 12180 chuanyi@ecse.rpi.edu, shengm@ecse.rpi.edu Abstract

More information

Revolutionizing the Datacenter

Revolutionizing the Datacenter Power-Efficient Machine Learning using FPGAs on POWER Systems Ralph Wittig, Distinguished Engineer Office of the CTO, Xilinx Revolutionizing the Datacenter Join the Conversation #OpenPOWERSummit Top-5

More information

COMPUTATIONAL INTELLIGENCE

COMPUTATIONAL INTELLIGENCE COMPUTATIONAL INTELLIGENCE Fundamentals Adrian Horzyk Preface Before we can proceed to discuss specific complex methods we have to introduce basic concepts, principles, and models of computational intelligence

More information

Energy Efficient K-Means Clustering for an Intel Hybrid Multi-Chip Package

Energy Efficient K-Means Clustering for an Intel Hybrid Multi-Chip Package High Performance Machine Learning Workshop Energy Efficient K-Means Clustering for an Intel Hybrid Multi-Chip Package Matheus Souza, Lucas Maciel, Pedro Penna, Henrique Freitas 24/09/2018 Agenda Introduction

More information