Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com
|
|
- Cody Poole
- 6 years ago
- Views:
Transcription
1 Neural Networks These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these slides for your own academic purposes, provided that you include proper attribution. Please send comments and corrections to Eric. Robot Image Credit: Viktoriya Sukhanova 3RF.com
2 Neural Function Brain function (thought) occurs as the result of the firing of neurons Neurons connect to each other through synapses, which propagate action potential (electrical impulses) by releasing neurotransmitters Synapses can be excitatory (potential-increasing) or inhibitory (potential-decreasing), and have varying activation thresholds Learning occurs as a result of the synapses plasticicity: They exhibit long-term changes in connection strength There are about neurons and about 4 synapses in the human brain! Based on slide by T. Finin, M. desjardins, L Getoor, R. Par
3 Biology of a Neuron 3
4 Brain Structure Different areas of the brain have different functions Some areas seem to have the same function in all humans (e.g., Broca s region for motor speech); the overall layout is generally consistent Some areas are more plastic, and vary in their function; also, the lower-level structure and function vary greatly We don t know how different functions are assigned or acquired Partly the result of the physical layout / connection to inputs (sensors) and outputs (effectors) Partly the result of experience (learning) We really don t understand how this neural structure leads to what we perceive as consciousness or thought Based on slide by T. Finin, M. desjardins, L Getoor, R. Par 4
5 The One Learning Algorithm Hypothesis Auditory Cortex Somatosensor y Cortex Auditory cortex learns to see [Roe et al., 99] Somatosensory cortex learns to see [Metin & Frost, 989] Based on slide by Andrew Ng 5
6 Sensor Representations in the Brain Seeing with your tongue Human echolocation (sonar) Haptic belt: Direction sense Implanting a 3 rd eye [BrainPort; Welsh & Blasch, 997; Nagel et al., 5; Constantine-Paton & Law, 9] Slide by Andrew Ng 6
7 Comparison of computing power INFORMATION CIRCA Computer Human Brain Computation Units -core Xeon: 9 Gates Neurons Storage Units 9 bits RAM, bits disk neurons, 4 synapses Cycle time -9 sec -3 sec Bandwidth 9 bits/sec 4 bits/sec Computers are way faster than neurons But there are a lot more neurons than we can reasonably model in modern digital computers, and they all fire in parallel Neural networks are designed to be massively parallel The brain is effectively a billion times faster 7
8 Neural Networks Origins: Algorithms that try to mimic the brain. Very widely used in 8s and early 9s; popularity diminished in late 9s. Recent resurgence: State-of-the-art technique for many applications Artificial neural networks are not nearly as complex or intricate as the actual brain structure Based on slide by Andrew Ng 8
9 Neural networks Output units Hidden units Layered feed-forward network Input units Neural networks are made up of nodes or units, connected by links Each link has an associated weight and activation level Each node has an input function (typically summing over weighted inputs), an activation function, and an output Based on slide by T. Finin, M. desjardins, L Getoor, R. Par 9
10 Neuron Model: Logistic Unit bias unit x x = x = 6 4 x x x x = X h (x) =g ( x) = +e T x Sigmoid (logistic) activation function: g(z) = +e z Based on slide by Andrew Ng
11 Neural Network bias units x a () h (x) Layer (Input Layer) Layer (Hidden Layer) Layer 3 (Output Layer) Slide by Andrew Ng
12 Feed-Forward Process Input layer units are set by some exterior function (think of these as sensors), which causes their output links to be activated at the specified level Working forward through the network, the input function of each unit is applied to compute the input value Usually this is just the weighted sum of the activation on the links feeding into this node The activation function transforms this input function into a final value Typically this is a nonlinear function, often a sigmoid function corresponding to the threshold of that node Based on slide by T. Finin, M. desjardins, L Getoor, R. Par 3
13 Neural Network () () h (x) a i (j) = activation of unit i in layer j Θ (j) = weight matrix controlling function mapping from layer j to layer j + If network has s j units in layer j and s j+ units in layer j+, then Θ (j) has dimension s j+ (s j +). Slide by Andrew Ng () R 3 4 () R 4 4
14 Vectorization a () = g z () = g () x + () x + () x + () 3 x 3 a () = g z () = g () x + () x + () x + () 3 x 3 a () = g z () 3 = g () 3 x + () 3 x + () 3 x + () 33 x 3 3 h (x) =g () a() + () a() + () a() + () 3 a() 3 = g z (3) Based on slide by Andrew Ng () () h (x) Feed-Forward Steps: z () = () x a () = g(z () ) Add a () = z (3) = () a () h (x) =a (3) = g(z (3) ) 5
15 Other Network Architectures s = [3, 3,, ] h (x) Layer Layer Layer 3 Layer 4 L denotes the number of layers s N +L contains the numbers of nodes at each layer Not counting bias units Typically, s = d (# input features) and s L- =K (# classes) 6
16 Multiple Output Units: One-vs-Rest 7 Pedestrian Car Motorcycle Truck h (x) R K when pedestrian when car when motorcycle when truck h (x) h (x) h (x) h (x) We want: Slide by Andrew Ng
17 Multiple Output Units: One-vs-Rest Given {(x,y ), (x,y ),..., (x n,y n )} Must convert labels to -of-k representation e.g., when motorcycle, when car, etc. 8 h (x) R K when pedestrian when car when motorcycle when truck h (x) h (x) h (x) h (x) We want: y i = y i = Based on slide by Andrew Ng
18 Neural Network Classification Given: {(x,y ), (x,y ),..., (x n,y n )} s N +L contains # nodes at each layer s = d (# features) Binary classification y = or output unit (s L- = ) Multi-class classification (K classes) y R K e.g.,,, pedestrian car motorcycle truck K output units (s L- = K) Slide by Andrew Ng 9
19 Understanding Representations
20 Representing Boolean Functions Simple example: AND Logistic / Sigmoid Function g(z) h (x) h Θ (x) = g(-3 + x + x ) x x h Θ (x) g(-3) g(-) g(-) g() Based on slide and example by Andrew Ng
21 Representing Boolean Functions -3 AND - OR + + h (x) + + h (x) NOT (NOT x ) AND (NOT x ) + - h (x) h (x)
22 Combining Representations to Create Non-Linear Functions (NOT x ) AND (NOT x ) -3 AND + - OR + + h (x) - - h (x) + + h (x) not(xor) II III I IV in I + in III I or III h (x) - Based on example by Andrew Ng 3
23 Layering Representations x... x x... x 4 x 4... x 6... pixel images d = 4 classes x x 4 Each image is unrolled into a vector x of pixel intensities 4
24 Layering Representations x x x 3 x 4 x 5 x d Hidden Layer Output Layer 9 Input Layer Visualization of Hidden Layer 5
25 LeNet 5 Demonstration: 6
26 Neural Network Learning 7
27 Perceptron Learning Rule + (y h(x))x Equivalent to the intuitive rules: If output is correct, don t change the weights If output is low (h(x) =, y = ), increment weights for all the inputs which are If output is high (h(x) =, y = ), decrement weights for all inputs which are Perceptron Convergence Theorem: If there is a set of weights that is consistent with the training data (i.e., the data is linearly separable), the perceptron learning algorithm will converge [Minicksy & Papert, 969] 9
28 Batch Perceptron Given training data Let [,,...,] (x (i),y (i) ) n i= Repeat: Let [,,...,] for i =...n,do if y (i) x (i) apple + y (i) x (i) + Until k k < // prediction for i th instance is incorrect /n // compute average update Simplest case: α = and don t normalize, yields the fixed increment perceptron Each increment of outer loop is called an epoch Based on slide by Alan Fern 3
29 Learning in NN: Backpropagation Similar to the perceptron learning algorithm, we cycle through our examples If the output of the network is correct, no changes are made If there is an error, weights are adjusted to reduce the error The trick is to assess the blame for the error and divide it among the contributing weights Based on slide by T. Finin, M. desjardins, L Getoor, R. Par 3
30 Cost Function Logistic Regression: J( ) = n nx dx [y i log h (x i )+( y i ) log ( h (x i ))] + n i= j= j Neural Network: h R K (h (x)) i = i th output " J( ) = n KX X X # y ik log (h (x i )) k + ( y ik ) log (h (x i )) k n i= L s X X l Xs l + n l= k= i= j= (l) ji k th class: true, predicted not k th class: true, predicted Based on slide by Andrew Ng 3
31 Optimizing the Neural Network J( ) = n " n X i= LX + n l= KX y ik log(h (x i )) k +( k= s X l i= Xs l j= (l) ji # y ik ) log (h (x i )) k Solve via: J(Θ) is not convex, so GD on a neural net yields a local optimum But, tends to work well in practice Need code to compute: Based on slide by Andrew Ng 33
32 Forward Propagation Given one labeled training instance (x, y): Forward Propagation a () = x z () = Θ () a () a () = g(z () ) [add a () ] a () a () a (3) a (4) z (3) = Θ () a () a (3) = g(z (3) ) [add a (3) ] z (4) = Θ (3) a (3) a (4) = h Θ (x) = g(z (4) ) Based on slide by Andrew Ng 34
33 Backpropagation Intuition Each hidden node j is responsible for some fraction of the error δ j (l) in each of the output nodes to which it connects δ j (l) is divided according to the strength of the connection between hidden node and the output node Then, the blame is propagated back to provide the error values for the hidden layer Based on slide by T. Finin, M. desjardins, L Getoor, R. Par 35
34 Backpropagation Intuition () (3) (4) () (3) δ j (l) = error of node j in layer l Formally, (l) (l) j cost(x i ) where cost(x i )=y i log h (x i )+( y i ) log( h (x i )) Based on slide by Andrew Ng 36
35 Backpropagation Intuition () () (3) (3) (4) δ (4) = a (4) y δ j (l) = error of node j in layer l Formally, Based on slide by Andrew Ng (l) (l) j cost(x i ) where cost(x i )=y i log h (x i )+( y i ) log( h (x i )) 37
36 Backpropagation Intuition () (3) (3) (4) () (3) δ (3) = Θ (3) δ (4) δ j (l) = error of node j in layer l Formally, Based on slide by Andrew Ng (l) (l) j cost(x i ) where cost(x i )=y i log h (x i )+( y i ) log( h (x i )) 38
37 Backpropagation Intuition () (3) (4) δ j (l) = error of node j in layer l Formally, Based on slide by Andrew Ng () (l) (l) j cost(x i ) (3) δ (3) = Θ (3) δ (4) δ (3) = Θ (3) δ (4) where cost(x i )=y i log h (x i )+( y i ) log( h (x i )) 39
38 Backpropagation Intuition () () (3) (4) () () (3) δ () = Θ () δ (3) + Θ () δ (3) δ j (l) = error of node j in layer l Formally, Based on slide by Andrew Ng (l) (l) j cost(x i ) where cost(x i )=y i log h (x i )+( y i ) log( h (x i )) 4
39 Backpropagation: Gradient Computation Let δ j (l) = error of node j in layer l (#layers L = 4) Backpropagation δ (4) = a (4) y δ (3) = (Θ (3) ) T δ (4).* g (z (3) ) δ () = (Θ () ) T δ (3).* g (z () ) (No δ () ) Element-wise product.* δ () δ (3) g (z (3) ) = a (3).* ( a (3) ) g (z () ) = a ().* ( a () ) (l) ij Based on slide by Andrew Ng J( ) = a (l) j (l+) i (ignoring λ; if λ = ) 4
40 6 Backpropagation (l) ij Set = 8l, i, j For each training instance (x i,y i ): Set a () = x i Compute {a (),...,a (L) } via forward propagation Compute (L) = a (L) y i Compute errors { (L ),..., () } (l) Compute gradients ij = (l) ij + a(l) (l) (l+) j ( Compute avg regularized gradient D (l) ij D (l) is the matrix of partial derivatives of J(Θ) (l) Note: Can vectorize + a(l) (l+) as (l) = (l) + (l+) a (l) ij = (l) ij j i (Used to accumulate gradient) i ( (l) = n ij + (l) (l) n ij (l) (l) (l) ij if j 6= otherwise Based on slide by Andrew Ng 4
41 Training a Neural Network via Gradient Descent with Backprop Given: training set {(x,y ),...,(x n,y n )} Initialize all (l) randomly (NOT to!) Loop // each iteration is called an epoch (l) ij Set = 8l, i, j For each training instance (x i,y i ): Set a () = x i Compute {a (),...,a (L) } via forward propagation Compute (L) = a (L) y i Compute errors { (L ),..., () } Compute gradients (l) ij = (l) ij + a(l) j Compute avg regularized gradient D (l) ij (l+) i = ( n Update weights via gradient step (l) ij = (l) ij Until weights converge or max #epochs is reached (Used to accumulate gradient) n (l) ij + (l) (l) ij D (l) ij ij if j 6= otherwise Backpropagation Based on slide by Andrew Ng 43
42 Backprop Issues Backprop is the cockroach of machine learning. It s ugly, and annoying, but you just can t get rid of it. -Geoff Hinton Problems: black box local minima 44
43 Implementation Details 45
44 Random Initialization Important to randomize initial weight matrices Can t have uniform initial weights, as in logistic regression Otherwise, all updates will be identical & the net won t learn () (3) (4) () (3) 46
45 Implementation Details For convenience, compress all parameters into θ unroll Θ (), Θ (),..., Θ (L-) into one long vector θ E.g., if Θ () is x, then the first entries of θ contain the value in Θ () Use the reshape command to recover the original matrices E.g., if Θ () is x, then theta = reshape(theta[:], (, )) Each step, check to make sure that J(θ) decreases Implement a gradient-checking procedure to ensure that the gradient is correct... 47
46 Gradient Checking Idea: estimate gradient numerically to verify implementation, then turn off gradient checking J( ) J( i+c ) J( i c ) i c J( ) J( i+c) J( i c i c θ i+c = [θ, θ,..., θ i, θ i +c, θ i+,...] Based on slide by Andrew Ng c E-4 Change ONLY the i th entry in θ, increasing (or decreasing) it by c 49
47 Gradient Checking R m is an unrolled version of (), (),... =[,, 3,..., m J( ) J([ + c,, 3,..., m ]) J([ c,, 3,..., m J( ) J([, + c, 3,..., m ]) J([, c, 3,..., m c. Put in vector called J( ) J([,, 3,..., m + c]) J([,, 3,..., m m c Check that the approximate numerical gradient matches the entries in the D matrices Based on slide by Andrew Ng 5
48 Implementation Steps Implement backprop to compute DVec DVec is the unrolled {D (), D (),... } matrices Implement numerical gradient checking to compute gradapprox Make sure DVec has similar values to gradapprox Turn off gradient checking. Using backprop code for learning. Important: Be sure to disable your gradient checking code before training your classifier. If you run the numerical gradient computation on every iteration of gradient descent, your code will be very slow Based on slide by Andrew Ng 5
49 Putting It All Together 5
50 Training a Neural Network Pick a network architecture (connectivity pattern between nodes) # input units = # of features in dataset # output units = # classes Reasonable default: hidden layer or if > hidden layer, have same # hidden units in every layer (usually the more the better) Based on slide by Andrew Ng 53
51 Training a Neural Network. Randomly initialize weights. Implement forward propagation to get h Θ (x i ) for any instance x i 3. Implement code to compute cost function J(Θ) 4. Implement backprop to compute partial derivatives 5. Use gradient checking to compare computed using backpropagation vs. the numerical gradient estimate. Then, disable gradient checking code 6. Use gradient descent with backprop to fit the network Based on slide by Andrew Ng 54
Linear Regression & Gradient Descent
Linear Regression & Gradient Descent These slides were assembled by Byron Boots, with grateful acknowledgement to Eric Eaton and the many others who made their course materials freely available online.
More informationData Mining. Neural Networks
Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most
More informationInstructor: Jessica Wu Harvey Mudd College
The Perceptron Instructor: Jessica Wu Harvey Mudd College The instructor gratefully acknowledges Andrew Ng (Stanford), Eric Eaton (UPenn), David Kauchak (Pomona), and the many others who made their course
More informationFor Monday. Read chapter 18, sections Homework:
For Monday Read chapter 18, sections 10-12 The material in section 8 and 9 is interesting, but we won t take time to cover it this semester Homework: Chapter 18, exercise 25 a-b Program 4 Model Neuron
More informationNeural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.
Lecture 24: Learning 3 Victor R. Lesser CMPSCI 683 Fall 2010 Today s Lecture Continuation of Neural Networks Artificial Neural Networks Compose of nodes/units connected by links Each link has a numeric
More informationNatural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu
Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward
More informationLearning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation
Learning Learning agents Inductive learning Different Learning Scenarios Evaluation Slides based on Slides by Russell/Norvig, Ronald Williams, and Torsten Reil Material from Russell & Norvig, chapters
More informationEnsemble methods in machine learning. Example. Neural networks. Neural networks
Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you
More informationNeural Networks: Learning. Cost func5on. Machine Learning
Neural Networks: Learning Cost func5on Machine Learning Neural Network (Classifica2on) total no. of layers in network no. of units (not coun5ng bias unit) in layer Layer 1 Layer 2 Layer 3 Layer 4 Binary
More informationUnderstanding Andrew Ng s Machine Learning Course Notes and codes (Matlab version)
Understanding Andrew Ng s Machine Learning Course Notes and codes (Matlab version) Note: All source materials and diagrams are taken from the Coursera s lectures created by Dr Andrew Ng. Everything I have
More informationProgramming Exercise 4: Neural Networks Learning
Programming Exercise 4: Neural Networks Learning Machine Learning Introduction In this exercise, you will implement the backpropagation algorithm for neural networks and apply it to the task of hand-written
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationPerceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Example Learning Problem Example Learning Problem Celebrity Faces in the Wild Machine Learning Pipeline Raw data Feature extract. Feature computation Inference: prediction,
More informationCOMP 551 Applied Machine Learning Lecture 14: Neural Networks
COMP 551 Applied Machine Learning Lecture 14: Neural Networks Instructor: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551 Unless otherwise noted, all material posted for this course
More informationCMPT 882 Week 3 Summary
CMPT 882 Week 3 Summary! Artificial Neural Networks (ANNs) are networks of interconnected simple units that are based on a greatly simplified model of the brain. ANNs are useful learning tools by being
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form
More informationLECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS
LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Neural Networks Classifier Introduction INPUT: classification data, i.e. it contains an classification (class) attribute. WE also say that the class
More informationMachine Learning Classifiers and Boosting
Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve
More informationMini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class
Mini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class Guidelines Submission. Submit a hardcopy of the report containing all the figures and printouts of code in class. For readability
More informationAssignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation
Farrukh Jabeen Due Date: November 2, 2009. Neural Networks: Backpropation Assignment # 5 The "Backpropagation" method is one of the most popular methods of "learning" by a neural network. Read the class
More informationPerceptrons and Backpropagation. Fabio Zachert Cognitive Modelling WiSe 2014/15
Perceptrons and Backpropagation Fabio Zachert Cognitive Modelling WiSe 2014/15 Content History Mathematical View of Perceptrons Network Structures Gradient Descent Backpropagation (Single-Layer-, Multilayer-Networks)
More informationSEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic
SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks
More informationSupervised Learning in Neural Networks (Part 2)
Supervised Learning in Neural Networks (Part 2) Multilayer neural networks (back-propagation training algorithm) The input signals are propagated in a forward direction on a layer-bylayer basis. Learning
More informationReservoir Computing with Emphasis on Liquid State Machines
Reservoir Computing with Emphasis on Liquid State Machines Alex Klibisz University of Tennessee aklibisz@gmail.com November 28, 2016 Context and Motivation Traditional ANNs are useful for non-linear problems,
More informationWeek 3: Perceptron and Multi-layer Perceptron
Week 3: Perceptron and Multi-layer Perceptron Phong Le, Willem Zuidema November 12, 2013 Last week we studied two famous biological neuron models, Fitzhugh-Nagumo model and Izhikevich model. This week,
More informationCS6220: DATA MINING TECHNIQUES
CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining
More informationCS 4510/9010 Applied Machine Learning. Neural Nets. Paula Matuszek Fall copyright Paula Matuszek 2016
CS 4510/9010 Applied Machine Learning 1 Neural Nets Paula Matuszek Fall 2016 Neural Nets, the very short version 2 A neural net consists of layers of nodes, or neurons, each of which has an activation
More information06: Logistic Regression
06_Logistic_Regression 06: Logistic Regression Previous Next Index Classification Where y is a discrete value Develop the logistic regression algorithm to determine what class a new input should fall into
More informationMachine Learning 13. week
Machine Learning 13. week Deep Learning Convolutional Neural Network Recurrent Neural Network 1 Why Deep Learning is so Popular? 1. Increase in the amount of data Thanks to the Internet, huge amount of
More informationProgramming Exercise 3: Multi-class Classification and Neural Networks
Programming Exercise 3: Multi-class Classification and Neural Networks Machine Learning Introduction In this exercise, you will implement one-vs-all logistic regression and neural networks to recognize
More informationLinear Regression QuadraJc Regression Year
Linear Regression Regression Given: n Data X = x (1),...,x (n)o where x (i) 2 R d n Corresponding labels y = y (1),...,y (n)o where y (i) 2 R September Arc+c Sea Ice Extent (1,000,000 sq km) 9 8 7 6 5
More informationMulti-layer Perceptron Forward Pass Backpropagation. Lecture 11: Aykut Erdem November 2016 Hacettepe University
Multi-layer Perceptron Forward Pass Backpropagation Lecture 11: Aykut Erdem November 2016 Hacettepe University Administrative Assignment 2 due Nov. 10, 2016! Midterm exam on Monday, Nov. 14, 2016 You are
More informationSupervised Learning with Neural Networks. We now look at how an agent might learn to solve a general problem by seeing examples.
Supervised Learning with Neural Networks We now look at how an agent might learn to solve a general problem by seeing examples. Aims: to present an outline of supervised learning as part of AI; to introduce
More informationDeep Neural Networks Optimization
Deep Neural Networks Optimization Creative Commons (cc) by Akritasa http://arxiv.org/pdf/1406.2572.pdf Slides from Geoffrey Hinton CSC411/2515: Machine Learning and Data Mining, Winter 2018 Michael Guerzhoy
More informationNeural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017
3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural
More informationBack propagation Algorithm:
Network Neural: A neural network is a class of computing system. They are created from very simple processing nodes formed into a network. They are inspired by the way that biological systems such as the
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right
More informationCapsule Networks. Eric Mintun
Capsule Networks Eric Mintun Motivation An improvement* to regular Convolutional Neural Networks. Two goals: Replace max-pooling operation with something more intuitive. Keep more info about an activated
More informationOpening the Black Box Data Driven Visualizaion of Neural N
Opening the Black Box Data Driven Visualizaion of Neural Networks September 20, 2006 Aritificial Neural Networks Limitations of ANNs Use of Visualization (ANNs) mimic the processes found in biological
More informationCS 179 Lecture 16. Logistic Regression & Parallel SGD
CS 179 Lecture 16 Logistic Regression & Parallel SGD 1 Outline logistic regression (stochastic) gradient descent parallelizing SGD for neural nets (with emphasis on Google s distributed neural net implementation)
More informationThe Mathematics Behind Neural Networks
The Mathematics Behind Neural Networks Pattern Recognition and Machine Learning by Christopher M. Bishop Student: Shivam Agrawal Mentor: Nathaniel Monson Courtesy of xkcd.com The Black Box Training the
More informationEECS 496 Statistical Language Models. Winter 2018
EECS 496 Statistical Language Models Winter 2018 Introductions Professor: Doug Downey Course web site: www.cs.northwestern.edu/~ddowney/courses/496_winter2018 (linked off prof. home page) Logistics Grading
More informationLecture 13. Deep Belief Networks. Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen
Lecture 13 Deep Belief Networks Michael Picheny, Bhuvana Ramabhadran, Stanley F. Chen IBM T.J. Watson Research Center Yorktown Heights, New York, USA {picheny,bhuvana,stanchen}@us.ibm.com 12 December 2012
More informationCS 2750: Machine Learning. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh April 13, 2016
CS 2750: Machine Learning Neural Networks Prof. Adriana Kovashka University of Pittsburgh April 13, 2016 Plan for today Neural network definition and examples Training neural networks (backprop) Convolutional
More informationPattern Classification Algorithms for Face Recognition
Chapter 7 Pattern Classification Algorithms for Face Recognition 7.1 Introduction The best pattern recognizers in most instances are human beings. Yet we do not completely understand how the brain recognize
More informationLecture #11: The Perceptron
Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be
More informationCS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016
CS 1674: Intro to Computer Vision Neural Networks Prof. Adriana Kovashka University of Pittsburgh November 16, 2016 Announcements Please watch the videos I sent you, if you haven t yet (that s your reading)
More informationSupervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)
Supervised Learning (contd) Linear Separation Mausam (based on slides by UW-AI faculty) Images as Vectors Binary handwritten characters Treat an image as a highdimensional vector (e.g., by reading pixel
More informationCS 4510/9010 Applied Machine Learning
CS 4510/9010 Applied Machine Learning Neural Nets Paula Matuszek Spring, 2015 1 Neural Nets, the very short version A neural net consists of layers of nodes, or neurons, each of which has an activation
More informationMore on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization
More on Learning Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization Neural Net Learning Motivated by studies of the brain. A network of artificial
More informationNeural Networks CMSC475/675
Introduction to Neural Networks CMSC475/675 Chapter 1 Introduction Why ANN Introduction Some tasks can be done easily (effortlessly) by humans but are hard by conventional paradigms on Von Neumann machine
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 22 2019 Outline Practical issues in Linear Regression Outliers Categorical variables Lab
More informationKnowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey.
Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN
More informationCOMPUTATIONAL INTELLIGENCE
COMPUTATIONAL INTELLIGENCE Fundamentals Adrian Horzyk Preface Before we can proceed to discuss specific complex methods we have to introduce basic concepts, principles, and models of computational intelligence
More information27: Hybrid Graphical Models and Neural Networks
10-708: Probabilistic Graphical Models 10-708 Spring 2016 27: Hybrid Graphical Models and Neural Networks Lecturer: Matt Gormley Scribes: Jakob Bauer Otilia Stretcu Rohan Varma 1 Motivation We first look
More informationDeep Learning & Neural Networks
Deep Learning & Neural Networks Machine Learning CSE4546 Sham Kakade University of Washington November 29, 2016 Sham Kakade 1 Announcements: HW4 posted Poster Session Thurs, Dec 8 Today: Review: EM Neural
More informationCOMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions
COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18 Lecture 2: Linear Regression Gradient Descent Non-linear basis functions LINEAR REGRESSION MOTIVATION Why Linear Regression? Simplest
More informationSimple Model Selection Cross Validation Regularization Neural Networks
Neural Nets: Many possible refs e.g., Mitchell Chapter 4 Simple Model Selection Cross Validation Regularization Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February
More informationLecture 20: Neural Networks for NLP. Zubin Pahuja
Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple
More informationClassification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska
Classification Lecture Notes cse352 Neural Networks Professor Anita Wasilewska Neural Networks Classification Introduction INPUT: classification data, i.e. it contains an classification (class) attribute
More informationNeural Network Neurons
Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given
More informationExercise: Training Simple MLP by Backpropagation. Using Netlab.
Exercise: Training Simple MLP by Backpropagation. Using Netlab. Petr Pošík December, 27 File list This document is an explanation text to the following script: demomlpklin.m script implementing the beckpropagation
More informationNatural Language Processing with Deep Learning CS224N/Ling284. Christopher Manning Lecture 4: Backpropagation and computation graphs
Natural Language Processing with Deep Learning CS4N/Ling84 Christopher Manning Lecture 4: Backpropagation and computation graphs Lecture Plan Lecture 4: Backpropagation and computation graphs 1. Matrix
More informationLogical Rhythm - Class 3. August 27, 2018
Logical Rhythm - Class 3 August 27, 2018 In this Class Neural Networks (Intro To Deep Learning) Decision Trees Ensemble Methods(Random Forest) Hyperparameter Optimisation and Bias Variance Tradeoff Biological
More informationAll lecture slides will be available at CSC2515_Winter15.html
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many
More informationGradient Descent Optimization Algorithms for Deep Learning Batch gradient descent Stochastic gradient descent Mini-batch gradient descent
Gradient Descent Optimization Algorithms for Deep Learning Batch gradient descent Stochastic gradient descent Mini-batch gradient descent Slide credit: http://sebastianruder.com/optimizing-gradient-descent/index.html#batchgradientdescent
More informationCharacter Recognition Using Convolutional Neural Networks
Character Recognition Using Convolutional Neural Networks David Bouchain Seminar Statistical Learning Theory University of Ulm, Germany Institute for Neural Information Processing Winter 2006/2007 Abstract
More informationDEEP NEURAL NETWORKS FOR OBJECT DETECTION
DEEP NEURAL NETWORKS FOR OBJECT DETECTION Sergey Nikolenko Steklov Institute of Mathematics at St. Petersburg October 21, 2017, St. Petersburg, Russia Outline Bird s eye overview of deep learning Convolutional
More informationImproving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah
Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah Reference Most of the slides are taken from the third chapter of the online book by Michael Nielson: neuralnetworksanddeeplearning.com
More informationDeep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper
Deep Convolutional Neural Networks Nov. 20th, 2015 Bruce Draper Background: Fully-connected single layer neural networks Feed-forward classification Trained through back-propagation Example Computer Vision
More informationMachine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center
Machine Learning With Python Bin Chen Nov. 7, 2017 Research Computing Center Outline Introduction to Machine Learning (ML) Introduction to Neural Network (NN) Introduction to Deep Learning NN Introduction
More informationMachine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013
Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork
More informationLecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa
Instructors: Parth Shah, Riju Pahwa Lecture 2 Notes Outline 1. Neural Networks The Big Idea Architecture SGD and Backpropagation 2. Convolutional Neural Networks Intuition Architecture 3. Recurrent Neural
More information4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.
1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when
More informationDeep Learning for Computer Vision II
IIIT Hyderabad Deep Learning for Computer Vision II C. V. Jawahar Paradigm Shift Feature Extraction (SIFT, HoG, ) Part Models / Encoding Classifier Sparrow Feature Learning Classifier Sparrow L 1 L 2 L
More informationPerceptron as a graph
Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 10 th, 2007 2005-2007 Carlos Guestrin 1 Perceptron as a graph 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-6 -4-2
More informationAn Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting.
An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting. Mohammad Mahmudul Alam Mia, Shovasis Kumar Biswas, Monalisa Chowdhury Urmi, Abubakar
More informationCPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2016
CPSC 340: Machine Learning and Data Mining Deep Learning Fall 2016 Assignment 5: Due Friday. Assignment 6: Due next Friday. Final: Admin December 12 (8:30am HEBB 100) Covers Assignments 1-6. Final from
More informationArtificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu
Artificial Neural Networks Introduction to Computational Neuroscience Ardi Tampuu 7.0.206 Artificial neural network NB! Inspired by biology, not based on biology! Applications Automatic speech recognition
More informationMachine Learning Basics. Sargur N. Srihari
Machine Learning Basics Sargur N. srihari@cedar.buffalo.edu 1 Overview Deep learning is a specific type of ML Necessary to have a solid understanding of the basic principles of ML 2 Topics Stochastic Gradient
More informationNeural Networks (Overview) Prof. Richard Zanibbi
Neural Networks (Overview) Prof. Richard Zanibbi Inspired by Biology Introduction But as used in pattern recognition research, have little relation with real neural systems (studied in neurology and neuroscience)
More informationDeep Learning with Tensorflow AlexNet
Machine Learning and Computer Vision Group Deep Learning with Tensorflow http://cvml.ist.ac.at/courses/dlwt_w17/ AlexNet Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton, "Imagenet classification
More informationNEURAL NETWORKS. Typeset by FoilTEX 1
NEURAL NETWORKS Typeset by FoilTEX 1 Basic Concepts The McCulloch-Pitts model Hebb s rule Neural network: double dynamics. Pattern Formation and Pattern Recognition Neural network as an input-output device
More informationIntroduction to Deep Learning
ENEE698A : Machine Learning Seminar Introduction to Deep Learning Raviteja Vemulapalli Image credit: [LeCun 1998] Resources Unsupervised feature learning and deep learning (UFLDL) tutorial (http://ufldl.stanford.edu/wiki/index.php/ufldl_tutorial)
More informationNotes on Multilayer, Feedforward Neural Networks
Notes on Multilayer, Feedforward Neural Networks CS425/528: Machine Learning Fall 2012 Prepared by: Lynne E. Parker [Material in these notes was gleaned from various sources, including E. Alpaydin s book
More informationCP365 Artificial Intelligence
CP365 Artificial Intelligence Tech News! Apple news conference tomorrow? Tech News! Apple news conference tomorrow? Google cancels Project Ara modular phone Weather-Based Stock Market Predictions? Dataset
More informationDeep neural networks II
Deep neural networks II May 31 st, 2018 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, Jia-Bin Huang, Derek Hoiem, Adriana Kovashka, Why (convolutional) neural networks? State of
More informationSequence Modeling: Recurrent and Recursive Nets. By Pyry Takala 14 Oct 2015
Sequence Modeling: Recurrent and Recursive Nets By Pyry Takala 14 Oct 2015 Agenda Why Recurrent neural networks? Anatomy and basic training of an RNN (10.2, 10.2.1) Properties of RNNs (10.2.2, 8.2.6) Using
More informationDeep Learning. Vladimir Golkov Technical University of Munich Computer Vision Group
Deep Learning Vladimir Golkov Technical University of Munich Computer Vision Group 1D Input, 1D Output target input 2 2D Input, 1D Output: Data Distribution Complexity Imagine many dimensions (data occupies
More informationArtificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)
Artificial Neural Networks Analogy to biological neural systems, the most robust learning systems we know. Attempt to: Understand natural biological systems through computational modeling. Model intelligent
More informationCS 8520: Artificial Intelligence. Machine Learning 2. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek
CS 8520: Artificial Intelligence Machine Learning 2 Paula Matuszek Fall, 2015!1 Regression Classifiers We said earlier that the task of a supervised learning system can be viewed as learning a function
More information11/14/2010 Intelligent Systems and Soft Computing 1
Lecture 8 Artificial neural networks: Unsupervised learning Introduction Hebbian learning Generalised Hebbian learning algorithm Competitive learning Self-organising computational map: Kohonen network
More informationNeural Network Optimization and Tuning / Spring 2018 / Recitation 3
Neural Network Optimization and Tuning 11-785 / Spring 2018 / Recitation 3 1 Logistics You will work through a Jupyter notebook that contains sample and starter code with explanations and comments throughout.
More informationLearning internal representations
CHAPTER 4 Learning internal representations Introduction In the previous chapter, you trained a single-layered perceptron on the problems AND and OR using the delta rule. This architecture was incapable
More informationCS 8520: Artificial Intelligence
CS 8520: Artificial Intelligence Machine Learning 2 Paula Matuszek Spring, 2013 1 Regression Classifiers We said earlier that the task of a supervised learning system can be viewed as learning a function
More informationUsing neural nets to recognize hand-written digits. Srikumar Ramalingam School of Computing University of Utah
Using neural nets to recognize hand-written digits Srikumar Ramalingam School of Computing University of Utah Reference Most of the slides are taken from the first chapter of the online book by Michael
More information.. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar..
.. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar.. Machine Learning: Support Vector Machines: Linear Kernel Support Vector Machines Extending Perceptron Classifiers. There are two ways to
More informationCS229 Final Project: Predicting Expected Response Times
CS229 Final Project: Predicting Expected Email Response Times Laura Cruz-Albrecht (lcruzalb), Kevin Khieu (kkhieu) December 15, 2017 1 Introduction Each day, countless emails are sent out, yet the time
More information