CS 4510/9010 Applied Machine Learning. Neural Nets. Paula Matuszek Fall copyright Paula Matuszek 2016

Similar documents
CS 4510/9010 Applied Machine Learning

CS 4510/9010 Applied Machine Learning. Deep Learning. Paula Matuszek Fall copyright Paula Matuszek 2016

Data Mining. Neural Networks

CMU Lecture 18: Deep learning and Vision: Convolutional neural networks. Teacher: Gianni A. Di Caro

Back propagation Algorithm:

Logical Rhythm - Class 3. August 27, 2018

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Ensemble methods in machine learning. Example. Neural networks. Neural networks

CS6220: DATA MINING TECHNIQUES

CS 6501: Deep Learning for Computer Graphics. Training Neural Networks II. Connelly Barnes

Supervised Learning in Neural Networks (Part 2)

Lecture 17: Neural Networks and Deep Learning. Instructor: Saravanan Thirumuruganathan

CS311 - Neural Nets Lab Thursday, April 11

Neural Networks Laboratory EE 329 A

COMPUTATIONAL INTELLIGENCE

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)

Neural Networks CMSC475/675

CMPT 882 Week 3 Summary

Opening the Black Box Data Driven Visualizaion of Neural N

Machine Learning 13. week

Neural Network Neurons

CS 1674: Intro to Computer Vision. Neural Networks. Prof. Adriana Kovashka University of Pittsburgh November 16, 2016

CS 8520: Artificial Intelligence. Machine Learning 2. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek

Backpropagation in Neural Nets, and an Introduction to Vision. CSCI 5582, Fall 2007

11/14/2010 Intelligent Systems and Soft Computing 1

Simple Model Selection Cross Validation Regularization Neural Networks

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

! References: ! Computer eyesight gets a lot more accurate, NY Times. ! Stanford CS 231n. ! Christopher Olah s blog. ! Take ECS 174!

Administrative. Assignment 1 due Wednesday April 18, 11:59pm

For Monday. Read chapter 18, sections Homework:

Neural Nets & Deep Learning

Artificial Intelligence Introduction Handwriting Recognition Kadir Eren Unal ( ), Jakob Heyder ( )

Perceptron: This is convolution!

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.

Lecture #11: The Perceptron

Neural Nets. CSCI 5582, Fall 2007

Multilayer Feed-forward networks

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Neuron Selectivity as a Biologically Plausible Alternative to Backpropagation

Introduction to Neural Networks

Climate Precipitation Prediction by Neural Network

Artificial Neuron Modelling Based on Wave Shape

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

CSC 578 Neural Networks and Deep Learning

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Backpropagation and Neural Networks. Lecture 4-1

The Mathematics Behind Neural Networks

Yuki Osada Andrew Cannon

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2016

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

Notes on Multilayer, Feedforward Neural Networks

CSIS. Pattern Recognition. Prof. Sung-Hyuk Cha Fall of School of Computer Science & Information Systems. Artificial Intelligence CSIS

CS 8520: Artificial Intelligence

Artificial Neural Networks

Technical University of Munich. Exercise 7: Neural Network Basics

5 Learning hypothesis classes (16 points)

Machine Learning. Deep Learning. Eric Xing (and Pengtao Xie) , Fall Lecture 8, October 6, Eric CMU,

EE 511 Neural Networks

Fuzzy Set Theory in Computer Vision: Example 3, Part II

Deep neural networks II

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Deep Learning. Deep Learning. Practical Application Automatically Adding Sounds To Silent Movies

Deep Learning & Neural Networks

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Keras: Handwritten Digit Recognition using MNIST Dataset

Deep Learning. Volker Tresp Summer 2014

Motivation. Problem: With our linear methods, we can train the weights but not the basis functions: Activator Trainable weight. Fixed basis function

Deep Convolutional Neural Networks. Nov. 20th, 2015 Bruce Draper

A Class of Instantaneously Trained Neural Networks

Design and Performance Analysis of and Gate using Synaptic Inputs for Neural Network Application

CHAPTER 7 MASS LOSS PREDICTION USING ARTIFICIAL NEURAL NETWORK (ANN)

Code Mania Artificial Intelligence: a. Module - 1: Introduction to Artificial intelligence and Python:

Perceptron as a graph

Neural Network Classifier for Isolated Character Recognition

Image Compression: An Artificial Neural Network Approach

ImageNet Classification with Deep Convolutional Neural Networks

Optimal Brain Damage. Yann Le Cun, John S. Denker and Sara A. Solla. presented by Chaitanya Polumetla

Natural Language Processing with Deep Learning CS224N/Ling284. Christopher Manning Lecture 4: Backpropagation and computation graphs

More on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation

Deep (1) Matthieu Cord LIP6 / UPMC Paris 6

Know your data - many types of networks

AMOL MUKUND LONDHE, DR.CHELPA LINGAM

WHAT TYPE OF NEURAL NETWORK IS IDEAL FOR PREDICTIONS OF SOLAR FLARES?

An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting.

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

COMP 551 Applied Machine Learning Lecture 16: Deep Learning

Perceptrons and Backpropagation. Fabio Zachert Cognitive Modelling WiSe 2014/15

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)

Linear Separability. Linear Separability. Capabilities of Threshold Neurons. Capabilities of Threshold Neurons. Capabilities of Threshold Neurons

Keywords: ANN; network topology; bathymetric model; representability.

Introduction to Deep Learning

Optimizing Number of Hidden Nodes for Artificial Neural Network using Competitive Learning Approach

Deep Learning for Visual Computing Prof. Debdoot Sheet Department of Electrical Engineering Indian Institute of Technology, Kharagpur

Simulation of Back Propagation Neural Network for Iris Flower Classification

Character Recognition Using Convolutional Neural Networks

Transcription:

CS 4510/9010 Applied Machine Learning 1 Neural Nets Paula Matuszek Fall 2016

Neural Nets, the very short version 2 A neural net consists of layers of nodes, or neurons, each of which has an activation level Nodes of each layer receive inputs from previous layers; these are combined according to a set of weights. If the activation level is reached the node fires and sends inputs to the next level The initial layer is data from cases; the final layer is expected outcomes Learning is accomplished by modifying the weights to reduce the prediction error

Connectionist Systems 3 A neural net is an example of a connectionist system; we are looking at the connections among the neurons Neurons are also known as perceptrons; the Weka book calls these MultiLayer Perceptron systems The origin of NN systems is modeling human neurons A recent research topic is deep learning systems, which are layered NNs; earlier NNs are the inputs to later ones. They are being explored as providing an approach to modeling a richer, more trainable knowledge space or model.

How the Human Brain learns 4 In the human brain, a typical neuron collects signals from others through a host of fine structures called dendrites. The neuron sends out spikes of electrical activity through a long, thin stand known as an axon, which splits into thousands of branches. At the end of each branch, a structure called a synapse converts the activity from the axon into electrical effects that inhibit or excite activity in the connected neurons. https://eclass.teicrete.gr/modules/document/file.php/tpold101/ Neural Networks/NeuralNets_ch1-2_intro_Eng.ppt

A Typical Neuron 5 ANNs incorporate the two fundamental components of biological neural nets: Neurons -> Nodes Synapses -> Weights P1 P2 P3 Inputs Weights w1 w2 w3 Σ f Output Each neuron within the network is usually a simple processing unit which takes one or more inputs and produces an output. At each neuron, every input has an associated weight which modifies the strength of each input. The neuron simply adds together all the inputs and calculates an output to be passed on. https://eclass.teicrete.gr/modules/document/file.php/tpold101/neural %20Networks/NeuralNets_ch1-2_intro_Eng.ppt

A Typical Neural Network 6 Neural computing requires a number of neurons, to be connected together into a neural network. Neurons are arranged in layers. Weights Nodes Inputs Outputs There are always an input layer and an output layer. There may also be one or more hidden layers https://eclass.teicrete.gr/modules/document/file.php/tpold101/neural %20Networks/NeuralNets_ch1-2_intro_Eng.ppt

Network Layers 7 Input Layer - The activity of the input units represents the raw information that is fed into the network. Hidden Layer - The activity of each hidden unit is determined by the activities of the input units and the weights on the connections between the input and the hidden units. Output Layer - The behavior of the output units depends on the activity of the hidden units and the weights between the hidden and output units. Weights between the input and hidden units determine when each hidden unit is active, and so by modifying these weights, a hidden unit can choose what it represents. Each layer can have a different number of nodes www.d.umn.edu/~alam0026/neuralnetwork.ppt

Training Basics 8 So now we have a network. How do we learn with it? The most basic method of training a neural network is trial and error. Set initial weights randomly. If the network isn't matching the outputs in the training instances, change the weighting of a random link by a random amount. If the accuracy of the network declines, undo the change and make a different one. Time-consuming, but it does learn. www.d.umn.edu/~alam0026/neuralnetwork.ppt

Training, Better 9 The typical method of modifying the weights is backpropagation success or failure at the output node is propagated back through the nodes which contributed to that output node Backprop consists of the repeated application of the following two passes: Forward pass: in this step the network is activated on one example and the error of (each neuron of) the output layer is computed. Backward pass: in this step the network error is used for updating the weights. Starting at the output layer, the error is propagated backwards through the network, layer by layer www.d.umn.edu/~alam0026/neuralnetwork.ppt

Back Propagation 10 Back-propagation training algorithm Network activation Forward Step Error propagation Backward Step Backprop adjusts the weights of the NN in order to minimize the network total mean squared error. www.d.umn.edu/~alam0026/neuralnetwork.ppt

More 11 Number of hidden nodes and layers is complicated too many = overfitting Typical is to try several and evaluate More than about 2 hidden layers has not in practice generally been useful vanishing gradient Number of connections can also be tweaked; we have been showing fully-connected networks No really good algorithm for this either These are feed forward networks; there are no loops or cycles.

Some NN Advantages and Disadvantages 12 Advantages Can learn complex patterns Works well for large problems involving pattern recognition Good for multiple classes Relatively insensitive to irrelevant attributes Disadvantages Can be very slow Needs a lot of examples to work well Very black box A lot of heuristics; results not identical every time

Example from AISpace 13 Mail Find the sample file and load it Set properties Initialize the parameters Solve How do we use it? Calculate output We are using the NN applet at aispace.org

Example: Which class to take? 14 Inputs? Outputs? Sample data

Some Examples 15 Example 1: 3 inputs, 1 output, all binary Example 2: same inputs, output inverted

Getting the right inputs 16 Example 3 Same inputs as 1 and 2 Same output as 1 Outcomes reversed for half the cases

Getting the right inputs 17 Example 3 Same inputs as 1 and 2 Same output as 1 Outcomes reversed for half the cases Network is not converging The output here cannot be predicted from these inputs. Whatever is determining whether to take the class, we haven t captured it

Unordered values 18 Example 4 nput variables here include professor Non-numeric, can t be ordered. Still need numeric values Solution is to treat n possible values as n separate binary values Applet does this for us

Variables with more values 19 Example 5 GPA and number of classes taken are integer values Takes considerably longer to solve Looks for a while like it s not converging Then it gets it

And Reals 20 Example 6 GPA is a real. Examples 5 and 6, without the is it a prereq attribute, and with interval data, depend more on the number of hidden nodes.

And multiple outputs 21 Small Car database from AIspace For any given input case, you will get a value for each possible outcome. Typical for, for instance, character recognition.

Training and Test Cases 22 The basic training approach will fit the training data as closely as possible. But we really want something that will generalize to other cases This is why we have test cases. The training cases are used to compute the weights The test cases tell us how well they generalize Both training and test cases should represent the overall population as well as possible.

So: 23 As for any classifier, getting a good NN involves understanding your domain and capturing knowledge about it choosing the right inputs and outputs choosing representative training and test set You can represent any kind of variable: numeric or not, ordered or not. Non-binary attributes become multiple yes-no attributes Not every set of variables and training cases will produce a net that can be trained.

Once it s trained... 24 When your NN is trained, you can feed it a specific set of inputs and get one or more outputs. These outputs are typically interpreted as some decision: take the class this is probably a 5 This car is most likely acceptable. The network itself is black box. If the situation changes the NN should be retrained new variables new values for some variables new patterns of cases

One last note 25 These have all been simple cases, as examples Most of my examples could in fact be predicted much more easily and cleanly with a decision tree, or even a couple of IF statements A more typical use for any connectionist system has many more inputs and many more training cases