Introduction to Neural Networks

Similar documents
Neural Networks CMSC475/675

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Data Mining. Neural Networks

COMPUTATIONAL INTELLIGENCE

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)

11/14/2010 Intelligent Systems and Soft Computing 1

Learning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation

Dr. Qadri Hamarsheh Supervised Learning in Neural Networks (Part 1) learning algorithm Δwkj wkj Theoretically practically

Perceptrons and Backpropagation. Fabio Zachert Cognitive Modelling WiSe 2014/15

For Monday. Read chapter 18, sections Homework:

Neural Networks. Neural Network. Neural Network. Neural Network 2/21/2008. Andrew Kusiak. Intelligent Systems Laboratory Seamans Center

Supervised Learning in Neural Networks (Part 2)

11/14/2010 Intelligent Systems and Soft Computing 1

CS 4510/9010 Applied Machine Learning. Neural Nets. Paula Matuszek Fall copyright Paula Matuszek 2016

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.

Artificial Neural Networks

CMPT 882 Week 3 Summary

Pattern Classification Algorithms for Face Recognition

Artificial Neural Networks. Introduction to Computational Neuroscience Ardi Tampuu

Multilayer Feed-forward networks

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Character Recognition Using Convolutional Neural Networks

CS6220: DATA MINING TECHNIQUES

Neural Networks (Overview) Prof. Richard Zanibbi

Instructor: Jessica Wu Harvey Mudd College

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter

Climate Precipitation Prediction by Neural Network

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

Yuki Osada Andrew Cannon

Opening the Black Box Data Driven Visualizaion of Neural N

Neural Network and Deep Learning. Donglin Zeng, Department of Biostatistics, University of North Carolina

Function approximation using RBF network. 10 basis functions and 25 data points.

Logical Rhythm - Class 3. August 27, 2018

Multi Layer Perceptron with Back Propagation. User Manual

6. Backpropagation training 6.1 Background

Image Compression: An Artificial Neural Network Approach

Unsupervised Learning

DEEP LEARNING REVIEW. Yann LeCun, Yoshua Bengio & Geoffrey Hinton Nature Presented by Divya Chitimalla

Alex Waibel

Notes on Multilayer, Feedforward Neural Networks

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

Week 3: Perceptron and Multi-layer Perceptron

Artificial Neural Networks Unsupervised learning: SOM

A SourceForge.net Project: tmans, an Agentbased Neural Network Simulator, Repast, and SourceForge CVS

Machine Learning 13. week

NEURAL NETWORKS. Typeset by FoilTEX 1

Chapter 5 Neural Network Concepts and Paradigms

Neural Network Approach for Automatic Landuse Classification of Satellite Images: One-Against-Rest and Multi-Class Classifiers

Artificial Neural Networks Lecture Notes Part 5. Stephen Lucci, PhD. Part 5

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Biologically inspired object categorization in cluttered scenes

Artificial Neural Networks MLP, RBF & GMDH

Neural Networks (pp )

Neuro-Fuzzy Computing

Seismic regionalization based on an artificial neural network

CHAPTER 7 MASS LOSS PREDICTION USING ARTIFICIAL NEURAL NETWORK (ANN)

464 Index. Associative memory paradigms

Introduction AL Neuronale Netzwerke. VL Algorithmisches Lernen, Teil 2b. Norman Hendrich

Machine Learning in Biology

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Machine Learning Classifiers and Boosting

Future Image Prediction using Artificial Neural Networks

Neural Nets. General Model Building

II. ARTIFICIAL NEURAL NETWORK

Figure (5) Kohonen Self-Organized Map

Neural Nets for Adaptive Filter and Adaptive Pattern Recognition

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks

CP365 Artificial Intelligence

A Dendrogram. Bioinformatics (Lec 17)

Neural network based Numerical digits Recognization using NNT in Matlab

Deep (1) Matthieu Cord LIP6 / UPMC Paris 6

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Chapter 7: Competitive learning, clustering, and self-organizing maps

Slide07 Haykin Chapter 9: Self-Organizing Maps

COMPUTATIONAL INTELLIGENCE

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-"&"3 -"(' ( +-" " " % '.+ % ' -0(+$,

Reservoir Computing for Neural Networks

RIMT IET, Mandi Gobindgarh Abstract - In this paper, analysis the speed of sending message in Healthcare standard 7 with the use of back

Neural Network Neurons

CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION

Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions

Ensemble methods in machine learning. Example. Neural networks. Neural networks

A novel firing rule for training Kohonen selforganising

Rough Set Approach to Unsupervised Neural Network based Pattern Classifier

Linear Separability. Linear Separability. Capabilities of Threshold Neurons. Capabilities of Threshold Neurons. Capabilities of Threshold Neurons

An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting.

Neuron Selectivity as a Biologically Plausible Alternative to Backpropagation

Khmer Character Recognition using Artificial Neural Network

IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS

Deep Learning. Practical introduction with Keras JORDI TORRES 27/05/2018. Chapter 3 JORDI TORRES

Unsupervised Learning

ARTIFICIAL NEURAL NETWORK CIRCUIT FOR SPECTRAL PATTERN RECOGNITION

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Cryptography Algorithms using Artificial Neural Network Arijit Ghosh 1 Department of Computer Science St. Xavier s College (Autonomous) Kolkata India

Neural networks for variable star classification

Machine Learning : Clustering, Self-Organizing Maps

Transcription:

Introduction to Neural Networks

What are connectionist neural networks? Connectionism refers to a computer modeling approach to computation that is loosely based upon the architecture of the brain Many different models, but all include: Multiple, individual nodes or units that operate at the same time (in parallel) A network that connects the nodes together Information is stored in a distributed fashion among the links that connect the nodes Learning can occur with gradual changes in connection strength

History of Neural Networks () Attempts to mimic the human brain date back to work in the 93s, 94s, & 95s by Alan Turing, Warren McCullough, Walter Pitts, Donald Hebb and James von Neumann 943 McCulloch-Pitts: neuron as computing element 948 Wiener: cybernetics 949 Hebb: learning rule 957 Rosenblatt at Cornell developed Perceptron, a hardware neural net for character recognition 959 Widrow and Hoff at Stanford developed Adaline for adaptive control of noise on telephone lines 96 Widrow-Hoff: least mean square algorithm 3

History of Neural Networks () Recession 969 Minsky-Papert: limitations of perceptron model Linear Separability in Perceptrons 4

History of Neural Networks (3) Revival, mathematically tied together many of the ideas from previous research 98 Hopfield: recurrent network model 98 Kohonen: self-organizing maps 986 Rumelhart et. al.: backpropagation universial approximation Since then, growth has exploded. Over 8% of Fortune 5 have neural net R&D programs Thousands of research papers Commercial software applications 5

Application with Neural Network Forecasting/Market Prediction: finance and banking Manufacturing: quality control, fault diagnosis Medicine: analysis of electrocardiogram data, RNA & DNA sequencing, drug development without animal testing Pattern/Image recognition: handwriting recognition, airport bomb detection Optimization: without Simplex Control: process, robotics 6

Comparison of Brains and Traditional Computers billion neurons, 3 trillion synapses Element size: -6 m Energy use: 5W Processing speed: Hz Parallel, Distributed Fault Tolerant Learns: Yes Intelligent/Conscious: Usually billion bytes RAM but trillions of bytes on disk Element size: -9 m Energy watt: 3~9W (CPU) Processing speed: 9 Hz Serial, Centralized Generally not Fault Tolerant Learns: Some Intelligent/Conscious: Generally No 7

Biological Inspiration Idea : To make the computer more robust, intelligent, and learn, Let s model our computer software (and/or hardware) after the brain My brain: It's my second favorite organ. - Woody Allen, from the movie Sleeper 8

Neurons in the Brain Although heterogeneous, at a low level the brain is composed of neurons A neuron receives input from other neurons (generally thousands) from its synapses Inputs are approximately summed When the input exceeds a threshold the neuron sends an electrical spike that travels from the body, down the axon, to the next neuron(s) 9

Biological Neuron 3 major functional units Dendrites Cell body Axon Synapse x x xn Amount of signal passing through a neuron depends on: Intensity of signal from feeding neurons Their synaptic strengths Threshold of the receiving neuron Hebb rule (plays key part in learning) A synapse which repeatedly triggers the activation of a postsynaptic neuron will grow in strength, others will gradually weaken Learn by adjusting magnitudes of synapses strengths w g(ξ) w w n y ξ

Learning in the Brain Brains learn Altering strength between neurons Creating/deleting connections Hebb s Postulate (Hebbian Learning) When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A's efficiency, as one of the cells firing B, is increased Long Term Potentiation (LTP) Cellular basis for learning and memory LTP is the long-lasting strengthening of the connection between two nerve cells in response to stimulation Discovered in many regions of the cortex

Artificial Neurons (basic computational entities of an ANN) Analogy between artificial and biological concepts (connection weights represent synapses) y In 958 Rosenblatt introduced mechanics (perceptron) Input to output: y=g( i w i x i ) Only when sum exceeds the threshold limit will neuron fire g( ) w.x Weights can enhance or inhibit Collective behaviour of neurons is what s interesting for intelligent data processing w w w 3 x x x 3

3 Model of a Neuron

Activation Function f(a) + A - Step Function f(a) + A - Sigmoid Function 4

Perceptrons Can be trained on a set of examples using a special learning rule (process) Weights are changed in proportion to the difference (error) between target output and perceptron output for each example Minimize summed square error function: E = / p i (o i (p) -t i (p) ) with respect to the weights Error is function of all the weights and forms an irregular multidimensional complex hyperplane with many peaks, saddle points and minima Error minimized by finding set of weights that correspond to global minimum Done with gradient descent method weights incrementally updated in proportion to δe/δw ij Updating reads: w ij (t + ) = w ij (t) w ij Aim is to produce a true mapping for all patterns w ij x j o i g(ξ) threshold ξ 5

6 Perceptron Structure

Learning for Perceptron. Initialize w ij with random values. Repeat until w ij (t + ) w ij (t): Pick pattern p from training set Feed input to network and calculate the output Update the weights according to w ij (t + ) = w ij (t) - w ij where w ij = -η δe/δw ij. When no change (within some accuracy) occurs, the weights are frozen and network is ready to use on data it has never seen 7

Example AND OR x x t x x t Perceptron learns these rules easily (ie, sets appropriate weights and threshold) w=(w,w,w ) = (-.5,.,.) and (-.5,.,.) where w corresponds to the threshold term 8

Problem & Solution Perceptrons can only perform accurately with linearly separable classes linear hyperplane can place one class of objects on one side of plane and other class on other x ANN research put on hold for yrs x Solution: additional (hidden) layers of neurons, MLP architecture Able to solve non-linear classification problems x x 9

Multilayer Perceptrons (MLPs) Learning procedure is extension of simple perceptron algorithm o i Response function: o i =g( i w ij g( k w jk x k )) which is non-linear so network able to perform non-linear mappings h j w ij Theory tells us that a neural network with at least hidden layer can represent any function w jk Vast number of ANN types exist x k

MLP Structure

Geometric Interpretation of Perceptron Learning

Backpropagation ANNs Most widely used type of network Feedforward Supervised (learns mapping from one data space to another using examples) Error propagated backwards Versatile. Used for data modelling, classification, forecasting, data and image compression and pattern recognition. 3

BP Learning Algorithm Like Perceptron, uses gradient descent to minimize error (generalized to case with hidden layers) Each iteration constitutes two sweeps To minimize Error we need δe/δw ij but also need δe/δw jk (which we get using the chain rule) Training of MLP using BP can be thought of as a walk in weight space along an energy surface, trying to find global minimum and avoiding local minima Unlike for Perceptron, there is no guarantee that global minimum will be reached, but most cases energy landscape is smooth 4

5 Backpropagation Net Structure

BP Learning Algorithm. Initialize w ij and w jk with random values. Repeat until w ij and w jk have converged or the desired performance level is reached: Pick pattern p from training set Present input and calculate the output Update weights according to: w ij (t + ) = w ij (t) w ij w jk (t + ) = w jk (t) w jk where w = -η δe/δw. ( etc for extra hidden layers) 6

Training Generalization: network s performance on a set of test patterns it has never seen before (lower than on training set) Training set used to let ANN capture features in data or mapping Error (eg SSE) Testing Initial large drop in error is due to learning, but subsequent slow reduction is due to:. Network memorization (too many training cycles used). Overfitting (too many hidden nodes) Training Optimum network (network learns individual training examples and loses generalization ability) No. of hidden nodes or training cycles 7

Other Popular ANNs Some problems can be solved using variety of ANN types, some only via specific. (problem logistics) Hopfield networks: optimization Presented with incomplete/noisy pattern, network responds by retrieving an internally stored pattern it most closely resembles Kohonen networks: (self-organizing) Trained in an unsupervised manner to form clusters in the data. Used for pattern classification and data compression 8

Summary of ANN Learning Artificial Neural Networks Feedforward Recurrent Unsupervised Supervised Unsupervised Supervised Kohonen, Hebbian MLP, RBF ART Elman, Jordan, Hopfield 9

홉필드망 : 구조와작동식 제약조건 w ij = w ji w ii = i I i, O i {,} 작동식 NET O j j wijoi + = I j + = O j if NET j if NET j if NET j > T = T j j < T j 구조 3

홉필드망 : 특성과목적 특성 피드백이있는 recurrent네트워크 동적네트워크 목적 입력에가장가까운패턴출력 응용분야 연상기억 (Associative memory) 최적화 (Optimization) 3

3 예 () 문제 : 두패턴벡터를저장 학습 : 연결강도를에의해구함 t t x x ),, (,, ),,, ( = = t t x x x x w + = = w

33 예 () 연상실험. 학습데이터의복구능력 = = 4 x w

34 예 (3) 불완전한데이터의복구능력 = = 4 4 x w limiting hard + - f h (x) x

실제예와문제점 연상시킬패턴들의유사도가적어야함 네트워크의용량 : 노드수의약 5% 예 : 개패턴의경우 7개이상의노드가필요 5개이상의연결필요 35

Boltzman Machine 시뮬레이티드어닐링 At temperature T, output value is determined Stochastically by Boltzman distribution With carefully designed Annealing schedule 볼쯔만분포 P( E i ) = α e E β/ T i 특성 시뮬레이티드어닐링등에의해통계학적으로작동하는신경망 전역최적화가가능한네트워크 36

37 에너지곡선

Self-Organizing Map Self-organizing map (SOM) Unsupervised learning Preserves the topology of data Widely used in data visualization or topology-preserving mapping Selection of winner: Weight update x mc = min{ x mi } mi( t + ) = mi( t) + α( t) nci( t) { x( t) mi( t)} i 38

39 SOM Structure

SASOM. Start with a basic SOM (4X4 map). Train the current network with the Kohonen s algorithm 3. Calibrate the network using known I/O patterns to determine Node should be replaced with a submap of several nodes (X map) Node should be deleted 4. Unless every node represents a unique class, go to step 4

Learning Procedure Input data Initialize map as 4X4 Train with Kohonen s algorithm Structure adaptation Find nodes whose hit_ratio is less than 95.% Split the nodes to X submap Train the split nodes with LVQ algorithm Remove nodes not participated in learning Stop condition satisfied? Yes Map generated No 4

Kohonen s Learning Initialization 4X4 rectangular map using Kohonen s learning algorithm Learning Winner node x m c = min{ i x mi } Kohonen s learning algorithm mi( t + ) = mi( t) + α( t) nci( t) { x( t) mi( t)} n ci (t) α(t) Neighborhood function Learning rate 4

Dynamic Node-Splitting Determining whether a node is to be split or not Hit ratio hit _ ratio i = max j P( c j n i ) where i =,, L, M and j =,, L, N Nodes less than 95.% of hit ratio are split,,,, 43

Initial Weight of Split Nodes C P N c S : Child node : Parent node C = ( P ) S + : Weights of neighbors : Total number of nodes that participate in weight initialization N c P P C C P P 4 P P P C C 3 P 3 P 3 44

LVQ Learning for Modified Map mi( t + ) = mi( t) + α( t) nci( t) hci( t) { x( t) mi( t)} { h ( t) = ci ) h ci ( t) = ), x (t and m i (t) belong to the same class, x (t and (t) belong to different classes m i Neighborhood function is used to preserve the topological order 45

Homework #. Information Geometry 에근거한 MLP 학습원리설명및학습성능향상을위한방법론을조사하시오.. MLP 를실제문제해결에사용하기위한 Tips 를네트워크의구조, 학습알고리즘, 학습데이터전처리로나누어조사하시오. 46