Training of Neural Networks. Q.J. Zhang, Carleton University
|
|
- Cecilia Morris
- 5 years ago
- Views:
Transcription
1 Training of Neural Networks
2 Notation: x: input of the original modeling problem or the neural network y: output of the original modeling problem or the neural network w: internal weights/parameters of the neural network m: number of outputs of the model y = f(x, w) : neural network model d: data for y (e.g., trainining data)
3 Define Model Input-Output Define model input-output (x, y), for example, x: physical/geometrical parameters of the component y: S-parameters of the component
4 Data Generation: (a)generate (x,y) samples: ( x k, yk ), k = 1, 2,, P, such that the finished NN best (accurately) represent original x~y problem (b)data generator Measurement : for each given xk, measure values of yk, k=1,2,, p Simulation: for each given x k, use a simulator to calculate, k=1,2,, p y k
5 Comparison of Neural Network Based Microwave Model Development Using Data from Two Types of Data Generators Basis of Comparison Neural Model Development Using Measurement Data Neural Model Development Using Sim ulation D ata A vailability of Problem Theory-Equations Model can be developed even if the theory-equations are not known, or difficult to im plem ent in CAD. Model can be developed only for the problem s that have theory that is implemented in a simulator. Assum ptions Input Parameter Sweep N o assum ptions are involved and the model could include all the effects, e.g., 3D-fullwave effects, fringing effects etc. Data generation could either be expensive or infeasible, if a geometrical param eter, e.g., transistor gate-length needs to be sampled/changed. Often involves assum ptions and the m odel will be lim ited by the assumptions made by the simulator, e.g., 2.5D E M. Relatively easier to sweep any param eter in the sim ulator, because the changes are num erical and not physical/m anual.
6 Comparison of Neural Network Based Microwave Model Development Using Data from Two Types of Data Generators (continued) Basis of Comparison Neural Model Development Using Measurement Data Neural Model Development Using Simulation Data Sources of Small and Large/Gross Errors Equipm ent limitations and tolerances. Accuracy limitations and nonconvergence of simulations. Feasibility of Getting Desired Output Development of models is possible for measurable responses only. For example, drain charge of an FET may not be easy to measure. Any response can be modeled as long as it can be computed by the simulator.
7 Data Generation: (c) Range of x to be sampled For Testing Data and Validation data: xmin x max : Should represent the user-intended range in which the NN is to be used by user For Training Data: Default range of x samples should be equal to userintended range, or if feasible, slightly beyond the user intended range
8 Data Generation - where data should be sampled x3 x1 x2 (d) Distribution of x samples Uniform grid distribution Non-uniform grid distribution Design of Experiments (DOE) methodology central-composite design 2 n factorial design Star distribution Random distribution
9 Data Generation (continued): (e) Number of samples P -- Theoretical factor: For grid distribution case: Shannon s Theorem For random distribution case: statistical confidence
10 Input / Output Scaling The orders of magnitude of various x and d values in microwave applications can be very different from one another. Scaling of training data is desirable for efficient neural network training The data can be scaled such that various x (or d ) have similar order of magnitude
11 Input / Output Scaling: Notation: x and y -- Original x and y x~ and ~ y -- Scaled x and y x, x max min -- Obtained from data x~ max, x ~ min -- Dictated by NN trainer Linear scale Scale formula: De-scaled formula: Log scale Scale formula: De-scale formula: x~ = x~ x x~ = x = min = x x + x x x min max ln( x x e min max min ( x ~ ~ x ~ x + ~ x ~ x x min x~ min + ) min max min ( x x~ max min ) x min )
12 Illustration of Data Scaling y d Scaling De-scaling Training data Scaled Data Neural network Trained Neural network model Scaling Scaling x x x x Data generation Data scaling Neural network training Finished model for user
13 Divide Data into Training Set, Validation Set and Testing Set Notation: P total number of data samples generated D Set for all data, D = {1, 2,, P} T r Training data set V -- Validation data set -- Test data set T e Ideally: Each data set ( Tr, V, T e ) should be an adequate representation of original y = f (x) problem in the entire range. Three sets have no overlap. x min ~ x max
14 Divide Data into Training Set, Validation Set and Testing Set Case 1: When original data is quite sufficient, split D into non-overlapping sets Case 2: When data is very limited, duplicate D, such that Tr = V = Te = D Case 3: Split data D into 2 sets.
15 Training / Validation and Testing Training error: Validation error: Test error: q 1 V k m 1 j q j min j max jk k j V y y d ) w, x ( y m size(v) 1 ) w ( E = = q 1 T k m 1 j q j min j max jk k j e T e e y y d ) w, x ( y m ) size(t 1 ) w ( E = = q 1 Tr k m 1 j q j min j max jk k j Tr y y d ) w, x ( y m size(tr) 1 ) w ( E = =
16 Where to Use Each Error Criteria ET r Training error: The training error E Tr (w) and its derivative w are used to determine how to update w during training Validation error: The validation error E V (w) is used as a stopping criteria during training, i.e., to determine if training is sufficient. Test error: The test error E Te (w) is used after training has finished to provide a final assessment of the quality of the trained neural network. Test error is not involved during training.
17 Flow-chart Showing Neural Network Training, Validation and Testing Update neural network weight parameters using a gradient-based algorithm (e.g., BP, backpropagation) quasi-newton) Compute derivatives of training error w.r.t. neural ANN weights network internal weights Evaluate training error Perform feedforward computation for all samples in training set No Assign random initial values for all the weight parameters Perform feedforward computation for all samples in validation set Evaluate validation error Desired accuracy achieved? Select a neural network structure, e.g., MLP START Evaluate test error as an independent quality measure of for the ANN trained model neural reliability network Perform feedforward computation for all samples in test set Yes STOP Training
18 Initial Value of NN Weights Before Training MLP: small random values RBF/Wavelet: estimate center & width of RBF or translation & dilation of Wavelet Knowledge Based NN: use physical/electrical experience
19 Overlearning Definition (strict): Math ET 0 r EV >> E T r Observation: NN memorized training data, but can not generalize well Possible reasons: a) Too many hidden neurons b) Not enough training data Actions: a) Add training data b) Delete hidden neurons c) Backup/retrieve previous solution
20 Neural Network Over-Learning output (y) Neural network Training data Validation data input (x)
21 Underlearning Definition (strict): Math E Tr >> 0 Observation: NN can not even represent the problem at training points Possible reasons: a) Not enough hidden neurons b) Training stuck at local solution c) Not enough training Actions: a) Add hidden neurons b) More training c) Perturb solution, then train
22 Neural Network Under-Learning Output (y) Neural network Training data Validation data Input (x)
23 Perfect Learning: Definition (strict): Math E Tr E V 0 Observation: generalized well
24 Perfect Learning of Neural Networks output (y) Neural network Training data Validation data input (x)
25 Types of Training Sample-by-sample (or online) training: ANN weights are updated every time a training sample is presented to the network, i.e., weight update is based on training error from that sample Batch-mode (or offline) training: ANN weights are updated after each epoch, i.e., weight update is based on training error from all the samples in training data set where an epoch is defined as a stage of ANN training that involves presentation of all the samples in the training data set to the neural network once for the purpose of learning Supervised training: using (x & y) data for training Un-supervised training: using only x data for training
26 Neural Network Training The error between training data and neural network outputs is fedback to the neural network to guide the internal weight update of the network d Training Error - y Training Data Neural Network w x
27 Training Problem Statement: Given training data (xk,dk ),k Tr Validation data (xk,dk ), k V NN model y(x, w) Find values of w, such that validation error is minimized min ( epoch) E V epoch where y j( x k, w (epoch)) 1 m EV (epoch) = PV m k V j= 1 ymax j y P V = size(v ) w( epoch) = w( epoch 1) + w( epoch 1) w epoch=0 = user s / software initial guess min j d jk q 1 q w( epoch 1) is the update determined by the optimization algorithm (training algorithm) which minimizes the training error
28 Steps of Gradient Based Training Algorithms: Step 1: w= initial guess epoch = 0 Step 2: If E V (epoch) < ε (given accuracy criteria) or epoch > max_epoch (max number of epochs), stop ( w) T Step 3: Calculate ET r (w) and using partial or all training Data w E r Step 4: Use optimization algorithm to find w w+ w w Step 5: If all training data are used, then epoch= epoch+ 1, go to Step 2, else go to Step 3
29 Update w in Gradient-based Methods w = η h where h is the direction of the update of w η is the step size of the update of w Gradient based methods use information of ET r (w) and to determine update direction of w. ( w) ET r w Step size η is determined by: Small fixed constant set by user Adaptive constant during training Line minimization method to find best value of η
30 Line Minimization Problem Statement Let a scalar function of one variable be defined as φ(η) = E Tr (w + η h) Given present value of w and direction h Find η such that φ(η) is minimized. Solution method: (1-dimensional optimization method): Sectioning Golden section method methods Fibonacci method Bisection method Quadratic method Interpolation methods Cubic method
31 Back-Propagation (BP), (Rumelhart, Hinton & Williams, 1986) We use the negative gradient direction: h = - for w = η h The neural network weights are updated during training as: or w E = w η where η is called the learning rate α is called the momentum factor T r ( w w ET ( w ) r w = w η + α w w ) epoch 1 ( w) ET r w
32 Determining η and α for BP : Set η and α as fixed constant η and α can be adaptive η = c / epoch, c is a constant STC (Darkens, 1992) c epoch 1 + ( ) ( η0 τ η = η0 c epoch ( ) ( ) + τ ( η0 τ where are user defined η 0, τ,c ) epoch τ ) 2 Delta-bar-delta (Jacobs, 1988) (a) A η i for each weight w in of NN, w i (b) η is adjusted during training using present and previous E information of T r ( w ) w i w i ET r ( w ) = wi ηi w
33 Conjugate Gradient Method w = η h ET ( w ) r Let E = w Then h(epoch) = E + γ h( epoch 1) where h = E γ = 0 E( epoch ) 2 E( epoch 1 ) 2 (Fletcher/Reeves) ( E( epoch ) E( epoch 1 )) γ = 2 E( epoch 1 ) Speed: generally fast than BP Memory: A few vectors of N W long, where weights/parameters in w T E( epoch ) Line minimization method η is determined by Trust Region method (Polak Ribiere) N W is the total # of NN
34 Quasi-Newton Method Let H be the Hessian matrix of E Tr w.r.t. w B be the inverse of H Weight update: w = η h, where Use information of w and g to approximate B: B B epoch= 0 epoch = = B I epoch 1 (DFP formula) where w = w(epoch) w(epoch-1) g = E( epoch) E( epoch 1) T w w + T w g B h = B E epoch 1 T g g g B T B epoch 1 epoch 1 g Speed: fast Main Memory Needed: 2 N W (Large)
35 Levenberg-Marquardt Method Obtain w from solving linear equations, e.g. T ( J J + µ I ) w = where e = [ e 1, e 2,..., e Ne ] T, J is Jacobian, J T e T e J = ( ) w T e i = δ > 0 Typical Levenberg Marquardt µ = 0 Gauss Newton jk = y j y ( x k max, j, w) d y jk min, j This method is good if problems. e can be very small, e.g., small residue Computation needs LU decomposition 2 Main Memory Needed:, (Large) N W
36 Other Training Methods Huber-Quasi-Newton Similar to Quasi-Newton, except that the error function for training is based on Huber function, and not the conventional least square error function. The Huber formulation allows the training algorithm to robustly handle both small random errors and accidental large error in training data.
37 Other Training Methods (continued) Simplex Method using information (w) only The method starts with several initial guesses of w. These initial points form a simplex in the w space. The method then iteratively updates the simplex using basic moves such as reflection, expansion, and contraction, according to the information of (w) at vertices of the simplex The error (w) generally decreases as the simplex is updated. ET r ET r ET r
38 Other Training Methods (continued) Genetic Algorithm using information E Tr (w) only, searching for global minimum The algorithm starts with several initial points of w called a population of w A fitness value is defined for each w such that w with lower error E Tr (w) has a high fitness w points with high fitness values are more likely selected as parents from whom new points of w called offspring are produced This process continues, and fitness among the population improves
39 Other Training Methods (continued) Simulated Annealing Algorithm using information E Tr (w) only, also searching for global minimum The method uses a temperature parameter to control the training process In early stages of training, temperature value is high and new w points with high error values could be accepted, thus avoiding being trapped in local minimum In late stages of training, temperature value is low and only the w points leading to a reduction in error are accepted
40 Qualitative Comparison of Different Algorithms Convergence Rate fast slow more less Levenberg Marquardt (for small-residue problems) Quasi-Newton Conjugate Gradient BP Memory need and effort in implementation
41 Comparison of Training Algorithms for 3-Conductor Microstrip Line Example (5 input neurons, 28 hidden neurons, 5 output neurons) Training Algorithm No. of Epochs Training Error (%) Avg. Test Error (%) CPU (in Sec) Adaptive Backpropagation Conjugate-gradient Quasi-Newton Levenberg- Marquardt
42 Comparison of Training Algorithms for MESFET Example (4 input neurons, 60 hidden neurons, 8 output neurons) Training Algorithm No of Epochs Training Error (%) Ave. Test Error (%) CPU (in Sec) Adaptive Backpropagation Conjugate Gradient Quasi-Newton Levenberg- Marquardt
Introduction to Neural Networks: Structure and Training
Introduction to Neural Networks: Structure and Training Qi-Jun Zhang Department of Electronics Carleton University, Ottawa, ON, Canada A Quick Illustration Example: Neural Network Model for Delay Estimation
More informationTheoretical Concepts of Machine Learning
Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5
More informationToday. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient
Optimization Last time Root finding: definition, motivation Algorithms: Bisection, false position, secant, Newton-Raphson Convergence & tradeoffs Example applications of Newton s method Root finding in
More informationClassical Gradient Methods
Classical Gradient Methods Note simultaneous course at AMSI (math) summer school: Nonlin. Optimization Methods (see http://wwwmaths.anu.edu.au/events/amsiss05/) Recommended textbook (Springer Verlag, 1999):
More informationModern Methods of Data Analysis - WS 07/08
Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints
More informationImage Compression: An Artificial Neural Network Approach
Image Compression: An Artificial Neural Network Approach Anjana B 1, Mrs Shreeja R 2 1 Department of Computer Science and Engineering, Calicut University, Kuttippuram 2 Department of Computer Science and
More informationIntroduction to Optimization
Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,
More informationMulti Layer Perceptron trained by Quasi Newton learning rule
Multi Layer Perceptron trained by Quasi Newton learning rule Feed-forward neural networks provide a general framework for representing nonlinear functional mappings between a set of input variables and
More informationAdaptive Regularization. in Neural Network Filters
Adaptive Regularization in Neural Network Filters Course 0455 Advanced Digital Signal Processing May 3 rd, 00 Fares El-Azm Michael Vinther d97058 s97397 Introduction The bulk of theoretical results and
More informationNeural Network Structures and Training Algorithms for RF and Microwave Applications
Neural Network Structures and Training Algorithms for RF and Microwave Applications Fang Wang, Vijaya K. Devabhaktuni, Changgeng Xi, Qi-Jun Zhang Department of Electronics, Carleton University, Ottawa,
More informationSYSTEMS OF NONLINEAR EQUATIONS
SYSTEMS OF NONLINEAR EQUATIONS Widely used in the mathematical modeling of real world phenomena. We introduce some numerical methods for their solution. For better intuition, we examine systems of two
More informationSupervised Learning in Neural Networks (Part 2)
Supervised Learning in Neural Networks (Part 2) Multilayer neural networks (back-propagation training algorithm) The input signals are propagated in a forward direction on a layer-bylayer basis. Learning
More informationIntroduction to optimization methods and line search
Introduction to optimization methods and line search Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi How to find optimal solutions? Trial and error widely used in practice, not efficient and
More informationContents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.
page v Preface xiii I Basics 1 1 Optimization Models 3 1.1 Introduction... 3 1.2 Optimization: An Informal Introduction... 4 1.3 Linear Equations... 7 1.4 Linear Optimization... 10 Exercises... 12 1.5
More informationWhat is machine learning?
Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship
More informationCHAPTER VI BACK PROPAGATION ALGORITHM
6.1 Introduction CHAPTER VI BACK PROPAGATION ALGORITHM In the previous chapter, we analysed that multiple layer perceptrons are effectively applied to handle tricky problems if trained with a vastly accepted
More informationRecapitulation on Transformations in Neural Network Back Propagation Algorithm
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 4 (2013), pp. 323-328 International Research Publications House http://www. irphouse.com /ijict.htm Recapitulation
More informationExperimental Data and Training
Modeling and Control of Dynamic Systems Experimental Data and Training Mihkel Pajusalu Alo Peets Tartu, 2008 1 Overview Experimental data Designing input signal Preparing data for modeling Training Criterion
More informationConvex Optimization CMU-10725
Convex Optimization CMU-10725 Conjugate Direction Methods Barnabás Póczos & Ryan Tibshirani Conjugate Direction Methods 2 Books to Read David G. Luenberger, Yinyu Ye: Linear and Nonlinear Programming Nesterov:
More information4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.
1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationCOMPUTATIONAL INTELLIGENCE
COMPUTATIONAL INTELLIGENCE Fundamentals Adrian Horzyk Preface Before we can proceed to discuss specific complex methods we have to introduce basic concepts, principles, and models of computational intelligence
More informationCamera calibration. Robotic vision. Ville Kyrki
Camera calibration Robotic vision 19.1.2017 Where are we? Images, imaging Image enhancement Feature extraction and matching Image-based tracking Camera models and calibration Pose estimation Motion analysis
More informationIntroduction to Optimization Problems and Methods
Introduction to Optimization Problems and Methods wjch@umich.edu December 10, 2009 Outline 1 Linear Optimization Problem Simplex Method 2 3 Cutting Plane Method 4 Discrete Dynamic Programming Problem Simplex
More informationAPPLICATION OF A MULTI- LAYER PERCEPTRON FOR MASS VALUATION OF REAL ESTATES
FIG WORKING WEEK 2008 APPLICATION OF A MULTI- LAYER PERCEPTRON FOR MASS VALUATION OF REAL ESTATES Tomasz BUDZYŃSKI, PhD Artificial neural networks the highly sophisticated modelling technique, which allows
More informationA NEW EFFICIENT VARIABLE LEARNING RATE FOR PERRY S SPECTRAL CONJUGATE GRADIENT TRAINING METHOD
1 st International Conference From Scientific Computing to Computational Engineering 1 st IC SCCE Athens, 8 10 September, 2004 c IC SCCE A NEW EFFICIENT VARIABLE LEARNING RATE FOR PERRY S SPECTRAL CONJUGATE
More informationNN-GVEIN: Neural Network-Based Modeling of Velocity Profile inside Graft-To-Vein Connection
Proceedings of 5th International Symposium on Intelligent Manufacturing Systems, Ma1y 29-31, 2006: 854-862 Sakarya University, Department of Industrial Engineering1 NN-GVEIN: Neural Network-Based Modeling
More informationCharacterizing Improving Directions Unconstrained Optimization
Final Review IE417 In the Beginning... In the beginning, Weierstrass's theorem said that a continuous function achieves a minimum on a compact set. Using this, we showed that for a convex set S and y not
More informationPerformance Evaluation of an Interior Point Filter Line Search Method for Constrained Optimization
6th WSEAS International Conference on SYSTEM SCIENCE and SIMULATION in ENGINEERING, Venice, Italy, November 21-23, 2007 18 Performance Evaluation of an Interior Point Filter Line Search Method for Constrained
More informationMATH3016: OPTIMIZATION
MATH3016: OPTIMIZATION Lecturer: Dr Huifu Xu School of Mathematics University of Southampton Highfield SO17 1BJ Southampton Email: h.xu@soton.ac.uk 1 Introduction What is optimization? Optimization is
More informationEnergy Minimization -Non-Derivative Methods -First Derivative Methods. Background Image Courtesy: 3dciencia.com visual life sciences
Energy Minimization -Non-Derivative Methods -First Derivative Methods Background Image Courtesy: 3dciencia.com visual life sciences Introduction Contents Criteria to start minimization Energy Minimization
More informationEfficient Iterative Semi-supervised Classification on Manifold
. Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani
More informationCOMPUTATIONAL INTELLIGENCE
COMPUTATIONAL INTELLIGENCE Radial Basis Function Networks Adrian Horzyk Preface Radial Basis Function Networks (RBFN) are a kind of artificial neural networks that use radial basis functions (RBF) as activation
More informationConstrained and Unconstrained Optimization
Constrained and Unconstrained Optimization Carlos Hurtado Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Oct 10th, 2017 C. Hurtado (UIUC - Economics) Numerical
More informationAutomatic Adaptation of Learning Rate for Backpropagation Neural Networks
Automatic Adaptation of Learning Rate for Backpropagation Neural Networks V.P. Plagianakos, D.G. Sotiropoulos, and M.N. Vrahatis University of Patras, Department of Mathematics, GR-265 00, Patras, Greece.
More informationNeural Networks: Optimization Part 1. Intro to Deep Learning, Fall 2018
Neural Networks: Optimization Part 1 Intro to Deep Learning, Fall 2018 1 Story so far Neural networks are universal approximators Can model any odd thing Provided they have the right architecture We must
More informationNumerical Optimization
Numerical Optimization Quantitative Macroeconomics Raül Santaeulàlia-Llopis MOVE-UAB and Barcelona GSE Fall 2018 Raül Santaeulàlia-Llopis (MOVE-UAB,BGSE) QM: Numerical Optimization Fall 2018 1 / 46 1 Introduction
More informationAn Improved Backpropagation Method with Adaptive Learning Rate
An Improved Backpropagation Method with Adaptive Learning Rate V.P. Plagianakos, D.G. Sotiropoulos, and M.N. Vrahatis University of Patras, Department of Mathematics, Division of Computational Mathematics
More informationResearch on Evaluation Method of Product Style Semantics Based on Neural Network
Research Journal of Applied Sciences, Engineering and Technology 6(23): 4330-4335, 2013 ISSN: 2040-7459; e-issn: 2040-7467 Maxwell Scientific Organization, 2013 Submitted: September 28, 2012 Accepted:
More informationCOMPUTATIONAL NEURAL NETWORKS FOR GEOPHYSICAL DATA PROCESSING
SEISMIC EXPLORATION Volume 30 COMPUTATIONAL NEURAL NETWORKS FOR GEOPHYSICAL DATA PROCESSING edited by Mary M. POULTON Department of Mining & Geological Engineering Computational Intelligence & Visualization
More informationIntroduction to Design Optimization: Search Methods
Introduction to Design Optimization: Search Methods 1-D Optimization The Search We don t know the curve. Given α, we can calculate f(α). By inspecting some points, we try to find the approximated shape
More informationMultivariate Numerical Optimization
Jianxin Wei March 1, 2013 Outline 1 Graphics for Function of Two Variables 2 Nelder-Mead Simplex Method 3 Steepest Descent Method 4 Newton s Method 5 Quasi-Newton s Method 6 Built-in R Function 7 Linear
More informationPattern Classification Algorithms for Face Recognition
Chapter 7 Pattern Classification Algorithms for Face Recognition 7.1 Introduction The best pattern recognizers in most instances are human beings. Yet we do not completely understand how the brain recognize
More informationCombine the PA Algorithm with a Proximal Classifier
Combine the Passive and Aggressive Algorithm with a Proximal Classifier Yuh-Jye Lee Joint work with Y.-C. Tseng Dept. of Computer Science & Information Engineering TaiwanTech. Dept. of Statistics@NCKU
More informationarxiv: v1 [cs.cv] 2 May 2016
16-811 Math Fundamentals for Robotics Comparison of Optimization Methods in Optical Flow Estimation Final Report, Fall 2015 arxiv:1605.00572v1 [cs.cv] 2 May 2016 Contents Noranart Vesdapunt Master of Computer
More informationHumanoid Robotics. Least Squares. Maren Bennewitz
Humanoid Robotics Least Squares Maren Bennewitz Goal of This Lecture Introduction into least squares Use it yourself for odometry calibration, later in the lecture: camera and whole-body self-calibration
More informationSolving for dynamic user equilibrium
Solving for dynamic user equilibrium CE 392D Overall DTA problem 1. Calculate route travel times 2. Find shortest paths 3. Adjust route choices toward equilibrium We can envision each of these steps as
More informationRecent Developments in Model-based Derivative-free Optimization
Recent Developments in Model-based Derivative-free Optimization Seppo Pulkkinen April 23, 2010 Introduction Problem definition The problem we are considering is a nonlinear optimization problem with constraints:
More informationDeep Learning. Architecture Design for. Sargur N. Srihari
Architecture Design for Deep Learning Sargur N. srihari@cedar.buffalo.edu 1 Topics Overview 1. Example: Learning XOR 2. Gradient-Based Learning 3. Hidden Units 4. Architecture Design 5. Backpropagation
More informationCMPT 882 Week 3 Summary
CMPT 882 Week 3 Summary! Artificial Neural Networks (ANNs) are networks of interconnected simple units that are based on a greatly simplified model of the brain. ANNs are useful learning tools by being
More informationCS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018
CS 395T Lecture 12: Feature Matching and Bundle Adjustment Qixing Huang October 10 st 2018 Lecture Overview Dense Feature Correspondences Bundle Adjustment in Structure-from-Motion Image Matching Algorithm
More informationComparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles
INTERNATIONAL JOURNAL OF MATHEMATICS MODELS AND METHODS IN APPLIED SCIENCES Comparison of Interior Point Filter Line Search Strategies for Constrained Optimization by Performance Profiles M. Fernanda P.
More informationIntroduction to Design Optimization: Search Methods
Introduction to Design Optimization: Search Methods 1-D Optimization The Search We don t know the curve. Given α, we can calculate f(α). By inspecting some points, we try to find the approximated shape
More informationFast Learning for Big Data Using Dynamic Function
IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Fast Learning for Big Data Using Dynamic Function To cite this article: T Alwajeeh et al 2017 IOP Conf. Ser.: Mater. Sci. Eng.
More informationData Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Search & Optimization Search and Optimization method deals with
More informationImproving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah
Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah Reference Most of the slides are taken from the third chapter of the online book by Michael Nielson: neuralnetworksanddeeplearning.com
More informationImage Registration Lecture 4: First Examples
Image Registration Lecture 4: First Examples Prof. Charlene Tsai Outline Example Intensity-based registration SSD error function Image mapping Function minimization: Gradient descent Derivative calculation
More informationSpatial Variation of Sea-Level Sea level reconstruction
Spatial Variation of Sea-Level Sea level reconstruction Biao Chang Multimedia Environmental Simulation Laboratory School of Civil and Environmental Engineering Georgia Institute of Technology Advisor:
More informationDynamic Analysis of Structures Using Neural Networks
Dynamic Analysis of Structures Using Neural Networks Alireza Lavaei Academic member, Islamic Azad University, Boroujerd Branch, Iran Alireza Lohrasbi Academic member, Islamic Azad University, Boroujerd
More informationA VARIANT OF BACK-PROPAGATION ALGORITHM FOR MULTILAYER FEED-FORWARD NETWORK. Anil Ahlawat, Sujata Pandey
International Conference «Information Research & Applications» - i.tech 2007 1 A VARIANT OF BACK-PROPAGATION ALGORITHM FOR MULTILAYER FEED-FORWARD NETWORK Anil Ahlawat, Sujata Pandey Abstract: In this
More informationNewton and Quasi-Newton Methods
Lab 17 Newton and Quasi-Newton Methods Lab Objective: Newton s method is generally useful because of its fast convergence properties. However, Newton s method requires the explicit calculation of the second
More informationYEAR 12 Core 1 & 2 Maths Curriculum (A Level Year 1)
YEAR 12 Core 1 & 2 Maths Curriculum (A Level Year 1) Algebra and Functions Quadratic Functions Equations & Inequalities Binomial Expansion Sketching Curves Coordinate Geometry Radian Measures Sine and
More informationDiffuse Optical Tomography, Inverse Problems, and Optimization. Mary Katherine Huffman. Undergraduate Research Fall 2011 Spring 2012
Diffuse Optical Tomography, Inverse Problems, and Optimization Mary Katherine Huffman Undergraduate Research Fall 11 Spring 12 1. Introduction. This paper discusses research conducted in order to investigate
More informationNeural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.
Lecture 24: Learning 3 Victor R. Lesser CMPSCI 683 Fall 2010 Today s Lecture Continuation of Neural Networks Artificial Neural Networks Compose of nodes/units connected by links Each link has a numeric
More informationThe exam is closed book, closed notes except your one-page (two-sided) cheat sheet.
CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or
More informationRadial Basis Function Networks: Algorithms
Radial Basis Function Networks: Algorithms Neural Computation : Lecture 14 John A. Bullinaria, 2015 1. The RBF Mapping 2. The RBF Network Architecture 3. Computational Power of RBF Networks 4. Training
More informationB553 Lecture 12: Global Optimization
B553 Lecture 12: Global Optimization Kris Hauser February 20, 2012 Most of the techniques we have examined in prior lectures only deal with local optimization, so that we can only guarantee convergence
More informationHybrid Learning of Feedforward Neural Networks for Regression Problems. Xing Wu
Hybrid Learning of Feedforward Neural Networks for Regression Problems by Xing Wu A dissertation submitted to the Graduate Faculty of Auburn University in partial fulfillment of the requirements for the
More informationINTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING
INTRODUCTION TO LINEAR AND NONLINEAR PROGRAMMING DAVID G. LUENBERGER Stanford University TT ADDISON-WESLEY PUBLISHING COMPANY Reading, Massachusetts Menlo Park, California London Don Mills, Ontario CONTENTS
More informationAn Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting.
An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting. Mohammad Mahmudul Alam Mia, Shovasis Kumar Biswas, Monalisa Chowdhury Urmi, Abubakar
More informationM. Sc. (Artificial Intelligence and Machine Learning)
Course Name: Advanced Python Course Code: MSCAI 122 This course will introduce students to advanced python implementations and the latest Machine Learning and Deep learning libraries, Scikit-Learn and
More informationComputational Methods. Constrained Optimization
Computational Methods Constrained Optimization Manfred Huber 2010 1 Constrained Optimization Unconstrained Optimization finds a minimum of a function under the assumption that the parameters can take on
More informationCHAPTER 4 DETERMINATION OF OPTIMAL PATTERN RECOGNITION TECHNIQUE BY SOFTCOMPUTING ANALYSIS
70 CHAPTER 4 DETERMINATION OF OPTIMAL PATTERN RECOGNITION TECHNIQUE BY SOFTCOMPUTING ANALYSIS 4.1 INTRODUCTION Pattern recognition is the act of taking in raw data and taking action based on the category
More informationMultilayer Feed-forward networks
Multi Feed-forward networks 1. Computational models of McCulloch and Pitts proposed a binary threshold unit as a computational model for artificial neuron. This first type of neuron has been generalized
More informationLearning from Data: Adaptive Basis Functions
Learning from Data: Adaptive Basis Functions November 21, 2005 http://www.anc.ed.ac.uk/ amos/lfd/ Neural Networks Hidden to output layer - a linear parameter model But adapt the features of the model.
More information1. Introduction. performance of numerical methods. complexity bounds. structural convex optimization. course goals and topics
1. Introduction EE 546, Univ of Washington, Spring 2016 performance of numerical methods complexity bounds structural convex optimization course goals and topics 1 1 Some course info Welcome to EE 546!
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June
More informationIntroduction to Objective Analysis
Chapter 4 Introduction to Objective Analysis Atmospheric data are routinely collected around the world but observation sites are located rather randomly from a spatial perspective. On the other hand, most
More informationTrust Region Methods for Training Neural Networks
Trust Region Methods for Training Neural Networks by Colleen Kinross A thesis presented to the University of Waterloo in fulllment of the thesis requirement for the degree of Master of Mathematics in Computer
More informationToday s class. Roots of equation Finish up incremental search Open methods. Numerical Methods, Fall 2011 Lecture 5. Prof. Jinbo Bi CSE, UConn
Today s class Roots of equation Finish up incremental search Open methods 1 False Position Method Although the interval [a,b] where the root becomes iteratively closer with the false position method, unlike
More information6. Backpropagation training 6.1 Background
6. Backpropagation training 6.1 Background To understand well how a feedforward neural network is built and it functions, we consider its basic first steps. We return to its history for a while. In 1949
More informationCS281 Section 3: Practical Optimization
CS281 Section 3: Practical Optimization David Duvenaud and Dougal Maclaurin Most parameter estimation problems in machine learning cannot be solved in closed form, so we often have to resort to numerical
More informationLECTURE NOTES Non-Linear Programming
CEE 6110 David Rosenberg p. 1 Learning Objectives LECTURE NOTES Non-Linear Programming 1. Write out the non-linear model formulation 2. Describe the difficulties of solving a non-linear programming model
More informationMath 2 Final Exam Study Guide. Translate down 2 units (x, y-2)
Math 2 Final Exam Study Guide Name: Unit 2 Transformations Translation translate Slide Moving your original point to the left (-) or right (+) changes the. Moving your original point up (+) or down (-)
More informationGradient Methods for Machine Learning
Gradient Methods for Machine Learning Nic Schraudolph Course Overview 1. Mon: Classical Gradient Methods Direct (gradient-free), Steepest Descent, Newton, Levenberg-Marquardt, BFGS, Conjugate Gradient
More informationOptimization of Components using Sensitivity and Yield Information
Optimization of Components using Sensitivity and Yield Information Franz Hirtenfelder, CST AG franz.hirtenfelder@cst.com www.cst.com CST UGM 2010 April 10 1 Abstract Optimization is a typical component
More informationCHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION
75 CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION 6.1 INTRODUCTION Counter propagation network (CPN) was developed by Robert Hecht-Nielsen as a means to combine an unsupervised Kohonen
More informationNMath Analysis User s Guide
NMath Analysis User s Guide Version 2.0 CenterSpace Software Corvallis, Oregon NMATH ANALYSIS USER S GUIDE 2009 Copyright CenterSpace Software, LLC. All Rights Reserved. The correct bibliographic reference
More informationA Deeper Insight of Neural Network Spatial Interaction Model s Performance Trained with Different Training Algorithms
A Deeper Insight of Neural Network Spatial Interaction Model s Performance Trained with Different Training Algorithms Gusri YALDI Civil Engineering Department, Padang State Polytechnic, Padang, Indonesia
More informationAN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENETIC ALGORITHMS
AN IMPROVED ITERATIVE METHOD FOR SOLVING GENERAL SYSTEM OF EQUATIONS VIA GENETIC ALGORITHMS Seyed Abolfazl Shahzadehfazeli 1, Zainab Haji Abootorabi,3 1 Parallel Processing Laboratory, Yazd University,
More informationPrograms. Introduction
16 Interior Point I: Linear Programs Lab Objective: For decades after its invention, the Simplex algorithm was the only competitive method for linear programming. The past 30 years, however, have seen
More informationLecture 5: Optimization of accelerators in simulation and experiments. X. Huang USPAS, Jan 2015
Lecture 5: Optimization of accelerators in simulation and experiments X. Huang USPAS, Jan 2015 1 Optimization in simulation General considerations Optimization algorithms Applications of MOGA Applications
More informationFast Training of Multilayer Perceptrons
Fast Training of Multilayer Perceptrons Brijesh Verma, Member of IEEE & IASTED School of Information Technology Faculty of Engineering and Applied Science Griffith University, Gold Coast Campus Gold Coast,
More informationMore on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization
More on Learning Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization Neural Net Learning Motivated by studies of the brain. A network of artificial
More informationLinear Separability. Linear Separability. Capabilities of Threshold Neurons. Capabilities of Threshold Neurons. Capabilities of Threshold Neurons
Linear Separability Input space in the two-dimensional case (n = ): - - - - - - w =, w =, = - - - - - - w = -, w =, = - - - - - - w = -, w =, = Linear Separability So by varying the weights and the threshold,
More informationRIMT IET, Mandi Gobindgarh Abstract - In this paper, analysis the speed of sending message in Healthcare standard 7 with the use of back
Global Journal of Computer Science and Technology Neural & Artificial Intelligence Volume 13 Issue 3 Version 1.0 Year 2013 Type: Double Blind Peer Reviewed International Research Journal Publisher: Global
More informationMachine Learning written examination
Institutionen för informationstenologi Olle Gällmo Universitetsadjunt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Machine Learning written examination Friday, June 10, 2011 8 00-13 00 Allowed help
More informationSimultaneous Perturbation Stochastic Approximation Algorithm Combined with Neural Network and Fuzzy Simulation
.--- Simultaneous Perturbation Stochastic Approximation Algorithm Combined with Neural Networ and Fuzzy Simulation Abstract - - - - Keywords: Many optimization problems contain fuzzy information. Possibility
More informationMULTILAYER PERCEPTRON WITH ADAPTIVE ACTIVATION FUNCTIONS CHINMAY RANE. Presented to the Faculty of Graduate School of
MULTILAYER PERCEPTRON WITH ADAPTIVE ACTIVATION FUNCTIONS By CHINMAY RANE Presented to the Faculty of Graduate School of The University of Texas at Arlington in Partial Fulfillment of the Requirements for
More informationApplying Neural Network Architecture for Inverse Kinematics Problem in Robotics
J. Software Engineering & Applications, 2010, 3: 230-239 doi:10.4236/jsea.2010.33028 Published Online March 2010 (http://www.scirp.org/journal/jsea) Applying Neural Network Architecture for Inverse Kinematics
More information