Experimental Data and Training
|
|
- Rebecca Lee
- 5 years ago
- Views:
Transcription
1 Modeling and Control of Dynamic Systems Experimental Data and Training Mihkel Pajusalu Alo Peets Tartu,
2 Overview Experimental data Designing input signal Preparing data for modeling Training Criterion Training methods Issues with training 2
3 Experimental data The primary purpose of an experiment is to produce a set of examples of how the dynamic system to be identified responds to various controls. Sufficiency of linear model Superposition check Homogeneity Frequency response check output signal must trace input signal over frequency and amplitude 3
4 Experiment design Sampling frequency compromise between identification and controller design Curse of dimensionality all combinations of frequencies and amplitudes Designing the input signal N-samples constant Level change at random instances Chirp signal 4
5 N-samples constant 5
6 Level change at random instances 6
7 Chirp signal 7
8 Experiment in closed loop When a system is unstable or poorly damped, in order to keep the system inside the range in which it is intended to operate, it is necessary to use 1. a stabilizing controller 2. or, alternatively, a manually tuned PID controller 3. or a human operator for controlling the system 8
9 Preparing data Filtering can be used to remove from the measured signals noise, periodic disturbances, off-sets, and the effects of uninteresting dynamics. Removing redundancy and outliers from the data set Scaling is highly recommended to remove the mean and scale all signals to the same variance, 9
10 Training Definition Mapping from the data set to the set of candidate models = selecting the best model among the candidates Criterion Mean square error Searching for minimum Prediction Error Method 10
11 Taylor expansion Second order Taylor expansion gradient the Hessian 11
12 Searching for minimum Minimum conditions Gradient equals zero Hessian matrix is positive definite for all non-zero vectors v Search Update rule Converges at a local minimum 12
13 Training methods First order Deploy gradient information about the criterion only Gradient method Second order ( Hessian Also use second order derivative (the Newton method, Levenberg-Marquardt method Recursive On-line training, use only the latest input-output pair Recursive versions of gradient and Gauss-Newton 13
14 First-order methods: Gradient a.k.a steepest descent method Search direction opposite to gradient Back-propagation algorithm Step size is very important Slow convergence 14
15 Second order: Newton method Uses also the Hessian Using criterion Derivation 15
16 Second order: Newton method Update rule Search direction Must be complemented with line search Damped Newton method Better convergence when gradient method as first stage Possible ill-conditioned Hessian Numerical problems for calculating search direction Might not converge if criterio n 16
17 Quasi-Newton methods Newton method has quadratic convergence Computationally expensive Computationally cheaper Hessian can be approximated -> quasi-newton methods BFGS (Broyden-Fletcher-Goldfarb-Shanno) algorithm Approximation of the Hessian form previous iterates and gradients 17
18 Gauss-Newton method Convergence of quasi-newton is poor Gauss-Newton method approximates prediction error Resulting Hessian is different Less expensive to compute (firs t ( definition -order derivates only, positive semidefinite by Damped Gauss-Newton method if with line search 18
19 Pseudo-Newton method Neglects off-diagonal elements in the Gauss- Newton Hessian Search direction cheaper to calculate Consumes less memory Overcomes ill-conditioning of Hessian Convergence slower 19
20 Levenberg-Marquadt method Search direction of Gauss-Newton method is not optimal - valid only in immediate neighborhood of iterate Minimum is searched only in a given radius Can be scaled 20
21 Levenberg-Marquadt method The most obvious choice for neural network training Fast and robust convergence 21
22 Recursive algorithms New data is added during training Sometimes on-line training is needed Adaptive control -> time-varying systems Calculation of batch methods requires too much time and old information becomes obsolete Useful also in off-line training Implementation simpler, less memory needed, redundancy is used more effectively for convergence A new criterion is needed Schemes for discarding past information 22
23 Recursive Gauss-Newton method P is a covariance matrix, initially P(0)=cI, where c is a large number For an ARX network degenerates to Recursive Least Squares (RLS) algorithm 23
24 Exponential forgetting Exponential forgetting factor can be used for discarding past information Must be high enough to avoid covariance blow-up 24
25 Exponential Forgetting and Resetting Algorithm 25
26 Recursive gradient method Recursive gradient method is obtained by setting A.k.a incremental or on-line back-propagation 26
27 Generalization Training data vs reality Errors -> overfitting Generalization error Cannot be exactly found Test and validation sets Final Prediction Error estimate: Generalization error contributions The bias error -> insufficient model structure The variance error -> specific data set 27
28 Bias vs variance dilemma Neural network never describes a system completely Bias can be reduced by expanding network architecture Causes variance of weights to increase 28
29 Bias vs variance 29
30 Regularization Possible way to deal with bias/variance dilemma Criterion can be augmented with regularization (or complexity) term Simple weight decay term D is a diagonal matrix Denotes weight decay 30
31 Effect of regularization 31
32 Effect of regularization 32
33 Effect of weight decay constant 33
34 Effects of regularization Improves generalization Local minima are gradually eliminated as decay constant increases Increases smoothness of criterion Minimum is reached faster Selection of weight decay is important Trial-and-error method is too demanding for large networks 34
35 Implementation issues Computing gradients for different model structures How to decide when to terminate the training How to handle systems with multiple inputs and outputs Multiple inputs can be used with practically no modifications Multiple outputs can be solved with multiple models or better methods 35
36 Computing gradients Except for the full Newton method (which requires second-order derivative information) only the derivative of the prediction with respect to the weights is required Derivative can be calculated using the structure of the network 36
37 Examples of derivatives 37
38 Back propagation A.k.a generalized delta rule Formally way of calculating gradients Popular method 38
39 Back propagation 1.Present a training sample to the neural network. 2.Compare the network's output to the desired output from that sample. Calculate the error in each output neuron. 3.For each neuron, calculate what the output should have been, and a scaling factor, how much lower or higher the output must be adjusted to match the desired output. This is the local error. 4.Adjust the weights of each neuron to lower the local error. 5.Assign "blame" for the local error to neurons at the previous level, giving greater responsibility to neurons connected by stronger weights. 6.Repeat from step 3 on the neurons at the previous level, using each one's "blame" as its error.. 39
40 Stopping criteria Maximum number of iterations Upper bound for gradient Upper bound for weight change If biggest weight change is below a certain value Upper bound for criterion Rarely known beforehand Lower bound for trust region Maximum value of λ as a stopping criterion Early stopping Additional data set to find when gen. error is smallest 40
41 Multiple outputs Simplest: separate model for each output Better strategy: model as whole Criterion changes Γ is covariance matrix For Levenberg-Marquardt 41
42 Multiple outputs Recursive Gauss-Newton algorithm 42
43 Iterated Generalized Least Squares IGLS Covariance matrix is unknown in practice 43
44 The End 44
Theoretical Concepts of Machine Learning
Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5
More informationCS 395T Lecture 12: Feature Matching and Bundle Adjustment. Qixing Huang October 10 st 2018
CS 395T Lecture 12: Feature Matching and Bundle Adjustment Qixing Huang October 10 st 2018 Lecture Overview Dense Feature Correspondences Bundle Adjustment in Structure-from-Motion Image Matching Algorithm
More informationClassical Gradient Methods
Classical Gradient Methods Note simultaneous course at AMSI (math) summer school: Nonlin. Optimization Methods (see http://wwwmaths.anu.edu.au/events/amsiss05/) Recommended textbook (Springer Verlag, 1999):
More informationIntroduction to Optimization
Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,
More informationWhat is machine learning?
Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship
More informationLogistic Regression
Logistic Regression ddebarr@uw.edu 2016-05-26 Agenda Model Specification Model Fitting Bayesian Logistic Regression Online Learning and Stochastic Optimization Generative versus Discriminative Classifiers
More informationLinear Discriminant Functions: Gradient Descent and Perceptron Convergence
Linear Discriminant Functions: Gradient Descent and Perceptron Convergence The Two-Category Linearly Separable Case (5.4) Minimizing the Perceptron Criterion Function (5.5) Role of Linear Discriminant
More informationNeural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.
Lecture 24: Learning 3 Victor R. Lesser CMPSCI 683 Fall 2010 Today s Lecture Continuation of Neural Networks Artificial Neural Networks Compose of nodes/units connected by links Each link has a numeric
More informationMLPQNA-LEMON Multi Layer Perceptron neural network trained by Quasi Newton or Levenberg-Marquardt optimization algorithms
MLPQNA-LEMON Multi Layer Perceptron neural network trained by Quasi Newton or Levenberg-Marquardt optimization algorithms 1 Introduction In supervised Machine Learning (ML) we have a set of data points
More informationMulti Layer Perceptron trained by Quasi Newton learning rule
Multi Layer Perceptron trained by Quasi Newton learning rule Feed-forward neural networks provide a general framework for representing nonlinear functional mappings between a set of input variables and
More informationModern Methods of Data Analysis - WS 07/08
Modern Methods of Data Analysis Lecture XV (04.02.08) Contents: Function Minimization (see E. Lohrmann & V. Blobel) Optimization Problem Set of n independent variables Sometimes in addition some constraints
More information4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.
1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when
More informationA large number of user subroutines and utility routines is available in Abaqus, that are all programmed in Fortran. Subroutines are different for
1 2 3 A large number of user subroutines and utility routines is available in Abaqus, that are all programmed in Fortran. Subroutines are different for implicit (standard) and explicit solvers. Utility
More informationConstrained and Unconstrained Optimization
Constrained and Unconstrained Optimization Carlos Hurtado Department of Economics University of Illinois at Urbana-Champaign hrtdmrt2@illinois.edu Oct 10th, 2017 C. Hurtado (UIUC - Economics) Numerical
More informationMulti Layer Perceptron trained by Quasi Newton Algorithm or Levenberg-Marquardt Optimization Network
Multi Layer Perceptron trained by Quasi Newton Algorithm or Levenberg-Marquardt Optimization Network MLPQNA/LEMON User Manual DAME-MAN-NA-0015 Issue: 1.3 Author: M. Brescia, S. Riccardi Doc. : MLPQNA_UserManual_DAME-MAN-NA-0015-Rel1.3
More informationNeural Networks: Optimization Part 1. Intro to Deep Learning, Fall 2018
Neural Networks: Optimization Part 1 Intro to Deep Learning, Fall 2018 1 Story so far Neural networks are universal approximators Can model any odd thing Provided they have the right architecture We must
More informationA Brief Look at Optimization
A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationNumerical Optimization
Numerical Optimization Quantitative Macroeconomics Raül Santaeulàlia-Llopis MOVE-UAB and Barcelona GSE Fall 2018 Raül Santaeulàlia-Llopis (MOVE-UAB,BGSE) QM: Numerical Optimization Fall 2018 1 / 46 1 Introduction
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form
More informationNewton and Quasi-Newton Methods
Lab 17 Newton and Quasi-Newton Methods Lab Objective: Newton s method is generally useful because of its fast convergence properties. However, Newton s method requires the explicit calculation of the second
More informationMulti Layer Perceptron trained by Quasi Newton Algorithm
Multi Layer Perceptron trained by Quasi Newton Algorithm MLPQNA User Manual DAME-MAN-NA-0015 Issue: 1.2 Author: M. Brescia, S. Riccardi Doc. : MLPQNA_UserManual_DAME-MAN-NA-0015-Rel1.2 1 Index 1 Introduction...
More informationMultivariate Numerical Optimization
Jianxin Wei March 1, 2013 Outline 1 Graphics for Function of Two Variables 2 Nelder-Mead Simplex Method 3 Steepest Descent Method 4 Newton s Method 5 Quasi-Newton s Method 6 Built-in R Function 7 Linear
More informationHartley - Zisserman reading club. Part I: Hartley and Zisserman Appendix 6: Part II: Zhengyou Zhang: Presented by Daniel Fontijne
Hartley - Zisserman reading club Part I: Hartley and Zisserman Appendix 6: Iterative estimation methods Part II: Zhengyou Zhang: A Flexible New Technique for Camera Calibration Presented by Daniel Fontijne
More informationClassification: Linear Discriminant Functions
Classification: Linear Discriminant Functions CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Discriminant functions Linear Discriminant functions
More informationβ-release Multi Layer Perceptron Trained by Quasi Newton Rule MLPQNA User Manual
β-release Multi Layer Perceptron Trained by Quasi Newton Rule MLPQNA User Manual DAME-MAN-NA-0015 Issue: 1.0 Date: July 28, 2011 Author: M. Brescia, S. Riccardi Doc. : BetaRelease_Model_MLPQNA_UserManual_DAME-MAN-NA-0015-Rel1.0
More informationImproving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah
Improving the way neural networks learn Srikumar Ramalingam School of Computing University of Utah Reference Most of the slides are taken from the third chapter of the online book by Michael Nielson: neuralnetworksanddeeplearning.com
More informationCS281 Section 3: Practical Optimization
CS281 Section 3: Practical Optimization David Duvenaud and Dougal Maclaurin Most parameter estimation problems in machine learning cannot be solved in closed form, so we often have to resort to numerical
More informationEnergy Minimization -Non-Derivative Methods -First Derivative Methods. Background Image Courtesy: 3dciencia.com visual life sciences
Energy Minimization -Non-Derivative Methods -First Derivative Methods Background Image Courtesy: 3dciencia.com visual life sciences Introduction Contents Criteria to start minimization Energy Minimization
More informationSIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014
SIFT: SCALE INVARIANT FEATURE TRANSFORM SURF: SPEEDED UP ROBUST FEATURES BASHAR ALSADIK EOS DEPT. TOPMAP M13 3D GEOINFORMATION FROM IMAGES 2014 SIFT SIFT: Scale Invariant Feature Transform; transform image
More informationEfficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms
IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1225 Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms S. Sathiya Keerthi Abstract This paper
More information25. NLP algorithms. ˆ Overview. ˆ Local methods. ˆ Constrained optimization. ˆ Global methods. ˆ Black-box methods.
CS/ECE/ISyE 524 Introduction to Optimization Spring 2017 18 25. NLP algorithms ˆ Overview ˆ Local methods ˆ Constrained optimization ˆ Global methods ˆ Black-box methods ˆ Course wrap-up Laurent Lessard
More informationToday. Golden section, discussion of error Newton s method. Newton s method, steepest descent, conjugate gradient
Optimization Last time Root finding: definition, motivation Algorithms: Bisection, false position, secant, Newton-Raphson Convergence & tradeoffs Example applications of Newton s method Root finding in
More informationMultiview Stereo COSC450. Lecture 8
Multiview Stereo COSC450 Lecture 8 Stereo Vision So Far Stereo and epipolar geometry Fundamental matrix captures geometry 8-point algorithm Essential matrix with calibrated cameras 5-point algorithm Intersect
More informationA Course in Machine Learning
A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling
More informationPerceptron as a graph
Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 10 th, 2007 2005-2007 Carlos Guestrin 1 Perceptron as a graph 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-6 -4-2
More informationLecture 6 - Multivariate numerical optimization
Lecture 6 - Multivariate numerical optimization Björn Andersson (w/ Jianxin Wei) Department of Statistics, Uppsala University February 13, 2014 1 / 36 Table of Contents 1 Plotting functions of two variables
More informationA neural network that classifies glass either as window or non-window depending on the glass chemistry.
A neural network that classifies glass either as window or non-window depending on the glass chemistry. Djaber Maouche Department of Electrical Electronic Engineering Cukurova University Adana, Turkey
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationICRA 2016 Tutorial on SLAM. Graph-Based SLAM and Sparsity. Cyrill Stachniss
ICRA 2016 Tutorial on SLAM Graph-Based SLAM and Sparsity Cyrill Stachniss 1 Graph-Based SLAM?? 2 Graph-Based SLAM?? SLAM = simultaneous localization and mapping 3 Graph-Based SLAM?? SLAM = simultaneous
More informationAssignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions
ENEE 739Q: STATISTICAL AND NEURAL PATTERN RECOGNITION Spring 2002 Assignment 2 Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions Aravind Sundaresan
More informationAdaptive Regularization. in Neural Network Filters
Adaptive Regularization in Neural Network Filters Course 0455 Advanced Digital Signal Processing May 3 rd, 00 Fares El-Azm Michael Vinther d97058 s97397 Introduction The bulk of theoretical results and
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-4: Constrained optimization Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428 June
More informationA projected Hessian matrix for full waveform inversion Yong Ma and Dave Hale, Center for Wave Phenomena, Colorado School of Mines
A projected Hessian matrix for full waveform inversion Yong Ma and Dave Hale, Center for Wave Phenomena, Colorado School of Mines SUMMARY A Hessian matrix in full waveform inversion (FWI) is difficult
More informationLecture : Training a neural net part I Initialization, activations, normalizations and other practical details Anne Solberg February 28, 2018
INF 5860 Machine learning for image classification Lecture : Training a neural net part I Initialization, activations, normalizations and other practical details Anne Solberg February 28, 2018 Reading
More informationNeural Network model for a biped robot
Neural Network model for a biped robot Daniel Zaldívar 1,2, Erik Cuevas 1,2, Raúl Rojas 1. 1 Freie Universität Berlin, Institüt für Informatik, Takustr. 9, D-14195 Berlin, Germany {zaldivar, cuevas, rojas}@inf.fu-berlin.de
More informationResearch on Evaluation Method of Product Style Semantics Based on Neural Network
Research Journal of Applied Sciences, Engineering and Technology 6(23): 4330-4335, 2013 ISSN: 2040-7459; e-issn: 2040-7467 Maxwell Scientific Organization, 2013 Submitted: September 28, 2012 Accepted:
More informationCAP 5415 Computer Vision Fall 2012
CAP 5415 Computer Vision Fall 01 Dr. Mubarak Shah Univ. of Central Florida Office 47-F HEC Lecture-5 SIFT: David Lowe, UBC SIFT - Key Point Extraction Stands for scale invariant feature transform Patented
More informationEfficient Iterative Semi-supervised Classification on Manifold
. Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani
More informationOptimization. there will solely. any other methods presented can be. saved, and the. possibility. the behavior of. next point is to.
From: http:/ //trond.hjorteland.com/thesis/node1.html Optimization As discussed briefly in Section 4.1, the problem we are facing when searching for stationaryy values of the action given in equation (4.1)
More informationTested Paradigm to Include Optimization in Machine Learning Algorithms
Tested Paradigm to Include Optimization in Machine Learning Algorithms Aishwarya Asesh School of Computing Science and Engineering VIT University Vellore, India International Journal of Engineering Research
More informationFull waveform inversion by deconvolution gradient method
Full waveform inversion by deconvolution gradient method Fuchun Gao*, Paul Williamson, Henri Houllevigue, Total), 2012 Lei Fu Rice University November 14, 2012 Outline Introduction Method Implementation
More informationAccelerating the Hessian-free Gauss-Newton Full-waveform Inversion via Preconditioned Conjugate Gradient Method
Accelerating the Hessian-free Gauss-Newton Full-waveform Inversion via Preconditioned Conjugate Gradient Method Wenyong Pan 1, Kris Innanen 1 and Wenyuan Liao 2 1. CREWES Project, Department of Geoscience,
More informationAdaptive Filtering using Steepest Descent and LMS Algorithm
IJSTE - International Journal of Science Technology & Engineering Volume 2 Issue 4 October 2015 ISSN (online): 2349-784X Adaptive Filtering using Steepest Descent and LMS Algorithm Akash Sawant Mukesh
More informationFunction approximation using RBF network. 10 basis functions and 25 data points.
1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data
More informationNeural Networks (pp )
Notation: Means pencil-and-paper QUIZ Means coding QUIZ Neural Networks (pp. 106-121) The first artificial neural network (ANN) was the (single-layer) perceptron, a simplified model of a biological neuron.
More informationDistributed model calibration using Levenberg-Marquardt algorithm
Distributed model calibration using Levenberg-Marquardt algorithm Mark Lu a, Liang Zhu a, Li Ling b, Gary Zhang b, Walter Chan c, Xin Zhou *c a Grace Semiconductor Manufacturing Corp, 818 GuoShouJing Rd,
More informationToday. Gradient descent for minimization of functions of real variables. Multi-dimensional scaling. Self-organizing maps
Today Gradient descent for minimization of functions of real variables. Multi-dimensional scaling Self-organizing maps Gradient Descent Derivatives Consider function f(x) : R R. The derivative w.r.t. x
More informationEnsemble methods in machine learning. Example. Neural networks. Neural networks
Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you
More informationRobust Regression. Robust Data Mining Techniques By Boonyakorn Jantaranuson
Robust Regression Robust Data Mining Techniques By Boonyakorn Jantaranuson Outline Introduction OLS and important terminology Least Median of Squares (LMedS) M-estimator Penalized least squares What is
More informationNumerical Optimization: Introduction and gradient-based methods
Numerical Optimization: Introduction and gradient-based methods Master 2 Recherche LRI Apprentissage Statistique et Optimisation Anne Auger Inria Saclay-Ile-de-France November 2011 http://tao.lri.fr/tiki-index.php?page=courses
More informationTree-GP: A Scalable Bayesian Global Numerical Optimization algorithm
Utrecht University Department of Information and Computing Sciences Tree-GP: A Scalable Bayesian Global Numerical Optimization algorithm February 2015 Author Gerben van Veenendaal ICA-3470792 Supervisor
More informationModel learning for robot control: a survey
Model learning for robot control: a survey Duy Nguyen-Tuong, Jan Peters 2011 Presented by Evan Beachly 1 Motivation Robots that can learn how their motors move their body Complexity Unanticipated Environments
More informationNeural Network Optimization and Tuning / Spring 2018 / Recitation 3
Neural Network Optimization and Tuning 11-785 / Spring 2018 / Recitation 3 1 Logistics You will work through a Jupyter notebook that contains sample and starter code with explanations and comments throughout.
More informationHumanoid Robotics. Least Squares. Maren Bennewitz
Humanoid Robotics Least Squares Maren Bennewitz Goal of This Lecture Introduction into least squares Use it yourself for odometry calibration, later in the lecture: camera and whole-body self-calibration
More informationAdaptive Signal Processing in Time Domain
Website: www.ijrdet.com (ISSN 2347-6435 (Online)) Volume 4, Issue 9, September 25) Adaptive Signal Processing in Time Domain Smita Chopde, Pushpa U.S 2 EXTC Department, Fr.CRIT Mumbai University Abstract
More informationCOMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization
COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18 Lecture 6: k-nn Cross-validation Regularization LEARNING METHODS Lazy vs eager learning Eager learning generalizes training data before
More informationTHE CLASSICAL method for training a multilayer feedforward
930 IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 4, JULY 1999 A Fast U-D Factorization-Based Learning Algorithm with Applications to Nonlinear System Modeling and Identification Youmin Zhang and
More informationChapter Multidimensional Gradient Method
Chapter 09.04 Multidimensional Gradient Method After reading this chapter, you should be able to: 1. Understand how multi-dimensional gradient methods are different from direct search methods. Understand
More informationRecapitulation on Transformations in Neural Network Back Propagation Algorithm
International Journal of Information and Computation Technology. ISSN 0974-2239 Volume 3, Number 4 (2013), pp. 323-328 International Research Publications House http://www. irphouse.com /ijict.htm Recapitulation
More informationRecent Developments in Model-based Derivative-free Optimization
Recent Developments in Model-based Derivative-free Optimization Seppo Pulkkinen April 23, 2010 Introduction Problem definition The problem we are considering is a nonlinear optimization problem with constraints:
More informationLarge-Scale Lasso and Elastic-Net Regularized Generalized Linear Models
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models DB Tsai Steven Hillion Outline Introduction Linear / Nonlinear Classification Feature Engineering - Polynomial Expansion Big-data
More informationCS489/698: Intro to ML
CS489/698: Intro to ML Lecture 14: Training of Deep NNs Instructor: Sun Sun 1 Outline Activation functions Regularization Gradient-based optimization 2 Examples of activation functions 3 5/28/18 Sun Sun
More informationTraining of Neural Networks. Q.J. Zhang, Carleton University
Training of Neural Networks Notation: x: input of the original modeling problem or the neural network y: output of the original modeling problem or the neural network w: internal weights/parameters of
More informationImage Compression: An Artificial Neural Network Approach
Image Compression: An Artificial Neural Network Approach Anjana B 1, Mrs Shreeja R 2 1 Department of Computer Science and Engineering, Calicut University, Kuttippuram 2 Department of Computer Science and
More information6. Linear Discriminant Functions
6. Linear Discriminant Functions Linear Discriminant Functions Assumption: we know the proper forms for the discriminant functions, and use the samples to estimate the values of parameters of the classifier
More informationCombine the PA Algorithm with a Proximal Classifier
Combine the Passive and Aggressive Algorithm with a Proximal Classifier Yuh-Jye Lee Joint work with Y.-C. Tseng Dept. of Computer Science & Information Engineering TaiwanTech. Dept. of Statistics@NCKU
More informationAlgorithms for convex optimization
Algorithms for convex optimization Michal Kočvara Institute of Information Theory and Automation Academy of Sciences of the Czech Republic and Czech Technical University kocvara@utia.cas.cz http://www.utia.cas.cz/kocvara
More informationLocally Weighted Learning
Locally Weighted Learning Peter Englert Department of Computer Science TU Darmstadt englert.peter@gmx.de Abstract Locally Weighted Learning is a class of function approximation techniques, where a prediction
More informationVisual Tracking (1) Feature Point Tracking and Block Matching
Intelligent Control Systems Visual Tracking (1) Feature Point Tracking and Block Matching Shingo Kagami Graduate School of Information Sciences, Tohoku University swk(at)ic.is.tohoku.ac.jp http://www.ic.is.tohoku.ac.jp/ja/swk/
More informationThe Mathematics Behind Neural Networks
The Mathematics Behind Neural Networks Pattern Recognition and Machine Learning by Christopher M. Bishop Student: Shivam Agrawal Mentor: Nathaniel Monson Courtesy of xkcd.com The Black Box Training the
More informationLinear Separability. Linear Separability. Capabilities of Threshold Neurons. Capabilities of Threshold Neurons. Capabilities of Threshold Neurons
Linear Separability Input space in the two-dimensional case (n = ): - - - - - - w =, w =, = - - - - - - w = -, w =, = - - - - - - w = -, w =, = Linear Separability So by varying the weights and the threshold,
More informationConstraint Satisfaction Problems
Constraint Satisfaction Problems Frank C. Langbein F.C.Langbein@cs.cf.ac.uk Department of Computer Science Cardiff University 13th February 2001 Constraint Satisfaction Problems (CSPs) A CSP is a high
More informationKnowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey.
Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN
More informationData Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Implementation: Real machine learning schemes Decision trees Classification
More informationCOMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions
COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18 Lecture 2: Linear Regression Gradient Descent Non-linear basis functions LINEAR REGRESSION MOTIVATION Why Linear Regression? Simplest
More informationAdaptive Filters Algorithms (Part 2)
Adaptive Filters Algorithms (Part 2) Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Electrical Engineering and Information Technology Digital Signal Processing and System
More informationDynamic Analysis of Structures Using Neural Networks
Dynamic Analysis of Structures Using Neural Networks Alireza Lavaei Academic member, Islamic Azad University, Boroujerd Branch, Iran Alireza Lohrasbi Academic member, Islamic Azad University, Boroujerd
More informationDr. Qadri Hamarsheh Supervised Learning in Neural Networks (Part 1) learning algorithm Δwkj wkj Theoretically practically
Supervised Learning in Neural Networks (Part 1) A prescribed set of well-defined rules for the solution of a learning problem is called a learning algorithm. Variety of learning algorithms are existing,
More informationAkarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction
Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different
More informationThe Curse of Dimensionality
The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more
More informationCOMPUTATIONAL INTELLIGENCE (CS) (INTRODUCTION TO MACHINE LEARNING) SS16. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions
COMPUTATIONAL INTELLIGENCE (CS) (INTRODUCTION TO MACHINE LEARNING) SS16 Lecture 2: Linear Regression Gradient Descent Non-linear basis functions LINEAR REGRESSION MOTIVATION Why Linear Regression? Regression
More informationIntroduction to optimization methods and line search
Introduction to optimization methods and line search Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi How to find optimal solutions? Trial and error widely used in practice, not efficient and
More informationCHAPTER VI BACK PROPAGATION ALGORITHM
6.1 Introduction CHAPTER VI BACK PROPAGATION ALGORITHM In the previous chapter, we analysed that multiple layer perceptrons are effectively applied to handle tricky problems if trained with a vastly accepted
More informationLOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave.
LOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave. http://en.wikipedia.org/wiki/local_regression Local regression
More informationData Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 8: Search and Optimization Methods Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Search & Optimization Search and Optimization method deals with
More informationFeature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate
More informationCPSC 340: Machine Learning and Data Mining. Robust Regression Fall 2015
CPSC 340: Machine Learning and Data Mining Robust Regression Fall 2015 Admin Can you see Assignment 1 grades on UBC connect? Auditors, don t worry about it. You should already be working on Assignment
More informationCharacterizing Improving Directions Unconstrained Optimization
Final Review IE417 In the Beginning... In the beginning, Weierstrass's theorem said that a continuous function achieves a minimum on a compact set. Using this, we showed that for a convex set S and y not
More information