DAta Mining Exploration Project

Size: px
Start display at page:

Download "DAta Mining Exploration Project"

Transcription

1 Pag. 1 of 31 DAta Mining Exploration Project General Purpose Multi Layer Perceptron Neural Network (trained by Back Propagation & Quasi-Newton) Data Mining Model User Manual DAME-MAN-NA-0008 Doc. : Issue: 1.0 Date: September 02, 2010 Prepared by M. Brescia 02/09/2010 Released by G. Longo 02/09/2010 1

2 Pag. 2 of 31 Revision Matrix Issue Author Date Section/Paragraph Reason/Initiation/ Affected Documents/Remarks 0.1 M. Brescia 02/09/2010 All First draft release 2

3 Pag. 3 of 31 INDEX 1 Reference & Applicable Documents Abbreviations & Acronyms Introduction Design Issues The MLP implementation overview The BP implementation The QNA implementation System Architectural Design Chosen System Architecture System Interface description Wrapping design & implementations requirements User Interface description Input dataset format MLP-BP wrapping requirements and execution details TRAIN USE CASE TEST USE CASE RUN USE CASE MLP-QNA wrapping requirements and execution details TRAIN USE CASE TEST USE CASE RUN USE CASE STATISTICAL TRAIN USE CASE APPENDIX Scientific case test with MLP-QNA The Science case Test procedure and results TABLE INDEX Tab. 1 Reference & Applicable Documents... 5 Tab. 2 Abbreviations and acronyms... 6 Tab. 3 Test results FIGURE INDEX Fig. 1 MLP architecture... 8 Fig. 2 bipolar sigmoid activation function... 9 Fig. 3 Execution time comparison between the 4 tests Fig. 4 Training iterations comparison between the 4 tests

4 Pag. 4 of 31 This page is intentionally left blank 4

5 Pag. 5 of 31 1 Reference & Applicable Documents INDEX Reference Author Date 1 sdd_template_voneural-sdd-na-0000-rel0.1 Software Design Description Document Guidelines M. Brescia 18/12/ SuiteDesign_VONEURAL-PDD-NA-0001-Rel2.0 Suite Project Description Document 3 DMPlugins_DAME-TRE-NA-0016-Rel0.3 deployed Model-functionality DMPlugins Description report 4 dm-model_voneural-sdd-na-0008-rel1.2 Data Mining Model Component Software Design Description 5 framework_voneural-sdd-na-0005-rel1.0 VO-Neural team A.DiGuido, M. Brescia S. Cavuoti, A. Di Guido O. Laurino, 15/10/ /04/ /05/ /05/2009 Framework Component Software Design Description 6 Neural Networks PC Tools A practical guide Academic Press 7 "Improving The Learning Speed of 2-layer Neural Networks by Choosing Initial Values of The Adaptive Weights", IJCNN, USA 8 "A Comparison among Weight Initialization Methods for Multilayer Feedforward Networks," IJCN, Italy 9 "Representations of Quasi-Newton Matrices and their use in Limited Memory Methods", Mathematical Programming, 63, 4, pp M. Fiore R. C. Eberhart, R.W. Dobbins Nguyen, D. Widrow, B. Mercedes Fernández- Redondo, Carlos Hernández- Espinosa Byrd, R.H, Nocedal,J., and Schnabel, R.B Tab. 1 Reference & Applicable Documents 5

6 Pag. 6 of 31 2 Abbreviations & Acronyms A & A BFGS BP CE CSV DAME DM DMM GP GRID L-BFGS MLP MSE NN OOP QNA SA SVM TS TBC UML VO XML Broyden-Fletcher-Goldfarb-Shanno Back Propagation Cross Entropy Comma Separated Value Data Mining & Exploration Data Mining Data Mining Model General Purpose Global Resource Information Database Limited memory BFGS Multi Layer Perceptron Mean Square Error Neural Network Object Oriented Programming Quasi Newton Algorithm Stand Alone Support Vector Machine Tournament Selection To Be Completed Unified Modeling Language Virtual Observatory extensible Markup Language Meaning Tab. 2 Abbreviations and acronyms 6

7 Pag. 7 of 31 3 Introduction This document deals with a data mining (DM) model, used to solve not linear problems optimization. It is based on the design of a general purpose feed-forward neural network architecture, in order to obtain a Soft Computing instrument implementing supervised learning. Classical MLP architecture is used associated to a double type of learning algorithm: the standard weight gradient descendent rule named Back Propagation (BP); the statistical Quasi-Newton Algorithm (QNA); In the MLP-BP case, both batch and on-line learning modes are available, while in the MLP-QNA case only batch learning is achieved. Hereinafter the term MLP-GP is used to indicate the general model implemented, while MLP-BP is used to focalize the MLP-GP associated to the BP algorithm and MLP-QNA when the MLP is associated to the QNA learning algorithm. 3.1 Design Issues The model described here is intended to become one of the DM models officially integrated into the Suite DAME. To achieve this goal a set of standardization rules is followed, in order to make the package compliant with the specific environment specifications, [2, 3, 4, 5]. These guidelines are basically related to input/output data format, compiling and execution dependencies, DMM wrapper conditions and requirements The MLP implementation overview The MLP architecture is one of the most typical feed-forward neural network model. The term feed-forward is used to identify basic behavior of such neural models, in which the impulse is propagated always in the same direction, e.g. from neuron input layer towards output layer, through one or more hidden layers (the network brain), by combining weighted sum of weights associated to all neurons (except the input layer). As easy to understand, the neurons are organized in layers, with proper own role. The input signal, simply propagated throughout the neurons of the input layer, is used to stimulate next hidden and output neuron layers. The output of each neuron is obtained by means of an activation function, applied to the weighted sum of its inputs. Different shape of this activation function can be applied, from the simplest linear one up to sigmoid. The number of hidden layers represents the degree of the complexity achieved for the energy solution space in which the network output moves looking for the best solution. As an example, in a typical classification problem, the number of hidden layers indicates the number of hyper-planes used to split the parameter space (i.e. number of possible classes) in order to classify each input pattern. What is different in such a neural network architecture is typically the learning algorithm used to train the network. It exists a dichotomy between supervised and unsupervised learning methods. 7

8 Pag. 8 of 31 Fig. 1 MLP architecture In the first case, the network must be firstly trained (training phase), in which the input patterns are submitted to the network as couples (input, desired known output). The feed-forward algorithm is then achieved and at the end of the input submission, the network output is compared with the corresponding desired output in order to quantify the learning quote. It is possible to perform the comparison in a batch way (after an entire input pattern set submission) or incremental (the comparison is done after each input pattern submission) and also the metric used for the distance measure between desired and obtained outputs, can be chosen accordingly problem specific requirements (in the MLP-BP the MSE, Mean Square Error, is used). After each comparison and until a desired error distance is unreached (typically the error tolerance is a precalculated value or a constant imposed by the user), the weights of hidden layers must be changed accordingly to a particular law or learning technique. After the training phase is finished (or arbitrarily stopped), the network should be able not only to recognize correct output for each input already used as training set, but also to achieve a certain degree of generalization, i.e. to give correct output for those inputs never used before to train it. The degree of generalization varies, as obvious, depending on how good has been the learning phase. This important feature is realized because the network doesn t associates a single input to the output, but it discovers the relationship present behind their association. After training, such a neural network can be seen as a black box able to perform a particular function (input-output correlation) whose analytical shape is a priori not known. In order to gain the best training, it must be as much homogeneous as possible and able to describe a great variety of samples. Bigger the training set, higher will be the network generalization capability. Despite of these considerations, it should always taken into account that neural networks application field should be usually referred to problems where it is needed high flexibility (quantitative result) more than high precision (qualitative results). Concerning the hidden layer choice, there is the possibility to define zero hidden layers (SLP, Single Layer Perceptron, able to solve only linear separation of the parameter space), 1 or 2 hidden layers, depending on the complexity the user wants to introduce in the not linear problem solving experiment. Second learning type (unsupervised) is basically referred to neural models able to classify/cluster patterns onto several categories, based on their common features, by submitting training inputs without related desired outputs. This is not the learning case approached with the MLP architecture, so it is not important to add more information in this document. 8

9 Pag. 9 of The BP implementation In feed-forward process, the network will calculate the output based on the given input. We use bipolar logistic function as the activation function in hidden and output layer. While in input layer, I use unity function. Choosing an appropriate activation function can also contribute to a much faster learning. Theoretically, sigmoid function with less saturation speed will give a better result. f f sigmoid ' sigmoid 2 ( x) = 1 σ x 1+ e ( x) = 2σ e σ x σ x ( e + 1) 2 Fig. 2 bipolar sigmoid activation function It can be manipulated its slope and see how it affects the learning speed. A larger slope will make weight values move faster to saturation region (faster convergence), while smaller slope will make weight values move slower but it allows a refined weight adjustment. Next, it will compare this calculated output to the desired output to calculate the error. The next mission is to minimize this error. What method we choose for minimizing this error will also determine the learning speed. Gradient descent method is the most common for minimizing this error. Finally, it will update the weight value as the following: where: 9

10 Pag. 10 of 31 Besides this gradient descent method, there are several other methods that will guarantee a faster learning speed. In this case, we can make the classical BP learning process much faster by adding momentum term or by using adaptive learning rate. The feedforward network error is calculated with the standard MSE function. In momentum learning, weight update at time (t+1) contains momentum of the previous learning. So we need to keep the previous value of error and output. The equation above can be implemented as the following. Variable α is the momentum value. The value should be greater than zero and smaller than one. In the MLP-BP implementation, the momentum is not the unique improvement done to the standard BP algorithm. In our case we have also implemented an adaptive learning rule, [7], described in the following. For adaptive learning, the idea is to change the learning rate automatically based on current error and previous error. The formula is: The idea is to observe the last two errors and adjust the learning rate in the direction that would have reduced the second error. Both variable E and Ei are the current and previous error. Parameter A is a parameter that will determine how rapidly the learning rate is adjusted. Parameter A should be less than one and greater than zero. You can also try another method by multiplying the current learning rate with a factor greater than one if current error is smaller than previous error. And if current error is bigger than previous error, you can multiply it with a factor less than one. In literature it is also suggested to discard the changes if the error is increasing. This will lead into a better result. The adaptive learning routine is in the function ann_train_network_from_file where learning rate update is performed either in on-line (updated epoch-by-epoch after each single pattern presentation) or batch (updated after a whole dataset presentation). Moreover, concerning the MLP network weight initialization, several alternative methods have been implemented, [8]. It is known that the particular initialization values give influences to the speed of convergence. There are several methods available for this purpose. The most common is by initializing the weights at random with uniform distribution inside the interval of a certain small range of number. In the MLP-BP we call this method HARD_RANDOM. Another better method is by bounding the range as expressed in the equation below. We call this method with just RANDOM. 10

11 Pag. 11 of 31 Widely known as a very good weight initialization method is the Nguyen-Widrow method. We call this method as NGUYEN. Nguyen-Widrow weight initialization algorithm can be expressed as the following steps: As stated in the algorithm as written above, first, we assign random number of -1 to 1 to all hidden nodes. Next, we calculate the norm of these random numbers that we have generated by calling function get_norm_of_weight. Now we have all the necessary data and we can proceed to the available formula. All the weight initialization routines are located in function initialize_weights. It is also possible to resume a trained network, to perform a further training session, starting from the previous final weight setup. In this case the user should set the method FROM_FILE and specifying the stored weights file name as inputs The QNA implementation In this case, the BP algorithm is completely replaced by an adapted version of the classical Newton method for optimization problems. The Newton method is the general basis for a whole family of so called Quasi-Newton methods. One of those methods, implemented here is the L-BFGS algorithm, [9]. More rigorously, the QNA is an optimization of learning rule, also because, as described below, the implementation is based on a statistical approximation of the Hessian by cyclic gradient calculation, that, as said in the previous section, is at the base of BP method. As known, the classical Newton method uses the Hessian of a function. The step of the method is defined as a product of an inverse Hessian matrix and a function gradient. If the function is a positive definite quadratic form, we can reach the function minimum in one step. In case of an indefinite quadratic form (which has no minimum), we will reach the maximum or saddle point. In short, the method finds the stationary point of a quadratic form. In practice, we usually have functions which are not quadratic forms. If such a function is smooth, it is sufficiently good described by a quadratic form in the minimum neighborhood. However, the Newton method can converge both to a minimum and a maximum (taking a step into the direction of a function increasing). Quasi-Newton methods solve this problem as follows: they use a positive definite approximation instead of a Hessian. If Hessian is positive definite, we make the step using the Newton method. If Hessian is indefinite, we modify it to make it positive definite, and then perform a step using the Newton method. The step is always performed in the direction of the function decrement. In case of a positive definite Hessian, we use it to generate a quadratic surface approximation. This should make the convergence better. If Hessian is indefinite, we just move to where function decreases. 11

12 Pag. 12 of 31 Some modifications of Quasi-Newton methods perform a precise linear minimum search along the indicated line, but it is proved that it's enough to sufficiently decrease the function value, and not necessary to find a precise minimum value. The L-BFGS algorithm tries to perform a step using the Newton method. If it does not lead to a function value decreasing, it lessens the step length to find a lesser function value. Up to here it seems quite simple but it is not! The Hessian of a function isn't always available and in many cases is too much complicated; more often we can only calculate the function gradient. Therefore, the following operation is used: the Hessian of a function is generated on the basis of the N consequent gradient calculations, and the quasi-newton step is performed. There is a special formulas which allows to iteratively get a Hessian approximation. On each step approximation, the matrix remains positive definite. The algorithm uses the L-BFGS update scheme. BFGS stands for Broyden-Fletcher-Goldfarb- Shanno (more precisely, this scheme generates not the Hessian, but its inverse matrix, so we don't have to waste time inverting a Hessian). The L letter in the scheme name comes from the words "Limited memory". In case of big dimensions, the amount of memory required to store a Hessian (N 2 ) is too big, along with the machine time required to process it. Therefore, instead of using N gradient values to generate a Hessian we can use a smaller number of values, which requires a memory capacity of order of N M. In practice, M is usually chosen from 3 to 7, in difficult cases it is reasonable to increase this constant to 20. Of course, as a result we'll get not the Hessian but its approximation. On the one hand, the convergence slows down. On the other hand, the performance could even grow up. At first sight, this statement is paradoxical. But it contains no contradictions: the convergence is measured by a number of iterations, whereas the performance depends on the number of processor's time units spent to calculate the result. As a matter of fact, this method was designed to optimize the functions of a number of arguments (hundreds and thousands), because in this case it is worth having an increasing iteration number due to the lower approximation precision because the overheads become much lower. This is particularly useful in astrophysical data mining problems, where usually the parameter space is dimensionally huge and confused by a low signal-to-noise ratio. But we can use these methods for small dimension problems too. The main advantage of the method is scalability, because it provides high performance when solving high dimensionality problems, and it allows to solve small dimension problems too. From the implementation point of view, in the MLP-QNA case the following features are available for the end user: only batch learning mode is available; Strict separation between classification and regression functionality modes; For classification mode, the Cross Entropy method is available to compare output and target network values. It is possible to alternatively use standard MSE rule, that is mandatory for regression mode; K-fold cross validation method to improve training performances and to avoid overfitting problems; Resume training from past experiments, by using the weights stored in an external file at the end of the training phase; Confusion matrix calculated and stored in an external file for both classification and regression modes (in the last case an adapted version is provided). It is useful after training and test sessions to evaluate model performances; 12

13 Pag. 13 of 31 4 System Architectural Design 4.1 Chosen System Architecture The choice of the MLP-GP system architecture is not free, but bounded by the specific requirements issued by the DAME Suite environment. The MLP-GP is one of the supported DM models to be integrated into the Suite infrastructure, both in terms of I/O data format, XML parameter description, functionality association (design pattern integration as specified in [4]) and DMPlugin package constraints, as specified in [3] and [5]. 4.2 System Interface description Wrapping design & implementations requirements In order to wrap MLP-GP model (library implemented in C++) into DAME Suite, we have to create a java class called MLPGP.java. DAME Suite have a class interface called DMMInterface which represents a generic data mining model that can be added in DAME Suite. Therefore the MLPGP java class must implement DMMInterface class and its specified use cases: Train; Test; Run; Full (as sequential combination of previous ones); This wrapping phase is foreseen to be provided as soon as possible. 4.3 User Interface description In order to be integrated into DAME suite, the code of MLP-GP has been structured taking into account the DMM design pattern requirements and the distinction into different functionality and use case constraints. The MLP-GP has to be put into the supervised model hierarchy associated with specific functionality modes. The MLP-GP, in particular with the QNA learning rule, can be used for the following functionality modes: Classification; Regression; Also the use case requirements mentioned in the previous section has been strictly followed (train, test, run, full use cases allowed). At run time the program can be executed under the form of formatted command line. 13

14 Pag. 14 of 31 Depending on the different use case and functionality of the current experiment, the following are the details of command line parameters to be specified to execute the MLP-GP. In principle the way to execute the program is by compiling a command line as suffix string for the executable program mycnn_bp.exe. In all following cases the command line must be respected in terms of number and order of parameters Input dataset format The dataset input file (both with target columns for train and test cases, and without target columns for run case) to be accepted by the program MLP-BP is exclusively CSV format with no header or special character at the beginning of the file. IMPORTANT: the file MUST be provided WITHOUT any header and with NO CARRIAGE RETURN after last pattern row!!! MLP-BP wrapping requirements and execution details Here the details about the launch of MLP with BP are reported TRAIN USE CASE The following are pre-formatted command lines to be used training command line parameters (ALL parameters are required in a strict sequential order) 1. Functionality case [integer]: a. 10 CLASSIFICATION; b. 20 REGRESSION; 2. Use case [integer]: a. 0 TRAIN_BP (training case); 3. Learning rate [double]. The BP learning rate, in the range [0, 1]; 4. Momentum factor [double]. The momentum value in the range [0, 1]; 5. Learning changing factor [double]. Used for adaptive learning rule, in the range ]0, 1[; 6. Slope sigma argument of bimodal sigmoid activation function [double]; 7. Number of input neurons [integer]. It must match the number of input dataset columns; 8. Number of output neurons [integer]. It must match the number of target dataset columns; 9. Number of hidden layers [integer]. It may be 0, 1 or 2; 10. Number of first hidden layer neurons [integer]. If parameter 9 is 0, then this field is not considered; 11. Number of second hidden layer neurons [integer]. If parameter 9 is < 2, then this field is not considered; 12. Weight initializing rule [integer]: 14

15 Pag. 15 of 31 a. 701 HARD_RANDOM. Random values with user specified range [-m_init_val, +m_init_val]; b. 702 RANDOM. Random range into [-1, +1]; c. 703 NGUYEN --> first scales into [-1, +1] then apply specific rule (see section 3.1.2); d. 704 FROM_FILE --> already created weight file used. This is the case of resume training or test/run sessions, where an already trained network must be used; 13. Range of user defined range of weight initialization [double]. Used only if the selected rule as parameter 12 is HARD_RANDOM, otherwise it is not considered; 14. Training input dataset file name (with full relative path) [character string]; 15. Training log file name (with full relative path) [character string]; 16. Partial training error file name (with full relative path) [character string]; 17. Training network weight file name (with full relative path) [character string]; 18. Number of training iterations [integer]. Stop condition; 19. Learning MSE error threshold [double]. Stop condition; 20. Training input dataset internal column order [integer]: a. 705 INPUT_FIRST. The file contains input columns first and then the target columns; b. 706 OUTPUT_FIRST. The file contains target columns first and then the input columns; 21. Training mode [integer]: a. 300 BATCH; b. 301 INCREMENTAL (on-line mode to update the network weights); As examples of command lines: training command line: mycnn_bp input.txt trainlog.txt trainpartialerror.txt trainedweights.txt Training Output The following output files are automatically generated at the end of execution: errorlog.txt: error report file, containing details about any incorrect condition or exception that caused the abnormal exit from the execution. This file is not created if the program ends normally; <log file name>: user defined log file with information about experiment results; <partial training error file name>: user defined file with partial error values at each training iteration. Useful to obtain a graphical view of the learning process <trained weights file name>: final network weights frozen at the end of training. It can be used in a new training experiment to restore old one; All these files have to be registered as official output files of the experiment. 15

16 Pag. 16 of TEST USE CASE The following are pre-formatted command lines to be used test command line parameters (ALL parameters are required in a strict sequential order) 1. Functionality case [integer]: a. 10 CLASSIFICATION; b. 20 REGRESSION; 2. Use case [integer]: a. 1 TEST_BP (test case); 3. Slope sigma argument of bimodal sigmoid activation function [double]; 4. Number of input neurons [integer]. It must match the number of input dataset columns; 5. Number of output neurons [integer]. It must match the number of target dataset columns; 6. Number of hidden layers [integer]. It may be 0, 1 or 2; 7. Number of first hidden layer neurons [integer]. If parameter 6 is 0, then this field is not considered; 8. Number of second hidden layer neurons [integer]. If parameter 6 is < 2, then this field is not considered; 9. Input weight file name (with full relative path) [character string]; 10. Test input dataset file name (with full relative path) [character string]; 11. Test output log file name (with full relative path) [character string]; 12. Test input dataset internal column order [integer]: a. 705 INPUT_FIRST. The file contains input columns first and then the target columns; b. 706 OUTPUT_FIRST. The file contains target columns first and then the input columns; As examples of command lines: test command line: mycnn_bp trainedweights.txt input.txt testlog.txt 705 Test Output The following output files are automatically generated at the end of execution: errorlog.txt: error report file, containing details about any incorrect condition or exception that caused the abnormal exit from the execution. This file is not created if the program ends normally; <log file name>: user defined log file with information about experiment results; 16

17 Pag. 17 of 31 All these files have to be registered as official output files of the experiment RUN USE CASE The following are pre-formatted command lines to be used run command line parameters (ALL parameters are required in a strict sequential order) 1. Functionality case [integer]: a. 10 CLASSIFICATION; b. 20 REGRESSION; 2. Use case [integer]: a. 2 RUN_BP (run case); 3. Slope sigma argument of bimodal sigmoid activation function [double]; 4. Number of input neurons [integer]. It must match the number of input dataset columns; 5. Number of output neurons [integer]. It must match the number of target dataset columns; 6. Number of hidden layers [integer]. It may be 0, 1 or 2; 7. Number of first hidden layer neurons [integer]. If parameter 6 is 0, then this field is not considered; 8. Number of second hidden layer neurons [integer]. If parameter 6 is < 2, then this field is not considered; 9. Input weight file name (with full relative path) [character string]; 10. Run input dataset file name (with full relative path) [character string]; 11. Run output log file name (with full relative path) [character string]; As examples of command lines: run command line: mycnn_bp trainedweights.txt run.txt runlog.txt Run Output The following output files are automatically generated at the end of execution: errorlog.txt: error report file, containing details about any incorrect condition or exception that caused the abnormal exit from the execution. This file is not created if the program ends normally; <log file name>: user defined log file with information about experiment results; All these files have to be registered as official output files are those underlined in the previous list. 17

18 Pag. 18 of MLP-QNA wrapping requirements and execution details Here the details about the launch of MLP with QNA are reported TRAIN USE CASE The following are pre-formatted command lines to be used training command line parameters (ALL parameters are required in a strict sequential order) 1. Functionality case [integer]: a. 10 CLASSIFICATION; b. 20 REGRESSION; 2. Use case [integer]: a. 3 TRAIN_QNA (training case); 3. Decay [double]. Weight decay constant, (>=0.001). Decay term 'Decay* Weights ^2' is added to error function. Default value = 0.001; 4. Restarts [integer]. Number of restarts from random position, >0. If you don't know what Restarts to choose, use 2. (THIS IS THE NUMBER OF MAX TRAINING CYCLES PERFORMED ANYWAY). Default value = 20; 5. Wstep [double]. Stopping criterion. Algorithm stops if step size is less than WStep. Recommended value = Zero step size means stopping after MaxIts iterations. Default value = 0.01; 6. MaxIts [integer]. Stopping criterion. Algorithm stops after MaxIts iterations (NOT gradient calculations). Zero MaxIts means stopping when step is sufficiently small (use Wstep). Default value = 1500; 7. Number of input neurons [integer]. It must match the number of input dataset columns; 8. Number of output neurons [integer]. It must match the number of target dataset columns; 9. Number of hidden layers [integer]. It may be 0, 1 or 2; 10. Number of first hidden layer neurons [integer]. If parameter 9 is 0, then this field is not considered; 11. Number of second hidden layer neurons [integer]. If parameter 9 is < 2, then this field is not considered; 12. Cross Entropy flag [integer]. Used in classification mode only. In case of regression mode this parameter is not considered. Default value = 0. a. 0 not used (standard MSE used); b. 1 used; 13. Name of input dataset file (with full relative path) [character string]; 14. K-fold cross validation flag [integer]. Default value = 0. a. 0 not used (standard training without validation); b. 1 used (training with validation sequence); 15. K-fold cross validation k value [integer]. If parameter 14 is 0 then this parameter is not considered. Default value = 5; 18

19 Pag. 19 of Confusion matrix calculation mode [integer]. This is an internal parameter used to define the calculation of confusion matrix (which diagonal to be considered as positive cases). To be used only in classification mode. Default value = 1. In case of regression this is not considered: a. 0 reverse mode (positive cases are on secondary diagonal of matrix); b. 1 standard mode (positive cases are on primary diagonal of matrix); 17. Weight initialization choice [integer]. It issues how to initialize network weights. It is possible to resume a previous training phase: a. 702 RANDOM initialization between [-1, +1]; b. 704 FROM_FILE. To be used in case of past training resume; 18. Name of the weight file (with full relative path if loaded from different directory) to be loaded to initialize network weights [character string]. To be used in case of parameter 17 set to FROM_FILE value. If parameter 17 is RANDOM, this is not considered. As examples of command lines: Classification mode training command line: mycnn_bp datasets/agn_7_stat_full.txt none In this case (red fields): - Network with 1 hidden layer of 15 neurons - It uses cross entropy (flag set to 1) - It uses cross validation (flag set to 1) with k = 10 - Weights are initialized randomly (702) with weight file name none (not used) mycnn_bp datasets/agn_7_stat_full.txt experiments/regression/trainedweights.txt In this case (red fields): - Network with 2 hidden layers of, respectively, 15 and 6 neurons - It does not use cross entropy (flag set to 0), but simple MSE error - It does not use cross validation (flag set to 0) with k = 10 not considered - Weights are initialized restoring past training (704) with weight file name experiments/regression/trainedweights.txt Regression mode training command line: 19

20 Pag. 20 of 31 mycnn_bp datasets/agn_7_stat_full.txt experiments/regression/trainedweights.txt Training Output When executed under training use case, the output is composed by following files, stored into a pre-defined directory sub-tree. This sub-tree starts from the execution directory, and it branches into two different subtrees, depending on the functionality domain of the current execution: -./experiments/classification for the classification case -./experiments/regression for the regression case In one of such directories the following output files are automatically generated at the end of execution: errorlog.txt: error report file, containing details about any incorrect condition or exception that caused the abnormal exit from the execution. This file is not created if the program ends normally; trainlog.txt: log file with detailed information about experiment configuration, main results and parameter setup; trainpartialerror.txt: ascii (space separated) file with partial values at each training iteration of the QNA algorithm. Useful to obtain a graphical view of the learning process. Each row is composed by three columns: o training step; o number of iterations of current step (number of Hessian approximations <= MaxIts); o current step batch error (MSE or Cross Entropy value if selected in classification mode); trainedweights.txt: final network weights frozen at the end of batch training. It can be used in a new training experiment to restore old one; frozen_train_net.txt: internal network node values as frozen at the end of training, to be given as network input file in test/run cases; traintestoutlog.txt: output values as calculated after training, with respective target values. It can be used to evaluate the network output for each input pattern. It corresponds to an embedded test session done by submitting the training dataset as test dataset; temptrash.txt: ascii file with network outputs and related targets for all input patterns (simplified, not verbose, version of traintestoutlog.txt, for internal use only); traintestconfmatrix.txt: confusion matrix calculated at the end of training. It results from the values stored into the traintestoutlog.txt file. Useful to obtain a simple statistical evaluation of the whole training results. In the case of regression it is an adapted version; Some of the above files, as described, are not very useful for the end user, being created for internal use only. In particular, main important files to be registered as official output files are those underlined in the previous list. 20

21 Pag. 21 of TEST USE CASE The following are pre-formatted command lines to be used test command line parameters (ALL parameters are required in a strict sequential order) 12. Functionality case [integer]: a. 10 CLASSIFICATION; b. 20 REGRESSION; 13. Use case [integer]: a. 4 TEST_QNA (test case); 14. Number of input neurons [integer]. It must match the number of input dataset columns; 15. Number of output neurons [integer]. It must match the number of target dataset columns; 16. Number of hidden layers [integer]. It may be 0, 1 or 2; 17. Number of first hidden layer neurons [integer]. If parameter 5 is 0, then this field is not considered; 18. Number of second hidden layer neurons [integer]. If parameter 5 is < 2, then this field is not considered; 19. Name of input dataset file (with full relative path) [character string]; 20. Confusion matrix calculation mode [integer]. This is an internal parameter used to define the calculation of confusion matrix (which diagonal to be considered as positive cases). To be used only in classification mode. Default value = 1. In case of regression this is not considered: a. 0 reverse mode (positive cases are on secondary diagonal of matrix); b. 1 standard mode (positive cases are on primary diagonal of matrix); 21. Name of the weight file (with full relative path if loaded from different directory) to be loaded to initialize network weights [character string]; 22. Name of the file (with full relative path if loaded from different directory) with internal network node values as frozen at the end of training phase [character string]; As examples of command lines: Classification mode test command line: mycnn_bp datasets/test_agn_ridotto.txt 1 experiments/classification/trainedweights.txt experiments/classification/frozen_train_net.txt Regression mode test command line: 21

22 Pag. 22 of 31 mycnn_bp datasets/test_agn_ridotto.txt 1 experiments/regression/trainedweights.txt experiments/regression/frozen_train_net.txt Test Output When executed under training use case, the output is composed by following files, stored into a pre-defined directory sub-tree. This sub-tree starts from the execution directory, and it branches into two different subtrees, depending on the functionality domain of the current execution: -./experiments/classification for the classification case -./experiments/regression for the regression case In one of such directories the following output files are automatically generated at the end of execution: errorlog.txt: error report file, containing details about any incorrect condition or exception that caused the abnormal exit from the execution. This file is not created if the program ends normally; testoutlog.txt: output values as calculated after test, with respective target values. It can be used to evaluate the network output for each input pattern; temptrash.txt: ascii file with network outputs and related targets for all input patterns (simplified, not verbose, version of testoutlog.txt, for internal use only); testconfmatrix.txt: confusion matrix calculated at the end of test. It results from the values stored into the testoutlog.txt file. Useful to obtain a simple statistical evaluation of the whole test results. In the case of regression it is an adapted version; All these files have to be registered as official output files are those underlined in the previous list RUN USE CASE The following are pre-formatted command lines to be used run command line parameters (ALL parameters are required in a strict sequential order) 1. Functionality case [integer]: a. 10 CLASSIFICATION; b. 20 REGRESSION; 2. Use case [integer]: a. 4 RUN_QNA (run case); 3. Number of input neurons [integer]. It must match the number of input dataset columns; 4. Number of output neurons [integer]. It must match the number of target dataset columns; 5. Number of hidden layers [integer]. It may be 0, 1 or 2; 6. Number of first hidden layer neurons [integer]. If parameter 5 is 0, then this field is not considered; 22

23 Pag. 23 of Number of second hidden layer neurons [integer]. If parameter 5 is < 2, then this field is not considered; 8. Name of input dataset file (with full relative path) [character string]; 9. Confusion matrix calculation mode [integer]. This is an internal parameter used to define the calculation of confusion matrix (which diagonal to be considered as positive cases). To be used only in classification mode. Default value = 1. In case of regression this is not considered: a. 0 reverse mode (positive cases are on secondary diagonal of matrix); b. 1 standard mode (positive cases are on primary diagonal of matrix); 10. Name of the weight file (with full relative path if loaded from different directory) to be loaded to initialize network weights [character string]; 11. Name of the file (with full relative path if loaded from different directory) with internal network node values as frozen at the end of training phase [character string]; As examples of command lines: Classification mode run command line: mycnn_bp datasets/test_agn_ridotto.txt 1 experiments/classification/trainedweights.txt experiments/classification/frozen_train_net.txt Regression mode run command line: mycnn_bp datasets/test_agn_ridotto.txt 1 experiments/regression/trainedweights.txt experiments/regression/frozen_train_net.txt Run Output When executed under training use case, the output is composed by following files, stored into a pre-defined directory sub-tree. This sub-tree starts from the execution directory, and it branches into two different subtrees, depending on the functionality domain of the current execution: -./experiments/classification for the classification case -./experiments/regression for the regression case In one of such directories the following output files are automatically generated at the end of execution: errorlog.txt: error report file, containing details about any incorrect condition or exception that caused the abnormal exit from the execution. This file is not created if the program ends normally; 23

24 Pag. 24 of 31 RunOutLog.txt: output values as calculated after training, with respective target values. It can be used to evaluate the network output for each input pattern. It corresponds to an embedded test session done by submitting the training dataset as test dataset; All these files have to be registered as official output files are those underlined in the previous list STATISTICAL TRAIN USE CASE The following are pre-formatted command lines to be used Statistical training command line parameters (ALL parameters are required in a strict sequential order) 1. Functionality case [integer]: a. 10 CLASSIFICATION; b. 20 REGRESSION; 2. Use case [integer]: a. 6 STAT_TRAIN_QNA (statistical training case). It generates a verbose log file reporting a step-by-step training procedure with incremental dimension of input dataset (see related section below for details); 3. Decay [double]. Weight decay constant, (>=0.001). Decay term 'Decay* Weights ^2' is added to error function. Default value = 0.001; 4. Restarts [integer]. Number of restarts from random position, >0. If you don't know what Restarts to choose, use 2. (THIS IS THE NUMBER OF MAX TRAINING CYCLES PERFORMED ANYWAY). Default value = 20; 5. Wstep [double]. Stopping criterion. Algorithm stops if step size is less than WStep. Recommended value = Zero step size means stopping after MaxIts iterations. Default value = 0.01; 6. MaxIts [integer]. Stopping criterion. Algorithm stops after MaxIts iterations (NOT gradient calculations). Zero MaxIts means stopping when step is sufficiently small (use Wstep). Default value = 1500; 7. Number of input neurons [integer]. It must match the number of input dataset columns; 8. Number of output neurons [integer]. It must match the number of target dataset columns; 9. Number of hidden layers [integer]. It may be 0, 1 or 2; 10. Number of first hidden layer neurons [integer]. If parameter 9 is 0, then this field is not considered; 11. Number of second hidden layer neurons [integer]. If parameter 9 is < 2, then this field is not considered; 12. Cross Entropy flag [integer]. Used in classification mode only. In case of regression mode this parameter is not considered. Default value = 0. a. 0 not used (standard MSE used); b. 1 used; 13. Name of input dataset file (with full relative path) [character string]; 14. K-fold cross validation flag [integer]. Default value = 0. 24

25 Pag. 25 of 31 a. 0 not used (standard training without validation); b. 1 used (training with validation sequence); 15. K-fold cross validation k value [integer]. If parameter 14 is 0 then this parameter is not considered. Default value = 5; 16. Confusion matrix calculation mode [integer]. This is an internal parameter used to define the calculation of confusion matrix (which diagonal to be considered as positive cases). To be used only in classification mode. Default value = 1. In case of regression this is not considered: a. 0 reverse mode (positive cases are on secondary diagonal of matrix); b. 1 standard mode (positive cases are on primary diagonal of matrix); As examples of command lines: Classification mode statistical training command line: mycnn_bp datasets/agn_7_stat_full.txt Regression mode statistical training command line: mycnn_bp datasets/agn_7_stat_full.txt Statistical Training Output When executed under training use case, the output is composed by following files, stored into a pre-defined directory sub-tree. This sub-tree starts from the execution directory, and it branches into two different subtrees, depending on the functionality domain of the current execution: -./experiments/classification for the classification case -./experiments/regression for the regression case In one of such directories the following output files are automatically generated at the end of execution: errorlog.txt: error report file, containing details about any incorrect condition or exception that caused the abnormal exit from the execution. This file is not created if the program ends normally; trainlog.txt: log file with detailed information about experiment configuration, main results and parameter setup; trainpartialerror.txt: ascii (space separated) file with partial values at each training iteration of the QNA algorithm. Useful to obtain a graphical view of the learning process. Each row is composed by three columns: 25

26 Pag. 26 of 31 o training step; o number of iterations of current step (number of Hessian approximations <= MaxIts); o current step batch error (MSE or Cross Entropy value if selected in classification mode); trainedweights.txt: final network weights frozen at the end of batch training. It can be used in a new training experiment to restore old one; frozen_train_net.txt: internal network node values as frozen at the end of training, to be given as network input file in test/run cases; traintestoutlog.txt: output values as calculated after training, with respective target values. It can be used to evaluate the network output for each input pattern. It corresponds to an embedded test session done by submitting the training dataset as test dataset; temptrash.txt: ascii file with network outputs and related targets for all input patterns (simplified, not verbose, version of traintestoutlog.txt, for internal use only); traintestconfmatrix.txt: confusion matrix calculated at the end of training. It results from the values stored into the traintestoutlog.txt file. Useful to obtain a simple statistical evaluation of the whole training results; stat.txt: complete log with statistics information about both method and algorithm performances; Some of the above files, as described, are not very useful for the end user, being created for internal use only. In particular, main important files to be registered as official output files are those underlined in the previous list. 26

27 Pag. 27 of 31 5 APPENDIX Scientific case test with MLP-QNA In the following details about statistical training/test use cases with MLP-QNA algorithm are reported. The described experiments are done for a commissioned scientific case. Both classification and regression are used as functionality modes. The scope of such tests is to verify the correctness of the algorithm, its efficiency and preliminary scientific results (to be investigated in more details with specialists of the team). 5.1 The Science case The scientific problem has two main goals: 1) to determine the accuracy of recognizing globular clusters in near galaxies (<20 Mpc) from images in a single band, taken with HST, by separating these sources from background stellar contaminants, compact galaxies and AGNs. Usually such recognition is done through color selection, eventually integrated with morphological parameters able to measure the angular extension of single sources. In our case, in the preliminary way, we consider photometric data; 2) To extract from the parameter space those which influence the formation of X binary sources (LMXB) in the globular clusters. But to investigate this goal, both X and optical data are required; Moreover, an intrinsic important result could be to prove the capability and robustness of neural network as an automatic and easy way to solve the mentioned goals, instead of more complex traditional methods. Last but not the least, if the proposed method is able to reach the two goals by using only photometric parameters, another important issue could be to consider morphological (structural) information of the sources as secondary in the recognizing process. But this is a matter of a deeper investigation in the next future. The dataset used in these preliminary tests consist of source catalogues obtained by HST images of galaxy NGC1399, in the broad band V (F606V) of HST. For these sources we have the photometric parameters (reported below). At the moment M. Paolillo is checking the possibility to obtain a more precise catalogue, with color information for a larger amount of sources, together with a more precise information set about morphological parameters for all considered sources. Concerning traditional methods, for example by considering the magnitude and stellar attributes of SExtractor, M. Paolillo is able to obtain an accuracy of 92% with a +/- 10% of contamination, within m_v~24.5 (by comparing with C-R color classification only) on a dataset of ~2700 sources. Our photometric information (input columns of our dataset) used are: 1) mag_iso "Isophotal magnitude" 2) mag_aper1 "Fixed aperture magnitude vector" 3) mag_aper2 4) mag_aper3 5) kron_radius "Kron apertures" 6) ellipticity "1 - B_IMAGE/A_IMAGE" 7) fwhm_image "FWHM assuming a gaussian core" 27

β-release Multi Layer Perceptron Trained by Quasi Newton Rule MLPQNA User Manual

β-release Multi Layer Perceptron Trained by Quasi Newton Rule MLPQNA User Manual β-release Multi Layer Perceptron Trained by Quasi Newton Rule MLPQNA User Manual DAME-MAN-NA-0015 Issue: 1.0 Date: July 28, 2011 Author: M. Brescia, S. Riccardi Doc. : BetaRelease_Model_MLPQNA_UserManual_DAME-MAN-NA-0015-Rel1.0

More information

Multi Layer Perceptron trained by Quasi Newton Algorithm

Multi Layer Perceptron trained by Quasi Newton Algorithm Multi Layer Perceptron trained by Quasi Newton Algorithm MLPQNA User Manual DAME-MAN-NA-0015 Issue: 1.2 Author: M. Brescia, S. Riccardi Doc. : MLPQNA_UserManual_DAME-MAN-NA-0015-Rel1.2 1 Index 1 Introduction...

More information

Multi Layer Perceptron trained by Quasi Newton Algorithm or Levenberg-Marquardt Optimization Network

Multi Layer Perceptron trained by Quasi Newton Algorithm or Levenberg-Marquardt Optimization Network Multi Layer Perceptron trained by Quasi Newton Algorithm or Levenberg-Marquardt Optimization Network MLPQNA/LEMON User Manual DAME-MAN-NA-0015 Issue: 1.3 Author: M. Brescia, S. Riccardi Doc. : MLPQNA_UserManual_DAME-MAN-NA-0015-Rel1.3

More information

MLPQNA-LEMON Multi Layer Perceptron neural network trained by Quasi Newton or Levenberg-Marquardt optimization algorithms

MLPQNA-LEMON Multi Layer Perceptron neural network trained by Quasi Newton or Levenberg-Marquardt optimization algorithms MLPQNA-LEMON Multi Layer Perceptron neural network trained by Quasi Newton or Levenberg-Marquardt optimization algorithms 1 Introduction In supervised Machine Learning (ML) we have a set of data points

More information

Multi Layer Perceptron trained by Quasi Newton learning rule

Multi Layer Perceptron trained by Quasi Newton learning rule Multi Layer Perceptron trained by Quasi Newton learning rule Feed-forward neural networks provide a general framework for representing nonlinear functional mappings between a set of input variables and

More information

Multi Layer Perceptron with Back Propagation. User Manual

Multi Layer Perceptron with Back Propagation. User Manual Multi Layer Perceptron with Back Propagation User Manual DAME-MAN-NA-0011 Issue: 1.3 Date: September 03, 2013 Author: S. Cavuoti, M. Brescia Doc. : MLPBP_UserManual_DAME-MAN-NA-0011-Rel1.3 1 INDEX 1 Introduction...

More information

DAME Web Application REsource Plugin Creator User Manual

DAME Web Application REsource Plugin Creator User Manual DAME Web Application REsource Plugin Creator User Manual DAMEWARE-MAN-NA-0016 Issue: 2.1 Date: March 20, 2014 Authors: S. Cavuoti, A. Nocella, S. Riccardi, M. Brescia Doc. : ModelPlugin_UserManual_DAMEWARE-MAN-NA-0016-Rel2.1

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

DAME Web Application REsource Plugin Setup Tool User Manual

DAME Web Application REsource Plugin Setup Tool User Manual DAME Web Application REsource Plugin Setup Tool User Manual DAMEWARE-MAN-NA-0016 Issue: 1.0 Date: October 15, 2011 Authors: M. Brescia, S. Riccardi Doc. : ModelPlugin_UserManual_DAMEWARE-MAN-NA-0016-Rel1.0

More information

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used. 1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when

More information

Image Compression: An Artificial Neural Network Approach

Image Compression: An Artificial Neural Network Approach Image Compression: An Artificial Neural Network Approach Anjana B 1, Mrs Shreeja R 2 1 Department of Computer Science and Engineering, Calicut University, Kuttippuram 2 Department of Computer Science and

More information

Experimental Data and Training

Experimental Data and Training Modeling and Control of Dynamic Systems Experimental Data and Training Mihkel Pajusalu Alo Peets Tartu, 2008 1 Overview Experimental data Designing input signal Preparing data for modeling Training Criterion

More information

Week 3: Perceptron and Multi-layer Perceptron

Week 3: Perceptron and Multi-layer Perceptron Week 3: Perceptron and Multi-layer Perceptron Phong Le, Willem Zuidema November 12, 2013 Last week we studied two famous biological neuron models, Fitzhugh-Nagumo model and Izhikevich model. This week,

More information

Neural Networks (pp )

Neural Networks (pp ) Notation: Means pencil-and-paper QUIZ Means coding QUIZ Neural Networks (pp. 106-121) The first artificial neural network (ANN) was the (single-layer) perceptron, a simplified model of a biological neuron.

More information

Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions

Assignment 2. Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions ENEE 739Q: STATISTICAL AND NEURAL PATTERN RECOGNITION Spring 2002 Assignment 2 Classification and Regression using Linear Networks, Multilayer Perceptron Networks, and Radial Basis Functions Aravind Sundaresan

More information

Theoretical Concepts of Machine Learning

Theoretical Concepts of Machine Learning Theoretical Concepts of Machine Learning Part 2 Institute of Bioinformatics Johannes Kepler University, Linz, Austria Outline 1 Introduction 2 Generalization Error 3 Maximum Likelihood 4 Noise Models 5

More information

Supervised Learning in Neural Networks (Part 2)

Supervised Learning in Neural Networks (Part 2) Supervised Learning in Neural Networks (Part 2) Multilayer neural networks (back-propagation training algorithm) The input signals are propagated in a forward direction on a layer-bylayer basis. Learning

More information

Knowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey.

Knowledge Discovery and Data Mining. Neural Nets. A simple NN as a Mathematical Formula. Notes. Lecture 13 - Neural Nets. Tom Kelsey. Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN

More information

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms

Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 13, NO. 5, SEPTEMBER 2002 1225 Efficient Tuning of SVM Hyperparameters Using Radius/Margin Bound and Iterative Algorithms S. Sathiya Keerthi Abstract This paper

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN

More information

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation Farrukh Jabeen Due Date: November 2, 2009. Neural Networks: Backpropation Assignment # 5 The "Backpropagation" method is one of the most popular methods of "learning" by a neural network. Read the class

More information

For Monday. Read chapter 18, sections Homework:

For Monday. Read chapter 18, sections Homework: For Monday Read chapter 18, sections 10-12 The material in section 8 and 9 is interesting, but we won t take time to cover it this semester Homework: Chapter 18, exercise 25 a-b Program 4 Model Neuron

More information

Lecture #11: The Perceptron

Lecture #11: The Perceptron Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be

More information

Notes on Multilayer, Feedforward Neural Networks

Notes on Multilayer, Feedforward Neural Networks Notes on Multilayer, Feedforward Neural Networks CS425/528: Machine Learning Fall 2012 Prepared by: Lynne E. Parker [Material in these notes was gleaned from various sources, including E. Alpaydin s book

More information

Self-Organizing Maps for cyclic and unbounded graphs

Self-Organizing Maps for cyclic and unbounded graphs Self-Organizing Maps for cyclic and unbounded graphs M. Hagenbuchner 1, A. Sperduti 2, A.C. Tsoi 3 1- University of Wollongong, Wollongong, Australia. 2- University of Padova, Padova, Italy. 3- Hong Kong

More information

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R.

Neural Network Learning. Today s Lecture. Continuation of Neural Networks. Artificial Neural Networks. Lecture 24: Learning 3. Victor R. Lecture 24: Learning 3 Victor R. Lesser CMPSCI 683 Fall 2010 Today s Lecture Continuation of Neural Networks Artificial Neural Networks Compose of nodes/units connected by links Each link has a numeric

More information

Research on Evaluation Method of Product Style Semantics Based on Neural Network

Research on Evaluation Method of Product Style Semantics Based on Neural Network Research Journal of Applied Sciences, Engineering and Technology 6(23): 4330-4335, 2013 ISSN: 2040-7459; e-issn: 2040-7467 Maxwell Scientific Organization, 2013 Submitted: September 28, 2012 Accepted:

More information

Neural Network Neurons

Neural Network Neurons Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given

More information

11/14/2010 Intelligent Systems and Soft Computing 1

11/14/2010 Intelligent Systems and Soft Computing 1 Lecture 7 Artificial neural networks: Supervised learning Introduction, or how the brain works The neuron as a simple computing element The perceptron Multilayer neural networks Accelerated learning in

More information

Louis Fourrier Fabien Gaie Thomas Rolf

Louis Fourrier Fabien Gaie Thomas Rolf CS 229 Stay Alert! The Ford Challenge Louis Fourrier Fabien Gaie Thomas Rolf Louis Fourrier Fabien Gaie Thomas Rolf 1. Problem description a. Goal Our final project is a recent Kaggle competition submitted

More information

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3

Neural Network Optimization and Tuning / Spring 2018 / Recitation 3 Neural Network Optimization and Tuning 11-785 / Spring 2018 / Recitation 3 1 Logistics You will work through a Jupyter notebook that contains sample and starter code with explanations and comments throughout.

More information

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Neural Networks Classifier Introduction INPUT: classification data, i.e. it contains an classification (class) attribute. WE also say that the class

More information

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa

Lecture 2 Notes. Outline. Neural Networks. The Big Idea. Architecture. Instructors: Parth Shah, Riju Pahwa Instructors: Parth Shah, Riju Pahwa Lecture 2 Notes Outline 1. Neural Networks The Big Idea Architecture SGD and Backpropagation 2. Convolutional Neural Networks Intuition Architecture 3. Recurrent Neural

More information

Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions

Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions ENEE 739Q SPRING 2002 COURSE ASSIGNMENT 2 REPORT 1 Classification and Regression using Linear Networks, Multilayer Perceptrons and Radial Basis Functions Vikas Chandrakant Raykar Abstract The aim of the

More information

CS281 Section 3: Practical Optimization

CS281 Section 3: Practical Optimization CS281 Section 3: Practical Optimization David Duvenaud and Dougal Maclaurin Most parameter estimation problems in machine learning cannot be solved in closed form, so we often have to resort to numerical

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right

More information

Data Mining. Neural Networks

Data Mining. Neural Networks Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most

More information

CS 4510/9010 Applied Machine Learning. Neural Nets. Paula Matuszek Fall copyright Paula Matuszek 2016

CS 4510/9010 Applied Machine Learning. Neural Nets. Paula Matuszek Fall copyright Paula Matuszek 2016 CS 4510/9010 Applied Machine Learning 1 Neural Nets Paula Matuszek Fall 2016 Neural Nets, the very short version 2 A neural net consists of layers of nodes, or neurons, each of which has an activation

More information

Dr. Qadri Hamarsheh Supervised Learning in Neural Networks (Part 1) learning algorithm Δwkj wkj Theoretically practically

Dr. Qadri Hamarsheh Supervised Learning in Neural Networks (Part 1) learning algorithm Δwkj wkj Theoretically practically Supervised Learning in Neural Networks (Part 1) A prescribed set of well-defined rules for the solution of a learning problem is called a learning algorithm. Variety of learning algorithms are existing,

More information

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction

Akarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different

More information

GLOBULAR CLUSTERS CLASSIFICATION WITH GPU-BASED DATA MINING METHODS

GLOBULAR CLUSTERS CLASSIFICATION WITH GPU-BASED DATA MINING METHODS GLOBULAR CLUSTERS CLASSIFICATION WITH GPU-BASED DATA MINING METHODS S. Cavuoti (1), M. Garofalo (2), M. Brescia (3), M. Paolillo (1), G. Longo (1,4), A. Pescapè (2), G. Ventre (2) and DAME Working Group

More information

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION 6 NEURAL NETWORK BASED PATH PLANNING ALGORITHM 61 INTRODUCTION In previous chapters path planning algorithms such as trigonometry based path planning algorithm and direction based path planning algorithm

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

A Data Classification Algorithm of Internet of Things Based on Neural Network

A Data Classification Algorithm of Internet of Things Based on Neural Network A Data Classification Algorithm of Internet of Things Based on Neural Network https://doi.org/10.3991/ijoe.v13i09.7587 Zhenjun Li Hunan Radio and TV University, Hunan, China 278060389@qq.com Abstract To

More information

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism)

Artificial neural networks are the paradigm of connectionist systems (connectionism vs. symbolism) Artificial Neural Networks Analogy to biological neural systems, the most robust learning systems we know. Attempt to: Understand natural biological systems through computational modeling. Model intelligent

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Hash Tables. Hashing Probing Separate Chaining Hash Function

Hash Tables. Hashing Probing Separate Chaining Hash Function Hash Tables Hashing Probing Separate Chaining Hash Function Introduction In Chapter 4 we saw: linear search O( n ) binary search O( log n ) Can we improve the search operation to achieve better than O(

More information

Deep Learning for Visual Computing Prof. Debdoot Sheet Department of Electrical Engineering Indian Institute of Technology, Kharagpur

Deep Learning for Visual Computing Prof. Debdoot Sheet Department of Electrical Engineering Indian Institute of Technology, Kharagpur Deep Learning for Visual Computing Prof. Debdoot Sheet Department of Electrical Engineering Indian Institute of Technology, Kharagpur Lecture - 05 Classification with Perceptron Model So, welcome to today

More information

Tested Paradigm to Include Optimization in Machine Learning Algorithms

Tested Paradigm to Include Optimization in Machine Learning Algorithms Tested Paradigm to Include Optimization in Machine Learning Algorithms Aishwarya Asesh School of Computing Science and Engineering VIT University Vellore, India International Journal of Engineering Research

More information

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-"&"3 -"(' ( +-" " " % '.+ % ' -0(+$,

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-&3 -(' ( +-   % '.+ % ' -0(+$, The structure is a very important aspect in neural network design, it is not only impossible to determine an optimal structure for a given problem, it is even impossible to prove that a given structure

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

A Systematic Overview of Data Mining Algorithms

A Systematic Overview of Data Mining Algorithms A Systematic Overview of Data Mining Algorithms 1 Data Mining Algorithm A well-defined procedure that takes data as input and produces output as models or patterns well-defined: precisely encoded as a

More information

COMPUTATIONAL INTELLIGENCE

COMPUTATIONAL INTELLIGENCE COMPUTATIONAL INTELLIGENCE Fundamentals Adrian Horzyk Preface Before we can proceed to discuss specific complex methods we have to introduce basic concepts, principles, and models of computational intelligence

More information

CS6220: DATA MINING TECHNIQUES

CS6220: DATA MINING TECHNIQUES CS6220: DATA MINING TECHNIQUES Image Data: Classification via Neural Networks Instructor: Yizhou Sun yzsun@ccs.neu.edu November 19, 2015 Methods to Learn Classification Clustering Frequent Pattern Mining

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks Ritika Luthra Research Scholar Chandigarh University Gulshan Goyal Associate Professor Chandigarh University ABSTRACT Image Skeletonization

More information

Unsupervised learning in Vision

Unsupervised learning in Vision Chapter 7 Unsupervised learning in Vision The fields of Computer Vision and Machine Learning complement each other in a very natural way: the aim of the former is to extract useful information from visual

More information

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska

Classification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska Classification Lecture Notes cse352 Neural Networks Professor Anita Wasilewska Neural Networks Classification Introduction INPUT: classification data, i.e. it contains an classification (class) attribute

More information

Graphical User Interface User Manual

Graphical User Interface User Manual Graphical User Interface User Manual DAME-MAN-NA-0010 Issue: 1.3 Date: September 04, 2013 Author: M. Brescia, S. Cavuoti Doc. : GUI_UserManual_DAME-MAN-NA-0010-Rel1.3 1 DAME we make science discovery happen

More information

Deep Learning. Practical introduction with Keras JORDI TORRES 27/05/2018. Chapter 3 JORDI TORRES

Deep Learning. Practical introduction with Keras JORDI TORRES 27/05/2018. Chapter 3 JORDI TORRES Deep Learning Practical introduction with Keras Chapter 3 27/05/2018 Neuron A neural network is formed by neurons connected to each other; in turn, each connection of one neural network is associated

More information

Multilayer Feed-forward networks

Multilayer Feed-forward networks Multi Feed-forward networks 1. Computational models of McCulloch and Pitts proposed a binary threshold unit as a computational model for artificial neuron. This first type of neuron has been generalized

More information

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017 3/0/207 Neural Networks Emily Fox University of Washington March 0, 207 Slides adapted from Ali Farhadi (via Carlos Guestrin and Luke Zettlemoyer) Single-layer neural network 3/0/207 Perceptron as a neural

More information

Machine Learning in Biology

Machine Learning in Biology Università degli studi di Padova Machine Learning in Biology Luca Silvestrin (Dottorando, XXIII ciclo) Supervised learning Contents Class-conditional probability density Linear and quadratic discriminant

More information

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Lecture 20: Neural Networks for NLP. Zubin Pahuja Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple

More information

DAMEWARE. Data Mining & Exploration Web Application Resource

DAMEWARE. Data Mining & Exploration Web Application Resource arxiv:1603.00720v2 [astro-ph.im] 16 Mar 2016 DAMEWARE Data Mining & Exploration Web Application Resource Issue: 1.5 Date: March 1, 2016 Authors: M. Brescia, S. Cavuoti, F. Esposito, M. Fiore, M. Garofalo,

More information

EE 589 INTRODUCTION TO ARTIFICIAL NETWORK REPORT OF THE TERM PROJECT REAL TIME ODOR RECOGNATION SYSTEM FATMA ÖZYURT SANCAR

EE 589 INTRODUCTION TO ARTIFICIAL NETWORK REPORT OF THE TERM PROJECT REAL TIME ODOR RECOGNATION SYSTEM FATMA ÖZYURT SANCAR EE 589 INTRODUCTION TO ARTIFICIAL NETWORK REPORT OF THE TERM PROJECT REAL TIME ODOR RECOGNATION SYSTEM FATMA ÖZYURT SANCAR 1.Introductıon. 2.Multi Layer Perception.. 3.Fuzzy C-Means Clustering.. 4.Real

More information

Deep Learning With Noise

Deep Learning With Noise Deep Learning With Noise Yixin Luo Computer Science Department Carnegie Mellon University yixinluo@cs.cmu.edu Fan Yang Department of Mathematical Sciences Carnegie Mellon University fanyang1@andrew.cmu.edu

More information

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018 CPSC 340: Machine Learning and Data Mining Deep Learning Fall 2018 Last Time: Multi-Dimensional Scaling Multi-dimensional scaling (MDS): Non-parametric visualization: directly optimize the z i locations.

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Neural Computation : Lecture 14 John A. Bullinaria, 2015 1. The RBF Mapping 2. The RBF Network Architecture 3. Computational Power of RBF Networks 4. Training

More information

Random Search Report An objective look at random search performance for 4 problem sets

Random Search Report An objective look at random search performance for 4 problem sets Random Search Report An objective look at random search performance for 4 problem sets Dudon Wai Georgia Institute of Technology CS 7641: Machine Learning Atlanta, GA dwai3@gatech.edu Abstract: This report

More information

5 Machine Learning Abstractions and Numerical Optimization

5 Machine Learning Abstractions and Numerical Optimization Machine Learning Abstractions and Numerical Optimization 25 5 Machine Learning Abstractions and Numerical Optimization ML ABSTRACTIONS [some meta comments on machine learning] [When you write a large computer

More information

This leads to our algorithm which is outlined in Section III, along with a tabular summary of it's performance on several benchmarks. The last section

This leads to our algorithm which is outlined in Section III, along with a tabular summary of it's performance on several benchmarks. The last section An Algorithm for Incremental Construction of Feedforward Networks of Threshold Units with Real Valued Inputs Dhananjay S. Phatak Electrical Engineering Department State University of New York, Binghamton,

More information

Index. Umberto Michelucci 2018 U. Michelucci, Applied Deep Learning,

Index. Umberto Michelucci 2018 U. Michelucci, Applied Deep Learning, A Acquisition function, 298, 301 Adam optimizer, 175 178 Anaconda navigator conda command, 3 Create button, 5 download and install, 1 installing packages, 8 Jupyter Notebook, 11 13 left navigation pane,

More information

An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting.

An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting. An Algorithm For Training Multilayer Perceptron (MLP) For Image Reconstruction Using Neural Network Without Overfitting. Mohammad Mahmudul Alam Mia, Shovasis Kumar Biswas, Monalisa Chowdhury Urmi, Abubakar

More information

A neural network that classifies glass either as window or non-window depending on the glass chemistry.

A neural network that classifies glass either as window or non-window depending on the glass chemistry. A neural network that classifies glass either as window or non-window depending on the glass chemistry. Djaber Maouche Department of Electrical Electronic Engineering Cukurova University Adana, Turkey

More information

PARALLEL TRAINING OF NEURAL NETWORKS FOR SPEECH RECOGNITION

PARALLEL TRAINING OF NEURAL NETWORKS FOR SPEECH RECOGNITION PARALLEL TRAINING OF NEURAL NETWORKS FOR SPEECH RECOGNITION Stanislav Kontár Speech@FIT, Dept. of Computer Graphics and Multimedia, FIT, BUT, Brno, Czech Republic E-mail: xkonta00@stud.fit.vutbr.cz In

More information

A NEW EFFICIENT VARIABLE LEARNING RATE FOR PERRY S SPECTRAL CONJUGATE GRADIENT TRAINING METHOD

A NEW EFFICIENT VARIABLE LEARNING RATE FOR PERRY S SPECTRAL CONJUGATE GRADIENT TRAINING METHOD 1 st International Conference From Scientific Computing to Computational Engineering 1 st IC SCCE Athens, 8 10 September, 2004 c IC SCCE A NEW EFFICIENT VARIABLE LEARNING RATE FOR PERRY S SPECTRAL CONJUGATE

More information

Combine the PA Algorithm with a Proximal Classifier

Combine the PA Algorithm with a Proximal Classifier Combine the Passive and Aggressive Algorithm with a Proximal Classifier Yuh-Jye Lee Joint work with Y.-C. Tseng Dept. of Computer Science & Information Engineering TaiwanTech. Dept. of Statistics@NCKU

More information

(1) Department of Physics University Federico II, Via Cinthia 24, I Napoli, Italy (2) INAF Astronomical Observatory of Capodimonte, Via

(1) Department of Physics University Federico II, Via Cinthia 24, I Napoli, Italy (2) INAF Astronomical Observatory of Capodimonte, Via (1) Department of Physics University Federico II, Via Cinthia 24, I-80126 Napoli, Italy (2) INAF Astronomical Observatory of Capodimonte, Via Moiariello 16, I-80131 Napoli, Italy To measure the distance

More information

Efficient Iterative Semi-supervised Classification on Manifold

Efficient Iterative Semi-supervised Classification on Manifold . Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani

More information

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network

Review on Methods of Selecting Number of Hidden Nodes in Artificial Neural Network Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 11, November 2014,

More information

Artificial Neural Networks MLP, RBF & GMDH

Artificial Neural Networks MLP, RBF & GMDH Artificial Neural Networks MLP, RBF & GMDH Jan Drchal drchajan@fel.cvut.cz Computational Intelligence Group Department of Computer Science and Engineering Faculty of Electrical Engineering Czech Technical

More information

Logistic Regression. Abstract

Logistic Regression. Abstract Logistic Regression Tsung-Yi Lin, Chen-Yu Lee Department of Electrical and Computer Engineering University of California, San Diego {tsl008, chl60}@ucsd.edu January 4, 013 Abstract Logistic regression

More information

COMBINING NEURAL NETWORKS FOR SKIN DETECTION

COMBINING NEURAL NETWORKS FOR SKIN DETECTION COMBINING NEURAL NETWORKS FOR SKIN DETECTION Chelsia Amy Doukim 1, Jamal Ahmad Dargham 1, Ali Chekima 1 and Sigeru Omatu 2 1 School of Engineering and Information Technology, Universiti Malaysia Sabah,

More information

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization

More information

M. Sc. (Artificial Intelligence and Machine Learning)

M. Sc. (Artificial Intelligence and Machine Learning) Course Name: Advanced Python Course Code: MSCAI 122 This course will introduce students to advanced python implementations and the latest Machine Learning and Deep learning libraries, Scikit-Learn and

More information

Rapid growth of massive datasets

Rapid growth of massive datasets Overview Rapid growth of massive datasets E.g., Online activity, Science, Sensor networks Data Distributed Clusters are Pervasive Data Distributed Computing Mature Methods for Common Problems e.g., classification,

More information

A Neural Network Model Of Insurance Customer Ratings

A Neural Network Model Of Insurance Customer Ratings A Neural Network Model Of Insurance Customer Ratings Jan Jantzen 1 Abstract Given a set of data on customers the engineering problem in this study is to model the data and classify customers

More information

CHAPTER VI BACK PROPAGATION ALGORITHM

CHAPTER VI BACK PROPAGATION ALGORITHM 6.1 Introduction CHAPTER VI BACK PROPAGATION ALGORITHM In the previous chapter, we analysed that multiple layer perceptrons are effectively applied to handle tricky problems if trained with a vastly accepted

More information

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers

Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers Traffic Signs Recognition using HP and HOG Descriptors Combined to MLP and SVM Classifiers A. Salhi, B. Minaoui, M. Fakir, H. Chakib, H. Grimech Faculty of science and Technology Sultan Moulay Slimane

More information

Clustering algorithms and autoencoders for anomaly detection

Clustering algorithms and autoencoders for anomaly detection Clustering algorithms and autoencoders for anomaly detection Alessia Saggio Lunch Seminars and Journal Clubs Université catholique de Louvain, Belgium 3rd March 2017 a Outline Introduction Clustering algorithms

More information

A Brief Look at Optimization

A Brief Look at Optimization A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest

More information

2. Neural network basics

2. Neural network basics 2. Neural network basics Next commonalities among different neural networks are discussed in order to get started and show which structural parts or concepts appear in almost all networks. It is presented

More information

Neural Networks Laboratory EE 329 A

Neural Networks Laboratory EE 329 A Neural Networks Laboratory EE 329 A Introduction: Artificial Neural Networks (ANN) are widely used to approximate complex systems that are difficult to model using conventional modeling techniques such

More information

Hashing. Hashing Procedures

Hashing. Hashing Procedures Hashing Hashing Procedures Let us denote the set of all possible key values (i.e., the universe of keys) used in a dictionary application by U. Suppose an application requires a dictionary in which elements

More information

Unsupervised Image Segmentation with Neural Networks

Unsupervised Image Segmentation with Neural Networks Unsupervised Image Segmentation with Neural Networks J. Meuleman and C. van Kaam Wageningen Agricultural University, Department of Agricultural, Environmental and Systems Technology, Bomenweg 4, 6703 HD

More information

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization More on Learning Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization Neural Net Learning Motivated by studies of the brain. A network of artificial

More information

What is machine learning?

What is machine learning? Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship

More information