Input and Structure Selection for k-nn Approximator

Size: px
Start display at page:

Download "Input and Structure Selection for k-nn Approximator"

Transcription

1 Input and Structure Selection for k- Approximator Antti Soramaa ima Reyhani and Amaury Lendasse eural etwork Research Centre Helsinki University of Technology P.O. Box spoo Finland {asorama nreyhani Abstract. This paper presents k- as an approximator for time series prediction problems. The main advantage of this approximator is its simplicity. Despite the simplicity k- can be used to perform input selection for nonlinear models and it also provides accurate approximations. Three model structure selection methods are presented: Leave-one-out Bootstrap and Bootstrap 63. We will show that both Bootstraps provide a good estimate of the number of neighbors k where Leave-one-out fails. Results of the methods are presented with the lectric load from Poland data set. Keywords: k- Time Series Prediction Bootstrap Leave-one-out and Model Structure Selection. Introduction In any function approximation system identification classification or prediction task one usually wants to find the best possible model and the best possible parameters to have good performance. Selected model must be generalizing enough still preserving accuracy and reliability without unnecessary complexity which increase computational load and thus calculation time. Optimal parameters must be determined for every model to be able to rank the models according to their performances. Furthermore in order to select the best model or model class one also needs to determine the best input set to use in the determination of the best parameters for each structure. When too many inputs are selected it is possible to get even worse results than with fewer inputs containing more accurate and valid information. Vice versa with only few inputs the accuracy of the model might not be enough and results are poor and unreliable. The problems mentioned above occur simultaneously so it is difficult to find the right combination of correct attributes. It can be a very tedious and time-consuming procedure to go through every possible structure to select the best one. In this paper we focus in finding the optimal structure for k-earest eighbors (k- ) approximator. At the same time we consider the problem of selecting the most necessary and optimal inputs for the k- as well as selecting the structure of the approximator. The k- method is presented in Section ; Section 3 describes the selection of the inputs with an exhaustive search and Section 4 the selection of the J. Cabestany A. Prieto and D.F. Sandoval (ds.): IWA 005 LCS 35 pp Springer-Verlag Berlin Heidelberg 005

2 986 A. Soramaa. Reyhani and A. Lendasse structure using Leave-one-out (LOO) and Bootstraps. In Section 5 we show some experimental results with an electric load time series and then derived conclusions from the results in Section 6. k-earest eighbors k-earest eighbors approximation method is a very simple but powerful method. It has been used in many different applications and particularly in classification tasks []. The key idea behind the k- is that similar input data vectors have similar output values. One has to look for a certain number of nearest neighbors according to uclidean distance [] and their corresponding output values to get the output approximation. We can calculate the estimation of the outputs by using the average of the outputs of the neighbors in the neighborhood. If the pairs (x i y i ) represent the data with x i as an n-dimensional input and y i as a scalar output value k- approximation is k yp( ) yˆ i k where ŷ i represents the output estimation P() is the index number of the th nearest neighbor of the input x i and k is the number of neighbors that are used. We use the same neighborhood size for every data point so we use a global k which must be determined. () 3 Input Selection In order to select the best set of inputs all possible n (n is the maximum number of inputs) input sets are built and evaluated. For each input set the global optimum number of neighbors is determined and the generalization error estimate (defined in Section 4) of the set is calculated as a mean of errors of all data points. In this way it is possible to compare all different input sets and take the best one to be used in the final k- approximation. In this case adding one input doubles the needed calculation time. We have to make a compromise between the maximum input size to use and the calculation time available. This kind of exhaustive search for best inputs is usually not preferred because of the huge computational load. However with k- the computations can be performed in a reasonable time thanks to the simplicity of the k-. 4 Model Structure Selection We consider the problem of determining a model which approximates as accurately as possible an unknown function g(.). This approximation is chosen among a set of several possible models. Models in a set are denoted here by

3 Input and Structure Selection for k- Approximator 987 h q ( x () where q represents the q th model in the set (q) are the parameters of the q th model and x is a n-dimensional input vector. The parameters that define a set of possible models are called hyper-parameters; they are not estimated by the learning algorithm but by some external procedure [4]. In a typical learning procedure the (q) parameters are optimized to minimize the approximation error on the learning set; the structure is determined as the minimization of the generalization error defined as gen M q ( h ( xi ( q)) yi ) i ) lim M M where x i are n-dimensional input vectors to the model and y i the corresponding scalar expected outputs. According to the definition (3) the generalization error is the mean square error of the model computed on an infinite sized test set. Such set is not available in practice so we must approximate the generalization error. The best model structure q is the structure that minimizes the approximation of the generalization error. In our case the model parameters consist in selecting the number of neighbors to use in the k- approximation. We have used two different methods to select the global optimal number of neighbors Leave-one-out (LOO) and Bootstrap 63. Leave-one-out is a common method used in many statistical evaluation purposes and we wanted to show that Bootstrap 63 is better than LOO in this case even [] claims that Bootstrap 63 doesn t work in selecting the number of neighbors. 4. Leave-One-Out Leave-one-out [3] is a special case of k-fold cross-validation resampling method. In k- fold cross-validation the training data is divided into k approximately equal sized sets. Then model is trained by using all but one set and the leftover set is used in validation. The generalization error estimation of k-fold cross-validation is a mean of all k different validation results. LOO procedure is the same as k-fold cross-validation with k equal to the size of the training set. So for each different neighborhood size LOO procedure is used to calculate its generalization error estimate by removing each neighbor at a time from the training set building a model with the rest of the training data and calculating the validation error with the one taken out. This procedure is done for every data point in the training set and the estimate of the generalization error is calculated as a mean of all k or validation errors (4). ˆ gen ( q) i q * ( h ( x y ) i where x i is the i th input vector from the training set y i is the corresponding output and ι *(q) includes the model parameters without using (x i y i ) in the training. i i (3) (4)

4 988 A. Soramaa. Reyhani and A. Lendasse Because we want to use the global optimum size of the neighborhood we have to calculate the LOO error for each data point and each size of the neighborhood. After that we can take the mean over all data points to find out which is globally the optimum number of neighbors. We select the number of neighbors that gives us the smallest generalization error. 4. Bootstrap and Bootstrap 63 Bootstrap [4] is a resampling technique developed to estimate some statistical parameters (like the mean of a population its variance etc). In the case of a model structure selection the parameter to be estimated is the generalization error. When using bootstrap the generalization error is not estimated directly. Rather the bootstrap estimates the difference between the generalization error and the training error or apparent error according to fron []. This difference is called the optimism. The estimation of the generalization error will be the sum of the training error and the estimated optimism. The training error is computed using all available data on the training set. I I * q I * I ( h ( xi yi ) i where h q is the q th model that is used I denotes the training set *(q) includes the model parameters after learning x i I is the i th input vector from the training set y i is the corresponding output and is the number of elements in the training set. The optimism is estimated using a resampling technique based on drawing with replacement within the training set. This bootstrap set is as large as the training set with its participants drawn randomly from the training set. ach model is trained using the bootstrap set and optimism is calculated as the difference between the learning error (6) and the validation error (7). Learning error is calculated in the bootstrap set with model trained in the same bootstrap set. A A * i q A * A ( h ( xi yi ) where A is the th bootstrap set x i A is the i th input vector from the bootstrap set and y i A is the corresponding output. Validation error is calculated in the initial training set with model trained on the same bootstrap set than the learning error. A I * q I * I ( h ( xi yi ) i Above described optimism calculation procedure is repeated as many times or rounds as possible considering linearly increasing computation time. The optimism. (5) (6) (7)

5 Input and Structure Selection for k- Approximator 989 of a model is then calculated as a mean of the difference of the two error functions described above J A I A A * * opti mism ˆ ( q) J where J is the number of bootstrap rounds done. The final generalization error estimate is the sum of the training error and the optimism ˆ * gen I I ( q) opti mism ˆ ( q) +. (9) (8) Bootstrap 63 [5] is a modified version of the original Bootstrap. Where the original Bootstrap gives biased estimation of the generalization error (3) Bootstrap 63 is not biased [4] and thus is more comparable with other methods estimating the generalization error. Bootstrap 63 converges towards the correct generalization error in a reasonable amount of calculation time. The main difference between standard Bootstrap and Bootstrap 63 is the estimation of optimism. In original Bootstrap the optimism is calculated as difference of the errors in two sets in the initial training data set and in the randomly drawn bootstrap set (8). Bootstrap 63 estimates optimism using the data points not drawn into the bootstrap set (0). The model is trained using this set of unselected data points and its error is evaluated on the bootstrap set. opti mism ˆ 63 ( q) J A A J * where the error function is the same as the learning error (6) except the model is trained using Ā the complement of the bootstrap set A. The estimation of the generalization error of the Bootstrap 63 is calculated as weighted sum of the training error and the new optimism (0). ˆ * gen (0) 63 I I ( q).368 opti mism ˆ ( q) () From equation () it gets quite clear to see that the name of the Bootstrap 63 comes from the weighting coefficient of the training error term. The value 0.63 is the probability of one sample to be drawn to the bootstrap set from the training set [ 5]. 5 xperimental Results 5. Time Series Prediction Time series forecasting [6] is a challenge in many fields. In finance one forecasts stock exchange courses or stock market indices; data processing specialists forecast the flow of information on their networks; producers of electricity forecast the load of

6 990 A. Soramaa. Reyhani and A. Lendasse the following day. The common point to their problems is the following: how can one analyse and use the past to predict the future? ext section will demonstrate how k- can be used to predict future values of a time series. 5. Results of Data Set : lectric Load The dataset used in experiments is a benchmark in the field of time series prediction: The Poland lectricity Dataset. It represents the electric load of Poland during 500 days in the 90 s. In our experiments we used the first half of the data set as a training set and the other half as a test set in order to evaluate the selected model s performance between different methods. lectric load Time Fig.. lectric load time series from Poland Fig. and 3 show the generalization error estimates of all different methods according to the number of neighbors when using the best selected input set. Generalization error 3 x umber of eighbors Fig.. The generalization error estimate using Leave-one-out according to the number of neighbors The Table shows the results of the experiments with three methods in selecting the inputs and number of neighbors. We have used n 8 as a maximum number of inputs and J 00 bootstrap rounds in calculations in both Bootstrap and Bootstrap 63.

7 Input and Structure Selection for k- Approximator 99 Generalization error.5 x umber of eighbors Fig. 3. The generalization error estimate using Bootstrap (solid line) and Bootstrap 63 (dashed line) according to the number of neighbors Table. The results using lectric load data set and methods described in this paper Selected Inputs k Êgen Test error LOO t - { 5 7 8} Bootstrap t - { 5 7 8} Bootstrap 63 t - { 5 7 8} All three methods select the same inputs but different number of neighbors. According to the test error both Bootstrap and Bootstrap 63 select the best k. On the other hand Bootstrap and Bootstrap 63 use J times more time than LOO. In Fig. 4 we have used the model selected by bootstrap to predict 300 first test set values. lectric load Time Fig. 4. Test set from lectric load data. 00 first values predicted and plotted. Solid line represents the real values and dashed represents the prediction 6 Conclusion We have shown that all methods Leave-one-out and Bootstraps select the same inputs. But number of neighbors is selected more efficiently by Bootstraps according to the test error; even some studies have proven otherwise [].

8 99 A. Soramaa. Reyhani and A. Lendasse It has also been shown that k- is a good approximator for time series. We have also tested k- with a couple of other time series and acquired the same results. As a conclusion we suggest Leave-one-out to be used in input selection and Bootstrap or Bootstrap 63 in selection of k. Acknowledgements Part the work of A. Soramaa. Reyhani and A. Lendasse is supported by the proect of ew Information Processing Principles of the Academy of Finland. References. Bishop C.M.: eural etworks for Pattern Recognition. Oxford University Press (995).. fron B. Tibshirani R. J.: An introduction to the bootstrap. Chapman & Hall (993). 3. Kohavi R.: A study of Cross-Validation and Bootstrap for Accuracy stimation and Model Selection. In: Proc. of the 4th Int. Joint Conf. on A.I. Montréal (995) : Lendasse A. Wertz V. Verleysen M.: Model selection with cross-validations and bootstraps Application to time series prediction with RBF models. In: Artificial eural etworks and eural Information Processing ICA/ICOIP (003) Kaynak O. Alpaydin. Oa. Xu L. (eds): Springer-Verlag Lecture otes in Computer Science 74 Berlin (003) fron B. Tibshirani R. J.: Improvements on cross-validation: The.63+ bootstrap method. J. Amer. Statist. Assoc. (997) 9: Weigend A.S. Gershenfeld.A.: Times Series Prediction: Forecasting the future and Understanding the Past. Addison-Wesley Reading MA (994).

LS-SVM Functional Network for Time Series Prediction

LS-SVM Functional Network for Time Series Prediction LS-SVM Functional Network for Time Series Prediction Tuomas Kärnä 1, Fabrice Rossi 2 and Amaury Lendasse 1 Helsinki University of Technology - Neural Networks Research Center P.O. Box 5400, FI-02015 -

More information

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks Series Prediction as a Problem of Missing Values: Application to ESTSP7 and NN3 Competition Benchmarks Antti Sorjamaa and Amaury Lendasse Abstract In this paper, time series prediction is considered as

More information

A Feature Selection Methodology for Steganalysis

A Feature Selection Methodology for Steganalysis A Feature Selection Methodology for Steganalysis Yoan Miche 1, Benoit Roue 2, Amaury Lendasse 1,andPatrickBas 1,2 1 Laboratory of Computer and Information Science Helsinki University of Technology P.O.

More information

SOM+EOF for Finding Missing Values

SOM+EOF for Finding Missing Values SOM+EOF for Finding Missing Values Antti Sorjamaa 1, Paul Merlin 2, Bertrand Maillet 2 and Amaury Lendasse 1 1- Helsinki University of Technology - CIS P.O. Box 5400, 02015 HUT - Finland 2- Variances and

More information

A methodology for Building Regression Models using Extreme Learning Machine: OP-ELM

A methodology for Building Regression Models using Extreme Learning Machine: OP-ELM A methodology for Building Regression Models using Extreme Learning Machine: OP-ELM Yoan Miche 1,2, Patrick Bas 1,2, Christian Jutten 2, Olli Simula 1 and Amaury Lendasse 1 1- Helsinki University of Technology

More information

Methodology for long-term prediction of time series

Methodology for long-term prediction of time series Neurocomputing ] (]]]]) ]]] ]]] www.elsevier.com/locate/neucom Methodology for long-term prediction of time series Antti Sorjamaa, Jin Hao, Nima Reyhani, Yongnan Ji, Amaury Lendasse Helsinki University

More information

Basis Functions. Volker Tresp Summer 2017

Basis Functions. Volker Tresp Summer 2017 Basis Functions Volker Tresp Summer 2017 1 Nonlinear Mappings and Nonlinear Classifiers Regression: Linearity is often a good assumption when many inputs influence the output Some natural laws are (approximately)

More information

Model validation T , , Heli Hiisilä

Model validation T , , Heli Hiisilä Model validation T-61.6040, 03.10.2006, Heli Hiisilä Testing Neural Models: How to Use Re-Sampling Techniques? A. Lendasse & Fast bootstrap methodology for model selection, A. Lendasse, G. Simon, V. Wertz,

More information

1. Estimation equations for strip transect sampling, using notation consistent with that used to

1. Estimation equations for strip transect sampling, using notation consistent with that used to Web-based Supplementary Materials for Line Transect Methods for Plant Surveys by S.T. Buckland, D.L. Borchers, A. Johnston, P.A. Henrys and T.A. Marques Web Appendix A. Introduction In this on-line appendix,

More information

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics

BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics BIOINF 585: Machine Learning for Systems Biology & Clinical Informatics Lecture 12: Ensemble Learning I Jie Wang Department of Computational Medicine & Bioinformatics University of Michigan 1 Outline Bias

More information

A Two-phase Distributed Training Algorithm for Linear SVM in WSN

A Two-phase Distributed Training Algorithm for Linear SVM in WSN Proceedings of the World Congress on Electrical Engineering and Computer Systems and Science (EECSS 015) Barcelona, Spain July 13-14, 015 Paper o. 30 A wo-phase Distributed raining Algorithm for Linear

More information

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection

Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Flexible-Hybrid Sequential Floating Search in Statistical Feature Selection Petr Somol 1,2, Jana Novovičová 1,2, and Pavel Pudil 2,1 1 Dept. of Pattern Recognition, Institute of Information Theory and

More information

Basis Functions. Volker Tresp Summer 2016

Basis Functions. Volker Tresp Summer 2016 Basis Functions Volker Tresp Summer 2016 1 I am an AI optimist. We ve got a lot of work in machine learning, which is sort of the polite term for AI nowadays because it got so broad that it s not that

More information

Weighting and selection of features.

Weighting and selection of features. Intelligent Information Systems VIII Proceedings of the Workshop held in Ustroń, Poland, June 14-18, 1999 Weighting and selection of features. Włodzisław Duch and Karol Grudziński Department of Computer

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

CSE 446 Bias-Variance & Naïve Bayes

CSE 446 Bias-Variance & Naïve Bayes CSE 446 Bias-Variance & Naïve Bayes Administrative Homework 1 due next week on Friday Good to finish early Homework 2 is out on Monday Check the course calendar Start early (midterm is right before Homework

More information

Nonparametric Regression

Nonparametric Regression Nonparametric Regression John Fox Department of Sociology McMaster University 1280 Main Street West Hamilton, Ontario Canada L8S 4M4 jfox@mcmaster.ca February 2004 Abstract Nonparametric regression analysis

More information

Modular network SOM : Theory, algorithm and applications

Modular network SOM : Theory, algorithm and applications Modular network SOM : Theory, algorithm and applications Kazuhiro Tokunaga and Tetsuo Furukawa Kyushu Institute of Technology, Kitakyushu 88-96, Japan {tokunaga, furukawa}@brain.kyutech.ac.jp Abstract.

More information

Internal representations of multi-layered perceptrons

Internal representations of multi-layered perceptrons Internal representations of multi-layered perceptrons Włodzisław Duch School of Computer Engineering, Nanyang Technological University, Singapore & Dept. of Informatics, Nicholaus Copernicus University,

More information

Automatic basis selection for RBF networks using Stein s unbiased risk estimator

Automatic basis selection for RBF networks using Stein s unbiased risk estimator Automatic basis selection for RBF networks using Stein s unbiased risk estimator Ali Ghodsi School of omputer Science University of Waterloo University Avenue West NL G anada Email: aghodsib@cs.uwaterloo.ca

More information

On the need of unfolding preprocessing for time series clustering

On the need of unfolding preprocessing for time series clustering WSOM 5, 5th Workshop on Self-Organizing Maps Paris (France), 5-8 September 5, pp. 5-58. On the need of unfolding preprocessing for time series clustering Geoffroy Simon, John A. Lee, Michel Verleysen Machine

More information

A Population Based Convergence Criterion for Self-Organizing Maps

A Population Based Convergence Criterion for Self-Organizing Maps A Population Based Convergence Criterion for Self-Organizing Maps Lutz Hamel and Benjamin Ott Department of Computer Science and Statistics, University of Rhode Island, Kingston, RI 02881, USA. Email:

More information

Dynamic Ensemble Construction via Heuristic Optimization

Dynamic Ensemble Construction via Heuristic Optimization Dynamic Ensemble Construction via Heuristic Optimization Şenay Yaşar Sağlam and W. Nick Street Department of Management Sciences The University of Iowa Abstract Classifier ensembles, in which multiple

More information

Supervised Variable Clustering for Classification of NIR Spectra

Supervised Variable Clustering for Classification of NIR Spectra Supervised Variable Clustering for Classification of NIR Spectra Catherine Krier *, Damien François 2, Fabrice Rossi 3, Michel Verleysen, Université catholique de Louvain, Machine Learning Group, place

More information

Fast Approximate Minimum Spanning Tree Algorithm Based on K-Means

Fast Approximate Minimum Spanning Tree Algorithm Based on K-Means Fast Approximate Minimum Spanning Tree Algorithm Based on K-Means Caiming Zhong 1,2,3, Mikko Malinen 2, Duoqian Miao 1,andPasiFränti 2 1 Department of Computer Science and Technology, Tongji University,

More information

Image Classification Using Wavelet Coefficients in Low-pass Bands

Image Classification Using Wavelet Coefficients in Low-pass Bands Proceedings of International Joint Conference on Neural Networks, Orlando, Florida, USA, August -7, 007 Image Classification Using Wavelet Coefficients in Low-pass Bands Weibao Zou, Member, IEEE, and Yan

More information

Homework #4 Programming Assignment Due: 11:59 pm, November 4, 2018

Homework #4 Programming Assignment Due: 11:59 pm, November 4, 2018 CSCI 567, Fall 18 Haipeng Luo Homework #4 Programming Assignment Due: 11:59 pm, ovember 4, 2018 General instructions Your repository will have now a directory P4/. Please do not change the name of this

More information

Distribution-free Predictive Approaches

Distribution-free Predictive Approaches Distribution-free Predictive Approaches The methods discussed in the previous sections are essentially model-based. Model-free approaches such as tree-based classification also exist and are popular for

More information

Application of Or-based Rule Antecedent Fuzzy Neural Networks to Iris Data Classification Problem

Application of Or-based Rule Antecedent Fuzzy Neural Networks to Iris Data Classification Problem Vol.1 (DTA 016, pp.17-1 http://dx.doi.org/10.157/astl.016.1.03 Application of Or-based Rule Antecedent Fuzzy eural etworks to Iris Data Classification roblem Chang-Wook Han Department of Electrical Engineering,

More information

Width optimization of the Gaussian kernels in Radial Basis Function Networks

Width optimization of the Gaussian kernels in Radial Basis Function Networks ESANN' proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), - April, d-side publi., ISBN -97--, pp. - Width optimization of the Gaussian kernels in Radial Basis Function Networks

More information

Multiresponse Sparse Regression with Application to Multidimensional Scaling

Multiresponse Sparse Regression with Application to Multidimensional Scaling Multiresponse Sparse Regression with Application to Multidimensional Scaling Timo Similä and Jarkko Tikka Helsinki University of Technology, Laboratory of Computer and Information Science P.O. Box 54,

More information

Self-Organizing Maps for cyclic and unbounded graphs

Self-Organizing Maps for cyclic and unbounded graphs Self-Organizing Maps for cyclic and unbounded graphs M. Hagenbuchner 1, A. Sperduti 2, A.C. Tsoi 3 1- University of Wollongong, Wollongong, Australia. 2- University of Padova, Padova, Italy. 3- Hong Kong

More information

6.867 Machine Learning

6.867 Machine Learning 6.867 Machine Learning Problem set - solutions Thursday, October What and how to turn in? Turn in short written answers to the questions explicitly stated, and when requested to explain or prove. Do not

More information

Cross-validation and the Bootstrap

Cross-validation and the Bootstrap Cross-validation and the Bootstrap In the section we discuss two resampling methods: cross-validation and the bootstrap. These methods refit a model of interest to samples formed from the training set,

More information

Stability Assessment of Electric Power Systems using Growing Neural Gas and Self-Organizing Maps

Stability Assessment of Electric Power Systems using Growing Neural Gas and Self-Organizing Maps Stability Assessment of Electric Power Systems using Growing Gas and Self-Organizing Maps Christian Rehtanz, Carsten Leder University of Dortmund, 44221 Dortmund, Germany Abstract. Liberalized competitive

More information

FEATURE EXTRACTION USING FUZZY RULE BASED SYSTEM

FEATURE EXTRACTION USING FUZZY RULE BASED SYSTEM International Journal of Computer Science and Applications, Vol. 5, No. 3, pp 1-8 Technomathematics Research Foundation FEATURE EXTRACTION USING FUZZY RULE BASED SYSTEM NARENDRA S. CHAUDHARI and AVISHEK

More information

Data mining with Support Vector Machine

Data mining with Support Vector Machine Data mining with Support Vector Machine Ms. Arti Patle IES, IPS Academy Indore (M.P.) artipatle@gmail.com Mr. Deepak Singh Chouhan IES, IPS Academy Indore (M.P.) deepak.schouhan@yahoo.com Abstract: Machine

More information

Accelerating the convergence speed of neural networks learning methods using least squares

Accelerating the convergence speed of neural networks learning methods using least squares Bruges (Belgium), 23-25 April 2003, d-side publi, ISBN 2-930307-03-X, pp 255-260 Accelerating the convergence speed of neural networks learning methods using least squares Oscar Fontenla-Romero 1, Deniz

More information

Lecture 26: Missing data

Lecture 26: Missing data Lecture 26: Missing data Reading: ESL 9.6 STATS 202: Data mining and analysis December 1, 2017 1 / 10 Missing data is everywhere Survey data: nonresponse. 2 / 10 Missing data is everywhere Survey data:

More information

Rank Measures for Ordering

Rank Measures for Ordering Rank Measures for Ordering Jin Huang and Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 email: fjhuang33, clingg@csd.uwo.ca Abstract. Many

More information

Improving A Trajectory Index For Topology Conserving Mapping

Improving A Trajectory Index For Topology Conserving Mapping Proceedings of the 8th WSEAS Int. Conference on Automatic Control, Modeling and Simulation, Prague, Czech Republic, March -4, 006 (pp03-08) Improving A Trajectory Index For Topology Conserving Mapping

More information

Stat 342 Exam 3 Fall 2014

Stat 342 Exam 3 Fall 2014 Stat 34 Exam 3 Fall 04 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed There are questions on the following 6 pages. Do as many of them as you can

More information

TWRBF Transductive RBF Neural Network with Weighted Data Normalization

TWRBF Transductive RBF Neural Network with Weighted Data Normalization TWRBF Transductive RBF eural etwork with Weighted Data ormalization Qun Song and ikola Kasabov Knowledge Engineering & Discovery Research Institute Auckland University of Technology Private Bag 9006, Auckland

More information

Overfitting. Machine Learning CSE546 Carlos Guestrin University of Washington. October 2, Bias-Variance Tradeoff

Overfitting. Machine Learning CSE546 Carlos Guestrin University of Washington. October 2, Bias-Variance Tradeoff Overfitting Machine Learning CSE546 Carlos Guestrin University of Washington October 2, 2013 1 Bias-Variance Tradeoff Choice of hypothesis class introduces learning bias More complex class less bias More

More information

PARAMETRIC ESTIMATION OF CONSTRUCTION COST USING COMBINED BOOTSTRAP AND REGRESSION TECHNIQUE

PARAMETRIC ESTIMATION OF CONSTRUCTION COST USING COMBINED BOOTSTRAP AND REGRESSION TECHNIQUE INTERNATIONAL JOURNAL OF CIVIL ENGINEERING AND TECHNOLOGY (IJCIET) Proceedings of the International Conference on Emerging Trends in Engineering and Management (ICETEM14) ISSN 0976 6308 (Print) ISSN 0976

More information

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München Evaluation Measures Sebastian Pölsterl Computer Aided Medical Procedures Technische Universität München April 28, 2015 Outline 1 Classification 1. Confusion Matrix 2. Receiver operating characteristics

More information

Estimating Error-Dimensionality Relationship for Gene Expression Based Cancer Classification

Estimating Error-Dimensionality Relationship for Gene Expression Based Cancer Classification 1 Estimating Error-Dimensionality Relationship for Gene Expression Based Cancer Classification Feng Chu and Lipo Wang School of Electrical and Electronic Engineering Nanyang Technological niversity Singapore

More information

Statistical Methods in AI

Statistical Methods in AI Statistical Methods in AI Distance Based and Linear Classifiers Shrenik Lad, 200901097 INTRODUCTION : The aim of the project was to understand different types of classification algorithms by implementing

More information

New structural similarity measure for image comparison

New structural similarity measure for image comparison University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2012 New structural similarity measure for image

More information

Lecture 7. CS4442/9542b: Artificial Intelligence II Prof. Olga Veksler. Outline. Machine Learning: Cross Validation. Performance evaluation methods

Lecture 7. CS4442/9542b: Artificial Intelligence II Prof. Olga Veksler. Outline. Machine Learning: Cross Validation. Performance evaluation methods CS4442/9542b: Artificial Intelligence II Prof. Olga Veksler Lecture 7 Machine Learning: Cross Validation Outline Performance evaluation methods test/train sets cross-validation k-fold Leave-one-out 1 A

More information

Manifold Learning for Video-to-Video Face Recognition

Manifold Learning for Video-to-Video Face Recognition Manifold Learning for Video-to-Video Face Recognition Abstract. We look in this work at the problem of video-based face recognition in which both training and test sets are video sequences, and propose

More information

Time Series Classification in Dissimilarity Spaces

Time Series Classification in Dissimilarity Spaces Proceedings 1st International Workshop on Advanced Analytics and Learning on Temporal Data AALTD 2015 Time Series Classification in Dissimilarity Spaces Brijnesh J. Jain and Stephan Spiegel Berlin Institute

More information

Estimating Map Accuracy without a Spatially Representative Training Sample

Estimating Map Accuracy without a Spatially Representative Training Sample Estimating Map Accuracy without a Spatially Representative Training Sample David A. Patterson, Mathematical Sciences, The University of Montana-Missoula Brian M. Steele, Mathematical Sciences, The University

More information

Individualized Error Estimation for Classification and Regression Models

Individualized Error Estimation for Classification and Regression Models Individualized Error Estimation for Classification and Regression Models Krisztian Buza, Alexandros Nanopoulos, Lars Schmidt-Thieme Abstract Estimating the error of classification and regression models

More information

Feature selection in environmental data mining combining Simulated Annealing and Extreme Learning Machine

Feature selection in environmental data mining combining Simulated Annealing and Extreme Learning Machine Feature selection in environmental data mining combining Simulated Annealing and Extreme Learning Machine Michael Leuenberger and Mikhail Kanevski University of Lausanne - Institute of Earth Surface Dynamics

More information

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-"&"3 -"(' ( +-" " " % '.+ % ' -0(+$,

A *69>H>N6 #DJGC6A DG C<>C::G>C<,8>:C8:H /DA 'D 2:6G, ()-&3 -(' ( +-   % '.+ % ' -0(+$, The structure is a very important aspect in neural network design, it is not only impossible to determine an optimal structure for a given problem, it is even impossible to prove that a given structure

More information

Local Linear Approximation for Kernel Methods: The Railway Kernel

Local Linear Approximation for Kernel Methods: The Railway Kernel Local Linear Approximation for Kernel Methods: The Railway Kernel Alberto Muñoz 1,JavierGonzález 1, and Isaac Martín de Diego 1 University Carlos III de Madrid, c/ Madrid 16, 890 Getafe, Spain {alberto.munoz,

More information

Constrained Optimization of the Stress Function for Multidimensional Scaling

Constrained Optimization of the Stress Function for Multidimensional Scaling Constrained Optimization of the Stress Function for Multidimensional Scaling Vydunas Saltenis Institute of Mathematics and Informatics Akademijos 4, LT-08663 Vilnius, Lithuania Saltenis@ktlmiilt Abstract

More information

Chapter 12 Feature Selection

Chapter 12 Feature Selection Chapter 12 Feature Selection Xiaogang Su Department of Statistics University of Central Florida - 1 - Outline Why Feature Selection? Categorization of Feature Selection Methods Filter Methods Wrapper Methods

More information

Feature clustering and mutual information for the selection of variables in spectral data

Feature clustering and mutual information for the selection of variables in spectral data Feature clustering and mutual information for the selection of variables in spectral data C. Krier 1, D. François 2, F.Rossi 3, M. Verleysen 1 1,2 Université catholique de Louvain, Machine Learning Group

More information

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things.

Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. + What is Data? Data is a collection of facts. Data can be in the form of numbers, words, measurements, observations or even just descriptions of things. In most cases, data needs to be interpreted and

More information

A Topography-Preserving Latent Variable Model with Learning Metrics

A Topography-Preserving Latent Variable Model with Learning Metrics A Topography-Preserving Latent Variable Model with Learning Metrics Samuel Kaski and Janne Sinkkonen Helsinki University of Technology Neural Networks Research Centre P.O. Box 5400, FIN-02015 HUT, Finland

More information

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte

Statistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Statistical Analysis of Metabolomics Data Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Outline Introduction Data pre-treatment 1. Normalization 2. Centering,

More information

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification Tomohiro Tanno, Kazumasa Horie, Jun Izawa, and Masahiko Morita University

More information

COMPARISON OF SOME EFFICIENT METHODS OF CONTOUR COMPRESSION

COMPARISON OF SOME EFFICIENT METHODS OF CONTOUR COMPRESSION COMPARISON OF SOME EFFICIENT METHODS OF CONTOUR COMPRESSION REMIGIUSZ BARAN ANDRZEJ DZIECH 2 Department of Electronics and Intelligent Systems, Kielce University of Technology, POLAND 2 AGH University

More information

Sensor Tasking and Control

Sensor Tasking and Control Sensor Tasking and Control Outline Task-Driven Sensing Roles of Sensor Nodes and Utilities Information-Based Sensor Tasking Joint Routing and Information Aggregation Summary Introduction To efficiently

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 12 Combining

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right

More information

Linear Cryptanalysis of Reduced Round Serpent

Linear Cryptanalysis of Reduced Round Serpent Linear Cryptanalysis of Reduced Round Serpent Eli Biham 1, Orr Dunkelman 1, and Nathan Keller 2 1 Computer Science Department, Technion Israel Institute of Technology, Haifa 32000, Israel, {biham,orrd}@cs.technion.ac.il,

More information

Bootstrap Confidence Intervals for Regression Error Characteristic Curves Evaluating the Prediction Error of Software Cost Estimation Models

Bootstrap Confidence Intervals for Regression Error Characteristic Curves Evaluating the Prediction Error of Software Cost Estimation Models Bootstrap Confidence Intervals for Regression Error Characteristic Curves Evaluating the Prediction Error of Software Cost Estimation Models Nikolaos Mittas, Lefteris Angelis Department of Informatics,

More information

A Monotonic Sequence and Subsequence Approach in Missing Data Statistical Analysis

A Monotonic Sequence and Subsequence Approach in Missing Data Statistical Analysis Global Journal of Pure and Applied Mathematics. ISSN 0973-1768 Volume 12, Number 1 (2016), pp. 1131-1140 Research India Publications http://www.ripublication.com A Monotonic Sequence and Subsequence Approach

More information

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995)

A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Department of Information, Operations and Management Sciences Stern School of Business, NYU padamopo@stern.nyu.edu

More information

A RADIO COLORING OF A HYPERCUBE

A RADIO COLORING OF A HYPERCUBE Intern. J. Computer Math., 2002, Vol. 79(6), pp. 665 670 A RADIO COLORING OF A HYPERCUBE OPHIR FRIEDER a, *, FRANK HARARY b and PENG-JUN WAN a a Illinois Institute of Technology, Chicago, IL 60616; b New

More information

Assignment 4 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran

Assignment 4 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran Assignment 4 (Sol.) Introduction to Data Analytics Prof. andan Sudarsanam & Prof. B. Ravindran 1. Which among the following techniques can be used to aid decision making when those decisions depend upon

More information

Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map

Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map Texture Classification by Combining Local Binary Pattern Features and a Self-Organizing Map Markus Turtinen, Topi Mäenpää, and Matti Pietikäinen Machine Vision Group, P.O.Box 4500, FIN-90014 University

More information

SYMBOLIC FEATURES IN NEURAL NETWORKS

SYMBOLIC FEATURES IN NEURAL NETWORKS SYMBOLIC FEATURES IN NEURAL NETWORKS Włodzisław Duch, Karol Grudziński and Grzegorz Stawski 1 Department of Computer Methods, Nicolaus Copernicus University ul. Grudziadzka 5, 87-100 Toruń, Poland Abstract:

More information

Noise Reduction in Image Sequences using an Effective Fuzzy Algorithm

Noise Reduction in Image Sequences using an Effective Fuzzy Algorithm Noise Reduction in Image Sequences using an Effective Fuzzy Algorithm Mahmoud Saeid Khadijeh Saeid Mahmoud Khaleghi Abstract In this paper, we propose a novel spatiotemporal fuzzy based algorithm for noise

More information

BRACE: A Paradigm For the Discretization of Continuously Valued Data

BRACE: A Paradigm For the Discretization of Continuously Valued Data Proceedings of the Seventh Florida Artificial Intelligence Research Symposium, pp. 7-2, 994 BRACE: A Paradigm For the Discretization of Continuously Valued Data Dan Ventura Tony R. Martinez Computer Science

More information

A Multiple-Line Fitting Algorithm Without Initialization Yan Guo

A Multiple-Line Fitting Algorithm Without Initialization Yan Guo A Multiple-Line Fitting Algorithm Without Initialization Yan Guo Abstract: The commonest way to fit multiple lines is to use methods incorporate the EM algorithm. However, the EM algorithm dose not guarantee

More information

Ensemble Modeling with a Constrained Linear System of Leave-One-Out Outputs

Ensemble Modeling with a Constrained Linear System of Leave-One-Out Outputs Ensemble Modeling with a Constrained Linear System of Leave-One-Out Outputs Yoan Miche 1,2, Emil Eirola 1, Patrick Bas 2, Olli Simula 1, Christian Jutten 2, Amaury Lendasse 1 and Michel Verleysen 3 1 Helsinki

More information

Generic object recognition using graph embedding into a vector space

Generic object recognition using graph embedding into a vector space American Journal of Software Engineering and Applications 2013 ; 2(1) : 13-18 Published online February 20, 2013 (http://www.sciencepublishinggroup.com/j/ajsea) doi: 10.11648/j. ajsea.20130201.13 Generic

More information

ERAD Optical flow in radar images. Proceedings of ERAD (2004): c Copernicus GmbH 2004

ERAD Optical flow in radar images. Proceedings of ERAD (2004): c Copernicus GmbH 2004 Proceedings of ERAD (2004): 454 458 c Copernicus GmbH 2004 ERAD 2004 Optical flow in radar images M. Peura and H. Hohti Finnish Meteorological Institute, P.O. Box 503, FIN-00101 Helsinki, Finland Abstract.

More information

Facial Expression Classification with Random Filters Feature Extraction

Facial Expression Classification with Random Filters Feature Extraction Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle

More information

Leave-One-Out Support Vector Machines

Leave-One-Out Support Vector Machines Leave-One-Out Support Vector Machines Jason Weston Department of Computer Science Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 OEX, UK. Abstract We present a new learning algorithm

More information

Recent advances in Metamodel of Optimal Prognosis. Lectures. Thomas Most & Johannes Will

Recent advances in Metamodel of Optimal Prognosis. Lectures. Thomas Most & Johannes Will Lectures Recent advances in Metamodel of Optimal Prognosis Thomas Most & Johannes Will presented at the Weimar Optimization and Stochastic Days 2010 Source: www.dynardo.de/en/library Recent advances in

More information

Parameter Selection of a Support Vector Machine, Based on a Chaotic Particle Swarm Optimization Algorithm

Parameter Selection of a Support Vector Machine, Based on a Chaotic Particle Swarm Optimization Algorithm BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 5 No 3 Sofia 205 Print ISSN: 3-9702; Online ISSN: 34-408 DOI: 0.55/cait-205-0047 Parameter Selection of a Support Vector Machine

More information

Time Complexity Analysis of the Genetic Algorithm Clustering Method

Time Complexity Analysis of the Genetic Algorithm Clustering Method Time Complexity Analysis of the Genetic Algorithm Clustering Method Z. M. NOPIAH, M. I. KHAIRIR, S. ABDULLAH, M. N. BAHARIN, and A. ARIFIN Department of Mechanical and Materials Engineering Universiti

More information

Solving Sudoku Puzzles with Node Based Coincidence Algorithm

Solving Sudoku Puzzles with Node Based Coincidence Algorithm Solving Sudoku Puzzles with Node Based Coincidence Algorithm Kiatsopon Waiyapara Department of Compute Engineering, Faculty of Engineering, Chulalongkorn University, Bangkok, Thailand kiatsopon.w@gmail.com

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis CHAPTER 3 BEST FIRST AND GREEDY SEARCH BASED CFS AND NAÏVE BAYES ALGORITHMS FOR HEPATITIS DIAGNOSIS 3.1 Introduction

More information

3 Nonlinear Regression

3 Nonlinear Regression CSC 4 / CSC D / CSC C 3 Sometimes linear models are not sufficient to capture the real-world phenomena, and thus nonlinear models are necessary. In regression, all such models will have the same basic

More information

A comparative study of local classifiers based on clustering techniques and one-layer neural networks

A comparative study of local classifiers based on clustering techniques and one-layer neural networks A comparative study of local classifiers based on clustering techniques and one-layer neural networks Yuridia Gago-Pallares, Oscar Fontenla-Romero and Amparo Alonso-Betanzos University of A Coruña, Department

More information

Comparison of Variational Bayes and Gibbs Sampling in Reconstruction of Missing Values with Probabilistic Principal Component Analysis

Comparison of Variational Bayes and Gibbs Sampling in Reconstruction of Missing Values with Probabilistic Principal Component Analysis Comparison of Variational Bayes and Gibbs Sampling in Reconstruction of Missing Values with Probabilistic Principal Component Analysis Luis Gabriel De Alba Rivera Aalto University School of Science and

More information

Experimental Study on Bound Handling Techniques for Multi-Objective Particle Swarm Optimization

Experimental Study on Bound Handling Techniques for Multi-Objective Particle Swarm Optimization Experimental Study on Bound Handling Techniques for Multi-Objective Particle Swarm Optimization adfa, p. 1, 2011. Springer-Verlag Berlin Heidelberg 2011 Devang Agarwal and Deepak Sharma Department of Mechanical

More information

HAUL TRUCK RELIABILITY ANALYSIS APPLYING A META- HEURISTIC-BASED ARTIFICIAL NEURAL NETWORK MODEL: A CASE STUDY FROM A BAUXITE MINE IN INDIA

HAUL TRUCK RELIABILITY ANALYSIS APPLYING A META- HEURISTIC-BASED ARTIFICIAL NEURAL NETWORK MODEL: A CASE STUDY FROM A BAUXITE MINE IN INDIA HAUL TRUCK RELIABILITY ANALYSIS APPLYING A META- HEURISTIC-BASED ARTIFICIAL NEURAL NETWORK MODEL: A CASE STUDY FROM A BAUXITE MINE IN INDIA Snehamoy Chatterjee, snehamoy@gmail.com, Assistant Professor,

More information

Artificial Neural Network and Multi-Response Optimization in Reliability Measurement Approximation and Redundancy Allocation Problem

Artificial Neural Network and Multi-Response Optimization in Reliability Measurement Approximation and Redundancy Allocation Problem International Journal of Mathematics and Statistics Invention (IJMSI) E-ISSN: 2321 4767 P-ISSN: 2321-4759 Volume 4 Issue 10 December. 2016 PP-29-34 Artificial Neural Network and Multi-Response Optimization

More information

Publication A Institute of Electrical and Electronics Engineers (IEEE)

Publication A Institute of Electrical and Electronics Engineers (IEEE) Publication A Yoan Miche, Antti Sorjamaa, Patrick Bas, Olli Simula, Christian Jutten, and Amaury Lendasse. 2010. OP ELM: Optimally Pruned Extreme Learning Machine. IEEE Transactions on Neural Networks,

More information

Local Image Registration: An Adaptive Filtering Framework

Local Image Registration: An Adaptive Filtering Framework Local Image Registration: An Adaptive Filtering Framework Gulcin Caner a,a.murattekalp a,b, Gaurav Sharma a and Wendi Heinzelman a a Electrical and Computer Engineering Dept.,University of Rochester, Rochester,

More information

A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition

A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition A Multiclassifier based Approach for Word Sense Disambiguation using Singular Value Decomposition Ana Zelaia, Olatz Arregi and Basilio Sierra Computer Science Faculty University of the Basque Country ana.zelaia@ehu.es

More information

Nonparametric Approaches to Regression

Nonparametric Approaches to Regression Nonparametric Approaches to Regression In traditional nonparametric regression, we assume very little about the functional form of the mean response function. In particular, we assume the model where m(xi)

More information