COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS
|
|
- Rodney Paul
- 6 years ago
- Views:
Transcription
1 COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS Toomas Kirt Supervisor: Leo Võhandu Tallinn Technical University Abstract: Key words: For the visualisation of multidimensional financial data sets we have used the Self- Organising Maps (SOM) by T. Kohonen [6,8]. SOM is one of the very useful neural computing methods for analysing and visualising multidimensional data. To achieve better computational speed, one has to reduce the dimensionality of the original data during data pre-processing stage. One of the best-known methods for that is the Principal Component Analysis (PCA). In our research project we have used one alternative effective method for the dimensionality reduction. This method is called the Peeling Method (Võhandu, Krusberg, 1977). Main difference between these two methods is that PCA finds out principal components that are optimised linear combinations of all original data variables (so there is no real reduction of data), but the Peeling method finds first out most important variables that describe the system correlations in the best possible way using only some (or most) of variables. To illustrate results between different dimensionality reduction methods, we have used financial data of Estonian banks. Our reduction method enables to detach almost half of the original variables from the original data and to get practically the same results afterwards using SOM as with all the data. Self- Organising Maps, Neural Networks, Dimensionality Reduction, Data Mining, Peeling Method 1
2 2 Toomas Kirt 1. INTRODUCTION The Self- Organising Maps (SOM) is a very useful neural computing method for analysing and visualising multidimensional data. SOM can be used for translating multidimensional financial data into simple twodimensional maps. SOM groups similar input data vectors which are near each other in the input space to nearby map units in the SOM. The SOM can thus be used as a clustering tool. To achieve better computational speed, it is possible to reduce dimensionality of the original data during data preprocessing. One of the well-known methods for the dimensionality reduction is the Principal Component Analysis (PCA). In this paper we use also another dimensionality reduction technique the Peeling Method. Our goal in this paper is to compare two dimensionality reduction methods. To visualise two methods we use financial data of Estonian banks. Our goal is to compare results created by SOM using original data and data with reduced dimensionality. This paper is divided into four parts. In the first part we give short overview of the SOM algorithm and visualisation methods. In the second part we give an overview of the PCA s main properties. In the third part we introduce the Peeling method and in the fourth part we use above described methods with the financial data of Estonian banks and compare results. 2. SELF ORGANISING MAPS (SOM) A self-organising map is a feedforward neural network that uses an unsupervised training algorithm, and through a process called selforganisation, it configures the output units into a topological representation of the original data [2,5]. The SOM belongs to a general class of neural network methods, which are non-linear regression techniques that can be trained to learn or find relationships between inputs and outputs or to organise data so as to disclose so far unknown patterns or structures [1]. The algorithm is based on unsupervised, competitive learning. The algorithm provides a topology preserving mapping from high-dimensional space to map units. Map units, or neurones, usually from a two-dimensional space grid and thus the mapping is a mapping from a high-dimensional space onto plane. The property of topology preserving means that the SOM groups similar input data vectors on neurons: points that are near each other in the input space are mapped to nearby map units in the SOM. The SOM can thus serve as a clustering tool as well as a tool for visualising high-dimensional data.
3 Combined method to visualise and reduce dimensionality of the financial data sets 3 The process of creating a self-organising map requires two layers of processing units. The first is an input layer containing processing units for each element in the input vector; the second is an output layer or a grid of processing units that is fully connected with those at the input layer. User, depending on how the map will be used, can define size of an output layer. The learning process goes on as follows. At first the output grid will be initialised with initial values that could be random values from input space. One sample will be taken from input variables and it will be presented to the output grid of the map. All the neurons in the output layer compete with each other to become a winner. The winner will be the output node that is the closest to the sample vector according to the Euclidean distance. The weights of the winner neuron will be changed closer to the sample vector, moved in direction of the input sample. The weights of the neurons in the neighbourhood of the winner unit will also be changed. During the process of learning the learning rate becomes smaller and also the rate of change declines around the neighbourhood of the winning neuron. At the end of the training only the winning unit is adjusted. As a result of the self- organising process the similar input data vectors are mapped to nearby map units in the SOM. The Unified distance matrix (U-matrix) method is usually used to visualise the structure of the input space of a self -organising feature map. The U-matrix method can be used for getting an impression of otherwise invisible structures in a multidimensional data space and it allows classifying data sets into groups of similar data points (self-organised classification or clustering). One of the simplest U-matrix methods is to sum up the distances of weight vectors of adjacent neurons on a feature map. An U-matrix gives a picture of topology of the unit layer and therefore also of the topology of the input spaces as follows: altitude in the U-matrix encodes dissimilarity in the input space. Valleys in the U-matrix (i.e. low altitudes) correspond to input vectors that are similar. [10] So the clusters in a multidimensional data set can be identified if all the points falling into the same valley of an U-matrix are grouped together. Furthermore the height of the walls or hills on an U-matrix gives a hint how much the classes differ from each other. Finally the properties of Self- Organising Maps ensure that similar groups are situated nearby in an U- matrix.
4 4 Toomas Kirt 3. PRINCIPAL COMPONENT ANALYSIS The Principal Component Analysis (PCA) is a technique commonly used for data reduction in statistical pattern recognition and signal processing. It is also known as Karhunen- Loève Transform [3,9]. In Principal Component Analysis each component of the projected vector is a linear combination of the components of the original data item. The projection is formed by multiplying each component by a certain fixed scalar coefficient and adding the results together. Mathematical methods exist for finding the optimal coefficients such that the variance of the data after the projection will be preserved, whereby it is also closest to the variance of the original data. N Let X R be a random n-dimensional vector representing the environment of interest. Our goal is to generate features that are optimally uncorrelated, that is E[y(i)y(j)]=0, i j. Let Y=A T X. From the definition of the correlation matrix we have R Y E[YY T ]=E[A T XX T A]=A T R X A. (3.1) However, R X is a symmetric matrix, and hence its eigenvectors are mutually orthogonal. Thus, if matrix A is chosen so that columns are the orthonormal eigenvectors a 1,a 2,..,a N, of R X, then R Y is diagonal R Y =A T R X A=Λ (3.2) where λ is the diagonal matrix having as elements on its diagonal the respective eigenvalues λ 1,λ 2,,λ N, of R X. Let the corresponding eigenvalues be arranged in decreasing order: λ 1 >λ 2 > >λ j > >λ N so that λ 1 = λ max. An important property of PCA is Mean square error approximation. If we choose in m xˆ = y( i) (3.3) i= 1 a i the eigenvectors corresponding to the m, m N, largest eigenvalues of the correlation matrix, then the MSE is minimised, being the sum of N-m smallest eigenvalues. Another property of PCA is property of total variance. Let E[X] be zero and y be the PCA transformed vector of X. The eigenvalues of the input correlation matrix are equal to the variances of the transformed features [ y i ] 2 ( i 2 σ Y ( i) E ) = λ. (3.4) Thus, selecting those features, y( i) a largest eigenvalues makes their sum variance λ maximum. i i T i x, corresponding to the m Those properties allow choosing m principal components that retain most of the total variance associated with the original random variables [7].
5 Combined method to visualise and reduce dimensionality of the financial data sets 5 4. PEELING METHOD The Peeling method [11] finds first out most important variables that describe the system correlations in the best possible way using only some (or most) of variables. Algorithm works as following: 1. For a correlation matrix R 1 r12... r1 m r r2 m R = (4.1) rm1 rm we calculate for every column 2. For ( k ) S = max S (4.3) j S m 2 rij i= j = 1 rjj (4.2) (That is the number of the most important in average variable in the system. Superscript shows the number of the iteration (k=1,,r m) 3. The correlation coefficients of the maximal variable will be divided by the square root of the diagonal element r jj of the matrix R. The transformed column vector b 1 is the first vector of the new factor matrix B. 4. Find the residual matrix (1) ' R = R b1b1 (4.4) 5. Repeat the process r times, where r m is the rank of R. According to the elimination order we take first r variables and use them in following activities. 5. CASE STUDY: SOM OF ESTONIAN BANKS To illustrate two methods we have used financial reports of Estonian banks. We used 92 different reports from the period and each report consists of 16 variables. It is not very remarkable amount of data, but our goal is just to compare visual results achieved by the different methods. Every node on the map represents one or more reports of the banks. Firstly we created a U-matrix of the original data and the result is given on the Figure 1.
6 6 Toomas Kirt Figure 1. SOM of Estonian banks 16 variables Secondly we applied to the original data PCA method and selected out nine linear combination that describe more than 95% of the variation. After creating SOM from reduced data we got result as showed on the Figure 2. Figure 2. SOM of Estonian banks PCA 9 variables At the third attempt we have used the Peeling method. We eliminated six variables and used ten original variables. Result we can see on Figure 3.
7 Combined method to visualise and reduce dimensionality of the financial data sets 7 Figure 3. SOM of Estonian banks Peeling Method 10 variables As we can see, the structures of the three maps are similar. It means that we can get practically the same results without using all data. Calculation of the SOM took respectively 17, 10 and 11 seconds. It means that the process was approximately 44% and 38% faster when we used 9 and 10 variables instead of CONCLUSION We have introduced two possible ways for the dimensionality reduction and compared them. As we saw from the maps there are only small differences between maps made of data with reduced dimensionality and original data. Despite achieved results we should take into account that calculating a correlation matrix and eigenvalues is computing-consuming activity. Therefore in the further research we would like to turn our attention to random mapping method suggested by Sami Kaski [4]. REFERENCES [1] Deboeck G, Kohonen T, Visual exploration in finance: with self-organising maps, Springer, Berlin, 1998 [2] Haykin S Neural Networks, Prentice Hall, New Jersey, 1999 [3] Jobson J.D Applied multivariate Data Analysis, Volume II, Springer, New York, 1992
8 8 Toomas Kirt [4] Kaski S Dimensionality Reduction by Random Mapping: Fast Similarity Computation for Clustering, IEEE International Joint Conference on Neural Networks, Anchorage, Alaska, May [5] Kohonen T, Kaski S, Laugus K, Salojärvi J, Honkela J, Paatero V and Saarela A, Self organisation of a massive document collection, IEEE Transactions on Neural Networks, vol. 11, No.3, May 2000 [6] Kohonen T, Self-organising maps, Third edition, Springer, Berlin, 2000 [7] Liu C, Wechler, Face Recognition Using Shape and Texture, CVPR 99, Fort Collins, Colorado, June [8] Oja E, Kaski S, Kohonen Maps, Elsevier, Amsterdam, 1999 [9] Theodoridis S, Koutroumbas K, Pattern Recognition, Academic Press, San Diego, 1998 [10] Ultsch A, Unified Matrix (U-matrix) Methods, [11] Võhandu L, Krusberg H A Direct Factor Analysis Method, The Proceedings of TTU, 426, 1977, pp.11-21
arxiv: v1 [physics.data-an] 27 Sep 2007
Classification of Interest Rate Curves Using Self-Organising Maps arxiv:0709.4401v1 [physics.data-an] 27 Sep 2007 M.Kanevski a,, M.Maignan b, V.Timonin a,1, A.Pozdnoukhov a,1 a Institute of Geomatics and
More informationUnsupervised Learning
Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised
More informationFigure (5) Kohonen Self-Organized Map
2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;
More informationA method for comparing self-organizing maps: case studies of banking and linguistic data
A method for comparing self-organizing maps: case studies of banking and linguistic data Toomas Kirt 1, Ene Vainik 2, Leo Võhandu 3 1 Institute of Cybernetics at Tallinn University of Technology, Akadeemia
More informationClustering and Visualisation of Data
Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some
More informationNonlinear dimensionality reduction of large datasets for data exploration
Data Mining VII: Data, Text and Web Mining and their Business Applications 3 Nonlinear dimensionality reduction of large datasets for data exploration V. Tomenko & V. Popov Wessex Institute of Technology,
More informationDimension Reduction CS534
Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of
More informationPATTERN RECOGNITION USING NEURAL NETWORKS
PATTERN RECOGNITION USING NEURAL NETWORKS Santaji Ghorpade 1, Jayshree Ghorpade 2 and Shamla Mantri 3 1 Department of Information Technology Engineering, Pune University, India santaji_11jan@yahoo.co.in,
More informationCluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6
Cluster Analysis and Visualization Workshop on Statistics and Machine Learning 2004/2/6 Outlines Introduction Stages in Clustering Clustering Analysis and Visualization One/two-dimensional Data Histogram,
More informationINTERNATIONAL CONFERENCE ON ENGINEERING DESIGN ICED 05 MELBOURNE, AUGUST 15-18, 2005
INTERNATIONAL CONFERENCE ON ENGINEERING DESIGN ICED MELBOURNE, AUGUST -, METHOD USING A SELF-ORGANISING MAP FOR DRIVER CLASSIFI- CATION AS A PRECONDITION FOR CUSTOMER ORIENTED DESIGN Albert Albers and
More informationRecognition: Face Recognition. Linda Shapiro EE/CSE 576
Recognition: Face Recognition Linda Shapiro EE/CSE 576 1 Face recognition: once you ve detected and cropped a face, try to recognize it Detection Recognition Sally 2 Face recognition: overview Typical
More informationSelf-Organizing Maps for cyclic and unbounded graphs
Self-Organizing Maps for cyclic and unbounded graphs M. Hagenbuchner 1, A. Sperduti 2, A.C. Tsoi 3 1- University of Wollongong, Wollongong, Australia. 2- University of Padova, Padova, Italy. 3- Hong Kong
More informationArtificial Neural Networks Unsupervised learning: SOM
Artificial Neural Networks Unsupervised learning: SOM 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001
More informationExploratory Data Analysis using Self-Organizing Maps. Madhumanti Ray
Exploratory Data Analysis using Self-Organizing Maps Madhumanti Ray Content Introduction Data Analysis methods Self-Organizing Maps Conclusion Visualization of high-dimensional data items Exploratory data
More informationCSC 411: Lecture 14: Principal Components Analysis & Autoencoders
CSC 411: Lecture 14: Principal Components Analysis & Autoencoders Raquel Urtasun & Rich Zemel University of Toronto Nov 4, 2015 Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 1 / 18
More informationFeature Selection Using Principal Feature Analysis
Feature Selection Using Principal Feature Analysis Ira Cohen Qi Tian Xiang Sean Zhou Thomas S. Huang Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign Urbana,
More informationUnsupervised Learning
Unsupervised Learning Learning without a teacher No targets for the outputs Networks which discover patterns, correlations, etc. in the input data This is a self organisation Self organising networks An
More informationSelf-Organizing Maps for Analysis of Expandable Polystyrene Batch Process
International Journal of Computers, Communications & Control Vol. II (2007), No. 2, pp. 143-148 Self-Organizing Maps for Analysis of Expandable Polystyrene Batch Process Mikko Heikkinen, Ville Nurminen,
More informationCluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1
Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods
More informationAn ICA-Based Multivariate Discretization Algorithm
An ICA-Based Multivariate Discretization Algorithm Ye Kang 1,2, Shanshan Wang 1,2, Xiaoyan Liu 1, Hokyin Lai 1, Huaiqing Wang 1, and Baiqi Miao 2 1 Department of Information Systems, City University of
More information11/14/2010 Intelligent Systems and Soft Computing 1
Lecture 8 Artificial neural networks: Unsupervised learning Introduction Hebbian learning Generalised Hebbian learning algorithm Competitive learning Self-organising computational map: Kohonen network
More informationFunction approximation using RBF network. 10 basis functions and 25 data points.
1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data
More informationData Compression. The Encoder and PCA
Data Compression The Encoder and PCA Neural network techniques have been shown useful in the area of data compression. In general, data compression can be lossless compression or lossy compression. In
More informationCS325 Artificial Intelligence Ch. 20 Unsupervised Machine Learning
CS325 Artificial Intelligence Cengiz Spring 2013 Unsupervised Learning Missing teacher No labels, y Just input data, x What can you learn with it? Unsupervised Learning Missing teacher No labels, y Just
More informationCSC 411: Lecture 14: Principal Components Analysis & Autoencoders
CSC 411: Lecture 14: Principal Components Analysis & Autoencoders Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 14-PCA & Autoencoders 1 / 18
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationColor Space Projection, Feature Fusion and Concurrent Neural Modules for Biometric Image Recognition
Proceedings of the 5th WSEAS Int. Conf. on COMPUTATIONAL INTELLIGENCE, MAN-MACHINE SYSTEMS AND CYBERNETICS, Venice, Italy, November 20-22, 2006 286 Color Space Projection, Fusion and Concurrent Neural
More informationWork 2. Case-based reasoning exercise
Work 2. Case-based reasoning exercise Marc Albert Garcia Gonzalo, Miquel Perelló Nieto November 19, 2012 1 Introduction In this exercise we have implemented a case-based reasoning system, specifically
More informationChapter 7: Competitive learning, clustering, and self-organizing maps
Chapter 7: Competitive learning, clustering, and self-organizing maps António R. C. Paiva EEL 6814 Spring 2008 Outline Competitive learning Clustering Self-Organizing Maps What is competition in neural
More informationYuki Osada Andrew Cannon
Yuki Osada Andrew Cannon 1 Humans are an intelligent species One feature is the ability to learn The ability to learn comes down to the brain The brain learns from experience Research shows that the brain
More informationRoad Sign Visualization with Principal Component Analysis and Emergent Self-Organizing Map
Road Sign Visualization with Principal Component Analysis and Emergent Self-Organizing Map H6429: Computational Intelligence, Method and Applications Assignment One report Written By Nguwi Yok Yen (nguw0001@ntu.edu.sg)
More informationData Mining. Kohonen Networks. Data Mining Course: Sharif University of Technology 1
Data Mining Kohonen Networks Data Mining Course: Sharif University of Technology 1 Self-Organizing Maps Kohonen Networks developed in 198 by Tuevo Kohonen Initially applied to image and sound analysis
More informationUnsupervised Learning
Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover
More informationCIE L*a*b* color model
CIE L*a*b* color model To further strengthen the correlation between the color model and human perception, we apply the following non-linear transformation: with where (X n,y n,z n ) are the tristimulus
More informationData Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University
Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Exploratory data analysis tasks Examine the data, in search of structures
More informationIntroduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture 14 Python Exercise on knn and PCA Hello everyone,
More informationFace Recognition for Mobile Devices
Face Recognition for Mobile Devices Aditya Pabbaraju (adisrinu@umich.edu), Srujankumar Puchakayala (psrujan@umich.edu) INTRODUCTION Face recognition is an application used for identifying a person from
More informationDimension reduction : PCA and Clustering
Dimension reduction : PCA and Clustering By Hanne Jarmer Slides by Christopher Workman Center for Biological Sequence Analysis DTU The DNA Array Analysis Pipeline Array design Probe design Question Experimental
More informationMethods for Intelligent Systems
Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering
More informationECG782: Multidimensional Digital Signal Processing
Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu ECG782: Multidimensional Digital Signal Processing Spring 2014 TTh 14:30-15:45 CBC C313 Lecture 06 Image Structures 13/02/06 http://www.ee.unlv.edu/~b1morris/ecg782/
More informationExploratory data analysis for microarrays
Exploratory data analysis for microarrays Jörg Rahnenführer Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrücken Germany NGFN - Courses in Practical DNA
More informationMineral Exploation Using Neural Netowrks
ABSTRACT I S S N 2277-3061 Mineral Exploation Using Neural Netowrks Aysar A. Abdulrahman University of Sulaimani, Computer Science, Kurdistan Region of Iraq aysser.abdulrahman@univsul.edu.iq Establishing
More informationDiscriminate Analysis
Discriminate Analysis Outline Introduction Linear Discriminant Analysis Examples 1 Introduction What is Discriminant Analysis? Statistical technique to classify objects into mutually exclusive and exhaustive
More informationGeneralized Principal Component Analysis CVPR 2007
Generalized Principal Component Analysis Tutorial @ CVPR 2007 Yi Ma ECE Department University of Illinois Urbana Champaign René Vidal Center for Imaging Science Institute for Computational Medicine Johns
More informationDESIGN OF KOHONEN SELF-ORGANIZING MAP WITH REDUCED STRUCTURE
DESIGN OF KOHONEN SELF-ORGANIZING MAP WITH REDUCED STRUCTURE S. Kajan, M. Lajtman Institute of Control and Industrial Informatics, Faculty of Electrical Engineering and Information Technology, Slovak University
More informationSelf-organization of very large document collections
Chapter 10 Self-organization of very large document collections Teuvo Kohonen, Samuel Kaski, Krista Lagus, Jarkko Salojärvi, Jukka Honkela, Vesa Paatero, Antti Saarela Text mining systems are developed
More information5.6 Self-organizing maps (SOM) [Book, Sect. 10.3]
Ch.5 Classification and Clustering 5.6 Self-organizing maps (SOM) [Book, Sect. 10.3] The self-organizing map (SOM) method, introduced by Kohonen (1982, 2001), approximates a dataset in multidimensional
More informationUnsupervised learning
Unsupervised learning Enrique Muñoz Ballester Dipartimento di Informatica via Bramante 65, 26013 Crema (CR), Italy enrique.munoz@unimi.it Enrique Muñoz Ballester 2017 1 Download slides data and scripts:
More informationLine Simplification Using Self-Organizing Maps
Line Simplification Using Self-Organizing Maps Bin Jiang Division of Geomatics, Dept. of Technology and Built Environment, University of Gävle, Sweden. Byron Nakos School of Rural and Surveying Engineering,
More informationWeek 7 Picturing Network. Vahe and Bethany
Week 7 Picturing Network Vahe and Bethany Freeman (2005) - Graphic Techniques for Exploring Social Network Data The two main goals of analyzing social network data are identification of cohesive groups
More informationA Comparative Study of Conventional and Neural Network Classification of Multispectral Data
A Comparative Study of Conventional and Neural Network Classification of Multispectral Data B.Solaiman & M.C.Mouchot Ecole Nationale Supérieure des Télécommunications de Bretagne B.P. 832, 29285 BREST
More informationComparison of supervised self-organizing maps using Euclidian or Mahalanobis distance in classification context
6 th. International Work Conference on Artificial and Natural Neural Networks (IWANN2001), Granada, June 13-15 2001 Comparison of supervised self-organizing maps using Euclidian or Mahalanobis distance
More informationReview: Final Exam CPSC Artificial Intelligence Michael M. Richter
Review: Final Exam Model for a Learning Step Learner initially Environm ent Teacher Compare s pe c ia l Information Control Correct Learning criteria Feedback changed Learner after Learning Learning by
More informationTime Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks
Series Prediction as a Problem of Missing Values: Application to ESTSP7 and NN3 Competition Benchmarks Antti Sorjamaa and Amaury Lendasse Abstract In this paper, time series prediction is considered as
More informationFeature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate
More informationSeismic regionalization based on an artificial neural network
Seismic regionalization based on an artificial neural network *Jaime García-Pérez 1) and René Riaño 2) 1), 2) Instituto de Ingeniería, UNAM, CU, Coyoacán, México D.F., 014510, Mexico 1) jgap@pumas.ii.unam.mx
More informationControlling the spread of dynamic self-organising maps
Neural Comput & Applic (2004) 13: 168 174 DOI 10.1007/s00521-004-0419-y ORIGINAL ARTICLE L. D. Alahakoon Controlling the spread of dynamic self-organising maps Received: 7 April 2004 / Accepted: 20 April
More informationCluster Analysis using Spherical SOM
Cluster Analysis using Spherical SOM H. Tokutaka 1, P.K. Kihato 2, K. Fujimura 2 and M. Ohkita 2 1) SOM Japan Co-LTD, 2) Electrical and Electronic Department, Tottori University Email: {tokutaka@somj.com,
More informationUnsupervised Learning
Unsupervised Learning Fabio G. Cozman - fgcozman@usp.br November 16, 2018 What can we do? We just have a dataset with features (no labels, no response). We want to understand the data... no easy to define
More informationLecture Topic Projects
Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, basic tasks, data types 3 Introduction to D3, basic vis techniques for non-spatial data Project #1 out 4 Data
More informationSelf-Organizing Maps of Web Link Information
Self-Organizing Maps of Web Link Information Sami Laakso, Jorma Laaksonen, Markus Koskela, and Erkki Oja Laboratory of Computer and Information Science Helsinki University of Technology P.O. Box 5400,
More information/00/$10.00 (C) 2000 IEEE
A SOM based cluster visualization and its application for false coloring Johan Himberg Helsinki University of Technology Laboratory of Computer and Information Science P.O. Box 54, FIN-215 HUT, Finland
More informationLinear Discriminant Analysis in Ottoman Alphabet Character Recognition
Linear Discriminant Analysis in Ottoman Alphabet Character Recognition ZEYNEB KURT, H. IREM TURKMEN, M. ELIF KARSLIGIL Department of Computer Engineering, Yildiz Technical University, 34349 Besiktas /
More informationApplying Kohonen Network in Organising Unstructured Data for Talus Bone
212 Third International Conference on Theoretical and Mathematical Foundations of Computer Science Lecture Notes in Information Technology, Vol.38 Applying Kohonen Network in Organising Unstructured Data
More informationWhat is a receptive field? Why a sensory neuron has such particular RF How a RF was developed?
What is a receptive field? Why a sensory neuron has such particular RF How a RF was developed? x 1 x 2 x 3 y f w 1 w 2 w 3 T x y = f (wx i i T ) i y x 1 x 2 x 3 = = E (y y) (y f( wx T)) 2 2 o o i i i
More informationCSE 258 Lecture 5. Web Mining and Recommender Systems. Dimensionality Reduction
CSE 258 Lecture 5 Web Mining and Recommender Systems Dimensionality Reduction This week How can we build low dimensional representations of high dimensional data? e.g. how might we (compactly!) represent
More informationSGN (4 cr) Chapter 10
SGN-41006 (4 cr) Chapter 10 Feature Selection and Extraction Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 18, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006
More informationModelling and Visualization of High Dimensional Data. Sample Examination Paper
Duration not specified UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE Modelling and Visualization of High Dimensional Data Sample Examination Paper Examination date not specified Time: Examination
More informationStability Assessment of Electric Power Systems using Growing Neural Gas and Self-Organizing Maps
Stability Assessment of Electric Power Systems using Growing Gas and Self-Organizing Maps Christian Rehtanz, Carsten Leder University of Dortmund, 44221 Dortmund, Germany Abstract. Liberalized competitive
More informationInitialization of Self-Organizing Maps: Principal Components Versus Random Initialization. A Case Study
Initialization of Self-Organizing Maps: Principal Components Versus Random Initialization. A Case Study Ayodeji A. Akinduko 1 and Evgeny M. Mirkes 2 1 University of Leicester, UK, aaa78@le.ac.uk 2 Siberian
More informationLorentzian Distance Classifier for Multiple Features
Yerzhan Kerimbekov 1 and Hasan Şakir Bilge 2 1 Department of Computer Engineering, Ahmet Yesevi University, Ankara, Turkey 2 Department of Electrical-Electronics Engineering, Gazi University, Ankara, Turkey
More informationA novel firing rule for training Kohonen selforganising
A novel firing rule for training Kohonen selforganising maps D. T. Pham & A. B. Chan Manufacturing Engineering Centre, School of Engineering, University of Wales Cardiff, P.O. Box 688, Queen's Buildings,
More informationAdvanced visualization techniques for Self-Organizing Maps with graph-based methods
Advanced visualization techniques for Self-Organizing Maps with graph-based methods Georg Pölzlbauer 1, Andreas Rauber 1, and Michael Dittenbach 2 1 Department of Software Technology Vienna University
More informationData Preprocessing. Chapter 15
Chapter 15 Data Preprocessing Data preprocessing converts raw data and signals into data representation suitable for application through a sequence of operations. The objectives of data preprocessing include
More informationGray-Level Reduction Using Local Spatial Features
Computer Vision and Image Understanding 78, 336 350 (2000) doi:10.1006/cviu.2000.0838, available online at http://www.idealibrary.com on Gray-Level Reduction Using Local Spatial Features Nikos Papamarkos
More informationBasics of Multivariate Modelling and Data Analysis
Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 9. Linear regression with latent variables 9.1 Principal component regression (PCR) 9.2 Partial least-squares regression (PLS) [ mostly
More informationCartographic Selection Using Self-Organizing Maps
1 Cartographic Selection Using Self-Organizing Maps Bin Jiang 1 and Lars Harrie 2 1 Division of Geomatics, Institutionen för Teknik University of Gävle, SE-801 76 Gävle, Sweden e-mail: bin.jiang@hig.se
More informationCHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION
75 CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION 6.1 INTRODUCTION Counter propagation network (CPN) was developed by Robert Hecht-Nielsen as a means to combine an unsupervised Kohonen
More informationImproving A Trajectory Index For Topology Conserving Mapping
Proceedings of the 8th WSEAS Int. Conference on Automatic Control, Modeling and Simulation, Prague, Czech Republic, March -4, 006 (pp03-08) Improving A Trajectory Index For Topology Conserving Mapping
More informationMachine Learning : Clustering, Self-Organizing Maps
Machine Learning Clustering, Self-Organizing Maps 12/12/2013 Machine Learning : Clustering, Self-Organizing Maps Clustering The task: partition a set of objects into meaningful subsets (clusters). The
More informationPrincipal Component Image Interpretation A Logical and Statistical Approach
Principal Component Image Interpretation A Logical and Statistical Approach Md Shahid Latif M.Tech Student, Department of Remote Sensing, Birla Institute of Technology, Mesra Ranchi, Jharkhand-835215 Abstract
More informationApplied Neuroscience. Columbia Science Honors Program Fall Machine Learning and Neural Networks
Applied Neuroscience Columbia Science Honors Program Fall 2016 Machine Learning and Neural Networks Machine Learning and Neural Networks Objective: Introduction to Machine Learning Agenda: 1. JavaScript
More informationTime Series Clustering Ensemble Algorithm Based on Locality Preserving Projection
Based on Locality Preserving Projection 2 Information & Technology College, Hebei University of Economics & Business, 05006 Shijiazhuang, China E-mail: 92475577@qq.com Xiaoqing Weng Information & Technology
More informationCSE 255 Lecture 5. Data Mining and Predictive Analytics. Dimensionality Reduction
CSE 255 Lecture 5 Data Mining and Predictive Analytics Dimensionality Reduction Course outline Week 4: I ll cover homework 1, and get started on Recommender Systems Week 5: I ll cover homework 2 (at the
More informationAlternative Statistical Methods for Bone Atlas Modelling
Alternative Statistical Methods for Bone Atlas Modelling Sharmishtaa Seshamani, Gouthami Chintalapani, Russell Taylor Department of Computer Science, Johns Hopkins University, Baltimore, MD Traditional
More informationOn Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions
On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions CAMCOS Report Day December 9th, 2015 San Jose State University Project Theme: Classification The Kaggle Competition
More informationPerformance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM
Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Lu Chen and Yuan Hang PERFORMANCE DEGRADATION ASSESSMENT AND FAULT DIAGNOSIS OF BEARING BASED ON EMD AND PCA-SOM.
More informationPrincipal Component Analysis (PCA) is a most practicable. statistical technique. Its application plays a major role in many
CHAPTER 3 PRINCIPAL COMPONENT ANALYSIS ON EIGENFACES 2D AND 3D MODEL 3.1 INTRODUCTION Principal Component Analysis (PCA) is a most practicable statistical technique. Its application plays a major role
More informationDI TRANSFORM. The regressive analyses. identify relationships
July 2, 2015 DI TRANSFORM MVstats TM Algorithm Overview Summary The DI Transform Multivariate Statistics (MVstats TM ) package includes five algorithm options that operate on most types of geologic, geophysical,
More informationRecognizing Handwritten Digits Using the LLE Algorithm with Back Propagation
Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation Lori Cillo, Attebury Honors Program Dr. Rajan Alex, Mentor West Texas A&M University Canyon, Texas 1 ABSTRACT. This work is
More informationInfluence of Neighbor Size for Initial Node Exchange of SOM Learning
FR-E3-3 SCIS&ISIS2006 @ Tokyo, Japan (September 20-24, 2006) Influence of Neighbor Size for Initial Node Exchange of SOM Learning MIYOSHI Tsutomu Department of Information and Knowledge Engineering, Tottori
More informationPERFORMANCE OF THE DISTRIBUTED KLT AND ITS APPROXIMATE IMPLEMENTATION
20th European Signal Processing Conference EUSIPCO 2012) Bucharest, Romania, August 27-31, 2012 PERFORMANCE OF THE DISTRIBUTED KLT AND ITS APPROXIMATE IMPLEMENTATION Mauricio Lara 1 and Bernard Mulgrew
More informationWAVELET USE FOR IMAGE CLASSIFICATION. Andrea Gavlasová, Aleš Procházka, and Martina Mudrová
WAVELET USE FOR IMAGE CLASSIFICATION Andrea Gavlasová, Aleš Procházka, and Martina Mudrová Prague Institute of Chemical Technology Department of Computing and Control Engineering Technická, Prague, Czech
More informationSelf-Organizing Map. presentation by Andreas Töscher. 19. May 2008
19. May 2008 1 Introduction 2 3 4 5 6 (SOM) aka Kohonen Network introduced by Teuvo Kohonen implements a discrete nonlinear mapping unsupervised learning Structure of a SOM Learning Rule Introduction
More informationCHAPTER 3 PRINCIPAL COMPONENT ANALYSIS AND FISHER LINEAR DISCRIMINANT ANALYSIS
38 CHAPTER 3 PRINCIPAL COMPONENT ANALYSIS AND FISHER LINEAR DISCRIMINANT ANALYSIS 3.1 PRINCIPAL COMPONENT ANALYSIS (PCA) 3.1.1 Introduction In the previous chapter, a brief literature review on conventional
More informationThe Curse of Dimensionality
The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more
More informationUnsupervised Learning
Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,
More informationFeature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22
Feature selection Javier Béjar cbea LSI - FIB Term 2011/2012 Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/2012 1 / 22 Outline 1 Dimensionality reduction 2 Projections 3 Attribute selection
More informationCSE 40171: Artificial Intelligence. Learning from Data: Unsupervised Learning
CSE 40171: Artificial Intelligence Learning from Data: Unsupervised Learning 32 Homework #6 has been released. It is due at 11:59PM on 11/7. 33 CSE Seminar: 11/1 Amy Reibman Purdue University 3:30pm DBART
More informationIMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS
IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS BOGDAN M.WILAMOWSKI University of Wyoming RICHARD C. JAEGER Auburn University ABSTRACT: It is shown that by introducing special
More information