COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS

Size: px
Start display at page:

Download "COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS"

Transcription

1 COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS Toomas Kirt Supervisor: Leo Võhandu Tallinn Technical University Abstract: Key words: For the visualisation of multidimensional financial data sets we have used the Self- Organising Maps (SOM) by T. Kohonen [6,8]. SOM is one of the very useful neural computing methods for analysing and visualising multidimensional data. To achieve better computational speed, one has to reduce the dimensionality of the original data during data pre-processing stage. One of the best-known methods for that is the Principal Component Analysis (PCA). In our research project we have used one alternative effective method for the dimensionality reduction. This method is called the Peeling Method (Võhandu, Krusberg, 1977). Main difference between these two methods is that PCA finds out principal components that are optimised linear combinations of all original data variables (so there is no real reduction of data), but the Peeling method finds first out most important variables that describe the system correlations in the best possible way using only some (or most) of variables. To illustrate results between different dimensionality reduction methods, we have used financial data of Estonian banks. Our reduction method enables to detach almost half of the original variables from the original data and to get practically the same results afterwards using SOM as with all the data. Self- Organising Maps, Neural Networks, Dimensionality Reduction, Data Mining, Peeling Method 1

2 2 Toomas Kirt 1. INTRODUCTION The Self- Organising Maps (SOM) is a very useful neural computing method for analysing and visualising multidimensional data. SOM can be used for translating multidimensional financial data into simple twodimensional maps. SOM groups similar input data vectors which are near each other in the input space to nearby map units in the SOM. The SOM can thus be used as a clustering tool. To achieve better computational speed, it is possible to reduce dimensionality of the original data during data preprocessing. One of the well-known methods for the dimensionality reduction is the Principal Component Analysis (PCA). In this paper we use also another dimensionality reduction technique the Peeling Method. Our goal in this paper is to compare two dimensionality reduction methods. To visualise two methods we use financial data of Estonian banks. Our goal is to compare results created by SOM using original data and data with reduced dimensionality. This paper is divided into four parts. In the first part we give short overview of the SOM algorithm and visualisation methods. In the second part we give an overview of the PCA s main properties. In the third part we introduce the Peeling method and in the fourth part we use above described methods with the financial data of Estonian banks and compare results. 2. SELF ORGANISING MAPS (SOM) A self-organising map is a feedforward neural network that uses an unsupervised training algorithm, and through a process called selforganisation, it configures the output units into a topological representation of the original data [2,5]. The SOM belongs to a general class of neural network methods, which are non-linear regression techniques that can be trained to learn or find relationships between inputs and outputs or to organise data so as to disclose so far unknown patterns or structures [1]. The algorithm is based on unsupervised, competitive learning. The algorithm provides a topology preserving mapping from high-dimensional space to map units. Map units, or neurones, usually from a two-dimensional space grid and thus the mapping is a mapping from a high-dimensional space onto plane. The property of topology preserving means that the SOM groups similar input data vectors on neurons: points that are near each other in the input space are mapped to nearby map units in the SOM. The SOM can thus serve as a clustering tool as well as a tool for visualising high-dimensional data.

3 Combined method to visualise and reduce dimensionality of the financial data sets 3 The process of creating a self-organising map requires two layers of processing units. The first is an input layer containing processing units for each element in the input vector; the second is an output layer or a grid of processing units that is fully connected with those at the input layer. User, depending on how the map will be used, can define size of an output layer. The learning process goes on as follows. At first the output grid will be initialised with initial values that could be random values from input space. One sample will be taken from input variables and it will be presented to the output grid of the map. All the neurons in the output layer compete with each other to become a winner. The winner will be the output node that is the closest to the sample vector according to the Euclidean distance. The weights of the winner neuron will be changed closer to the sample vector, moved in direction of the input sample. The weights of the neurons in the neighbourhood of the winner unit will also be changed. During the process of learning the learning rate becomes smaller and also the rate of change declines around the neighbourhood of the winning neuron. At the end of the training only the winning unit is adjusted. As a result of the self- organising process the similar input data vectors are mapped to nearby map units in the SOM. The Unified distance matrix (U-matrix) method is usually used to visualise the structure of the input space of a self -organising feature map. The U-matrix method can be used for getting an impression of otherwise invisible structures in a multidimensional data space and it allows classifying data sets into groups of similar data points (self-organised classification or clustering). One of the simplest U-matrix methods is to sum up the distances of weight vectors of adjacent neurons on a feature map. An U-matrix gives a picture of topology of the unit layer and therefore also of the topology of the input spaces as follows: altitude in the U-matrix encodes dissimilarity in the input space. Valleys in the U-matrix (i.e. low altitudes) correspond to input vectors that are similar. [10] So the clusters in a multidimensional data set can be identified if all the points falling into the same valley of an U-matrix are grouped together. Furthermore the height of the walls or hills on an U-matrix gives a hint how much the classes differ from each other. Finally the properties of Self- Organising Maps ensure that similar groups are situated nearby in an U- matrix.

4 4 Toomas Kirt 3. PRINCIPAL COMPONENT ANALYSIS The Principal Component Analysis (PCA) is a technique commonly used for data reduction in statistical pattern recognition and signal processing. It is also known as Karhunen- Loève Transform [3,9]. In Principal Component Analysis each component of the projected vector is a linear combination of the components of the original data item. The projection is formed by multiplying each component by a certain fixed scalar coefficient and adding the results together. Mathematical methods exist for finding the optimal coefficients such that the variance of the data after the projection will be preserved, whereby it is also closest to the variance of the original data. N Let X R be a random n-dimensional vector representing the environment of interest. Our goal is to generate features that are optimally uncorrelated, that is E[y(i)y(j)]=0, i j. Let Y=A T X. From the definition of the correlation matrix we have R Y E[YY T ]=E[A T XX T A]=A T R X A. (3.1) However, R X is a symmetric matrix, and hence its eigenvectors are mutually orthogonal. Thus, if matrix A is chosen so that columns are the orthonormal eigenvectors a 1,a 2,..,a N, of R X, then R Y is diagonal R Y =A T R X A=Λ (3.2) where λ is the diagonal matrix having as elements on its diagonal the respective eigenvalues λ 1,λ 2,,λ N, of R X. Let the corresponding eigenvalues be arranged in decreasing order: λ 1 >λ 2 > >λ j > >λ N so that λ 1 = λ max. An important property of PCA is Mean square error approximation. If we choose in m xˆ = y( i) (3.3) i= 1 a i the eigenvectors corresponding to the m, m N, largest eigenvalues of the correlation matrix, then the MSE is minimised, being the sum of N-m smallest eigenvalues. Another property of PCA is property of total variance. Let E[X] be zero and y be the PCA transformed vector of X. The eigenvalues of the input correlation matrix are equal to the variances of the transformed features [ y i ] 2 ( i 2 σ Y ( i) E ) = λ. (3.4) Thus, selecting those features, y( i) a largest eigenvalues makes their sum variance λ maximum. i i T i x, corresponding to the m Those properties allow choosing m principal components that retain most of the total variance associated with the original random variables [7].

5 Combined method to visualise and reduce dimensionality of the financial data sets 5 4. PEELING METHOD The Peeling method [11] finds first out most important variables that describe the system correlations in the best possible way using only some (or most) of variables. Algorithm works as following: 1. For a correlation matrix R 1 r12... r1 m r r2 m R = (4.1) rm1 rm we calculate for every column 2. For ( k ) S = max S (4.3) j S m 2 rij i= j = 1 rjj (4.2) (That is the number of the most important in average variable in the system. Superscript shows the number of the iteration (k=1,,r m) 3. The correlation coefficients of the maximal variable will be divided by the square root of the diagonal element r jj of the matrix R. The transformed column vector b 1 is the first vector of the new factor matrix B. 4. Find the residual matrix (1) ' R = R b1b1 (4.4) 5. Repeat the process r times, where r m is the rank of R. According to the elimination order we take first r variables and use them in following activities. 5. CASE STUDY: SOM OF ESTONIAN BANKS To illustrate two methods we have used financial reports of Estonian banks. We used 92 different reports from the period and each report consists of 16 variables. It is not very remarkable amount of data, but our goal is just to compare visual results achieved by the different methods. Every node on the map represents one or more reports of the banks. Firstly we created a U-matrix of the original data and the result is given on the Figure 1.

6 6 Toomas Kirt Figure 1. SOM of Estonian banks 16 variables Secondly we applied to the original data PCA method and selected out nine linear combination that describe more than 95% of the variation. After creating SOM from reduced data we got result as showed on the Figure 2. Figure 2. SOM of Estonian banks PCA 9 variables At the third attempt we have used the Peeling method. We eliminated six variables and used ten original variables. Result we can see on Figure 3.

7 Combined method to visualise and reduce dimensionality of the financial data sets 7 Figure 3. SOM of Estonian banks Peeling Method 10 variables As we can see, the structures of the three maps are similar. It means that we can get practically the same results without using all data. Calculation of the SOM took respectively 17, 10 and 11 seconds. It means that the process was approximately 44% and 38% faster when we used 9 and 10 variables instead of CONCLUSION We have introduced two possible ways for the dimensionality reduction and compared them. As we saw from the maps there are only small differences between maps made of data with reduced dimensionality and original data. Despite achieved results we should take into account that calculating a correlation matrix and eigenvalues is computing-consuming activity. Therefore in the further research we would like to turn our attention to random mapping method suggested by Sami Kaski [4]. REFERENCES [1] Deboeck G, Kohonen T, Visual exploration in finance: with self-organising maps, Springer, Berlin, 1998 [2] Haykin S Neural Networks, Prentice Hall, New Jersey, 1999 [3] Jobson J.D Applied multivariate Data Analysis, Volume II, Springer, New York, 1992

8 8 Toomas Kirt [4] Kaski S Dimensionality Reduction by Random Mapping: Fast Similarity Computation for Clustering, IEEE International Joint Conference on Neural Networks, Anchorage, Alaska, May [5] Kohonen T, Kaski S, Laugus K, Salojärvi J, Honkela J, Paatero V and Saarela A, Self organisation of a massive document collection, IEEE Transactions on Neural Networks, vol. 11, No.3, May 2000 [6] Kohonen T, Self-organising maps, Third edition, Springer, Berlin, 2000 [7] Liu C, Wechler, Face Recognition Using Shape and Texture, CVPR 99, Fort Collins, Colorado, June [8] Oja E, Kaski S, Kohonen Maps, Elsevier, Amsterdam, 1999 [9] Theodoridis S, Koutroumbas K, Pattern Recognition, Academic Press, San Diego, 1998 [10] Ultsch A, Unified Matrix (U-matrix) Methods, [11] Võhandu L, Krusberg H A Direct Factor Analysis Method, The Proceedings of TTU, 426, 1977, pp.11-21

arxiv: v1 [physics.data-an] 27 Sep 2007

arxiv: v1 [physics.data-an] 27 Sep 2007 Classification of Interest Rate Curves Using Self-Organising Maps arxiv:0709.4401v1 [physics.data-an] 27 Sep 2007 M.Kanevski a,, M.Maignan b, V.Timonin a,1, A.Pozdnoukhov a,1 a Institute of Geomatics and

More information

Unsupervised Learning

Unsupervised Learning Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised

More information

Figure (5) Kohonen Self-Organized Map

Figure (5) Kohonen Self-Organized Map 2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;

More information

A method for comparing self-organizing maps: case studies of banking and linguistic data

A method for comparing self-organizing maps: case studies of banking and linguistic data A method for comparing self-organizing maps: case studies of banking and linguistic data Toomas Kirt 1, Ene Vainik 2, Leo Võhandu 3 1 Institute of Cybernetics at Tallinn University of Technology, Akadeemia

More information

Clustering and Visualisation of Data

Clustering and Visualisation of Data Clustering and Visualisation of Data Hiroshi Shimodaira January-March 28 Cluster analysis aims to partition a data set into meaningful or useful groups, based on distances between data points. In some

More information

Nonlinear dimensionality reduction of large datasets for data exploration

Nonlinear dimensionality reduction of large datasets for data exploration Data Mining VII: Data, Text and Web Mining and their Business Applications 3 Nonlinear dimensionality reduction of large datasets for data exploration V. Tomenko & V. Popov Wessex Institute of Technology,

More information

Dimension Reduction CS534

Dimension Reduction CS534 Dimension Reduction CS534 Why dimension reduction? High dimensionality large number of features E.g., documents represented by thousands of words, millions of bigrams Images represented by thousands of

More information

PATTERN RECOGNITION USING NEURAL NETWORKS

PATTERN RECOGNITION USING NEURAL NETWORKS PATTERN RECOGNITION USING NEURAL NETWORKS Santaji Ghorpade 1, Jayshree Ghorpade 2 and Shamla Mantri 3 1 Department of Information Technology Engineering, Pune University, India santaji_11jan@yahoo.co.in,

More information

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6 Cluster Analysis and Visualization Workshop on Statistics and Machine Learning 2004/2/6 Outlines Introduction Stages in Clustering Clustering Analysis and Visualization One/two-dimensional Data Histogram,

More information

INTERNATIONAL CONFERENCE ON ENGINEERING DESIGN ICED 05 MELBOURNE, AUGUST 15-18, 2005

INTERNATIONAL CONFERENCE ON ENGINEERING DESIGN ICED 05 MELBOURNE, AUGUST 15-18, 2005 INTERNATIONAL CONFERENCE ON ENGINEERING DESIGN ICED MELBOURNE, AUGUST -, METHOD USING A SELF-ORGANISING MAP FOR DRIVER CLASSIFI- CATION AS A PRECONDITION FOR CUSTOMER ORIENTED DESIGN Albert Albers and

More information

Recognition: Face Recognition. Linda Shapiro EE/CSE 576

Recognition: Face Recognition. Linda Shapiro EE/CSE 576 Recognition: Face Recognition Linda Shapiro EE/CSE 576 1 Face recognition: once you ve detected and cropped a face, try to recognize it Detection Recognition Sally 2 Face recognition: overview Typical

More information

Self-Organizing Maps for cyclic and unbounded graphs

Self-Organizing Maps for cyclic and unbounded graphs Self-Organizing Maps for cyclic and unbounded graphs M. Hagenbuchner 1, A. Sperduti 2, A.C. Tsoi 3 1- University of Wollongong, Wollongong, Australia. 2- University of Padova, Padova, Italy. 3- Hong Kong

More information

Artificial Neural Networks Unsupervised learning: SOM

Artificial Neural Networks Unsupervised learning: SOM Artificial Neural Networks Unsupervised learning: SOM 01001110 01100101 01110101 01110010 01101111 01101110 01101111 01110110 01100001 00100000 01110011 01101011 01110101 01110000 01101001 01101110 01100001

More information

Exploratory Data Analysis using Self-Organizing Maps. Madhumanti Ray

Exploratory Data Analysis using Self-Organizing Maps. Madhumanti Ray Exploratory Data Analysis using Self-Organizing Maps Madhumanti Ray Content Introduction Data Analysis methods Self-Organizing Maps Conclusion Visualization of high-dimensional data items Exploratory data

More information

CSC 411: Lecture 14: Principal Components Analysis & Autoencoders

CSC 411: Lecture 14: Principal Components Analysis & Autoencoders CSC 411: Lecture 14: Principal Components Analysis & Autoencoders Raquel Urtasun & Rich Zemel University of Toronto Nov 4, 2015 Urtasun & Zemel (UofT) CSC 411: 14-PCA & Autoencoders Nov 4, 2015 1 / 18

More information

Feature Selection Using Principal Feature Analysis

Feature Selection Using Principal Feature Analysis Feature Selection Using Principal Feature Analysis Ira Cohen Qi Tian Xiang Sean Zhou Thomas S. Huang Beckman Institute for Advanced Science and Technology University of Illinois at Urbana-Champaign Urbana,

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without a teacher No targets for the outputs Networks which discover patterns, correlations, etc. in the input data This is a self organisation Self organising networks An

More information

Self-Organizing Maps for Analysis of Expandable Polystyrene Batch Process

Self-Organizing Maps for Analysis of Expandable Polystyrene Batch Process International Journal of Computers, Communications & Control Vol. II (2007), No. 2, pp. 143-148 Self-Organizing Maps for Analysis of Expandable Polystyrene Batch Process Mikko Heikkinen, Ville Nurminen,

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

An ICA-Based Multivariate Discretization Algorithm

An ICA-Based Multivariate Discretization Algorithm An ICA-Based Multivariate Discretization Algorithm Ye Kang 1,2, Shanshan Wang 1,2, Xiaoyan Liu 1, Hokyin Lai 1, Huaiqing Wang 1, and Baiqi Miao 2 1 Department of Information Systems, City University of

More information

11/14/2010 Intelligent Systems and Soft Computing 1

11/14/2010 Intelligent Systems and Soft Computing 1 Lecture 8 Artificial neural networks: Unsupervised learning Introduction Hebbian learning Generalised Hebbian learning algorithm Competitive learning Self-organising computational map: Kohonen network

More information

Function approximation using RBF network. 10 basis functions and 25 data points.

Function approximation using RBF network. 10 basis functions and 25 data points. 1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data

More information

Data Compression. The Encoder and PCA

Data Compression. The Encoder and PCA Data Compression The Encoder and PCA Neural network techniques have been shown useful in the area of data compression. In general, data compression can be lossless compression or lossy compression. In

More information

CS325 Artificial Intelligence Ch. 20 Unsupervised Machine Learning

CS325 Artificial Intelligence Ch. 20 Unsupervised Machine Learning CS325 Artificial Intelligence Cengiz Spring 2013 Unsupervised Learning Missing teacher No labels, y Just input data, x What can you learn with it? Unsupervised Learning Missing teacher No labels, y Just

More information

CSC 411: Lecture 14: Principal Components Analysis & Autoencoders

CSC 411: Lecture 14: Principal Components Analysis & Autoencoders CSC 411: Lecture 14: Principal Components Analysis & Autoencoders Richard Zemel, Raquel Urtasun and Sanja Fidler University of Toronto Zemel, Urtasun, Fidler (UofT) CSC 411: 14-PCA & Autoencoders 1 / 18

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,

More information

Color Space Projection, Feature Fusion and Concurrent Neural Modules for Biometric Image Recognition

Color Space Projection, Feature Fusion and Concurrent Neural Modules for Biometric Image Recognition Proceedings of the 5th WSEAS Int. Conf. on COMPUTATIONAL INTELLIGENCE, MAN-MACHINE SYSTEMS AND CYBERNETICS, Venice, Italy, November 20-22, 2006 286 Color Space Projection, Fusion and Concurrent Neural

More information

Work 2. Case-based reasoning exercise

Work 2. Case-based reasoning exercise Work 2. Case-based reasoning exercise Marc Albert Garcia Gonzalo, Miquel Perelló Nieto November 19, 2012 1 Introduction In this exercise we have implemented a case-based reasoning system, specifically

More information

Chapter 7: Competitive learning, clustering, and self-organizing maps

Chapter 7: Competitive learning, clustering, and self-organizing maps Chapter 7: Competitive learning, clustering, and self-organizing maps António R. C. Paiva EEL 6814 Spring 2008 Outline Competitive learning Clustering Self-Organizing Maps What is competition in neural

More information

Yuki Osada Andrew Cannon

Yuki Osada Andrew Cannon Yuki Osada Andrew Cannon 1 Humans are an intelligent species One feature is the ability to learn The ability to learn comes down to the brain The brain learns from experience Research shows that the brain

More information

Road Sign Visualization with Principal Component Analysis and Emergent Self-Organizing Map

Road Sign Visualization with Principal Component Analysis and Emergent Self-Organizing Map Road Sign Visualization with Principal Component Analysis and Emergent Self-Organizing Map H6429: Computational Intelligence, Method and Applications Assignment One report Written By Nguwi Yok Yen (nguw0001@ntu.edu.sg)

More information

Data Mining. Kohonen Networks. Data Mining Course: Sharif University of Technology 1

Data Mining. Kohonen Networks. Data Mining Course: Sharif University of Technology 1 Data Mining Kohonen Networks Data Mining Course: Sharif University of Technology 1 Self-Organizing Maps Kohonen Networks developed in 198 by Tuevo Kohonen Initially applied to image and sound analysis

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

CIE L*a*b* color model

CIE L*a*b* color model CIE L*a*b* color model To further strengthen the correlation between the color model and human perception, we apply the following non-linear transformation: with where (X n,y n,z n ) are the tristimulus

More information

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Exploratory data analysis tasks Examine the data, in search of structures

More information

Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture 14 Python Exercise on knn and PCA Hello everyone,

More information

Face Recognition for Mobile Devices

Face Recognition for Mobile Devices Face Recognition for Mobile Devices Aditya Pabbaraju (adisrinu@umich.edu), Srujankumar Puchakayala (psrujan@umich.edu) INTRODUCTION Face recognition is an application used for identifying a person from

More information

Dimension reduction : PCA and Clustering

Dimension reduction : PCA and Clustering Dimension reduction : PCA and Clustering By Hanne Jarmer Slides by Christopher Workman Center for Biological Sequence Analysis DTU The DNA Array Analysis Pipeline Array design Probe design Question Experimental

More information

Methods for Intelligent Systems

Methods for Intelligent Systems Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering

More information

ECG782: Multidimensional Digital Signal Processing

ECG782: Multidimensional Digital Signal Processing Professor Brendan Morris, SEB 3216, brendan.morris@unlv.edu ECG782: Multidimensional Digital Signal Processing Spring 2014 TTh 14:30-15:45 CBC C313 Lecture 06 Image Structures 13/02/06 http://www.ee.unlv.edu/~b1morris/ecg782/

More information

Exploratory data analysis for microarrays

Exploratory data analysis for microarrays Exploratory data analysis for microarrays Jörg Rahnenführer Computational Biology and Applied Algorithmics Max Planck Institute for Informatics D-66123 Saarbrücken Germany NGFN - Courses in Practical DNA

More information

Mineral Exploation Using Neural Netowrks

Mineral Exploation Using Neural Netowrks ABSTRACT I S S N 2277-3061 Mineral Exploation Using Neural Netowrks Aysar A. Abdulrahman University of Sulaimani, Computer Science, Kurdistan Region of Iraq aysser.abdulrahman@univsul.edu.iq Establishing

More information

Discriminate Analysis

Discriminate Analysis Discriminate Analysis Outline Introduction Linear Discriminant Analysis Examples 1 Introduction What is Discriminant Analysis? Statistical technique to classify objects into mutually exclusive and exhaustive

More information

Generalized Principal Component Analysis CVPR 2007

Generalized Principal Component Analysis CVPR 2007 Generalized Principal Component Analysis Tutorial @ CVPR 2007 Yi Ma ECE Department University of Illinois Urbana Champaign René Vidal Center for Imaging Science Institute for Computational Medicine Johns

More information

DESIGN OF KOHONEN SELF-ORGANIZING MAP WITH REDUCED STRUCTURE

DESIGN OF KOHONEN SELF-ORGANIZING MAP WITH REDUCED STRUCTURE DESIGN OF KOHONEN SELF-ORGANIZING MAP WITH REDUCED STRUCTURE S. Kajan, M. Lajtman Institute of Control and Industrial Informatics, Faculty of Electrical Engineering and Information Technology, Slovak University

More information

Self-organization of very large document collections

Self-organization of very large document collections Chapter 10 Self-organization of very large document collections Teuvo Kohonen, Samuel Kaski, Krista Lagus, Jarkko Salojärvi, Jukka Honkela, Vesa Paatero, Antti Saarela Text mining systems are developed

More information

5.6 Self-organizing maps (SOM) [Book, Sect. 10.3]

5.6 Self-organizing maps (SOM) [Book, Sect. 10.3] Ch.5 Classification and Clustering 5.6 Self-organizing maps (SOM) [Book, Sect. 10.3] The self-organizing map (SOM) method, introduced by Kohonen (1982, 2001), approximates a dataset in multidimensional

More information

Unsupervised learning

Unsupervised learning Unsupervised learning Enrique Muñoz Ballester Dipartimento di Informatica via Bramante 65, 26013 Crema (CR), Italy enrique.munoz@unimi.it Enrique Muñoz Ballester 2017 1 Download slides data and scripts:

More information

Line Simplification Using Self-Organizing Maps

Line Simplification Using Self-Organizing Maps Line Simplification Using Self-Organizing Maps Bin Jiang Division of Geomatics, Dept. of Technology and Built Environment, University of Gävle, Sweden. Byron Nakos School of Rural and Surveying Engineering,

More information

Week 7 Picturing Network. Vahe and Bethany

Week 7 Picturing Network. Vahe and Bethany Week 7 Picturing Network Vahe and Bethany Freeman (2005) - Graphic Techniques for Exploring Social Network Data The two main goals of analyzing social network data are identification of cohesive groups

More information

A Comparative Study of Conventional and Neural Network Classification of Multispectral Data

A Comparative Study of Conventional and Neural Network Classification of Multispectral Data A Comparative Study of Conventional and Neural Network Classification of Multispectral Data B.Solaiman & M.C.Mouchot Ecole Nationale Supérieure des Télécommunications de Bretagne B.P. 832, 29285 BREST

More information

Comparison of supervised self-organizing maps using Euclidian or Mahalanobis distance in classification context

Comparison of supervised self-organizing maps using Euclidian or Mahalanobis distance in classification context 6 th. International Work Conference on Artificial and Natural Neural Networks (IWANN2001), Granada, June 13-15 2001 Comparison of supervised self-organizing maps using Euclidian or Mahalanobis distance

More information

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter Review: Final Exam Model for a Learning Step Learner initially Environm ent Teacher Compare s pe c ia l Information Control Correct Learning criteria Feedback changed Learner after Learning Learning by

More information

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks Series Prediction as a Problem of Missing Values: Application to ESTSP7 and NN3 Competition Benchmarks Antti Sorjamaa and Amaury Lendasse Abstract In this paper, time series prediction is considered as

More information

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Feature Selection CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Dimensionality reduction Feature selection vs. feature extraction Filter univariate

More information

Seismic regionalization based on an artificial neural network

Seismic regionalization based on an artificial neural network Seismic regionalization based on an artificial neural network *Jaime García-Pérez 1) and René Riaño 2) 1), 2) Instituto de Ingeniería, UNAM, CU, Coyoacán, México D.F., 014510, Mexico 1) jgap@pumas.ii.unam.mx

More information

Controlling the spread of dynamic self-organising maps

Controlling the spread of dynamic self-organising maps Neural Comput & Applic (2004) 13: 168 174 DOI 10.1007/s00521-004-0419-y ORIGINAL ARTICLE L. D. Alahakoon Controlling the spread of dynamic self-organising maps Received: 7 April 2004 / Accepted: 20 April

More information

Cluster Analysis using Spherical SOM

Cluster Analysis using Spherical SOM Cluster Analysis using Spherical SOM H. Tokutaka 1, P.K. Kihato 2, K. Fujimura 2 and M. Ohkita 2 1) SOM Japan Co-LTD, 2) Electrical and Electronic Department, Tottori University Email: {tokutaka@somj.com,

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Fabio G. Cozman - fgcozman@usp.br November 16, 2018 What can we do? We just have a dataset with features (no labels, no response). We want to understand the data... no easy to define

More information

Lecture Topic Projects

Lecture Topic Projects Lecture Topic Projects 1 Intro, schedule, and logistics 2 Applications of visual analytics, basic tasks, data types 3 Introduction to D3, basic vis techniques for non-spatial data Project #1 out 4 Data

More information

Self-Organizing Maps of Web Link Information

Self-Organizing Maps of Web Link Information Self-Organizing Maps of Web Link Information Sami Laakso, Jorma Laaksonen, Markus Koskela, and Erkki Oja Laboratory of Computer and Information Science Helsinki University of Technology P.O. Box 5400,

More information

/00/$10.00 (C) 2000 IEEE

/00/$10.00 (C) 2000 IEEE A SOM based cluster visualization and its application for false coloring Johan Himberg Helsinki University of Technology Laboratory of Computer and Information Science P.O. Box 54, FIN-215 HUT, Finland

More information

Linear Discriminant Analysis in Ottoman Alphabet Character Recognition

Linear Discriminant Analysis in Ottoman Alphabet Character Recognition Linear Discriminant Analysis in Ottoman Alphabet Character Recognition ZEYNEB KURT, H. IREM TURKMEN, M. ELIF KARSLIGIL Department of Computer Engineering, Yildiz Technical University, 34349 Besiktas /

More information

Applying Kohonen Network in Organising Unstructured Data for Talus Bone

Applying Kohonen Network in Organising Unstructured Data for Talus Bone 212 Third International Conference on Theoretical and Mathematical Foundations of Computer Science Lecture Notes in Information Technology, Vol.38 Applying Kohonen Network in Organising Unstructured Data

More information

What is a receptive field? Why a sensory neuron has such particular RF How a RF was developed?

What is a receptive field? Why a sensory neuron has such particular RF How a RF was developed? What is a receptive field? Why a sensory neuron has such particular RF How a RF was developed? x 1 x 2 x 3 y f w 1 w 2 w 3 T x y = f (wx i i T ) i y x 1 x 2 x 3 = = E (y y) (y f( wx T)) 2 2 o o i i i

More information

CSE 258 Lecture 5. Web Mining and Recommender Systems. Dimensionality Reduction

CSE 258 Lecture 5. Web Mining and Recommender Systems. Dimensionality Reduction CSE 258 Lecture 5 Web Mining and Recommender Systems Dimensionality Reduction This week How can we build low dimensional representations of high dimensional data? e.g. how might we (compactly!) represent

More information

SGN (4 cr) Chapter 10

SGN (4 cr) Chapter 10 SGN-41006 (4 cr) Chapter 10 Feature Selection and Extraction Jussi Tohka & Jari Niemi Department of Signal Processing Tampere University of Technology February 18, 2014 J. Tohka & J. Niemi (TUT-SGN) SGN-41006

More information

Modelling and Visualization of High Dimensional Data. Sample Examination Paper

Modelling and Visualization of High Dimensional Data. Sample Examination Paper Duration not specified UNIVERSITY OF MANCHESTER SCHOOL OF COMPUTER SCIENCE Modelling and Visualization of High Dimensional Data Sample Examination Paper Examination date not specified Time: Examination

More information

Stability Assessment of Electric Power Systems using Growing Neural Gas and Self-Organizing Maps

Stability Assessment of Electric Power Systems using Growing Neural Gas and Self-Organizing Maps Stability Assessment of Electric Power Systems using Growing Gas and Self-Organizing Maps Christian Rehtanz, Carsten Leder University of Dortmund, 44221 Dortmund, Germany Abstract. Liberalized competitive

More information

Initialization of Self-Organizing Maps: Principal Components Versus Random Initialization. A Case Study

Initialization of Self-Organizing Maps: Principal Components Versus Random Initialization. A Case Study Initialization of Self-Organizing Maps: Principal Components Versus Random Initialization. A Case Study Ayodeji A. Akinduko 1 and Evgeny M. Mirkes 2 1 University of Leicester, UK, aaa78@le.ac.uk 2 Siberian

More information

Lorentzian Distance Classifier for Multiple Features

Lorentzian Distance Classifier for Multiple Features Yerzhan Kerimbekov 1 and Hasan Şakir Bilge 2 1 Department of Computer Engineering, Ahmet Yesevi University, Ankara, Turkey 2 Department of Electrical-Electronics Engineering, Gazi University, Ankara, Turkey

More information

A novel firing rule for training Kohonen selforganising

A novel firing rule for training Kohonen selforganising A novel firing rule for training Kohonen selforganising maps D. T. Pham & A. B. Chan Manufacturing Engineering Centre, School of Engineering, University of Wales Cardiff, P.O. Box 688, Queen's Buildings,

More information

Advanced visualization techniques for Self-Organizing Maps with graph-based methods

Advanced visualization techniques for Self-Organizing Maps with graph-based methods Advanced visualization techniques for Self-Organizing Maps with graph-based methods Georg Pölzlbauer 1, Andreas Rauber 1, and Michael Dittenbach 2 1 Department of Software Technology Vienna University

More information

Data Preprocessing. Chapter 15

Data Preprocessing. Chapter 15 Chapter 15 Data Preprocessing Data preprocessing converts raw data and signals into data representation suitable for application through a sequence of operations. The objectives of data preprocessing include

More information

Gray-Level Reduction Using Local Spatial Features

Gray-Level Reduction Using Local Spatial Features Computer Vision and Image Understanding 78, 336 350 (2000) doi:10.1006/cviu.2000.0838, available online at http://www.idealibrary.com on Gray-Level Reduction Using Local Spatial Features Nikos Papamarkos

More information

Basics of Multivariate Modelling and Data Analysis

Basics of Multivariate Modelling and Data Analysis Basics of Multivariate Modelling and Data Analysis Kurt-Erik Häggblom 9. Linear regression with latent variables 9.1 Principal component regression (PCR) 9.2 Partial least-squares regression (PLS) [ mostly

More information

Cartographic Selection Using Self-Organizing Maps

Cartographic Selection Using Self-Organizing Maps 1 Cartographic Selection Using Self-Organizing Maps Bin Jiang 1 and Lars Harrie 2 1 Division of Geomatics, Institutionen för Teknik University of Gävle, SE-801 76 Gävle, Sweden e-mail: bin.jiang@hig.se

More information

CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION

CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION 75 CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION 6.1 INTRODUCTION Counter propagation network (CPN) was developed by Robert Hecht-Nielsen as a means to combine an unsupervised Kohonen

More information

Improving A Trajectory Index For Topology Conserving Mapping

Improving A Trajectory Index For Topology Conserving Mapping Proceedings of the 8th WSEAS Int. Conference on Automatic Control, Modeling and Simulation, Prague, Czech Republic, March -4, 006 (pp03-08) Improving A Trajectory Index For Topology Conserving Mapping

More information

Machine Learning : Clustering, Self-Organizing Maps

Machine Learning : Clustering, Self-Organizing Maps Machine Learning Clustering, Self-Organizing Maps 12/12/2013 Machine Learning : Clustering, Self-Organizing Maps Clustering The task: partition a set of objects into meaningful subsets (clusters). The

More information

Principal Component Image Interpretation A Logical and Statistical Approach

Principal Component Image Interpretation A Logical and Statistical Approach Principal Component Image Interpretation A Logical and Statistical Approach Md Shahid Latif M.Tech Student, Department of Remote Sensing, Birla Institute of Technology, Mesra Ranchi, Jharkhand-835215 Abstract

More information

Applied Neuroscience. Columbia Science Honors Program Fall Machine Learning and Neural Networks

Applied Neuroscience. Columbia Science Honors Program Fall Machine Learning and Neural Networks Applied Neuroscience Columbia Science Honors Program Fall 2016 Machine Learning and Neural Networks Machine Learning and Neural Networks Objective: Introduction to Machine Learning Agenda: 1. JavaScript

More information

Time Series Clustering Ensemble Algorithm Based on Locality Preserving Projection

Time Series Clustering Ensemble Algorithm Based on Locality Preserving Projection Based on Locality Preserving Projection 2 Information & Technology College, Hebei University of Economics & Business, 05006 Shijiazhuang, China E-mail: 92475577@qq.com Xiaoqing Weng Information & Technology

More information

CSE 255 Lecture 5. Data Mining and Predictive Analytics. Dimensionality Reduction

CSE 255 Lecture 5. Data Mining and Predictive Analytics. Dimensionality Reduction CSE 255 Lecture 5 Data Mining and Predictive Analytics Dimensionality Reduction Course outline Week 4: I ll cover homework 1, and get started on Recommender Systems Week 5: I ll cover homework 2 (at the

More information

Alternative Statistical Methods for Bone Atlas Modelling

Alternative Statistical Methods for Bone Atlas Modelling Alternative Statistical Methods for Bone Atlas Modelling Sharmishtaa Seshamani, Gouthami Chintalapani, Russell Taylor Department of Computer Science, Johns Hopkins University, Baltimore, MD Traditional

More information

On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions

On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions CAMCOS Report Day December 9th, 2015 San Jose State University Project Theme: Classification The Kaggle Competition

More information

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Lu Chen and Yuan Hang PERFORMANCE DEGRADATION ASSESSMENT AND FAULT DIAGNOSIS OF BEARING BASED ON EMD AND PCA-SOM.

More information

Principal Component Analysis (PCA) is a most practicable. statistical technique. Its application plays a major role in many

Principal Component Analysis (PCA) is a most practicable. statistical technique. Its application plays a major role in many CHAPTER 3 PRINCIPAL COMPONENT ANALYSIS ON EIGENFACES 2D AND 3D MODEL 3.1 INTRODUCTION Principal Component Analysis (PCA) is a most practicable statistical technique. Its application plays a major role

More information

DI TRANSFORM. The regressive analyses. identify relationships

DI TRANSFORM. The regressive analyses. identify relationships July 2, 2015 DI TRANSFORM MVstats TM Algorithm Overview Summary The DI Transform Multivariate Statistics (MVstats TM ) package includes five algorithm options that operate on most types of geologic, geophysical,

More information

Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation

Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation Lori Cillo, Attebury Honors Program Dr. Rajan Alex, Mentor West Texas A&M University Canyon, Texas 1 ABSTRACT. This work is

More information

Influence of Neighbor Size for Initial Node Exchange of SOM Learning

Influence of Neighbor Size for Initial Node Exchange of SOM Learning FR-E3-3 SCIS&ISIS2006 @ Tokyo, Japan (September 20-24, 2006) Influence of Neighbor Size for Initial Node Exchange of SOM Learning MIYOSHI Tsutomu Department of Information and Knowledge Engineering, Tottori

More information

PERFORMANCE OF THE DISTRIBUTED KLT AND ITS APPROXIMATE IMPLEMENTATION

PERFORMANCE OF THE DISTRIBUTED KLT AND ITS APPROXIMATE IMPLEMENTATION 20th European Signal Processing Conference EUSIPCO 2012) Bucharest, Romania, August 27-31, 2012 PERFORMANCE OF THE DISTRIBUTED KLT AND ITS APPROXIMATE IMPLEMENTATION Mauricio Lara 1 and Bernard Mulgrew

More information

WAVELET USE FOR IMAGE CLASSIFICATION. Andrea Gavlasová, Aleš Procházka, and Martina Mudrová

WAVELET USE FOR IMAGE CLASSIFICATION. Andrea Gavlasová, Aleš Procházka, and Martina Mudrová WAVELET USE FOR IMAGE CLASSIFICATION Andrea Gavlasová, Aleš Procházka, and Martina Mudrová Prague Institute of Chemical Technology Department of Computing and Control Engineering Technická, Prague, Czech

More information

Self-Organizing Map. presentation by Andreas Töscher. 19. May 2008

Self-Organizing Map. presentation by Andreas Töscher. 19. May 2008 19. May 2008 1 Introduction 2 3 4 5 6 (SOM) aka Kohonen Network introduced by Teuvo Kohonen implements a discrete nonlinear mapping unsupervised learning Structure of a SOM Learning Rule Introduction

More information

CHAPTER 3 PRINCIPAL COMPONENT ANALYSIS AND FISHER LINEAR DISCRIMINANT ANALYSIS

CHAPTER 3 PRINCIPAL COMPONENT ANALYSIS AND FISHER LINEAR DISCRIMINANT ANALYSIS 38 CHAPTER 3 PRINCIPAL COMPONENT ANALYSIS AND FISHER LINEAR DISCRIMINANT ANALYSIS 3.1 PRINCIPAL COMPONENT ANALYSIS (PCA) 3.1.1 Introduction In the previous chapter, a brief literature review on conventional

More information

The Curse of Dimensionality

The Curse of Dimensionality The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Unsupervised learning Until now, we have assumed our training samples are labeled by their category membership. Methods that use labeled samples are said to be supervised. However,

More information

Feature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22

Feature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22 Feature selection Javier Béjar cbea LSI - FIB Term 2011/2012 Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/2012 1 / 22 Outline 1 Dimensionality reduction 2 Projections 3 Attribute selection

More information

CSE 40171: Artificial Intelligence. Learning from Data: Unsupervised Learning

CSE 40171: Artificial Intelligence. Learning from Data: Unsupervised Learning CSE 40171: Artificial Intelligence Learning from Data: Unsupervised Learning 32 Homework #6 has been released. It is due at 11:59PM on 11/7. 33 CSE Seminar: 11/1 Amy Reibman Purdue University 3:30pm DBART

More information

IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS

IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS BOGDAN M.WILAMOWSKI University of Wyoming RICHARD C. JAEGER Auburn University ABSTRACT: It is shown that by introducing special

More information