COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS

Similar documents
arxiv: v1 [physics.data-an] 27 Sep 2007

Unsupervised Learning

Figure (5) Kohonen Self-Organized Map

A method for comparing self-organizing maps: case studies of banking and linguistic data

Clustering and Visualisation of Data

Nonlinear dimensionality reduction of large datasets for data exploration

Dimension Reduction CS534

PATTERN RECOGNITION USING NEURAL NETWORKS

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6

INTERNATIONAL CONFERENCE ON ENGINEERING DESIGN ICED 05 MELBOURNE, AUGUST 15-18, 2005

Recognition: Face Recognition. Linda Shapiro EE/CSE 576

Self-Organizing Maps for cyclic and unbounded graphs

Artificial Neural Networks Unsupervised learning: SOM

Exploratory Data Analysis using Self-Organizing Maps. Madhumanti Ray

CSC 411: Lecture 14: Principal Components Analysis & Autoencoders

Feature Selection Using Principal Feature Analysis

Unsupervised Learning

Self-Organizing Maps for Analysis of Expandable Polystyrene Batch Process

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

An ICA-Based Multivariate Discretization Algorithm

11/14/2010 Intelligent Systems and Soft Computing 1

Function approximation using RBF network. 10 basis functions and 25 data points.

Data Compression. The Encoder and PCA

CS325 Artificial Intelligence Ch. 20 Unsupervised Machine Learning

CSC 411: Lecture 14: Principal Components Analysis & Autoencoders

Network Traffic Measurements and Analysis

Color Space Projection, Feature Fusion and Concurrent Neural Modules for Biometric Image Recognition

Work 2. Case-based reasoning exercise

Chapter 7: Competitive learning, clustering, and self-organizing maps

Yuki Osada Andrew Cannon

Road Sign Visualization with Principal Component Analysis and Emergent Self-Organizing Map

Data Mining. Kohonen Networks. Data Mining Course: Sharif University of Technology 1

Unsupervised Learning

CIE L*a*b* color model

Data Mining Chapter 3: Visualizing and Exploring Data Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Face Recognition for Mobile Devices

Dimension reduction : PCA and Clustering

Methods for Intelligent Systems

ECG782: Multidimensional Digital Signal Processing

Exploratory data analysis for microarrays

Mineral Exploation Using Neural Netowrks

Discriminate Analysis

Generalized Principal Component Analysis CVPR 2007

DESIGN OF KOHONEN SELF-ORGANIZING MAP WITH REDUCED STRUCTURE

Self-organization of very large document collections

5.6 Self-organizing maps (SOM) [Book, Sect. 10.3]

Unsupervised learning

Line Simplification Using Self-Organizing Maps

Week 7 Picturing Network. Vahe and Bethany

A Comparative Study of Conventional and Neural Network Classification of Multispectral Data

Comparison of supervised self-organizing maps using Euclidian or Mahalanobis distance in classification context

Review: Final Exam CPSC Artificial Intelligence Michael M. Richter

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks

Feature Selection. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Seismic regionalization based on an artificial neural network

Controlling the spread of dynamic self-organising maps

Cluster Analysis using Spherical SOM

Unsupervised Learning

Lecture Topic Projects

Self-Organizing Maps of Web Link Information

/00/$10.00 (C) 2000 IEEE

Linear Discriminant Analysis in Ottoman Alphabet Character Recognition

Applying Kohonen Network in Organising Unstructured Data for Talus Bone

What is a receptive field? Why a sensory neuron has such particular RF How a RF was developed?

CSE 258 Lecture 5. Web Mining and Recommender Systems. Dimensionality Reduction

SGN (4 cr) Chapter 10

Modelling and Visualization of High Dimensional Data. Sample Examination Paper

Stability Assessment of Electric Power Systems using Growing Neural Gas and Self-Organizing Maps

Initialization of Self-Organizing Maps: Principal Components Versus Random Initialization. A Case Study

Lorentzian Distance Classifier for Multiple Features

A novel firing rule for training Kohonen selforganising

Advanced visualization techniques for Self-Organizing Maps with graph-based methods

Data Preprocessing. Chapter 15

Gray-Level Reduction Using Local Spatial Features

Basics of Multivariate Modelling and Data Analysis

Cartographic Selection Using Self-Organizing Maps

CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION

Improving A Trajectory Index For Topology Conserving Mapping

Machine Learning : Clustering, Self-Organizing Maps

Principal Component Image Interpretation A Logical and Statistical Approach

Applied Neuroscience. Columbia Science Honors Program Fall Machine Learning and Neural Networks

Time Series Clustering Ensemble Algorithm Based on Locality Preserving Projection

CSE 255 Lecture 5. Data Mining and Predictive Analytics. Dimensionality Reduction

Alternative Statistical Methods for Bone Atlas Modelling

On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM

Principal Component Analysis (PCA) is a most practicable. statistical technique. Its application plays a major role in many

DI TRANSFORM. The regressive analyses. identify relationships

Recognizing Handwritten Digits Using the LLE Algorithm with Back Propagation

Influence of Neighbor Size for Initial Node Exchange of SOM Learning

PERFORMANCE OF THE DISTRIBUTED KLT AND ITS APPROXIMATE IMPLEMENTATION

WAVELET USE FOR IMAGE CLASSIFICATION. Andrea Gavlasová, Aleš Procházka, and Martina Mudrová

Self-Organizing Map. presentation by Andreas Töscher. 19. May 2008

CHAPTER 3 PRINCIPAL COMPONENT ANALYSIS AND FISHER LINEAR DISCRIMINANT ANALYSIS

The Curse of Dimensionality

Unsupervised Learning

Feature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22

CSE 40171: Artificial Intelligence. Learning from Data: Unsupervised Learning

IMPLEMENTATION OF RBF TYPE NETWORKS BY SIGMOIDAL FEEDFORWARD NEURAL NETWORKS

Transcription:

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS Toomas Kirt Supervisor: Leo Võhandu Tallinn Technical University Toomas.Kirt@mail.ee Abstract: Key words: For the visualisation of multidimensional financial data sets we have used the Self- Organising Maps (SOM) by T. Kohonen [6,8]. SOM is one of the very useful neural computing methods for analysing and visualising multidimensional data. To achieve better computational speed, one has to reduce the dimensionality of the original data during data pre-processing stage. One of the best-known methods for that is the Principal Component Analysis (PCA). In our research project we have used one alternative effective method for the dimensionality reduction. This method is called the Peeling Method (Võhandu, Krusberg, 1977). Main difference between these two methods is that PCA finds out principal components that are optimised linear combinations of all original data variables (so there is no real reduction of data), but the Peeling method finds first out most important variables that describe the system correlations in the best possible way using only some (or most) of variables. To illustrate results between different dimensionality reduction methods, we have used financial data of Estonian banks. Our reduction method enables to detach almost half of the original variables from the original data and to get practically the same results afterwards using SOM as with all the data. Self- Organising Maps, Neural Networks, Dimensionality Reduction, Data Mining, Peeling Method 1

2 Toomas Kirt 1. INTRODUCTION The Self- Organising Maps (SOM) is a very useful neural computing method for analysing and visualising multidimensional data. SOM can be used for translating multidimensional financial data into simple twodimensional maps. SOM groups similar input data vectors which are near each other in the input space to nearby map units in the SOM. The SOM can thus be used as a clustering tool. To achieve better computational speed, it is possible to reduce dimensionality of the original data during data preprocessing. One of the well-known methods for the dimensionality reduction is the Principal Component Analysis (PCA). In this paper we use also another dimensionality reduction technique the Peeling Method. Our goal in this paper is to compare two dimensionality reduction methods. To visualise two methods we use financial data of Estonian banks. Our goal is to compare results created by SOM using original data and data with reduced dimensionality. This paper is divided into four parts. In the first part we give short overview of the SOM algorithm and visualisation methods. In the second part we give an overview of the PCA s main properties. In the third part we introduce the Peeling method and in the fourth part we use above described methods with the financial data of Estonian banks and compare results. 2. SELF ORGANISING MAPS (SOM) A self-organising map is a feedforward neural network that uses an unsupervised training algorithm, and through a process called selforganisation, it configures the output units into a topological representation of the original data [2,5]. The SOM belongs to a general class of neural network methods, which are non-linear regression techniques that can be trained to learn or find relationships between inputs and outputs or to organise data so as to disclose so far unknown patterns or structures [1]. The algorithm is based on unsupervised, competitive learning. The algorithm provides a topology preserving mapping from high-dimensional space to map units. Map units, or neurones, usually from a two-dimensional space grid and thus the mapping is a mapping from a high-dimensional space onto plane. The property of topology preserving means that the SOM groups similar input data vectors on neurons: points that are near each other in the input space are mapped to nearby map units in the SOM. The SOM can thus serve as a clustering tool as well as a tool for visualising high-dimensional data.

Combined method to visualise and reduce dimensionality of the financial data sets 3 The process of creating a self-organising map requires two layers of processing units. The first is an input layer containing processing units for each element in the input vector; the second is an output layer or a grid of processing units that is fully connected with those at the input layer. User, depending on how the map will be used, can define size of an output layer. The learning process goes on as follows. At first the output grid will be initialised with initial values that could be random values from input space. One sample will be taken from input variables and it will be presented to the output grid of the map. All the neurons in the output layer compete with each other to become a winner. The winner will be the output node that is the closest to the sample vector according to the Euclidean distance. The weights of the winner neuron will be changed closer to the sample vector, moved in direction of the input sample. The weights of the neurons in the neighbourhood of the winner unit will also be changed. During the process of learning the learning rate becomes smaller and also the rate of change declines around the neighbourhood of the winning neuron. At the end of the training only the winning unit is adjusted. As a result of the self- organising process the similar input data vectors are mapped to nearby map units in the SOM. The Unified distance matrix (U-matrix) method is usually used to visualise the structure of the input space of a self -organising feature map. The U-matrix method can be used for getting an impression of otherwise invisible structures in a multidimensional data space and it allows classifying data sets into groups of similar data points (self-organised classification or clustering). One of the simplest U-matrix methods is to sum up the distances of weight vectors of adjacent neurons on a feature map. An U-matrix gives a picture of topology of the unit layer and therefore also of the topology of the input spaces as follows: altitude in the U-matrix encodes dissimilarity in the input space. Valleys in the U-matrix (i.e. low altitudes) correspond to input vectors that are similar. [10] So the clusters in a multidimensional data set can be identified if all the points falling into the same valley of an U-matrix are grouped together. Furthermore the height of the walls or hills on an U-matrix gives a hint how much the classes differ from each other. Finally the properties of Self- Organising Maps ensure that similar groups are situated nearby in an U- matrix.

4 Toomas Kirt 3. PRINCIPAL COMPONENT ANALYSIS The Principal Component Analysis (PCA) is a technique commonly used for data reduction in statistical pattern recognition and signal processing. It is also known as Karhunen- Loève Transform [3,9]. In Principal Component Analysis each component of the projected vector is a linear combination of the components of the original data item. The projection is formed by multiplying each component by a certain fixed scalar coefficient and adding the results together. Mathematical methods exist for finding the optimal coefficients such that the variance of the data after the projection will be preserved, whereby it is also closest to the variance of the original data. N Let X R be a random n-dimensional vector representing the environment of interest. Our goal is to generate features that are optimally uncorrelated, that is E[y(i)y(j)]=0, i j. Let Y=A T X. From the definition of the correlation matrix we have R Y E[YY T ]=E[A T XX T A]=A T R X A. (3.1) However, R X is a symmetric matrix, and hence its eigenvectors are mutually orthogonal. Thus, if matrix A is chosen so that columns are the orthonormal eigenvectors a 1,a 2,..,a N, of R X, then R Y is diagonal R Y =A T R X A=Λ (3.2) where λ is the diagonal matrix having as elements on its diagonal the respective eigenvalues λ 1,λ 2,,λ N, of R X. Let the corresponding eigenvalues be arranged in decreasing order: λ 1 >λ 2 > >λ j > >λ N so that λ 1 = λ max. An important property of PCA is Mean square error approximation. If we choose in m xˆ = y( i) (3.3) i= 1 a i the eigenvectors corresponding to the m, m N, largest eigenvalues of the correlation matrix, then the MSE is minimised, being the sum of N-m smallest eigenvalues. Another property of PCA is property of total variance. Let E[X] be zero and y be the PCA transformed vector of X. The eigenvalues of the input correlation matrix are equal to the variances of the transformed features [ y i ] 2 ( i 2 σ Y ( i) E ) = λ. (3.4) Thus, selecting those features, y( i) a largest eigenvalues makes their sum variance λ maximum. i i T i x, corresponding to the m Those properties allow choosing m principal components that retain most of the total variance associated with the original random variables [7].

Combined method to visualise and reduce dimensionality of the financial data sets 5 4. PEELING METHOD The Peeling method [11] finds first out most important variables that describe the system correlations in the best possible way using only some (or most) of variables. Algorithm works as following: 1. For a correlation matrix R 1 r12... r1 m r21 1... r2 m R =............ (4.1) rm1 rm 2... 1 we calculate for every column 2. For ( k ) S = max S (4.3) j S m 2 rij i= j = 1 rjj (4.2) (That is the number of the most important in average variable in the system. Superscript shows the number of the iteration (k=1,,r m) 3. The correlation coefficients of the maximal variable will be divided by the square root of the diagonal element r jj of the matrix R. The transformed column vector b 1 is the first vector of the new factor matrix B. 4. Find the residual matrix (1) ' R = R b1b1 (4.4) 5. Repeat the process r times, where r m is the rank of R. According to the elimination order we take first r variables and use them in following activities. 5. CASE STUDY: SOM OF ESTONIAN BANKS To illustrate two methods we have used financial reports of Estonian banks. We used 92 different reports from the period 1997-1998 and each report consists of 16 variables. It is not very remarkable amount of data, but our goal is just to compare visual results achieved by the different methods. Every node on the map represents one or more reports of the banks. Firstly we created a U-matrix of the original data and the result is given on the Figure 1.

6 Toomas Kirt Figure 1. SOM of Estonian banks 16 variables Secondly we applied to the original data PCA method and selected out nine linear combination that describe more than 95% of the variation. After creating SOM from reduced data we got result as showed on the Figure 2. Figure 2. SOM of Estonian banks PCA 9 variables At the third attempt we have used the Peeling method. We eliminated six variables and used ten original variables. Result we can see on Figure 3.

Combined method to visualise and reduce dimensionality of the financial data sets 7 Figure 3. SOM of Estonian banks Peeling Method 10 variables As we can see, the structures of the three maps are similar. It means that we can get practically the same results without using all data. Calculation of the SOM took respectively 17, 10 and 11 seconds. It means that the process was approximately 44% and 38% faster when we used 9 and 10 variables instead of 16. 6. CONCLUSION We have introduced two possible ways for the dimensionality reduction and compared them. As we saw from the maps there are only small differences between maps made of data with reduced dimensionality and original data. Despite achieved results we should take into account that calculating a correlation matrix and eigenvalues is computing-consuming activity. Therefore in the further research we would like to turn our attention to random mapping method suggested by Sami Kaski [4]. REFERENCES [1] Deboeck G, Kohonen T, Visual exploration in finance: with self-organising maps, Springer, Berlin, 1998 [2] Haykin S Neural Networks, Prentice Hall, New Jersey, 1999 [3] Jobson J.D Applied multivariate Data Analysis, Volume II, Springer, New York, 1992

8 Toomas Kirt [4] Kaski S Dimensionality Reduction by Random Mapping: Fast Similarity Computation for Clustering, IEEE International Joint Conference on Neural Networks, Anchorage, Alaska, May 4-9 1998 [5] Kohonen T, Kaski S, Laugus K, Salojärvi J, Honkela J, Paatero V and Saarela A, Self organisation of a massive document collection, IEEE Transactions on Neural Networks, vol. 11, No.3, May 2000 [6] Kohonen T, Self-organising maps, Third edition, Springer, Berlin, 2000 [7] Liu C, Wechler, Face Recognition Using Shape and Texture, CVPR 99, Fort Collins, Colorado, June 23-25 1999 [8] Oja E, Kaski S, Kohonen Maps, Elsevier, Amsterdam, 1999 [9] Theodoridis S, Koutroumbas K, Pattern Recognition, Academic Press, San Diego, 1998 [10] Ultsch A, Unified Matrix (U-matrix) Methods, http://www.mathematik.unimarburg.de/~ultsch/umatrix/umatrix.html, 1999 [11] Võhandu L, Krusberg H A Direct Factor Analysis Method, The Proceedings of TTU, 426, 1977, pp.11-21