Learning algorithms and temporal structures for SOM

Size: px

Start display at page:

Download "Learning algorithms and temporal structures for SOM"

Chester Harvey
5 years ago
Views:

Learning algorithms and temporal structures for SOM Qian Wu <u533625@anu.edu.

1 Learning algorithms and temporal structures for SOM Qian Wu November 203 A report submitted for the degree of Master of Computing of Australian National University Supervisors: Prof. Tom Gedeon & Leana Copeland

2 Acknowledgements I would like to sincerely thank my supervisors Professor Tom Gedeon and Leana Copeland for their constant guidance and assistance along the research process. Their expertise, patience, enthusiasm and friendship were highly appreciated and encouraged me greatly. Page

3 Abstract Self-Organising Maps (SOMs) have been shown to be an effective unsupervised learning method when analysing complex data sets, especially high dimensional data sets. SOM is widely used to perform clustering, dimensional reduction and topological mapping, thus enabling visualization and analysis toward multi-dimensional data. Kangas Map is an approach to narrow down the searching space of SOMs and ideally can optimise the performance and outcome of original SOMs. This research aims to apply Kangas Map to SOMs so that time-dependant dataset can be more efficiently and effectively clustered, visualized and analysed. This research proposed a novel approach for time-series learning combining Multiple Spherical SOMs and Kangas Map. A unified testing framework is used to thoroughly test the learning capabilities. The experiment results show that the Kangas map with radius R provides some improvement in some cases, with further work required to identify the interactions with other parameter setting such as metric used and sizes and numbers of s. Key words: Self-Organising Map, spherical, time-series learning, Kangas Map List of Abbreviations SOM SSOM MSSOM BMU QE TE Self-Organizing Map Spherical Self-Organizing Map Multiple Spherical Self-Organizing Map Best Matching Unit Quantisation Error Topological Error Page 2

4 Contents Acknowledgements... Abstract... 2 List of Abbreviations Introduction.... Project Motivation....2 Project Objectives....3 Contribution Background Unsupervised Learning and neural network Kohonen s SOM SOM Algorithm Kangas Map SSOM Neighbourhood Structure Algorithm Error Measures for Testing Method Initializing Restructuring Self Metric Distance Metric Integer Metric Training Data and Sphere Kangas Map Testing Experiments and Results ECSH Data Set Description Experiment Quantization Error Topological Error Problems existing in previous work Redundant Neighbours Missing Neighbours Conclusion and Future Work Conclusion Future Work... 2 References... 2 Page 3

5 . Introduction. Project Motivation As a type of unsupervised learning, self-organising maps have been widely used effectively as tool for a range of tasks such as cluster analysis and visualization of high dimensional data spaces. Researchers have proposed different approaches to analyse time-dependent datasets using SOMs. One approach being investigated in [], [2], [3] is to apply time-series analysis to multiple Spherical Self-Organizing Maps for visualization of time-dependent and high dimensional datasets. The previous research has shown limited but promising results. One of many possible improvement approaches is to enhance the multiple Spherical Self-Organizing maps with Kangas maps[] which limit the search space when selecting the best matching units...2 Project Objectives The objectives of this project are to re-implement and expand the existing work of [], [2], [3] with a better time-series approach and test the performance of the learning algorithm(s) using existing testing methods to see how much improvement has been achieved...3 Contribution The work of [], [2], [3] has been revised, modified and expanded to incorporate the Kangas Map approach to facilitate the analysis of the proposed MSSOM time-series learning algorithm 2. Background 2. Unsupervised Learning and neural network Machine learning algorithms are typically subdivided into three main classes, namely supervised learning, unsupervised learning and reinforcement learning [5]. In supervised learning, input data is presented to the models along with known output. The models adjust their settings during the process of categorization and prediction of patterns against known outputs as a form of learning. The algorithm seeks to build predictor models that generate reasonable predictions for the response to new data. Reinforcement learning differs from supervised learning in that an agent is connected to its environment via perception and action. The agent learns from the consequences of its actions rather than from previous known target Page

6 outputs and it selects its actions based on its experiences along with new choices. The agent seeks to learn to select actions that maximize the accumulated reward over time with reinforcement signal which encodes the success of an action s outcome. The third algorithm is unsupervised learning where no previously known outputs are needed when training. Unsupervised algorithms seek out inherent similarities among input patterns in order to categorize them into groups which can be viewed as a process of developing classification labels automatically. It is often used in techniques to reduce dimensionality of complex patterns, to cluster high dimensional datasets and to facilitate visualization tasks. Neural Networks (NNs) are a computational model inspired by certain human brain structures and functions. They are constructed with interconnected groups of artificial neurons. By adjusting the neuron connections or weights, the model changes its structure based on previously known output patterns and intrinsic correlations hidden in the input patterns themselves. It normally consists of more than two layers. 2.2 Kohonen s SOM The term Self-Organising map (SOM) is normally referred to as the Kohonen Self-Organising Map which was first proposed by Teuvo Kohonen in 982 [6]. It is a mature unsupervised learning algorithm that has multiple functions such as clustering, dimensionality reduction, data visualisation, sampling and vector quantisation. It has been widely used to solve practical problems cross disciplines and industry domains. The Self-Organising Map performs data clustering and visualisation by mapping the data into a lower dimensional feature space, normally in less than three dimensions. Provided with similarity measure or distance metric, SOM can have patterns that are close together in the input space in a dataset will remain so in the feature space. Thus a SOM preserves much of the neighbourhood relationships of the neurons and performs a topological mapping of the data at a lower dimension that is readily and easily visualised. Figure shows a most common 2-D map. Page 5

Figure : 2D Self-Organising Map showing both the input and feature spaces Figure displays a typical 2-D, 2-layer SOM. It consists of a layer of input nodes and a feature layer.

7 Figure : 2D Self-Organising Map showing both the input and feature spaces Figure displays a typical 2-D, 2-layer SOM. It consists of a layer of input nodes and a feature layer. Each node in the input layer represents an input data pattern vector and each node in the feature map contains a weight vector of the same dimension as input patterns. It should be noted that the neurons in the feature space can be arranged in grid, hexagonal, random topology and so forth [8] SOM Algorithm This part presents the basic algorithm of a conventional SOM. To begin with, the weight vectors of neurons in the feature space are usually randomly initialized with small values and slight variations. The input patterns are typically randomly fed to the feature map one after another. The learning process is then made up of three main phases which are competition, cooperation and adaptation. Competition With the use of a distance metric the distance between an input vector and each output neuron is calculated. Although Euclidean distance is mostly adopted, the choices of distance metrics for different problems often vary. The winning neuron or best matching unit (BMU) is defined as the neuron which is closest to the input pattern and is calculated as follows. (2.) Here represents the index of the BMU given an input vector and the j th weight vector. Cooperation After the winning neuron is specified, those neurons within a certain distance of the BMU form the neighbours which are normally set by a distance function. The distance function given below calculates the proximity of every neuron relative to the BMU with higher values Page 6

8 represent neurons closer to the BMU. The function must be symmetric and decreases monotonically [9]. (2.2) Here is the topological area centered around the wining neuron. is the lateral distance between winning neuron and cooperating neuron and is the radius influence []. Adaption It is in this phase that the weights of the BMU itself and all neurons in its neighborhood are updated so that they resemble the input vector more. It should be noted the amount of adaption is proportional to the distance function. The following formula states how the weights of each neuron in the neighborhood of the BMU are updated: Here is the index of winning neuron, and are the new and old weights of neuron j respectively and is a user supplied learning rate. These three phases are repeated during the training process until the updates on neurons are minor enough to stop training. 2.3 Kangas Map (2.3) The conventional Kohonen s map as discussed above does not incorporate time-dependent learning. Kangas map is one of many approaches to modify the original Konohen activation function so that temporal data can be processed with SOM. In a Kangas map, instead of considering all the units as candidates for the BMU in each iteration, only those which are in the neighborhood of the last best-match are considered []. This approach is expected to be much faster than a basic SOM for only selected units are to have their distances to the BMU calculated. Kangas map uses the core concepts of neighborhood and topological ordering of SOMs to code temporal dependency []. 2. SSOM Conventional Kohonen Self-Organising Maps come with some disadvantages. One of them is that the neurons on the edge of the map have fewer neighbors compared with interior neurons, which is known as border effect. This causes the neurons on the borders having fewer chances to be updated. One solution toward this problem is Spherical SOM (SSOM) which was first introduced by Ritter []. SSOMs effectively eliminate border effect for all the neurons have equal geometrical treatment. Besides, it also optimizes visualization because human beings are more used to read meaning (e.g. maps) from the surface of s. Page 7

Several types of SSOMs have been proposed and experimented on different datasets. GeoSOM, S-SOM, 3D-SOM and H-SOM are the most acknowledged SSOM topologies.

In an S-SOM every grid unit stores the list of their immediate neighbors respectively.

2.. Neighbourhood Structure The output space is formed by a set of predefined grid units that are generated by a recursive subdivision of an Icosahedron in S-SOM.

9 Several types of SSOMs have been proposed and experimented on different datasets. GeoSOM, S-SOM, 3D-SOM and H-SOM are the most acknowledged SSOM topologies. There are disadvantages and advantages among these spherical SOMs, discussed in detail by Wu & Takatsuka [7]. This research is mainly based on S-SOM which is proposed by Sangole & Leontitsis[6]. In an S-SOM every grid unit stores the list of their immediate neighbors respectively. The size of the data structure used to store the adjacency information scales poorly with the size (number of recursive subdivisions) but the lookup time is improved as a result in this approach [0]. 2.. Neighbourhood Structure The output space is formed by a set of predefined grid units that are generated by a recursive subdivision of an Icosahedron in S-SOM. The number of neurons scales exponentially with number of subdivisions. The number of neurons in the output space is calculated with an exponential function of the number of recursive subdivisions of the Icosahedron. (3.) The twelve original neurons of the icosahedron structure have 5 direct neighbors whereas other units have 6 adjacent points All directly adjacent neurons are regarded to be separated by distance in terms of the SOM algorithm. An Icosahedron as well as one recursively subdivided and times, respectively are shown below in figure 2. Figure 2: Icosahedron subdivision 0, and times As mentioned earlier in this section, the data structure of the map is encoded somehow beforehand. Firstly, an array is created to store the Cartesian coordinates of the neurons with representing the coordinate of the neuron with index i. Then a 2D cell array is created to encode the neighbourhood relationships of the neurons with storing all the neighbours of distance to the neuron with index Algorithm After having discussed the topology structure of S-SOM in the last section, it is necessary to present the training algorithm. Page 8

10 . Initialize the weight vector for each neuron and set training epochs. Values in each space of the vectors are assigned normal random numbers in accordance with the range of each input space. 2. The distances of all the neurons on the map to a randomly selected input pattern are calculated respectively with certain distance metric. By comparing all the neurons distances to the input vector, the best matching unit (BMU) which is the closest neuron to the input pattern in space is located. 3. The weights of the BMU and its neighbours are updated according to the following equations (3.) (3.2) is the original weight of the neuron with index ), is the current pattern vector and is the learning rate. Here represent the distance function introduced in section 2.2 where r is the distance calculated between the neuron with index ) and its BMU. determines the limit on the fraction of the that con comprise the neighbourhood of a node.. By repeating steps -3 so that all input patterns are fed to the map to have the neurons adapted, an epoch of training has been achieved. Normally multiple epochs of training are executed to optimize the experiment performance Error Measures for Testing As mentioned in section 2.2 the SOM can perform a multitude of functions and as such multiple error measures are required to quantify learning performance. Two distinct but complementary error measures are introduced here to quantify the quality of a SOM. Error measures for use in SOM are discussed more completely by [2] and [3]. The Quantisation Error (QE) represents the average difference between each pattern vector and its best matching unit and is calculated as follows. This measure does not however take into account the quality of the topological mapping and as such a second error measure, Topological Error (TE) is introduced. (3.3) (3.) Page 9

11 (3.5) 3. Method The methods which use SOM to visualise and analyse time-dependent data proposed by different studies can be grouped into three categories. The first category includes the studies that analyse and then modify the original datasets to incorporate the time characteristics. It can be viewed as form of pre-processing on the dataset and no change is implemented on the map itself. The second category refers to those that modify the settings during the training process such as changing the activation function and learning rule to take the time factor into consideration. The third group includes all the methods that modify the topology of SOMs by using multiple SOMs or through some appropriate use of feedback connectivity. This project is a continuing effort on the novel approach toward time-series learning using multiple S-SOMs (MSSOM) based on the work done by [], [2] and [3]. Both learning algorithm and network topology are modified. Part of the effort of the research is to apply Kangas map which is a technique to decrease the searching space for each BMU to existing MSSOM structure. The general architecture of a MSSOM computation is composed of four modules which are the initialization module, restructuring module, training module and test suite module 3. Initializing All input patterns and possible single S-SOM structures are saved as variables in the workspace before the experiments start. In this module both the input data and S-SOM structures are loaded. 3.2 Restructuring MSSOM structure is formed based on several single S-SOMs. The restructuring process is to first modify the indexes, coordinates of the units on each of the S-SOMs. Then the neighbourhood for each neuron is updated. Three Inter-Sphere connection metrics, namely Self Connection, Distance Connection and Integer Connection are discussed in detail later in this section. The different connection metrics, along with various other training parameters are expected to bring about different learning behaviours. Page 0

12 3.2. Self Metric Of all the three inter- connection metrics, self metric is the one that has least connections among different. Besides its neighbours from the same, every neuron is connected to the equivalent neurons on a certain number of other s as well. The distances of all inter- connections are assumed to be. The topology of the s can be interpreted as multiple concentric s []. The parameter of this connection metric is the number of s being connected which must be an even number between 2 and the value of s-. Other than representing the number of s being connected, the parameter can also be a fraction of the total s Distance Metric The Distance metric was proposed by Songwen Zha[2] as a method to broaden the neighbourhood boundary of a neuron. Although self metric is easy to understand and implement, it limits the searching mainly on one. Distance metric proposes that a more scientific distance among s should be defined thus a neuron can have a different numbers of neighbours from other s depending on how close they are. The solution Zha presented is to have the steps by which the neuron traverses the hemi is exactly equal to the neuron traversing half of the number of s on the Multiple Spheres SOM. (.) Here rsize is the steps to traverse the hemi and represent the number of s. Distance metric enables a decreasing neighbourhood and effectively avoids problems of wrapping around encountered by [] Figure 3 below shows the example of 7-s MSSOM with the fourth one being the centre. Figure 3: 7 s connected with distance metric. The red circles showing the neighbours of the node at the centre Page

13 3.2.3 Integer Metric As the initial exploration towards inter- connectivity proposed by [], integer metric can now be regarded as a special case of the distance metric where is set to. However, wrap around effect which means the neighbours of a neuron overlap when searching toward different directions will happen if either size or number is not set properly. The unnecessary computation and duplicate neighbours being stored can slow down the training process significantly and potentially cause errors and stop the training. However, this metric is included for completeness. 3.3 Training The training starts after MSSOM topology has been formed. There are several parameters to be set for training. They are data order, targeting and searching distance 3.3. Data and Sphere Data order specifies whether input patterns should be presented to MSSOM orderly or randomly. It can be preserved which means the input patterns are shown to the MSSOM in order or random otherwise. Sphere targeting determines whether all the neighbours of a neuron should come from the same or they can be on multiple s. Specific is to have the neighbours of a neuron selected from the same and any means the neighbourhood can be expanded to s nearby. If targeting is specific, targeted is rotated in order as the patterns are presented. The other option for targeting is any where all s are candidates to select a winning neuron. [3] Kangas Map As discussed in section 2.3, Kangas Map is an approach to limit the searching space. It can be beneficial for time-series learning for several reasons. The first one is topology reservation. The input patterns next to each other are often similar because of the time characteristic in time-dependant. Thus the corresponding BMUs for neighbouring input patterns are also expected to be close if the map can truly carry on some meaning. Kangas Map is in favour of topology preservation because of its context learning as discussed in section 2.3. The second possible benefit is to increase learning efficiency. To successfully implement Kangas Map, a suitable searching distance should be defined. 3. Testing In this project, the performance of the modified MSSOM is evaluated against the Unified Testing Page 2

14 Framework which is proposed by [3]. The framework calculates the Quantization Error and Topological Error for each combination of parameters for data order and targeting. It is expected that the combination preserved, specific should be the most effective time-series learning algorithm. Contrarily random, any is not supposed to be able to have sufficient time-series learning capabilities. With different numbers of SSOMs, various connection metrics and many possibilities of searching distances, there should be enough experiments and testings so that the more suitable learning parameters can be established in the context of this project.. Experiments and Results This section presented all the experiments done with different setting and parameters. First, the dataset is briefly described. Then a set of experiments along with their performance are shown in tables and are analysed. Then some errors existing in the previous framework are pointed out and analysed.. ECSH Data Set Description The ECSH or Easy Calm Stressful Hard data set was produced as part of a larger and a yet unpublished work and has been used previously [], [2], [3]. The data represents measurements taken while a person reads four different paragraphs of text intended to be of the types indicated by the title of the data set [3]. There are 3,6 input patterns altogether, each has seven attributes which are x and y coordinates of the eye gaze, pupil diameter of the left and right eye, Electrocardiography (ECG), Galvanic skin response (GSR) and blood pressure readings. The dataset is chosen because it incorporates distinct time-series characteristics and is considered of sufficient size and complexity to be analysed by the MSSOM..2 Experiment The experiments are carried out on various MSSOMs with different sizes and numbers. All the inter- connection metrics discussed in section 3.2 for each MSSOM are covered by training. There can be many possible experiments for the combinations of data order, targeting and searching distance are unlimited. However due to time constraints only the most significant settings are experimented with. The Quantisation Errors and Topological Errors are calculated in each experiment. The experiment performance can be compared against those with similar parameters presented by [2]. The effectiveness of Kangas Map approach can be tested with this comparison. Table shows the different cases of numbers and sizes which have experimented by Paget [2]. Case Number of s Sphere structure Neurons per Total Neurons Page 3

15 Table : Cases of number of s and sizes considered in this experiment Here refers to an icosahedron with recursive subdivisions. It can be seen that the total neurons which is the product of number of s and neurons per is about the same for each case. As one of the emphases of this research is to see how well Kangas Map works on ECSH Data Set, it is important that different searching distances are experimented with. What Paget has done [2] is that for each iteration selecting the BMU from all units on MSSOM, updating the weights for the BMU and its neighbours then searching for the BMU among all the units for next iteration. However Kangas Map requires a BMU to be selected from the neighbours of the last BMU. Thus in each iteration the neighbours of the BMU must be stored. Two different neighbourhood areas being experimented with Kangas Map are R and R/2. R is the neighbourhood boundary of each MSSOM. In the tables of the following sections, the best QE and TE for each configuration in each metric are bolded and best QE and TE for each configuration (over all metrics) are bold italics and coloured in red..2. Quantization Error Self Metric: 2 connected s Kangas Map: off s Page 5 s 6 s 2 s QE preserved any QE preserved specific QE random any QE random specific Table 2: Self Metric, 2 connected s without Kangas Map Kangas Map Radius: R 5 6 s s s QE preserved any QE preserved specific s QE random any QE random specific Table 3: Self Metric, 2 connected s, R for Kangas Map radius Kangas Map Radius: R/2 5 6 s s s 2 s QE preserved any QE preserved specific

16 QE random any QE random specific Table : Self Metric, 2 connected s, R/2 for Kangas Map radius The results on s is consistent that preserved specific is best, with the absolute best result on s being from Kangas map with radius R, which suggests that with the Kangas map extension the time series property of the data is being used. The best result on s with Kangas map not used is also preserved specific so the result is likely to partly be due to the properties of the case. Kangas map with R/2 performs poorly. Self Metric: connected s Kangas Map: off Error Data s 5 s 6 s 2 s QE preserved any QE preserved specific QE random any QE random specific Kangas Map Radius: R Table 5: Self Metric, connected s without Kangas Map s 5 s 6 s QE preserved any s QE preserved specific QE random any QE random specific Kangas Map Radius: R/2 Table 6: Self Metric, connected s, R for Kangas Map radius s 5 s 6 s 2 s QE preserved any QE preserved specific QE random any QE random specific Table 7: Self Metric, connected s, R/2 for Kangas Map radius The result on preserved specific is best on 5 s being from Kangas map with radius R, which suggests that with the Kangas map extension the time series property of the data is being used. Considering the result on 5 s with Kangas map not used is random any, Kangas map should be mostly attributed for time-series learning. Kangas map with R/2 performs poorly. Self Metric: /2 connected s Kangas Map Off Page 5

17 5 6 s s s 2 s QE preserved any QE preserved specific QE random any QE random specific Kangas Map Radius:R Table 8: Self Metric, /2 connected s, without Kangas Map s 5 s 6 s 2 s QE preserved any QE preserved specific QE random any QE random specific Table 9: Self Metric, /2 connected s, R for Kangas Map radius Kangas Map Radius: R/2 s 5 s 6 s QE preserved any QE preserved specific QE random any QE random specific Table 0: Self Metric, /2 connected s, R/2 for Kangas Map radius 2 s With the self metric on /2 connected s, the Kangas map (with radius R) appears to make a difference, in that the best result overall on 5 s is the preserved specific, and the best for Kangas map radius R for 2 s is also preserved specific. As Kangas map with R/2 constantly performs less well than with R, the QE for 2 cases is not calculated. Self Metric: all connected s Kangas Map: off s s s s QE preserved any QE preserved specific QE random any QE random specific Kangas Map Radius:R Table : Self Metric, all connected s without Kangas Map s 5 s 6 s 2 s Page 6

18 QE preserved any QE preserved specific QE random any QE random specific Table 2: Self Metric, all connected s, R for Kangas Map radius Kangas Map Radius: R/2 s 5 s 6 s QE preserved any QE preserved specific QE random any QE random specific s Table 3: Self Metric, all connected s, R/2 for Kangas Map radius The self metric on all s is generally best on random any, suggesting that the time series properties are not being learnt. Overall, the results for the Kangas map on the self metric is that Kangas map radius R works well for some numbers of s and number of connections in the self metric. This indicates that there may be some interaction between these parameter settings we could investigate. As Kangas map with R/2 constantly performs less well than with R, the QE for 2 cases is not calculated due to time limit. Distance Metric Kangas Map: off s 5 s 6 s 2 s QE preserved any QE preserved specific QE random any QE random specific Kangas Map Radius:R Table : Distance Metric, without Kangas Map s 5 s 6 s 2 s QE preserved any QE preserved specific QE random any QE random specific Table 5: Distance Metric, R for Kangas Map radius For the distance metric with Kangas map not used, the result is consistently random any, where the time series properties are not learnt. For the Kangas map radius R, the best result for s is preserved specific, which fits with the self metric on 2 s connection. The Page 7

19 results are otherwise inconsistent. Integer Metric Kangas Map: off s 5 s 6 s 2 s QE preserved any QE preserved specific QE random any QE random specific Kangas Map Radius R Table 6: Integer Metric, without Kangas Map s 5 s 6 s 2 s QE preserved any QE preserved specific QE random any QE random specific Table 7: Integer Metric, R for Kangas Map radius For the distance metric with Kangas map not used, the result is consistently random any, where the time series properties are not learnt. For the Kangas map radius R, the best result for s is preserved specific, which fits with the self metric on 2 s connection. The results are otherwise inconsistent.2.2 Topological Error Self Metric: 2 connected s Kangas Map: off s Page 8 5 s 6 s TE preserved any TE preserved specific TE random any TE random specific Table 8: Self Metric, 2 connected s without Kangas Map 2 s Kangas Map Radius: R 5 6 s s s 2 s TE preserved any TE preserved specific

20 TE random any TE random specific Table 9: Self Metric, 2 connected s, R for Kangas Map radius Kangas Map Radius: R/2 s 5 s 6 s TE preserved any TE preserved specific TE random any TE random specific Self Metric: connected s Kangas Map: off Table 20: Self Metric, 2 connected s, R/2 for Kangas Map radius s 5 s 6 s TE preserved any s 2 s TE preserved specific TE random any TE random specific Kangas Map Radius: R Table 2: Self Metric, connected s without Kangas Map s 5 s 6 s TE preserved any TE preserved specific TE random any TE random specific Kangas Map Radius: R/2 Table 22: Self Metric, connected s, R for Kangas Map radius s 5 s 6 s TE preserved any TE preserved specific TE random any TE random specific Table 23: Self Metric, connected s, R/2 for Kangas Map radius Self Metric: /2 connected s 2 s 2 s Page 9

21 Kangas Map Off s Page 20 5 s 6 s TE preserved any TE preserved specific TE random any TE random specific Kangas Map Radius:R Table 2: Self Metric, /2 connected s, without Kangas Map s 5 s 6 s TE preserved any TE preserved specific s 2 s TE random any TE random specific Kangas Map Radius: R/2 Table 25: Self Metric, /2 connected s, R for Kangas Map radius s 5 s 6 s TE preserved any TE preserved specific TE random any TE random specific Table 26: Self Metric, /2 connected s, R/2 for Kangas Map radius Self Metric: all connected s Kangas Map: off s 5 s 6 s TE preserved any TE preserved specific s 2 s TE random any TE random specific Kangas Map Radius:R Table 27: Self Metric, all connected s without Kangas Map s 5 s 6 s TE preserved any TE preserved specific s TE random any

22 TE random specific Table 28: Self Metric, all connected s, R for Kangas Map radius Kangas Map Radius: R/2 s 5 s QE preserved any QE preserved specific s QE random any QE random specific Table 29: Self Metric, all connected s, R/2 for Kangas Map radius 2 s The results of the various self metric trials are either inconsistent or favor random any, thus suggesting the time series properties are not being learnt. Distance Metric Kangas Map: off s 5 s 6 s TE preserved any s TE preserved specific TE random any TE random specific Kangas Map Radius:R Table 30: Distance Metric, without Kangas Map s 5 s 6 s TE preserved any TE preserved specific s TE random any TE random specific Integer Metric Kangas Map: off Table 3: Distance Metric, R for Kangas Map radius s 5 s 6 s TE preserved any s TE preserved specific TE random any TE random specific Table 32: Integer Metric, without Kangas Map Page 2

23 Kangas Map Radius R s 5 s 6 s 2 s TE preserved any TE preserved specific TE random any TE random specific Table 33: Integer Metric, R for Kangas Map radius The topological error results are either inconsistent for the Kangas map or weakly show that random any is best. Any benefits of the Kangas map shown via the Quantization Error are not visible in the Topological Error results. The MSSOMs with Kangas Map of radius R are not better than those without Kangas Map whereas those with Kangas Map of radius R/2 perform badly. This is as expected based on previous results by Paget [3]..3 Problems existing in previous work When the experiments were executed, some problems in the previous work were discovered. However, there was not enough time to solve those problems despite substantial effort due to the time available, and the lateness of detecting these relatively subtle but significant problems. These problems are presented and discussed in this section as will be valuable for future research..3. Redundant Neighbours The first problem is duplicate storage of neighbors in distance and integer connection metrics. For example, in structure as discussed in section.2, there are s for MSSOM with each having 62 neurons. As a can only be connected to an even number of other s, each unit can only have neighbors from the two s next to it in a -shere SSOM. The right way of connection is shown in figure whose numbers are indexes of starting neurons on different s Figure : example of right connection C{,2} which are the neighbors of distance 2 to neuron are (3,, 6, 9, 50, 83, 86, 26, 269, 2, 826, 905, 908, 909, 90, 20, 289, 292, 293, 29, 285, 285). It is clearly not correct Page 22

24 because the neuron 285 is stored twice. Considering neuron should not be connected with any neuron on the th, neuron 285 is not supposed to be its neighbor at all. But here neuron 285 is stored as a neighbor for neuron and saved twice. The reason is that the st is wrongly connected with the th. Thus neuron 285 is counted as a neighbor of distance 2 of neuron when searching from left direction and right direction. The actual connections from the previous work is shown in figure Figure 5: example of wrong connection Above is a very simple case of the problem. However, the neighborhood range can go up to more than a hundred in some structures. There would be great deal of redundancy if each neuron has unnecessary and wrong neighbors of all distances. This has caused some errors during training using the Kangas map. As the Kangas Map does requires all the neighbors of a BMU to be saved in each epoch, it is important to make sure no neighbor is repeatedly saved. Thus the index of a potential neighbor is compared with indexes of all saved neighbors and is only added to the list of neighbors when it hasn t been added. This large amount of comparisons slows down the training significantly, and should not have been necessary if the neighborhood structure was correct..3.2 Missing Neighbours In some cases the neighbors stored in the MSSOM structure are incomplete. For example, in structure Cnew{,7} which is the neighbors of distance 7 to neuron are (7, 8, 88, 233, 23, 289, 290, 292, 3, 37, 38, 35, 36, 353, 356, 357, 359, 387, 389, 390, 30, 99, 50, 505, 506, 522, 529, 530, 53, 58, 582, 583, 589, 592, 593). All the neighbors listed above are from the same. In this case the neighbors from other two s are missing. As the distance between neuron and 63 is, all the neighbors of distance 6 to neuron 63 on the 2 nd should be of distance 7 to neuron. Similarly all the neighbors of distance 6 to neuron 927 on the th should be of distance 7 to neuron as well. There are many more neighbors missing for Cnew{, 7} actually. The missing neighbors certainly affect experiment performance significantly as not all neurons are updated in each epoch. This is a more serious problem, as the results for previous work should be redone after the network structure is corrected, as conclusions as to whether the various distance metrics do or do not work and how well, may be incorrect. Nevertheless, comparisons between the same settings (same number of s and distance metric) remain meaningful as both previous and current implementations share this feature Page 23

25 5. Conclusion and Future Work 5.. Conclusion This research is an effort to explore the effectiveness of analysing complex time-series data with MSSOM and Kangas map. In this project the previous work of Paget [3] has been modified to incorporate the Kangas map algorithm. The MSSOM structures and testing framework proposed by Paget [3] has been used and tested against in this work to facilitate comparison with his previous experiment results. The experiments explored two possible searching distances for Kangas Map. Compared with MSSOMs without Kangas Map algorithm, the ones with Kangas map of neighbourhood R produce less QE on preserved specific occasionally but have not shown any improvement on TE at all. When applied with Kangas map of neighbourhood R/2, the experiment performance is always worse compared with MSSOMs without Kangas Map. Generally the Kangas map has not proved to be effective when learning the ECSH dataset. Two major problems existing in previous work have been discovered during the experiment phase. The problems are likely to have had a detrimental impact on performance. Although there was not enough time to solve both the problems identified, it is imperative to have them fixed in future research in order to obtain more sensible and convincing results Future Work Future work should first of all fix the redundant and missing neighbourhood problems. It is not easy for any work to be taken seriously if the neighbourhood is not correct. Any future work based on these MSSOM structures should fix these two problems before moving on with new research. As an initial exploration toward Kangas Map algorithm, this research experimented only two Kangas map distances. With more suitable parameters, however, Kangas map should be more effective on time-series learning. Besides, with limited time, this research has not taken the number of epochs required to reach convergence into consideration. Later work should have more epochs in each experiment. References [] H. Wu, Spherical Topology Self-Organising Map Neuron Network for Visualisation of Complex Data, Report submitted for the degree of Master of Computing of Australian National University (20). [2] S. Zha, Multiple Spheres SOM, Report submitted for the degree of Master of Computing of Australian National University (202). Page 2

26 [3] Lachlan Paget, Distance Metrics and Learning Algorithms for Time-Series analysis using Multiple Sphere Self Organising Maps, Report submitted for the degree of Master of Computing of Australian National University (202) [] A. Sangole, G.K. Knopf, Visualization of randomly ordered numeric data sets using spherical self-organizing feature maps, Computers & Graphics 27 (2003) [5] S.J. Russell, P. Norvig, E. Davis, Artificial intelligence: a modern approach, Prentice hall Upper Saddle River, NJ, 200. [6] T. Kohonen, Self-organized formation of topologically correct feature maps, Biological cybernetics 3 (982) [7] T. Kohonen, S.O. Maps, Springer Series in Information Sciences, Springer, Berlin, 995. [8] T. Kohonen, J. Hynninen, J. Kangas, J. Laaksonen, Som pak: The self-organizing map program package, Report A3, Helsinki University of Technology, Laboratory of Computer and Information Science (996). [9] S. Haikin, Neural networks: a comprehensive foundation, (998). [0] H. Ritter, Self-organizing maps on non-euclidean spaces, Kohonen maps (999) [] Y. Sugii, H. Satoh, D. Yu, Y. Matsuura, H. Tokutaka, M. Seno, Spherical self-organizing map as a helpful tool to identify category-specific cell surface markers, Biochemical and biophysical research communications 376 (2008) -8. [8] J.S. Kirk, J.M. Zurada, Algorithms for improved topology preservation in self-organizing maps, Systems, Man, and Cybernetics, 999. IEEE SMC'99 Conference Proceedings. 999 IEEE International Conference on, IEEE, 999, pp [3] E.A. Uriarte, F.D. Martín, Topology preservation in SOM, International journal of applied mathematics and computer sciences (2005) 9-22 [] Gabriela Guimarães, Victor Sousa Lobo, Fernando Moura-Pires, A Taxonomy of Self-organizing Maps for Temporal Sequence Processing, Intelligent Data Analysis Volume 7, Number /2003 Page 25

Spherical Topology Self-Organizing Map Neuron Network for Visualization of Complex Data

Spherical Topology Self-Organizing Map Neuron Network for Visualization of Complex Data Huajie Wu 4 November 2011 A report submitted for the degree of Master of Computing of Australian