Multi-Clustering Centers Approach to Enhancing the Performance of SOM Clustering Ability

Size: px
Start display at page:

Download "Multi-Clustering Centers Approach to Enhancing the Performance of SOM Clustering Ability"

Transcription

1 JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 25, (2009) Multi-Clustering Centers Approach to Enhancing the Performance of SOM Clustering Ability CHING-HWANG WANG AND CHIH-HAN KAO * Department of Construction Engineering National Taiwan University of Science and Technology Taipei, 106 Taiwan * Department of Construction Engineering National Kinmen Institute of Technology Kinmen, 892 Taiwan This paper modified the mechanism of weight adjusting of the Self-Organizing Mapping network (SOM) for solving the problems of topology preserving and clarifying boundary of clustering graph for the clustering analysis. The modified SOM is named the Multiple Clustering Centers SOM (MCC-SOM). The MCC-SOM changed the competitive learning mechanism of winner takes all to allow the more one clustering centers that can cause the graph of neighboring clusters with blurring boundary to focus on each of the cluster centers, and highlight the boundary of each cluster. The mechanism also can automatically set the units of topology without appropriate setting, and promote the topology preserving of graph by the consistency between the standard of weight adjustment and the case. Through the case studies, it is evident that the MCC-SOM can modify the performance of the SOM model. Thus, by using the MCC-SOM, the analysts can use the output topology to produce more precise result of classification of cases, and enhance the correct percent of the afterward predicting or classifying models. Keywords: clustering analysis, output topology, topology preserving, multiple clustering centers, self-organizing mapping 1. INTRODUCTION Clustering analysis is an effective tool of data extracting in the data mining process. Most of the original clustering analyses use the k-mean method. But in fact, the k-mean method has unfavorable filtering ability for the noise of background environment, and must pre-set the amount of cluster for calculation [1]. The above mentioned problem causes the disregard for the cluster with blurring boundary in the topology, and it also causes the output to not be precise. Thus, there are several modified algorithms of the clustering analysis. However, those modified algorithms of the clustering analysis cannot solve the problems about the presetting of the calculation and the pre-processing for the data of the clustering analysis [2]. Vesanto [3] cited that the following researches of the clustering analysis adopt the unsupervised learning type of neural network the Self-Organizing Mapping network (denoted as SOM in this paper) as the one algorithm of the clustering analysis. The SOM model has the ability of fault tolerance networks, and do not have to preset the amount of cluster. It is more flexible than the other algorithms of the clustering analysis, and it be- Received September 10, 2007; revised April 9 & July 18, 2008; accepted August 22, Communicated by Chung-Yu Wu. * Corresponding author. 1087

2 1088 CHING-HWANG WANG AND CHIH-HAN KAO comes one of the major algorithms. However, when the SOM model is used for the clustering analysis, the model has the requirement for solving the problems of topology preserving and clarifying boundary of the clustering graph [4, 5]. When the SOM is applied to construct the predicting model for the decision making of the construction engineering, the historical cases of construction engineering do not have the consistent background condition and the same evaluating standards. The SOM model is usually used to create the desired outputs of training data. In other words, the data is pre-processing for the predicting model construction. However, due to the ability of fault tolerance of the SOM model, the flexible network structure of the SOM causes the topology twisting to eliminate the topology preserving. The SOM is used to pre-process the data of model construction. Moreover, the analysts must use the precise result of classification when the training process of the supervised neural networks simplifies the calculation process. Based on this requirement, the boundary between the neighboring clusters needs more specifics than the application on the other fields of the SOM model. Lo [6] proposed the approach of optimizing learning parameter of the SOM for dealing with the above mentioned problems. His quoted algorithms of modified mechanisms are the optimized learning environment of weight adjusting of the SOM. These algorithms have the same target of promoting the convergence of network learning. Thus, the modified mechanisms of weight adjustment can be found to be directly related to the promotion of topology preserving of the SOM. To overcome the shortcomings mentioned above, this paper develops a modified SOM model with the new weight adjustment mechanism. The modified SOM model can improve the topology preserving of the SOM and also can clarify the shape of clustering graph. Furthermore, it enhances the correct percent of the SOM for the clustering analysis. 2. BASIC CONCEPT OF MODELING Concerning the discussion mentioned before, this paper will survey the related papers for three issues in order to construct the basic concept of the modified SOM model. The three issues are the relation between the weight adjustment and the topology preserving, the feasibility of the multiple clustering centers, and the modification of competitive theory. First, this paper discusses the issue of the relation between the weight adjustment and the topology preserving. It refers to Dittenbach [7] that proposed: The topology adopts the difference between the feature values of cases and the weights of output units as the evaluating standard of the structure adjusting. The adjusted distribution of output units of topology increase or decrease the output units. The mechanism of adjusting output generates the distribution of output points of cases to match the original shape of the mapping graph of the inputting features. The output units are deleted to decrease the influence for the topology that has not been influenced by the imported cases. Furthermore, the output units are flexibly added to increase the ability of data transaction of the units located on the zone of high density of original topology. The previously quoted mechanism of adjusting output units causes the positive influence of the correctness of weight adjustment. Likewise, it can improve the topology preserving.

3 MULTI-CLUSTERING CENTERS APPROACH 1089 Second, attention has been directed to the feasibility of the multiple clustering centers. This paper refers to the evaluation model of the topology preserving in the Vesanto s research [3] to conclude the following viewpoint: Based on the goal of the topology preserving promoting, the evaluation equation of topology preserving has the multiple benchmarks of topology distance measuring (clustering center). This can prove that it has the possibility of multiple clustering centers in a single cluster. In additional, Martinetz [8, 9] proposed the soft-max concept of the fuzzy c mean cluster model. The fuzzy c mean cluster model uses the fuzzy theory to set the flexible threshold for electing the clustering center. The clustering center defining is not limited by the rule that the output unit must be absolutely similar to the case as the clustering center. It can enhance the representative ability of clustering center for the original information that is defined by the fuzzy c mean cluster model. Through the above discussion, this paper can also prove that the clustering center can be diversified. Finally, this paper discusses the issue of the modification of the competitive theory. DeSieno [11] proposed the conscience mechanism of the modified competitive theory. He claimed that the weight adjustment needs to attend to the following possibility. The other output units in the same cluster (except the clustering center) may have the possibility for the representation of cases. The weight adjusting mechanism of the SOM should restrain the influence of cluster center for the weight adjustment of other output units, and should increase the influence for all output units. Thus, the standard of weight adjustment equation is not only from the clustering center. Moreover, Si [12] proposed the concept of the winner take quota. The concept also claims that the treatment of the winner unit need to amend. It can avoid ignoring the potential output unit that the output unit needs to adjust. Concerning the above algorithm, this paper confirms the feasibility that the single cluster with the multiple clustering centers can reach the optimized weight combination for the best topology preserving and the optimum output topology. As discussed above, this paper confirms the feasibility of the performance promotion of the modified SOM model so that the SOM model adopts the multiple clustering centers mechanism. Furthermore, the authors have implemented this basic concept to construct the Multi-Clustering Center Self-Organizing Mapping model (MCC-SOM). As Fig. 1 is detailed, the MCC-SOM model replaces the original competitive theory that the mechanism of adjusting the amount of the output unit is replaced to the mechanism of the flexible amount and the changeable location for the clustering center in the topology space with the fixed amount of output unit. Therefore, it eliminates the influencing level of weight adjustment from the single clustering center, and attends the influencing level of weight adjustment from the other output units. More precisely, this paper constructs the modified SOM model by replacing the rule of weight adjustment of the SOM, that is, the winner takes all. It is the original theory of competitive learning, that the winner of the output unit is the single clustering center. The clustering center is the standard of topology distance measuring for the topology distance that is between the other output units and the cluster center in the neighborhood area. The topology distance measuring of weight adjustment causes the all output units toward the clustering center in the neighborhood area (as shown in Fig. 2). The MCC-SOM model uses the difference between the weights of the output unit

4 1090 CHING-HWANG WANG AND CHIH-HAN KAO Adding the weight adjusting influence of the others output unit with representative ability Reducing the influence strength of the main clustering center The identification of the clustering centers The mechanisms of weighting adjusting for each locations combinations of the clustering centers The MCC-SOM model Fig. 1. The basic concept of the MCC-SOM model. Main Cluster j Topology space A Fig. 2. The neighboring area of the single clustering center. Main Cluster j Sub Cluster Topology space A Fig. 3. The neighboring area of the multiple clustering centers. j1 and the feature values of cases as the electing threshold of the sub-clustering center. The output units also are the sub best matching units and the weight values of output units are closed to the feature values of cases. It can add the reference of adjusting weight of other output units in the neighborhood area (as shown in Fig. 3). The modified mechanism of competitive theory by the multiple clustering centers leads the weight adjustment of output unit to be more conforming to the characteristics of cases. Moreover, the modified mechanism of weight adjusting has the advantage because the mechanism handles the higher level of the integrated ability for the output units. Also, the mechanism can avoid adding too many output units. Excessive output units will increase the computing cost, and generate the wide distribution of output points of case, leading to the fault of the clusters with the blurred boundary. The mechanism of multiple clustering centers calculates the multi-standards of weight adjustment in the same time.

5 MULTI-CLUSTERING CENTERS APPROACH PROCESS OF MODEL CONSTRUCTION This paper establishes the modified competitive theory with the multiple clustering centers as the new basic concept of weight adjustment of the SOM, so as to construct the MCC-SOM model. Fig. 4 shows the processes of calculation for the MCC-SOM model. Beginning The intinal setting of the parameters Amount of dimension, Amount of output unit, Shape of topology,initial value of weight, Radius of neighborhood area Identification of clustering center Calculating EI and net of output unit, Examining the threshold, Citing clustering center Electing subclustering center Yes Determining status of related location between clustering center Yes No No Calculating topology distances of each clustering center Calculating location of virtual clustering center and topology distance Weight adjusting Calculating topology distance and weight adjustment Reducing radius of neighborhood area Examining term of learning termination Epoch and preset value is equal The analysis for unknown case The regonizing output of cluster END Fig. 4. The flow chart of calculation of the MCC-SOM model.

6 1092 CHING-HWANG WANG AND CHIH-HAN KAO 3.1 Initial Setting Items of MCC-SOM In the first process, the MCC-SOM model sets the output type of topology so that the parameters are the amount of dimension, and the coordinate system of topology. The topology space is usually presented by two dimensions, and this paper uses the coordinate of the two dimensions, m, n, as the code of output unit in topology. Additionally, the model selects the shape of topology and the amount of output unit, etc. Next, the model sets the radius σ and the reducing function of neighborhood to converge the learning of weight adjustment. The model also selects the measuring function of topology distance as the important variable of weight adjustment. Finally, the model sets the initial value of the kth weight of the output unit in the m, n code location of coordinates (denoted as w kmn in this paper). The authors use W mn = (w 1mn, w 2mn,, w kmn ) to present the weight group of the output unit. Moreover, the model normalizes the inputting values of cases to be the feature vectors (denoted as X = [x 1, x 2,, x k ] to present the feature vectors of cases). Those feature vectors are imported to the calculation of the MCC-SOM model. The above mentioned items of initial parameters of the MCC-SOM model are shown in the Table 1. Those parameters are the basis of following the calculations of the MCC- SOM model. Table 1. The initial parameters of the MCC-SOM model. Item Value Item Value Amount of the inputting layer 1 Amount of feature vector 12 Amount of adding vector 0 ~ 1 Amount of the outputting layer 1 Range of the output 0 ~ 1 Topology function Grid Topology Amount of the hidden layer 0 Topology distance function Euclidean Radius of neighboring area 1 Amount of the output unit 6 * 6 Reducing function of the Threshold of sub clustering Guass radius of neighboring area center electing 1.25 * net Amount of weight 36 * 12 Learning rate 0.9 Amount of iteration Electing Terms of the Clustering Center and Sub-Clustering Center This paper develops the MCC-SOM model by the modification of an electing rule of the clustering center for the competitive theory. Furthermore, the model imports the feature values of samples to the MCC-SOM model. The model calculates the differences between the weights of the output unit and the feature values (denoted as net in this paper). It uses the sum of the net of one output unit to be an index (denoted as EI in this paper) for filtering the clustering center. The output unit is chosen as the Best Match Unit (denoted as B.M.U. in this paper) and has the lowest EI value. The mathematical equation of the clustering center electing is expressed in the Eq. (1): (i 1, j 1 ) = arg min k(x W mn ) (B.U.M.). (1)

7 MULTI-CLUSTERING CENTERS APPROACH 1093 The basic concept of the clustering center is that the output unit is the most similar to the characteristics of samples. The model selects the output unit with the smallest EI value to be the clustering center. Moreover, the model implements the basic concept to allow more clustering centers with output units that have the lower EI values. Therefore, the MCC-SOM model selects the output units as the sub-clustering centers that the EI values of sub-clustering centers close to the lowest EI value, while the EI is lower than the threshold. The sub-clustering centers should be authorized the influencing right of weight adjusting for the other output units in the same neighborhood area. (To easily discriminate from the clustering center and the sub-clustering center, this paper names the clustering center as the main clustering center in the following description.) The mathematic equation of the sub-clustering center election is expressed in Eq. (2): (i 2, j 2 ) = arg min m 1,n 1 k(x W mn ) If net k (i 2, j 2 ) S net k (i 1, j 1 ) Then Y(i 2, j 2 ) = 1, (sub B.U.M.). (2) The value of Y in Eq. (2) is a binary index for indicating the output unit, whether it is the clustering center or not. The topology distance between the clustering center and the other output units is the important variable of following weight adjustment of the MCC-SOM model. 3.3 Weight Adjustment by the Combination of Main Clustering Center and Sub-Clustering Center The model calculates the topology distances between each of the output units of one cluster and the main clustering center after that the clustering centers have been identified. Hence, the equations of weight adjustment of output units are modified to use the multiple topology distances when the single clustering center of one cluster transfers to the multiple clustering centers. It is the extending algorithm of the multiple clustering centers. The definition of the neighborhood area of cluster is the area that is included from the main clustering center or the sub-clustering center by the radius (denoted as σ in this paper). The output units are defined by the neighborhood area, whether the output units need to adjust the weights or not. The MCC-SOM model evaluates the two variables that select the equation of weight adjustment. The one variable is the topology distance between the two types of clustering center (denoted as r ij in this paper). The other variable is the radius of the neighborhood area σ. The relative locations of two types of combinations of clustering centers have been defined. However, there are two types of intersection latitudes between the neighborhood area of the main clustering center and the sub-cluster center. One is where the neighborhood areas are almost fully overlapping, and the other one is where the neighborhood areas are not overlapping. The equations of the weight adjusting of the output unit are also different.

8 1094 CHING-HWANG WANG AND CHIH-HAN KAO 1. Neighborhood Areas that are Almost Fully Overlapping When the locations of the main clustering center and the sub-clustering center are very close, the standard of the overlapping latitude of neighborhood areas is defined by this paper as the topology distance between two types of clustering centers being lower than 0.1σ. Because the neighborhood areas of two types of clustering centers almost overlap, the difference of topology distance between the same output unit and each clustering center in the same cluster is small. Furthermore, the model uses the proportional relation of the distances between the each clustering center and the same output unit to calculate the averages of the weights of each dimension. Those averages of weights can set a virtual clustering center that presents the combination of the main clustering center and the sub-clustering center. The mathematic equation of weight calculation of the virtual clustering center and the weight adjustments of output units are expressed in Eq. (3.1). If [ k(w i1j1 w i2j2 ) 2 ] 1/2 0.1σ Then r ij = S 1 r i1j1 + S 2 r i2j2 If σ r ij W ij = S 1 W i1,j1 + S 2 W i2,j2 2 2 H ij = exp( rij /2 σ ) W mn (t + 1) = W mn (t) + η H i,j [X W i,j (t)] Else W mn (t + 1) = W mn (t) Else Goto Eq. (3.2) (3.1) In the iteration of the learning process, the model refers to the location of virtual clustering center so as to calculate the topology distance and the range of weight adjustments of output units by Eq. (3.1). When the topology distance between two types of clustering center is higher than the presetting threshold of the selection for the virtual clustering center, the calculation of topology distance and range of weight adjustments of output units use the Eq. (3.2) that is developed in the next section of this paper. 2. Neighborhood Areas are not Overlapping When most of the neighborhood areas of the main clustering center and the subclustering center are separated, the weight adjustments of output units in the neighborhood area of clustering center of the MCC-SOM are the same as the original SOM. However, the candidate output units of the sub-clustering center shall be located in the neighborhood area of the clustering center. Therefore, the weight adjustment output units also need to consider the influence from the sub-clustering center on the output units located on the overlapping part of neighborhood areas between the two types clustering centers. When the model adjusts the weight of output units that are located on the non-overlapping part of neighborhood area between the two types clustering centers, the model treats the main clustering center or the sub-clustering center as an independent system. The mathematic equations of the main clustering center or the sub-clustering center are expressed in Eq. (3.2).

9 MULTI-CLUSTERING CENTERS APPROACH 1095 If σ r1 ij and σ r2 ij Then W mn (t + 1) = W mn (t) + S 1 η 1 H i1,j1 [X W i1,j1 (t)] + S 2 η 2 H i2,j2 [X W i2,j2 (t)] If σ r1 ij and σ r2 ij Then W mn (t + 1) = W mn (t) + S 1 η 1 H i1,j1 [X W i1,j1 (t)] If σ r1 ij and σ r2 ij Then W mn (t + 1) = W mn (t) + S 2 η 2 H i2,j2 [X W i2,j2 (t)] Else W mn (t + 1) = W mn (t) (3.2) When the topology distance between the two types of clustering centers and the radius of the neighborhood area have been evaluated, the MCC-SOM model can select the appropriate equation of weight adjustment. 3.4 Examining Termination Criteria Because the MCC-SOM model is the unsupervised learning type of neural network, the termination decision of next iteration proceeding is made by the presetting epoch. The mathematic equation of examining termination criteria is expressed in Eq. (4). IF t > t termin THEN the learning procedure is terminated (4) If the present epoch is smaller than the threshold of presetting epoch, the MCC- SOM model then randomly imports the next feature values of case. Additionally, the epoch is added, the model reduces the radius of neighborhood area and the learning ratio as the parameters of the next iteration. It helps the convergence learning of the MCC- SOM model. The neighborhood function of the next iteration is also recalculated, and the mathematic equation is expressed in Eq. (5). H mn = 2 2 mn exp( r /2 σ ) (5) The reduced range of neighborhood function is calculated by Eq. (5). It uses the logarithm ratio between the topology distance of the output unit and the radius of the neighborhood to calculate the reduced range of the neighborhood area. 3.5 Clustering Analysis of Unknown Case After the weight of the MCC-SOM model has been adjusted, the feature values of the unknown case are imported into the model. The feature values are compared with the weight of all output units in the topology to find the output unit that has the smallest EI value. This location of the selected output unit is the location of the unknown case mapped into the topology space. Moreover, the output unit indicates the feature of the unknown case. The unknown case can be identified the cluster by the location of this output unit. Additionally, the distributed density of all cases can be presented by the output topology.

10 1096 CHING-HWANG WANG AND CHIH-HAN KAO 3.6 Evaluated Index of Clustering Performance To evaluate whether the MCC-SOM model is suitable to the requirement of data proceeding ability of cluster or not, this paper needs an objectively evaluated index of clustering performances to evaluate the level of promoting clustering ability of the MCC- SOM model. When the SOM model is applied to the clustering analysis, there are the two requirements: the preserving topology and the focusing status of the topology graph of the cluster. First, the topology preserving represents the accuracy of output that the output layer is connected by the weight combination. The weight combination makes sure that the information is not twisted in the process of mapping. It can evaluate whether the MCC-SOM model will satisfy the requirement of the topology preserving of the clustering analysis or not. Second, the focusing status of the topology graph of cluster presents the precise location identification of the cluster. It can avoid the confusion of classification of output points located in the zone between each cluster. This paper implements the AEI to evaluate the topology preserving of the MCC- SOM model. The AEI index is the average of EI value of all output units in the topology. The mathematic equation of AEI is expressed in Eq. (6). AEI = mn k(x W ij )/(m n) (6) Besides, this paper refers to the suggestion of Vesanto [2] and Dimitriadou [14] to implement Davies-Bouldin index [15] index to evaluate the focusing status of topology graph of cluster from the MCC-SOM model. The Davies-Bouldin index can use the topology distance of the internal cluster and the separating clusters to calculate the index. The internal topology distance of the cluster (L) and the separating topology distance of the cluster (D) are expressed in Eqs. (7) and (8). L = C x=1~m,y=1~n W mn W i1j1 /(N c 1) (7) D = C W C1i1j1 W C2i1j1 (8) Therefore, the Davies-Bouldin index uses the above mentioned two types of topology distance to be calculated. David Bouldin index = 1/N d c max c1 c2 {[L(C1) + L(C2)]/D(C1, C2)} (9) 4. CASE STUDY AND RESULT ANALYSIS To prove the SOM model with the single clustering center of weight adjusting mechanism can be modified by the MCC-SOM model to enhance the correctness and the explicit level of clustering result, this paper uses Kohonen s SOM model as the benchmark of performance comparison, and adopts the financial ratios from large-scale construction contractors in Taiwan as the case study. The number of financial ratios of contractors for the training data is 868. After the case study has been calculated, this paper analyzes the distribution result of output topology of the case study.

11 MULTI-CLUSTERING CENTERS APPROACH Comparison of EI of Output Topology for Choosing Threshold Value As mentioned in section 3.2, the threshold value of sub-clustering center (the S ratio in Eq. (2)) can decide the amount of sub-clustering center. Because the adjusting range of weight of output unit in the SOM is decided by the topology distance between output unit and cluster center, the amount of sub-cluster center will effect the learning converge of the MCC-SOM. The status of learning converge of the MCC-SOM will be shown in the AEI. Hence, in this paper, it is needed to choose a suitable threshold value of sub-clustering center to make the MCC-SOM learning converge well. This paper utilizes an arithmetic progression to set the threshold value, which ranges from 0.9 to 0.5. Thus, the authors use those different threshold values as the variable of a sensitive analysis of the MCC-SOM, and employ the lowest AEI of the MCC-SOM as the critical for choosing suitable threshold value. Table 2 shows the result of sensitive analysis of the MCC-SOM for threshold value, and the suitable threshold value is found to be 0.8 which has the lowest AEI of all. Table 2. The result of sensitive analysis of the MCC-SOM for threshold value. Threshold value AEI According to the result in Table 2, this paper explains the trend of AEI with different threshold values, that is: because the amount of cluster center is insufficient, the MCC-SOM acts as the weight adjustment of the original SOM which has less amount of the sub-clustering center. Moreover, the amount of cluster center is surplus, the unnecessary roles for weight adjusting of the MCC-SOM make it uneasy for learning to convergence. Therefore, the MCC-SOM of the following case study uses 0.8 as the threshold value of the sub-clustering center. 4.2 Comparison of EI of Output Topology for the Benchmark After the financial ratios of contractors have been imported to the three types of SOM models for the clustering analysis, they then get the trained weight of the SOM models. This paper uses the trained weight to calculate the EI of every output unit in topology. The EI values of every output unit in topology are shown in Figs. 5 to 7. Because the EI value represents the consistency level between the weight of the output unit and the feature value of case, the output unit with the smallest EI is the main clustering center. Moreover, the output unit with the largest EI is the boundary between the two neighboring clusters of the output unit is among the two neighboring clusters. To present the shape of the cluster, this paper set the boundary in the center of this output unit. According to the above mentioned mechanism, this paper sets the boundaries of clusters in the topologies of three types of SOM models and elects the clustering centers from output units. The case study is calculated by the MCC-SOM model. As Fig. 5 shows, there are seven clusters in the output topology. The same case study is calculated by the

12 1098 CHING-HWANG WANG AND CHIH-HAN KAO Coordinate X1 X2 X3 X4 X5 X6 Y Y Y Y Y Y Fig. 5. The output EI distribution of the program of the MCC-SOM Coordinate X1 X2 X3 X4 X5 X6 Y Y Y Y Y Y Fig. 6. The output EI distribution of the program of Kohonen s SOM. Coordinate X1 X2 X3 X4 X5 X6 Y Y Y Y Y Y Fig. 7. The output EI distribution of the program of Si s SOM

13 MULTI-CLUSTERING CENTERS APPROACH 1099 Kohonen s SOM model. The output topology is detailed in Fig. 6, showing six clusters. Additionally, the output topology of the Si s SOM is shown in Fig. 7, with six clusters in the output topology. Through the above result, it can be found that the MCC-SOM model appropriately adds the amount of the clustering center to increase the amount of the cluster. The single fusing cluster in the topology is outputted by Kohonen s SOM or Si s SOM model, the models that adopt the competitive theory of the single clustering center. The topology with the single cluster center can be precisely divided into few clusters by the learning mechanism of the multiple clustering centers. Thus, it effects that the clustering boundary of the MCC-SOM model is more clarifying. 4.3 Comparison of Training Correctness for the Benchmark This paper compares the AEIs of clustering centers from the MCC-SOM, Si s SOM, and Kohonen s SOM. The AEIs of clustering centers from the MCC-SOM, Si s SOM, and Kohonen s SOM are 1.64, 2.03, and The MCC-SOM model eliminates about 16.33% of the AEI from Kohonen s SOM model. Thus, it can be found that the MCC- SOM model adopts the approach of adding clustering centers to adjust weights of output units suitably. The weight combination maps the information to fit the characteristic of case. However, the AEI of Si s SOM model is a little larger than the AEI of Kohonen s SOM model. Si s SOM model preserves the mechanism of the single clustering center, eliminating the influence to other output units from the main clustering center, but does not attend the influence to other output units from the other output units with the representative of case. 4.4 Comparison of Davies-Bouldin Index for the Benchmark To prove the influence of clarified output topology from the MCC-SOM model, this paper uses the Davies-Bouldin index as the evaluating standard. The Davies-Bouldin indexes in Tables 3 to 5 are the calculated results from output topologies of the three types of SOM models. Furthermore, the cluster amounts of the three types of SOM models are different; thus, this paper uses the average of Davies-Bouldin index of two neighboring clusters as the comparing standard for the comparing base consisting. According to Tables 3 to 5, this paper finds out that the MCC-SOM model reduces the average of the Davies-Bouldin index of Kohonen s SOM from 1.14 to The scale of the Davies-Bouldin index of Kohonen s SOM eliminates about 6.8% by the MCC-SOM model. Moreover, Si s SOM model reduces the average of the Davies- Bouldin index of Kohonen s SOM from 1.14 to It also eliminates the scale of the Davies-Bouldin index of Kohonen s SOM model about 5.2%. Therefore, this paper concludes the following description. Because the multiple clustering centers are appropriately identified, it wards off that the output unit on the clustering boundary refers to the single standard of weight adjustment to adjust the direction of weight. Moreover, the distance between the output unit on the clustering boundary and the clustering center is too far, and it causes the influence of clustering center to be insufficient. These reasons cause a situation to happen where the output unit locates

14 1100 CHING-HWANG WANG AND CHIH-HAN KAO Code of clustering center Table 3. The result of the Davies-Bouldin index by the MCC-SOM. Inner topology distance L(C) Separating topology distance D(C1, C2) Davies-Bouldin index 1& & & & & & & & & & Average of Davies-Bouldin index Code of clustering center Table 4. The result of the Davies-Bouldin index by the Kohonen SOM. Inner topology distance L(C) Separating topology distance D(C1, C2) Davies-Bouldin index 1& & & & & & & &7 Average of Davies-Bouldin index 1.14 Code of clustering center Table 5. The result of the Davies-Bouldin index by the Si s SOM. Inner topology distance L(C) Separating topology distance D(C1, C2) Davies-Bouldin index & & & & & & & Average of Davies-Bouldin index 1.08 on the fuzzy zone between the two neighboring clusters. As Fig. 7 shows, there is a shadowing zone in the center between two neighboring clusters in the output topology of Si s SOM model. The output units on the shadowing zone can not be defined which cluster it belongs to, and that it is the proof for the above mentioned insufficient influence in the cluster centers.

15 MULTI-CLUSTERING CENTERS APPROACH 1101 In additional, the mechanism of Si s SOM model for the reducing influence of the clustering center for other output units is similar to the mechanism of the multiple clustering centers. However, the mechanism of Si s SOM model lacks the correct leading direction of weight adjustment for output unit from the sub clustering centers. Thus, it causes the Davies-Bouldin index of Kohonen s SOM model and Si s SOM model to be smaller than the Davies-Bouldin index of the MCC-SOM model. 5. CONCLUSION This paper develops the mechanism of the multiple clustering centers of competitive theory and proposes the MCC-SOM model. It can effectively enhance the mapping ability between the input and output layer of the SOM model by the connected weight. Furthermore, it reduces the fault percent of trained weight, and increases the ability of topology preserving of the SOM model. Next, this paper allows for more clustering centers in a cluster. It reduces the amount of output point of case that the output points are located on the fuzzy zone between the two neighboring clusters. Thus, the cluster boundaries of output topology have more clarity, and therefore the recognizable ability of cluster is enhanced. It eliminates the missing possibility of cluster identifying. Furthermore, this paper develops the mechanism of the multiple clustering centers of weight adjustment to clarify the cluster graph. The hidden cluster in the neighboring cluster inner of topology can be isolated by the MCC-SOM model. The feature of the MCC-SOM model enhances the advantage that the SOM model does not need to preset the amount of cluster before the calculation of the SOM model constructing. In additional, it enhances the ability of explaining information of the MCC-SOM model for the clustering analysis. Finally, in the constructing process of the MCC-SOM, this paper suggests that: if the MCC-SOM model wants to enhance the ability of topology preserving, the following related researches shall construct an objective threshold of sub-clustering center electing by both the presupposition of the reasonable computing cost and the acceptable topology preserving. Additionally, the shape of the neighborhood area combination of the multiple clustering centers is not a circle. It is worthy to pay more attention to finding the fit dimension and shape of topology. REFERENCES 1. H. Ritter and K. Schulten, On the stationary state of Kohonen s self-organizingsensory mapping, Biological Cybernetics, Vol. 54, 1986, pp S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd ed., Prentice Hall, New Jersey, J. Vesanto and E. Alhoniemi, Cluster of the self-organizing map, IEEE Transactions on Neural Networks, Vol. 11, 2000, pp A. Baraldi and P. Blonda, A survey of fuzzy cluster algorithms for pattern recognition Part ІII, IEEE Transactions on Systems, Man, and Cybernetics, Vol. 29, 1999, pp

16 1102 CHING-HWANG WANG AND CHIH-HAN KAO 5. T. Villmann, M. Herrmann, D. Ralf, and T. M. Martinetz, Topology preservation in self-organizing feature maps: Exact definition and mearsurement, IEEE Transactions on Neural Networks, Vol. 8, 1997, pp Z. P. Lo and B. Bavarian, Improved rate of convergence in Kohonen neural networks, in Proceedings of International Joint Conference on Neural Networks, Vol. 2, 1991, pp M. Dittenbach, D. Merkl, and A. Rauber, The growing hierarchical self-organizing map, in Proceedings of the IEEE International Joint Conference on Neural Networks, Vol. 6, 2000, pp T. M. Martinetz, S. G. Berkovich, and K. Schulten, Neural-gas network for vector quantization and its application to time-series prediction, IEEE Transactions on Neural Networks, Vol. 4, 1993, pp T. Martinetz and K. Schulten, Topology representing networks, Neural Networks, Vol. 7, 1994, pp D. DeSieno, Adding a conscience to competitive learning, in Proceedings of the IEEE International Conference on Neural Networks, Vol. 1, 1988, pp J. Si, S. Lin, and M. A. Vuong, Dynamic topology representing networks, Neural Networks, Vol. 13, 2000, pp E. Dimitriadou, S. Dolnicar, and A. Weingessel, An examination of indexes for determining the number of clusters in binary data sets, Psychometrika, Vol. 67, 2002, pp D. L. Davies and D. W. Bouldin, A cluster separation measure, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. PAMI-1, 1979, pp Ching-Hwang Wang ( ) received the M.S. degree and the Ph.D. degree in Civil Engineering from University of Washington, Seattle, U.S.A., in 1985 and He is a Professor of the Department of Construction Engineering at National Taiwan University of Science and Technology, Taipei, Taiwan, and his areas of research focus on construction management and economics, simulation of construction schedule and cost, building investment, and modeling technologies. Chih-Han Kao ( ) received the B.S. degree, the M.S. degree and the Ph.D. degree in Construction Engineering from National Taiwan University of Science and Technology, Taipei, Taiwan, in 1990, 1992 and He is a Assistant Professor of the Department of Construction Engineering of National Kinmen Institute of Technology, Kinmen, Taiwan. His areas of research include construction management and data mining embedded.

Figure (5) Kohonen Self-Organized Map

Figure (5) Kohonen Self-Organized Map 2- KOHONEN SELF-ORGANIZING MAPS (SOM) - The self-organizing neural networks assume a topological structure among the cluster units. - There are m cluster units, arranged in a one- or two-dimensional array;

More information

Seismic regionalization based on an artificial neural network

Seismic regionalization based on an artificial neural network Seismic regionalization based on an artificial neural network *Jaime García-Pérez 1) and René Riaño 2) 1), 2) Instituto de Ingeniería, UNAM, CU, Coyoacán, México D.F., 014510, Mexico 1) jgap@pumas.ii.unam.mx

More information

Mineral Exploation Using Neural Netowrks

Mineral Exploation Using Neural Netowrks ABSTRACT I S S N 2277-3061 Mineral Exploation Using Neural Netowrks Aysar A. Abdulrahman University of Sulaimani, Computer Science, Kurdistan Region of Iraq aysser.abdulrahman@univsul.edu.iq Establishing

More information

Controlling the spread of dynamic self-organising maps

Controlling the spread of dynamic self-organising maps Neural Comput & Applic (2004) 13: 168 174 DOI 10.1007/s00521-004-0419-y ORIGINAL ARTICLE L. D. Alahakoon Controlling the spread of dynamic self-organising maps Received: 7 April 2004 / Accepted: 20 April

More information

SOMSN: An Effective Self Organizing Map for Clustering of Social Networks

SOMSN: An Effective Self Organizing Map for Clustering of Social Networks SOMSN: An Effective Self Organizing Map for Clustering of Social Networks Fatemeh Ghaemmaghami Research Scholar, CSE and IT Dept. Shiraz University, Shiraz, Iran Reza Manouchehri Sarhadi Research Scholar,

More information

Chapter 7: Competitive learning, clustering, and self-organizing maps

Chapter 7: Competitive learning, clustering, and self-organizing maps Chapter 7: Competitive learning, clustering, and self-organizing maps António R. C. Paiva EEL 6814 Spring 2008 Outline Competitive learning Clustering Self-Organizing Maps What is competition in neural

More information

Extract an Essential Skeleton of a Character as a Graph from a Character Image

Extract an Essential Skeleton of a Character as a Graph from a Character Image Extract an Essential Skeleton of a Character as a Graph from a Character Image Kazuhisa Fujita University of Electro-Communications 1-5-1 Chofugaoka, Chofu, Tokyo, 182-8585 Japan k-z@nerve.pc.uec.ac.jp

More information

Cluster analysis of 3D seismic data for oil and gas exploration

Cluster analysis of 3D seismic data for oil and gas exploration Data Mining VII: Data, Text and Web Mining and their Business Applications 63 Cluster analysis of 3D seismic data for oil and gas exploration D. R. S. Moraes, R. P. Espíndola, A. G. Evsukoff & N. F. F.

More information

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS

COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS COMBINED METHOD TO VISUALISE AND REDUCE DIMENSIONALITY OF THE FINANCIAL DATA SETS Toomas Kirt Supervisor: Leo Võhandu Tallinn Technical University Toomas.Kirt@mail.ee Abstract: Key words: For the visualisation

More information

Fuzzy-Kernel Learning Vector Quantization

Fuzzy-Kernel Learning Vector Quantization Fuzzy-Kernel Learning Vector Quantization Daoqiang Zhang 1, Songcan Chen 1 and Zhi-Hua Zhou 2 1 Department of Computer Science and Engineering Nanjing University of Aeronautics and Astronautics Nanjing

More information

Rough Set Approach to Unsupervised Neural Network based Pattern Classifier

Rough Set Approach to Unsupervised Neural Network based Pattern Classifier Rough Set Approach to Unsupervised Neural based Pattern Classifier Ashwin Kothari, Member IAENG, Avinash Keskar, Shreesha Srinath, and Rakesh Chalsani Abstract Early Convergence, input feature space with

More information

Applying Kohonen Network in Organising Unstructured Data for Talus Bone

Applying Kohonen Network in Organising Unstructured Data for Talus Bone 212 Third International Conference on Theoretical and Mathematical Foundations of Computer Science Lecture Notes in Information Technology, Vol.38 Applying Kohonen Network in Organising Unstructured Data

More information

Unsupervised Learning

Unsupervised Learning Networks for Pattern Recognition, 2014 Networks for Single Linkage K-Means Soft DBSCAN PCA Networks for Kohonen Maps Linear Vector Quantization Networks for Problems/Approaches in Machine Learning Supervised

More information

Relation Organization of SOM Initial Map by Improved Node Exchange

Relation Organization of SOM Initial Map by Improved Node Exchange JOURNAL OF COMPUTERS, VOL. 3, NO. 9, SEPTEMBER 2008 77 Relation Organization of SOM Initial Map by Improved Node Echange MIYOSHI Tsutomu Department of Information and Electronics, Tottori University, Tottori,

More information

Function approximation using RBF network. 10 basis functions and 25 data points.

Function approximation using RBF network. 10 basis functions and 25 data points. 1 Function approximation using RBF network F (x j ) = m 1 w i ϕ( x j t i ) i=1 j = 1... N, m 1 = 10, N = 25 10 basis functions and 25 data points. Basis function centers are plotted with circles and data

More information

Dynamic Clustering of Data with Modified K-Means Algorithm

Dynamic Clustering of Data with Modified K-Means Algorithm 2012 International Conference on Information and Computer Networks (ICICN 2012) IPCSIT vol. 27 (2012) (2012) IACSIT Press, Singapore Dynamic Clustering of Data with Modified K-Means Algorithm Ahamed Shafeeq

More information

TreeGNG - Hierarchical Topological Clustering

TreeGNG - Hierarchical Topological Clustering TreeGNG - Hierarchical Topological lustering K..J.Doherty,.G.dams, N.Davey Department of omputer Science, University of Hertfordshire, Hatfield, Hertfordshire, L10 9, United Kingdom {K..J.Doherty,.G.dams,

More information

An Approach for Fuzzy Modeling based on Self-Organizing Feature Maps Neural Network

An Approach for Fuzzy Modeling based on Self-Organizing Feature Maps Neural Network Appl. Math. Inf. Sci. 8, No. 3, 27-2 (24) 27 Applied Mathematics & Information Sciences An International Journal http://dx.doi.org/.278/amis/8334 An Approach for Fuzzy Modeling based on Self-Organizing

More information

Reducing topological defects in self-organizing maps using multiple scale neighborhood functions

Reducing topological defects in self-organizing maps using multiple scale neighborhood functions Reducing topological defects in self-organizing maps using multiple scale neighborhood functions Kazushi Murakoshi,YuichiSato Department of Knowledge-based Information Engineering, Toyohashi University

More information

CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION

CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION 75 CHAPTER 6 COUNTER PROPAGATION NEURAL NETWORK IN GAIT RECOGNITION 6.1 INTRODUCTION Counter propagation network (CPN) was developed by Robert Hecht-Nielsen as a means to combine an unsupervised Kohonen

More information

Process Parameter Optimization via Data Mining Technique

Process Parameter Optimization via Data Mining Technique Proceedings of the 6th WSEAS International Conference on Simulation, Modelling and Optimization, Lisbon, Portugal, September 22-24, 2006 130 Process Parameter Optimization via Data Mining Technique Kun-Lin

More information

Open Access Research on the Prediction Model of Material Cost Based on Data Mining

Open Access Research on the Prediction Model of Material Cost Based on Data Mining Send Orders for Reprints to reprints@benthamscience.ae 1062 The Open Mechanical Engineering Journal, 2015, 9, 1062-1066 Open Access Research on the Prediction Model of Material Cost Based on Data Mining

More information

Validation for Data Classification

Validation for Data Classification Validation for Data Classification HILARIO LÓPEZ and IVÁN MACHÓN and EVA FERNÁNDEZ Departamento de Ingeniería Eléctrica, Electrónica de Computadores y Sistemas Universidad de Oviedo Edificio Departamental

More information

ECM A Novel On-line, Evolving Clustering Method and Its Applications

ECM A Novel On-line, Evolving Clustering Method and Its Applications ECM A Novel On-line, Evolving Clustering Method and Its Applications Qun Song 1 and Nikola Kasabov 2 1, 2 Department of Information Science, University of Otago P.O Box 56, Dunedin, New Zealand (E-mail:

More information

Self-Organizing Maps for cyclic and unbounded graphs

Self-Organizing Maps for cyclic and unbounded graphs Self-Organizing Maps for cyclic and unbounded graphs M. Hagenbuchner 1, A. Sperduti 2, A.C. Tsoi 3 1- University of Wollongong, Wollongong, Australia. 2- University of Padova, Padova, Italy. 3- Hong Kong

More information

Modification of the Growing Neural Gas Algorithm for Cluster Analysis

Modification of the Growing Neural Gas Algorithm for Cluster Analysis Modification of the Growing Neural Gas Algorithm for Cluster Analysis Fernando Canales and Max Chacón Universidad de Santiago de Chile; Depto. de Ingeniería Informática, Avda. Ecuador No 3659 - PoBox 10233;

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Regression Based Cluster Formation for Enhancement of Lifetime of WSN

Regression Based Cluster Formation for Enhancement of Lifetime of WSN Regression Based Cluster Formation for Enhancement of Lifetime of WSN K. Lakshmi Joshitha Assistant Professor Sri Sai Ram Engineering College Chennai, India lakshmijoshitha@yahoo.com A. Gangasri PG Scholar

More information

Optimal Design of Steel Columns with Axial Load Using Artificial Neural Networks

Optimal Design of Steel Columns with Axial Load Using Artificial Neural Networks 2017 2nd International Conference on Applied Mechanics and Mechatronics Engineering (AMME 2017) ISBN: 978-1-60595-521-6 Optimal Design of Steel Columns with Axial Load Using Artificial Neural Networks

More information

Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005

Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005 Data Mining with Oracle 10g using Clustering and Classification Algorithms Nhamo Mdzingwa September 25, 2005 Abstract Deciding on which algorithm to use, in terms of which is the most effective and accurate

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

Two-step Modified SOM for Parallel Calculation

Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Two-step Modified SOM for Parallel Calculation Petr Gajdoš and Pavel Moravec Petr Gajdoš and Pavel Moravec Department of Computer Science, FEECS, VŠB Technical

More information

Modular network SOM : Theory, algorithm and applications

Modular network SOM : Theory, algorithm and applications Modular network SOM : Theory, algorithm and applications Kazuhiro Tokunaga and Tetsuo Furukawa Kyushu Institute of Technology, Kitakyushu 88-96, Japan {tokunaga, furukawa}@brain.kyutech.ac.jp Abstract.

More information

Investigation of Alternative Strategies and Quality Measures for Controlling the Growth Process of the Growing Hierarchical Self-Organizing Map

Investigation of Alternative Strategies and Quality Measures for Controlling the Growth Process of the Growing Hierarchical Self-Organizing Map Investigation of Alternative Strategies and Quality Measures for Controlling the Growth Process of the Growing Hierarchical Self-Organizing Map Michael Dittenbach ispaces Group ecommerce Competence Center

More information

Data Mining. Kohonen Networks. Data Mining Course: Sharif University of Technology 1

Data Mining. Kohonen Networks. Data Mining Course: Sharif University of Technology 1 Data Mining Kohonen Networks Data Mining Course: Sharif University of Technology 1 Self-Organizing Maps Kohonen Networks developed in 198 by Tuevo Kohonen Initially applied to image and sound analysis

More information

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification

Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification Robustness of Selective Desensitization Perceptron Against Irrelevant and Partially Relevant Features in Pattern Classification Tomohiro Tanno, Kazumasa Horie, Jun Izawa, and Masahiko Morita University

More information

COMBINING NEURAL NETWORKS FOR SKIN DETECTION

COMBINING NEURAL NETWORKS FOR SKIN DETECTION COMBINING NEURAL NETWORKS FOR SKIN DETECTION Chelsia Amy Doukim 1, Jamal Ahmad Dargham 1, Ali Chekima 1 and Sigeru Omatu 2 1 School of Engineering and Information Technology, Universiti Malaysia Sabah,

More information

Computers and Mathematics with Applications. An embedded system for real-time facial expression recognition based on the extension theory

Computers and Mathematics with Applications. An embedded system for real-time facial expression recognition based on the extension theory Computers and Mathematics with Applications 61 (2011) 2101 2106 Contents lists available at ScienceDirect Computers and Mathematics with Applications journal homepage: www.elsevier.com/locate/camwa An

More information

A Study on Clustering Method by Self-Organizing Map and Information Criteria

A Study on Clustering Method by Self-Organizing Map and Information Criteria A Study on Clustering Method by Self-Organizing Map and Information Criteria Satoru Kato, Tadashi Horiuchi,andYoshioItoh Matsue College of Technology, 4-4 Nishi-ikuma, Matsue, Shimane 90-88, JAPAN, kato@matsue-ct.ac.jp

More information

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data Journal of Computational Information Systems 11: 6 (2015) 2139 2146 Available at http://www.jofcis.com A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

More information

Image Segmentation for Image Object Extraction

Image Segmentation for Image Object Extraction Image Segmentation for Image Object Extraction Rohit Kamble, Keshav Kaul # Computer Department, Vishwakarma Institute of Information Technology, Pune kamble.rohit@hotmail.com, kaul.keshav@gmail.com ABSTRACT

More information

Using Decision Boundary to Analyze Classifiers

Using Decision Boundary to Analyze Classifiers Using Decision Boundary to Analyze Classifiers Zhiyong Yan Congfu Xu College of Computer Science, Zhejiang University, Hangzhou, China yanzhiyong@zju.edu.cn Abstract In this paper we propose to use decision

More information

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks

Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks Simulation of Zhang Suen Algorithm using Feed- Forward Neural Networks Ritika Luthra Research Scholar Chandigarh University Gulshan Goyal Associate Professor Chandigarh University ABSTRACT Image Skeletonization

More information

Optimization of Noisy Fitness Functions by means of Genetic Algorithms using History of Search with Test of Estimation

Optimization of Noisy Fitness Functions by means of Genetic Algorithms using History of Search with Test of Estimation Optimization of Noisy Fitness Functions by means of Genetic Algorithms using History of Search with Test of Estimation Yasuhito Sano and Hajime Kita 2 Interdisciplinary Graduate School of Science and Engineering,

More information

Processing Missing Values with Self-Organized Maps

Processing Missing Values with Self-Organized Maps Processing Missing Values with Self-Organized Maps David Sommer, Tobias Grimm, Martin Golz University of Applied Sciences Schmalkalden Department of Computer Science D-98574 Schmalkalden, Germany Phone:

More information

A Dendrogram. Bioinformatics (Lec 17)

A Dendrogram. Bioinformatics (Lec 17) A Dendrogram 3/15/05 1 Hierarchical Clustering [Johnson, SC, 1967] Given n points in R d, compute the distance between every pair of points While (not done) Pick closest pair of points s i and s j and

More information

NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM

NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM NORMALIZATION INDEXING BASED ENHANCED GROUPING K-MEAN ALGORITHM Saroj 1, Ms. Kavita2 1 Student of Masters of Technology, 2 Assistant Professor Department of Computer Science and Engineering JCDM college

More information

A New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering

A New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering A New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering Nghiem Van Tinh 1, Vu Viet Vu 1, Tran Thi Ngoc Linh 1 1 Thai Nguyen University of

More information

Image Segmentation Based on Watershed and Edge Detection Techniques

Image Segmentation Based on Watershed and Edge Detection Techniques 0 The International Arab Journal of Information Technology, Vol., No., April 00 Image Segmentation Based on Watershed and Edge Detection Techniques Nassir Salman Computer Science Department, Zarqa Private

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

Clustering of Data with Mixed Attributes based on Unified Similarity Metric

Clustering of Data with Mixed Attributes based on Unified Similarity Metric Clustering of Data with Mixed Attributes based on Unified Similarity Metric M.Soundaryadevi 1, Dr.L.S.Jayashree 2 Dept of CSE, RVS College of Engineering and Technology, Coimbatore, Tamilnadu, India 1

More information

Supervised vs.unsupervised Learning

Supervised vs.unsupervised Learning Supervised vs.unsupervised Learning In supervised learning we train algorithms with predefined concepts and functions based on labeled data D = { ( x, y ) x X, y {yes,no}. In unsupervised learning we are

More information

ANALYSIS AND REASONING OF DATA IN THE DATABASE USING FUZZY SYSTEM MODELLING

ANALYSIS AND REASONING OF DATA IN THE DATABASE USING FUZZY SYSTEM MODELLING ANALYSIS AND REASONING OF DATA IN THE DATABASE USING FUZZY SYSTEM MODELLING Dr.E.N.Ganesh Dean, School of Engineering, VISTAS Chennai - 600117 Abstract In this paper a new fuzzy system modeling algorithm

More information

THE discrete multi-valued neuron was presented by N.

THE discrete multi-valued neuron was presented by N. Proceedings of International Joint Conference on Neural Networks, Dallas, Texas, USA, August 4-9, 2013 Multi-Valued Neuron with New Learning Schemes Shin-Fu Wu and Shie-Jue Lee Department of Electrical

More information

On The Method and Performance Evaluation of A Hybrid Mesh-Tree Topology

On The Method and Performance Evaluation of A Hybrid Mesh-Tree Topology Appl. Math. Inf. Sci. 6. 2S pp. 547S-551S (2012) Applied Mathematics & Information Sciences An International Journal @ 2012 NSP Natural Sciences Publishing Cor. On The Method and Performance Evaluation

More information

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM

Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Performance Degradation Assessment and Fault Diagnosis of Bearing Based on EMD and PCA-SOM Lu Chen and Yuan Hang PERFORMANCE DEGRADATION ASSESSMENT AND FAULT DIAGNOSIS OF BEARING BASED ON EMD AND PCA-SOM.

More information

Data Mining 4. Cluster Analysis

Data Mining 4. Cluster Analysis Data Mining 4. Cluster Analysis 4.5 Spring 2010 Instructor: Dr. Masoud Yaghini Introduction DBSCAN Algorithm OPTICS Algorithm DENCLUE Algorithm References Outline Introduction Introduction Density-based

More information

NEURAL NETWORK-BASED SEGMENTATION OF TEXTURES USING GABOR FEATURES

NEURAL NETWORK-BASED SEGMENTATION OF TEXTURES USING GABOR FEATURES NEURAL NETWORK-BASED SEGMENTATION OF TEXTURES USING GABOR FEATURES A. G. Ramakrishnan, S. Kumar Raja, and H. V. Raghu Ram Dept. of Electrical Engg., Indian Institute of Science Bangalore - 560 012, India

More information

Self-Organizing Maps for Analysis of Expandable Polystyrene Batch Process

Self-Organizing Maps for Analysis of Expandable Polystyrene Batch Process International Journal of Computers, Communications & Control Vol. II (2007), No. 2, pp. 143-148 Self-Organizing Maps for Analysis of Expandable Polystyrene Batch Process Mikko Heikkinen, Ville Nurminen,

More information

Growing Neural Gas A Parallel Approach

Growing Neural Gas A Parallel Approach Growing Neural Gas A Parallel Approach Lukáš Vojáček 1 and JiříDvorský 2 1 IT4Innovations Centre of Excellence Ostrava, Czech Republic lukas.vojacek@vsb.cz 2 Department of Computer Science, VŠB Technical

More information

A Data Classification Algorithm of Internet of Things Based on Neural Network

A Data Classification Algorithm of Internet of Things Based on Neural Network A Data Classification Algorithm of Internet of Things Based on Neural Network https://doi.org/10.3991/ijoe.v13i09.7587 Zhenjun Li Hunan Radio and TV University, Hunan, China 278060389@qq.com Abstract To

More information

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks

Time Series Prediction as a Problem of Missing Values: Application to ESTSP2007 and NN3 Competition Benchmarks Series Prediction as a Problem of Missing Values: Application to ESTSP7 and NN3 Competition Benchmarks Antti Sorjamaa and Amaury Lendasse Abstract In this paper, time series prediction is considered as

More information

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Clustering

Data Mining and Data Warehousing Henryk Maciejewski Data Mining Clustering Data Mining and Data Warehousing Henryk Maciejewski Data Mining Clustering Clustering Algorithms Contents K-means Hierarchical algorithms Linkage functions Vector quantization SOM Clustering Formulation

More information

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES

CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES CHAPTER 6 HYBRID AI BASED IMAGE CLASSIFICATION TECHNIQUES 6.1 INTRODUCTION The exploration of applications of ANN for image classification has yielded satisfactory results. But, the scope for improving

More information

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms.

Keywords Clustering, Goals of clustering, clustering techniques, clustering algorithms. Volume 3, Issue 5, May 2013 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com A Survey of Clustering

More information

Radial Basis Function Neural Network Classifier

Radial Basis Function Neural Network Classifier Recognition of Unconstrained Handwritten Numerals by a Radial Basis Function Neural Network Classifier Hwang, Young-Sup and Bang, Sung-Yang Department of Computer Science & Engineering Pohang University

More information

Density Based Clustering using Modified PSO based Neighbor Selection

Density Based Clustering using Modified PSO based Neighbor Selection Density Based Clustering using Modified PSO based Neighbor Selection K. Nafees Ahmed Research Scholar, Dept of Computer Science Jamal Mohamed College (Autonomous), Tiruchirappalli, India nafeesjmc@gmail.com

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Pattern Recognition Methods for Object Boundary Detection

Pattern Recognition Methods for Object Boundary Detection Pattern Recognition Methods for Object Boundary Detection Arnaldo J. Abrantesy and Jorge S. Marquesz yelectronic and Communication Engineering Instituto Superior Eng. de Lisboa R. Conselheiro Emídio Navarror

More information

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION

6. NEURAL NETWORK BASED PATH PLANNING ALGORITHM 6.1 INTRODUCTION 6 NEURAL NETWORK BASED PATH PLANNING ALGORITHM 61 INTRODUCTION In previous chapters path planning algorithms such as trigonometry based path planning algorithm and direction based path planning algorithm

More information

Generalized Fuzzy Clustering Model with Fuzzy C-Means

Generalized Fuzzy Clustering Model with Fuzzy C-Means Generalized Fuzzy Clustering Model with Fuzzy C-Means Hong Jiang 1 1 Computer Science and Engineering, University of South Carolina, Columbia, SC 29208, US jiangh@cse.sc.edu http://www.cse.sc.edu/~jiangh/

More information

Unsupervised learning

Unsupervised learning Unsupervised learning Enrique Muñoz Ballester Dipartimento di Informatica via Bramante 65, 26013 Crema (CR), Italy enrique.munoz@unimi.it Enrique Muñoz Ballester 2017 1 Download slides data and scripts:

More information

On Distributed Algorithms for Maximizing the Network Lifetime in Wireless Sensor Networks

On Distributed Algorithms for Maximizing the Network Lifetime in Wireless Sensor Networks On Distributed Algorithms for Maximizing the Network Lifetime in Wireless Sensor Networks Akshaye Dhawan Georgia State University Atlanta, Ga 30303 akshaye@cs.gsu.edu Abstract A key challenge in Wireless

More information

Neural networks for variable star classification

Neural networks for variable star classification Neural networks for variable star classification Vasily Belokurov, IoA, Cambridge Supervised classification Multi-Layer Perceptron (MLP) Neural Networks for Pattern Recognition by C. Bishop Unsupervised

More information

Machine Learning Based Autonomous Network Flow Identifying Method

Machine Learning Based Autonomous Network Flow Identifying Method Machine Learning Based Autonomous Network Flow Identifying Method Hongbo Shi 1,3, Tomoki Hamagami 1,3, and Haoyuan Xu 2,3 1 Division of Physics, Electrical and Computer Engineering, Graduate School of

More information

Evaluation of the Performance of O(log 2 M) Self-Organizing Map Algorithm without Neighborhood Learning

Evaluation of the Performance of O(log 2 M) Self-Organizing Map Algorithm without Neighborhood Learning 04 IJCSNS International Journal of Computer Science and Network Security, VOL.6 No.0, October 006 Evaluation of the Performance of O(log M) Self-Organizing Map Algorithm without Neighborhood Learning Hiroki

More information

Traffic Signal Control Based On Fuzzy Artificial Neural Networks With Particle Swarm Optimization

Traffic Signal Control Based On Fuzzy Artificial Neural Networks With Particle Swarm Optimization Traffic Signal Control Based On Fuzzy Artificial Neural Networks With Particle Swarm Optimization J.Venkatesh 1, B.Chiranjeevulu 2 1 PG Student, Dept. of ECE, Viswanadha Institute of Technology And Management,

More information

Block-Based Connected-Component Labeling Algorithm Using Binary Decision Trees

Block-Based Connected-Component Labeling Algorithm Using Binary Decision Trees Sensors 2015, 15, 23763-23787; doi:10.3390/s150923763 Article OPEN ACCESS sensors ISSN 1424-8220 www.mdpi.com/journal/sensors Block-Based Connected-Component Labeling Algorithm Using Binary Decision Trees

More information

AN IMPROVED MULTI-SOM ALGORITHM

AN IMPROVED MULTI-SOM ALGORITHM AN IMPROVED MULTI-SOM ALGORITHM ABSTRACT Imen Khanchouch 1, Khaddouja Boujenfa 2 and Mohamed Limam 3 1 LARODEC ISG, University of Tunis kh.imen88@gmail.com 2 LARODEC ISG, University of Tunis khadouja.boujenfa@isg.rnu.tn

More information

PATTERN RECOGNITION USING NEURAL NETWORKS

PATTERN RECOGNITION USING NEURAL NETWORKS PATTERN RECOGNITION USING NEURAL NETWORKS Santaji Ghorpade 1, Jayshree Ghorpade 2 and Shamla Mantri 3 1 Department of Information Technology Engineering, Pune University, India santaji_11jan@yahoo.co.in,

More information

Expectation and Maximization Algorithm for Estimating Parameters of a Simple Partial Erasure Model

Expectation and Maximization Algorithm for Estimating Parameters of a Simple Partial Erasure Model 608 IEEE TRANSACTIONS ON MAGNETICS, VOL. 39, NO. 1, JANUARY 2003 Expectation and Maximization Algorithm for Estimating Parameters of a Simple Partial Erasure Model Tsai-Sheng Kao and Mu-Huo Cheng Abstract

More information

Data Mining. Covering algorithms. Covering approach At each stage you identify a rule that covers some of instances. Fig. 4.

Data Mining. Covering algorithms. Covering approach At each stage you identify a rule that covers some of instances. Fig. 4. Data Mining Chapter 4. Algorithms: The Basic Methods (Covering algorithm, Association rule, Linear models, Instance-based learning, Clustering) 1 Covering approach At each stage you identify a rule that

More information

Cse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University

Cse634 DATA MINING TEST REVIEW. Professor Anita Wasilewska Computer Science Department Stony Brook University Cse634 DATA MINING TEST REVIEW Professor Anita Wasilewska Computer Science Department Stony Brook University Preprocessing stage Preprocessing: includes all the operations that have to be performed before

More information

An explicit feature control approach in structural topology optimization

An explicit feature control approach in structural topology optimization th World Congress on Structural and Multidisciplinary Optimisation 07 th -2 th, June 205, Sydney Australia An explicit feature control approach in structural topology optimization Weisheng Zhang, Xu Guo

More information

Title. Author(s)Liu, Hao; Kurihara, Masahito; Oyama, Satoshi; Sato, Issue Date Doc URL. Rights. Type. File Information

Title. Author(s)Liu, Hao; Kurihara, Masahito; Oyama, Satoshi; Sato, Issue Date Doc URL. Rights. Type. File Information Title An incremental self-organizing neural network based Author(s)Liu, Hao; Kurihara, Masahito; Oyama, Satoshi; Sato, CitationThe 213 International Joint Conference on Neural Ne Issue Date 213 Doc URL

More information

Methods for Intelligent Systems

Methods for Intelligent Systems Methods for Intelligent Systems Lecture Notes on Clustering (II) Davide Eynard eynard@elet.polimi.it Department of Electronics and Information Politecnico di Milano Davide Eynard - Lecture Notes on Clustering

More information

A NEW ALGORITHM FOR OPTIMIZING THE SELF- ORGANIZING MAP

A NEW ALGORITHM FOR OPTIMIZING THE SELF- ORGANIZING MAP A NEW ALGORITHM FOR OPTIMIZING THE SELF- ORGANIZING MAP BEN-HDECH Adil, GHANOU Youssef, EL QADI Abderrahim Team TIM, High School of Technology, Moulay Ismail University, Meknes, Morocco E-mail: adilbenhdech@gmail.com,

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

INCREASING CLASSIFICATION QUALITY BY USING FUZZY LOGIC

INCREASING CLASSIFICATION QUALITY BY USING FUZZY LOGIC JOURNAL OF APPLIED ENGINEERING SCIENCES VOL. 1(14), issue 4_2011 ISSN 2247-3769 ISSN-L 2247-3769 (Print) / e-issn:2284-7197 INCREASING CLASSIFICATION QUALITY BY USING FUZZY LOGIC DROJ Gabriela, University

More information

A Topography-Preserving Latent Variable Model with Learning Metrics

A Topography-Preserving Latent Variable Model with Learning Metrics A Topography-Preserving Latent Variable Model with Learning Metrics Samuel Kaski and Janne Sinkkonen Helsinki University of Technology Neural Networks Research Centre P.O. Box 5400, FIN-02015 HUT, Finland

More information

1. INTRODUCTION. AMS Subject Classification. 68U10 Image Processing

1. INTRODUCTION. AMS Subject Classification. 68U10 Image Processing ANALYSING THE NOISE SENSITIVITY OF SKELETONIZATION ALGORITHMS Attila Fazekas and András Hajdu Lajos Kossuth University 4010, Debrecen PO Box 12, Hungary Abstract. Many skeletonization algorithms have been

More information

A Population Based Convergence Criterion for Self-Organizing Maps

A Population Based Convergence Criterion for Self-Organizing Maps A Population Based Convergence Criterion for Self-Organizing Maps Lutz Hamel and Benjamin Ott Department of Computer Science and Statistics, University of Rhode Island, Kingston, RI 02881, USA. Email:

More information

Color reduction by using a new self-growing and self-organized neural network

Color reduction by using a new self-growing and self-organized neural network Vision, Video and Graphics (2005) E. Trucco, M. Chantler (Editors) Color reduction by using a new self-growing and self-organized neural network A. Atsalakis and N. Papamarkos* Image Processing and Multimedia

More information

Chapter 7 UNSUPERVISED LEARNING TECHNIQUES FOR MAMMOGRAM CLASSIFICATION

Chapter 7 UNSUPERVISED LEARNING TECHNIQUES FOR MAMMOGRAM CLASSIFICATION UNSUPERVISED LEARNING TECHNIQUES FOR MAMMOGRAM CLASSIFICATION Supervised and unsupervised learning are the two prominent machine learning algorithms used in pattern recognition and classification. In this

More information

Identifying Layout Classes for Mathematical Symbols Using Layout Context

Identifying Layout Classes for Mathematical Symbols Using Layout Context Rochester Institute of Technology RIT Scholar Works Articles 2009 Identifying Layout Classes for Mathematical Symbols Using Layout Context Ling Ouyang Rochester Institute of Technology Richard Zanibbi

More information

The k-means Algorithm and Genetic Algorithm

The k-means Algorithm and Genetic Algorithm The k-means Algorithm and Genetic Algorithm k-means algorithm Genetic algorithm Rough set approach Fuzzy set approaches Chapter 8 2 The K-Means Algorithm The K-Means algorithm is a simple yet effective

More information

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler BBS654 Data Mining Pinar Duygulu Slides are adapted from Nazli Ikizler 1 Classification Classification systems: Supervised learning Make a rational prediction given evidence There are several methods for

More information

Motion Detection Algorithm

Motion Detection Algorithm Volume 1, No. 12, February 2013 ISSN 2278-1080 The International Journal of Computer Science & Applications (TIJCSA) RESEARCH PAPER Available Online at http://www.journalofcomputerscience.com/ Motion Detection

More information

A Novel Approach for Minimum Spanning Tree Based Clustering Algorithm

A Novel Approach for Minimum Spanning Tree Based Clustering Algorithm IJCSES International Journal of Computer Sciences and Engineering Systems, Vol. 5, No. 2, April 2011 CSES International 2011 ISSN 0973-4406 A Novel Approach for Minimum Spanning Tree Based Clustering Algorithm

More information

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network

Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network Cursive Handwriting Recognition System Using Feature Extraction and Artificial Neural Network Utkarsh Dwivedi 1, Pranjal Rajput 2, Manish Kumar Sharma 3 1UG Scholar, Dept. of CSE, GCET, Greater Noida,

More information