Fuzzy Signature Neural Network

Size: px

Start display at page:

Download "Fuzzy Signature Neural Network"

Cody Doyle
5 years ago
Views:

1 Fuzzy Signature Neural Network Kun He 1 st June 2012 A report submitted for the degree of Master of Computing of Australian National University Supervisor: Prof. Tom Gedeon

2 Acknowledgements Thanks to my supervisor Tom Gedeon, and Dingyun Zhu for their recommendations and kind support on this project. Also thanks to the course coordinator Weifa Liang for supporting the skill of writing report. Moreover, thanks to Wei Fan for his suggestions about the knowledge of fuzzy signature neural networks. Finally, thanks to my family and my friends for their encouragements. Page 1

3 Abstract In this report, firstly, we introduce the background of neural networks and fuzzy signatures, and then focus on the fuzzy signature neural network. The fuzzy signatures are used as the solution for fuzzy rule based systems to reduce the rule explosion issue. The neural networks which we consider use Radial Basis Functions as the activation function, which are real value functions whose value only depends on the distance from the centroid of that function. We modify the previous approach to improve the method of choosing aggregation functions, and the method of creating the structure of the fuzzy signature neural network. The new method reduces the risk that the results depends highly on the manual choice of aggregation function and manual choice of structure of fuzzy signature. Finally, this report presents experimental evaluation for the fuzzy signature neural network by three experiments. The first and second experiments compare the fuzzy signature based on RBF neural networks. They show that our approach is better than the previous fuzzy signature based RBF neural network when the datasets have significant numbers of missing values. The third experiment compares our work with other neural networks. The results demonstrate that our approach is viable and worth further investigation. Page 2

4 List of Abbreviations FSNN NN ANN RBF Cascor snn Fuzzy signature neural network Neural Networks Artificial Neural Networks Radial Basis Function Cascade Correlation Neural Network Symmetric Nearest Neighbor Page 3

5 Contents Acknowledgements... 1 Abstract... 2 List of Abbreviations... 3 List of Figures... 6 List of Tables Introduction Motivation Objectives Contribution Preview Background and relevant knowledge Neural Networks Radial Basis Function Fuzzy Rules Based system Fuzzy Signature Fuzzy Signature Neural Network Basic structure and process of fuzzy signature neural network Improvement of fuzzy signature neural network Design and Implementation of Fuzzy signature neural network Description Construction of fuzzy signature neural network Damaged data Clustering Create fuzzy signature Structure of fuzzy signature Obtain fuzzy signature information Aggregation Create neural network Training the Fuzzy Signature Neural Network Testing Testing network Extracting network information Experiments and Evaluation Description of the dataset Hardware and software environment information Experiment 1: Datasets experiments with no missing data Purpose of the experiment Description of the experiment Experiment Process and Discussion of Results Experiment 2: Datasets experiments with missing data Purpose of the experiment Description of the experiment Page 4

6 5.4.3 Experiment Process and Discussion of Results Experiment 3: Benchmarks comparison between Fuzzy signature neural network and other approaches Purpose of the experiment Description of the experiment Experiment Process and Discussion of Results Conclusion and Future Works Conclusion Future Work Reference Appendix A Page 5

7 List of Figures Figure 1: Example of a basic NN Figure 2: Example of a neural network Figure 3: Example of structure of fuzzy signature of SARS patient Figure 4: Example of aggregation of SARS patient Figure 5: Example of Fuzzy signature based redial basis function neural network Figure 6: Construction of fuzzy signature neural network & testing suit Figure 7: Example of Agglomerative hierarchical clustering Figure 8: Examples of structure of fuzzy signature for SARS Figure 9: The example of structures of fuzzy signature according to Figure Figure 10: Example of aggregation function selection Figure 11: Manhattan distance Page 6

8 List of Tables Table 1: Information of files Table 2: Datasets details Table 3: the information of Hard and Software environment Table 4: the detail of number of cluster selected Table 5: the benchmark for our approach Table 6: the benchmark for Fan's approach Table 7: the benchmark for our approach with missing value Table 8: the benchmark for Fan's approach with missing value Table 9: the information of our approach that will be used Table 10: the results of 3 different neural networks Page 7

9 1. Introduction 1.1 Motivation Human decision making is a comprehensible hierarchical process, in which cognition processes lead to the selection of a set of actions among many variations, so bio-inspired techniques are used as human decision making foundation. The design of intelligent systems, in Artificial intelligence or its descendent Computational Intelligence is a problem of identifying approximate models to describe a real world scenario [1]. Nowadays, AI is applied is a wide of variety of fields, such as business, math, industry, medical science, and so on. However, if those systems consist of very complex structured high dimensional data, and sometimes with interdependent features and missing components, conventional AI systems are not adequate [1]. Therefore, how to handle the datasets correctly and effectively or how to design structure which is trend to real word, all of these have become a key issue in decision-making under uncertainty. An efficiency issue called rule explosion affects fuzzy rule based system which is conventional AI system [1]. It will grow exponentially with the number of input dimensions. In this case, fuzzy signature method is used by Gedeon et [2], which can be used in fuzzy rule based system, meanwhile solving the rule explosion issue. Neural Networks have very strong nonlinear fitting ability, can represent any complex nonlinear mapping relationship, and also have strong robustness, memory, nonlinear mapping ability and strong learning ability.[3] However, the neural network approach does not produce easily comprehensible results, especially for networks that contain very complex structured high dimensional data. [4]The fuzzy signatures based neural network approach has been used to combine the benefits of neural networks and fuzzy signatures. In our project, we choose Radial Basis Function as the activation function. Moreover, each fuzzy signature is seem as the hidden neural in Radial Basis function. So fuzzy signature neural network can gain the advantages of neural networks, at the same it can solve the efficiency issue of rule explosion in fuzzy rule based system. 1.2 Objectives The aim of this project is to implement and improved fuzzy signature neural network based on Fan s code [3], and to provide a detailed investigation, then evaluate this approach by testing data, and compare with the previous approach. 1.3 Contribution The contribution of this project involves the three following areas. Firstly, the FSNN code Page 8

10 written by Fan is simplified and modified in order to implement and improve FSNN. Secondly, the sequence training code is designed for FSNN, and is used to compare with other neural networks, also including the previous FSNN. Finally, the feasibility and future works of such a neural network based on the results of the experiments is discussed. 1.4 Preview Chapter 2 gives an overview of the relevant techniques and the basic concepts, including neural networks, radial basis function neural networks, and fuzzy rule based system and fuzzy signature, which are helpful for understanding FSNN and its evaluation. Fallowing that; in Chapter 3 introduction and especially focuses on relevance to the fuzzy signature neural network. Chapter 4 demonstrates the techniques and methodologies of basic fuzzy signature neural networks and then describes the test suite for the implementation, then presents the design, implement and testing. Chapter 5 is about evaluating this approach based on the implementation in Chapter 4,We used 3 different experiment to evaluate our approach in various aspect. Finally, Chapter 6 concludes the report and indicates the weaknesses as well as suggestions for future work. 2. Background and relevant knowledge 2.1 Neural Networks An artificial neural network, also called neural network, which is a mathematical model or computational model that is inspired by the structure and/or functional aspects of biological neural networks. It consists of an interconnected group of artificial neurons, which processes information using a connectionist approach to computation [5]. Modern neural networks are non-linear statistical data modeling tools which means it has an ability to solve some problems that do not have a known statistical model [3]. So they are usually used to model complex relationships between inputs and outputs or to find patterns in data. The following figure shows the basic principle of NNs. Page 9

11 X1 W1 X2 W2 WN WX Y XN Activation fuction Inputs Weights Outputs Figure 1: Example of a basic NN According to the Figure 1, X represents a number of input data either from original data or the output of other neurons; the strength of the connection between input data and neurons is called weights. Finally the activation function that converts a neurons weighted input to its output activation. The activation function could be non-liner function such as Gaussian function, sigmoid function. The activation function formula is shown below [5]. Where y is the output of the neuron n is the number of the inputs i is the i th of the neuron xi is the value of input i of the neuron n y = f ( w i x i ) ωi is the weight value of input i i is the value of input i of the neuron i=1 f is the activation function, e.g. for sigmoid, f(x)= 1 1+e x Equation 1: basic equation of neural network However a single layer neural network cannot handle complex problems because its structure is too simple, and then we need to create multiple layer neural networks. This neural network consists of numbers of neurons so that they are able to solve more complex problems. It consists of 3 different kind neurons which are input neurons, hidden neurons, and output neurons, and then each neuron is located at different layers which are called input layer, hidden layer and output layer respectively. [5] Figure 2 shows an example neuron network. Page 10

12 Hidden neuron x1 y2 Hidden neuron x2... xn Hidden neuron y2 Weights matrix Weights matrix Inputs layer Hidden layer Output layer Figure 2: Example of a neural network In order to create a neural networks module, we need two procedures which are training and testing. The first necessary procedure is training the neural network. Through training neural networks is able to find suitable values of the weight matrix which can make the actual outputs more closely with the desired outputs. During this process, neurons learn the weights iteratively by being given a number of training data. This process is finished when the network has stabilized. After this procedure, the second necessary procedure is testing the neural network by using the testing data and the weights matrix after training to compare between actual outputs and desired outputs. Then, according to the testing accuracy rate, we can conclude about our neural networks module whether it was successful or not. 2.2 Radial Basis Function A radial basis function, also called RBF, is a real-valued function whose value depends only on the distance between the origin point x and some other point c [6]. The radial basis function represents as: (x, c) = ( x c ) Any function satisfies the property below, also called radial function. (x) = ( x ) There are different distance measures which can be used in the radial basis function such as Euclidean distance, Lukaszyk-Karmowski metric and taxicab distance. In our project, we choose the Euclidean distance measure method for distance of RBF. The Definition of Euclidean distance is below: Page 11

13 d(q,p) = (q i p i ) 2 Where p = (p1, p2... pn) and q = (q1, q2... qn) are two points in Euclidean n-space. n i=1 Equation 1.1: Euclidean distance There are two obviously advantages of RBF neural networks, firstly RBF neural networks train faster than other multiple layer neural networks. Another advantage that is claimed is that the hidden layer is easier to interpret than the hidden layer in an MLP. 2.3 Fuzzy Rule Based systems Fuzzy Rule Based systems are linguistic IF-THEN- constructions that have the general form "IF A THEN B" where A and B are (collections of) propositions containing linguistic variables,in which case, A is called the premise and B is the consequence of the rule. In effect, the use of linguistic variables and fuzzy IF-THEN- rules exploits the tolerance for imprecision and uncertainty. In this respect, fuzzy logic mimics the crucial ability of the human mind to summarize data and focus on decision-relevant information. Fuzzy Rule based systems are very successful and popular in control system applications. They outperform the conventional method of modeling non-linear control systems, which is based on solving high order partial deferential equations, by simplicity of inference. However, the fuzzy rule based system is also called a dense fuzzy rule based system, because it suffers from a serious issue which called rule explosion [1]. Rule explosion is caused by the exponential growth of the number of rules needed with regard to the number of fuzzy sets per input dimension and number of inputs. The equation 1.2 below shows the calculation of the number of rules required for a system which has k input variables and T number of fuzzy sets per input dimension [7]. R = O(T K ) Equation 1.2: Rule explosion As the k or T increasing, the number of rules will have an increasing sharply. In order to solve this problem; there are 4 possible solutions to model systems that have high number of inputs and/or fuzzy subsets within those inputs. They are sparse fuzzy rule based systems, hierarchical fuzzy rule based systems, sparse hierarchical fuzzy rule based systems, and fuzzy signatures. In our approach, we choose the fuzzy signatures to solve this issue. Page 12

2.4 Fuzzy Signature Computational Intelligence research focuses mainly on identifying approximate models for decision support or classification where analytically unknown systems exist.

14 2.4 Fuzzy Signature Computational Intelligence research focuses mainly on identifying approximate models for decision support or classification where analytically unknown systems exist. Especially, those systems consist of very complex structured and/or high dimensional data, even with interdependent features [3]. Traditional fuzzy logic approaches such as fuzzy rule based systems have become popular for Computational Intelligence research because of the ability to assign linguistic labels and to model uncertainty in many decision making and classification problems. However, conventional fuzzy rule based systems suffer from high computational time complexity, so in most cases, applications of fuzzy rule based systems still remain in some conditions which have few dimensions of input variables and have relatively simple structured data even if there is complex behavior in the system being modeled [1]. The role of aggregation of information in rule based fuzzy systems, including sparse hierarchical fuzzy r ule based systems, is generally by min, max and average. This is a restriction on conventional fuzzy systems as it neglects other membership values of the same input. A Fuzzy Signature is a Vector Valued Fuzzy Set (VVFS), where each vector component is another VVFS (branch) or an atomic value (leaf). It can be described as below [7]: k A: X [a i ] i=1 Where: a i = { [a ij] ; if branc j =1 [0,1] ; if leaf Equation 1.3: VVFS k i If each fuzzy signature has the same structure and aggregation method, then fuzzy signature can be described a vector, normally we used min, max, and average method as aggregation functions [2]. Figure 3 below represents a fuzzy signature example for a SARS patient, and Figure 4 shows an aggregation method based on max, min then average. According to the figure, these aggregation functions transfer those membership values of this fuzzy signature into a single fuzzy signature and eventually a single value. Figure 3: Example of structure of fuzzy signature of SARS patient Page 13

Figure 4: Example of aggregation of SARS patient There are 3 important advantages of fuzzy signature in fuzzy rule based systems. Firstly it is able to reduce the high computation cost.

15 Figure 4: Example of aggregation of SARS patient There are 3 important advantages of fuzzy signature in fuzzy rule based systems. Firstly it is able to reduce the high computation cost. Second, fuzzy signatures have the ability to handle noisy and missing value by using specific aggregation functions. Thirdly, new information or features can be added without redesigning the structure of the data representation However, in actual use, the definition of structure and choice of aggregation function which is based on professional knowledge, in the presence of uncertain data, we have some difficulty to define and choose all of these. In order to solve this issue, in our project, we are trying to automate and hence improve the parts of definition of structure and selecting of aggregation function to make fuzzy signature more general. In the next chapter, we will discuss more details of the improvement of the definition of structure and selection of aggregation functions. 3. Fuzzy Signature Neural Network 3.1 Basic structure and process of fuzzy signature neural network Fuzzy signature neural network is a type of neural networks, in our project; the radial basis function is treated as the activation function in the neurons [8].Firstly, a number of input data either from original data or the output of other neurons as input, then the Euclidean distance is calculated from the evaluation point (each input is a point in a multi-dimensional input space) to the sample point in each neuron. Additionally, each hidden neuron (fuzzy signature neuron) in the neural network has a specific fuzzy signature associated with it and the output of the hidden neuron is the similarity between the input vector and the fuzzy signature based on the specific aggregation function. Thirdly, through the equation below, we can calculate the output, and then we use the equation below to modify the weights. w ij = (t i y i )x i were α is a constant with small value called learning rate ti is the value of desired output at dimension i yi is the value of actual output at dimension i Page 14

16 xi is the value of input at dimension i Equation 2.1: Weight modify So the strengths between the input neurons and hidden neurons are constants, which mean it is not able to change with training. The whole training task is only performed by the weight matrix between hidden neurons (fuzzy signature neurons) and the output neurons. Therefore, the time to train these neural networks should reduce since the smaller size of weight matrix, and reduced number of layers. Figure 5 shows the general architecture of the fuzzy signature based radial basis function neural network. Input neuron Fuzzy Signature neuron Hidden neuron Weight Output neuron Input neuron Weight Weight Input neuron Hidden neuron Weight Output neuron Figure 5: Example of Fuzzy signature based redial basis function neural network 3.2 Improvement of fuzzy signature neural network In our project, we work on a similar network has been shown by Fan; However, Fan s application has some limitations. Firstly, the users must be choose the aggregation function by themselves, so the results depends highly on their choice, and secondly, each structure of fuzzy signature must be set by the users, in actual use, the users choice must be based on professional or practical knowledge then the users can select the structure of fuzzy signature. Hence selecting the structure of a fuzzy signature is quite difficult to common users. The third weakness is that once chosen, the same structure of fuzzy signature and aggregation function is used in constructing all fuzzy signatures in that network. So it is likely to miss some features in the data if there are significant differences in the important data in different clusters. This is likely if we consider highly complex data with significantly different sub-structure. Therefore, this implementation significantly extends the previous work, re-designs and constructs the Page 15

17 fuzzy signature based RBF network in a different and more general way. 4. Design and Implementation of Fuzzy signature neural network 4.1 Description This chapter describes the techniques and methodology used to implement fuzzy signature neural networks. The programing language used to implement the FSNN is Matlab. In order to create a fuzzy signature neural network, the implementation can be divided into five modules, and the testing can be divided into four modules. The fundamental architecture of the implementation and testing is demonstrated in Figure 6, the construction of fuzzy signature neural network consists of data with missing values, clustering, and obtaining fuzzy signature information, create and train the network. The testing suite contains three modules which are extracting network information, and testing network, and then collate results. The construction of the fuzzy signature neural network should be embedded into the test suite, which means it should be part of the test suite. Construct of fuzzy signature neural network Damaged input Clustering input Obtain fuzzy signature Create nerual network Train neural network Testing suite Extracting network information Clustering input from testing data Testing neural network Collect the results Figure 6: Construction of fuzzy signature neural network & testing suit Page 16

18 4.2 Construction of fuzzy signature neural network This section describes the procedure to construct fuzzy signature neural networks in detail. The whole network procedure is divided into five different modules and will be introduced as follows: Dealing with data In a real world, the recorded data is not always perfect, normally, missing data happens under uncertainty condition. We add this module in order to simulate the situation above. The damaged data module is an optional module, we only need to simulate the data which has missing values, and then we will choose to use it. Below a pseudo code demonstrates the algorithm and process. The damaged function to rearrange the data and it contains some missing values. Input: Output: data (this means the data we need to handle), rates (proportion of missing values in whole data) damageddata damaged(data, rate) 1. inputcol=number of column (data) 2. inputrow=number of row(data) 3. total=round(rate*inputcol*inputrow); // get the number of the total missing values in whole data 4. col=random(1,total,[1,inputcol]); //get the position of the column 5. row=random(total,1,[1,inputrow]); // get the position of the random 6. for loop 1 to total 7. Set data(col(1,i), row(1,i) Not a number //set those position to a non-number 8. End loops 9. Set damageddata=data 10. End From above pseudo code, firstly, we need to set the rate with which we need to deal the data with missing values. Then we calculate the total number of missing values, after that we random by the positions in the matrix, then can set those position numbers into non-numbers. Then the damage-data set is ready. Page 17

19 4.2.2 Clustering Clustering is the pre-processing method to obtain the fuzzy signature neurons in this implementation. Clustering is a creditable unsupervised method; using a clustering technique allows that the users do not need to consider how to construct the fuzzy signature themselves. The second reason is that extracting fuzzy signature manually is time consuming and can be difficult, especially when the input data set contains large numbers of records. Clustering has many different methods. In our approach, we use agglomerative hierarchical clustering. Agglomerative hierarchical clustering has some obvious advantages with regards other methods, the first one is that we do not need users to specify the number of clusters, the second one is that the algorithm is deterministic which is useful for research purposes, and the third one is that the outputs are more informative than in flat clustering. [9] Agglomerative hierarchical clustering is a bottom up hierarchical clustering method where each cluster can be another cluster s sub-cluster. It begins with each single object in a separate cluster. Then it agglomerates similar clusters based on similarity criteria, until all data merge into one cluster. Below the figures shows an example of hierarchical clustering.. a a b c d e f bc de bcdef b c d e f abc def Figure 7: Example of Agglomerative hierarchical clustering In the hierarchical clustering, the similarity criterion is based on the Ward linkage method since it is efficient. It uses the incremental sum of squares to calculate the distance; normally distance means the Euclidean distance. The sum of squares measure is defined as in the formula below: d(r, s) = 2n r n s (n r + n s ) x r x s Page 18

20 Where x r and x s are the centroids of clusters r and s x r x s is Euclidean distance between x r and x s n r and n s are the number of elements in clusters r and s Equation 2.2: distance calculate of Euclidean In our application, Matlab has a package in the statistical toolbox for Agglomerative Hierarchical Clustering with the Ward linkage method. Function linkage(inputdata, ward, Euclidean ) takes input data set inputdata and the specific method name as arguments then creates a hierarchical cluster tree. In this case, the formatted data combined with a string as specific parameters pass though this function. This string indicates Ward linkage method. Furthermore, the function cluster(z, maxclust, n) constructs clusters from a hierarchical cluster tree and the number of clusters n as arguments. The clustering module in this implementation uses both functions to cluster the formatted data set Create fuzzy signature Before creating a fuzzy signature, we need to get the structure of fuzzy signature. In our application, each fuzzy signature has a different structure which was created automatically. Additionally, the fuzzy signature neurons are based on selected clusters (which has been mentioned in Section 4.2.2), after that, we will get the membership values as described above, and finally, our code will choose the best aggregation function to find the final membership value as the hidden neurons (fuzzy signature neuron). In order to present the detail, we will use the next 3 sections to explain our application Structure of fuzzy signature Each fuzzy signature has different structure, this has the advantage that it reduces the risk of affecting the results by manual construction. In our project, we use a function called CreateS(n, input) to handle this part. Following below is the pseudo code description of the algorithm and the process for creating the structure of the fuzzy signature. The function to create structure of fuzzy signature Input: input (the number of input dimensions), n (this means the number of fuzzy signatures we need create) Output: structure (it is a matrix which contains the structure of fuzzy signature) Structure=CreateS (n, input) 1. For loop 1 : n //once we just handle a fuzzy signature Page 19

21 2. For loops 1:10 // in our application we limit to 10 as maximum levels of structure. 3. Then create a structure in each level until the number of input dimensions is reached 4. End loop 5. End loop 6. Return the structure Actually, in our project, we divide this to three parts, each function involves a loop, and then the last function creates a structure in a fuzzy signature. After creation is done, we will gain the structure. The matrix below is an example of the structure of fuzzy signature for SARS which contains 8 input dimensions, and we want to create the 2 fuzzy signatures. Figure 8: Examples of structure of fuzzy signature for SARS The first fuzzy signature has two levels; first level is from the second dimension to the seventh dimension. The second level is from 1 st to the end; However, the second fuzzy signature structure has 3 levels, the first one is 1 st to 3 rd, after aggregation function to the first level, then we can handle the second level which from to 1 st to 6 th, and the third level is from 1 st to 8 th. The two vectors below present the detailed structure of two above two fuzzy signatures. Figure 9: The example of structures of fuzzy signature according to Figure Obtain fuzzy signature information Each fuzzy signature demonstrates the similarity between input and the centroid of a specific cluster in a data set. Therefore, a matrix would be extracted after the hierarchical clustering Page 20

22 method is applied on the formatted data set. It is essential to create a neural network that contains the cluster s information such as the centroid. The detailed structure information of this matrix is shown below: c 11 c 1n [ c 21 c 2n ] c j1 c jn Where j is the number of clusters n is the number of dimensions of input data Cjn is the coordinate of centroid point at dimension n in cluster j Equation 3.1: Information of the cluster Aggregation Before we do the aggregation, we need select the aggregation functions for each fuzzy signature. In our project, we have 3 aggregations function method, max, min, and average. In Fan s application, we must select it manually, which has the disadvantage of the risk of worse selection of aggregation functions. Therefore, in our application, we improve this, and following Mendis work, [10] we used the standard deviation comparison to select the aggregation function. The standard deviation comparison has 4 processes; firstly we calculate all possible membership values of the fuzzy signature. Then we find the average value for all membership values, thirdly we find the standard deviation according to the average membership values, finally we can find the smallest standard deviation value s position, then we can find the aggregation function we select. The figure below shows the process of finding the aggregation function for a 2 level structure fuzzy signature ave max min ave max min fun Average = 0.42 Smallest sd = 0.04 Aggregation function = min/average Figure 10: Example of aggregation function selection Page 21

23 4.2.4 Create neural network Before the neural network is created, we need to set some parameters to satisfy initial conditions. The following parameters need to be specified by the creation process: 1. Number of Fuzzy Signature neurons 2. Number of training epochs Firstly, the number of fuzzy signature neurons and the number of training epochs are determined by the user at the beginning of the program. The second parameter is the number of training times for a dataset. The approach also provides some parameters automatically. The size of the weight matrix is based on the number of hidden neurons and the number of output dimensions. Furthermore, the neural network weight matrix values are initially set to small random values. This provides the training procedure with a good starting stage to work towards a solution. In this case, a function in the Matlab statistics toolbox called is random embedded into the implementation. It takes the number of columns and rows comb ined with a mean and standard deviation as arguments then generates a matrix with random values. After all possible parameters have been determined; one data container called handles packs all of them and sends the package to the training module Training the Fuzzy Signature Neural Network Training procedure helps us to find suitable values of the weight matrix which can make the actual outputs more closely the desired outputs. During this process, neurons learn the weights iteratively by being given a number of training data. Therefore, providing a suitable training method is an essential pre-condition for solving a specific task such as a classification problem. In our approach, we use the training module to handle it. The training procedure helps us to find suitable values of the weight matrix which can make the actual outputs more closely model the desired outputs. During this process, neurons learn the weights iteratively by being given a number of training data. Therefore, providing a suitable training method is an essential pre-condition for solving a specific task such as a classification problem. In our approach, we use the training module to handle it. The training module starts the neural network when training data is input. After receiving th e training data, we calculate the Manhattan distance between centroid of cluster and the given input at each dimension as in Figure 11, and then the Gaussian function is then applied on those distances. The Gaussian function control the range of results between 0 and 1 which represent the similarity between the given input and centroid of cluster. The value 1 means the given input is the same or very close to the centroid of the cluster; on the other hand, the value 0 means the given input is far from the centroid. After the hidden neurons finalize the similarity Page 22

24 computations in all input dimensions, the results are encapsulated by the structure, that is, a fuzzy signature. Input data Vertical distance Centroid of cluster Horizontal distance Figure 11: Manhattan distance Distance = Horizontal distance + Vertical distance In order to get a membership value of the fuzzy signature, a specific function called bestaggreatedata(f, s) takes the fuzzy signature f and structure s as arguments, and then we can produce the membership value. After above process, each membership value will be used as compute the actual value. The formula below demonstrates the calculation of the actual value: n y = w i i were i=1 y is output for neual netwok w is te weigts is te value of idden neuron i is te i th of idden neuron or weigt After we calculate actual values, then the neural network will compare the difference between the actual values and desired outputs, and then update the weights matrix values based on the delta rule. The delta rule is a gradient descent update rule for tuning the weights matrix of the neurons. The training module in this implementation uses this delta rule to minimize the error between actual output and desired output. The learning rate can be determined by the users, with a default value of The formula below represents a simplify form with the update function: w ij = (t i y i )x i were α is a constant with small value called learning rate ti is the value of desired output at dimension i yi is the value of actual output at dimension i xi is the value of input at dimension i Page 23

25 When all the training data has been used in this module, one training epoch complete, but it is not the end of the training phase. It just finished training one epoch. The training is not complete until all the epochs finish. 4.3 Testing In the testing part, we have three modules, which are testing network, collect results and extracting network information. The 2 sections below demonstrate the details of them separately Testing network Before testing the network, the dataset need to be reorganized, to create more accurate benchmarks since the training set and testing set are isolated. The organizing method is performed in this test suite for the k-fold cross validation scheme. For example, if the variable k is determined as 4 by the user, both input data set and desired data set are divided into 4 sets, where three of them are treated as training set randomly and the remaining one is treated as test set, the total number of iterations is 4. The advantage of achieving this is that the user is able to choose how large each test set is and how many iterations are averaged over independently. After re-organizing the raw data, the network starts the testing part. In the dataset, each desired output is a vector of binary values that represents the class to which this observation belongs. On the other hand, the actual result from the neural network is usually a vector consisting of decimal numbers. Therefore, a specific mapping function is perfo rmed to produce optimized output. Firstly, we gain the actual values from the neural network. Secondly, find the position of maximum value. Thirdly, create a new vector which has the same size as the vector of outputs, and then set all elements as 0 in this vector. Fourthly, assign the value of element at position which has the same position from the second step to 1. Finally, return this vector as output. Once the optimized output has been generated, the module then compares it with the desired output and produces the value of accuracy and mean squared error, one by one. Furthermore, the final benchmarks including average accuracy rate and average mean squared error are generated at the end of the procedure. This module also shows the trend of benchmarks by diagrams, so users can easily understand the quality of the network with those benchmarks. Page 24

26 4.3.2 Extracting network information Once the neural network has been trained and tested, all information that directly relates to the network has been determined. This module collects network information as a kind of benchmark and stores it into a file. There are 5 files which are Cluster_detail.txt, Traning_detail.txt, Testing_detail.txt, Final_result.txt, and structure.txt. The table 1 below describes the files and the benchmarks that are associated with them. The Appendix A at the end of this report also shows an example of network information based on the Wine data set that is associated with the experiments and evaluation chapter. File name Cluster_detail.txt Benchmarks Description This file describes cluster information in detail. It includes the class distribution, centroid value, minimum value and maximum value for each cluster. Traning_detail.txt This file contains training information Testing_detail.txt This file shows testing information based on the trained network. It includes the desired output, optimized output and actual output information for unmatched results. Structure.txt Final_result.txt This file shows detailed information about each cluster. E.g.: 8 inputs dimensions. 1,8; 1,2 3,7 1 8; This file shows the general result benchmarks. It includes accuracy rate and mean square error by iterations. Moreover, the average accuracy rate and mean squared error is included as well. Table 1: Information of files 5. Experiments and Evaluation This chapter introduces three experiments, and their evaluation and compassion with other result. The experiments 1 and 2 used the same data sets; however, we handle the dataset with some missing value in experiment 2. We wish to evaluate the advantages of our approach when the dataset has missing values. The experiment 3 our approach will be compare with other neural networks, other neural networks results from the relative academic essay. Before we experiment and evaluate, we need to describe the basic information for the experiments. Page 25

27 5.1 Description of the dataset There are 10 data sets described in section 5.1 from the University of California Irvine (UCI) machine learning data set repository. Table 2 below shows the general information for these data sets [12]. Data set Number of Number of Number input dimensions output dimensions observation Wine Thyroid SARS Ionosphere Horse Heart Heartc Diabetes Card Cancer of Table 2: Datasets details Descriptions for all data sets except SARS are listed at the UCI website. SARS is from Wong [2]. A brief synopsis of each data set is as follows: Wine This data set is about a chemical classification of three different type of wine that grown in the same region in Italy. It consists of 13 input dimensions which describe the information of the wine and three output dimensions which indicate three types of wines. Thyroid This data set refers to a classification task that diagnoses patients thyroid condition. There are 21 inputs, 7200 examples. The desired output has 3 categories which indicate whether the patient s thyroid has over function, normal function, or under function. SARS This data set describes the severe acute respiratory syndrome (SARS) patients information such as fever temperature at different time, blood pressure, nausea and abdominal pain by 8 input dimensions. The 4 outputs indicate 4 types of patients. They are SARS, normal, pneumonia and hypertension. The total number of observations is 4,000 (Wong et al [2]). Ionosphere Page 26

28 This data set refers to radar information that was obtained by a system in G oose Bay, Labrador. It is used to determine whether some type of structure exists in the ionosphere. There are 34 input dimensions that indicate values of the electromagnetic signal, and 2 desired output dimensions show whether the structure exists or not. Therefore, this is a binary classification task. Total number of observations is 351. Horse This data set is about a classification task which predicts the fate of a colic horse. It demonstrates whether the horse will survive, will die or will be euthanized based on the result of veterinary examination. Heart This data set represents the prediction of heart disease. There are 35 input dimensions associated with personal data such as age, sex, smoking habits, subjective patient pain descriptions and results of various medical examinations such as blood pressure and electro cardiogram result. The 2 output dimensions determine whether at least one of four vessels is reduced in diameter by more than 50%. Heartc This is an alternate version of the Heart data set. The structure of input and output is the same as the heart data set. The difference between Heart and Heartc is the Heartc dataset comes from a different source, the Cleveland Clinic Foundation. Diabetes This data is about a classification task that diagnoses Pima Indians diabetes. The aim of the data set is trying to determine whether a Pima Indian individual is diabetes positive or not. Card This data set refers to a binary classification task that predicts whether a customer s credit card should be approved or not. There are 51 inputs that represent a real credit card application. The 2 outputs show the decision on the credit card. Cancer This data set is about a classification task that diagnoses patients breast cancer. The main task indicates a tumor as either benign or malignant Hardware and software environment information We use the same Matlab version and computer to test our and Fan s approach, because we need to avoid the situation that different hardware and software leads to the different result. Table 3 blow shows the detail of the hardware and software information: Page 27

29 MATLAB version Operating system CPU Installed memory( RAM) Hard Disk 7.8.0(R2009) X64 Win7 Ultimate X64 Intel Core I7-2630QM 2,00GHz 6.00GB 750G Table 3: the information of Hard and Software environment 5.3 Experiment 1: Datasets experiments with no missing data Purpose of the experiment The aim of this experiment is to discover the feasibility and performance of fuzzy signature based neural networks for a range of data sets. This experiment determines the results of our approach and Fan s in datasets with no missing data, through the experiment, and then comparison with ours and Fan s, we will try to find the advantages and disadvantages of our fuzzy signature neural network. Finally we analyze the reasons which cause the differences of results Description of the experiment This experiment produces related benchmarks with the data sets that are described in section 5.1 and compares them with Fan s classification approach. In order to avoid the risk which has each result different due to the initial conditions, we run the experiment 5 times for each dataset, and we used the same number of hidden neurons (fuzzy signature neurons) each time. Additionally, in Wei s approach, we used the average method as the aggregation function. Both approaches set for the limit at 100 training times, and 20% data as the testing dataset, and 80% as the training dataset. The table 4 below shows the number of fuzzy signature neurons. Data set Number of Number of input hidden neurons dimensions Wine Thyroid SARS Ionosphere Horse Heart Heartc Diabetes Page 28

30 Card Cancer Table 4: the detail of number of cluster selected Experiment Process and Discussion of Results As mentioned in the last section, various numbers of fuzzy neurons have been used to find the maximum accuracy rate and minimum mean squared error in each dataset. The locally optimized results for all data sets are listed in the table below. More specifically, each row in table 5 represents one specific data set benchmark, and includes mean and standard deviation of accuracy rate (results of 5 fold cross validation) for both training and testing data sets, and mean squared error for both the training and testing data sets. For each data set benchmark, we used the average number to represent the 5 folds. Data set Training data set Testing data set Mean (%) MSE Mean (%) MSE Wine Thyroid SARS Ionosphere Horse Heart Heartc Diabetes Card Cancer Average Table 5: the benchmark for our approach In the table above, there is no generally optimum number of fuzzy neurons in this experiment; moreover, each data set s accuracy rate is different. Table 6 below shows Fan s results which are not very different. Data set Training data set Testing data set Mean (%) MSE Mean (%) MSE Wine Thyroid SARS Ionosphere Horse Heart Page 29

31 Heartc Diabetes Card Cancer Average Table 6: the benchmark for Fan's approach From the two tables above, we can see that the overall results have only a little difference between our fuzzy signature neural network and Fan s fuzzy signature RBF neural network. Although accuracy rates of our training dataset and testing dataset is higher than Fan s, on the other hand, the mean square error of our approach in the testing dataset is lower than Fan s. So we do not have any obvious evidence to present that our algorithm has any advantages compared with Fan s approach. Additionally, in specific datasets, the accuracy rates of wine, thyroid, SARS, diabetes, card, and cancer datasets are a little higher than Fan s. However, the accuracy rates of other dataset are a little lower than Fan s. So there are 6 specific datasets where our result is higher than Fan s and 4 specific datasets where our result is slightly lower than Fan s. Based on those the benchmarks above, we still cannot conclude which algorithm is better. Therefore, we performed the second experiment in which datasets with missing va lues are used to evaluate the differences in the effects of the algorithm between ours and Fan s. 5.4 Experiment 2: Datasets experiments with missing data Purpose of the experiment In experiment 1, we do not find any significant difference between our approa ch and Fan s from the benchmark. Therefore, we will try to add the datasets with missing data to test our and Fan s approach. The aim of this experiment is still to discover the feasibility and performance of fuzzy signature based neural networks for a range of data sets. Through the experiment, and then comparison with ours and Fan s, we will try to find the advantages and disadvantages of our fuzzy signature neural network Description of the experiment In this experiment, we handle the dataset with 20% missing values, for the method of dealing the missing values, we have introduced by section For Fan s approach, we need to handle the missing values. In Fan s approach, it uses the average value of this dimension instead of the missing values. After dealing with the missing values, we used otherwise exactly the same data and number of hidden neurons in experiment 1. All the information can be seen in section Page 30

32 5.4.3 Experiment Process and Discussion of Results With the same as experiment 1, various numbers of fuzzy neurons have been used to find the maximum accuracy rate and minimum mean squared error in each dataset. The table 7 below shows the average values of the accuracy rate and mean squared in our approach. Data set Training data set Testing data set Mean (%) MSE Mean (%) MSE Wine Thyroid SARS Ionosphere Horse Heart Heartc Diabetes Card Cancer Average Table 7: the benchmark for our approach with missing value Because the datasets have missing values, all the benchmark results have decreased slightly when compared with experiment 1. However, the decrease is slightly, it keeps stable generally; therefore, our approach works quite well on datasets with missing values. After that, we tested Fan s approach, with the same conditions, with 20% missing values. Table 8 below shows the benchmark for Fan s approach. Data set Training data set Testing data set Mean (%) MSE Mean (%) MSE Wine Thyroid SARS Ionosphere Horse Heart Heartc Diabetes Card Cancer Average Table 8: the benchmark for Fan's approach with missing value From Fan s benchmark, the accuracy rates both of training datasets and testing datasets have a larger decrease with the accuracy rate of training datasets going from to 78.70, and the Page 31

33 accuracy rate of testing datasets from to Compared with our approach, it represents worse results, which has two aspects. The first aspect is that all of the benchmarks of our approach are higher than Fan s approach. The secondly, the effect of the dataset with missing values in our approach are less than Fan s approach. In two datasets, wine and SARS, the results of Fan s approach has decreased sharply, the accuracy rate of the wine dataset decreased from 93% to 57%, on the other hand, in our approach the effect is slight, the result is almost the same which is 93%. The same effect happens in SARS dataset. According to the results above, we conclude Fan s fuzzy signature RBF neural network has an obvious limitation, which is that it depends highly on data integrity. On the other hand, our approach has less effect when the datasets have missing values. Because the aggregation method of our approach and structure of fuzzy signature is suited for more extreme cases, it will be less affected when the data is incomplete. In future work we should examine eliminating the 20% of data values and using those datasets to construct the fuzzy signatures for both our and Fan s approaches. 5.5 Experiment 3: Benchmarks comparison between Fuzzy signature neural network and other approaches Purpose of the experiment The purpose of this experiment is to evaluate the performance of our approach when compared with other neural networks. Then we can find some ways to improve our approach Description of the experiment In this experiment, our approach will be compared with other neural networks which are Cascor and k-snn. In order to do this comparison, firstly, we find the benchmarks on which both Cascor and k-snn have been tested by other researchers [13, 14]. The datasets heart, horse, diabetes and cancer have been tested by Cascor and K-sNN. After that, we find the most optimized results from each approach. The table 9 below shows our values of parameters, which involve number of fuzzy neurons and epochs. Dataset Number of fuzzy neurons Epochs Heart Horse Diabetes Cancer Table 9: the information of our approach that will be used Page 32

34 5.5.3 Experiment Process and Discussion of Results We used the last section s parameters to test our approach 5 times, and then find the accuracy rates of training datasets and testing datasets, finally calculating the average accuracy rates and find the other neural network s accuracy rates. The table 10 below shows the information mentioned above. Dataset Fuzzy signature Cascor K-sNN neural network Heart Horse Diabetes Cancer Table 10: the results of 3 different neural networks From the table above, the best performance is Cascor, our approach is slightly lower than Cascor; however the K-sNN is the worst one. In the horse dataset, our approach has the worst performance, though on the hand, in the heart and diabetes dataset our approach has the best accuracy rates. From the viewpoint of the speed of processing, the fuzzy signature neural network is the best one, and the others are slower. 6. Conclusion and Future Work 6.1 Conclusion We have implemented our approach (fuzzy signature neural network) and Fan s approach (fuzzy signature RBF neural network), and then compared their advantages and disadvantages. The techniques and methodologies to implement such a network and suitable test cases have been demonstrated. We used real world data sets from UCI, the benchmarks have been compared with different parameters in the network and different classification methods. Our approach achieved stable and good results, especially when using datasets with missing values, it still provides stable and good benchmarks. Therefore the approach is viable and worth further investigation. 6.2 Future Work In this report, our approach (fuzzy signature neural network) has good performance for Page 33

35 classification datasets; however, this approach still has space for improvement. Therefore we give 3 major suggestions for future work, beyond the few minor suggestions already mentioned in various sections earlier in this report. Firstly, in the aggregation function, we only choose 3 methods, which are max, min, and average functions, to handle the aggregation process. This is a little simple for the aggregation process; we can add the weights matrix during the aggregation process. For the adding weights method, we may use the weights learning which is introduced by Mendis, 2008 [1]. It is a more complex aggregation method, and we would compare benchmarks to see the result, which we expect would improve. Secondly, in our approach, we have 3 layer neural networks, which means only one hidden layer is used. This has a disadvantage in that it may lose some features in the fuzzy signature neural network. Based on this, we suggest that we can add a hidden layer, after that using the back propagating algorithm to modify the weights, and then it may find some new features in the dataset. Finally, this approach has a limitation for general real world tasks, because it is suited for classification datasets, and not tested for regression datasets. In order to extend the types of dataset, for the fuzzy part, we can use the polymorphic fuzzy signature instead of our fuzzy signature. At the same time, the neural network could use the cascor neural network instead of the RBF neural network [15]. Because the polymorphic fuzzy signature and cascor neural network have better performance individually, we consider that combining those two methods into our approach will have better performance [1, 15]. Page 34

36 Reference [1] B.S.U. Mendis, 2008, Fuzzy Signatures: Hierarchical Fuzzy Systems and Applications, PhD thesis, Department of Computer Science, The Australian National University, Australia, March [2] K.W. Wong, T.D. Gedeon, L.T. Kóczy, Construction of Fuzzy Signature from Data: An Example of SARs Pre-clinical Diagnosis System, Proceedings of IEEE International Conference on Fuzzy Systems FUZZ-IEEE 2004, 2004,pp [3] W. Fan, 2008, Fuzzy Signature based Radial Basis Neural Network, Master thesis, Department of Computer Science, The Australian National University, Australia, Nov [4]T. Gedeon, K. Wong, and D. Tikk, Constructing hierarchical fuzzy rule bases for classification, in Fuzzy Systems, The 10th IEEE International Conference, 2001,pp [5] M. Minsky, and S. Papert, Perceptron, Cambridge: MA: MIT Press, 1969, pp [6]M. D. Buhmann, Radial Basis Functions: Theory and Implementations, 1st edn, Canmbridge University Press,2003. [7] B. S. U. Mendis, T. D. Gedeon, and L. T. K_oczy. Flexibility and robustness of hierarchical fuzzy signature structures with perturbed input data. in International Conference of Information Processing and Management of Uncertainty in Knowledge Based Systems (IPMU), 2006, pp [8]Pawel Strumillo, Wladyslaw Kaminski, Radial Basis Function Neural Networks: Theory and Applications, in proceedings of the Sixth International Conference on Neural Network and Soft Computing, Zakopane, Poland,2006, pp [9]H. Tevor, T. Robert, F. Jerome, The Elements of Statistical Learning (2nd ed.), New York: Springer, 2009, pp [10] B. S. U. Mendis, and T. D. Gedeon, Complex Structured Decision Making Model: A hierarchical frame work for complex structured data. Information Sciences, vol 194, pp , July,2011. [11] G. Karypis, E.-H. Han, and V. Kumar, Chameleon: Hierarchical Clustering Algorithm Using Dynamic Modeling, Computer, 1999,vol. 32, no. 8, pp , [12] The UCI Machine Learning Repository, 1987, Center for Machine Learning and Intelligent Systems, viewed 15 Feb 2011, < Page 35

37 [13]N.K. Treadgold, T.D. Gedeon, Exploring constructive cascade networks, Industrial Electronics Society, IECON '01. The 27th Annual Conference of the IEEE, vol. 1, 2001, pp [14] Nock, R., Sebban, M., and Bernard, D, A Simple locally adaptive nearest neighbor rule with application to pollution forecasting, Internal Journal of Pattern Recognition and Artificial Intelligence, 2003 pp [15] S. Fahlman, and C. Lebiere, The Cascade-Correlation Learning Architecture, Advances in Neural Information Processing Systems 2, D.S. Touretsky, ed., 1990,pp Page 36

38 Appendix A The files below show an example of wine s the text benchmarks with 6 fuzzy neurons and 100 epochs, with 20% missing values. File: cluster_detail.txt Details for 6 clusters Details for cluster number 1 with number of elements 14 are: Percentage of member is class 1 are Percentage of member is class 2 are Percentage of member is class 3 are Cluster centroid Minimum values of each argument Maximum values of each argument Details for cluster number 2 with number of elements 44 are: Percentage of member is class 1 are Percentage of member is class 2 are Percentage of member is class 3 are Cluster centroid Minimum values of each argument Maximum values of each argument Details for cluster number 3 with number of elements 28 are: Percentage of member is class 1 are Page 37

39 Percentage of member is class 2 are Percentage of member is class 3 are Cluster centroid Minimum values of each argument Maximum values of each argument Details for cluster number 4 with number of elements 44 are: Percentage of member is class 1 are Percentage of member is class 2 are Percentage of member is class 3 are Cluster centroid Minimum values of each argument Maximum values of each argument Details for cluster number 5 with number of elements 28 are: Percentage of member is class 1 are Percentage of member is class 2 are Percentage of member is class 3 are Cluster centroid Minimum values of each argument Maximum values of each argument Page 38

40 Details for cluster number 6 with number of elements 20 are: Percentage of member is class 1 are Percentage of member is class 2 are Percentage of member is class 3 are Cluster centroid Minimum values of each argument Maximum values of each argument File: training_detail.txt weight matrix learnt after rum number weight matrix learnt after rum number weight matrix learnt after rum number weight matrix learnt after rum number weight matrix learnt after rum number Page 39

41 File: testing_detail.txt Failed results in run number 1 Observation 69 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 82 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 89 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 138 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 145 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 172 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Failed results in run number 2 Observation 39 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Page 40

42 Observation 62 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 147 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 159 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 161 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Failed results in run number 3 Observation 36 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 69 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 145 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 171 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 172 is not match the desired result Page 41

43 Actual classification Neural net classification (rounded) Neural net classification (actual) Failed results in run number 4 Observation 22 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 26 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 28 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 63 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 64 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 147 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Failed results in run number 5 Observation 24 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 36 is not match the desired result Page 42

44 Actual classification Neural net classification (rounded) Neural net classification (actual) Observation 97 is not match the desired result Actual classification Neural net classification (rounded) Neural net classification (actual) File structure.txt There are 6 clusters in it. Cluster 1 : 2,5 6,11 1,13,2 Cluster 2 : 2,4 5,12 1,13,2 Cluster 3 : 2,3 4,7 8,12 1,13 Cluster 4 : 2,4 5,11 1,13 Cluster 5 : 2,7 8,10 11,12 1,13 Cluster 6 : 2,5 6,11 Page 43

45 1,13 File final_result.txt Final result for iteration 1 training data set success rate = testing data set success rate = training data set mean square error = testing data set mean square error = Final result for iteration 2 training data set success rate = testing data set success rate = training data set mean square error = testing data set mean square error = Final result for iteration 3 training data set success rate = testing data set success rate = training data set mean square error = testing data set mean square error = Final result for iteration 4 training data set success rate = testing data set success rate = training data set mean square error = testing data set mean square error = Final result for iteration 5 training data set success rate = testing data set success rate = training data set mean square error = testing data set mean square error = Final benchmarks Average training success rate = Average testing success rate = Average training mean square error = Average testing mean square error = Page 44

46 The screen shot for the example Page 45

Fuzzy Signature Based Radial Basis Neural Network

Fuzzy Signature Based Radial Basis Neural Network Wei Fan 4 November 2011 A report submitted for the degree of Master of Computing of the Australian National University Under the supervision of Prof. Tom