128 CHAPTER 7 MASS LOSS PREDICTION USING ARTIFICIAL NEURAL NETWORK (ANN) Various mathematical techniques like regression analysis and software tools have helped to develop a model using equation, which is able to explain the input output relation with minimum error. Depending upon the complexity involved in the problem, either mathematical technique or software tools can be selected. The Neural Networks (NN) can be used with many complicated functions, because of their sophisticated nature. This technique brings out in almost every technological field, solutions to ample range of problems in a convenient and easier way (Laurene 1994 and Galushkin 2010). Owing to the natural self-learning nature of NN, their activities can be sometimes unpredictable and unexpected. 7.1 INTRODUCTION TO ANN ANN is a mathematical model or computational model based on biological neural networks. The first model of basic neurons was summarized by McCulloch and Pitts in 1943. The basic model of the artificial neuron was derived from the functionality of a biological neuron in the human brain. The human brain has a set of more than ten billion interconnected neurons and each neuron is a cell that uses biochemical reactions to receive, process and transmit information. The dendrites (treelike networks of nerve fibers) are interlinked to the soma or cell body, where the cell nucleus is positioned. A single long fiber called axon, extending from the cell body is connected to
129 other neurons through synapses or synaptic terminals. Figure 7.1(a-b) shows a simplified biological and artificial neuron. The soma of the cell body receives inputs from the other neurons through adaptive synaptic connections to the dendrites, when a neuron is excited. The nerve impulses from the soma are transmitted next to an axon to the synapses of other neurons. Artificial neurons are similar to their biological counter parts. Basic building block of an ANN is the artificial neuron and such a model has three simple set of rules namely multiplication, summation and activation. Figure 7.1 Model of a) Biological neuron b) Artificial neuron
130 Every input value is multiplied with individual weight at the entrance and the weighted input vales are summed up to determine the strength of their output. The sum of previously weighted inputs and bias passing through activation function is called transfer function. Activation function which gives output varying between 0 (for low input values) and 1 (for high output values) or -1 and 1. Mathematically, this process is described in Figure 7.1b. The resultant of this function is then passed as the input and these weights determine the behavior of the network. One of the advantages of ANN is their competence to learn from their environment. It is useful in applications where complexity of the environment makes implementations of the other type of solutions which are not practical. As such ANNs can be used for a variety of tasks like data processing, regulations, decision making, classification, clustering, robotics, compression, function approximation, etc. Taskin & Caligulu 2006 and Mustafa et al (2008) have modeled the adhesive wear resistance of Al-Si-Mg/SiCp composites using Back Propagation Neural Network (BPNN). At the end of the training and testing process, the results were compared with the experimental test results. It was found that the overall performance of the model was quite satisfactory and lower fraction values were obtained. John & Kingsly (2008) have attempted to predict the wear loss of A390 aluminium alloy. The results showed a satisfactory agreement between the experimental and ANN results and it was suggested that ANN can be an efficient tool used for prediction in the area of material characterization and tribology. Ahmet et al (2009) have found the wear loss of Al2024 and Al6063 alloys at different temperatures, aging time and applied load. It was suggested that the overall performance of the model was quite satisfactory and this prediction technique may be applied to all the manufacturing processes.
131 Dobrzanski et al (2007) have attempted to predict the mechanical properties of the Al-Si-Cu alloy using NN approach. The predicted results showed that good compatibility with experimental data with better accuracy. Dobrzanski et al (2008) have also investigated the prediction of hardness of various magnesium alloys at different temperatures, solution heat treatment, aging time and various percentage of aluminium content using ANN. The optimal heat treatment working conditions and time were obtained from a well trained model and that helped to attain the best mechanical properties. Tang et al (2009) have developed a NN model with smaller errors, which helped to improve the accuracy of the prediction results. The predicted results were found to be in good agreement with the experimental data. Zhang et al (2006) have developed an ANN model for the tribological behavior of SiC-filled PEEK coating based on the influence of sliding velocity and applied load. The results found that the developed models were relatively satisfactory. Singh et al (2006) have predicted the tool flank wear of High Speed Steel (HSS) drill bits over copper. The models were well trained by using BPNN and the results were compared with the experimental values and it was observed that the NN was able to efficiently learn the model of wear. Palanisamy et al (2008) have attempted, to predict the flank wear of the cutting tool used in end-milling operation by using regression and ANN model. The results revealed that the NN model was better prediction tool than the regression method. Mustafa et al (2008) have tested the accuracy of the ANN model of SiC reinforced Al-alloy Metal Matrix Composite (MMC). The obtained results exhibited low error fraction values and it ensured the performance of the model. Ugur et al (2008) have found the effects of various burnishing parameters such as burnishing force, number of passes, feed rate and burnishing speed on the surface roughness of AA7075 Al-alloy. The
132 prediction of ANN model coincided with the test results and this had helped to determine the average surface roughness value in a short time. Prediction of friction and wear properties of the developed alloys is significant, which can save not only cost but also time. ANNs have the capacity to eliminate the need for expensive and difficult experimental investigations in testing and manufacturing processes. In the recent years, neural network model have been widely used in different metallurgical applications. 7.1.1 Statistical Analysis of ANN Model shown to be: From artificial neuron model interval activity of the neuron can be v k j p W kj x j 1 (7.1) In general there are three types of activation functions used in the ANN namely threshold function, piecewise-linear function and sigmoid function. Sigmoid function is commonly used as activation function and an example for the sigmoid function of the hyperbolic tangent function is given by Equation (7.2). Similarly the bipolar sigmoid activation function, which is used to estimate the output of a neuron that receives input (except neurons in input layer) from other neuron is given by Equation (7.3) (Laurene Fausett 1994). (-v) v 1- e (v) tanh (7.2) (-v) 2 1 e x 1 e g(x) (7.3) x 1 e
133 The outputs of the aforementioned layers can be determined as shown in Equations (7.4-7.5). h F Mx (7.4) o F Nh (7.5) The internal error of output layer is calculated and then back propagated to the hidden layer. The weights of the links have to be adjusted to minimize this error and again the error is calculated with new weights. This is called as an epoch and training of the net would stop if any one of the criteria is met. The prediction of the error of the ANN system is calculated by using Equation (7.6) (Pendse & Joshi 2004). actual computed %e pr X 100 (7.6) actual The several layers used in the NN are input layer (first layer), output layer (final layer) and hidden layer (intermediate layer). The input value is weighed and compared against a threshold value and if this value exceeds the threshold value, the unit will stop. The amount of error is calculated from the difference between the desired output of the net for a given input pattern and the actual value. This value for the error is for that particular pattern. These are combined to find the total error for the network. The weights are repeatedly adjusted in order to minimize the error between the actual and required output. The feed forward back propagation technique has a two-stage learning process involving two passes: forward one and a backward one. In the forward pass, the information moves in only one direction, forward, from the input nodes, through the hidden nodes and to the output nodes. There are no cycles or loops in the network. In the backward pass, reversed, starting at the output layer and armed with the actual and
134 required output patterns, an error value can be found for each output unit. The procedure is worked backwards through the layers and the error is used to apply the appropriate weight changes to each unit in the network. 7.2 MODELING USING ANN The performance of the ANN model is evaluated by separating the data into two sets: the training set and the testing set. During the training set, the parameters of the network are calculated. Then the learning process is stopped when the error goal is reached and finally the network is evaluated with the data from the testing set. It consists of a large number of simple synchronous processing elements called neurons, and is assembled in different layers in the network such as an input layer, an output layer and hidden layer as shown in Figure 7.2. Figure 7.2 Feed-forward neural network architecture The ANN is built with a systematic approach to optimize a performance criterion or to follow some implicit internal constraint, which is commonly referred to as the learning rule. In the supervised learning, an input
135 is presented to the neural network and a corresponding target set at the output. The difference between the desired response and the system output is calculated as error. From this error, the information is fed back to the system and based on the learning rule systematic adjustments of the system parameters are done. The process is repeated until the performance of the net is acceptable. Effectiveness of ANN is ensured by normalizing which is to confine them between certain limits and also make all the input parameters equally important in the training of neural network. This is done by mapping each term to a value between 1 and +1 or 0 and 1 using the following equation (7.7) Normalized value of the parameter, 2( y ymin ) y norm 1 (7.7) y y max min The input layer receives input from the external environment and the output layer which communicates the output of the system to the user or external environment. There are usually a number of hidden layers between these two layers. The process continues until the network outputs fit the targets. Once the network is trained, the NN may be used to calculate the output for any arbitrary set of input data through the fixed weight factors and the errors are also calculated. ANN has the potential to minimize the need for expensive experimental investigation and/or inspection of aluminum alloys used in various applications, hence resulting in large economic benefits for organizations. The training phase can be finished in a few minutes whereas the experimental study lasts for a number of days. The number of neurons in the input and output layers are decided based on the number of input parameters available and output responses, respectively. There is not a perfect theory for instructing choice for the
136 number of neurons in the hidden layer (Shang and Sun 2008). The initial number of neurons in the hidden layer and the modeling error, measured by Root Mean Square Error (RMSE), Mean Percentage Error (MPE), Absolute Percentage Error (APE), and Absolute Fraction of Variance values have been used for making comparisons which can be evaluated by the following Equations (7.8-7.11). n h n i n 2 o m (7.8) APE (%) Model prediction values - Experimental values Experimental values X 100 (7.9) RMSE 1 m m i 1 y i y i y i 2 (7.10) MPE j a j - p n j / a j X 100 (7.11) 2 2 j j j R - 2 p j j a - p 1 (7.12) 7.3 RESULTS AND DISCUSSIONS In ANN, the designer chooses the network topology, the performance function, the learning rule, and the criterion to stop the training phase and finally the system automatically adjusts the parameters. For every set of input data an error between the result at output layer and the actual value is calculated and the weights between the set of nodes are adjusted to
137 minimize this error. This is done by correcting weights from output to input layer via a hidden layer and hence the name back propagation. Out of 36 experimental data, 28 training data sets are considered for both the networks to compare the performances. Besides, 8 testing sets outside the training data set are selected for testing the neural networks. In this work, two hidden layers are used, each layer containing 15 nodes. In the present work, the neural network models are designed and trained using the MATLAB 7.5.0.342 package. Back propagation algorithm is used for predicting the mass loss under lubricated conditions. The input selection is a very important aspect of NN modelling (Nalbant et al 2008). In this work, the network has three neurons in input layer (load, sliding distance and different alloying element), one neuron in output layer (mass loss) and 15 neurons in each hidden layer. So the architecture of ANN is 3:15:15:1 as shown in Figure 7.3. Figure 7.3 ANN architecture for this study 7.3.1 Wear Curves of Developed Alloys The mass loss of the developed alloy for three different applied loads of 50 N, 60 N and 70 N is calculated through the tribological wear test analysis. The test is conducted under lubricated conditions with an oil temperature of 80 0 C. The total sliding distance of the alloy is 54 km with a constant sliding speed of 1 m/s. Before and after the wear test, the weight of
138 the discs of the alloy is calculated by using electronic balance machine with an accuracy of 10-4 g and the mass loss of the developed alloys is determined. The presented results clearly show that, at maximum applied load the mass loss was significantly higher when compared to the other loads. The mass loss is also increased due to increase in sliding distance. The reason for higher wear and testing procedures are explained in detail in previous chapter six in the sections 6.4 and 6.5. Relations of sliding distance versus mass loss of the AlTSi alloy and AlTSiH alloy are presented in Figures 7.4-7.5. Figure 7.4 Wear graph for AlTSi alloy
139 Figure 7.5 Wear graph for AlTSiH alloy All the input and output values are normalized between 0.1 and 0.9 by using linear scaling. After selecting the final network structure 3:15:15:1, the sigmoid activation function is selected as the transfer function and learning rate, and momentum are set as 0.8 and 0.8 respectively. After fixing the momentum and learning rate the trial is continued to find out the optimal value for number of epochs. The training process is ended after 15000 epochs. Figure 7.6 shows the Normalized Standard Error (NSE) with training cycles which decreases with the increasing number of iteration and attains 3.01578e-005. The testing process is carried out, in order to understand whether the ANN is making good predictions.
140 Figure 7.6 ANN training performance graph Using the Equations (6.10-6.12), the statistical values between the network predictions and the experimental values with training and testing data have been calculated and the results are presented in Table 7.1. Table 7.1 Statistical values of the mass loss of the developed alloys Training performance Testing performance RMS 0.000190 0.006597 R 2 0.999987 0.992678 MPE -0.000876 0.125696 Error, % AlTSi alloy 0.175 0.151 AlTSiH alloy 6.911 7.498
141 7.3.2 Prediction by ANN Model The experimental values are compared with the predicted values, so that the performance of the trained network is tested and the results are as shown in Figure 7.7-7.8, it is obvious that the mass loss values derived from the trained ANN system are closely matching with the experimental values. Figure 7.7 Comparison of mass loss at training stage From this comparison the prediction accuracies of the network are calculated. In the learning stage, the obtained mean errors in the AlTSi alloy is 0.175% and AlTSiH alloy is 6.911% respectively. This can still be improved by training the ANN system with more number of experimental results.
142 Figure 7.8 Comparison of mass loss at testing stage The mean errors in the testing stage are also found to be 0.151% for AlTSi alloy and 7.498% for AlTSiH alloy. Now, by using this trained network one can predict the mass loss of the alloy at any combinations of the chosen parameters within the range of values. The values are within the acceptable ranges which meets the reliability of the ANN training and testing stages and the summary of the proposed model is given in Table 7.2. Very good performance of the trained neural network is attained and the prediction of mass loss of the alloys is in good agreement with the experimental values.
143 Table 7.2 Summary of ANN model Object model Mass loss prediction Total number of layers 4 Number of hidden layers 2 The number of neuron on the layers Input: 3; hidden1: 15; hidden2: 15; output:1 Network type Feed-forward back propagation Transfer function Log-sigmoid Training function Trainlm Learning function Learngdm Learning rate, lr 0.8 Momentum constant, mc 0.8 Acceptable mean square error 0.0001 MSE at the end of training 3.01578e-005 7.4 CONCLUSIONS Speed, ability to learn from the experimental results and ease are the advantages of ANN when compared to the classical method and it can also reduce the conduct of wide experimental study. Because of the above reasons, ANN is chosen. This approach emerges to be a dominant tool in materials engineering and can be used efficiently as prediction technique in the area of material characterization and tribology. In this work, feed-forward BPNN is developed and used to calculate the mass loss of the developed alloys. For both training and testing, the experimental values of mass of the alloys are used. The error between the predicted value and experimental value is less, i.e., good compatibility with the experimental value and also this network can save much time. The overall performance of the model is relatively agreeable and it can be used to predict the mass loss with high accuracy.