A Hybrid Intelligent System for Fault Detection in Power Systems Hiroyuki Mori Hikaru Aoyama Dept. of Electrical and Electronics Eng. Meii University Tama-ku, Kawasaki 14-8571 Japan Toshiyuki Yamanaka Shoichi Urano Power Engineering R&D Center Tokyo Electric Power Co., Inc. Tsurumi-ku, Yokohama 30-8510 Japan Abstract- This paper proposes a method for fault detection with a preconditioned artificial neural network. The proposed method makes use of and deterministic annealing (DA) clustering as a precondition technique. The proposed method is tested in a sample system. I. INTRODUCTION In this paper, a hybrid intelligent system is proposed to handle fault detection in power systems. The proposed intelligent system consists of the precondition and inference functions. The former corresponds to clustering to etract data features while the latter means to estimate the location and type of fault with MLP of artificial neural network (ANN). In recent years, power systems become more complicated under competitive and deregulated environments. As a result, more advanced security control is required to smooth power system operation and planning. It is recognized that power system security control is one of main concern. As a framework of security control, fault detection is one of important tasks. Specifically, it is required that power system operators at control center appropriately handle information on faults and detect faults effectively. In other words, more sophisticated fault detection techniques are necessary to maintain secure power systems. The conventional studies on the fault detection may be classified into the following: a) circuit-theory-based method [1, ] b) traveling-theory-based method [3] c) intelligent system [4-11] Method a) detects a fault through nodal voltages, line currents and impedance changes. Method b) identifies the fault location using the return time of pulse wave. As far as Method c) is concerned, several approaches such as epert systems, fuzzy logic and artificial neural networks (ANNs) have been developed. Compared with fuzzy logic and ANN, epert systems are not so good as them in terms of solution accuracy. ANN and fuzzy logic have good nonlinear approimation function. In this paper, an ANN-based method is considered because it is very complicated to evaluate the optimal fuzzy membership functions in fuzzy logic. Tanaka, et al. proposed a method with MLP for power system fault diagnosis [6]. They tried to estimate the fault location with information on relays and circuit breakers. Afterwards, Kim and Park developed a hierarchical artificial neural network for fault diagnosis in power systems [7]. They used MLP as ANN to construct a hierarchical scheme of fault detection. Then, a lot of studies have been made to handle fault detection[8-11]. In this paper, a new method has been proposed for fault detection. The differences between the proposed and conventional ANN based methods may be described as follows: a) This paper uses a precondition technique for ANN. The typical ANN has the configuration in Fig. 1 (a), where and z are input and output vectors. In this paper, a precondition is installed in front of ANN. The precondition helps ANN to improve the performance of inference. b) In this paper, two precondition techniques are employed. One is to use for measured fault currents. is a good tool to etract features of current waveforms in frequency domain. The other is to apply deterministic annealing (DA) clustering[1, 13] to results obtained by. It is very effective for classifying data into some clusters in a sense of global data clustering. In other words, it gives us better results than the conventional local clustering such as. DA clustering has advantage that the obtained results are not affected by initial conditions. Therefore, this paper estimates the location and type of fault after preprocessing fault current waveforms with and DA clustering. The proposed method is successfully applied to a sample system. A comparison between the proposed and Precondition ANN (a) Conventional Model ANN (b) Preconditioned ANN Fig. 1 Inference Model with Preconditioning Technique z z
conventional methods is made to eamine the effectiveness of the proposed method. II. CLUSTERING.1 Clustering Clustering means to classify a given data into some clusters with a classification criterion or data similarity. Mathematically, it may be epressed as data processing to assign m samples i (i=1,,..., m) to k clusters with a center vector y (=1,,..., k), where m is the number of input data and k is the number of clusters. Each center vector implies data features in each cluster. In other words, it may be represented in an optimization problem that minimizes the distance between samples i and center vector y. Thus, the problem has the following cost function: m k i, i i= 1 = 1 d = v where d: cost function i : i-th data of given input vector y : center vector in Cluster v i : association to Cluster such as i v i, = 1 (if belongs to Cluster ) 0 (otherwise) The optimal solution of Eqn. (1) is obtained by determining association v i, and center vector y. From a standpoint of nonlinear systems, Eqn. (1) has many local minima. Thus, it is necessary to evaluate Eqn. (1) with a global optimization technique. In this paper, deterministic annealing (DA) is used as a global optimization technique for clustering.. Concept of Deterministic Annealing (DA) Deterministic annealing (DA) developed by Rose, et al. is effective for clustering. DA is similar to simulated annealing (SA) at the following features: a) DA and SA come from analogy of heat bath. The idea of thermodynamics is introduced into the algorithms. b) Both techniques make use of a parameter called temperature. It controls the solution search process from high temperature to low temperature. On the other hand, DA is different from SA in the following aspects: i) DA changes the form of a cost function although SA has a fied cost function. DA starts with a simple cost function. At high temperature, DA easily obtains the globally optimal solution. As temperature goes down, the cost (1) () function approaches the original cost function of DA. ii) DA is based on the deterministic solution search while SA makes use of probabilistic solution search. In other words, SA employs random numbers in the algorithm although DA does not. iii) DA is tailored for clustering only. However, SA is one of more general optimization techniques. Fig. shows the difference of solution search between DA and SA from high to low temperature. Figs. (a) and (b) correspond to DA and SA, respectively. At high temperature, DA readily evaluates a minimum of the quadratic function. SA moves around toward better solutions freely. At low temperature, DA evaluates the globally optimal solution after transforming the cost function into original one. SA has low probability that the state variable moves to other states. It turns out that SA converges to a solution..3 DA Clustering The theoretical background of DA clustering is outlined. DA has advantage that it epresses association as probability and f() f() f() f() Low Temperature High f() f() (a) DA (b) SA Fig. Solution search of DA and SA
permits association to belong to several clusters. That allows to find out an optimal solution with more fleibility. Eventually, the solution search process determines which association should be deterministically assigned to a certain cluster. Therefore, DA allows to carry out global clustering. However, the conventional clustering like the gives us results of local clustering. The results are heavily influenced by the initial solutions. Thus, it is clear that DA is better than the in terms of clustering. Eqn. (1) may be rewritten as m ( ) d = P C i= 1 k = 1 i i where, P( i C ): association probability DA makes use of the maimum entropy principle to determine the association probability. The association probability in Eqn. () may be rewritten as P ( i C ) = k = 1 ep ( b i ( b i ep where, β: temperature parameter Parameter β is equal to 1/T, where T is temperature. Each time β is updated, the cost function is optimized. DA starts with β = 0 where input data is uniformly distributed. As β is increased, each data belongs to a cluster step by step. When parameter β becomes infinity, each data is assigned to a cluster with probability of one. According to the equilibrium state in statistical dynamics, the center vector may be written as (3) (4) m () t ip( i C ( t + 1) i = 1 y = m () (5) P( i C t i = 1 where t: number of iteration counts (t+1) y : center vector in Cluster at iteration count t+1 P( i C ): a probability that i belongs to Cluster C at iteration count t+1such as P ( ) ( C t + 1 i = ep β i k ep β i = 1 ( t + 1) ( t + 1) It can be seen that DA optimizes the cost function of clustering each time parameter β is updated. The algorithm of DA clustering may be summarized as follows: (6) Step 1: Set initial conditions: initial temperature parameter β 0, final temperature parameter β ma, convergence criterion ε and iteration count t=0. Step : Select the initial center vector from given data. (t+1) Step 3: Set t=t+1and compute center vector y with Eqn. (5). Step 4: Evaluate the association probability in Eqn. (6). Step 5: Calculate cost function d (t+1) and go to the net step if the convergence criterion is satisfied. Otherwise, return Step 3. Step 6: Stop if β >β ma. Otherwise, update β and return to Step 3. III. PROPOSED METHOD This paper proposes an efficient method for fault detection with a preconditioned MLP. The obectives of fault detection is to estimate the location of fault and to identify fault types. It is assumed here that input variables are fault current and output ones are the location and the types of fault. In this paper, two precondition techniques are used to etract features of input data. One is that transforms current waveform in time domain into those in frequency domain. That allows to etract features of current fault with frequencies. The other is clustering that classifies data into some clusters with a criterion of data similarity. Classifying data into clusters and constructing MLP at each cluster, we can make MLP learn efficiently (see Fig. 3). That is because data similarity at each cluster improves the performance of MLP. Specifically, this paper presents DA clustering as a clustering technique. In the proposed method, input vector means fault current waveforms while output vector z consists of the location of fault (z L ) and the type of fault (z T ). Now, consider k cluster for given input data, and define the following methods: Method A: hybrid method of and MLP Method B: hybrid method of, and MLP Method C: hybrid method of, DA clustering and MLP (proposed method) Figs. 3 (a), (b) and (c) show Method A, B and C, respectively. Also, C i and MLP i (i=1,,, k) show the i-th cluster and MLP at Cluster i, respectively. It can be seen that Method A makes use of all input data in frequencies domain to construct MLP while Methods B and C use similar data assigned to each clusters. The algorithm of Method C may be written as Step 1: Carry out for input fault current. Step : Classify results in frequency domain into a cluster with DA clustering. Step 3: Construct MLP at each cluster and estimate output variables.
MLP z I 1 L 1 L L 3 L 4 L 5 I (a) Method A C 1 C MLP 1 MLP z 1 z Fig. 4 Sample System C k (b) Method B C 1 C C k MLP k MLP 1 MLP MLP k z k z 1 z z k single-phase line-to-ground fault(0<z T 0.) two-phase line-to-ground fault(0.<z T 0.4) z T = three-phase line- to-ground fault(0.4<z T 0.6) (7) two-phase short-circuit fault(0.6<z T 0.8) three-phase short-circuit fault(0.8<z T 1) where z T is the actual output regarding the fault type. The recognition rate η is defined as (c) Method C Fig. 3 Concept of Methods A, B and C IV. SIMULATION 4.1 Simulation Conditions 1) The proposed method is applied to a sample system as shown in Fig. 4. The one-machine-infinite bus system has a couple of lines. The down side line has the following faults: a) single-phase line-to-ground fault b) two-phase line-to-ground fault c) three-phase line-to-ground fault d) two-phase short-circuit fault e) three-phase short-circuit fault ) The line was sectionized into 6 parts so that fault location L 1 -L 5 was placed at the line. The fault location for learning data was selected from L 1 to L 5. The relationship between the fault location and the target signal is given in Table 1. Line distance from a generator to infinite node was normalized to be unity and it was decomposed into si parts. For eample, z d =0.167 means that the fault to be considered occurred at Location L 1. 3) Net, the relationship between the fault type and target signal is shown in Table. The section from 0 to 1 was decomposed into five parts. The midpoint of each point was assigned to the fault type like Table. Moreover, the fault type was determined by eamining the following: η η η = c T where, η c : number of correct data, η T : number of total data 4) The input variables are line currents I 1 and I that are by the side of generator and infinite nodes, respectively. Line currents I 1 and I were observed for 50 (ms) with sampling time of 10 (ms). A fault is cleared in 70 (ms) after fault. 5) The number of input variables are 90. considers frequencies with order of 1to 15 and two kinds of three phase currents are measured. In this paper, 100 learning data and 5 test data were prepared with the TEPCO power system simulator. To generalize learning MLP, the cross validation technique is used so that five groups of learning and test data are created. 6) DA clustering employs the number of cluster from to 5. Fifty initial conditions are prepared for the center vector to eamine the influence of the initial conditions on the results. The parameters scheduling is given as follows: () t ( t 1) β = β + β (8) (9) where β 0 : initial parameter, β 0 =0.001 β: updating term, β=0.001 t: iteration count Also, the following convergence criterion at each temperature parameter is employed: ( t ) ( t) () t + 1 4 d d d 10 (10)
TABLE 1 TARGET SIGNALS OF FAULT LOCATION Fault Location Target Signal L 1 0.167 L 0.333 L 3 0.5 L 4 0.667 L 5 0.833 TABLE TARGET SIGNALS OF FAULT TYPES Fault Type Target Signal single-phase line-to-ground fault 0.1 two-phase line-to-ground fault 0.3 three-phase line-to-ground fault 0.5 two-phase short-circuit fault 0.7 therr-phase short-circuit fault 0.9 Method TABLE 3 PARAMETERS OF MLP No. of Learning Momontum No. of No. of Clusters Rate Rate Hidden Layer Iterations A - 0.8 0.9 9 70000 0.8 0.9 9 40000 B 3 0.7 0.9 9 30000 4 0.7 0.9 9 30000 5 0.7 0.9 9 30000 0.8 0.9 9 40000 C 3 0.7 0.9 9 30000 4 0.7 0.9 9 30000 5 0.7 0.9 9 30000 The whole algorithm is repeated until b becomes 1000. 7) Table 3 shows parameters of MLP. They are determined by the preliminary simulation results. 8) The computation is performed in the FUJITSU Work station S-7/7000U model 45. 4. Simulation Results Table 4 shows a comparison between the cost function of the and DA clustering. It should be noted that the and DA clustering has the cost function given by Eqns. (1) and (), respectively. In the table, the best, the worst, the average and the standard deviation (SD) of the cost function are given to eamine the performance of the and DA clustering for cases from to 5 clusters. It can be observed that DA clustering is better than the in terms TABLE 4 COST FUNCTION OF EACH METHOD Method No. of Clusters Cost Function Best Worst Avarege SD 7.5066 7.9665 7.7881.53 10-1 3 4.6589 5.6943 4.8086.3 10-1 4 3.0853 4.318 3.1638 1.41 10-1 5.1598 3.338.370 1.18 10-1 7.1657 7.166 7.1658 1.76 10-3 DA 3 4.0071 4.0076 4.0073 1.53 10-3 4.693.6937.6935 1.16 10-3 5.19.198.196 1.13 10-3 TABLE 5 INFERENCE RESULTS OF FAULT LOCATION Cost Function Method No. of Clusters Best Worst Avarege SD 7.5066 7.9665 7.7881.53 10-1 3 4.6589 5.6943 4.8086.3 10-1 4 3.0853 4.318 3.1638 1.41 10-1 5.1598 3.338.370 1.18 10-1 7.1657 7.166 7.1658 1.76 10-3 DA 3 4.0071 4.0076 4.0073 1.53 10-3 4.693.6937.6935 1.16 10-3 5.19.198.196 1.13 10-3 TABLE 6 INFERENCE RESULTS OF FAULT TYPES Method No. of Clusters Cost Function Best Worst Avarege SD 7.5066 7.9665 7.7881.53 10-1 3 4.6589 5.6943 4.8086.3 10-1 4 3.0853 4.318 3.1638 1.41 10-1 5.1598 3.338.370 1.18 10-1 7.1657 7.166 7.1658 1.76 10-3 DA 3 4.0071 4.0076 4.0073 1.53 10-3 4.693.6937.6935 1.16 10-3 5.19.198.196 1.13 10-3 of all aspects. It is noteworthy that SD of DA clustering is much smaller that of the. For eample, SD of the has about 143.7 times larger that of DA clustering in case of two clusters. That implies that DA clustering is not affected by the initial solution although the is done. Also, the and DA has the best results in case of five clusters. Thus, simulation results have shown that DA clustering is much better than the.
Table 5 gives results of fault location for Methods A, B and C. Methods B and C gave better results than Method A due to the effectiveness of preconditioned technique for MLP. Methods B and C have the best cluster in case of three clusters although five cluster gave the best results regarding only clustering as shown in Table. That is because too many clusters deteriorate the performance of MLP since enough learning data is not available. Table 6 shows results of fault types, where the recognition rate and variance of errors are shown for each cluster. As the number of clusters increases, the recognition rate increases until it reaches at the minimum. Methods B and C have the largest recognition rate and the smallest error variance in case of three clusters. Also, Method C has better results than Method B in terms of the recognition rate and the error variance. The case provided the recognition rate of 96.6% and the error variance of 0.0030 for Method C. Therefore, simulation results have indicated that DA clustering is superior to the as a precondition technique for MLP. V. CONCLUSIONS (1) In this paper, a new hybrid method has been proposed to handle a fault detection problem that estimates the location and type of fault. The proposed method makes use of DA clustering and MLP. DA clustering is more effective in a sense of global clustering that is not affected by the initial solutions. DA clustering plays an important role to classify input data of MLP into clusters. MLP have been constructed at each cluster. That allows to learn MLP efficiently because of data similarity. () The proposed method has applied to measured current waveforms in three phases to etract features of fault. The results in frequency domain are given to DA clustering as input variables. itself also works as a preconditional technique for MLP. (3) The proposed method was applied to a sample system. A comparison was made between the proposed and conventional methods. The simulation results have shown that the proposed method was much better than the conventional ones. Compared with the conventional method, the proposed method has reduced the average error of 30% and the maimum error of 17.1% in terms of the fault location. Also, it has contributed to improving the recognition rate of 7.% in terms of the fault type. ACKNOWLEDGEMENTS The authors acknowledge the support provided by the Meii University High Technology Research Center. REFERENCES [1] T. Takagi, et al., Development of a New Type Fault Locator Using the One-terminal Voltage and Current Data, IEEE Trans. on PAS, Vol. PAS-101, No. 8, pp. 89-898, Aug. 198. [] L. Eriksson, et al., An Accurate Fault Locator with Compensation for Apparent Reactance Resulting from Remote-end feed, IEEE Trans. on PAS, Vol. PAS-104, No., pp. 44-436, Feb. 1985. [3] A. O. Ibe and B. J. Cory, A Traveling Wave-based Fault Locator for Two and Three Terminal Networks, IEEE Trans. Power System, Vol. 1, No., pp. 83-88, 1986. [4] M. Pfau-Wagenbauer and H. Brugger, Model and Rule Based Intelligent Alarm Processing, Proc. of the Third Symposium on Epert Systems Application to Power Systems, pp. 7-3, Tokyo- Kobe, Japan, April 1-5 1991. [5] E. Handschin, et al., Knowledge Based Alarm Handling and Fault Location in Distribution Networks, IEEE Trans. on Power Systems, Vol. 7, No., pp. 770-776, May 199. [6] H. Tanaka, et al., Design and Evaluation of Neural Network for Fault Diagnosis, Proc. of ESAP 91, pp. 378-384, Seattle, WA, July 1989. [7] K. -H. Kim and J. -K. Park, Application of Hierarchical Neural Network to Fault Diagnosis of Power System, Proc. of ESAP 91, pp. 33-37, Tokyo/Kobe, Japan, April 1991. [8] D. Novosel, et al., Algorithms for Locating Faults on Series Compensated Lines Using Neural Network and Deterministic Methods, IEEE Trans. Power Delivery, Vol. 11, No. 4, pp. 178-1736, 1996. [9] H. J. Altuve, et al., Neural-network-based Fault Location Estimator for Transmission Line Protection, Journal of Intelligent And Fuzzy Systems, No. 7, pp. 159-171, 1999. [10] A. J. Mazon, et al., A New Approach to Fault Location in Twoterminal Transmission Lines Using Artificial Neural Networks, Electric Power Systems Research, Vol. 56, No. 3 pp. 61-66, Dec. 000. [11] G. K. Purushothama, et al., ANN Applications in Fault Locators, Electrical Power And Energy Systems, Vol. 3, No. 6, pp. 491-506, Aug. 001. [1] K. Rose, et al., A Deterministic Annealing Approach to Clustering, Pattern Recognition Letters, Vol. 11, No. 9, pp. 589-594, North-Holland, Sep. 1990. [13] K. Rose, et al., Constrained Clustering as an Optimization Method, IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 15, No. 8, pp. 785-794, Aug. 1993.