Prediction of Time Series Using RBF Neural Networks: A New Approach of Clustering

138 The Intenational Aab Jounal of Infomation Technology, Vol. 6,. 2, Apil 2009 Pediction of Time Seies Using RBF Neual Netwoks: A New Appoach of Clusteing Mohammed Awad 2, Hécto Pomaes 1, Ignacio Rojas 1, Osama Salameh 2 and Mai Hamdon 2 1 Depatment of Compute Achitectue and Technology, Univesity of Ganada, Spain 2 Faculty of Infomation Technology, Aab Ameican Univesity, Palestine Abstact: In this pape, we deal with the poblem of time seies pediction fom a given set of input/output data. This poblem consists of the pediction of futue values based on past and/o pesent data. We pesent a new method fo pediction of time seies data using adial basis functions. This appoach is based on a new efficient method of clusteing of the centes of the adial basis function neual netwok; it uses the eo committed in evey cluste using the eal output of the adial basis function neual netwok tying to concentate moe clustes in those input egions whee the eo is bigge and move the clustes instead of just the input values of the I/O data. This method of clusteing, impoves the pefomance of the time seies pediction system obtained, compaed with othe methods deived fom taditional algoithms. Keywods: Clusteing, time seies pediction, RBF neual netwoks. Received Apil 21, 2007; accepted Decembe 14, 2007 1. Intoduction Time seies is widely used in many aspects of ou lives. Daily tempeatue, electical load and ive flood foecasting [21], etc. The poblem consists of pedicting the next value of a seies known up to a specific time, using the known past values of the seies. Basically, time seies pediction can be consideed a modeling poblem. The fist step is establishing a mapping between inputs/outputs. Usually, the mapping is nonlinea and chaotic. Afte such a mapping is set up, futue values ae pedicted based on past and cuent obsevations [16, 21]. Radial Basis Function Neual Netwoks (RBFNNs) ae chaacteized by a tansfe function in the hidden unit laye having adial symmety with espect to a cente [9]. The basic achitectue of an RBFNN is a 3- laye netwok as in Figue 1. The output of the net is given by the following expession: m F ( x, Φ, w ) = φ ( x ) w (1) i= 1 whee Φ = { φ i : i= 1,..., m} ae the basis functions set and w i the associate weights fo evey Radial Basis Function (RBF). The basis function φ can be calculated as a Gaussian function using the following expession: (,, ) exp x φ x c c = i i (2) whee c is the cental point of the function φ, is its adius and x is the input vecto. x 1. x m φ 1 φ 2 φ m w 2 Figue 1. Radial basis function netwok. A common leaning method fo RBFNNs is clusteing. Evey cluste has a cente, which can be chosen as the cente of a new RBF. The RBF centes can be obtained by many clusteing algoithms. These algoithms ae classified as unsupevised clusteing algoithms such as k-means [6], fuzzy c-means [2], enhanced LBG [19], and supevised clusteing algoithms such as the Clusteing fo Function Appoximation method (CFA) [3], the Conditional Fuzzy Clusteing algoithm (CFC) [11] and the Altenating Cluste Estimation method (ACE) [18] etc. The clusteing algoithm obtains the cluste centes by attempting to minimize the total squaed eo incued in epesenting the data set by the m cluste centes. Howeve, the clusteing algoithm can only achieve a local optimal solution, which depends on the initial locations of cluste centes. A consequence of this local optimality is that some initial centes can become stuck in egions of the input domain with few o no input pattens. This wastes esouces and esults in a local optimal netwok. w 1 wm Σ F(X)

Pediction of Time Seies Using RBF Neual Netwoks: A New Appoach of Clusteing 139 RBFNNs ae univesal appoximatos and thus best suited fo function appoximation poblems. In geneal an appoximato is said to be univesal if it can appoximate any continuous function on a compact set to a desied degee of pecision. The technique of finding the suitable numbe of adial functions is vey complex since we must be caeful of not poducing excessively lage netwoks which ae inefficient and sensitive to ove-fitting and exhibit poo pefomances. Figue 2 pesents a functional appoximation using seveal RBFs with diffeent values of the adius, whee is the adius of RBF. Figue 2. Function appoximation using RBFNNs. In this pape we pesent a new method of clusteing the centes of RBFs fo the pediction of time seies and a new efficient clusteing method fo the initialization of the centes of the RBF netwok, this method uses the taget output of the RBFN to migate and fine-tune the clustes instead of just the input values of the I/O data. This method calculates the eo committed in evey cluste using the eal output of the RBFN tying to concentate moe clustes in those input egions whee the eo is bigge, thus attempting to homogenize the contibution to the eo of evey cluste. The oganization of the est of this pape is as follows. Section 2 pesents an oveview of the poposed algoithm. In section 3, we pesent in detail the poposed algoithm fo the detemination of the pseudo-optimal RBF paametes. Then, in section 4 we show some esults that confim the pefomance of the poposed methodology. Some final conclusions ae dawn in section 5. 2. Poposed Appoach As mentioned befoe, the poblem of time seies pediction consists of the pediction of futue values based on past and/ o pesent values. A time seies is a sequence of vectos, x(t), t = 0,1,, whee t epesents elapsed time. Theoetically, x may be a value which vaies continuously with t, such as tempeatue. In pactice, fo any given system, x will be sampled to give a seies of discete data points, equally spaced in time [7]. Fomally this can be stated as: find a function x( t+ d) = f ( x( t)) such as to obtain an estimate of x at time t + d, fom the N time steps back fom time t, so that: x( t+ d) = f( x( t)), x( t 1),..., x( t N+ 1) (3) The accuacy of the pediction pocess is measued by a cost function which takes into account the eo between the output of the RBFNN and the eal output. In this pape, the cost function we ae going to use is the so-called malized/root Mean Squaed Eo (N/RMSE). This pefomance index is defined as: NRMSE = y f x y y P P 2 2 ( i ( i)) / ( i ) i= 1 i= 1 (4) whee y is the mean of the eal output, and p is the data numbe. The objective of ou algoithm is to incease the density of clustes in the input domain aeas whee the eo committed in evey cluste using the eal output of the RBFN is bigge. The RBFNN is completely specified by choosing the following paametes: the numbe m of adial basis functions, the centes c of evey RBF, the adius, and the weights w. The numbe of RBFs is a citical choice. In ou algoithm we have used a simple incemental method to detemine the numbe of RBFs. We will stop adding new RBFs when the time seies pediction eo falls below a cetain taget eo. As to the est of the paametes of the RBFNN, in section 3 we pesent a new clusteing technique. Figue 4 pesents a flowchat with the geneal desciption of the poposed appoach. Figue 3. The distotion befoe the migation. Figue 4. The distotion afte the migation. 3. Paamete Adjustment of the RBFNN The locality popety inheent to the RBF allows us to use a clusteing algoithm to obtain the RBF centes. Clusteing algoithms may get stuck in a local minimum ignoing a bette placement of some of the clustes, i.e., the algoithm is tapped in a local minimum which is not the global one. Fo this eason we need a clusteing algoithm capable to solve this local minimum poblem. To avoid this poblem we

140 The Intenational Aab Jounal of Infomation Technology, Vol. 6,. 2, Apil 2009 endow ou supevised algoithm with a migation technique. This modification allows the algoithm to escape fom local minimum and to obtain a pototype allocation independent of the initial configuation. To optimize the othe paametes of the RBFNN (the adius and the weights w) we used well-known heuistics; the k-neaest Neighbou technique (knn) [10] fo the initialization of the adius of each RBF, Singula Value Decomposition (SVD) [13] to diectly optimize the weights. Finally, the levenbeg-maquadt algoithm is to fine-tune the obtained RBFNN [8]. Theefoe, in this section we will concentate on the poposed clusteing algoithm. In Figue 4, we show a flowchat epesenting the geneal desciption of ou clusteing algoithm. As can be seen fom this figue, the initial values of the clustes ae calculated using the k-means clusteing algoithm followed by a local displacement pocess which locally minimizes the distotion (D) within each cluste. In Figue 3 we can see the initial distotion distibution fo the case of 6 equally distibuted RBFs, which is the fist configuation whose appoximation eo falls unde the taget eo. Figue 4 epesents the same infomation when the clusteing pocess has ended. We can now see the advantage that we expect fom the fact of making each cluste to have an equal contibution to the total distotion, which is the objective of the poposed clusteing algoithm. The distotion is defined as: m 2 xi c j Eij j= 1 xi C j D= (5) m E j= 1 xi C j whee m is the numbe of RBFs (clustes), c j is the cente of cluste C j and E ij is the eo committed by the net when the input vecto x i belongs to cluste C j. E= y f ( x, Φ, w) ij (6) In the local displacement of the cluste centes, we stat by making a had patition of the taining set, just as in the k-means algoithm. The second step of the pocess of local displacement is the calculation of the eo of the RBFNN using the the K-neaest neighbous algoithm to initiate the adii and the singula value decomposition to calculate the weights of the RBFs.This is caied out by an iteative pocess that updates each cluste cente as the weighted mean of the taining data belonging to that cluste and we epeat this pocess until the total distotion of the net eaches a minimum. cm= Eim xi / Eim (7) xi Cm xi Cm D Initiate the Clustes using K-means Pefom Local Displacement of the Clustes D ant D Pefom Migation of the Clustes Pefom Local Displacement of the Clustes Calculate the Distotion D D D / D < ε? a n t Retun the final Clustes Cj Figue 5. Geneal desciption of the poposed clusteing algoithm. Afte this pocess we must update the cluste centes in ode to minimize the total distotion. The algoithm stops when the value of the distotion is less than the value of a theshold ε. Figue 6 pesents a flowchat with the geneal desciption of the local displacement pocess. D Pefom the Patition of the Taining set Calculate the Eo of the RBFN, using (KNN) to initiate the Radius and (SVD) to calculate the weight of the RBFN D ant D Update the Clustes Calculate the Eo of the RBFN, using (KNN) to initiate the Radius and (SVD) to calculate the weight of the RBFN Pefom the Patition of the Taining set. Calculate the Distotion D D D / D < ε? ant Retun the new Cluste Cj Figue 6. Local displacement of the clustes. The migation pocess migates clustes fom the bette zones towad those zones whee the eo is wose, thus attempting to make equal thei contibution to the total distotion. Ou main hypothesis is that the best initial cluste configuation will be the one that equalizes the eo committed by evey cluste. The pobability of

Pediction of Time Seies Using RBF Neual Netwoks: A New Appoach of Clusteing 141 choosing a given cluste invesely popotional to what we call the utility of that cluste, which is defined as: U = D / D j= 1,..., m (8) j j In this way, the poposed algoithm selects one cluste that has utility less than one and moves this cluste to the zone neaby a new selected cluste having utility moe than one as shown in Figue 7. This migation step is necessay because the local displacement of clustes only moves clustes in a local manne. Calculate the Distotion Dj and the Utility Uj, Select all the Clustes that they have U < 1 Calculate the pobability of evey Cluste that has U<1 U Has maximum pobability. min Any cluste with U< 1? Select one (U<1) using oulette wheel selection Select all the Clustes that have (U > 1) bibliogaphy fo this benchmak poblem (including ou poposed appoach). Mackey-glass time seies geneated with the following expession: ds( t) s( t - t ) = a. - b s( t) 10 dt 1 + s ( t - t ) (9) whee x(t) is the value of the time seies at time t. The time seies was constucted with paamete values a = 0.2 and b = 0.1. Hee, initial conditions used in ou test bench ae set as s (0) = 1.2 and s( t ) = 0 when t < 0, doing t = 17. 1000 samples of the Mackey-glass time seies ae depicted Figue 8. The fist 500 points ae used as a taining set and the last 500 ae used as the test set. The tables pesent esults of the nomalized oot mean-squae eo NRMSE test obtained by testing set of 500 test points afte the application of the Levenbeg Maquadt method. As can be seen fom the Tables 1 and 2, the poposed algoithm eaches bette pediction eo. Calculate the pobability of evey Cluste that has U>1 U Has maximum pobability. max Any cluste with U > 1? Select one (U>1) using the oulette wheel selection Move the selected cluste with (U<1) to the zone of the cluste selected with (U>1). Used K-means to epatition the data Figue 8. Mackey- glass time seies. Confim the migation Pefom Local Displacement of the Clustes Calculate the Distotion Dj Has the Distotion impoved? Stop the Migation Figue 7. The migation pocess. Reject the Migation 4. Example of the Poposed Pocedue Expeiments have been pefomed to test the poposed algoithm. The system is simulated in MATLAB 7.0 unde Windows XP with pocesso Pentium IV unning at 2.4 Ghz. In this section we attempt a shottem and lage-tem pediction of the algoithm pesented in the above section with egad to the Mackey-glass time seies data [5]. The Mackey-glass time seies is commonly used to test the pefomance of neual netwoks [5]. The seies is a chaotic time seies making it an ideal epesentation of the nonlinea oscillations of many physiological pocesses [17]. To make the compaisons with ealie wok, we chose the paametes pesented in [4]. Figue 8 shows the Mackey- glass time seies. Tables 1 and 2 compaes the pediction accuacy of diffeent computational paadigms pesented in the 4.1. Shot-Tem Pediction Following the conventions established to pedict this time seies in shot-tem, the execution of the algoithm is consideed to look fo netwoks that pedict the value s(s+6) fom cuent value s(t) and of past values s(s-6), s(s-12), and s(s-18), using values of taining of the fom [ s( t- 18), s( t- 12), s( t- 6), s( t); s( t+ 6) ] (10) The NRMSE of the points pedicted by the algoithms is shown in Table 1. It is clea that the poposed algoithm has pedicted the time seies in shot-tem with much geate accuacy than othe algoithms. Figue 9. Pediction step 85, with 20 RBF.

142 The Intenational Aab Jounal of Infomation Technology, Vol. 6,. 2, Apil 2009 Table 1. Compaison esult of the pediction eo of diffeent methods fo pediction step 6 (500 test data). Method m RMSE test Lineal model pediction - 0.55 Auto Regessive model - 0.19 Cascade coelation NN - 0.06 6 th -ode polynomial - 0.04 Back-Pop NN - 0.02 Kim and Kim (GA A & Fuzzy System) [3] ANFIS & Fuzzy Svstem [3] New RBFNs Stuctue [16] Pomaes [13] González [7] Rivas [15] Ou Appoach 5MFs 0.049 7MFs 0.042 9MFs 0.038 16 ules 0.007 12 RBF 0.003 3 3 3 3 0.011 3 4 4 4 0.007 4 4 5 5 0.006 4 0.015 ± 0.0019 7 0.007 ± 0.0009 10 0.005 ± 0.0010 13 0.004 ± 0.0011 16 0.004 ± 0.0002 4 0.014 ± 0.0021 7 0.009 ± 0.0008 10 0.007 ± 0.0009 13 0.006 ± 0.0011 16 0.005 ± 0.0003 4 0.012 ± 0.0080 7 0.007 ± 0.0008 10 0.005 ± 0.0006 13 0.004 ± 0.0012 16 0.003 ± 0.0006 4.2. Lage-Tem Pediction In lage-tem, the execution of the algoithm is consideed to look fo netwoks that pedict the value s(s+85) fom cuent value s(t) and of past values s(s- 6), s(s-12), and s(s-18), using values of taining of the fom [ st ( - 18), st ( - 12), st ( - 6), s(); t st ( + 85) ] (11) 5. Conclusion In this pape, a new modified appoach is pesented to pedict chaotic time seies. We have poposed an algoithm of clusteing especially suited fo function appoximation poblems. This method calculates the eo committed in evey cluste using the eal output of the RBFNN, and not just an appoximate value of that output, tying to concentate moe clustes in those input egions whee the appoximation eo is bigge, thus attempting to homogenize the contibution to the eo of evey cluste. Simulations, in this pape have demonstated that the poposed method poduces moe accuate pediction. This algoithm is easy to implement and is supeio in both pefomance and computation time to othe algoithms. Table 2. Compaison esult of the pediction eo of diffeent methods fo pediction step 85 (500 test data). Method m NRMSE test RAN-P-GQRD [1] Fuzzy system [1] Whitehead [20] González [7] Rivas [15] Ou Appoach Refeences 14 0.206 24 0.174 31 0.160 38 0.183 10 0.108 11 0.109 12 0.103 13 0.223 14 0.159 15 0.103 25 0.29 50 0.18 75 0.11 125 0.05 5 0.389 ± 0.0194 10 0.251 ± 0.0246 14 0.198 ± 0.0164 17 0.147 ± 0.0178 20 0.126 ± 0.0174 5 0.397 ± 0.0238 10 0.249 ± 0.0207 14 0.168 ± 0.0210 17 0.128 ± 0.0091 20 0.113 ± 0.0125 5 0.388 ± 0.0227 10 0.243 ± 0.0130 14 0.150 ± 0.0303 17 0.111 ± 0.0156 20 0.097 ± 0.0074 [1] Besini H., Duchateau A., and Badshaw N., Using Incemental Leaning Algoithms in the Seach fo Minimal Effective Fuzzy Models, in Poceedings of 6 th Intenational Confances on Fuzzy Systems, Bacelona, pp. 819-825, 1997. [2] Bezdek C., Patten Recognition with Fuzzy Objective Function Algoithms, Plenum, New Yok, 1981. [3] Duda O. and Hat E., Patten Classification and Scene Analysis, Wiley, New Yok, 1973. [4] Fank J., Davey N., and Hunt P., Time Seies Pediction and Neual Netwoks, Hatfield, UK, 1997. [5] Gonzalez J., Rojas I., and Pomaes H., A New Clusteing Technique fo Function Appoximation, IEEE Tansaction Neual Netwoks, vol. 13, no. 1, pp. 132-142, 2002. [6] González J., Identificación y Optimización de Redes de Funciones de Base Radiales Paa Apoximación Funcional, PHD Thesis, Univesity of Ganada, 2001. [7] Kanjilal P. and Banejee N., On the Application of Othogonal Tansfomation fo the Design and Analysis of Feedfowad Netwoks, IEEE Tansactions on Neutal Netwoks, vol. 6, no. 2, pp. 1061-1070, 1995. [8] Kim D. and Kim C., Foecasting Time Seies with Genetic Fuzzy Pedictoensemble, IEEE

Pediction of Time Seies Using RBF Neual Netwoks: A New Appoach of Clusteing 143 Tansactions on Fuzzy Systems, vo1. 5, no. 4, pp. 523-535, 1997. [9] Mackey C. and Glass L., Oscillation and Chaos in Physiological Contol Systems, Science Repot, 1977. [10] Maquadt W., An Algoithm fo Least-Squaes Estimation of nlinea Inequalities, SIAM Jounal on Applied Mathematics, vol. 11, no. 4, pp. 431-441, 1963. [11] Moody E. and Daken C., Fast Leaning in Netwoks of Locally Tuned Pocessing Units, Compute Jounal Neual Computation, vol. 2, no. 1, pp. 281-294, 1989. [12] O L., Regulaization in the Selection of Radial Basis Function Centes, Compute Jounal Neual Computation, vol. 7, no. 3, pp. 606-623, 1995. [13] Pedycz W., Conditional Fuzzy C-means, Patten Recognition Lettes, vol. 17, no. 2, pp. 625-632, 1996. [14] Pomaes H., Nuevo Metodología Paa El Diseño Automático De Sistemas Difusos, PHD Thesis, Univesity of Ganada, 2000. [15] Rivas A., Diseño Y Optimización de Redes de Funciones de Base Radial Mediante Técnicas Bioinspiadas, PHD Thesis, Univesity of Ganada, 2003. [16] Rojas I., Pomaes H., Gonzalez J., and Ros A., A New Radial Basis Function Netwoks Stuctue: Application to Time Seies Pediction, in Poceedings of IEEE-INNS- ENNS Intenational Joint Confeence on Neual Netwoks (IJCNN-2000), Italy, pp. 449-445, 2000. [17] Rosipal R., Koska M., and Fakas I., Pediction of Chaotic Time-Seies with a Resouce Allocating RBF Netwok, Compute Jounal Neual Pocessing Lettes, vol. 7, no. 3, pp. 1-13, 1998. [18] Runkle A. and Bezdek C., Altenating Cluste Estimation: A New Tool fo Clusteing and Function Appoximation, IEEE Tansactions on Fuzzy Systems, vol. 7, no. 3, 377-393. 1999. [19] Russo M. and Patanè G., Impoving the LBG Algoithm, Lectue tes in Compute Science, vol. 1606, no. 4, pp. 621-630, 1999. [20] Whitehead A. and Choate D., Coopeative- Competitive Genetic Evolution of Radial Basis Function Centes and Widths fo Time Seies Pediction, IEEE Tansactions on Neual Netwoks, vol. 7, no. 4, pp. 869-80, 1996. [21] Xiaoyu L., Bing K., and Simon Y., Time Seies Pediction Based on Fuzzy Pinciples, Heidelbeg, Floida, 2003. Mohammed Awad eceived the BSc degee in industial automation engineeing in 2000, fom the Palestine Polytechnic Univesity and the PhD degee in 2005, fom the Univesity of Ganada, Spain. He is cuently assistant pofesso in the Faculty of Infomation Technology and Chai of the Depatment of compute Infomation Technology at the Aab Ameican Univesity, Palestine. Hécto Pomaes eceived the MSc degee in electonic engineeing in 1995, the MSc degee in physics in 1997, and the PhD degee in 2000, all fom the Univesity of Ganada, Ganada, Spain. He is cuently an associate pofesso in the Depatment of Compute Achitectue and Compute Technology at the Univesity of Ganada. Ignacio Rojas eceived the MSc degee in physics and electonics in 1992 and the PhD degee in 1996, both fom the Univesity of Ganada, Spain. He was at the Univesity of Dotmund, Gemany, as invited eseache fom 1993 to 1995. Osama Salameh eceived the MSc degee in compute engineeing in 1990 and the PhD degee in 1996, fom Odessa State Polytechnic Univesity, Ukaine. He is cuently assistant pofesso in the Faculty of Infomation Technology and acting dean of the same faculty at the Aab Ameican Univesity, Palestine. His cuent aeas of eseach inteest include atificial neual netwoks and aspect oiented softwae development. Mai Hamdon is cuently a student in Aab Ameican Univesity in the Depatment of Compute Science. He aea of inteests includes algoithms and atificial Intelligent.

144 The Intenational Aab Jounal of Infomation Technology, Vol. 6,. 2, Apil 2009