Parallelization of a Series of Extreme Learning Machine Algorithms Based on Spark

Parallelzaton of a Seres of Extreme Machne Algorthms Based on Spark Tantan Lu, Zhy Fang, Chen Zhao, Yngmn Zhou College of Computer Scence and Technology Jln Unversty, JLU Changchun, Chna e-mal: lutt1992x@sna.com Abstracts Wth the development of the Internet, tradtonal bg data computng platform gradually lose ts compettve advantages as a result of hgh latency. On the contrary, a fast, easy to use and generc bg data computng frame called Spark draws more and more attentons. At the same the ntegrated soluton whch s based on RDD (Reslent Dstrbuted Datasets) and s offered by Spark makes the applcatons of Spark n actual projects broader and broader. Non-teratve ELM (Extreme Machne) algorthm whch generates hdden layer weghts randomly determnes the output layer weghts by analyzng. Usng ths method to reduce learnng as more as possble can brng much convenence to many -senstve applcatons. In ths artcle we put forward a knd of Feedforward Neural Network Parallel Algorthm whch s based on Spark platform, establsh VMware vsphere platform, experments on vsphere VMware experment platform. Our experment results show that ths algorthm can ncrease the analyss speed of ELM algorthm. scheduler no-loop acyclc graph schedulng phase, whch s for state. No-loop acyclc graph scheduler dvdes no-loop acyclc graph nto many tasks, every group of tasks s a state. Only by meetng shuffle can new state generate. There are three states n Fg.1. The task of no-loop acyclc graph scheduler s to record the acton of RDD, seek the optmal schedulng of tasks and montor falures whch s produced by shuffle output. Keywords spark; feedforward neural network; ELM; parallel algorthm I. INTRODUCTION As the Internet gettng popular, now we are n an era of explosve growth n bg data. Bg data nfrastructure platform has more and more demands on the storage capacty, management capacty and computng power of an enterprse. Furthermore only n ths way can enterprses satsfy the needs of users. Therefore better parallel processng, hgher computng densty, advanced vrtualzaton capabltes, memory modular desgn, vrtual machnes wth better kernel and sold-state storage gradually become basc functons of enterprse-class servers. Wth the decrease of memory prce, the trend of bg data processng s based on memory computng, rather than calculaton of the dsk or drum. Hadoop data processng platform whch s based on Mapreduce archtecture gradually becomes unpopular because of ts dsadvantage of hgh latency and Spark dstrbuted bg data processng platform whch s based on RDD (Reslent Dstrbuted Datasets) archtecture gradually becomes the manstream platform of data processng. II. SPARK DISTRIBUTED BIG DATA PROCESSING FRAMEWORK Spark s a cluster computng platform and s based on memory computng [1]. Fg.1 shows task schedulng process of Spark. We can get no-loop acyclc graph whch s generated by RDD objects from Fg. 1. The next stage s hgh-level Fg. 1. Spark task schedulng A. Spark Programmng Interface Spark uses Scala Language to realze the API of RDD. When usng Spark dstrbuted computng platform for programmng, developers need wrte a drver and connect t to cluster for runnng worker frst. Programmng nterfaces are shown n Fg.2. Drver defnes one or more RDDs and call actons whch are on RDD. Worker parttons the RDD, and caches the Java object n memory. Fg.2 shows that when Spark s runnng, the user start more than one worker by drve program. Worker s work s to read the data block from dstrbute fle system and cached the RDD parttons whch have been calculated before n memory. B. RDD Dstrbuted Functonal Programmng RDD s a dstrbuted read-only collecton object based on memory computng. RDD s a knd of parallel data structure wth fault tolerance. It allows users to store data to memory and dsks explctly. It can also control the partton of data. RDD whch uses Lneage for re-partton dffers from dstrbuted shared memory system whch needs to pay a hgh cost of the checkpont, and rollback mechansm.rdd s the 978-1-5090-0806-3/16/$31.00 copyrght 2016 IEEE ICIS 2016, June 26-29, 2016, Okayama, Japan

cornerstone of spark abstracton, the entre spark programmng s based on the operaton of the RDD. From RDD to transformaton operator of RDD only occurs n RDD space.lazy evaluaton whch s very mportant n Spark dstrbuted computng platform s not actually occurrng n the transformaton process, but t s only a contnuous record of metadata. RDD s essentally a read-only partton record collecton. Fg. 2. Programmng nterface III. EVALUATION AND CALCULATION BEFORE VIRTUALIZATION A. Introducton of Vrtualzaton Platform Based on VMware VMware vsphere s the most wdely deployed vrtualzaton platform packaged software, t optmzes and manages ndustry standard IT envronments from desktop to data center by vrtualzaton technology. VMware vsphere vrtually aggregate underlyng physcal hardware resources from many systems and offer data center rch vrtual resources. VMware vsphere conssts of basc archtecture servces, applcaton servces, VMware vcenter Sever and clent component layer. VMware vsphere can be the large basc archtecture of seamless and dynamc operatng envronment management. It can also manage complex data center at the same. B. VMware VSphere Workng Prncple VMware vsphere vrtualzes and aggregates ndustry standard server and unfed resource pool. The complete envronment of the operatng system and the applcatons s encapsulated n a vrtual machne whch s ndependent of the hardware. A group of vrtualzed and dstrbuted basc archtecture servces whch s for the vrtual machne brng more flexblty, servceablty and effectveness; centrally managng and montorng the vrtual machne can automatcally smplfy the deployment of resources; Intellgent and dynamcally allocatng avalable resources among multple vrtual machnes, by dong ths we can ncrease the hardware utlzaton greatly and coordnate IT resources and busness pror affars; lower cost for the applcaton to provde a hgher level of servce. C. Evaluaton of Effcency Gettng a comprehensve understandng of the use effcency of a server whch s runnng normally s prerequste for dong vrtualzaton mplementaton. Because vrtualzaton process s convertng entty machne nto vrtual machne, f the performance of the entty machne s nadequate or the vrtual machne uses excessve resources, other entty machne or vrtual machne n the same IT envronment could be affected. Therefore, t s necessary to quantfy the effcency of evaluaton. Due to that the memory sze can be calculated from the entty memory and the network nterface card can s determned by the server therefore our hardware evaluaton s manly the evaluaton of CPU effcency. We can clearly record ths by Performance Event n Wndows. And usng PAL TOOL can help us to get more precse evaluaton of ndexes of the server. IV. FEEDFORWARD NEURAL NETWORK MODEL A. Sngle-Layer Neural Network Let us begn wth analyss of the neural network whch conssts of one neuron. Fg.3 shows one neuron model. Ths neuron s an arthmetc unt of X and ntercept "+1" as the nput value, ts output s h w, b ( x). Our artcle chooses sgmod functon as actvaton functon hence we can get the logc recurson of the mappng relatons between nput of ths neuron and ts output. Fg. 3. A sngle neuron model Fg. 4. Mult neuron model B. Multlayer Neural Network Many sngle neurons couple together and form a multneuron neural network. In ths neural network the output of a neuron s the nput of another neuron [2]. Fg.4 shows a smple mult-neuron neural network. Feedforward neural network s a knd of classcal herarchy neural network, nformaton enters network from the nput layer then forwards layer by layer and enters output layer. Neural networks wth dfferent features forms by adjustng hdden layer node number, weght adjustment rule and neuron transfer functon n feedforward network [3]. C. ELM Algorthm Model ELM (Extreme Machne) [2] s a new neural network algorthm put forward by Huang G B. Comparng to SVM (support vector machne) [6] and tradtonal neural algorthm, ths algorthm has features such as fast tranng

speed, less artfcal nterference need and strong data generalzaton ablty towards heterogenety [4]. ELM algorthm trans sngle hdden layer feedforward neural network by randomzng ntalzaton nput weghts and bas and gets the correspondng output weght. It s also better than a parallel ncremental extreme SVM classfer. ELM s a smple SLFN (Sngle-hdden Layer Feedforward Neural Network) [7]. Fg.5 s the schematc dagram of SLFN. Fg. 5. SLFN schematc dagram Ths SLFN ncludes three layers: nput layer, hdden layer, output layer. Hdden layer ncludes L number of hdden neurons. Normally L s far less than N. The output of output layer s a m-dmensonal vector. For bnary classfcaton problems ths vector s one-dmensonal. For a tranng data sample, f we gnore nput layer and hdden layer and only take the nput and output layer of hdden layer neurons, we can get an output functon expresson of the neural network: L L f ( x) = β G( a, b, x) 1 = 1 a andb are parameters of hdden. β represents the connecton weghts between -th hdden layer neuron and output neuron, n other words, t s an m-dmensonal weght vector. The G n the formula s the output of the hdden layer neuron. V. PARALLEL ELM ALGORITHM BASED ON SPARK COMPUTING PLATFORM Algorthm nput s gven a set of tranng samples N, hdden layer node number L, tranng sample sets and hdden layer output functons; Algorthm output s learnng and tranng accuracy and effcency. Randomly generated hdden layer node parameters. The calculaton of the hdden layer output matrx H. Computng optmal network. Calculatng the accuracy and effcency of learnng and tranng, and analyzng and summarzng. a. If the feed forward neural network can predct the tranng sample wthout any errors, then the weght of output layer and the hdden layer must be solved. Especally when the number of nodes n the hdden layer L s equal to the number of nodes n the nput layer N, then there must be a soluton. However, n practcal applcaton, the hdden layer node number L s far less than the nput layer node number N, weght vector does not necessarly have a soluton, that s, there may be errors between the actual value and the network output. For solvng the optmal weght problem, the loss functon J can be mnmzed, and the ELM algorthm s proposed to solve the problem of two knds of solutons: If the matrx H s a full rank matrx, the optmal weghts are found by the least squares; If the matrx H s not a full rank matrx, the optmal weghts are calculated by solvng the sngular value of H. Dfferent from other algorthm, the weghts of all layers are updated by usng the gradent descent method. The ELM algorthm adjusts the weghts between the nput layer and the hdden layer by randomly settng weghts method, so the tranng speed of ELM algorthm s very fast. ELM algorthm obtan weghts of the hdden layer to the output layer by the least square method. In ths paper, we have a parallel processng for ELM algorthm, for the random generaton of hdden layer node parameters, we fnd that the hdden layer can be generated n a short, so that we can do the hdden layer output matrx H and the network optmal operaton, n order to mprove the effcency of the algorthm, we can acheve t through multthread parallelzaton. VI. ELM ALGORITHM ON THE SPARK PLATFORM AND THE RESULTS ANALYSIS A. Expermental Envronment The software used n ths experment s VMare workstaton 9.0.2 for Wndows Ubuntu system, Spark, java. The language used s java. Operaton mode s local-cluster. Ths paper s based on the same data set from Huang G.B [5]. They are artfcal benchmark data sets. Data set 1 and data set 2 are shown n table I and table II, respectvely. TABLE I. DATA SET 1; NODE NUMBER =579; HIDDEN LAYER =2 1.00000000 0.17647100 0.81500000 0.57377000 0.18000000 0.12411300 0.47093900 0.08112720 0.11666700 1.00000000 0.41176500 0.57000000 0.52459000 0.00000000 0.00000000 0.40834600 0.27924900 0.21666700

TABLE II. DATA SET 2; NODE NUMBER =5000; HIDDEN LAYER =2 TABLE IV. ACCURACY AND TIME FOR 110-200 HIDDEN NODES -0.23785760-4.28429561-0.11624735 8.62498104 The desgn dea can be expressed by the flow chart, and the desgn flow chart s shown n Fg.6. 110 83.50 11.34 72.90 8.37 120 69.70 13.08 69.30 9.03 130 65.20 15.63 61.50 10.63 Fg. 6. Desgn deas flow chart B. Performance Testng In ths experment, we select dfferent number of to carry out experments. The hdden are set at 10 ntervals, and the unt of tranng s ms. Table IV, table V, table III s hdden layer node number and tranng, learnng and tranng, learnng accuracy after the test results. Table III shows that the tranng accuracy, tranng, learnng accuracy, and learnng when the number of hdden nodes s 10 to 100.Table IV shows that the tranng accuracy, tranng, learnng accuracy, and learnng when the number of hdden nodes s 110 to 200.Table V shows that the tranng accuracy, tranng, learnng accuracy, and learnng when the number of hdden nodes s 210 to 270. Fg.7 shows the relatonshp between the number of hdden and the tranng accuracy. From Fg.7, we can see that dfferent number of hdden lead to dfferent tranng and learnng accuracy. When the accuracy of the tranng and learnng ncreases to a certan value, the accuracy of tranng and learnng s gradually reduced. As the hdden ncrease, t may appear the phenomenon of fttng. As the hdden ncrease, the of learnng and tranng wll gradually ncrease. The Fg.8 s a fttng mage of the learnng. 140 62.80 17.54 57.00 12.74 150 59.40 25.71 50.40 20.45 160 50.30 34.09 53.30 28.9 170 51.70 49.73 49.80 34.09 180 49.00 53.8 48.20 40.51 190 50.60 59.61 51.90 47.83 200 48.10 64.06 52.60 54.6 TABLE V. ACCURACY AND TIME FOR 210-270 HIDDEN NODES 210 51.60 69.38 49.10 60.74 220 50.00 69.49 50.30 69.49 230 47.90 81.53 53.10 80.3 240 49.20 98.06 49.60 94.61 250 50.70 109.15 51.50 100.64 260 51.30 135.47 47.40 125.38 270 43.80 153.73 51.90 149.06 TABLE III. ACCURACY AND TIME FOR 10-100 HIDDEN NODES 10 11.30 1.38 10.70 1.28 20 14.90 2.97 12.80 2.45 30 20.00 4 18.40 3.79 40 29.50 5.12 23.80 4.14 50 37.00 6.15 28.80 4.98 60 40.90 6.78 37.30 5.62 70 49.30 7.49 48.20 5.93 80 55.80 7.99 53.70 6.82 90 64.80 8.63 60.00 7.31 Fg. 7. Curve dagram of accuracy and hdden layer node number 100 72.00 9.01 67.90 7.85

Expermental results show that the algorthm can mprove the speed of ELM algorthm analyss. REFERENCES Fg. 8. Image of fttng VII. CONCLUSION Ths paper ntroduces the framework of Spark large data processng and RDD dstrbuted functon programmng, and states the background of ELM algorthm and ELM algorthm. The vsphere VMware platform s establshed, and the parallel experments are carred out on the vsphere VMware platform. The ELM algorthm based on Spark s mplemented. [1] Zahara M, Chowdhury M, Frankln M J, et al. Spark: Cluster Computng wth Workng Sets. Book of Extremes, 2010, 15(1):1765-1773. [2] Lan Y, Soh Y C, Huang G B. Constructve hdden nodes selecton of extreme learnng machne for regresson. Neurocomputng, 2010, 73(s 16 18):3191-3199. [3] He Q, Shang T, Zhuang F, et al. Parallel extreme learnng machne for regresson based on MapReduce. Neurocomputng, 2013, 102(2):52 58. [4] Huang G B, Zhu Q Y, Sew C K. Extreme learnng machne: Theory and applcatons. Neurocomputng, 2006, 70(s 1 3):489-501. [5] Huang G B, Zhu Q Y, Sew C K. Extreme learnng machne: a new learnng scheme of feedforward neural networks. Proc.nt.jont Conf.neural Netw, 2004, 2:985-990 vol.2. [6] He Q, Du C, Wang Q, et al. A parallel ncremental extreme SVM classfer. Neurocomputng, 2011, 74(16):2532-2540. [7] Huang G B. An Insght nto Extreme Machnes: Random Neurons, Random Features and Kernels. Cogntve Computaton, 2014, 6(3):376-390.