Remote Sensing Textual Image Classification based on Ensemble Learning

I.J. Image, Graphcs and Sgnal Processng, 2016, 12, 21-29 Publshed Onlne December 2016 n MECS (http://www.mecs-press.org/) DOI: 10.5815/jgsp.2016.12.03 Remote Sensng Textual Image Classfcaton based on Ensemble Learnng Ye zhwe 1, Yang Juan 1, Zhang Xu 1, Hu Zhengbng 2 1 School of Computer Scence, Hube unversty of Technology, Wuhan, Chna 2 School of Educatonal Informaton Technology, Central Chna Normal Unversty, Wuhan, Chna Abstract Remote sensng textual mage classfcaton technology has been the hottest topc n the fled of remote sensng. Texture s the most helpful symbol for mage classfcaton. In common, there are complex terran types and multple texture features are extracted for classfcaton, n addton; there s nose n the remote sensng mages and the sngle classfer s hard to obtan the optmal classfcaton results. Integraton of multple classfers s able to make good use of the characterstcs of dfferent classfers and mprove the classfcaton accuracy n the largest extent. In the paper, based on the dversty measurement of the base classfers, J48 classfer, IBk classfer, sequental mnmal optmzaton (SMO) classfer, Nave Bayes classfer and multlayer perceptron (MLP) classfer are selected for ensemble learnng. In order to evaluate the nfluence of our proposed method, our approach s compared wth the fve base classfers through calculatng the average classfcaton accuracy. Experments on fve UCI data sets and remote sensng mage data sets are performed to testfy the effectveness of the proposed method. Index Terms Remote Sensng, Textual Image Classfcaton, Ensemble Learnng, Baggng. I. INTRODUCTION Remote sensng mage mnng or classfcaton s one of the most mportant methods of extractng land cover nformaton on the Earth [1]. Dfferent from standard alphanumerc mnng, mage mnng or classfcaton s very dffcult because mages data are unstructured [2]. There are two man mage classfcaton technques, unsupervsed mage classfcaton and supervsed mage classfcaton. As for supervsed mage classfcaton, frst, the user selects representatve samples called tranng set for each land cover classes, then a learnng classfer s traned by a set of gven tranng data set whch contans a lot of tranng samples, n the end, the traned classfer wll be utlzed for practcal applcaton. In each tranng samples, there are a low-level feature vector and ts related class label. The traned classfer s able to dstngush unknown low-level feature vectors nto a class whch has been traned. Several classfers lke Maxmum Lkelhood Classfer, Mnmum Dstance Classfer has been used for mage classfcaton [3]. Wth the development of remote sensng technology, the spatal and spectral resoluton of remote sensng mages has been gettng hgher and hgher [4]. It presents new challenges to remote sensng mage classfcaton and requres the development of new data classfcaton methods. Many new classfcaton methods such as spectral nformaton dvergence, object orented paradgm appeared [5]. To a certan extent, these classfers or classfcaton strategy can mprove the classfcaton accuracy; however, dfferent classfers have ther own characterstcs. For dfferent applcatons, the performance of classfcaton s not dentcal [6]. Some of the samples are wrongly classfed by one classfer whle these samples may be correctly labeled by another classfer, whch ndcates that there s complementarty between the classfers. It s dffcult to desgn a powerful model for classfyng remote sensng mage because the model should not only have man dscrmnaton nformaton of remote sensng mage and t should be robust to ts varatons at the same tme. As a result, only mprovng tradtonal methods to acheve robust classfcaton s not always feasble. In 1998, Dun et al. proposed combnng multple classfers to enhance classfcaton performance of a sngle classfer [7]. That s, the combnaton of classfers s able to amend the errors made by a sngle classfer on dstnct parts of the nput space. It s concevable that the performance of combnng multple classfers s better than one of the base classfers used n solaton [8]. The emergence of ensemble learnng provdes a new research dea for solvng the problem of strong correlaton and redundancy exsts n the bands. Hanson et al. frstly proposed the concept of neural network ensemble [9]. They proved that, the generalzaton ablty of learnng systems could be sgnfcantly mproved through the tranng of multple neural networks. In 2011, multple classfers ensemble was appled to face recognton [10]. At the same year, support vector machne (SVM) was used as the base classfer to recognze the facal expresson [11]. As s known, texture s a vtal characterstcs for remote sensng mage nterpretaton. However, texture often changes n orentaton, scale or other vsual appearance thus t s hard to be accurately descrbed by use of a sngle mathematcal model. Generally, several descrptors wll utlzed for classfyng textures, whch may mprove the classfcaton accuracy and lead to classfcaton dffculty n the meantme. In the paper, based on the dversty measurement of the base classfers, J48 classfer, IBk classfer, sequental mnmal optmzaton (SMO) classfer, Nave Bayes

22 Remote Sensng Textual Image Classfcaton based on Ensemble Learnng classfer and multlayer perceptron (MLP) classfer are selected for ensemble learnng. These classfers respectvely use C4.5 classfcaton algorthm, Nave Bayes classfcaton algorthm, k-nearest Neghbors (k- NN) classfcaton algorthm, artfcal neural network (ANN) classfcaton algorthm as the base classfer. In order to evaluate the nfluence of our proposed method, our approach s compared wth the fve base classfers through calculatng the average classfcaton accuracy. The remander of ths paper s organzed as follows. Secton 2 brefly revews the ensemble learnng. In Secton 3, the selecton of base classfers and the proposed method s descrbed n detal. The effectveness of the proposed method s demonstrated n Secton 4 by experments on several publc data sets from UCI machne learnng repostory and real remote sensng mages. Fnally, Secton 5 draws the concluson from the expermental results. II. OVERVIEW OF ENSEMBLE LEARNING In a narrow sense, ensemble learnng just uses the same type of learners to learn the same problem. For example, we can put all the learners as support vector machne or neural network classfers. In a broad sense, a varety of learners are appled to solve the problem, whch could be also consdered as ensemble learnng. The followng s the dea of ensemble learnng. In general, when learnng new examples, the dea of ensemble learnng s ntegratng multple ndvdual learners and the result s determned by combnng the results of multple learners n order to acheve better performance than a sngle learner [12]. If consdered the sngle learner as a decson maker, ensemble learnng s consdered as the decson whch s made by a number of decson-makers. Wth combnng k base classfers, M1, M2,, M k, * an mproved composte classfcaton model M s created. A gven data set D1, D2,, D where k D (1 k 1) s devoted to generate classfer M, s used to create k tranng data sets. The ensemble result s a predcton of class based on votes from the base classfers. The flow chart of ensemble learnng s shown n Fg 1. Fg.1. The flow chart of ensemble learnng There are three methods consdered n the theoretcal gudance to ensemble learnng: 1) Sample set reconstructon. 2) Feature level reconstructon. 3) Output varable reconstructon. The way to employ ensemble learnng has two steps usually. The frst step s to obtan ndvdual models through producng several tranng subset. The second step s to use the synthess technology to get the fnal results on the ndvdual output through the thrd methods. Detterch expounded why an ensemble learner s superor to a sngle model n three ways [13]. Usually from a statstcal perspectve, the hypothess space need to be searched s very large, but t s not enough to accurately learn the target hypothess as only a few tranng samples could be used to compare wth real samples n the world, whch causes the results of learnng to be a seres of hypotheses that meet the tranng sets and have approxmaton accuracy. The hypotheses may well meet the tranng sets but not hold a good performance n practce, n consequence the choce of only one classfy wll lead to a bg rsk. Fortunately, t s able to reduce ths rsk by consderng multple hypotheses at the same tme. III. MULTIPLE CLASSIFIERS ENSEMBLE BASED ON BAGGING ALGORITHM A. Selecton of Base Classfers As s dscussed above, an mportant reason for the success of ensemble classfer algorthm s that a group of dfferent base classfers are employed. Dversty among a team of classfers s deemed to be a key ssue n classfer ensemble [14]. However, measurng dversty s not specfc for there s no wdely accepted formal defnton. In 1989, Lttlewood and Mller proposed that dversty has been recognzed as a very mportant characterstc n classfer combnaton [15]. However, there s no rgd defnton of what s drectly perceved as dependence, dversty or orthogonalty of classfers. Many measures of the connecton between two classfer outputs are able to be derved from the statstcal lterature, such as the Q statstcs and the correlaton coeffcent. There are formulas, methods and deas amng at quantfyng dversty when three or more classfers are concerned, but lttle s put on a strct or systematc bass due to lack of a defnton. The general antcpaton s that desgnng the base classfers and the combnaton technology can be helped by dversty measures. In the paper, we measure the dversty between 5 types of supervsed classfer and use the non-parwse dversty measures, such as entropy, Kappa measure, Kohava- Wolpert varance, etc. Let D { D1, D2,, D L } be a set of base classfers. Let { 1, 2,, c } be a set of class labels, n whch x be a vector wth n features to be labeled [16].The entropy measure E s defned as Eq.(1) N 1 1 E mn{ l( x j), L l( x j)} (1) N ( L [ L / 2]) j1

Remote Sensng Textual Image Classfcaton based on Ensemble Learnng 23 E vares between 0 and 1, where E 0 ndcates no dfference between base classfes and E 1 ndcates the hghest possble dversty. It ndcates hgher possble dversty when the value of entropy s larger. Kohav-Wolpert Varance use a specfc classfer 1 2 2 model vx (1 P ( y 1 x ) P ( y 0 x ) ) to express the 2 dversty of the predcted class label y for x across tranng samples, where P( y x) s estmated as an average over dfferent data sets. Averagng over the entre Z, the KW measure of dversty s defned as Eq.(2) N 1 KW ( )( ( )) 2 l z j L l z j (2) NL j 1 Let p be the average accuracy of each classfcaton, 1 N L j.e., p y,, then Kappa measurement s NL j 1 1 defned Eq.(3) N l( x)( L l( x)) 1 (3) 1 K 1 N N( L 1) P(1 P) From Eq.(3), the Kappa value ncreases wth the ncrease of the correlaton between classfers. In ths paper, J48 classfer, IBk classfer, SMO classfer, Nave Bayes classfer and MLP classfer are selected as the base classfers. Frst of all, the fve supervsed classfers are ntroduced brefly. Then, the dversty between these fve classfers s measured by usng the non-parwse dversty measures. 1) J48 Classfer Decson tree learnng construct predctve model as a decson tree, mappng observatons about conclusons about an tem's target value. Ross Qunlan developed the algorthm of decson tree whch called C4.5. C4.5 s an expanson of earler ID3 algorthm [17]. C4.5 has the same way as ID3 to buld decson trees from tranng data by the concept of nformaton entropy. The method of the constructon of decson tree was frst derved from Hunt method, whch ncludes two steps [18]. The frst step s f there s only one class, the node s a leaf node, otherwse t wll enter the next step. The second step s to search for a varable that s to dvde the data nto two or more subsets of data wth hgher purty accordng to the condton of the varable. That s to say, t selects the varable accordng to local optmalty and then returns to the frst step. J48 s an open source Java mplementaton of the C4.5 algorthm n Weka. The classfcaton rules of the C4.5 algorthm are easy to understand and ts accuracy s hgh. Its man drawback s that t needs to scan and sort the data set repeatedly n the process of constructng the decson tree, whch leads to the low effcency of the algorthm. 2) IBk The second classfcaton chosen s k-nn. The nput of k-nn comprses k closest tranng sets n the feature space and the output s a class member. An object s classfed as a majorty of ts neghbors, and the object s assgned to the commonest k nearest neghbor. If k 1, then the object s assgned to the class of that sngle nearest neghbor n a nutshell. The tranng examples, each of whch wth a class label, are vectors n a multdmensonal feature space. In the classfcaton phase, k s a user-defned constant value. The unlabeled vector s classfed by attrbutng the label most frequent among the k tranng samples nearest to that unknown pont. When the tranng samples are a few, t can smply put the tranng set as a reference set. When there are many tranng samples, t can use the exstng selecton or calculate the prototype of the reference sets. k-nn algorthm has strong adaptablty to the tested samples wth more overlappng domans. A commonly used dstance metrc for contnuous varables s Eucldean dstance [19]. The length of the lne segment connectng ponts and j, ( j ) s the Eucldean dstance. In Cartesan coordnates, the dstance (d) from to j, or from j to when ( x 1, x 2,, xn ) and j ( xj 1, x j2,, x jn) are two ponts n Eucldean n- space s gven by Eq.(4) d(, j) ( x x ) ( x x ) ( x x ) 2 2 2 1 j1 2 j2 n jn k-nn s a non parametrc classfcaton technque [20]. It has hgher classfcaton accuracy to unknown and non normal dstrbuton. It has the advantages of ntutve thnkng, hgh feasble degree and clear concept. It drectly uses the relatonshp between the samples, whch can reduce the error probablty of the classfcaton and avod unnecessary trouble. Of course, t s a knd of lazy learnng method. It has the dsadvantages of slow classfcaton speed and strong dependence on sample sze. 3) Sequental Mnmal Optmzaton Classfer SMO s get from the dea of decomposton algorthm to the extreme for solvng the optmzaton problem [21]. It s an teratve algorthm to break the problem nto some smallest possble sub-problems, of whch the most promnent place s that the optmzaton problem of two data ponts can be obtaned analytcally. Therefore, there s no need to take twce plannng optmzaton algorthm as a part of the algorthm. Each step of SMO selects two elements to optmze [22]. The optmal values of two parameters need to be found and updated the correspondng vectors on the premse that other parameters have been fxed. In spte of more teratons needed to converge, there s an ncrease n the number of speed because the operaton of each teraton s very small. Based on the prncple of structural rsk mnmzaton and VC dmenson theory of statstcal learnng theory, Support vector machne (SVM) fnds the best balance between the learnng ablty and the complexty of the model by a certan samples, so as to get the best (4)

24 Remote Sensng Textual Image Classfcaton based on Ensemble Learnng promoton ablty. Compared wth the tradtonal artfcal neural network, SVM has the advantages of smple structure. It ncreases a lot n the generalzaton performance and solves the local optmum problem whch could not be avoded n the neural network. SVM can solve the problems of small collectve samples, hgh dmenson and nonlnear. It has a lot of specal propertes to ensure that the generalzaton ablty n learnng perod s better. At the same tme, t also averts the problem of dmenson. 4) Nave Bayes Nave Bayes classfer s a smple probablstc classfer based on Bayes' theorem, whch has a strong ndependent assumpton [23]. Bayes' theorem s based on the pror probablty of a gven class known, and then t uses the Bayes formula to calculate the posteror probablty. Fnally, the class that has the largest posteror probablty s selected as the class of object. In the theory, Nave Bayes s a condtonal probablty model: suppose the sample space of experment E to be S, represented by B1, B2,..., B n representng n features. A s a event of E and PA ( ) 0. For each of possble results or classes B, the nstance probabltes are P( B ) 0,( 1,2,..., n). Usng Bayes' theorem, the condtonal probablty s able to be calculated as Eq.(5) P( A B) P( B) P( B A), 1,2,..., n n P( A B ) P( B ) j1 A clear dstncton between Nave Bayes and other learnng methods s that t does not explctly search possble hypothess space. Nave Bayes algorthm takes less tme and consders the logc relatvely smple. Nave Bayes algorthm also has a hgh degree of feasblty and the characterstcs of logc and hgh stablty. 5) Multlayer Perceptron j An artfcal neural network s a smulaton of bologcal neural network system whch are used to evaluate or approxmate functons that can depend on a large number of smple computng unts connected n some form to form a network [24]. In the stage of network learnng, network acheves the correspondence between nput samples and correct sample by adjustng the weghts. The neural network has a strong ablty to dentfy and classfy the nput samples, whch s to fnd out the segmentaton regons each of whch belongs to a class meetng the classfcaton requrements through sample space n fact. A MLP s a feedforward ANN model, whch d M can be regarded as a mappng F : R R. What makes a MLP dfferent from other neural network s that a number of neurons use a nonlnear actvaton functon. Learnng happens n the perceptron by alterng the weght between neurons after each tranng sample s processed, based on the comparson between the output of the error and the expected results. j (5) Compared wth other algorthms, neural network has the advantages of hgh capacty of nose data, and t has a very good performance for the classfcaton of the tranng data. Dfferent numbers and types of classfers are used to measure ther dversty. The results are shown n TABLE I. Table 1. Kappa Measurement of Dfferent Classfers number classfers Kappa 2 3 4 (J48,IBk) 0.5952 (J48,SMO) 0.885 (J48,Bayes) 0.8745 (J48,MLP) 0.767 (IBk,SMO) 0.9492 (IBk,Bayes) 0.8751 (IBk,MLP) 0.7868 (SMO,Bayes) 0.885 (SMO,MLP) 0.7787 (Bayes,MLP) 0.7768 (J48,IBk,SMO) 0.9595 (J48,IBk,Bayes) 0.9476 (J48,IBk,MLP) 0.9158 (J48,SMO,Bayes) 0.9473 (J48,SMO,MLP) 0.9132 (J48,Bayes,MLP) 0.9116 (IBk,SMO,Bayes) 0.9573 (IBk,SMO,MLP) 0.9232 (IBk,Bayes,MLP) 0.9149 (SMO,Bayes,MLP) 0.9143 (J48,IBk,SMO,Bayes) 0.9627 (J48,IBk,SMO,MLP) 0.958 (J48,IBk,Bayes,MLP) 0.9548 (J48,SMO,Bayes,MLP) 0.9542 (IBk,SMO,Bayes,MLP) 0.9572 5 (J48,SMO,MLP,IBk,Bayes) 0.9671 It can be seen from TABLE I, f we choose two classfers from all classfers, the best choce are IBk classfer and SMO classfer. Smlarly, f we choose three classfers from all classfers, the best choces are J48 classfer, IBk classfer and SMO classfer. If we choose four classfers from all classfers, the best choces are J48 classfer, IBk classfer, SMO classfer and Bayes classfer. If we want to get better results, we need to choose the fve classfers. Therefore, J48 classfer, IBk classfer, SMO classfer, Nave Bayes classfer and MLP classfer are chosen to conduct ensemble learnng.

Remote Sensng Textual Image Classfcaton based on Ensemble Learnng 25 B. Multple Classfers Ensemble based on Baggng Algorthm Baggng algorthm s a knd of ensemble learnng method mprovng the classfcaton by combnng classfcatons wth randomly selectng tranng sets, whch proposed by Breman n 1994 [25]. If there s a tranng set of sze m, t s practcable to draw m random nstances from t wth replacement. The m nstances are able to be learned, and ths process can be duplcated several tmes. Some duplcates and omssons are contaned n the nstance compared to the ntal tranng set, snce the draw s wth replacement. Through the process, each cycle results n a classfer. Accordng to the constructon of several classfers, the forecast of each classfer wll be a vote to nfluence the fnal forecast. Algorthm: Baggng algorthm Input: 1. D {( x1; y1),( x2; y2),...,( xm, ym)}, a set of m tranng tuples; 2. T, the number of models n the ensemble; 3. L, a classfcaton learnng scheme (J48, IBk, SMO, Nave Bayes and MLP). 4. Output: The ensemble --- a composte model, H( x) arg max l( y h ( x)) yy T t1 t. When the value n the parentheses s a true proposton, the sum s 1. Otherwse, the sum s 0. 5. Method: 6. for t 1 to T do 7. create bootstrap sample, D Bootstrap ( D) t, by samplng D wth replacement; 8. use Dt and the learnng scheme L to derve a model, h t ; 9. endfor 10. To use the ensemble to classfy a tuple, X : 11. Let each of the T models classfy X and return the majorty vote; Gven a tranng set D {( x1; y1),( x2; y2),...,( xm, ym)} of sze m, baggng algorthm generates T new tranng sets D, each of sze m, by samplng from D congruously. t For each sample set, the probablty s 1 (1 1/ m) m. For large m, the unque examples wll be 11/ e 63.2 and the rest wll be duplcates. The T models are ftted usng the above T knds of samples whch known as a bootstrap sample and combned by castng votes. It has the correct * classfcaton rate r max P( x) PX ( x). Based on baggng algorthm, the probablty of correct classfcaton can be as Eq.(6) r max P ( x ) P ( x ) [ I ( ( x ) ) P ( x )] P ( x ) A xc X C ' A X It can be seen from the correct rate that the result of baggng algorthm s better than the results obtaned by a sngle predcton functon. IV. SIMULATION RESULTS AND DISCUSSION A. Experments for Publc Data Sets In order to evaluate the performance of multple classfers ensemble based on baggng algorthm, fve publc data sets from UCI machne learnng repostory named Image segment, german_credt, hepatts, onosphere and soybean are used n ths part. For example, Image segmentaton data set has 19 contnuous attrbutes, 210 tranng samples and 2100 test samples. It was randomly selected nstances from a database of 7 outdoor mages and each nstance s a 3x3 regon. The mages were segmented to create a classfcaton for each pxel. The classes of the Image segmentaton data set are brckface, cement, folage, sky, path, wndow and grass. The general nformaton of other data sets, such as the number of nstances, the number of attrbutes and the number of classes are shown n TABLE II. Table 2. General Informaton of Publc Data Sets From UCI(Http://Archve.Ics.Uc.Edu/Ml/Datasets.Html) Data Sets Instance Attrbute Class segment 2310 19 7 german_cred t 1000 20 2 hepatts 155 19 2 onosphere 351 34 2 soybean 683 35 19 Before calculatng the classfcaton accuracy, some approaches are chosen to data cleanng as a process. It not only ensures the degree of unformty and accuracy of the data set, but also makes the data set more conducve to the mplementaton of the mnng process by changng the nternal structure and content of the data fle. The data preprocessng not only mproves the qualty of the data sample set but also mproves the qualty of the data mnng algorthm and reduces the runnng tme. For neural network backpropagaton algorthm, normalzaton helps speed up the learnng phase after normalzng the nput values for each attrbute. If usng a dstance-based method, normalzaton can help prevent attrbutes wth orgnally large ranges from overweghtng attrbutes wth orgnally smaller ranges. Consderng the classfers chosen, we use mn-max normalzaton to preprocess the data. In the attrbutes of Image segmentaton data set, the values of regon-centrod-col, regon-centrod-row, regon-pxel-count, short-lne- (6)

26 Remote Sensng Textual Image Classfcaton based on Ensemble Learnng densty-5, short-lne-densty-2, vedge-mean, vegde-sd, hedge-mean, hedge-sd, ntensty-mean, rawred-mean, rawblue-mean, rawgreen-mean, exred-mean, exbluemean, exgreen-mean, value-mean, saturaton-mean and hue-mean are [1,254], [11,251], [9,9], [0,0.333], [0,0.222], [0,29.222], [0,991.718], [0,44.722], [0,1386.33], [0,143.444], [0,137.111], [0,150.889], [0,142.556], [-49.667,9.889],[-12.444,82],[- 33.889,24.667], [0,150.889], [0,1], [-3.044,2.912], respectvely. It s clear that the 19 numerc attrbutes are not n the same range, so they need to be unfed to a certan extent. The experments use Normalze, whch s an unsupervsed flter n Weka. By mn-max normalzaton, suppose that mn A and max A are the mnmum and maxmum values of an attrbute, A. The value v s mapped by the value v of A n the range [0,1] by mnmax normalzaton, computng as Eq.(7) Fg.3. Vsualzaton of the german_credt data set usng a scatter-plot matrx wth part of attrbutes v mn A v' max mn A A (7) Fg.2, Fg, 3, Fg, 4, Fg.5 and Fg. 6 are the vsualzaton of the above fve data sets usng a scatterplot matrx. Fg.4. Vsualzaton of the hepatts data set usng a scatter-plot matrx wth part of attrbutes Fg.2. Vsualzaton of the Image Segmentaton data set wth part of attrbutes Fg.5. Vsualzaton of the onosphere data set usng a scatter-plot matrx wth part of attrbutes

Remote Sensng Textual Image Classfcaton based on Ensemble Learnng 27 Fg.6. Vsualzaton of the soybean data set usng a scatter-plot matrx wth part of attrbutes In the experment, C4.5 algorthm, k-nn algorthm, SVM algorthm, Nave Bayes algorthm, ANN algorthm and ensemble learnng algorthm are used to classfy the test data sets separately. For fve publc datasets, TABLE III shows average classfcaton accuracy of usng the J48 classfer, IBk classfer, SMO classfer, Nave Bayes classfer, MLP classfer and baggng classfer. Table 3. Comparson Of Classfcaton Accuracy For Fve Publc Data Sets Data Sets J48 IBk segment 96.9 97.1 german_ credt 70.5 72 SM O 93.0 75.1 Nav e Baye s MLP Baggng 80.2 96.2 97.7 75.6 72 76.4 performance of the base classfers are dfferent. There s no one knd of classfer has absolute advantage. Ths s also the purpose of ths experment. Based on the dfference and nformaton complementarty between the base classfers, t combnes dfferent classfers wth baggng algorthm and gves full play to the advantages of each base classfer. From the expermental data of TABLE III, the average classfcaton accuracy of ensemble learnng algorthm based on the above fve knds of classfers s hgher than the average classfcaton accuracy usng one of the base classfers separately. It can be seen that the results of the fve data sets usng baggng algorthms for classfcaton respectvely are 97.2, 76.1, 85.8, 92.0, 94.7. B. Experments for Remote Sensng Image Data Sets In order to further llustrate the performance of our method on real remote sensng mages, we selected some real remote sensng mages as tranng data. There are 684 nstances whch are dvded nto four classes, ncludng resdent, paddy feld water and vegetaton area. Some tranng samples of each class are shown n Fg.7, Fg.8, Fg.9 and Fg.10. Fg.7. Resdent tranng samples Fg.8. Paddy feld tranng samples hepatts 83.8 80.6 85.1 84.5 80 85.8 onosphere 91.4 86.3 88.6 82.6 91.1 92.0 Fg.9. Water tranng sample. soybean 91.5 91.2 93.8 92.9 93.4 94.4 It can be seen from TABLE III, for Image segment data set, the average classfcaton accuracy of IBk classfer s sgnfcantly hgher than the other four knds of base classfers, reachng 97.1; as for german_credt data set, the average classfcaton accuracy of Nave Bayes reaches 75.6; for hepatts data set and soybean data set, the average classfcaton accuracy of SMO classfer s hgher than the other four knds of base classfers, reachng 85.1 and 93.8 respectvely; n the experment of onosphere data set, the average classfcaton accuracy of J48 classfer reaches 91.4, however the average classfcaton accuracy of IBk algorthm, SMO algorthm, Nave Bayes algorthm and MLP algorthm are 86.3, 88.6, 82.6, 91.1, respectvely. It can be seen that, for dfferent data sets, the results of the classfcaton accuracy are dfferent, because the Fg.10. Vegetaton tranng samples The same as fve publc data sets, remote sensng mage data set uses mn-max normalzaton to deal wth the orgnal data of 22 attrbutes, ncludng varance, skewness, promnence, energy, absolute value and texture energy of each order. Fg.11 shows vsualzaton of the remote sensng mage data set usng a scatter-plot matrx. Then C4.5 algorthm, k-nn algorthm, SVM algorthm, Nave Bayes algorthm, ANN algorthm and ensemble learnng algorthm are used to classfy remote sensng mage data set separately. Table IV shows average classfcaton accuracy of usng the J48 classfer, IBk classfer, SMO classfer, Nave Bayes classfer, MLP classfer and baggng classfer for remote sensng mage data set.

28 Remote Sensng Textual Image Classfcaton based on Ensemble Learnng REFERENCES Fg.11. Vsualzaton of the remote sensng mage data set usng a scatter-plot matrx wth part of attrbutes Table 4. Comparson of Classfcaton Accuracy for Remote Sensng Image Data Set Data Sets J48 IBk SMO Remote Sensng Image Nave Bayes MLP Baggng 81.2 78.6 86.5 85.1 86.9 89.1 From TABLE IV, the average classfcaton accuracy of usng the J48 classfer, IBk classfer, SMO classfer, Nave Bayes classfer, MLP classfer and baggng classfer for remote sensng mage data set are 81.2, 78.6, 86.5, 85.1, 86.9, successvely. As for the result of baggng algorthm, t rses to 89.1, whch s nearly 2 hgher than MLP, whch performs best as a sngle classfer n base classfers. It may be deduced that as for texture mages classfcaton, ensemble learnng s a promsng approach whch could acqure the satsfed results n practce. V. CONCLUSION In order to mprove the classfcaton accuracy of remote sensng mage, our method uses ensemble learnng to combne the classfers of J48, IBk, sequental mnmal optmzaton, Nave Bayes and multlayer perceptron, whch classfy the data sets by straght votng. At last, fve set of publc data and real remote sensng mages are selected to verfy the results. The expermental results show that multple classfer ensemble can effectvely mprove the classfcaton accuracy of textural remote sensng mages. However, n the paper, classfers are ntegrated wth the sample mode, n the future, some better way would be employed. ACKNOWLEDGMENT Ths work s funded by the Natonal Natural Scence Foundaton of Chna under Grant No.41301371 and funded by State Key Laboratory of Geo-Informaton Engneerng, No. SKLGIE2014-M-3-3. [1] Ghasseman H. A revew of remote sensng mage fuson methods[j]. Informaton Fuson, 2016, 32(PA):75-89. [2] Tsa C F. Image mnng by spectral features: A case study of scenery mage classfcaton[j]. Expert Systems wth Applcatons, 2007, 32(1):135-142. [3] Goel S, Gaur M, Jan E. Nature Inspred Algorthms n Remote Sensng Image Classfcaton[J]. Proceda Computer Scence, 2015, 57:377-384. [4] Xu M, Zhang L, Du B. An Image-Based Endmember Bundle Extracton Algorthm Usng Both Spatal and Spectral Informaton[J]. IEEE Journal of Selected Topcs n Appled Earth Observatons & Remote Sensng, 2015, 8(6):2607-2617. [5] Rutherford V. Platt, Lauren Rapoza. An Evaluaton of an Object-Orented Paradgm for Land Use/Land Cover Classfcaton[J]. Professonal Geographer, 2008, 60(1):87-100. [6] Wolpert, D H. The supervsed learnng no-free-lunch theorem [C]. Proceedngs of the 6th Onlne World Conference on Soft Computng n Industral Applcatons, 2001. [7] Kttler J, Hatef M, Dun R P W, et al. On combnng classfers[j]. IEEE Transactons on Pattern Analyss & Machne Intellgence, 1998, 20(3):226-239. [8] Doan H T, Foody G M. Increasng soft classfcaton accuracy through the use of an ensemble of classfers [J]. Internatonal Journal of Remote Sensng, 2007, 28(20): 4606-4623 [9] Hansen L K, Salamon P. Neural network ensembles[j]. Pattern Analyss & Machne Intellgence IEEE Transactons on, 1990, 12(10):993-1001. [10] Le Z, Lao S, Petka&#x, et al. Face Recognton by Explorng Informaton Jontly n Space, Scale and Orentaton[J]. IEEE Transactons on Image Processng, 2011, 20(1):247-56. [11] Mountraks G, Im J, Ogole C. Support vector machnes n remote sensng: A revew[j]. Isprs Journal of Photogrammetry & Remote Sensng, 2011, 66(3):247-259. [12] Rokach L. Ensemble-based classfers[j]. Artfcal Intellgence Revew, 2010, 33(1-2):1-39. [13] Detterch T G. Ensemble Methods n Machne Learnng[C]// Internatonal Workshop on Multple Classfer Systems. Sprnger-Verlag, 2000:1-15. [14] Detterch T G. An expermental comparson of three methods for constructng ensembles of decson trees: baggng, boostng, and randomzaton. Machne Learnng, 2000,40(2):139-158 [15] Lttlewood B, Mller D R. Conceptual modelng of concdent falures n multverson software[j]. IEEE Transactons on Software Engneerng, 1989, 15(12):1596-1614. [16] Kuncheva L, Whtaker C J, Measures of dversty n classfer ensembles and ther relatonshp wth the ensemble accuracy [J]. Machne Learnng, 2003, 51(2): 181-207 [17] Qunlan J R. Improved use of contnuous attrbutes n C4.5[J]. Journal of Artfcal Intellgence Research, 1996, 4(1):77-90. [18] Hunt E B, Marn J, Stone P J. Experments n nducton.[j]. Amercan Journal of Psychology, 1967, 80(4):17-19. [19] Luxburg U V. A tutoral on spectral clusterng[j]. Statstcs & Computng, 2007, 17(17):395-416. [20] Yang J F. A Novel Template Reducton K-Nearest Neghbor Classfcaton Method Based on Weghted

Remote Sensng Textual Image Classfcaton based on Ensemble Learnng 29 Dstance[J]. Danz Yu Xnx Xuebao/journal of Electroncs & Informaton Technology, 2011, 33(10):2378-2383. [21] Chen P H, Fan R E, Ln C J. A study on SMO-type decomposton methods for support vector machnes.[j]. IEEE Transactons on Neural Networks, 2006, 17(4):893-908. [22] Karatzoglou A, Smola A, Hornk K, et al. kernlab - An S4 Package for Kernel Methods n R[J]. Journal of Statstcal Software, 2004, 11(09):721-729. [23] Hameg S, Lazr M, Ameur S. Usng nave Bayes classfer for classfcaton of convectve ranfall ntenstes based on spectral characterstcs retreved from SEVIRI[J]. Journal of Earth System Scence, 2016:1-11. [24] Roy M, Routaray D, Ghosh S, et al. Ensemble of Multlayer Perceptrons for Change Detecton n Remotely Sensed Images[J]. IEEE Geoscence & Remote Sensng Letters, 2014, 11(11):49-53. [25] Wolpert D H, Macready W G. An Effcent Method To Estmate Baggng's Generalzaton Error[C]// Santa Fe Insttute, 1999:41-55. Authors Profles Ye Zhwe, Born n Hube Chna. He s an assocate professor n school of computer scence, Hube Unversty of technology, Wuhan Chna. Hs major research nterests nclude mage processng, swarm ntellgence and machne learnng. How to cte ths paper: Ye zhwe, Yang Juan, Zhang Xu, Hu Zhengbng,"Remote Sensng Textual Image Classfcaton based on Ensemble Learnng", Internatonal Journal of Image, Graphcs and Sgnal Processng(IJIGSP), Vol.8, No.12, pp.21-29, 2016.DOI: 10.5815/jgsp.2016.12.03