A HYBRID FEATURE SELECTION METHOD BASED ON FISHER SCORE AND GENETIC ALGORITHM

Size: px
Start display at page:

Download "A HYBRID FEATURE SELECTION METHOD BASED ON FISHER SCORE AND GENETIC ALGORITHM"

Transcription

1 Journa of Mathematica Sciences: Advances and Appications Voume 37, 2016, Pages Avaiabe at DOI: A HYBRID FEATURE SELECTION METHOD BASED ON FISHER SCORE AND GENETIC ALGORITHM Schoo of Information Science and Technoogy Jinan University Guangzhou P. R. China e-mai: @163.com Abstract Fisher score and genetic agorithm are widey used for feature seection. However, some redundant features wi be seected by Fisher score, and the convergence properties may be worse if the initia popuation of genetic agorithm is generated by a random manner. To improve the performance of feature seection by Fisher score and genetic agorithm, we propose a hybrid feature seection method, which merging the advantages of Fisher score and genetic agorithm together. It aims at utiizing the features' Fisher score to generate the initia popuation of genetic agorithm. To begin with, the Fisher scores of a the features wi be mapped into a specific interva by a inear function, and then the rescaed Fisher scores wi be utiized to generate the initia popuation of genetic agorithm. Finay, the initia popuation wi be used in the subsequent procedure of genetic agorithm to perform feature seection with eitist strategy for reference. In this paper, we choose four data sets of Sonar, WDBC, Arrhythmia, and Hepatitis to test the performance of our agorithm. Feature subsets of the four data sets wi be seected by our agorithm, and then the dimensionaity of data sets wi be reduced according to the seected feature subsets, respectivey. 1-NN cassifier is used to cassify the dimensionaity reduced data sets, and respectivey, achieving the cassification 2010 Mathematics Subject Cassification: 68T10. Keywords and phrases: feature seection, Fisher score, genetic agorithm, eitist strategy. Received January 21, Scientific Advances Pubishers

2 52 accuracy of 72.36%, 95.64%, 72.04%, and 87.83% with ten-fod cross vaidation method. The experiment resuts show that, compared to the performance of Fisher score, genetic agorithm and Fisher score genetic agorithm, our agorithm is fit for eiminate redundant features, and can seect discriminative features. Above a, our method is effective in feature seection. 1. Introduction In many research fieds such as machine earnin pattern recognition, biomedicine and so on, researchers need to anaysis vast amounts of data, which is described by high-dimensiona vectors. Every component of a vector represents a kind of feature of the highdimensiona data. In reaity, there are some meaningess and redundant features in the high-dimensiona vectors. To improve the efficiency of subsequent procedure (such as cassification and prediction) and save the dedicated space of device, we need to eiminate the meaningess and/or redundant features from those high-dimensiona vectors, and seect the features which are most reevant for our probems to form the optima feature subset, the procedure is caed to be feature seection, or feature subset seection (FSS) [1, 2]. Superficiay, exhaustive search can seect the optima feature subset, however, it is a NP-hard probem that we adopt exhaustive search to seect features, because each dataset has 2 n feature subsets. In the past decades, a number of feature seection methods have been proposed, and they are generay divided into two categories: fiter-based methods and wrapper-based methods. Fiter-based methods rank the features according to a predefined criterion independent of the actua generaization performance of the earning machine, and seect those features with high ranking scores, so a faster speed can usuay be obtained. Mutua information (MI) [3]; Fisher score (FS) [4]; Reieff [5]; Lapacian score [6]; Hibert Schmidt independence criterion (HSIC) [7]; and Trace ratio criterion [8] or so can be regarded as the criterion for fiter-based feature seection, among which Fisher score is one of the most widey used criteria for fiter-based feature seection due to its genera good performance. The specific process of feature seection method based on Fisher score is ike this: cacuating

3 A HYBRID FEATURE SELECTION METHOD 53 the Fisher scores of a the features, and for a given threshod θ, feature f i is seected if F ( f i ) > θ, otherwise, if F ( f i ) θ, feature f i wi not be seected. Seecting features by Fisher score can improve the accuracy of subsequent procedure (such as cassification and prediction), and the process is simpe, feasibe and time saving. However, there are aso some probems about Fisher score: (1) How to set the threshod θ such that an optima feature subset is seected? (2) We cannot eiminate redundant features via Fisher score. For exampe, both the scores of feature f i and feature f j are very high, but they are highy correated. In this case, the fiter-based agorithm wi seect both f i and f j, whie either f i or f j shoud be eiminated without any oss in the subsequent cassification performance. (3) Since the fiter-based agorithm computes the Fisher score of each feature individuay, it negects the combination of features, which means evauating two or more than two features together. For instance, it coud be the case that the scores of feature f i and feature f j are both ow, but the score of the combination of f i and f j is very high. In this case, the fiter-based agorithm wi discard both f i and f j, athough they shoud be seected. Wrapper-based methods use a predictor as a back box and the predictor performance as the objective function to evauate the feature subset. A wide range of wrapper-based methods have been used incuding forward seection [9]; backward eimination [10]; hi-cimbing [11]; branch and bound agorithms [12]; simuated anneaing and genetic agorithms (GAs) [13, 14, 15, 16]. Kudo and Skansky [17] made a comparison among many of the feature seection agorithms and expicity recommended that GAs shoud be used for arge-scae probems with more than 50 candidate variabes. In many practica appications, the genetic agorithm can be reformed in different ways according to different situations. However, when the GA is adopted to seect features, it is unreasonabe that the initia popuation is generated by a random manner. Because every feature has the same chance to be seected into the initia popuation, the convergence properties of GA may be worse.

4 54 Yet, we just want to seect the best features rapidy. Since our purpose is to seect the best features rapidy, why not seect the features of high Fisher score by setting a threshod θ to generate the initia popuation, and then adopt the genetic agorithm to achieve the optima feature subset? It is reasonabe to a extent, however, features of owest Fisher score may have itte chance to be seected. To overcome the above probems, we consider that if the features of higher Fisher score have more probabiities to be seected to generate the initia popuation than features of ower Fisher score, and some chromosomes of best fitness are reserved to the next generation without any genetic operations, the cassification accuracy after feature seection wi be improved. The basic idea of this paper is just about this. In this paper, we present a hybrid feature seection method based on Fisher score and genetic agorithm. Our method utiize features rescaed Fisher score to generate the initia popuation of genetic agorithm. In the first pace, the Fisher scores of a the features wi be mapped into a specific interva by a inear function, and then we utiize the rescaed Fisher scores to generate the initia popuation of genetic agorithm. Finay, the initia popuation wi be used in the subsequent procedure of genetic agorithm to perform feature seection with eitist strategy for reference. Experiments on four benchmark data sets of Sonar, WDBC, Arrhythmia, and Hepatitis indicate that the proposed method outperforms Fisher score method and genetic agorithm, and it does we in eiminating redundant features. Features seected by our method are more discriminative than that some state of the art feature seection methods. The remainder of this paper is organized as foows. In Section 2, we briefy review Fisher score and genetic agorithm. We present the hybrid feature seection method based on Fisher score and genetic agorithm in Section 3. The experiments on four data sets of Sonar, WDBC, Arrhythmia, and Hepatitis are demonstrated in Section 4. Finay, we draw a concusion in Section 5.

5 A HYBRID FEATURE SELECTION METHOD A Brief Review of Fisher Score and Genetic Agorithm 2.1. Fisher score, N i y i i= 1 Given dataset { x }, where M x i R and y i { 1, 2,, c} represents the cass which x i beongs to. N and M are the number of sampes and features, respectivey. f 1, f2,, fm denote the M features. Redundant and meaningess features may be incuded in the feature set when M is arge, thus we want to seect a subset from the M features in order to make the data set more productive. For exampe, we hope the seected features improve the cassification accuracy for a cassification probem. If a feature is discriminative, the between-cass variance of the feature shoud be arge, whie the within-cass variance of the feature shoud be sma. Let µ denote the mean of feature f, and f i mean of feature f i in cass k. Fisher score [4] is defined as foows: i k µ be the f i F ( f ) = i c nk k = 1 c k = 1 yj = k k ( µ µ ) fi k ( f µ ) j, i fi fi 2 2, (2.1) where n k is the number of sampes in cass k, and f j, i is the vaue of feature f i in sampe x j. From the Equation (2.1), we can concude that features with higher Fisher score are more discriminative to some extent Genetic agorithm The genetic agorithm (GA) [18] was originay deveoped by Hoand [3] and his associates. The basic idea of GA is to construct a fitness function according to the objective function of the probem, and utiize the fitness function to evauate a set of candidate soutions (every soution responds to a chromosome) caed a popuation. Based on the Darwinian

6 56 principe of surviva of the fittest, the GA obtains the optima soution after a series of iterative computations. GA generates successive popuations of aternate soutions that are represented by a chromosome, i.e., a soution to the probem, unti acceptabe resut is obtained. Genetic agorithm is a method that based on group optimization, as described beow: Step 1: Generating initia popuation. The generating of initia popuation is random, and the specific way reays on the encoding of chromosomes. The 0-1 codes are generated as beow: et 0, I R denotes the -th chromosome of the initia popuation, for 0, 0, 0, 0, I = ( I, I,, I ), generating a random foating number 1 2 M ξ i U( 0, 1), if ξ i > 0.5, 0, then I i = 1; otherwise, if ξ i 0.5, then 0, Ii = 0 ( i = 1, 2,, M ). The setting of popuation size N reays on the computing abiity of a computer and the compexity of a agorithm. Generay, N is set to be It is the 0-th generation at present. Step 2: Estimating stopping criterion. We adopt the number of maximum generation as the stopping criterion. The agorithm stops when the number of generations reaches the preset maximum generation. Step 3: Computing the fitness. Choosing a proper fitness function f ( I ) based on the objective function of optimizing issue, and cacuating the fitness of every chromosome, respectivey. The higher fitness a chromosome have, the more probabiity it wi be seected to take part in the subsequent genetic operation. Step 4: Seecting the chromosome with high fitness. The rouette-whee seection scheme wi be used to seect N chromosomes to generate a new popuation. Step 5: Genetic operation. The seected chromosomes need to undergo genetic operations, such as crossover and mutation. (1) Crossover: On the basis of the crossover rate P c, the crossover operation generates new M

7 A HYBRID FEATURE SELECTION METHOD 57 chromosomes (offspring) out of their parents. (2) Mutation: The mutation is appied to the offspring. According to the mutation rate P m, we perform the 1-0 and 0-1 conversions. Step 6: Updating the popuation. We obtain the offspring after the crossover and mutation process, and the agorithm enters to the next generation. Return to Step A Hybrid Feature Seection Method Based on Fisher Score and Genetic Agorithm To overcome the probems of Fisher score and genetic agorithm, we propose a hybrid feature seection method based on Fisher score and F f F f,, F f genetic agorithm. We cacuate the Fisher score [ ( ) ( ) ( )] 1, 2 of every feature, and then map a the Fisher scores into a cosed interva [ ε 1 ε], by a inear function, where ε is a reativey sma positive rea vaue. We utiize the rescaed Fisher scores to generate the initia popuation of genetic agorithm, and make a itte modification in the subsequent operator of genetic agorithm. We aim to seect a more optima subset of features than Fisher score and genetic agorithm Generating initia popuation Suppose we have Fisher scores [ F ( f ), F ( f2 ),, F ( fm )] features, utiize a inear function L( F ( f )) to rescae ( ) [ ε 1 ε] i M 1 for a F f into range, for i = 1, 2,, M, and the inear function L( F ( f )) is given as beow: 1 2ε ε( Fmax + Fmin ) Fmin L( F ( f i )) = F ( fi ) +, (3.1) F F F F max min where ε is a reativey sma positive rea vaue, 0 < ε < 0.1, and we ascertain the vaue of ε according to the practica situation of probems. Fmax = max { F ( f1 ), F ( f2 ),, F ( fm )}, Fmin = min { F ( f1 ), F ( f2 ),, F ( fm )}. max min i i

8 58 We adopt 0-1 coding scheme to represent individua in every generation. Let M I R denote the -th individua in the g-th generation. Then I = ( I1, I2,, I M ), where I i = 1 if feature f is seected, otherwise, I = 0, for i = 1, 2,, M. i i Unike genetic agorithm, we mainy utiize the rescaed Fisher scores L( F ( f )) to generate initia popuation in GA agorithm. i Generating M rea vaue of η i by computer with a random manner, where η i U( 0, 1), if ( ( )) 1 L F f i > η i, 0, then I i = 1; otherwise, if L( F ( f )) η, then I = 0, for i = 1, 2,, M. i 3.2. Fitness function i 0, i The seection of a fitness function has a direct infuence on the convergence rate of GA, and it aso concerns whether the optima soution wi be found. Generay speakin the fitness function reates to the objective function of a optimization probem. Therefore, the fitness function denoted by f ( I ) is defined as cassification accuracy of 1-NN cassifier with 10-fods cross vaidation, and there is an additiona term of M g I, i i= 1 λ to penaize the number of features. Then f ( I ) can be expressed as: M i i= 1 f ( I ) = accuracy( I ) λ I, (3.2) where λ is a positive rea parameter Eitism strategy In traditiona genetic agorithm, after a series of seection, crossover and mutation procedures, we get N (where N means popuation size) new chromosomes, and the N new chromosomes wi repace a of their parents to form the next generation. Different from the traditiona GA,

9 A HYBRID FEATURE SELECTION METHOD 59 eitism strategy [19] considers two situations: if the new chromosome with best fitness is superior to the best fitness individua in the ast popuation, a the individuas in the ast popuation wi be repaced; otherwise, the most inferior new chromosome is repaced by the best chromosome in the ast generation. Eitism strategy guarantees the consistency of the optima soution at present and in history, and makes the agorithm to be goba convergence [20]. We take the idea of eitism strategy for reference, seecting ( N m) chromosomes from the ast popuation to participate in genetic operation, and retain those chromosomes corresponding to top m fitness vaues in the next generation without participating in any genetic operations. The specific operations are exempified in Figure 3.1. This operator does not obstruct the creation of new chromosomes, and ensures the best fitness chromosome in the next generation never inferior to the one in the ast generation, the process of evoution becomes a optimizing process. Figure 3.1. The eitism strategy operation in this paper Seection strategy Let P( I ) denotes the probabiity that define P( I ) as foows: I is seected. Then we f ( I ) P( I ) =, (3.3) N = 1 f ( I ) where N means popuation size. Rouette whee seection scheme is appied to seect individuas for genetic operations. Let

10 60 PP 0 = 0, (3.4) PP j = P( I ). (3.5) j= 1 Generate a random variabe ξ k ~ U( 0, 1), for k = 1, 2,, N; if PP < 1 ξk PP, then the individua process is repeated ( N m) times Genetic operator and stopping criterion I is seected. The seection Genetic operator mainy incudes two procedures: crossover and mutation. The ( N m) seected chromosomes wi take part in the crossover and mutation procedure according to the probabiity of crossover and mutation, respectivey. The crossover operation generates two new chromosomes (offspring) out of their parents, and the mutation operation sighty perturbs the offspring. The GA stops when the number of generations reaches the preset maximum generation Maxiters Crossover The crossover operator is performed according to the crossover probabiity denoted by P c. The probabiity of crossover contros the frequency of crossover operator, and a higher probabiity is good for open up a new searching area, whie a ower probabiity may ead to a buntness state of GA agorithm. To get a considerabe resut, in this paper, we set the crossover probabiity as a function of the iteration times, and the expression of the crossover probabiity is given beow: 1 P c = n( iters) + 0.7, (3.6) 10 n( Maxiters) where iters denotes the current iteration times and Maxiters is the maximum iteration times. Figure 3.2 shows the curve of crossover probabiity as a function of iteration times in 500 times of iteration.

11 A HYBRID FEATURE SELECTION METHOD 61 Figure 3.2. P c changing with iteration times. We adopt the singe-point crossover manner, and seect two chromosomes t t t t of = ( I, I,, I ) I and = 1 2 M t+1 t+ 1 t+ 1 t+ 1 I ( 1, I2,, I M ) I by Rouette whee, choosing a cutting-point randomy and exchanging the part on the right of cutting-point to get two new chromosomes of g + 1, t g + 1, t g + 1, t ( I, I,, I ) g + 1, t g + 1, t+ 1 g + 1, t+ 1 g + 1, t+ 1 I = 1 2 M and I = ( I1, I2,, g + 1, t+ 1 I M ). A specific exampe of singe-point crossover is exempified in Figure 3.3. Figure 3.3. A specific exampe of singe-point crossover Mutation The mutation operator is performed according to the mutation probabiity denoted by P m. The mainy purpose of mutation is to

12 62 maintain the diversities of popuation. In genera, a ower mutation probabiity means ess ikey to ose important genes; whie a higher mutation probabiity wi ead the agorithm to be a random research process, so we need to carefuy contro the number of 1-0 and 0-1 conversions. In this paper, we set the mutation probabiity as a function of the iteration times, and the expression of the crossover probabiity is given beow: ( 1) 0.04 cos π iters P m = , (3.7) 2( Maxiters 1) where iters denotes the current iteration times and Maxiters is the maximum iteration times. Figure 3.4 shows the curve of mutation probabiity as a function of iteration times in 500 iterations. Figure 3.4. P m changing with iteration times. Based on the mutation probabiity, we seect three gene sites, and perform the 1-0 and 0-1 conversions. A specific exampe of mutation is exempified in Figure Figure 3.5. A specific exampe of mutation.

13 A HYBRID FEATURE SELECTION METHOD Experiments 4.1. Datasets To test the efficiency of our proposed agorithm adequatey, we use a subset of UCI machine earning benchmark data set [21], and they are Sonar, WDBC, Arrhythmia, and Hepatitis dataset, respectivey. Tabe 4.1 summarizes the characteristics of the data sets used in our experiments. A datasets are standardized to be zero-mean and normaized by standard deviation for each dimension. Tabe 4.1. Description of the data sets used in our experiments Data sets Size Number of features Casses Sonar WDBC Arrhythmia Hepatitis Sonar dataset contains 208 vectors, each vector incudes 60 components, which means the dataset has 60 features. We acquire the data from the feedback information of sonar. There are two casses in Sonar dataset, data of rock and data of minera. After feature seection, 1-nearest neighbour (1-NN) cassifier is used for cassification, and ascertain whether a vector beongs to minera according to the resut of cassification. WDBC dataset contains 569 vectors, each vector incudes 30 components, which means the dataset has 30 features. There are two casses in WDBC dataset, benign breast cancer and maignant breast cancer. After feature seection, 1-nearest neighbour (1-NN) cassifier is used for cassification, and ascertain whether a vector beongs to benign or maignant according to the resut of cassification.

14 64 Arrhythmia dataset contains 452 vectors, each vector incudes 279 components, which means the dataset has 279 features. There are two casses in Arrhythmia dataset, norma heart rate and abnorma heart rate. After feature seection, 1-nearest neighbour (1-NN) cassifier is used for cassification, and ascertain whether a vector beongs to norma heart rate or abnorma heart rate according to the resut of cassification. Hepatitis dataset contains 155 vectors, each vector incudes 19 components, which means the dataset has 19 features. There are two casses in Hepatitis dataset, hepatitis patients and heathy person. After feature seection, 1-nearest neighbour (1-NN) cassifier is used for cassification, and ascertain whether a vector beongs to hepatitis patients or not according to the resut of cassification Experiments and parameters In our experiments, we compare the proposed method to the state-ofart feature seection methods: Fisher score (FS), Genetic agorithm (GA), and Fisher score genetic agorithm (FSGA). We perform four experiments on every dataset respectivey, and the parameters of 16 experiments are summarized in Tabe 4.2.

15 A HYBRID FEATURE SELECTION METHOD 65 Tabe 4.2. The parameters in 16 experiments Data sets Sonar WDBC Arrhythmia Hepatitis Parameters Agorithms λ N Maxiters θ ε m FS 0.01 GA FSGA * * HFSGA * FS 0.01 GA FSGA HFSGA * FS 0.01 GA FSGA HFSGA FS GA FSGA HFSGA Experiment 1. Feature seection by Fisher score (FS). Whether a feature is seected or not is just reated to the feature s Fisher score and the setting of threshod. When a threshod θ is settin C θ features of F ( f i ) > θ wi be seected to construct a feature subset A, and then we reduce the dimensionaity of origina data set according to the seected feature subset A. 1-NN cassifier is used to cassify the dimensionaity reduced data set. To ensure the comparabiity with other three methods, we aso construct a fitness function of f 0 ( A), which is expressed as the subtraction of the cassification accuracy accuracy(a) with ten-fod cross vaidation method and a penaty item of λc θ. The fitness of a feature subset A is defined as f ( A) = accuracy( A) λc, (4.1) 0 θ

16 66 where parameter λ is the same as that in formua (3.2). In the experiment of Sonar, WDBC, and Hepatitis, we set λ = 0.01, whie in the experiment of Arrhythmia data set which has the most features, we set λ = From formua (4.1), we can concude that the feature subset with higher fitness is better. Experiment 2. Feature seection by genetic agorithm (GA). When adopt GA to perform feature seection in the four data sets respectivey, we set the popuation size N = 100, and λ = in the experiments of Sonar, WDBC and Hepatitis, whie λ = in the experiment of Arrhythmia. In the experiments of Sonar, WDBC, and Arrhythmia, we set Maxiters = 500, and Maxiters = 1000 in the Hepatitis data set experiment. Experiment 3. Feature seection by Fisher score genetic agorithm (FSGA). The specific process of FSGA is ike this: in the first pace, set a threshod θ to seect features whose F ( f ) > θ, then utiize the seected features to generate initia popuation of genetic agorithm by random manner (the features whose Fisher score is ower than threshod θ wi not be seected to generate initia popuation), finay, the feature subset is seected by traditiona genetic agorithm. We set the threshod to be 0.15, 1.5, 0.5, and 0.28, respectivey, in data sets of Sonar, WDBC, Arrhythmia, Hepatitis, and the setting of parameter N, λ, and Maxiters are the same as Experiment 2. Experiment 4. Feature seection by the proposed hybrid feature seection method based on Fisher score and genetic agorithm (HFSGA). We set N = 100, ε = 0.05, m = 2 in experiments of data sets Sonar, WDBC, Arrhythmia, and Hepatitis, and the setting of parameter N, λ, and Maxiters are the same as Experiment 2. i

17 A HYBRID FEATURE SELECTION METHOD Experimenta resuts anaysis The resuts of four experiments in Sonar, WDBC, Arrhythmia, and Hepatitis data sets are shown in Figure 4.1, Figure 4.2, Figure 4.3, and Figure 4.4, respectivey, and Tabe 4.3 shows the cassification accuracy by ten-fod cross vaidation method Experimenta resuts on Sonar From Figure 4.1 and Tabe 4.3, we can concude that our method achieved the highest cassification accuracy of 72.36% in Sonar data set, and seected 4 features, so it can save the storage space and operation time in subsequent processing. In addition, our method reached the best fitness of just requiring 40 iterations, whie the FSGA needs 75 iterations to get the best fitness of , and the GA needs 338 iterations to get the best fitness of 0.6, thus, our method has the fastest convergence rate. In terms of cassification accuracy, number of seected features and convergence rate, our proposed method is most effective for Sonar data set to perform feature seection.

18 68 (a) Resuts of feature seection by FS (b) Resuts of feature seection by GA

19 A HYBRID FEATURE SELECTION METHOD 69 (c) Resuts of feature seection by FSGA (d) Resuts of feature seection by HFSGA Figure 4.1. Four experimenta resuts on Sonar data set Experimenta resuts on WDBC Figure 4.2 and Tabe 4.3 show that the cassification accuracy of performing feature seection on WDBC data set by the four methods are 92.70%, 94.10%, 94.05%, and 95.66%, respectivey, which indicates that the discrimination between reevant features and irreevant features is obvious. The FS seected 5 features, whie other three methods seected 3

20 70 features respectivey, we can infer that there may be 2 redundant features with high Fisher score. Our method achieved the highest cassification accuracy of 95.66%, which demonstrated that the proposed method does we in eiminating redundant features. (a) Resuts of feature seection by FS (b) Resuts of feature seection by GA

21 A HYBRID FEATURE SELECTION METHOD 71 (c) Resuts of feature seection by FSGA (d) Resuts of feature seection by HFSGA Figure 4.2. Four experimenta resuts on WDBC data set Experimenta resuts on Arrhythmia From Figure 4.3(a), when Fisher score is adopted to feature seection, in the beginnin the fitness becomes higher and higher with the increase of threshod, and when the threshod is higher than 0.5, the fitness becomes ower and ower with the increase of threshod. We can infer that, on Arrhythmia data set, there are more redundant features and ess

22 72 meaningess features. The cassification accuracy of proposed method is 70.54%, ower than the FSGA method of 72.34%, whie higher than genetic agorithm and Fisher score of 66.57% and 67.49%, respectivey. This resut indicates that, on the one hand, Fisher score is a favorabe criterion to determine the reevance of features for Arrhythmia data set; on the other hand, the FAGA method does we in eiminating irreevant features, whie the proposed HFAGA method is good at eiminating redundant features. (a) Resuts of feature seection by FS (b) Resuts of feature seection by GA

23 A HYBRID FEATURE SELECTION METHOD 73 (c) Resuts of feature seection by FSGA (d) Resuts of feature seection by HFSGA Figure 4.3. Four experimenta resuts on Arrhythmia data set Experimenta resuts on Hepatitis From Figure 4.4 and Tabe 4.3, we can concude that our method achieved the highest cassification accuracy of 87.83% on Hepatitis data set, and seected 5 features, so it can save the storage space and

24 74 operation time in subsequent processing. Furthermore, our method has the fastest convergence rate. Above a, our proposed method is most effective for Hepatitis data set to perform feature seection. (a) Resuts of feature seection by FS (b) Resuts of feature seection by GA

25 A HYBRID FEATURE SELECTION METHOD 75 (c) Resuts of feature seection by FSGA (d) Resuts of feature seection by HFSGA Figure 4.4. Four experimenta resuts on Hepatitis data set.

26 76 Tabe 4.3. Cassification accuracy and number of seected features of 16 experiments Data sets Agorithms Best fitness Cassification Accuracy Number of seected features FS Sonar (60) GA FSGA HFSGA FS WDBC (30) GA FSGA HFSGA FS Arrhythmia (279) GA FSGA HFSGA FS Hepatitis (19) GA FSGA HFSGA Discussion The proposed HFSGA method is superior to other three feature seection methods of FS, GA, and FSGA methods in terms of cassification accuracy, and it does we in eiminating redundant features with a fast convergence rate. From the experiments on Arrhythmia data set, we can discover that the FSGA method is good at eiminating irreevant features. In a word, our method is effective in feature seection. Certainy, our proposed HFSGA method requires to set many parameters, so we can do some further researches to expore the infuence of parameters on experiment resuts, and then determine the

27 A HYBRID FEATURE SELECTION METHOD 77 optima parameters. In addition, the seection of cassifier aso infuences the cassification accuracy, thus, the match of cassifier and feature seection method is worthy of further research. 5. Concusion In this paper, we presented a hybrid feature seection method based on Fisher score and genetic agorithm. It utiizes features' rescaed Fisher score to generate the initia popuation of genetic agorithm. On the one hand, our method avoids the conundrum of how to set a threshod, and it does we in eiminating redundant features, thus, it can seect the feature subset with better performance. On the other hand, unike genetic agorithm generating initia popuation with a random manner, our method has a faster convergence rate. Contrast experiments on four benchmark data sets of Sonar, WDBC, Arrhythmia, and Hepatitis indicate that the proposed method outperforms Fisher score method and genetic agorithm, and it is effective in feature seection. References [1] I. Guyon and A. Eisseeff, An introduction to variabe and feature seection, Journa of Machine Learning Research 3 (2003), [2] Y. Yang and J. O. Pedersen, A comparative study on feature seection in text categorization, In ICML (1997), [3] D. Koer and M. Sahami, Toward optima feature seection, In ICML (1996), [4] P. E. H. R. O. Duda and D. G. Stork, Pattern Cassification, Wiey-Inter Science Pubication, [5] M. Robnik-Sikonja and I. Kononenko, Theoretica and Empirica Anaysis of ReiefF and RReiefF, Machine Learning 53(1-2) (2003), [6] X. He, D. Cai and P. Niyogi, Lapacian score for feature seection, In NIPS, [7] L. Son A. J. Smoa, A. Gretton, K. M. Borgwardt and J. Bedo, Supervised feature seection via dependence estimation, In ICML (2007), [8] F. Nie, S. Xian Y. Jia, C. Zhang and S. Yan, Trace ratio criterion for feature seection, In AAAI (2008),

28 78 [9] R. Battiti, Using mutua information for seecting features in supervised neura net earnin IEEE Trans. Neura Networks 5(4) (1994), [10] C. M. Bishop, Neura Networks for Pattern Recognition, Oxford University Press, [11] R. Caruana and D. Freita Greedy Attribute Seection, In: Proc. of the 11th Internet. Conf. on Machine Learn., New Brunswick, NJ, USA, (1994), [12] P. Somo, P. Pudi and J. Kitter, Fast branch and bound agorithms for optima feature seection, IEEE Trans. Pattern Ana. Machine Inte. 26(7) (2004), [13] J. H. Yang and V. Honavar, Feature subset seection using a genetic agorithm, IEEE Inte. Systems 13(2) (1998), [14] M. L. Raymer, W. F. Punch, E. D. Goodman, L. A. Kuhn and A. K. Jain, Dimensionaity reduction using genetic agorithms, IEEE Trans. Evout. Comput. 4(2) (2000), [15] B. Bhanu and Y. Lin, Genetic agorithm based feature seection for target detection in SAR images, Image Vision Comput. 21(7) (2003), [16] F. Zhu and S. Guan, Feature seection for moduar GA-based cassification, App. Soft Comput. J. 4(4) (2004), [17] M. Kudo and J. Skansky, Comparison of agorithms that seect features for pattern cassifiers, Pattern Recognition 33(1) (2000), [18] John H. Hoand, Adaptation in Natura and Artificia Systems, [19] K. A. De Jon An Anaysis of the Behavior of a Cass of Generic Adaptive Systems, University of Michigan, [20] B. Foge David, Evoutionary Computation: Toward a New Phiosophy of Machine Inteigence, 2nd Edition, Wiey-IEEE Press, [21] A. Asuncion and D. Newman, UCI Machine Learning Repository, g

Language Identification for Texts Written in Transliteration

Language Identification for Texts Written in Transliteration Language Identification for Texts Written in Transiteration Andrey Chepovskiy, Sergey Gusev, Margarita Kurbatova Higher Schoo of Economics, Data Anaysis and Artificia Inteigence Department, Pokrovskiy

More information

Nearest Neighbor Learning

Nearest Neighbor Learning Nearest Neighbor Learning Cassify based on oca simiarity Ranges from simpe nearest neighbor to case-based and anaogica reasoning Use oca information near the current query instance to decide the cassification

More information

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART

AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART 13 AN EVOLUTIONARY APPROACH TO OPTIMIZATION OF A LAYOUT CHART Eva Vona University of Ostrava, 30th dubna st. 22, Ostrava, Czech Repubic e-mai: Eva.Vona@osu.cz Abstract: This artice presents the use of

More information

Application of Intelligence Based Genetic Algorithm for Job Sequencing Problem on Parallel Mixed-Model Assembly Line

Application of Intelligence Based Genetic Algorithm for Job Sequencing Problem on Parallel Mixed-Model Assembly Line American J. of Engineering and Appied Sciences 3 (): 5-24, 200 ISSN 94-7020 200 Science Pubications Appication of Inteigence Based Genetic Agorithm for Job Sequencing Probem on Parae Mixed-Mode Assemby

More information

Mobile App Recommendation: Maximize the Total App Downloads

Mobile App Recommendation: Maximize the Total App Downloads Mobie App Recommendation: Maximize the Tota App Downoads Zhuohua Chen Schoo of Economics and Management Tsinghua University chenzhh3.12@sem.tsinghua.edu.cn Yinghui (Catherine) Yang Graduate Schoo of Management

More information

Formulation of Loss minimization Problem Using Genetic Algorithm and Line-Flow-based Equations

Formulation of Loss minimization Problem Using Genetic Algorithm and Line-Flow-based Equations Formuation of Loss minimization Probem Using Genetic Agorithm and Line-Fow-based Equations Sharanya Jaganathan, Student Member, IEEE, Arun Sekar, Senior Member, IEEE, and Wenzhong Gao, Senior member, IEEE

More information

Research of Classification based on Deep Neural Network

Research of  Classification based on Deep Neural Network 2018 Internationa Conference on Sensor Network and Computer Engineering (ICSNCE 2018) Research of Emai Cassification based on Deep Neura Network Wang Yawen Schoo of Computer Science and Engineering Xi

More information

A New Supervised Clustering Algorithm Based on Min-Max Modular Network with Gaussian-Zero-Crossing Functions

A New Supervised Clustering Algorithm Based on Min-Max Modular Network with Gaussian-Zero-Crossing Functions 2006 Internationa Joint Conference on Neura Networks Sheraton Vancouver Wa Centre Hote, Vancouver, BC, Canada Juy 16-21, 2006 A New Supervised Custering Agorithm Based on Min-Max Moduar Network with Gaussian-Zero-Crossing

More information

A Memory Grouping Method for Sharing Memory BIST Logic

A Memory Grouping Method for Sharing Memory BIST Logic A Memory Grouping Method for Sharing Memory BIST Logic Masahide Miyazai, Tomoazu Yoneda, and Hideo Fuiwara Graduate Schoo of Information Science, Nara Institute of Science and Technoogy (NAIST), 8916-5

More information

Image Segmentation Using Semi-Supervised k-means

Image Segmentation Using Semi-Supervised k-means I J C T A, 9(34) 2016, pp. 595-601 Internationa Science Press Image Segmentation Using Semi-Supervised k-means Reza Monsefi * and Saeed Zahedi * ABSTRACT Extracting the region of interest is a very chaenging

More information

Optimization and Application of Support Vector Machine Based on SVM Algorithm Parameters

Optimization and Application of Support Vector Machine Based on SVM Algorithm Parameters Optimization and Appication of Support Vector Machine Based on SVM Agorithm Parameters YAN Hui-feng 1, WANG Wei-feng 1, LIU Jie 2 1 ChongQing University of Posts and Teecom 400065, China 2 Schoo Of Civi

More information

Chapter Multidimensional Direct Search Method

Chapter Multidimensional Direct Search Method Chapter 09.03 Mutidimensiona Direct Search Method After reading this chapter, you shoud be abe to:. Understand the fundamentas of the mutidimensiona direct search methods. Understand how the coordinate

More information

Hiding secrete data in compressed images using histogram analysis

Hiding secrete data in compressed images using histogram analysis University of Woongong Research Onine University of Woongong in Dubai - Papers University of Woongong in Dubai 2 iding secrete data in compressed images using histogram anaysis Farhad Keissarian University

More information

A probabilistic fuzzy method for emitter identification based on genetic algorithm

A probabilistic fuzzy method for emitter identification based on genetic algorithm A probabitic fuzzy method for emitter identification based on genetic agorithm Xia Chen, Weidong Hu, Hongwen Yang, Min Tang ATR Key Lab, Coege of Eectronic Science and Engineering Nationa University of

More information

A Petrel Plugin for Surface Modeling

A Petrel Plugin for Surface Modeling A Petre Pugin for Surface Modeing R. M. Hassanpour, S. H. Derakhshan and C. V. Deutsch Structure and thickness uncertainty are important components of any uncertainty study. The exact ocations of the geoogica

More information

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining

Lecture Notes for Chapter 4 Part III. Introduction to Data Mining Data Mining Cassification: Basic Concepts, Decision Trees, and Mode Evauation Lecture Notes for Chapter 4 Part III Introduction to Data Mining by Tan, Steinbach, Kumar Adapted by Qiang Yang (2010) Tan,Steinbach,

More information

Automatic Grouping for Social Networks CS229 Project Report

Automatic Grouping for Social Networks CS229 Project Report Automatic Grouping for Socia Networks CS229 Project Report Xiaoying Tian Ya Le Yangru Fang Abstract Socia networking sites aow users to manuay categorize their friends, but it is aborious to construct

More information

Neural Network Enhancement of the Los Alamos Force Deployment Estimator

Neural Network Enhancement of the Los Alamos Force Deployment Estimator Missouri University of Science and Technoogy Schoars' Mine Eectrica and Computer Engineering Facuty Research & Creative Works Eectrica and Computer Engineering 1-1-1994 Neura Network Enhancement of the

More information

Binarized support vector machines

Binarized support vector machines Universidad Caros III de Madrid Repositorio instituciona e-archivo Departamento de Estadística http://e-archivo.uc3m.es DES - Working Papers. Statistics and Econometrics. WS 2007-11 Binarized support vector

More information

Sensitivity Analysis of Hopfield Neural Network in Classifying Natural RGB Color Space

Sensitivity Analysis of Hopfield Neural Network in Classifying Natural RGB Color Space Sensitivity Anaysis of Hopfied Neura Network in Cassifying Natura RGB Coor Space Department of Computer Science University of Sharjah UAE rsammouda@sharjah.ac.ae Abstract: - This paper presents a study

More information

Genetic Algorithms for Parallel Code Optimization

Genetic Algorithms for Parallel Code Optimization Genetic Agorithms for Parae Code Optimization Ender Özcan Dept. of Computer Engineering Yeditepe University Kayışdağı, İstanbu, Turkey Emai: eozcan@cse.yeditepe.edu.tr Abstract- Determining the optimum

More information

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-7435 Voume 10 Issue 16 BioTechnoogy 014 An Indian Journa FULL PAPER BTAIJ, 10(16), 014 [999-9307] Study on prediction of type- fuzzy ogic power system based

More information

FIRST BEZIER POINT (SS) R LE LE. φ LE FIRST BEZIER POINT (PS)

FIRST BEZIER POINT (SS) R LE LE. φ LE FIRST BEZIER POINT (PS) Singe- and Muti-Objective Airfoi Design Using Genetic Agorithms and Articia Inteigence A.P. Giotis K.C. Giannakogou y Nationa Technica University of Athens, Greece Abstract Transonic airfoi design probems

More information

Automatic Hidden Web Database Classification

Automatic Hidden Web Database Classification Automatic idden Web atabase Cassification Zhiguo Gong, Jingbai Zhang, and Qian Liu Facuty of Science and Technoogy niversity of Macau Macao, PRC {fstzgg,ma46597,ma46620}@umac.mo Abstract. In this paper,

More information

Improvement of Nearest-Neighbor Classifiers via Support Vector Machines

Improvement of Nearest-Neighbor Classifiers via Support Vector Machines From: FLAIRS-01 Proceedings. Copyright 2001, AAAI (www.aaai.org). A rights reserved. Improvement of Nearest-Neighbor Cassifiers via Support Vector Machines Marc Sebban and Richard Nock TRIVIA-Department

More information

Sequential Approximate Multiobjective Optimization using Computational Intelligence

Sequential Approximate Multiobjective Optimization using Computational Intelligence The Ninth Internationa Symposium on Operations Research and Its Appications (ISORA 10) Chengdu-Jiuzhaigou, China, August 19 23, 2010 Copyright 2010 ORSC & APORC, pp. 1 12 Sequentia Approximate Mutiobjective

More information

IJOART. Copyright 2013 SciResPub. Shri Venkateshwara University, Institute of Computer & Information Science. India. Uttar Pradesh, India

IJOART. Copyright 2013 SciResPub. Shri Venkateshwara University, Institute of Computer & Information Science. India. Uttar Pradesh, India Internationa Journa of Advancements in Research & Technoogy, Voume, Issue 6, June-04 ISSN 78-7763 89 Pattern assification for andwritten Marathi haracters using Gradient Descent of Distributed error with

More information

A Fast Block Matching Algorithm Based on the Winner-Update Strategy

A Fast Block Matching Algorithm Based on the Winner-Update Strategy In Proceedings of the Fourth Asian Conference on Computer Vision, Taipei, Taiwan, Jan. 000, Voume, pages 977 98 A Fast Bock Matching Agorithm Based on the Winner-Update Strategy Yong-Sheng Chenyz Yi-Ping

More information

Fuzzy Equivalence Relation Based Clustering and Its Use to Restructuring Websites Hyperlinks and Web Pages

Fuzzy Equivalence Relation Based Clustering and Its Use to Restructuring Websites Hyperlinks and Web Pages Fuzzy Equivaence Reation Based Custering and Its Use to Restructuring Websites Hyperinks and Web Pages Dimitris K. Kardaras,*, Xenia J. Mamakou, and Bi Karakostas 2 Business Informatics Laboratory, Dept.

More information

WATERMARKING GIS DATA FOR DIGITAL MAP COPYRIGHT PROTECTION

WATERMARKING GIS DATA FOR DIGITAL MAP COPYRIGHT PROTECTION WATERMARKING GIS DATA FOR DIGITAL MAP COPYRIGHT PROTECTION Shen Tao Chinese Academy of Surveying and Mapping, Beijing 100039, China shentao@casm.ac.cn Xu Dehe Institute of resources and environment, North

More information

A Column Generation Approach for Support Vector Machines

A Column Generation Approach for Support Vector Machines A Coumn Generation Approach for Support Vector Machines Emiio Carrizosa Universidad de Sevia (Spain). ecarrizosa@us.es Beén Martín-Barragán Universidad de Sevia (Spain). bemart@us.es Doores Romero Moraes

More information

Layout Optimization of Binocular Stereo Vision Measuring System

Layout Optimization of Binocular Stereo Vision Measuring System Sensors & Transducers 013 by IFSA http://www.sensorsporta.com Layout Optimization of Binocuar Stereo Vision Measuring System Zhenyuan JIA, Mingxing LI, Wei LIU, Yang LIU and Zhiiang SHANG Key Laboratory

More information

As Michi Henning and Steve Vinoski showed 1, calling a remote

As Michi Henning and Steve Vinoski showed 1, calling a remote Reducing CORBA Ca Latency by Caching and Prefetching Bernd Brügge and Christoph Vismeier Technische Universität München Method ca atency is a major probem in approaches based on object-oriented middeware

More information

Distance Weighted Discrimination and Second Order Cone Programming

Distance Weighted Discrimination and Second Order Cone Programming Distance Weighted Discrimination and Second Order Cone Programming Hanwen Huang, Xiaosun Lu, Yufeng Liu, J. S. Marron, Perry Haaand Apri 3, 2012 1 Introduction This vignette demonstrates the utiity and

More information

Backing-up Fuzzy Control of a Truck-trailer Equipped with a Kingpin Sliding Mechanism

Backing-up Fuzzy Control of a Truck-trailer Equipped with a Kingpin Sliding Mechanism Backing-up Fuzzy Contro of a Truck-traier Equipped with a Kingpin Siding Mechanism G. Siamantas and S. Manesis Eectrica & Computer Engineering Dept., University of Patras, Patras, Greece gsiama@upatras.gr;stam.manesis@ece.upatras.gr

More information

MACHINE learning techniques can, automatically,

MACHINE learning techniques can, automatically, Proceedings of Internationa Joint Conference on Neura Networks, Daas, Texas, USA, August 4-9, 203 High Leve Data Cassification Based on Network Entropy Fiipe Aves Neto and Liang Zhao Abstract Traditiona

More information

A NEW APPROACH FOR BLOCK BASED STEGANALYSIS USING A MULTI-CLASSIFIER

A NEW APPROACH FOR BLOCK BASED STEGANALYSIS USING A MULTI-CLASSIFIER Internationa Journa on Technica and Physica Probems of Engineering (IJTPE) Pubished by Internationa Organization of IOTPE ISSN 077-358 IJTPE Journa www.iotpe.com ijtpe@iotpe.com September 014 Issue 0 Voume

More information

SOFT SENSOR DEVELOPMENT AND OPTIMIZATION OF THE COMMERCIAL PETROCHEMICAL PLANT INTEGRATING SUPPORT VECTOR REGRESSION AND GENETIC ALGORITHM

SOFT SENSOR DEVELOPMENT AND OPTIMIZATION OF THE COMMERCIAL PETROCHEMICAL PLANT INTEGRATING SUPPORT VECTOR REGRESSION AND GENETIC ALGORITHM From the SeectedWorks of Dr. Sandip Kumar Lahiri 2009 SOFT SENSOR DEVELOPMENT AND OPTIMIZATION OF THE COMMERCIAL PETROCHEMICAL PLANT INTEGRATING SUPPORT VECTOR REGRESSION AND GENETIC ALGORITHM sandip k

More information

Research on the overall optimization method of well pattern in water drive reservoirs

Research on the overall optimization method of well pattern in water drive reservoirs J Petro Expor Prod Techno (27) 7:465 47 DOI.7/s322-6-265-3 ORIGINAL PAPER - EXPLORATION ENGINEERING Research on the overa optimization method of we pattern in water drive reservoirs Zhibin Zhou Jiexiang

More information

Comparative Analysis of Relevance for SVM-Based Interactive Document Retrieval

Comparative Analysis of Relevance for SVM-Based Interactive Document Retrieval Comparative Anaysis for SVM-Based Interactive Document Retrieva Paper: Comparative Anaysis of Reevance for SVM-Based Interactive Document Retrieva Hiroshi Murata, Takashi Onoda, and Seiji Yamada Centra

More information

Resource Optimization to Provision a Virtual Private Network Using the Hose Model

Resource Optimization to Provision a Virtual Private Network Using the Hose Model Resource Optimization to Provision a Virtua Private Network Using the Hose Mode Monia Ghobadi, Sudhakar Ganti, Ghoamai C. Shoja University of Victoria, Victoria C, Canada V8W 3P6 e-mai: {monia, sganti,

More information

A Method for Calculating Term Similarity on Large Document Collections

A Method for Calculating Term Similarity on Large Document Collections $ A Method for Cacuating Term Simiarity on Large Document Coections Wofgang W Bein Schoo of Computer Science University of Nevada Las Vegas, NV 915-019 bein@csunvedu Jeffrey S Coombs and Kazem Taghva Information

More information

A Design Method for Optimal Truss Structures with Certain Redundancy Based on Combinatorial Rigidity Theory

A Design Method for Optimal Truss Structures with Certain Redundancy Based on Combinatorial Rigidity Theory 0 th Word Congress on Structura and Mutidiscipinary Optimization May 9 -, 03, Orando, Forida, USA A Design Method for Optima Truss Structures with Certain Redundancy Based on Combinatoria Rigidity Theory

More information

Quality of Service Evaluations of Multicast Streaming Protocols *

Quality of Service Evaluations of Multicast Streaming Protocols * Quaity of Service Evauations of Muticast Streaming Protocos Haonan Tan Derek L. Eager Mary. Vernon Hongfei Guo omputer Sciences Department University of Wisconsin-Madison, USA {haonan, vernon, guo}@cs.wisc.edu

More information

Response Surface Model Updating for Nonlinear Structures

Response Surface Model Updating for Nonlinear Structures Response Surface Mode Updating for Noninear Structures Gonaz Shahidi a, Shamim Pakzad b a PhD Student, Department of Civi and Environmenta Engineering, Lehigh University, ATLSS Engineering Research Center,

More information

GPU Implementation of Parallel SVM as Applied to Intrusion Detection System

GPU Implementation of Parallel SVM as Applied to Intrusion Detection System GPU Impementation of Parae SVM as Appied to Intrusion Detection System Sudarshan Hiray Research Schoar, Department of Computer Engineering, Vishwakarma Institute of Technoogy, Pune, India sdhiray7@gmai.com

More information

FREE-FORM ANISOTROPY: A NEW METHOD FOR CRACK DETECTION ON PAVEMENT SURFACE IMAGES

FREE-FORM ANISOTROPY: A NEW METHOD FOR CRACK DETECTION ON PAVEMENT SURFACE IMAGES FREE-FORM ANISOTROPY: A NEW METHOD FOR CRACK DETECTION ON PAVEMENT SURFACE IMAGES Tien Sy Nguyen, Stéphane Begot, Forent Ducuty, Manue Avia To cite this version: Tien Sy Nguyen, Stéphane Begot, Forent

More information

Utility-based Camera Assignment in a Video Network: A Game Theoretic Framework

Utility-based Camera Assignment in a Video Network: A Game Theoretic Framework This artice has been accepted for pubication in a future issue of this journa, but has not been fuy edited. Content may change prior to fina pubication. Y.LI AND B.BHANU CAMERA ASSIGNMENT: A GAME-THEORETIC

More information

A Robust Sign Language Recognition System with Sparsely Labeled Instances Using Wi-Fi Signals

A Robust Sign Language Recognition System with Sparsely Labeled Instances Using Wi-Fi Signals A Robust Sign Language Recognition System with Sparsey Labeed Instances Using Wi-Fi Signas Jiacheng Shang, Jie Wu Center for Networked Computing Dept. of Computer and Info. Sciences Tempe University Motivation

More information

Community-Aware Opportunistic Routing in Mobile Social Networks

Community-Aware Opportunistic Routing in Mobile Social Networks IEEE TRANSACTIONS ON COMPUTERS VOL:PP NO:99 YEAR 213 Community-Aware Opportunistic Routing in Mobie Socia Networks Mingjun Xiao, Member, IEEE Jie Wu, Feow, IEEE, and Liusheng Huang, Member, IEEE Abstract

More information

Semi-Supervised Learning with Sparse Distributed Representations

Semi-Supervised Learning with Sparse Distributed Representations Semi-Supervised Learning with Sparse Distributed Representations David Zieger dzieger@stanford.edu CS 229 Fina Project 1 Introduction For many machine earning appications, abeed data may be very difficut

More information

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion.

Lecture outline Graphics and Interaction Scan Converting Polygons and Lines. Inside or outside a polygon? Scan conversion. Lecture outine 433-324 Graphics and Interaction Scan Converting Poygons and Lines Department of Computer Science and Software Engineering The Introduction Scan conversion Scan-ine agorithm Edge coherence

More information

Testing Whether a Set of Code Words Satisfies a Given Set of Constraints *

Testing Whether a Set of Code Words Satisfies a Given Set of Constraints * JOURNAL OF INFORMATION SCIENCE AND ENGINEERING 6, 333-346 (010) Testing Whether a Set of Code Words Satisfies a Given Set of Constraints * HSIN-WEN WEI, WAN-CHEN LU, PEI-CHI HUANG, WEI-KUAN SHIH AND MING-YANG

More information

FACE RECOGNITION WITH HARMONIC DE-LIGHTING. s: {lyqing, sgshan, wgao}jdl.ac.cn

FACE RECOGNITION WITH HARMONIC DE-LIGHTING.  s: {lyqing, sgshan, wgao}jdl.ac.cn FACE RECOGNITION WITH HARMONIC DE-LIGHTING Laiyun Qing 1,, Shiguang Shan, Wen Gao 1, 1 Graduate Schoo, CAS, Beijing, China, 100080 ICT-ISVISION Joint R&D Laboratory for Face Recognition, CAS, Beijing,

More information

Modification of Support Vector Machine for Microarray Data Analysis

Modification of Support Vector Machine for Microarray Data Analysis Goba Journa of Computer Science and Technoogy Hardware & Computation Voume 13 Issue 1 Version 1.0 Year 2013 Type: Doube Bind Peer Reviewed Internationa Research Journa Pubisher: Goba Journas Inc. (USA)

More information

Research on UAV Fixed Area Inspection based on Image Reconstruction

Research on UAV Fixed Area Inspection based on Image Reconstruction Research on UAV Fixed Area Inspection based on Image Reconstruction Kun Cao a, Fei Wu b Schoo of Eectronic and Eectrica Engineering, Shanghai University of Engineering Science, Abstract Shanghai 20600,

More information

JOINT IMAGE REGISTRATION AND EXAMPLE-BASED SUPER-RESOLUTION ALGORITHM

JOINT IMAGE REGISTRATION AND EXAMPLE-BASED SUPER-RESOLUTION ALGORITHM JOINT IMAGE REGISTRATION AND AMPLE-BASED SUPER-RESOLUTION ALGORITHM Hyo-Song Kim, Jeyong Shin, and Rae-Hong Park Department of Eectronic Engineering, Schoo of Engineering, Sogang University 35 Baekbeom-ro,

More information

A Novel Method for Early Software Quality Prediction Based on Support Vector Machine

A Novel Method for Early Software Quality Prediction Based on Support Vector Machine A Nove Method for Eary Software Quaity Prediction Based on Support Vector Machine Fei Xing 1,PingGuo 1;2, and Michae R. Lyu 2 1 Department of Computer Science Beijing Norma University, Beijing, 1875, China

More information

A Novel Linear-Polynomial Kernel to Construct Support Vector Machines for Speech Recognition

A Novel Linear-Polynomial Kernel to Construct Support Vector Machines for Speech Recognition Journa of Computer Science 7 (7): 99-996, 20 ISSN 549-3636 20 Science Pubications A Nove Linear-Poynomia Kerne to Construct Support Vector Machines for Speech Recognition Bawant A. Sonkambe and 2 D.D.

More information

1682 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 22, NO. 6, DECEMBER Backward Fuzzy Rule Interpolation

1682 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 22, NO. 6, DECEMBER Backward Fuzzy Rule Interpolation 1682 IEEE TRANSACTIONS ON FUZZY SYSTEMS, VOL. 22, NO. 6, DECEMBER 2014 Bacward Fuzzy Rue Interpoation Shangzhu Jin, Ren Diao, Chai Que, Senior Member, IEEE, and Qiang Shen Abstract Fuzzy rue interpoation

More information

Development of a hybrid K-means-expectation maximization clustering algorithm

Development of a hybrid K-means-expectation maximization clustering algorithm Journa of Computations & Modeing, vo., no.4, 0, -3 ISSN: 79-765 (print, 79-8850 (onine Scienpress Ltd, 0 Deveopment of a hybrid K-means-expectation maximization custering agorithm Adigun Abimboa Adebisi,

More information

Real-Time Feature Descriptor Matching via a Multi-Resolution Exhaustive Search Method

Real-Time Feature Descriptor Matching via a Multi-Resolution Exhaustive Search Method 297 Rea-Time Feature escriptor Matching via a Muti-Resoution Ehaustive Search Method Chi-Yi Tsai, An-Hung Tsao, and Chuan-Wei Wang epartment of Eectrica Engineering, Tamang University, New Taipei City,

More information

The Classification of Stored Grain Pests based on Convolutional Neural Network

The Classification of Stored Grain Pests based on Convolutional Neural Network 2017 2nd Internationa Conference on Mechatronics and Information Technoogy (ICMIT 2017) The Cassification of Stored Grain Pests based on Convoutiona Neura Network Dexian Zhang1, Wenun Zhao*, 1 1 Schoo

More information

Joint disparity and motion eld estimation in. stereoscopic image sequences. Ioannis Patras, Nikos Alvertos and Georgios Tziritas y.

Joint disparity and motion eld estimation in. stereoscopic image sequences. Ioannis Patras, Nikos Alvertos and Georgios Tziritas y. FORTH-ICS / TR-157 December 1995 Joint disparity and motion ed estimation in stereoscopic image sequences Ioannis Patras, Nikos Avertos and Georgios Tziritas y Abstract This work aims at determining four

More information

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm

A Comparison of a Second-Order versus a Fourth- Order Laplacian Operator in the Multigrid Algorithm A Comparison of a Second-Order versus a Fourth- Order Lapacian Operator in the Mutigrid Agorithm Kaushik Datta (kdatta@cs.berkeey.edu Math Project May 9, 003 Abstract In this paper, the mutigrid agorithm

More information

Layer-Specific Adaptive Learning Rates for Deep Networks

Layer-Specific Adaptive Learning Rates for Deep Networks Layer-Specific Adaptive Learning Rates for Deep Networks arxiv:1510.04609v1 [cs.cv] 15 Oct 2015 Bharat Singh, Soham De, Yangmuzi Zhang, Thomas Godstein, and Gavin Tayor Department of Computer Science Department

More information

Bilevel Optimization based on Iterative Approximation of Multiple Mappings

Bilevel Optimization based on Iterative Approximation of Multiple Mappings Bieve Optimization based on Iterative Approximation of Mutipe Mappings arxiv:1702.03394v2 [math.oc] 5 May 2017 Ankur Sinha 1, Zhichao Lu 2, Kayanmoy Deb 2 and Pekka Mao 3 1 Production and Quantitative

More information

Design of IP Networks with End-to. to- End Performance Guarantees

Design of IP Networks with End-to. to- End Performance Guarantees Design of IP Networks with End-to to- End Performance Guarantees Irena Atov and Richard J. Harris* ( Swinburne University of Technoogy & *Massey University) Presentation Outine Introduction Mutiservice

More information

Solving Large Double Digestion Problems for DNA Restriction Mapping by Using Branch-and-Bound Integer Linear Programming

Solving Large Double Digestion Problems for DNA Restriction Mapping by Using Branch-and-Bound Integer Linear Programming The First Internationa Symposium on Optimization and Systems Bioogy (OSB 07) Beijing, China, August 8 10, 2007 Copyright 2007 ORSC & APORC pp. 267 279 Soving Large Doube Digestion Probems for DNA Restriction

More information

A Novel Congestion Control Scheme for Elastic Flows in Network-on-Chip Based on Sum-Rate Optimization

A Novel Congestion Control Scheme for Elastic Flows in Network-on-Chip Based on Sum-Rate Optimization A Nove Congestion Contro Scheme for Eastic Fows in Network-on-Chip Based on Sum-Rate Optimization Mohammad S. Taebi 1, Fahimeh Jafari 1,3, Ahmad Khonsari 2,1, and Mohammad H. Yaghmae 3 1 IPM, Schoo of

More information

A VIDEO-BASED APPROACH FOR RAILWAY BRIDGE SURFACE CRACK DETECTION

A VIDEO-BASED APPROACH FOR RAILWAY BRIDGE SURFACE CRACK DETECTION 6 th Internationa Symposium on Mobie Mapping Technoogy, Presidente Prudente, São Pauo, Brazi, Juy 21-24, 2009 A VIDEO-BASED APPROACH FOR RAILWAY BRIDGE SURFACE CRACK DETECTION LI Qingquan a,c, CHEN Long

More information

Fast Methods for Kernel-based Text Analysis

Fast Methods for Kernel-based Text Analysis Proceedings of the 41st Annua Meeting of the Association for Computationa Linguistics, Juy 2003, pp. 24-31. Fast Methods for Kerne-based Text Anaysis Taku Kudo and Yuji Matsumoto Graduate Schoo of Information

More information

file://j:\macmillancomputerpublishing\chapters\in073.html 3/22/01

file://j:\macmillancomputerpublishing\chapters\in073.html 3/22/01 Page 1 of 15 Chapter 9 Chapter 9: Deveoping the Logica Data Mode The information requirements and business rues provide the information to produce the entities, attributes, and reationships in ogica mode.

More information

University of Illinois at Urbana-Champaign, Urbana, IL 61801, /11/$ IEEE 162

University of Illinois at Urbana-Champaign, Urbana, IL 61801, /11/$ IEEE 162 oward Efficient Spatia Variation Decomposition via Sparse Regression Wangyang Zhang, Karthik Baakrishnan, Xin Li, Duane Boning and Rob Rutenbar 3 Carnegie Meon University, Pittsburgh, PA 53, wangyan@ece.cmu.edu,

More information

Massively Parallel Part of Speech Tagging Using. Min-Max Modular Neural Networks.

Massively Parallel Part of Speech Tagging Using. Min-Max Modular Neural Networks. assivey Parae Part of Speech Tagging Using in-ax oduar Neura Networks Bao-Liang Lu y, Qing a z, ichinori Ichikawa y, & Hitoshi Isahara z y Lab. for Brain-Operative Device, Brain Science Institute, RIEN

More information

Indexed Block Coordinate Descent for Large-Scale Linear Classification with Limited Memory

Indexed Block Coordinate Descent for Large-Scale Linear Classification with Limited Memory Indexed Bock Coordinate Descent for Large-Scae Linear Cassification with Limited Memory Ian E.H. Yen Chun-Fu Chang Nationa Taiwan University Nationa Taiwan University r0092207@csie.ntu.edu.tw r99725033@ntu.edu.tw

More information

Providing Hop-by-Hop Authentication and Source Privacy in Wireless Sensor Networks

Providing Hop-by-Hop Authentication and Source Privacy in Wireless Sensor Networks The 31st Annua IEEE Internationa Conference on Computer Communications: Mini-Conference Providing Hop-by-Hop Authentication and Source Privacy in Wireess Sensor Networks Yun Li Jian Li Jian Ren Department

More information

Outline. Introduce yourself!! What is Machine Learning? What is CAP-5610 about? Class information and logistics

Outline. Introduce yourself!! What is Machine Learning? What is CAP-5610 about? Class information and logistics Outine Introduce yoursef!! What is Machine Learning? What is CAP-5610 about? Cass information and ogistics Lecture Notes for E Apaydın 2010 Introduction to Machine Learning 2e The MIT Press (V1.0) About

More information

Subgrade Cumulative Plastic Deformation under the Bridge in the Transitional Period of Orbit Dynamics Analysis

Subgrade Cumulative Plastic Deformation under the Bridge in the Transitional Period of Orbit Dynamics Analysis 499 A pubication of CHEMICAL ENGINEERING TRANSACTIONS VOL. 59, 2017 Guest Editors: Zhuo Yang, Junjie Ba, Jing Pan Copyright 2017, AIDIC Servizi S.r.. ISBN 978-88-95608-49-5; ISSN 2283-9216 The Itaian Association

More information

CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING

CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING CLOUD RADIO ACCESS NETWORK WITH OPTIMIZED BASE-STATION CACHING Binbin Dai and Wei Yu Ya-Feng Liu Department of Eectrica and Computer Engineering University of Toronto, Toronto ON, Canada M5S 3G4 Emais:

More information

ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES. Eyal En Gad, Akshay Gadde, A. Salman Avestimehr and Antonio Ortega

ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES. Eyal En Gad, Akshay Gadde, A. Salman Avestimehr and Antonio Ortega ACTIVE LEARNING ON WEIGHTED GRAPHS USING ADAPTIVE AND NON-ADAPTIVE APPROACHES Eya En Gad, Akshay Gadde, A. Saman Avestimehr and Antonio Ortega Department of Eectrica Engineering University of Southern

More information

Endoscopic Motion Compensation of High Speed Videoendoscopy

Endoscopic Motion Compensation of High Speed Videoendoscopy Endoscopic Motion Compensation of High Speed Videoendoscopy Bharath avuri Department of Computer Science and Engineering, University of South Caroina, Coumbia, SC - 901. ravuri@cse.sc.edu Abstract. High

More information

MULTITASK MULTIVARIATE COMMON SPARSE REPRESENTATIONS FOR ROBUST MULTIMODAL BIOMETRICS RECOGNITION. Heng Zhang, Vishal M. Patel and Rama Chellappa

MULTITASK MULTIVARIATE COMMON SPARSE REPRESENTATIONS FOR ROBUST MULTIMODAL BIOMETRICS RECOGNITION. Heng Zhang, Vishal M. Patel and Rama Chellappa MULTITASK MULTIVARIATE COMMON SPARSE REPRESENTATIONS FOR ROBUST MULTIMODAL BIOMETRICS RECOGNITION Heng Zhang, Visha M. Pate and Rama Cheappa Center for Automation Research University of Maryand, Coage

More information

Digital Image Watermarking Algorithm Based on Fast Curvelet Transform

Digital Image Watermarking Algorithm Based on Fast Curvelet Transform J. Software Engineering & Appications, 010, 3, 939-943 doi:10.436/jsea.010.310111 Pubished Onine October 010 (http://www.scirp.org/journa/jsea) 939 igita Image Watermarking Agorithm Based on Fast Curveet

More information

Forgot to compute the new centroids (-1); error in centroid computations (-1); incorrect clustering results (-2 points); more than 2 errors: 0 points.

Forgot to compute the new centroids (-1); error in centroid computations (-1); incorrect clustering results (-2 points); more than 2 errors: 0 points. Probem 1 a. K means is ony capabe of discovering shapes that are convex poygons [1] Cannot discover X shape because X is not convex. [1] DBSCAN can discover X shape. [1] b. K-means is prototype based and

More information

Identifying and Tracking Pedestrians Based on Sensor Fusion and Motion Stability Predictions

Identifying and Tracking Pedestrians Based on Sensor Fusion and Motion Stability Predictions Sensors 2010, 10, 8028-8053; doi:10.3390/s100908028 OPEN ACCESS sensors ISSN 1424-8220 www.mdpi.com/journa/sensors Artice Identifying and Tracking Pedestrians Based on Sensor Fusion and Motion Stabiity

More information

Extracting semistructured data from the Web: An XQuery Based Approach

Extracting semistructured data from the Web: An XQuery Based Approach EurAsia-ICT 2002, Shiraz-Iran, 29-31 Oct. Extracting semistructured data from the Web: An XQuery Based Approach Gies Nachouki Université de Nantes - Facuté des Sciences, IRIN, 2, rue de a Houssinière,

More information

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code

Further Optimization of the Decoding Method for Shortened Binary Cyclic Fire Code Further Optimization of the Decoding Method for Shortened Binary Cycic Fire Code Ch. Nanda Kishore Heosoft (India) Private Limited 8-2-703, Road No-12 Banjara His, Hyderabad, INDIA Phone: +91-040-3378222

More information

An Alternative Approach for Solving Bi-Level Programming Problems

An Alternative Approach for Solving Bi-Level Programming Problems American Journa of Operations Research, 07, 7, 9-7 http://www.scirp.org/ourna/aor ISSN Onine: 60-889 ISSN rint: 60-880 An Aternative Approach for Soving Bi-Leve rogramming robems Rashmi Bira, Viay K. Agarwa,

More information

On-Chip CNN Accelerator for Image Super-Resolution

On-Chip CNN Accelerator for Image Super-Resolution On-Chip CNN Acceerator for Image Super-Resoution Jung-Woo Chang and Suk-Ju Kang Dept. of Eectronic Engineering, Sogang University, Seou, South Korea {zwzang91, sjkang}@sogang.ac.kr ABSTRACT To impement

More information

AUTOMATIC gender classification based on facial images

AUTOMATIC gender classification based on facial images SUBMITTED TO IEEE TRANSACTIONS ON NEURAL NETWORKS 1 Gender Cassification Using a Min-Max Moduar Support Vector Machine with Incorporating Prior Knowedge Hui-Cheng Lian and Bao-Liang Lu, Senior Member,

More information

Multi-task hidden Markov modeling of spectrogram feature from radar high-resolution range profiles

Multi-task hidden Markov modeling of spectrogram feature from radar high-resolution range profiles http://asp.eurasipjournas.com/content/22//86 RESEARCH Open Access Muti-task hidden Markov modeing of spectrogram feature from radar high-resoution range profies Mian Pan, Lan Du *, Penghui Wang, Hongwei

More information

fastruct: model-based clustering made faster

fastruct: model-based clustering made faster fastruct: mode-based custering made faster Chibiao Chen Forence Forbes Oivier François INRIA Rhone-Apes, 655 Avenue de Europe, Montbonnot, 38334 St Ismier France TIMC-TIMB, Dept Math Bioogy, Facuty of

More information

Self-Control Cyclic Access with Time Division - A MAC Proposal for The HFC System

Self-Control Cyclic Access with Time Division - A MAC Proposal for The HFC System Sef-Contro Cycic Access with Time Division - A MAC Proposa for The HFC System S.M. Jiang, Danny H.K. Tsang, Samue T. Chanson Hong Kong University of Science & Technoogy Cear Water Bay, Kowoon, Hong Kong

More information

Sparse Representation based Face Recognition with Limited Labeled Samples

Sparse Representation based Face Recognition with Limited Labeled Samples Sparse Representation based Face Recognition with Limited Labeed Sampes Vijay Kumar, Anoop Namboodiri, C.V. Jawahar Center for Visua Information Technoogy, IIIT Hyderabad, India Abstract Sparse representations

More information

A Two-Step Approach to Hallucinating Faces: Global Parametric Model and Local Nonparametric Model

A Two-Step Approach to Hallucinating Faces: Global Parametric Model and Local Nonparametric Model A Two-Step Approach to aucinating Faces: Goba Parametric Mode and Loca Nonparametric Mode Ce Liu eung-yeung Shum Chang-Shui Zhang State Key Lab of nteigent Technoogy and Systems, Dept. of Automation, Tsinghua

More information

Split Restoration with Wavelength Conversion in WDM Networks*

Split Restoration with Wavelength Conversion in WDM Networks* Spit Reoration with aveength Conversion in DM Networks* Yuanqiu Luo and Nirwan Ansari Advanced Networking Laborator Department of Eectrica and Computer Engineering New Jerse Initute of Technoog Universit

More information

Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation

Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation Transformation Invariance in Pattern Recognition: Tangent Distance and Propagation Patrice Y. Simard, 1 Yann A. Le Cun, 2 John S. Denker, 2 Bernard Victorri 3 1 Microsoft Research, 1 Microsoft Way, Redmond,

More information

DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS

DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS DETERMINING INTUITIONISTIC FUZZY DEGREE OF OVERLAPPING OF COMPUTATION AND COMMUNICATION IN PARALLEL APPLICATIONS USING GENERALIZED NETS Pave Tchesmedjiev, Peter Vassiev Centre for Biomedica Engineering,

More information

Fastest-Path Computation

Fastest-Path Computation Fastest-Path Computation DONGHUI ZHANG Coege of Computer & Information Science Northeastern University Synonyms fastest route; driving direction Definition In the United states, ony 9.% of the househods

More information