Multi-task Joint Feature Selection for Multi-label Classification

Size: px
Start display at page:

Download "Multi-task Joint Feature Selection for Multi-label Classification"

Transcription

1 Chinese Journal of Electronics Vol.24, No.2, Apr. 205 Multi-task Joint Feature Selection for Multi-label Classification HE Zhifen,2, YANG Ming,2 and LIU Huidong 2 (. School of Mathematical Sciences, Nanjing Normal University, Nanjing 20023, China) (2. School of Computer Science and Technology, Nanjing Normal University, Nanjing 20023, China) Abstract Multi-label learning deals with each instance which may be associated with a set of class labels simultaneously. We propose a novel multi-label classification approach named MFSM (Multi-task joint feature selection for multi-label classification). In MFSM, we compute the asymmetric label correlation matrix in the label space. The multi-label learning problem can be formulated as a joint optimization problem including two regularization terms, one aims to consider the label correlations and the other is used to select the similar sparse features shared among multiple different classification tasks (each for one label). Our model can be reformulated into an equivalent smooth convex optimization problem which can be solved by the Nesterov s method. The experiments on sixteen benchmark multi-label data sets demonstrate that our method outperforms the state-of-the-art multi-label learning algorithms. Key words Multi-label learning, Multi-task learning, Feature selection, Label correlations. I. Introduction Multi-label learning (MLL) is a hot research topic in the area of machine learning, pattern recognition, etc. In MLL, each instance is represented by a feature vector and may belong to multiple labels. The task of MLL is to predict a set of class labels for the unseen instance. Single label learning (SLL) including two-class and multi-class learning, where each object is represented by a single instance and only associated with one class label (no matter the number of class label is two or more). Thus, SLL is essentially a degenerated version of MLL by restricting that each instance only belongs to one class label. The conception of MLL is originated from the research of document categorization [,2], where a document is possibly assigned to several predefined categories at the same time, such as diet and health. In recent years, many other real-world application domains are also involved in MLL. For example, in automatic image annotation, each image may be annotated with a set of semantic concepts [3,4], such as sunset, beach and sea. In bioinformatics, each gene sequence may be correlated with multiple functional classes [5,6], such as transcription, metabolism and protein synthesis. In music categorization, one music clip may be tagged with more than one concept labels [7], such as happy, joyful and so on. Numerous algorithms have been put forward for MLL, such as Boosting-based text categorization algorithm (BoosTexter) [], Kernel method for multi-label classification (RankSvm) [6], Multi-label neural network (BPMLL) [5], Multi-label lazy learner (MLkNN) [8], Multi-label naive bayes (MLNB) [9], Multi-label core vector machine (Rank-CVM) [0] and so on. The most common and direct approach is to transform the MLL problem into a series of independent binary classification problems [5,6,0,]. However, it ignores the label correlations. It is widely accepted that exploiting label correlationsisanimportantissueinmll [5,6,2 5]. For instance, a document tagged with Olympic Games and Basketball would be likely labeled as Sports. A music clip assigned with happy would be unlikely labeled as sad. There are many approaches for exploiting label correlations, which can be roughly categorized into three groups according to the order of label correlations, namely, first-order strategy, second-order strategy and high-order strategy [2,6]. For example, Fürnkranz et al. [7] considered pairwise correlations between labels. Huang et al. [4] tried to discover label correlations automatically and concluded that the correlations among labels are usually asymmetric. Like in SLL, MLL may also encounter the curse of dimensionality because it often involves high-dimensional data sets. Zhang et al. [9] adapted the naive Bayes algorithm for MLL, where feature extraction strategies based on principal component analysis and feature selection techniques based on a genetic algorithm are incorporated into it. Ji et al. [8] considered multi-label classification and dimensionality reduction simultaneously in the objective function. Nevertheless, the label Manuscript Received Nov. 203; Accepted June 204. This work is supported by National Natural Science Foundation of China (No , No.60036), Natural Science Foundation of Jiangsu Province of China (No.BK20782), and Key (Major) Program of Natural Science Foundation of Jiangsu Province of China (No.BK20005). c 205 Chinese Institute of Electronics. DOI:0.049/cje

2 282 Chinese Journal of Electronics 205 correlations are not taken into consideration. Zhang et al. [9] studied the problem of dimensionality reduction in MLL. The dependence between the instances and the corresponding labels are maximized by the Hilbert-schmidt independence criterion in the lower-dimensional feature space. Nevertheless, it only conducts dimensionality reduction without concerning with the construction of multi-label classifier. Therefore, there exists three major challenges in MLL, namely, a) how to build an effective multi-label classifier to predict the label set of the unknown instance, b) howtoeffectively exploit label correlations and c) how to reduce the dimensionality of high-dimensional data to improve the generalization performance of MLL system. In this paper, we try to address the above three problems. We introduce a novel multi-label classification algorithm named MFSM. The high-order asymmetric label correlations are obtained by l sparse coding in the label space at first. Then, the MLL problem is formulated as a joint optimization problem by incorporating label correlations term, and the l 2,- norm regularization term which isusedforselectingthecom- mon features shared among multiple classifiers. At last, since optimization problem is a non-smooth convex problem, we reformulate it as an equivalent smooth convex problem [20] which can be solved by the Nesterov s method [2]. In summary, the major contributions of our paper include three aspects. a) MFSM has enriched the research of MLL by proposing a novel multi-label classifier. b) The technique of multi-task feature learning is successfully applied to MLL. c) The proposed formulation incorporates the correlations among class labels. II. Related Works In the past decade years, MLL which deals with instances having multiple labels has been received much concern in machine learning, data ing, etc. A number of MLL approaches have been developed and they can be roughly divided into two categories [2] : ) Problem transformation methods: These methods convert the MLL problem into other well-established learning problems such as binary classification problem, multi-class problem, label ranking problem and so on. Boutell et al. [3] introduced the Binary relevance (BR) method for MLL. BR s mehods can be parallelized. However, there exists many drawbacks such as the label correlations are not taken into account and the problem of large scale label space. Read et al. [22] developed the Classifier chains (CC) algorithm based on the BR method. This method takes into consideration the label correlations in the label space, but the label correlations are random due to the randomly permutated chain. Furthermore, there possibly exists the effect of error propagation along a random chain if the performance of the first one or more classifiers are poor. To overcome the shortcogs, an Ensemble of classifier chains (ECC) [22] is proposed by employing an ensemble of chains. Label powerset (LP) method considers each distinct subset of label sets that exists in the training set as a different class value and then constructs a multi-class classifier. This method takes into account the label correlations, but suffers from the large number of class labels and can not predict unseen label sets. To deal with these problems, Tsoumakas et al. [23] built an ensemble of LP classifiers (RAkEL). In the beginning, a number of small subsets are randomly picked from the initial set of labels. Multiple multi-class classifiers are constructed by utilizing the LP method subsequently. In the end, the outputs are predicted by combining multiple classifiers and classified by majority voting or threshold for each instance. The basic idea of Label ranking (LR) [24] is to convert the MLL problem into the label ranking problem by pairwise comparison. Nevertheless, it is difficult to detere the threshold value to correctly estimate the label sets of the predicted instance. Fürnkranz et al. [7] attempted to add an extra virtual class label, which plays the part of a bi-partition point between the relevant and irrelevant class labels, for the class label set of each instance on the basis of the LR method. 2) Algorithm adaptation methods: These methods are the extension of some well-known learning algorithms to handle the multi-label data sets directly. Schapire et al. [] extended the popular Adaboost algorithm to text categorization, where Adaboost.MH and Adaboost.MR algorithm are proposed by imizing hamg loss and ranking loss based on AdaBoost, respectively. Zhang et al. [8] adapted the classic k-nearest neighbor algorithm to deal directly with the multi-label data sets. Many approaches are the variant of the classic support vector machine, such as RankSvm [6],RankCVM [0],OVR-ESVM [], MLLOC [3] and so on. Zhang et al. [5] designed the multi-label algorithm based on the traditional back-propagation neural network. Multi-task learning (MTL) [20,25 27] learns multiple related tasks simultaneously rather than learning each task independently. It aims to learn the common information across multiple related tasks to improve the performance of MTL system. Thus, exploiting the relatedness among multiple tasks is a key issue in MTL [27]. In this paper, we decompose the MLL problem into multiple binary problems and treat each binary classification problem as a learning task with the same input features. Meanwhile, we consider the high-order label correlations and use l 2,-norm to capture the similar sparse structures shared among multiple tasks. Our proposed approach is presented in the next section. III. The Proposed Framework In this section, let us begin by introducing some notations. Given a training set D = {(x, Y ), (x 2, Y 2),...,(x n, Y n)} with n instances, where x i R d represents a single instance, Y i {+, } K is a binary label vector representation corresponding to x i, Y ij = + if x i belongs to the jth label and otherwise, and K denotes the number of class labels. We denote by X = [x,x 2,...,x n] T R n d, Y =[Y, Y 2,...,Y n] T R n K the data matrix and the class label indicator matrix, respectively.. Constructing the label correlation matrix Sparse representation (SR) is firstly used for signal representation and compression, and then widely applied in signal processing, machine learning, etc. Given a signal c R n and

3 Multi-task Joint Feature Selection for Multi-label Classification 283 amatrixc =[c, c 2,...,c K] R n K,SRaimstorepresentc using as few elements in C as possible [28], the objective function can be defined as follows: s s 0 s.t. c = Cs where s R K is the sparse coefficient vector. Unfortunately, the problem in Eq.() is not convex. Some studies [28,29] showed that if the solution of Eq.() is sparse, Eq.() can be transformed into Eq.(2): s s s.t. c = Cs Generally, the constraint in Eq.(2) does not always hold due to noises. One of the robust extension is to add a noise term [30] ɛ R n, i.e. c = Cs + ɛ. Meanwhile, the constrained optimization problem can be rewritten to an unconstrained optimization problem by adding a tradeoff parameter. Thus, the reformulated model is given by Eq.(3): es 2 c C e es 2 + α es (3) where e C = [C, I n] R n (K+n), I n is the n n identity matrix, es = [s T,ɛ T ] T R K+n and α is the regularization parameter which is utilized to trade off the two terms. The label correlations are usually asymmetric [4] in MLL. Moreover, not all the labels are related [3]. In many cases, only part of labels are relevant, especially when the number of class labels is very large. So we try to obtain the label sparse representation for each label vector by using Eq.(3) to characterize the label correlations of each label against the rest labels. In this work, we assume that the self-label-correlation coefficient is zero for each label. Thus, the label correlation matrix W with all diagonal elements being zero can be constructed according to the sparse representation vector of each label. The complete description is illustrated in Algorithm. Algorithm Learning the label correlation matrix Input: The label indicator matrix of the training set Y Output: Label correlation matrix W R K K with all diagonal elements being zero : Set C =[c, c 2,...,c K ] R n K,wherec ik =ify ik =+ and 0 otherwise. And then each element c k of C is normalized to be c k / c k, k =, 2,...,K 2: For k =tok 3: The sparse representation s R K+n with the kth element being zero for each label vector c k is obtained by solving the following optimization problem: s 2 c k Cs e 2 + α s s.t. s k =0 4: Set W lk = s l for l K 5: End for 2. The MFSM approach In this work, we learn a set of K linear functions: {f,f 2,...,f K} (each for one label), where f k (x) =a T k x + b k, a k R d and b k R denote the weight vector and bias for the kth class label, respectively, k =, 2,...,K. Assume that both the data matrix X and the label indicator matrix Y are () (2) centered. Therefore, all the bias terms {b k } are zero. We first consider the construction of each linear classifier as a learning task, and then train K linear classifiers in a joint optimization problem to learn multiple tasks simultaneously. Furthermore, to effectively exploit label correlations, a regularization term is added. Thus, the objective function is given by Eq.(4): {a k } K 2 k= nx i= k= + λ 2 (a T k x i Y ik ) 2 Xa k W jk Xa j 2 2 k= j= where λ is the regularization parameter that balances the two terms. The first term is the least square loss function and the second is the reconstruction error term which ensures that each label can be linearly represented by other related class labels. As we know, the curse of dimensionality may also occur in MLL. Generally, in the high-dimensional feature space, only a small subset of features is useful for classifier building. In this paper, multiple tasks with all the instances sharing the same feature space are trained simultaneously. In essence, there is a certain relationship among multiple tasks. By adding a l 2,- norm penalty to select the common discriative features across multiple linear classifiers. MFSM can be expressed as Eq.(5): {a k } K 2 k= + λ 2 nx i= k= (a T k x i Y ik ) 2 k= (4) (5) Xa k W jk Xa j λ 2 A 2, j= where A =[a,a 2,...,a K] = [a,a 2,...,a d ] T R d K and A 2, = P d i= ai 2 is the l 2,-norm of the matrix A which is used for enforcing the similar sparsity features shared among K binary classifiers, and λ and λ 2 are the regularization parameters that balance the three terms. Eq.(5) can be rewritten as Eq.(6): q(a)+λ2 A 2, (6) A R d K where q(a) = 2 XA Y 2 F + λ 2 tr(xamat X T ), M = (I K W )(I K W ) T, W stands for the label correlation matrix, I K is the K K identity matrix, tr( ) denotes the trace of a matrix. 3. Problem solution by the Nesterov s method The problem in Eq.(6) is a non-smooth convex problem because A 2, is non-differentiable. But it can be reformulated as an equivalent constrained smooth convex optimization problem, which is then solved by the Nesterov s method [2],by adopting the method of Ref.[20]. g(u, A) =q(a)+λ 2 (u,a) Ω dx u i (7) where u =[u,u 2,...,u d ] T and Ω = {(u, A) a i u i, i =, 2,...,d}. Moreover, since q(a) is a smooth convex function, the optimization problem in Eq.(7) is close and convex [20]. The Nesterov s method is based on the sequences of approximate solutions {(u t, A t)} and search points {(v t, S t)}, i=

4 284 Chinese Journal of Electronics 205 where t denotes the t-th iteration. The search point (v t, S t)is defined as (v t, S t)=(u t + α t(u t u t ), A t + α t(a t A t )) (8) where α t is the combination coefficient. The approximate solution (u t+, A t+) can be computed by (u t+, A t+) =π Ω(v t γ t vt g(v t, S t), S t γ t St g(v t, S t)) (9) where γ t is the stepsize, π Ω(v, S) is the Euclidean projection [32] of (v, S) onto the convex set of Ω: π Ω(v, S) =arg (u,a) Ω 2 A S 2 F + 2 u v 2 2 (0) vg(v, S) is the partial derivative of g(v, S) withrespectto v: vg(v, S) =λ 2 () R d is a vector of all ones and S g(v, S) is the partial derivative of g(v, S) withrespecttos: S g(v, S) =X T (XS Y )+λ X T XSM (2) The algorithm description for solving Eq.(7) by the Nesterov s method is given in Algorithm 2. And the definition of g γ,v,s(u, A) isgivenby: g γ,v,s (u, A) =g(v, S)+ vg(v, S), u v dx + ( S g(v, S)) ij(a S) ij i= j= + 2γ u v γ A S 2 F (3) The time complexity of the optimization problem in Eq.(7) which is solved by the Nesterov s method is O( ε (ndk+dk)), where n, d, K denote the number of instances, features and class labels, respectively. ε is the error tolerance. The detailed analysis please see Ref.[2]. Algorithm 2 Problem solution of MFSM Input: g(, ), Ω, γ 0 > 0, (u 0, A 0 ) Output: (u, A), where u R d and A R d K : Initialize (u, A )=(u 0, A 0 ), β =0,β 0 = 2: For t = to...do 3: Set α t = β t 2 β t 4: Compute (v t, S t)usingeq.(9) 5: For i = 0 to...do 6: Set γ =2 i γ t,v t = vt γ vt g(v t, S t), S t = St γ S t g(v t, S t) 7: Compute (u t+, A t+ )=π Ω (v t, S t ) 8: If g(u t+, A t+ ) g γ,vt,st (u t+, A t+ )then 9: γ t = γ, break 0: End if : End for q + +4βt 2 2: Set β t = 2 3: If convergence then 4: (u, A) =(u t, A t), terate 5: End if 6: End for In the testing phase, given an unseen instance x, itslabel sets can be predicted by: h(x) ={k f k (x) t(x), k K} (4) where t(x) is a threshold function. There are some ways to choose the threshold function t(x) [3,5,6,8]. A simple method is that t(x) is set to be a constant function [3,8]. However, motivated by Refs.[5,6], we here select t(x) using a linear regression function. IV. Experiments. Experimental setting To validate the effectiveness of our proposed approach, we compare MFSM with eight state-of-the-art MLL algorithms including MLkNN [8], RankSVM [6], BPMLL [5], MLNB [9], MDDM [9],ECC [22],RAkEL [23] and MAHR [4],whereECC and RAkEL are implemented on MULAN library [33] while the other algorithms are implemented in Matlab. These algorithms are performed on sixteen multi-label data sets where Yahoo includes eleven data sets. The data sets are summarized in Table. Table. Characteristics of the experimental data sets Data set Instances Train Test Features Labels Domain Medical text Slashdot text Langlog text Human biology Plant biology Yahoo text In our experiment, five commonly-used multi-label evaluation criteria are employed, i.e., Average precision, One-Error, Micro-F, Macro-F, Macro-AUC. The first two are instancebased and the last three are label-based. The detailed definitions of these evaluation criteria can be seen in Ref.[2]. 2. Parameters selection The optimal parameters of eight comparative approaches are suggested in the corresponding literatures. For MFSM, three parameters need to be tuned, i.e. α, λ and λ 2.Inthis paper, λ is selected from {0.000, 0.00, 0.0, 0., 0.2,...,}, α and λ 2 are varied from 0 to with an interval of 0.. We perform MFSM on some data sets with five-fold crossvalidation by grid-research. Experimental results show that MFSM achieves stable performance with α =0.5, λ =0.00 and λ 2 =0.7, respectively. Hence, MFSM is implemented with α =0.5, λ =0.00 and λ 2 =0.7, respectively, for all the data sets in experiments. Furthermore, the common features shared among multiple classifiers are learned by the l 2,-norm regularization. The features are selected when the corresponding rows of matrix A are not identically equal to zero. Thus, the number of selected features is not a parameter but controlled by the parameter λ Experimental results Tables 2 6 present the experimental results of MFSM and the other eight well-known approaches on sixteen multi-label data sets in terms of five commonly-used multi-label evaluation metrics, where the best result on each data set among nine algorithms is highlighted in boldface. Due to space limit,

5 Multi-task Joint Feature Selection for Multi-label Classification 285 we only list the average results over the eleven data sets for Yahoo. The experimental results in Tables 2 3 show that our proposed method MFSM gets significantly better performance than the other compared methods on all the data sets in terms of two instance-based evaluation criteria. Particularly, MFSM outperforms the high-order algorithms (ECC, RAkEL and MAHR). The comparison results in terms of three label-based evaluation criteria are summarized in Tables 4 6. In Table 4, MFSM gets the second place on medical data set, but the difference is very small compared with MAHR. In addition, it is worth noting that MFSM achieves the best performance on the rest data sets. As shown in Table 5, MFSM gets better performance than BPMLL on most of the data sets and the difference is less than 0.0 in terms of Macro-F on Slashdot data set. On Medical data set, MFSM is only inferior to MAHR. For the other data sets, MFSM is obviously superior to all the compared algorithms. In Table 6, there is little difference between the results of MFSM and MAHR on Medical data set. Although MFSM ranks the third on Langlog data set, the difference between MFSM and the best algorithm is not significant, and MFSM is better than the other six methods. Table 2. Experimental results in terms of Average precision (the bigger the better) Medical Slashdot Langlog Human Plant Yahoo Table 3. Experimental results in terms of One-Error (the smaller the better) Medical Slashdot Langlog Human Plant Yahoo Table 4. Experimental results in terms of Micro-F (the bigger the better) Medical Slashdot Langlog Human Plant Yahoo Table 5. Experimental results in terms of Macro-F (the bigger the better) Medical Slashdot Langlog Human Plant Yahoo Table 6. Experimental results in terms of Macro-AUC (the bigger the better) Medical Slashdot Langlog Human Plant Yahoo To further do a comparative analysis on statistics, we compare all methods against each other over multiple data sets by Friedman test [34] with 5% significance level. For each evaluation criterion, the null-hypothesis that all the algorithms have equal performance is rejected. So we should use post-hoc tests to detere which algorithms significantly different. In this paper, the Nemenyi test is used where the performance of two algorithms is significantly different if the difference be-

6 286 Chinese Journal of Electronics 205 tween their average ranks is not less than the Critical difference (CD) [6,34]. Fig. gives the statistical comparisons of all algorithms against each other over multiple data sets in terms of five evaluation criteria, where the critical difference (CD=3.00, 9 algorithms, 6 data sets) is drawn above the axis and the average rank of each algorithm is plotted on the axis (higher ranks to the left) in each subfigure. The groups of algorithms that are not significantly different are connected with a bold line. For each algorithm, it exists 40 comparisons (8 comparing algorithms, 5 criteria). For MFSM, it achieves statistically comparable performance in 30% cases and better performance in 70% cases, and it is not inferior to the other algorithms, i.e. no algorithms outperform MFSM. The above experimental results and statistical analysis indicate that MFSM obtains better performance than the other state-of-the-art MLL algorithms. Fig.. Performance comparisons of all algorithms against each other in terms of each evaluation criterion. (a) Average precision; (b) One-Error; (c) Micro-F ; (d) Macro-F ; (e) Macro-AUC V. Conclusions This paper presents a novel effective method for multi-label classification named MFSM. MFSM considers the construction of multi-label classifier, label correlations and feature selection simultaneously. Firstly, by using l sparse coding in the label space, the high-order asymmetric label correlations are obtained. Secondly, by taking into account the label correlations, the major shortcog that converting the MLL problem into multiple binary classification problems is overcame. Thirdly, by incorporating the l 2,-norm regularization term, the similar sparsity patterns among multiple tasks are selected. Finally, a joint optimization formulation is constructed and reformulated as an equivalent smooth convex optimization problem and then solved by the Nesterov s method. The experimental results on sixteen data sets verify the effectiveness of our proposed method. References [] R.E. Schapire and Y. Singer, Boostexter: A boosting-based system for text categorization, Machine Learning, Vol.39, No.2/3, pp.35 68, [2] Y.Y. Jiang, P. Li and Q. Wang, Labeled LDA model based on shared background topics, Acta Electronic Sinica, Vol.4, No.9, pp , 203. (in Chinese) [3] M.R. Boutell, J. Luo, et al., Learning multi-label scene classification, Pattern Recognition, Vol.37, No.9, pp , [4] C. Wang, S. Yan, et al., Multi-label sparse coding for automatic image annotation, IEEE Conference on Computer Vision and Pattern Recognition, Miami, Florida, USA, pp , [5] M.L. Zhang and Z.H. Zhou, Multilabel neural networks with applications to functional genomics and text categorization, IEEE Transactions on Knowledge and Data Engineering, Vol.8, No.0, pp , [6] A. Elisseeff and J. Weston, A kernel method for multi-labelled classification, Proc. of the Fifteenth Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, pp , 200. [7] K. Trohidis, G. Tsoumakas, et al., Multilabel classification of music into emotions, Proc. of the 9th International Conference on Music Information Retrieval, Philadephia, PA, USA, pp , [8] M.L. Zhang and Z.H. Zhou, ML-KNN: A lazy learning approach to multi-label learning, Pattern Recognition, Vol.40, No.7, pp , [9] M.L. Zhang, J.M. Peña and V. Robles, Feature selection for multi-label naive Bayes classification, Information Sciences, Vol.79, No.9, pp , [0] J.H. Xu, Fast multi-label core vector machine, Pattern Recognition, Vol.46, No.3, pp , 203. [] J.H. Xu, An extended one-versus-rest support vector machine for multi-label classification, Neurocomputing, Vol.74, No.7, pp , 20. [2] M.L. Zhang and Z.H. Zhou, A review on multi-label learning algorithms, IEEE Transactions on Knowledge and Data Engineering, Vol.26, No.8, pp , 204. [3] S.J. Huang and Z.H. Zhou, Multi-label learning by exploiting label correlations locally, Proc. of the 26th AAAI Conference on Artificial Intelligence, Toronto, Canada, pp , 202.

7 Multi-task Joint Feature Selection for Multi-label Classification 287 [4] S.J. Huang, Y. Yu and Z.H. Zhou, Multi-label hypothesis reuse, Proc. of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Beijing, China, pp , 202. [5] M.L. Zhang and K. Zhang, Multi-label learning by exploiting label dependency, Proc. of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Washington, D.C., USA, pp , 200. [6] M.L. Zhang, LIFT: Multi-label learning with label-specific features, Proc. of the Twenty-Second International Joint Conference on Artificial Intelligence, Barcelona, Spanish, pp , 20. [7] J. Fürnkranz, E. Hüllermeier, et al., Multilabel classification via calibrated label ranking, Machine Learning, Vol.73, No.2, pp.33 53, [8] S.W. Ji and J.P. Ye, Linear dimensionality reduction for multilabel classification, Proc. of the 2st International Joint Conference on Artifical Intelligence, California, USA, pp , [9] Y. Zhang and Z.H. Zhou, Multilabel dimensionality reduction via dependence maximization, ACM Transactions on Knowledge Discovery from Data, Vol.4, No.3, pp. 2, 200. [20] J. Liu, S.W. Ji and J.P. Ye, Multi-task feature learning via efficient l 2, -norm imization, Proc. of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, Montreal, Canada, pp , [2] Y. Nesterov and I.U.E Nesterov, Introductory Lectures on Convex Optimization: A Basic Course, Kluwer Academic Publishers, Holland, [22] J. Read, B. Pfahringer, G. Holmes, et al., Classifier chains for multi-label classification, Machine Learning, Vol.85, No.3, pp , 20. [23] G. Tsoumakas, I. Katakis and I. Vlahavas, Random k-labelsets for multilabel classification, IEEE Transactions on Knowledge and Data Engineering, Vol.23, No.7, pp , 20. [24] E. Hüllermeier, J. Fürnkranz, et al., Labelrankingbylearning pairwise preferences, Artificial Intelligence, Vol.72, No.6, pp , [25] R. Caruana, Multitask learning, Machine Learning, Vol.28, No., pp.4 75, 997. [26] A. Evgeniou and M. Pontil, Multi-task feature learning, Proc. of the Twenty-First Annual Conference on Neural Information Processing Systems, Vancouver, B.C., Canada, pp.4 48, [27] S. Ben-David and R. Schuller, Exploiting task relatedness for multiple task learning, Proc. of the 6th Annual Conference on Learning Theory, Washington, D.C., USA, pp , [28] L.S. Qiao, S.C. Chen and X.Y. Tan, Sparsity preserving projections with applications to face recognition, Pattern Recognition, Vol.43, No., pp.33 34, 200. [29] D.L. Donoho, Compressed sensing, IEEE Transactions on Information Theory, Vol.52, No.4, pp , [30] J. Wright, A.Y. Yang, et al., Robust face recognition via sparse representation, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.3, No.2, pp , [3] Y.H. Guo and W. Xue, Probabilistic multi-label classification with sparse feature learning, Proc. of the Twenty-Third International Joint Conference on Artificial Intelligence, Beijing, China, pp , 203. [32] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, Cambridge, UK, [33] G. Tsoumakas, E.S. Xioufis, J. Vilcek, et al., Mulan: A java library for multi-label learning, The Journal of Machine Learning Research, Vol.2, No.7, pp , 20. [34] J. Demsar, Statistical comparisons of classifiers over multiple data sets, The Journal of Machine Learning Research, Vol.7, pp. 30, HE Zhifen was born in 988. She is currently pursuing the Ph.D degree in the school of mathematical sciences at Nanjing Normal University. Her research interests include machine learning and pattern recognition. ( hzfnjnu@gmail.com) YANG Ming (corresponding author) was born in 964. He received the Ph.D. degree from Southeast University in He is currently a professor at the school of computer science and technology of Nanjing Normal University. His research interests include data ing, pattern recognition and machine learning. ( m.yang@njnu.edu.cn) LIU Huidong was born in 987. He received the M.S. degree from Nanjing Normal University in 203. His research interests include machine learning, pattern recognition and their applications to face recognition, image processing, etc.

An Empirical Study of Lazy Multilabel Classification Algorithms

An Empirical Study of Lazy Multilabel Classification Algorithms An Empirical Study of Lazy Multilabel Classification Algorithms E. Spyromitros and G. Tsoumakas and I. Vlahavas Department of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece

More information

Feature and Search Space Reduction for Label-Dependent Multi-label Classification

Feature and Search Space Reduction for Label-Dependent Multi-label Classification Feature and Search Space Reduction for Label-Dependent Multi-label Classification Prema Nedungadi and H. Haripriya Abstract The problem of high dimensionality in multi-label domain is an emerging research

More information

Categorizing Social Multimedia by Neighborhood Decision using Local Pairwise Label Correlation

Categorizing Social Multimedia by Neighborhood Decision using Local Pairwise Label Correlation Categorizing Social Multimedia by Neighborhood Decision using Local Pairwise Label Correlation Jun Huang 1, Guorong Li 1, Shuhui Wang 2, Qingming Huang 1,2 1 University of Chinese Academy of Sciences,

More information

Benchmarking Multi-label Classification Algorithms

Benchmarking Multi-label Classification Algorithms Benchmarking Multi-label Classification Algorithms Arjun Pakrashi, Derek Greene, Brian Mac Namee Insight Centre for Data Analytics, University College Dublin, Ireland arjun.pakrashi@insight-centre.org,

More information

Multi-Label Learning

Multi-Label Learning Multi-Label Learning Zhi-Hua Zhou a and Min-Ling Zhang b a Nanjing University, Nanjing, China b Southeast University, Nanjing, China Definition Multi-label learning is an extension of the standard supervised

More information

Multi-label classification using rule-based classifier systems

Multi-label classification using rule-based classifier systems Multi-label classification using rule-based classifier systems Shabnam Nazmi (PhD candidate) Department of electrical and computer engineering North Carolina A&T state university Advisor: Dr. A. Homaifar

More information

Low-Rank Approximation for Multi-label Feature Selection

Low-Rank Approximation for Multi-label Feature Selection Low-Rank Approximation for Multi-label Feature Selection Hyunki Lim, Jaesung Lee, and Dae-Won Kim Abstract The goal of multi-label feature selection is to find a feature subset that is dependent to multiple

More information

Multi-Label Active Learning: Query Type Matters

Multi-Label Active Learning: Query Type Matters Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence (IJCAI 2015) Multi-Label Active Learning: Query Type Matters Sheng-Jun Huang 1,3 and Songcan Chen 1,3 and Zhi-Hua

More information

Time Series Clustering Ensemble Algorithm Based on Locality Preserving Projection

Time Series Clustering Ensemble Algorithm Based on Locality Preserving Projection Based on Locality Preserving Projection 2 Information & Technology College, Hebei University of Economics & Business, 05006 Shijiazhuang, China E-mail: 92475577@qq.com Xiaoqing Weng Information & Technology

More information

Sparsity Preserving Canonical Correlation Analysis

Sparsity Preserving Canonical Correlation Analysis Sparsity Preserving Canonical Correlation Analysis Chen Zu and Daoqiang Zhang Department of Computer Science and Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China {zuchen,dqzhang}@nuaa.edu.cn

More information

Efficient Multi-label Classification

Efficient Multi-label Classification Efficient Multi-label Classification Jesse Read (Supervisors: Bernhard Pfahringer, Geoff Holmes) November 2009 Outline 1 Introduction 2 Pruned Sets (PS) 3 Classifier Chains (CC) 4 Related Work 5 Experiments

More information

Semi-Supervised Clustering with Partial Background Information

Semi-Supervised Clustering with Partial Background Information Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject

More information

A Feature Selection Method to Handle Imbalanced Data in Text Classification

A Feature Selection Method to Handle Imbalanced Data in Text Classification A Feature Selection Method to Handle Imbalanced Data in Text Classification Fengxiang Chang 1*, Jun Guo 1, Weiran Xu 1, Kejun Yao 2 1 School of Information and Communication Engineering Beijing University

More information

An efficient face recognition algorithm based on multi-kernel regularization learning

An efficient face recognition algorithm based on multi-kernel regularization learning Acta Technica 61, No. 4A/2016, 75 84 c 2017 Institute of Thermomechanics CAS, v.v.i. An efficient face recognition algorithm based on multi-kernel regularization learning Bi Rongrong 1 Abstract. A novel

More information

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data Journal of Computational Information Systems 11: 6 (2015) 2139 2146 Available at http://www.jofcis.com A Fuzzy C-means Clustering Algorithm Based on Pseudo-nearest-neighbor Intervals for Incomplete Data

More information

Multi-Label Neural Networks with Applications to Functional Genomics and Text Categorization

Multi-Label Neural Networks with Applications to Functional Genomics and Text Categorization IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING 1 Multi-Label Neural Networks with Applications to Functional Genomics and Text Categorization Min-Ling Zhang and Zhi-Hua Zhou, Senior Member, IEEE Abstract

More information

Improving Image Segmentation Quality Via Graph Theory

Improving Image Segmentation Quality Via Graph Theory International Symposium on Computers & Informatics (ISCI 05) Improving Image Segmentation Quality Via Graph Theory Xiangxiang Li, Songhao Zhu School of Automatic, Nanjing University of Post and Telecommunications,

More information

Limitations of Matrix Completion via Trace Norm Minimization

Limitations of Matrix Completion via Trace Norm Minimization Limitations of Matrix Completion via Trace Norm Minimization ABSTRACT Xiaoxiao Shi Computer Science Department University of Illinois at Chicago xiaoxiao@cs.uic.edu In recent years, compressive sensing

More information

The Comparative Study of Machine Learning Algorithms in Text Data Classification*

The Comparative Study of Machine Learning Algorithms in Text Data Classification* The Comparative Study of Machine Learning Algorithms in Text Data Classification* Wang Xin School of Science, Beijing Information Science and Technology University Beijing, China Abstract Classification

More information

Deakin Research Online

Deakin Research Online Deakin Research Online This is the published version: Nasierding, Gulisong, Tsoumakas, Grigorios and Kouzani, Abbas Z. 2009, Clustering based multi-label classification for image annotation and retrieval,

More information

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference

Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Detecting Burnscar from Hyperspectral Imagery via Sparse Representation with Low-Rank Interference Minh Dao 1, Xiang Xiang 1, Bulent Ayhan 2, Chiman Kwan 2, Trac D. Tran 1 Johns Hopkins Univeristy, 3400

More information

Learning Low-Rank Label Correlations for Multi-label Classification with Missing Labels

Learning Low-Rank Label Correlations for Multi-label Classification with Missing Labels 2014 IEEE International Conference on Mining Learning Low-Rank Label Correlations for Multi-label Classification with Missing Labels Linli Xu,ZhenWang, Zefan Shen, Yubo Wang, and Enhong Chen School of

More information

Face Recognition Based on LDA and Improved Pairwise-Constrained Multiple Metric Learning Method

Face Recognition Based on LDA and Improved Pairwise-Constrained Multiple Metric Learning Method Journal of Information Hiding and Multimedia Signal Processing c 2016 ISSN 2073-4212 Ubiquitous International Volume 7, Number 5, September 2016 Face Recognition ased on LDA and Improved Pairwise-Constrained

More information

Muli-label Text Categorization with Hidden Components

Muli-label Text Categorization with Hidden Components Muli-label Text Categorization with Hidden Components Li Li Longkai Zhang Houfeng Wang Key Laboratory of Computational Linguistics (Peking University) Ministry of Education, China li.l@pku.edu.cn, zhlongk@qq.com,

More information

A Survey on Postive and Unlabelled Learning

A Survey on Postive and Unlabelled Learning A Survey on Postive and Unlabelled Learning Gang Li Computer & Information Sciences University of Delaware ligang@udel.edu Abstract In this paper we survey the main algorithms used in positive and unlabeled

More information

Cost-sensitive Boosting for Concept Drift

Cost-sensitive Boosting for Concept Drift Cost-sensitive Boosting for Concept Drift Ashok Venkatesan, Narayanan C. Krishnan, Sethuraman Panchanathan Center for Cognitive Ubiquitous Computing, School of Computing, Informatics and Decision Systems

More information

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality

More information

Random k-labelsets: An Ensemble Method for Multilabel Classification

Random k-labelsets: An Ensemble Method for Multilabel Classification Random k-labelsets: An Ensemble Method for Multilabel Classification Grigorios Tsoumakas and Ioannis Vlahavas Department of Informatics, Aristotle University of Thessaloniki 54124 Thessaloniki, Greece

More information

Semi supervised clustering for Text Clustering

Semi supervised clustering for Text Clustering Semi supervised clustering for Text Clustering N.Saranya 1 Assistant Professor, Department of Computer Science and Engineering, Sri Eshwar College of Engineering, Coimbatore 1 ABSTRACT: Based on clustering

More information

Efficient Voting Prediction for Pairwise Multilabel Classification

Efficient Voting Prediction for Pairwise Multilabel Classification Efficient Voting Prediction for Pairwise Multilabel Classification Eneldo Loza Mencía, Sang-Hyeun Park and Johannes Fürnkranz TU-Darmstadt - Knowledge Engineering Group Hochschulstr. 10 - Darmstadt - Germany

More information

Multi-label Classification. Jingzhou Liu Dec

Multi-label Classification. Jingzhou Liu Dec Multi-label Classification Jingzhou Liu Dec. 6 2016 Introduction Multi-class problem, Training data (x $, y $ ) ( ), x $ X R., y $ Y = 1,2,, L Learn a mapping f: X Y Each instance x $ is associated with

More information

Bi-Directional Representation Learning for Multi-label Classification

Bi-Directional Representation Learning for Multi-label Classification Bi-Directional Representation Learning for Multi-label Classification Xin Li and Yuhong Guo Department of Computer and Information Sciences Temple University Philadelphia, PA 19122, USA {xinli,yuhong}@temple.edu

More information

Comparison of Optimization Methods for L1-regularized Logistic Regression

Comparison of Optimization Methods for L1-regularized Logistic Regression Comparison of Optimization Methods for L1-regularized Logistic Regression Aleksandar Jovanovich Department of Computer Science and Information Systems Youngstown State University Youngstown, OH 44555 aleksjovanovich@gmail.com

More information

An Empirical Study on Lazy Multilabel Classification Algorithms

An Empirical Study on Lazy Multilabel Classification Algorithms An Empirical Study on Lazy Multilabel Classification Algorithms Eleftherios Spyromitros, Grigorios Tsoumakas and Ioannis Vlahavas Machine Learning & Knowledge Discovery Group Department of Informatics

More information

Bagging and Boosting Algorithms for Support Vector Machine Classifiers

Bagging and Boosting Algorithms for Support Vector Machine Classifiers Bagging and Boosting Algorithms for Support Vector Machine Classifiers Noritaka SHIGEI and Hiromi MIYAJIMA Dept. of Electrical and Electronics Engineering, Kagoshima University 1-21-40, Korimoto, Kagoshima

More information

Text Categorization (I)

Text Categorization (I) CS473 CS-473 Text Categorization (I) Luo Si Department of Computer Science Purdue University Text Categorization (I) Outline Introduction to the task of text categorization Manual v.s. automatic text categorization

More information

Query-Sensitive Similarity Measure for Content-Based Image Retrieval

Query-Sensitive Similarity Measure for Content-Based Image Retrieval Query-Sensitive Similarity Measure for Content-Based Image Retrieval Zhi-Hua Zhou Hong-Bin Dai National Laboratory for Novel Software Technology Nanjing University, Nanjing 2193, China {zhouzh, daihb}@lamda.nju.edu.cn

More information

CS570: Introduction to Data Mining

CS570: Introduction to Data Mining CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.

More information

Fuzzy Bidirectional Weighted Sum for Face Recognition

Fuzzy Bidirectional Weighted Sum for Face Recognition Send Orders for Reprints to reprints@benthamscience.ae The Open Automation and Control Systems Journal, 2014, 6, 447-452 447 Fuzzy Bidirectional Weighted Sum for Face Recognition Open Access Pengli Lu

More information

FSRM Feedback Algorithm based on Learning Theory

FSRM Feedback Algorithm based on Learning Theory Send Orders for Reprints to reprints@benthamscience.ae The Open Cybernetics & Systemics Journal, 2015, 9, 699-703 699 FSRM Feedback Algorithm based on Learning Theory Open Access Zhang Shui-Li *, Dong

More information

Efficient Pairwise Classification

Efficient Pairwise Classification Efficient Pairwise Classification Sang-Hyeun Park and Johannes Fürnkranz TU Darmstadt, Knowledge Engineering Group, D-64289 Darmstadt, Germany Abstract. Pairwise classification is a class binarization

More information

Multi-Stage Rocchio Classification for Large-scale Multilabeled

Multi-Stage Rocchio Classification for Large-scale Multilabeled Multi-Stage Rocchio Classification for Large-scale Multilabeled Text data Dong-Hyun Lee Nangman Computing, 117D Garden five Tools, Munjeong-dong Songpa-gu, Seoul, Korea dhlee347@gmail.com Abstract. Large-scale

More information

Classification. 1 o Semestre 2007/2008

Classification. 1 o Semestre 2007/2008 Classification Departamento de Engenharia Informática Instituto Superior Técnico 1 o Semestre 2007/2008 Slides baseados nos slides oficiais do livro Mining the Web c Soumen Chakrabarti. Outline 1 2 3 Single-Class

More information

Multi-Instance Multi-Label Learning with Application to Scene Classification

Multi-Instance Multi-Label Learning with Application to Scene Classification Multi-Instance Multi-Label Learning with Application to Scene Classification Zhi-Hua Zhou Min-Ling Zhang National Laboratory for Novel Software Technology Nanjing University, Nanjing 210093, China {zhouzh,zhangml}@lamda.nju.edu.cn

More information

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis

Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis CHAPTER 3 BEST FIRST AND GREEDY SEARCH BASED CFS AND NAÏVE BAYES ALGORITHMS FOR HEPATITIS DIAGNOSIS 3.1 Introduction

More information

MULTI-POSE FACE HALLUCINATION VIA NEIGHBOR EMBEDDING FOR FACIAL COMPONENTS. Yanghao Li, Jiaying Liu, Wenhan Yang, Zongming Guo

MULTI-POSE FACE HALLUCINATION VIA NEIGHBOR EMBEDDING FOR FACIAL COMPONENTS. Yanghao Li, Jiaying Liu, Wenhan Yang, Zongming Guo MULTI-POSE FACE HALLUCINATION VIA NEIGHBOR EMBEDDING FOR FACIAL COMPONENTS Yanghao Li, Jiaying Liu, Wenhan Yang, Zongg Guo Institute of Computer Science and Technology, Peking University, Beijing, P.R.China,

More information

A new Graph constructor for Semi-supervised Discriminant Analysis via Group Sparsity

A new Graph constructor for Semi-supervised Discriminant Analysis via Group Sparsity 2011 Sixth International Conference on Image and Graphics A new Graph constructor for Semi-supervised Discriminant Analysis via Group Sparsity Haoyuan Gao, Liansheng Zhuang, Nenghai Yu MOE-MS Key Laboratory

More information

Feature Selection Using Modified-MCA Based Scoring Metric for Classification

Feature Selection Using Modified-MCA Based Scoring Metric for Classification 2011 International Conference on Information Communication and Management IPCSIT vol.16 (2011) (2011) IACSIT Press, Singapore Feature Selection Using Modified-MCA Based Scoring Metric for Classification

More information

FUZZY C-MEANS ALGORITHM BASED ON PRETREATMENT OF SIMILARITY RELATIONTP

FUZZY C-MEANS ALGORITHM BASED ON PRETREATMENT OF SIMILARITY RELATIONTP Dynamics of Continuous, Discrete and Impulsive Systems Series B: Applications & Algorithms 14 (2007) 103-111 Copyright c 2007 Watam Press FUZZY C-MEANS ALGORITHM BASED ON PRETREATMENT OF SIMILARITY RELATIONTP

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 2 out Announcements Due May 3 rd (11:59pm) Course project proposal

More information

Dimensionality Reduction using Relative Attributes

Dimensionality Reduction using Relative Attributes Dimensionality Reduction using Relative Attributes Mohammadreza Babaee 1, Stefanos Tsoukalas 1, Maryam Babaee Gerhard Rigoll 1, and Mihai Datcu 1 Institute for Human-Machine Communication, Technische Universität

More information

Leaf Image Recognition Based on Wavelet and Fractal Dimension

Leaf Image Recognition Based on Wavelet and Fractal Dimension Journal of Computational Information Systems 11: 1 (2015) 141 148 Available at http://www.jofcis.com Leaf Image Recognition Based on Wavelet and Fractal Dimension Haiyan ZHANG, Xingke TAO School of Information,

More information

Robust Face Recognition via Sparse Representation

Robust Face Recognition via Sparse Representation Robust Face Recognition via Sparse Representation Panqu Wang Department of Electrical and Computer Engineering University of California, San Diego La Jolla, CA 92092 pawang@ucsd.edu Can Xu Department of

More information

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation Xingguo Li Tuo Zhao Xiaoming Yuan Han Liu Abstract This paper describes an R package named flare, which implements

More information

Encoding Words into String Vectors for Word Categorization

Encoding Words into String Vectors for Word Categorization Int'l Conf. Artificial Intelligence ICAI'16 271 Encoding Words into String Vectors for Word Categorization Taeho Jo Department of Computer and Information Communication Engineering, Hongik University,

More information

Biclustering Bioinformatics Data Sets. A Possibilistic Approach

Biclustering Bioinformatics Data Sets. A Possibilistic Approach Possibilistic algorithm Bioinformatics Data Sets: A Possibilistic Approach Dept Computer and Information Sciences, University of Genova ITALY EMFCSC Erice 20/4/2007 Bioinformatics Data Sets Outline Introduction

More information

Multiresponse Sparse Regression with Application to Multidimensional Scaling

Multiresponse Sparse Regression with Application to Multidimensional Scaling Multiresponse Sparse Regression with Application to Multidimensional Scaling Timo Similä and Jarkko Tikka Helsinki University of Technology, Laboratory of Computer and Information Science P.O. Box 54,

More information

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011

Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 Reddit Recommendation System Daniel Poon, Yu Wu, David (Qifan) Zhang CS229, Stanford University December 11 th, 2011 1. Introduction Reddit is one of the most popular online social news websites with millions

More information

Keyword Extraction by KNN considering Similarity among Features

Keyword Extraction by KNN considering Similarity among Features 64 Int'l Conf. on Advances in Big Data Analytics ABDA'15 Keyword Extraction by KNN considering Similarity among Features Taeho Jo Department of Computer and Information Engineering, Inha University, Incheon,

More information

Cost-Sensitive Label Embedding for Multi-Label Classification

Cost-Sensitive Label Embedding for Multi-Label Classification Noname manuscript No. (will be inserted by the editor) Cost-Sensitive Label Embedding for Multi-Label Classification Kuan-Hao Huang Hsuan-Tien Lin Received: date / Accepted: date Abstract Label embedding

More information

Naïve Bayes for text classification

Naïve Bayes for text classification Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support

More information

Fuzzy-Kernel Learning Vector Quantization

Fuzzy-Kernel Learning Vector Quantization Fuzzy-Kernel Learning Vector Quantization Daoqiang Zhang 1, Songcan Chen 1 and Zhi-Hua Zhou 2 1 Department of Computer Science and Engineering Nanjing University of Aeronautics and Astronautics Nanjing

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

Stepwise Metric Adaptation Based on Semi-Supervised Learning for Boosting Image Retrieval Performance

Stepwise Metric Adaptation Based on Semi-Supervised Learning for Boosting Image Retrieval Performance Stepwise Metric Adaptation Based on Semi-Supervised Learning for Boosting Image Retrieval Performance Hong Chang & Dit-Yan Yeung Department of Computer Science Hong Kong University of Science and Technology

More information

The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R

The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R Journal of Machine Learning Research 6 (205) 553-557 Submitted /2; Revised 3/4; Published 3/5 The flare Package for High Dimensional Linear Regression and Precision Matrix Estimation in R Xingguo Li Department

More information

The Analysis of Parameters t and k of LPP on Several Famous Face Databases

The Analysis of Parameters t and k of LPP on Several Famous Face Databases The Analysis of Parameters t and k of LPP on Several Famous Face Databases Sujing Wang, Na Zhang, Mingfang Sun, and Chunguang Zhou College of Computer Science and Technology, Jilin University, Changchun

More information

Classification of Printed Chinese Characters by Using Neural Network

Classification of Printed Chinese Characters by Using Neural Network Classification of Printed Chinese Characters by Using Neural Network ATTAULLAH KHAWAJA Ph.D. Student, Department of Electronics engineering, Beijing Institute of Technology, 100081 Beijing, P.R.CHINA ABDUL

More information

A Weighted Majority Voting based on Normalized Mutual Information for Cluster Analysis

A Weighted Majority Voting based on Normalized Mutual Information for Cluster Analysis A Weighted Majority Voting based on Normalized Mutual Information for Cluster Analysis Meshal Shutaywi and Nezamoddin N. Kachouie Department of Mathematical Sciences, Florida Institute of Technology Abstract

More information

Markov Random Fields and Gibbs Sampling for Image Denoising

Markov Random Fields and Gibbs Sampling for Image Denoising Markov Random Fields and Gibbs Sampling for Image Denoising Chang Yue Electrical Engineering Stanford University changyue@stanfoed.edu Abstract This project applies Gibbs Sampling based on different Markov

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Multi-label Classification using Ensembles of Pruned Sets

Multi-label Classification using Ensembles of Pruned Sets 2008 Eighth IEEE International Conference on Data Mining Multi-label Classification using Ensembles of Pruned Sets Jesse Read, Bernhard Pfahringer, Geoff Holmes Department of Computer Science University

More information

Motion analysis for broadcast tennis video considering mutual interaction of players

Motion analysis for broadcast tennis video considering mutual interaction of players 14-10 MVA2011 IAPR Conference on Machine Vision Applications, June 13-15, 2011, Nara, JAPAN analysis for broadcast tennis video considering mutual interaction of players Naoto Maruyama, Kazuhiro Fukui

More information

A 3D MODEL RETRIEVAL ALGORITHM BASED ON BP- BAGGING

A 3D MODEL RETRIEVAL ALGORITHM BASED ON BP- BAGGING A 3D MODEL RETRIEVAL ALGORITHM BASED ON BP- BAGGING 1 ZUOJUN LIU, 2 LIHONG LI 1 Faculty of Computer Engineering, Huaiyin Institute of Technology, Huai an, Jiangsu, China 2 Faculty of Foreign Language,

More information

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering

INF4820 Algorithms for AI and NLP. Evaluating Classifiers Clustering INF4820 Algorithms for AI and NLP Evaluating Classifiers Clustering Murhaf Fares & Stephan Oepen Language Technology Group (LTG) September 27, 2017 Today 2 Recap Evaluation of classifiers Unsupervised

More information

Video annotation based on adaptive annular spatial partition scheme

Video annotation based on adaptive annular spatial partition scheme Video annotation based on adaptive annular spatial partition scheme Guiguang Ding a), Lu Zhang, and Xiaoxu Li Key Laboratory for Information System Security, Ministry of Education, Tsinghua National Laboratory

More information

Machine Learning: Think Big and Parallel

Machine Learning: Think Big and Parallel Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least

More information

Bilevel Sparse Coding

Bilevel Sparse Coding Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional

More information

Class-Information-Incorporated Principal Component Analysis

Class-Information-Incorporated Principal Component Analysis Class-Information-Incorporated Principal Component Analysis Songcan Chen * Tingkai Sun Dept. of Computer Science & Engineering, Nanjing University of Aeronautics & Astronautics, Nanjing, 210016, China

More information

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1 Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches

More information

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1

Cluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods

More information

Learning based face hallucination techniques: A survey

Learning based face hallucination techniques: A survey Vol. 3 (2014-15) pp. 37-45. : A survey Premitha Premnath K Department of Computer Science & Engineering Vidya Academy of Science & Technology Thrissur - 680501, Kerala, India (email: premithakpnath@gmail.com)

More information

CONTENT ADAPTIVE SCREEN IMAGE SCALING

CONTENT ADAPTIVE SCREEN IMAGE SCALING CONTENT ADAPTIVE SCREEN IMAGE SCALING Yao Zhai (*), Qifei Wang, Yan Lu, Shipeng Li University of Science and Technology of China, Hefei, Anhui, 37, China Microsoft Research, Beijing, 8, China ABSTRACT

More information

Discriminative sparse model and dictionary learning for object category recognition

Discriminative sparse model and dictionary learning for object category recognition Discriative sparse model and dictionary learning for object category recognition Xiao Deng and Donghui Wang Institute of Artificial Intelligence, Zhejiang University Hangzhou, China, 31007 {yellowxiao,dhwang}@zju.edu.cn

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem

The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem Int. J. Advance Soft Compu. Appl, Vol. 9, No. 1, March 2017 ISSN 2074-8523 The Un-normalized Graph p-laplacian based Semi-supervised Learning Method and Speech Recognition Problem Loc Tran 1 and Linh Tran

More information

Diagonal Principal Component Analysis for Face Recognition

Diagonal Principal Component Analysis for Face Recognition Diagonal Principal Component nalysis for Face Recognition Daoqiang Zhang,2, Zhi-Hua Zhou * and Songcan Chen 2 National Laboratory for Novel Software echnology Nanjing University, Nanjing 20093, China 2

More information

HFCT: A Hybrid Fuzzy Clustering Method for Collaborative Tagging

HFCT: A Hybrid Fuzzy Clustering Method for Collaborative Tagging 007 International Conference on Convergence Information Technology HFCT: A Hybrid Fuzzy Clustering Method for Collaborative Tagging Lixin Han,, Guihai Chen Department of Computer Science and Engineering,

More information

Stepwise Nearest Neighbor Discriminant Analysis

Stepwise Nearest Neighbor Discriminant Analysis Stepwise Nearest Neighbor Discriminant Analysis Xipeng Qiu and Lide Wu Media Computing & Web Intelligence Lab Department of Computer Science and Engineering Fudan University, Shanghai, China xpqiu,ldwu@fudan.edu.cn

More information

Random projection for non-gaussian mixture models

Random projection for non-gaussian mixture models Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,

More information

Multi-Label Classification with Conditional Tree-structured Bayesian Networks

Multi-Label Classification with Conditional Tree-structured Bayesian Networks Multi-Label Classification with Conditional Tree-structured Bayesian Networks Original work: Batal, I., Hong C., and Hauskrecht, M. An Efficient Probabilistic Framework for Multi-Dimensional Classification.

More information

Noise-based Feature Perturbation as a Selection Method for Microarray Data

Noise-based Feature Perturbation as a Selection Method for Microarray Data Noise-based Feature Perturbation as a Selection Method for Microarray Data Li Chen 1, Dmitry B. Goldgof 1, Lawrence O. Hall 1, and Steven A. Eschrich 2 1 Department of Computer Science and Engineering

More information

Rough Set Approach to Unsupervised Neural Network based Pattern Classifier

Rough Set Approach to Unsupervised Neural Network based Pattern Classifier Rough Set Approach to Unsupervised Neural based Pattern Classifier Ashwin Kothari, Member IAENG, Avinash Keskar, Shreesha Srinath, and Rakesh Chalsani Abstract Early Convergence, input feature space with

More information

An Efficient Probabilistic Framework for Multi-Dimensional Classification

An Efficient Probabilistic Framework for Multi-Dimensional Classification An Efficient Probabilistic Framework for Multi-Dimensional Classification Iyad Batal Computer Science Dept. University of Pittsburgh iyad@cs.pitt.edu Charmgil Hong Computer Science Dept. University of

More information

MSA220 - Statistical Learning for Big Data

MSA220 - Statistical Learning for Big Data MSA220 - Statistical Learning for Big Data Lecture 13 Rebecka Jörnsten Mathematical Sciences University of Gothenburg and Chalmers University of Technology Clustering Explorative analysis - finding groups

More information

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation

An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation Xingguo Li Tuo Zhao Xiaoming Yuan Han Liu Abstract This paper describes an R package named flare, which implements

More information

Applying Supervised Learning

Applying Supervised Learning Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains

More information

KBSVM: KMeans-based SVM for Business Intelligence

KBSVM: KMeans-based SVM for Business Intelligence Association for Information Systems AIS Electronic Library (AISeL) AMCIS 2004 Proceedings Americas Conference on Information Systems (AMCIS) December 2004 KBSVM: KMeans-based SVM for Business Intelligence

More information

A New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering

A New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering A New Method For Forecasting Enrolments Combining Time-Variant Fuzzy Logical Relationship Groups And K-Means Clustering Nghiem Van Tinh 1, Vu Viet Vu 1, Tran Thi Ngoc Linh 1 1 Thai Nguyen University of

More information

An Empirical Comparison of Spectral Learning Methods for Classification

An Empirical Comparison of Spectral Learning Methods for Classification An Empirical Comparison of Spectral Learning Methods for Classification Adam Drake and Dan Ventura Computer Science Department Brigham Young University, Provo, UT 84602 USA Email: adam drake1@yahoo.com,

More information

Leave-One-Out Support Vector Machines

Leave-One-Out Support Vector Machines Leave-One-Out Support Vector Machines Jason Weston Department of Computer Science Royal Holloway, University of London, Egham Hill, Egham, Surrey, TW20 OEX, UK. Abstract We present a new learning algorithm

More information