An Empirical Study on Lazy Multilabel Classification Algorithms

Size: px

Start display at page:

Download "An Empirical Study on Lazy Multilabel Classification Algorithms"

Melvyn Cameron
5 years ago
Views:

1 An Empirical Study on Lazy Multilabel Classification Algorithms Eleftherios Spyromitros, Grigorios Tsoumakas and Ioannis Vlahavas Machine Learning & Knowledge Discovery Group Department of Informatics Aristotle University of Thessaloniki Greece

2 What is Multilabel Classification? Multilabel Classification Multilabel Classification Methods Single-label Classification Results are associated with a single label disjoint labels If L 2, binary classification If L 2, multi-class classification Multilabel Classification Results are associated with a set of labels from a set of Y L L

$categorized into both categorys { Science_Technology, History_Culture } Medical Diagnosis Multiple$ $Music into Emotions A song can make you feel { Sad_Lonely, Quiet_Still} Semantic Scene Analysis {$

3 Data With Multilabel Nature Multilabel Classification Multilabel Classification Methods Traditional Text Classification A web article concerning the Antikythera Mechanism Research Project can be categorized into both categorys { Science_Technology, History_Culture } Medical Diagnosis Multiple diseases for a patient { Obesity, Hypertension} Modern Gene Function Classification A gene usually has multiple functions { Protein Synthesis, Cellular Biogenesis, Cellular Transport} Classification of Music into Emotions A song can make you feel { Sad_Lonely, Quiet_Still} Semantic Scene Analysis { Mountain, Trees, Lake }

4 Types of Multilabel Classification Methods Multilabel Classification Multilabel Classification Methods Problem transformation methods They transform the learning problem into one (LP) or more (BR) single-label classification or label ranking problems Algorithm independent Algorithm adaptation methods They extend specific algorithms to handle multi-label data SVM, decision tree, neural network, lazy, Bayesian, boosting

5 The Binary Relevance (BR) Method Multilabel Classification Multilabel Classification Methods How it works Learns one binary classifier h : X {, } for each different label L The original dataset is transformed into L datasets D contains all examples of D labeled as if they are associated with and as otherwise Criticism Label correlations are not considered D

6 The Label Powerset (LP) Method Multilabel Classification Multilabel Classification Methods How it works Considers its different subset of Criticism Large number of label subsets ( ) as a single label It learns one single-label classifier h : X P( L) Most of these are associated with very few examples L 2 L

7 The BRkNN Algorithm The BRkNN Algorithm The Problem of BRkNN Extensions of BRkNN MLkNN and LPkNN Origin Equivalent to using the BR method in conjunction with the knn algorithm Refinement L times faster than BR + knn in prediction Benefit Avoids the redundant calculations of k nearest neighbors in each one of the transformed datasets D A single k nearest neighbors search is followed by independent predictions for each label Applies better in domains with large number of labels and examples, requiring low response times

8 How it works Introduction The BRkNN Algorithm The Problem of BRkNN Extensions of BRkNN MLkNN and LPkNN Confidence scores BrKNN is based on the calculation of confidence scores for each label L c Confidence is obtained considering the percentage of the k nearest neighbors that include each label A label is included in the label-set when the percentage is higher than or equal to 50%

9 Percenage of instances, where the enpty set is output Introduction Independent Predictions The BRkNN Algorithm The Problem of BRkNN Extensions of BRkNN MLkNN and LPkNN The 35% weakness The empty set is a possible overall output 30% 20% The reason Independent 15% predictions for each label, a general 10% 5% scene yeast emotions Arises when none of the labels has a confidence higher than 25% 50% disadvantage of the BR method Is this common in BrkNN? 0% Nearest Neighbors

10 The Proposed Extensions The BRkNN Algorithm The Problem of BRkNN Extensions of BRkNN MLkNN and LPkNN Trying to dissolve the aforementioned problem BRkNN-a Checks if BRkNN outputs the empty set In that case outputs the label with the highest confidence BRkNN-b 1 st step: Calculates the average size s 1 k nearest neighbors ( s Yj ) 2 nd step: outputs the highest confidence k [] s j 1 of the label sets of the k (nearest integer of s) labels with the

11 The MLkNN and LPkNN Algorithms The BRkNN Algorithm The Problem of BRkNN Extensions of BRkNN MLkNN and LPkNN Two more lazy multi-label classification methods LPkNN The pairing of LP problem transformation method with the knn algorithm A little discussed in the past MLkNN An adaptation of knn for multi-label data Main difference with BRkNN: prior and posterior probabilities estimated from the training set Extended with an option for min-max normalization

12 Evaluation Measures Evaluation Measures Datasets Evaluation Methodology Example-based Calculate the difference between the actual and predicted label sets for each example Average the results over all examples of the test set Label-based Calculate a binary evaluation measure separately for each label Micro/Macro averaging operations over all labels

13 Example Based Measures Evaluation Measures Datasets Evaluation Methodology Notation ( xy, ) 2 Y Z Let be a multi-label example, Z Y Let h be a multi-label classifier Let Z h( x) be the set of labels predicted by h for Hamming Loss Y Z L ( xy, ), where is the symmetric difference of two sets Classification Accuracy or Subset Accuracy 1, if Y Z 0, if Y Z IR-inspired measures Y Z Z Y Z Y Precision, Recall, F-measure 2 Y Z Z Y

14 Label Based Measures Evaluation Measures Datasets Evaluation Methodology Any binary evaluation measure can be used Accuracy, area under ROC curve, precision, recall, etc Operations for averaging across all labels Macro-averaging Micro-averaging L 1 M macro M ( tp, fp, tn, fn ) L 1 L L L L M micro M tp, fp, tn, fn

Datasets Introduction Evaluation Measures Datasets Evaluation Methodology Dataset Examples Attributes Numeric Discrete Labels Distinct Subsets Label Cardinality Label Density Scene 2,407 294 0 6 15

15 Datasets Introduction Evaluation Measures Datasets Evaluation Methodology Dataset Examples Attributes Numeric Discrete Labels Distinct Subsets Label Cardinality Label Density Scene 2, , Emotions , Yeast 2, , Datasets Scene: semantic indexing of still images Emotions: classification of songs into 6 classes of emotion Yeast: Multi-label Statistics gene function classification Distinct Subsets is the number of different label sets Label Cardinality is the average number of labels per example Label Density is equal to Label Cardinality divided by L

16 Evaluation Methodology Evaluation Measures Datasets Evaluation Methodology Multi-label algorithms evaluated BRkNN BRkNN-a / BRkNN-b MLkNN LPkNN Varying number of nearest neighbors k ranged from 1 to 30 Distance function: Normalized Euclidean Evaluation Example-based: hamming loss, accuracy, F-measure, subset accuracy Label-based : micro and macro version of F-measure 10-fold cross-validation

17 Do the Proposed Extensions Improve BRkNN? Comparison of BRkNN, LPkNN and MLkNN Do the Proposed Extensions Improve BRkNN? BRkNN against its extensions BRkNN-a and BRknn-b Average performance across all 30 values of k metric scene base ext-a ext-b emotions base ext-a ext-b yeast base ext-a ext-b Hamming loss 0,0950 0,0938 0,0941 0,1976 0,1982 0,2175 0,1974 0,1975 0,2082 Accuracy 0,6256 0,7226 0,7218 0,5215 0,5441 0,5430 0,5062 0,5080 0,5346 F-measure 0,6495 0,7539 0,7538 0,6275 0,6576 0,6590 0,5777 0,5795 0,6652 Subset accuracy 0,6281 0,7251 0,7230 0,2895 0,2971 0,2759 0,1958 0,1959 0,1766 micro F-measure 0,6386 0,7392 0,7381 0,6499 0,6577 0,6509 0, macro F-measure 0,5993 0,6889 0,6886 0,6224 0,6303 0, #wins (#better) 0 6 (6) 0 (6) 1 4 (5) 1 (4) 1 1 (5) 4 (4)

18 Do the Proposed Extensions Improve BRkNN? Comparison of BRkNN, LPkNN and MLkNN Do the Proposed Extensions Improve BRkNN? Remarks Both extensions outperform the base algorithm in more than half of the metrics in all datasets Performance pattern correlates with dataset cardinality BRkNN-a dominates in scene and emotions (1.074, 1.868) Increased probability for BRkNN to output the empty set BRkNN-b dominates in yeast (4.237) A mechanism to predict the number of labels

19 Comparison of BRkNN, LPkNN and MLkNN Do the Proposed Extensions Improve BRkNN? Comparison of BRkNN, LPkNN and MLkNN Best extension of BRkNN against LPknn and MLknn Average performance across all 30 values of k Metric scene ext-a LPkNN MLkNN emotions ext-a LPkNN MLkNN yeast ext-b LPkNN MLkNN Hamming loss 0,0938 0,0955 0,0884 0,1982 0,2094 0,2003 0,2082 0,2143 0,1950 Accuracy 0,7226 0,7181 0,6720 0,5441 0,5600 0,5233 0,5346 0,5280 0,5105 F-measure 0,7392 0,7343 0,6944 0,6576 0,6662 0,6352 0,6652 0,6375 0,5823 Subset accuracy 0,6889 0,6854 0,6272 0,2971 0,3287 0,2780 0,1766 0,2452 0,1780 micro F-measure 0,7296 0,7249 0,7316 0,6577 0,6649 0,6509 0,6567 0,6415 0,6422 macro F-measure 0,7363 0,7323 0,7341 0,6303 0,6505 0,6110 0,4261 0,4322 0,3701 #wins

20 Comparison of BRkNN, LPkNN and MLkNN Do the Proposed Extensions Improve BRkNN? Comparison of BRkNN, LPkNN and MLkNN Remarks BRkNN-a dominates in scene LPkNN dominates in emotions BRkNN-b performs slightly better in yeast Possible correlation between LPkNN performance and label density

21 Summary and Future Work Summary and Future Work Resources The End Use of knn for multi-label classification BRkNN an efficient implementation of BR plus knn Extensions that enhance BRkNN s performance Additional comparative experiments with LPkNN and MLkNN Main contribution Which method is most suitable for a dataset depending on certain dataset characteristics. Future work Additional lazy multi-label classification approaches Experiments with additional multi-label datasets

22 Summary and Future Work Resources The End The MUlti-LAbel classification (MULAN)library Open source software for multi-label classification Several problem transformation and algorithm adaptation methods Example/label/ranking based measures Multi-label statistics Built on top of Weka Also hosted by Sourceforge (integrated with SVN) Multi-label classification datasets (.arff format) delicious, emotions, genbase, mediamill, rcv1v2, scene, tmc2007, yeast Active multi-label classification bibliography

23 End of Presentation Introduction Summary and Future Work Resources The End

An Empirical Study of Lazy Multilabel Classification Algorithms

An Empirical Study of Lazy Multilabel Classification Algorithms E. Spyromitros and G. Tsoumakas and I. Vlahavas Department of Informatics, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece