3 Virtual attribute subsetting
|
|
- Giles Wiggins
- 5 years ago
- Views:
Transcription
1 3 Virtual attribute subsetting Portions of this chapter were previously presented at the 19 th Australian Joint Conference on Artificial Intelligence (Horton et al., 2006). Virtual attribute subsetting is a meta-classification technique which will be shown in chapter 6 to improve the performance of the confidence mapping technique described there. It was created in this work as a generic technique, and this chapter describes its implementation and application to generic classification problems. It is an extension to attribute subsetting, as described in section Unlike standard attribute subsetting, virtual attribute subsetting uses a single base classifier but classifies an instance by passing multiple copies to the base classifier; each copy has some of its attribute values set to unknown. This is shown here to have some of the accuracy benefits of standard attribute subsetting while needing less training time virtual attribute subsetting only trains a single base classifier. This also means that virtual attribute subsetting can be applied when the base classifier already exists, while the standard attribute subsetting base classifiers must be trained with attribute subsetting in mind. This possibility of improving the performance of an existing classifier without needing any training data or further training time is very unusual in classifier learning. 3.1 Unknown attributes in classification Virtual attribute subsetting requires base classifiers that can return a classification for an instance with unknown values. Many types of classifiers can handle such missing values, but their means of doing so differ. Three such classifiers are considered here. 1. Naïve Bayesian classifiers can easily handle missing values by omitting the term for that attribute when the probabilities are multiplied (Kohavi et al., 1997). 2. Decision tree classifiers can deal with missing values at classification time in several different ways (Quinlan, 1989). In trees learnt by the commonly-used C4.5 algorithm, if a node in the tree depends on an attribute whose value is unknown, all subtrees of that node are checked and their class distribution summed (Quinlan, 1993). 30
2 Chapter 3 Virtual attribute subsetting Many rule induction algorithms can handle missing values, but not all do so in a manner useful to virtual attribute subsetting. For example, in ripper (Repeated Incremental Pruning to Produce Error Reduction), rules return negative if they need a missing value (Cohen, 1995). This means that making attributes unknown can easily result in all rules evaluating to negative, so the default classification is returned. Early tests showed that this made virtual attribute subsetting perform poorly, but that the part rule set inductor, which trains decision nodes with the C4.5 decision tree learning algorithm (Frank & Witten, 1998), was more promising. 3.2 The virtual attribute subsetting algorithm Bay (Bay, 1998) applied attribute subsetting to a single nearest-neighbour base classifier. The only training step for nearest-neighbour classifiers is reading the training data. There is no benefit from reading the training data multiple times, so they were only read once. At classification time, multiple nearest-neighbour measurements were made on this single dataset, each measuring a subset of attributes. This saved training time and storage space, while the results were identical to those obtainable by reading the data multiple times to create multiple base classifiers. This approach can be generalized to other base classifiers, to create virtual attribute subsetting. At training time, a single base classifier is learnt on all of the training data and multiple attribute subsets are created, using one of the algorithms discussed in section To classify an instance, one copy of the instance is created for each subset. For each instance copy, those attributes missing from its subset are given the value unknown. The instance copies are then passed to the base classifier, and its predictions are combined to give an overall prediction. The process of copying instances and erasing their values at classification time is shown in fig. 3.1; this may be compared with the attribute subsetting at training time using the same subsets shown in fig For most base classifiers, virtual attribute subsetting may be less accurate than standard attribute subsetting, but will have used less training time and storage. Specifically, it only needs the time and space required by the single base classifier.
3 Chapter 3 Virtual attribute subsetting 32 Figure 3.1: Virtual attribute subsetting example: 3 copies of a 4-attribute instance created with attribute values erased Three parameters of a virtual attribute subsetting classifier must be chosen: the subsets, the type of base classifier, and the means of combining predictions Subset choice There are many ways to select multiple subsets of attributes. Some subsets have been chosen by hand, based on domain knowledge (Cherkauer, 1996; Sutton et al., 2005). Pseudorandom subsets have also been used, as in (Ho, 1998), where each subset contained 50% of the attributes. Pseudorandom subsets may be optimized by introspective subset choice: a portion of the training data is set aside for evaluation, and only those subsets which lead to accurate classifiers on the evaluation data are used. Introspective subset choice may be based on learning the optimal proportion of attributes for each dataset (Bay, 1998), or by generating many subsets and only using those which lead to accurate classifiers (Bryll et al., 2003). However, virtual attribute subsetting is intended to be a fast, generic technique. The subsets should not be based on domain knowledge (which is not generic) or introspective subset choice (which is time-consuming). For this experiment, four types of pseudorandom subset choice were tested: random, classifier balanced, attribute balanced and both balanced. Sample subsets generated by all four subset choice algorithms are shown in table 3.1; the shaded cells show where balancing has been enforced.
4 Chapter 3 Virtual attribute subsetting 33 Each subset generation algorithm receives three parameters: a, the number of attributes, s, the number of subsets to generate and p, the desired proportion of attributes per subset. Table 3.1: Examples of subsets created by different algorithms with a = 4, s = 5 and p =0.7 (a) No balancing (b) Classifier balanced (c) Attribute balanced (d) Both balanced Random subsets This is the simplest algorithm. It iterates through the a s attribute/subset pairs, randomly selecting a s p attributes to include in the subsets Classifier balanced subsets This algorithm chooses subsets such that each subset contains approximately a p attributes. Since a p may not be an integer, it rounds some subsets up and some down to bring the total number of attributes used as close to a s p as possible.
5 Chapter 3 Virtual attribute subsetting Attribute balanced subsets This algorithm chooses subsets such that each attribute appears in approximately s p subsets. Since s p may not be an integer, it rounds some counts up and some down to bring the total number of attributes used as close to a s p as possible Both balanced subsets Creating subsets that satisfy both the classifier and attribute balancing specified above is slightly more difficult. The algorithm is illustrated by associating a line segment with each attribute. The length of each line segment is the number of further times that attribute needs to be added to a subset. It is described in pseudocode below and the process is illustrated in fig Note that this algorithm may create duplicated subsets; a version that created only unique subsets was tested but had an insignificant effect on accuracy. Determine the number of times each attribute should appear; if achieving the correct proportion requires varied attribute counts (some must be rounded up and some rounded down), randomly distribute the counts For each subset: Randomly arrange the attributes in line selectorlength <- number of subsets remaining selectorpos <- random(0..selectorlength-1) While selectorpos lies adjacent to an attribute: Add that attribute to the subset selectorpos <- selectorpos + selectorlength Reduce the lengths of the chosen attributes by 1
6 Chapter 3 Virtual attribute subsetting 35 Figure 3.2: Visualisation of steps in both balanced attribute subsets algorithm
7 Chapter 3 Virtual attribute subsetting Base classifiers The effect of virtual attribute subsetting will vary, depending upon the base classifier used. Three types of base classifiers were tested in this experiment: Naïve Bayes, C4.5 decision trees, and part rule sets. Making an attribute value unknown makes Naïve Bayes ignore it as if that entire attribute were not in the training data, so there should be no difference in output between standard and virtual attribute subsetting if both use Naïve Bayes as the base classifier. For C4.5 and part base classifiers, virtual attribute subsetting is unlikely to match the accuracy of standard attribute subsetting, as the classifiers used for each subset will not be appropriately independent. However, virtual attribute subsetting may still be more accurate than a single base classifier Combining predictions There are many ways to combine the predictions of multiple classifiers. The method chosen for virtual attribute subsetting was to sum and normalize the class probability distributions of the base classifiers to give an overall class probability distribution. 3.3 Method As mentioned in section 2.3.5, all experiments here were carried out in weka, the Waikato Environment for Knowledge Analysis. Weka conversions of 31 datasets from the UCI Machine Learning Repository (Newman et al., 1998) were selected for testing 1. The dataset names are listed in tables 3.8, 3.9 and Each test of a classifier on a dataset involved 10 repetitions of 10-fold cross-validation. Both standard and virtual attribute subsetting have three adjustable settings: subset choice algorithm ( no balancing, classifier balanced, attribute balanced or both balanced ), attribute proportion (a floating point number from 0.0 to 1.0, although exactly 1.0 results in every subset containing every attribute) and number of subsets (any positive integer). Preliminary tests showed that reasonable default settings are balancing= both balanced, attribute proportion=0.8 and subsets=
8 Chapter 3 Virtual attribute subsetting 37 The decision tree classifier used was J4.8, the weka implementation of C4.5. The rule set classifier was the weka implementation of part. All base classifier settings were left at their defaults. For both J4.8 and part, this meant pruning with a confidence factor of 0.25 and requiring a minimum of two instances per leaf. 3.4 Results A virtual attribute subsetting classifier with a given base classifier may be considered to succeed if it is more accurate than a single classifier with the same type and settings: that is, if it improves accuracy for no significant increase in storage space or training time. This is the main comparison made here; standard attribute subsetting results are also shown, as they provide a probable upper bound to accuracy. Results are shaded if there are more wins than losses, and the probability that this is due to chance is less than 0.05, based on a one-tailed Fisher Sign Test (Weisstein, 1999) Naïve Bayes The ability of virtual attribute subsetting to yield exactly the same results as standard attribute subsetting when Naïve Bayes is the base classifier was verified, so the results shown apply to both standard and virtual attribute subsetting. Attribute subsetting usually only improved the accuracy of a Naïve Bayesian classifier when attributes were balanced, as in table 3.2. The best attribute proportion was 0.9, as shown in table 3.3. Table 3.2: Wins/draws/losses for standard/virtual attribute subsetting with varying subset choice algorithms compared with a single Naïve Bayesian classifier Table 3.3: Wins/draws/losses for standard/virtual attribute subsetting with varying proportion compared with a single Naïve Bayesian classifier
9 Chapter 3 Virtual attribute subsetting Decision trees With J4.8 as the base classifier, standard attribute subsetting performed well with all subset algorithms, but virtual attribute subsetting was only effective when both attributes and classifiers were balanced, as shown in table 3.4. Both attribute subsetting classifiers were commonly more accurate with attribute proportions of 0.8 or 0.9, as in table 3.5. The accuracies achieved with the best settings are listed in table 3.9. The accuracy gains made were usually small. Two additional experiments with J4.8 were undertaken but are not reported in detail here. The number of subsets was varied from 2 to 40, with no significant improvement in virtual attribute subsetting once the number of subsets was at least 6. Unpruned decision trees were also tested; virtual attribute subsetting using unpruned J4.8 as the base classifier performed well against a single unpruned J4.8 classifier but poorly against a single pruned J4.8 classifier. Table 3.4: Wins/draws/losses for standard/virtual attribute subsetting with varying subset choice algorithms compared with a single J4.8 classifier Table 3.5: Wins/draws/losses for standard/virtual attribute subsetting with varying proportion compared with a single J4.8 classifier
10 Chapter 3 Virtual attribute subsetting Rule learning Results for the part algorithm are similar to those for J4.8. Standard attribute subsetting gave good results, while virtual attribute subsetting was only effective when the attributes were balanced. Balancing both attributes and classifiers did not cause further improvements, as shown in table 3.6. Of the attribute proportions tested, attribute subsetting using part was commonly more accurate with proportions of 0.8 or 0.9, as in table 3.7. The accuracies achieved with the best settings are listed in table Once again, the actual accuracy gains made were small. Table 3.6: Wins/draws/losses for standard/virtual attribute subsetting with varying subset choice algorithms compared with a single part classifier Table 3.7: Wins/draws/losses for standard/virtual attribute subsetting with varying proportion compared with a single part classifier
11 Chapter 3 Virtual attribute subsetting Training time Since the training for virtual attribute subsetting is limited to building subsets and training one classifier on the original training data, it was expected to have similar training time to a single classifier. The standard attribute subsetting algorithm builds a classifier for each subset. Since the training data for each classifier have some attributes removed, the individual classifiers may take less time to learn than a single classifier (as there are fewer attributes to consider) or more time (as more steps may be needed to build an accurate classifier). To test this, the ratios of attribute subsetting training time to single classifier training time were measured. As expected, virtual attribute subsetting took comparable training time to a single classifier, while standard attribute subsetting needed at least six times longer than both a single classifier and virtual attribute subsetting Classifier size The size of a decision tree or rule set provides some measure of its complexity. Small classifiers may be too simple and underfit, while large classifiers may be needlessly complex and overfit. Under attribute subsetting, leaving out some attributes may lead the classifier learner to simpler classifiers, or may force it into complexity. The number of nodes in the J4.8 decision trees, and the number of rules in the part rule sets, were therefore compared in table 3.8. The numbers are the average across the 10-fold cross-validation; each fold s standard attribute subsetting result is also the mean of its 10 base classifiers. Virtual attribute subsetting is not listed, as it uses the single classifier. The table shows that for J4.8 neither method leads to consistently larger trees, while attribute subsetting with part generally does learn larger rule sets Accuracy tables The accuracies returned by the most accurate attribute subsetting and virtual attribute subsetting classifiers using J4.8 and part base classifiers are shown in table 3.9 and 3.10 respectively, along with their win/draw/loss totals.
12 Chapter 3 Virtual attribute subsetting 41 Table 3.8: Sizes of classifiers trained on the entire dataset and under standard attribute subsetting Greater or Less than results are based on a simple count comparison and were measured before these figures were truncated for presentation.
13 Chapter 3 Virtual attribute subsetting 42 Table 3.9: Percentage accuracy over the 31 datasets for J4.8 and standard/virtual attribute subsetting using J4.8 base classifiers Wins and losses are based on a simple accuracy comparison and were measured before these figures were truncated for presentation.
14 Chapter 3 Virtual attribute subsetting 43 Table 3.10: Percentage accuracy over the 31 datasets for part and standard/virtual attribute subsetting using part base classifiers Wins and losses are based on a simple accuracy comparison and were measured before these figures were truncated for presentation.
15 Chapter 3 Virtual attribute subsetting Conclusions This chapter introduced a new meta-classification technique, virtual attribute subsetting. Tests show that it is effective at improving the accuracy of decision tree and rule set classifiers on a variety of common classification datasets. If blind subset selection is made, the results suggest that the attribute subsets should be chosen with the both balanced subsets algorithm described in section with a subset proportion of This technique is discussed further in chapter 6. There, the confidence measures from Haar Classifier Cascades are modified in section by applying virtual attribute subsetting to the cascade stages. This obtains multiple classifications from a single cascade. The results in section show the benefits of doing so.
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995)
A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection (Kohavi, 1995) Department of Information, Operations and Management Sciences Stern School of Business, NYU padamopo@stern.nyu.edu
More informationThe digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand).
http://waikato.researchgateway.ac.nz/ Research Commons at the University of Waikato Copyright Statement: The digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand). The thesis
More informationEstimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees
Estimating Missing Attribute Values Using Dynamically-Ordered Attribute Trees Jing Wang Computer Science Department, The University of Iowa jing-wang-1@uiowa.edu W. Nick Street Management Sciences Department,
More informationEnsemble Learning. Another approach is to leverage the algorithms we have via ensemble methods
Ensemble Learning Ensemble Learning So far we have seen learning algorithms that take a training set and output a classifier What if we want more accuracy than current algorithms afford? Develop new learning
More informationData Mining. 3.3 Rule-Based Classification. Fall Instructor: Dr. Masoud Yaghini. Rule-Based Classification
Data Mining 3.3 Fall 2008 Instructor: Dr. Masoud Yaghini Outline Using IF-THEN Rules for Classification Rules With Exceptions Rule Extraction from a Decision Tree 1R Algorithm Sequential Covering Algorithms
More informationChapter 8 The C 4.5*stat algorithm
109 The C 4.5*stat algorithm This chapter explains a new algorithm namely C 4.5*stat for numeric data sets. It is a variant of the C 4.5 algorithm and it uses variance instead of information gain for the
More informationData Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Implementation: Real machine learning schemes Decision trees Classification
More informationThe digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand).
http://researchcommons.waikato.ac.nz/ Research Commons at the University of Waikato Copyright Statement: The digital copy of this thesis is protected by the Copyright Act 1994 (New Zealand). The thesis
More informationPerceptron-Based Oblique Tree (P-BOT)
Perceptron-Based Oblique Tree (P-BOT) Ben Axelrod Stephen Campos John Envarli G.I.T. G.I.T. G.I.T. baxelrod@cc.gatech sjcampos@cc.gatech envarli@cc.gatech Abstract Decision trees are simple and fast data
More informationINTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá
INTRODUCTION TO DATA MINING Daniel Rodríguez, University of Alcalá Outline Knowledge Discovery in Datasets Model Representation Types of models Supervised Unsupervised Evaluation (Acknowledgement: Jesús
More informationA Comparative Study of Selected Classification Algorithms of Data Mining
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 6, June 2015, pg.220
More informationStudy on Classifiers using Genetic Algorithm and Class based Rules Generation
2012 International Conference on Software and Computer Applications (ICSCA 2012) IPCSIT vol. 41 (2012) (2012) IACSIT Press, Singapore Study on Classifiers using Genetic Algorithm and Class based Rules
More informationContents. ACE Presentation. Comparison with existing frameworks. Technical aspects. ACE 2.0 and future work. 24 October 2009 ACE 2
ACE Contents ACE Presentation Comparison with existing frameworks Technical aspects ACE 2.0 and future work 24 October 2009 ACE 2 ACE Presentation 24 October 2009 ACE 3 ACE Presentation Framework for using
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.4. Spring 2010 Instructor: Dr. Masoud Yaghini Outline Using IF-THEN Rules for Classification Rule Extraction from a Decision Tree 1R Algorithm Sequential Covering Algorithms
More informationMIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018
MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge
More informationSlides for Data Mining by I. H. Witten and E. Frank
Slides for Data Mining by I. H. Witten and E. Frank 7 Engineering the input and output Attribute selection Scheme-independent, scheme-specific Attribute discretization Unsupervised, supervised, error-
More informationTrade-offs in Explanatory
1 Trade-offs in Explanatory 21 st of February 2012 Model Learning Data Analysis Project Madalina Fiterau DAP Committee Artur Dubrawski Jeff Schneider Geoff Gordon 2 Outline Motivation: need for interpretable
More informationEfficient Pairwise Classification
Efficient Pairwise Classification Sang-Hyeun Park and Johannes Fürnkranz TU Darmstadt, Knowledge Engineering Group, D-64289 Darmstadt, Germany Abstract. Pairwise classification is a class binarization
More informationCloNI: clustering of JN -interval discretization
CloNI: clustering of JN -interval discretization C. Ratanamahatana Department of Computer Science, University of California, Riverside, USA Abstract It is known that the naive Bayesian classifier typically
More informationh=[3,2,5,7], pos=[2,1], neg=[4,4]
2D1431 Machine Learning Lab 1: Concept Learning & Decision Trees Frank Hoffmann e-mail: hoffmann@nada.kth.se November 8, 2002 1 Introduction You have to prepare the solutions to the lab assignments prior
More informationProbabilistic Classifiers DWML, /27
Probabilistic Classifiers DWML, 2007 1/27 Probabilistic Classifiers Conditional class probabilities Id. Savings Assets Income Credit risk 1 Medium High 75 Good 2 Low Low 50 Bad 3 High Medium 25 Bad 4 Medium
More informationDATA ANALYSIS I. Types of Attributes Sparse, Incomplete, Inaccurate Data
DATA ANALYSIS I Types of Attributes Sparse, Incomplete, Inaccurate Data Sources Bramer, M. (2013). Principles of data mining. Springer. [12-21] Witten, I. H., Frank, E. (2011). Data Mining: Practical machine
More informationEnsemble Methods, Decision Trees
CS 1675: Intro to Machine Learning Ensemble Methods, Decision Trees Prof. Adriana Kovashka University of Pittsburgh November 13, 2018 Plan for This Lecture Ensemble methods: introduction Boosting Algorithm
More informationUsing Decision Trees and Soft Labeling to Filter Mislabeled Data. Abstract
Using Decision Trees and Soft Labeling to Filter Mislabeled Data Xinchuan Zeng and Tony Martinez Department of Computer Science Brigham Young University, Provo, UT 84602 E-Mail: zengx@axon.cs.byu.edu,
More informationSample 1. Dataset Distribution F Sample 2. Real world Distribution F. Sample k
can not be emphasized enough that no claim whatsoever is It made in this paper that all algorithms are equivalent in being in the real world. In particular, no claim is being made practice, one should
More informationMachine Learning (CS 567)
Machine Learning (CS 567) Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol Han (cheolhan@usc.edu)
More informationCSI5387: Data Mining Project
CSI5387: Data Mining Project Terri Oda April 14, 2008 1 Introduction Web pages have become more like applications that documents. Not only do they provide dynamic content, they also allow users to play
More informationComparative Study of Instance Based Learning and Back Propagation for Classification Problems
Comparative Study of Instance Based Learning and Back Propagation for Classification Problems 1 Nadia Kanwal, 2 Erkan Bostanci 1 Department of Computer Science, Lahore College for Women University, Lahore,
More informationPredictive Analysis: Evaluation and Experimentation. Heejun Kim
Predictive Analysis: Evaluation and Experimentation Heejun Kim June 19, 2018 Evaluation and Experimentation Evaluation Metrics Cross-Validation Significance Tests Evaluation Predictive analysis: training
More informationWhat is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.
What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem
More informationWEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1
WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS 1 H. Altay Güvenir and Aynur Akkuş Department of Computer Engineering and Information Science Bilkent University, 06533, Ankara, Turkey
More informationBest First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis
Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis CHAPTER 3 BEST FIRST AND GREEDY SEARCH BASED CFS AND NAÏVE BAYES ALGORITHMS FOR HEPATITIS DIAGNOSIS 3.1 Introduction
More informationSSV Criterion Based Discretization for Naive Bayes Classifiers
SSV Criterion Based Discretization for Naive Bayes Classifiers Krzysztof Grąbczewski kgrabcze@phys.uni.torun.pl Department of Informatics, Nicolaus Copernicus University, ul. Grudziądzka 5, 87-100 Toruń,
More informationA Two-level Learning Method for Generalized Multi-instance Problems
A wo-level Learning Method for Generalized Multi-instance Problems Nils Weidmann 1,2, Eibe Frank 2, and Bernhard Pfahringer 2 1 Department of Computer Science University of Freiburg Freiburg, Germany weidmann@informatik.uni-freiburg.de
More informationBias-Variance Analysis of Ensemble Learning
Bias-Variance Analysis of Ensemble Learning Thomas G. Dietterich Department of Computer Science Oregon State University Corvallis, Oregon 97331 http://www.cs.orst.edu/~tgd Outline Bias-Variance Decomposition
More informationMachine Learning Techniques for Data Mining
Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already
More informationInternational Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X
Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,
More informationArtificial Intelligence. Programming Styles
Artificial Intelligence Intro to Machine Learning Programming Styles Standard CS: Explicitly program computer to do something Early AI: Derive a problem description (state) and use general algorithms to
More informationThe Role of Biomedical Dataset in Classification
The Role of Biomedical Dataset in Classification Ajay Kumar Tanwani and Muddassar Farooq Next Generation Intelligent Networks Research Center (nexgin RC) National University of Computer & Emerging Sciences
More informationDimensionality Reduction, including by Feature Selection.
Dimensionality Reduction, including by Feature Selection www.cs.wisc.edu/~dpage/cs760 Goals for the lecture you should understand the following concepts filtering-based feature selection information gain
More informationImproving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique
Improving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique Anotai Siltepavet 1, Sukree Sinthupinyo 2 and Prabhas Chongstitvatana 3 1 Computer Engineering, Chulalongkorn University,
More informationData Mining Practical Machine Learning Tools and Techniques
Decision trees Extending previous approach: Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank to permit numeric s: straightforward
More informationMidterm Examination CS540-2: Introduction to Artificial Intelligence
Midterm Examination CS540-2: Introduction to Artificial Intelligence March 15, 2018 LAST NAME: FIRST NAME: Problem Score Max Score 1 12 2 13 3 9 4 11 5 8 6 13 7 9 8 16 9 9 Total 100 Question 1. [12] Search
More informationEvaluating the Replicability of Significance Tests for Comparing Learning Algorithms
Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms Remco R. Bouckaert 1,2 and Eibe Frank 2 1 Xtal Mountain Information Technology 215 Three Oaks Drive, Dairy Flat, Auckland,
More informationAbstract. 1 Introduction. Ian H. Witten Department of Computer Science University of Waikato Hamilton, New Zealand
Ò Ö Ø Ò ÙÖ Ø ÊÙÐ Ë Ø Ï Ø ÓÙØ ÐÓ Ð ÇÔØ Ñ Þ Ø ÓÒ Eibe Frank Department of Computer Science University of Waikato Hamilton, New Zealand eibe@cs.waikato.ac.nz Abstract The two dominant schemes for rule-learning,
More informationComparative analysis of classifier algorithm in data mining Aikjot Kaur Narula#, Dr.Raman Maini*
Comparative analysis of classifier algorithm in data mining Aikjot Kaur Narula#, Dr.Raman Maini* #Student, Department of Computer Engineering, Punjabi university Patiala, India, aikjotnarula@gmail.com
More informationA Critical Study of Selected Classification Algorithms for Liver Disease Diagnosis
A Critical Study of Selected Classification s for Liver Disease Diagnosis Shapla Rani Ghosh 1, Sajjad Waheed (PhD) 2 1 MSc student (ICT), 2 Associate Professor (ICT) 1,2 Department of Information and Communication
More informationECLT 5810 Evaluation of Classification Quality
ECLT 5810 Evaluation of Classification Quality Reference: Data Mining Practical Machine Learning Tools and Techniques, by I. Witten, E. Frank, and M. Hall, Morgan Kaufmann Testing and Error Error rate:
More informationData Mining. 3.2 Decision Tree Classifier. Fall Instructor: Dr. Masoud Yaghini. Chapter 5: Decision Tree Classifier
Data Mining 3.2 Decision Tree Classifier Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Basic Algorithm for Decision Tree Induction Attribute Selection Measures Information Gain Gain Ratio
More informationNaïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others
Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict
More informationCost-sensitive C4.5 with post-pruning and competition
Cost-sensitive C4.5 with post-pruning and competition Zilong Xu, Fan Min, William Zhu Lab of Granular Computing, Zhangzhou Normal University, Zhangzhou 363, China Abstract Decision tree is an effective
More informationK*: An Instance-based Learner Using an Entropic Distance Measure
K*: An Instance-based Learner Using an Entropic Distance Measure John G. Cleary, Leonard E. Trigg, Dept. of Computer Science, University of Waikato, New Zealand. e-mail:{jcleary,trigg}@waikato.ac.nz Abstract
More informationEnsemble Learning: An Introduction. Adapted from Slides by Tan, Steinbach, Kumar
Ensemble Learning: An Introduction Adapted from Slides by Tan, Steinbach, Kumar 1 General Idea D Original Training data Step 1: Create Multiple Data Sets... D 1 D 2 D t-1 D t Step 2: Build Multiple Classifiers
More informationImproving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique
www.ijcsi.org 29 Improving Quality of Products in Hard Drive Manufacturing by Decision Tree Technique Anotai Siltepavet 1, Sukree Sinthupinyo 2 and Prabhas Chongstitvatana 3 1 Computer Engineering, Chulalongkorn
More informationRipple Down Rule learner (RIDOR) Classifier for IRIS Dataset
Ripple Down Rule learner (RIDOR) Classifier for IRIS Dataset V.Veeralakshmi Department of Computer Science Bharathiar University, Coimbatore, Tamilnadu veeralakshmi13@gmail.com Dr.D.Ramyachitra Department
More informationExam Advanced Data Mining Date: Time:
Exam Advanced Data Mining Date: 11-11-2010 Time: 13.30-16.30 General Remarks 1. You are allowed to consult 1 A4 sheet with notes written on both sides. 2. Always show how you arrived at the result of your
More informationMODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS
MODELLING DOCUMENT CATEGORIES BY EVOLUTIONARY LEARNING OF TEXT CENTROIDS J.I. Serrano M.D. Del Castillo Instituto de Automática Industrial CSIC. Ctra. Campo Real km.0 200. La Poveda. Arganda del Rey. 28500
More informationLogical Decision Rules: Teaching C4.5 to Speak Prolog
Logical Decision Rules: Teaching C4.5 to Speak Prolog Kamran Karimi and Howard J. Hamilton Department of Computer Science University of Regina Regina, Saskatchewan Canada S4S 0A2 {karimi,hamilton}@cs.uregina.ca
More informationInvestigation into the use of PCA with Machine Learning for the Identification of Narcotics based on Raman Spectroscopy
Investigation into the use of PCA with Machine Learning for the Identification of Narcotics based on Raman Spectroscopy Tom Howley, Michael G. Madden, Marie-Louise O Connell and Alan G. Ryder National
More information7. Decision or classification trees
7. Decision or classification trees Next we are going to consider a rather different approach from those presented so far to machine learning that use one of the most common and important data structure,
More informationSemi-Supervised Clustering with Partial Background Information
Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject
More informationAssignment 4 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran
Assignment 4 (Sol.) Introduction to Data Analytics Prof. andan Sudarsanam & Prof. B. Ravindran 1. Which among the following techniques can be used to aid decision making when those decisions depend upon
More informationBack-to-Back Stem-and-Leaf Plots
Chapter 195 Back-to-Back Stem-and-Leaf Plots Introduction This procedure generates a stem-and-leaf plot of a batch of data. The stem-and-leaf plot is similar to a histogram and its main purpose is to show
More informationA Lazy Approach for Machine Learning Algorithms
A Lazy Approach for Machine Learning Algorithms Inés M. Galván, José M. Valls, Nicolas Lecomte and Pedro Isasi Abstract Most machine learning algorithms are eager methods in the sense that a model is generated
More informationEfficient Pairwise Classification
Efficient Pairwise Classification Sang-Hyeun Park and Johannes Fürnkranz TU Darmstadt, Knowledge Engineering Group, D-64289 Darmstadt, Germany {park,juffi}@ke.informatik.tu-darmstadt.de Abstract. Pairwise
More information.. Cal Poly CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. for each element of the dataset we are given its class label.
.. Cal Poly CSC 466: Knowledge Discovery from Data Alexander Dekhtyar.. Data Mining: Classification/Supervised Learning Definitions Data. Consider a set A = {A 1,...,A n } of attributes, and an additional
More informationPractical Data Mining COMP-321B. Tutorial 1: Introduction to the WEKA Explorer
Practical Data Mining COMP-321B Tutorial 1: Introduction to the WEKA Explorer Gabi Schmidberger Mark Hall Richard Kirkby July 12, 2006 c 2006 University of Waikato 1 Setting up your Environment Before
More informationMachine Learning. Cross Validation
Machine Learning Cross Validation Cross Validation Cross validation is a model evaluation method that is better than residuals. The problem with residual evaluations is that they do not give an indication
More informationA Fast Decision Tree Learning Algorithm
A Fast Decision Tree Learning Algorithm Jiang Su and Harry Zhang Faculty of Computer Science University of New Brunswick, NB, Canada, E3B 5A3 {jiang.su, hzhang}@unb.ca Abstract There is growing interest
More informationClassification. Slide sources:
Classification Slide sources: Gideon Dror, Academic College of TA Yaffo Nathan Ifill, Leicester MA4102 Data Mining and Neural Networks Andrew Moore, CMU : http://www.cs.cmu.edu/~awm/tutorials 1 Outline
More informationEvaluating Classifiers
Evaluating Classifiers Charles Elkan elkan@cs.ucsd.edu January 18, 2011 In a real-world application of supervised learning, we have a training set of examples with labels, and a test set of examples with
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationHow Learning Differs from Optimization. Sargur N. Srihari
How Learning Differs from Optimization Sargur N. srihari@cedar.buffalo.edu 1 Topics in Optimization Optimization for Training Deep Models: Overview How learning differs from optimization Risk, empirical
More informationPerformance Analysis of Data Mining Classification Techniques
Performance Analysis of Data Mining Classification Techniques Tejas Mehta 1, Dr. Dhaval Kathiriya 2 Ph.D. Student, School of Computer Science, Dr. Babasaheb Ambedkar Open University, Gujarat, India 1 Principal
More informationData Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners
Data Mining 3.5 (Instance-Based Learners) Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction k-nearest-neighbor Classifiers References Introduction Introduction Lazy vs. eager learning Eager
More informationUnsupervised Discretization using Tree-based Density Estimation
Unsupervised Discretization using Tree-based Density Estimation Gabi Schmidberger and Eibe Frank Department of Computer Science University of Waikato Hamilton, New Zealand {gabi, eibe}@cs.waikato.ac.nz
More informationAssignment 1: CS Machine Learning
Assignment 1: CS7641 - Machine Learning Saad Khan September 18, 2015 1 Introduction I intend to apply supervised learning algorithms to classify the quality of wine samples as being of high or low quality
More informationAssociation Rule Mining and Clustering
Association Rule Mining and Clustering Lecture Outline: Classification vs. Association Rule Mining vs. Clustering Association Rule Mining Clustering Types of Clusters Clustering Algorithms Hierarchical:
More informationProbabilistic Abstraction Lattices: A Computationally Efficient Model for Conditional Probability Estimation
Probabilistic Abstraction Lattices: A Computationally Efficient Model for Conditional Probability Estimation Daniel Lowd January 14, 2004 1 Introduction Probabilistic models have shown increasing popularity
More informationComparing Univariate and Multivariate Decision Trees *
Comparing Univariate and Multivariate Decision Trees * Olcay Taner Yıldız, Ethem Alpaydın Department of Computer Engineering Boğaziçi University, 80815 İstanbul Turkey yildizol@cmpe.boun.edu.tr, alpaydin@boun.edu.tr
More informationModel Selection and Assessment
Model Selection and Assessment CS4780/5780 Machine Learning Fall 2014 Thorsten Joachims Cornell University Reading: Mitchell Chapter 5 Dietterich, T. G., (1998). Approximate Statistical Tests for Comparing
More informationMachine Learning Techniques for Data Mining
Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART V Credibility: Evaluating what s been learned 10/25/2000 2 Evaluation: the key to success How
More informationUnivariate Margin Tree
Univariate Margin Tree Olcay Taner Yıldız Department of Computer Engineering, Işık University, TR-34980, Şile, Istanbul, Turkey, olcaytaner@isikun.edu.tr Abstract. In many pattern recognition applications,
More informationClassification Using Unstructured Rules and Ant Colony Optimization
Classification Using Unstructured Rules and Ant Colony Optimization Negar Zakeri Nejad, Amir H. Bakhtiary, and Morteza Analoui Abstract In this paper a new method based on the algorithm is proposed to
More informationPart I. Instructor: Wei Ding
Classification Part I Instructor: Wei Ding Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Classification: Definition Given a collection of records (training set ) Each record contains a set
More informationData Mining and Knowledge Discovery: Practice Notes
Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 2016/11/16 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization
More informationI211: Information infrastructure II
Data Mining: Classifier Evaluation I211: Information infrastructure II 3-nearest neighbor labeled data find class labels for the 4 data points 1 0 0 6 0 0 0 5 17 1.7 1 1 4 1 7.1 1 1 1 0.4 1 2 1 3.0 0 0.1
More informationRank Measures for Ordering
Rank Measures for Ordering Jin Huang and Charles X. Ling Department of Computer Science The University of Western Ontario London, Ontario, Canada N6A 5B7 email: fjhuang33, clingg@csd.uwo.ca Abstract. Many
More informationFeature Selection with Decision Tree Criterion
Feature Selection with Decision Tree Criterion Krzysztof Grąbczewski and Norbert Jankowski Department of Computer Methods Nicolaus Copernicus University Toruń, Poland kgrabcze,norbert@phys.uni.torun.pl
More informationData Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Output: Knowledge representation Tables Linear models Trees Rules
More informationNearest Neighbor Classifiers
Nearest Neighbor Classifiers TNM033 Data Mining Techniques Linköping University 2009-12-04 When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.
More informationA Two Stage Zone Regression Method for Global Characterization of a Project Database
A Two Stage Zone Regression Method for Global Characterization 1 Chapter I A Two Stage Zone Regression Method for Global Characterization of a Project Database J. J. Dolado, University of the Basque Country,
More informationImproving Naïve Bayes Classifier for Software Architecture Reconstruction
Improving Naïve Bayes Classifier for Software Architecture Reconstruction Zahra Sadri Moshkenani Faculty of Computer Engineering Najafabad Branch, Islamic Azad University Isfahan, Iran zahra_sadri_m@sco.iaun.ac.ir
More informationAlgorithms: Decision Trees
Algorithms: Decision Trees A small dataset: Miles Per Gallon Suppose we want to predict MPG From the UCI repository A Decision Stump Recursion Step Records in which cylinders = 4 Records in which cylinders
More informationCSC411/2515 Tutorial: K-NN and Decision Tree
CSC411/2515 Tutorial: K-NN and Decision Tree Mengye Ren csc{411,2515}ta@cs.toronto.edu September 25, 2016 Cross-validation K-nearest-neighbours Decision Trees Review: Motivation for Validation Framework:
More informationWeka ( )
Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised
More informationEnhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques
24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE
More informationImpact of Boolean factorization as preprocessing methods for classification of Boolean data
Impact of Boolean factorization as preprocessing methods for classification of Boolean data Radim Belohlavek, Jan Outrata, Martin Trnecka Data Analysis and Modeling Lab (DAMOL) Dept. Computer Science,
More informationUsing Google s PageRank Algorithm to Identify Important Attributes of Genes
Using Google s PageRank Algorithm to Identify Important Attributes of Genes Golam Morshed Osmani Ph.D. Student in Software Engineering Dept. of Computer Science North Dakota State Univesity Fargo, ND 58105
More informationName Period Date. REAL NUMBER SYSTEM Student Pages for Packet 3: Operations with Real Numbers
Name Period Date REAL NUMBER SYSTEM Student Pages for Packet : Operations with Real Numbers RNS. Rational Numbers Review concepts of experimental and theoretical probability. a Understand why all quotients
More information