Weka ( )
|
|
- Kelley Higgins
- 6 years ago
- Views:
Transcription
1 Weka ( ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised Learning/Classification, or Unsupervised Learning / Clustering, or Rule-based classification Visualization
2 Introduction to Weka Online course on WEKA ( 5 lessons of 6 modules each (7-10 minutes + exercises): about 15 hours of total engagement including time for assignments. Lessons are also accessible on YouTube
3 Weka WEKA lets one choose among a huge number of modules which implement the main steps necessary to develop a classification/data mining application (filters, classifiers, clustering algorithms, graphical analysis of results) Modules can either be accessed: using a command-line interface through one of the GUIs provided by Weka Within one s own code as functions of a JAVA package
4 Weka WEKA includes three GUIs: Explorer Interactive pre-processing, data classification and algorithm configuration Experimenter Development of applications as cascaded modules and environment for organizing multiple executions of algorithms (on different data sets or using different random seeds) with final generation of a summary of the results. Knowledge Flow GUI for visual building of an application as well as a Command-Line Interface from which it is possible to run the different modules as independent applications.
5 Introduction to Weka Explorer Main functions and options An example Classification of 2-dimensional data (easy to visualize!) Bayes Classifiers Naïve Bayes Though it is based on the hypothesis of data independence, which is very rarely true, it often has reasonable performances and is very often taken as «baseline reference» for more sophisticated classifiers
6 An example Problem distinguish the points belonging to: 1. circle of radius 1, centered in 0,0 2. square with sides of length L=2.5, centered in (0,0) from which region 1 has been removed. Classify the dataset saved in the files circletrain.arff circletest.arff // training set // test set This is a binary problem: one needs to distinguish between patterns belonging to 2 classes.
7 N-fold Crossvalidation Recalling what we said in the previous lesson, the classical learning scheme based on training/(validation)/test sets may still provide very inaccurate estimates of a classifier s performance. Too few data Data unevenly distributed among classes and between the training and test set, with some overor under-represented class Presence of outliers
8 N-fold Crossvalidation In the circle/square example the training and test sets have been sampled under the best possible conditions: Samples drawn randomly according to a uniform distribution for both x and y coordinates => x and y are statistically independent (that is why the naïve Bayes classifier works reasonably well) No noise added to features (x and y) No wrong/noisy expected output/label (aka teaching input)
9 N-fold Crossvalidation Still we have 53/47 (circle/square) training patterns and 60/40 (circle/square) testing patterns versus an expected ratio of /(6.25- ) 3.14/3.11 = 1.01 It is no surprise we obtain different results on the training and test sets And that different random drawings of the training and test sets also lead to rather different estimates of the classifier s performance.
10 N-fold Crossvalidation What improvements can we introduce? Stratification: the drawing is made such that the distribution of samples of different classes is the same in each set This solves one problem but cannot do much when data are few and different drawings can apparently tell different stories. In particular, one outlier obviously has a much stronger impact on the estimates, the smaller the sample size.
11 N-fold Crossvalidation After all: We have just a small data set If we use all data for training, we will definitely overfit data. So what can we do if we want to take the most from our data?
12 N-fold Crossvalidation Divide the data set into N subsets For N times: Leave one subset out and use it as test set, training a classifier using the other N-1 as test set Estimate the performance of the classifier over the union of the results obtained on the N disjoint test sets. Extreme case: leave-one-out, in which one has as many folds as samples.
13 N-fold Crossvalidation Observations All data used as test when not involved in training The procedure can be used only to evaluate the method (estimate a classifier s error, compare it to others) but not to design a classifier: we have produced N different classifiers, none of which has a reason for being preferred to the others.
14 Quality criteria for binary classifiers (see also Prof. Bononi s notes on Bayesian Prediction) The decision taken by a binary classifier produces one of four possible outcomes: True Positive (TP): decision= 1, correct decision C=1 True Negative (TN): decision= 0, correct decision C=0 False Positive (FP): decision= 1, correct decision C=0 False Negative (FN): decision= 0, correct decision C=1
15 Quality criteria for binary classifiers (see also Prof. Bononi s notes on Bayesian Prediction) Given a data set with N d patterns to which one applies a binary classifier, then TP+TN+FP+FN = N d Accuracy : 100*(TP+TN)/(TP+TN+FP+FN) = (TP+TN)/N d Estimate of P(C). Correct decision rate. Recall / Sensitivity / TP rate: 100 * TP/(TP+FN) Estimate of P(1 c=1 ):. Correct decision rate when C=1 Specificity =1-FPRate: 100 * TN/(TN+FP) Estimate of P(0 c=0 ). Negative rate when C= 0 Precision / Positive predictivity: 100 * TP/(TP+FP) Estimates P(C=1 1 ). Positive rate when decision =1 F-Measure = 2 * Precision * Recall / (Precision + Recall) Global performance index. F-Measure ϵ [0,1]
16 Quality criteria for binary classifiers (see also Prof. Bononi s notes on Bayesian Prediction) The Confusion Matrix (CM) associated with an N-class classifier is a square NxN matrix whose element A ij represents the number (frequency, if normalized by the number of samples of class i) of patterns belonging to class i classified as belonging to class j. Proprietà: The CM associated with an ideal classifier is diagonal S i A i* equates the number of patterns belonging to class i (or 1, if frequency is used) Using the elements A i* it is possible to compute the quality indices for a binary classifier whose decision is i / not i TP i = A ii (Recall if frequency is used) FN i = S j A ij (1- d ij ) (Homework: derive TN i and FP i )
17 Confusion Matrix Decision C1 C2 C3 C4 N i C Class C C C
18 Confusion Matrix Decision C1 C2 C3 C4 N i C Class C C C
19 Confusion Matrix (%) Decision C1 C2 C3 C4 C1 95% 3% 1% 1% Class C2 4% 89% 2% 5% C3 2% 2% 96% 0% C4 4% 4% 1% 91%
20 Confusion Matrix (Binary Classifier) Decision 1 0 Class 1 TP FN 0 FP TN Normalizing by Np = TP + FN and Nn = row-wise, respectively, one obtains the TP (FN, FP, TN) rates.
21 ROC (receiver operating characteristic) curve The general goal for a classifier is to maximize accuracy. However, that single figure does not tell the whole story. Only if accuracy 1 we can be sure that the classifier is actually good. As an example, consider a classifier that must recognize the figure 8 (positive) against all others (negatives). If we have a uniform sample, the priors are P(n) = 0.9 P(p) = 0.1 So, a classifier which always decides 0, has an accuracy of 90%, but it is definitely not a good classifier!
22 ROC (receiver operating characteristic) curve In terms of optimization, designing a classifier is a multiobjective optimization maximize two criteria: Sensitivity (Recall) and Specificity, i.e. Maximize TP Minimize FP (Minimize FN) (Maximize TN)
23 ROC (receiver operating characteristic) curve ROC curves are used in signal detection to show tradeoff between hit rate (TP rate) and false alarm rate (FP rate) over noisy channels. Defined when radar theory was first developed. y axis represents the percentage of true positives in sample x axis represents the percentage of false positives in sample. It is an estimate of the actual ROC curve : y axis represents P(ŷ=1 y=1) x axis represents P(ŷ=1 y=0) (Sensitivity/ TP rate) (FP rate) its slope is equal to the likelihood ratio used in the Bayes decision rule (see Prof. Bononi s slides, page B13)
24 (R)OC (receiver operating characteristic) curve What we obtain, formally, is not the ROC, but the so-called OC (operating characteristic) curve. In fact, there are cases (see Duda-Hart, page ), e.g., with multi-modal distributions, in which using a single threshold is not the right way to go for receiver/classifier design Still a curve (which may not even be convex) can always be built according to the same rules.
25 (R)OC (receiver operating characteristic) curve Imagine the following processing chain is used signal V 0/1 Preprocessing Thresholding The curve is obtained by plotting the points (FP rate, TP rate) one would obtain by sweeping the threshold from + (ŷ = 0 for all patterns, FPrate=TPrate=0) to - (ŷ = 1 for all patterns, FPrate=TPrate=1)
26 (R)OC (receiver operating characteristic) curve How can we estimate the curve if we cannot set the threshold and we have no explicit model of the classifier behaviour? We can use the results obtained in the test set. Supposing the classifier (as most do) outputs, for each pattern, a value V which is closer to 1, the more likely the pattern is to be a positive one, one can: 1. Sort {V i, i=1, Np} in descending order 2. For i=1, Np - move up1/np if y i = 1 - move 1/Np to the right if y i = 0
27 (R)OC (receiver operating characteristic) curve If we use data from a single test set, we will obtain a jagged curve; if we use cross-validation the curve should be smoother.
28 (R)OC (receiver operating characteristic) curve Using (for instance) the logistic regression, the likelihood of each observation is the corresponding value of the logistic function. All observations with >0.5 (located to the right of the dotted line) are classified as 1.
29 (R)OC (receiver operating characteristic) curve All points to the right of the dotted line are classified as positive, all points to its left are classified as negative. o s to the left = TN x s to the right = TP o s to the right =FP x s to the left = FN
30 (R)OC (receiver operating characteristic) curve If we shift the logistic function to the extreme right the dotted line will shift with it and the classifier will always output 0. If we then start sweeping the curve leftwards, the likeliest observations will start to be classified as positive, so the OC curve will grow vertically whenever a positive (x) crosses the dotted line and horizontally whenever a negative (o) does.
31 (R)OC (receiver operating characteristic) curve Reasoning with distributions:
32 (R)OC (receiver operating characteristic) curve The area under the curve (AUC) thus constructed is a quality index for the classifier: the more it is close to one, the better. In fact, an ideal classifier separates positive from negatives perfectly. In that case, the OC Curve will actually be a square, since it will grow first only vertically until it reaches (FP rate, TP rate) (0,1), then it will grow only horizontally to reach (1,1). Homework: write a simple program (very easy in Matlab) to compute the AUC for a classifier.
33 Quality criteria for binary classifiers Kappa statistic (used by WEKA) Given two confusion matrices for a 3-class problem: actual predictor (left) vs. random predictor (right) Number of successes: sum of entries in diagonal (D) Random Predictor P(a x) = P(a) P(b x) = P(b) P(c x) = P(c) Kappa statistic: D observed D random D perfect D random measures relative improvement over random predictor Most probably, same value obtained if AUC is used instead of D.
34 Quality criteria for binary classifiers (used by WEKA) Difference: error measures Actual target values: a 1 a 2 a n Predicted target values: p 1 p 2 p n Mean Squared error (p 1 a 1 ) (p n a n ) 2 Root Mean Squared (RMS) error n (p 1 a 1 ) (p n a n ) 2 n Mostly relevant for regression problems (continuous target values)
35 Quality criteria for binary classifiers (used by WEKA) Difference: error measures Actual target values: a 1 a 2 a n Predicted target values: p 1 p 2 p n Mean Absolute error p 1 a p n a n n Less sensitive to outliers.
36 Quality criteria for binary classifiers (used by WEKA) Relative errors Relative Absolute error a = average of a 1.. a n Relative Squared error p 1 a p n a n a a a a n (p 1 a 1 ) (p n a n ) 2 (a a 1 ) (a a n ) 2
37 Suggested readings/videos First two units of the online course (6+6 modules) on Weka: they include a revision of some concepts we introduced in the previous lesson. Slides associated to chapter 5 of the book Data mining by Witten, Frank, and Hall. Online in our course site.
INTRODUCTION TO MACHINE LEARNING. Measuring model performance or error
INTRODUCTION TO MACHINE LEARNING Measuring model performance or error Is our model any good? Context of task Accuracy Computation time Interpretability 3 types of tasks Classification Regression Clustering
More informationMachine Learning and Bioinformatics 機器學習與生物資訊學
Molecular Biomedical Informatics 分子生醫資訊實驗室 機器學習與生物資訊學 Machine Learning & Bioinformatics 1 Evaluation The key to success 2 Three datasets of which the answers must be known 3 Note on parameter tuning It
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 2 out Announcements Due May 3 rd (11:59pm) Course project proposal
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 08: Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu October 24, 2017 Learnt Prediction and Classification Methods Vector Data
More informationEvaluating Classifiers
Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts
More informationMachine Learning Techniques for Data Mining
Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART V Credibility: Evaluating what s been learned 10/25/2000 2 Evaluation: the key to success How
More informationData Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Credibility: Evaluating what s been learned Issues: training, testing,
More informationEvaluating Classifiers
Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts
More informationCS4491/CS 7265 BIG DATA ANALYTICS
CS4491/CS 7265 BIG DATA ANALYTICS EVALUATION * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Dr. Mingon Kang Computer Science, Kennesaw State University Evaluation for
More informationEvaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München
Evaluation Measures Sebastian Pölsterl Computer Aided Medical Procedures Technische Universität München April 28, 2015 Outline 1 Classification 1. Confusion Matrix 2. Receiver operating characteristics
More informationNaïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others
Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationPattern recognition (4)
Pattern recognition (4) 1 Things we have discussed until now Statistical pattern recognition Building simple classifiers Supervised classification Minimum distance classifier Bayesian classifier (1D and
More informationClassification Part 4
Classification Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Model Evaluation Metrics for Performance Evaluation How to evaluate
More informationMetrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates?
Model Evaluation Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates? Methods for Model Comparison How to
More informationLarge Scale Data Analysis Using Deep Learning
Large Scale Data Analysis Using Deep Learning Machine Learning Basics - 1 U Kang Seoul National University U Kang 1 In This Lecture Overview of Machine Learning Capacity, overfitting, and underfitting
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationList of Exercises: Data Mining 1 December 12th, 2015
List of Exercises: Data Mining 1 December 12th, 2015 1. We trained a model on a two-class balanced dataset using five-fold cross validation. One person calculated the performance of the classifier by measuring
More informationA Comparative Study of Locality Preserving Projection and Principle Component Analysis on Classification Performance Using Logistic Regression
Journal of Data Analysis and Information Processing, 2016, 4, 55-63 Published Online May 2016 in SciRes. http://www.scirp.org/journal/jdaip http://dx.doi.org/10.4236/jdaip.2016.42005 A Comparative Study
More informationData Mining and Knowledge Discovery Practice notes 2
Keywords Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si Data Attribute, example, attribute-value data, target variable, class, discretization Algorithms
More informationCPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016
CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Admin Course add/drop deadline tomorrow. Assignment 1 is due Friday. Setup your CS undergrad account ASAP to use Handin: https://www.cs.ubc.ca/getacct
More informationData Mining Classification: Alternative Techniques. Imbalanced Class Problem
Data Mining Classification: Alternative Techniques Imbalanced Class Problem Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Class Imbalance Problem Lots of classification problems
More informationData Mining and Knowledge Discovery: Practice Notes
Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 8.11.2017 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization
More informationArtificial Intelligence. Programming Styles
Artificial Intelligence Intro to Machine Learning Programming Styles Standard CS: Explicitly program computer to do something Early AI: Derive a problem description (state) and use general algorithms to
More informationEvaluating Classifiers
Evaluating Classifiers Charles Elkan elkan@cs.ucsd.edu January 18, 2011 In a real-world application of supervised learning, we have a training set of examples with labels, and a test set of examples with
More informationData Mining and Knowledge Discovery: Practice Notes
Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 2016/11/16 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization
More informationNaïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others
Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict
More informationECLT 5810 Evaluation of Classification Quality
ECLT 5810 Evaluation of Classification Quality Reference: Data Mining Practical Machine Learning Tools and Techniques, by I. Witten, E. Frank, and M. Hall, Morgan Kaufmann Testing and Error Error rate:
More informationContents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation
Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation Learning 4 Supervised Learning 4 Unsupervised Learning 4
More informationCS 584 Data Mining. Classification 3
CS 584 Data Mining Classification 3 Today Model evaluation & related concepts Additional classifiers Naïve Bayes classifier Support Vector Machine Ensemble methods 2 Model Evaluation Metrics for Performance
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING Clustering Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu November 7, 2017 Learnt Clustering Methods Vector Data Set Data Sequence Data Text
More informationBest First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis
Best First and Greedy Search Based CFS and Naïve Bayes Algorithms for Hepatitis Diagnosis CHAPTER 3 BEST FIRST AND GREEDY SEARCH BASED CFS AND NAÏVE BAYES ALGORITHMS FOR HEPATITIS DIAGNOSIS 3.1 Introduction
More informationDATA MINING OVERFITTING AND EVALUATION
DATA MINING OVERFITTING AND EVALUATION 1 Overfitting Will cover mechanisms for preventing overfitting in decision trees But some of the mechanisms and concepts will apply to other algorithms 2 Occam s
More informationEvaluation Metrics. (Classifiers) CS229 Section Anand Avati
Evaluation Metrics (Classifiers) CS Section Anand Avati Topics Why? Binary classifiers Metrics Rank view Thresholding Confusion Matrix Point metrics: Accuracy, Precision, Recall / Sensitivity, Specificity,
More informationClassification. Instructor: Wei Ding
Classification Part II Instructor: Wei Ding Tan,Steinbach, Kumar Introduction to Data Mining 4/18/004 1 Practical Issues of Classification Underfitting and Overfitting Missing Values Costs of Classification
More informationLecture 25: Review I
Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,
More informationPredictive Analysis: Evaluation and Experimentation. Heejun Kim
Predictive Analysis: Evaluation and Experimentation Heejun Kim June 19, 2018 Evaluation and Experimentation Evaluation Metrics Cross-Validation Significance Tests Evaluation Predictive analysis: training
More information2. On classification and related tasks
2. On classification and related tasks In this part of the course we take a concise bird s-eye view of different central tasks and concepts involved in machine learning and classification particularly.
More informationCCRMA MIR Workshop 2014 Evaluating Information Retrieval Systems. Leigh M. Smith Humtap Inc.
CCRMA MIR Workshop 2014 Evaluating Information Retrieval Systems Leigh M. Smith Humtap Inc. leigh@humtap.com Basic system overview Segmentation (Frames, Onsets, Beats, Bars, Chord Changes, etc) Feature
More informationMEASURING CLASSIFIER PERFORMANCE
MEASURING CLASSIFIER PERFORMANCE ERROR COUNTING Error types in a two-class problem False positives (type I error): True label is -1, predicted label is +1. False negative (type II error): True label is
More informationPartitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning
Partitioning Data IRDS: Evaluation, Debugging, and Diagnostics Charles Sutton University of Edinburgh Training Validation Test Training : Running learning algorithms Validation : Tuning parameters of learning
More informationData Mining Classification: Bayesian Decision Theory
Data Mining Classification: Bayesian Decision Theory Lecture Notes for Chapter 2 R. O. Duda, P. E. Hart, and D. G. Stork, Pattern classification, 2nd ed. New York: Wiley, 2001. Lecture Notes for Chapter
More informationINTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá
INTRODUCTION TO DATA MINING Daniel Rodríguez, University of Alcalá Outline Knowledge Discovery in Datasets Model Representation Types of models Supervised Unsupervised Evaluation (Acknowledgement: Jesús
More informationINF 4300 Classification III Anne Solberg The agenda today:
INF 4300 Classification III Anne Solberg 28.10.15 The agenda today: More on estimating classifier accuracy Curse of dimensionality and simple feature selection knn-classification K-means clustering 28.10.15
More informationFor continuous responses: the Actual by Predicted plot how well the model fits the models. For a perfect fit, all the points would be on the diagonal.
1 ROC Curve and Lift Curve : GRAPHS F0R GOODNESS OF FIT Reference 1. StatSoft, Inc. (2011). STATISTICA (data analysis software system), version 10. www.statsoft.com. 2. JMP, Version 9. SAS Institute Inc.,
More informationCOSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor
COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality
More informationEvaluation of different biological data and computational classification methods for use in protein interaction prediction.
Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Yanjun Qi, Ziv Bar-Joseph, Judith Klein-Seetharaman Protein 2006 Motivation Correctly
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:
More informationMachine Learning: Symbolische Ansätze
Machine Learning: Symbolische Ansätze Evaluation and Cost-Sensitive Learning Evaluation Hold-out Estimates Cross-validation Significance Testing Sign test ROC Analysis Cost-Sensitive Evaluation ROC space
More informationTutorials Case studies
1. Subject Three curves for the evaluation of supervised learning methods. Evaluation of classifiers is an important step of the supervised learning process. We want to measure the performance of the classifier.
More informationEvaluation. Evaluate what? For really large amounts of data... A: Use a validation set.
Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?
More informationMACHINE LEARNING TOOLBOX. Logistic regression on Sonar
MACHINE LEARNING TOOLBOX Logistic regression on Sonar Classification models Categorical (i.e. qualitative) target variable Example: will a loan default? Still a form of supervised learning Use a train/test
More informationDATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane
DATA MINING AND MACHINE LEARNING Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Data preprocessing Feature normalization Missing
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationChapter 3: Supervised Learning
Chapter 3: Supervised Learning Road Map Basic concepts Evaluation of classifiers Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Summary 2 An example
More informationWhat is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.
What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem
More informationCISC 4631 Data Mining
CISC 4631 Data Mining Lecture 05: Overfitting Evaluation: accuracy, precision, recall, ROC Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Eamonn Koegh (UC Riverside)
More informationClassification Algorithms in Data Mining
August 9th, 2016 Suhas Mallesh Yash Thakkar Ashok Choudhary CIS660 Data Mining and Big Data Processing -Dr. Sunnie S. Chung Classification Algorithms in Data Mining Deciding on the classification algorithms
More informationWEKA Explorer User Guide for Version 3-4
WEKA Explorer User Guide for Version 3-4 Richard Kirkby Eibe Frank July 28, 2010 c 2002-2010 University of Waikato This guide is licensed under the GNU General Public License version 2. More information
More informationNächste Woche. Dienstag, : Vortrag Ian Witten (statt Vorlesung) Donnerstag, 4.12.: Übung (keine Vorlesung) IGD, 10h. 1 J.
1 J. Fürnkranz Nächste Woche Dienstag, 2. 12.: Vortrag Ian Witten (statt Vorlesung) IGD, 10h 4 Donnerstag, 4.12.: Übung (keine Vorlesung) 2 J. Fürnkranz Evaluation and Cost-Sensitive Learning Evaluation
More informationIEE 520 Data Mining. Project Report. Shilpa Madhavan Shinde
IEE 520 Data Mining Project Report Shilpa Madhavan Shinde Contents I. Dataset Description... 3 II. Data Classification... 3 III. Class Imbalance... 5 IV. Classification after Sampling... 5 V. Final Model...
More informationProbabilistic Classifiers DWML, /27
Probabilistic Classifiers DWML, 2007 1/27 Probabilistic Classifiers Conditional class probabilities Id. Savings Assets Income Credit risk 1 Medium High 75 Good 2 Low Low 50 Bad 3 High Medium 25 Bad 4 Medium
More informationModel Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer
Model Assessment and Selection Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Model Training data Testing data Model Testing error rate Training error
More informationCLASSIFICATION JELENA JOVANOVIĆ. Web:
CLASSIFICATION JELENA JOVANOVIĆ Email: jeljov@gmail.com Web: http://jelenajovanovic.net OUTLINE What is classification? Binary and multiclass classification Classification algorithms Naïve Bayes (NB) algorithm
More informationUsing Machine Learning to Optimize Storage Systems
Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation
More informationImproving Positron Emission Tomography Imaging with Machine Learning David Fan-Chung Hsu CS 229 Fall
Improving Positron Emission Tomography Imaging with Machine Learning David Fan-Chung Hsu (fcdh@stanford.edu), CS 229 Fall 2014-15 1. Introduction and Motivation High- resolution Positron Emission Tomography
More informationFMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)
More informationDATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS
DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes and a class attribute
More informationBoolean Classification
EE04 Spring 08 S. Lall and S. Boyd Boolean Classification Sanjay Lall and Stephen Boyd EE04 Stanford University Boolean classification Boolean classification I supervised learning is called boolean classification
More informationFunction Algorithms: Linear Regression, Logistic Regression
CS 4510/9010: Applied Machine Learning 1 Function Algorithms: Linear Regression, Logistic Regression Paula Matuszek Fall, 2016 Some of these slides originated from Andrew Moore Tutorials, at http://www.cs.cmu.edu/~awm/tutorials.html
More informationCSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13
CSE 634 - Data Mining Concepts and Techniques STATISTICAL METHODS Professor- Anita Wasilewska (REGRESSION) Team 13 Contents Linear Regression Logistic Regression Bias and Variance in Regression Model Fit
More information1 Machine Learning System Design
Machine Learning System Design Prioritizing what to work on: Spam classification example Say you want to build a spam classifier Spam messages often have misspelled words We ll have a labeled training
More informationStudy on Classifiers using Genetic Algorithm and Class based Rules Generation
2012 International Conference on Software and Computer Applications (ICSCA 2012) IPCSIT vol. 41 (2012) (2012) IACSIT Press, Singapore Study on Classifiers using Genetic Algorithm and Class based Rules
More informationMachine Learning: Algorithms and Applications Mockup Examination
Machine Learning: Algorithms and Applications Mockup Examination 14 May 2012 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students Write First Name, Last Name, Student Number and Signature
More informationMini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class
Mini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class Guidelines Submission. Submit a hardcopy of the report containing all the figures and printouts of code in class. For readability
More informationCSE 158. Web Mining and Recommender Systems. Midterm recap
CSE 158 Web Mining and Recommender Systems Midterm recap Midterm on Wednesday! 5:10 pm 6:10 pm Closed book but I ll provide a similar level of basic info as in the last page of previous midterms CSE 158
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris
ML4Bio Lecture #1: Introduc3on February 24 th, 216 Quaid Morris Course goals Prac3cal introduc3on to ML Having a basic grounding in the terminology and important concepts in ML; to permit self- study,
More informationModel s Performance Measures
Model s Performance Measures Evaluating the performance of a classifier Section 4.5 of course book. Taking into account misclassification costs Class imbalance problem Section 5.7 of course book. TNM033:
More informationThe Explorer. chapter Getting started
chapter 10 The Explorer Weka s main graphical user interface, the Explorer, gives access to all its facilities using menu selection and form filling. It is illustrated in Figure 10.1. There are six different
More informationChuck Cartledge, PhD. 23 September 2017
Introduction K-Nearest Neighbors Na ıve Bayes Hands-on Q&A Conclusion References Files Misc. Big Data: Data Analysis Boot Camp Classification with K-Nearest Neighbors and Na ıve Bayes Chuck Cartledge,
More informationAssignment 4 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran
Assignment 4 (Sol.) Introduction to Data Analytics Prof. andan Sudarsanam & Prof. B. Ravindran 1. Which among the following techniques can be used to aid decision making when those decisions depend upon
More informationECE 5470 Classification, Machine Learning, and Neural Network Review
ECE 5470 Classification, Machine Learning, and Neural Network Review Due December 1. Solution set Instructions: These questions are to be answered on this document which should be submitted to blackboard
More informationEvaluating Machine Learning Methods: Part 1
Evaluating Machine Learning Methods: Part 1 CS 760@UW-Madison Goals for the lecture you should understand the following concepts bias of an estimator learning curves stratified sampling cross validation
More informationLecture 7: Decision Trees
Lecture 7: Decision Trees Instructor: Outline 1 Geometric Perspective of Classification 2 Decision Trees Geometric Perspective of Classification Perspective of Classification Algorithmic Geometric Probabilistic...
More informationPractice Exam Sample Solutions
CS 675 Computer Vision Instructor: Marc Pomplun Practice Exam Sample Solutions Note that in the actual exam, no calculators, no books, and no notes allowed. Question 1: out of points Question 2: out of
More informationCPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016
CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Assignment 0: Admin 1 late day to hand it in tonight, 2 late days for Wednesday. Assignment 1 is out: Due Friday of next week.
More informationS2 Text. Instructions to replicate classification results.
S2 Text. Instructions to replicate classification results. Machine Learning (ML) Models were implemented using WEKA software Version 3.8. The software can be free downloaded at this link: http://www.cs.waikato.ac.nz/ml/weka/downloading.html.
More informationEvaluating Machine-Learning Methods. Goals for the lecture
Evaluating Machine-Learning Methods Mark Craven and David Page Computer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from
More informationTutorial on Machine Learning Tools
Tutorial on Machine Learning Tools Yanbing Xue Milos Hauskrecht Why do we need these tools? Widely deployed classical models No need to code from scratch Easy-to-use GUI Outline Matlab Apps Weka 3 UI TensorFlow
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 20: 10/12/2015 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationCSE 190, Spring 2015: Midterm
CSE 190, Spring 2015: Midterm Name: Student ID: Instructions Hand in your solution at or before 7:45pm. Answers should be written directly in the spaces provided. Do not open or start the test before instructed
More informationPart II: A broader view
Part II: A broader view Understanding ML metrics: isometrics, basic types of linear isometric plots linear metrics and equivalences between them skew-sensitivity non-linear metrics Model manipulation:
More informationCross-validation for detecting and preventing overfitting
Cross-validation for detecting and preventing overfitting Note to other teachers and users of these slides. Andrew would be delighted if ou found this source material useful in giving our own lectures.
More informationIntroduction to Information Retrieval
Introduction to Information Retrieval http://informationretrieval.org IIR 16: Flat Clustering Hinrich Schütze Institute for Natural Language Processing, Universität Stuttgart 2009.06.16 1/ 64 Overview
More informationData Mining: Classifier Evaluation. CSCI-B490 Seminar in Computer Science (Data Mining)
Data Mining: Classifier Evaluation CSCI-B490 Seminar in Computer Science (Data Mining) Predictor Evaluation 1. Question: how good is our algorithm? how will we estimate its performance? 2. Question: what
More informationNoise-based Feature Perturbation as a Selection Method for Microarray Data
Noise-based Feature Perturbation as a Selection Method for Microarray Data Li Chen 1, Dmitry B. Goldgof 1, Lawrence O. Hall 1, and Steven A. Eschrich 2 1 Department of Computer Science and Engineering
More information7. Decision or classification trees
7. Decision or classification trees Next we are going to consider a rather different approach from those presented so far to machine learning that use one of the most common and important data structure,
More informationUnsupervised Learning : Clustering
Unsupervised Learning : Clustering Things to be Addressed Traditional Learning Models. Cluster Analysis K-means Clustering Algorithm Drawbacks of traditional clustering algorithms. Clustering as a complex
More informationData Mining and Knowledge Discovery Practice notes: Numeric Prediction, Association Rules
Keywords Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si Data Attribute, example, attribute-value data, target variable, class, discretization Algorithms
More information