Data Mining: Model Evaluation

Similar documents
Classification Part 4

Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates?

Classification. Instructor: Wei Ding

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Data Mining Classification: Bayesian Decision Theory

Data Mining Classification: Alternative Techniques. Imbalanced Class Problem

CS 584 Data Mining. Classification 3

Support Vector Machines

The Research of Support Vector Machine in Agricultural Data Classification

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

Edge Detection in Noisy Images Using the Support Vector Machines

Classifier Selection Based on Data Complexity Measures *

Three supervised learning methods on pen digits character recognition dataset

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

CHAPTER 3 SEQUENTIAL MINIMAL OPTIMIZATION TRAINED SUPPORT VECTOR CLASSIFIER FOR CANCER PREDICTION

CS 534: Computer Vision Model Fitting

Machine Learning: Algorithms and Applications

10 Classification: Evaluation

Journal of Process Control

Programming in Fortran 90 : 2017/2018

Concurrent Apriori Data Mining Algorithms

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

A Robust Method for Estimating the Fundamental Matrix

SI485i : NLP. Set 5 Using Naïve Bayes

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

Journal of Chemical and Pharmaceutical Research, 2014, 6(6): Research Article. A selective ensemble classification method on microarray data

Hierarchical clustering for gene expression data analysis

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

DATA MINING LECTURE 9. Classification Decision Trees Evaluation

Smoothing Spline ANOVA for variable screening

Support Vector Machines

Cluster Analysis of Electrical Behavior

Discriminative Dictionary Learning with Pairwise Constraints

X- Chart Using ANOM Approach

A Selective Sampling Method for Imbalanced Data Learning on Support Vector Machines

The BGLR (Bayesian Generalized Linear Regression) R- Package. Gustavo de los Campos, Amit Pataki & Paulino Pérez. (August- 2013)

A Statistical Model Selection Strategy Applied to Neural Networks

Learning Ensemble of Local PDM-based Regressions. Yen Le Computational Biomedicine Lab Advisor: Prof. Ioannis A. Kakadiaris

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Machine Learning. Support Vector Machines. (contains material adapted from talks by Constantin F. Aliferis & Ioannis Tsamardinos, and Martin Law)

Performance Evaluation of Information Retrieval Systems

Error Detection and Impact-Sensitive Instance Ranking in Noisy Datasets

Machine Learning 9. week

Statistics and Data Analysis. Use of the ROC Curve and the Bootstrap in Comparing Weighted Logistic Regression Models

Machine Learning Algorithm Improves Accuracy for analysing Kidney Function Test Using Decision Tree Algorithm

Bayesian Approach for Fatigue Life Prediction from Field Inspection

Intelligent Information Acquisition for Improved Clustering

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

DETECTION OF ELECTRICAL FAULTS IN INDUCTION MOTOR FED BY INVERTER USING SUPPORT VECTOR MACHINE AND RECEIVER OPERATING CHARACTERISTIC

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Unsupervised Learning

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Unsupervised Learning and Clustering

B.N.Jagadesh* et al. /International Journal of Pharmacy & Technology

3 Supervised Learning

Why visualisation? IRDS: Visualization. Univariate data. Visualisations that we won t be interested in. Graphics provide little additional information

Optimizing Document Scoring for Query Retrieval

Announcements. Supervised Learning

A Multivariate Analysis of Static Code Attributes for Defect Prediction

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

DATA MINING LECTURE 11. Classification Basic Concepts Decision Trees Evaluation Nearest-Neighbor Classifier

Biological Sequence Mining Using Plausible Neural Network and its Application to Exon/intron Boundaries Prediction

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

Feature Reduction and Selection

CS145: INTRODUCTION TO DATA MINING

Associative Based Classification Algorithm For Diabetes Disease Prediction

LEAST SQUARES. RANSAC. HOUGH TRANSFORM.

Unsupervised object segmentation in video by efficient selection of highly probable positive features

We Two Seismic Interference Attenuation Methods Based on Automatic Detection of Seismic Interference Moveout

Air Transport Demand. Ta-Hui Yang Associate Professor Department of Logistics Management National Kaohsiung First Univ. of Sci. & Tech.

Tighter Perceptron with Improved Dual Use of Cached Data for Model Representation and Validation

Implementation Naïve Bayes Algorithm for Student Classification Based on Graduation Status

SVM-based Learning for Multiple Model Estimation

The Codesign Challenge

SUPPORT VECTOR MACHINES FOR CLASSIFICATION OF MALIGNANT AND BENIGN LESIONS. Anatoli Nachev, Mairead Hogan

Non-Negative Matrix Factorization and Support Vector Data Description Based One Class Classification

Support Vector Machines. CS534 - Machine Learning

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

An Improved Neural Network Algorithm for Classifying the Transmission Line Faults

Multi-stable Perception. Necker Cube

Analysis of Continuous Beams in General

Automated Selection of Training Data and Base Models for Data Stream Mining Using Naïve Bayes Ensemble Classification

Lecture 4: Principal components

A Modified Median Filter for the Removal of Impulse Noise Based on the Support Vector Machines

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining

Evolutionary Wavelet Neural Network for Large Scale Function Estimation in Optimization

Feature Mining for GMM-Based Speech Quality Measurement

CISC 4631 Data Mining

Switching Convolutional Neural Network for Crowd Counting

Feature Extractions for Iris Recognition

Comparison Study of Textural Descriptors for Training Neural Network Classifiers

Efficient Text Classification by Weighted Proximal SVM *

Meta-Prediction for Collective Classification

EYE CENTER LOCALIZATION ON A FACIAL IMAGE BASED ON MULTI-BLOCK LOCAL BINARY PATTERNS

CS570: Introduction to Data Mining

Feature Selection as an Improving Step for Decision Tree Construction

On Evaluating Open Biometric Identification Systems

Image Feature Selection Based on Ant Colony Optimization

Lecture 5: Multilayer Perceptrons

Transcription:

Data Mnng: Model Evaluaton Aprl 16, 2013 1

Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct the model (tranng tme) tme to use the model (classfcaton/predcton tme) Robustness: handlng nose and mssng values Scalablt: effcenc n dsk-resdent databases Interpretablt understandng and nsght provded b the model Other measures, e.g., goodness of rules, such as decson tree sze or compactness of classfcaton rules Aprl 16, 2013 2

Predctor Error Measures Measure predctor accurac: measure how far off the predcted value s from the actual known value Loss functon: measures the error betw. and the predcted value Absolute error: Squared error: ( ) 2 Test error (generalzaton error): the average loss over the test set Mean absolute error: Mean squared error: Relatve absolute error: Relatve squared error: The mean squared-error exaggerates the presence of outlers Popularl use (square) root mean-square error, smlarl, root relatve squared error d d = 1 ' d d = 1 2 ') ( = = d d 1 1 ' = = d d 1 2 1 2 ) ( ') ( Aprl 16, 2013 3

Evaluatng the Accurac of a Classfer or Predctor (I) Holdout method Gven data s randoml parttoned nto two ndependent sets Tranng set (e.g., 2/3) for model constructon Test set (e.g., 1/3) for accurac estmaton Random samplng: a varaton of holdout Repeat holdout k tmes, accurac = avg. of the accuraces obtaned Cross-valdaton (k-fold, where k = 10 s most popular) Randoml partton the data nto k mutuall exclusve subsets, each approxmatel equal sze At -th teraton, use D as test set and others as tranng set Leave-one-out: k folds where k = # of tuples, for small szed data Stratfed cross-valdaton: folds are stratfed so that class dst. n each fold s approx. the same as that n the ntal data Aprl 16, 2013 4

Evaluatng the Accurac of a Classfer or Predctor (II) Bootstrap Works well wth small data sets Samples the gven tranng tuples unforml wth replacement.e., each tme a tuple s selected, t s equall lkel to be selected agan and re-added to the tranng set Several boostrap methods, and a common one s.632 boostrap Suppose we are gven a data set of d tuples. The data set s sampled d tmes, wth replacement, resultng n a tranng set of d samples. The data tuples that dd not make t nto the tranng set end up formng the test set. About 63.2% of the orgnal data wll end up n the bootstrap, and the remanng 36.8% wll form the test set (snce (1 1/d) d e -1 = 0.368) Repeat the samplng procedure k tmes, overall accurac of the model: k acc ( M ) = (0.632 acc( M ) test _ set + 0.368 acc( M ) tran _ = 1 Aprl 16, 2013 5 set )

Model Evaluaton Metrcs for Performance Evaluaton How to evaluate the performance of a model? Methods for Performance Evaluaton How to obtan relable estmates? Methods for Model Comparson How to compare the relatve performance among competng models? Aprl 16, 2013 6

Metrcs for Performance Evaluaton Focus on the predctve capablt of a model Rather than how fast t takes to classf or buld models, scalablt, etc. Confuson Matrx: ACTUAL CLASS PREDICTED CLASS Class=Yes Class=No Class=Yes a (TP) b (FN) Class=No c (FP) d (TN) a: TP (true postve) b: FN (false negatve) c: FP (false postve) d: TN (true Aprl 16, 2013 negatve) 7

Metrcs for Performance Evaluaton PREDICTED CLASS Class=Yes Class=No ACTUAL CLASS Class=Yes Class=No Most wdel-used metrc: a (TP) c (FP) b (FN) d (TN) Accurac = a a + b + + d c + d = TP TP + TN + TN + FP + FN Aprl 16, 2013 8

Classfer Accurac Measures Predcted classes bu_computer = es bu_computer = no total recognton(%) bu_computer = es 6954 46 7000 99.34 bu_computer = no 412 2588 3000 86.27 total 7366 2634 10000 95.52 Accurac of a classfer M, acc(m): percentage of test set tuples that are correctl classfed b the model M Error rate (msclassfcaton rate) of M = 1 acc(m) Gven m classes, CM,j, an entr n a confuson matrx, ndcates # of tuples n class that are labeled b the classfer as class j Alternatve accurac measures (e.g., for cancer dagnoss) senstvt = TP/TP+FN /* true postve recognton rate */ specfct = TN/TN+FP /* true negatve recognton rate */ Ths model can also be used for cost-beneft analss Aprl 16, 2013 9

Lmtaton of Accurac Consder a 2-class problem Number of Class 0 examples = 9990 Number of Class 1 examples = 10 If model predcts everthng to be class 0, accurac s 9990/10000 = 99.9 % Accurac s msleadng because model does not detect an class 1 example Aprl 16, 2013 10

Cost Matrx PREDICTED CLASS C( j) Class=Yes Class=No ACTUAL CLASS Class=Yes C(Yes Yes) C(No Yes) Class=No C(Yes No) C(No No) C( j): Cost of msclassfng class j example as class Aprl 16, 2013 11

Computng Cost of Classfcaton Cost Matrx ACTUAL CLASS PREDICTED CLASS C( j) + - + -1 100-1 0 Model M 1 PREDICTED CLASS Model M 2 PREDICTED CLASS ACTUAL CLASS + - + 150 40-60 250 ACTUAL CLASS + - + 250 45-5 200 Accurac = 80% Cost = 3910 Accurac = 90% Cost = 4255 Aprl 16, 2013 12

Cost vs Accurac Count ACTUAL CLASS PREDICTED CLASS Class=Yes Class=No Class=Yes a b Class=No c d Accurac s proportonal to cost f 1. C(Yes No)=C(No Yes) = q 2. C(Yes Yes)=C(No No) = p N = a + b + c + d Accurac = (a + d)/n Cost ACTUAL CLASS PREDICTED CLASS Class=Yes Class=No Class=Yes p q Class=No q p Cost = p (a + d) + q (b + c) = p (a + d) + q (N a d) = q N (q p)(a + d) = N [q (q-p) Accurac] Aprl 16, 2013 13

Cost-Senstve Measures a Precson (p) = a + c a Recall (r) = a + b 2rp 2a F - measure (F) = = r + p 2a + b + c Precson s based towards C(Yes Yes) & C(Yes No) Recall s based towards C(Yes Yes) & C(No Yes) F-measure s based towards all except C(No No) Weghted Accurac = w a 1 w a + 1 + w b + 2 w d 4 w c + 3 w d Aprl 16, 2013 14 4

Model Evaluaton Metrcs for Performance Evaluaton How to evaluate the performance of a model? Methods for Performance Evaluaton How to obtan relable estmates? Methods for Model Comparson How to compare the relatve performance among competng models? Aprl 16, 2013 15

Methods for Performance Evaluaton How to obtan a relable estmate of performance? Performance of a model ma depend on other factors besdes the learnng algorthm: Class dstrbuton Cost of msclassfcaton Sze of tranng and test sets Aprl 16, 2013 16

Learnng Curve Learnng curve shows how accurac changes wth varng sample sze Requres a samplng schedule for creatng learnng curve: Arthmetc samplng (Langle, et al) Geometrc samplng (Provost et al) Effect of small sample sze: - Bas n the estmate - Varance of estmate Aprl 16, 2013 17

Holdout Methods of Estmaton Reserve 2/3 for tranng and 1/3 for testng Random subsamplng Repeated holdout Cross valdaton Partton data nto k dsjont subsets k-fold: tran on k-1 parttons, test on the remanng one Leave-one-out: k=n Stratfed samplng oversamplng vs undersamplng Bootstrap Samplng wth replacement Aprl 16, 2013 18

Model Evaluaton Metrcs for Performance Evaluaton How to evaluate the performance of a model? Methods for Performance Evaluaton How to obtan relable estmates? Methods for Model Comparson How to compare the relatve performance among competng models? Aprl 16, 2013 19

ROC (Recever Operatng Characterstc) Developed n 1950s for sgnal detecton theor to analze nos sgnals Characterze the trade-off between postve hts and false alarms ROC curve plots TP (on the -axs) aganst FP (on the x-axs) Performance of each classfer represented as a pont on the ROC curve changng the threshold of algorthm, sample dstrbuton or cost matrx changes the locaton of the pont Aprl 16, 2013 20

ROC Curve - 1-dmensonal data set contanng 2 classes (postve and negatve) - an ponts located at x > t s classfed as postve At threshold t: TP=0.5, FN=0.5, FP=0.12, FN=0.88 Aprl 16, 2013 21

ROC Curve (TP,FP): (0,0): declare everthng to be negatve class (1,1): declare everthng to be postve class (1,0): deal Dagonal lne: Random guessng Below dagonal lne: predcton s opposte of the true class Aprl 16, 2013 22

Usng ROC for Model Comparson In general, No model consstentl outperform the other M 1 s better for small FPR M 2 s better for large FPR Aprl 16, 2013 23