Quick review: Data Mining Tasks... Classifica(on [Predic(ve] Regression [Predic(ve] Clustering [Descrip(ve] Associa(on Rule Discovery [Descrip(ve]

Size: px
Start display at page:

Download "Quick review: Data Mining Tasks... Classifica(on [Predic(ve] Regression [Predic(ve] Clustering [Descrip(ve] Associa(on Rule Discovery [Descrip(ve]"

Transcription

1 Evaluation

2 Quick review: Data Mining Tasks... Classifica(on [Predic(ve] Regression [Predic(ve] Clustering [Descrip(ve] Associa(on Rule Discovery [Descrip(ve]

3 Classification: Definition Given a collec(on of records (training set ) Each record contains a set of a*ributes, one of the abributes is the class. Find a model for class abribute as a func(on of the values of other abributes. Goal: previously unseen records should be assigned a class as accurately as possible. A test set is used to determine the accuracy of the model. Usually, the given data set is divided into training and test sets, with training set used to build the model and test set used to validate it.

4 10 10 Decision Tree Classi9ication Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No Tree Induction algorithm 3 No Small 70K No 4 Yes Medium 120K No 5 No Large 95K Yes Induction 6 No Medium 60K No 7 Yes Large 220K No 8 No Small 85K Yes 9 No Medium 75K No Learn Model 10 No Small 90K Yes Training Set Tid Attrib1 Attrib2 Attrib3 Class Apply Model Model Decision Tree 11 No Small 55K? 12 Yes Medium 80K? 13 Yes Large 110K? Deduction 14 No Small 95K? 15 No Large 67K? Test Set

5 Examples of Classification Task Predic(on Predic(on stock prices Predic(ng tumor cells as benign or malignant Recognizing anomalies: Classifying credit card transac(ons as legi(mate or fraudulent Classifying secondary structures of protein as alpha- helix, beta- sheet, or random coil Categorizing news stories as finance, weather, entertainment, sports, etc

6 Change of orientation It is very hard to write programs to solve problems like recognizing a three- dimensional object in a scenario where lightning condi(ons change We do not know how its done in our brain Even if we had an idea how to do the program might be horrendously complicated It is hard to write a program to compute the probability that a a credit card transac(on is fraudulent There may note be any rules that are both simple and reliable Fraud is a moving target, the program needs to keep changing

7 Model Evaluation Metrics for Performance Evalua(on How to evaluate the performance of a model? How to obtain reliable es(mates? Methods for Model Comparison How to compare the rela(ve performance among compe(ng models?

8 Model Evaluation Metrics for Performance Evalua(on How to evaluate the performance of a model? How to obtain reliable es(mates? Methods for Model Comparison How to compare the rela(ve performance among compe(ng models?

9 Difference in Error costs For any predic(on scheme there will be successes and failures, successes and errors Two kinds of successes: Correct predic(ons of posi(ve, true posi(ves, TP Correct predic(ons of nega(ve, true nega(ves, TN Quite owen, the cost (i.e., the benefit) of the two kinds of successes are taken to be the same You can deal with TP + TN together Two kinds of errors: Incorrect predic(ons of posi(ve, false posi(ves, FP Incorrect predic(ons of nega(ve, false nega(ves, FN In virtually every applied situa(on the costs of false posi(ves and false nega(ves materially differ

10 Difference in Error costs Consider a mailing to a predicted poten(al customer who doesn t respond Which type is it? What is the cost? Consider a mailing that was never sent to what would have been a customer Which type is it? What is the cost?

11 Confusion Matrix A graphical way of summarizing informa(on about successes and failures in predic(on The entries in the matrix are the counts of the different kinds of successes and failures for a given situa(on PREDICTED CLASS ACTUAL CLASS Class=Yes Class=No Class=Yes a b Class=No c d a: TP (true positive) b: FN (false negative) c: FP (false positive) d: TN (true negative)

12 Confusion Matrix: Example Predicted class a b c total a Actual b class c Total

13 Evaluating performance with a confussion matrix The idea is that performance can be evaluated on the basis of how much beber it is than what would be achieved by random results

14 Comparing vs random classi9ier Suppose the previous table Is this a fair measure of overall success? How many agreements would you get by chance? The previous predictor predicted 120 a s, 60 b s and 20 c s What if you had a random predictor the predicted the same total number of the three classes but with the original propor(on E.g. divides the 100 a s but keeping the propor(on 100*120/200 = *60/200 = *20/200 = 10

15 Evaluating performance: random classi9ier Predicted class a b c total a Actual b class c Total Predicted class a b c total a Actual b class c Total

16 Evaluating performance: random classi9ier The random predictor is the point of comparison Can you make a quan(ta(ve comparison between the performance of the actual classifica(on scheme and the random predictor? If your classifica(on scheme isn t beber than random, you ve got a big problem. Your predictor is sort of an an(- predictor.

17 Comparing performances Sum the counts on the main diagonal of the matrix for the actual scheme It got = 140 correct Do the same for the random predictor It got = 82 correct Clearly the actual predictor is beber Can this be quan(fied? There are 200 instances altogether The random predictor lew = 118 remaining incorrectly predicted The actual predictor got = 58 more correct than the random predictor The actual predictor got 58/118 = 49.2% of those remaining instances correct

18 Kappa statistic This computa(on is known as the Kappa sta(s(c. 0 means no beber than random 1 means perfect predic(on What would happen if the predictor is worse than a random predictor? Kappa sta(s(c E.g. (140 82)/ (200 82)

19 Imbalance problem Data sets with imbalanced class distribu(ons are quite common in many real applica(ons (credit card fraud detec(on, manufactoring inspec(on, ) The accuracy measure may not be well suited Several solu(ons to handle with these problems are presented in the following slides

20 Exercise Analyze the results of the balance dataset with the J48 algorithms

21 Classi9ication with costs Two cost matrices: Include cost in the classifier To evaluate its performance Success rate is replaced by average cost per predic(on To assign class to a leaf Instead of selec(ng the majority, take into account the cost of successful and unsuccessful predic(ons Cost is given by appropriate entry in the cost matrix When you use the default cost matrices you are simple coun(ng

22 Exercise Repeat the analysis of the balance.arff data set by adding cost to the wrong classifica(ons of class b. Keep increasing the cost un(l the misclassifica(ons of b change significantly. Use the CostSensi(veClassifier

23 Bias to cost If you want accurate performance es(mates, the distribu(on of classifica(ons in the test set should match the distribu(on of classifica(ons in the training set If there was a mismatch between training and test set some test set instances might not classify correctly The mismatch in the sets means that the training set was biased If it was biased against certain classifica(ons, this suggests that it was biased in favor of other classifica(ons Use sampling to bias the classifica(on towards the underrepresented class

24 Bias to cost Suppose that false posi(ves are more costly than false nega(ves Over- represent no instances in the training set When you run the classifica(on algorithm on this training set, it will overtrain on no That means the rules derived by the algorithm will be more likely to reach a conclusion of no Incidentally, this will increase the number of false nega(ves However, it will decrease the number of false posi(ves for these classifica(ons In this way, cost has been taken into account in the rules derived, reducing the likelihood of expensive FP results

25 Sampling Oversampling: Inten(onally bias the training set by increasing the representa(on of some classifica(ons Replicate the minority examples The algorithm will produce rules more likely to correctly predict these classifica(ons Some noise records may be replicated many (mes Overfirng Create new records according to Some criterion: distribu(on of values, neighbors, Undersampling: less records are considered for the training Some useful nega(ve examples may not be chosen Hybrid approach:??

26 Undersampling and oversampling

27 Weights in weka As of weka >=3.5.8 a weight can be associated with an instance in a standard ARFF file by appending it to the end of the line for that instance and enclosing the value in curly braces. 0, X, 0, Y, "class A", {5}

28 SMOTE Normally used with in combina(on with random undersampling Undersampling the majority class Oversampling the minority class Uses k- neighbors to generate new values (k=5) In the SMOTE paper they analyze several combina(ons for undersampling and oversampling Not a golden rule for how much oversampling and undersampling

29 SMOTE

30 SMOTE

31 Exercise Apply undersampling of x% to the majority classes of the balance dataset and then apply smote to the minority class Use Resample to sample without replacement, biasing towards uniform classes (1.0) and genera(ng a sample of the 70% of the original size. Use SMOTE to oversample the minority class up to approximately the same number of the majority classes

32 Lift charts In prac(ce, costs are rarely known Decisions are usually made by comparing possible scenarios Example: promo(onal mailout to 1,000,000 households Mail to all; respond 1000 (0.1%) Iden(fy subset of 400,000 most promising In this one the number of responds is 800 (0.2%) A li2 chart allows a visual comparison in order to iden(fy subpopula(ons samples with a greater likelihood of yes Needs a learning scheme that outputs probabili(es The increase in the response rate is known as the liw factor

33 Lift factor This idea is summarized in the table on the following overhead Sample Yeses Response Rate Li5 Factor 1,000,000 1,000.1% 400, % 2 100, % 4

34 Finding lift factors Given a learning scheme that outputs probabili(es for the predicted class Rank all of the instances by their probability and keep track of their actual class Note that we re opera(ng under the (reasonable) assump(on that the data mining algorithm really works The higher the predicted probability, the more likely an instance really is to take on a certain value You can find the liw factor for a given sample size, or you can find the sample size for a given liw factor

35 Finding lift factors

36 Lift Chart

37 Lift Chart If the algorithm is any good at all, the curve should be above the diagonal Otherwise, the algorithm is determining the yes probability for instances with a success rate lower than a random sample In general, the closer the liw chart curve comes to the upper lew hand corner, the beber A minimal sample size and a maximum response rate is good The hypothe(cal ideal would be a mailing only to those who would respond yes, namely a 100% success rate, with no one lew out (no false nega(ves)

38 Cost Bene9it Curves The costs and benefits of differing sample sizes/mailing scenarios will differ E.g. Mailing scenario The cost of interest now is the cost of mailing an individual item We ll assume that the cost is constant per item The benefit of interest is the value of the business generated per yes response We ll assume that the benefit is constant per posi(ve response It is now possible to form a cost/benefit curve across the same domain (percent of sample size) as the liw chart

39 Cost Bene9it curves

40 Exercise With the balance data set, analyze the cost/benefit for the L class if the benefit of achieving a true posi(ve would be 15 and the cost of a false posi(ve 7

41 ROC Curves Stands for receiver opera(ng characteris(c Used to show tradeoff between hit rate and false alarm rate over noisy channel Similar to liw charts It provides a way of drawing conclusions about one data mining algorithm Needs that the algorithm returns a numeric value Analyzes the evolu(on of the TPR and FPR with different thresholds Differences to liw chart: y axis shows percentage of true posi(ves in sample rather than absolute number TPR=100*TP/(TP+FN) x axis shows percentage of false posi(ves in sample rather than sample size FPR=100*FP/(FP+TN)

42 ROC Curve (TP,FP): (0,0): declare everything to be negative class (1,1): declare everything to be positive class (1,0): ideal Diagonal line: Random guessing Below diagonal line: prediction is opposite of the true class

43 How to construct a ROC curve Threshold >= Class TP FP TN FN TPR FPR ROC Curve:

44 How is the ROC Curve of this data?

45 ROC Curve It represents the probability that a randomly chosen nega(ve example will have a smaller es(mated probability of belonging to the posi(ve class than a randomly chosen posi(ve example

46 ROC Curve

47 Exercise Obtain the ROC curve for the L class value and the following algorithms: IBK (k=3), NaiveBayes and J48 Visualize the B class also and compare results

48 Recall and precision The general idea behind liw charts and ROC curves is trade- off They are measures of good outcomes vs. unsuccessful outcomes Trying to measure the tradeoff between desirable and undesirable outcomes occurs in many different problem domains In the area of informa(on retrieval, two measures are used: Recall and precision

49 Recall and precission Recall = TP/(TP+FN) Precision = TP/(TP+FP) General idea: You can increase the number of relevant documents you retrieve by increasing the total number of documents you retrieve But as you do so, the propor(on of relevant documents falls

50 Recall and precision Consider the extreme case How would you be guaranteed of always retrieving all relevant documents?

51 Sensitivity and speci9icity Medical tes(ng domain Similar idea For a given medical test: Sensi(vity = propor(on of people with the disease who test posi(ve Specificity = propor(on of people without the disease who test nega(ve Sensi(vity = TP/(TP+FN) (Recall/TPR) Specificity = TN/(TN+FP) For both measures high is good

52 Measures explained Sensi:vity/recall/TPR (TP/(TP+FN)): How good a test is detec(ng the posi(ves. How can a test cheat this measure? Specificity/1- FPR (TN/(TN+FP)): How good is a test avoiding false alarms How can a test cheat this measure? Precision: (TP/(TP+FP)): How many of the positevely classified were relevant How can a test cheat this measure? False posi:ve rate/1- Specificity: (FP/(FP+TN)): How many false are detected as true. How can a test cheat this measure? Accuracy: (TP+TN)/(TP+TN+FP+FN) or (TP+TN)/N Error Rate: (FP+FN)/(FP+FN+TP+TN) or (FP+FN)/N

53 One 9igure measures In addi(on to 2- dimensional graphs there are also techniques for trying to express the goodness of a scheme in a single number by the combina(on of several measures For example, in informa(on retrieval there is the concept of average recall measures Averages of precision over several recall values Three- point average recall is the average precision for recall figures of 20%, 50%, and 80% Eleven- point average recall is the average precision for figures of 0%- 100% by 10 s F- measure 1 = 1 2 ( 1 recall + 1 precision ) 2 precision recall precision + recall = 2 TP 2 TP + FN + FP Success rate/accuracy (TP+TN)/(TP+FN+TN+FP) Error Rate: (FP+FN)/(FP+FN+TP+TN) or (FP+FN)/N

54 Area under ROC Curve (AUC) To summarize ROC Curves in a single quan(ty Roughly speaking, the larger the beber It represents the probability that a randomly chosen nega(ve example will have a smaller es(mated probability of belonging to the posi(ve class than a randomly chosen posi(ve example Perfect value = 1.0 Random Guessing = 0.5

55 Model Evaluation Metrics for Performance Evalua(on How to evaluate the performance of a model? How to obtain reliable es(mates? Methods for Model Comparison How to compare the rela(ve performance among compe(ng models?

56 Training and testing Natural performance measure for classifica(on problems: error rate Success: instance s class is predicted correctly Error: instance s class is predicted incorrectly Error rate: propor(on of errors made over the whole set of instances ResubsFtuFon error: error rate obtained from training data Resubs(tu(on error is (hopelessly) op(mis(c!

57 Training and testing II Test set: independent instances that have played no part in forma(on of classifier Assump(on: both training data and test data are representa(ve samples of the underlying problem Test and training data may differ in nature

58 Splitting the data Once evalua(on is complete, all the data can be used to build the final classifier Generally, the larger the training data the beber the classifier (but returns diminish) The larger the test data the more accurate the error es(mate Holdout procedure: method of splirng original data into training and test set Dilemma: ideally both training set and test set should be large!

59 Holdout estimation What to do if the amount of data is limited? The holdout method reserves a certain amount for tes(ng and uses the remainder for training Usually: one third for tes(ng, the rest for training Problem: the samples might not be representa(ve Example: class might be missing in the test data Advanced version uses stra%fica%on Ensures that each class is represented with approximately equal propor(ons in both subsets

60 Repeated holdout method Holdout es(mate can be made more reliable by repea(ng the process with different subsamples In each itera(on, a certain propor(on is randomly selected for training (possibly with stra(ficia(on) The error rates on the different itera(ons are averaged to yield an overall error rate This is called the repeated holdout method S(ll not op(mum: the different test sets overlap Can we prevent overlapping?

61 Cross- validation Cross- validafon avoids overlapping test sets First step: split data into k subsets of equal size Second step: use each subset in turn for tes(ng, the remainder for training Called k- fold cross- validafon

62 10 fold cross- validation Standard method for evalua(on: stra(fied ten- fold cross- valida(on The data is divided randomly in 10 parts in which the class is represented in appoximately the same propor(ons Why ten? Extensive experiments have shown that this is the best choice to get an accurate es(mate There is also some theore(cal evidence for this Stra(fica(on reduces the es(mate s variance Lots of data? use percentage split Else stra(fied 10- fold- cross- valida(on

63 Leave- one- out cross- validation This is basically n- fold cross valida(on taken to the max For a data set with n instances, you hold out one for tes(ng and train on the remaining (n 1) You do this for each of the instances and then average the results Advantages: It s determinis(c: There s no sampling In a sense, you maximize the informa(on you can squeeze out of the data set Disadvantages It s computa(onally intensive By defini(on, a holdout of 1 can t be stra(fied

64 X- fold CV vs leave- one out The Elements of Sta(s(cal Learning Chapter 7 "On the other hand, leave- one- out cross- valida(on has low bias but can have high variance. Overall, five- or tenfold cross- valida(on are recommended as a good compromise: see Breiman and Spector (1992) and Kohavi (1995)"

65 The bootstrap This term comes from the phrase pulling oneself up by one s bootstraps which is a metaphor for accomplishing task without any outside help Originally used to es(mate a parameter from only a sample Idea: Take a bootstrap sample (random sample taken with replacement from the original sample of the same size) Calculate the bootstrap sta(s(c computed on the bootstrap sample Repeat these steps many (me to create a bootstrap distribu(on

66 The bootstrap CV uses sampling without replacement The same instance, once selected, can not be selected again for a par(cular training/ test set The bootstrap uses sampling with replacement to form the training set Sample a dataset of n instances n (mes with replacement to form a new dataset of n instances Use this data as the training set Use the instances from the original dataset that don t occur in the newtraining set for tes:ng

67 Bootstrap A par(cular instance has a probability of 1 1/n of not being picked Thus its probability of not ending up in the test data is: This means the training data will contain approximately 63.2% of the instances

68 The bootstrap The error es(mate on the test data will be very pessimis(c Trained on just ~63% of the instances Therefore, combine it with the resubs(tu(on error: The resubs(tu(on error gets less weight than the error on the test data Repeat process several (mes with different replacement samples; average the results

69 More on the bootstrap Probably the best way of es(ma(ng performance for very small datasets (<30) However, it has some problems Compared to basic CV, the bootstrap increases the variance that can occur in each fold [Efron and Tibshirani, 1993] Could be argueed that is desirable since it is more realis(c Consider a random dataset with two classes of equal size True error rate 50% for any predic(on rule A perfect memorizer will achieve 0% resubs(tu(on error and ~50% error on test data Bootstrap es(mate for this classifier: Err = 0.632*50% *0% = 31.6% Misleadingly op(mis(c

70 Model Evaluation Metrics for Performance Evalua(on How to evaluate the performance of a model? How to obtain reliable es(mates? Methods for Model Comparison How to compare the rela(ve performance among compe(ng models?

71 Comparing models Frequent ques(on: which of two learning schemes performs beber? Note: this is domain dependent! Obvious way: compare 10- fold CV es(mates Generally sufficient in applica(ons (we don't loose if the chosen method is not truly beber) However, what about machine learning research? Need to show convincingly that a par(cular method works beber

72 Comparing models Want to show that scheme A is beber than scheme B in a par(cular domain For a given amount of training data On average, across all possible training sets However, just using the mean values is not enough

73 Hypothesis testing In inferen(al sta(s(s sample data are employed in two ways to draw inferences about one or more popula(ons Hypothesis tes(ng and es(ma(on of popula(on parameters Hypothesis tes(ng is a procedure in which sample data are employed to evaluate a hypothesis In order to evaluate a research hypothesis, it is restated within the framework of two sta(s(cal hypothesis Null hypothesis (H 0 ): statement of no effect or no difference Alterna(ve hypothesis (H 1 ): sta(s(cal statement indica(ng the presence of an effect of a difference

74 Hypothesis testing These types of tests allow a researcher to determine whether or not the result of a study is sta(s(cally significant This implies that one is determining whether or not an obtained difference in an experiment is likely to be due to chance or due to the presence of genuine experimental effect Normally an sta(s(c (measure/characteris(c of a sample) is obtained and compared in reference to a sampling distribu(on Theore(cal distribu(on of all the possible values the test sta(s(c can assume

75 Analyzing the statistic value

76 Hypothesis testing Scien(fic conven(on has established that in order to declare a difference sta(s(cally significant No more than 5% likelihood that the difference is due to chance 1% in medical domains Within the framework of hypothesis tes(ng it is possible for a researcher to commite two types of errors Type I: When a true null hypothesis is rejected One concludes that there are difference when in reality there are none Represented by α Type II: When a false null hypothesis is retained One concludes that a true alterna(ve hypothesis is false Represented by β Power = 1- β They are inversely related As one decreases the other one increases

77

78 α vs β How is the graphical representa(on of α vs β?

79 α vs β

80 Paired t- test In prac(ce we have limited data and a limited number of es(mates for compu(ng the mean Student s t- test tells whether the means of two samples are significantly different In our case the samples are cross- valida(on es(mates for different datasets from the domain Use a paired t- test because the individual samples are paired We test the null hypothesis: H 0 : µ 1 = µ 2 Against H 1 : µ 1 µ 2

81 T- test Used with interval/ra(o data Used it when the researcher does not know the value of the popula(on standard devia(on and must es(mate it by compu(ng the sample standard devia(on As opposed to the z- test T distribu(on in contrast to the normal distribu(on of the z test Assump(ons: Sample has been randomly selected from the popula(on it represents The distribu(on of the data in the underlying popula(on the sample represents is normal Homoscedas(city of variances The rela(onship between X and Y variables is of equal strength across the whole range

82 T- test for paired samples Used when there are two samples that have been matched or paired If we are comparing two models constructed from the same dataset the samples are matched We are going to compute an sta(s(c based on the difference between the samples

83 Standard Error of the mean Sampling distribu(on of the sample mean Distribu(on of the mean of a sample for all possible samples from the same popula(on of a given size The standard error of the mean is the standard devia(on of the sample means es(mate of a popula(on mean. Is usually es(mated by using the sample standard devia(on Sample standard devia(on divided by the square root of the sample size SE x = es p n s P X 2 ( P X) 2 SD x = p n es = n 1 n

84 Distribution of the differences Let m d = m x m y The difference of the means (m d ) has a Student s distribu(on with k 1 degrees of freedom Let σ d 2 be the variance of the difference The standardized version of m d is called the t- sta(s(c: t = m d 0 σ d 2 / k We use t to perform the t- test

85 Student s distribution With small samples (k < 100) the mean follows Student s distribufon with k 1 degrees of freedom With infinite degrees of freedom the distribu(on is the same as the normal distribu(on

86 Student s distribution

87 Performing the test Fix a significance level If a difference is significant at the a% level,there is a (100- a)% chance that the true means differ Look up the value for z that corresponds to the significance level If t z or t z then the difference is significant I.e. the null hypothesis (that the difference is zero) can be rejected

88 Non- parametric tests What to do when the distribu(on does not sa(sfy the t- test requirements Normality distribu(on Shapiro Test Homoscedas(city of variances The rela(onship between X and Y variables is of equal strength across the whole range They are not spread equally Levene test If not the probability of obtaining a significant result may be greater Use non- parametric tests Do not assume any distribu(on

89 Wilcoxon test Good results when no distribu(on can be assumed Null hypothesis The median of the difference scores equals zero H 0 :θ D = 0 H 1 :θ D 0 Direc(onal alterna(ve hypothesis H 1 :θ D > 0 H 1 :θ D < 0

90 Wilcoxon test: Computation First, compute the differences d i between the performance scores of the two classifiers on the i- th out of N data sets The differences are ranked according to their absolute values Average ranks are assigned in case of (es Let R+ be the sum of ranks for the data sets on which the second algorithm outperformed the first Let R- be the sum of ranks for the opposite Ranks of d i = 0 are split evenly (if there is an odd number one is ignored) Let T be the smaller of the sums T=min(R+,R- )

91 Wilcoxon test For N up to 25 there are tables of exact cri(cal values T must be equal or less than tabled cri(cal value For larger values, the following sta(s(c is distributed approximately normal.

92 Wilcoxon test: Example

93 Wilcoxon test: Example

94 Analysis of variance In a set of k independent samples do at least two of the samples represent popula(on with different mean values? If the computed test sta(s(c is significant, it indicates there is a significant difference between at least two of the sample means in the set of k means There is a high likelihood that at least two of the samples represent popula(ons with different mean values To compute the test sta(s(c the total variability is divided into between- groups variability and within- groups variability Between- groups: measure of the variability of the means of the k samples Within groups: (average of the variance within each group) is abributable to chance factors (random error)

95 Analysis of variance: computation

96 Example

97 Analysis of variance: computation Mean square between groups Mean square within-groups Fratio (F distribution)

98 Analysis of variance assumptions Sample has been randomly selected from the popula(on it represents The distribu(on of the data in the underlying popula(on the sample represents is normal Homoscedas(city of variances The rela(onship between X and Y variables is of equal strength across the whole range What to do when the assump(ons are not sa(sfied? An alterna(ve is the non- parametric Friedman test

99 Friedman Test Non- parametric equivalent of ANOVA test Based on rank performances rather than actual performances es(mates All classifiers are ranked according to their performance in ascending order for each data set and the mean rank of a classifier i, AR i, is computed across all data sets. The test sta(s(c of the Friedman test is calculated as: 2 F = 12K L(L + 1) " L X AR 2 i # L(L + 1) 2 i=1 j=1 With K represen(ng the overall number of data sets, L the number of classifiers and r i j the rank of classifier i on data set j. The sta(s(c is distributed according to the Chi- square distribu(on with L- 1 degrees of freedom 4 AR i = 1 K KX r i j

100 Multiple tests comparisons When conduc(ng several pairwise tests to obtain a global conclusion (e.g. Model 1 is beber than Model 2, Model 3, ) the family wise error rate should be taken into account If 10 models and each of the comparisons Model1 vs Model2, Model1 v Model3, has a 5% probability for affirming that there is a significant difference when there is none (Type I error), what is the probability of crea(ng a Type I error if we say that Type I error is beber than the remaining 9 models Probability ~ 37% We need to correct it In the example the new p- value = Other alterna(ves: Bonferroni- Dunn, Holm, Hochberg,

101 Tests comparisons: R t.test wilcoxon.test Checking normality: shapiro.test Checking homocedas(city: levenetest (required library car) aov (ANOVA) friedman.test

102 Some R examples

Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates?

Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates? Model Evaluation Metrics for Performance Evaluation How to evaluate the performance of a model? Methods for Performance Evaluation How to obtain reliable estimates? Methods for Model Comparison How to

More information

Cross- Valida+on & ROC curve. Anna Helena Reali Costa PCS 5024

Cross- Valida+on & ROC curve. Anna Helena Reali Costa PCS 5024 Cross- Valida+on & ROC curve Anna Helena Reali Costa PCS 5024 Resampling Methods Involve repeatedly drawing samples from a training set and refibng a model on each sample. Used in model assessment (evalua+ng

More information

Data Mining Classification: Alternative Techniques. Imbalanced Class Problem

Data Mining Classification: Alternative Techniques. Imbalanced Class Problem Data Mining Classification: Alternative Techniques Imbalanced Class Problem Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Class Imbalance Problem Lots of classification problems

More information

Machine Learning Techniques for Data Mining

Machine Learning Techniques for Data Mining Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART V Credibility: Evaluating what s been learned 10/25/2000 2 Evaluation: the key to success How

More information

Weka ( )

Weka (  ) Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised

More information

Machine Learning and Bioinformatics 機器學習與生物資訊學

Machine Learning and Bioinformatics 機器學習與生物資訊學 Molecular Biomedical Informatics 分子生醫資訊實驗室 機器學習與生物資訊學 Machine Learning & Bioinformatics 1 Evaluation The key to success 2 Three datasets of which the answers must be known 3 Note on parameter tuning It

More information

CS4491/CS 7265 BIG DATA ANALYTICS

CS4491/CS 7265 BIG DATA ANALYTICS CS4491/CS 7265 BIG DATA ANALYTICS EVALUATION * Some contents are adapted from Dr. Hung Huang and Dr. Chengkai Li at UT Arlington Dr. Mingon Kang Computer Science, Kennesaw State University Evaluation for

More information

Classification Part 4

Classification Part 4 Classification Part 4 Dr. Sanjay Ranka Professor Computer and Information Science and Engineering University of Florida, Gainesville Model Evaluation Metrics for Performance Evaluation How to evaluate

More information

Classification. Instructor: Wei Ding

Classification. Instructor: Wei Ding Classification Part II Instructor: Wei Ding Tan,Steinbach, Kumar Introduction to Data Mining 4/18/004 1 Practical Issues of Classification Underfitting and Overfitting Missing Values Costs of Classification

More information

Classification and Regression

Classification and Regression Classification and Regression Announcements Study guide for exam is on the LMS Sample exam will be posted by Monday Reminder that phase 3 oral presentations are being held next week during workshops Plan

More information

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson

Search Engines. Informa1on Retrieval in Prac1ce. Annota1ons by Michael L. Nelson Search Engines Informa1on Retrieval in Prac1ce Annota1ons by Michael L. Nelson All slides Addison Wesley, 2008 Evalua1on Evalua1on is key to building effec$ve and efficient search engines measurement usually

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 3 Parametric Distribu>ons We want model the probability

More information

DATA MINING LECTURE 9. Classification Decision Trees Evaluation

DATA MINING LECTURE 9. Classification Decision Trees Evaluation DATA MINING LECTURE 9 Classification Decision Trees Evaluation 10 10 Illustrating Classification Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No 3 No Small 70K No 4 Yes Medium

More information

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München

Evaluation Measures. Sebastian Pölsterl. April 28, Computer Aided Medical Procedures Technische Universität München Evaluation Measures Sebastian Pölsterl Computer Aided Medical Procedures Technische Universität München April 28, 2015 Outline 1 Classification 1. Confusion Matrix 2. Receiver operating characteristics

More information

About the Course. Reading List. Assignments and Examina5on

About the Course. Reading List. Assignments and Examina5on Uppsala University Department of Linguis5cs and Philology About the Course Introduc5on to machine learning Focus on methods used in NLP Decision trees and nearest neighbor methods Linear models for classifica5on

More information

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A.

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 5 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Credibility: Evaluating what s been learned Issues: training, testing,

More information

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris ML4Bio Lecture #1: Introduc3on February 24 th, 216 Quaid Morris Course goals Prac3cal introduc3on to ML Having a basic grounding in the terminology and important concepts in ML; to permit self- study,

More information

Stages of (Batch) Machine Learning

Stages of (Batch) Machine Learning Evalua&on Stages of (Batch) Machine Learning Given: labeled training data X, Y = {hx i,y i i} n i=1 Assumes each x i D(X ) with y i = f target (x i ) Train the model: model ß classifier.train(x, Y ) x

More information

DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS

DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes and a class attribute

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 08: Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu October 24, 2017 Learnt Prediction and Classification Methods Vector Data

More information

DATA MINING LECTURE 11. Classification Basic Concepts Decision Trees Evaluation Nearest-Neighbor Classifier

DATA MINING LECTURE 11. Classification Basic Concepts Decision Trees Evaluation Nearest-Neighbor Classifier DATA MINING LECTURE 11 Classification Basic Concepts Decision Trees Evaluation Nearest-Neighbor Classifier What is a hipster? Examples of hipster look A hipster is defined by facial hair Hipster or Hippie?

More information

Data Mining Concepts & Techniques

Data Mining Concepts & Techniques Data Mining Concepts & Techniques Lecture No. 03 Data Processing, Data Mining Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information

DATA MINING OVERFITTING AND EVALUATION

DATA MINING OVERFITTING AND EVALUATION DATA MINING OVERFITTING AND EVALUATION 1 Overfitting Will cover mechanisms for preventing overfitting in decision trees But some of the mechanisms and concepts will apply to other algorithms 2 Occam s

More information

CS 584 Data Mining. Classification 3

CS 584 Data Mining. Classification 3 CS 584 Data Mining Classification 3 Today Model evaluation & related concepts Additional classifiers Naïve Bayes classifier Support Vector Machine Ensemble methods 2 Model Evaluation Metrics for Performance

More information

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data

More information

DATA MINING LECTURE 9. Classification Basic Concepts Decision Trees Evaluation

DATA MINING LECTURE 9. Classification Basic Concepts Decision Trees Evaluation DATA MINING LECTURE 9 Classification Basic Concepts Decision Trees Evaluation What is a hipster? Examples of hipster look A hipster is defined by facial hair Hipster or Hippie? Facial hair alone is not

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 2 out Announcements Due May 3 rd (11:59pm) Course project proposal

More information

ECLT 5810 Evaluation of Classification Quality

ECLT 5810 Evaluation of Classification Quality ECLT 5810 Evaluation of Classification Quality Reference: Data Mining Practical Machine Learning Tools and Techniques, by I. Witten, E. Frank, and M. Hall, Morgan Kaufmann Testing and Error Error rate:

More information

Data Mining Classification: Bayesian Decision Theory

Data Mining Classification: Bayesian Decision Theory Data Mining Classification: Bayesian Decision Theory Lecture Notes for Chapter 2 R. O. Duda, P. E. Hart, and D. G. Stork, Pattern classification, 2nd ed. New York: Wiley, 2001. Lecture Notes for Chapter

More information

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach, Kumar (modified by Predrag Radivojac, 2017) Classification:

More information

Evaluating Machine Learning Methods: Part 1

Evaluating Machine Learning Methods: Part 1 Evaluating Machine Learning Methods: Part 1 CS 760@UW-Madison Goals for the lecture you should understand the following concepts bias of an estimator learning curves stratified sampling cross validation

More information

Machine Learning Crash Course: Part I

Machine Learning Crash Course: Part I Machine Learning Crash Course: Part I Ariel Kleiner August 21, 2012 Machine learning exists at the intersec

More information

CISC 4631 Data Mining

CISC 4631 Data Mining CISC 4631 Data Mining Lecture 03: Introduction to classification Linear classifier Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Eamonn Koegh (UC Riverside) 1 Classification:

More information

INTRODUCTION TO MACHINE LEARNING. Measuring model performance or error

INTRODUCTION TO MACHINE LEARNING. Measuring model performance or error INTRODUCTION TO MACHINE LEARNING Measuring model performance or error Is our model any good? Context of task Accuracy Computation time Interpretability 3 types of tasks Classification Regression Clustering

More information

10 Classification: Evaluation

10 Classification: Evaluation CSE4334/5334 Data Mining 10 Classification: Evaluation Chengkai Li Department of Computer Science and Engineering University of Texas at Arlington Fall 2018 (Slides courtesy of Pang-Ning Tan, Michael Steinbach

More information

Predictive Analysis: Evaluation and Experimentation. Heejun Kim

Predictive Analysis: Evaluation and Experimentation. Heejun Kim Predictive Analysis: Evaluation and Experimentation Heejun Kim June 19, 2018 Evaluation and Experimentation Evaluation Metrics Cross-Validation Significance Tests Evaluation Predictive analysis: training

More information

Introduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010

Introduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010 Introduc)on to Probabilis)c Latent Seman)c Analysis NYP Predic)ve Analy)cs Meetup June 10, 2010 PLSA A type of latent variable model with observed count data and nominal latent variable(s). Despite the

More information

CISC 4631 Data Mining

CISC 4631 Data Mining CISC 4631 Data Mining Lecture 05: Overfitting Evaluation: accuracy, precision, recall, ROC Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Eamonn Koegh (UC Riverside)

More information

Evaluating Machine-Learning Methods. Goals for the lecture

Evaluating Machine-Learning Methods. Goals for the lecture Evaluating Machine-Learning Methods Mark Craven and David Page Computer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from

More information

2. On classification and related tasks

2. On classification and related tasks 2. On classification and related tasks In this part of the course we take a concise bird s-eye view of different central tasks and concepts involved in machine learning and classification particularly.

More information

Model s Performance Measures

Model s Performance Measures Model s Performance Measures Evaluating the performance of a classifier Section 4.5 of course book. Taking into account misclassification costs Class imbalance problem Section 5.7 of course book. TNM033:

More information

Classification: Basic Concepts, Decision Trees, and Model Evaluation

Classification: Basic Concepts, Decision Trees, and Model Evaluation Classification: Basic Concepts, Decision Trees, and Model Evaluation Data Warehousing and Mining Lecture 4 by Hossen Asiful Mustafa Classification: Definition Given a collection of records (training set

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts

More information

Data Mining and Knowledge Discovery Practice notes 2

Data Mining and Knowledge Discovery Practice notes 2 Keywords Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si Data Attribute, example, attribute-value data, target variable, class, discretization Algorithms

More information

Lecture Notes for Chapter 4

Lecture Notes for Chapter 4 Classification - Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Look for accompanying R code on the course web

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts

More information

Data Mining and Knowledge Discovery: Practice Notes

Data Mining and Knowledge Discovery: Practice Notes Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 8.11.2017 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization

More information

Part I. Instructor: Wei Ding

Part I. Instructor: Wei Ding Classification Part I Instructor: Wei Ding Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Classification: Definition Given a collection of records (training set ) Each record contains a set

More information

CSE4334/5334 DATA MINING

CSE4334/5334 DATA MINING CSE4334/5334 DATA MINING Lecture 4: Classification (1) CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai Li (Slides courtesy

More information

List of Exercises: Data Mining 1 December 12th, 2015

List of Exercises: Data Mining 1 December 12th, 2015 List of Exercises: Data Mining 1 December 12th, 2015 1. We trained a model on a two-class balanced dataset using five-fold cross validation. One person calculated the performance of the classifier by measuring

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Sta$s$cs & Experimental Design with R. Barbara Kitchenham Keele University

Sta$s$cs & Experimental Design with R. Barbara Kitchenham Keele University Sta$s$cs & Experimental Design with R Barbara Kitchenham Keele University 1 Comparing two or more groups Part 5 2 Aim To cover standard approaches for independent and dependent groups For two groups Student

More information

Pattern recognition (4)

Pattern recognition (4) Pattern recognition (4) 1 Things we have discussed until now Statistical pattern recognition Building simple classifiers Supervised classification Minimum distance classifier Bayesian classifier (1D and

More information

Part I. Classification & Decision Trees. Classification. Classification. Week 4 Based in part on slides from textbook, slides of Susan Holmes

Part I. Classification & Decision Trees. Classification. Classification. Week 4 Based in part on slides from textbook, slides of Susan Holmes Week 4 Based in part on slides from textbook, slides of Susan Holmes Part I Classification & Decision Trees October 19, 2012 1 / 1 2 / 1 Classification Classification Problem description We are given a

More information

Minimum Redundancy and Maximum Relevance Feature Selec4on. Hang Xiao

Minimum Redundancy and Maximum Relevance Feature Selec4on. Hang Xiao Minimum Redundancy and Maximum Relevance Feature Selec4on Hang Xiao Background Feature a feature is an individual measurable heuris4c property of a phenomenon being observed In character recogni4on: horizontal

More information

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict

More information

Feature Selec+on. Machine Learning Fall 2018 Kasthuri Kannan

Feature Selec+on. Machine Learning Fall 2018 Kasthuri Kannan Feature Selec+on Machine Learning Fall 2018 Kasthuri Kannan Interpretability vs. Predic+on Types of feature selec+on Subset selec+on/forward/backward Shrinkage (Lasso/Ridge) Best model (CV) Feature selec+on

More information

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set.

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set. Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?

More information

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others

Naïve Bayes Classification. Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Naïve Bayes Classification Material borrowed from Jonathan Huang and I. H. Witten s and E. Frank s Data Mining and Jeremy Wyatt and others Things We d Like to Do Spam Classification Given an email, predict

More information

CS395T Visual Recogni5on and Search. Gautam S. Muralidhar

CS395T Visual Recogni5on and Search. Gautam S. Muralidhar CS395T Visual Recogni5on and Search Gautam S. Muralidhar Today s Theme Unsupervised discovery of images Main mo5va5on behind unsupervised discovery is that supervision is expensive Common tasks include

More information

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati

Evaluation Metrics. (Classifiers) CS229 Section Anand Avati Evaluation Metrics (Classifiers) CS Section Anand Avati Topics Why? Binary classifiers Metrics Rank view Thresholding Confusion Matrix Point metrics: Accuracy, Precision, Recall / Sensitivity, Specificity,

More information

CITS4009 Introduc0on to Data Science

CITS4009 Introduc0on to Data Science School of Computer Science and Software Engineering CITS4009 Introduc0on to Data Science SEMESTER 2, 2017: CHAPTER 3 EXPLORING DATA 1 Chapter Objec0ves Using summary sta.s.cs to explore data Exploring

More information

Classification. Slide sources:

Classification. Slide sources: Classification Slide sources: Gideon Dror, Academic College of TA Yaffo Nathan Ifill, Leicester MA4102 Data Mining and Neural Networks Andrew Moore, CMU : http://www.cs.cmu.edu/~awm/tutorials 1 Outline

More information

Cross-validation and the Bootstrap

Cross-validation and the Bootstrap Cross-validation and the Bootstrap In the section we discuss two resampling methods: cross-validation and the bootstrap. These methods refit a model of interest to samples formed from the training set,

More information

Data Mining and Knowledge Discovery: Practice Notes

Data Mining and Knowledge Discovery: Practice Notes Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 2016/11/16 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization

More information

Decision Trees: Representa:on

Decision Trees: Representa:on Decision Trees: Representa:on Machine Learning Fall 2017 Some slides from Tom Mitchell, Dan Roth and others 1 Key issues in machine learning Modeling How to formulate your problem as a machine learning

More information

CSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13

CSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13 CSE 634 - Data Mining Concepts and Techniques STATISTICAL METHODS Professor- Anita Wasilewska (REGRESSION) Team 13 Contents Linear Regression Logistic Regression Bias and Variance in Regression Model Fit

More information

An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data

An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data An Intelligent Clustering Algorithm for High Dimensional and Highly Overlapped Photo-Thermal Infrared Imaging Data Nian Zhang and Lara Thompson Department of Electrical and Computer Engineering, University

More information

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS

CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of

More information

Sta$s$cs & Experimental Design with R. Barbara Kitchenham Keele University

Sta$s$cs & Experimental Design with R. Barbara Kitchenham Keele University Sta$s$cs & Experimental Design with R Barbara Kitchenham Keele University 1 Analysis of Variance Mul$ple groups with Normally distributed data 2 Experimental Design LIST Factors you may be able to control

More information

Classification Salvatore Orlando

Classification Salvatore Orlando Classification Salvatore Orlando 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class. The values of the

More information

INTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá

INTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá INTRODUCTION TO DATA MINING Daniel Rodríguez, University of Alcalá Outline Knowledge Discovery in Datasets Model Representation Types of models Supervised Unsupervised Evaluation (Acknowledgement: Jesús

More information

{HEADSHOT} In this lesson, we will look at the process that enables to leverage this source of real- world tes;ng data.

{HEADSHOT} In this lesson, we will look at the process that enables to leverage this source of real- world tes;ng data. {HEADSHOT} Despite our best efforts, even a9er extensive tes;ng, we as so9ware developers rarely put out bug- free code. In this lesson, we will look at a type of debugging technique used throughout the

More information

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane DATA MINING AND MACHINE LEARNING Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Data preprocessing Feature normalization Missing

More information

Metrics Overfitting Model Evaluation Research directions. Classification. Practical Issues. Huiping Cao. lassification-issues, Slide 1/57

Metrics Overfitting Model Evaluation Research directions. Classification. Practical Issues. Huiping Cao. lassification-issues, Slide 1/57 lassification-issues, Slide 1/57 Classification Practical Issues Huiping Cao lassification-issues, Slide 2/57 Outline Criteria to evaluate a classifier Underfitting and overfitting Model evaluation lassification-issues,

More information

K- Nearest Neighbors(KNN) And Predictive Accuracy

K- Nearest Neighbors(KNN) And Predictive Accuracy Contact: mailto: Ammar@cu.edu.eg Drammarcu@gmail.com K- Nearest Neighbors(KNN) And Predictive Accuracy Dr. Ammar Mohammed Associate Professor of Computer Science ISSR, Cairo University PhD of CS ( Uni.

More information

Classification: Decision Trees

Classification: Decision Trees Classification: Decision Trees IST557 Data Mining: Techniques and Applications Jessie Li, Penn State University 1 Decision Tree Example Will a pa)ent have high-risk based on the ini)al 24-hour observa)on?

More information

Cross-validation for detecting and preventing overfitting

Cross-validation for detecting and preventing overfitting Cross-validation for detecting and preventing overfitting Note to other teachers and users of these slides. Andrew would be delighted if ou found this source material useful in giving our own lectures.

More information

Machine Learning: Symbolische Ansätze

Machine Learning: Symbolische Ansätze Machine Learning: Symbolische Ansätze Evaluation and Cost-Sensitive Learning Evaluation Hold-out Estimates Cross-validation Significance Testing Sign test ROC Analysis Cost-Sensitive Evaluation ROC space

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

Decision Trees, Random Forests and Random Ferns. Peter Kovesi

Decision Trees, Random Forests and Random Ferns. Peter Kovesi Decision Trees, Random Forests and Random Ferns Peter Kovesi What do I want to do? Take an image. Iden9fy the dis9nct regions of stuff in the image. Mark the boundaries of these regions. Recognize and

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

Introduc)on to Informa)on Visualiza)on

Introduc)on to Informa)on Visualiza)on Introduc)on to Informa)on Visualiza)on Seeing the Science with Visualiza)on Raw Data 01001101011001 11001010010101 00101010100110 11101101011011 00110010111010 Visualiza(on Applica(on Visualiza)on on

More information

CCRMA MIR Workshop 2014 Evaluating Information Retrieval Systems. Leigh M. Smith Humtap Inc.

CCRMA MIR Workshop 2014 Evaluating Information Retrieval Systems. Leigh M. Smith Humtap Inc. CCRMA MIR Workshop 2014 Evaluating Information Retrieval Systems Leigh M. Smith Humtap Inc. leigh@humtap.com Basic system overview Segmentation (Frames, Onsets, Beats, Bars, Chord Changes, etc) Feature

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 20: 10/12/2015 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

Predic'ng ALS Progression with Bayesian Addi've Regression Trees

Predic'ng ALS Progression with Bayesian Addi've Regression Trees Predic'ng ALS Progression with Bayesian Addi've Regression Trees Lilly Fang and Lester Mackey November 13, 2012 RECOMB Conference on Regulatory and Systems Genomics The ALS Predic'on Prize Challenge: Predict

More information

I211: Information infrastructure II

I211: Information infrastructure II Data Mining: Classifier Evaluation I211: Information infrastructure II 3-nearest neighbor labeled data find class labels for the 4 data points 1 0 0 6 0 0 0 5 17 1.7 1 1 4 1 7.1 1 1 1 0.4 1 2 1 3.0 0 0.1

More information

Tree-based methods for classification and regression

Tree-based methods for classification and regression Tree-based methods for classification and regression Ryan Tibshirani Data Mining: 36-462/36-662 April 11 2013 Optional reading: ISL 8.1, ESL 9.2 1 Tree-based methods Tree-based based methods for predicting

More information

The Prac)cal Applica)on of Knowledge Discovery to Image Data: A Prac))oners View in The Context of Medical Image Mining

The Prac)cal Applica)on of Knowledge Discovery to Image Data: A Prac))oners View in The Context of Medical Image Mining The Prac)cal Applica)on of Knowledge Discovery to Image Data: A Prac))oners View in The Context of Medical Image Mining Frans Coenen (http://cgi.csc.liv.ac.uk/~frans/) 10th Interna+onal Conference on Natural

More information

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer

Model Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer Model Assessment and Selection Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Model Training data Testing data Model Testing error rate Training error

More information

CS 584 Data Mining. Classification 1

CS 584 Data Mining. Classification 1 CS 584 Data Mining Classification 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class. Find a model for

More information

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson

Search Engines. Informa1on Retrieval in Prac1ce. Annotations by Michael L. Nelson Search Engines Informa1on Retrieval in Prac1ce Annotations by Michael L. Nelson All slides Addison Wesley, 2008 Retrieval Models Provide a mathema1cal framework for defining the search process includes

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Charles Elkan elkan@cs.ucsd.edu January 18, 2011 In a real-world application of supervised learning, we have a training set of examples with labels, and a test set of examples with

More information

The Prac)cal Applica)on of Knowledge Discovery to Image Data: A Prac))oners View in the Context of Medical Image Diagnos)cs

The Prac)cal Applica)on of Knowledge Discovery to Image Data: A Prac))oners View in the Context of Medical Image Diagnos)cs The Prac)cal Applica)on of Knowledge Discovery to Image Data: A Prac))oners View in the Context of Medical Image Diagnos)cs Frans Coenen (http://cgi.csc.liv.ac.uk/~frans/) University of Mauri0us, June

More information

Large Scale Data Analysis Using Deep Learning

Large Scale Data Analysis Using Deep Learning Large Scale Data Analysis Using Deep Learning Machine Learning Basics - 1 U Kang Seoul National University U Kang 1 In This Lecture Overview of Machine Learning Capacity, overfitting, and underfitting

More information

Pa#ern Recogni-on for Neuroimaging Toolbox

Pa#ern Recogni-on for Neuroimaging Toolbox Pa#ern Recogni-on for Neuroimaging Toolbox Pa#ern Recogni-on Methods: Basics João M. Monteiro Based on slides from Jessica Schrouff and Janaina Mourão-Miranda PRoNTo course UCL, London, UK 2017 Outline

More information

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset. Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied

More information

Data Mining: Classifier Evaluation. CSCI-B490 Seminar in Computer Science (Data Mining)

Data Mining: Classifier Evaluation. CSCI-B490 Seminar in Computer Science (Data Mining) Data Mining: Classifier Evaluation CSCI-B490 Seminar in Computer Science (Data Mining) Predictor Evaluation 1. Question: how good is our algorithm? how will we estimate its performance? 2. Question: what

More information

CLASSIFICATION JELENA JOVANOVIĆ. Web:

CLASSIFICATION JELENA JOVANOVIĆ.   Web: CLASSIFICATION JELENA JOVANOVIĆ Email: jeljov@gmail.com Web: http://jelenajovanovic.net OUTLINE What is classification? Binary and multiclass classification Classification algorithms Naïve Bayes (NB) algorithm

More information