Example of DT Apply Model Example Learn Model Hunt s Alg. Measures of Node Impurity DT Examples and Characteristics. Classification.

Size: px
Start display at page:

Download "Example of DT Apply Model Example Learn Model Hunt s Alg. Measures of Node Impurity DT Examples and Characteristics. Classification."

Transcription

1 lassification-decision Trees, Slide 1/56 Classification Decision Trees Huiping Cao

2 lassification-decision Trees, Slide 2/56 Examples of a Decision Tree Tid Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Refund: categorical Marital Status: categorical Taxable Income: continuous Cheat: class

3 10 lassification-decision Trees, Slide 3/56 Examples Example of a Decision of a Decision Tree (cont.) Tree Tid Refund Marital Status Taxable Income Cheat Splitting Attributes 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No Yes Refund No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Training Data Model: Decision Tree CS479/579: Data Mining Slide-4

4 10 lassification-decision Trees, Slide 4/56 Examples Example of a Decision of a Decision Tree (cont.) Tree Tid Refund Marital Status Taxable Income Cheat Splitting Attributes 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No Yes Refund No 4 Yes Married 120K No 5 No Divorced 95K Yes MarSt 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Training Data Model: Decision Tree CS479/579: Data Mining Slide-5

5 10 lassification-decision Trees, Slide 5/56 Examples Example of a Decision of a Decision Tree (cont.) Tree Tid Refund Marital Status Taxable Income Cheat Splitting Attributes 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No Yes Refund No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No MarSt Single, Divorced Married 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Training Data Model: Decision Tree CS479/579: Data Mining Slide-6

6 10 lassification-decision Trees, Slide 6/56 Examples of a Decision Tree (cont.) Example of a Decision Tree Tid Refund Marital Status Taxable Income Cheat Splitting Attributes 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No Yes Refund No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No MarSt Single, Divorced TaxInc Married 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Training Data Model: Decision Tree CS479/579: Data Mining Slide-7

7 10 lassification-decision Trees, Slide 7/56 Examples Example of a Decision of a Decision Tree (cont.) Tree Tid Refund Marital Status Taxable Income Cheat Splitting Attributes 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Refund Yes No MarSt Single, Divorced TaxInc < 80K 80K YES Married Training Data Model: Decision Tree CS479/579: Data Mining Slide-6

8 10 lassification-decision Trees, Slide 8/56 Examples of a Decision Tree (cont.) Example of a Decision Tree Tid Refund Marital Status Taxable Income Cheat Splitting Attributes 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Refund Yes No MarSt Single, Divorced TaxInc < 80K 80K YES Married Training Data Model: Decision Tree CS479/579: Data Mining Slide-7

9 10 Examples of a Decision Tree (cont.) Another Example of Decision Tree Tid Refund Marital Status Taxable Income 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No Cheat 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes Married MarSt Yes Single, Divorced Refund No TaxInc < 80K 80K YES There could be more than one tree that fits the same data! lassification-decision Trees, Slide 9/56 CS479/579: Data Mining Slide-8

10 10 10 Example of DT Apply Model Example Learn Model Hunt s Alg. Measures of Node Impurity DT Examples and Characteristics Decision Tree Classification Task Decision Tree Classification Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No 3 No Small 70K No 4 Yes Medium 120K No 5 No Large 95K Yes 6 No Medium 60K No 7 Yes Large 220K No 8 No Small 85K Yes 9 No Medium 75K No 10 No Small 90K Yes Training Set Tid Attrib1 Attrib2 Attrib3 Class 11 No Small 55K? 12 Yes Medium 80K? 13 Yes Large 110K? 14 No Small 95K? 15 No Large 67K? Test Set lassification-decision Trees, Slide 10/56 Induction Deduction Tree Induction algorithm Learn Model Apply Model Model Decision Tree

11 lassification-decision Trees, Slide 11/56 CS479/579: Data Mining Slide Example: Apply Model to Test Data Apply Model to Test Data Start from the root of tree. Test Data Refund Marital Status Taxable Income Cheat Yes Refund No No Married 80K? Single, Divorced MarSt Married TaxInc < 80K 80K YES

12 lassification-decision Trees, Slide 12/56 10 Example: Apply Model to Test Data (cont.) Apply Model to Test Data Test Data Refund Marital Status Taxable Income Cheat Yes Refund No No Married 80K? Single, Divorced MarSt Married TaxInc < 80K 80K YES

13 lassification-decision Trees, Slide 13/56 10 Example: Apply Apply Model Model to to Test Data (cont.) Test Data Refund Marital Status Taxable Income Cheat Yes Refund No No Married 80K? Single, Divorced MarSt Married TaxInc < 80K 80K YES

14 lassification-decision Trees, Slide 14/56 10 Example: Apply Apply Model Model to to Test Data (cont.) Test Data Refund Marital Status Taxable Income Cheat Yes Refund No No Married 80K? Single, Divorced MarSt Married TaxInc < 80K 80K YES

15 lassification-decision Trees, Slide 15/56 10 Example: Apply Apply Model Model to to Test Data (cont.) Test Data Refund Marital Status Taxable Income Cheat Yes Refund No No Married 80K? Single, Divorced MarSt Married TaxInc < 80K 80K YES

16 lassification-decision Trees, Slide 16/56 CS479/579: Data Mining Slide Example: Apply Model to Test Data (cont.) Apply Model to Test Data Test Data Refund Marital Status Taxable Income Cheat Yes Refund No No Married 80K? Single, Divorced MarSt Married Assign Cheat to No TaxInc < 80K 80K YES

17 10 10 Decision Tree Classification Task Example of DT Apply Model Example Learn Model Hunt s Alg. Measures of Node Impurity DT Examples and Characteristics Decision Tree Classification Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No 3 No Small 70K No 4 Yes Medium 120K No 5 No Large 95K Yes 6 No Medium 60K No 7 Yes Large 220K No 8 No Small 85K Yes 9 No Medium 75K No 10 No Small 90K Yes Training Set Tid Attrib1 Attrib2 Attrib3 Class 11 No Small 55K? 12 Yes Medium 80K? 13 Yes Large 110K? 14 No Small 95K? 15 No Large 67K? Test Set lassification-decision Trees, Slide 17/56 Induction Deduction Tree Induction algorithm Learn Model Apply Model Model Decision Tree

18 lassification-decision Trees, Slide 18/56 Decision Tree Induction How many trees? Exponential in the number of attributes Many Algorithms: reasonably accurate, suboptimal, reasonable amount of time Hunt s Algorithm (basis of many others) CART (Classification and Regression Trees), a book by Breiman et al. ID3, C4.5 by Quinlan

19 lassification-decision Trees, Slide 19/56 Hunt s algorithm Let D t be the set of training records that reach a node t y = {y 1, y 2,, y c } are class labels General procedure If D t contains records that belong to the same class y t, then t is a leaf node labeled as y t If D t contains records that belong to more than one class Use an attribute test to split the data into smaller subsets Recursively apply the procedure to each subset

20 lassification-decision Trees, Slide 20/56 Decision Tree Induction Algorithms Design Issues How should the training records be split? How to specify the attribute test condition? How to determine the best split? Greedy strategy: split the records based on an attribute test that optimizes certain criterion When to stop splitting Naive: (1) all the records have identical attribute values; or (2) all the records belong to the same class Is there any better way? Early stop?

21 lassification-decision Trees, Slide 21/56 Specify the Attribute Test Condition? Depends on attribute types Nominal Ordinal Continuous Depends on number of ways to split 2-way split Multi-way split

22 lassification-decision Trees, Slide 22/56 Splitting Based on Nominal Attributes itting Based Splitting on Nominal Based on Attributes Nominal Attributes lti-way split: Multi-way Use as many split: Use partitions as many as distinct partitions as distinct es. Multi-way values. split: Use as many partitions as distinct values CarType CarType Family Sports Luxury Family Sports Luxury ary split: Divides Binary split: values Divides into two values Binary split: Divides valuessubsets. into two subsets. into two subsets. Need to find optimal Need to partitioning. find optimal Need to partitioning. find optimal partitioning., } CarType {Sports, {Family} Luxury} CarType OR {Family} {Family, Luxury} CarType {Family, {Sports} Luxury} OR CarType {Sports} CS479/579: Data Mining Slide-25 CS479/579: Data Mining Slide-25

23 ting Splitting Based Based on onordinal Attributes ased on Splitting Ordinal Attributes Based on Ordinal Attributes way split: Use as many partitions as distinct. it: Use as Multi-way many partitions split: Use Use as distinct as as many many partitions partitions as distinct as distinct values values. Size Small Small Size Large Medium Large split: Divides values into two subsets. ivides values Binary into split: two Divides subsets. values into two subsets. Need to find optimal partitioning. Need to find optimal partitioning. Binary split: Divides values into two subsets. Need to find optimal partitioning and preserve the order among attribute Need to find optimal partitioning. values. Size Size {Medium, Size OR {Large} Medium OR {Large} {Small, Medium} Large} Small {Large} Medium Size Large {Medium, {Small} Large} OR Size Size {Medium, {Small} Large} {Small} Size {Small, Size Size Large} {Medium} {Small, {Small, What about this split? Large} Large} {Medium} CS479/579: Data Mining Slide-26 CS479/579: Data Mining Slide-26 CS479/579: Data Mining Slide-26 is split? about this split? lassification-decision Trees, Slide 23/56 {Medium}

24 lassification-decision Trees, Slide 24/56 Splitting Based on Continuous Attributes Discretization to form an ordinal categorical attribute Static discretize once at the beginning Dynamic ranges can be found by equal interval bucketing, equal frequency bucketing (percentiles), or clustering. Binary decision: (A < v) or (A v) Consider all possible splits and find the best cut Can be more compute intensive

25 lassification-decision Trees, Slide 25/56 Splitting Based on Continuous Attributes Example Yes Taxable Income >80K? No <10K Taxable Income? >=80K [10K,25K) [25K,50K) [50K,80K) (i) Binary split (ii) Multi-way split

26 lassification-decision Trees, Slide 26/56 Determine the best split Before Splitting: 10How records to of determine class 0 the Best Split Before Splitting: 10 records of class 0, 10 records of class 1 10 records of class 1 Own Car? Car Type? Student ID? Yes No Family C0: 6 C1: 4 C0: 4 C1: 6 C0: 1 C1: 3 Sports C0: 8 C1: 0 Luxury c 1 c 10 C0: 1 C1: 7 Which test condition is the best? Which test condition is the best? C0: 1 C1: 0... C0: 1 C1: 0 c 11 C0: 0 C1: 1 c C0: 0 C1: 1

27 lassification-decision Trees, Slide 27/56 Determine the Best Split Node Impurity Splitting criterion Splitting attribute Splitting point or splitting subset Ideally, the resulting partitions at each branch are as pure as possible. Need a measure of node impurity The smaller the degree of impurity, the more skewed the class distribution. The BETTER. Node with class distribution (0,1) has zero impurity. Node with class distribution (0.5,0.5) has highest impurity.

28 lassification-decision Trees, Slide 28/56 Measures of Node Impurity Gini Index c 1 Gini(t) = 1 [p(i t)] 2 i=0 p(i t): the probability that an instance t in D t belongs to class C i, estimated by n i D, where n t i is the number of instances with class label C i. Entropy c 1 Entropy(t) = p(i t)log 2 p(i t) i=0 Log uses base 2: information is encoded in bits. Classification error Classification error(t) = 1 max i [p(i t)]

29 lassification-decision Trees, Slide 29/56 Determine the Best Split Information Gain Before splitting: C0 How N00 to Find the Best Split M0 = I (N) C1 N01 Before Splitting: C0 C1 N00 N01 M0 A? B? Yes No Yes No Node N1 Node N2 Node N3 Node N4 C0 C1 N10 N11 C0 C1 N20 N21 C0 C1 N30 N31 C0 C1 N40 N41 M1 M2 M3 M4 M12 Gain = M0 M12 vs M0 M34 CS479/579: Data Mining Slide-33 Gain: = I (parent) k N(v j ) j=1 N I (v j) Gain = M0 M12 vs M0 M34 M34

30 lassification-decision Trees, Slide 30/56 Measures of Node Impurity Gini c 1 Gini(t) = 1 [p(i t)] 2 i=0 p(i t): the relative frequency of class i at node t. Maximum? (1 1 n c ) when records are equally distributed among all classes, implying least interesting information Minimum? (0.0) when all records belong to one class, implying most interesting information First used in CART, which allows only binary splitting

31 lassification-decision Trees, Slide 31/56 Measures of Node Impurity Gini C1 0 C2 6 C1 1 C2 5 C1 2 C2 4 C1 3 C2 3 Gini? 1 ( 0 6 )2 ( 6 6 )2 = 0 1 ( 1 6 )2 ( 5 6 )2 = = ( 2 6 )2 ( 4 6 )2 = =

32 lassification-decision Trees, Slide 32/56 two partitions Example of DT Apply Model Example Learn Model Hunt s Alg. Measures of Node Impurity DT Examples and Characteristics Weighing partitions: Binary Attributes: Computing Gini Index r and Purer Partitions are sought for. Yes Node N1 B? No Node N2 CS479/579: Data Mining Slide-37 Parent C1 6 C2 6 Gini = (2/6) 2 Parent N1 N2 N1 N2 Gini(Children) C C1 5 1 = 7/12 * C C /12 * (4/6) 2 Gini(Parent) Gini=0.333 = 0.5 = Gini(N1) = 1 (5/7) 2 (2/7) 2 Gini(N2) = 1 (1/5) 2 (4/5) 2 Gini(Children) = 7/12 Gini(N1) + 5/12 Gini(N2) Gain = Gini(Parent) Gini(Children)

33 Categorical Attributes: Computing Gini Index Categorical Attributes: Computing Gini Index For each distinct value, gather counts counts for for each each class class in the in the dataset dataset Use the count matrix to make decisions Multi-way split Two-way split (find best partition of values) CarType Family Sports Luxury C C Gini CarType {Sports, Luxury} {Family} C1 3 1 C2 2 4 Gini CarType {Sports} {Family, Luxury} C1 2 2 C2 1 5 Gini Calculation example? Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 # lassification-decision Trees, Slide 33/56

34 lassification-decision Trees, Slide 34/56 Categorical Attributes: Computing Gini Index Gini(family) = 1-(1/25)-(16/25)=(8/25) Gini(sports) = 1-(4/9)-(1/9)=(4/9) Gini(luxury) = 1/2 Gini(all) = (5/10)*(8/25)+(3/10)*(4/9)+(2/10)*(1/2) =(4/25)+(2/15)+(1/10) = ( )/150 = 59/150 = 0.393

35 lassification-decision Trees, Slide 35/56 Continuous Attributes: Computing Gini Index Use Binary Decisions based on one value Several Choices for the splitting value Number of possible splitting values = Number of distinct values Each splitting value has a count matrix associated with it Class counts in each of the partitions, A < v and A v Simple method to choose best v For each v, scan the database to gather count matrix and compute its Gini index Computationally Inefficient! Repetition of work.

36 lassification-decision Trees, Slide 36/56 Continuous Attributes: Computing Gini Index Tid Refund Marital Status Taxable Income Cheat 1 Yes Single 125K No 2 No Married 100K No 3 No Single 70K No 4 Yes Married 120K No 5 No Divorced 95K Yes 6 No Married 60K No 7 Yes Divorced 220K No 8 No Single 85K Yes 9 No Married 75K No 10 No Single 90K Yes

37 lassification-decision Trees, Slide 37/56 Continuous Attributes: Computing Gini Index Continuous Attributes: Computing Gini Index... For efficient computation: for each for each attribute, attribute, Sort the attribute on values Sort the attribute on values Linearly Linearly scan scan these these values, values, each each time time updating update the the count count matrix matrix and computing gini index and compute gini index Choose the split position that has the least gini index Choose the split position that has the least Gini index Cheat No No No Yes Yes Yes No No No No Taxable Income Sorted Values Split Positions <= > <= > <= > <= > <= > <= > <= > <= > <= > <= > <= > Yes No Gini Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 #

38 lassification-decision Trees, Slide 38/56 Entropy c 1 Entropy(t) = p(i t)log 2 p(i t) i=0 Measures homogeneity of a node. Maximum (log(n c )) when records are equally distributed among all classes implying least information Minimum (0.0) when all records belong to one class, implying most information

39 lassification-decision Trees, Slide 39/56 Examples for Computing Entropy C1 0 C2 6 Entropy? c 1 Entropy(t) = p(i t)log 2 p(i t) C1 1 C2 5 C1 2 C2 4 i=0 C1 3 C2 3 0log0 1log1 = 0 (prob. is 0 means it does not happen, let 0log0 = 0) 1 6 log(1 6 ) 5 6 log(5 6 ) = log(2 6 ) 4 6 log(4 6 ) = 0.92

40 lassification-decision Trees, Slide 40/56 Splitting Based on Information Gain Information Gain = Entropy(p) Used in ID3 k j=1 N(v j ) N Entropy(v j) Disadvantage: Tends to prefer splits that result in large number of partitions, each being small but pure. Consider partition on ID? Entropy(children) = 0

41 lassification-decision Trees, Slide 41/56 Splitting Based on Gain Ratio Gain Ratio = Split Info = Information Gain Split Info k j=1 ( n j n log(n j n )) Parent node p is split into k partitions n j is the number of records in partition j Higher entropy partitioning (large number of small partitions) is penalized! Used in C4.5, successor of ID3 Designed to overcome the disadvantage of Information Gain

42 lassification-decision Trees, Slide 42/56 Classification Error Error(t) = 1 max i p(i t) Measures misclassification error made by a node. Maximum (1 1/n c ) when records are equally distributed among all classes, implying least interesting information Minimum (0.0) when all records belong to one class, implying most interesting information

43 lassification-decision Trees, Slide 43/56 Examples for Computing Classification Error C1 0 C2 6 C1 1 C2 5 Classification Error? 1 max(0, 1) = 0 Error(t) = 1 max i p(i t) C1 2 C2 4 C1 3 C2 3 1 max(1/6, 5/6) = 1 5/6 = 1/6 1 max(2/6, 4/6) = 1 4/6 = 1/3 0.5

44 Comparison among Splitting Criteria Comparison among Splitting Criteria Example of DT Apply Model Example Learn Model Hunt s Alg. Measures of Node Impurity DT Examples and Characteristics For Fora a2-class problem: lassification-decision Trees, Slide 44/56 CS479/579: Data Mining Slide-47

45 lassification-decision Trees, Slide 45/56 Stopping Criteria for Tree Induction Stop expanding a node when all the records belong to the same class Stop expanding a node when all the records have similar attribute values Early termination: use threshold

46 lassification-decision Trees, Slide 46/56 Examples ID3: Use Entropy, Information gain C4.5: Use Entropy, Normalized Information Gain (Gain Ratio) Download the software from: CART: Uses Gini Index, only binary splits Which measure is the best? Time complexity of DT induction increases exponentially with tree height shallower trees Shallow trees tend to have a large number of leaves and higher error rates

47 lassification-decision Trees, Slide 47/56 Example: C4.5 Simple depth-first construction. Uses Information gain Sorts continuous attributes at each node. Needs entire data to fit in memory. Unsuitable for large datasets. Download the software from:

48 DT Example (1) RID age income student credit rating buys computer 1 <= 30 high no fair no 2 <= 30 high no excellent no high no fair yes 4 > 40 medium no fair yes 5 > 40 low yes fair yes 6 > 40 low yes excellent no low yes excellent yes 8 <= 30 medium no fair no 9 <= 30 low yes fair yes 10 > 40 medium yes fair yes 11 <= 30 medium yes excellent yes medium no excellent yes high yes fair yes 14 > 40 medium no excellent no Class label attribute: buys computer Preprocess age attribute: <= 30: young 31 40: middle aged > 40: senior lassification-decision Trees, Slide 48/56

49 lassification-decision Trees, Slide 49/56 DT Example (2) RID age income student credit rating buys computer 1 young high no fair no 2 young high no excellent no 3 middle aged high no fair yes 4 senior medium no fair yes 5 senior low yes fair yes 6 senior low yes excellent no 7 middle aged low yes excellent yes 8 young medium no fair no 9 young low yes fair yes 10 senior medium yes fair yes 11 young medium yes excellent yes 12 middle aged medium no excellent yes 13 middle aged high yes fair yes 14 senior medium no excellent no

50 lassification-decision Trees, Slide 50/56 DT Example (3) Information gain based on Gini Index Number of classes: C 1 : for yes, 9 C 2 : for no, 5 Gini(D) = 1 ( 9 14 )2 ( 5 14 )2 = Consider using attribute income as splitting attribute: D 1 : income is {low, medium} Gini(D 1 ) = 1 ( 6 10 )2 ( 4 10 )2 = D 2 : income is {high} Gini(D 2 ) = 1 ( 2 4 )2 ( 2 4 )2 = 1 2 Gini(children) = = = = Gain = Gini(D) Gini(children) = 0.027

51 lassification-decision Trees, Slide 51/56 DT Example (4) Information gain based on Entropy Entropy(D) = 9 14 log 2( 9 14 ) 5 14 log 2( 5 14 ) = 0.94 Compute the expected entropy for each attribute. Start with attribute age. Consider multi-split (i.e., 3-split). D young : 2 yes, 3 no; Calculation? Avg Entropy = Gain(age) = = Compute gain for other attributes: Gain(income) = 0.029, Gain(student), etc.

52 lassification-decision Trees, Slide 52/56 DT Example (5) Gain ratio Consider attribute income. low: 4 medium: 6 high: 4 Split Info(D) = 4 14 log 2( 4 14 ) 6 14 log 2( 6 14 ) 4 14 log 2( 4 14 ) = Gain ratio(income) = = 0.031

53 lassification-decision Trees, Slide 53/56 Decision Tree R packages > install.packages("tree") > library(tree) > require(tree) > help(tree) > install.packages("rpart") > library(rpart) > require(rpart) > help(rpart)

54 lassification-decision Trees, Slide 54/56 Decision Tree Based Classification Advantages Inexpensive to construct Extremely fast at classifying unknown records Easy to interpret for small-sized trees Accuracy is comparable to other classification techniques for many simple data sets

55 lassification-decision Trees, Slide 55/56 Decision Boundary Decision boundary: border line between two neighboring regions of different classes Decision boundary is parallel to axes because test condition involves a single attribute at-a-time. Oblique decision tree Test condition may involve multiple attributes x + y < 1 More expressive representation Finding optimal test condition is computationally expensive Constructive induction Purpose: partition the data into homogeneous, non-rectangular regions. Composite attribute

56 lassification-decision Trees, Slide 56/56 Example of DT Apply Model Example Learn Model Hunt s Alg. Measures of Node Impurity DT Examples and Characteristics Tree Replication Tree Replication P Q R S 0 Q S 0 0 1

CSE4334/5334 DATA MINING

CSE4334/5334 DATA MINING CSE4334/5334 DATA MINING Lecture 4: Classification (1) CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai Li (Slides courtesy

More information

Classification. Instructor: Wei Ding

Classification. Instructor: Wei Ding Classification Decision Tree Instructor: Wei Ding Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Preliminaries Each data record is characterized by a tuple (x, y), where x is the attribute

More information

Classification: Basic Concepts, Decision Trees, and Model Evaluation

Classification: Basic Concepts, Decision Trees, and Model Evaluation Classification: Basic Concepts, Decision Trees, and Model Evaluation Data Warehousing and Mining Lecture 4 by Hossen Asiful Mustafa Classification: Definition Given a collection of records (training set

More information

CS Machine Learning

CS Machine Learning CS 60050 Machine Learning Decision Tree Classifier Slides taken from course materials of Tan, Steinbach, Kumar 10 10 Illustrating Classification Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K

More information

Part I. Instructor: Wei Ding

Part I. Instructor: Wei Ding Classification Part I Instructor: Wei Ding Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Classification: Definition Given a collection of records (training set ) Each record contains a set

More information

Data Mining Concepts & Techniques

Data Mining Concepts & Techniques Data Mining Concepts & Techniques Lecture No. 03 Data Processing, Data Mining Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information

Extra readings beyond the lecture slides are important:

Extra readings beyond the lecture slides are important: 1 Notes To preview next lecture: Check the lecture notes, if slides are not available: http://web.cse.ohio-state.edu/~sun.397/courses/au2017/cse5243-new.html Check UIUC course on the same topic. All their

More information

Machine Learning. Decision Trees. Le Song /15-781, Spring Lecture 6, September 6, 2012 Based on slides from Eric Xing, CMU

Machine Learning. Decision Trees. Le Song /15-781, Spring Lecture 6, September 6, 2012 Based on slides from Eric Xing, CMU Machine Learning 10-701/15-781, Spring 2008 Decision Trees Le Song Lecture 6, September 6, 2012 Based on slides from Eric Xing, CMU Reading: Chap. 1.6, CB & Chap 3, TM Learning non-linear functions f:

More information

Classification and Regression

Classification and Regression Classification and Regression Announcements Study guide for exam is on the LMS Sample exam will be posted by Monday Reminder that phase 3 oral presentations are being held next week during workshops Plan

More information

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data

More information

Classification Salvatore Orlando

Classification Salvatore Orlando Classification Salvatore Orlando 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class. The values of the

More information

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining

Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation. Lecture Notes for Chapter 4. Introduction to Data Mining Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Introduction to Data Mining by Tan, Steinbach, Kumar (modified by Predrag Radivojac, 2017) Classification:

More information

Data Mining Classification - Part 1 -

Data Mining Classification - Part 1 - Data Mining Classification - Part 1 - Universität Mannheim Bizer: Data Mining I FSS2019 (Version: 20.2.2018) Slide 1 Outline 1. What is Classification? 2. K-Nearest-Neighbors 3. Decision Trees 4. Model

More information

DATA MINING LECTURE 11. Classification Basic Concepts Decision Trees Evaluation Nearest-Neighbor Classifier

DATA MINING LECTURE 11. Classification Basic Concepts Decision Trees Evaluation Nearest-Neighbor Classifier DATA MINING LECTURE 11 Classification Basic Concepts Decision Trees Evaluation Nearest-Neighbor Classifier What is a hipster? Examples of hipster look A hipster is defined by facial hair Hipster or Hippie?

More information

DATA MINING LECTURE 9. Classification Basic Concepts Decision Trees Evaluation

DATA MINING LECTURE 9. Classification Basic Concepts Decision Trees Evaluation DATA MINING LECTURE 9 Classification Basic Concepts Decision Trees Evaluation What is a hipster? Examples of hipster look A hipster is defined by facial hair Hipster or Hippie? Facial hair alone is not

More information

Lecture outline. Decision-tree classification

Lecture outline. Decision-tree classification Lecture outline Decision-tree classification Decision Trees Decision tree A flow-chart-like tree structure Internal node denotes a test on an attribute Branch represents an outcome of the test Leaf nodes

More information

Data Mining D E C I S I O N T R E E. Matteo Golfarelli

Data Mining D E C I S I O N T R E E. Matteo Golfarelli Data Mining D E C I S I O N T R E E Matteo Golfarelli Decision Tree It is one of the most widely used classification techniques that allows you to represent a set of classification rules with a tree. Tree:

More information

Given a collection of records (training set )

Given a collection of records (training set ) Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is (always) the class. Find a model for class attribute as a function of the values of other

More information

CSE 5243 INTRO. TO DATA MINING

CSE 5243 INTRO. TO DATA MINING CSE 5243 INTRO. TO DATA MINING Classification (Basic Concepts) Huan Sun, CSE@The Ohio State University 09/12/2017 Slides adapted from UIUC CS412, Fall 2017, by Prof. Jiawei Han Classification: Basic Concepts

More information

DATA MINING LECTURE 9. Classification Decision Trees Evaluation

DATA MINING LECTURE 9. Classification Decision Trees Evaluation DATA MINING LECTURE 9 Classification Decision Trees Evaluation 10 10 Illustrating Classification Task Tid Attrib1 Attrib2 Attrib3 Class 1 Yes Large 125K No 2 No Medium 100K No 3 No Small 70K No 4 Yes Medium

More information

Lecture Notes for Chapter 4

Lecture Notes for Chapter 4 Classification - Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Slides by Tan, Steinbach, Kumar adapted by Michael Hahsler Look for accompanying R code on the course web

More information

Lecture 7: Decision Trees

Lecture 7: Decision Trees Lecture 7: Decision Trees Instructor: Outline 1 Geometric Perspective of Classification 2 Decision Trees Geometric Perspective of Classification Perspective of Classification Algorithmic Geometric Probabilistic...

More information

Data Warehousing & Data Mining

Data Warehousing & Data Mining Data Warehousing & Data Mining Wolf-Tilo Balke Silviu Homoceanu Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Summary Last week: Sequence Patterns: Generalized

More information

Data Mining. 3.2 Decision Tree Classifier. Fall Instructor: Dr. Masoud Yaghini. Chapter 5: Decision Tree Classifier

Data Mining. 3.2 Decision Tree Classifier. Fall Instructor: Dr. Masoud Yaghini. Chapter 5: Decision Tree Classifier Data Mining 3.2 Decision Tree Classifier Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Basic Algorithm for Decision Tree Induction Attribute Selection Measures Information Gain Gain Ratio

More information

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler

BBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler BBS654 Data Mining Pinar Duygulu Slides are adapted from Nazli Ikizler 1 Classification: Basic Concepts Classification: Basic Concepts Decision Tree Induction Bayes Classification Methods Model Evaluation

More information

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai

Decision Trees Dr. G. Bharadwaja Kumar VIT Chennai Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target

More information

COMP 465: Data Mining Classification Basics

COMP 465: Data Mining Classification Basics Supervised vs. Unsupervised Learning COMP 465: Data Mining Classification Basics Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Supervised

More information

Data Mining: Concepts and Techniques Classification and Prediction Chapter 6.1-3

Data Mining: Concepts and Techniques Classification and Prediction Chapter 6.1-3 Data Mining: Concepts and Techniques Classification and Prediction Chapter 6.1-3 January 25, 2007 CSE-4412: Data Mining 1 Chapter 6 Classification and Prediction 1. What is classification? What is prediction?

More information

Decision Tree Learning

Decision Tree Learning Decision Tree Learning Debapriyo Majumdar Data Mining Fall 2014 Indian Statistical Institute Kolkata August 25, 2014 Example: Age, Income and Owning a flat Monthly income (thousand rupees) 250 200 150

More information

Classification with Decision Tree Induction

Classification with Decision Tree Induction Classification with Decision Tree Induction This algorithm makes Classification Decision for a test sample with the help of tree like structure (Similar to Binary Tree OR k-ary tree) Nodes in the tree

More information

Part I. Classification & Decision Trees. Classification. Classification. Week 4 Based in part on slides from textbook, slides of Susan Holmes

Part I. Classification & Decision Trees. Classification. Classification. Week 4 Based in part on slides from textbook, slides of Susan Holmes Week 4 Based in part on slides from textbook, slides of Susan Holmes Part I Classification & Decision Trees October 19, 2012 1 / 1 2 / 1 Classification Classification Problem description We are given a

More information

ARTIFICIAL INTELLIGENCE (CS 370D)

ARTIFICIAL INTELLIGENCE (CS 370D) Princess Nora University Faculty of Computer & Information Systems ARTIFICIAL INTELLIGENCE (CS 370D) (CHAPTER-18) LEARNING FROM EXAMPLES DECISION TREES Outline 1- Introduction 2- know your data 3- Classification

More information

Nesnelerin İnternetinde Veri Analizi

Nesnelerin İnternetinde Veri Analizi Nesnelerin İnternetinde Veri Analizi Bölüm 3. Classification in Data Streams w3.gazi.edu.tr/~suatozdemir Supervised vs. Unsupervised Learning (1) Supervised learning (classification) Supervision: The training

More information

ISSUES IN DECISION TREE LEARNING

ISSUES IN DECISION TREE LEARNING ISSUES IN DECISION TREE LEARNING Handling Continuous Attributes Other attribute selection measures Overfitting-Pruning Handling of missing values Incremental Induction of Decision Tree 1 DECISION TREE

More information

Supervised Learning. Decision trees Artificial neural nets K-nearest neighbor Support vectors Linear regression Logistic regression...

Supervised Learning. Decision trees Artificial neural nets K-nearest neighbor Support vectors Linear regression Logistic regression... Supervised Learning Decision trees Artificial neural nets K-nearest neighbor Support vectors Linear regression Logistic regression... Supervised Learning y=f(x): true function (usually not known) D: training

More information

Classification: Decision Trees

Classification: Decision Trees Classification: Decision Trees IST557 Data Mining: Techniques and Applications Jessie Li, Penn State University 1 Decision Tree Example Will a pa)ent have high-risk based on the ini)al 24-hour observa)on?

More information

Classification: Basic Concepts, Decision Trees, and Model Evaluation

Classification: Basic Concepts, Decision Trees, and Model Evaluation Chapter 4 92 Chapter 4 Classification: Basic Concepts, Decision Trees, and Model Evaluation Classification is the task of assigning objects to their respective categories. Examples include classifying

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

DATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines

DATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines DATA MINING LECTURE 10B Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines NEAREST NEIGHBOR CLASSIFICATION 10 10 Illustrating Classification Task Tid Attrib1

More information

Nominal Data. May not have a numerical representation Distance measures might not make sense. PR and ANN

Nominal Data. May not have a numerical representation Distance measures might not make sense. PR and ANN NonMetric Data Nominal Data So far we consider patterns to be represented by feature vectors of real or integer values Easy to come up with a distance (similarity) measure by using a variety of mathematical

More information

Decision tree learning

Decision tree learning Decision tree learning Andrea Passerini passerini@disi.unitn.it Machine Learning Learning the concept Go to lesson OUTLOOK Rain Overcast Sunny TRANSPORTATION LESSON NO Uncovered Covered Theoretical Practical

More information

9 Classification: KNN and SVM

9 Classification: KNN and SVM CSE4334/5334 Data Mining 9 Classification: KNN and SVM Chengkai Li Department of Computer Science and Engineering University of Texas at Arlington Fall 2017 (Slides courtesy of Pang-Ning Tan, Michael Steinbach

More information

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany

Lars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany Syllabus Fri. 27.10. (1) 0. Introduction A. Supervised Learning: Linear Models & Fundamentals Fri. 3.11. (2) A.1 Linear Regression Fri. 10.11. (3) A.2 Linear Classification Fri. 17.11. (4) A.3 Regularization

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees David S. Rosenberg New York University April 3, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 April 3, 2018 1 / 51 Contents 1 Trees 2 Regression

More information

Decision Tree CE-717 : Machine Learning Sharif University of Technology

Decision Tree CE-717 : Machine Learning Sharif University of Technology Decision Tree CE-717 : Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Some slides have been adapted from: Prof. Tom Mitchell Decision tree Approximating functions of usually discrete

More information

A Systematic Overview of Data Mining Algorithms

A Systematic Overview of Data Mining Algorithms A Systematic Overview of Data Mining Algorithms 1 Data Mining Algorithm A well-defined procedure that takes data as input and produces output as models or patterns well-defined: precisely encoded as a

More information

Decision trees. Decision trees are useful to a large degree because of their simplicity and interpretability

Decision trees. Decision trees are useful to a large degree because of their simplicity and interpretability Decision trees A decision tree is a method for classification/regression that aims to ask a few relatively simple questions about an input and then predicts the associated output Decision trees are useful

More information

Machine Learning. A. Supervised Learning A.7. Decision Trees. Lars Schmidt-Thieme

Machine Learning. A. Supervised Learning A.7. Decision Trees. Lars Schmidt-Thieme Machine Learning A. Supervised Learning A.7. Decision Trees Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany 1 /

More information

7. Decision or classification trees

7. Decision or classification trees 7. Decision or classification trees Next we are going to consider a rather different approach from those presented so far to machine learning that use one of the most common and important data structure,

More information

Classification: Basic Concepts and Techniques

Classification: Basic Concepts and Techniques 3 Classification: Basic Concepts and Techniques Humans have an innate ability to classify things into categories, e.g., mundane tasks such as filtering spam email messages or more specialized tasks such

More information

Nominal Data. May not have a numerical representation Distance measures might not make sense PR, ANN, & ML

Nominal Data. May not have a numerical representation Distance measures might not make sense PR, ANN, & ML Decision Trees Nominal Data So far we consider patterns to be represented by feature vectors of real or integer values Easy to come up with a distance (similarity) measure by using a variety of mathematical

More information

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York A Systematic Overview of Data Mining Algorithms Sargur Srihari University at Buffalo The State University of New York 1 Topics Data Mining Algorithm Definition Example of CART Classification Iris, Wine

More information

Data Mining Course Overview

Data Mining Course Overview Data Mining Course Overview 1 Data Mining Overview Understanding Data Classification: Decision Trees and Bayesian classifiers, ANN, SVM Association Rules Mining: APriori, FP-growth Clustering: Hierarchical

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Decision trees Extending previous approach: Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank to permit numeric s: straightforward

More information

Chapter ML:III. III. Decision Trees. Decision Trees Basics Impurity Functions Decision Tree Algorithms Decision Tree Pruning

Chapter ML:III. III. Decision Trees. Decision Trees Basics Impurity Functions Decision Tree Algorithms Decision Tree Pruning Chapter ML:III III. Decision Trees Decision Trees Basics Impurity Functions Decision Tree Algorithms Decision Tree Pruning ML:III-67 Decision Trees STEIN/LETTMANN 2005-2017 ID3 Algorithm [Quinlan 1986]

More information

KDD E A MINERAÇÃO DE DADOS. Daniela Barreiro Claro

KDD E A MINERAÇÃO DE DADOS. Daniela Barreiro Claro KDD E A MINERAÇÃO DE DADOS Daniela Barreiro Claro Outline Introduction KDD Pré-Processamento Mineração de Dados Tarefas Pós-Processamento Prof. Daniela Barreiro Claro 2 de X;X= BIG Data Huge amount of

More information

Lazy Decision Trees Ronny Kohavi

Lazy Decision Trees Ronny Kohavi Lazy Decision Trees Ronny Kohavi Data Mining and Visualization Group Silicon Graphics, Inc. Joint work with Jerry Friedman and Yeogirl Yun Stanford University Motivation: Average Impurity = / interesting

More information

Classification/Regression Trees and Random Forests

Classification/Regression Trees and Random Forests Classification/Regression Trees and Random Forests Fabio G. Cozman - fgcozman@usp.br November 6, 2018 Classification tree Consider binary class variable Y and features X 1,..., X n. Decide Ŷ after a series

More information

Data Mining in Bioinformatics Day 1: Classification

Data Mining in Bioinformatics Day 1: Classification Data Mining in Bioinformatics Day 1: Classification Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institute Tübingen and Eberhard Karls

More information

10 Classification: Evaluation

10 Classification: Evaluation CSE4334/5334 Data Mining 10 Classification: Evaluation Chengkai Li Department of Computer Science and Engineering University of Texas at Arlington Fall 2018 (Slides courtesy of Pang-Ning Tan, Michael Steinbach

More information

Credit card Fraud Detection using Predictive Modeling: a Review

Credit card Fraud Detection using Predictive Modeling: a Review February 207 IJIRT Volume 3 Issue 9 ISSN: 2396002 Credit card Fraud Detection using Predictive Modeling: a Review Varre.Perantalu, K. BhargavKiran 2 PG Scholar, CSE, Vishnu Institute of Technology, Bhimavaram,

More information

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018 MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge

More information

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization

More information

Data Mining Lecture 8: Decision Trees

Data Mining Lecture 8: Decision Trees Data Mining Lecture 8: Decision Trees Jo Houghton ECS Southampton March 8, 2019 1 / 30 Decision Trees - Introduction A decision tree is like a flow chart. E. g. I need to buy a new car Can I afford it?

More information

Nearest neighbor classification DSE 220

Nearest neighbor classification DSE 220 Nearest neighbor classification DSE 220 Decision Trees Target variable Label Dependent variable Output space Person ID Age Gender Income Balance Mortgag e payment 123213 32 F 25000 32000 Y 17824 49 M 12000-3000

More information

PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning

PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning Data Mining and Knowledge Discovery, 4, 315 344, 2000 c 2000 Kluwer Academic Publishers. Manufactured in The Netherlands. PUBLIC: A Decision Tree Classifier that Integrates Building and Pruning RAJEEV

More information

CS 584 Data Mining. Classification 1

CS 584 Data Mining. Classification 1 CS 584 Data Mining Classification 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class. Find a model for

More information

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept

More information

Implementierungstechniken für Hauptspeicherdatenbanksysteme Classification: Decision Trees

Implementierungstechniken für Hauptspeicherdatenbanksysteme Classification: Decision Trees Implementierungstechniken für Hauptspeicherdatenbanksysteme Classification: Decision Trees Dominik Vinan February 6, 2018 Abstract Decision Trees are a well-known part of most modern Machine Learning toolboxes.

More information

Analysis of Various Decision Tree Algorithms for Classification in Data Mining

Analysis of Various Decision Tree Algorithms for Classification in Data Mining Volume 163 No 8, April 017 Analysis of Various Decision Tree Algorithms for Classification in Data Mining Bhumika Gupta, PhD Assistant Professor, C.S.E.D Arpit Arora Aditya Rawat Naresh Dhami Akshay Jain

More information

Statistical Machine Learning Hilary Term 2018

Statistical Machine Learning Hilary Term 2018 Statistical Machine Learning Hilary Term 2018 Pier Francesco Palamara Department of Statistics University of Oxford Slide credits and other course material can be found at: http://www.stats.ox.ac.uk/~palamara/sml18.html

More information

UNIT 2 Data Preprocessing

UNIT 2 Data Preprocessing UNIT 2 Data Preprocessing Lecture Topic ********************************************** Lecture 13 Why preprocess the data? Lecture 14 Lecture 15 Lecture 16 Lecture 17 Data cleaning Data integration and

More information

Business Club. Decision Trees

Business Club. Decision Trees Business Club Decision Trees Business Club Analytics Team December 2017 Index 1. Motivation- A Case Study 2. The Trees a. What is a decision tree b. Representation 3. Regression v/s Classification 4. Building

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees Matthew S. Shotwell, Ph.D. Department of Biostatistics Vanderbilt University School of Medicine Nashville, TN, USA March 16, 2018 Introduction trees partition feature

More information

9/6/14. Our first learning algorithm. Comp 135 Introduction to Machine Learning and Data Mining. knn Algorithm. knn Algorithm (simple form)

9/6/14. Our first learning algorithm. Comp 135 Introduction to Machine Learning and Data Mining. knn Algorithm. knn Algorithm (simple form) Comp 135 Introduction to Machine Learning and Data Mining Our first learning algorithm How would you classify the next example? Fall 2014 Professor: Roni Khardon Computer Science Tufts University o o o

More information

Classification Key Concepts

Classification Key Concepts http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Classification Key Concepts Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech 1 How will

More information

Machine Learning. Decision Trees. Manfred Huber

Machine Learning. Decision Trees. Manfred Huber Machine Learning Decision Trees Manfred Huber 2015 1 Decision Trees Classifiers covered so far have been Non-parametric (KNN) Probabilistic with independence (Naïve Bayes) Linear in features (Logistic

More information

List of Exercises: Data Mining 1 December 12th, 2015

List of Exercises: Data Mining 1 December 12th, 2015 List of Exercises: Data Mining 1 December 12th, 2015 1. We trained a model on a two-class balanced dataset using five-fold cross validation. One person calculated the performance of the classifier by measuring

More information

Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers

Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers Dynamic Load Balancing of Unstructured Computations in Decision Tree Classifiers A. Srivastava E. Han V. Kumar V. Singh Information Technology Lab Dept. of Computer Science Information Technology Lab Hitachi

More information

Random Forest A. Fornaser

Random Forest A. Fornaser Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University

More information

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA

Data Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA Obj ti Objectives Motivation: Why preprocess the Data? Data Preprocessing Techniques Data Cleaning Data Integration and Transformation Data Reduction Data Preprocessing Lecture 3/DMBI/IKI83403T/MTI/UI

More information

CS229 Lecture notes. Raphael John Lamarre Townshend

CS229 Lecture notes. Raphael John Lamarre Townshend CS229 Lecture notes Raphael John Lamarre Townshend Decision Trees We now turn our attention to decision trees, a simple yet flexible class of algorithms. We will first consider the non-linear, region-based

More information

Additive Models, Trees, etc. Based in part on Chapter 9 of Hastie, Tibshirani, and Friedman David Madigan

Additive Models, Trees, etc. Based in part on Chapter 9 of Hastie, Tibshirani, and Friedman David Madigan Additive Models, Trees, etc. Based in part on Chapter 9 of Hastie, Tibshirani, and Friedman David Madigan Predictive Modeling Goal: learn a mapping: y = f(x;θ) Need: 1. A model structure 2. A score function

More information

INTRO TO RANDOM FOREST BY ANTHONY ANH QUOC DOAN

INTRO TO RANDOM FOREST BY ANTHONY ANH QUOC DOAN INTRO TO RANDOM FOREST BY ANTHONY ANH QUOC DOAN MOTIVATION FOR RANDOM FOREST Random forest is a great statistical learning model. It works well with small to medium data. Unlike Neural Network which requires

More information

CS 229 Midterm Review

CS 229 Midterm Review CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask

More information

Logical Rhythm - Class 3. August 27, 2018

Logical Rhythm - Class 3. August 27, 2018 Logical Rhythm - Class 3 August 27, 2018 In this Class Neural Networks (Intro To Deep Learning) Decision Trees Ensemble Methods(Random Forest) Hyperparameter Optimisation and Bias Variance Tradeoff Biological

More information

Data Mining: Data. What is Data? Lecture Notes for Chapter 2. Introduction to Data Mining. Properties of Attribute Values. Types of Attributes

Data Mining: Data. What is Data? Lecture Notes for Chapter 2. Introduction to Data Mining. Properties of Attribute Values. Types of Attributes 0 Data Mining: Data What is Data? Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar Collection of data objects and their attributes An attribute is a property or characteristic

More information

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining 10 Data Mining: Data Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 What is Data? Collection of data objects

More information

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.

What is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control. What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem

More information

Data Preprocessing UE 141 Spring 2013

Data Preprocessing UE 141 Spring 2013 Data Preprocessing UE 141 Spring 2013 Jing Gao SUNY Buffalo 1 Outline Data Data Preprocessing Improve data quality Prepare data for analysis Exploring Data Statistics Visualization 2 Document Data Each

More information

Chapter 2: Classification & Prediction

Chapter 2: Classification & Prediction Chapter 2: Classification & Prediction 2.1 Basic Concepts of Classification and Prediction 2.2 Decision Tree Induction 2.3 Bayes Classification Methods 2.4 Rule Based Classification 2.4.1 The principle

More information

Comparing Univariate and Multivariate Decision Trees *

Comparing Univariate and Multivariate Decision Trees * Comparing Univariate and Multivariate Decision Trees * Olcay Taner Yıldız, Ethem Alpaydın Department of Computer Engineering Boğaziçi University, 80815 İstanbul Turkey yildizol@cmpe.boun.edu.tr, alpaydin@boun.edu.tr

More information

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition

Pattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant

More information

Decision Trees Oct

Decision Trees Oct Decision Trees Oct - 7-2009 Previously We learned two different classifiers Perceptron: LTU KNN: complex decision boundary If you are a novice in this field, given a classification application, are these

More information

Data Mining. 3.3 Rule-Based Classification. Fall Instructor: Dr. Masoud Yaghini. Rule-Based Classification

Data Mining. 3.3 Rule-Based Classification. Fall Instructor: Dr. Masoud Yaghini. Rule-Based Classification Data Mining 3.3 Fall 2008 Instructor: Dr. Masoud Yaghini Outline Using IF-THEN Rules for Classification Rules With Exceptions Rule Extraction from a Decision Tree 1R Algorithm Sequential Covering Algorithms

More information

Basic Data Mining Technique

Basic Data Mining Technique Basic Data Mining Technique What is classification? What is prediction? Supervised and Unsupervised Learning Decision trees Association rule K-nearest neighbor classifier Case-based reasoning Genetic algorithm

More information

Tree-Structured Indexes

Tree-Structured Indexes Tree-Structured Indexes Yanlei Diao UMass Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke Access Methods v File of records: Abstraction of disk storage for query processing (1) Sequential scan;

More information

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 10: Decision Trees

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 10: Decision Trees University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 10: Decision Trees colour=green? size>20cm? colour=red? watermelon size>5cm? size>5cm? colour=yellow? apple

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

2. Data Preprocessing

2. Data Preprocessing 2. Data Preprocessing Contents of this Chapter 2.1 Introduction 2.2 Data cleaning 2.3 Data integration 2.4 Data transformation 2.5 Data reduction Reference: [Han and Kamber 2006, Chapter 2] SFU, CMPT 459

More information