Data Mining. Covering algorithms. Covering approach At each stage you identify a rule that covers some of instances. Fig. 4.
|
|
- Kristian Hampton
- 6 years ago
- Views:
Transcription
1 Data Mining Chapter 4. Algorithms: The Basic Methods (Covering algorithm, Association rule, Linear models, Instance-based learning, Clustering) 1 Covering approach At each stage you identify a rule that covers some of instances. Fig. 4.6 (a) 2 1
2 A set of rules covering the a s if x > 1.2 and y > 2.6 then class = a if x > 1.4 and y > 2.4 then class = a A set of rules covering the b s if x 1.2 then class = b if x > 1.2 and y 2.6 then class = b 3 Covering algorithm Choosing an attribute-value pair to maximize the probability of the desired classification Including as many instances of the desired class as possible Excluding as many instances of other classes as possible Weka rules: PRISM method 4 2
3 A basic rule learner Maximizing the accuracy p(positive examples) / t(instances) 5 Contact lens problem IF? THEN recommendation = hard Age = young, 2/8 Age = pre-presbyopic, 1/8 Age = presbyopic, 1/8 Spectacle prescription = myope, 3/12 Spectacle prescription = hypermetrope, 1/12 Astigmatism = no, 0/12 Astigmatism = yes, 4/12 Tear production rate = reduced, 0/12 Tear production rate = normal, 4/12 6 3
4 Accuracy p / t Break ties by choosing the condition with the largest p Selecting largest fraction : 4/12 (at random) IF astigmatism = yes THEN recommendation = hard 7 Refinement IF astigmatism = yes AND?, THEN recommendation = hard Age = young, 2/4 Age = pre-presbyopic, 1/4 Age = presbyopic, 1/4 Spectacle prescription = myope, 3/6 Spectacle prescription = hypermetrope, 1/6 Tear production rate = reduced, 0/6 Tear production rate = normal, 4/6 IF astigmatism=yes AND tear production rate=normal 8 4
5 Exact rules Age=young, 2/2 Age=pre-presbyopic, 1/2 Age=presbyopic, 1/2 Spectacle prescription=myope, 3/3 greater coverage Spectacle prescription=hypermetrope, 1/3 IF astigmatism=yes AND tear production rate=normal AND spectacle prescription=myope, THEN recommendation = hard 9 Checking the coverage rate The above rule covers ¾. (4: hard ) delete! (3 instances) Looking for another rule IF? THEN recommendation = hard best choice : age=young (coverage: 7) IF age=young AND astigmatism=yes AND tear production rate=normal, 1/1 Covering 2 out of the original instances! 10 5
6 For another class For soft and none PRISM Adding clauses to each rule until it s perfect only correct rules. 11 Rules vs trees Tree : taking all classes into account Rule : one class at a time (more compact!) Decision tree Applied in order Execution stops as soon as one rule applies. Order-independent rules Independent nuggets of knowledge Disadvantage Being not clear what to do when conflicting rules apply e.g.) rules different classes 12 6
7 Mining association rules Association rules Weka Apriori algorithm Coverage support Accuracy confidence Association rules with high coverage Attribute-value pair: item Item sets Table 4.10 Item sets for weather data with coverage 2 or greater One-item sets, Two-item sets, Three-item sets 6 Four-item sets 13 Mining association rules Association rules A three-item set with a coverage of 4 (Table 4.10): humidity=normal, windy=false, play=yes 7 potential rules IF humidity = normal and windy = false, THEN play = yes 4/4 IF humidity = normal and play = yes, THEN windy = false 4/6 IF windy = false and play = yes, THEN humidity = normal 4/6 IF humidity = normal, THEN windy = false and play = yes 4/7 IF windy = false, THEN humidity = normal and play = yes 4/8 IF play = yes, THEN humidity = normal and windy = false 4/9 IF, THEN humidity = normal and windy = false and play = yes 4/14 4: coverage, 4/4: accuracy Assuming that minimum specified accuracy is 100%, then 1 st rule! Table 4.11: the final rule set 14 7
8 Linear models Numeric prediction: Linear regression Class and all attributes: numeric Expressing the class as a linear combination of the attributes x = w 0 + w 1 a 1 +w 2 a 2 + +w k a k where x: class, w k : weights, a k : attribute values 15 Linear models Minimizing the sum of the squared differences Predicted value w 0 a (1) 0 + w 1 a (1) w k a (1) k k = j=0 w j a j The sum of the squared differences» n : training instances» x (i) : ith instance s actual class (1) (1): 1st instance n k i=1 ( x(i) - j=0 w j a (i) j )
9 Linear models Disadvantages Linearity For a nonlinear dependency the best-fitting straight line the least mean-squared difference Linear models serve well as building blocks for more complex learning methods. 17 Linear classification using the perceptron From a biological viewpoint, a mathematical model for the operation of brain a method of representing functions using networks 18 9
10 Linear classification using the perceptron Neural networks Input units (nodes), hidden units, output units Links, (numeric) weights Network structure : feed-forward (unidirectional, no cycles) Input function in i, activation function g I 1 w 13 H 3 w 35 w 23 w 14 O 5 I 2 H 4 w 45 w Linear classification using the perceptron a i g in i = g( w j,i a j ) j Input links in i g a i output links a 5 = g(w 3,5 a 3 + w 4,5 a 4 ) = g(w 3,5 g(w 1,3 a 1 + w 2,3 a 2 ) + w 4,5 g(w 1,4 a 1 + w 2,4 a 2 )) A complex nonlinear function 20 10
11 Linear classification using the perceptron a 1 a 2 w 1 w 2 +/- a n threshold w n T If a 1 w 1 + a 2 w a n w n > T then positive examples else negative examples 21 Linear classification using the perceptron activation function g 1(firing): when the input is greater than its threshold 0(no firing): otherwise hard threshold 1: positive 0 or -1: negative Sigmoid function : smooth transition To determine a predicted value anywhere between 0 and
12 Linear classification using the perceptron Perceptrons Layered feed-forward networks A single-layer: no hidden layer Weight update rule Predicted output for the single output unit: O Correct output: T Error: T-O If error is positive, increase O. If it s negative, decrease O. Learning rate(gain factor) W j W j + I J Error Activation of input I j 23 Linear classification using the perceptron Example) classification errors lead to changes in weights When the misclassified instance is positive, w i = ƞv i When the misclassified instance is negative, w i = ƞv i 24 12
13 Linear classification using the perceptron Initial hypothesis 1.0 Height Girth 1.5 ƞ=0.04, Instance(Girth, Height) = {(1.75,6.0), (2.0,5.0), (2.5,5.0), (3.0,6.25)} Positives = {(1.75,6.0), (2.0,5.0)} Negatives = {(2.5,5.0), (3.0,6.25)} The misclassified instance is positive: (2.0,5.0) w for threshold = =0.04 w for girth = =0.08 w for height = = H G Linear classification using the perceptron The misclassified instance is negative: (3.0,6.25) w for threshold = 0.04 w for girth = =0.12 w for height = = H G 1.5 Final revised hypothesis 1.15Height Girth 1.54 if the training set is linearly separable, it is guaranteed to converge in a finite number of iterations. useful approximations even when the target concepts are not linearly separable 26 13
14 Once the nearest training instance has been located, its class is predicted for test instance. Distance function Determining which member of the training set is closest to an unknown test instance Euclidian distance Distance between an instance with a 1 (1), a 2 (1),.. a k (1) and one with values a 1 (2), a 2 (2),, a k (2) k: attributes, (#): instances 27 the sum of squares (a 1 (1) a 1 2 ) 2 + (a 2 1 a 2 2 ) (a k 1 a k 2 ) 2 Normalization ([0..1]) a i = v i min v i max v i min v i Nominal attributes If the values are the same, difference: 0. Otherwise, difference:
15 Finding nearest neighbors efficiently Finding which member of training set is closest to an unknown test instance Calculating the distance from every member of the training set and selecting the smallest Being linear in the number of training instances Representing the training set as a tree kd-tree Storing a set of points in k-dimensional space» k-dimensional space: the number of attributes 29 Root (horizontally) (vertically) 30 15
16 Speeding up nearest-neighbor calculations ; h 2 ; v 5 ; v closest Good first approximation! (log 2 n ) where n: depth of the tree 32 16
17 Using hyperspheres, not hyperrectangles Squares are not the best shape, because of their corners Ball tree Fig 4.14 The nodes of actual ball trees: the center and radius of their ball The leaf nodes: the points they contain 33 Splitting method Choose the point in the ball that is farthest from its center. Then choose a 2 nd point that is farthest from the 1 st one. Assign all data points in the ball to the closest one of these two cluster centers. Then compute the centroid of each cluster and the minimum radius required for it to enclose all the data points
18 35 To use a ball tree to find the nearest neighbor to a given target Traversing the tree from the top down to locate the leaf that contains the target and find the closest point to the target in that ball If the distance from target to the sibling s center exceeds its radius plus the current upper bound, it cannot possibly contain a close point. Otherwise, the sibling must be examined by descending the tree further. Ruling out (Fig. 4.15) 36 18
19 37 Clustering techniques Clustering When there is no class to be predicted but rather when the instances are to be divided into natural groups (unsupervised learning) Iterative distance-based clustering k-means Specifying in advance how many clusters are being sought: k k points are chosen at random as cluster centers. All instances are assigned to their closest cluster center according to Euclidean distance metric. The mean of the instances in each cluster is calculated
20 Clustering These means are taken to be new center values for their respective clusters. Repeated until the cluster centers have stabilized. Minimizing V V = k i=1 j S i x j - u i 2 where» k: clusters» S i for i = 1, 2,, k» U i : the mean point of the points x j S i 39 Clustering The overall effect Minimizing the total squared distance from all points to their cluster centers The minimum: local optimum not a global optimum Final clusters: being sensitive to the initial cluster centers Running the algorithm several times with different initial choices and choosing the best final result The one with the smallest total squared distance 40 20
21
Chapter 4: Algorithms CS 795
Chapter 4: Algorithms CS 795 Inferring Rudimentary Rules 1R Single rule one level decision tree Pick each attribute and form a single level tree without overfitting and with minimal branches Pick that
More informationData Mining Algorithms: Basic Methods
Algorithms: The basic methods Inferring rudimentary rules Data Mining Algorithms: Basic Methods Chapter 4 of Data Mining Statistical modeling Constructing decision trees Constructing rules Association
More informationChapter 4: Algorithms CS 795
Chapter 4: Algorithms CS 795 Inferring Rudimentary Rules 1R Single rule one level decision tree Pick each attribute and form a single level tree without overfitting and with minimal branches Pick that
More informationData Mining Part 4. Tony C Smith WEKA Machine Learning Group Department of Computer Science University of Waikato
Data Mining Part 4 Tony C Smith WEKA Machine Learning Group Department of Computer Science University of Waikato Algorithms: The basic methods Inferring rudimentary rules Statistical modeling Constructing
More informationInstance-Based Representations. k-nearest Neighbor. k-nearest Neighbor. k-nearest Neighbor. exemplars + distance measure. Challenges.
Instance-Based Representations exemplars + distance measure Challenges. algorithm: IB1 classify based on majority class of k nearest neighbors learned structure is not explicitly represented choosing k
More informationHomework 1 Sample Solution
Homework 1 Sample Solution 1. Iris: All attributes of iris are numeric, therefore ID3 of weka cannt be applied to this data set. Contact-lenses: tear-prod-rate = reduced: none tear-prod-rate = normal astigmatism
More informationData Mining and Machine Learning: Techniques and Algorithms
Instance based classification Data Mining and Machine Learning: Techniques and Algorithms Eneldo Loza Mencía eneldo@ke.tu-darmstadt.de Knowledge Engineering Group, TU Darmstadt International Week 2019,
More informationRepresenting structural patterns: Reading Material: Chapter 3 of the textbook by Witten
Representing structural patterns: Plain Classification rules Decision Tree Rules with exceptions Relational solution Tree for Numerical Prediction Instance-based presentation Reading Material: Chapter
More informationInput: Concepts, Instances, Attributes
Input: Concepts, Instances, Attributes 1 Terminology Components of the input: Concepts: kinds of things that can be learned aim: intelligible and operational concept description Instances: the individual,
More informationMachine Learning and Pervasive Computing
Stephan Sigg Georg-August-University Goettingen, Computer Networks 17.12.2014 Overview and Structure 22.10.2014 Organisation 22.10.3014 Introduction (Def.: Machine learning, Supervised/Unsupervised, Examples)
More informationSummary. Machine Learning: Introduction. Marcin Sydow
Outline of this Lecture Data Motivation for Data Mining and Learning Idea of Learning Decision Table: Cases and Attributes Supervised and Unsupervised Learning Classication and Regression Examples Data:
More informationData Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners
Data Mining 3.5 (Instance-Based Learners) Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction k-nearest-neighbor Classifiers References Introduction Introduction Lazy vs. eager learning Eager
More informationFor Monday. Read chapter 18, sections Homework:
For Monday Read chapter 18, sections 10-12 The material in section 8 and 9 is interesting, but we won t take time to cover it this semester Homework: Chapter 18, exercise 25 a-b Program 4 Model Neuron
More informationPattern Recognition. Kjell Elenius. Speech, Music and Hearing KTH. March 29, 2007 Speech recognition
Pattern Recognition Kjell Elenius Speech, Music and Hearing KTH March 29, 2007 Speech recognition 2007 1 Ch 4. Pattern Recognition 1(3) Bayes Decision Theory Minimum-Error-Rate Decision Rules Discriminant
More informationUnsupervised Learning Hierarchical Methods
Unsupervised Learning Hierarchical Methods Road Map. Basic Concepts 2. BIRCH 3. ROCK The Principle Group data objects into a tree of clusters Hierarchical methods can be Agglomerative: bottom-up approach
More informationData Mining Practical Machine Learning Tools and Techniques
Algorithms: The basic methods Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter of Data Mining by I. H. Witten and. Frank Inferring rudimentary rules Statistical modeling Constructing
More informationAdvanced learning algorithms
Advanced learning algorithms Extending decision trees; Extraction of good classification rules; Support vector machines; Weighted instance-based learning; Design of Model Tree Clustering Association Mining
More informationMachine Learning. Unsupervised Learning. Manfred Huber
Machine Learning Unsupervised Learning Manfred Huber 2015 1 Unsupervised Learning In supervised learning the training data provides desired target output for learning In unsupervised learning the training
More informationData Mining and Knowledge Discovery Practice notes 2
Keywords Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si Data Attribute, example, attribute-value data, target variable, class, discretization Algorithms
More informationHierarchical Clustering 4/5/17
Hierarchical Clustering 4/5/17 Hypothesis Space Continuous inputs Output is a binary tree with data points as leaves. Useful for explaining the training data. Not useful for making new predictions. Direction
More informationData Mining and Knowledge Discovery: Practice Notes
Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 8.11.2017 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization
More informationAssociation Rule Mining and Clustering
Association Rule Mining and Clustering Lecture Outline: Classification vs. Association Rule Mining vs. Clustering Association Rule Mining Clustering Types of Clusters Clustering Algorithms Hierarchical:
More informationWhat is Learning? CS 343: Artificial Intelligence Machine Learning. Raymond J. Mooney. Problem Solving / Planning / Control.
What is Learning? CS 343: Artificial Intelligence Machine Learning Herbert Simon: Learning is any process by which a system improves performance from experience. What is the task? Classification Problem
More information6.034 Quiz 2, Spring 2005
6.034 Quiz 2, Spring 2005 Open Book, Open Notes Name: Problem 1 (13 pts) 2 (8 pts) 3 (7 pts) 4 (9 pts) 5 (8 pts) 6 (16 pts) 7 (15 pts) 8 (12 pts) 9 (12 pts) Total (100 pts) Score 1 1 Decision Trees (13
More informationData Mining and Machine Learning. Instance-Based Learning. Rote Learning k Nearest-Neighbor Classification. IBL and Rule Learning
Data Mining and Machine Learning Instance-Based Learning Rote Learning k Nearest-Neighbor Classification Prediction, Weighted Prediction choosing k feature weighting (RELIEF) instance weighting (PEBLS)
More informationData Mining. Lesson 9 Support Vector Machines. MSc in Computer Science University of New York Tirana Assoc. Prof. Dr.
Data Mining Lesson 9 Support Vector Machines MSc in Computer Science University of New York Tirana Assoc. Prof. Dr. Marenglen Biba Data Mining: Content Introduction to data mining and machine learning
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationCluster Analysis. Mu-Chun Su. Department of Computer Science and Information Engineering National Central University 2003/3/11 1
Cluster Analysis Mu-Chun Su Department of Computer Science and Information Engineering National Central University 2003/3/11 1 Introduction Cluster analysis is the formal study of algorithms and methods
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More information1 Case study of SVM (Rob)
DRAFT a final version will be posted shortly COS 424: Interacting with Data Lecturer: Rob Schapire and David Blei Lecture # 8 Scribe: Indraneel Mukherjee March 1, 2007 In the previous lecture we saw how
More informationCMPT 882 Week 3 Summary
CMPT 882 Week 3 Summary! Artificial Neural Networks (ANNs) are networks of interconnected simple units that are based on a greatly simplified model of the brain. ANNs are useful learning tools by being
More informationData Mining and Knowledge Discovery Practice notes: Numeric Prediction, Association Rules
Keywords Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si Data Attribute, example, attribute-value data, target variable, class, discretization Algorithms
More informationFeature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule.
CS 188: Artificial Intelligence Fall 2007 Lecture 26: Kernels 11/29/2007 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit your
More informationMidterm Examination CS 540-2: Introduction to Artificial Intelligence
Midterm Examination CS 54-2: Introduction to Artificial Intelligence March 9, 217 LAST NAME: FIRST NAME: Problem Score Max Score 1 15 2 17 3 12 4 6 5 12 6 14 7 15 8 9 Total 1 1 of 1 Question 1. [15] State
More informationClassification Lecture Notes cse352. Neural Networks. Professor Anita Wasilewska
Classification Lecture Notes cse352 Neural Networks Professor Anita Wasilewska Neural Networks Classification Introduction INPUT: classification data, i.e. it contains an classification (class) attribute
More informationBased on Raymond J. Mooney s slides
Instance Based Learning Based on Raymond J. Mooney s slides University of Texas at Austin 1 Example 2 Instance-Based Learning Unlike other learning algorithms, does not involve construction of an explicit
More informationAnalytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.
Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied
More informationData Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Output: Knowledge representation Tables Linear models Trees Rules
More information5 Learning hypothesis classes (16 points)
5 Learning hypothesis classes (16 points) Consider a classification problem with two real valued inputs. For each of the following algorithms, specify all of the separators below that it could have generated
More informationData Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Input: Concepts, instances, attributes Terminology What s a concept?
More informationClustering in Data Mining
Clustering in Data Mining Classification Vs Clustering When the distribution is based on a single parameter and that parameter is known for each object, it is called classification. E.g. Children, young,
More informationUnsupervised Learning
Outline Unsupervised Learning Basic concepts K-means algorithm Representation of clusters Hierarchical clustering Distance functions Which clustering algorithm to use? NN Supervised learning vs. unsupervised
More informationINF4820. Clustering. Erik Velldal. Nov. 17, University of Oslo. Erik Velldal INF / 22
INF4820 Clustering Erik Velldal University of Oslo Nov. 17, 2009 Erik Velldal INF4820 1 / 22 Topics for Today More on unsupervised machine learning for data-driven categorization: clustering. The task
More informationMachine Learning and Data Mining. Clustering (1): Basics. Kalev Kask
Machine Learning and Data Mining Clustering (1): Basics Kalev Kask Unsupervised learning Supervised learning Predict target value ( y ) given features ( x ) Unsupervised learning Understand patterns of
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2009 CS 551, Spring 2009 c 2009, Selim Aksoy (Bilkent University)
More informationKernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:
More informationMachine Learning Chapter 2. Input
Machine Learning Chapter 2. Input 2 Input: Concepts, instances, attributes Terminology What s a concept? Classification, association, clustering, numeric prediction What s in an example? Relations, flat
More informationWhat to come. There will be a few more topics we will cover on supervised learning
Summary so far Supervised learning learn to predict Continuous target regression; Categorical target classification Linear Regression Classification Discriminative models Perceptron (linear) Logistic regression
More informationCOMP33111: Tutorial and lab exercise 7
COMP33111: Tutorial and lab exercise 7 Guide answers for Part 1: Understanding clustering 1. Explain the main differences between classification and clustering. main differences should include being unsupervised
More informationThe exam is closed book, closed notes except your one-page (two-sided) cheat sheet.
CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or
More informationCOSC 6397 Big Data Analytics. Fuzzy Clustering. Some slides based on a lecture by Prof. Shishir Shah. Edgar Gabriel Spring 2015.
COSC 6397 Big Data Analytics Fuzzy Clustering Some slides based on a lecture by Prof. Shishir Shah Edgar Gabriel Spring 215 Clustering Clustering is a technique for finding similarity groups in data, called
More informationUnsupervised Learning: Clustering
Unsupervised Learning: Clustering Vibhav Gogate The University of Texas at Dallas Slides adapted from Carlos Guestrin, Dan Klein & Luke Zettlemoyer Machine Learning Supervised Learning Unsupervised Learning
More informationGeometric data structures:
Geometric data structures: Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade Sham Kakade 2017 1 Announcements: HW3 posted Today: Review: LSH for Euclidean distance Other
More informationData Mining Part 5. Prediction
Data Mining Part 5. Prediction 5.4. Spring 2010 Instructor: Dr. Masoud Yaghini Outline Using IF-THEN Rules for Classification Rule Extraction from a Decision Tree 1R Algorithm Sequential Covering Algorithms
More informationLECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS
LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS Neural Networks Classifier Introduction INPUT: classification data, i.e. it contains an classification (class) attribute. WE also say that the class
More informationArtificial Neural Networks (Feedforward Nets)
Artificial Neural Networks (Feedforward Nets) y w 03-1 w 13 y 1 w 23 y 2 w 01 w 21 w 22 w 02-1 w 11 w 12-1 x 1 x 2 6.034 - Spring 1 Single Perceptron Unit y w 0 w 1 w n w 2 w 3 x 0 =1 x 1 x 2 x 3... x
More informationData Mining and Analytics
Data Mining and Analytics Aik Choon Tan, Ph.D. Associate Professor of Bioinformatics Division of Medical Oncology Department of Medicine aikchoon.tan@ucdenver.edu 9/22/2017 http://tanlab.ucdenver.edu/labhomepage/teaching/bsbt6111/
More informationCase-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.
CS 188: Artificial Intelligence Fall 2008 Lecture 25: Kernels and Clustering 12/2/2008 Dan Klein UC Berkeley Case-Based Reasoning Similarity for classification Case-based reasoning Predict an instance
More informationCS 188: Artificial Intelligence Fall 2008
CS 188: Artificial Intelligence Fall 2008 Lecture 25: Kernels and Clustering 12/2/2008 Dan Klein UC Berkeley 1 1 Case-Based Reasoning Similarity for classification Case-based reasoning Predict an instance
More informationData Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Implementation: Real machine learning schemes Decision trees Classification
More informationLecture on Modeling Tools for Clustering & Regression
Lecture on Modeling Tools for Clustering & Regression CS 590.21 Analysis and Modeling of Brain Networks Department of Computer Science University of Crete Data Clustering Overview Organizing data into
More informationData Mining. 3.3 Rule-Based Classification. Fall Instructor: Dr. Masoud Yaghini. Rule-Based Classification
Data Mining 3.3 Fall 2008 Instructor: Dr. Masoud Yaghini Outline Using IF-THEN Rules for Classification Rules With Exceptions Rule Extraction from a Decision Tree 1R Algorithm Sequential Covering Algorithms
More informationCluster Analysis. Prof. Thomas B. Fomby Department of Economics Southern Methodist University Dallas, TX April 2008 April 2010
Cluster Analysis Prof. Thomas B. Fomby Department of Economics Southern Methodist University Dallas, TX 7575 April 008 April 010 Cluster Analysis, sometimes called data segmentation or customer segmentation,
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Introduction Often, we have only a set of features x = x 1, x 2,, x n, but no associated response y. Therefore we are not interested in prediction nor classification,
More informationInstructor: Jessica Wu Harvey Mudd College
The Perceptron Instructor: Jessica Wu Harvey Mudd College The instructor gratefully acknowledges Andrew Ng (Stanford), Eric Eaton (UPenn), David Kauchak (Pomona), and the many others who made their course
More informationArtificial Intelligence. Programming Styles
Artificial Intelligence Intro to Machine Learning Programming Styles Standard CS: Explicitly program computer to do something Early AI: Derive a problem description (state) and use general algorithms to
More informationData Mining Practical Machine Learning Tools and Techniques
Input: Concepts, instances, attributes Data ining Practical achine Learning Tools and Techniques Slides for Chapter 2 of Data ining by I. H. Witten and E. rank Terminology What s a concept z Classification,
More informationINF4820, Algorithms for AI and NLP: Hierarchical Clustering
INF4820, Algorithms for AI and NLP: Hierarchical Clustering Erik Velldal University of Oslo Sept. 25, 2012 Agenda Topics we covered last week Evaluating classifiers Accuracy, precision, recall and F-score
More informationCOMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions
COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18 Lecture 2: Linear Regression Gradient Descent Non-linear basis functions LINEAR REGRESSION MOTIVATION Why Linear Regression? Simplest
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right
More informationData mining. Classification k-nn Classifier. Piotr Paszek. (Piotr Paszek) Data mining k-nn 1 / 20
Data mining Piotr Paszek Classification k-nn Classifier (Piotr Paszek) Data mining k-nn 1 / 20 Plan of the lecture 1 Lazy Learner 2 k-nearest Neighbor Classifier 1 Distance (metric) 2 How to Determine
More informationMidterm Examination CS540-2: Introduction to Artificial Intelligence
Midterm Examination CS540-2: Introduction to Artificial Intelligence March 15, 2018 LAST NAME: FIRST NAME: Problem Score Max Score 1 12 2 13 3 9 4 11 5 8 6 13 7 9 8 16 9 9 Total 100 Question 1. [12] Search
More informationBBS654 Data Mining. Pinar Duygulu. Slides are adapted from Nazli Ikizler
BBS654 Data Mining Pinar Duygulu Slides are adapted from Nazli Ikizler 1 Classification Classification systems: Supervised learning Make a rational prediction given evidence There are several methods for
More informationProblem 1: Complexity of Update Rules for Logistic Regression
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1
More informationLOGISTIC REGRESSION FOR MULTIPLE CLASSES
Peter Orbanz Applied Data Mining Not examinable. 111 LOGISTIC REGRESSION FOR MULTIPLE CLASSES Bernoulli and multinomial distributions The mulitnomial distribution of N draws from K categories with parameter
More informationUnsupervised Learning and Clustering
Unsupervised Learning and Clustering Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2008 CS 551, Spring 2008 c 2008, Selim Aksoy (Bilkent University)
More informationInstance-based Learning
Instance-based Learning Nearest Neighbor 1-nearest neighbor algorithm: Remember all your data points When prediction needed for a new point Find the nearest saved data point Return the answer associated
More informationStatistical Learning Part 2 Nonparametric Learning: The Main Ideas. R. Moeller Hamburg University of Technology
Statistical Learning Part 2 Nonparametric Learning: The Main Ideas R. Moeller Hamburg University of Technology Instance-Based Learning So far we saw statistical learning as parameter learning, i.e., given
More informationCOMP 551 Applied Machine Learning Lecture 13: Unsupervised learning
COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning Associate Instructor: Herke van Hoof (herke.vanhoof@mail.mcgill.ca) Slides mostly by: (jpineau@cs.mcgill.ca) Class web page: www.cs.mcgill.ca/~jpineau/comp551
More informationMIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018
MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge
More informationMachine Learning Classifiers and Boosting
Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve
More informationSlides for Data Mining by I. H. Witten and E. Frank
Slides for Data Mining by I. H. Witten and E. Frank 7 Engineering the input and output Attribute selection Scheme-independent, scheme-specific Attribute discretization Unsupervised, supervised, error-
More informationModule 1 Lecture Notes 2. Optimization Problem and Model Formulation
Optimization Methods: Introduction and Basic concepts 1 Module 1 Lecture Notes 2 Optimization Problem and Model Formulation Introduction In the previous lecture we studied the evolution of optimization
More informationCluster Analysis. Ying Shen, SSE, Tongji University
Cluster Analysis Ying Shen, SSE, Tongji University Cluster analysis Cluster analysis groups data objects based only on the attributes in the data. The main objective is that The objects within a group
More informationCOMP 465: Data Mining Still More on Clustering
3/4/015 Exercise COMP 465: Data Mining Still More on Clustering Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Describe each of the following
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationIntroduction to ANSYS DesignXplorer
Lecture 4 14. 5 Release Introduction to ANSYS DesignXplorer 1 2013 ANSYS, Inc. September 27, 2013 s are functions of different nature where the output parameters are described in terms of the input parameters
More informationThe Explorer. chapter Getting started
chapter 10 The Explorer Weka s main graphical user interface, the Explorer, gives access to all its facilities using menu selection and form filling. It is illustrated in Figure 10.1. There are six different
More informationLearning. Learning agents Inductive learning. Neural Networks. Different Learning Scenarios Evaluation
Learning Learning agents Inductive learning Different Learning Scenarios Evaluation Slides based on Slides by Russell/Norvig, Ronald Williams, and Torsten Reil Material from Russell & Norvig, chapters
More informationClassification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University
Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate
More information11/2/2017 MIST.6060 Business Intelligence and Data Mining 1. Clustering. Two widely used distance metrics to measure the distance between two records
11/2/2017 MIST.6060 Business Intelligence and Data Mining 1 An Example Clustering X 2 X 1 Objective of Clustering The objective of clustering is to group the data into clusters such that the records within
More informationData Mining. Neural Networks
Data Mining Neural Networks Goals for this Unit Basic understanding of Neural Networks and how they work Ability to use Neural Networks to solve real problems Understand when neural networks may be most
More informationDATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines
DATA MINING LECTURE 10B Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines NEAREST NEIGHBOR CLASSIFICATION 10 10 Illustrating Classification Task Tid Attrib1
More informationMachine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013
Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork
More informationCS513-Data Mining. Lecture 2: Understanding the Data. Waheed Noor
CS513-Data Mining Lecture 2: Understanding the Data Waheed Noor Computer Science and Information Technology, University of Balochistan, Quetta, Pakistan Waheed Noor (CS&IT, UoB, Quetta) CS513-Data Mining
More informationGene Clustering & Classification
BINF, Introduction to Computational Biology Gene Clustering & Classification Young-Rae Cho Associate Professor Department of Computer Science Baylor University Overview Introduction to Gene Clustering
More informationData Mining Practical Machine Learning Tools and Techniques
Output: Knowledge representation Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter of Data Mining by I. H. Witten and E. Frank Decision tables Decision trees Decision rules
More informationCHAPTER 4: CLUSTER ANALYSIS
CHAPTER 4: CLUSTER ANALYSIS WHAT IS CLUSTER ANALYSIS? A cluster is a collection of data-objects similar to one another within the same group & dissimilar to the objects in other groups. Cluster analysis
More informationCLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS
CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS CHAPTER 4 CLASSIFICATION WITH RADIAL BASIS AND PROBABILISTIC NEURAL NETWORKS 4.1 Introduction Optical character recognition is one of
More informationLinear Regression and K-Nearest Neighbors 3/28/18
Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression Hypothesis Space Supervised learning For every input in the data set, we know the output Regression Outputs are continuous A number,
More information