Data Mining in Bioinformatics Day 1: Classification
|
|
- Justin Clifford Simpson
- 6 years ago
- Views:
Transcription
1 Data Mining in Bioinformatics Day 1: Classification Karsten Borgwardt February 18 to March 1, 2013 Machine Learning & Computational Biology Research Group Max Planck Institute Tübingen and Eberhard Karls Universität Tübingen Karsten Borgwardt: Data Mining in Bioinformatics, Page 1
2 Our course Schedule Structure February 18 to March 1 Lecture from 10:00 to 12:30, tutorials are projectedbased Oral exam on March 5 to get the certificate 1 week of algorithmics, 1 week of bioinformatics applications Key topics: classification, clustering, feature selection, text & graph mining Lecture will provide introduction to the topic + discussion of important papers Karsten Borgwardt: Data Mining in Bioinformatics, Page 2
3 What is data mining? Data Mining Extracting knowledge from large amounts of data (Han and Kamber, 2006) Often used as synonym for Knowledge Discovery yet other definitions disagree Knowledge Discovery Data cleaning Data integration Data selection Data transformation Data mining Pattern evaluation Knowledge presentation Karsten Borgwardt: Data Mining in Bioinformatics, Page 3
4 What is classification? Problem Given an object, which class of objects does it belong to? Given object x, predict its class label y. Examples Computer vision: Is this object a chair? Credit cards: Is this customer to be trusted? Marketing: Will this customer buy/like our product? Function prediction: Is this protein an enzyme? Gene finding: Does this sequence contain a splice site? Karsten Borgwardt: Data Mining in Bioinformatics, Page 4
5 What is classification? Setting Role of y Classification is usually performed in a supervised setting: We are given a training dataset. A training dataset is a dataset of pairs (x i, y i ), that is objects and their known class labels The test set is a dataset of test points x with unknown class label The task is to predict the class label y of x if y {0, 1}: then we are dealing with a binary classification problem if y {1,..., n}, (n N): a multiclass classification problem if y R: a regression problem Karsten Borgwardt: Data Mining in Bioinformatics, Page 5
6 Classifiers in a nutshell Nearest Neighbour Key idea: if x 1 is most similar to x 2, then y 1 = y 2 Classification by looking at the Nearest Neighbour Naive Bayes A simple probabilistic classifier based on applying Bayes theorem with strong (naive) independence assumptions Decision trees A series of decisions has to be taken to classify an object, based on its attributes The hierarchy of these decisions is ordered as a tree, a decision tree. Karsten Borgwardt: Data Mining in Bioinformatics, Page 6
7 Classifiers in a nutshell Support Vector Machine Key idea: Draw a line (plane, hyperplane) that separates two classes of data Maximise the distance between the hyperplane and the points closest to it (margin) Test point is predicted to belong to the class whose halfspace it is located in Criteria for a good classifier Accuracy Runtime and scalability Interpretability Flexibility Karsten Borgwardt: Data Mining in Bioinformatics, Page 7
8 Nearest Neighbour The actual classification Runtime Given x i, we predict its label y i by x j = argmin x D x x i 2 y i = y j (1) x i s predicted label is that of the point closest to it, that is its nearest neighbour Naively, one has to compute the distance to all N neighbours in the dataset for each point: O(N) for one point O(N 2 ) for the entire dataset Karsten Borgwardt: Data Mining in Bioinformatics, Page 8
9 Nearest Neighbour How to speed NN up Metric Exploit the triangle inequality: d(x 1, x 2 ) + d(x 2, x 3 ) d(x 1, x 3 ) (2) This holds for any metric d. A distance function d is a metric iff 1. d(x 1, x 2 ) 0 2. d(x 1, x 2 ) = 0 if and only if x 1 = x 2 3. d(x 1, x 2 ) = d(x 2, x 1 ) 4. d(x 1, x 3 ) d(x 1, x 2 ) + d(x 2, x 3 ) Karsten Borgwardt: Data Mining in Bioinformatics, Page 9
10 Nearest Neighbour How to speed NN up Rewrite triangle inequality: d(x 1, x 2 ) d(x 1, x 3 ) d(x 2, x 3 ) (3) That means if you know d(x 1, x 3 ) and d(x 2, x 3 ), you can provide a lower bound on d(x 1, x 2 ) If you know a point that is closer to x 1 than d(x 1, x 3 ) d(x 2, x 3 ), then you can avoid to compute d(x 1, x 2 ) Karsten Borgwardt: Data Mining in Bioinformatics, Page 10
11 Naive Bayes Bayes Rule P (C x) = Naive Bayes Classification P (x C)P (C) P (x) (4) Classify x into one of m classes C 1,..., C m argmax Ci P (C i x) = P (x C i)p (C i ) P (x) (5) Karsten Borgwardt: Data Mining in Bioinformatics, Page 11
12 Naive Bayes Three simplifications P (x) is the same for all classes, ignore this term We further assume that P (C i ) is constant for all classes 1 i m, ignore this term as well. That means P (C i x) P (x C i ) (6) If x is multidimensional, that is if x contains n features x = (x 1,..., x n ), we further assume that n P (x C i ) = P (x j C i ) (7) j=1 Karsten Borgwardt: Data Mining in Bioinformatics, Page 12
13 Naive Bayes The actual classification Runtime The actual classification is performed by computing n P (C i x) P (x j C i ) (8) j=1 The three simplifications are that all classes have the same marginal probability all data points have the same marginal probability all features of an object are independent of each other Alternative name: Simple Bayes Classifier O(Nmn), where N is the number of data points, m the number of classes, n the number of features Karsten Borgwardt: Data Mining in Bioinformatics, Page 13
14 Decision Tree Key idea Recursively split the data space into regions that contain a single class only Y x Karsten Borgwardt: Data Mining in Bioinformatics, Page 14
15 Decision Tree Concept A decision tree is a flowchart like tree structure with a root: this is the uppermost node internal nodes: these represents tests on an attribute branches: these represent outcomes of a test leaf nodes: these hold a class label Karsten Borgwardt: Data Mining in Bioinformatics, Page 15
16 Decision Tree Classification given a test point x perform test on the attributes of x at the root follow the branch that corresponds to the outcome of this test repeat this procedure, until you reach a leaf node predict the label of x to be the label of that leaf node Karsten Borgwardt: Data Mining in Bioinformatics, Page 16
17 Decision Tree Popularity requires no domain knowledge easy to interpret construction and prediction is fast But how to construct a decision tree? Karsten Borgwardt: Data Mining in Bioinformatics, Page 17
18 Decision Tree Construction requires to determine a splitting criterion at each internal node this splitting criterion tells us which attribute to test at node v we would like to use the attribute that best separates the classes on the training dataset Karsten Borgwardt: Data Mining in Bioinformatics, Page 18
19 Decision Tree Information gain ID3 uses information gain as attribute selection measure The information content is defined as: m Info(D) = p(c i ) log 2 (p(c i )), (9) i=1 where p(c i ) is the probability that an arbitrary tuple in D belongs to class C i and is estimated by C i,d D. This is also known as the Shannon entropy of D. Karsten Borgwardt: Data Mining in Bioinformatics, Page 19
20 Decision Tree Information gain Assume that attribute A was used to split D into v partitions or subsets, {D 1, D 2,..., D v }, where D j contains those tuples in D that have outcome a j of A. Ideally, the D j would provide a perfect classification, but they seldomly do. How much more information do we need to arrive at an exact classification? This is quantified by Info A (D) = v j=1 D j D Info(D j). (10) Karsten Borgwardt: Data Mining in Bioinformatics, Page 20
21 Decision Tree Information gain The information gain is the loss of entropy (increase in information) that is caused by splitting with respect to attribute A Gain(A) = Info(D) Info A (D) (11) We pick A such that this gain is maximised. Karsten Borgwardt: Data Mining in Bioinformatics, Page 21
22 Decision Tree Gain ratio The information gain is biased towards attributes with a large number of values For example, an ID attribute maximises the information gain! Hence C4.5 uses an extension of information gain: the gain ratio The gain ratio is based on the split information v D j SplitInfo A (D) = D log 2( D j ). (12) D and is defined as j=1 GainRatio(A) = Gain(A) SplitInfo(A) (13) Karsten Borgwardt: Data Mining in Bioinformatics, Page 22
23 Decision Tree Gain ratio The attribute with maximum gain ratio is selected as the splitting attribute The ratio becomes unstable, as the split information approaches zero A constraint is added to ensure that the information gain of the test selected is at least as great as the average gain over all tests examined. Karsten Borgwardt: Data Mining in Bioinformatics, Page 23
24 Decision Tree Gini index Attribute selection measure in CART system Gini index measures class impurity as m Gini(D) = 1 p 2 i (14) If we split via attribute A into partitions {D 1, D 2,..., D v }, the Gini index of this partitioning is defined as v D j Gini A (D) = D Gini(D j) (15) j=1 i=1 and the reduction in impurity by a split on A is Gini(D) = Gini(D) Gini A (D) (16) Karsten Borgwardt: Data Mining in Bioinformatics, Page 24
25 Support Vector Machines Hyperplane classifiers Vapnik et al. define a family of classifiers for binary classification problems. This family is the class of hyperplanes in some dot product space H, w, x + b = 0, (17) where w H, b R. These correspond to decision functions ( classifiers ): f(x) = sgn( w, x + b) (18) Vapnik et al. proposed a learning algorithm for determining this f from the training dataset Karsten Borgwardt: Data Mining in Bioinformatics, Page 25
26 Support Vector Machines The optimal hyperplane maximises the margin of separation between any training point and the hyperplane max min{ x x i x H, w, x + b = 0, i = 1,..., m} w H,b R (19) Karsten Borgwardt: Data Mining in Bioinformatics, Page 26
27 Support Vector Machines Optimisation problem minimise w H,b R τ(w) = 1 2 w 2 (20) subject to y i ( w, x i + b) 1 for all i = 1,..., m Why minimise 1 2 w 2? The size of the margin is 2 w. The smaller w, the larger the margin. Why do we have to obey the constraints y i ( w, x i +b) 1? They ensure that all training data points of the same class are on the same side of the hyperplane and outside the margin. Karsten Borgwardt: Data Mining in Bioinformatics, Page 27
28 Support Vector Machines The Lagrangian We form the Lagrangian: L(w, b, α) = 1 m 2 w 2 α i (y i ( x i, w + b) 1) (21) i=1 The Lagrangian is minimised with respect to the primal variables w and b, and maximised with respect to the dual variables α i. Karsten Borgwardt: Data Mining in Bioinformatics, Page 28
29 Support Vector Machines Support Vectors At optimality, L(w, b, α) = 0 and L(w, b, α) = 0 (22) b w such that m m α i y i = 0 and w = α i y i x i (23) i=1 Hence the solution vector w, the crucial parameter of the SVM classifier, has an expansion in terms of the training points and their labels. Those training points with α > 0 are the Support Vectors. i=1 Karsten Borgwardt: Data Mining in Bioinformatics, Page 29
30 Support Vector Machines The dual problem Plugging (23) into the Lagrangian (21), we obtain the dual optimization problem that is solved in practice: m maximise α R mw (α) = α i 1 m α i α j y i y j x i, x j 2 The kernel trick i=1 i,j=1 (24) The key insight is that (24) accesses the training data only in terms of inner products x i, x j We can plug in an inner product of our choice here! This is referred to as a kernel k: k(x i, x j ) = x i, x j (25) Karsten Borgwardt: Data Mining in Bioinformatics, Page 30
31 Support Vector Machines Some prominent kernels linear kernel polynomial kernel k(x i, x j ) = where c, d R, Gaussian RBF kernel n x il x jl = x i x j, (26) l=1 k(x i, x j ) = (x i x j + c) d, (27) k(x i, x j ) = exp( 1 2σ 2 x i x j 2 ), (28) where σ R. Karsten Borgwardt: Data Mining in Bioinformatics, Page 31
32 References and further reading References [1] B. Schölkopf and A. Smola. Learning with Kernels. MIT press, [2] J. Han and M. Kamber. Data Mining: Concepts and Techniques. Elsevier, Morgan-Kaufmann Publishers, Karsten Borgwardt: Data Mining in Bioinformatics, Page 32
33 The end See you tomorrow! Next topic: Clustering Karsten Borgwardt: Data Mining in Bioinformatics, Page 33
Data Mining. 3.2 Decision Tree Classifier. Fall Instructor: Dr. Masoud Yaghini. Chapter 5: Decision Tree Classifier
Data Mining 3.2 Decision Tree Classifier Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction Basic Algorithm for Decision Tree Induction Attribute Selection Measures Information Gain Gain Ratio
More informationCLASSIFICATION OF C4.5 AND CART ALGORITHMS USING DECISION TREE METHOD
CLASSIFICATION OF C4.5 AND CART ALGORITHMS USING DECISION TREE METHOD Khin Lay Myint 1, Aye Aye Cho 2, Aye Mon Win 3 1 Lecturer, Faculty of Information Science, University of Computer Studies, Hinthada,
More informationMIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018
MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge
More informationExtra readings beyond the lecture slides are important:
1 Notes To preview next lecture: Check the lecture notes, if slides are not available: http://web.cse.ohio-state.edu/~sun.397/courses/au2017/cse5243-new.html Check UIUC course on the same topic. All their
More informationLecture 9: Support Vector Machines
Lecture 9: Support Vector Machines William Webber (william@williamwebber.com) COMP90042, 2014, Semester 1, Lecture 8 What we ll learn in this lecture Support Vector Machines (SVMs) a highly robust and
More informationData Mining: Concepts and Techniques Classification and Prediction Chapter 6.1-3
Data Mining: Concepts and Techniques Classification and Prediction Chapter 6.1-3 January 25, 2007 CSE-4412: Data Mining 1 Chapter 6 Classification and Prediction 1. What is classification? What is prediction?
More informationClassification Algorithms in Data Mining
August 9th, 2016 Suhas Mallesh Yash Thakkar Ashok Choudhary CIS660 Data Mining and Big Data Processing -Dr. Sunnie S. Chung Classification Algorithms in Data Mining Deciding on the classification algorithms
More informationLinear methods for supervised learning
Linear methods for supervised learning LDA Logistic regression Naïve Bayes PLA Maximum margin hyperplanes Soft-margin hyperplanes Least squares resgression Ridge regression Nonlinear feature maps Sometimes
More informationCSE4334/5334 DATA MINING
CSE4334/5334 DATA MINING Lecture 4: Classification (1) CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai Li (Slides courtesy
More informationSupervised Learning Classification Algorithms Comparison
Supervised Learning Classification Algorithms Comparison Aditya Singh Rathore B.Tech, J.K. Lakshmipat University -------------------------------------------------------------***---------------------------------------------------------
More informationData Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)
Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based
More informationClassification by Support Vector Machines
Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III
More informationThe Weisfeiler-Lehman Kernel
The Weisfeiler-Lehman Kernel Karsten Borgwardt and Nino Shervashidze Machine Learning and Computational Biology Research Group, Max Planck Institute for Biological Cybernetics and Max Planck Institute
More informationCOMP 465: Data Mining Classification Basics
Supervised vs. Unsupervised Learning COMP 465: Data Mining Classification Basics Slides Adapted From : Jiawei Han, Micheline Kamber & Jian Pei Data Mining: Concepts and Techniques, 3 rd ed. Supervised
More informationCS 229 Midterm Review
CS 229 Midterm Review Course Staff Fall 2018 11/2/2018 Outline Today: SVMs Kernels Tree Ensembles EM Algorithm / Mixture Models [ Focus on building intuition, less so on solving specific problems. Ask
More informationClassification: Basic Concepts, Decision Trees, and Model Evaluation
Classification: Basic Concepts, Decision Trees, and Model Evaluation Data Warehousing and Mining Lecture 4 by Hossen Asiful Mustafa Classification: Definition Given a collection of records (training set
More informationLab 2: Support vector machines
Artificial neural networks, advanced course, 2D1433 Lab 2: Support vector machines Martin Rehn For the course given in 2006 All files referenced below may be found in the following directory: /info/annfk06/labs/lab2
More informationLECTURE 5: DUAL PROBLEMS AND KERNELS. * Most of the slides in this lecture are from
LECTURE 5: DUAL PROBLEMS AND KERNELS * Most of the slides in this lecture are from http://www.robots.ox.ac.uk/~az/lectures/ml Optimization Loss function Loss functions SVM review PRIMAL-DUAL PROBLEM Max-min
More information12 Classification using Support Vector Machines
160 Bioinformatics I, WS 14/15, D. Huson, January 28, 2015 12 Classification using Support Vector Machines This lecture is based on the following sources, which are all recommended reading: F. Markowetz.
More informationSupport Vector Machines + Classification for IR
Support Vector Machines + Classification for IR Pierre Lison University of Oslo, Dep. of Informatics INF3800: Søketeknologi April 30, 2014 Outline of the lecture Recap of last week Support Vector Machines
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationMachine Learning for NLP
Machine Learning for NLP Support Vector Machines Aurélie Herbelot 2018 Centre for Mind/Brain Sciences University of Trento 1 Support Vector Machines: introduction 2 Support Vector Machines (SVMs) SVMs
More informationNon-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines
Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007,
More informationAll lecture slides will be available at CSC2515_Winter15.html
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many
More informationClassification by Support Vector Machines
Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III
More informationSecond Order SMO Improves SVM Online and Active Learning
Second Order SMO Improves SVM Online and Active Learning Tobias Glasmachers and Christian Igel Institut für Neuroinformatik, Ruhr-Universität Bochum 4478 Bochum, Germany Abstract Iterative learning algorithms
More informationFeature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.
CS 188: Artificial Intelligence Fall 2008 Lecture 24: Perceptrons II 11/24/2008 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit
More informationImplementierungstechniken für Hauptspeicherdatenbanksysteme Classification: Decision Trees
Implementierungstechniken für Hauptspeicherdatenbanksysteme Classification: Decision Trees Dominik Vinan February 6, 2018 Abstract Decision Trees are a well-known part of most modern Machine Learning toolboxes.
More informationLecture 7: Support Vector Machine
Lecture 7: Support Vector Machine Hien Van Nguyen University of Houston 9/28/2017 Separating hyperplane Red and green dots can be separated by a separating hyperplane Two classes are separable, i.e., each
More informationData Mining Concepts & Techniques
Data Mining Concepts & Techniques Lecture No. 03 Data Processing, Data Mining Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationPart I. Instructor: Wei Ding
Classification Part I Instructor: Wei Ding Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Classification: Definition Given a collection of records (training set ) Each record contains a set
More informationModule 4. Non-linear machine learning econometrics: Support Vector Machine
Module 4. Non-linear machine learning econometrics: Support Vector Machine THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Introduction When the assumption of linearity
More informationKernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:
More informationDecision Trees Dr. G. Bharadwaja Kumar VIT Chennai
Decision Trees Decision Tree Decision Trees (DTs) are a nonparametric supervised learning method used for classification and regression. The goal is to create a model that predicts the value of a target
More informationECG782: Multidimensional Digital Signal Processing
ECG782: Multidimensional Digital Signal Processing Object Recognition http://www.ee.unlv.edu/~b1morris/ecg782/ 2 Outline Knowledge Representation Statistical Pattern Recognition Neural Networks Boosting
More informationSVM in Analysis of Cross-Sectional Epidemiological Data Dmitriy Fradkin. April 4, 2005 Dmitriy Fradkin, Rutgers University Page 1
SVM in Analysis of Cross-Sectional Epidemiological Data Dmitriy Fradkin April 4, 2005 Dmitriy Fradkin, Rutgers University Page 1 Overview The goals of analyzing cross-sectional data Standard methods used
More informationKernel Methods & Support Vector Machines
& Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector
More informationA Dendrogram. Bioinformatics (Lec 17)
A Dendrogram 3/15/05 1 Hierarchical Clustering [Johnson, SC, 1967] Given n points in R d, compute the distance between every pair of points While (not done) Pick closest pair of points s i and s j and
More informationContent-based image and video analysis. Machine learning
Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all
More informationCSE 573: Artificial Intelligence Autumn 2010
CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine
More informationLab 2: Support Vector Machines
Articial neural networks, advanced course, 2D1433 Lab 2: Support Vector Machines March 13, 2007 1 Background Support vector machines, when used for classication, nd a hyperplane w, x + b = 0 that separates
More informationClassification by Support Vector Machines
Classification by Support Vector Machines Florian Markowetz Max-Planck-Institute for Molecular Genetics Computational Molecular Biology Berlin Practical DNA Microarray Analysis 2003 1 Overview I II III
More informationCPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017
CPSC 340: Machine Learning and Data Mining Kernel Trick Fall 2017 Admin Assignment 3: Due Friday. Midterm: Can view your exam during instructor office hours or after class this week. Digression: the other
More informationSemi-supervised learning and active learning
Semi-supervised learning and active learning Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Combining classifiers Ensemble learning: a machine learning paradigm where multiple learners
More informationRobot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning
Robot Learning 1 General Pipeline 1. Data acquisition (e.g., from 3D sensors) 2. Feature extraction and representation construction 3. Robot learning: e.g., classification (recognition) or clustering (knowledge
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 20: 10/12/2015 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationData Mining and Analytics
Data Mining and Analytics Aik Choon Tan, Ph.D. Associate Professor of Bioinformatics Division of Medical Oncology Department of Medicine aikchoon.tan@ucdenver.edu 9/22/2017 http://tanlab.ucdenver.edu/labhomepage/teaching/bsbt6111/
More informationClassification. Instructor: Wei Ding
Classification Decision Tree Instructor: Wei Ding Tan,Steinbach, Kumar Introduction to Data Mining 4/18/2004 1 Preliminaries Each data record is characterized by a tuple (x, y), where x is the attribute
More informationSupport vector machines
Support vector machines Cavan Reilly October 24, 2018 Table of contents K-nearest neighbor classification Support vector machines K-nearest neighbor classification Suppose we have a collection of measurements
More informationData Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank Implementation: Real machine learning schemes Decision trees Classification
More informationCPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017
CPSC 340: Machine Learning and Data Mining More Linear Classifiers Fall 2017 Admin Assignment 3: Due Friday of next week. Midterm: Can view your exam during instructor office hours next week, or after
More informationNaïve Bayes for text classification
Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support
More informationStatistical Pattern Recognition
Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2012 http://ce.sharif.edu/courses/90-91/2/ce725-1/ Agenda Features and Patterns The Curse of Size and
More informationLecture 7: Decision Trees
Lecture 7: Decision Trees Instructor: Outline 1 Geometric Perspective of Classification 2 Decision Trees Geometric Perspective of Classification Perspective of Classification Algorithmic Geometric Probabilistic...
More informationGeometric data structures:
Geometric data structures: Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade Sham Kakade 2017 1 Announcements: HW3 posted Today: Review: LSH for Euclidean distance Other
More informationPerformance Evaluation of Various Classification Algorithms
Performance Evaluation of Various Classification Algorithms Shafali Deora Amritsar College of Engineering & Technology, Punjab Technical University -----------------------------------------------------------***----------------------------------------------------------
More informationAdvanced Video Content Analysis and Video Compression (5LSH0), Module 8B
Advanced Video Content Analysis and Video Compression (5LSH0), Module 8B 1 Supervised learning Catogarized / labeled data Objects in a picture: chair, desk, person, 2 Classification Fons van der Sommen
More informationInstance-based Learning
Instance-based Learning Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 19 th, 2007 2005-2007 Carlos Guestrin 1 Why not just use Linear Regression? 2005-2007 Carlos Guestrin
More informationSupport Vector Machines and their Applications
Purushottam Kar Department of Computer Science and Engineering, Indian Institute of Technology Kanpur. Summer School on Expert Systems And Their Applications, Indian Institute of Information Technology
More informationSUPPORT VECTOR MACHINES
SUPPORT VECTOR MACHINES Today Reading AIMA 18.9 Goals (Naïve Bayes classifiers) Support vector machines 1 Support Vector Machines (SVMs) SVMs are probably the most popular off-the-shelf classifier! Software
More informationClassification and Regression
Classification and Regression Announcements Study guide for exam is on the LMS Sample exam will be posted by Monday Reminder that phase 3 oral presentations are being held next week during workshops Plan
More informationUnivariate Margin Tree
Univariate Margin Tree Olcay Taner Yıldız Department of Computer Engineering, Işık University, TR-34980, Şile, Istanbul, Turkey, olcaytaner@isikun.edu.tr Abstract. In many pattern recognition applications,
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationLecture outline. Decision-tree classification
Lecture outline Decision-tree classification Decision Trees Decision tree A flow-chart-like tree structure Internal node denotes a test on an attribute Branch represents an outcome of the test Leaf nodes
More informationUniversity of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 10: Decision Trees
University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 10: Decision Trees colour=green? size>20cm? colour=red? watermelon size>5cm? size>5cm? colour=yellow? apple
More informationCSE 417T: Introduction to Machine Learning. Lecture 22: The Kernel Trick. Henry Chai 11/15/18
CSE 417T: Introduction to Machine Learning Lecture 22: The Kernel Trick Henry Chai 11/15/18 Linearly Inseparable Data What can we do if the data is not linearly separable? Accept some non-zero in-sample
More informationSOCIAL MEDIA MINING. Data Mining Essentials
SOCIAL MEDIA MINING Data Mining Essentials Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate
More informationProblem 1: Complexity of Update Rules for Logistic Regression
Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationExample of DT Apply Model Example Learn Model Hunt s Alg. Measures of Node Impurity DT Examples and Characteristics. Classification.
lassification-decision Trees, Slide 1/56 Classification Decision Trees Huiping Cao lassification-decision Trees, Slide 2/56 Examples of a Decision Tree Tid Refund Marital Status Taxable Income Cheat 1
More informationSupport Vector Machines.
Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing
More informationThorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA
Retrospective ICML99 Transductive Inference for Text Classification using Support Vector Machines Thorsten Joachims Then: Universität Dortmund, Germany Now: Cornell University, USA Outline The paper in
More informationAn introduction to random forests
An introduction to random forests Eric Debreuve / Team Morpheme Institutions: University Nice Sophia Antipolis / CNRS / Inria Labs: I3S / Inria CRI SA-M / ibv Outline Machine learning Decision tree Random
More informationEnhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques
24 Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Enhancing Forecasting Performance of Naïve-Bayes Classifiers with Discretization Techniques Ruxandra PETRE
More informationA Program demonstrating Gini Index Classification
A Program demonstrating Gini Index Classification Abstract In this document, a small program demonstrating Gini Index Classification is introduced. Users can select specified training data set, build the
More informationSupport vector machines
Support vector machines When the data is linearly separable, which of the many possible solutions should we prefer? SVM criterion: maximize the margin, or distance between the hyperplane and the closest
More informationUniversity of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 10: Decision Trees
University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 10: Decision Trees colour=green? size>20cm? colour=red? watermelon size>5cm? size>5cm? colour=yellow? apple
More informationAnalysis of Various Decision Tree Algorithms for Classification in Data Mining
Volume 163 No 8, April 017 Analysis of Various Decision Tree Algorithms for Classification in Data Mining Bhumika Gupta, PhD Assistant Professor, C.S.E.D Arpit Arora Aditya Rawat Naresh Dhami Akshay Jain
More informationDATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines
DATA MINING LECTURE 10B Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines NEAREST NEIGHBOR CLASSIFICATION 10 10 Illustrating Classification Task Tid Attrib1
More informationClassification and Regression Trees
Classification and Regression Trees Matthew S. Shotwell, Ph.D. Department of Biostatistics Vanderbilt University School of Medicine Nashville, TN, USA March 16, 2018 Introduction trees partition feature
More informationSVM Classification in -Arrays
SVM Classification in -Arrays SVM classification and validation of cancer tissue samples using microarray expression data Furey et al, 2000 Special Topics in Bioinformatics, SS10 A. Regl, 7055213 What
More informationCS570: Introduction to Data Mining
CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.
More informationLars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Syllabus Fri. 27.10. (1) 0. Introduction A. Supervised Learning: Linear Models & Fundamentals Fri. 3.11. (2) A.1 Linear Regression Fri. 10.11. (3) A.2 Linear Classification Fri. 17.11. (4) A.3 Regularization
More informationSUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018
SUPERVISED LEARNING METHODS Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 2 CHOICE OF ML You cannot know which algorithm will work
More informationA Short SVM (Support Vector Machine) Tutorial
A Short SVM (Support Vector Machine) Tutorial j.p.lewis CGIT Lab / IMSC U. Southern California version 0.zz dec 004 This tutorial assumes you are familiar with linear algebra and equality-constrained optimization/lagrange
More information2. Data Preprocessing
2. Data Preprocessing Contents of this Chapter 2.1 Introduction 2.2 Data cleaning 2.3 Data integration 2.4 Data transformation 2.5 Data reduction Reference: [Han and Kamber 2006, Chapter 2] SFU, CMPT 459
More informationAnnouncements. CS 188: Artificial Intelligence Spring Classification: Feature Vectors. Classification: Weights. Learning: Binary Perceptron
CS 188: Artificial Intelligence Spring 2010 Lecture 24: Perceptrons and More! 4/20/2010 Announcements W7 due Thursday [that s your last written for the semester!] Project 5 out Thursday Contest running
More informationLecture 10 September 19, 2007
CS 6604: Data Mining Fall 2007 Lecture 10 September 19, 2007 Lecture: Naren Ramakrishnan Scribe: Seungwon Yang 1 Overview In the previous lecture we examined the decision tree classifier and choices for
More informationLecture 9. Support Vector Machines
Lecture 9. Support Vector Machines COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne This lecture Support vector machines (SVMs) as maximum
More information9. Support Vector Machines. The linearly separable case: hard-margin SVMs. The linearly separable case: hard-margin SVMs. Learning objectives
Foundations of Machine Learning École Centrale Paris Fall 25 9. Support Vector Machines Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech Learning objectives chloe agathe.azencott@mines
More informationEvaluation of different biological data and computational classification methods for use in protein interaction prediction.
Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Yanjun Qi, Ziv Bar-Joseph, Judith Klein-Seetharaman Protein 2006 Motivation Correctly
More informationDecision trees. Decision trees are useful to a large degree because of their simplicity and interpretability
Decision trees A decision tree is a method for classification/regression that aims to ask a few relatively simple questions about an input and then predicts the associated output Decision trees are useful
More informationPARALLEL CLASSIFICATION ALGORITHMS
PARALLEL CLASSIFICATION ALGORITHMS By: Faiz Quraishi Riti Sharma 9 th May, 2013 OVERVIEW Introduction Types of Classification Linear Classification Support Vector Machines Parallel SVM Approach Decision
More informationLecture Linear Support Vector Machines
Lecture 8 In this lecture we return to the task of classification. As seen earlier, examples include spam filters, letter recognition, or text classification. In this lecture we introduce a popular method
More informationUnsupervised Learning. Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi
Unsupervised Learning Presenter: Anil Sharma, PhD Scholar, IIIT-Delhi Content Motivation Introduction Applications Types of clustering Clustering criterion functions Distance functions Normalization Which
More informationData Mining Lecture 8: Decision Trees
Data Mining Lecture 8: Decision Trees Jo Houghton ECS Southampton March 8, 2019 1 / 30 Decision Trees - Introduction A decision tree is like a flow chart. E. g. I need to buy a new car Can I afford it?
More informationTree-based methods for classification and regression
Tree-based methods for classification and regression Ryan Tibshirani Data Mining: 36-462/36-662 April 11 2013 Optional reading: ISL 8.1, ESL 9.2 1 Tree-based methods Tree-based based methods for predicting
More informationData Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017
Data Analysis 3 Support Vector Machines Jan Platoš October 30, 2017 Department of Computer Science Faculty of Electrical Engineering and Computer Science VŠB - Technical University of Ostrava Table of
More informationSupport Vector Machines
Support Vector Machines About the Name... A Support Vector A training sample used to define classification boundaries in SVMs located near class boundaries Support Vector Machines Binary classifiers whose
More informationApplication of Support Vector Machine In Bioinformatics
Application of Support Vector Machine In Bioinformatics V. K. Jayaraman Scientific and Engineering Computing Group CDAC, Pune jayaramanv@cdac.in Arun Gupta Computational Biology Group AbhyudayaTech, Indore
More information