Instance and case-based reasoning

Size: px
Start display at page:

Download "Instance and case-based reasoning"

Transcription

1 Instance and case-based reasoning ML for NLP Lecturer: Kevin Koidl Assist. Lecturer Alfredo Maldonado Instance-based Learning Instance-based Learning approximates real-values or discrete-valued functions. k-nearest Neighbour New instances are related to similar instances in memory. Key difference: Different approximation per instance queried. Constructs only local approximations not for entire instance space. Instance-based learning can use complex, symbolic representations. Typical examples are help desk, reasoning about legal cases, complex scheduling... Instance-based Learning - disadvantages High Classification Cost. Indexing approaches become very important. Needs to consider all instances stored to compare with the new one (specifically in KNN). The importance of being Lazy Lazy learning: Generalising beyond training examples is postponed until a new instance must be classified The importance of being lazy : instead of estimating the target function once for the whole instance space, estimate it locally and differently for each new instance A family of related techniques: k-nearest Neighbour Locally weighted regression

2 Radial basis functions Case-based reasoning Lazy vs. Eager learning Instance-Based Learning Two classification approaches: Nearest neighbour: K-Nearest Neighbors doesnt learn an explicit mapping f from the training data. Given query instance x q, first locate nearest training example x n, then estimate ˆf(x q ) f(x n ) k-nearest neighbour: Given x q, take vote among its k nearest neighbours, if discrete-valued target function ˆf(x q ) arg max v V k δ(v, f(x i )) take mean of f values of k nearest neighbours, if the target function is real-valued. i= ˆf(x q ) f(x i) k Representation All instances correspond to points in the n-dimensional space R n As before, an instance x will be described by a feature vector: i= a (x), a 2 (x),..., a n (x) Nearest neighbours can be defined in terms of standard Euclidean distance (but other measures are possible): d(x i, x j ) = n (a r (x i ) a r (x j )) 2 The k-nearest neighbour algorithm r= Consider learning a discrete-valued function with signature f : R n V, for a finite V = {v,..., v n } Training algorithm: For each training example x, f(x), add example to tlist Classification Algorithm: Input: x q, a query instance to be classified Let x,..., x k be the nearest instances to x q in tlist Return ˆf(x q ) arg max v V i= δ(v, f(x i)) where δ(a, b) = if a = b, and otherwise (Kronecker function) 2

3 Decision Surfaces xq What would a nearest neighbour classify x q? What would a 5-nearest neighbour algorithm do? What does the decision surface (for the -NN classifier) look like? Voronoi Diagrams Distance-Weighted knn One might want to weight nearer neighbours more heavily... For the discrete case: ˆf(x q ) arg max v V k w i δ(v, f(x i )) () where w i d(x q,x i) and d(x 2 q, x i ) is distance between x q and x i. If d(x q, x i ) = assign ˆf(x q ) def = f(x i ) For real-valued target functions?: i= ˆf(x q ) i= w if(x i ) i= w i (2) Now we could use all training examples instead of just k local method, global method, Shepard s method Locally Weighted Regression k-nn forms local approximation to f for each query. point x q So why not form an explicit approximation ˆf(x) for region surrounding x q? Ways in which this could be done: Fit linear function to k nearest neighbours Fit quadratic,... Produces piecewise approximation to f 3

4 ( N.B.: Locally Weighted Regression: Local: based only on data near x q Weighted: contribution of each instance weighted by its distance to x q Regression: approximates real-valued functions ) A global approximation Consider approximating f near x q by linear function ˆf(x) = w + w a (x) w n a n (x) One could use gradient descent to find the coefficients to minimise the error in fitting ˆf to training set D: E = (f(x) 2 ˆf(x)) 2 x D The gradient descent training rule: w j = η x D(f(x) ˆf(x))a j (x) (Recall the LMS algorithm from Lecture 3) Other ways of minimising error Gradient descent isn t the only way to find the coefficients for, say, One could also use... ˆf(x) = w + w a (x) w n a n (x) a variety of search methods such as simulated annealing, genetic algorithms, etc But first... the global approximation given by gradient descent (or GA, etc) needs to be adapted... Lazy and Eager Learning Lazy: wait for query before generalizing k-nearest Neighbour, Case based reasoning Eager: generalize before seeing query Radial basis function (RBF) networks, ID3, Backpropagation, Naive Bayes,... Does it matter? Eager learner must create global approximation Lazy learner can create many local approximations if they use same H, lazy can represent more complex functions (e.g., consider H = linear functions) 4

5 Presentation based on (Mitchell, 997, ch. 5). References Mitchell, T. M. (997). Machine Learning. McGraw-Hill. Sycara, K., Navin Chandra, D., Guttal, R., Koning, J., and Narasimhan, S. (992). CADET: a case-based synthesis tool for engineering design. International Journal for Expert Systems, 4(2): Yang, Y. (994). Expert network: effective and efficient learning from human decisions in text categorisation and retrieval. In Proceedings of SIGIR-94, 7th ACM International Conference on Research and Development in Information Retrieval, pages 3 22, Dublin, Ireland. ACM Press. 5

Instance Based Learning. k-nearest Neighbor. Locally weighted regression. Radial basis functions. Case-based reasoning. Lazy and eager learning

Instance Based Learning. k-nearest Neighbor. Locally weighted regression. Radial basis functions. Case-based reasoning. Lazy and eager learning Instance Based Learning [Read Ch. 8] k-nearest Neighbor Locally weighted regression Radial basis functions Case-based reasoning Lazy and eager learning 65 lecture slides for textbook Machine Learning,

More information

Statistical Learning Part 2 Nonparametric Learning: The Main Ideas. R. Moeller Hamburg University of Technology

Statistical Learning Part 2 Nonparametric Learning: The Main Ideas. R. Moeller Hamburg University of Technology Statistical Learning Part 2 Nonparametric Learning: The Main Ideas R. Moeller Hamburg University of Technology Instance-Based Learning So far we saw statistical learning as parameter learning, i.e., given

More information

Linear Regression and K-Nearest Neighbors 3/28/18

Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression Hypothesis Space Supervised learning For every input in the data set, we know the output Regression Outputs are continuous A number,

More information

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition Data Mining Classification: Alternative Techniques Lecture Notes for Chapter 4 Instance-Based Learning Introduction to Data Mining, 2 nd Edition by Tan, Steinbach, Karpatne, Kumar Instance Based Classifiers

More information

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013 Voronoi Region K-means method for Signal Compression: Vector Quantization Blocks of signals: A sequence of audio. A block of image pixels. Formally: vector example: (0.2, 0.3, 0.5, 0.1) A vector quantizer

More information

Instance-Based Learning.

Instance-Based Learning. Instance-Based Learning www.biostat.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts k-nn classification k-nn regression edited nearest neighbor k-d trees for nearest

More information

Data mining. Classification k-nn Classifier. Piotr Paszek. (Piotr Paszek) Data mining k-nn 1 / 20

Data mining. Classification k-nn Classifier. Piotr Paszek. (Piotr Paszek) Data mining k-nn 1 / 20 Data mining Piotr Paszek Classification k-nn Classifier (Piotr Paszek) Data mining k-nn 1 / 20 Plan of the lecture 1 Lazy Learner 2 k-nearest Neighbor Classifier 1 Distance (metric) 2 How to Determine

More information

K-Nearest Neighbour (Continued) Dr. Xiaowei Huang

K-Nearest Neighbour (Continued) Dr. Xiaowei Huang K-Nearest Neighbour (Continued) Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ A few things: No lectures on Week 7 (i.e., the week starting from Monday 5 th November), and Week 11 (i.e., the week

More information

Instance-Based Learning. Goals for the lecture

Instance-Based Learning. Goals for the lecture Instance-Based Learning Mar Craven and David Page Computer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed

More information

ADVANCED CLASSIFICATION TECHNIQUES

ADVANCED CLASSIFICATION TECHNIQUES Admin ML lab next Monday Project proposals: Sunday at 11:59pm ADVANCED CLASSIFICATION TECHNIQUES David Kauchak CS 159 Fall 2014 Project proposal presentations Machine Learning: A Geometric View 1 Apples

More information

CISC 4631 Data Mining

CISC 4631 Data Mining CISC 4631 Data Mining Lecture 03: Nearest Neighbor Learning Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Prof. R. Mooney (UT Austin) Prof E. Keogh (UCR), Prof. F.

More information

Data Mining. Lecture 03: Nearest Neighbor Learning

Data Mining. Lecture 03: Nearest Neighbor Learning Data Mining Lecture 03: Nearest Neighbor Learning Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Prof. R. Mooney (UT Austin) Prof E. Keogh (UCR), Prof. F. Provost

More information

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015 Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2015 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows K-Nearest

More information

Lecture 3. Oct

Lecture 3. Oct Lecture 3 Oct 3 2008 Review of last lecture A supervised learning example spam filter, and the design choices one need to make for this problem use bag-of-words to represent emails linear functions as

More information

CHAPTER INSTANCE-BASED LEARNING 8.1 INTRODUCTION

CHAPTER INSTANCE-BASED LEARNING 8.1 INTRODUCTION CHAPTER INSTANCE-BASED LEARNING In contrast to learning methods that construct a general, explicit description of the target function when training examples are provided, instance-based learning methods

More information

Machine Learning: k-nearest Neighbors. Lecture 08. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Machine Learning: k-nearest Neighbors. Lecture 08. Razvan C. Bunescu School of Electrical Engineering and Computer Science Machine Learning: k-nearest Neighbors Lecture 08 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Nonparametric Methods: k-nearest Neighbors Input: A training dataset

More information

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18 Lecture 6: k-nn Cross-validation Regularization LEARNING METHODS Lazy vs eager learning Eager learning generalizes training data before

More information

Support Vector Machines + Classification for IR

Support Vector Machines + Classification for IR Support Vector Machines + Classification for IR Pierre Lison University of Oslo, Dep. of Informatics INF3800: Søketeknologi April 30, 2014 Outline of the lecture Recap of last week Support Vector Machines

More information

VECTOR SPACE CLASSIFICATION

VECTOR SPACE CLASSIFICATION VECTOR SPACE CLASSIFICATION Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. Chapter 14 Wei Wei wwei@idi.ntnu.no Lecture

More information

k-nearest Neighbor (knn) Sept Youn-Hee Han

k-nearest Neighbor (knn) Sept Youn-Hee Han k-nearest Neighbor (knn) Sept. 2015 Youn-Hee Han http://link.koreatech.ac.kr ²Eager Learners Eager vs. Lazy Learning when given a set of training data, it will construct a generalization model before receiving

More information

Instance-based Learning

Instance-based Learning Instance-based Learning Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 19 th, 2007 2005-2007 Carlos Guestrin 1 Why not just use Linear Regression? 2005-2007 Carlos Guestrin

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description

More information

CSC 411: Lecture 05: Nearest Neighbors

CSC 411: Lecture 05: Nearest Neighbors CSC 411: Lecture 05: Nearest Neighbors Raquel Urtasun & Rich Zemel University of Toronto Sep 28, 2015 Urtasun & Zemel (UofT) CSC 411: 05-Nearest Neighbors Sep 28, 2015 1 / 13 Today Non-parametric models

More information

Batch-Incremental vs. Instance-Incremental Learning in Dynamic and Evolving Data

Batch-Incremental vs. Instance-Incremental Learning in Dynamic and Evolving Data Batch-Incremental vs. Instance-Incremental Learning in Dynamic and Evolving Data Jesse Read 1, Albert Bifet 2, Bernhard Pfahringer 2, Geoff Holmes 2 1 Department of Signal Theory and Communications Universidad

More information

DATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines

DATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines DATA MINING LECTURE 10B Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines NEAREST NEIGHBOR CLASSIFICATION 10 10 Illustrating Classification Task Tid Attrib1

More information

MODULE 7 Nearest Neighbour Classifier and its variants LESSON 11. Nearest Neighbour Classifier. Keywords: K Neighbours, Weighted, Nearest Neighbour

MODULE 7 Nearest Neighbour Classifier and its variants LESSON 11. Nearest Neighbour Classifier. Keywords: K Neighbours, Weighted, Nearest Neighbour MODULE 7 Nearest Neighbour Classifier and its variants LESSON 11 Nearest Neighbour Classifier Keywords: K Neighbours, Weighted, Nearest Neighbour 1 Nearest neighbour classifiers This is amongst the simplest

More information

A study of classification algorithms using Rapidminer

A study of classification algorithms using Rapidminer Volume 119 No. 12 2018, 15977-15988 ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu A study of classification algorithms using Rapidminer Dr.J.Arunadevi 1, S.Ramya 2, M.Ramesh Raja

More information

CS489/698 Lecture 2: January 8 th, 2018

CS489/698 Lecture 2: January 8 th, 2018 CS489/698 Lecture 2: January 8 th, 2018 Nearest Neighbour [RN] Sec. 18.8.1, [HTF] Sec. 2.3.2, [D] Chapt. 3, [B] Sec. 2.5.2, [M] Sec. 1.4.2 CS489/698 (c) 2018 P. Poupart 1 Inductive Learning (recap) Induction

More information

Text Categorization (I)

Text Categorization (I) CS473 CS-473 Text Categorization (I) Luo Si Department of Computer Science Purdue University Text Categorization (I) Outline Introduction to the task of text categorization Manual v.s. automatic text categorization

More information

CP365 Artificial Intelligence

CP365 Artificial Intelligence CP365 Artificial Intelligence Example Problem Problem: Does a given image contain cats? Input vector: RGB/BW pixels of the image. Output: Yes or No. Example Problem Problem: What category is a news story?

More information

Nearest Neighbor Classification. Machine Learning Fall 2017

Nearest Neighbor Classification. Machine Learning Fall 2017 Nearest Neighbor Classification Machine Learning Fall 2017 1 This lecture K-nearest neighbor classification The basic algorithm Different distance measures Some practical aspects Voronoi Diagrams and Decision

More information

UVA CS 4501: Machine Learning. Lecture 10: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia

UVA CS 4501: Machine Learning. Lecture 10: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia UVA CS 4501: Machine Learning Lecture 10: K-nearest-neighbor Classifier / Bias-Variance Tradeoff Dr. Yanjun Qi University of Virginia Department of Computer Science 1 Where are we? è Five major secfons

More information

9 Classification: KNN and SVM

9 Classification: KNN and SVM CSE4334/5334 Data Mining 9 Classification: KNN and SVM Chengkai Li Department of Computer Science and Engineering University of Texas at Arlington Fall 2017 (Slides courtesy of Pang-Ning Tan, Michael Steinbach

More information

Naïve Bayes for text classification

Naïve Bayes for text classification Road Map Basic concepts Decision tree induction Evaluation of classifiers Rule induction Classification using association rules Naïve Bayesian classification Naïve Bayes for text classification Support

More information

Information Retrieval

Information Retrieval Introduction to Information Retrieval SCCS414: Information Storage and Retrieval Christopher Manning and Prabhakar Raghavan Lecture 10: Text Classification; Vector Space Classification (Rocchio) Relevance

More information

Distribution-free Predictive Approaches

Distribution-free Predictive Approaches Distribution-free Predictive Approaches The methods discussed in the previous sections are essentially model-based. Model-free approaches such as tree-based classification also exist and are popular for

More information

K-Nearest Neighbour Classifier. Izabela Moise, Evangelos Pournaras, Dirk Helbing

K-Nearest Neighbour Classifier. Izabela Moise, Evangelos Pournaras, Dirk Helbing K-Nearest Neighbour Classifier Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 Reminder Supervised data mining Classification Decision Trees Izabela

More information

More on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.

More on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5. More on Neural Networks Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.6 Recall the MLP Training Example From Last Lecture log likelihood

More information

A STUDY OF SOME DATA MINING CLASSIFICATION TECHNIQUES

A STUDY OF SOME DATA MINING CLASSIFICATION TECHNIQUES A STUDY OF SOME DATA MINING CLASSIFICATION TECHNIQUES Narsaiah Putta Assistant professor Department of CSE, VASAVI College of Engineering, Hyderabad, Telangana, India Abstract Abstract An Classification

More information

6.034 Quiz 2, Spring 2005

6.034 Quiz 2, Spring 2005 6.034 Quiz 2, Spring 2005 Open Book, Open Notes Name: Problem 1 (13 pts) 2 (8 pts) 3 (7 pts) 4 (9 pts) 5 (8 pts) 6 (16 pts) 7 (15 pts) 8 (12 pts) 9 (12 pts) Total (100 pts) Score 1 1 Decision Trees (13

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

USING OF THE K NEAREST NEIGHBOURS ALGORITHM (k-nns) IN THE DATA CLASSIFICATION

USING OF THE K NEAREST NEIGHBOURS ALGORITHM (k-nns) IN THE DATA CLASSIFICATION USING OF THE K NEAREST NEIGHBOURS ALGORITHM (k-nns) IN THE DATA CLASSIFICATION Gîlcă Natalia, Roșia de Amaradia Technological High School, Gorj, ROMANIA Gîlcă Gheorghe, Constantin Brîncuși University from

More information

We use non-bold capital letters for all random variables in these notes, whether they are scalar-, vector-, matrix-, or whatever-valued.

We use non-bold capital letters for all random variables in these notes, whether they are scalar-, vector-, matrix-, or whatever-valued. The Bayes Classifier We have been starting to look at the supervised classification problem: we are given data (x i, y i ) for i = 1,..., n, where x i R d, and y i {1,..., K}. In this section, we suppose

More information

CSC411/2515 Tutorial: K-NN and Decision Tree

CSC411/2515 Tutorial: K-NN and Decision Tree CSC411/2515 Tutorial: K-NN and Decision Tree Mengye Ren csc{411,2515}ta@cs.toronto.edu September 25, 2016 Cross-validation K-nearest-neighbours Decision Trees Review: Motivation for Validation Framework:

More information

Learning via Optimization

Learning via Optimization Lecture 7 1 Outline 1. Optimization Convexity 2. Linear regression in depth Locally weighted linear regression 3. Brief dips Logistic Regression [Stochastic] gradient ascent/descent Support Vector Machines

More information

Text classification II CE-324: Modern Information Retrieval Sharif University of Technology

Text classification II CE-324: Modern Information Retrieval Sharif University of Technology Text classification II CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2015 Some slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)

More information

Lecture 17: Smoothing splines, Local Regression, and GAMs

Lecture 17: Smoothing splines, Local Regression, and GAMs Lecture 17: Smoothing splines, Local Regression, and GAMs Reading: Sections 7.5-7 STATS 202: Data mining and analysis November 6, 2017 1 / 24 Cubic splines Define a set of knots ξ 1 < ξ 2 < < ξ K. We want

More information

Machine Learning (CSE 446): Practical Issues

Machine Learning (CSE 446): Practical Issues Machine Learning (CSE 446): Practical Issues Noah Smith c 2017 University of Washington nasmith@cs.washington.edu October 18, 2017 1 / 39 scary words 2 / 39 Outline of CSE 446 We ve already covered stuff

More information

What is machine learning?

What is machine learning? Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship

More information

An Assessment of Case Base Reasoning for Short Text Message Classification

An Assessment of Case Base Reasoning for Short Text Message Classification An Assessment of Case Base Reasoning for Short Text Message Classification Matt Healy 1, Sarah Jane Delany 1, and Anton Zamolotskikh 2 1 Dublin Institute of Technology, Kevin Street, Dublin 8, Ireland

More information

UVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 15: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia

UVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 15: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia UVA CS 6316/4501 Fall 2016 Machine Learning Lecture 15: K-nearest-neighbor Classifier / Bias-Variance Tradeoff Dr. Yanjun Qi University of Virginia Department of Computer Science 11/9/16 1 Rough Plan HW5

More information

Based on Raymond J. Mooney s slides

Based on Raymond J. Mooney s slides Instance Based Learning Based on Raymond J. Mooney s slides University of Texas at Austin 1 Example 2 Instance-Based Learning Unlike other learning algorithms, does not involve construction of an explicit

More information

Announcements. CS 188: Artificial Intelligence Spring Generative vs. Discriminative. Classification: Feature Vectors. Project 4: due Friday.

Announcements. CS 188: Artificial Intelligence Spring Generative vs. Discriminative. Classification: Feature Vectors. Project 4: due Friday. CS 188: Artificial Intelligence Spring 2011 Lecture 21: Perceptrons 4/13/2010 Announcements Project 4: due Friday. Final Contest: up and running! Project 5 out! Pieter Abbeel UC Berkeley Many slides adapted

More information

Perceptron as a graph

Perceptron as a graph Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 10 th, 2007 2005-2007 Carlos Guestrin 1 Perceptron as a graph 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-6 -4-2

More information

MS1b Statistical Data Mining Part 3: Supervised Learning Nonparametric Methods

MS1b Statistical Data Mining Part 3: Supervised Learning Nonparametric Methods MS1b Statistical Data Mining Part 3: Supervised Learning Nonparametric Methods Yee Whye Teh Department of Statistics Oxford http://www.stats.ox.ac.uk/~teh/datamining.html Outline Supervised Learning: Nonparametric

More information

KTH ROYAL INSTITUTE OF TECHNOLOGY. Lecture 14 Machine Learning. K-means, knn

KTH ROYAL INSTITUTE OF TECHNOLOGY. Lecture 14 Machine Learning. K-means, knn KTH ROYAL INSTITUTE OF TECHNOLOGY Lecture 14 Machine Learning. K-means, knn Contents K-means clustering K-Nearest Neighbour Power Systems Analysis An automated learning approach Understanding states in

More information

CS178: Machine Learning and Data Mining. Complexity & Nearest Neighbor Methods

CS178: Machine Learning and Data Mining. Complexity & Nearest Neighbor Methods + CS78: Machine Learning and Data Mining Complexity & Nearest Neighbor Methods Prof. Erik Sudderth Some materials courtesy Alex Ihler & Sameer Singh Machine Learning Complexity and Overfitting Nearest

More information

CS570: Introduction to Data Mining

CS570: Introduction to Data Mining CS570: Introduction to Data Mining Classification Advanced Reading: Chapter 8 & 9 Han, Chapters 4 & 5 Tan Anca Doloc-Mihu, Ph.D. Slides courtesy of Li Xiong, Ph.D., 2011 Han, Kamber & Pei. Data Mining.

More information

CS 584 Data Mining. Classification 1

CS 584 Data Mining. Classification 1 CS 584 Data Mining Classification 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes, one of the attributes is the class. Find a model for

More information

Recap of the last lecture. CS276A Text Retrieval and Mining. Text Categorization Examples. Categorization/Classification. Text Classification

Recap of the last lecture. CS276A Text Retrieval and Mining. Text Categorization Examples. Categorization/Classification. Text Classification CS276A Text Retrieval and Mining Recap of the last lecture Linear Algebra SVD Latent Semantic Analysis Lecture 16 [Borrows slides from Ray Mooney and Barbara Rosario] Okay, today s lecture doesn t very

More information

k-nearest Neighbors + Model Selection

k-nearest Neighbors + Model Selection 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University k-nearest Neighbors + Model Selection Matt Gormley Lecture 5 Jan. 30, 2019 1 Reminders

More information

Data Mining and Machine Learning: Techniques and Algorithms

Data Mining and Machine Learning: Techniques and Algorithms Instance based classification Data Mining and Machine Learning: Techniques and Algorithms Eneldo Loza Mencía eneldo@ke.tu-darmstadt.de Knowledge Engineering Group, TU Darmstadt International Week 2019,

More information

Decision Tree (Continued) and K-Nearest Neighbour. Dr. Xiaowei Huang

Decision Tree (Continued) and K-Nearest Neighbour. Dr. Xiaowei Huang Decision Tree (Continued) and K-Nearest Neighbour Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Recap basic knowledge Decision tree learning How to split Identify the best feature to

More information

Model Selection Introduction to Machine Learning. Matt Gormley Lecture 4 January 29, 2018

Model Selection Introduction to Machine Learning. Matt Gormley Lecture 4 January 29, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Model Selection Matt Gormley Lecture 4 January 29, 2018 1 Q&A Q: How do we deal

More information

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X Analysis about Classification Techniques on Categorical Data in Data Mining Assistant Professor P. Meena Department of Computer Science Adhiyaman Arts and Science College for Women Uthangarai, Krishnagiri,

More information

Mathematics of Data. INFO-4604, Applied Machine Learning University of Colorado Boulder. September 5, 2017 Prof. Michael Paul

Mathematics of Data. INFO-4604, Applied Machine Learning University of Colorado Boulder. September 5, 2017 Prof. Michael Paul Mathematics of Data INFO-4604, Applied Machine Learning University of Colorado Boulder September 5, 2017 Prof. Michael Paul Goals In the intro lecture, every visualization was in 2D What happens when we

More information

Nearest Neighbor Classification

Nearest Neighbor Classification Nearest Neighbor Classification Charles Elkan elkan@cs.ucsd.edu October 9, 2007 The nearest-neighbor method is perhaps the simplest of all algorithms for predicting the class of a test example. The training

More information

Machine Learning Classifiers and Boosting

Machine Learning Classifiers and Boosting Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve

More information

Supervised vs unsupervised clustering

Supervised vs unsupervised clustering Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful

More information

Introduction to Artificial Intelligence

Introduction to Artificial Intelligence Introduction to Artificial Intelligence COMP307 Machine Learning 2: 3-K Techniques Yi Mei yi.mei@ecs.vuw.ac.nz 1 Outline K-Nearest Neighbour method Classification (Supervised learning) Basic NN (1-NN)

More information

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016 Machine Learning 10-701, Fall 2016 Nonparametric methods for Classification Eric Xing Lecture 2, September 12, 2016 Reading: 1 Classification Representing data: Hypothesis (classifier) 2 Clustering 3 Supervised

More information

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010 INFORMATICS SEMINAR SEPT. 27 & OCT. 4, 2010 Introduction to Semi-Supervised Learning Review 2 Overview Citation X. Zhu and A.B. Goldberg, Introduction to Semi- Supervised Learning, Morgan & Claypool Publishers,

More information

MLCC 2018 Local Methods and Bias Variance Trade-Off. Lorenzo Rosasco UNIGE-MIT-IIT

MLCC 2018 Local Methods and Bias Variance Trade-Off. Lorenzo Rosasco UNIGE-MIT-IIT MLCC 2018 Local Methods and Bias Variance Trade-Off Lorenzo Rosasco UNIGE-MIT-IIT About this class 1. Introduce a basic class of learning methods, namely local methods. 2. Discuss the fundamental concept

More information

Notes and Announcements

Notes and Announcements Notes and Announcements Midterm exam: Oct 20, Wednesday, In Class Late Homeworks Turn in hardcopies to Michelle. DO NOT ask Michelle for extensions. Note down the date and time of submission. If submitting

More information

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule. CS 188: Artificial Intelligence Fall 2008 Lecture 24: Perceptrons II 11/24/2008 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit

More information

Overview. Non-Parametrics Models Definitions KNN. Ensemble Methods Definitions, Examples Random Forests. Clustering. k-means Clustering 2 / 8

Overview. Non-Parametrics Models Definitions KNN. Ensemble Methods Definitions, Examples Random Forests. Clustering. k-means Clustering 2 / 8 Tutorial 3 1 / 8 Overview Non-Parametrics Models Definitions KNN Ensemble Methods Definitions, Examples Random Forests Clustering Definitions, Examples k-means Clustering 2 / 8 Non-Parametrics Models Definitions

More information

Machine Perception of Music & Audio. Topic 10: Classification

Machine Perception of Music & Audio. Topic 10: Classification Machine Perception of Music & Audio Topic 10: Classification 1 Classification Label objects as members of sets Things on the left Things on the right There is a set of possible examples Each example is

More information

Comparison of Linear Regression with K-Nearest Neighbors

Comparison of Linear Regression with K-Nearest Neighbors Comparison of Linear Regression with K-Nearest Neighbors Rebecca C. Steorts, Duke University STA 325, Chapter 3.5 ISL Agenda Intro to KNN Comparison of KNN and Linear Regression K-Nearest Neighbors vs

More information

Lecture 6 Classification and Prediction

Lecture 6 Classification and Prediction Lecture 6 Classification and Prediction (Part B) Zhou Shuigeng April 9, 2006 2006-4-16 Data Mining: Tech. & Appl. 1 Outline Instance-Based Method Classification based on concepts from association rule

More information

Using Machine Learning to Identify Security Issues in Open-Source Libraries. Asankhaya Sharma Yaqin Zhou SourceClear

Using Machine Learning to Identify Security Issues in Open-Source Libraries. Asankhaya Sharma Yaqin Zhou SourceClear Using Machine Learning to Identify Security Issues in Open-Source Libraries Asankhaya Sharma Yaqin Zhou SourceClear Outline - Overview of problem space Unidentified security issues How Machine Learning

More information

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Kernels + K-Means Matt Gormley Lecture 29 April 25, 2018 1 Reminders Homework 8:

More information

K-Nearest Neighbors. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824

K-Nearest Neighbors. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824 K-Nearest Neighbors Jia-Bin Huang ECE-5424G / CS-5824 Virginia Tech Spring 2019 Administrative Check out review materials Probability Linear algebra Python and NumPy Start your HW 0 On your Local machine:

More information

Large Scale Data Analysis Using Deep Learning

Large Scale Data Analysis Using Deep Learning Large Scale Data Analysis Using Deep Learning Machine Learning Basics - 1 U Kang Seoul National University U Kang 1 In This Lecture Overview of Machine Learning Capacity, overfitting, and underfitting

More information

Data Mining and Machine Learning. Instance-Based Learning. Rote Learning k Nearest-Neighbor Classification. IBL and Rule Learning

Data Mining and Machine Learning. Instance-Based Learning. Rote Learning k Nearest-Neighbor Classification. IBL and Rule Learning Data Mining and Machine Learning Instance-Based Learning Rote Learning k Nearest-Neighbor Classification Prediction, Weighted Prediction choosing k feature weighting (RELIEF) instance weighting (PEBLS)

More information

Instance-based Learning

Instance-based Learning Instance-based Learning Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 15 th, 2007 2005-2007 Carlos Guestrin 1 1-Nearest Neighbor Four things make a memory based learner:

More information

Classification: Feature Vectors

Classification: Feature Vectors Classification: Feature Vectors Hello, Do you want free printr cartriges? Why pay more when you can get them ABSOLUTELY FREE! Just # free YOUR_NAME MISSPELLED FROM_FRIEND... : : : : 2 0 2 0 PIXEL 7,12

More information

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each

More information

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 SUPERVISED LEARNING METHODS Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 2 CHOICE OF ML You cannot know which algorithm will work

More information

Supervised Learning: Nearest Neighbors

Supervised Learning: Nearest Neighbors CS 2750: Machine Learning Supervised Learning: Nearest Neighbors Prof. Adriana Kovashka University of Pittsburgh February 1, 2016 Today: Supervised Learning Part I Basic formulation of the simplest classifier:

More information

CSE 573: Artificial Intelligence Autumn 2010

CSE 573: Artificial Intelligence Autumn 2010 CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine

More information

Basis Functions. Volker Tresp Summer 2017

Basis Functions. Volker Tresp Summer 2017 Basis Functions Volker Tresp Summer 2017 1 Nonlinear Mappings and Nonlinear Classifiers Regression: Linearity is often a good assumption when many inputs influence the output Some natural laws are (approximately)

More information

Classification and Regression

Classification and Regression Classification and Regression Announcements Study guide for exam is on the LMS Sample exam will be posted by Monday Reminder that phase 3 oral presentations are being held next week during workshops Plan

More information

Classification and K-Nearest Neighbors

Classification and K-Nearest Neighbors Classification and K-Nearest Neighbors Administrivia o Reminder: Homework 1 is due by 5pm Friday on Moodle o Reading Quiz associated with today s lecture. Due before class Wednesday. NOTETAKER 2 Regression

More information

Data Preprocessing. Supervised Learning

Data Preprocessing. Supervised Learning Supervised Learning Regression Given the value of an input X, the output Y belongs to the set of real values R. The goal is to predict output accurately for a new input. The predictions or outputs y are

More information

A Soft-Computing Approach to Knowledge Flow Synthesis and Optimization

A Soft-Computing Approach to Knowledge Flow Synthesis and Optimization A Soft-Computing Approach to Knowledge Flow Synthesis and Optimization Tomáš Řehořek Pavel Kordík Computational Intelligence Group (CIG), Faculty of Information Technology (FIT), Czech Technical University

More information

Introduction to Machine Learning. Xiaojin Zhu

Introduction to Machine Learning. Xiaojin Zhu Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006

More information

A Lazy Approach for Machine Learning Algorithms

A Lazy Approach for Machine Learning Algorithms A Lazy Approach for Machine Learning Algorithms Inés M. Galván, José M. Valls, Nicolas Lecomte and Pedro Isasi Abstract Most machine learning algorithms are eager methods in the sense that a model is generated

More information

Announcements:$Rough$Plan$$

Announcements:$Rough$Plan$$ UVACS6316 Fall2015Graduate: MachineLearning Lecture16:K@nearest@neighbor Classifier/Bias@VarianceTradeoff 10/27/15 Dr.YanjunQi UniversityofVirginia Departmentof ComputerScience 1 Announcements:RoughPlan

More information

Image Registration Lecture 4: First Examples

Image Registration Lecture 4: First Examples Image Registration Lecture 4: First Examples Prof. Charlene Tsai Outline Example Intensity-based registration SSD error function Image mapping Function minimization: Gradient descent Derivative calculation

More information

Non-Parametric Modeling

Non-Parametric Modeling Non-Parametric Modeling CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Non-Parametric Density Estimation Parzen Windows Kn-Nearest Neighbor

More information