Nearest neighbor classifiers

Size: px
Start display at page:

Download "Nearest neighbor classifiers"

Transcription

1 Nearest Neighbor Classification Nearest neighbor classifiers Chapter e-6 HAPT Distance based methods NN(image):. Find the image in the training data which is closest to the query image.. Return its label. query closest image Measuring distance q q Superised learning methods (nearest neighbor methods) Clustering (k-means and hierarchical clustering) How to measure closeness? Distance measures for continuous data: HAPT 3 The Euclidean distance: d (x, x 0 )= x x 0 = p ux (x x 0 ) (x x 0 )= t d (x i x 0 i ) (based on the -norm)

2 The Manhattan distance: More distance measures d (x, x 0 )= x x 0 = dx x i x 0 i This is the distance if we can only trael along coordinate axes. The Minkowski distance The Minkowki distance of order p:! /p dx d p (x, x 0 )= x x 0 p = x i x 0 i p This is based on the p-norm (sometimes called the L p norm). The Euclidean and Manhattan distances are special cases 5 6 The Minkowski distance The Minkowki distance of order p:! /p dx d p (x, x 0 )= x x 0 p = x i x 0 i p This is based on the p-norm (sometimes called the L p norm). What happens when p goes to infinity? The Minkowski distance The Minkowki distance of order p: d p (x, x 0 )= x x 0 p = What happens when p goes to infinity?! /p dx x i x 0 i p d (x, x 0 ) = max x i x 0 i i We get the Chebyshe distance 7 8

3 The Minkowski distance The unit sphere for arious alues of p Properties of a distance A function d(x,x ) is called a distance function if it satisfies the following conditions: i. d(x, x) = 0 (a distance of a point to itself is zero) ii. iii. i. d(x, x ) 0 if x x (all other distances are non-zero) d(x, x ) = d(x, x) (distances are symmetric) d(x, x ) <= d(x, x ) + d(x, x ) (detours make distances larger) The last condition is called the triangle inequality The Minkowski distance with p< does not satisfy the triangle inequality. 9 0 Data normalization Is ery important in this context! The Mahalanobis distance Sometimes it s useful to use different scales for different coordinates. Therefore: use an ellipse rather than a circle to identify points that are a fixed distance away. Also consider rotating the ellipse.! R = p p p p 45 degree clockwise rotation S =! 0 0 distance metric then Euclidean distance is our only choice. (right) The rotated e x T R T S Rx = /4; the axis-parallel ellipse x T S x = /4; and the circle x T x = /4. 3

4 The Mahalanobis distance The shape of the ellipse can be estimated from data as the inerse of the coariance matrix: Dis M (x, y ) = p (x y) (x y) (The inerse of the coariance matrix has the effect of decorrelating and normalizing features) NN(image):. Find the image in the training data which is closest to the query image.. Return its label. query closest image 3 Voronoi diagrams Classify a gien test example to the class of the nearest More formally: omputes the final hypothe D =(x,y )...(x N,y N ) he nearest point to in th Reorder the data according from to (breaking its similarity to an input x: (x [n] (x),y [n] (x)) i.e. The prediction: d(x, x [] ) d(x, x [] ) d(x, x [N] ) g(x) =y [] (x) 5 Voronoi diagram with respect to a collection of points x,,x N : The Voronoi cell associated with point x i is the set of points that are closer to x i than eery other point in the collection HAPT 6 4

5 Voronoi diagrams The Voronoi diagram depends on the distance measure that is used: Classify a gien test example to the class of the nearest Voronoi diagram computed from Euclidean distance (L norm) Voronoi diagram computed from Manhattan distance (L norm) Image from 7 Decision boundary is the result of fusing adjacent Voronoi cells that are associated with the same class. HAPT 8 Classify a gien test example to the class of the nearest Classify a gien test example to the class of the nearest What is the accuracy of the nearest neighbor classifier when it is tested on the training set? (i.e., what is E in ) Property of the nearest neighbor classifier: E out E* out where E* out is the error of an optimal classifier More precisely: Theorem: For any δ > 0 there is a sufficiently large N such that with probability > δ the resulting nearest neighbor classifier has E out E* out

6 k-nn k-nn Use the closest k neighbors to make a decision instead of a single nearest neighbor Choose the label that occurs among the majority of the k nearest neighbors Use the closest k neighbors to make a decision instead of a single nearest neighbor Choose the label that occurs among the majority of the k nearest neighbors Why do you expect this to work better? Can produce confidence scores. How? k-nn Use the closest k neighbors to make a decision instead of a single nearest neighbor Choose the label using a majority ote Other refinements: an example s ote is inersely proportional to its distance NN s k-nn Decision boundary of NN s k-nn on the digits data: (a) -NN rule (b) -NN rule Figure 6.: The -NN and -NN rules for classifying a random sample of 500 digits ( (blue circle) s all other digits). Note, 500. For the -NN rule, the in-sample error is zero, resulting in a complicated decision Figure boundary 6. in chapter with islands e-6 of red and blue regions. For the -NN rule, the in-sample error is not zero and the decision boundary is simpler. Exercise 6.3 Fix an odd k. ForN =,,... and data sets {D N } of size N, letg N be the k-nn rule deried from D N,without-of-sampleerrorE out(g N). (a) Argue that E out(g N )=E x[q k(η(x))] + E x[ϵ N (x)] for some error term ϵ N (x) which conerges to zero, and where TER 4 6

7 How to choose k? Interim conclusions The alue of k can be chosen using cross-alidation, like any classifier paramter: Eout (%).5 Figure 6.3 in chapter e-6 k = k =3 k = N CV #DataPoints,N ER 5 Properties of nearest neighbor classifiers: Simple and easy to implement No training required Expressie: can achiee zero training error Easy to explain the result But Running time can be an issue Not the best in terms of generalization 6 Classify a gien test example to the class of the nearest Classify a gien test example to the class of the nearest Running time for testing an example when dataset has N examples? Running time for testing an example when dataset has N examples? O(N). Expensie when dealing with large datasets

8 Running time for testing an example when dataset has N examples is O(N). Solutions: Condense the dataset Efficient nearest neighbor search (a) Condensed data for -NN (KD-trees, ball-tress, antage-point trees) (b) Condensed data for -NN TER Algorithm: KD-tree ² Cycle through the coordinates ² Insert a node that corresponds to the median of the gien coordinate, and put all other points in the left/right subtree on the basis of that coordinate A KD-tree for the set of points (,3), (5,4), (9,6), (4,7), (8,), (7,). Can be used for implementing nearest neighbor search in O(log N) Not effectie for high dimensional data (use ball-tree or antage-point tree) 9 30 in high dimensions Distance functions lose their usefulness in high dimensions. Consider the Euclidean distance for example: ux d (x, x 0 )= t d (x i x 0 i ) We expect that if d is large, many of the features won t be releant, and so the signal contained in the informatie dimensions can easily be corrupted by the noise. This can lead to low accuracy of a nearest neighbor classifier. The curse of dimensionality An umbrella term for the issues that can arise in high dimensional data. Solution: feature selection, dimensionality reduction 3 8

9 e-6. Similarity-Based Methods 6.. Nearest Neighbor /8/6 The curse of dimensionality Some of our intuition from low dimensional spaces breaks in high dimensions. Example: In high dimensions, most of the olume of the unit sphere is ery close to its surface. Let s compute the fraction of the olume that is between r=-ε and r=. The required fraction is: Related fact: V d (r) =k d r d V d () V d ( ) V d () = ( ) d The ratio of the olume of the unit sphere and unit cube tends to 0 as d goes to infinity. Multi-class problems e-6. The Similarity-Based nearest neighbor Methods algorithm works much the 6.. same Nearest way for Neighbor multi-class problems Symmetry Aerage Intensity (a) Multiclass digits data (b) -NN decision regions In fact, nearest neighbor methods are easily adaptable to any ML problem. Figure 6.6: -NN decision classifier for the multiclass digits data. Exercise 6.9 Symmetry Aerage Intensity 0 PTER With C classes labeled,...,c,defineπ c(x) =P[c x] (the probability to obsere class c gien x, analogoustoπ(x)). Let η(x) = max c π c(x). (a) Define a target f(x) =argmax c π c(x). Showthat,onatestpointx, f attains the minimum possible error probability of e(f(x)) = P[f(x) y] = η(x) data is shown in Table 6.. The sum of the diagonal elements gies the probability of correct classification, which is about 4%. From Table 6., we can easily identify some of the digits which are commonly confused, for example 8 Non-parametric s parametric methods is often classified as 0. In comparison, for the two class problem of classifying digit ersus all others, the success was upwards of 98%. For the multiclass problem, Non-parametric random performance methods would don t hae achiee any a parameters success rate of that 0%, are sotheper- formance learned is significantly from the data. aboe random; howeer, it is significantly below the -class ersion; multiclass problems are typically much harder than the two class problem. Though symmetry and intensity are two features that hae Parametric methods hae a specific form that the learned model enough information to distinguish between ersus the other digits,theyare will hae. not powerful enough to sole the multiclass problem. Better features would certainly help. One can also gain by breaking up the problem into a sequence of -class tasks and tailoring the features to each -class problem using domain knowledge Nearest Neighbor for Regression. With multiclass data, the natural way to combine the classes of the nearest neighbors to obtain the class of a test input is by using some form of majority oting procedure. When the output is a real number (y R), the natural way to combine the outputs (a) Nonparametric of the nearest NN neighbors (b) is Parametric using some linear form of aeraging. The simplest way to extend the k-nearestneighbor algorithmto regression Figure 6.5: The decision boundary of the flexible nonparametric nearest is to take theneighbor aerage ruleof molds theitself target to the alues data, whereas from the the rigid k-nearest parametric neighbors: linear model will always gie a linear separator. g(x) = k y [i] (x). k The k-nearest neighbor method would also be considered nonparametric (once the parameter k has been specified). Theorem 6. is an example of There are no ageneralconergenceresult. explicit 0 Under mild regularity conditions, matter what the Nearest parameters target f is, weneighbor being learned, can recoer it as Nregression and so this is anonparametric,proidedthatk is chosen regression appropriately. technique. That s Figurequite 6.7aillustrates powerful statement the k-nn about such technique asimplelearning for regression using a toymodel dataapplied set generated to learning by a general the target f. Suchfunction conergence in results light gray. undermild How do assumptions turn k-nn on f into are aa trademark regression of nonparametric method? methods. This has led to the folklore that nonparametric methods are, in some sense, more powerful than their cousins the parametric methods: for the parametric linear model, only if the target f happens to be in the linearly parameterized hypothesis k = set, can one get such conergence k =3 to f with larger N. k = To complicate the distinction between the two methods, let slookatthe non-linear feature transform (e.g. the polynomial transform). As the polynomial order increases, the number of parameters to be estimated increases and H, thehypothesisset,getsmoreandmoreexpressie. Itispossible to choose the polynomial order to increase with N, butnottooquicklysothath gets more and more expressie, eentually capturing any fixed targetandyet PTE e-chapter 0 For the technically oriented, it establishes the uniersal consistency of the k-nn rule. Figure 6.7: k-nn for regression on a noisy toy data set with N =0. c AM L Abu-Mostafa, Magdon-Ismail, Lin: Jan-05 e-chap:6 8 c AM L Abu-Mostafa, Magdon-Ismail, Lin: Jan-05 e-chap: (b) Show that for the nearest neighbor rule (k =), with high probability, the final hypothesis g N achiees an error on the test point x that is

10 Distances and kernels Distances and kernels The Euclidean distance can be expressed in terms of dot products: Dis (x, y) = x y = p (x y) (x y) = p x x x y y y Replacing dot products with kernels: Dis K (x, y) = p K(x, x) K(x, y) K(y, y) Replacing dot products with kernels: Dis K (x, y) = p K(x, x) K(x, y) K(y, y) If we consider a kernel that satisfies K(x,x) =, then nearest neighbor classification with kernels or distances is equialent. As an alternatie, consider kernels as measures of similarity, and rather than looking for the closest points, look for the most similar points, and use kernels directly Summary : Pros: Simple and easy to implement No training inoled One method that does it all Cons: Accuracy suffers in high dimensions Testing speed is an issue for large datasets 39 0

Nearest Neighbor Classification. Machine Learning Fall 2017

Nearest Neighbor Classification. Machine Learning Fall 2017 Nearest Neighbor Classification Machine Learning Fall 2017 1 This lecture K-nearest neighbor classification The basic algorithm Different distance measures Some practical aspects Voronoi Diagrams and Decision

More information

Supervised Learning: Nearest Neighbors

Supervised Learning: Nearest Neighbors CS 2750: Machine Learning Supervised Learning: Nearest Neighbors Prof. Adriana Kovashka University of Pittsburgh February 1, 2016 Today: Supervised Learning Part I Basic formulation of the simplest classifier:

More information

7. Nearest neighbors. Learning objectives. Foundations of Machine Learning École Centrale Paris Fall 2015

7. Nearest neighbors. Learning objectives. Foundations of Machine Learning École Centrale Paris Fall 2015 Foundations of Machine Learning École Centrale Paris Fall 2015 7. Nearest neighbors Chloé-Agathe Azencott Centre for Computational Biology, Mines ParisTech chloe agathe.azencott@mines paristech.fr Learning

More information

7. Nearest neighbors. Learning objectives. Centre for Computational Biology, Mines ParisTech

7. Nearest neighbors. Learning objectives. Centre for Computational Biology, Mines ParisTech Foundations of Machine Learning CentraleSupélec Paris Fall 2016 7. Nearest neighbors Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr Learning

More information

Instance-based Learning

Instance-based Learning Instance-based Learning Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 19 th, 2007 2005-2007 Carlos Guestrin 1 Why not just use Linear Regression? 2005-2007 Carlos Guestrin

More information

Non-Bayesian Classifiers Part I: k-nearest Neighbor Classifier and Distance Functions

Non-Bayesian Classifiers Part I: k-nearest Neighbor Classifier and Distance Functions Non-Bayesian Classifiers Part I: k-nearest Neighbor Classifier and Distance Functions Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551,

More information

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013 Voronoi Region K-means method for Signal Compression: Vector Quantization Blocks of signals: A sequence of audio. A block of image pixels. Formally: vector example: (0.2, 0.3, 0.5, 0.1) A vector quantizer

More information

Going nonparametric: Nearest neighbor methods for regression and classification

Going nonparametric: Nearest neighbor methods for regression and classification Going nonparametric: Nearest neighbor methods for regression and classification STAT/CSE 46: Machine Learning Emily Fox University of Washington May 3, 208 Locality sensitive hashing for approximate NN

More information

Perceptron as a graph

Perceptron as a graph Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 10 th, 2007 2005-2007 Carlos Guestrin 1 Perceptron as a graph 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-6 -4-2

More information

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015 Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2015 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows K-Nearest

More information

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each

More information

CS7267 MACHINE LEARNING NEAREST NEIGHBOR ALGORITHM. Mingon Kang, PhD Computer Science, Kennesaw State University

CS7267 MACHINE LEARNING NEAREST NEIGHBOR ALGORITHM. Mingon Kang, PhD Computer Science, Kennesaw State University CS7267 MACHINE LEARNING NEAREST NEIGHBOR ALGORITHM Mingon Kang, PhD Computer Science, Kennesaw State University KNN K-Nearest Neighbors (KNN) Simple, but very powerful classification algorithm Classifies

More information

ECE 5424: Introduction to Machine Learning

ECE 5424: Introduction to Machine Learning ECE 5424: Introduction to Machine Learning Topics: Supervised Learning Measuring performance Nearest Neighbor Distance Metrics Readings: Barber 14 (knn) Stefan Lee Virginia Tech Administrative Course add

More information

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation

Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each

More information

Distribution-free Predictive Approaches

Distribution-free Predictive Approaches Distribution-free Predictive Approaches The methods discussed in the previous sections are essentially model-based. Model-free approaches such as tree-based classification also exist and are popular for

More information

Going nonparametric: Nearest neighbor methods for regression and classification

Going nonparametric: Nearest neighbor methods for regression and classification Going nonparametric: Nearest neighbor methods for regression and classification STAT/CSE 46: Machine Learning Emily Fox University of Washington May 8, 28 Locality sensitive hashing for approximate NN

More information

CS 340 Lec. 4: K-Nearest Neighbors

CS 340 Lec. 4: K-Nearest Neighbors CS 340 Lec. 4: K-Nearest Neighbors AD January 2011 AD () CS 340 Lec. 4: K-Nearest Neighbors January 2011 1 / 23 K-Nearest Neighbors Introduction Choice of Metric Overfitting and Underfitting Selection

More information

Machine Learning: k-nearest Neighbors. Lecture 08. Razvan C. Bunescu School of Electrical Engineering and Computer Science

Machine Learning: k-nearest Neighbors. Lecture 08. Razvan C. Bunescu School of Electrical Engineering and Computer Science Machine Learning: k-nearest Neighbors Lecture 08 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Nonparametric Methods: k-nearest Neighbors Input: A training dataset

More information

Content-based image and video analysis. Machine learning

Content-based image and video analysis. Machine learning Content-based image and video analysis Machine learning for multimedia retrieval 04.05.2009 What is machine learning? Some problems are very hard to solve by writing a computer program by hand Almost all

More information

Instance-Based Learning.

Instance-Based Learning. Instance-Based Learning www.biostat.wisc.edu/~dpage/cs760/ Goals for the lecture you should understand the following concepts k-nn classification k-nn regression edited nearest neighbor k-d trees for nearest

More information

Nearest Neighbor Predictors

Nearest Neighbor Predictors Nearest Neighbor Predictors September 2, 2018 Perhaps the simplest machine learning prediction method, from a conceptual point of view, and perhaps also the most unusual, is the nearest-neighbor method,

More information

Data mining. Classification k-nn Classifier. Piotr Paszek. (Piotr Paszek) Data mining k-nn 1 / 20

Data mining. Classification k-nn Classifier. Piotr Paszek. (Piotr Paszek) Data mining k-nn 1 / 20 Data mining Piotr Paszek Classification k-nn Classifier (Piotr Paszek) Data mining k-nn 1 / 20 Plan of the lecture 1 Lazy Learner 2 k-nearest Neighbor Classifier 1 Distance (metric) 2 How to Determine

More information

CS178: Machine Learning and Data Mining. Complexity & Nearest Neighbor Methods

CS178: Machine Learning and Data Mining. Complexity & Nearest Neighbor Methods + CS78: Machine Learning and Data Mining Complexity & Nearest Neighbor Methods Prof. Erik Sudderth Some materials courtesy Alex Ihler & Sameer Singh Machine Learning Complexity and Overfitting Nearest

More information

Instance-based Learning

Instance-based Learning Instance-based Learning Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 15 th, 2007 2005-2007 Carlos Guestrin 1 1-Nearest Neighbor Four things make a memory based learner:

More information

Chapter 4: Non-Parametric Techniques

Chapter 4: Non-Parametric Techniques Chapter 4: Non-Parametric Techniques Introduction Density Estimation Parzen Windows Kn-Nearest Neighbor Density Estimation K-Nearest Neighbor (KNN) Decision Rule Supervised Learning How to fit a density

More information

The Boundary Graph Supervised Learning Algorithm for Regression and Classification

The Boundary Graph Supervised Learning Algorithm for Regression and Classification The Boundary Graph Supervised Learning Algorithm for Regression and Classification! Jonathan Yedidia! Disney Research!! Outline Motivation Illustration using a toy classification problem Some simple refinements

More information

Mining di Dati Web. Lezione 3 - Clustering and Classification

Mining di Dati Web. Lezione 3 - Clustering and Classification Mining di Dati Web Lezione 3 - Clustering and Classification Introduction Clustering and classification are both learning techniques They learn functions describing data Clustering is also known as Unsupervised

More information

Nearest Neighbor Classification

Nearest Neighbor Classification Nearest Neighbor Classification Charles Elkan elkan@cs.ucsd.edu October 9, 2007 The nearest-neighbor method is perhaps the simplest of all algorithms for predicting the class of a test example. The training

More information

Learning to Learn: additional notes

Learning to Learn: additional notes MASSACHUSETTS INSTITUTE OF TECHNOLOGY Department of Electrical Engineering and Computer Science 6.034 Artificial Intelligence, Fall 2008 Recitation October 23 Learning to Learn: additional notes Bob Berwick

More information

Linear Regression and K-Nearest Neighbors 3/28/18

Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression Hypothesis Space Supervised learning For every input in the data set, we know the output Regression Outputs are continuous A number,

More information

Text classification II CE-324: Modern Information Retrieval Sharif University of Technology

Text classification II CE-324: Modern Information Retrieval Sharif University of Technology Text classification II CE-324: Modern Information Retrieval Sharif University of Technology M. Soleymani Fall 2015 Some slides have been adapted from: Profs. Manning, Nayak & Raghavan (CS-276, Stanford)

More information

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering

Introduction to Pattern Recognition Part II. Selim Aksoy Bilkent University Department of Computer Engineering Introduction to Pattern Recognition Part II Selim Aksoy Bilkent University Department of Computer Engineering saksoy@cs.bilkent.edu.tr RETINA Pattern Recognition Tutorial, Summer 2005 Overview Statistical

More information

PROBLEM 4

PROBLEM 4 PROBLEM 2 PROBLEM 4 PROBLEM 5 PROBLEM 6 PROBLEM 7 PROBLEM 8 PROBLEM 9 PROBLEM 10 PROBLEM 11 PROBLEM 12 PROBLEM 13 PROBLEM 14 PROBLEM 16 PROBLEM 17 PROBLEM 22 PROBLEM 23 PROBLEM 24 PROBLEM 25

More information

Machine Learning in Biology

Machine Learning in Biology Università degli studi di Padova Machine Learning in Biology Luca Silvestrin (Dottorando, XXIII ciclo) Supervised learning Contents Class-conditional probability density Linear and quadratic discriminant

More information

Challenges motivating deep learning. Sargur N. Srihari

Challenges motivating deep learning. Sargur N. Srihari Challenges motivating deep learning Sargur N. srihari@cedar.buffalo.edu 1 Topics In Machine Learning Basics 1. Learning Algorithms 2. Capacity, Overfitting and Underfitting 3. Hyperparameters and Validation

More information

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18 Lecture 6: k-nn Cross-validation Regularization LEARNING METHODS Lazy vs eager learning Eager learning generalizes training data before

More information

Task Description: Finding Similar Documents. Document Retrieval. Case Study 2: Document Retrieval

Task Description: Finding Similar Documents. Document Retrieval. Case Study 2: Document Retrieval Case Study 2: Document Retrieval Task Description: Finding Similar Documents Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade April 11, 2017 Sham Kakade 2017 1 Document

More information

Problem 1: Complexity of Update Rules for Logistic Regression

Problem 1: Complexity of Update Rules for Logistic Regression Case Study 1: Estimating Click Probabilities Tackling an Unknown Number of Features with Sketching Machine Learning for Big Data CSE547/STAT548, University of Washington Emily Fox January 16 th, 2014 1

More information

Approximate Nearest Line Search in High Dimensions. Sepideh Mahabadi

Approximate Nearest Line Search in High Dimensions. Sepideh Mahabadi Approximate Nearest Line Search in High Dimensions Sepideh Mahabadi The NLS Problem Given: a set of N lines L in R d The NLS Problem Given: a set of N lines L in R d Goal: build a data structure s.t. given

More information

Nearest Neighbor Classification

Nearest Neighbor Classification Nearest Neighbor Classification Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms January 11, 2017 1 / 48 Outline 1 Administration 2 First learning algorithm: Nearest

More information

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016 Machine Learning 10-701, Fall 2016 Nonparametric methods for Classification Eric Xing Lecture 2, September 12, 2016 Reading: 1 Classification Representing data: Hypothesis (classifier) 2 Clustering 3 Supervised

More information

1 Case study of SVM (Rob)

1 Case study of SVM (Rob) DRAFT a final version will be posted shortly COS 424: Interacting with Data Lecturer: Rob Schapire and David Blei Lecture # 8 Scribe: Indraneel Mukherjee March 1, 2007 In the previous lecture we saw how

More information

K-Nearest Neighbour (Continued) Dr. Xiaowei Huang

K-Nearest Neighbour (Continued) Dr. Xiaowei Huang K-Nearest Neighbour (Continued) Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ A few things: No lectures on Week 7 (i.e., the week starting from Monday 5 th November), and Week 11 (i.e., the week

More information

Data Mining and Machine Learning: Techniques and Algorithms

Data Mining and Machine Learning: Techniques and Algorithms Instance based classification Data Mining and Machine Learning: Techniques and Algorithms Eneldo Loza Mencía eneldo@ke.tu-darmstadt.de Knowledge Engineering Group, TU Darmstadt International Week 2019,

More information

DATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines

DATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines DATA MINING LECTURE 10B Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines NEAREST NEIGHBOR CLASSIFICATION 10 10 Illustrating Classification Task Tid Attrib1

More information

Hidden Line and Surface

Hidden Line and Surface Copyright@00, YZU Optimal Design Laboratory. All rights resered. Last updated: Yeh-Liang Hsu (00--). Note: This is the course material for ME550 Geometric modeling and computer graphics, Yuan Ze Uniersity.

More information

CSC 411: Lecture 05: Nearest Neighbors

CSC 411: Lecture 05: Nearest Neighbors CSC 411: Lecture 05: Nearest Neighbors Raquel Urtasun & Rich Zemel University of Toronto Sep 28, 2015 Urtasun & Zemel (UofT) CSC 411: 05-Nearest Neighbors Sep 28, 2015 1 / 13 Today Non-parametric models

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu [Kumar et al. 99] 2/13/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

More information

K-Nearest Neighbors. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824

K-Nearest Neighbors. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824 K-Nearest Neighbors Jia-Bin Huang ECE-5424G / CS-5824 Virginia Tech Spring 2019 Administrative Check out review materials Probability Linear algebra Python and NumPy Start your HW 0 On your Local machine:

More information

Computer Vision I - Appearance-based Matching and Projective Geometry

Computer Vision I - Appearance-based Matching and Projective Geometry Computer Vision I - Appearance-based Matching and Projective Geometry Carsten Rother 05/11/2015 Computer Vision I: Image Formation Process Roadmap for next four lectures Computer Vision I: Image Formation

More information

Key Stage 2 Mathematics Programme of Study

Key Stage 2 Mathematics Programme of Study Deeloping numerical reasoning Identify processes and connections Represent and communicate Reiew transfer mathematical skills to a ariety of contexts and eeryday situations identify the appropriate steps

More information

Data Mining. Lecture 03: Nearest Neighbor Learning

Data Mining. Lecture 03: Nearest Neighbor Learning Data Mining Lecture 03: Nearest Neighbor Learning Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Prof. R. Mooney (UT Austin) Prof E. Keogh (UCR), Prof. F. Provost

More information

CS6716 Pattern Recognition

CS6716 Pattern Recognition CS6716 Pattern Recognition Prototype Methods Aaron Bobick School of Interactive Computing Administrivia Problem 2b was extended to March 25. Done? PS3 will be out this real soon (tonight) due April 10.

More information

Classifiers and Detection. D.A. Forsyth

Classifiers and Detection. D.A. Forsyth Classifiers and Detection D.A. Forsyth Classifiers Take a measurement x, predict a bit (yes/no; 1/-1; 1/0; etc) Detection with a classifier Search all windows at relevant scales Prepare features Classify

More information

Supervised Learning: K-Nearest Neighbors and Decision Trees

Supervised Learning: K-Nearest Neighbors and Decision Trees Supervised Learning: K-Nearest Neighbors and Decision Trees Piyush Rai CS5350/6350: Machine Learning August 25, 2011 (CS5350/6350) K-NN and DT August 25, 2011 1 / 20 Supervised Learning Given training

More information

Nonparametric Methods Recap

Nonparametric Methods Recap Nonparametric Methods Recap Aarti Singh Machine Learning 10-701/15-781 Oct 4, 2010 Nonparametric Methods Kernel Density estimate (also Histogram) Weighted frequency Classification - K-NN Classifier Majority

More information

Basis Functions. Volker Tresp Summer 2017

Basis Functions. Volker Tresp Summer 2017 Basis Functions Volker Tresp Summer 2017 1 Nonlinear Mappings and Nonlinear Classifiers Regression: Linearity is often a good assumption when many inputs influence the output Some natural laws are (approximately)

More information

K-Nearest Neighbour Classifier. Izabela Moise, Evangelos Pournaras, Dirk Helbing

K-Nearest Neighbour Classifier. Izabela Moise, Evangelos Pournaras, Dirk Helbing K-Nearest Neighbour Classifier Izabela Moise, Evangelos Pournaras, Dirk Helbing Izabela Moise, Evangelos Pournaras, Dirk Helbing 1 Reminder Supervised data mining Classification Decision Trees Izabela

More information

Key Stage 3 Mathematics Programme of Study

Key Stage 3 Mathematics Programme of Study Deeloping numerical reasoning Identify processes and connections Represent and communicate Reiew transfer mathematical skills across the curriculum in a ariety of contexts and eeryday situations select,

More information

Notes and Announcements

Notes and Announcements Notes and Announcements Midterm exam: Oct 20, Wednesday, In Class Late Homeworks Turn in hardcopies to Michelle. DO NOT ask Michelle for extensions. Note down the date and time of submission. If submitting

More information

Nonparametric Classification. Prof. Richard Zanibbi

Nonparametric Classification. Prof. Richard Zanibbi Nonparametric Classification Prof. Richard Zanibbi What to do when feature distributions (likelihoods) are not normal Don t Panic! While they may be suboptimal, LDC and QDC may still be applied, even though

More information

Machine Learning and Pervasive Computing

Machine Learning and Pervasive Computing Stephan Sigg Georg-August-University Goettingen, Computer Networks 17.12.2014 Overview and Structure 22.10.2014 Organisation 22.10.3014 Introduction (Def.: Machine learning, Supervised/Unsupervised, Examples)

More information

Nearest Neighbor Methods

Nearest Neighbor Methods Nearest Neighbor Methods Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Nearest Neighbor Methods Learning Store all training examples Classifying a

More information

Geometric Computations for Simulation

Geometric Computations for Simulation 1 Geometric Computations for Simulation David E. Johnson I. INTRODUCTION A static virtual world would be boring and unlikely to draw in a user enough to create a sense of immersion. Simulation allows things

More information

Nearest neighbors. Focus on tree-based methods. Clément Jamin, GUDHI project, Inria March 2017

Nearest neighbors. Focus on tree-based methods. Clément Jamin, GUDHI project, Inria March 2017 Nearest neighbors Focus on tree-based methods Clément Jamin, GUDHI project, Inria March 2017 Introduction Exact and approximate nearest neighbor search Essential tool for many applications Huge bibliography

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14

More information

Features: representation, normalization, selection. Chapter e-9

Features: representation, normalization, selection. Chapter e-9 Features: representation, normalization, selection Chapter e-9 1 Features Distinguish between instances (e.g. an image that you need to classify), and the features you create for an instance. Features

More information

CPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016

CPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016 CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Assignment 0: Admin 1 late day to hand it in tonight, 2 late days for Wednesday. Assignment 1 is out: Due Friday of next week.

More information

CISC 4631 Data Mining

CISC 4631 Data Mining CISC 4631 Data Mining Lecture 03: Nearest Neighbor Learning Theses slides are based on the slides by Tan, Steinbach and Kumar (textbook authors) Prof. R. Mooney (UT Austin) Prof E. Keogh (UCR), Prof. F.

More information

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule. CS 188: Artificial Intelligence Fall 2008 Lecture 24: Perceptrons II 11/24/2008 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit

More information

Objectives. Geometry. Basic Elements. Coordinate-Free Geometry. Transformations to Change Coordinate Systems. Scalars

Objectives. Geometry. Basic Elements. Coordinate-Free Geometry. Transformations to Change Coordinate Systems. Scalars Objecties Geometry CS 432 Interactie Computer Graphics Prof. Daid E. Breen Department of Computer Science Introduce the elements of geometry - Scalars - Vectors - Points Deelop mathematical operations

More information

Motivation: Art gallery problem. Polygon decomposition. Art gallery problem: upper bound. Art gallery problem: lower bound

Motivation: Art gallery problem. Polygon decomposition. Art gallery problem: upper bound. Art gallery problem: lower bound CG Lecture 3 Polygon decomposition 1. Polygon triangulation Triangulation theory Monotone polygon triangulation 2. Polygon decomposition into monotone pieces 3. Trapezoidal decomposition 4. Conex decomposition

More information

Non-Parametric Modeling

Non-Parametric Modeling Non-Parametric Modeling CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Non-Parametric Density Estimation Parzen Windows Kn-Nearest Neighbor

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 2014-2015 Jakob Verbeek, November 28, 2014 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.14.15

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining Fundamentals of learning (continued) and the k-nearest neighbours classifier Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart.

More information

Nearest Neighbor with KD Trees

Nearest Neighbor with KD Trees Case Study 2: Document Retrieval Finding Similar Documents Using Nearest Neighbors Machine Learning/Statistics for Big Data CSE599C1/STAT592, University of Washington Emily Fox January 22 nd, 2013 1 Nearest

More information

Overview of Clustering

Overview of Clustering based on Loïc Cerfs slides (UFMG) April 2017 UCBL LIRIS DM2L Example of applicative problem Student profiles Given the marks received by students for different courses, how to group the students so that

More information

Nearest neighbor classification DSE 220

Nearest neighbor classification DSE 220 Nearest neighbor classification DSE 220 Decision Trees Target variable Label Dependent variable Output space Person ID Age Gender Income Balance Mortgag e payment 123213 32 F 25000 32000 Y 17824 49 M 12000-3000

More information

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 SUPERVISED LEARNING METHODS Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 2 CHOICE OF ML You cannot know which algorithm will work

More information

8 th Grade Mathematics Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the

8 th Grade Mathematics Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 8 th Grade Mathematics Unpacked Content For the new Common Core standards that will be effective in all North Carolina schools in the 2012-13. This document is designed to help North Carolina educators

More information

Recursive Similarity-Based Algorithm for Deep Learning

Recursive Similarity-Based Algorithm for Deep Learning Recursive Similarity-Based Algorithm for R Tomasz Maszczyk & W lodzis law Duch Nicolaus Copernicus University Toruń, Poland ICONIP 2012 {tmaszczyk,wduch}@is.umk.pl 1 / 21 R Similarity-Based Learning ()

More information

Data Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners

Data Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners Data Mining 3.5 (Instance-Based Learners) Fall 2008 Instructor: Dr. Masoud Yaghini Outline Introduction k-nearest-neighbor Classifiers References Introduction Introduction Lazy vs. eager learning Eager

More information

The Curse of Dimensionality

The Curse of Dimensionality The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more

More information

Practice EXAM: SPRING 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE

Practice EXAM: SPRING 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE Practice EXAM: SPRING 0 CS 6375 INSTRUCTOR: VIBHAV GOGATE The exam is closed book. You are allowed four pages of double sided cheat sheets. Answer the questions in the spaces provided on the question sheets.

More information

Classification: Feature Vectors

Classification: Feature Vectors Classification: Feature Vectors Hello, Do you want free printr cartriges? Why pay more when you can get them ABSOLUTELY FREE! Just # free YOUR_NAME MISSPELLED FROM_FRIEND... : : : : 2 0 2 0 PIXEL 7,12

More information

CSE 573: Artificial Intelligence Autumn 2010

CSE 573: Artificial Intelligence Autumn 2010 CSE 573: Artificial Intelligence Autumn 2010 Lecture 16: Machine Learning Topics 12/7/2010 Luke Zettlemoyer Most slides over the course adapted from Dan Klein. 1 Announcements Syllabus revised Machine

More information

Mathematics of Data. INFO-4604, Applied Machine Learning University of Colorado Boulder. September 5, 2017 Prof. Michael Paul

Mathematics of Data. INFO-4604, Applied Machine Learning University of Colorado Boulder. September 5, 2017 Prof. Michael Paul Mathematics of Data INFO-4604, Applied Machine Learning University of Colorado Boulder September 5, 2017 Prof. Michael Paul Goals In the intro lecture, every visualization was in 2D What happens when we

More information

Perspective Projection

Perspective Projection Perspectie Projection (Com S 477/77 Notes) Yan-Bin Jia Aug 9, 7 Introduction We now moe on to isualization of three-dimensional objects, getting back to the use of homogeneous coordinates. Current display

More information

Machine Learning for NLP

Machine Learning for NLP Machine Learning for NLP Support Vector Machines Aurélie Herbelot 2018 Centre for Mind/Brain Sciences University of Trento 1 Support Vector Machines: introduction 2 Support Vector Machines (SVMs) SVMs

More information

Machine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016

Machine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016 Machine Learning for Signal Processing Clustering Bhiksha Raj Class 11. 13 Oct 2016 1 Statistical Modelling and Latent Structure Much of statistical modelling attempts to identify latent structure in the

More information

Supervised Learning. CS 586 Machine Learning. Prepared by Jugal Kalita. With help from Alpaydin s Introduction to Machine Learning, Chapter 2.

Supervised Learning. CS 586 Machine Learning. Prepared by Jugal Kalita. With help from Alpaydin s Introduction to Machine Learning, Chapter 2. Supervised Learning CS 586 Machine Learning Prepared by Jugal Kalita With help from Alpaydin s Introduction to Machine Learning, Chapter 2. Topics What is classification? Hypothesis classes and learning

More information

Geometric data structures:

Geometric data structures: Geometric data structures: Machine Learning for Big Data CSE547/STAT548, University of Washington Sham Kakade Sham Kakade 2017 1 Announcements: HW3 posted Today: Review: LSH for Euclidean distance Other

More information

Support Vector Machines + Classification for IR

Support Vector Machines + Classification for IR Support Vector Machines + Classification for IR Pierre Lison University of Oslo, Dep. of Informatics INF3800: Søketeknologi April 30, 2014 Outline of the lecture Recap of last week Support Vector Machines

More information

CPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017

CPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017 CPSC 340: Machine Learning and Data Mining Kernel Trick Fall 2017 Admin Assignment 3: Due Friday. Midterm: Can view your exam during instructor office hours or after class this week. Digression: the other

More information

Parametrizing the easiness of machine learning problems. Sanjoy Dasgupta, UC San Diego

Parametrizing the easiness of machine learning problems. Sanjoy Dasgupta, UC San Diego Parametrizing the easiness of machine learning problems Sanjoy Dasgupta, UC San Diego Outline Linear separators Mixture models Nonparametric clustering Nonparametric classification and regression Nearest

More information

Instance-Based Learning. Goals for the lecture

Instance-Based Learning. Goals for the lecture Instance-Based Learning Mar Craven and David Page Computer Sciences 760 Spring 2018 www.biostat.wisc.edu/~craven/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed

More information

Missing Data. Where did it go?

Missing Data. Where did it go? Missing Data Where did it go? 1 Learning Objectives High-level discussion of some techniques Identify type of missingness Single vs Multiple Imputation My favourite technique 2 Problem Uh data are missing

More information

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight

This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight This research aims to present a new way of visualizing multi-dimensional data using generalized scatterplots by sensitivity coefficients to highlight local variation of one variable with respect to another.

More information

Conjectures concerning the geometry of 2-point Centroidal Voronoi Tessellations

Conjectures concerning the geometry of 2-point Centroidal Voronoi Tessellations Conjectures concerning the geometry of 2-point Centroidal Voronoi Tessellations Emma Twersky May 2017 Abstract This paper is an exploration into centroidal Voronoi tessellations, or CVTs. A centroidal

More information

Topic 1 Classification Alternatives

Topic 1 Classification Alternatives Topic 1 Classification Alternatives [Jiawei Han, Micheline Kamber, Jian Pei. 2011. Data Mining Concepts and Techniques. 3 rd Ed. Morgan Kaufmann. ISBN: 9380931913.] 1 Contents 2. Classification Using Frequent

More information