Nearest Neighbor Methods

Similar documents
Nearest Neighbor Classification. Machine Learning Fall 2017

Data Mining and Machine Learning: Techniques and Algorithms

Distribution-free Predictive Approaches

Nearest Neighbors Classifiers

Lecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013

MODULE 7 Nearest Neighbour Classifier and its variants LESSON 11. Nearest Neighbour Classifier. Keywords: K Neighbours, Weighted, Nearest Neighbour

Supervised Learning: Nearest Neighbors

Nearest Neighbor Predictors

Non-Bayesian Classifiers Part I: k-nearest Neighbor Classifier and Distance Functions

Data Mining and Data Warehousing Classification-Lazy Learners

K-Nearest Neighbour (Continued) Dr. Xiaowei Huang

Geometric data structures:

CSC 411: Lecture 05: Nearest Neighbors

Machine Learning and Pervasive Computing

CS7267 MACHINE LEARNING NEAREST NEIGHBOR ALGORITHM. Mingon Kang, PhD Computer Science, Kennesaw State University

Data Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners

CISC 4631 Data Mining

Data mining. Classification k-nn Classifier. Piotr Paszek. (Piotr Paszek) Data mining k-nn 1 / 20

Linear Regression and K-Nearest Neighbors 3/28/18

Problem 1: Complexity of Update Rules for Logistic Regression

Data Mining. Lecture 03: Nearest Neighbor Learning

CS6716 Pattern Recognition

Task Description: Finding Similar Documents. Document Retrieval. Case Study 2: Document Retrieval

DATA MINING LECTURE 10B. Classification k-nearest neighbor classifier Naïve Bayes Logistic Regression Support Vector Machines

k Nearest Neighbors Super simple idea! Instance-based learning as opposed to model-based (no pre-processing)

CS178: Machine Learning and Data Mining. Complexity & Nearest Neighbor Methods

Multi-way Search Trees. (Multi-way Search Trees) Data Structures and Programming Spring / 25

7. Nearest neighbors. Learning objectives. Foundations of Machine Learning École Centrale Paris Fall 2015

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition

Data Structures and Algorithms

7. Nearest neighbors. Learning objectives. Centre for Computational Biology, Mines ParisTech

Perceptron as a graph

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

9 Classification: KNN and SVM

Topic 7 Machine learning

Introduction to Machine Learning Lecture 4. Mehryar Mohri Courant Institute and Google Research

Instance-based Learning

K Nearest Neighbor Wrap Up K- Means Clustering. Slides adapted from Prof. Carpuat

Data Mining and Machine Learning. Instance-Based Learning. Rote Learning k Nearest-Neighbor Classification. IBL and Rule Learning

Decision trees. Decision trees are useful to a large degree because of their simplicity and interpretability

3D Object Recognition using Multiclass SVM-KNN

Basic Data Mining Technique

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013

Instance-based Learning

Nearest neighbor classification DSE 220

CS F-11 B-Trees 1

Supervised vs unsupervised clustering

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018

Lecture 3. Oct

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall

Nearest Neighbor Classifiers

CS127: B-Trees. B-Trees

CS 584 Data Mining. Classification 1

VQ Encoding is Nearest Neighbor Search

Classifiers and Detection. D.A. Forsyth

Design and Analysis of Algorithms Prof. Madhavan Mukund Chennai Mathematical Institute

k-nearest Neighbor (knn) Sept Youn-Hee Han

I211: Information infrastructure II

Search. The Nearest Neighbor Problem

PROBLEM 4

Decision Trees Oct

Nearest Neighbor Classification

Unsupervised Learning: Clustering

Clustering & Classification (chapter 15)

Segmentation of Images

Introduction. Introduction. Related Research. SIFT method. SIFT method. Distinctive Image Features from Scale-Invariant. Scale.

Data Structures III: K-D

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques

SYDE 372 Introduction to Pattern Recognition. Distance Measures for Pattern Classification: Part I

University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques.

Gene Clustering & Classification

Case-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.

CS 188: Artificial Intelligence Fall 2008

CS 340 Lec. 4: K-Nearest Neighbors

Data Mining Practical Machine Learning Tools and Techniques

CS 350 : Data Structures B-Trees

Support vector machines

Warped parallel nearest neighbor searches using kd-trees

Nearest Neighbor Classification

Clustering Billions of Images with Large Scale Nearest Neighbor Search

Spatial Data Structures for Computer Graphics

High Dimensional Indexing by Clustering

Text classification II CE-324: Modern Information Retrieval Sharif University of Technology

Introduction to Clustering

Module 8: Range-Searching in Dictionaries for Points

CSCI567 Machine Learning (Fall 2014)

K-Nearest Neighbour Classifier. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Clustering. So far in the course. Clustering. Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. dist(x, y) = x y 2 2

Based on Raymond J. Mooney s slides

Network Traffic Measurements and Analysis

Nonparametric Methods Recap

CSC411/2515 Tutorial: K-NN and Decision Tree

Mathematics of Data. INFO-4604, Applied Machine Learning University of Colorado Boulder. September 5, 2017 Prof. Michael Paul

Supervised Learning: K-Nearest Neighbors and Decision Trees

CHAPTER 2: SAMPLING AND DATA

MAT 142 College Mathematics. Module ST. Statistics. Terri Miller revised July 14, 2015

Nearest Neighbor Search by Branch and Bound

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

Transcription:

Nearest Neighbor Methods Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag

Nearest Neighbor Methods Learning Store all training examples Classifying a new point x Find the training example (x i, y (i) ) such that x (i) is closest (for some notion of close) to x Classify x with the label y (i) 2

Nearest Neighbor Methods 3

Nearest Neighbor Methods 4

Nearest Neighbor Methods 5

Nearest Neighbor Methods 6

Nearest Neighbor Methods k-nearest neighbor methods look at the k closest points in the training set and take a majority vote (should choose k to be odd) 7

Nearest Neighbor Methods k-nearest neighbor methods look at the k closest points in the training set and take a majority vote (should choose k to be odd) 8

Nearest Neighbor Methods Applies to data sets with points in R d Best for large data sets with only a few (< 20) attributes Advantages Learning is easy Can learn complicated decision boundaries Disadvantages Classification is slow (need to keep the entire training set around) Easily fooled by irrelevant attributes 9

Practical Challenges How to choose the right measure of closeness? Euclidean distance is popular, but many other possibilities How to pick k? Too small and the estimates are noisy, too large and the accuracy suffers What if the nearest neighbor is really far away? 10

Choosing the Distance Euclidean distance makes sense when each of the features is roughly on the same scale If the features are very different (e.g., height and age), then Euclidean distance makes less sense as height would be less significant than age simply because age has a larger range of possible values To correct for this, feature vectors are often scaled by the standard deviation over the training set 11

Normalization Sample mean n x ҧ = 1 n i=1 x i Sample variance n σ k 2 = 1 n i=1 x k i 2 xҧ k 12

Irrelevant Attributes Consider the nearest neighbor problem in one dimension 13

Irrelevant Attributes Now, add a new attribute that is just random noise... 14

K-Dimensional Trees In order to do classification, we can compute the distances between all points in the training set and the point we are trying to classify If the data is d-dimensional, this takes O(nd) time for Euclidean distance It is possible to do better if we do some preprocessing on the training data 15

K-Dimensional Trees k-d trees provide a data structure that can help simplify the classification task by constructing a tree that partitions the search space Starting with the entire training set, choose some dimension, i Select an element of the training data whose i th dimension has the median value among all elements of the training set Divide the training set into two pieces: depending on whether their i th attribute is smaller or larger than the median Repeat this partitioning process on each of the two new pieces separately 16

K-Dimensional Trees [Images from slides by Mehyrar Mohri] 17

K-Dimensional Trees Start at the top of the k-d tree and traverse it to a leaf of the tree based on where the point to classify should fall Once a leaf node is reached, selected to be the current closest point to x Follow the path, in the opposite direction, from the leaf to the root If the current node along the path is closer to x than the current closest point it becomes the new closest point Before moving up the tree, the algorithm checks if there could be any points in the opposite partition that are closer to x than the current closest point If so, then closest point in that subtree is computed recursively Otherwise, the parent of the current node along the path becomes the new current node 18

K-Dimensional Trees [Images from slides by Mehyrar Mohri] 19

K-Dimensional Trees By design, the constructed k-d tree is bushy The idea is that if new points to classify are evenly distributed throughout the space, then the expected cost of classification is approximately O(d log n) operations Summary k-nn is fast and easy to implement No training required Can be good in practice (where applicable) 20