K Nearest Neighbor Wrap Up K- Means Clustering. Slides adapted from Prof. Carpuat

Similar documents
CS178: Machine Learning and Data Mining. Complexity & Nearest Neighbor Methods

Lecture 3. Oct

K-Nearest Neighbors. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018

Case-Based Reasoning. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. Parametric / Non-parametric.

CS 188: Artificial Intelligence Fall 2008

Data Preprocessing. Supervised Learning

CS5670: Computer Vision

Supervised Learning: Nearest Neighbors

Machine Learning in Biology

Gene Clustering & Classification

CS 343: Artificial Intelligence

Overview. Non-Parametrics Models Definitions KNN. Ensemble Methods Definitions, Examples Random Forests. Clustering. k-means Clustering 2 / 8

Machine Learning Department School of Computer Science Carnegie Mellon University. K- Means + GMMs

Kernels and Clustering

Data Mining Classification: Alternative Techniques. Lecture Notes for Chapter 4. Instance-Based Learning. Introduction to Data Mining, 2 nd Edition

Unsupervised Data Mining: Clustering. Izabela Moise, Evangelos Pournaras, Dirk Helbing

Text classification II CE-324: Modern Information Retrieval Sharif University of Technology

Clustering & Classification (chapter 15)

Nearest Neighbor Classification. Machine Learning Fall 2017

CISC 4631 Data Mining

k-nearest Neighbors + Model Selection

Data Mining. Lecture 03: Nearest Neighbor Learning

International Journal of Scientific Research & Engineering Trends Volume 4, Issue 6, Nov-Dec-2018, ISSN (Online): X

COMPUTATIONAL INTELLIGENCE SEW (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 6: k-nn Cross-validation Regularization

k-nearest Neighbor (knn) Sept Youn-Hee Han

Mathematics of Data. INFO-4604, Applied Machine Learning University of Colorado Boulder. September 5, 2017 Prof. Michael Paul

CSCI567 Machine Learning (Fall 2014)

Naïve Bayes for text classification

Problem 1: Complexity of Update Rules for Logistic Regression

Nearest Neighbor Classification

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

Lecture 25: Review I

Model Selection Introduction to Machine Learning. Matt Gormley Lecture 4 January 29, 2018

Data Mining and Machine Learning: Techniques and Algorithms

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016

A Dendrogram. Bioinformatics (Lec 17)

K-Nearest Neighbour Classifier. Izabela Moise, Evangelos Pournaras, Dirk Helbing

KTH ROYAL INSTITUTE OF TECHNOLOGY. Lecture 14 Machine Learning. K-means, knn

7. Nearest neighbors. Learning objectives. Centre for Computational Biology, Mines ParisTech

7. Nearest neighbors. Learning objectives. Foundations of Machine Learning École Centrale Paris Fall 2015

CS6716 Pattern Recognition

9 Classification: KNN and SVM

UVA CS 6316/4501 Fall 2016 Machine Learning. Lecture 15: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia

UVA CS 4501: Machine Learning. Lecture 10: K-nearest-neighbor Classifier / Bias-Variance Tradeoff. Dr. Yanjun Qi. University of Virginia

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018

Supervised vs unsupervised clustering

CSE 573: Artificial Intelligence Autumn 2010

A Survey On Data Mining Algorithm

Machine Learning / Jan 27, 2010

Voronoi Region. K-means method for Signal Compression: Vector Quantization. Compression Formula 11/20/2013

Classification and Regression

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015

INF4820, Algorithms for AI and NLP: Hierarchical Clustering

Machine Learning (CSE 446): Perceptron

Machine Learning Techniques for Data Mining

Announcements. CS 188: Artificial Intelligence Spring Classification: Feature Vectors. Classification: Weights. Learning: Binary Perceptron

CSC 411: Lecture 05: Nearest Neighbors

Classifier Inspired Scaling for Training Set Selection

Machine Learning and Pervasive Computing

The k-means Algorithm and Genetic Algorithm

Introduction to Artificial Intelligence

SOCIAL MEDIA MINING. Data Mining Essentials

MLCC 2018 Local Methods and Bias Variance Trade-Off. Lorenzo Rosasco UNIGE-MIT-IIT

Notes and Announcements

MODULE 7 Nearest Neighbour Classifier and its variants LESSON 11. Nearest Neighbour Classifier. Keywords: K Neighbours, Weighted, Nearest Neighbour

Classification: Feature Vectors

Machine Learning: Think Big and Parallel

Unsupervised Learning

Applied Statistics for Neuroscientists Part IIa: Machine Learning

k Nearest Neighbors Super simple idea! Instance-based learning as opposed to model-based (no pre-processing)

Challenges motivating deep learning. Sargur N. Srihari

Perceptron Introduction to Machine Learning. Matt Gormley Lecture 5 Jan. 31, 2018

Nearest Neighbor with KD Trees

Uninformed Search Methods. Informed Search Methods. Midterm Exam 3/13/18. Thursday, March 15, 7:30 9:30 p.m. room 125 Ag Hall

Artificial Intelligence. Programming Styles

Machine Learning in Python. Rohith Mohan GradQuant Spring 2018

Data Mining. 3.5 Lazy Learners (Instance-Based Learners) Fall Instructor: Dr. Masoud Yaghini. Lazy Learners

Distribution-free Predictive Approaches

Jarek Szlichta

Decision Trees Oct

Using Machine Learning to Optimize Storage Systems

Decision Tree (Continued) and K-Nearest Neighbour. Dr. Xiaowei Huang

Slides for Data Mining by I. H. Witten and E. Frank

LARGE MARGIN CLASSIFIERS

An Unsupervised Approach for Combining Scores of Outlier Detection Techniques, Based on Similarity Measures

K-Nearest Neighbour (Continued) Dr. Xiaowei Huang

Overview Citation. ML Introduction. Overview Schedule. ML Intro Dataset. Introduction to Semi-Supervised Learning Review 10/4/2010

Feature Extractors. CS 188: Artificial Intelligence Fall Some (Vague) Biology. The Binary Perceptron. Binary Decision Rule.

Clustering. Unsupervised Learning

CSCI567 Machine Learning (Fall 2018)

PARALLEL CLASSIFICATION ALGORITHMS

PROBLEM 4

Clustering. Unsupervised Learning

1 Case study of SVM (Rob)

INTRODUCTION TO DATA MINING. Daniel Rodríguez, University of Alcalá

CS570: Introduction to Data Mining

Machine Learning (CSE 446): Concepts & the i.i.d. Supervised Learning Paradigm

Clustering k-mean clustering

Transcription:

K Nearest Neighbor Wrap Up K- Means Clustering Slides adapted from Prof. Carpuat

K Nearest Neighbor classification Classification is based on Test instance with Training Data K: number of neighbors that unknown class in { 1; +1 }

Components of a k-nn Classifier Distance metric How do we measure distance between instances? Determines the layout of the example space The k hyperparameter How large a neighborhood should we consider? Determines the complexity of the hypothesis space

K=1 and Voronoi Diagrams Imagine we are given a bunch of training examples Find regions in the feature space which are closest to every training example Algorithm if our test point is in the region corresponding to a given input point return its label

Decision Boundaries for 1-NN

Decision Boundaries change with the distance function

Decision Boundaries change with K

The k hyperparameter Tunes the complexity of the hypothesis space If k = 1, every training example has its own neighborhood If k = N, the entire feature space is one neighborhood! Higher k yields smoother decision boundaries How would you set k in practice?

What is the inductive bias of k-nn? Nearby instances should have the same label All features are equally important Complexity is tuned by the k parameter

Variations on k-nn: Weighted voting Weighted voting Default: all neighbors have equal weight Extension: weight neighbors by (inverse) distance

Variations on k-nn: Epsilon Ball Nearest Neighbors Same general principle as K-NN, but change the method for selecting which training examples vote Instead of using K nearest neighbors, use all examples x such that distance x, x ε

Exercise: How would you modify KNN- Predict to perform Epsilon Ball NN?

Exercise: When are DT vs knn appropriate? Properties of classification problem Can Decision Trees handle them? Can K-NN handle them? Binary features Numeric features Categorical features Robust to noisy training examples Fast classification is crucial Many irrelevant features Relevant features have very different scale

Exercise: When are DT vs knn appropriate? Properties of classification problem Can Decision Trees handle them? Can K-NN handle them? Binary features yes yes Numeric features yes yes Categorical features yes yes Robust to noisy training examples no (for default algorithm) yes (when k > 1) Fast classification is crucial yes no Many irrelevant features yes no Relevant features have very different scale yes no

Recap Nearest Neighbors (NN) algorithms for classification K-NN, Epsilon ball NN Take a geometric view of learning Fundamental Machine Learning Concepts Decision boundary Visualizes predictions over entire feature space Characterizes complexity of learned model Indicates overfitting/underfitting

K-Means an example of unsupervised learning

When applying a learning algorithm, some things are properties of the problem you are trying to solve, and some things are up to you to choose as the ML programmer. Which of the following are properties of the problem? The data generating distribution The train/dev/test split The learning model The loss function

Today s Topics A new algorithm K-Means Clustering Fundamental Machine Learning Concepts Unsupervised vs. supervised learning Decision boundary

Clustering Goal: automatically partition examples into groups of similar examples Why? It is useful for Automatically organizing data Understanding hidden structure in data Preprocessing for further analysis

What can we cluster in practice? news articles or web pages by topic protein sequences by function, or genes according to expression profile users of social networks by interest customers according to purchase history galaxies or nearby stars

Clustering Input a set S of n points in feature space a distance measure specifying distance d(x_i,x_j) between pairs (x_i,x_j) Output A partition {S_1,S_2, S_k} of S

Supervised Machine Learning as Function Approximation Problem setting Set of possible instances X Unknown target function f: X Y Set of function hypotheses H = h h: X Y} Input Training examples { x 1, y 1, x N, y N } of unknown target function f Output Hypothesis h H that best approximates target function f

Supervised vs. unsupervised learning Clustering is an example of unsupervised learning We are not given examples of classes y Instead we have to discover classes in data

2 datasets with very different underlying structure!

The K-Means Algorithm Training Data K: number of clusters to discover

Example: using K-Means to discover 2 clusters in data

Example: using K-Means to discover 2 clusters in data

K-Means properties Time complexity: O(KNL) where K is the number of clusters N is number of examples L is the number of iterations K is a hyperparameter Needs to be set in advance (or learned on dev set) Different initializations yield different results! Doesn t necessarily converge to best partition Global view of data: revisits all examples at every iteration

Impact of initialization

Impact of initialization