Experimental Design + k- Nearest Neighbors

Similar documents
Model Selection Introduction to Machine Learning. Matt Gormley Lecture 4 January 29, 2018

k-nearest Neighbors + Model Selection

Introduction to Artificial Intelligence

KTH ROYAL INSTITUTE OF TECHNOLOGY. Lecture 14 Machine Learning. K-means, knn

Nearest Neighbor Classification

k Nearest Neighbors Super simple idea! Instance-based learning as opposed to model-based (no pre-processing)

Perceptron Introduction to Machine Learning. Matt Gormley Lecture 5 Jan. 31, 2018

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2018)

Machine Learning Department School of Computer Science Carnegie Mellon University. K- Means + GMMs

Machine Learning: Algorithms and Applications Mockup Examination

Hands on Datamining & Machine Learning with Weka

EPL451: Data Mining on the Web Lab 5

BL5229: Data Analysis with Matlab Lab: Learning: Clustering

An Introduction to Cluster Analysis. Zhaoxia Yu Department of Statistics Vice Chair of Undergraduate Affairs

Value Iteration. Reinforcement Learning: Introduction to Machine Learning. Matt Gormley Lecture 23 Apr. 10, 2019

Introduction to R and Statistical Data Analysis

Machine Learning for. Artem Lind & Aleskandr Tkachenko

Cluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6

Basic Concepts Weka Workbench and its terminology

Summary. Machine Learning: Introduction. Marcin Sydow

Input: Concepts, Instances, Attributes

Computational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions

STATS Data Analysis using Python. Lecture 15: Advanced Command Line

Identification Of Iris Flower Species Using Machine Learning

DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS

CS 584 Data Mining. Classification 1

Performance Analysis of Data Mining Classification Techniques

Introduction to Machine Learning Prof. Anirban Santara Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Ripple Down Rule learner (RIDOR) Classifier for IRIS Dataset

Representing structural patterns: Reading Material: Chapter 3 of the textbook by Witten

Data Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47

CSC411/2515 Tutorial: K-NN and Decision Tree

Machine Learning with MATLAB --classification

Content-based image and video analysis. Machine learning

Notes and Announcements

CSE 446 Bias-Variance & Naïve Bayes

Hsiaochun Hsu Date: 12/12/15. Support Vector Machine With Data Reduction

Intro to R for Epidemiologists

Support Vector Machines - Supplement

Data Warehousing and Machine Learning

Applied Statistics for Neuroscientists Part IIa: Machine Learning

EFFECTIVENESS PREDICTION OF MEMORY BASED CLASSIFIERS FOR THE CLASSIFICATION OF MULTIVARIATE DATA SET

In stochastic gradient descent implementations, the fixed learning rate η is often replaced by an adaptive learning rate that decreases over time,

CISC 4631 Data Mining

Data analysis case study using R for readily available data set using any one machine learning Algorithm

Computer Vision. Exercise Session 10 Image Categorization

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.

ESERCITAZIONE PIATTAFORMA WEKA. Croce Danilo Web Mining & Retrieval 2015/2016

Introduction to Pattern Classification

Object Recognition. Lecture 11, April 21 st, Lexing Xie. EE4830 Digital Image Processing

Instance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015

Machine Learning. Nonparametric methods for Classification. Eric Xing , Fall Lecture 2, September 12, 2016

Application of Fuzzy Logic Akira Imada Brest State Technical University

K Nearest Neighbor Wrap Up K- Means Clustering. Slides adapted from Prof. Carpuat

CS489/698 Lecture 2: January 8 th, 2018

A Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York

Chuck Cartledge, PhD. 20 January 2018

STAT 1291: Data Science

Comparision between Quad tree based K-Means and EM Algorithm for Fault Prediction

Data Mining Practical Machine Learning Tools and Techniques

Classification: Decision Trees

COSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor

K-Nearest Neighbors. Jia-Bin Huang. Virginia Tech Spring 2019 ECE-5424G / CS-5824

Clojure & Incanter. Introduction to Datasets & Charts. Data Sorcery with. David Edgar Liebke

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

SUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018

Stats 170A: Project in Data Science Exploratory Data Analysis: Clustering Algorithms

Large Scale Data Analysis Using Deep Learning

Introduction to Machine Learning

Data Mining: Exploring Data

ESANN'2001 proceedings - European Symposium on Artificial Neural Networks Bruges (Belgium), April 2001, D-Facto public., ISBN ,

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.

CROSS-CORRELATION NEURAL NETWORK: A NEW NEURAL NETWORK CLASSIFIER

Machine Learning / Jan 27, 2010

ANN exercise session

Using Machine Learning to Identify Security Issues in Open-Source Libraries. Asankhaya Sharma Yaqin Zhou SourceClear

k-nearest Neighbor (knn) Sept Youn-Hee Han

Classification and K-Nearest Neighbors

Machine Learning. Supervised Learning. Manfred Huber

arulescba: Classification for Factor and Transactional Data Sets Using Association Rules

Data Mining: Exploring Data. Lecture Notes for Chapter 3

Nearest neighbors classifiers

COS Lecture 13 Autonomous Robot Navigation

All lecture slides will be available at CSC2515_Winter15.html

Boosting Simple Model Selection Cross Validation Regularization

UNSUPERVISED LEARNING IN PYTHON. Visualizing the PCA transformation

Work 2. Case-based reasoning exercise

Data Mining Practical Machine Learning Tools and Techniques

COMP 551 Applied Machine Learning Lecture 13: Unsupervised learning

Machine Learning Chapter 2. Input

Linear discriminant analysis and logistic

Decision Trees In Weka,Data Formats

Techniques for Dealing with Missing Values in Feedforward Networks

CLASSIFICATION JELENA JOVANOVIĆ. Web:

Homework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures:

劉介宇 國立台北護理健康大學 護理助產研究所 / 通識教育中心副教授 兼教師發展中心教師評鑑組長 Nov 19, 2012

Lecture 8: Grid Search and Model Validation Continued

Analysis and Latent Semantic Indexing

Transcription:

10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Experimental Design + k- Nearest Neighbors KNN Readings: Mitchell 8.2 HTF 13.3 Murphy - - - Bishop 2.5.2 Prob. Readings: (next lecture) Lecture notes from 10-600 (See Piazza post for the pointers) Murphy 2 Bishop 2 HTF - - Mitchell - - Matt Gormley Lecture 3 January 25, 2016 1

Reminders Background Exercises (Homework 1) Released: Wed, Jan. 25 Due: Mon, Jan. 30 at 5:30pm Website updates Office hours Google calendar on People Readings on Schedule Meet AIs: Sarah, Daniel, Brynn 2

Outline k- Nearest Neighbors (KNN) Special cases Choosing k Case Study: KNN on Fisher Iris Data Case Study: KNN on 2D Gaussian Data Experimental Design Train error vs. test error Train / validation / test splits Cross- validation Function Approximation View of ML 3

K- NEAREST NEIGHBORS 4

Whiteboard: Special cases Choosing k k- Nearest Neighbors 5

KNN ON FISHER IRIS DATA 6

Fisher Iris Dataset Fisher (1936) used 150 measurements of flowers from 3 different species: Iris setosa (0), Iris virginica (1), Iris versicolor (2) collected by Anderson (1936) Species Sepal Length Sepal Width Petal Length 0 4.3 3.0 1.1 0.1 0 4.9 3.6 1.4 0.1 0 5.3 3.7 1.5 0.2 1 4.9 2.4 3.3 1.0 1 5.7 2.8 4.1 1.3 1 6.3 3.3 4.7 1.6 1 6.7 3.0 5.0 1.7 Petal Width Full dataset: https://en.wikipedia.org/wiki/iris_flower_data_set 7

KNN on Fisher Iris Data 8

KNN on Fisher Iris Data Special Case: Nearest Neighbor 9

KNN on Fisher Iris Data Special Case: Majority Vote 10

KNN on Fisher Iris Data 11

KNN on Fisher Iris Data Special Case: Nearest Neighbor 12

KNN on Fisher Iris Data 13

KNN on Fisher Iris Data 14

KNN on Fisher Iris Data 15

KNN on Fisher Iris Data 16

KNN on Fisher Iris Data 17

KNN on Fisher Iris Data 18

KNN on Fisher Iris Data 19

KNN on Fisher Iris Data 20

KNN on Fisher Iris Data 21

KNN on Fisher Iris Data 22

KNN on Fisher Iris Data 23

KNN on Fisher Iris Data 24

KNN on Fisher Iris Data 25

KNN on Fisher Iris Data 26

KNN on Fisher Iris Data 27

KNN on Fisher Iris Data 28

KNN on Fisher Iris Data 29

KNN on Fisher Iris Data 30

KNN on Fisher Iris Data 31

KNN on Fisher Iris Data Special Case: Majority Vote 32

KNN ON GAUSSIAN DATA 33

KNN on Gaussian Data 34

KNN on Gaussian Data 35

KNN on Gaussian Data 36

KNN on Gaussian Data 37

KNN on Gaussian Data 38

KNN on Gaussian Data 39

KNN on Gaussian Data 40

KNN on Gaussian Data 41

KNN on Gaussian Data 42

KNN on Gaussian Data 43

KNN on Gaussian Data 44

KNN on Gaussian Data 45

KNN on Gaussian Data 46

KNN on Gaussian Data 47

KNN on Gaussian Data 48

KNN on Gaussian Data 49

KNN on Gaussian Data 50

KNN on Gaussian Data 51

KNN on Gaussian Data 52

KNN on Gaussian Data 53

KNN on Gaussian Data 54

KNN on Gaussian Data 55

KNN on Gaussian Data 56

KNN on Gaussian Data 57

KNN on Gaussian Data 58

CHOOSING THE NUMBER OF NEIGHBORS 59

(Name changed from K- Fold Cross- Validation to avoid confusion with KNN) F- Fold Cross- Validation Key idea: rather than just a single validation set, use many! (Error is more stable. Slower computation.) D = y (1) y (2) y (N) x (1) x (2) x (N) Fold 1 Fold 2 Fold 3 Fold 4 Divide data into folds (e.g. 4) 1. Train on folds {1,2,3} and predict on {4} 2. Train on folds {1,2,4} and predict on {3} 3. Train on folds {1,3,4} and predict on {2} 4. Train on folds {2,3,4} and predict on {1} Concatenate all the predictions and evaluate error 60

Math as Code How to implement? y max = argmax f(y) y Y It depends on how large the set Y is! If it s a small enumerable set Y = {1,2,,77}, then: ymax = -inf for y in {1,2, 77}: if f(y) > ymax: ymax = y return ymax 61

Math as Code How to implement? v max = max f(y) y Y It depends on how large the set Y is! If it s a small enumerable set Y = {1,2,,77}, then: vmax = -inf for y in {1,2, 77}: if f(y) > vmax: vmax = f(y) return vmax 62

Function Approximation View of ML Whiteboard 63

Beyond the Scope of This Lecture k- Nearest Neighbors (KNN) for Regression Distance- weighted KNN Cover & Hart (1967) Bayes error rate bound KNN for Facial Recognition (see Eigenfaces in PCA lecture) 64

Takeaways k- Nearest Neighbors Requires careful choice of k (# of neighbors) Experimental design can be just as important as the learning algorithm itself Function Approximation View Assumption: inputs are sampled from some unknown distributions Assumption: outputs come from a fixed unknown function (e.g. human annotator) Goal: Learn a hypothesis which closely approximates that function 65