Learning from Data. COMP61011 : Machine Learning and Data Mining. Dr Gavin Brown Machine Learning and Op<miza<on Research Group

Size: px
Start display at page:

Download "Learning from Data. COMP61011 : Machine Learning and Data Mining. Dr Gavin Brown Machine Learning and Op<miza<on Research Group"

Transcription

1 Learning from Data COMP61011 : Machine Learning and Data Mining Dr Gavin Brown Machine Learning and Op<miza<on Research Group

2 Learning from Data Data is recorded from some real- world phenomenon. What might we want to do with that data? Predic+on - what can we predict about this phenomenon? Descrip+on - how can we describe/understand this phenomenon in a new way? Op+miza+on - how can we control and op+mize this phenomenon for our own objec<ves?

3 COMP61011 Machine Learning & Data Mining COMP61021 Modeling & Visualiza<on of High Dimensional Data COMP61032 Op<miza<on for Learning, Planning & Problem Solving Period 1 Oct/Nov Period 2 Nov/Dec Period 3 Feb/Mar Predic+on Lecturer: Dr Gavin Brown

4 Machine Learning and Data Mining Medical Records / Novel Drugs What characteris<cs of a pa<ent indicate they may react well/badly to a new drug? How can we predict whether it will poten<ally hurt rather then help them? AstraZeneca Project Research Bursaries Limited number of eligible MSc projects, announced Dec 2011

5 Machine Learning and Data Mining Handwri+ng Recogni+on Google Books is currently digi<zing millions of books. Smartphones need to process non- European handwri<ng to tap into the Asian market. How can we recognize handwriven digits in a huge variety of handwri<ng styles, in real- <me?

6 Learning from Data Where does all this fit? Ar<ficial Intelligence Sta<s<cs / Mathema<cs Data Mining Learning from Data Computer Vision Robo<cs (No defini<on of a field is perfect the diagram above is just one interpreta<on, mine ;- )

7 Learn your trade

8 Learning from Data.. Prerequisites MATHEMATICS This is a mathema+cal subject. You must be comfortable with probabili+es and algebra. Maths primer on course website for reviewing. PROGRAMMING You must be able to program, and pick up a new language rela<vely easily. We use Matlab for the first 2 modules. In the 3 rd module, you may use any language. Module codes in this theme: (predic<on) (descrip<on) (op<miza<on)

9 COMP61011 topic structure Week 1: Some Data and Simple Predictors Week 2: Support Vector Machines / Model SelecBon Week 3: Decision Trees / Feature SelecBon Week 4: Bayes Theorem / ProbabilisBc Classifiers Week 5: Ensemble Methods / Industry Guest Lectures Week 6: No lecture.

10 COMP61011 assessment structure 50% January exam 50% coursework, broken down as 10% + 10% lab exercises (weeks 2,3) 30% mini- project (weeks 4,5,6) Lab exercises will be marked at the START of the following lab session. You should NOT be sbll working on the previous week s work. Extensions will require a medical note.

11 Matlab MATrix LABoratory Interac<ve scrip<ng language Interpreted (i.e. no compiling) Objects possible, not compulsory Dynamically typed Flexible GUI / plofng framework Large libraries of tools Highly op<mized for maths Available free from Uni, but usable only when connected to our network (e.g. via VPN) Module- specific sozware supported on school machines only.

12 Books (not compulsory purchase, but recommended) Introduc+on to Machine Learning By Ethem Alpaydin Very Short Introduc+on to Sta+s+cs By David Hand Technical. Contains all necessary material For modules 1+2 of this theme. Not technical at all. More of a mo>va>onal, big- picture read.

13 Some Data, and Simple Predictors

14 A Problem to Solve Dis<nguish rugby players from ballet dancers. You are provided with some data. Fallowfield rugby club (16). Rusholme ballet troupe (10). Task Generate a program which will correctly classify ANY player/dancer in the world. Hint We shouldn t fine- tune our system too much so it only works on the local clubs.

15 Taking measurements. We have to process the people with a computer, so it needs to be in a computer- readable form. What are the dis<nguishing characteris<cs? 1. Height 2. Weight 3. Shoe size 4. Sex

16 Taking measurements. Person Weight Height kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm height weight

17 The Nearest Neighbour Rule Person Weight Height Class height kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm weight TRAINING DATA TESTING DATA Who s this guy? - player or dancer? height = 180cm weight = 78kg

18 The Nearest Neighbour Rule Person Weight Height Class height kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm weight TRAINING DATA height = 180cm weight = 78kg 1. Find nearest neighbour 2. Assign the same class

19 The K- Nearest Neighbour Classifier Testing point x For each training datapoint x measure distance(x,x ) End Sort distances Select K nearest Assign most common class! height Person Weight 63kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg Height 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm TRAINING DATA Class weight

20 Quick reminder: Pythagoras theorem measure distance(x,x ) c a a 2 + b b = c So... c = a + b 2 a.k.a. Euclidean distance height distance( x, x' ) = i ( x i x' i 2 ) weight

21 The K- Nearest Neighbour Classifier Testing point x For each training datapoint x measure distance(x,x ) End Sort distances Select K nearest Assign most common class! height Person Weight 63kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg Height 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm TRAINING DATA Class Seems sensible. But what are the disadvantages? weight

22 The K- Nearest Neighbour Classifier Person Weight Height Class height kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm TRAINING DATA weight Here I chose k=3. What would happen if I chose k=5? What would happen if I chose k=26?

23 The K- Nearest Neighbour Classifier Person Weight Height Class height kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm TRAINING DATA weight Any point on the led of this boundary is closer to the red circles. Any point on the right of this boundary is closer to the blue crosses. This is called the decision boundary

24 Where s the decision boundary? height weight Not always a simple straight line!

25 Where s the decision boundary? height weight Not always con<guous!

26 So, we have our first machine learning algorithm The K- Nearest Neighbour Classifier Testing point x For each training datapoint x measure distance(x,x ) End Sort distances Select K nearest Assign most common class! Make your own notes on its advantages / disadvantages.

27 The most important concept in Machine Learning

28 The most important concept in Machine Learning Looks good so far

29 The most important concept in Machine Learning Looks good so far Oh no! Mistakes! What happened?

30 The most important concept in Machine Learning Looks good so far Oh no! Mistakes! What happened? We didn t have all the data. We can never assume that we do. This is called OVER-FITTING to the small dataset.

31 Overfifng Overfieng happens when the classifier is too flexible for the problem. If we d drawn a simpler decision boundary below, maybe a straight line, we may have gogen lower error.

32 Break for 10 mins Possible uses of your break: 1. Ensure you have a working login for the computer lab this aqernoon. 2. Talk to me or a demonstrator about the material. 3. Read ahead in the notes. 4. Go get a coffee.

33 A more simple, compact rule? height θ weight if (weight >θ) then "player" else "dancer"

34 What s an algorithm to find a good threshold? θ = 40 while ( nummistakes!= 0 ) { } θ =θ θ nummistakes = error( ) height if (weight >θ) then "player" else "dancer" weight

35 We have our second Machine Learning procedure. The threshold classifier (also known as a Decision Stump ) if (weight >θ) then "player" else "dancer" θ = 40 while ( nummistakes!= 0 ) { } θ =θ θ nummistakes = error( )

36 Three ingredients of a Machine Learning procedure Model The final product, the thing you have to package up and send to a customer. A piece of code with some parameters that need to be set. Error func<on The performance criterion: the funcbon you use to judge how well the parameters of the model are set. Learning algorithm The algorithm that opbmises the model parameters, using the error funcbon to judge how well it is doing.

37 Three ingredients of a Threshold Classifier Error func<on θ = 40 while ( nummistakes!= 0 ) { } θ =θ θ nummistakes = error( ) Learning algorithm if (weight >θ) then "player" else "dancer" Model

38 What s the model for the k- NN classifier? For the k- nn, the model is the training data itself! - very good accuracy J - very computabonally intensive! L height weight Testing point x For each training datapoint x measure distance(x,x ) End Sort distances Select K nearest Assign most common class!

39 New data: what s an algorithm to find a good threshold? height Our model does not match the problem! θ weight if (weight >θ) then "player" else "dancer" 1 mistake

40 New data: what s an algorithm to find a good threshold? height J But our current model cannot represent this weight if (weight >θ) then "player" else "dancer" L

41 We need a more sophis<cated model if (weight >θ) then "player" else "dancer" if (f ( x) >θ) then "player" else "dancer" x 1 = height (cm) x 2 = weight (kg) height The Linear Classifier f ( x) = ( w1 * x1) + ( w2 * x2) = d w i x i= 1 i weight

42 The Linear Classifier if f ( x) >θ then "player" else "dancer" f ( x) = ( w1 * x1) + ( w2 * x2) = d w i x i= 1 i height height Decision boundary weight weight w θ w1, 2and change the posi<on of the DECISION BOUNDARY

43 Geometry of the Linear Classifier (1) if f ( x) >θ then else -1 "player" "dancer" = + 1 = 1 In 2- d, this is a line. In higher dimensions, it is a decision hyper- plane. Any point on the plane evaluates to 0. Points not on the plane evaluate to /. [ w 1, w2 ] The decision boundary is always ORTHOGONAL to the weight vector. See if you can prove this for yourself before going to the notes. f(x)= f(x)= f(x)=0

44 Geometry of the Linear Classifier (2) We can rearrange the decision rule: d if w i x i >θ then else -1 i=1 d w i x i θ > 0 i=1 d w i x i + ( 1.w 0 ) > 0 i=1 if d w i x i > 0 i=0 d i= 0 w i x i > 0 then + 1else -1

45 Geometry of the Linear Classifier (3) On the plane: In 2- dimensions: d f ( x) = w i x i θ = 0 i=1 f ( x) = w 1 x 1 + w 2 x 2 θ = 0 w 1 w 2 x 1 + x 2 = θ w 2 x 2 = w 1 w 2 x 1 + θ w 2 f(x)= [ w 1, w2 ] f(x)= f(x)=0 This now follows the geometry of a straight line y=mx+c, with m = w 1 w 2 c = θ w 2

46 The Linear Classifier Model if d i= 0 w i x i > 0 then ŷ = + 1else ŷ = 1 Error func<on Learning algo. e = 1 ( 2 f ( x) y) 2???... need to optimise the w values... height x y inputs class weight Note the terminology! See notes for details!!

47 Gradient Descent e w i e = 1 ( 2 f ( x) y) 2 = e f f w i = f y ( ) x i Follow the NEGATIVE gradient.!

48 Stochas<c Gradient Descent initialise weight values to random numbers in range -1 to for n = 1 to NUM_ITERATIONS for each training example (x,y) calculate for each weight i f (x) end end end w i = w i α( f ( x) y)x i α = a small constant, the learning rate Convergence theorem: If the data is linearly separable, then application of the learning rule will find a separating decision boundary, within a finite number of iterations.

49 A problem initialise weight values to random numbers in range -1 to... height Depending on the random ini<alisa<on, the linear classifier will converge to one of the valid boundaries but randomly! weight

50 Break for 30 mins Possible uses of your break: 1. Ensure you have a working login for the computer lab this aqernoon. 2. Talk to me or a demonstrator about the material. 3. Read ahead in the notes. 4. Go get a coffee.

51 Another model : logis+c regression Our model f(x) has range plus/minus INFIINTY! Is this really necessary? What is the confidence of our decisions? Can we es<mate PROBABILITIES? Logis<c regression es<mates p( y=1 x ) Output in range [0,1] Sigmoid func+on p(y =1 x) = f (x) = 1 1+ e (wt x Θ)

52 Another error : cross entropy N j=1 e = y j ln f (x j )+ (1 y j )ln(1 f (x j )) Above we assume y is either 0 or 1. Derived from the sta<s<cal principle of Likelihood. We ll see this again in a few weeks.

53 Gradient Descent N j=1 e = y j ln f (x j )+ (1 y j )ln(1 f (x j )) e w i = e f f w i = f y ( ) x i Follow the NEGATIVE gradient.! SAME update as for squared error!

54 Stochas<c Gradient Descent initialise weight values to random numbers in range -1 to for n = 1 to NUM_ITERATIONS for each training example (x,y) calculate for each weight i f (x) end end end w i = w i α( f ( x) y)x i α = a small constant, the learning rate

55 A natural pairing of error func<on to model N e = y j ln f (x j )+ (1 y j )ln(1 f (x j )) j=1 e = 1 ( 2 f ( x) y) 2 1 f (x) = 1+ e (wt x Θ) d f ( x) = w i x i θ i=1 e w i = e f f w i = f y ( ) x i

56 S<ll a problem initialise weight values to random numbers in range -1 to... height Depending on the random ini<alisa<on, the logis+c regression classifier will converge to one of the valid boundaries but randomly! weight

57 Geometry of Linear Models (see notes)

58 Another problem - new data. non- linearly separable height Our model does not match the problem! We ll deal with this next week! weight

59 End of Day 1 Now read the notes. Read the Surrounded by StaBsBcs chapter in the handouts. The fog will clear. This adernoon learn MATLAB. This week s exercise is unassessed, but you are highly advised to get as much pracbce in as you can.

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 3 Parametric Distribu>ons We want model the probability

More information

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris

ML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris ML4Bio Lecture #1: Introduc3on February 24 th, 216 Quaid Morris Course goals Prac3cal introduc3on to ML Having a basic grounding in the terminology and important concepts in ML; to permit self- study,

More information

Week 3: Perceptron and Multi-layer Perceptron

Week 3: Perceptron and Multi-layer Perceptron Week 3: Perceptron and Multi-layer Perceptron Phong Le, Willem Zuidema November 12, 2013 Last week we studied two famous biological neuron models, Fitzhugh-Nagumo model and Izhikevich model. This week,

More information

Nearest Neighbor Classification

Nearest Neighbor Classification Nearest Neighbor Classification Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms January 11, 2017 1 / 48 Outline 1 Administration 2 First learning algorithm: Nearest

More information

Gradient Descent. Michail Michailidis & Patrick Maiden

Gradient Descent. Michail Michailidis & Patrick Maiden Gradient Descent Michail Michailidis & Patrick Maiden Outline Mo4va4on Gradient Descent Algorithm Issues & Alterna4ves Stochas4c Gradient Descent Parallel Gradient Descent HOGWILD! Mo4va4on It is good

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Lecture #11: The Perceptron

Lecture #11: The Perceptron Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be

More information

CAP5415-Computer Vision Lecture 13-Support Vector Machines for Computer Vision Applica=ons

CAP5415-Computer Vision Lecture 13-Support Vector Machines for Computer Vision Applica=ons CAP5415-Computer Vision Lecture 13-Support Vector Machines for Computer Vision Applica=ons Guest Lecturer: Dr. Boqing Gong Dr. Ulas Bagci bagci@ucf.edu 1 October 14 Reminders Choose your mini-projects

More information

Perceptron Introduction to Machine Learning. Matt Gormley Lecture 5 Jan. 31, 2018

Perceptron Introduction to Machine Learning. Matt Gormley Lecture 5 Jan. 31, 2018 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Perceptron Matt Gormley Lecture 5 Jan. 31, 2018 1 Q&A Q: We pick the best hyperparameters

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #19: Machine Learning 1

CS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #19: Machine Learning 1 CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #19: Machine Learning 1 Supervised Learning Would like to do predicbon: esbmate a func3on f(x) so that y = f(x) Where y can be: Real number:

More information

Machine Learning Crash Course: Part I

Machine Learning Crash Course: Part I Machine Learning Crash Course: Part I Ariel Kleiner August 21, 2012 Machine learning exists at the intersec

More information

CSCI567 Machine Learning (Fall 2014)

CSCI567 Machine Learning (Fall 2014) CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 9, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 9, 2014 1 / 47

More information

Logis&c Regression. Aar$ Singh & Barnabas Poczos. Machine Learning / Jan 28, 2014

Logis&c Regression. Aar$ Singh & Barnabas Poczos. Machine Learning / Jan 28, 2014 Logis&c Regression Aar$ Singh & Barnabas Poczos Machine Learning 10-701/15-781 Jan 28, 2014 Linear Regression & Linear Classifica&on Weight Height Linear fit Linear decision boundary 2 Naïve Bayes Recap

More information

6.034 Quiz 2, Spring 2005

6.034 Quiz 2, Spring 2005 6.034 Quiz 2, Spring 2005 Open Book, Open Notes Name: Problem 1 (13 pts) 2 (8 pts) 3 (7 pts) 4 (9 pts) 5 (8 pts) 6 (16 pts) 7 (15 pts) 8 (12 pts) 9 (12 pts) Total (100 pts) Score 1 1 Decision Trees (13

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Structure and Support Vector Machines. SPFLODD October 31, 2013

Structure and Support Vector Machines. SPFLODD October 31, 2013 Structure and Support Vector Machines SPFLODD October 31, 2013 Outline SVMs for structured outputs Declara?ve view Procedural view Warning: Math Ahead Nota?on for Linear Models Training data: {(x 1, y

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP

More information

CS 6140: Machine Learning Spring Final Exams. What we learned Final Exams 2/26/16

CS 6140: Machine Learning Spring Final Exams. What we learned Final Exams 2/26/16 Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Assignment

More information

CS 6140: Machine Learning Spring 2016

CS 6140: Machine Learning Spring 2016 CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Assignment

More information

Introduction to Machine Learning. Xiaojin Zhu

Introduction to Machine Learning. Xiaojin Zhu Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006

More information

06: Logistic Regression

06: Logistic Regression 06_Logistic_Regression 06: Logistic Regression Previous Next Index Classification Where y is a discrete value Develop the logistic regression algorithm to determine what class a new input should fall into

More information

ADVANCED CLASSIFICATION TECHNIQUES

ADVANCED CLASSIFICATION TECHNIQUES Admin ML lab next Monday Project proposals: Sunday at 11:59pm ADVANCED CLASSIFICATION TECHNIQUES David Kauchak CS 159 Fall 2014 Project proposal presentations Machine Learning: A Geometric View 1 Apples

More information

Machine Learning (CSE 446): Unsupervised Learning

Machine Learning (CSE 446): Unsupervised Learning Machine Learning (CSE 446): Unsupervised Learning Sham M Kakade c 2018 University of Washington cse446-staff@cs.washington.edu 1 / 19 Announcements HW2 posted. Due Feb 1. It is long. Start this week! Today:

More information

Perceptron: This is convolution!

Perceptron: This is convolution! Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image

More information

All lecture slides will be available at CSC2515_Winter15.html

All lecture slides will be available at  CSC2515_Winter15.html CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many

More information

About the Course. Reading List. Assignments and Examina5on

About the Course. Reading List. Assignments and Examina5on Uppsala University Department of Linguis5cs and Philology About the Course Introduc5on to machine learning Focus on methods used in NLP Decision trees and nearest neighbor methods Linear models for classifica5on

More information

Nearest Neighbor Classification. Machine Learning Fall 2017

Nearest Neighbor Classification. Machine Learning Fall 2017 Nearest Neighbor Classification Machine Learning Fall 2017 1 This lecture K-nearest neighbor classification The basic algorithm Different distance measures Some practical aspects Voronoi Diagrams and Decision

More information

CMPT 882 Week 3 Summary

CMPT 882 Week 3 Summary CMPT 882 Week 3 Summary! Artificial Neural Networks (ANNs) are networks of interconnected simple units that are based on a greatly simplified model of the brain. ANNs are useful learning tools by being

More information

k-nearest Neighbors + Model Selection

k-nearest Neighbors + Model Selection 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University k-nearest Neighbors + Model Selection Matt Gormley Lecture 5 Jan. 30, 2019 1 Reminders

More information

CSC411 Fall 2014 Machine Learning & Data Mining. Ensemble Methods. Slides by Rich Zemel

CSC411 Fall 2014 Machine Learning & Data Mining. Ensemble Methods. Slides by Rich Zemel CSC411 Fall 2014 Machine Learning & Data Mining Ensemble Methods Slides by Rich Zemel Ensemble methods Typical application: classi.ication Ensemble of classi.iers is a set of classi.iers whose individual

More information

CSE 158. Web Mining and Recommender Systems. Midterm recap

CSE 158. Web Mining and Recommender Systems. Midterm recap CSE 158 Web Mining and Recommender Systems Midterm recap Midterm on Wednesday! 5:10 pm 6:10 pm Closed book but I ll provide a similar level of basic info as in the last page of previous midterms CSE 158

More information

Weka ( )

Weka (  ) Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised

More information

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford

Clustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford Department of Engineering Science University of Oxford January 27, 2017 Many datasets consist of multiple heterogeneous subsets. Cluster analysis: Given an unlabelled data, want algorithms that automatically

More information

Lecture 3: Linear Classification

Lecture 3: Linear Classification Lecture 3: Linear Classification Roger Grosse 1 Introduction Last week, we saw an example of a learning task called regression. There, the goal was to predict a scalar-valued target from a set of features.

More information

CSE 546 Machine Learning, Autumn 2013 Homework 2

CSE 546 Machine Learning, Autumn 2013 Homework 2 CSE 546 Machine Learning, Autumn 2013 Homework 2 Due: Monday, October 28, beginning of class 1 Boosting [30 Points] We learned about boosting in lecture and the topic is covered in Murphy 16.4. On page

More information

The Mathematics Behind Neural Networks

The Mathematics Behind Neural Networks The Mathematics Behind Neural Networks Pattern Recognition and Machine Learning by Christopher M. Bishop Student: Shivam Agrawal Mentor: Nathaniel Monson Courtesy of xkcd.com The Black Box Training the

More information

10-701/15-781, Fall 2006, Final

10-701/15-781, Fall 2006, Final -7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

CPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016

CPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016 CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Assignment 0: Admin 1 late day to hand it in tonight, 2 late days for Wednesday. Assignment 1 is out: Due Friday of next week.

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Machine Learning Classifiers and Boosting

Machine Learning Classifiers and Boosting Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve

More information

2. Linear Regression and Gradient Descent

2. Linear Regression and Gradient Descent Pattern Recognition And Machine Learning - EPFL - Fall 2015 Emtiyaz Khan, Timur Bagautdinov, Carlos Becker, Ilija Bogunovic & Ksenia Konyushkova 2. Linear Regression and Gradient Descent 2.1 Goals The

More information

CS535 Big Data Fall 2017 Colorado State University 10/10/2017 Sangmi Lee Pallickara Week 8- A.

CS535 Big Data Fall 2017 Colorado State University   10/10/2017 Sangmi Lee Pallickara Week 8- A. CS535 Big Data - Fall 2017 Week 8-A-1 CS535 BIG DATA FAQs Term project proposal New deadline: Tomorrow PA1 demo PART 1. BATCH COMPUTING MODELS FOR BIG DATA ANALYTICS 5. ADVANCED DATA ANALYTICS WITH APACHE

More information

Chakra Chennubhotla and David Koes

Chakra Chennubhotla and David Koes MSCBIO/CMPBIO 2065: Support Vector Machines Chakra Chennubhotla and David Koes Nov 15, 2017 Sources mmds.org chapter 12 Bishop s book Ch. 7 Notes from Toronto, Mark Schmidt (UBC) 2 SVM SVMs and Logistic

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Introduction to Machine Learning CMU-10701

Introduction to Machine Learning CMU-10701 Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-

More information

More on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.

More on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5. More on Neural Networks Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.6 Recall the MLP Training Example From Last Lecture log likelihood

More information

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set.

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set. Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?

More information

CS273 Midterm Exam Introduction to Machine Learning: Winter 2015 Tuesday February 10th, 2014

CS273 Midterm Exam Introduction to Machine Learning: Winter 2015 Tuesday February 10th, 2014 CS273 Midterm Eam Introduction to Machine Learning: Winter 2015 Tuesday February 10th, 2014 Your name: Your UCINetID (e.g., myname@uci.edu): Your seat (row and number): Total time is 80 minutes. READ THE

More information

12 Classification using Support Vector Machines

12 Classification using Support Vector Machines 160 Bioinformatics I, WS 14/15, D. Huson, January 28, 2015 12 Classification using Support Vector Machines This lecture is based on the following sources, which are all recommended reading: F. Markowetz.

More information

A Brief Look at Optimization

A Brief Look at Optimization A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest

More information

Inf2B assignment 2. Natural images classification. Hiroshi Shimodaira and Pol Moreno. Submission due: 4pm, Wednesday 30 March 2016.

Inf2B assignment 2. Natural images classification. Hiroshi Shimodaira and Pol Moreno. Submission due: 4pm, Wednesday 30 March 2016. Inf2B assignment 2 (Ver. 1.2) Natural images classification Submission due: 4pm, Wednesday 30 March 2016 Hiroshi Shimodaira and Pol Moreno This assignment is out of 100 marks and forms 12.5% of your final

More information

CPSC 340: Machine Learning and Data Mining. Logistic Regression Fall 2016

CPSC 340: Machine Learning and Data Mining. Logistic Regression Fall 2016 CPSC 340: Machine Learning and Data Mining Logistic Regression Fall 2016 Admin Assignment 1: Marks visible on UBC Connect. Assignment 2: Solution posted after class. Assignment 3: Due Wednesday (at any

More information

CSC 4510 Machine Learning

CSC 4510 Machine Learning 5: Mul'variate Regression CSC 4510 Machine Learning Dr. Mary Angela Papalaskari Department of CompuBng Sciences Villanova Course website: www.csc.villanova.edu/~map/4510/ The slides in this presentabon

More information

[1] CURVE FITTING WITH EXCEL

[1] CURVE FITTING WITH EXCEL 1 Lecture 04 February 9, 2010 Tuesday Today is our third Excel lecture. Our two central themes are: (1) curve-fitting, and (2) linear algebra (matrices). We will have a 4 th lecture on Excel to further

More information

Computational Machine Learning, Fall 2015 Homework 4: stochastic gradient algorithms

Computational Machine Learning, Fall 2015 Homework 4: stochastic gradient algorithms Computational Machine Learning, Fall 2015 Homework 4: stochastic gradient algorithms Due: Tuesday, November 24th, 2015, before 11:59pm (submit via email) Preparation: install the software packages and

More information

Gradient Descent - Problem of Hiking Down a Mountain

Gradient Descent - Problem of Hiking Down a Mountain Gradient Descent - Problem of Hiking Down a Mountain Udacity Have you ever climbed a mountain? I am sure you had to hike down at some point? Hiking down is a great exercise and it is going to help us understand

More information

Math 2250 Lab #3: Landing on Target

Math 2250 Lab #3: Landing on Target Math 2250 Lab #3: Landing on Target 1. INTRODUCTION TO THE LAB PROGRAM. Here are some general notes and ideas which will help you with the lab. The purpose of the lab program is to expose you to problems

More information

Introduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010

Introduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010 Introduc)on to Probabilis)c Latent Seman)c Analysis NYP Predic)ve Analy)cs Meetup June 10, 2010 PLSA A type of latent variable model with observed count data and nominal latent variable(s). Despite the

More information

Machine Learning using Matlab. Lecture 3 Logistic regression and regularization

Machine Learning using Matlab. Lecture 3 Logistic regression and regularization Machine Learning using Matlab Lecture 3 Logistic regression and regularization Presentation Date (correction) 10.07.2017 11.07.2017 17.07.2017 18.07.2017 24.07.2017 25.07.2017 Project proposals 13 submissions,

More information

COMP6237 Data Mining Introduction to Data Mining

COMP6237 Data Mining Introduction to Data Mining COMP6237 Data Mining Introduction to Data Mining Jonathon Hare jsh2@ecs.soton.ac.uk Markus Brede mb8@ecs.soton.ac.uk Teaching Staff Jonathon Hare jsh2@ecs.soton.ac.uk 1/2003 Markus Brede mb8@ecs.soton.ac.uk

More information

Partitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning

Partitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning Partitioning Data IRDS: Evaluation, Debugging, and Diagnostics Charles Sutton University of Edinburgh Training Validation Test Training : Running learning algorithms Validation : Tuning parameters of learning

More information

Recommender Systems Collabora2ve Filtering and Matrix Factoriza2on

Recommender Systems Collabora2ve Filtering and Matrix Factoriza2on Recommender Systems Collaborave Filtering and Matrix Factorizaon Narges Razavian Thanks to lecture slides from Alex Smola@CMU Yahuda Koren@Yahoo labs and Bing Liu@UIC We Know What You Ought To Be Watching

More information

Lecture 9. Support Vector Machines

Lecture 9. Support Vector Machines Lecture 9. Support Vector Machines COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne This lecture Support vector machines (SVMs) as maximum

More information

Math 5320, 3/28/18 Worksheet 26: Ruler and compass constructions. 1. Use your ruler and compass to construct a line perpendicular to the line below:

Math 5320, 3/28/18 Worksheet 26: Ruler and compass constructions. 1. Use your ruler and compass to construct a line perpendicular to the line below: Math 5320, 3/28/18 Worksheet 26: Ruler and compass constructions Name: 1. Use your ruler and compass to construct a line perpendicular to the line below: 2. Suppose the following two points are spaced

More information

Planar data classification with one hidden layer

Planar data classification with one hidden layer Planar data classification with one hidden layer Welcome to your week 3 programming assignment. It's time to build your first neural network, which will have a hidden layer. You will see a big difference

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu [Kumar et al. 99] 2/13/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

Data Mining and Knowledge Discovery: Practice Notes

Data Mining and Knowledge Discovery: Practice Notes Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 2016/01/12 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN

More information

6. Linear Discriminant Functions

6. Linear Discriminant Functions 6. Linear Discriminant Functions Linear Discriminant Functions Assumption: we know the proper forms for the discriminant functions, and use the samples to estimate the values of parameters of the classifier

More information

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017 Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last

More information

CPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017

CPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017 CPSC 340: Machine Learning and Data Mining More Linear Classifiers Fall 2017 Admin Assignment 3: Due Friday of next week. Midterm: Can view your exam during instructor office hours next week, or after

More information

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani

Neural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer

More information

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018

CPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018 CPSC 340: Machine Learning and Data Mining Deep Learning Fall 2018 Last Time: Multi-Dimensional Scaling Multi-dimensional scaling (MDS): Non-parametric visualization: directly optimize the z i locations.

More information

COMP6237 Data Mining Data Mining & Machine Learning with Big Data. Jonathon Hare

COMP6237 Data Mining Data Mining & Machine Learning with Big Data. Jonathon Hare COMP6237 Data Mining Data Mining & Machine Learning with Big Data Jonathon Hare jsh2@ecs.soton.ac.uk Contents Going to look at two case-studies looking at how we can make machine-learning algorithms work

More information

Mathematics of Data. INFO-4604, Applied Machine Learning University of Colorado Boulder. September 5, 2017 Prof. Michael Paul

Mathematics of Data. INFO-4604, Applied Machine Learning University of Colorado Boulder. September 5, 2017 Prof. Michael Paul Mathematics of Data INFO-4604, Applied Machine Learning University of Colorado Boulder September 5, 2017 Prof. Michael Paul Goals In the intro lecture, every visualization was in 2D What happens when we

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

Exercise: Training Simple MLP by Backpropagation. Using Netlab.

Exercise: Training Simple MLP by Backpropagation. Using Netlab. Exercise: Training Simple MLP by Backpropagation. Using Netlab. Petr Pošík December, 27 File list This document is an explanation text to the following script: demomlpklin.m script implementing the beckpropagation

More information

Image Analysis - Lecture 1

Image Analysis - Lecture 1 General Research Image models Repetition Image Analysis - Lecture 1 Magnus Oskarsson General Research Image models Repetition Lecture 1 Administrative things What is image analysis? Examples of image analysis

More information

Dulwich College. SAMPLE PAPER Mathematics

Dulwich College. SAMPLE PAPER Mathematics 1+ Dulwich College YEAR 9 ENTRANCE AND SCHOLARSHIP EXAMINATION SAMPLE PAPER Mathematics 1 HOUR 0 MINUTES Use a calculator where appropriate. Answer all the questions. Show all your working. Marks for parts

More information

Linear Regression and K-Nearest Neighbors 3/28/18

Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression Hypothesis Space Supervised learning For every input in the data set, we know the output Regression Outputs are continuous A number,

More information

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please) Virginia Tech. Computer Science CS 5614 (Big) Data Management Systems Fall 2014, Prakash Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description

More information

Generative and discriminative classification techniques

Generative and discriminative classification techniques Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14

More information

Math 2250 Lab #3: Landing on Target

Math 2250 Lab #3: Landing on Target Math 2250 Lab #3: Landing on Target 1. INTRODUCTION TO THE LAB PROGRAM. Here are some general notes and ideas which will help you with the lab. The purpose of the lab program is to expose you to problems

More information

10.4 Linear interpolation method Newton s method

10.4 Linear interpolation method Newton s method 10.4 Linear interpolation method The next best thing one can do is the linear interpolation method, also known as the double false position method. This method works similarly to the bisection method by

More information

CS249: ADVANCED DATA MINING

CS249: ADVANCED DATA MINING CS249: ADVANCED DATA MINING Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 2 out Announcements Due May 3 rd (11:59pm) Course project proposal

More information

Classification and Regression Trees

Classification and Regression Trees Classification and Regression Trees David S. Rosenberg New York University April 3, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 April 3, 2018 1 / 51 Contents 1 Trees 2 Regression

More information

Online Algorithm Comparison points

Online Algorithm Comparison points CS446: Machine Learning Spring 2017 Problem Set 3 Handed Out: February 15 th, 2017 Due: February 27 th, 2017 Feel free to talk to other members of the class in doing the homework. I am more concerned that

More information

Supervised Learning Classification Algorithms Comparison

Supervised Learning Classification Algorithms Comparison Supervised Learning Classification Algorithms Comparison Aditya Singh Rathore B.Tech, J.K. Lakshmipat University -------------------------------------------------------------***---------------------------------------------------------

More information

1 Training/Validation/Testing

1 Training/Validation/Testing CPSC 340 Final (Fall 2015) Name: Student Number: Please enter your information above, turn off cellphones, space yourselves out throughout the room, and wait until the official start of the exam to begin.

More information

Introduction to the Weebly Toolkit for Building Websites

Introduction to the Weebly Toolkit for Building Websites Introduction to the Weebly Toolkit for Building Websites Maureen Pratchett July 2015 1 Objective The purpose of this workshop is not to teach you how to design or even build a website, but rather to introduce

More information

Minimum Redundancy and Maximum Relevance Feature Selec4on. Hang Xiao

Minimum Redundancy and Maximum Relevance Feature Selec4on. Hang Xiao Minimum Redundancy and Maximum Relevance Feature Selec4on Hang Xiao Background Feature a feature is an individual measurable heuris4c property of a phenomenon being observed In character recogni4on: horizontal

More information

Practice EXAM: SPRING 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE

Practice EXAM: SPRING 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE Practice EXAM: SPRING 0 CS 6375 INSTRUCTOR: VIBHAV GOGATE The exam is closed book. You are allowed four pages of double sided cheat sheets. Answer the questions in the spaces provided on the question sheets.

More information

CSE 158 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression

CSE 158 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression CSE 158 Lecture 2 Web Mining and Recommender Systems Supervised learning Regression Supervised versus unsupervised learning Learning approaches attempt to model data in order to solve a problem Unsupervised

More information

Missing Data. Where did it go?

Missing Data. Where did it go? Missing Data Where did it go? 1 Learning Objectives High-level discussion of some techniques Identify type of missingness Single vs Multiple Imputation My favourite technique 2 Problem Uh data are missing

More information

Simple Model Selection Cross Validation Regularization Neural Networks

Simple Model Selection Cross Validation Regularization Neural Networks Neural Nets: Many possible refs e.g., Mitchell Chapter 4 Simple Model Selection Cross Validation Regularization Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February

More information

Classification Key Concepts

Classification Key Concepts http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Classification Key Concepts Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech 1 How will

More information