Learning from Data. COMP61011 : Machine Learning and Data Mining. Dr Gavin Brown Machine Learning and Op<miza<on Research Group
|
|
- Jessie Casey
- 5 years ago
- Views:
Transcription
1 Learning from Data COMP61011 : Machine Learning and Data Mining Dr Gavin Brown Machine Learning and Op<miza<on Research Group
2 Learning from Data Data is recorded from some real- world phenomenon. What might we want to do with that data? Predic+on - what can we predict about this phenomenon? Descrip+on - how can we describe/understand this phenomenon in a new way? Op+miza+on - how can we control and op+mize this phenomenon for our own objec<ves?
3 COMP61011 Machine Learning & Data Mining COMP61021 Modeling & Visualiza<on of High Dimensional Data COMP61032 Op<miza<on for Learning, Planning & Problem Solving Period 1 Oct/Nov Period 2 Nov/Dec Period 3 Feb/Mar Predic+on Lecturer: Dr Gavin Brown
4 Machine Learning and Data Mining Medical Records / Novel Drugs What characteris<cs of a pa<ent indicate they may react well/badly to a new drug? How can we predict whether it will poten<ally hurt rather then help them? AstraZeneca Project Research Bursaries Limited number of eligible MSc projects, announced Dec 2011
5 Machine Learning and Data Mining Handwri+ng Recogni+on Google Books is currently digi<zing millions of books. Smartphones need to process non- European handwri<ng to tap into the Asian market. How can we recognize handwriven digits in a huge variety of handwri<ng styles, in real- <me?
6 Learning from Data Where does all this fit? Ar<ficial Intelligence Sta<s<cs / Mathema<cs Data Mining Learning from Data Computer Vision Robo<cs (No defini<on of a field is perfect the diagram above is just one interpreta<on, mine ;- )
7 Learn your trade
8 Learning from Data.. Prerequisites MATHEMATICS This is a mathema+cal subject. You must be comfortable with probabili+es and algebra. Maths primer on course website for reviewing. PROGRAMMING You must be able to program, and pick up a new language rela<vely easily. We use Matlab for the first 2 modules. In the 3 rd module, you may use any language. Module codes in this theme: (predic<on) (descrip<on) (op<miza<on)
9 COMP61011 topic structure Week 1: Some Data and Simple Predictors Week 2: Support Vector Machines / Model SelecBon Week 3: Decision Trees / Feature SelecBon Week 4: Bayes Theorem / ProbabilisBc Classifiers Week 5: Ensemble Methods / Industry Guest Lectures Week 6: No lecture.
10 COMP61011 assessment structure 50% January exam 50% coursework, broken down as 10% + 10% lab exercises (weeks 2,3) 30% mini- project (weeks 4,5,6) Lab exercises will be marked at the START of the following lab session. You should NOT be sbll working on the previous week s work. Extensions will require a medical note.
11 Matlab MATrix LABoratory Interac<ve scrip<ng language Interpreted (i.e. no compiling) Objects possible, not compulsory Dynamically typed Flexible GUI / plofng framework Large libraries of tools Highly op<mized for maths Available free from Uni, but usable only when connected to our network (e.g. via VPN) Module- specific sozware supported on school machines only.
12 Books (not compulsory purchase, but recommended) Introduc+on to Machine Learning By Ethem Alpaydin Very Short Introduc+on to Sta+s+cs By David Hand Technical. Contains all necessary material For modules 1+2 of this theme. Not technical at all. More of a mo>va>onal, big- picture read.
13 Some Data, and Simple Predictors
14 A Problem to Solve Dis<nguish rugby players from ballet dancers. You are provided with some data. Fallowfield rugby club (16). Rusholme ballet troupe (10). Task Generate a program which will correctly classify ANY player/dancer in the world. Hint We shouldn t fine- tune our system too much so it only works on the local clubs.
15 Taking measurements. We have to process the people with a computer, so it needs to be in a computer- readable form. What are the dis<nguishing characteris<cs? 1. Height 2. Weight 3. Shoe size 4. Sex
16 Taking measurements. Person Weight Height kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm height weight
17 The Nearest Neighbour Rule Person Weight Height Class height kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm weight TRAINING DATA TESTING DATA Who s this guy? - player or dancer? height = 180cm weight = 78kg
18 The Nearest Neighbour Rule Person Weight Height Class height kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm weight TRAINING DATA height = 180cm weight = 78kg 1. Find nearest neighbour 2. Assign the same class
19 The K- Nearest Neighbour Classifier Testing point x For each training datapoint x measure distance(x,x ) End Sort distances Select K nearest Assign most common class! height Person Weight 63kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg Height 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm TRAINING DATA Class weight
20 Quick reminder: Pythagoras theorem measure distance(x,x ) c a a 2 + b b = c So... c = a + b 2 a.k.a. Euclidean distance height distance( x, x' ) = i ( x i x' i 2 ) weight
21 The K- Nearest Neighbour Classifier Testing point x For each training datapoint x measure distance(x,x ) End Sort distances Select K nearest Assign most common class! height Person Weight 63kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg Height 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm TRAINING DATA Class Seems sensible. But what are the disadvantages? weight
22 The K- Nearest Neighbour Classifier Person Weight Height Class height kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm TRAINING DATA weight Here I chose k=3. What would happen if I chose k=5? What would happen if I chose k=26?
23 The K- Nearest Neighbour Classifier Person Weight Height Class height kg 55kg 75kg 50kg 57kg 85kg 93kg 75kg 99kg 100kg 190cm 185cm 202cm 180cm 174cm 150cm 145cm 130cm 163cm 171cm TRAINING DATA weight Any point on the led of this boundary is closer to the red circles. Any point on the right of this boundary is closer to the blue crosses. This is called the decision boundary
24 Where s the decision boundary? height weight Not always a simple straight line!
25 Where s the decision boundary? height weight Not always con<guous!
26 So, we have our first machine learning algorithm The K- Nearest Neighbour Classifier Testing point x For each training datapoint x measure distance(x,x ) End Sort distances Select K nearest Assign most common class! Make your own notes on its advantages / disadvantages.
27 The most important concept in Machine Learning
28 The most important concept in Machine Learning Looks good so far
29 The most important concept in Machine Learning Looks good so far Oh no! Mistakes! What happened?
30 The most important concept in Machine Learning Looks good so far Oh no! Mistakes! What happened? We didn t have all the data. We can never assume that we do. This is called OVER-FITTING to the small dataset.
31 Overfifng Overfieng happens when the classifier is too flexible for the problem. If we d drawn a simpler decision boundary below, maybe a straight line, we may have gogen lower error.
32 Break for 10 mins Possible uses of your break: 1. Ensure you have a working login for the computer lab this aqernoon. 2. Talk to me or a demonstrator about the material. 3. Read ahead in the notes. 4. Go get a coffee.
33 A more simple, compact rule? height θ weight if (weight >θ) then "player" else "dancer"
34 What s an algorithm to find a good threshold? θ = 40 while ( nummistakes!= 0 ) { } θ =θ θ nummistakes = error( ) height if (weight >θ) then "player" else "dancer" weight
35 We have our second Machine Learning procedure. The threshold classifier (also known as a Decision Stump ) if (weight >θ) then "player" else "dancer" θ = 40 while ( nummistakes!= 0 ) { } θ =θ θ nummistakes = error( )
36 Three ingredients of a Machine Learning procedure Model The final product, the thing you have to package up and send to a customer. A piece of code with some parameters that need to be set. Error func<on The performance criterion: the funcbon you use to judge how well the parameters of the model are set. Learning algorithm The algorithm that opbmises the model parameters, using the error funcbon to judge how well it is doing.
37 Three ingredients of a Threshold Classifier Error func<on θ = 40 while ( nummistakes!= 0 ) { } θ =θ θ nummistakes = error( ) Learning algorithm if (weight >θ) then "player" else "dancer" Model
38 What s the model for the k- NN classifier? For the k- nn, the model is the training data itself! - very good accuracy J - very computabonally intensive! L height weight Testing point x For each training datapoint x measure distance(x,x ) End Sort distances Select K nearest Assign most common class!
39 New data: what s an algorithm to find a good threshold? height Our model does not match the problem! θ weight if (weight >θ) then "player" else "dancer" 1 mistake
40 New data: what s an algorithm to find a good threshold? height J But our current model cannot represent this weight if (weight >θ) then "player" else "dancer" L
41 We need a more sophis<cated model if (weight >θ) then "player" else "dancer" if (f ( x) >θ) then "player" else "dancer" x 1 = height (cm) x 2 = weight (kg) height The Linear Classifier f ( x) = ( w1 * x1) + ( w2 * x2) = d w i x i= 1 i weight
42 The Linear Classifier if f ( x) >θ then "player" else "dancer" f ( x) = ( w1 * x1) + ( w2 * x2) = d w i x i= 1 i height height Decision boundary weight weight w θ w1, 2and change the posi<on of the DECISION BOUNDARY
43 Geometry of the Linear Classifier (1) if f ( x) >θ then else -1 "player" "dancer" = + 1 = 1 In 2- d, this is a line. In higher dimensions, it is a decision hyper- plane. Any point on the plane evaluates to 0. Points not on the plane evaluate to /. [ w 1, w2 ] The decision boundary is always ORTHOGONAL to the weight vector. See if you can prove this for yourself before going to the notes. f(x)= f(x)= f(x)=0
44 Geometry of the Linear Classifier (2) We can rearrange the decision rule: d if w i x i >θ then else -1 i=1 d w i x i θ > 0 i=1 d w i x i + ( 1.w 0 ) > 0 i=1 if d w i x i > 0 i=0 d i= 0 w i x i > 0 then + 1else -1
45 Geometry of the Linear Classifier (3) On the plane: In 2- dimensions: d f ( x) = w i x i θ = 0 i=1 f ( x) = w 1 x 1 + w 2 x 2 θ = 0 w 1 w 2 x 1 + x 2 = θ w 2 x 2 = w 1 w 2 x 1 + θ w 2 f(x)= [ w 1, w2 ] f(x)= f(x)=0 This now follows the geometry of a straight line y=mx+c, with m = w 1 w 2 c = θ w 2
46 The Linear Classifier Model if d i= 0 w i x i > 0 then ŷ = + 1else ŷ = 1 Error func<on Learning algo. e = 1 ( 2 f ( x) y) 2???... need to optimise the w values... height x y inputs class weight Note the terminology! See notes for details!!
47 Gradient Descent e w i e = 1 ( 2 f ( x) y) 2 = e f f w i = f y ( ) x i Follow the NEGATIVE gradient.!
48 Stochas<c Gradient Descent initialise weight values to random numbers in range -1 to for n = 1 to NUM_ITERATIONS for each training example (x,y) calculate for each weight i f (x) end end end w i = w i α( f ( x) y)x i α = a small constant, the learning rate Convergence theorem: If the data is linearly separable, then application of the learning rule will find a separating decision boundary, within a finite number of iterations.
49 A problem initialise weight values to random numbers in range -1 to... height Depending on the random ini<alisa<on, the linear classifier will converge to one of the valid boundaries but randomly! weight
50 Break for 30 mins Possible uses of your break: 1. Ensure you have a working login for the computer lab this aqernoon. 2. Talk to me or a demonstrator about the material. 3. Read ahead in the notes. 4. Go get a coffee.
51 Another model : logis+c regression Our model f(x) has range plus/minus INFIINTY! Is this really necessary? What is the confidence of our decisions? Can we es<mate PROBABILITIES? Logis<c regression es<mates p( y=1 x ) Output in range [0,1] Sigmoid func+on p(y =1 x) = f (x) = 1 1+ e (wt x Θ)
52 Another error : cross entropy N j=1 e = y j ln f (x j )+ (1 y j )ln(1 f (x j )) Above we assume y is either 0 or 1. Derived from the sta<s<cal principle of Likelihood. We ll see this again in a few weeks.
53 Gradient Descent N j=1 e = y j ln f (x j )+ (1 y j )ln(1 f (x j )) e w i = e f f w i = f y ( ) x i Follow the NEGATIVE gradient.! SAME update as for squared error!
54 Stochas<c Gradient Descent initialise weight values to random numbers in range -1 to for n = 1 to NUM_ITERATIONS for each training example (x,y) calculate for each weight i f (x) end end end w i = w i α( f ( x) y)x i α = a small constant, the learning rate
55 A natural pairing of error func<on to model N e = y j ln f (x j )+ (1 y j )ln(1 f (x j )) j=1 e = 1 ( 2 f ( x) y) 2 1 f (x) = 1+ e (wt x Θ) d f ( x) = w i x i θ i=1 e w i = e f f w i = f y ( ) x i
56 S<ll a problem initialise weight values to random numbers in range -1 to... height Depending on the random ini<alisa<on, the logis+c regression classifier will converge to one of the valid boundaries but randomly! weight
57 Geometry of Linear Models (see notes)
58 Another problem - new data. non- linearly separable height Our model does not match the problem! We ll deal with this next week! weight
59 End of Day 1 Now read the notes. Read the Surrounded by StaBsBcs chapter in the handouts. The fog will clear. This adernoon learn MATLAB. This week s exercise is unassessed, but you are highly advised to get as much pracbce in as you can.
STA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 3 Parametric Distribu>ons We want model the probability
More informationML4Bio Lecture #1: Introduc3on. February 24 th, 2016 Quaid Morris
ML4Bio Lecture #1: Introduc3on February 24 th, 216 Quaid Morris Course goals Prac3cal introduc3on to ML Having a basic grounding in the terminology and important concepts in ML; to permit self- study,
More informationWeek 3: Perceptron and Multi-layer Perceptron
Week 3: Perceptron and Multi-layer Perceptron Phong Le, Willem Zuidema November 12, 2013 Last week we studied two famous biological neuron models, Fitzhugh-Nagumo model and Izhikevich model. This week,
More informationNearest Neighbor Classification
Nearest Neighbor Classification Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms January 11, 2017 1 / 48 Outline 1 Administration 2 First learning algorithm: Nearest
More informationGradient Descent. Michail Michailidis & Patrick Maiden
Gradient Descent Michail Michailidis & Patrick Maiden Outline Mo4va4on Gradient Descent Algorithm Issues & Alterna4ves Stochas4c Gradient Descent Parallel Gradient Descent HOGWILD! Mo4va4on It is good
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationLecture #11: The Perceptron
Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be
More informationCAP5415-Computer Vision Lecture 13-Support Vector Machines for Computer Vision Applica=ons
CAP5415-Computer Vision Lecture 13-Support Vector Machines for Computer Vision Applica=ons Guest Lecturer: Dr. Boqing Gong Dr. Ulas Bagci bagci@ucf.edu 1 October 14 Reminders Choose your mini-projects
More informationPerceptron Introduction to Machine Learning. Matt Gormley Lecture 5 Jan. 31, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Perceptron Matt Gormley Lecture 5 Jan. 31, 2018 1 Q&A Q: We pick the best hyperparameters
More informationMachine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013
Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork
More informationCS 5614: (Big) Data Management Systems. B. Aditya Prakash Lecture #19: Machine Learning 1
CS 5614: (Big) Data Management Systems B. Aditya Prakash Lecture #19: Machine Learning 1 Supervised Learning Would like to do predicbon: esbmate a func3on f(x) so that y = f(x) Where y can be: Real number:
More informationMachine Learning Crash Course: Part I
Machine Learning Crash Course: Part I Ariel Kleiner August 21, 2012 Machine learning exists at the intersec
More informationCSCI567 Machine Learning (Fall 2014)
CSCI567 Machine Learning (Fall 2014) Drs. Sha & Liu {feisha,yanliu.cs}@usc.edu September 9, 2014 Drs. Sha & Liu ({feisha,yanliu.cs}@usc.edu) CSCI567 Machine Learning (Fall 2014) September 9, 2014 1 / 47
More informationLogis&c Regression. Aar$ Singh & Barnabas Poczos. Machine Learning / Jan 28, 2014
Logis&c Regression Aar$ Singh & Barnabas Poczos Machine Learning 10-701/15-781 Jan 28, 2014 Linear Regression & Linear Classifica&on Weight Height Linear fit Linear decision boundary 2 Naïve Bayes Recap
More information6.034 Quiz 2, Spring 2005
6.034 Quiz 2, Spring 2005 Open Book, Open Notes Name: Problem 1 (13 pts) 2 (8 pts) 3 (7 pts) 4 (9 pts) 5 (8 pts) 6 (16 pts) 7 (15 pts) 8 (12 pts) 9 (12 pts) Total (100 pts) Score 1 1 Decision Trees (13
More informationThe exam is closed book, closed notes except your one-page (two-sided) cheat sheet.
CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or
More informationStructure and Support Vector Machines. SPFLODD October 31, 2013
Structure and Support Vector Machines SPFLODD October 31, 2013 Outline SVMs for structured outputs Declara?ve view Procedural view Warning: Math Ahead Nota?on for Linear Models Training data: {(x 1, y
More informationCPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017
CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP
More informationCS 6140: Machine Learning Spring Final Exams. What we learned Final Exams 2/26/16
Logis@cs CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa@on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Assignment
More informationCS 6140: Machine Learning Spring 2016
CS 6140: Machine Learning Spring 2016 Instructor: Lu Wang College of Computer and Informa?on Science Northeastern University Webpage: www.ccs.neu.edu/home/luwang Email: luwang@ccs.neu.edu Logis?cs Assignment
More informationIntroduction to Machine Learning. Xiaojin Zhu
Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006
More information06: Logistic Regression
06_Logistic_Regression 06: Logistic Regression Previous Next Index Classification Where y is a discrete value Develop the logistic regression algorithm to determine what class a new input should fall into
More informationADVANCED CLASSIFICATION TECHNIQUES
Admin ML lab next Monday Project proposals: Sunday at 11:59pm ADVANCED CLASSIFICATION TECHNIQUES David Kauchak CS 159 Fall 2014 Project proposal presentations Machine Learning: A Geometric View 1 Apples
More informationMachine Learning (CSE 446): Unsupervised Learning
Machine Learning (CSE 446): Unsupervised Learning Sham M Kakade c 2018 University of Washington cse446-staff@cs.washington.edu 1 / 19 Announcements HW2 posted. Due Feb 1. It is long. Start this week! Today:
More informationPerceptron: This is convolution!
Perceptron: This is convolution! v v v Shared weights v Filter = local perceptron. Also called kernel. By pooling responses at different locations, we gain robustness to the exact spatial location of image
More informationAll lecture slides will be available at CSC2515_Winter15.html
CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many
More informationAbout the Course. Reading List. Assignments and Examina5on
Uppsala University Department of Linguis5cs and Philology About the Course Introduc5on to machine learning Focus on methods used in NLP Decision trees and nearest neighbor methods Linear models for classifica5on
More informationNearest Neighbor Classification. Machine Learning Fall 2017
Nearest Neighbor Classification Machine Learning Fall 2017 1 This lecture K-nearest neighbor classification The basic algorithm Different distance measures Some practical aspects Voronoi Diagrams and Decision
More informationCMPT 882 Week 3 Summary
CMPT 882 Week 3 Summary! Artificial Neural Networks (ANNs) are networks of interconnected simple units that are based on a greatly simplified model of the brain. ANNs are useful learning tools by being
More informationk-nearest Neighbors + Model Selection
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University k-nearest Neighbors + Model Selection Matt Gormley Lecture 5 Jan. 30, 2019 1 Reminders
More informationCSC411 Fall 2014 Machine Learning & Data Mining. Ensemble Methods. Slides by Rich Zemel
CSC411 Fall 2014 Machine Learning & Data Mining Ensemble Methods Slides by Rich Zemel Ensemble methods Typical application: classi.ication Ensemble of classi.iers is a set of classi.iers whose individual
More informationCSE 158. Web Mining and Recommender Systems. Midterm recap
CSE 158 Web Mining and Recommender Systems Midterm recap Midterm on Wednesday! 5:10 pm 6:10 pm Closed book but I ll provide a similar level of basic info as in the last page of previous midterms CSE 158
More informationWeka ( )
Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised
More informationClustering. Mihaela van der Schaar. January 27, Department of Engineering Science University of Oxford
Department of Engineering Science University of Oxford January 27, 2017 Many datasets consist of multiple heterogeneous subsets. Cluster analysis: Given an unlabelled data, want algorithms that automatically
More informationLecture 3: Linear Classification
Lecture 3: Linear Classification Roger Grosse 1 Introduction Last week, we saw an example of a learning task called regression. There, the goal was to predict a scalar-valued target from a set of features.
More informationCSE 546 Machine Learning, Autumn 2013 Homework 2
CSE 546 Machine Learning, Autumn 2013 Homework 2 Due: Monday, October 28, beginning of class 1 Boosting [30 Points] We learned about boosting in lecture and the topic is covered in Murphy 16.4. On page
More informationThe Mathematics Behind Neural Networks
The Mathematics Behind Neural Networks Pattern Recognition and Machine Learning by Christopher M. Bishop Student: Shivam Agrawal Mentor: Nathaniel Monson Courtesy of xkcd.com The Black Box Training the
More information10-701/15-781, Fall 2006, Final
-7/-78, Fall 6, Final Dec, :pm-8:pm There are 9 questions in this exam ( pages including this cover sheet). If you need more room to work out your answer to a question, use the back of the page and clearly
More informationFMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)
More informationCPSC 340: Machine Learning and Data Mining. Non-Parametric Models Fall 2016
CPSC 340: Machine Learning and Data Mining Non-Parametric Models Fall 2016 Assignment 0: Admin 1 late day to hand it in tonight, 2 late days for Wednesday. Assignment 1 is out: Due Friday of next week.
More informationLecture 25: Review I
Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,
More informationMachine Learning Classifiers and Boosting
Machine Learning Classifiers and Boosting Reading Ch 18.6-18.12, 20.1-20.3.2 Outline Different types of learning problems Different types of learning algorithms Supervised learning Decision trees Naïve
More information2. Linear Regression and Gradient Descent
Pattern Recognition And Machine Learning - EPFL - Fall 2015 Emtiyaz Khan, Timur Bagautdinov, Carlos Becker, Ilija Bogunovic & Ksenia Konyushkova 2. Linear Regression and Gradient Descent 2.1 Goals The
More informationCS535 Big Data Fall 2017 Colorado State University 10/10/2017 Sangmi Lee Pallickara Week 8- A.
CS535 Big Data - Fall 2017 Week 8-A-1 CS535 BIG DATA FAQs Term project proposal New deadline: Tomorrow PA1 demo PART 1. BATCH COMPUTING MODELS FOR BIG DATA ANALYTICS 5. ADVANCED DATA ANALYTICS WITH APACHE
More informationChakra Chennubhotla and David Koes
MSCBIO/CMPBIO 2065: Support Vector Machines Chakra Chennubhotla and David Koes Nov 15, 2017 Sources mmds.org chapter 12 Bishop s book Ch. 7 Notes from Toronto, Mark Schmidt (UBC) 2 SVM SVMs and Logistic
More informationThe exam is closed book, closed notes except your one-page (two-sided) cheat sheet.
CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or
More informationIntroduction to Machine Learning CMU-10701
Introduction to Machine Learning CMU-10701 Clustering and EM Barnabás Póczos & Aarti Singh Contents Clustering K-means Mixture of Gaussians Expectation Maximization Variational Methods 2 Clustering 3 K-
More informationMore on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.
More on Neural Networks Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.6 Recall the MLP Training Example From Last Lecture log likelihood
More informationEvaluation. Evaluate what? For really large amounts of data... A: Use a validation set.
Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?
More informationCS273 Midterm Exam Introduction to Machine Learning: Winter 2015 Tuesday February 10th, 2014
CS273 Midterm Eam Introduction to Machine Learning: Winter 2015 Tuesday February 10th, 2014 Your name: Your UCINetID (e.g., myname@uci.edu): Your seat (row and number): Total time is 80 minutes. READ THE
More information12 Classification using Support Vector Machines
160 Bioinformatics I, WS 14/15, D. Huson, January 28, 2015 12 Classification using Support Vector Machines This lecture is based on the following sources, which are all recommended reading: F. Markowetz.
More informationA Brief Look at Optimization
A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest
More informationInf2B assignment 2. Natural images classification. Hiroshi Shimodaira and Pol Moreno. Submission due: 4pm, Wednesday 30 March 2016.
Inf2B assignment 2 (Ver. 1.2) Natural images classification Submission due: 4pm, Wednesday 30 March 2016 Hiroshi Shimodaira and Pol Moreno This assignment is out of 100 marks and forms 12.5% of your final
More informationCPSC 340: Machine Learning and Data Mining. Logistic Regression Fall 2016
CPSC 340: Machine Learning and Data Mining Logistic Regression Fall 2016 Admin Assignment 1: Marks visible on UBC Connect. Assignment 2: Solution posted after class. Assignment 3: Due Wednesday (at any
More informationCSC 4510 Machine Learning
5: Mul'variate Regression CSC 4510 Machine Learning Dr. Mary Angela Papalaskari Department of CompuBng Sciences Villanova Course website: www.csc.villanova.edu/~map/4510/ The slides in this presentabon
More information[1] CURVE FITTING WITH EXCEL
1 Lecture 04 February 9, 2010 Tuesday Today is our third Excel lecture. Our two central themes are: (1) curve-fitting, and (2) linear algebra (matrices). We will have a 4 th lecture on Excel to further
More informationComputational Machine Learning, Fall 2015 Homework 4: stochastic gradient algorithms
Computational Machine Learning, Fall 2015 Homework 4: stochastic gradient algorithms Due: Tuesday, November 24th, 2015, before 11:59pm (submit via email) Preparation: install the software packages and
More informationGradient Descent - Problem of Hiking Down a Mountain
Gradient Descent - Problem of Hiking Down a Mountain Udacity Have you ever climbed a mountain? I am sure you had to hike down at some point? Hiking down is a great exercise and it is going to help us understand
More informationMath 2250 Lab #3: Landing on Target
Math 2250 Lab #3: Landing on Target 1. INTRODUCTION TO THE LAB PROGRAM. Here are some general notes and ideas which will help you with the lab. The purpose of the lab program is to expose you to problems
More informationIntroduc)on to Probabilis)c Latent Seman)c Analysis. NYP Predic)ve Analy)cs Meetup June 10, 2010
Introduc)on to Probabilis)c Latent Seman)c Analysis NYP Predic)ve Analy)cs Meetup June 10, 2010 PLSA A type of latent variable model with observed count data and nominal latent variable(s). Despite the
More informationMachine Learning using Matlab. Lecture 3 Logistic regression and regularization
Machine Learning using Matlab Lecture 3 Logistic regression and regularization Presentation Date (correction) 10.07.2017 11.07.2017 17.07.2017 18.07.2017 24.07.2017 25.07.2017 Project proposals 13 submissions,
More informationCOMP6237 Data Mining Introduction to Data Mining
COMP6237 Data Mining Introduction to Data Mining Jonathon Hare jsh2@ecs.soton.ac.uk Markus Brede mb8@ecs.soton.ac.uk Teaching Staff Jonathon Hare jsh2@ecs.soton.ac.uk 1/2003 Markus Brede mb8@ecs.soton.ac.uk
More informationPartitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning
Partitioning Data IRDS: Evaluation, Debugging, and Diagnostics Charles Sutton University of Edinburgh Training Validation Test Training : Running learning algorithms Validation : Tuning parameters of learning
More informationRecommender Systems Collabora2ve Filtering and Matrix Factoriza2on
Recommender Systems Collaborave Filtering and Matrix Factorizaon Narges Razavian Thanks to lecture slides from Alex Smola@CMU Yahuda Koren@Yahoo labs and Bing Liu@UIC We Know What You Ought To Be Watching
More informationLecture 9. Support Vector Machines
Lecture 9. Support Vector Machines COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne This lecture Support vector machines (SVMs) as maximum
More informationMath 5320, 3/28/18 Worksheet 26: Ruler and compass constructions. 1. Use your ruler and compass to construct a line perpendicular to the line below:
Math 5320, 3/28/18 Worksheet 26: Ruler and compass constructions Name: 1. Use your ruler and compass to construct a line perpendicular to the line below: 2. Suppose the following two points are spaced
More informationPlanar data classification with one hidden layer
Planar data classification with one hidden layer Welcome to your week 3 programming assignment. It's time to build your first neural network, which will have a hidden layer. You will see a big difference
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu [Kumar et al. 99] 2/13/2013 Jure Leskovec, Stanford CS246: Mining Massive Datasets, http://cs246.stanford.edu
More informationEnsemble methods in machine learning. Example. Neural networks. Neural networks
Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you
More informationData Mining and Knowledge Discovery: Practice Notes
Data Mining and Knowledge Discovery: Practice Notes Petra Kralj Novak Petra.Kralj.Novak@ijs.si 2016/01/12 1 Keywords Data Attribute, example, attribute-value data, target variable, class, discretization
More informationKnowledge Discovery and Data Mining
Knowledge Discovery and Data Mining Lecture 13 - Neural Nets Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey ID5059-13-NN
More information6. Linear Discriminant Functions
6. Linear Discriminant Functions Linear Discriminant Functions Assumption: we know the proper forms for the discriminant functions, and use the samples to estimate the values of parameters of the classifier
More informationLecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017
Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last
More informationCPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017
CPSC 340: Machine Learning and Data Mining More Linear Classifiers Fall 2017 Admin Assignment 3: Due Friday of next week. Midterm: Can view your exam during instructor office hours next week, or after
More informationNeural Networks. CE-725: Statistical Pattern Recognition Sharif University of Technology Spring Soleymani
Neural Networks CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Biological and artificial neural networks Feed-forward neural networks Single layer
More informationCPSC 340: Machine Learning and Data Mining. Deep Learning Fall 2018
CPSC 340: Machine Learning and Data Mining Deep Learning Fall 2018 Last Time: Multi-Dimensional Scaling Multi-dimensional scaling (MDS): Non-parametric visualization: directly optimize the z i locations.
More informationCOMP6237 Data Mining Data Mining & Machine Learning with Big Data. Jonathon Hare
COMP6237 Data Mining Data Mining & Machine Learning with Big Data Jonathon Hare jsh2@ecs.soton.ac.uk Contents Going to look at two case-studies looking at how we can make machine-learning algorithms work
More informationMathematics of Data. INFO-4604, Applied Machine Learning University of Colorado Boulder. September 5, 2017 Prof. Michael Paul
Mathematics of Data INFO-4604, Applied Machine Learning University of Colorado Boulder September 5, 2017 Prof. Michael Paul Goals In the intro lecture, every visualization was in 2D What happens when we
More informationUsing Machine Learning to Optimize Storage Systems
Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation
More informationExercise: Training Simple MLP by Backpropagation. Using Netlab.
Exercise: Training Simple MLP by Backpropagation. Using Netlab. Petr Pošík December, 27 File list This document is an explanation text to the following script: demomlpklin.m script implementing the beckpropagation
More informationImage Analysis - Lecture 1
General Research Image models Repetition Image Analysis - Lecture 1 Magnus Oskarsson General Research Image models Repetition Lecture 1 Administrative things What is image analysis? Examples of image analysis
More informationDulwich College. SAMPLE PAPER Mathematics
1+ Dulwich College YEAR 9 ENTRANCE AND SCHOLARSHIP EXAMINATION SAMPLE PAPER Mathematics 1 HOUR 0 MINUTES Use a calculator where appropriate. Answer all the questions. Show all your working. Marks for parts
More informationLinear Regression and K-Nearest Neighbors 3/28/18
Linear Regression and K-Nearest Neighbors 3/28/18 Linear Regression Hypothesis Space Supervised learning For every input in the data set, we know the output Regression Outputs are continuous A number,
More informationHomework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)
Virginia Tech. Computer Science CS 5614 (Big) Data Management Systems Fall 2014, Prakash Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14
More informationMath 2250 Lab #3: Landing on Target
Math 2250 Lab #3: Landing on Target 1. INTRODUCTION TO THE LAB PROGRAM. Here are some general notes and ideas which will help you with the lab. The purpose of the lab program is to expose you to problems
More information10.4 Linear interpolation method Newton s method
10.4 Linear interpolation method The next best thing one can do is the linear interpolation method, also known as the double false position method. This method works similarly to the bisection method by
More informationCS249: ADVANCED DATA MINING
CS249: ADVANCED DATA MINING Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu April 24, 2017 Homework 2 out Announcements Due May 3 rd (11:59pm) Course project proposal
More informationClassification and Regression Trees
Classification and Regression Trees David S. Rosenberg New York University April 3, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 April 3, 2018 1 / 51 Contents 1 Trees 2 Regression
More informationOnline Algorithm Comparison points
CS446: Machine Learning Spring 2017 Problem Set 3 Handed Out: February 15 th, 2017 Due: February 27 th, 2017 Feel free to talk to other members of the class in doing the homework. I am more concerned that
More informationSupervised Learning Classification Algorithms Comparison
Supervised Learning Classification Algorithms Comparison Aditya Singh Rathore B.Tech, J.K. Lakshmipat University -------------------------------------------------------------***---------------------------------------------------------
More information1 Training/Validation/Testing
CPSC 340 Final (Fall 2015) Name: Student Number: Please enter your information above, turn off cellphones, space yourselves out throughout the room, and wait until the official start of the exam to begin.
More informationIntroduction to the Weebly Toolkit for Building Websites
Introduction to the Weebly Toolkit for Building Websites Maureen Pratchett July 2015 1 Objective The purpose of this workshop is not to teach you how to design or even build a website, but rather to introduce
More informationMinimum Redundancy and Maximum Relevance Feature Selec4on. Hang Xiao
Minimum Redundancy and Maximum Relevance Feature Selec4on Hang Xiao Background Feature a feature is an individual measurable heuris4c property of a phenomenon being observed In character recogni4on: horizontal
More informationPractice EXAM: SPRING 2012 CS 6375 INSTRUCTOR: VIBHAV GOGATE
Practice EXAM: SPRING 0 CS 6375 INSTRUCTOR: VIBHAV GOGATE The exam is closed book. You are allowed four pages of double sided cheat sheets. Answer the questions in the spaces provided on the question sheets.
More informationCSE 158 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression
CSE 158 Lecture 2 Web Mining and Recommender Systems Supervised learning Regression Supervised versus unsupervised learning Learning approaches attempt to model data in order to solve a problem Unsupervised
More informationMissing Data. Where did it go?
Missing Data Where did it go? 1 Learning Objectives High-level discussion of some techniques Identify type of missingness Single vs Multiple Imputation My favourite technique 2 Problem Uh data are missing
More informationSimple Model Selection Cross Validation Regularization Neural Networks
Neural Nets: Many possible refs e.g., Mitchell Chapter 4 Simple Model Selection Cross Validation Regularization Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February
More informationClassification Key Concepts
http://poloclub.gatech.edu/cse6242 CSE6242 / CX4242: Data & Visual Analytics Classification Key Concepts Duen Horng (Polo) Chau Assistant Professor Associate Director, MS Analytics Georgia Tech 1 How will
More information