CS 237: Probability in Computing

Size: px
Start display at page:

Download "CS 237: Probability in Computing"

Transcription

1 CS 237: Probability in Computing Wayne Snyder Computer Science Department Boston University Lecture 26: Logistic Regression 2 Gradient Descent for Linear Regression Gradient Descent for Logistic Regression Example: Logistic Regression for Heights Gender Putting it all together: Multivariate Logistic Regression

2 Recall: we were searching for the parameters using the technique of gradient descent: for this linear regression problem

3 Basic idea: Define a cost or loss function J(...) which gives the cost or penalty for a given choice of parameters, then search for the parameters which minimize this cost. So let s pretend we didn t have the formulae at the upper right, suppose we needed to find them by gradient descent. In linear regression this would mean finding values for which minimize the MSE, in other words minimize the cost function: Cost Function = MSE

4 Partial Derivatives: To find a partial derivative of a function with multiple parameters, we find the rate at which each parameter varies its derivative -- in isolation, considering each of the other parameters to effectively be a constant: We will use b m instead of will represent points as column vectors:

5 To find the minimum value along one axis we will work with only one of the partial derivatives as a time, say the y-intercept: Step One: Choose an initial point b 0. Step Two: Choose a step size or learning rate! threshold of accuracy ". Step Three: Move that distance along the axis, in the decreasing direction (the negative of the slope), repeat until the distance moved is less than ". Step Four: Output b n+1 as the minimum. Let s look at a notebook showing this...

6 Gradient Descent for Linear Regression: To find a point in multiple dimensions, we simply do all dimensions at the same time. Here is the algorithm from the reading: def update(m,b,x,y,lam): N = len(x) m_deriv = 0 b_deriv = 0 for i in range(n): m_deriv += -2*X[i]*(Y[i]-(m*X[i]+b)) b_deriv += -2*(Y[i]-(m*X[i]+b)) m_deriv /= float(n) b_deriv /= float(n) m -= m_deriv * lam b -= b_deriv * lam return (m, b) def gradient_descent(m,b,x,y,lam,epsilon): while True: (m1, b1) = update(m,b,x,y,lam) if abs(m1 m) + abs(b1 b) < epsilon: return (m1,b1) m = m1 b = b1

7 As the parameters are tuned to minimize the cost (= measuring how well the parameters fit the model) you get a better better fit between the model the data. You can run the gradient descent model as long as you wish to get a better fit. Obviously, defining the cost function picking the learning rate threshold are critical decisions, much research has been devoted to different cost models different approaches to gradient descent.

8 There are dozens of different cost functions that have been defined even more varients of algorithms that perform minimization of a given cost function available in stard Python libraries:

9 Logistic Regression: The Cost Function Ok, back to logistic regression! A common cost function used in logistic regression is Cross-Entropy or Log-Loss: If Y is the actual value (0 or 1, from the actual data) is the predicted probability P(Y=1) output by logistic regression, then the cost of this prediction is:

10 Logistic Regression: The Cost Function Thus, in our simple example (predicting gender from height) we want to predict the y- intercept b slope m such that the regression line, after being transformed by the sigmoid function s(...), minimizes the following cost: Predicted probability that P(y i = 1)

11 Logistic Regression: Putting it All Together In the case of m multiple independent variables X binary dependent variable Y: where the prediction for the sample (row) using the parameters is denoted, then we use the same gradient descent algorithm to find the values of the parameters such that the following cost function is minimized:

12 Logistic Regression: Supervised Learning Even if we have successfully minimized the cost function, surely the cost will not be 0, which means that our model will not be perfect. How do we evaluate our model? We can think of this algorithm as trying to learn the categories (0 or 1) that the independent variables belong to, use our data itself to test the results. The basic idea is to take a rom selection (say 80%) of our data set, find the best fitting parameters (called training ) then test it on the remaining 20%. The percentage of the remaining data that is successfully classified is the accuracy of our model: Train Test

CS 237: Probability in Computing

CS 237: Probability in Computing CS 237: Probability in Computing Wayne Snyder Computer Science Department Boston University Lecture 25: Logistic Regression Motivation: Why Logistic Regression? Sigmoid functions the logit transformation

More information

CS273 Midterm Exam Introduction to Machine Learning: Winter 2015 Tuesday February 10th, 2014

CS273 Midterm Exam Introduction to Machine Learning: Winter 2015 Tuesday February 10th, 2014 CS273 Midterm Eam Introduction to Machine Learning: Winter 2015 Tuesday February 10th, 2014 Your name: Your UCINetID (e.g., myname@uci.edu): Your seat (row and number): Total time is 80 minutes. READ THE

More information

Logistic Regression. May 28, Decision boundary is a property of the hypothesis and not the data set e z. g(z) = g(z) 0.

Logistic Regression. May 28, Decision boundary is a property of the hypothesis and not the data set e z. g(z) = g(z) 0. Logistic Regression May 28, 202 Logistic Regression. Decision Boundary Decision boundary is a property of the hypothesis and not the data set. sigmoid function: h (x) = g( x) = P (y = x; ) suppose predict

More information

Week 3: Perceptron and Multi-layer Perceptron

Week 3: Perceptron and Multi-layer Perceptron Week 3: Perceptron and Multi-layer Perceptron Phong Le, Willem Zuidema November 12, 2013 Last week we studied two famous biological neuron models, Fitzhugh-Nagumo model and Izhikevich model. This week,

More information

CPSC 340: Machine Learning and Data Mining. Logistic Regression Fall 2016

CPSC 340: Machine Learning and Data Mining. Logistic Regression Fall 2016 CPSC 340: Machine Learning and Data Mining Logistic Regression Fall 2016 Admin Assignment 1: Marks visible on UBC Connect. Assignment 2: Solution posted after class. Assignment 3: Due Wednesday (at any

More information

CS 179 Lecture 16. Logistic Regression & Parallel SGD

CS 179 Lecture 16. Logistic Regression & Parallel SGD CS 179 Lecture 16 Logistic Regression & Parallel SGD 1 Outline logistic regression (stochastic) gradient descent parallelizing SGD for neural nets (with emphasis on Google s distributed neural net implementation)

More information

CSE 546 Machine Learning, Autumn 2013 Homework 2

CSE 546 Machine Learning, Autumn 2013 Homework 2 CSE 546 Machine Learning, Autumn 2013 Homework 2 Due: Monday, October 28, beginning of class 1 Boosting [30 Points] We learned about boosting in lecture and the topic is covered in Murphy 16.4. On page

More information

Multinomial Regression and the Softmax Activation Function. Gary Cottrell!

Multinomial Regression and the Softmax Activation Function. Gary Cottrell! Multinomial Regression and the Softmax Activation Function Gary Cottrell Notation reminder We have N data points, or patterns, in the training set, with the pattern number as a superscript: {(x 1,t 1 ),

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Stat 342 Exam 3 Fall 2014

Stat 342 Exam 3 Fall 2014 Stat 34 Exam 3 Fall 04 I have neither given nor received unauthorized assistance on this exam. Name Signed Date Name Printed There are questions on the following 6 pages. Do as many of them as you can

More information

Regularization and model selection

Regularization and model selection CS229 Lecture notes Andrew Ng Part VI Regularization and model selection Suppose we are trying select among several different models for a learning problem. For instance, we might be using a polynomial

More information

Programming Exercise 4: Neural Networks Learning

Programming Exercise 4: Neural Networks Learning Programming Exercise 4: Neural Networks Learning Machine Learning Introduction In this exercise, you will implement the backpropagation algorithm for neural networks and apply it to the task of hand-written

More information

06: Logistic Regression

06: Logistic Regression 06_Logistic_Regression 06: Logistic Regression Previous Next Index Classification Where y is a discrete value Develop the logistic regression algorithm to determine what class a new input should fall into

More information

Weka ( )

Weka (  ) Weka ( http://www.cs.waikato.ac.nz/ml/weka/ ) The phases in which classifier s design can be divided are reflected in WEKA s Explorer structure: Data pre-processing (filtering) and representation Supervised

More information

Lesson 20: Every Line is a Graph of a Linear Equation

Lesson 20: Every Line is a Graph of a Linear Equation Student Outcomes Students know that any non vertical line is the graph of a linear equation in the form of, where is a constant. Students write the equation that represents the graph of a line. Lesson

More information

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review

CS6375: Machine Learning Gautam Kunapuli. Mid-Term Review Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes

More information

Homework 2. Due: March 2, 2018 at 7:00PM. p = 1 m. (x i ). i=1

Homework 2. Due: March 2, 2018 at 7:00PM. p = 1 m. (x i ). i=1 Homework 2 Due: March 2, 2018 at 7:00PM Written Questions Problem 1: Estimator (5 points) Let x 1, x 2,..., x m be an i.i.d. (independent and identically distributed) sample drawn from distribution B(p)

More information

Practice Exam Sample Solutions

Practice Exam Sample Solutions CS 675 Computer Vision Instructor: Marc Pomplun Practice Exam Sample Solutions Note that in the actual exam, no calculators, no books, and no notes allowed. Question 1: out of points Question 2: out of

More information

Lecture 1 Notes. Outline. Machine Learning. What is it? Instructors: Parth Shah, Riju Pahwa

Lecture 1 Notes. Outline. Machine Learning. What is it? Instructors: Parth Shah, Riju Pahwa Instructors: Parth Shah, Riju Pahwa Lecture 1 Notes Outline 1. Machine Learning What is it? Classification vs. Regression Error Training Error vs. Test Error 2. Linear Classifiers Goals and Motivations

More information

node2vec: Scalable Feature Learning for Networks

node2vec: Scalable Feature Learning for Networks node2vec: Scalable Feature Learning for Networks A paper by Aditya Grover and Jure Leskovec, presented at Knowledge Discovery and Data Mining 16. 11/27/2018 Presented by: Dharvi Verma CS 848: Graph Database

More information

HMC CS 158, Fall 2017 Problem Set 3 Programming: Regularized Polynomial Regression

HMC CS 158, Fall 2017 Problem Set 3 Programming: Regularized Polynomial Regression HMC CS 158, Fall 2017 Problem Set 3 Programming: Regularized Polynomial Regression Goals: To open up the black-box of scikit-learn and implement regression models. To investigate how adding polynomial

More information

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu Natural Language Processing CS 6320 Lecture 6 Neural Language Models Instructor: Sanda Harabagiu In this lecture We shall cover: Deep Neural Models for Natural Language Processing Introduce Feed Forward

More information

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Lecture 20: Neural Networks for NLP. Zubin Pahuja Lecture 20: Neural Networks for NLP Zubin Pahuja zpahuja2@illinois.edu courses.engr.illinois.edu/cs447 CS447: Natural Language Processing 1 Today s Lecture Feed-forward neural networks as classifiers simple

More information

More on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.

More on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5. More on Neural Networks Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.6 Recall the MLP Training Example From Last Lecture log likelihood

More information

Ensemble methods in machine learning. Example. Neural networks. Neural networks

Ensemble methods in machine learning. Example. Neural networks. Neural networks Ensemble methods in machine learning Bootstrap aggregating (bagging) train an ensemble of models based on randomly resampled versions of the training set, then take a majority vote Example What if you

More information

APPLICATION OF SOFTMAX REGRESSION AND ITS VALIDATION FOR SPECTRAL-BASED LAND COVER MAPPING

APPLICATION OF SOFTMAX REGRESSION AND ITS VALIDATION FOR SPECTRAL-BASED LAND COVER MAPPING APPLICATION OF SOFTMAX REGRESSION AND ITS VALIDATION FOR SPECTRAL-BASED LAND COVER MAPPING J. Wolfe a, X. Jin a, T. Bahr b, N. Holzer b, * a Harris Corporation, Broomfield, Colorado, U.S.A. (jwolfe05,

More information

1. What specialist uses information obtained from bones to help police solve crimes?

1. What specialist uses information obtained from bones to help police solve crimes? Mathematics: Modeling Our World Unit 4: PREDICTION HANDOUT VIDEO VIEWING GUIDE H4.1 1. What specialist uses information obtained from bones to help police solve crimes? 2.What are some things that can

More information

A Brief Look at Optimization

A Brief Look at Optimization A Brief Look at Optimization CSC 412/2506 Tutorial David Madras January 18, 2018 Slides adapted from last year s version Overview Introduction Classes of optimization problems Linear programming Steepest

More information

I211: Information infrastructure II

I211: Information infrastructure II Data Mining: Classifier Evaluation I211: Information infrastructure II 3-nearest neighbor labeled data find class labels for the 4 data points 1 0 0 6 0 0 0 5 17 1.7 1 1 4 1 7.1 1 1 1 0.4 1 2 1 3.0 0 0.1

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Understanding Andrew Ng s Machine Learning Course Notes and codes (Matlab version)

Understanding Andrew Ng s Machine Learning Course Notes and codes (Matlab version) Understanding Andrew Ng s Machine Learning Course Notes and codes (Matlab version) Note: All source materials and diagrams are taken from the Coursera s lectures created by Dr Andrew Ng. Everything I have

More information

Logistic Regression and Gradient Ascent

Logistic Regression and Gradient Ascent Logistic Regression and Gradient Ascent CS 349-02 (Machine Learning) April 0, 207 The perceptron algorithm has a couple of issues: () the predictions have no probabilistic interpretation or confidence

More information

Multi-Class Logistic Regression and Perceptron

Multi-Class Logistic Regression and Perceptron Multi-Class Logistic Regression and Perceptron Instructor: Wei Xu Some slides adapted from Dan Jurfasky, Brendan O Connor and Marine Carpuat MultiClass Classification Q: what if we have more than 2 categories?

More information

Lecture #11: The Perceptron

Lecture #11: The Perceptron Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be

More information

1.5 Equations of Lines and Planes in 3-D

1.5 Equations of Lines and Planes in 3-D 1.5. EQUATIONS OF LINES AND PLANES IN 3-D 55 Figure 1.16: Line through P 0 parallel to v 1.5 Equations of Lines and Planes in 3-D Recall that given a point P = (a, b, c), one can draw a vector from the

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description

More information

CS 224N: Assignment #1

CS 224N: Assignment #1 Due date: assignment) 1/25 11:59 PM PST (You are allowed to use three (3) late days maximum for this These questions require thought, but do not require long answers. Please be as concise as possible.

More information

model order p weights The solution to this optimization problem is obtained by solving the linear system

model order p weights The solution to this optimization problem is obtained by solving the linear system CS 189 Introduction to Machine Learning Fall 2017 Note 3 1 Regression and hyperparameters Recall the supervised regression setting in which we attempt to learn a mapping f : R d R from labeled examples

More information

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set.

Evaluation. Evaluate what? For really large amounts of data... A: Use a validation set. Evaluate what? Evaluation Charles Sutton Data Mining and Exploration Spring 2012 Do you want to evaluate a classifier or a learning algorithm? Do you want to predict accuracy or predict which one is better?

More information

Machine Learning Lecture-1

Machine Learning Lecture-1 Machine Learning Lecture-1 Programming Club Indian Institute of Technology Kanpur pclubiitk@gmail.com February 25, 2016 Programming Club (IIT Kanpur) ML-1 February 25, 2016 1 / 18 Acknowledgement This

More information

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane

DATA MINING AND MACHINE LEARNING. Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane DATA MINING AND MACHINE LEARNING Lecture 6: Data preprocessing and model selection Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Data preprocessing Feature normalization Missing

More information

3.5 Day 1 Warm Up. Graph each line. 3.4 Proofs with Perpendicular Lines

3.5 Day 1 Warm Up. Graph each line. 3.4 Proofs with Perpendicular Lines 3.5 Day 1 Warm Up Graph each line. 1. y = 4x 2. y = 3x + 2 3. y = x 3 4. y = 4 x + 3 3 November 2, 2015 3.4 Proofs with Perpendicular Lines Geometry 3.5 Equations of Parallel and Perpendicular Lines Day

More information

CSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13

CSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13 CSE 634 - Data Mining Concepts and Techniques STATISTICAL METHODS Professor- Anita Wasilewska (REGRESSION) Team 13 Contents Linear Regression Logistic Regression Bias and Variance in Regression Model Fit

More information

CS 224N: Assignment #1

CS 224N: Assignment #1 Due date: assignment) 1/25 11:59 PM PST (You are allowed to use three (3) late days maximum for this These questions require thought, but do not require long answers. Please be as concise as possible.

More information

Predict the Likelihood of Responding to Direct Mail Campaign in Consumer Lending Industry

Predict the Likelihood of Responding to Direct Mail Campaign in Consumer Lending Industry Predict the Likelihood of Responding to Direct Mail Campaign in Consumer Lending Industry Jincheng Cao, SCPD Jincheng@stanford.edu 1. INTRODUCTION When running a direct mail campaign, it s common practice

More information

Neural Networks (pp )

Neural Networks (pp ) Notation: Means pencil-and-paper QUIZ Means coding QUIZ Neural Networks (pp. 106-121) The first artificial neural network (ANN) was the (single-layer) perceptron, a simplified model of a biological neuron.

More information

3.1. Exponential Functions. Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley

3.1. Exponential Functions. Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley 3.1 Exponential Functions Copyright 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Quick Review Evaluate the expression without using a calculator. 3 1. -125 2. 3 27 64 3. 27 4/3 Rewrite

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 22 2019 Outline Practical issues in Linear Regression Outliers Categorical variables Lab

More information

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic

SEMANTIC COMPUTING. Lecture 8: Introduction to Deep Learning. TU Dresden, 7 December Dagmar Gromann International Center For Computational Logic SEMANTIC COMPUTING Lecture 8: Introduction to Deep Learning Dagmar Gromann International Center For Computational Logic TU Dresden, 7 December 2018 Overview Introduction Deep Learning General Neural Networks

More information

Assignment 4 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran

Assignment 4 (Sol.) Introduction to Data Analytics Prof. Nandan Sudarsanam & Prof. B. Ravindran Assignment 4 (Sol.) Introduction to Data Analytics Prof. andan Sudarsanam & Prof. B. Ravindran 1. Which among the following techniques can be used to aid decision making when those decisions depend upon

More information

(Multinomial) Logistic Regression + Feature Engineering

(Multinomial) Logistic Regression + Feature Engineering -6 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University (Multinomial) Logistic Regression + Feature Engineering Matt Gormley Lecture 9 Feb.

More information

Package hiernet. March 18, 2018

Package hiernet. March 18, 2018 Title A Lasso for Hierarchical Interactions Version 1.7 Author Jacob Bien and Rob Tibshirani Package hiernet March 18, 2018 Fits sparse interaction models for continuous and binary responses subject to

More information

2 Second Derivatives. As we have seen, a function f (x, y) of two variables has four different partial derivatives: f xx. f yx. f x y.

2 Second Derivatives. As we have seen, a function f (x, y) of two variables has four different partial derivatives: f xx. f yx. f x y. 2 Second Derivatives As we have seen, a function f (x, y) of two variables has four different partial derivatives: (x, y), (x, y), f yx (x, y), (x, y) It is convenient to gather all four of these into

More information

Introduction to Machine Learning. Xiaojin Zhu

Introduction to Machine Learning. Xiaojin Zhu Introduction to Machine Learning Xiaojin Zhu jerryzhu@cs.wisc.edu Read Chapter 1 of this book: Xiaojin Zhu and Andrew B. Goldberg. Introduction to Semi- Supervised Learning. http://www.morganclaypool.com/doi/abs/10.2200/s00196ed1v01y200906aim006

More information

MDM 4UI: Unit 8 Day 2: Regression and Correlation

MDM 4UI: Unit 8 Day 2: Regression and Correlation MDM 4UI: Unit 8 Day 2: Regression and Correlation Regression: The process of fitting a line or a curve to a set of data. Coefficient of Correlation(r): This is a value between and allows statisticians

More information

Neural Network Neurons

Neural Network Neurons Neural Networks Neural Network Neurons 1 Receives n inputs (plus a bias term) Multiplies each input by its weight Applies activation function to the sum of results Outputs result Activation Functions Given

More information

Package automl. September 13, 2018

Package automl. September 13, 2018 Type Package Title Deep Learning with Metaheuristic Version 1.0.5 Author Alex Boulangé Package automl September 13, 2018 Maintainer Alex Boulangé Fits from

More information

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University

CS 112 Introduction to Computing II. Wayne Snyder Computer Science Department Boston University CS 112 Introduction to Computing II Wayne Snyder Department Boston University Today: Administrivia and Motivation Administrative Matters: Review of course design and course policies Motivation: Two Algorithms

More information

Writing and Graphing Linear Equations. Linear equations can be used to represent relationships.

Writing and Graphing Linear Equations. Linear equations can be used to represent relationships. Writing and Graphing Linear Equations Linear equations can be used to represent relationships. Linear equation An equation whose solutions form a straight line on a coordinate plane. Collinear Points that

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 08: Classification Evaluation and Practical Issues Instructor: Yizhou Sun yzsun@cs.ucla.edu October 24, 2017 Learnt Prediction and Classification Methods Vector Data

More information

Function Algorithms: Linear Regression, Logistic Regression

Function Algorithms: Linear Regression, Logistic Regression CS 4510/9010: Applied Machine Learning 1 Function Algorithms: Linear Regression, Logistic Regression Paula Matuszek Fall, 2016 Some of these slides originated from Andrew Moore Tutorials, at http://www.cs.cmu.edu/~awm/tutorials.html

More information

AKA: Logistic Regression Implementation

AKA: Logistic Regression Implementation AKA: Logistic Regression Implementation 1 Supervised classification is the problem of predicting to which category a new observation belongs. A category is chosen from a list of predefined categories.

More information

Nearest neighbor classification DSE 220

Nearest neighbor classification DSE 220 Nearest neighbor classification DSE 220 Decision Trees Target variable Label Dependent variable Output space Person ID Age Gender Income Balance Mortgag e payment 123213 32 F 25000 32000 Y 17824 49 M 12000-3000

More information

CS535 Big Data Fall 2017 Colorado State University 10/10/2017 Sangmi Lee Pallickara Week 8- A.

CS535 Big Data Fall 2017 Colorado State University   10/10/2017 Sangmi Lee Pallickara Week 8- A. CS535 Big Data - Fall 2017 Week 8-A-1 CS535 BIG DATA FAQs Term project proposal New deadline: Tomorrow PA1 demo PART 1. BATCH COMPUTING MODELS FOR BIG DATA ANALYTICS 5. ADVANCED DATA ANALYTICS WITH APACHE

More information

Predicting Popular Xbox games based on Search Queries of Users

Predicting Popular Xbox games based on Search Queries of Users 1 Predicting Popular Xbox games based on Search Queries of Users Chinmoy Mandayam and Saahil Shenoy I. INTRODUCTION This project is based on a completed Kaggle competition. Our goal is to predict which

More information

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression

Your Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using

More information

CS5670: Computer Vision

CS5670: Computer Vision CS5670: Computer Vision Noah Snavely Lecture 33: Recognition Basics Slides from Andrej Karpathy and Fei-Fei Li http://vision.stanford.edu/teaching/cs231n/ Announcements Quiz moved to Tuesday Project 4

More information

Univariate and Multivariate Decision Trees

Univariate and Multivariate Decision Trees Univariate and Multivariate Decision Trees Olcay Taner Yıldız and Ethem Alpaydın Department of Computer Engineering Boğaziçi University İstanbul 80815 Turkey Abstract. Univariate decision trees at each

More information

Graphing Linear Equations

Graphing Linear Equations Graphing Linear Equations Question 1: What is a rectangular coordinate system? Answer 1: The rectangular coordinate system is used to graph points and equations. To create the rectangular coordinate system,

More information

Supervised Learning Classification Algorithms Comparison

Supervised Learning Classification Algorithms Comparison Supervised Learning Classification Algorithms Comparison Aditya Singh Rathore B.Tech, J.K. Lakshmipat University -------------------------------------------------------------***---------------------------------------------------------

More information

Data Mining Issues. Danushka Bollegala

Data Mining Issues. Danushka Bollegala Data Mining Issues Danushka Bollegala Missing values What if we do not know the value of a particular feature of an instance? eg. the height of the 10-th student is missing although other feature values

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2017 Assignment 3: 2 late days to hand in tonight. Admin Assignment 4: Due Friday of next week. Last Time: MAP Estimation MAP

More information

Scatterplot: The Bridge from Correlation to Regression

Scatterplot: The Bridge from Correlation to Regression Scatterplot: The Bridge from Correlation to Regression We have already seen how a histogram is a useful technique for graphing the distribution of one variable. Here is the histogram depicting the distribution

More information

Frameworks in Python for Numeric Computation / ML

Frameworks in Python for Numeric Computation / ML Frameworks in Python for Numeric Computation / ML Why use a framework? Why not use the built-in data structures? Why not write our own matrix multiplication function? Frameworks are needed not only because

More information

Logistic Regression: Probabilistic Interpretation

Logistic Regression: Probabilistic Interpretation Logistic Regression: Probabilistic Interpretation Approximate 0/1 Loss Logistic Regression Adaboost (z) SVM Solution: Approximate 0/1 loss with convex loss ( surrogate loss) 0-1 z = y w x SVM (hinge),

More information

Semi-Supervised Learning of Named Entity Substructure

Semi-Supervised Learning of Named Entity Substructure Semi-Supervised Learning of Named Entity Substructure Alden Timme aotimme@stanford.edu CS229 Final Project Advisor: Richard Socher richard@socher.org Abstract The goal of this project was two-fold: (1)

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

Computational Intelligence (CS) SS15 Homework 1 Linear Regression and Logistic Regression

Computational Intelligence (CS) SS15 Homework 1 Linear Regression and Logistic Regression Computational Intelligence (CS) SS15 Homework 1 Linear Regression and Logistic Regression Anand Subramoney Points to achieve: 17 pts Extra points: 5* pts Info hour: 28.04.2015 14:00-15:00, HS i12 (ICK1130H)

More information

Predict the box office of US movies

Predict the box office of US movies Predict the box office of US movies Group members: Hanqing Ma, Jin Sun, Zeyu Zhang 1. Introduction Our task is to predict the box office of the upcoming movies using the properties of the movies, such

More information

14.2 The Regression Equation

14.2 The Regression Equation 14.2 The Regression Equation Tom Lewis Fall Term 2009 Tom Lewis () 14.2 The Regression Equation Fall Term 2009 1 / 12 Outline 1 Exact and inexact linear relationships 2 Fitting lines to data 3 Formulas

More information

Backpropagation + Deep Learning

Backpropagation + Deep Learning 10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Backpropagation + Deep Learning Matt Gormley Lecture 13 Mar 1, 2018 1 Reminders

More information

15-780: Problem Set #2

15-780: Problem Set #2 15-780: Problem Set #2 February 19, 2014 1. Constraint satisfaction problem (CSP) [20pts] A common problem at universities is to schedule rooms for exams. The basic structure of this problem is to divide

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 18 2018 Logistics HW 1 is on Piazza and Gradescope Deadline: Friday, Sept. 28, 2018 Office

More information

CSE 158 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression

CSE 158 Lecture 2. Web Mining and Recommender Systems. Supervised learning Regression CSE 158 Lecture 2 Web Mining and Recommender Systems Supervised learning Regression Supervised versus unsupervised learning Learning approaches attempt to model data in order to solve a problem Unsupervised

More information

Increasing and Decreasing Functions. MATH 1003 Calculus and Linear Algebra (Lecture 20) Increasing and Decreasing Functions

Increasing and Decreasing Functions. MATH 1003 Calculus and Linear Algebra (Lecture 20) Increasing and Decreasing Functions Increasing and Decreasing Functions MATH 1003 Calculus and Linear Algebra (Lecture 20) Maosheng Xiong Department of Mathematics, HKUST Suppose y = f (x). 1. f (x) is increasing on an interval a < x < b,

More information

Partitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning

Partitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning Partitioning Data IRDS: Evaluation, Debugging, and Diagnostics Charles Sutton University of Edinburgh Training Validation Test Training : Running learning algorithms Validation : Tuning parameters of learning

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

Introduction to Automated Text Analysis. bit.ly/poir599

Introduction to Automated Text Analysis. bit.ly/poir599 Introduction to Automated Text Analysis Pablo Barberá School of International Relations University of Southern California pablobarbera.com Lecture materials: bit.ly/poir599 Today 1. Solutions for last

More information

Linear Regression Implementation

Linear Regression Implementation Linear Regression Implementation 1 When you experience regression, you go back in some way. The process of regressing is to go back to a less perfect or less developed state. Modeling data is focused on

More information

Problem Set #6 Due: 11:30am on Wednesday, June 7th Note: We will not be accepting late submissions.

Problem Set #6 Due: 11:30am on Wednesday, June 7th Note: We will not be accepting late submissions. Chris Piech Pset #6 CS09 May 26, 207 Problem Set #6 Due: :30am on Wednesday, June 7th Note: We will not be accepting late submissions. For each of the written problems, explain/justify how you obtained

More information

Linear Classification and Perceptron

Linear Classification and Perceptron Linear Classification and Perceptron INFO-4604, Applied Machine Learning University of Colorado Boulder September 7, 2017 Prof. Michael Paul Prediction Functions Remember: a prediction function is the

More information

Universität Freiburg Lehrstuhl für Maschinelles Lernen und natürlichsprachliche Systeme. Machine Learning (SS2012)

Universität Freiburg Lehrstuhl für Maschinelles Lernen und natürlichsprachliche Systeme. Machine Learning (SS2012) Universität Freiburg Lehrstuhl für Maschinelles Lernen und natürlichsprachliche Systeme Machine Learning (SS2012) Prof. Dr. M. Riedmiller, Manuel Blum Exercise Sheet 5 Exercise 5.1: Spam Detection Suppose

More information

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty)

Supervised Learning (contd) Linear Separation. Mausam (based on slides by UW-AI faculty) Supervised Learning (contd) Linear Separation Mausam (based on slides by UW-AI faculty) Images as Vectors Binary handwritten characters Treat an image as a highdimensional vector (e.g., by reading pixel

More information

Introduction to Artificial Intelligence

Introduction to Artificial Intelligence Introduction to Artificial Intelligence COMP307 Evolutionary Computing 3: Genetic Programming for Regression and Classification Yi Mei yi.mei@ecs.vuw.ac.nz 1 Outline Statistical parameter regression Symbolic

More information

Exploring Slope. We use the letter m to represent slope. It is the ratio of the rise to the run.

Exploring Slope. We use the letter m to represent slope. It is the ratio of the rise to the run. Math 7 Exploring Slope Slope measures the steepness of a line. If you take any two points on a line, the change in y (vertical change) is called the rise and the change in x (horizontal change) is called

More information

Deep neural networks II

Deep neural networks II Deep neural networks II May 31 st, 2018 Yong Jae Lee UC Davis Many slides from Rob Fergus, Svetlana Lazebnik, Jia-Bin Huang, Derek Hoiem, Adriana Kovashka, Why (convolutional) neural networks? State of

More information

Solution 1 (python) Performance: Enron Samples Rate Recall Precision Total Contribution

Solution 1 (python) Performance: Enron Samples Rate Recall Precision Total Contribution Summary Each of the ham/spam classifiers has been tested against random samples from pre- processed enron sets 1 through 6 obtained via: http://www.aueb.gr/users/ion/data/enron- spam/, or the entire set

More information

Planar data classification with one hidden layer

Planar data classification with one hidden layer Planar data classification with one hidden layer Welcome to your week 3 programming assignment. It's time to build your first neural network, which will have a hidden layer. You will see a big difference

More information

Data Mining: Classifier Evaluation. CSCI-B490 Seminar in Computer Science (Data Mining)

Data Mining: Classifier Evaluation. CSCI-B490 Seminar in Computer Science (Data Mining) Data Mining: Classifier Evaluation CSCI-B490 Seminar in Computer Science (Data Mining) Predictor Evaluation 1. Question: how good is our algorithm? how will we estimate its performance? 2. Question: what

More information