Machine Learning Lecture-1

Similar documents
06: Logistic Regression

REGRESSION ANALYSIS : LINEAR BY MAUAJAMA FIRDAUS & TULIKA SAHA

Multilayer Feed-forward networks

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

Data Mining: Models and Methods

Logistic Regression. May 28, Decision boundary is a property of the hypothesis and not the data set e z. g(z) = g(z) 0.

Linear Regression & Gradient Descent

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

CSC 4510 Machine Learning

Supervised Learning for Image Segmentation

AKA: Logistic Regression Implementation

Gradient Descent - Problem of Hiking Down a Mountain

Programming Exercise 4: Neural Networks Learning

HMC CS 158, Fall 2017 Problem Set 3 Programming: Regularized Polynomial Regression

Machine Learning Classifiers and Boosting

An experimental study of human action recognition system using machine learning

Understanding Andrew Ng s Machine Learning Course Notes and codes (Matlab version)

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

Programming Exercise 3: Multi-class Classification and Neural Networks

Lecture 3: Linear Classification

PARALLELIZED IMPLEMENTATION OF LOGISTIC REGRESSION USING MPI

Lecture 3: Convex sets

Deep Learning. Practical introduction with Keras JORDI TORRES 27/05/2018. Chapter 3 JORDI TORRES

CS 237: Probability in Computing

ITI Introduction to Computing II Winter 2018

Natural Language Processing CS 6320 Lecture 6 Neural Language Models. Instructor: Sanda Harabagiu

Gradient Descent. Wed Sept 20th, James McInenrey Adapted from slides by Francisco J. R. Ruiz

COMPUTATIONAL INTELLIGENCE (INTRODUCTION TO MACHINE LEARNING) SS18. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions

Machine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center

Week 3: Perceptron and Multi-layer Perceptron

Programming Exercise 1: Linear Regression

Network Traffic Measurements and Analysis

Neural Network Neurons

Tangent line problems

Rapid growth of massive datasets

Predict the Likelihood of Responding to Direct Mail Campaign in Consumer Lending Industry

All lecture slides will be available at CSC2515_Winter15.html

Machine Learning with Python

CS489/698: Intro to ML

Neural Networks. Single-layer neural network. CSE 446: Machine Learning Emily Fox University of Washington March 10, /10/2017

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Computational Intelligence (CS) SS15 Homework 1 Linear Regression and Logistic Regression

Optimization. Industrial AI Lab.

COMPUTATIONAL INTELLIGENCE (CS) (INTRODUCTION TO MACHINE LEARNING) SS16. Lecture 2: Linear Regression Gradient Descent Non-linear basis functions

Machine Learning Applications for Data Center Optimization

5 Learning hypothesis classes (16 points)

Lecture 1 Notes. Outline. Machine Learning. What is it? Instructors: Parth Shah, Riju Pahwa

Test Name: Chapter 3 Review

Fraud Detection using Machine Learning

CS535 Big Data Fall 2017 Colorado State University 10/10/2017 Sangmi Lee Pallickara Week 8- A.

Performance Evaluation of Various Classification Algorithms

3 Types of Gradient Descent Algorithms for Small & Large Data Sets

Neural Networks (pp )

Perceptron Introduction to Machine Learning. Matt Gormley Lecture 5 Jan. 31, 2018

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2017

Object Detection Lecture Introduction to deep learning (CNN) Idar Dyrdal

IST 597 Foundations of Deep Learning Fall 2018 Homework 1: Regression & Gradient Descent

Big Data Methods. Chapter 5: Machine learning. Big Data Methods, Chapter 5, Slide 1

Linear Regression Optimization

Tested Paradigm to Include Optimization in Machine Learning Algorithms

M. Sc. (Artificial Intelligence and Machine Learning)

CPSC 340: Machine Learning and Data Mining. Logistic Regression Fall 2016

Logistic Regression and Gradient Ascent

Machine Learning / Jan 27, 2010

Regularization and model selection

Imperial College London

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

Neural Networks. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Learning via Optimization

Lecture 20: Neural Networks for NLP. Zubin Pahuja

Supervised Learning in Neural Networks (Part 2)

EECS 496 Statistical Language Models. Winter 2018

Contents Machine Learning concepts 4 Learning Algorithm 4 Predictive Model (Model) 4 Model, Classification 4 Model, Regression 4 Representation

Machine Learning using Matlab. Lecture 3 Logistic regression and regularization

Penalizied Logistic Regression for Classification

Supervised Learning with Neural Networks. We now look at how an agent might learn to solve a general problem by seeing examples.

CS 237: Probability in Computing

Lecture 5: Duality Theory

CPSC 340: Machine Learning and Data Mining. More Linear Classifiers Fall 2017

Probabilistic Graphical Models

Linear Regression QuadraJc Regression Year

CS 8520: Artificial Intelligence. Machine Learning 2. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek

Section 18-1: Graphical Representation of Linear Equations and Functions

CP365 Artificial Intelligence

Deep Neural Networks Optimization

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

Instructor: Jessica Wu Harvey Mudd College

Machine Learning. The Breadth of ML Neural Networks & Deep Learning. Marc Toussaint. Duy Nguyen-Tuong. University of Stuttgart

COMP 551 Applied Machine Learning Lecture 14: Neural Networks

CS 8520: Artificial Intelligence

Module 1 Lecture Notes 2. Optimization Problem and Model Formulation

Log- linear models. Natural Language Processing: Lecture Kairit Sirts

Assignment # 5. Farrukh Jabeen Due Date: November 2, Neural Networks: Backpropation

Demystifying Machine Learning

Simple Model Selection Cross Validation Regularization Neural Networks

Machine Learning: Think Big and Parallel

Perceptrons and Backpropagation. Fabio Zachert Cognitive Modelling WiSe 2014/15

Transcription:

Machine Learning Lecture-1 Programming Club Indian Institute of Technology Kanpur pclubiitk@gmail.com February 25, 2016 Programming Club (IIT Kanpur) ML-1 February 25, 2016 1 / 18

Acknowledgement This presentation is meant only for teaching purposes as a part of IIT Kanpur club activity. Most informations used here are taken from different sources available online and hence contents of this presentation should NOT be used anywhere without proper consents from the original sources. Programming Club (IIT Kanpur) ML-1 February 25, 2016 2 / 18

What is Machine Learning? Machine Learning Machine Learning is the science of getting computers to learn, without being explicitly programmed. Search Spam Filters Recommendations Chat-Bots Speech recognition Face Detection and Recognition Robotics And a LOT more! Programming Club (IIT Kanpur) ML-1 February 25, 2016 3 / 18

Supervised and Unsupervised Learning Supervised Learning A supervised learning algorithm analyses the training data and produces an inferred function, which can be used for mapping new examples. The right answers of the training data are given to us. There are two types of Supervised Learning Classification Regression Unsupervised Learning Unsupervised learning is the machine learning task of inferring a function to describe hidden structure from unlabeled data. Programming Club (IIT Kanpur) ML-1 February 25, 2016 4 / 18

Linear Regression with One Variable Linear regression with one variable is also known as Univariate Linear Regression. Regression Predict Real-Valued Outputs. Univariate Linear Regression Univariate linear regression is used when you want to predict a single output value from a single input value. We re doing supervised learning here, so that means we already have an idea what the input/output cause and effect should be. Programming Club (IIT Kanpur) ML-1 February 25, 2016 5 / 18

Model Representation The data is usually categorized into these three parts - Training Set Validation Set Testing Set Some usual Notations - m - Number of Training Examples x - Input Features/Variables y - Output Features/Variables x (i) - ith example of the input y (i) - ith example of the output Programming Club (IIT Kanpur) ML-1 February 25, 2016 6 / 18

Hypothesis Function Training Data Learning Algorithm Input Hypothesis Output Programming Club (IIT Kanpur) ML-1 February 25, 2016 7 / 18

The Hypothesis Function and Cost Function The Hypothesis Function h θ (x) = θ 0 + θ 1 x We are trying to find a function of this linear from that will reliably map our input data (x) to our output data (y). The Cost Function J(θ 0, θ 1 ) = 1 2m m i=1 (h θ(x (i) ) y (i) ) 2 We can measure the accuracy of our hypothesis function by using a cost function. This takes an average of all the results of the hypothesis with inputs from x s compared to the actual output y s. This function is otherwise called the Squared error function, or Mean squared error. Programming Club (IIT Kanpur) ML-1 February 25, 2016 8 / 18

Gradient Descent Gradient Descent A way to automatically improve our hypothesis function. Consider the graph which has - x- axis as θ 0, z- axis as θ 1 and the vertical y- axis as J(θ 0, θ 1 ). From our intuition we know that the most reliable hypothesis would come from the value (θ 0, θ 1 ) that gives the lowest point of the bottom most pits on our graph. We try achieving that graph point by taking the derivative of our cost function. The slope of the tangent is the derivative at that point and it will give us a direction to move towards. We make steps down that derivative by the parameter α, called the learning rate. Programming Club (IIT Kanpur) ML-1 February 25, 2016 9 / 18

Gradient Descent for Linear Regression The gradient descent equation Repeat Until Convergence - θ j := θ j α (θ j ) (J(θ 0, θ 1 )) for j=0 and j=1 This can be reduced to Gradient Descent for Linear Regression Repeat Until Convergence - θ 0 := θ 0 α 1 m ( m i=1 (h θ(x (i) ) y (i) )) θ 1 := θ 1 α 1 m ( m i=1 (h θ(x (i) ) y (i) ) (x (i) )) Note that we have separated out the two cases for θ 0 and that for θ 1 as we are multiplying x (i) at the end due to the derivative. Programming Club (IIT Kanpur) ML-1 February 25, 2016 10 / 18

Logistic Regression Logistic Model To build a classification/regression model for problems with yes-no (or) negative-positive outcomes. Logistic Regression Model h θ (x) = g(θ T (x)) where g(τ) = 1, (i.e) g is a sigmoid function whose value lies in the 1+e τ range [0, 1] Programming Club (IIT Kanpur) ML-1 February 25, 2016 11 / 18

Logistic Regression Model Decision Boundary We predict y = 1, if h θ (x) 0.5, this means when θ T x 0 We predict y = 0, if h θ (x) < 0.5, this means when θ T x < 0 Consider the case of linear regression model, where hθ(x) = g(θ 0 + θ 1 (x 1 ) +.. + θ n (x n )) To predict whether the output is positive or negative, it is sufficient to check the sign of the argument. This helps in making a very clean analysis. Programming Club (IIT Kanpur) ML-1 February 25, 2016 12 / 18

Logistic Regression - Cost Function Cost Function J(θ) = 1 m m i=1 cost(h θ(x (i) ), y (i) ) cost(h θ (x (i) ), y (i) ) = ylog(h θ (x)) (1 y)log(1 h θ (x)) The intuition behind this cost function is from the fact that the output is either 1 or 0. Moreover, this function is convex in nature thus enabling the gradient descend procedure. Programming Club (IIT Kanpur) ML-1 February 25, 2016 13 / 18

Logistic Regression - Gradient Descend The gradient descent equation Repeat Until Convergence - θ j := θ j α 1 m ( m i=1 (h θ(x (i) ) y (i) ) (x (i) j )) The gradient descend looks very similar to that of Linear Regression. But, You should remember the hypothesis here is different. Some Optimizations that can be done - Feature Scaling Choice of Learning Rate α Programming Club (IIT Kanpur) ML-1 February 25, 2016 14 / 18

Some Pointers to master ML Almost everything you would want to do, will have a package already built by somebody. DO NOT start everything from scratch. While implementing codes in octave/matlab, loop through only if there is NO other alternative. Some useful kits you would like to refer to - scikit, TensorFlow Programming Club (IIT Kanpur) ML-1 February 25, 2016 15 / 18

Question Try this question for sure. This will help you understand the concepts better. Most importantly, it will help in retaining the stuff we did in a hurry. That will be needed when u get into more tougher concepts. Programming Club (IIT Kanpur) ML-1 February 25, 2016 16 / 18

References Andrew NG, Stanford University, coursera.org Machine Learning - Online Course Wikipedia For Definitions Programming Club (IIT Kanpur) ML-1 February 25, 2016 17 / 18

The End Programming Club (IIT Kanpur) ML-1 February 25, 2016 18 / 18