Derek Bridge School of Computer Science and Information Technology University College Cork

Size: px
Start display at page:

Download "Derek Bridge School of Computer Science and Information Technology University College Cork"

Transcription

1 CS4619: Artificial Intelligence II Overfitting and Underfitting Derek Bridge School of Computer Science and Information Technology University College Cork Initialization In [1]: %load_ext autoreload %autoreload 2 %matplotlib inline In [2]: import pandas as pd import numpy as np import matplotlib.pyplot as plt In [3]: from sklearn.pipeline import Pipeline from sklearn.linear_model import LinearRegression from sklearn.preprocessing import PolynomialFeatures from sklearn.model_selection import validation_curve from sklearn.model_selection import learning_curve from sklearn.model_selection import cross_validate Acknowledgements The book was helpful again: A. Géron: Hands-On Machine Learning with Scikit-Learn & TensorFlow, O'Reilly, 2017 I based some of this notebook on some resources from Jake VanderPlas ( ( 1 of 11

2 Introduction You are building an estimator but its performance is not good enough: In the case of a regressor, its validation error is too high In the case of a classifier, its validation accuracy is too low What should you do? The options include: Gather more training examples (see also Data Augmentation, later) Remove noise in the training examples Add more features or remove features Change model: move to a more complex model or maybe to a less complex model Stick with your existing model but add constraints to it to reduce its complexity or remove constraints to increase its complexity Surprisingly, gathering more training examples may not help; adding more features may in some cases worsen the performance; changing to a more complex model may in some worsen the performance it all depends on what is causing the poor performance (underfitting or overfitting) This lecture shows you how to diagnose the problem and choose remedies that suit the diagnosis Defining Underfitting and Overfitting To illustrate the concepts, we will use an artificial dataset So that we can plot things in 2D, the dataset will have just one feature: a numeric-valued feature whose values range from 0 to 1 The target will also be numeric-valued and will be a non-linear function of the feature But we'll add a bit of noise to the dataset too In [4]: # Functions for creating the dataset def make_dataset(m, func, error): X = np.random.random(m) y = func(x, error) return X.reshape(m, 1), y def f(x, error = 1.0): y = 10-1 / (x + 0.1) if error > 0: y = np.random.normal(y, error) return y In [5]: # Call the functions to create a training set X_train, y_train = make_dataset(50, f, 1.0) 2 of 11

3 In [6]: # Plot it so you can see what it looks like plt.xlabel("feature") plt.ylabel("y") plt.ylim(-4, 14) plt.scatter(x_train, y_train, color = "green") In [7]: vals = np.linspace(-0.1, 1.1, 500).reshape(500, 1) Fitting a Linear Model to the Data We'll use OLS Linear Regression to fit a linear model And we'll plot the model that it learns In [8]: Out[8]: # Fit the model linear_estimator = LinearRegression() linear_estimator.fit(x_train, y_train) LinearRegression(copy_X=True, fit_intercept=true, n_jobs=1, normalize=fal se) In [9]: # Get its predictions y_predicted = linear_estimator.predict(vals) 3 of 11

4 In [10]: # Plot the training set and also the predictions made by the model plt.xlabel("feature") plt.ylabel("y") plt.ylim(-4, 14) plt.scatter(x_train, y_train, color = "green") plt.plot(vals, y_predicted, color = "blue") It's easy to see that a linear model is a poor choice It underfits the data: The model is not complex enough: it is too simple to capture the underlying structure of the data Fitting a Quadratic Model to the Data What happens if we try to fit a more complex model such as a quadratic function? In [11]: Out[11]: # Fit the model quadratic_estimator = Pipeline([ ("polyfeatures", PolynomialFeatures(degree=2, include_bias=false )), ("estimator", LinearRegression()) ]) quadratic_estimator.fit(x_train, y_train) Pipeline(memory=None, steps=[('polyfeatures', PolynomialFeatures(degree=2, include_bias=fa lse, interaction_only=false)), ('estimator', LinearRegression(copy_X=True, fit_intercept=true, n_jobs=1, normalize=false))]) In [12]: # Get its predictions y_predicted = quadratic_estimator.predict(vals) 4 of 11

5 In [13]: # Plot the training set and also the predictions made by the model on th e validation set plt.xlabel("feature") plt.ylabel("y") plt.ylim(-4, 14) plt.scatter(x_train, y_train, color = "green") plt.plot(vals, y_predicted, color = "blue") This fits the training data much better: it underfits less We could now try a cubic model But let's skip all that and try something much more complex Fitting a Much Higher Degree Polynomial to the Data So what happens if we fit a polynomial of degree 30? In [14]: Out[14]: poly_estimator = Pipeline([ ("polyfeatures", PolynomialFeatures(degree=30, include_bias=fals e)), ("estimator", LinearRegression()) ]) poly_estimator.fit(x_train, y_train) Pipeline(memory=None, steps=[('polyfeatures', PolynomialFeatures(degree=30, include_bias=f alse, interaction_only=false)), ('estimator', LinearRegression(copy_X=Tru e, fit_intercept=true, n_jobs=1, normalize=false))]) In [15]: # Get its predictions y_predicted = poly_estimator.predict(vals) 5 of 11

6 In [16]: # Plot the training set and also the predictions made by the model on th e validation set plt.xlabel("feature") plt.ylabel("y") plt.ylim(-4, 14) plt.scatter(x_train, y_train, color = "green") plt.plot(vals, y_predicted, color = "blue") While a model of this complexity fits the training set really well, it seems clear that this model is a poor choice It is not capturing the target function; it is fitting to the noise in the training set It overfits the data: The model is too complex relative to the amount of training data and the noisiness of the training data Fitting Models of Different Complexities to the Data We can plot complexity along the -axis and error on the -axis In our case, we plot the degree of the polynomial along the -axis In fact, we'll plot two lines: training error and validation error x y x In [17]: # I'll make a larger dataset than the one I used above because I want to split this one # into training and validation sets X, y = make_dataset(200, f, 1.0) In [18]: degrees = np.arange(1, 30) estimator = Pipeline([ ("polyfeatures", PolynomialFeatures(include_bias=False)), ("estimator", LinearRegression()) ]) mses_train, mses_val = validation_curve( estimator, X, y, "polyfeatures degree", degrees, cv=10, scoring="ne g_mean_squared_error") mean_mses_train = np.mean(np.abs(mses_train), axis=1) mean_mses_val = np.mean(np.abs(mses_val), axis=1) 6 of 11

7 In [19]: plt.xlabel("degree") plt.ylabel("mse") plt.ylim(0, 2) plt.plot(degrees, mean_mses_train, label = "training error", color = "re d") plt.plot(degrees, mean_mses_val, label="validation error", color = "gold ") plt.legend() We might get different results each time we run the code, but typically Training error starts high and gets ever lower: the more complex models can wiggle their way through the noise in the data Validation error starts high, gets lower, and then grows again (but somewhat erratically) The simpler models to the left underfit, so validation error (and training error) are high The more complex models to the right overfit: The training error is low: the more complex models can wiggle their way through the noise in the data The validation error is high (but variable): the models don't generalise from the training data to the validation data Between the two, the complexity is 'just right' In [20]: quartic_estimator = Pipeline([ ("polyfeatures", PolynomialFeatures(degree=4, include_bias=false )), ("estimator", LinearRegression()) ]) In summary, A model underfits the training set if there is a more complex model with lower validation error A model overfits the training set if there is a less complex model with lower validation error Diagnosis How do you tell whether a particular model is underfitting or overfitting? We'll look at two methods: Compare training error and validation error Plot a learning curve 7 of 11

8 Compare training error and validation error The simplest method is to compute the training error and validation error If a model has high training error and high validation error, then it is underfitting If a model has low training error but high validation error, then it is overfitting In [21]: #Underfitting scores = cross_validate(linear_estimator, X, y, cv=10, scoring="neg_mean _squared_error", return_train_score=true) print("training error: ", np.mean(np.abs(scores["train_score"]))) print("validation error: ", np.mean(np.abs(scores["test_score"]))) Training error: Validation error: In [22]: # Overfitting scores = cross_validate(poly_estimator, X, y, cv=10, scoring="neg_mean_s quared_error", return_train_score=true) print("training error: ", np.mean(np.abs(scores["train_score"]))) print("validation error: ", np.mean(np.abs(scores["test_score"]))) Training error: Validation error: In [23]: # Just right scores = cross_validate(quartic_estimator, X, y, cv=10, scoring="neg_mea n_squared_error", return_train_score=true) print("training error: ", np.mean(np.abs(scores["train_score"]))) print("validation error: ", np.mean(np.abs(scores["test_score"]))) Training error: Validation error: Plot a learning curve Learning curves plot training error and validation error against the number of examples in the training set But they are expensive to produce In [24]: train_set_sizes = np.linspace(.1, 1.0, 10) In [25]: # Underfitting train_sizes, mses_train, mses_val = learning_curve(linear_estimator, X, y, train_sizes=train_set_sizes, cv=10, scoring="neg_ mean_squared_error") mean_mses_train = np.mean(np.abs(mses_train), axis=1) mean_mses_val = np.mean(np.abs(mses_val), axis=1) 8 of 11

9 In [26]: plt.xlabel("num. training examples") plt.ylabel("mse") plt.ylim(0, 3) plt.plot(train_sizes, mean_mses_train, label = "training error", color = "purple") plt.plot(train_sizes, mean_mses_val, label = "validation error", color = "orange") plt.legend() Training error: When there are just a few training examples, the model can fit them near perfectly, which is why the curve starts low As more examples are used for training, it becomes impossible for the model to fit the data both because of the noise but because the model isn't complex enough The curve goes up and eventually plateaus Validation error: When there are few training examples, the model cannot generalize well, so test error is high As more examples are used for training, the model is better so validation error comes down But, since the model isn't complex enough, eventually validation error plateaus, very close to the training error In [27]: # Overfitting train_sizes, mses_train, mses_val = learning_curve(poly_estimator, X, y, train_sizes=train_set_sizes, cv=10, scoring="neg_ mean_squared_error") mean_mses_train = np.mean(np.abs(mses_train), axis=1) mean_mses_val = np.mean(np.abs(mses_val), axis=1) 9 of 11

10 In [28]: plt.xlabel("num. training examples") plt.ylabel("mse") plt.ylim(0, 3) plt.plot(train_sizes, mean_mses_train, label = "training error", color = "purple") plt.plot(train_sizes, mean_mses_val, label = "validation error", color = "orange") plt.legend() These curves have a similar shape to the case of underfitting except Training error: This is much lower because the model can wiggle its way through the noise Validation error: There remains a big gap between training error and validation error (although they may get closer if we had even more training examples) In [29]: # Just right train_sizes, mses_train, mses_val = learning_curve(quartic_estimator, X, y, train_sizes=train_set_sizes, cv=10, scoring="neg_ mean_squared_error") mean_mses_train = np.mean(np.abs(mses_train), axis=1) mean_mses_val = np.mean(np.abs(mses_val), axis=1) 10 of 11

11 In [30]: plt.xlabel("num. training examples") plt.ylabel("mse") plt.ylim(0, 3) plt.plot(train_sizes, mean_mses_train, label = "training error", color = "purple") plt.plot(train_sizes, mean_mses_val, label = "validation error", color = "orange") plt.legend() The same kind of shape again But, this time, the gap narrows and they should converge Solutions After the diagnosis come the solutions! If your model underfits: Gathering more training examples will not help Your main options are: Change model: move to a more complex model Add better features (feature engineering) Stick with your existing model but remove constraints (if you can) to increase its complexity If your model overfits, your main options are: Gather more training data (or use Data Augmentation) Remove noise in the training examples Change model: move to a less complex model Simplify by reducing the number of features Stick with your existing model but add constraints (if you can) to reduce its complexity In [ ]: 11 of 11

Derek Bridge School of Computer Science and Information Technology University College Cork

Derek Bridge School of Computer Science and Information Technology University College Cork CS468: Artificial Intelligence I Ordinary Least Squares Regression Derek Bridge School of Computer Science and Information Technology University College Cork Initialization In [4]: %load_ext autoreload

More information

Derek Bridge School of Computer Science and Information Technology University College Cork. from sklearn.preprocessing import add_dummy_feature

Derek Bridge School of Computer Science and Information Technology University College Cork. from sklearn.preprocessing import add_dummy_feature CS4618: Artificial Intelligence I Gradient Descent Derek Bridge School of Computer Science and Information Technology University College Cork Initialization In [1]: %load_ext autoreload %autoreload 2 %matplotlib

More information

Derek Bridge School of Computer Science and Information Technology University College Cork

Derek Bridge School of Computer Science and Information Technology University College Cork CS4619: Artificial Intelligence II Methodology Dere Bridge School of Computer Science and Information Technology University College Cor Initialization In [1]: %load_ext autoreload %autoreload 2 %matplotlib

More information

CS4618: Artificial Intelligence I. Accuracy Estimation. Initialization

CS4618: Artificial Intelligence I. Accuracy Estimation. Initialization CS4618: Artificial Intelligence I Accuracy Estimation Derek Bridge School of Computer Science and Information echnology University College Cork Initialization In [1]: %reload_ext autoreload %autoreload

More information

Derek Bridge School of Computer Science and Information Technology University College Cork

Derek Bridge School of Computer Science and Information Technology University College Cork CS4618: rtificial Intelligence I Vectors and Matrices Derek Bridge School of Computer Science and Information Technology University College Cork Initialization In [1]: %load_ext autoreload %autoreload

More information

IST 597 Deep Learning Overfitting and Regularization. Sep. 27, 2018

IST 597 Deep Learning Overfitting and Regularization. Sep. 27, 2018 IST 597 Deep Learning Overfitting and Regularization 1. Overfitting Sep. 27, 2018 Regression model y 1 3 x3 13 2 x2 36x10 import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import

More information

Lab 15 - Support Vector Machines in Python

Lab 15 - Support Vector Machines in Python Lab 15 - Support Vector Machines in Python November 29, 2016 This lab on Support Vector Machines is a Python adaptation of p. 359-366 of Introduction to Statistical Learning with Applications in R by Gareth

More information

Lab 10 - Ridge Regression and the Lasso in Python

Lab 10 - Ridge Regression and the Lasso in Python Lab 10 - Ridge Regression and the Lasso in Python March 9, 2016 This lab on Ridge Regression and the Lasso is a Python adaptation of p. 251-255 of Introduction to Statistical Learning with Applications

More information

In stochastic gradient descent implementations, the fixed learning rate η is often replaced by an adaptive learning rate that decreases over time,

In stochastic gradient descent implementations, the fixed learning rate η is often replaced by an adaptive learning rate that decreases over time, Chapter 2 Although stochastic gradient descent can be considered as an approximation of gradient descent, it typically reaches convergence much faster because of the more frequent weight updates. Since

More information

Lab 16 - Multiclass SVMs and Applications to Real Data in Python

Lab 16 - Multiclass SVMs and Applications to Real Data in Python Lab 16 - Multiclass SVMs and Applications to Real Data in Python April 7, 2016 This lab on Multiclass Support Vector Machines in Python is an adaptation of p. 366-368 of Introduction to Statistical Learning

More information

Interpolation and curve fitting

Interpolation and curve fitting CITS2401 Computer Analysis and Visualization School of Computer Science and Software Engineering Lecture 9 Interpolation and curve fitting 1 Summary Interpolation Curve fitting Linear regression (for single

More information

Logistic Regression with a Neural Network mindset

Logistic Regression with a Neural Network mindset Logistic Regression with a Neural Network mindset Welcome to your first (required) programming assignment! You will build a logistic regression classifier to recognize cats. This assignment will step you

More information

Practical example - classifier margin

Practical example - classifier margin Support Vector Machines (SVMs) SVMs are very powerful binary classifiers, based on the Statistical Learning Theory (SLT) framework. SVMs can be used to solve hard classification problems, where they look

More information

Introduction to Machine Learning. Useful tools: Python, NumPy, scikit-learn

Introduction to Machine Learning. Useful tools: Python, NumPy, scikit-learn Introduction to Machine Learning Useful tools: Python, NumPy, scikit-learn Antonio Sutera and Jean-Michel Begon September 29, 2016 2 / 37 How to install Python? Download and use the Anaconda python distribution

More information

from sklearn import tree from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier

from sklearn import tree from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier 1 av 7 2019-02-08 10:26 In [1]: import pandas as pd import numpy as np import matplotlib import matplotlib.pyplot as plt from sklearn import tree from sklearn.ensemble import AdaBoostClassifier, GradientBoostingClassifier

More information

Python for Scientists

Python for Scientists High level programming language with an emphasis on easy to read and easy to write code Includes an extensive standard library We use version 3 History: Exists since 1991 Python 3: December 2008 General

More information

Planar data classification with one hidden layer

Planar data classification with one hidden layer Planar data classification with one hidden layer Welcome to your week 3 programming assignment. It's time to build your first neural network, which will have a hidden layer. You will see a big difference

More information

Lecture #11: The Perceptron

Lecture #11: The Perceptron Lecture #11: The Perceptron Mat Kallada STAT2450 - Introduction to Data Mining Outline for Today Welcome back! Assignment 3 The Perceptron Learning Method Perceptron Learning Rule Assignment 3 Will be

More information

MATH 829: Introduction to Data Mining and Analysis Model selection

MATH 829: Introduction to Data Mining and Analysis Model selection 1/12 MATH 829: Introduction to Data Mining and Analysis Model selection Dominique Guillot Departments of Mathematical Sciences University of Delaware February 24, 2016 2/12 Comparison of regression methods

More information

CS4618: Artificial Intelligence I. Clustering: Introduction. Initialization

CS4618: Artificial Intelligence I. Clustering: Introduction. Initialization CS4618: Artificial Intelligence I Clustering: Introduction Dere Bridge School of Computer Science and Information Technology University College Cor Initialization %reload_et autoreload %autoreload 2 %matplotlib

More information

SUPERVISED LEARNING WITH SCIKIT-LEARN. How good is your model?

SUPERVISED LEARNING WITH SCIKIT-LEARN. How good is your model? SUPERVISED LEARNING WITH SCIKIT-LEARN How good is your model? Classification metrics Measuring model performance with accuracy: Fraction of correctly classified samples Not always a useful metric Class

More information

Lab 9 - Linear Model Selection in Python

Lab 9 - Linear Model Selection in Python Lab 9 - Linear Model Selection in Python March 7, 2016 This lab on Model Validation using Validation and Cross-Validation is a Python adaptation of p. 248-251 of Introduction to Statistical Learning with

More information

Ch.1 Introduction. Why Machine Learning (ML)? manual designing of rules requires knowing how humans do it.

Ch.1 Introduction. Why Machine Learning (ML)? manual designing of rules requires knowing how humans do it. Ch.1 Introduction Syllabus, prerequisites Notation: Means pencil-and-paper QUIZ Means coding QUIZ Code respository for our text: https://github.com/amueller/introduction_to_ml_with_python Why Machine Learning

More information

Lab Five. COMP Advanced Artificial Intelligence Xiaowei Huang Cameron Hargreaves. October 29th 2018

Lab Five. COMP Advanced Artificial Intelligence Xiaowei Huang Cameron Hargreaves. October 29th 2018 Lab Five COMP 219 - Advanced Artificial Intelligence Xiaowei Huang Cameron Hargreaves October 29th 2018 1 Decision Trees and Random Forests 1.1 Reading Begin by reading chapter three of Python Machine

More information

Lab Four. COMP Advanced Artificial Intelligence Xiaowei Huang Cameron Hargreaves. October 22nd 2018

Lab Four. COMP Advanced Artificial Intelligence Xiaowei Huang Cameron Hargreaves. October 22nd 2018 Lab Four COMP 219 - Advanced Artificial Intelligence Xiaowei Huang Cameron Hargreaves October 22nd 2018 1 Reading Begin by reading chapter three of Python Machine Learning until page 80 found in the learning

More information

Ch.1 Introduction. Why Machine Learning (ML)?

Ch.1 Introduction. Why Machine Learning (ML)? Syllabus, prerequisites Ch.1 Introduction Notation: Means pencil-and-paper QUIZ Means coding QUIZ Why Machine Learning (ML)? Two problems with conventional if - else decision systems: brittleness: The

More information

CS178: Machine Learning and Data Mining. Complexity & Nearest Neighbor Methods

CS178: Machine Learning and Data Mining. Complexity & Nearest Neighbor Methods + CS78: Machine Learning and Data Mining Complexity & Nearest Neighbor Methods Prof. Erik Sudderth Some materials courtesy Alex Ihler & Sameer Singh Machine Learning Complexity and Overfitting Nearest

More information

3 Nonlinear Regression

3 Nonlinear Regression CSC 4 / CSC D / CSC C 3 Sometimes linear models are not sufficient to capture the real-world phenomena, and thus nonlinear models are necessary. In regression, all such models will have the same basic

More information

Programming Exercise 5: Regularized Linear Regression and Bias v.s. Variance

Programming Exercise 5: Regularized Linear Regression and Bias v.s. Variance Programming Exercise 5: Regularized Linear Regression and Bias v.s. Variance Machine Learning May 13, 212 Introduction In this exercise, you will implement regularized linear regression and use it to study

More information

Lecture 37: ConvNets (Cont d) and Training

Lecture 37: ConvNets (Cont d) and Training Lecture 37: ConvNets (Cont d) and Training CS 4670/5670 Sean Bell [http://bbabenko.tumblr.com/post/83319141207/convolutional-learnings-things-i-learned-by] (Unrelated) Dog vs Food [Karen Zack, @teenybiscuit]

More information

Python Matplotlib. MACbioIDi February March 2018

Python Matplotlib. MACbioIDi February March 2018 Python Matplotlib MACbioIDi February March 2018 Introduction Matplotlib is a Python 2D plotting library Its origins was emulating the MATLAB graphics commands It makes heavy use of NumPy Objective: Create

More information

Quadratic Functions. *These are all examples of polynomial functions.

Quadratic Functions. *These are all examples of polynomial functions. Look at: f(x) = 4x-7 f(x) = 3 f(x) = x 2 + 4 Quadratic Functions *These are all examples of polynomial functions. Definition: Let n be a nonnegative integer and let a n, a n 1,..., a 2, a 1, a 0 be real

More information

Regressors Documentation

Regressors Documentation Regressors Documentation Release 0.0.3 Nikhil Haas December 08, 2015 Contents 1 Regressors 3 1.1 Features.................................................. 3 1.2 Credits..................................................

More information

intro_mlp_xor March 26, 2018

intro_mlp_xor March 26, 2018 intro_mlp_xor March 26, 2018 1 Introduction to Neural Networks Some material from peterroelants Goal: understand neural networks so they are no longer a black box In [121]: # do all of the imports here

More information

Programming for Engineers in Python

Programming for Engineers in Python Programming for Engineers in Python Autumn 2016-17 Lecture 11: NumPy & SciPy Introduction, Plotting and Data Analysis 1 Today s Plan Introduction to NumPy & SciPy Plotting Data Analysis 2 NumPy and SciPy

More information

Tutorial Four: Linear Regression

Tutorial Four: Linear Regression Tutorial Four: Linear Regression Imad Pasha Chris Agostino February 25, 2015 1 Introduction When looking at the results of experiments, it is critically important to be able to fit curves to scattered

More information

1 Shapes of Power Functions

1 Shapes of Power Functions MA 1165 - Lecture 06 1 Wednesday, 1/28/09 1 Shapes of Power Functions I would like you to be familiar with the shape of the power functions, that is, the functions of the form f(x) = x n, (1) for n = 1,

More information

More on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.

More on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5. More on Neural Networks Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.6 Recall the MLP Training Example From Last Lecture log likelihood

More information

LOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave.

LOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave. LOESS curve fitted to a population sampled from a sine wave with uniform noise added. The LOESS curve approximates the original sine wave. http://en.wikipedia.org/wiki/local_regression Local regression

More information

Machine Learning in Python. Rohith Mohan GradQuant Spring 2018

Machine Learning in Python. Rohith Mohan GradQuant Spring 2018 Machine Learning in Python Rohith Mohan GradQuant Spring 2018 What is Machine Learning? https://twitter.com/myusuf3/status/995425049170489344 Traditional Programming Data Computer Program Output Getting

More information

Goals: In [1]: import numpy as np. In [2]: from sklearn.svm import SVR. Introduction to the challenge

Goals: In [1]: import numpy as np. In [2]: from sklearn.svm import SVR.   Introduction to the challenge In [1]: import numpy as np import pandas as pd import os import warnings import time warnings.simplefilter("ignore") In [2]: from sklearn.svm import SVR from sklearn.linear_model import SGDRegressor, LinearRegression

More information

Orange3 Educational Add-on Documentation

Orange3 Educational Add-on Documentation Orange3 Educational Add-on Documentation Release 0.1 Biolab Jun 01, 2018 Contents 1 Widgets 3 2 Indices and tables 27 i ii Widgets in Educational Add-on demonstrate several key data mining and machine

More information

3 Nonlinear Regression

3 Nonlinear Regression 3 Linear models are often insufficient to capture the real-world phenomena. That is, the relation between the inputs and the outputs we want to be able to predict are not linear. As a consequence, nonlinear

More information

Last time... Bias-Variance decomposition. This week

Last time... Bias-Variance decomposition. This week Machine learning, pattern recognition and statistical data modelling Lecture 4. Going nonlinear: basis expansions and splines Last time... Coryn Bailer-Jones linear regression methods for high dimensional

More information

Warm-Up Exercises. Find the x-intercept and y-intercept 1. 3x 5y = 15 ANSWER 5; y = 2x + 7 ANSWER ; 7

Warm-Up Exercises. Find the x-intercept and y-intercept 1. 3x 5y = 15 ANSWER 5; y = 2x + 7 ANSWER ; 7 Warm-Up Exercises Find the x-intercept and y-intercept 1. 3x 5y = 15 ANSWER 5; 3 2. y = 2x + 7 7 2 ANSWER ; 7 Chapter 1.1 Graph Quadratic Functions in Standard Form A quadratic function is a function that

More information

PHY Introduction to Python Programming, week 5

PHY Introduction to Python Programming, week 5 PHY1024 - Introduction to Python Programming, week 5 The lecture materials, worksheets, and assignments can all be found here: http://go.qub.ac.uk/phy1024y2016 (http://go.qub.ac.uk/phy1024y2016) Lecture

More information

More on Curve Fitting

More on Curve Fitting PHY 310 DataAnalysis 5 More on Curve Fitting In [1]: # All the standard "import" stuff import scipy as sp from scipy.optimize import curve_fit from scipy.linalg import lstsq import matplotlib.pyplot as

More information

Intro. Speed V Growth

Intro. Speed V Growth Intro Good code is two things. It's elegant, and it's fast. In other words, we got a need for speed. We want to find out what's fast, what's slow, and what we can optimize. First, we'll take a tour of

More information

Machine Learning Part 1

Machine Learning Part 1 Data Science Weekend Machine Learning Part 1 KMK Online Analytic Team Fajri Koto Data Scientist fajri.koto@kmklabs.com Machine Learning Part 1 Outline 1. Machine Learning at glance 2. Vector Representation

More information

Matplotlib Python Plotting

Matplotlib Python Plotting Matplotlib Python Plotting 1 / 6 2 / 6 3 / 6 Matplotlib Python Plotting Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive

More information

Using the Matplotlib Library in Python 3

Using the Matplotlib Library in Python 3 Using the Matplotlib Library in Python 3 Matplotlib is a Python 2D plotting library that produces publication-quality figures in a variety of hardcopy formats and interactive environments across platforms.

More information

#To import the whole library under a different name, so you can type "diff_name.f unc_name" import numpy as np import matplotlib.

#To import the whole library under a different name, so you can type diff_name.f unc_name import numpy as np import matplotlib. In [1]: #Here I import the relevant function libraries #This can be done in many ways #To import an entire library (e.g. scipy) so that functions accessed by typing "l ib_name.func_name" import matplotlib

More information

Homework 11 - Debugging

Homework 11 - Debugging 1 of 7 5/28/2018, 1:21 PM Homework 11 - Debugging Instructions: Fix the errors in the following problems. Some of the problems are with the code syntax, causing an error message. Other errors are logical

More information

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018 MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge

More information

Resampling methods (Ch. 5 Intro)

Resampling methods (Ch. 5 Intro) Zavádějící faktor (Confounding factor), ale i 'současně působící faktor' Resampling methods (Ch. 5 Intro) Key terms: Train/Validation/Test data Crossvalitation One-leave-out = LOOCV Bootstrup key slides

More information

Sergey Fogelson VP of Analytics, Viacom

Sergey Fogelson VP of Analytics, Viacom EXTREME GRADIENT BOOSTING WITH XGBOOST Review of pipelines using sklearn Sergey Fogelson VP of Analytics, Viacom Pipeline Review Takes a list of named 2-tuples (name, pipeline_step) as input Tuples can

More information

EPL451: Data Mining on the Web Lab 10

EPL451: Data Mining on the Web Lab 10 EPL451: Data Mining on the Web Lab 10 Παύλος Αντωνίου Γραφείο: B109, ΘΕΕ01 University of Cyprus Department of Computer Science Dimensionality Reduction Map points in high-dimensional (high-feature) space

More information

Perceptron as a graph

Perceptron as a graph Neural Networks Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 10 th, 2007 2005-2007 Carlos Guestrin 1 Perceptron as a graph 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0-6 -4-2

More information

Scikit-plot Documentation

Scikit-plot Documentation Scikit-plot Documentation Release Reiichiro S. Nakano Feb 07, 2018 Contents 1 The quickest and easiest way to go from analysis... 1 2... to this. 3 2.1 First steps with Scikit-plot........................................

More information

Feature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule.

Feature Extractors. CS 188: Artificial Intelligence Fall Nearest-Neighbor Classification. The Perceptron Update Rule. CS 188: Artificial Intelligence Fall 2007 Lecture 26: Kernels 11/29/2007 Dan Klein UC Berkeley Feature Extractors A feature extractor maps inputs to feature vectors Dear Sir. First, I must solicit your

More information

Iris Example PyTorch Implementation

Iris Example PyTorch Implementation Iris Example PyTorch Implementation February, 28 Iris Example using Pytorch.nn Using SciKit s Learn s prebuilt datset of Iris Flowers (which is in a numpy data format), we build a linear classifier in

More information

PYTHON DATA VISUALIZATIONS

PYTHON DATA VISUALIZATIONS PYTHON DATA VISUALIZATIONS from Learning Python for Data Analysis and Visualization by Jose Portilla https://www.udemy.com/learning-python-for-data-analysis-and-visualization/ Notes by Michael Brothers

More information

Evaluating Classifiers

Evaluating Classifiers Evaluating Classifiers Reading for this topic: T. Fawcett, An introduction to ROC analysis, Sections 1-4, 7 (linked from class website) Evaluating Classifiers What we want: Classifier that best predicts

More information

Scikit-plot Documentation

Scikit-plot Documentation Scikit-plot Documentation Release Reiichiro S. Nakano Sep 17, 2017 Contents 1 The quickest and easiest way to go from analysis... 1 2...to this. 3 2.1 First steps with Scikit-plot........................................

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

BBOB Black-Box Optimization Benchmarking with CoCO (Comparing Continuous Optimizers) The Turbo-Intro

BBOB Black-Box Optimization Benchmarking with CoCO (Comparing Continuous Optimizers) The Turbo-Intro with CoCO (Comparing Continuous Optimizers) The Turbo-Intro Black-Box Optimization (Search) CoCO: the noiseless functions 24 functions within five sub-groups Separable functions Essential unimodal functions

More information

Optimization Methods for Machine Learning (OMML)

Optimization Methods for Machine Learning (OMML) Optimization Methods for Machine Learning (OMML) 2nd lecture Prof. L. Palagi References: 1. Bishop Pattern Recognition and Machine Learning, Springer, 2006 (Chap 1) 2. V. Cherlassky, F. Mulier - Learning

More information

CS535 Big Data Fall 2017 Colorado State University 10/10/2017 Sangmi Lee Pallickara Week 8- A.

CS535 Big Data Fall 2017 Colorado State University   10/10/2017 Sangmi Lee Pallickara Week 8- A. CS535 Big Data - Fall 2017 Week 8-A-1 CS535 BIG DATA FAQs Term project proposal New deadline: Tomorrow PA1 demo PART 1. BATCH COMPUTING MODELS FOR BIG DATA ANALYTICS 5. ADVANCED DATA ANALYTICS WITH APACHE

More information

Scientific Programming. Lecture A08 Numpy

Scientific Programming. Lecture A08 Numpy Scientific Programming Lecture A08 Alberto Montresor Università di Trento 2018/10/25 Acknowledgments: Stefano Teso, Documentation http://disi.unitn.it/~teso/courses/sciprog/python_appendices.html https://docs.scipy.org/doc/numpy-1.13.0/reference/

More information

Tree models with Scikit-Learn Great learners with little assumptions

Tree models with Scikit-Learn Great learners with little assumptions Tree models with Scikit-Learn Great learners with little assumptions Material: https://github.com/glouppe/talk-pydata2015 Gilles Louppe (@glouppe) CERN PyData, April 3, 2015 Outline 1 Motivation 2 Growing

More information

Prof. Dr. Rudolf Mathar, Dr. Arash Behboodi, Emilio Balda. Exercise 5. Friday, December 22, 2017

Prof. Dr. Rudolf Mathar, Dr. Arash Behboodi, Emilio Balda. Exercise 5. Friday, December 22, 2017 Fundamentals of Big Data Analytics Prof. Dr. Rudolf Mathar, Dr. Arash Behboodi, Emilio Balda Exercise 5 Friday, December 22, 2017 Problem 1. Discriminant Analysis for MNIST dataset (PyTorch) In this script,

More information

Large Scale Data Analysis Using Deep Learning

Large Scale Data Analysis Using Deep Learning Large Scale Data Analysis Using Deep Learning Machine Learning Basics - 1 U Kang Seoul National University U Kang 1 In This Lecture Overview of Machine Learning Capacity, overfitting, and underfitting

More information

Instance-based Learning

Instance-based Learning Instance-based Learning Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 19 th, 2007 2005-2007 Carlos Guestrin 1 Why not just use Linear Regression? 2005-2007 Carlos Guestrin

More information

ARTIFICIAL INTELLIGENCE AND PYTHON

ARTIFICIAL INTELLIGENCE AND PYTHON ARTIFICIAL INTELLIGENCE AND PYTHON DAY 1 STANLEY LIANG, LASSONDE SCHOOL OF ENGINEERING, YORK UNIVERSITY WHAT IS PYTHON An interpreted high-level programming language for general-purpose programming. Python

More information

Predicting Diabetes using Neural Networks and Randomized Optimization

Predicting Diabetes using Neural Networks and Randomized Optimization Predicting Diabetes using Neural Networks and Randomized Optimization Kunal Sharma GTID: ksharma74 CS 4641 Machine Learning Abstract This paper analysis the following randomized optimization techniques

More information

Introducing Categorical Data/Variables (pp )

Introducing Categorical Data/Variables (pp ) Notation: Means pencil-and-paper QUIZ Means coding QUIZ Definition: Feature Engineering (FE) = the process of transforming the data to an optimal representation for a given application. Scaling (see Chs.

More information

scikit-learn (Machine Learning in Python)

scikit-learn (Machine Learning in Python) scikit-learn (Machine Learning in Python) (PB13007115) 2016-07-12 (PB13007115) scikit-learn (Machine Learning in Python) 2016-07-12 1 / 29 Outline 1 Introduction 2 scikit-learn examples 3 Captcha recognize

More information

5 File I/O, Plotting with Matplotlib

5 File I/O, Plotting with Matplotlib 5 File I/O, Plotting with Matplotlib Bálint Aradi Course: Scientific Programming / Wissenchaftliches Programmieren (Python) Installing some SciPy stack components We will need several Scipy components

More information

Practical session 3: Machine learning for NLP

Practical session 3: Machine learning for NLP Practical session 3: Machine learning for NLP Traitement Automatique des Langues 21 February 2018 1 Introduction In this practical session, we will explore machine learning models for NLP applications;

More information

RESAMPLING METHODS. Chapter 05

RESAMPLING METHODS. Chapter 05 1 RESAMPLING METHODS Chapter 05 2 Outline Cross Validation The Validation Set Approach Leave-One-Out Cross Validation K-fold Cross Validation Bias-Variance Trade-off for k-fold Cross Validation Cross Validation

More information

Instance-based Learning

Instance-based Learning Instance-based Learning Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University October 15 th, 2007 2005-2007 Carlos Guestrin 1 1-Nearest Neighbor Four things make a memory based learner:

More information

Stages of (Batch) Machine Learning

Stages of (Batch) Machine Learning Evalua&on Stages of (Batch) Machine Learning Given: labeled training data X, Y = {hx i,y i i} n i=1 Assumes each x i D(X ) with y i = f target (x i ) Train the model: model ß classifier.train(x, Y ) x

More information

CSC 1315! Data Science

CSC 1315! Data Science CSC 1315! Data Science Data Visualization Based on: Python for Data Analysis: http://hamelg.blogspot.com/2015/ Learning IPython for Interactive Computation and Visualization by C. Rossant Plotting with

More information

Nearest Neighbor Classification. Machine Learning Fall 2017

Nearest Neighbor Classification. Machine Learning Fall 2017 Nearest Neighbor Classification Machine Learning Fall 2017 1 This lecture K-nearest neighbor classification The basic algorithm Different distance measures Some practical aspects Voronoi Diagrams and Decision

More information

Distribution-free Predictive Approaches

Distribution-free Predictive Approaches Distribution-free Predictive Approaches The methods discussed in the previous sections are essentially model-based. Model-free approaches such as tree-based classification also exist and are popular for

More information

No more questions will be added

No more questions will be added CSC 2545, Spring 2017 Kernel Methods and Support Vector Machines Assignment 2 Due at the start of class, at 2:10pm, Thurs March 23. No late assignments will be accepted. The material you hand in should

More information

Clustering to Reduce Spatial Data Set Size

Clustering to Reduce Spatial Data Set Size Clustering to Reduce Spatial Data Set Size Geoff Boeing arxiv:1803.08101v1 [cs.lg] 21 Mar 2018 1 Introduction Department of City and Regional Planning University of California, Berkeley March 2018 Traditionally

More information

Model Complexity and Generalization

Model Complexity and Generalization HT2015: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford http://www.stats.ox.ac.uk/~sejdinov/sdmml.html Generalization Learning Curves Underfit Generalization

More information

Partitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning

Partitioning Data. IRDS: Evaluation, Debugging, and Diagnostics. Cross-Validation. Cross-Validation for parameter tuning Partitioning Data IRDS: Evaluation, Debugging, and Diagnostics Charles Sutton University of Edinburgh Training Validation Test Training : Running learning algorithms Validation : Tuning parameters of learning

More information

Lecture Linear Support Vector Machines

Lecture Linear Support Vector Machines Lecture 8 In this lecture we return to the task of classification. As seen earlier, examples include spam filters, letter recognition, or text classification. In this lecture we introduce a popular method

More information

Section 4.3. Graphing Exponential Functions

Section 4.3. Graphing Exponential Functions Graphing Exponential Functions Graphing Exponential Functions with b > 1 Graph f x = ( ) 2 x Graphing Exponential Functions by hand. List input output pairs (see table) Input increases by 1 and output

More information

MS6021 Scientific Computing. TOPICS: Python BASICS, INTRO to PYTHON for Scientific Computing

MS6021 Scientific Computing. TOPICS: Python BASICS, INTRO to PYTHON for Scientific Computing MS6021 Scientific Computing TOPICS: Python BASICS, INTRO to PYTHON for Scientific Computing Preliminary Notes on Python (v MatLab + other languages) When you enter Spyder (available on installing Anaconda),

More information

Model Answers to The Next Pixel Prediction Task

Model Answers to The Next Pixel Prediction Task Model Answers to The Next Pixel Prediction Task December 2, 25. (Data preprocessing and visualization, 8 marks) (a) Solution. In Algorithm we are told that the data was discretized to 64 grey scale values,...,

More information

7. Decision or classification trees

7. Decision or classification trees 7. Decision or classification trees Next we are going to consider a rather different approach from those presented so far to machine learning that use one of the most common and important data structure,

More information

Linear, Quadratic, Exponential, and Absolute Value Functions

Linear, Quadratic, Exponential, and Absolute Value Functions Linear, Quadratic, Exponential, and Absolute Value Functions Linear Quadratic Exponential Absolute Value Y = mx + b y = ax 2 + bx + c y = a b x y = x 1 What type of graph am I? 2 What can you tell me about

More information

GLM II. Basic Modeling Strategy CAS Ratemaking and Product Management Seminar by Paul Bailey. March 10, 2015

GLM II. Basic Modeling Strategy CAS Ratemaking and Product Management Seminar by Paul Bailey. March 10, 2015 GLM II Basic Modeling Strategy 2015 CAS Ratemaking and Product Management Seminar by Paul Bailey March 10, 2015 Building predictive models is a multi-step process Set project goals and review background

More information

model order p weights The solution to this optimization problem is obtained by solving the linear system

model order p weights The solution to this optimization problem is obtained by solving the linear system CS 189 Introduction to Machine Learning Fall 2017 Note 3 1 Regression and hyperparameters Recall the supervised regression setting in which we attempt to learn a mapping f : R d R from labeled examples

More information

Ensemble Methods, Decision Trees

Ensemble Methods, Decision Trees CS 1675: Intro to Machine Learning Ensemble Methods, Decision Trees Prof. Adriana Kovashka University of Pittsburgh November 13, 2018 Plan for This Lecture Ensemble methods: introduction Boosting Algorithm

More information

Machine Learning. Step-by-Step Guide To Implement Machine Learning Algorithms with Python. Author Rudolph Russell

Machine Learning. Step-by-Step Guide To Implement Machine Learning Algorithms with Python. Author Rudolph Russell Machine Learning Step-by-Step Guide To Implement Machine Learning Algorithms with Python Author Rudolph Russell Copyright 2018 - All rights reserved. If you would like to share this book with another person,

More information

Machine learning using embedpy to apply LASSO regression

Machine learning using embedpy to apply LASSO regression Technical Whitepaper Machine learning using embedpy to apply LASSO regression Date October 2018 Author Samantha Gallagher is a kdb+ consultant for Kx and has worked in leading financial institutions for

More information