Divide and Conquer Kernel Ridge Regression

Size: px
Start display at page:

Download "Divide and Conquer Kernel Ridge Regression"

Transcription

1 Divide and Conquer Kernel Ridge Regression Yuchen Zhang John Duchi Martin Wainwright University of California, Berkeley COLT 2013 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

2 Problem set-up Goal Solve the following problem: minimize E[(f(x) y) 2 ] subject to f H where (x, y) is sampled from joint distribution P, and H is a Reproducing Kernel Hilbert Space (RKHS). Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

3 Problem set-up Goal Solve the following problem: minimize E[(f(x) y) 2 ] subject to f H where (x, y) is sampled from joint distribution P, and H is a Reproducing Kernel Hilbert Space (RKHS). H is defined by a kernel function k : X X R. Formally: H = {f : f = α i k(x i, ), x i X }. i=1 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

4 Kernel trick review 1 Linear regression f(x) = θ T x only fits linear problems. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

5 Kernel trick review 1 Linear regression f(x) = θ T x only fits linear problems. 2 For non-linear problems, the trick is to map x onto a high-dimension feature space x φ(x), then learn a model f(x) = θ T φ(x), so that f is a non-linear function of x. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

6 Kernel trick review 3 Usually, φ(x) is a infinite-dimensional vector or it is expensive to compute. We can reformulate the learning algorithm such that the input vector enters only in the form of inner product k(x, x ) = φ(x) T φ(x ). Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

7 Kernel trick review 3 Usually, φ(x) is a infinite-dimensional vector or it is expensive to compute. We can reformulate the learning algorithm such that the input vector enters only in the form of inner product k(x, x ) = φ(x) T φ(x ). 4 k is called the kernel function and should be easy to compute. Examples: Polynomial kernel: k(x, x ) = (1 + x T x ) d. Gaussian kernel: k(x, x ) = exp ( x x 2 2σ ). 2 Sobolev kernel in R 1 : k(x, x ) = 1 + min(x, x ). Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

8 Kernel ridge regression Given N samples (x 1, y 1 ),..., (x N, y N ), we want to compute the empirical minimizer ˆf = argmin f H 1 N N (f(x i ) y i ) 2 + λ f 2 H i=1 as an estimate to f = argmin f H E[(f(x) y) 2 ]. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

9 Kernel ridge regression Given N samples (x 1, y 1 ),..., (x N, y N ), we want to compute the empirical minimizer ˆf = argmin f H 1 N N (f(x i ) y i ) 2 + λ f 2 H i=1 as an estimate to f = argmin f H E[(f(x) y) 2 ]. This minimization problem has a closed-form solution: ˆf = N α i k(x i, ), where α = (K + λni) 1 y. i=1 K is the N N kernel matrix defined by K ij = k(x i, x j ). Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

10 Think about large datasets The matrix inversion α = (K + λni) 1 y takes O(N 3 ) time and O(N 2 ) memory space, which can be prohibitively expensive when N is large. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

11 Think about large datasets The matrix inversion α = (K + λni) 1 y takes O(N 3 ) time and O(N 2 ) memory space, which can be prohibitively expensive when N is large. Fast approaches to compute kernel ridge regression: 1 Low-rank matrix approximation: Kernel PCA. Incomplete Cholesky decomposition Nystrom sampling. 2 Iterative optimization algorithm: Gradient descent. Conjugate gradient methods. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

12 Think about large datasets The matrix inversion α = (K + λni) 1 y takes O(N 3 ) time and O(N 2 ) memory space, which can be prohibitively expensive when N is large. Fast approaches to compute kernel ridge regression: 1 Low-rank matrix approximation: Kernel PCA. Incomplete Cholesky decomposition Nystrom sampling. 2 Iterative optimization algorithm: Gradient descent. Conjugate gradient methods. However, none of these method can be shown achieving the same level of accuracy as the exact algorithm does. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

13 Our main idea Only keep the diagonal blocks, so that the matrix inversion is fast. K = K 11 K 12 K 13 K 14 K 15 K 16 K 21 K 22 K 23 K 24 K 25 K 26 K 31 K 32 K 33 K 34 K 35 K 36 K 41 K 42 K 43 K 44 K 45 K 46 K 51 K 52 K 53 K 54 K 55 K 56 K 61 K 62 K 63 K 64 K 65 K 66 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

14 Our main idea Only keep the diagonal blocks, so that the matrix inversion is fast. K = K 33 K 36 K 34 K 32 K 31 K 35 K 63 K 66 K 64 K 62 K 61 K 65 K 43 K 46 K 44 K 42 K 41 K 45 K 23 K 26 K 24 K 22 K 21 K 25 K 13 K 16 K 14 K 12 K 11 K 15 K 53 K 56 K 54 K 52 K 61 K 55 Random Shuffle K 11 K 12 K 13 K 14 K 15 K 16 K 21 K 22 K 23 K 24 K 25 K 26 K 31 K 32 K 33 K 34 K 35 K 36 K 41 K 42 K 43 K 44 K 45 K 46 K 51 K 52 K 53 K 54 K 55 K 56 K 61 K 62 K 63 K 64 K 65 K 66 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

15 Our main idea Only keep the diagonal blocks, so that the matrix inversion is fast. K = K 11 K 12 K 13 K 14 K 15 K 16 K 21 K 22 K 23 K 24 K 25 K 26 K 31 K 32 K 33 K 34 K 35 K 36 K 41 K 42 K 43 K 44 K 45 K 46 K 51 K 52 K 53 K 54 K 55 K 56 K 61 K 62 K 63 K 64 K 65 K 66 K 33 K 36 K 34 K 32 K 31 K 35 K 63 K 66 K 64 K 62 K 61 K 65 K 43 K 46 K 44 K 42 K 41 K 45 K 23 K 26 K 24 K 22 K 21 K 25 K 13 K 16 K 14 K 12 K 11 K 15 K 53 K 56 K 54 K 52 K 61 K 55 Random Shuffle K 33 K K 63 K K 44 K K 24 K K 11 K K 61 K 55 Block Diagonalize Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

16 Fast kernel ridge regression (Fast-KRR) We propose a divide-and-conquer approach: 1 Divide the set of samples {(x 1, y 1 ),..., (x N, y N )} evenly and uniformly at randomly into the m disjoint subsets: S 1,..., S m X R. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

17 Fast kernel ridge regression (Fast-KRR) We propose a divide-and-conquer approach: 1 Divide the set of samples {(x 1, y 1 ),..., (x N, y N )} evenly and uniformly at randomly into the m disjoint subsets: S 1,..., S m X R. 2 For each i = 1, 2,..., m, compute the local KRR estimate { 1 ˆf i := argmin f H S i (f(x) y) 2 (x,y) S i }{{} local risk + λ f 2 H }{{} under-regularized }. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

18 Fast kernel ridge regression (Fast-KRR) We propose a divide-and-conquer approach: 1 Divide the set of samples {(x 1, y 1 ),..., (x N, y N )} evenly and uniformly at randomly into the m disjoint subsets: S 1,..., S m X R. 2 For each i = 1, 2,..., m, compute the local KRR estimate { 1 ˆf i := argmin f H S i (f(x) y) 2 (x,y) S i }{{} local risk + λ f 2 H }{{} under-regularized 3 Average together the local estimates and output f = 1 m m i=1 ˆf i. }. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

19 Fast kernel ridge regression (Fast-KRR) We propose a divide-and-conquer approach: 1 Divide the set of samples {(x 1, y 1 ),..., (x N, y N )} evenly and uniformly at randomly into the m disjoint subsets: S 1,..., S m X R. 2 For each i = 1, 2,..., m, compute the local KRR estimate { 1 ˆf i := argmin f H S i (f(x) y) 2 (x,y) S i }{{} local risk + λ f 2 H }{{} under-regularized 3 Average together the local estimates and output f = 1 m m i=1 ˆf i. Computation time: O(N 3 /m 2 ); memory space: O(N 2 /m 2 ). }. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

20 Theoretical result Theorem With m splits, Fast-KRR achieves the mean square error: E[ f ( f 2 2] C λ f 2 H }{{} squared bias + γ(λ) ) + T (λ, m) } N{{} variance Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

21 Theoretical result Theorem With m splits, Fast-KRR achieves the mean square error: E[ f ( f 2 2] C λ f 2 H }{{} squared bias + γ(λ) ) + T (λ, m) } N{{} variance λ f 2 H is the bias introduced by regularization. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

22 Theoretical result Theorem With m splits, Fast-KRR achieves the mean square error: E[ f ( f 2 2] C λ f 2 H }{{} squared bias + γ(λ) ) + T (λ, m) } N{{} variance λ f 2 H is the bias introduced by regularization. T (λ, m) becomes a higher-order negligible term when m is below the threshold m N/γ 2 (λ). Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

23 Theoretical result Theorem With m splits, Fast-KRR achieves the mean square error: E[ f ( f 2 2] C λ f 2 H }{{} squared bias + γ(λ) ) + T (λ, m) } N{{} variance λ f 2 H is the bias introduced by regularization. T (λ, m) becomes a higher-order negligible term when m is below the threshold m N/γ 2 (λ). γ(λ) is the effective dimensionality: let µ 1 µ 2... be the sequence of eigenvalues in kernel k s eigen-expansion, then γ(λ) = µ k /(λ + µ k ). k=1 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

24 Theoretical result Theorem With m splits, Fast-KRR achieves the mean square error: E[ f ( f 2 2] C λ f 2 H }{{} squared bias + γ(λ) ) + T (λ, m) } N{{} variance λ f 2 H is the bias introduced by regularization. T (λ, m) becomes a higher-order negligible term when m is below the threshold m N/γ 2 (λ). γ(λ) is the effective dimensionality: let µ 1 µ 2... be the sequence of eigenvalues in kernel k s eigen-expansion, then γ(λ) = µ k /(λ + µ k ). k=1 Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

25 Apply to specific kernels Corollary For polynomial kernel if m cn/ log N then ( ) E[ f 1 f 2 2] = O (minimax optimal rate) N Time: O(N 3 ) O(N log 2 N) Space: O(N 2 ) O(log 2 N) Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

26 Apply to specific kernels Corollary For polynomial kernel if m cn/ log N then ( ) E[ f 1 f 2 2] = O (minimax optimal rate) N Time: O(N 3 ) O(N log 2 N) Space: O(N 2 ) O(log 2 N) Corollary For Gaussian kernel, if m cn/log 2 N then ( ) E[ f log N f 2 2] = O (minimax optimal rate) N Time: O(N 3 ) O(N log 4 N) Space: O(N 2 ) O(log 4 N) Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

27 Apply to specific kernels Corollary For Sobolev kernel of smoothness ν, if m cn 2ν 1 2ν+1 / log N then E[ f ) f 2 2] = O (N 2ν 2ν+1 (minimax optimal rate) Time: O(N 3 ) O(N 2ν+5 2ν+1 log 2 N) Space: O(N 2 ) O(N 4 2ν+1 log 2 N) Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

28 Simulation Study Data (x, y) is generated by y = min(x, 1 x) + ε where ε N(0, 0.2) y x Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

29 Compare Fast-KRR and exact KRR We use a Sobolev kernel of smoothness-1 to fit the data. m=1 m=4 m=16 m=64 Mean square error Total number of samples (N) Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

30 Compare Fast-KRR and exact KRR We use a Sobolev kernel of smoothness-1 to fit the data. m=1 m=4 m=16 m=64 Mean square error Total number of samples (N) Fast-KRR s performance is very close to exact KRR for m 16. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

31 Threshold for data partitioning Mean square error is plotted for varied choices of m. Mean square error N=256 N=512 N=1024 N=2048 N=4096 N= log(# of partitions)/log(# of samples) Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

32 Threshold for data partitioning Mean square error is plotted for varied choices of m. Mean square error N=256 N=512 N=1024 N=2048 N=4096 N= log(# of partitions)/log(# of samples) As long as m N 0.45, the accuracy is not hurt. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

33 Conclusion We propose a divide-and-conquer approach for kernel ridge regression that leads to substantial reduction in computation time and memory space. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

34 Conclusion We propose a divide-and-conquer approach for kernel ridge regression that leads to substantial reduction in computation time and memory space. The proposed algorithm archives the optimal convergence rate for the full sample size N, as long as the partition number m is not too large. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

35 Conclusion We propose a divide-and-conquer approach for kernel ridge regression that leads to substantial reduction in computation time and memory space. The proposed algorithm archives the optimal convergence rate for the full sample size N, as long as the partition number m is not too large. As concrete examples, our theory guarantees that m may grow polynomially in N for Sobolev spaces, and grow nearly linearly in N for finite-rank kernels and Gaussian kernels. Yuchen Zhang (UC Berkeley) Divide and Conquer KRR COLT / 15

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE REGULARIZATION METHODS FOR HIGH DIMENSIONAL LEARNING Francesca Odone and Lorenzo Rosasco odone@disi.unige.it - lrosasco@mit.edu June 6, 2011 ABOUT THIS

More information

Unsupervised Learning

Unsupervised Learning Unsupervised Learning Learning without Class Labels (or correct outputs) Density Estimation Learn P(X) given training data for X Clustering Partition data into clusters Dimensionality Reduction Discover

More information

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE

MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE MODEL SELECTION AND REGULARIZATION PARAMETER CHOICE REGULARIZATION METHODS FOR HIGH DIMENSIONAL LEARNING Francesca Odone and Lorenzo Rosasco odone@disi.unige.it - lrosasco@mit.edu June 3, 2013 ABOUT THIS

More information

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016

CPSC 340: Machine Learning and Data Mining. Principal Component Analysis Fall 2016 CPSC 340: Machine Learning and Data Mining Principal Component Analysis Fall 2016 A2/Midterm: Admin Grades/solutions will be posted after class. Assignment 4: Posted, due November 14. Extra office hours:

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 13 UNSUPERVISED LEARNING If you have access to labeled training data, you know what to do. This is the supervised setting, in which you have a teacher telling

More information

Machine Learning Basics: Stochastic Gradient Descent. Sargur N. Srihari

Machine Learning Basics: Stochastic Gradient Descent. Sargur N. Srihari Machine Learning Basics: Stochastic Gradient Descent Sargur N. srihari@cedar.buffalo.edu 1 Topics 1. Learning Algorithms 2. Capacity, Overfitting and Underfitting 3. Hyperparameters and Validation Sets

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing

More information

Kernels and representation

Kernels and representation Kernels and representation Corso di AA, anno 2017/18, Padova Fabio Aiolli 20 Dicembre 2017 Fabio Aiolli Kernels and representation 20 Dicembre 2017 1 / 19 (Hierarchical) Representation Learning Hierarchical

More information

Efficient Iterative Semi-supervised Classification on Manifold

Efficient Iterative Semi-supervised Classification on Manifold . Efficient Iterative Semi-supervised Classification on Manifold... M. Farajtabar, H. R. Rabiee, A. Shaban, A. Soltani-Farani Sharif University of Technology, Tehran, Iran. Presented by Pooria Joulani

More information

Chapter 5: Basis Expansion and Regularization

Chapter 5: Basis Expansion and Regularization Chapter 5: Basis Expansion and Regularization DD3364 April 1, 2012 Introduction Main idea Moving beyond linearity Augment the vector of inputs X with additional variables. These are transformations of

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form

More information

Learning with infinitely many features

Learning with infinitely many features Learning with infinitely many features R. Flamary, Joint work with A. Rakotomamonjy F. Yger, M. Volpi, M. Dalla Mura, D. Tuia Laboratoire Lagrange, Université de Nice Sophia Antipolis December 2012 Example

More information

random fourier features for kernel ridge regression: approximation bounds and statistical guarantees

random fourier features for kernel ridge regression: approximation bounds and statistical guarantees random fourier features for kernel ridge regression: approximation bounds and statistical guarantees Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, and Amir Zandieh Tel

More information

Transductive Learning: Motivation, Model, Algorithms

Transductive Learning: Motivation, Model, Algorithms Transductive Learning: Motivation, Model, Algorithms Olivier Bousquet Centre de Mathématiques Appliquées Ecole Polytechnique, FRANCE olivier.bousquet@m4x.org University of New Mexico, January 2002 Goal

More information

Machine Learning Basics. Sargur N. Srihari

Machine Learning Basics. Sargur N. Srihari Machine Learning Basics Sargur N. srihari@cedar.buffalo.edu 1 Overview Deep learning is a specific type of ML Necessary to have a solid understanding of the basic principles of ML 2 Topics Stochastic Gradient

More information

Support Vector Machines

Support Vector Machines Support Vector Machines . Importance of SVM SVM is a discriminative method that brings together:. computational learning theory. previously known methods in linear discriminant functions 3. optimization

More information

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems

Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Robust Kernel Methods in Clustering and Dimensionality Reduction Problems Jian Guo, Debadyuti Roy, Jing Wang University of Michigan, Department of Statistics Introduction In this report we propose robust

More information

SVM in Analysis of Cross-Sectional Epidemiological Data Dmitriy Fradkin. April 4, 2005 Dmitriy Fradkin, Rutgers University Page 1

SVM in Analysis of Cross-Sectional Epidemiological Data Dmitriy Fradkin. April 4, 2005 Dmitriy Fradkin, Rutgers University Page 1 SVM in Analysis of Cross-Sectional Epidemiological Data Dmitriy Fradkin April 4, 2005 Dmitriy Fradkin, Rutgers University Page 1 Overview The goals of analyzing cross-sectional data Standard methods used

More information

CPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017

CPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017 CPSC 340: Machine Learning and Data Mining Kernel Trick Fall 2017 Admin Assignment 3: Due Friday. Midterm: Can view your exam during instructor office hours or after class this week. Digression: the other

More information

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM)

Lecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM) School of Computer Science Probabilistic Graphical Models Structured Sparse Additive Models Junming Yin and Eric Xing Lecture 7, April 4, 013 Reading: See class website 1 Outline Nonparametric regression

More information

Machine Learning / Jan 27, 2010

Machine Learning / Jan 27, 2010 Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,

More information

The Pre-Image Problem in Kernel Methods

The Pre-Image Problem in Kernel Methods The Pre-Image Problem in Kernel Methods James Kwok Ivor Tsang Department of Computer Science Hong Kong University of Science and Technology Hong Kong The Pre-Image Problem in Kernel Methods ICML-2003 1

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Support Vector Machines

Support Vector Machines Support Vector Machines SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions 6. Dealing

More information

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin

Clustering K-means. Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, Carlos Guestrin Clustering K-means Machine Learning CSEP546 Carlos Guestrin University of Washington February 18, 2014 Carlos Guestrin 2005-2014 1 Clustering images Set of Images [Goldberger et al.] Carlos Guestrin 2005-2014

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Neural Computation : Lecture 14 John A. Bullinaria, 2015 1. The RBF Mapping 2. The RBF Network Architecture 3. Computational Power of RBF Networks 4. Training

More information

Basis Functions. Volker Tresp Summer 2017

Basis Functions. Volker Tresp Summer 2017 Basis Functions Volker Tresp Summer 2017 1 Nonlinear Mappings and Nonlinear Classifiers Regression: Linearity is often a good assumption when many inputs influence the output Some natural laws are (approximately)

More information

Data Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\

Data Preprocessing. Javier Béjar. URL - Spring 2018 CS - MAI 1/78 BY: $\ Data Preprocessing Javier Béjar BY: $\ URL - Spring 2018 C CS - MAI 1/78 Introduction Data representation Unstructured datasets: Examples described by a flat set of attributes: attribute-value matrix Structured

More information

The exam is closed book, closed notes except your one-page cheat sheet.

The exam is closed book, closed notes except your one-page cheat sheet. CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right

More information

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu

FMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)

More information

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please) Virginia Tech. Computer Science CS 5614 (Big) Data Management Systems Fall 2014, Prakash Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in

More information

Last time... Bias-Variance decomposition. This week

Last time... Bias-Variance decomposition. This week Machine learning, pattern recognition and statistical data modelling Lecture 4. Going nonlinear: basis expansions and splines Last time... Coryn Bailer-Jones linear regression methods for high dimensional

More information

Please write your initials at the top right of each page (e.g., write JS if you are Jonathan Shewchuk). Finish this by the end of your 3 hours.

Please write your initials at the top right of each page (e.g., write JS if you are Jonathan Shewchuk). Finish this by the end of your 3 hours. CS 189 Spring 016 Introduction to Machine Learning Final Please do not open the exam before you are instructed to do so. The exam is closed book, closed notes except your two-page cheat sheet. Electronic

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions

More information

Tree-GP: A Scalable Bayesian Global Numerical Optimization algorithm

Tree-GP: A Scalable Bayesian Global Numerical Optimization algorithm Utrecht University Department of Information and Computing Sciences Tree-GP: A Scalable Bayesian Global Numerical Optimization algorithm February 2015 Author Gerben van Veenendaal ICA-3470792 Supervisor

More information

Mini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class

Mini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class Mini-project 2 CMPSCI 689 Spring 2015 Due: Tuesday, April 07, in class Guidelines Submission. Submit a hardcopy of the report containing all the figures and printouts of code in class. For readability

More information

6 Model selection and kernels

6 Model selection and kernels 6. Bias-Variance Dilemma Esercizio 6. While you fit a Linear Model to your data set. You are thinking about changing the Linear Model to a Quadratic one (i.e., a Linear Model with quadratic features φ(x)

More information

A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression

A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression A Study of Clustering Techniques and Hierarchical Matrix Formats for Kernel Ridge Regression Xiaoye Sherry Li, xsli@lbl.gov Liza Rebrova, Gustavo Chavez, Yang Liu, Pieter Ghysels Lawrence Berkeley National

More information

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1

Preface to the Second Edition. Preface to the First Edition. 1 Introduction 1 Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches

More information

Introduction to Optimization

Introduction to Optimization Introduction to Optimization Second Order Optimization Methods Marc Toussaint U Stuttgart Planned Outline Gradient-based optimization (1st order methods) plain grad., steepest descent, conjugate grad.,

More information

Machine Learning: Think Big and Parallel

Machine Learning: Think Big and Parallel Day 1 Inderjit S. Dhillon Dept of Computer Science UT Austin CS395T: Topics in Multicore Programming Oct 1, 2013 Outline Scikit-learn: Machine Learning in Python Supervised Learning day1 Regression: Least

More information

A Dendrogram. Bioinformatics (Lec 17)

A Dendrogram. Bioinformatics (Lec 17) A Dendrogram 3/15/05 1 Hierarchical Clustering [Johnson, SC, 1967] Given n points in R d, compute the distance between every pair of points While (not done) Pick closest pair of points s i and s j and

More information

Machine Learning and Computational Statistics, Spring 2015 Homework 1: Ridge Regression and SGD

Machine Learning and Computational Statistics, Spring 2015 Homework 1: Ridge Regression and SGD Machine Learning and Computational Statistics, Spring 2015 Homework 1: Ridge Regression and SGD Due: Friday, February 6, 2015, at 4pm (Submit via NYU Classes) Instructions: Your answers to the questions

More information

Conflict Graphs for Parallel Stochastic Gradient Descent

Conflict Graphs for Parallel Stochastic Gradient Descent Conflict Graphs for Parallel Stochastic Gradient Descent Darshan Thaker*, Guneet Singh Dhillon* Abstract We present various methods for inducing a conflict graph in order to effectively parallelize Pegasos.

More information

Machine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016

Machine Learning for Signal Processing Clustering. Bhiksha Raj Class Oct 2016 Machine Learning for Signal Processing Clustering Bhiksha Raj Class 11. 13 Oct 2016 1 Statistical Modelling and Latent Structure Much of statistical modelling attempts to identify latent structure in the

More information

Advanced Machine Learning Practical 1: Manifold Learning (PCA and Kernel PCA)

Advanced Machine Learning Practical 1: Manifold Learning (PCA and Kernel PCA) Advanced Machine Learning Practical : Manifold Learning (PCA and Kernel PCA) Professor: Aude Billard Assistants: Nadia Figueroa, Ilaria Lauzana and Brice Platerrier E-mails: aude.billard@epfl.ch, nadia.figueroafernandez@epfl.ch

More information

MULTI-DIMENSIONAL MONTE CARLO INTEGRATION

MULTI-DIMENSIONAL MONTE CARLO INTEGRATION CS580: Computer Graphics KAIST School of Computing Chapter 3 MULTI-DIMENSIONAL MONTE CARLO INTEGRATION 2 1 Monte Carlo Integration This describes a simple technique for the numerical evaluation of integrals

More information

Data Preprocessing. Javier Béjar AMLT /2017 CS - MAI. (CS - MAI) Data Preprocessing AMLT / / 71 BY: $\

Data Preprocessing. Javier Béjar AMLT /2017 CS - MAI. (CS - MAI) Data Preprocessing AMLT / / 71 BY: $\ Data Preprocessing S - MAI AMLT - 2016/2017 (S - MAI) Data Preprocessing AMLT - 2016/2017 1 / 71 Outline 1 Introduction Data Representation 2 Data Preprocessing Outliers Missing Values Normalization Discretization

More information

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017

Lecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017 Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last

More information

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017 Data Analysis 3 Support Vector Machines Jan Platoš October 30, 2017 Department of Computer Science Faculty of Electrical Engineering and Computer Science VŠB - Technical University of Ostrava Table of

More information

Network Traffic Measurements and Analysis

Network Traffic Measurements and Analysis DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:

More information

Support Vector Regression with ANOVA Decomposition Kernels

Support Vector Regression with ANOVA Decomposition Kernels Support Vector Regression with ANOVA Decomposition Kernels Mark O. Stitson, Alex Gammerman, Vladimir Vapnik, Volodya Vovk, Chris Watkins, Jason Weston Technical Report CSD-TR-97- November 7, 1997!()+,

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

Learning from High Dimensional fmri Data using Random Projections

Learning from High Dimensional fmri Data using Random Projections Learning from High Dimensional fmri Data using Random Projections Author: Madhu Advani December 16, 011 Introduction The term the Curse of Dimensionality refers to the difficulty of organizing and applying

More information

Kernel Spectral Clustering

Kernel Spectral Clustering Kernel Spectral Clustering Ilaria Giulini Université Paris Diderot joint work with Olivier Catoni introduction clustering is task of grouping objects into classes (clusters) according to their similarities

More information

Chapter 7: Numerical Prediction

Chapter 7: Numerical Prediction Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases SS 2016 Chapter 7: Numerical Prediction Lecture: Prof. Dr.

More information

What is machine learning?

What is machine learning? Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship

More information

All lecture slides will be available at CSC2515_Winter15.html

All lecture slides will be available at  CSC2515_Winter15.html CSC2515 Fall 2015 Introduc3on to Machine Learning Lecture 9: Support Vector Machines All lecture slides will be available at http://www.cs.toronto.edu/~urtasun/courses/csc2515/ CSC2515_Winter15.html Many

More information

A Weighted Kernel PCA Approach to Graph-Based Image Segmentation

A Weighted Kernel PCA Approach to Graph-Based Image Segmentation A Weighted Kernel PCA Approach to Graph-Based Image Segmentation Carlos Alzate Johan A. K. Suykens ESAT-SCD-SISTA Katholieke Universiteit Leuven Leuven, Belgium January 25, 2007 International Conference

More information

Basis Functions. Volker Tresp Summer 2016

Basis Functions. Volker Tresp Summer 2016 Basis Functions Volker Tresp Summer 2016 1 I am an AI optimist. We ve got a lot of work in machine learning, which is sort of the polite term for AI nowadays because it got so broad that it s not that

More information

Convex Optimization / Homework 2, due Oct 3

Convex Optimization / Homework 2, due Oct 3 Convex Optimization 0-725/36-725 Homework 2, due Oct 3 Instructions: You must complete Problems 3 and either Problem 4 or Problem 5 (your choice between the two) When you submit the homework, upload a

More information

Machine Learning (CSE 446): Unsupervised Learning

Machine Learning (CSE 446): Unsupervised Learning Machine Learning (CSE 446): Unsupervised Learning Sham M Kakade c 2018 University of Washington cse446-staff@cs.washington.edu 1 / 19 Announcements HW2 posted. Due Feb 1. It is long. Start this week! Today:

More information

Dimension reduction : PCA and Clustering

Dimension reduction : PCA and Clustering Dimension reduction : PCA and Clustering By Hanne Jarmer Slides by Christopher Workman Center for Biological Sequence Analysis DTU The DNA Array Analysis Pipeline Array design Probe design Question Experimental

More information

Kernel Methods & Support Vector Machines

Kernel Methods & Support Vector Machines & Support Vector Machines & Support Vector Machines Arvind Visvanathan CSCE 970 Pattern Recognition 1 & Support Vector Machines Question? Draw a single line to separate two classes? 2 & Support Vector

More information

Machine Learning and Computational Statistics, Spring 2016 Homework 1: Ridge Regression and SGD

Machine Learning and Computational Statistics, Spring 2016 Homework 1: Ridge Regression and SGD Machine Learning and Computational Statistics, Spring 2016 Homework 1: Ridge Regression and SGD Due: Friday, February 5, 2015, at 6pm (Submit via NYU Classes) Instructions: Your answers to the questions

More information

Online Learning. Lorenzo Rosasco MIT, L. Rosasco Online Learning

Online Learning. Lorenzo Rosasco MIT, L. Rosasco Online Learning Online Learning Lorenzo Rosasco MIT, 9.520 About this class Goal To introduce theory and algorithms for online learning. Plan Different views on online learning From batch to online least squares Other

More information

Image Analysis - Lecture 5

Image Analysis - Lecture 5 Texture Segmentation Clustering Review Image Analysis - Lecture 5 Texture and Segmentation Magnus Oskarsson Lecture 5 Texture Segmentation Clustering Review Contents Texture Textons Filter Banks Gabor

More information

LSML19: Introduction to Large Scale Machine Learning

LSML19: Introduction to Large Scale Machine Learning LSML19: Introduction to Large Scale Machine Learning MINES ParisTech March 2019 Chloé-Agathe Azencot Centre for Computational Biology, Mines ParisTech chloe-agathe.azencott@mines-paristech.fr Practical

More information

Positive Definite Kernel Functions on Fuzzy Sets

Positive Definite Kernel Functions on Fuzzy Sets Positive Definite Kernel Functions on Fuzzy Sets FUZZ 2014 Jorge Guevara Díaz 1 Roberto Hirata Jr. 1 Stéphane Canu 2 1 Institute of Mathematics and Statistics University of Sao Paulo-Brazil 2 Institut

More information

Learning Kernels with Random Features

Learning Kernels with Random Features Learning Kernels with Random Features Aman Sinha John Duchi Stanford University NIPS, 2016 Presenter: Ritambhara Singh Outline 1 Introduction Motivation Background State-of-the-art 2 Proposed Approach

More information

Hyperparameters and Validation Sets. Sargur N. Srihari

Hyperparameters and Validation Sets. Sargur N. Srihari Hyperparameters and Validation Sets Sargur N. srihari@cedar.buffalo.edu 1 Topics in Machine Learning Basics 1. Learning Algorithms 2. Capacity, Overfitting and Underfitting 3. Hyperparameters and Validation

More information

Topics in Machine Learning-EE 5359 Model Assessment and Selection

Topics in Machine Learning-EE 5359 Model Assessment and Selection Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing

More information

Grundlagen der Künstlichen Intelligenz

Grundlagen der Künstlichen Intelligenz Grundlagen der Künstlichen Intelligenz Unsupervised learning Daniel Hennes 29.01.2018 (WS 2017/18) University Stuttgart - IPVS - Machine Learning & Robotics 1 Today Supervised learning Regression (linear

More information

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013

Machine Learning. Topic 5: Linear Discriminants. Bryan Pardo, EECS 349 Machine Learning, 2013 Machine Learning Topic 5: Linear Discriminants Bryan Pardo, EECS 349 Machine Learning, 2013 Thanks to Mark Cartwright for his extensive contributions to these slides Thanks to Alpaydin, Bishop, and Duda/Hart/Stork

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic

Clustering. SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering SC4/SM4 Data Mining and Machine Learning, Hilary Term 2017 Dino Sejdinovic Clustering is one of the fundamental and ubiquitous tasks in exploratory data analysis a first intuition about the

More information

Clustering. So far in the course. Clustering. Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. dist(x, y) = x y 2 2

Clustering. So far in the course. Clustering. Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. dist(x, y) = x y 2 2 So far in the course Clustering Subhransu Maji : Machine Learning 2 April 2015 7 April 2015 Supervised learning: learning with a teacher You had training data which was (feature, label) pairs and the goal

More information

A Polygonal Line Algorithm for Constructing Principal Curves

A Polygonal Line Algorithm for Constructing Principal Curves A Polygonal Line Algorithm for Constructing s Balázs Kégl, Adam Krzyżak Dept. of Computer Science Concordia University 45 de Maisonneuve Blvd. W. Montreal, Canada H3G M8 keglcs.concordia.ca krzyzakcs.concordia.ca

More information

CPSC 340: Machine Learning and Data Mining. Robust Regression Fall 2015

CPSC 340: Machine Learning and Data Mining. Robust Regression Fall 2015 CPSC 340: Machine Learning and Data Mining Robust Regression Fall 2015 Admin Can you see Assignment 1 grades on UBC connect? Auditors, don t worry about it. You should already be working on Assignment

More information

Feature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22

Feature selection. Term 2011/2012 LSI - FIB. Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/ / 22 Feature selection Javier Béjar cbea LSI - FIB Term 2011/2012 Javier Béjar cbea (LSI - FIB) Feature selection Term 2011/2012 1 / 22 Outline 1 Dimensionality reduction 2 Projections 3 Attribute selection

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines CS 536: Machine Learning Littman (Wu, TA) Administration Slides borrowed from Martin Law (from the web). 1 Outline History of support vector machines (SVM) Two classes,

More information

Visual Representations for Machine Learning

Visual Representations for Machine Learning Visual Representations for Machine Learning Spectral Clustering and Channel Representations Lecture 1 Spectral Clustering: introduction and confusion Michael Felsberg Klas Nordberg The Spectral Clustering

More information

Second Order SMO Improves SVM Online and Active Learning

Second Order SMO Improves SVM Online and Active Learning Second Order SMO Improves SVM Online and Active Learning Tobias Glasmachers and Christian Igel Institut für Neuroinformatik, Ruhr-Universität Bochum 4478 Bochum, Germany Abstract Iterative learning algorithms

More information

Transductive and Inductive Methods for Approximate Gaussian Process Regression

Transductive and Inductive Methods for Approximate Gaussian Process Regression Transductive and Inductive Methods for Approximate Gaussian Process Regression Anton Schwaighofer 1,2 1 TU Graz, Institute for Theoretical Computer Science Inffeldgasse 16b, 8010 Graz, Austria http://www.igi.tugraz.at/aschwaig

More information

Lecture 27: Fast Laplacian Solvers

Lecture 27: Fast Laplacian Solvers Lecture 27: Fast Laplacian Solvers Scribed by Eric Lee, Eston Schweickart, Chengrun Yang November 21, 2017 1 How Fast Laplacian Solvers Work We want to solve Lx = b with L being a Laplacian matrix. Recall

More information

Contents. I The Basic Framework for Stationary Problems 1

Contents. I The Basic Framework for Stationary Problems 1 page v Preface xiii I The Basic Framework for Stationary Problems 1 1 Some model PDEs 3 1.1 Laplace s equation; elliptic BVPs... 3 1.1.1 Physical experiments modeled by Laplace s equation... 5 1.2 Other

More information

Autoencoder Using Kernel Method

Autoencoder Using Kernel Method 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC) Banff Center, Banff, Canada, October 5-8, 2017 Autoencoder Using Kernel Method Yan Pei Computer Science Division University of

More information

Support Vector Machines

Support Vector Machines Support Vector Machines About the Name... A Support Vector A training sample used to define classification boundaries in SVMs located near class boundaries Support Vector Machines Binary classifiers whose

More information

K-Means and Gaussian Mixture Models

K-Means and Gaussian Mixture Models K-Means and Gaussian Mixture Models David Rosenberg New York University June 15, 2015 David Rosenberg (New York University) DS-GA 1003 June 15, 2015 1 / 43 K-Means Clustering Example: Old Faithful Geyser

More information

Kernel spectral clustering: model representations, sparsity and out-of-sample extensions

Kernel spectral clustering: model representations, sparsity and out-of-sample extensions Kernel spectral clustering: model representations, sparsity and out-of-sample extensions Johan Suykens and Carlos Alzate K.U. Leuven, ESAT-SCD/SISTA Kasteelpark Arenberg B-3 Leuven (Heverlee), Belgium

More information

Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. 2 April April 2015

Clustering. Subhransu Maji. CMPSCI 689: Machine Learning. 2 April April 2015 Clustering Subhransu Maji CMPSCI 689: Machine Learning 2 April 2015 7 April 2015 So far in the course Supervised learning: learning with a teacher You had training data which was (feature, label) pairs

More information

CS8803: Statistical Techniques in Robotics Byron Boots. Predicting With Hilbert Space Embeddings

CS8803: Statistical Techniques in Robotics Byron Boots. Predicting With Hilbert Space Embeddings CS8803: Statistical Techniques in Robotics Byron Boots Predicting With Hilbert Space Embeddings 1 HSE: Gram/Kernel Matrices bc YX = 1 N bc XX = 1 N NX (y i )'(x i ) > = 1 N Y > X 2 R 1 1 i=1 NX i=1 '(x

More information

Convex or non-convex: which is better?

Convex or non-convex: which is better? Sparsity Amplified Ivan Selesnick Electrical and Computer Engineering Tandon School of Engineering New York University Brooklyn, New York March 7 / 4 Convex or non-convex: which is better? (for sparse-regularized

More information

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.

4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used. 1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when

More information

Convex Optimization CMU-10725

Convex Optimization CMU-10725 Convex Optimization CMU-10725 Conjugate Direction Methods Barnabás Póczos & Ryan Tibshirani Conjugate Direction Methods 2 Books to Read David G. Luenberger, Yinyu Ye: Linear and Nonlinear Programming Nesterov:

More information

model order p weights The solution to this optimization problem is obtained by solving the linear system

model order p weights The solution to this optimization problem is obtained by solving the linear system CS 189 Introduction to Machine Learning Fall 2017 Note 3 1 Regression and hyperparameters Recall the supervised regression setting in which we attempt to learn a mapping f : R d R from labeled examples

More information

Lab 2: Support vector machines

Lab 2: Support vector machines Artificial neural networks, advanced course, 2D1433 Lab 2: Support vector machines Martin Rehn For the course given in 2006 All files referenced below may be found in the following directory: /info/annfk06/labs/lab2

More information

Chap.12 Kernel methods [Book, Chap.7]

Chap.12 Kernel methods [Book, Chap.7] Chap.12 Kernel methods [Book, Chap.7] Neural network methods became popular in the mid to late 1980s, but by the mid to late 1990s, kernel methods have also become popular in machine learning. The first

More information

Stanford University. A Distributed Solver for Kernalized SVM

Stanford University. A Distributed Solver for Kernalized SVM Stanford University CME 323 Final Project A Distributed Solver for Kernalized SVM Haoming Li Bangzheng He haoming@stanford.edu bzhe@stanford.edu GitHub Repository https://github.com/cme323project/spark_kernel_svm.git

More information

Module 4. Non-linear machine learning econometrics: Support Vector Machine

Module 4. Non-linear machine learning econometrics: Support Vector Machine Module 4. Non-linear machine learning econometrics: Support Vector Machine THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Introduction When the assumption of linearity

More information