Instructor: Dr. Benjamin Thompson Lecture 20: 31 March 2009

Size: px
Start display at page:

Download "Instructor: Dr. Benjamin Thompson Lecture 20: 31 March 2009"

Transcription

1 Instructor: Dr. Benjamin hompson Lecture 20: 31 March 2009

2 Announcements MIDERMS ARE GRADED! Yay! I will return them AFER the lecture, so that you pay attention to the lecture. I m sadistic like that. Some statistics: Mean score: 134 (out of 160 total points) Median: Standard Deviation: 36.3 points Skew: Kurtosis: Oh, and: NO CLASS ON HURSDAY

3 Last ime You were taking the midterm, remember? I hope you do. Maybe you ve blocked it out. Painful memories are like that.

4 oday Chapter 6: Support Vector Machines Motivation: Optimal Separation of Classes SVMs and You Constrained Optimization he Method of Lagrangian Multipliers he Optimal Hyperplane for Nonseparable Patterns Midterm Debriefing

5

6 After 11 a.m. versus never is a pretty optimal separation of classes for many of us

7 Rosenblatt Refresher Recall that, for the Rosenblatt Perceptron, the goal was linearhyperplane separationof two classes he RP was a good enough approach: Learning process terminated once all training patterns were successfully classified Many possible solutions (see next slide) No measure of how one hyperplane is better/worse than any other, so long as all patterns are properly classified In other words: the RP is a sufficientclassifier, but not an optimal classifier

8 Non-Uniqueness of the RP Hyperplane How much do these points contribute to the overall position/orientation of the hyperplane? How about these? hat s a hint about where we re going

9 So you want Optimality? So we want a hyperplane that is somehow betterthan all the other possible hyperplanes Subject to it correctly classifying all the points, of course. Better is a loaded term Any suggestions? We ll need a few tools to develop a clear idea of the best hyperplane

10 Our Dear Old Friend Recall the definition of the separating (linear) hyperplane: w x+ b= Now, let s assume we only have two classes we re interested in separating, C1 and C2 Further, let s assume fixed outputs for each class: Each member of class C1 should yield a +1 Each member of class C2 should yield a -1 Putting those together, we have: 0 w xi w xi + b 0 d =+ 1 i + b< 0 d = 1 i

11 On the Margin Consider a point x j very close to the hyperplane In this case, w x j +b 0 Suppose the approximation is exact: hat is, suppose the point yields an output that is exactly zero his will classify as class C1, but only because we defined it that way Now extend that to something that is only slightly non-zero It s clear from the hyperplane equation that a correspondingly small change in bor w would fudge that result as well In other words: points very close to the hyperplane may result in confusion if there is any noise in the system So: if points close to the decision boundary cause us problems, we should construct the hyperplane so that all the points in either class are as far away from the hyperplane as possible!

12 A Cautionary ale.

13 Who knew? ( ) he function defined by g x is known as the j = w xj+ b discriminant function, and is a measure of the algebraic distancefrom the point x j to the hyperplane defined by w o understand this, first we need to take a look at some arbitrary vector xand break it up into two components: x p, the normal projection of xalong the hyperplane defined by w x r, the component of xperpendicular to wstarting from x p

14 In pictures his is easiest to picture when b=0: x j x jr, x j, p w x= 0 So it should be clear that x r is somehow a measure of what we re looking for: or more specifically, the lengthof x r!

15 In pictures But changes in non-obvious ways when b 0 x j x jr, w x+ b= 0 x j, p Note: since I just shifted everything up by some value b, xpchanged, but x r did not!

16 Insight Into the Inner Product Recall that w xis an inner productbetween the two vectors wand x Suppose that one of the vectors (let s assume w) has unit length hat is, w = 1 his is known as a unit vector When this condition holds, the inner product of x with that unit vector tells you exactly how much of x lies along the direction of w Note: if neither vector is a unit vector, you can makeit a unit vector by setting: w u = w w

17 A Little Math Never Hurt Anybody Because x p lies onthe hyperplane defined by w, g(x p ) is, by definition, zero. Additionally, the directionof a vector normalto (perpendicular to) some plane defined by wis just w! So We may rewrite and arbitrary vector xas x=x p +x r And we may rewrite x r as some vector of length r pointing in the direction of w, or: x= x + p r w w Remember, the direction of w is perpendicular to the direction of the plane defined by w

18 A Little More Math Never Hurt Anybody Except Small Woodland Creatures and the Very Young Just for laughs, let s evaluate g(x p ): So: ( xp) 0 Which just simplifies to: g = = w x r + b r w w w x w r= In other words: the perpendicular distance from the hyperplane defined by wto any point x is just the discriminant function g(x) divided by the norm of the hyperplane vector w! w w = + b g( x) w

19 At Last, a Goal! So now the goal becomes clear: We want to find a hyperplane defined by some wand b that maximizes this value r, thus maximizing the margin of separation ρ o, over all the patterns to be classified! Note: ρ o is actually 2r, since ρ o is defined as the worst case minimum distance between any two data data points from different classes In other words, suppose we have some {x j } that are closest to the hyperplane defined by w: We want to find the wand bthat maximize the distance ρ o for the minimallyseparated patterns {x j } From now on, we will refer to these best parameters as {w o, b o }

20 A Bit of a rick We may cleverly (or cheaply, depending on your mood) phrase the conditions of the previous slide in the following way, as a pair of constraintson our choice of {w o, b o }: w x + b + 1 for d =+ 1 o i o i w x + b 1 for d = 1 o i o i Some discussion of these equations on the next slide

21 Remember When? Recall that the hyperplane of separation is defined by the equation wox+ bo = 0 So if we multiply both sides of that equation by an arbitrary constant, the dividing plane doesn t change at all, while w o and b o are both scaled by that constant Yes, this does imply that the optimal solution is still nonunique But the separating plane defined by that optimal solution is unchanged! So the constraints are really saying We require that our classifier always produce some positive constant (or greater) for Class 1 inputs, and some negative constant (or less) for Class 2 inputs. hen we just scale the optimal parameters so that positive constant and negative constant are just +/-1

22 A Veritable Dichotomy! Given the constraints w x + b + 1 for d =+ 1 o i o i woxi+ bo 1 for di = 1 here are two typesof vectors in our data set {xi, di}: hose that fulfill the strict inequality hose for whom the equality holds he vectors that fulfill the equalityabove are those that lie closest to the separating boundary Indeed, these are the points that definethe boundary, given the above constraints

23 Support Vectors Ahoy! As our previous demonstration hints at, the vectors that fulfill the strict inequality are irrelevant! All that matters (to define our boundary) is the vectors that fit the exact equality condition: w x ( s) ( s) + b =+ 1 for d =+ 1 o i o i w x ( s) ( s) + b = 1 for d = 1 o i o i he data points for which this holds are known as support vectorsand are denoted asx (s)

24 Immediate Ramifications Because of the equality in the constraint, then, for any support vector, our discriminant function becomes: g ( ( s) ) ( s) x wo x = + b =± 1 o so we may write: r and therefore, ( ( s) x ) ± 1 g = = w o 2 ρ o = 2r= w o w o

25 he Consequence! Remember, the goal was to maximizethe margin of separation, or 2 ρ o = wo Clearly, we can maximizeρ o by minimizing the parameter vector w o! Of course, this must be subject to the constraints w x ( s) ( s) + b =+ 1 for d =+ 1 o i o i w x ( s) ( s) + b = 1 for d = 1 o i o i which of course leads us to

26 Like unconstrained optimization, only less prone to violent outbursts.

27 Optimization We ve Seen So far, the only optimization problems we ve seen have been unconstrained optimization problems hat is, given some objective function E(w), find the minimum (or maximum) of that function with respect to every possible choice of w. Sometimes, the choice of wwhich truly optimizes the objective function may not be a useful/realistic /allowable value. hat is, there are constraints on the choice of wthat we may use to optimize our cost function.

28 Optimization with Constraints Simple example: Minimize the cost function E ( w) = w w subject to the constraintw α β Here, the unconstrained optimization of E(w) is trivial: his is just a scaled w 2, which takes a minimum value of zero at w=0 However, the constraint equation defines a plane(this time parameterized by the vector α and scalar β). Next slide: visualizing this for 2-D parameter vector

29 Constrained Optimization in Pictures Where is the solution if it is a minimizationproblem subject to a greater than constraint? Where is the solution if it is a minimizationproblem subject to a less than constraint? Where is the solution if it is a maximizationproblem subject to a greater than constraint? Where is the solution if it is a maximizationproblem subject to a less than constraint?

Lecture 3: Linear Classification

Lecture 3: Linear Classification Lecture 3: Linear Classification Roger Grosse 1 Introduction Last week, we saw an example of a learning task called regression. There, the goal was to predict a scalar-valued target from a set of features.

More information

Lecture 9. Support Vector Machines

Lecture 9. Support Vector Machines Lecture 9. Support Vector Machines COMP90051 Statistical Machine Learning Semester 2, 2017 Lecturer: Andrey Kan Copyright: University of Melbourne This lecture Support vector machines (SVMs) as maximum

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Xiaojin Zhu jerryzhu@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison [ Based on slides from Andrew Moore http://www.cs.cmu.edu/~awm/tutorials] slide 1

More information

Hot X: Algebra Exposed

Hot X: Algebra Exposed Hot X: Algebra Exposed Solution Guide for Chapter 10 Here are the solutions for the Doing the Math exercises in Hot X: Algebra Exposed! DTM from p.137-138 2. To see if the point is on the line, let s plug

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions

More information

Support Vector Machines

Support Vector Machines Support Vector Machines . Importance of SVM SVM is a discriminative method that brings together:. computational learning theory. previously known methods in linear discriminant functions 3. optimization

More information

(Refer Slide Time: 01.26)

(Refer Slide Time: 01.26) Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture # 22 Why Sorting? Today we are going to be looking at sorting.

More information

Support Vector Machines

Support Vector Machines Support Vector Machines SVM Discussion Overview. Importance of SVMs. Overview of Mathematical Techniques Employed 3. Margin Geometry 4. SVM Training Methodology 5. Overlapping Distributions 6. Dealing

More information

In other words, we want to find the domain points that yield the maximum or minimum values (extrema) of the function.

In other words, we want to find the domain points that yield the maximum or minimum values (extrema) of the function. 1 The Lagrange multipliers is a mathematical method for performing constrained optimization of differentiable functions. Recall unconstrained optimization of differentiable functions, in which we want

More information

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs) Data Mining: Concepts and Techniques Chapter 9 Classification: Support Vector Machines 1 Support Vector Machines (SVMs) SVMs are a set of related supervised learning methods used for classification Based

More information

Kernel Methods. Chapter 9 of A Course in Machine Learning by Hal Daumé III. Conversion to beamer by Fabrizio Riguzzi

Kernel Methods. Chapter 9 of A Course in Machine Learning by Hal Daumé III.   Conversion to beamer by Fabrizio Riguzzi Kernel Methods Chapter 9 of A Course in Machine Learning by Hal Daumé III http://ciml.info Conversion to beamer by Fabrizio Riguzzi Kernel Methods 1 / 66 Kernel Methods Linear models are great because

More information

4 Integer Linear Programming (ILP)

4 Integer Linear Programming (ILP) TDA6/DIT37 DISCRETE OPTIMIZATION 17 PERIOD 3 WEEK III 4 Integer Linear Programg (ILP) 14 An integer linear program, ILP for short, has the same form as a linear program (LP). The only difference is that

More information

B ABC is mapped into A'B'C'

B ABC is mapped into A'B'C' h. 00 Transformations Sec. 1 Mappings & ongruence Mappings Moving a figure around a plane is called mapping. In the figure below, was moved (mapped) to a new position in the plane and the new triangle

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Maximum Margin Methods Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574

More information

B ABC is mapped into A'B'C'

B ABC is mapped into A'B'C' h. 00 Transformations Sec. 1 Mappings & ongruence Mappings Moving a figure around a plane is called mapping. In the figure below, was moved (mapped) to a new position in the plane and the new triangle

More information

MA 1128: Lecture 02 1/22/2018

MA 1128: Lecture 02 1/22/2018 MA 1128: Lecture 02 1/22/2018 Exponents Scientific Notation 1 Exponents Exponents are used to indicate how many copies of a number are to be multiplied together. For example, I like to deal with the signs

More information

Support Vector Machines.

Support Vector Machines. Support Vector Machines srihari@buffalo.edu SVM Discussion Overview 1. Overview of SVMs 2. Margin Geometry 3. SVM Optimization 4. Overlapping Distributions 5. Relationship to Logistic Regression 6. Dealing

More information

10-2 Circles. Warm Up Lesson Presentation Lesson Quiz. Holt Algebra2 2

10-2 Circles. Warm Up Lesson Presentation Lesson Quiz. Holt Algebra2 2 10-2 Circles Warm Up Lesson Presentation Lesson Quiz Holt Algebra2 2 Warm Up Find the slope of the line that connects each pair of points. 1. (5, 7) and ( 1, 6) 1 6 2. (3, 4) and ( 4, 3) 1 Warm Up Find

More information

DM6 Support Vector Machines

DM6 Support Vector Machines DM6 Support Vector Machines Outline Large margin linear classifier Linear separable Nonlinear separable Creating nonlinear classifiers: kernel trick Discussion on SVM Conclusion SVM: LARGE MARGIN LINEAR

More information

A Short SVM (Support Vector Machine) Tutorial

A Short SVM (Support Vector Machine) Tutorial A Short SVM (Support Vector Machine) Tutorial j.p.lewis CGIT Lab / IMSC U. Southern California version 0.zz dec 004 This tutorial assumes you are familiar with linear algebra and equality-constrained optimization/lagrange

More information

Computational Geometry: Lecture 5

Computational Geometry: Lecture 5 Computational Geometry: Lecture 5 Don Sheehy January 29, 2010 1 Degeneracy In many of the algorithms that we have discussed so far, we have run into problems when that input is somehow troublesome. For

More information

Topic 4: Support Vector Machines

Topic 4: Support Vector Machines CS 4850/6850: Introduction to achine Learning Fall 2018 Topic 4: Support Vector achines Instructor: Daniel L Pimentel-Alarcón c Copyright 2018 41 Introduction Support vector machines (SVs) are considered

More information

COMS 4771 Support Vector Machines. Nakul Verma

COMS 4771 Support Vector Machines. Nakul Verma COMS 4771 Support Vector Machines Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake bound for the perceptron

More information

Lab 2: Support vector machines

Lab 2: Support vector machines Artificial neural networks, advanced course, 2D1433 Lab 2: Support vector machines Martin Rehn For the course given in 2006 All files referenced below may be found in the following directory: /info/annfk06/labs/lab2

More information

B ABC is mapped into A'B'C'

B ABC is mapped into A'B'C' h. 00 Transformations Sec. 1 Mappings & ongruence Mappings Moving a figure around a plane is called mapping. In the figure below, was moved (mapped) to a new position in the plane and the new triangle

More information

Support Vector Machines. James McInerney Adapted from slides by Nakul Verma

Support Vector Machines. James McInerney Adapted from slides by Nakul Verma Support Vector Machines James McInerney Adapted from slides by Nakul Verma Last time Decision boundaries for classification Linear decision boundary (linear classification) The Perceptron algorithm Mistake

More information

Objective- Students will be able to use the Order of Operations to evaluate algebraic expressions. Evaluating Algebraic Expressions

Objective- Students will be able to use the Order of Operations to evaluate algebraic expressions. Evaluating Algebraic Expressions Objective- Students will be able to use the Order of Operations to evaluate algebraic expressions. Evaluating Algebraic Expressions Variable is a letter or symbol that represents a number. Variable (algebraic)

More information

Chapter 1. Linear Equations and Straight Lines. 2 of 71. Copyright 2014, 2010, 2007 Pearson Education, Inc.

Chapter 1. Linear Equations and Straight Lines. 2 of 71. Copyright 2014, 2010, 2007 Pearson Education, Inc. Chapter 1 Linear Equations and Straight Lines 2 of 71 Outline 1.1 Coordinate Systems and Graphs 1.4 The Slope of a Straight Line 1.3 The Intersection Point of a Pair of Lines 1.2 Linear Inequalities 1.5

More information

6.001 Notes: Section 8.1

6.001 Notes: Section 8.1 6.001 Notes: Section 8.1 Slide 8.1.1 In this lecture we are going to introduce a new data type, specifically to deal with symbols. This may sound a bit odd, but if you step back, you may realize that everything

More information

Lecture 10: SVM Lecture Overview Support Vector Machines The binary classification problem

Lecture 10: SVM Lecture Overview Support Vector Machines The binary classification problem Computational Learning Theory Fall Semester, 2012/13 Lecture 10: SVM Lecturer: Yishay Mansour Scribe: Gitit Kehat, Yogev Vaknin and Ezra Levin 1 10.1 Lecture Overview In this lecture we present in detail

More information

Paul's Online Math Notes Calculus III (Notes) / Applications of Partial Derivatives / Lagrange Multipliers Problems][Assignment Problems]

Paul's Online Math Notes Calculus III (Notes) / Applications of Partial Derivatives / Lagrange Multipliers Problems][Assignment Problems] 1 of 9 25/04/2016 13:15 Paul's Online Math Notes Calculus III (Notes) / Applications of Partial Derivatives / Lagrange Multipliers Problems][Assignment Problems] [Notes] [Practice Calculus III - Notes

More information

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs

Advanced Operations Research Techniques IE316. Quiz 1 Review. Dr. Ted Ralphs Advanced Operations Research Techniques IE316 Quiz 1 Review Dr. Ted Ralphs IE316 Quiz 1 Review 1 Reading for The Quiz Material covered in detail in lecture. 1.1, 1.4, 2.1-2.6, 3.1-3.3, 3.5 Background material

More information

MITOCW ocw f99-lec07_300k

MITOCW ocw f99-lec07_300k MITOCW ocw-18.06-f99-lec07_300k OK, here's linear algebra lecture seven. I've been talking about vector spaces and specially the null space of a matrix and the column space of a matrix. What's in those

More information

Machine Learning for Signal Processing Lecture 4: Optimization

Machine Learning for Signal Processing Lecture 4: Optimization Machine Learning for Signal Processing Lecture 4: Optimization 13 Sep 2015 Instructor: Bhiksha Raj (slides largely by Najim Dehak, JHU) 11-755/18-797 1 Index 1. The problem of optimization 2. Direct optimization

More information

1 Linear programming relaxation

1 Linear programming relaxation Cornell University, Fall 2010 CS 6820: Algorithms Lecture notes: Primal-dual min-cost bipartite matching August 27 30 1 Linear programming relaxation Recall that in the bipartite minimum-cost perfect matching

More information

1. The Normal Distribution, continued

1. The Normal Distribution, continued Math 1125-Introductory Statistics Lecture 16 10/9/06 1. The Normal Distribution, continued Recall that the standard normal distribution is symmetric about z = 0, so the area to the right of zero is 0.5000.

More information

More on Classification: Support Vector Machine

More on Classification: Support Vector Machine More on Classification: Support Vector Machine The Support Vector Machine (SVM) is a classification method approach developed in the computer science field in the 1990s. It has shown good performance in

More information

MAT 003 Brian Killough s Instructor Notes Saint Leo University

MAT 003 Brian Killough s Instructor Notes Saint Leo University MAT 003 Brian Killough s Instructor Notes Saint Leo University Success in online courses requires self-motivation and discipline. It is anticipated that students will read the textbook and complete sample

More information

3 Perceptron Learning; Maximum Margin Classifiers

3 Perceptron Learning; Maximum Margin Classifiers Perceptron Learning; Maximum Margin lassifiers Perceptron Learning; Maximum Margin lassifiers Perceptron Algorithm (cont d) Recall: linear decision fn f (x) = w x (for simplicity, no ) decision boundary

More information

EC121 Mathematical Techniques A Revision Notes

EC121 Mathematical Techniques A Revision Notes EC Mathematical Techniques A Revision Notes EC Mathematical Techniques A Revision Notes Mathematical Techniques A begins with two weeks of intensive revision of basic arithmetic and algebra, to the level

More information

Affine Transformations Computer Graphics Scott D. Anderson

Affine Transformations Computer Graphics Scott D. Anderson Affine Transformations Computer Graphics Scott D. Anderson 1 Linear Combinations To understand the poer of an affine transformation, it s helpful to understand the idea of a linear combination. If e have

More information

Separable Kernels and Edge Detection

Separable Kernels and Edge Detection Separable Kernels and Edge Detection CS1230 Disclaimer: For Filter, using separable kernels is optional. It makes your implementation faster, but if you can t get it to work, that s totally fine! Just

More information

arxiv: v1 [math.co] 25 Sep 2015

arxiv: v1 [math.co] 25 Sep 2015 A BASIS FOR SLICING BIRKHOFF POLYTOPES TREVOR GLYNN arxiv:1509.07597v1 [math.co] 25 Sep 2015 Abstract. We present a change of basis that may allow more efficient calculation of the volumes of Birkhoff

More information

3.7. Vertex and tangent

3.7. Vertex and tangent 3.7. Vertex and tangent Example 1. At the right we have drawn the graph of the cubic polynomial f(x) = x 2 (3 x). Notice how the structure of the graph matches the form of the algebraic expression. The

More information

Inverse and Implicit functions

Inverse and Implicit functions CHAPTER 3 Inverse and Implicit functions. Inverse Functions and Coordinate Changes Let U R d be a domain. Theorem. (Inverse function theorem). If ϕ : U R d is differentiable at a and Dϕ a is invertible,

More information

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi

Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Data Structures and Algorithms Dr. Naveen Garg Department of Computer Science and Engineering Indian Institute of Technology, Delhi Lecture 20 Priority Queues Today we are going to look at the priority

More information

Some Advanced Topics in Linear Programming

Some Advanced Topics in Linear Programming Some Advanced Topics in Linear Programming Matthew J. Saltzman July 2, 995 Connections with Algebra and Geometry In this section, we will explore how some of the ideas in linear programming, duality theory,

More information

Support Vector Machines for Face Recognition

Support Vector Machines for Face Recognition Chapter 8 Support Vector Machines for Face Recognition 8.1 Introduction In chapter 7 we have investigated the credibility of different parameters introduced in the present work, viz., SSPD and ALR Feature

More information

Machine Learning for NLP

Machine Learning for NLP Machine Learning for NLP Support Vector Machines Aurélie Herbelot 2018 Centre for Mind/Brain Sciences University of Trento 1 Support Vector Machines: introduction 2 Support Vector Machines (SVMs) SVMs

More information

Support Vector Machines

Support Vector Machines Support Vector Machines About the Name... A Support Vector A training sample used to define classification boundaries in SVMs located near class boundaries Support Vector Machines Binary classifiers whose

More information

Class Note #02. [Overall Information] [During the Lecture]

Class Note #02. [Overall Information] [During the Lecture] Class Note #02 Date: 01/11/2006 [Overall Information] In this class, after a few additional announcements, we study the worst-case running time of Insertion Sort. The asymptotic notation (also called,

More information

STA Module 4 The Normal Distribution

STA Module 4 The Normal Distribution STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally

More information

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves

STA /25/12. Module 4 The Normal Distribution. Learning Objectives. Let s Look at Some Examples of Normal Curves STA 2023 Module 4 The Normal Distribution Learning Objectives Upon completing this module, you should be able to 1. Explain what it means for a variable to be normally distributed or approximately normally

More information

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Spring 2007 c 2007,

More information

Excerpt from "Art of Problem Solving Volume 1: the Basics" 2014 AoPS Inc.

Excerpt from Art of Problem Solving Volume 1: the Basics 2014 AoPS Inc. Chapter 5 Using the Integers In spite of their being a rather restricted class of numbers, the integers have a lot of interesting properties and uses. Math which involves the properties of integers is

More information

30. Constrained Optimization

30. Constrained Optimization 30. Constrained Optimization The graph of z = f(x, y) is represented by a surface in R 3. Normally, x and y are chosen independently of one another so that one may roam over the entire surface of f (within

More information

Lecture 7: Support Vector Machine

Lecture 7: Support Vector Machine Lecture 7: Support Vector Machine Hien Van Nguyen University of Houston 9/28/2017 Separating hyperplane Red and green dots can be separated by a separating hyperplane Two classes are separable, i.e., each

More information

Lecture 2 - Introduction to Polytopes

Lecture 2 - Introduction to Polytopes Lecture 2 - Introduction to Polytopes Optimization and Approximation - ENS M1 Nicolas Bousquet 1 Reminder of Linear Algebra definitions Let x 1,..., x m be points in R n and λ 1,..., λ m be real numbers.

More information

6. Linear Discriminant Functions

6. Linear Discriminant Functions 6. Linear Discriminant Functions Linear Discriminant Functions Assumption: we know the proper forms for the discriminant functions, and use the samples to estimate the values of parameters of the classifier

More information

Lecture 9: Support Vector Machines

Lecture 9: Support Vector Machines Lecture 9: Support Vector Machines William Webber (william@williamwebber.com) COMP90042, 2014, Semester 1, Lecture 8 What we ll learn in this lecture Support Vector Machines (SVMs) a highly robust and

More information

Lecture Transcript While and Do While Statements in C++

Lecture Transcript While and Do While Statements in C++ Lecture Transcript While and Do While Statements in C++ Hello and welcome back. In this lecture we are going to look at the while and do...while iteration statements in C++. Here is a quick recap of some

More information

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14

Introduction to Algorithms / Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14 600.363 Introduction to Algorithms / 600.463 Algorithms I Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/18/14 23.1 Introduction We spent last week proving that for certain problems,

More information

4.3, Math 1410 Name: And now for something completely different... Well, not really.

4.3, Math 1410 Name: And now for something completely different... Well, not really. 4.3, Math 1410 Name: And now for something completely different... Well, not really. How derivatives affect the shape of a graph. Please allow me to offer some explanation as to why the first couple parts

More information

(Refer Slide Time: 0:32)

(Refer Slide Time: 0:32) Digital Image Processing. Professor P. K. Biswas. Department of Electronics and Electrical Communication Engineering. Indian Institute of Technology, Kharagpur. Lecture-57. Image Segmentation: Global Processing

More information

Lab 2: Support Vector Machines

Lab 2: Support Vector Machines Articial neural networks, advanced course, 2D1433 Lab 2: Support Vector Machines March 13, 2007 1 Background Support vector machines, when used for classication, nd a hyperplane w, x + b = 0 that separates

More information

Section 13.5: Equations of Lines and Planes. 1 Objectives. 2 Assignments. 3 Lecture Notes

Section 13.5: Equations of Lines and Planes. 1 Objectives. 2 Assignments. 3 Lecture Notes Section 13.5: Equations of Lines and Planes 1 Objectives 1. Find vector, symmetric, or parametric equations for a line in space given two points on the line, given a point on the line and a vector parallel

More information

Topic. Section 4.1 (3, 4)

Topic. Section 4.1 (3, 4) Topic.. California Standards: 6.0: Students graph a linear equation and compute the x- and y-intercepts (e.g., graph x + 6y = ). They are also able to sketch the region defined by linear inequality (e.g.,

More information

Support vector machines. Dominik Wisniewski Wojciech Wawrzyniak

Support vector machines. Dominik Wisniewski Wojciech Wawrzyniak Support vector machines Dominik Wisniewski Wojciech Wawrzyniak Outline 1. A brief history of SVM. 2. What is SVM and how does it work? 3. How would you classify this data? 4. Are all the separating lines

More information

Lesson 6-2: Function Operations

Lesson 6-2: Function Operations So numbers not only have a life but they have relationships well actually relations. There are special relations we call functions. Functions are relations for which each input has one and only one output.

More information

LOGISTIC REGRESSION FOR MULTIPLE CLASSES

LOGISTIC REGRESSION FOR MULTIPLE CLASSES Peter Orbanz Applied Data Mining Not examinable. 111 LOGISTIC REGRESSION FOR MULTIPLE CLASSES Bernoulli and multinomial distributions The mulitnomial distribution of N draws from K categories with parameter

More information

Section 2.2 Graphs of Linear Functions

Section 2.2 Graphs of Linear Functions Section. Graphs of Linear Functions Section. Graphs of Linear Functions When we are working with a new function, it is useful to know as much as we can about the function: its graph, where the function

More information

Support Vector Machines + Classification for IR

Support Vector Machines + Classification for IR Support Vector Machines + Classification for IR Pierre Lison University of Oslo, Dep. of Informatics INF3800: Søketeknologi April 30, 2014 Outline of the lecture Recap of last week Support Vector Machines

More information

3.3 Optimizing Functions of Several Variables 3.4 Lagrange Multipliers

3.3 Optimizing Functions of Several Variables 3.4 Lagrange Multipliers 3.3 Optimizing Functions of Several Variables 3.4 Lagrange Multipliers Prof. Tesler Math 20C Fall 2018 Prof. Tesler 3.3 3.4 Optimization Math 20C / Fall 2018 1 / 56 Optimizing y = f (x) In Math 20A, we

More information

Hey there, I m (name) and today I m gonna talk to you about rate of change and slope.

Hey there, I m (name) and today I m gonna talk to you about rate of change and slope. Rate and Change of Slope A1711 Activity Introduction Hey there, I m (name) and today I m gonna talk to you about rate of change and slope. Slope is the steepness of a line and is represented by the letter

More information

Convex Programs. COMPSCI 371D Machine Learning. COMPSCI 371D Machine Learning Convex Programs 1 / 21

Convex Programs. COMPSCI 371D Machine Learning. COMPSCI 371D Machine Learning Convex Programs 1 / 21 Convex Programs COMPSCI 371D Machine Learning COMPSCI 371D Machine Learning Convex Programs 1 / 21 Logistic Regression! Support Vector Machines Support Vector Machines (SVMs) and Convex Programs SVMs are

More information

.. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar..

.. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar.. .. Spring 2017 CSC 566 Advanced Data Mining Alexander Dekhtyar.. Machine Learning: Support Vector Machines: Linear Kernel Support Vector Machines Extending Perceptron Classifiers. There are two ways to

More information

DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS

DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes and a class attribute

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Chapter 9 Chapter 9 1 / 50 1 91 Maximal margin classifier 2 92 Support vector classifiers 3 93 Support vector machines 4 94 SVMs with more than two classes 5 95 Relationshiop to

More information

LINEAR ALGEBRA AND VECTOR ANALYSIS MATH 22A

LINEAR ALGEBRA AND VECTOR ANALYSIS MATH 22A 1 2 3 4 Name: 5 6 7 LINEAR ALGEBRA AND VECTOR ANALYSIS 8 9 1 MATH 22A Total : Unit 28: Second Hourly Welcome to the second hourly. Please don t get started yet. We start all together at 9: AM. You can

More information

8/27/2016. ECE 120: Introduction to Computing. Graphical Illustration of Modular Arithmetic. Representations Must be Unambiguous

8/27/2016. ECE 120: Introduction to Computing. Graphical Illustration of Modular Arithmetic. Representations Must be Unambiguous University of Illinois at Urbana-Champaign Dept. of Electrical and Computer Engineering ECE 120: Introduction to Computing Signed Integers and 2 s Complement Strategy: Use Common Hardware for Two Representations

More information

Structures of Expressions

Structures of Expressions SECONDARY MATH TWO An Integrated Approach MODULE 2 Structures of Expressions The Scott Hendrickson, Joleigh Honey, Barbara Kuehl, Travis Lemon, Janet Sutorius 2017 Original work 2013 in partnership with

More information

5 R1 The one green in the same place so either of these could be green.

5 R1 The one green in the same place so either of these could be green. Page: 1 of 20 1 R1 Now. Maybe what we should do is write out the cases that work. We wrote out one of them really very clearly here. [R1 takes out some papers.] Right? You did the one here um where you

More information

Graded Assignment 2 Maple plots

Graded Assignment 2 Maple plots Graded Assignment 2 Maple plots The Maple part of the assignment is to plot the graphs corresponding to the following problems. I ll note some syntax here to get you started see tutorials for more. Problem

More information

Demo 1: KKT conditions with inequality constraints

Demo 1: KKT conditions with inequality constraints MS-C5 Introduction to Optimization Solutions 9 Ehtamo Demo : KKT conditions with inequality constraints Using the Karush-Kuhn-Tucker conditions, see if the points x (x, x ) (, 4) or x (x, x ) (6, ) are

More information

Table of Laplace Transforms

Table of Laplace Transforms Table of Laplace Transforms 1 1 2 3 4, p > -1 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 Heaviside Function 27 28. Dirac Delta Function 29 30. 31 32. 1 33 34. 35 36. 37 Laplace Transforms

More information

Supervised vs. Unsupervised Learning. Supervised vs. Unsupervised Learning. Supervised vs. Unsupervised Learning. Supervised vs. Unsupervised Learning

Supervised vs. Unsupervised Learning. Supervised vs. Unsupervised Learning. Supervised vs. Unsupervised Learning. Supervised vs. Unsupervised Learning Overview T7 - SVM and s Christian Vögeli cvoegeli@inf.ethz.ch Supervised/ s Support Vector Machines Kernels Based on slides by P. Orbanz & J. Keuchel Task: Apply some machine learning method to data from

More information

Lecture 9: Linear Programming

Lecture 9: Linear Programming Lecture 9: Linear Programming A common optimization problem involves finding the maximum of a linear function of N variables N Z = a i x i i= 1 (the objective function ) where the x i are all non-negative

More information

Where The Objects Roam

Where The Objects Roam CS61A, Spring 2006, Wei Tu (Based on Chung s Notes) 1 CS61A Week 8 Where The Objects Roam (v1.0) Paradigm Shift (or: The Rabbit Dug Another Hole) And here we are, already ready to jump into yet another

More information

Lecture 5: Duality Theory

Lecture 5: Duality Theory Lecture 5: Duality Theory Rajat Mittal IIT Kanpur The objective of this lecture note will be to learn duality theory of linear programming. We are planning to answer following questions. What are hyperplane

More information

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18

/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 601.433/633 Introduction to Algorithms Lecturer: Michael Dinitz Topic: Approximation algorithms Date: 11/27/18 22.1 Introduction We spent the last two lectures proving that for certain problems, we can

More information

MATH 19520/51 Class 10

MATH 19520/51 Class 10 MATH 19520/51 Class 10 Minh-Tam Trinh University of Chicago 2017-10-16 1 Method of Lagrange multipliers. 2 Examples of Lagrange multipliers. The Problem The ingredients: 1 A set of parameters, say x 1,...,

More information

Divisibility Rules and Their Explanations

Divisibility Rules and Their Explanations Divisibility Rules and Their Explanations Increase Your Number Sense These divisibility rules apply to determining the divisibility of a positive integer (1, 2, 3, ) by another positive integer or 0 (although

More information

Algebra of Sets (Mathematics & Logic A)

Algebra of Sets (Mathematics & Logic A) Algebra of Sets (Mathematics & Logic A) RWK/MRQ October 28, 2002 Note. These notes are adapted (with thanks) from notes given last year by my colleague Dr Martyn Quick. Please feel free to ask me (not

More information

Graphical Analysis. Figure 1. Copyright c 1997 by Awi Federgruen. All rights reserved.

Graphical Analysis. Figure 1. Copyright c 1997 by Awi Federgruen. All rights reserved. Graphical Analysis For problems with 2 variables, we can represent each solution as a point in the plane. The Shelby Shelving model (see the readings book or pp.68-69 of the text) is repeated below for

More information

Division of the Humanities and Social Sciences. Convex Analysis and Economic Theory Winter Separation theorems

Division of the Humanities and Social Sciences. Convex Analysis and Economic Theory Winter Separation theorems Division of the Humanities and Social Sciences Ec 181 KC Border Convex Analysis and Economic Theory Winter 2018 Topic 8: Separation theorems 8.1 Hyperplanes and half spaces Recall that a hyperplane in

More information

Sets. Sets. Examples. 5 2 {2, 3, 5} 2, 3 2 {2, 3, 5} 1 /2 {2, 3, 5}

Sets. Sets. Examples. 5 2 {2, 3, 5} 2, 3 2 {2, 3, 5} 1 /2 {2, 3, 5} Sets We won t spend much time on the material from this and the next two chapters, Functions and Inverse Functions. That s because these three chapters are mostly a review of some of the math that s a

More information

Math background. 2D Geometric Transformations. Implicit representations. Explicit representations. Read: CS 4620 Lecture 6

Math background. 2D Geometric Transformations. Implicit representations. Explicit representations. Read: CS 4620 Lecture 6 Math background 2D Geometric Transformations CS 4620 Lecture 6 Read: Chapter 2: Miscellaneous Math Chapter 5: Linear Algebra Notation for sets, functions, mappings Linear transformations Matrices Matrix-vector

More information

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please)

Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in class hard-copy please) Virginia Tech. Computer Science CS 5614 (Big) Data Management Systems Fall 2014, Prakash Homework 4: Clustering, Recommenders, Dim. Reduction, ML and Graph Mining (due November 19 th, 2014, 2:30pm, in

More information

1 EquationsofLinesandPlanesin 3-D

1 EquationsofLinesandPlanesin 3-D 1 EquationsofLinesandPlanesin 3-D Recall that given a point P (a, b, c), one can draw a vector from the origin to P. Such a vector is called the position vector of the point P and its coordinates are a,

More information

Applied Lagrange Duality for Constrained Optimization

Applied Lagrange Duality for Constrained Optimization Applied Lagrange Duality for Constrained Optimization Robert M. Freund February 10, 2004 c 2004 Massachusetts Institute of Technology. 1 1 Overview The Practical Importance of Duality Review of Convexity

More information