Mathematical Themes in Economics, Machine Learning, and Bioinformatics

Similar documents
Notes on Support Vector Machines

Data Analysis 3. Support Vector Machines. Jan Platoš October 30, 2017

Linear methods for supervised learning

Support Vector Machines

Support Vector Machines

Data Mining: Concepts and Techniques. Chapter 9 Classification: Support Vector Machines. Support Vector Machines (SVMs)

Support vector machines

SUPPORT VECTOR MACHINES

12 Classification using Support Vector Machines

SVM Classification in Multiclass Letter Recognition System

Non-Bayesian Classifiers Part II: Linear Discriminants and Support Vector Machines

LECTURE 13: SOLUTION METHODS FOR CONSTRAINED OPTIMIZATION. 1. Primal approach 2. Penalty and barrier methods 3. Dual approach 4. Primal-dual approach

Lecture 7: Support Vector Machine

Kernel SVM. Course: Machine Learning MAHDI YAZDIAN-DEHKORDI FALL 2017

Introduction to Support Vector Machines

Constrained Optimization

Lab 2: Support Vector Machines

A Short SVM (Support Vector Machine) Tutorial

Optimal Separating Hyperplane and the Support Vector Machine. Volker Tresp Summer 2018

CSE 417T: Introduction to Machine Learning. Lecture 22: The Kernel Trick. Henry Chai 11/15/18

Application of Support Vector Machine In Bioinformatics

Kernel Methods & Support Vector Machines

SUPPORT VECTOR MACHINES

Support vector machines

Support Vector Machines.

Lab 2: Support vector machines

Classification by Support Vector Machines

Support Vector Machines

Sparse and large-scale learning with heterogeneous data

All lecture slides will be available at CSC2515_Winter15.html

Introduction to Machine Learning

LECTURE 5: DUAL PROBLEMS AND KERNELS. * Most of the slides in this lecture are from

Kernels + K-Means Introduction to Machine Learning. Matt Gormley Lecture 29 April 25, 2018

Classification by Support Vector Machines

Chakra Chennubhotla and David Koes

DM6 Support Vector Machines

Support vector machine (II): non-linear SVM. LING 572 Fei Xia

Support Vector Machines. James McInerney Adapted from slides by Nakul Verma

Convex Optimization and Machine Learning

A Dendrogram. Bioinformatics (Lec 17)

Data mining with Support Vector Machine

9. Support Vector Machines. The linearly separable case: hard-margin SVMs. The linearly separable case: hard-margin SVMs. Learning objectives

Lecture 2 Optimization with equality constraints

COMS 4771 Support Vector Machines. Nakul Verma

Rule extraction from support vector machines

10. Support Vector Machines

Generative and discriminative classification techniques

A Practical Guide to Support Vector Classification

21-256: Lagrange multipliers

Classification by Support Vector Machines

The Effects of Outliers on Support Vector Machines

Unconstrained Optimization Principles of Unconstrained Optimization Search Methods

SUPPORT VECTOR MACHINE ACTIVE LEARNING

Support Vector Machines

Linear Programming and its Applications

EC5555 Economics Masters Refresher Course in Mathematics September Lecture 6 Optimization with equality constraints Francesco Feri

LOGISTIC REGRESSION FOR MULTIPLE CLASSES

Ec 181: Convex Analysis and Economic Theory

David G. Luenberger Yinyu Ye. Linear and Nonlinear. Programming. Fourth Edition. ö Springer

Support Vector Machines

5 Day 5: Maxima and minima for n variables.

Robot Learning. There are generally three types of robot learning: Learning from data. Learning by demonstration. Reinforcement learning

Reliability Evaluation Using Monte Carlo Simulation and Support Vector Machine

Table of Contents. Recognition of Facial Gestures... 1 Attila Fazekas

Probabilistic Graphical Models

HW2 due on Thursday. Face Recognition: Dimensionality Reduction. Biometrics CSE 190 Lecture 11. Perceptron Revisited: Linear Separators

4 LINEAR PROGRAMMING (LP) E. Amaldi Fondamenti di R.O. Politecnico di Milano 1

Machine Learning for NLP

CS 559: Machine Learning Fundamentals and Applications 9 th Set of Notes

Introduction to Kernels (part II)Application to sequences p.1

Support vector machine prediction of signal peptide cleavage site using a new class of kernels for strings

Support Vector Machines

Local and Global Minimum

Linear Discriminant Functions: Gradient Descent and Perceptron Convergence

More on Learning. Neural Nets Support Vectors Machines Unsupervised Learning (Clustering) K-Means Expectation-Maximization

Support Vector Machines.

J. Weston, A. Gammerman, M. Stitson, V. Vapnik, V. Vovk, C. Watkins. Technical Report. February 5, 1998

Constrained optimization

Conic Duality. yyye

Contents. I Basics 1. Copyright by SIAM. Unauthorized reproduction of this article is prohibited.

Data Mining Practical Machine Learning Tools and Techniques. Slides for Chapter 6 of Data Mining by I. H. Witten and E. Frank

Lecture Linear Support Vector Machines

Applied Lagrange Duality for Constrained Optimization

Topics in Machine Learning

Convex Optimization MLSS 2015

MATHEMATICS II: COLLECTION OF EXERCISES AND PROBLEMS

Support Vector Machines: Brief Overview" November 2011 CPSC 352

SELF-ADAPTIVE SUPPORT VECTOR MACHINES

Kernel Methods. Chapter 9 of A Course in Machine Learning by Hal Daumé III. Conversion to beamer by Fabrizio Riguzzi

AM 221: Advanced Optimization Spring 2016

Lagrangian Multipliers

DM545 Linear and Integer Programming. Lecture 2. The Simplex Method. Marco Chiarandini

Lecture 9: Support Vector Machines

Module 4. Non-linear machine learning econometrics: Support Vector Machine

IE521 Convex Optimization Introduction

Support Vector Machines

EC422 Mathematical Economics 2

Supervised vs. Unsupervised Learning. Supervised vs. Unsupervised Learning. Supervised vs. Unsupervised Learning. Supervised vs. Unsupervised Learning

SVM Classification in -Arrays

Lecture 25 Nonlinear Programming. November 9, 2009

Transcription:

Western Kentucky University From the SelectedWorks of Matt Bogard 2010 Mathematical Themes in Economics, Machine Learning, and Bioinformatics Matt Bogard, Western Kentucky University Available at: https://works.bepress.com/matt_bogard/7/

Mathematical Themes in Economics, Machine Learning, and Bioinformatics Graduate students in economics are often introduced to some very useful mathematical tools that many outside the discipline may not associate with training in economics. This essay looks at some of these tools and concepts, including constrained optimization, separating hyperplanes, supporting hyperplanes, and duality. Applications of these tools are explored including topics from machine learning and bioinformatics. Most all economics students are introduced to the concept of the utility maximization problem (UMP), where a utility function, say U(x,y) is introduced and maximized subject to some constraint such as the budget constraint say I P x *X P y *Y = 0. They are also introduced to the expenditure minimization problem (EMP) where total expenditures represented by say E = P x *X + P y *Y are minimized subject to a utility constraint, say U = U(x,y). By use of the lagrangian multiplier method it turns out that the optimal vector in the UMP is the same for the EMP. In other words, UMP and EMP are dual problems. The demand function is derived from the UMP and the expenditure function is derived from the EMP. Typically for first semester graduate students, the second order conditions for UMP and EMP are presented in terms of the bordered Hessian matrix. Second order conditions imply that for a maximum in a constrained optimization problem, the hessian must be negative definite, and it must be positive definite for a constrained minimization problem. Later in graduate school, if economics students use Mas-Colell, Whinston, and Green s Microeconomic Theory, they will be introduced to a different concept of duality. Green s mathematical appendix defines two important theorems as follows: Separating Hyperplane Theorem: (Image adapted from Green, 1995)

Suppose A, B R N and A B =. Suppose B is a convex set and closed, and that x B. Then there is a p R N with p 0, and a value c R N such that p.x > c and p.y < c for every y B. i.e. There is a hyperplane that separates A and B leaving A and B on opposite sides. (Green, 1995) Supporting Hyperplane Theorem: (Image adapted from Green, 1995) Suppose B R N is convex and that x is not an element of the interior set of B (i.e. x Int B). Then there is p R N with p 0 such that p.x p.y for every y B. i.e. the hyperplane H p,c supports the set B. (Green, 1995) APPLICATION: Consumer Choice Theory from Economics As Mas-Colell states, a consequence of the separating hyperplane theorem is that if K is a convex set, then there is a half space (defined by a normal vector and a hyperplane) containing K and excluding x.

(Image adapted from Green, 1995) i.e. a closed convex set can be dually defined by the sum of its half spaces. A support function is then defined: A support function for K can be defined as µ k (p) = infimum{ p.x: x K} and K = {x R L : p.x µ k (p) for every p} It turns out that the expenditure function e(p,u) derived from the expenditure minimization problem in microeconomics is the support function for the set { x : u(x) u}

APPLICATION : Support Vector Machines SVM s select the hyperplane that maximizes the margin of separation between two classes from among all separating hypererplanes. The SVM classifies inputs into two classes using a hyperplane in high dimensional space (Wu,2006). Ultimately, this machine learning or data mining classifier makes use of the same concepts from mathematical economics that many economics graduate students may be familiar with, such as supporting hyperplanes and separating hyperplanes, as well as constrained optimization. According to Fletcher, a SVM involves deriving w and b and using them to describe the training data { x,y} as such: x i. w + b +1 for y I = +1 and x i. w + b -1 for y I = -1 or y i (x i. w + b) -1 0 Margin is defined using vector geometry to be 1/ w where w is a vector normal to a separating hyperplane that is equidistant from the supporting hyperplanes (described by support vectors) that support the different classes or sets of observations in the training data. Maximizing margin becomes an optimization problem involving the use of the Lagrangian multiplier method familiar to economics students. In setting up the Lagrangian, the term T H is formulated where H ij = y i y j x i x j. In this context, the linear kernal is described as a dot product, being k(x i,x j ) = x i T x j. If data is not linearly separable, then alternative kernel functions are used to map non-linearly separable data into a feature space where the data are linearly seperable. Examples may include the use of radial

basis functions, polynomial, and sigmoidal kernel functions. The kernel matrix must be positive semidefinite. APPLICATION: Bioinformatics In Comparing Kernals for Predicting Protein Binding Sites from Amino Acid Sequence Wu, et al recognize that kernel functions in SVM s allow you to capitalize on domain specific knowledge. They compare 3 kernal types including the identity kernel, sequence alignment kernel, and an amino acid sequence substitution matrix kernel for predicting protein-protein, protein-dna, and Protein-RNA binding sites. They are careful to ensure that their kernel formulations satisfy the Mercer conditions which require kernel matrices to be positive semi definite. References: Mas-Colell, Winston and Green. Microeconomic Theory. Oxford University Press. (1995) Wu, F., Olson, B., Dobbs, D., and Honavar, V. "Comparing Kernels For Predicting Protein Binding Sites From Amino Acid Sequence". IEEE Joint Conference on Neural Networks, Vancouver, Canada, IEEE Press. Vol. In press, 2006. Tristen Fletcher. Support Vector Machines Explained. March 1,2009.