Manuel Oviedo de la Fuente and Manuel Febrero Bande
|
|
- Shana Gilbert
- 5 years ago
- Views:
Transcription
1 Supervised classification methods in by fda.usc package Manuel Oviedo de la Fuente and Manuel Febrero Bande Universidade de Santiago de Compostela CNTG (Centro de Novas Tecnoloxías de Galicia). Santiago de Compostela, 3 y 4 de octubre de 04
2 Introduction Multivariate Real Data Example Iris data. This data set gives the measurements in centimeters of the variables sepal length and width and petal length and width, respectively, for 50 flowers from each of 3 species of iris. The species are Iris setosa, versicolor, and virginica. Sepal.Length Sepal.Width Petal.Length Petal.Width Species October, 04 / 9
3 Introduction Multivariate Simulation Data Example Scatterplot of two uniform samples. October, 04 3 / 9
4 Introduction Functional Real Data Example Tecator data. 5 spectrometric curves of meat with Fat, Water and Protein contents. Goal: Explain the fat content through spectrometric curves. data(tecator) par(mfrow=c(,3)) fat5<-ifelse((y<-tecator$y$fat)<5,,4) boxplot(y,main="fat") plot((x<-tecator$absorp),col=fat5,main="spectrometric: X") plot((x.d<-fdata.deriv(tecator$absorp,)),col=fat5,main="derviative: X.d") Fat Spectrometric: X Derviative: X.d Absorbances d(absorbances,) Wavelength (mm) Wavelength (mm) October, 04 4 / 9
5 Introduction Functional Simulation Data Example Model, k=. Model, k=. Model 3, k=. X(t) m m X(t) m m X(t) m m t t t A sample of 40 functions (Ornstein Uhlenbeck process with Gaussian error) for every simulation model along with the means of each sub-group (G s (black lines) and G s (red line)). October, 04 5 / 9
6 Introduction Some definitions of Functional Data Functional data analysis is a branch of statistics that analyzes data providing information about curves, surfaces or anything else varying over a continuum. The continuum is often time, but may also be spatial location, wavelength, probability, etc. Functional data analysis is a branch of statistics concerned with analysing data in the form of functions, [Ferraty and Vieu, (006)]. Definition.. A random variable X is called a functional variable if it takes values in a functional space E complete normed (or seminormed) space. Definition.. A functional dataset {X,..., X n} is the observation of n functional variables X,..., X n identically distributed as X. October, 04 6 / 9
7 Introduction Example of functional dataset in fda.usc Load the library fda.usc [Febrero-Bande and Oviedo de la Fuente, 0] An object called fdata as a list of the following components: data: typically a matrix of (n x m) dimension which contains a set of n curves discretized in m points or argvals. argvals: locations of the discretization points, by default: {t =,..., t m = m}. rangeval: rangeval of discretization points. names: list with an overall title, xlab, a title for the x axis and ylab, a title for the y axis. library(fda.usc) data(tecator) class(tecator$absorp.fdata) [] "fdata" names(tecator$absorp.fdata) [] "data" "argvals" "rangeval" "names" October, 04 7 / 9
8 Introduction Example of image data: Yoga dataset The dataset was obtained by capturing two actors transiting between yoga poses in front of a green screen. It has been shown recently that in many domains it can be useful to convert images into pseudo time series. Therefore we have converted the motion capture data into time series by a well known technique, see [Keogh et al., 0]. October, 04 8 / 9
9 Introduction Form image data to functional data The dataset was obtained by capturing two actors transiting between yoga poses in front of a green screen. It has been shown recently that in many domains it can be useful to convert images into pseudo time series. Therefore we have converted the motion capture data into time series by a well-known technique, see [Keogh et al., 0]. October, 04 9 / 9
10 Supervised classification Functional supervised classification Aim: How predict the class Y of a functional variable X Bayes rule: Given a sample X, the aim is to estimate the posterior probability of belonging to each group: The classification rule: p g(x ) = P(Y = g χ = X ) = E( Y =g χ = X ) Ŷ = arg maxˆp g(x ) The estimate of the posterior probability p g(x ) can be calculated using logistic regression or non parametric regression. The package allows the estimation of the groups in a training set of functional data by k-nearest Neighbor Classifier: classif.knn Kernel Classifier: classif.kernel Logistic Classifier (linear model): classif.glm Logistic Classifier (additive model): classif.gsam and classif.gkam Distance Classifier: classif.dist DD classifier: classif.dd October, 04 0 / 9
11 Classification via regression models Generalized Functional Linear Model The scalar response y is estimated by functional {X q(t)} Q q= and also non functional Z = { } J Z j covariates by: j= Q y i = g α + Z i β + X q i (t), β q(t) + ε i q= g() is the inverse link function and ε i are random errors with mean zero and finite variance σ. [Ramsay and Silverman, 005] uses fixed basis representation of X (t) and β(t): B spline, Fourier, Wavelets, create.basis. [Cardot et al., 999] uses so-called functional principal components regression (FPC),create.pc.basis. [Preda et al., 007] uses so-called functional partial least squares components regression (FPLS), create.pls.basis. October, 04 / 9
12 Classification via regression models Generalized Functional Linear Model ldata<-list("df"=tecator$y,"absorp.fdata"=tecator$absorp.fdata, "absorp.d"=fdata.deriv(tecator$absorp.fdata)) ldata$df$fat5<-factor(ifelse(tecator$y$fat<5,0,)) res.glm<- classif.glm( fat5 ~ absorp.d,data=ldata) res.glm -Call: classif.glm(formula = fat5 ~ absorp.d, data = ldata) -Probability of correct classification: res.glm$fit[[]] Call: glm(formula = pf) Coefficients: (Intercept) absorp.d.bspl4. absorp.d.bspl absorp.d.bspl4.3 absorp.d.bspl4.4 absorp.d.bspl Degrees of Freedom: 4 Total (i.e. Null); 09 Residual Null Deviance: 98 Residual Deviance: 8.5 AIC: 40.5 October, 04 / 9
13 Classification via regression models Generalized Functional Additive Model Generalized Functional Spectral Additive Linear Model (FGSAM), [Müller and Yao, 008], J ( y i = g α + f j Z j ) Q ( (t)) + s i q X q + ε i i j= q= where f ( ), s( ) are the smoothed functions. res.gsam<-classif.gsam(fat5~ s(absorp.d),data=ldata) res.gsam -Call: classif.gsam(formula = fat5 ~ s(absorp.d), data = ldata) -Probability of correct classification: res.gsam$fit[[]] Family: binomial Link function: logit Formula: [] "fat5~+s(absorp.d.bspl4.,k=-)+s(absorp.d.bspl4.,k=-)+s(absorp.d.bspl4.3,k=-)+s(absorp.d.bs Estimated degrees of freedom: total = 8.5 UBRE score: October, 04 3 / 9
14 Classification via regression models Generalized Functional Additive Model Generalized Functional Kernel Additive Linear Model (FGKAM), [Febrero-Bande and González-Manteiga, 03], Q ( (t)) y i = g α + K X q + ε i i q= where K( ) is the kernel estimator (extends the knn classifier). res.gkam<-classif.gkam(fat5 ~ absorp.d,data=ldata) res.gkam -Call: classif.gkam(formula = fat5 ~ absorp.d, data = ldata) -Probability of correct classification: res.gkam$fit[[]] Family: binomial Link function: logit alpha= -8.9 n= 5 **** **** **** **** **** **** h cor(f(x),eta) edf f(absorp.d) **** **** **** **** **** **** edf: Equivalent degrees of freedom AIC= 69.3 Deviance explained = 89.8 % R-sq.= 0.93 R-sq.(adj)= 0.89 October, 04 4 / 9
15 Classification by depth functions Classification by DD-classifier Make the group classification of a training dataset using DD-classifier estimation in the following steps. Step. The function computes the selected depth measure of the points in x (multivariate data or multivariate functional data) w.r.t. a subsample of each G level group. October, 04 5 / 9
16 Classification by depth functions DD plot Step. The function calculates the misclassification rate based on data depth computed in step () using the following classifiers: "MaxD": Maximum depth. "DD","DD","DD3": Search the best separating polynomial of degree,, 3. DD-plot(HS,DD) DD-plot(HS,DD) depth depth depth depth DD-plot(HS,DD3) DD-plot(HS,MaxD) depth depth depth depth From left to right, top to bottom DD plot using DD, DD, DD3 and Maximum Depth classifiers to the DD-plot. The one-dimensional depth in all cases is the Tukey depth. October, 04 6 / 9
17 Classification by depth functions DD classifier for Multivariate Data "glm","gam": Generalized Linear (or Additive) Models. "lda","qda": Linear Discriminant (or Quadratic) Analysis. "knn","np": Non-parametric k-nearest Neighbour (or Kernel) classifier. DD-plot(HS,lda) DD-plot(HS,qda) depth depth depth depth DD-plot(HS,knn) DD-plot(HS,glm) depth depth depth depth DD-plot(HS,gam) DD-plot(HS,tree) depth depth depth depth From left to right, top to bottom DD plot using LDA, QDA, knn, GLM, GAM and tree classifiers to the DD plot. The one-dimensional depth in all cases is the Tukey depth. October, 04 7 / 9
18 Classification by depth functions DD classifier for functional data par(mfrow=c(,)) out=classif.dd(ldata$df$fat5,ldata$absorp.fdata,classif="gam") out=classif.dd(ldata$df$fat5,ldata$absorp.d,classif="gam") DD plot(fm,gam) depth 0 depth DD plot(fm,gam) depth 0 depth 0 0 out$misclassification;out$misclassification [] [] 0.07 October, 04 8 / 9
19 Conclusions The functional discrimintation extends the logistic regression (GLM, GAM) using a basis representation of the functional data (B-spline, Fourier, wavelets, PC, PLS,...). The DD G classifier converts the functional data in a multivariate dataset whose columns are constructed using depths and the new classifiers are classical multivariate classifiers based on discrimination procedures (LDA, QDA) or regression ones (knn, NP, GLM, GAM). More classifiers could be considered here (SVM, neural networks,...). Several depth procedures can be taken into account at the same time in order to improve the classification or in order to diagnose whether a depth contains or not useful information for the classification process. The DD G classifier trick" is specially interesting in a functional or high dimensional framework because it changes the dimension of the classification problem from infinite or large dimension to G, where G depends only on the number of groups and the number of depths employed. The functions needed to perform these preseentation are freely available at CRAN in the fda.usc package ([Febrero-Bande and Oviedo de la Fuente, 0]) for versions higher than..0. October, 04 9 / 9
20 Cardot H, Ferraty F and Sarda P (999). Functional Linear Model. Statistics and Probability Letters, 45(), -. Cueata-Albertos, J.A., Febrero-Bande, M. and Oviedo de la Fuente, M. The DDG-classifier in the functional setting. Submitted. Cuevas, A., Febrero, M., and Fraiman, R. (007). Robust estimation and classification for functional data via projection based depth notions. Computational Statistics, (3): Fraiman, R. and Muniz, G. (00). Trimmed means for functional data. Test, 0(): Febrero-Bande, M. and Oviedo de la Fuente, M. (0). Statistical computing in functional data analysis: The R package fda.usc. Journal of Statistical Software, 5(4): 8. Febrero-Bande, M. and González-Manteiga, W. (03). Generalized additive models for functional data. TEST, ():78 9. Ferraty F and Vieu P (006). Nonparametric functional data analysis. Springer Series in Statistics, New York. Keogh, E., Zhu, Q., Hu, B., Hao. Y., Xi, X., Wei, L. and Ratanamahatana, C. A. (0) The UCR Time Series Classification/Clustering data/ Müller, H.G. and Stadtmüller, U. (005). October, 04 9 / 9
21 Generalized functional linear models. Ann Stat, 33, Müller, H.G. and Yao, F. (008), Functional additive models. Journal of the American Statistical Association 03, Preda C, Saporta G, Lévéder CL. (007). PLS classification of functional data. Comput. Stat, (), Ramsay, J. and Silverman, B. (005). Functional Data Analysis. Springer. Ripley, B. (996). Pattern Recognition and Neural Networks. Cambridge Uni. Press, Cambridge. Wood, S. N. (004). Stable and efficient multiple smoothing parameter estimation for generalized additive models. Journal of the American Statistical Association, 99(467): October, 04 9 / 9
k Nearest Neighbors Super simple idea! Instance-based learning as opposed to model-based (no pre-processing)
k Nearest Neighbors k Nearest Neighbors To classify an observation: Look at the labels of some number, say k, of neighboring observations. The observation is then classified based on its nearest neighbors
More informationData analysis case study using R for readily available data set using any one machine learning Algorithm
Assignment-4 Data analysis case study using R for readily available data set using any one machine learning Algorithm Broadly, there are 3 types of Machine Learning Algorithms.. 1. Supervised Learning
More informationLinear discriminant analysis and logistic
Practical 6: classifiers Linear discriminant analysis and logistic This practical looks at two different methods of fitting linear classifiers. The linear discriminant analysis is implemented in the MASS
More informationMachine Learning / Jan 27, 2010
Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,
More informationIntroduction to Artificial Intelligence
Introduction to Artificial Intelligence COMP307 Machine Learning 2: 3-K Techniques Yi Mei yi.mei@ecs.vuw.ac.nz 1 Outline K-Nearest Neighbour method Classification (Supervised learning) Basic NN (1-NN)
More informationIntroduction to R and Statistical Data Analysis
Microarray Center Introduction to R and Statistical Data Analysis PART II Petr Nazarov petr.nazarov@crp-sante.lu 22-11-2010 OUTLINE PART II Descriptive statistics in R (8) sum, mean, median, sd, var, cor,
More informationModel Selection Introduction to Machine Learning. Matt Gormley Lecture 4 January 29, 2018
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Model Selection Matt Gormley Lecture 4 January 29, 2018 1 Q&A Q: How do we deal
More informationk-nearest Neighbors + Model Selection
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University k-nearest Neighbors + Model Selection Matt Gormley Lecture 5 Jan. 30, 2019 1 Reminders
More informationNONPARAMETRIC CLASSIFICATION OF DIRECTIONAL DATA THROUGH DEPTH FUNCTIONS
NONPARAMETRIC CLASSIFICATION OF DIRECTIONAL DATA THROUGH DEPTH FUNCTIONS Houyem Demni 1 Amor Messaoud 1 Giovanni C.Porzio 2 1 Tunis Business School University of Tunis, Tunisia 2 University of Cassino,
More informationMachine Learning: Algorithms and Applications Mockup Examination
Machine Learning: Algorithms and Applications Mockup Examination 14 May 2012 FIRST NAME STUDENT NUMBER LAST NAME SIGNATURE Instructions for students Write First Name, Last Name, Student Number and Signature
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationData Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47
Data Mining - Data Dr. Jean-Michel RICHER 2018 jean-michel.richer@univ-angers.fr Dr. Jean-Michel RICHER Data Mining - Data 1 / 47 Outline 1. Introduction 2. Data preprocessing 3. CPA with R 4. Exercise
More informationEPL451: Data Mining on the Web Lab 5
EPL451: Data Mining on the Web Lab 5 Παύλος Αντωνίου Γραφείο: B109, ΘΕΕ01 University of Cyprus Department of Computer Science Predictive modeling techniques IBM reported in June 2012 that 90% of data available
More informationThe Curse of Dimensionality
The Curse of Dimensionality ACAS 2002 p1/66 Curse of Dimensionality The basic idea of the curse of dimensionality is that high dimensional data is difficult to work with for several reasons: Adding more
More informationPreface to the Second Edition. Preface to the First Edition. 1 Introduction 1
Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches
More informationPerformance Analysis of Data Mining Classification Techniques
Performance Analysis of Data Mining Classification Techniques Tejas Mehta 1, Dr. Dhaval Kathiriya 2 Ph.D. Student, School of Computer Science, Dr. Babasaheb Ambedkar Open University, Gujarat, India 1 Principal
More informationThe alpha-procedure - a nonparametric invariant method for automatic classification of multi-dimensional objects
The alpha-procedure - a nonparametric invariant method for automatic classification of multi-dimensional objects Tatjana Lange Pavlo Mozharovskyi Hochschule Merseburg, 06217 Merseburg, Germany Universität
More informationSPATIAL DEPTH-BASED CLASSIFICATION FOR FUNCTIONAL DATA. Carlo Sguera, Pedro Galeano and Rosa Lillo
Working Paper 12-09 Statistics and Econometrics Series 06 May 2012 Departamento de Estadística Universidad Carlos III de Madrid Calle Madrid, 126 28903 Getafe (Spain) Fax (34) 91 624-98-49 SPATIAL DEPTH-BASED
More informationInterpretable Dimension Reduction for Classifying Functional Data
Interpretable Dimension Reduction for Classifying Functional Data TIAN SIVA TIAN GARETH M JAMES Abstract Classification problems involving a categorical class label Y and a functional predictor X(t) are
More informationA Self Organizing Map for dissimilarity data 0
A Self Organizing Map for dissimilarity data Aïcha El Golli,2, Brieuc Conan-Guez,2, and Fabrice Rossi,2,3 Projet AXIS, INRIA-Rocquencourt Domaine De Voluceau, BP 5 Bâtiment 8 7853 Le Chesnay Cedex, France
More informationComputational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions
Computational Statistics The basics of maximum likelihood estimation, Bayesian estimation, object recognitions Thomas Giraud Simon Chabot October 12, 2013 Contents 1 Discriminant analysis 3 1.1 Main idea................................
More informationKTH ROYAL INSTITUTE OF TECHNOLOGY. Lecture 14 Machine Learning. K-means, knn
KTH ROYAL INSTITUTE OF TECHNOLOGY Lecture 14 Machine Learning. K-means, knn Contents K-means clustering K-Nearest Neighbour Power Systems Analysis An automated learning approach Understanding states in
More informationIntro to R for Epidemiologists
Lab 9 (3/19/15) Intro to R for Epidemiologists Part 1. MPG vs. Weight in mtcars dataset The mtcars dataset in the datasets package contains fuel consumption and 10 aspects of automobile design and performance
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Category Representation 013-014 Jakob Verbeek, December 13+0, 013 Course website: http://lear.inrialpes.fr/~verbeek/mlcr.13.14
More informationClustering Functional Data with the SOM algorithm
Clustering Functional Data with the SOM algorithm Fabrice Rossi, Brieuc Conan-Guez and Aïcha El Golli Projet AxIS, INRIA, Domaine de Voluceau, Rocquencourt, B.P. 105 78153 Le Chesnay Cedex, France CEREMADE,
More informationGeneralized Additive Models
Generalized Additive Models Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Additive Models GAMs are one approach to non-parametric regression in the multiple predictor setting.
More informationINF 4300 Classification III Anne Solberg The agenda today:
INF 4300 Classification III Anne Solberg 28.10.15 The agenda today: More on estimating classifier accuracy Curse of dimensionality and simple feature selection knn-classification K-means clustering 28.10.15
More informationNina Zumel and John Mount Win-Vector LLC
SUPERVISED LEARNING IN R: REGRESSION Logistic regression to predict probabilities Nina Zumel and John Mount Win-Vector LLC Predicting Probabilities Predicting whether an event occurs (yes/no): classification
More informationSUPERVISED LEARNING METHODS. Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018
SUPERVISED LEARNING METHODS Stanley Liang, PhD Candidate, Lassonde School of Engineering, York University Helix Science Engagement Programs 2018 2 CHOICE OF ML You cannot know which algorithm will work
More informationMachine Learning with MATLAB --classification
Machine Learning with MATLAB --classification Stanley Liang, PhD York University Classification the definition In machine learning and statistics, classification is the problem of identifying to which
More informationStat 8053, Fall 2013: Additive Models
Stat 853, Fall 213: Additive Models We will only use the package mgcv for fitting additive and later generalized additive models. The best reference is S. N. Wood (26), Generalized Additive Models, An
More informationSmoothing Dissimilarities for Cluster Analysis: Binary Data and Functional Data
Smoothing Dissimilarities for Cluster Analysis: Binary Data and unctional Data David B. University of South Carolina Department of Statistics Joint work with Zhimin Chen University of South Carolina Current
More informationStat 4510/7510 Homework 4
Stat 45/75 1/7. Stat 45/75 Homework 4 Instructions: Please list your name and student number clearly. In order to receive credit for a problem, your solution must show sufficient details so that the grader
More informationMachine Learning. Chao Lan
Machine Learning Chao Lan Machine Learning Prediction Models Regression Model - linear regression (least square, ridge regression, Lasso) Classification Model - naive Bayes, logistic regression, Gaussian
More informationRandom Forest A. Fornaser
Random Forest A. Fornaser alberto.fornaser@unitn.it Sources Lecture 15: decision trees, information theory and random forests, Dr. Richard E. Turner Trees and Random Forests, Adele Cutler, Utah State University
More informationA Systematic Overview of Data Mining Algorithms. Sargur Srihari University at Buffalo The State University of New York
A Systematic Overview of Data Mining Algorithms Sargur Srihari University at Buffalo The State University of New York 1 Topics Data Mining Algorithm Definition Example of CART Classification Iris, Wine
More informationLecture 25: Review I
Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,
More informationLecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017
Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationMoving Beyond Linearity
Moving Beyond Linearity Basic non-linear models one input feature: polynomial regression step functions splines smoothing splines local regression. more features: generalized additive models. Polynomial
More informationExperimental Design + k- Nearest Neighbors
10-601 Introduction to Machine Learning Machine Learning Department School of Computer Science Carnegie Mellon University Experimental Design + k- Nearest Neighbors KNN Readings: Mitchell 8.2 HTF 13.3
More informationInstance-based Learning CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2015
Instance-based Learning CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2015 Outline Non-parametric approach Unsupervised: Non-parametric density estimation Parzen Windows K-Nearest
More informationClassification Algorithms in Data Mining
August 9th, 2016 Suhas Mallesh Yash Thakkar Ashok Choudhary CIS660 Data Mining and Big Data Processing -Dr. Sunnie S. Chung Classification Algorithms in Data Mining Deciding on the classification algorithms
More informationCOSC160: Detection and Classification. Jeremy Bolton, PhD Assistant Teaching Professor
COSC160: Detection and Classification Jeremy Bolton, PhD Assistant Teaching Professor Outline I. Problem I. Strategies II. Features for training III. Using spatial information? IV. Reducing dimensionality
More informationPredictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA
Predictive Analytics: Demystifying Current and Emerging Methodologies Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA May 18, 2017 About the Presenters Tom Kolde, FCAS, MAAA Consulting Actuary Chicago,
More informationUniversity of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques
University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques Mark Gales mjfg@eng.cam.ac.uk Michaelmas 2015 11. Non-Parameteric Techniques
More informationGeneralized Additive Models
:p Texts in Statistical Science Generalized Additive Models An Introduction with R Simon N. Wood Contents Preface XV 1 Linear Models 1 1.1 A simple linear model 2 Simple least squares estimation 3 1.1.1
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description
More informationExploring high-dimensional classification boundaries
Exploring high-dimensional classification boundaries Hadley Wickham, Doina Caragea, Di Cook January 20, 2006 1 Introduction Given p-dimensional training data containing d groups (the design space), a classification
More informationGeneralized Additive Model
Generalized Additive Model by Huimin Liu Department of Mathematics and Statistics University of Minnesota Duluth, Duluth, MN 55812 December 2008 Table of Contents Abstract... 2 Chapter 1 Introduction 1.1
More informationDATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS
DATA MINING INTRODUCTION TO CLASSIFICATION USING LINEAR CLASSIFIERS 1 Classification: Definition Given a collection of records (training set ) Each record contains a set of attributes and a class attribute
More informationFunctional Data Analysis
Functional Data Analysis Venue: Tuesday/Thursday 1:25-2:40 WN 145 Lecturer: Giles Hooker Office Hours: Wednesday 2-4 Comstock 1186 Ph: 5-1638 e-mail: gjh27 Texts and Resources Ramsay and Silverman, 2007,
More informationCIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, :59pm, PDF to Canvas [100 points]
CIS 520, Machine Learning, Fall 2015: Assignment 7 Due: Mon, Nov 16, 2015. 11:59pm, PDF to Canvas [100 points] Instructions. Please write up your responses to the following problems clearly and concisely.
More informationEdinburgh Research Explorer
Edinburgh Research Explorer Interpretable support vector machines for functional data Citation for published version: Martin-Barragan, B, Lillo, R & Romo, J 2012, 'Interpretable support vector machines
More informationTopics in Machine Learning-EE 5359 Model Assessment and Selection
Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing
More informationUniversity of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques.
. Non-Parameteric Techniques University of Cambridge Engineering Part IIB Paper 4F: Statistical Pattern Processing Handout : Non-Parametric Techniques Mark Gales mjfg@eng.cam.ac.uk Michaelmas 23 Introduction
More informationStatistical Analysis of Metabolomics Data. Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte
Statistical Analysis of Metabolomics Data Xiuxia Du Department of Bioinformatics & Genomics University of North Carolina at Charlotte Outline Introduction Data pre-treatment 1. Normalization 2. Centering,
More informationIdentification Of Iris Flower Species Using Machine Learning
Identification Of Iris Flower Species Using Machine Learning Shashidhar T Halakatti 1, Shambulinga T Halakatti 2 1 Department. of Computer Science Engineering, Rural Engineering College,Hulkoti 582205
More informationDiscriminant analysis in R QMMA
Discriminant analysis in R QMMA Emanuele Taufer file:///c:/users/emanuele.taufer/google%20drive/2%20corsi/5%20qmma%20-%20mim/0%20labs/l4-lda-eng.html#(1) 1/26 Default data Get the data set Default library(islr)
More informationUniversity of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques
University of Cambridge Engineering Part IIB Paper 4F10: Statistical Pattern Processing Handout 11: Non-Parametric Techniques Mark Gales mjfg@eng.cam.ac.uk Michaelmas 2011 11. Non-Parameteric Techniques
More informationLecture 7: Linear Regression (continued)
Lecture 7: Linear Regression (continued) Reading: Chapter 3 STATS 2: Data mining and analysis Jonathan Taylor, 10/8 Slide credits: Sergio Bacallado 1 / 14 Potential issues in linear regression 1. Interactions
More informationHomework. Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression Pod-cast lecture on-line. Next lectures:
Homework Gaussian, Bishop 2.3 Non-parametric, Bishop 2.5 Linear regression 3.0-3.2 Pod-cast lecture on-line Next lectures: I posted a rough plan. It is flexible though so please come with suggestions Bayes
More informationIntroduction to machine learning, pattern recognition and statistical data modelling Coryn Bailer-Jones
Introduction to machine learning, pattern recognition and statistical data modelling Coryn Bailer-Jones What is machine learning? Data interpretation describing relationship between predictors and responses
More informationLecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010
Lecture 24: Generalized Additive Models Stat 704: Data Analysis I, Fall 2010 Tim Hanson, Ph.D. University of South Carolina T. Hanson (USC) Stat 704: Data Analysis I, Fall 2010 1 / 26 Additive predictors
More informationOn Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions
On Classification: An Empirical Study of Existing Algorithms Based on Two Kaggle Competitions CAMCOS Report Day December 9th, 2015 San Jose State University Project Theme: Classification The Kaggle Competition
More informationApplying Supervised Learning
Applying Supervised Learning When to Consider Supervised Learning A supervised learning algorithm takes a known set of input data (the training set) and known responses to the data (output), and trains
More informationLars Schmidt-Thieme, Information Systems and Machine Learning Lab (ISMLL), University of Hildesheim, Germany
Syllabus Fri. 27.10. (1) 0. Introduction A. Supervised Learning: Linear Models & Fundamentals Fri. 3.11. (2) A.1 Linear Regression Fri. 10.11. (3) A.2 Linear Classification Fri. 17.11. (4) A.3 Regularization
More informationMore on Classification: Support Vector Machine
More on Classification: Support Vector Machine The Support Vector Machine (SVM) is a classification method approach developed in the computer science field in the 1990s. It has shown good performance in
More informationGeneralized additive models I
I Patrick Breheny October 6 Patrick Breheny BST 764: Applied Statistical Modeling 1/18 Introduction Thus far, we have discussed nonparametric regression involving a single covariate In practice, we often
More informationSTAT 705 Introduction to generalized additive models
STAT 705 Introduction to generalized additive models Timothy Hanson Department of Statistics, University of South Carolina Stat 705: Data Analysis II 1 / 22 Generalized additive models Consider a linear
More informationImage analysis. Computer Vision and Classification Image Segmentation. 7 Image analysis
7 Computer Vision and Classification 413 / 458 Computer Vision and Classification The k-nearest-neighbor method The k-nearest-neighbor (knn) procedure has been used in data analysis and machine learning
More informationWhat is machine learning?
Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship
More informationCluster Analysis and Visualization. Workshop on Statistics and Machine Learning 2004/2/6
Cluster Analysis and Visualization Workshop on Statistics and Machine Learning 2004/2/6 Outlines Introduction Stages in Clustering Clustering Analysis and Visualization One/two-dimensional Data Histogram,
More informationCAMCOS Report Day. December 9 th, 2015 San Jose State University Project Theme: Classification
CAMCOS Report Day December 9 th, 2015 San Jose State University Project Theme: Classification On Classification: An Empirical Study of Existing Algorithms based on two Kaggle Competitions Team 1 Team 2
More informationGenerative and discriminative classification
Generative and discriminative classification Machine Learning and Object Recognition 2017-2018 Jakob Verbeek Classification in its simplest form Given training data labeled for two or more classes Classification
More informationApplied Statistics : Practical 9
Applied Statistics : Practical 9 This practical explores nonparametric regression and shows how to fit a simple additive model. The first item introduces the necessary R commands for nonparametric regression
More informationFMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)
More informationThis is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or
STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible
More informationPackage DTRlearn. April 6, 2018
Type Package Package DTRlearn April 6, 2018 Title Learning Algorithms for Dynamic Treatment Regimes Version 1.3 Date 2018-4-05 Author Ying Liu, Yuanjia Wang, Donglin Zeng Maintainer Ying Liu
More informationThe mrmr variable selection method: a comparative study for functional data
To appear in the Journal of Statistical Computation and Simulation Vol. 00, No. 00, Month 20XX, 1 17 The mrmr variable selection method: a comparative study for functional data J.R. Berrendero, A. Cuevas
More informationSupport Vector Machines + Classification for IR
Support Vector Machines + Classification for IR Pierre Lison University of Oslo, Dep. of Informatics INF3800: Søketeknologi April 30, 2014 Outline of the lecture Recap of last week Support Vector Machines
More informationMachine Learning for. Artem Lind & Aleskandr Tkachenko
Machine Learning for Object Recognition Artem Lind & Aleskandr Tkachenko Outline Problem overview Classification demo Examples of learning algorithms Probabilistic modeling Bayes classifier Maximum margin
More informationSTA 414/2104 S: February Administration
1 / 16 Administration HW 2 posted on web page, due March 4 by 1 pm Midterm on March 16; practice questions coming Lecture/questions on Thursday this week Regression: variable selection, regression splines,
More informationData mining with Support Vector Machine
Data mining with Support Vector Machine Ms. Arti Patle IES, IPS Academy Indore (M.P.) artipatle@gmail.com Mr. Deepak Singh Chouhan IES, IPS Academy Indore (M.P.) deepak.schouhan@yahoo.com Abstract: Machine
More informationNon-Parametric Modeling
Non-Parametric Modeling CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Non-Parametric Density Estimation Parzen Windows Kn-Nearest Neighbor
More informationInstance-Based Representations. k-nearest Neighbor. k-nearest Neighbor. k-nearest Neighbor. exemplars + distance measure. Challenges.
Instance-Based Representations exemplars + distance measure Challenges. algorithm: IB1 classify based on majority class of k nearest neighbors learned structure is not explicitly represented choosing k
More informationWork 2. Case-based reasoning exercise
Work 2. Case-based reasoning exercise Marc Albert Garcia Gonzalo, Miquel Perelló Nieto November 19, 2012 1 Introduction In this exercise we have implemented a case-based reasoning system, specifically
More informationPackage FWDselect. December 19, 2015
Title Selecting Variables in Regression Models Version 2.1.0 Date 2015-12-18 Author Marta Sestelo [aut, cre], Nora M. Villanueva [aut], Javier Roca-Pardinas [aut] Maintainer Marta Sestelo
More informationOn Bias, Variance, 0/1 - Loss, and the Curse of Dimensionality
RK April 13, 2014 Abstract The purpose of this document is to summarize the main points from the paper, On Bias, Variance, 0/1 - Loss, and the Curse of Dimensionality, written by Jerome H.Friedman1997).
More informationSupport Vector Machines - Supplement
Support Vector Machines - Supplement Prof. Dan A. Simovici UMB 1 / 1 Outline 2 / 1 Building an SVM Classifier for the Iris data set Data Set Description Attribute Information: sepal length in cm sepal
More informationGenerative and discriminative classification techniques
Generative and discriminative classification techniques Machine Learning and Object Recognition 2015-2016 Jakob Verbeek, December 11, 2015 Course website: http://lear.inrialpes.fr/~verbeek/mlor.15.16 Classification
More informationNearest Neighbor Classification
Nearest Neighbor Classification Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms January 11, 2017 1 / 48 Outline 1 Administration 2 First learning algorithm: Nearest
More informationMachine Learning. A. Supervised Learning A.7. Decision Trees. Lars Schmidt-Thieme
Machine Learning A. Supervised Learning A.7. Decision Trees Lars Schmidt-Thieme Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University of Hildesheim, Germany 1 /
More informationMachine Learning: k-nearest Neighbors. Lecture 08. Razvan C. Bunescu School of Electrical Engineering and Computer Science
Machine Learning: k-nearest Neighbors Lecture 08 Razvan C. Bunescu School of Electrical Engineering and Computer Science bunescu@ohio.edu Nonparametric Methods: k-nearest Neighbors Input: A training dataset
More informationLecture 17: Smoothing splines, Local Regression, and GAMs
Lecture 17: Smoothing splines, Local Regression, and GAMs Reading: Sections 7.5-7 STATS 202: Data mining and analysis November 6, 2017 1 / 24 Cubic splines Define a set of knots ξ 1 < ξ 2 < < ξ K. We want
More informationNonparametric Methods Recap
Nonparametric Methods Recap Aarti Singh Machine Learning 10-701/15-781 Oct 4, 2010 Nonparametric Methods Kernel Density estimate (also Histogram) Weighted frequency Classification - K-NN Classifier Majority
More informationOrange3 Educational Add-on Documentation
Orange3 Educational Add-on Documentation Release 0.1 Biolab Jun 01, 2018 Contents 1 Widgets 3 2 Indices and tables 27 i ii Widgets in Educational Add-on demonstrate several key data mining and machine
More informationEquation to LaTeX. Abhinav Rastogi, Sevy Harris. I. Introduction. Segmentation.
Equation to LaTeX Abhinav Rastogi, Sevy Harris {arastogi,sharris5}@stanford.edu I. Introduction Copying equations from a pdf file to a LaTeX document can be time consuming because there is no easy way
More informationSupervised Learning for Image Segmentation
Supervised Learning for Image Segmentation Raphael Meier 06.10.2016 Raphael Meier MIA 2016 06.10.2016 1 / 52 References A. Ng, Machine Learning lecture, Stanford University. A. Criminisi, J. Shotton, E.
More informationA Method for Comparing Multiple Regression Models
CSIS Discussion Paper No. 141 A Method for Comparing Multiple Regression Models Yuki Hiruta Yasushi Asami Department of Urban Engineering, the University of Tokyo e-mail: hiruta@ua.t.u-tokyo.ac.jp asami@csis.u-tokyo.ac.jp
More information