Species distribution modelling for combined data sources
|
|
- Eileen Evans
- 5 years ago
- Views:
Transcription
1 Species distribution modelling for combined data sources Ian Renner and Olivier Gimenez. oaggimenez oliviergimenez.github.io
2 Ian Renner - Australia 1
3 Outline Background (Species Distribution Models) Combining Data Sources LASSO Regularisation More to Explore! Ian W. Renner SDM with combined data sources EURING / 48
4 Species Distribution Models Species Data e.g. Reported locations of Eurasian lynx near the Jura mountains in France Ian W. Renner SDM with combined data sources EURING / 48
5 Species Distribution Models Species Distribution Modelling Ian W. Renner SDM with combined data sources EURING / 48
6 Species Distribution Models SDM methods Different species distribution modelling methods are appropriate for different sources of species data: Data source Presence-only Systematic survey Repeated surveys SDM method point process model (PPM) logistic regression occupancy modelling Ian W. Renner SDM with combined data sources EURING / 48
7 Species Distribution Models Poisson point process models Simplest useful model: inhomogeneous Poisson point process model with intensity µ(s) defined over region A fitted to presence locations s P. Intensity modelled as a log-linear function of environmental variables: ln µ(s) = β 0 + β 1 rain(s) + β 2 temp(s) α 1 dist road(s) +... Maximise log-likelihood (using GLM software): l ppm (β, α; s P ) = m i=1 ln µ(s i ) µ(s)ds s A Ian W. Renner SDM with combined data sources EURING / 48
8 Species Distribution Models What is the intensity measuring? Intensity is not a probability, but is related to abundance, but abundance of what? What we want: Ian W. Renner What we get: SDM with combined data sources EURING / 48
9 Species Distribution Models Occupancy Modelling Occupancy models have been developed to account for imperfect detection. They rely on repeated visits to a set of sites at which presence/non-detection is recorded at each site for each visit. Ian W. Renner SDM with combined data sources EURING / 48
10 Species Distribution Models Occupancy Data Detection of species across all sites during Visit 1: Ian W. Renner SDM with combined data sources EURING / 48
11 Species Distribution Models Occupancy Data Detection of species across all sites during Visit 2: Ian W. Renner SDM with combined data sources EURING / 48
12 Species Distribution Models Occupancy Data Detection of species across all sites during Visit 3: Ian W. Renner SDM with combined data sources EURING / 48
13 Species Distribution Models Occupancy Data Detection of species across all sites during Visit 4: Ian W. Renner SDM with combined data sources EURING / 48
14 Species Distribution Models Occupancy Data Detection of species across all sites during Visit 5: Ian W. Renner SDM with combined data sources EURING / 48
15 Species Distribution Models Occupancy Data Total detections of species across all sites during all visits: Problem: We don t know whether sites with 0 detections indicate the species is absent or whether it was present but undetected. Ian W. Renner SDM with combined data sources EURING / 48
16 Species Distribution Models Occupancy Model Fit by maximizing l occ (α O, β) = ln N i=1 P (Y i = y i ) What we want: What we get (more or less): Ian W. Renner SDM with combined data sources EURING / 48
17 Combining Data Sources Multiple sources In many situations, there is more than one source of data. 364 Sightings in the wild (s W ) 242 Domestic interferences (s D ) 73 Camera traps (y O ) Ian W. Renner SDM with combined data sources EURING / 48
18 Combining Data Sources One-source models Common approach: choose only one set of data. Available covariates: Altitude (alt) Forest cover (fc%) Distance to nearest water source (d.wat) Distance to nearest urban area (d.urb) Distance to nearest road (d.rd) Distance to nearest farm (d.farm) Human population density (h.dens) Ian W. Renner SDM with combined data sources EURING / 48
19 Combining Data Sources Point process model for wild sightings Maximise l ppm (α W, β; s W ) using: β: Linear, quadratic, and interaction terms of {alt, fc%, d.wat, d.urb} α W = d.rd Output µ W : intensity of wild reportings per unit area Ian W. Renner SDM with combined data sources EURING / 48
20 Combining Data Sources Point process model for domestic sightings Maximise l ppm (α D, β; s D ) using: β: Linear, quadratic, and interaction terms of {alt, fc%, d.wat, d.urb} α D = d.farm Output µ W : intensity of domestic reportings per unit area Ian W. Renner SDM with combined data sources EURING / 48
21 Combining Data Sources Occupancy model for camera traps Maximise l occ (α O, β; y O ) using: β: Linear, quadratic, and interaction terms of {alt, fc%, d.wat, d.urb} α O = h.dens Output µ occ : intensity of species per unit area Ian W. Renner SDM with combined data sources EURING / 48
22 Combining Data Sources Combined Approach How might we build a model using multiple sources of data? Presence-only and presence-absence : l(α, β, γ, δ) = l ppm (α, β, γ, δ) + l PA (β, γ) Presence-only and occupancy : l(α P O, α Occ, β, γ) = l ppm (α P O, β) + l Occ (α Occ, β) Fithian, W., Elith, J., Hastie, T., & Keith, D.A. (2015) Bias correction in species distribution models: pooling survey and collection data for multiple species. Methods in Ecology and Evolution 6, Dorazio, R.M. (2014) Accounting for imperfect detection and survey bias in statistical analysis of presence-only data. Global Ecology and Biogeography 23, Ian W. Renner SDM with combined data sources EURING / 48
23 Combining Data Sources Combined model Maximise l ppm (α W, β; s W ) + l ppm (α D, β; s D ) + l occ (α O, β; y O ) using: β: Linear, quadratic, and interaction terms of {alt, fc%, d.wat, d.urb} α W = d.rd α D = d.farm α O = h.dens Output µ combined :? Ian W. Renner SDM with combined data sources EURING / 48
24 Combining Data Sources Comparing models Ian W. Renner SDM with combined data sources EURING / 48
25 LASSO Regularisation Regularisation with the LASSO LASSO: Least Absolute Selection and Shrinkage Operator p β = argmax l(β) λ β j. j=1 Ian W. Renner SDM with combined data sources EURING / 48
26 Lasso vs. ridge regression, graphically 9
27 LASSO Regularisation The LASSO in Action: Regularization Paths Regularization paths for the three individual models: The occupancy model appears to be greatly overfitted with 15 covariates. Ian W. Renner SDM with combined data sources EURING / 48
28 LASSO Regularisation Regularized Individual Models Ian W. Renner SDM with combined data sources EURING / 48
29 LASSO Regularisation Regularized Combined Model Ian W. Renner SDM with combined data sources EURING / 48
30 Future Work Weighted Likelihood The combined model puts presence-only and survey data on equal footing. One way to acknowledge superior quality of survey data: weighted likelihood. Model RSS (survey data) Occupancy Wild P-O Domestic P-O Ian W. Renner SDM with combined data sources EURING / 48
31 Future Work Residual-weighted Combined Model Maximise w W l ppm (α W, β; s W ) + w D l ppm (α D, β; s D ) + w O l occ (α O, β; y O ). Ian W. Renner SDM with combined data sources EURING / 48
32 Future Work Model checking There are many tools for diagnostics of point process models. K-envelopes (to diagnose conditional independence of point locations): Ian W. Renner SDM with combined data sources EURING / 48
33 Future Work Model checking Spatial residual plots: Ian W. Renner SDM with combined data sources EURING / 48
34 Future Work More to explore Some next steps: Other weighting approaches Developing diagnostic tools Combinations involving non-poisson PPMs Please come see me if you are interested in contributing! Ian W. Renner SDM with combined data sources EURING / 48
DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University September 20 2018 Review Solution for multiple linear regression can be computed in closed form
More informationClassification with PAM and Random Forest
5/7/2007 Classification with PAM and Random Forest Markus Ruschhaupt Practical Microarray Analysis 2007 - Regensburg Two roads to classification Given: patient profiles already diagnosed by an expert.
More informationWorkshop 8: Model selection
Workshop 8: Model selection Selecting among candidate models requires a criterion for evaluating and comparing models, and a strategy for searching the possibilities. In this workshop we will explore some
More informationMachine Learning Duncan Anderson Managing Director, Willis Towers Watson
Machine Learning Duncan Anderson Managing Director, Willis Towers Watson 21 March 2018 GIRO 2016, Dublin - Response to machine learning Don t panic! We re doomed! 2 This is not all new Actuaries adopt
More informationMachine Learning. Chao Lan
Machine Learning Chao Lan Machine Learning Prediction Models Regression Model - linear regression (least square, ridge regression, Lasso) Classification Model - naive Bayes, logistic regression, Gaussian
More informationGLMSELECT for Model Selection
Winnipeg SAS User Group Meeting May 11, 2012 GLMSELECT for Model Selection Sylvain Tremblay SAS Canada Education Copyright 2010 SAS Institute Inc. All rights reserved. Proc GLM Proc REG Class Statement
More informationGenotype x Environmental Analysis with R for Windows
Genotype x Environmental Analysis with R for Windows Biometrics and Statistics Unit Angela Pacheco CIMMYT,Int. 23-24 Junio 2015 About GEI In agricultural experimentation, a large number of genotypes are
More informationChapter 7: Dual Modeling in the Presence of Constant Variance
Chapter 7: Dual Modeling in the Presence of Constant Variance 7.A Introduction An underlying premise of regression analysis is that a given response variable changes systematically and smoothly due to
More informationLecture 13: Model selection and regularization
Lecture 13: Model selection and regularization Reading: Sections 6.1-6.2.1 STATS 202: Data mining and analysis October 23, 2017 1 / 17 What do we know so far In linear regression, adding predictors always
More informationSURFEX LDAS March 2012
SURFEX LDAS March 2012 Alina Barbu Combined assimilation of satellite-derived soil moisture and LAI 2 Motivation of our work GEOLAND 2 project Land Carbon Information Service (LCIS) on vegetation/land
More informationClassification by Nearest Shrunken Centroids and Support Vector Machines
Classification by Nearest Shrunken Centroids and Support Vector Machines Florian Markowetz florian.markowetz@molgen.mpg.de Max Planck Institute for Molecular Genetics, Computational Diagnostics Group,
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 24 2019 Logistics HW 1 is due on Friday 01/25 Project proposal: due Feb 21 1 page description
More informationEvaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support
Evaluating generalization (validation) Harvard-MIT Division of Health Sciences and Technology HST.951J: Medical Decision Support Topics Validation of biomedical models Data-splitting Resampling Cross-validation
More informationMachine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums
Machine Learning Techniques for Detecting Hierarchical Interactions in GLM s for Insurance Premiums José Garrido Department of Mathematics and Statistics Concordia University, Montreal EAJ 2016 Lyon, September
More informationGAMs semi-parametric GLMs. Simon Wood Mathematical Sciences, University of Bath, U.K.
GAMs semi-parametric GLMs Simon Wood Mathematical Sciences, University of Bath, U.K. Generalized linear models, GLM 1. A GLM models a univariate response, y i as g{e(y i )} = X i β where y i Exponential
More informationPredictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA
Predictive Analytics: Demystifying Current and Emerging Methodologies Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA May 18, 2017 About the Presenters Tom Kolde, FCAS, MAAA Consulting Actuary Chicago,
More informationGENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute
GENREG DID THAT? Clay Barker Research Statistician Developer JMP Division, SAS Institute GENREG WHAT IS IT? The Generalized Regression platform was introduced in JMP Pro 11 and got much better in version
More informationFMA901F: Machine Learning Lecture 3: Linear Models for Regression. Cristian Sminchisescu
FMA901F: Machine Learning Lecture 3: Linear Models for Regression Cristian Sminchisescu Machine Learning: Frequentist vs. Bayesian In the frequentist setting, we seek a fixed parameter (vector), with value(s)
More informationSupport Vector Machines
Support Vector Machines Chapter 9 Chapter 9 1 / 50 1 91 Maximal margin classifier 2 92 Support vector classifiers 3 93 Support vector machines 4 94 SVMs with more than two classes 5 95 Relationshiop to
More informationIntegrating auxiliary data in optimal spatial design for species distribution mapping
Integrating auxiliary data in optimal spatial design for species distribution mapping Brian Reich, Krishna Pacifici and Jon Stallings North Carolina State University Reich + Pacifici + Stallings Optimal
More informationMachine Learning / Jan 27, 2010
Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,
More informationFast or furious? - User analysis of SF Express Inc
CS 229 PROJECT, DEC. 2017 1 Fast or furious? - User analysis of SF Express Inc Gege Wen@gegewen, Yiyuan Zhang@yiyuan12, Kezhen Zhao@zkz I. MOTIVATION The motivation of this project is to predict the likelihood
More informationLinear Model Selection and Regularization. especially usefull in high dimensions p>>100.
Linear Model Selection and Regularization especially usefull in high dimensions p>>100. 1 Why Linear Model Regularization? Linear models are simple, BUT consider p>>n, we have more features than data records
More informationMissing Data Analysis for the Employee Dataset
Missing Data Analysis for the Employee Dataset 67% of the observations have missing values! Modeling Setup Random Variables: Y i =(Y i1,...,y ip ) 0 =(Y i,obs, Y i,miss ) 0 R i =(R i1,...,r ip ) 0 ( 1
More informationSpatial Outlier Detection
Spatial Outlier Detection Chang-Tien Lu Department of Computer Science Northern Virginia Center Virginia Tech Joint work with Dechang Chen, Yufeng Kou, Jiang Zhao 1 Spatial Outlier A spatial data point
More informationPractical Methodology. Lecture slides for Chapter 11 of Deep Learning Ian Goodfellow
Practical Methodology Lecture slides for Chapter 11 of Deep Learning www.deeplearningbook.org Ian Goodfellow 2016-09-26 What drives success in ML? Arcane knowledge of dozens of obscure algorithms? Mountains
More informationInstance-Based Learning: Nearest neighbor and kernel regression and classificiation
Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each
More informationLast time... Coryn Bailer-Jones. check and if appropriate remove outliers, errors etc. linear regression
Machine learning, pattern recognition and statistical data modelling Lecture 3. Linear Methods (part 1) Coryn Bailer-Jones Last time... curse of dimensionality local methods quickly become nonlocal as
More informationClustering Lecture 5: Mixture Model
Clustering Lecture 5: Mixture Model Jing Gao SUNY Buffalo 1 Outline Basics Motivation, definition, evaluation Methods Partitional Hierarchical Density-based Mixture model Spectral methods Advanced topics
More informationLecture 27: Review. Reading: All chapters in ISLR. STATS 202: Data mining and analysis. December 6, 2017
Lecture 27: Review Reading: All chapters in ISLR. STATS 202: Data mining and analysis December 6, 2017 1 / 16 Final exam: Announcements Tuesday, December 12, 8:30-11:30 am, in the following rooms: Last
More informationLeveling Up as a Data Scientist. ds/2014/10/level-up-ds.jpg
Model Optimization Leveling Up as a Data Scientist http://shorelinechurch.org/wp-content/uploa ds/2014/10/level-up-ds.jpg Bias and Variance Error = (expected loss of accuracy) 2 + flexibility of model
More informationMonte Carlo for Spatial Models
Monte Carlo for Spatial Models Murali Haran Department of Statistics Penn State University Penn State Computational Science Lectures April 2007 Spatial Models Lots of scientific questions involve analyzing
More informationVARIABLE SELECTION MADE EASY USING GENREG IN JMP PRO
VARIABLE SELECTION MADE EASY USING GENREG IN JMP PRO Clay Barker Senior Research Statistician Developer JMP Division, SAS Institute THE IMPORTANCE OF VARIABLE SELECTION In 1996, Brad Efron (famous for
More informationInstance-Based Learning: Nearest neighbor and kernel regression and classificiation
Instance-Based Learning: Nearest neighbor and kernel regression and classificiation Emily Fox University of Washington February 3, 2017 Simplest approach: Nearest neighbor regression 1 Fit locally to each
More informationLecture on Modeling Tools for Clustering & Regression
Lecture on Modeling Tools for Clustering & Regression CS 590.21 Analysis and Modeling of Brain Networks Department of Computer Science University of Crete Data Clustering Overview Organizing data into
More informationProblem 1 (20 pt) Answer the following questions, and provide an explanation for each question.
Problem 1 Answer the following questions, and provide an explanation for each question. (5 pt) Can linear regression work when all X values are the same? When all Y values are the same? (5 pt) Can linear
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationRegularization Methods. Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel
Regularization Methods Business Analytics Practice Winter Term 2015/16 Stefan Feuerriegel Today s Lecture Objectives 1 Avoiding overfitting and improving model interpretability with the help of regularization
More informationStatistical Consulting Topics Using cross-validation for model selection. Cross-validation is a technique that can be used for model evaluation.
Statistical Consulting Topics Using cross-validation for model selection Cross-validation is a technique that can be used for model evaluation. We often fit a model to a full data set and then perform
More informationThe Problem of Overfitting with Maximum Likelihood
The Problem of Overfitting with Maximum Likelihood In the previous example, continuing training to find the absolute maximum of the likelihood produced overfitted results. The effect is much bigger if
More informationGradient LASSO algoithm
Gradient LASSO algoithm Yongdai Kim Seoul National University, Korea jointly with Yuwon Kim University of Minnesota, USA and Jinseog Kim Statistical Research Center for Complex Systems, Korea Contents
More informationMultiresponse Sparse Regression with Application to Multidimensional Scaling
Multiresponse Sparse Regression with Application to Multidimensional Scaling Timo Similä and Jarkko Tikka Helsinki University of Technology, Laboratory of Computer and Information Science P.O. Box 54,
More informationClassification. Vladimir Curic. Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University
Classification Vladimir Curic Centre for Image Analysis Swedish University of Agricultural Sciences Uppsala University Outline An overview on classification Basics of classification How to choose appropriate
More informationINTEGRATION OF TREE DATABASE DERIVED FROM SATELLITE IMAGERY AND LIDAR POINT CLOUD DATA
INTEGRATION OF TREE DATABASE DERIVED FROM SATELLITE IMAGERY AND LIDAR POINT CLOUD DATA S. C. Liew 1, X. Huang 1, E. S. Lin 2, C. Shi 1, A. T. K. Yee 2, A. Tandon 2 1 Centre for Remote Imaging, Sensing
More informationChapter 6: Linear Model Selection and Regularization
Chapter 6: Linear Model Selection and Regularization As p (the number of predictors) comes close to or exceeds n (the sample size) standard linear regression is faced with problems. The variance of the
More information1. Estimation equations for strip transect sampling, using notation consistent with that used to
Web-based Supplementary Materials for Line Transect Methods for Plant Surveys by S.T. Buckland, D.L. Borchers, A. Johnston, P.A. Henrys and T.A. Marques Web Appendix A. Introduction In this on-line appendix,
More informationGene signature selection to predict survival benefits from adjuvant chemotherapy in NSCLC patients
1 Gene signature selection to predict survival benefits from adjuvant chemotherapy in NSCLC patients 1,2 Keyue Ding, Ph.D. Nov. 8, 2014 1 NCIC Clinical Trials Group, Kingston, Ontario, Canada 2 Dept. Public
More informationUsing Machine Learning to Optimize Storage Systems
Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation
More informationNONPARAMETRIC REGRESSION TECHNIQUES
NONPARAMETRIC REGRESSION TECHNIQUES C&PE 940, 28 November 2005 Geoff Bohling Assistant Scientist Kansas Geological Survey geoff@kgs.ku.edu 864-2093 Overheads and other resources available at: http://people.ku.edu/~gbohling/cpe940
More informationPerformance Estimation and Regularization. Kasthuri Kannan, PhD. Machine Learning, Spring 2018
Performance Estimation and Regularization Kasthuri Kannan, PhD. Machine Learning, Spring 2018 Bias- Variance Tradeoff Fundamental to machine learning approaches Bias- Variance Tradeoff Error due to Bias:
More informationCSE Data Mining Concepts and Techniques STATISTICAL METHODS (REGRESSION) Professor- Anita Wasilewska. Team 13
CSE 634 - Data Mining Concepts and Techniques STATISTICAL METHODS Professor- Anita Wasilewska (REGRESSION) Team 13 Contents Linear Regression Logistic Regression Bias and Variance in Regression Model Fit
More informationOptimization Plugin for RapidMiner. Venkatesh Umaashankar Sangkyun Lee. Technical Report 04/2012. technische universität dortmund
Optimization Plugin for RapidMiner Technical Report Venkatesh Umaashankar Sangkyun Lee 04/2012 technische universität dortmund Part of the work on this technical report has been supported by Deutsche Forschungsgemeinschaft
More informationSparse Linear Models
November 2015 Trevor Hastie, Stanford Statistics 1 Sparse Linear Models Trevor Hastie Stanford University joint work with Jerome Friedman, Rob Tibshirani and many students November 2015 Trevor Hastie,
More informationMore on Neural Networks. Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.
More on Neural Networks Read Chapter 5 in the text by Bishop, except omit Sections 5.3.3, 5.3.4, 5.4, 5.5.4, 5.5.5, 5.5.6, 5.5.7, and 5.6 Recall the MLP Training Example From Last Lecture log likelihood
More informationDetection of Smoke in Satellite Images
Detection of Smoke in Satellite Images Mark Wolters Charmaine Dean Shanghai Center for Mathematical Sciences Western University December 15, 2014 TIES 2014, Guangzhou Summary Application Smoke identification
More informationSCGLR - An R Package for Supervised Component Generalized Linear Regression
SCGLR - An R Package for Supervised Component Generalized Linear Regression Frédéric Mortier, Catherine Trottier, Guillaume Cornu and Xavier Bry March 7, 2016 Summary: The objective of this paper is to
More informationINF 4300 Classification III Anne Solberg The agenda today:
INF 4300 Classification III Anne Solberg 28.10.15 The agenda today: More on estimating classifier accuracy Curse of dimensionality and simple feature selection knn-classification K-means clustering 28.10.15
More informationRegion-based Segmentation and Object Detection
Region-based Segmentation and Object Detection Stephen Gould Tianshi Gao Daphne Koller Presented at NIPS 2009 Discussion and Slides by Eric Wang April 23, 2010 Outline Introduction Model Overview Model
More informationNetwork Traffic Measurements and Analysis
DEIB - Politecnico di Milano Fall, 2017 Sources Hastie, Tibshirani, Friedman: The Elements of Statistical Learning James, Witten, Hastie, Tibshirani: An Introduction to Statistical Learning Andrew Ng:
More informationLecture 20: Bagging, Random Forests, Boosting
Lecture 20: Bagging, Random Forests, Boosting Reading: Chapter 8 STATS 202: Data mining and analysis November 13, 2017 1 / 17 Classification and Regression trees, in a nut shell Grow the tree by recursively
More informationLecture 27, April 24, Reading: See class website. Nonparametric regression and kernel smoothing. Structured sparse additive models (GroupSpAM)
School of Computer Science Probabilistic Graphical Models Structured Sparse Additive Models Junming Yin and Eric Xing Lecture 7, April 4, 013 Reading: See class website 1 Outline Nonparametric regression
More informationModule 4. Non-linear machine learning econometrics: Support Vector Machine
Module 4. Non-linear machine learning econometrics: Support Vector Machine THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Introduction When the assumption of linearity
More informationStatistics 202: Data Mining. c Jonathan Taylor. Outliers Based in part on slides from textbook, slides of Susan Holmes.
Outliers Based in part on slides from textbook, slides of Susan Holmes December 2, 2012 1 / 1 Concepts What is an outlier? The set of data points that are considerably different than the remainder of the
More informationCS6375: Machine Learning Gautam Kunapuli. Mid-Term Review
Gautam Kunapuli Machine Learning Data is identically and independently distributed Goal is to learn a function that maps to Data is generated using an unknown function Learn a hypothesis that minimizes
More informationModel Inference and Averaging. Baging, Stacking, Random Forest, Boosting
Model Inference and Averaging Baging, Stacking, Random Forest, Boosting Bagging Bootstrap Aggregating Bootstrap Repeatedly select n data samples with replacement Each dataset b=1:b is slightly different
More informationLasso. November 14, 2017
Lasso November 14, 2017 Contents 1 Case Study: Least Absolute Shrinkage and Selection Operator (LASSO) 1 1.1 The Lasso Estimator.................................... 1 1.2 Computation of the Lasso Solution............................
More informationUpdates and Errata for Statistical Data Analytics (1st edition, 2015)
Updates and Errata for Statistical Data Analytics (1st edition, 2015) Walter W. Piegorsch University of Arizona c 2018 The author. All rights reserved, except where previous rights exist. CONTENTS Preface
More informationSTA 4273H: Sta-s-cal Machine Learning
STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! h0p://www.cs.toronto.edu/~rsalakhu/ Lecture 3 Parametric Distribu>ons We want model the probability
More information3D Convolutional Neural Networks for Landing Zone Detection from LiDAR
3D Convolutional Neural Networks for Landing Zone Detection from LiDAR Daniel Mataruna and Sebastian Scherer Presented by: Sabin Kafle Outline Introduction Preliminaries Approach Volumetric Density Mapping
More informationTopics in Machine Learning-EE 5359 Model Assessment and Selection
Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing
More informationNon-Linearity of Scorecard Log-Odds
Non-Linearity of Scorecard Log-Odds Ross McDonald, Keith Smith, Matthew Sturgess, Edward Huang Retail Decision Science, Lloyds Banking Group Edinburgh Credit Scoring Conference 6 th August 9 Lloyds Banking
More informationLasso.jl Documentation
Lasso.jl Documentation Release 0.0.1 Simon Kornblith Jan 07, 2018 Contents 1 Lasso paths 3 2 Fused Lasso and trend filtering 7 3 Indices and tables 9 i ii Lasso.jl Documentation, Release 0.0.1 Contents:
More informationThis is called a linear basis expansion, and h m is the mth basis function For example if X is one-dimensional: f (X) = β 0 + β 1 X + β 2 X 2, or
STA 450/4000 S: February 2 2005 Flexible modelling using basis expansions (Chapter 5) Linear regression: y = Xβ + ɛ, ɛ (0, σ 2 ) Smooth regression: y = f (X) + ɛ: f (X) = E(Y X) to be specified Flexible
More informationComputer Vision Group Prof. Daniel Cremers. 4. Probabilistic Graphical Models Directed Models
Prof. Daniel Cremers 4. Probabilistic Graphical Models Directed Models The Bayes Filter (Rep.) (Bayes) (Markov) (Tot. prob.) (Markov) (Markov) 2 Graphical Representation (Rep.) We can describe the overall
More informationUnsupervised: no target value to predict
Clustering Unsupervised: no target value to predict Differences between models/algorithms: Exclusive vs. overlapping Deterministic vs. probabilistic Hierarchical vs. flat Incremental vs. batch learning
More informationCSSS 510: Lab 2. Introduction to Maximum Likelihood Estimation
CSSS 510: Lab 2 Introduction to Maximum Likelihood Estimation 2018-10-12 0. Agenda 1. Housekeeping: simcf, tile 2. Questions about Homework 1 or lecture 3. Simulating heteroskedastic normal data 4. Fitting
More informationarxiv: v1 [stat.me] 29 May 2015
MIMCA: Multiple imputation for categorical variables with multiple correspondence analysis Vincent Audigier 1, François Husson 2 and Julie Josse 2 arxiv:1505.08116v1 [stat.me] 29 May 2015 Applied Mathematics
More informationNina Zumel and John Mount Win-Vector LLC
SUPERVISED LEARNING IN R: REGRESSION Logistic regression to predict probabilities Nina Zumel and John Mount Win-Vector LLC Predicting Probabilities Predicting whether an event occurs (yes/no): classification
More informationModeling and Monitoring Crop Disease in Developing Countries
Modeling and Monitoring Crop Disease in Developing Countries John Quinn 1, Kevin Leyton-Brown 2, Ernest Mwebaze 1 1 Department of Computer Science 2 Department of Computer Science Makerere University,
More informationPreface to the Second Edition. Preface to the First Edition. 1 Introduction 1
Preface to the Second Edition Preface to the First Edition vii xi 1 Introduction 1 2 Overview of Supervised Learning 9 2.1 Introduction... 9 2.2 Variable Types and Terminology... 9 2.3 Two Simple Approaches
More informationMulti-label classification using rule-based classifier systems
Multi-label classification using rule-based classifier systems Shabnam Nazmi (PhD candidate) Department of electrical and computer engineering North Carolina A&T state university Advisor: Dr. A. Homaifar
More informationAnalysis of Different Reference Plane Setups for the Calibration of a Mobile Laser Scanning System
Analysis of Different Reference Plane Setups for the Calibration of a Mobile Laser Scanning System 18. Internationaler Ingenieurvermessungskurs Graz, Austria, 25-29 th April 2017 Erik Heinz, Christian
More informationParthy. A test of neutrality using species abundance evenness, and parameter inference by Approximate Bayesian Computation
Parthy A test of neutrality using species abundance evenness, and parameter inference by Approximate Bayesian Computation http://www.edb.ups tlse.fr/equipe1/tetame.htm Franck Jabot Jérôme Chave Laboratoire
More informationModel Assessment and Selection. Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer
Model Assessment and Selection Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 Model Training data Testing data Model Testing error rate Training error
More informationLarge-Scale Lasso and Elastic-Net Regularized Generalized Linear Models
Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models DB Tsai Steven Hillion Outline Introduction Linear / Nonlinear Classification Feature Engineering - Polynomial Expansion Big-data
More informationSlides modified from: PATTERN RECOGNITION AND MACHINE LEARNING CHRISTOPHER M. BISHOP
Slides modified from: PATTERN RECOGNITION AND MACHINE LEARNING CHRISTOPHER M. BISHOP Linear regression Linear Basis FuncDon Models (1) Example: Polynomial Curve FiLng Linear Basis FuncDon Models (2) Generally
More informationA Versatile Dependent Model for Heterogeneous Cellular Networks
1 A Versatile Dependent Model for Heterogeneous Cellular Networks Martin Haenggi University of Notre Dame July 7, 1 Abstract arxiv:135.97v [cs.ni] 7 May 13 We propose a new model for heterogeneous cellular
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right
More informationLECTURE 12: LINEAR MODEL SELECTION PT. 3. October 23, 2017 SDS 293: Machine Learning
LECTURE 12: LINEAR MODEL SELECTION PT. 3 October 23, 2017 SDS 293: Machine Learning Announcements 1/2 Presentation of the CS Major & Minors TODAY @ lunch Ford 240 FREE FOOD! Announcements 2/2 CS Internship
More informationImage Registration + Other Stuff
Image Registration + Other Stuff John Ashburner Pre-processing Overview fmri time-series Motion Correct Anatomical MRI Coregister m11 m 21 m 31 m12 m13 m14 m 22 m 23 m 24 m 32 m 33 m 34 1 Template Estimate
More informationMonocular Human Motion Capture with a Mixture of Regressors. Ankur Agarwal and Bill Triggs GRAVIR-INRIA-CNRS, Grenoble, France
Monocular Human Motion Capture with a Mixture of Regressors Ankur Agarwal and Bill Triggs GRAVIR-INRIA-CNRS, Grenoble, France IEEE Workshop on Vision for Human-Computer Interaction, 21 June 2005 Visual
More informationHow to carry out secondary validation of climatic data
World Bank & Government of The Netherlands funded Training module # SWDP -17 How to carry out secondary validation of climatic data New Delhi, November 1999 CSMRS Building, 4th Floor, Olof Palme Marg,
More informationModel selection Outline for today
Model selection Outline for today The problem of model selection Choose among models by a criterion rather than significance testing Criteria: Mallow s C p and AIC Search strategies: All subsets; stepaic
More informationChapter 7: Numerical Prediction
Ludwig-Maximilians-Universität München Institut für Informatik Lehr- und Forschungseinheit für Datenbanksysteme Knowledge Discovery in Databases SS 2016 Chapter 7: Numerical Prediction Lecture: Prof. Dr.
More informationCollaborative Filtering Applied to Educational Data Mining
Collaborative Filtering Applied to Educational Data Mining KDD Cup 200 July 25 th, 200 BigChaos @ KDD Team Dataset Solution Overview Michael Jahrer, Andreas Töscher from commendo research Dataset Team
More informationLast time... Bias-Variance decomposition. This week
Machine learning, pattern recognition and statistical data modelling Lecture 4. Going nonlinear: basis expansions and splines Last time... Coryn Bailer-Jones linear regression methods for high dimensional
More informationMEDICAL IMAGE COMPUTING (CAP 5937) LECTURE 4: Pre-Processing Medical Images (II)
SPRING 2016 1 MEDICAL IMAGE COMPUTING (CAP 5937) LECTURE 4: Pre-Processing Medical Images (II) Dr. Ulas Bagci HEC 221, Center for Research in Computer Vision (CRCV), University of Central Florida (UCF),
More informationClassification and Detection in Images. D.A. Forsyth
Classification and Detection in Images D.A. Forsyth Classifying Images Motivating problems detecting explicit images classifying materials classifying scenes Strategy build appropriate image features train
More informationConquering Massive Clinical Models with GPU. GPU Parallelized Logistic Regression
Conquering Massive Clinical Models with GPU Parallelized Logistic Regression M.D./Ph.D. candidate in Biomathematics University of California, Los Angeles Joint Statistical Meetings Vancouver, Canada, July
More informationOptimization Models for Machine Learning: A Survey
Optimization Models for Machine Learning: A Survey arxiv:1901.05331v1 [math.oc] 16 Jan 2019 Claudio Gambella 1 Bissan Ghaddar 2 Joe Naoum-Sawaya 2 1 IBM Research Ireland, Mulhuddart, Dublin 15, Ireland
More information