Available online at ScienceDirect. Procedia Environmental Sciences 26 (2015 )

Similar documents
NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

X- Chart Using ANOM Approach

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Small Area Estimation via M-Quantile Geographically Weighted Regression

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

S1 Note. Basis functions.

A Semi-parametric Regression Model to Estimate Variability of NO 2

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

CS 534: Computer Vision Model Fitting

y and the total sum of

Cluster Analysis of Electrical Behavior

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Support Vector Machines

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Feature Reduction and Selection

An Accurate Evaluation of Integrals in Convex and Non convex Polygonal Domain by Twelve Node Quadrilateral Finite Element Method

Parameter estimation for incomplete bivariate longitudinal data in clinical trials

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

A Robust Method for Estimating the Fundamental Matrix

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

TN348: Openlab Module - Colocalization

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

Classifier Selection Based on Data Complexity Measures *

Available online at Available online at Advanced in Control Engineering and Information Science

Parallelism for Nested Loops with Non-uniform and Flow Dependences

The Research of Ellipse Parameter Fitting Algorithm of Ultrasonic Imaging Logging in the Casing Hole

Machine Learning: Algorithms and Applications

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

GWR 3 Software for Geographically Weighted Regression

THE THEORY OF REGIONALIZED VARIABLES

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

Adjustment methods for differential measurement errors in multimode surveys

Quick error verification of portable coordinate measuring arm

UB at GeoCLEF Department of Geography Abstract

Constructing Minimum Connected Dominating Set: Algorithmic approach

Hybridization of Expectation-Maximization and K-Means Algorithms for Better Clustering Performance

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Air Transport Demand. Ta-Hui Yang Associate Professor Department of Logistics Management National Kaohsiung First Univ. of Sci. & Tech.

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

The Shortest Path of Touring Lines given in the Plane

Improved Methods for Lithography Model Calibration

The Research of Support Vector Machine in Agricultural Data Classification

Outlier Detection based on Robust Parameter Estimates

ScienceDirect. The Influence of Subpixel Corner Detection to Determine the Camera Displacement

The Nottingham eprints service makes this work by researchers of the University of Nottingham available open access under the following conditions.

Fuzzy Logic Based RS Image Classification Using Maximum Likelihood and Mahalanobis Distance Classifiers

Analysis on the Workspace of Six-degrees-of-freedom Industrial Robot Based on AutoCAD

A Post Randomization Framework for Privacy-Preserving Bayesian. Network Parameter Learning

Exercises (Part 4) Introduction to R UCLA/CCPR. John Fox, February 2005

REG^2: A Regional Regression Framework for Geo-Referenced Datasets

Nondestructive and intuitive determination of circadian chlorophyll rhythms in soybean leaves using multispectral imaging

C2 Training: June 8 9, Combining effect sizes across studies. Create a set of independent effect sizes. Introduction to meta-analysis

Meta-heuristics for Multidimensional Knapsack Problems

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

A New Approach For the Ranking of Fuzzy Sets With Different Heights

Reducing Frame Rate for Object Tracking

Cell Count Method on a Network with SANET

A proposal for the motion analysis method of skiing turn by measurement of orientation and gliding trajectory

SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR

Estimating Regression Coefficients using Weighted Bootstrap with Probability

A high precision collaborative vision measurement of gear chamfering profile

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

Positive Semi-definite Programming Localization in Wireless Sensor Networks

Mixed Linear System Estimation and Identification

USING LINEAR REGRESSION FOR THE AUTOMATION OF SUPERVISED CLASSIFICATION IN MULTITEMPORAL IMAGES

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Active Contours/Snakes

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

Module Management Tool in Software Development Organizations

Hierarchical clustering for gene expression data analysis

Smoothing Spline ANOVA for variable screening

SURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Page 0 of 0 SPATIAL INTERPOLATION METHODS

A Five-Point Subdivision Scheme with Two Parameters and a Four-Point Shape-Preserving Scheme

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

A Simple and Efficient Goal Programming Model for Computing of Fuzzy Linear Regression Parameters with Considering Outliers

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Mathematics 256 a course in differential equations for engineering students

A Bootstrap Approach to Robust Regression

Intra-Parametric Analysis of a Fuzzy MOLP

Routing in Degree-constrained FSO Mesh Networks

Predicting the Density of Algae Communities using Local Regression Trees

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

Wavefront Reconstructor

Programming in Fortran 90 : 2017/2018

Fast Computation of Shortest Path for Visiting Segments in the Plane

All-Pairs Shortest Paths. Approximate All-Pairs shortest paths Approximate distance oracles Spanners and Emulators. Uri Zwick Tel Aviv University

Wireless Sensor Network Localization Research

Probability Base Classification Technique: A Preliminary Study for Two Groups

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Transcription:

Avalable onlne at www.scencedrect.com ScenceDrect Proceda Envronmental Scences 26 (2015 ) 109 114 Spatal Statstcs 2015: Emergng Patterns Calbratng a Geographcally Weghted Regresson Model wth Parameter-Specfc Dstance Metrcs Bnbn Lu a *, Paul Harrs b, Martn Charlton c, Chrs Brunsdon c a School of Remote Sensng and Informaton Engneerng, Wuhan Unversty, 129 Luoyu Road, Wuhan 430079, Chna. b Rothamsted Research, North Wyke, Okehampton, Devon, UK c Natonal Centre for Geocomputaton, Maynooth Unversty, Maynooth, Co. Kldare, Ireland Abstract Geographcally Weghted Regresson (GWR) s a local technque that models spatally varyng relatonshps, where Eucldean dstance s tradtonally used as default n ts calbraton. However, emprcal work has shown that the use of non-eucldean dstance metrcs n GWR can mprove model performance, at least n terms of predctve ft. Furthermore, the relatonshps between the dependent and each ndependent varable may have ther own dstnctve response to the weghtng computaton, whch s reflected by the choce of dstance metrc. Thus, we propose a back-fttng approach to calbrate a GWR model wth parameter-specfc dstance metrcs. To objectvely evaluate ths new approach, a smple smulaton experment s carred out that not only enables an assessment of predcton accuracy, but also parameter accuracy. The results show that the approach can provde both more accurate predctons and parameter estmates, than that found wth standard GWR. Accurate localsed parameter estmaton s crucal to GWR s man use as a method to detect and assess relatonshp non-statonarty. 2015 2015 The The Authors. Authors. Publshed Publshed by Elsever by Elsever B.V Ths B.V. s an open access artcle under the CC BY-NC-ND lcense (http://creatvecommons.org/lcenses/by-nc-nd/4.0/). Peer-revew under responsblty of Spatal Statstcs 2015: Emergng Patterns commttee. Peer-revew under responsblty of Spatal Statstcs 2015: Emergng Patterns commttee Keywords: Non-statonarty, GWR, Parameter-Specfc Dstance Metrcs, Smulaton Experment, 1. Introducton A number of localzed regresson technques have been proposed to account for spatal non-statonarty or spatal heterogenety n data relatonshps, one of whch s geographcally weghted regresson (GWR) [1]. Key to GWR s a bump of nfluence around each local regresson pont: where nearer observatons have more nfluence n estmatng * Bnbn Lu. Tel.: +86-27-68770771; fax: +86-27-68778086. E-mal address: bnbnlu@whu.edu.cn 1878-0296 2015 The Authors. Publshed by Elsever B.V. Ths s an open access artcle under the CC BY-NC-ND lcense (http://creatvecommons.org/lcenses/by-nc-nd/4.0/). Peer-revew under responsblty of Spatal Statstcs 2015: Emergng Patterns Commttee do:10.1016/j.proenv.2015.05.011

110 Bnbn Lu et al. / Proceda Envronmental Scences 26 ( 2015 ) 109 114 the local set of parameters than do observatons farther away [2]. Ths s descrbed by a kernel weghtng functon based on dstances between model calbraton ponts and observaton ponts. Eucldean dstance (ED) s tradtonally used as default n calbratng a GWR model. However, emprcal work has shown that the use of non-eucldean dstance metrcs (lke network dstance and travel tme metrcs) n GWR can mprove model ft [3, 4]. Furthermore, the relatonshp between the dependent and each ndependent varable may have ts own dstnctve response to the weghtng computaton. Some related and mportant studes have been done n ths respect, where the bandwdth of the kernel functon s allowed to vary across relatonshps. Brunsdon et al. [5] ntroduced mxed GWR, whch consders some data relatonshps as global (or fxed), and the rest as local (but each at the same spatal scale). Yang [6] generalzes the mxed GWR model by allowng each data relatonshp to operate at ts own (and commonly dfferent) spatal scale. In ths study, we enhance both studes, where the choce of dstance metrc s also allowed to vary over dfferent parameter estmates n the same model. We hypothesze that each ndependent/dependent varable par n the GWR model may correspond to dfferent optmal dstance metrcs, and then calbrate GWR wth parameter-specfc dstance metrcs (PSDM-GWR). A back-fttng approach nherted from mxed GWR s adjusted for the PSDM- GWR model calbraton. PDSM-GWR s evaluated va a smple smulaton experment. All of the modellng functons used n ths artcle can be found n the GWmodel package [7, 8] n R [9], whch s an ntegrated framework for handlng spatally-varyng structures, va a wde range of geographcally weghted models. 2. Methodology GWR estmates a localzed set of regresson parameters n order to assess the possblty of spatally-varyng relatonshps. The basc formula of a GWR model can be wrtten as: y m k1 x k k 0 (1) where s the dependent varable at locaton, x k s the value of the kth explanatory varable at locaton, s 0 the ntercept parameter at locaton, k s the local regresson parameter (or coeffcent) for the kth explanatory varable at locaton, and s the random error at locaton. At each locaton, the model s calbrated by a weghted least squares approach, of whch the matrx expresson s: ˆ T -1 XWX XWy (2) T where W s the dagonal matrx denotng the geographcal weghtngs for each observaton data (sub-)set for regresson pont. In a standard GWR calbraton, W s calculated va a kernel functon whose bandwdth, s customarly selected va a leave-one-out cross-valdaton (CV) approach [10] or an Akake Informaton Crteron (AIC) approach [11]. For ths study, the GWR technque s extended to PSDM-GWR, where the back-fttng algorthm used n mxed GWR [5] and (smlarly) n flexble bandwdth GWR [6] s adjusted for PSDM-GWR calbraton. If we assume that the specfc dstance metrcs are respectvely for estmatng ther correspondng parameters, and the hat matrx for each parameter estmates s defned as, then eq.(1) can be re-wrtten as: m m y y S y j j0 j0 j (3) Then the back-fttng procedure to calbrate PSDM-GWR can be carred out n the followng steps: Step 1. Intalze values of, wth ; Step 2. Set =1;

Bnbn Lu et al. / Proceda Envronmental Scences 26 ( 2015 ) 109 114 111 Step 3. Calculate, where the functon s defned n eq.(4), and s calculated usng and a gven bandwdth ; Latestyhat y, y 1 y k k k k yk, f y exsts (4) 1, otherwse Step 4. Repeat Step 3 from 0 to m; Step 5. Calculate the resdual sum of squares RSS between y and y, and set =+1; 1 Step 6. Return to Step 3 unless RSS converges to RSS. In ths procedure, the choce of ntal guesses s open. Here we use the results form a standard GWR calbraton (eq.(2)) as startng values n Step 1. The senstvty of the back-fttng algorthm to dfferent ntal guesses s currently under consderaton, but poor ntal guesses wll undoubtedly affect the speed of convergence. 3. Case study wth smulated data As an ntroductory assessment of the PSDM-GWR model, we use smulated data. For ths basc smulaton experment, a pont data set of sze 25*25 s generated on a square grd, of whch the coordnates n two dmensons range from 10 to 100. For each cell, two predctor varables and are ndependently drawn from a unform dstrbuton as a random numerc vector rangng from 1 to 100, as shown n Fg. 1. Fg. 1 (a) Surface for the random predctor ; (b) Surface for the random predctor. The process to generate each realsaton of ths smulaton experment s defned as follows: y x x (5) 1 2 1 1 2 2 2, log uv (6) where the dependent varable y s naturally generated from eq. (5), whch tself conssts of a statonary (sngle) parameter and a non-statonary parameter, as found from the equatons n (6). It s a farly smple case study, but represents clearly dfferent varyng relatonshps between y and. Observe that we do not smulate an ntercept parameter,. The correspondng surfaces of and y are vsualzed n Fg. 2.

112 Bnbn Lu et al. / Proceda Envronmental Scences 26 ( 2015 ) 109 114 Fg. 2. (a) Surface for the coeffcent ; (b) Surface for the dependent varable y. Usng one realsaton of the smulaton, we calbrate the model shown n eq. (5) va both standard GWR and PSDM-GWR. For standard GWR, ED s used to estmate both and ; whch s the standard approach. However for PSDM-GWR, we use a zero dstance matrx (.e. assumng the dstance between any par of ponts s zero,.e. a smple non-ed metrc) to estmate and a ED matrx to estmate. Thus t represents a smple form of PSDM- GWR and s chosen to demonstrate ts potental. For an objectve comparson, we use the same fxed bandwdth for both GWR calbratons, whch s selected by an AIC approach usng the standard GWR model. The results are presented n Table 1, where a reducton n RSS ndcates that PSDM-GWR provdes more accurate predctons than standard GWR. Fg. 3 plots the estmated parameters and from both calbratons. As would be expected, PSDM-GWR provdes a hghly accurate estmate of the statonary (constant) parameter, wth ; whlst smlarly as expected, standard GWR provdes a non-constant estmaton of and as such, s relatvely naccurate. In terms of, both models provde smlar estmates, but the estmates from PSDM-GWR appear slghtly closer to the real values than that found wth standard GWR. Tentatvely, ths smple experment suggests that PSDM-GWR can also provde more accurate parameter estmates than that found wth standard GWR. Table 1. Model calbratons va standard GWR and PSDM-GWR Dstance metrc(s) Kernel functon Bandwdth RSS Standard GWR ED for estmatng both and Gaussan functon wth a fxed 446.11 Zero dstance matrx for estmatng bandwdth selected by AICc 3.54 PSDM-GWR approach n a standard way 418.20 ED matrx for estmatng

Bnbn Lu et al. / Proceda Envronmental Scences 26 ( 2015 ) 109 114 113 Fg. 3. (a) Real values of and estmatons from standard GWR and PSDM-GWR; (b) Real values of and estmatons from standard GWR and PSDM-GWR. 4. Concludng remarks In ths study, we proposed a back-fttng algorthm for PSDM-GWR. Va a smulaton study, we have shown that PSDM-GWR can provde more accurate predctons and parameter estmates than standard GWR. However, ths can only be consdered as prelmnary fndngs, as: The form of the PSDM-GWR model used n ths study s just a specfc case of a mxed GWR model. In ths respect, a more nvolved smulaton study s requred usng (novel) PSDM-GWR specfcatons that do not mmc exstng GWR constructons. The way to defne or select a dstance metrc for an ndependent varable wthn a gven PSDM-GWR model s key and requres refnement. PSDM-GWR also needs to demonstrate ts practcal worth wthn an emprcal case study. The approach could be meshed wth that of Yang [6], where bandwdths vary across relatonshps. Acknowledgements Research presented n ths paper s funded by Natonal Natural Scence Foundaton of Chna (NSFC: 41401455). The authors gratefully acknowledge ths support. References [1].Brunsdon, C., A.S. Fotherngham, and M.E. Charlton, Geographcally Weghted Regresson: A Method for Explorng Spatal Nonstatonarty.

114 Bnbn Lu et al. / Proceda Envronmental Scences 26 ( 2015 ) 109 114 Geographcal Analyss, 1996. 28(4): p. 281-298. [2].Fotherngham, A.S., M.E. Charlton, and C. Brunsdon, Geographcally weghted regresson: a natural evoluton of the expanson method for spatal data analyss. Envronment and Plannng A, 1998. 30(11): p. 1905-1927. [3].Lu, B., M. Charlton, and A.S. Fotherngham, Geographcally Weghted Regresson Usng a Non-Eucldean Dstance Metrc wth a Study on London House Prce Data. Proceda Envronmental Scences, 2011. 7(0): p. 92-97. [4].Lu, B., et al., Geographcally weghted regresson wth a non-eucldean dstance metrc: a case study usng hedonc house prce data. Internatonal Journal of Geographcal Informaton Scence, 2014. 28(4): p. 660-681. [5].Brunsdon, C., A.S. Fotherngham, and M. Charlton, Some Notes on Parametrc Sgnfcance Tests for Geographcally Weghted Regresson. Journal of Regonal Scence, 1999. 39(3): p. 497-524. [6].Yang, W., An Extenson of Geographcally Weghted Regresson wth Flexble Bandwdths, n Centre for GeoInformatcs. 2014, Unversty of St Andrews: St Andrews, UK. [7].Golln, I., et al., GWmodel: an R Package for Explorng Spatal Heterogenety usng Geographcally Weghted Models. Journal of Statstcal Software, 2015. 63(17): p. 1-50. [8].Lu, B., et al., The GWmodel R package: further topcs for explorng spatal heterogenety usng geographcally weghted models. Geo-spatal Informaton Scence, 2014. 17(2): p. 85-101. [9].R Development Core Team, R: A Language and Envronment for Statstcal Computng. 2013, R Foundaton for Statstcal Computng: Venna, Austra. [10].Farber, S. and A. Páez, A systematc nvestgaton of cross-valdaton n GWR model estmaton: emprcal analyss and Monte Carlo smulatons. Journal of Geographcal Systems, 2007. 9(4): p. 371-396-396. [11].Fotherngham, A.S., C. Brunsdon, and M. Charlton, Geographcally Weghted Regresson: the analyss of spatally varyng relatonshps. 2002, Chchester: Wley.