Avalable onlne at www.scencedrect.com ScenceDrect Proceda Envronmental Scences 26 (2015 ) 109 114 Spatal Statstcs 2015: Emergng Patterns Calbratng a Geographcally Weghted Regresson Model wth Parameter-Specfc Dstance Metrcs Bnbn Lu a *, Paul Harrs b, Martn Charlton c, Chrs Brunsdon c a School of Remote Sensng and Informaton Engneerng, Wuhan Unversty, 129 Luoyu Road, Wuhan 430079, Chna. b Rothamsted Research, North Wyke, Okehampton, Devon, UK c Natonal Centre for Geocomputaton, Maynooth Unversty, Maynooth, Co. Kldare, Ireland Abstract Geographcally Weghted Regresson (GWR) s a local technque that models spatally varyng relatonshps, where Eucldean dstance s tradtonally used as default n ts calbraton. However, emprcal work has shown that the use of non-eucldean dstance metrcs n GWR can mprove model performance, at least n terms of predctve ft. Furthermore, the relatonshps between the dependent and each ndependent varable may have ther own dstnctve response to the weghtng computaton, whch s reflected by the choce of dstance metrc. Thus, we propose a back-fttng approach to calbrate a GWR model wth parameter-specfc dstance metrcs. To objectvely evaluate ths new approach, a smple smulaton experment s carred out that not only enables an assessment of predcton accuracy, but also parameter accuracy. The results show that the approach can provde both more accurate predctons and parameter estmates, than that found wth standard GWR. Accurate localsed parameter estmaton s crucal to GWR s man use as a method to detect and assess relatonshp non-statonarty. 2015 2015 The The Authors. Authors. Publshed Publshed by Elsever by Elsever B.V Ths B.V. s an open access artcle under the CC BY-NC-ND lcense (http://creatvecommons.org/lcenses/by-nc-nd/4.0/). Peer-revew under responsblty of Spatal Statstcs 2015: Emergng Patterns commttee. Peer-revew under responsblty of Spatal Statstcs 2015: Emergng Patterns commttee Keywords: Non-statonarty, GWR, Parameter-Specfc Dstance Metrcs, Smulaton Experment, 1. Introducton A number of localzed regresson technques have been proposed to account for spatal non-statonarty or spatal heterogenety n data relatonshps, one of whch s geographcally weghted regresson (GWR) [1]. Key to GWR s a bump of nfluence around each local regresson pont: where nearer observatons have more nfluence n estmatng * Bnbn Lu. Tel.: +86-27-68770771; fax: +86-27-68778086. E-mal address: bnbnlu@whu.edu.cn 1878-0296 2015 The Authors. Publshed by Elsever B.V. Ths s an open access artcle under the CC BY-NC-ND lcense (http://creatvecommons.org/lcenses/by-nc-nd/4.0/). Peer-revew under responsblty of Spatal Statstcs 2015: Emergng Patterns Commttee do:10.1016/j.proenv.2015.05.011
110 Bnbn Lu et al. / Proceda Envronmental Scences 26 ( 2015 ) 109 114 the local set of parameters than do observatons farther away [2]. Ths s descrbed by a kernel weghtng functon based on dstances between model calbraton ponts and observaton ponts. Eucldean dstance (ED) s tradtonally used as default n calbratng a GWR model. However, emprcal work has shown that the use of non-eucldean dstance metrcs (lke network dstance and travel tme metrcs) n GWR can mprove model ft [3, 4]. Furthermore, the relatonshp between the dependent and each ndependent varable may have ts own dstnctve response to the weghtng computaton. Some related and mportant studes have been done n ths respect, where the bandwdth of the kernel functon s allowed to vary across relatonshps. Brunsdon et al. [5] ntroduced mxed GWR, whch consders some data relatonshps as global (or fxed), and the rest as local (but each at the same spatal scale). Yang [6] generalzes the mxed GWR model by allowng each data relatonshp to operate at ts own (and commonly dfferent) spatal scale. In ths study, we enhance both studes, where the choce of dstance metrc s also allowed to vary over dfferent parameter estmates n the same model. We hypothesze that each ndependent/dependent varable par n the GWR model may correspond to dfferent optmal dstance metrcs, and then calbrate GWR wth parameter-specfc dstance metrcs (PSDM-GWR). A back-fttng approach nherted from mxed GWR s adjusted for the PSDM- GWR model calbraton. PDSM-GWR s evaluated va a smple smulaton experment. All of the modellng functons used n ths artcle can be found n the GWmodel package [7, 8] n R [9], whch s an ntegrated framework for handlng spatally-varyng structures, va a wde range of geographcally weghted models. 2. Methodology GWR estmates a localzed set of regresson parameters n order to assess the possblty of spatally-varyng relatonshps. The basc formula of a GWR model can be wrtten as: y m k1 x k k 0 (1) where s the dependent varable at locaton, x k s the value of the kth explanatory varable at locaton, s 0 the ntercept parameter at locaton, k s the local regresson parameter (or coeffcent) for the kth explanatory varable at locaton, and s the random error at locaton. At each locaton, the model s calbrated by a weghted least squares approach, of whch the matrx expresson s: ˆ T -1 XWX XWy (2) T where W s the dagonal matrx denotng the geographcal weghtngs for each observaton data (sub-)set for regresson pont. In a standard GWR calbraton, W s calculated va a kernel functon whose bandwdth, s customarly selected va a leave-one-out cross-valdaton (CV) approach [10] or an Akake Informaton Crteron (AIC) approach [11]. For ths study, the GWR technque s extended to PSDM-GWR, where the back-fttng algorthm used n mxed GWR [5] and (smlarly) n flexble bandwdth GWR [6] s adjusted for PSDM-GWR calbraton. If we assume that the specfc dstance metrcs are respectvely for estmatng ther correspondng parameters, and the hat matrx for each parameter estmates s defned as, then eq.(1) can be re-wrtten as: m m y y S y j j0 j0 j (3) Then the back-fttng procedure to calbrate PSDM-GWR can be carred out n the followng steps: Step 1. Intalze values of, wth ; Step 2. Set =1;
Bnbn Lu et al. / Proceda Envronmental Scences 26 ( 2015 ) 109 114 111 Step 3. Calculate, where the functon s defned n eq.(4), and s calculated usng and a gven bandwdth ; Latestyhat y, y 1 y k k k k yk, f y exsts (4) 1, otherwse Step 4. Repeat Step 3 from 0 to m; Step 5. Calculate the resdual sum of squares RSS between y and y, and set =+1; 1 Step 6. Return to Step 3 unless RSS converges to RSS. In ths procedure, the choce of ntal guesses s open. Here we use the results form a standard GWR calbraton (eq.(2)) as startng values n Step 1. The senstvty of the back-fttng algorthm to dfferent ntal guesses s currently under consderaton, but poor ntal guesses wll undoubtedly affect the speed of convergence. 3. Case study wth smulated data As an ntroductory assessment of the PSDM-GWR model, we use smulated data. For ths basc smulaton experment, a pont data set of sze 25*25 s generated on a square grd, of whch the coordnates n two dmensons range from 10 to 100. For each cell, two predctor varables and are ndependently drawn from a unform dstrbuton as a random numerc vector rangng from 1 to 100, as shown n Fg. 1. Fg. 1 (a) Surface for the random predctor ; (b) Surface for the random predctor. The process to generate each realsaton of ths smulaton experment s defned as follows: y x x (5) 1 2 1 1 2 2 2, log uv (6) where the dependent varable y s naturally generated from eq. (5), whch tself conssts of a statonary (sngle) parameter and a non-statonary parameter, as found from the equatons n (6). It s a farly smple case study, but represents clearly dfferent varyng relatonshps between y and. Observe that we do not smulate an ntercept parameter,. The correspondng surfaces of and y are vsualzed n Fg. 2.
112 Bnbn Lu et al. / Proceda Envronmental Scences 26 ( 2015 ) 109 114 Fg. 2. (a) Surface for the coeffcent ; (b) Surface for the dependent varable y. Usng one realsaton of the smulaton, we calbrate the model shown n eq. (5) va both standard GWR and PSDM-GWR. For standard GWR, ED s used to estmate both and ; whch s the standard approach. However for PSDM-GWR, we use a zero dstance matrx (.e. assumng the dstance between any par of ponts s zero,.e. a smple non-ed metrc) to estmate and a ED matrx to estmate. Thus t represents a smple form of PSDM- GWR and s chosen to demonstrate ts potental. For an objectve comparson, we use the same fxed bandwdth for both GWR calbratons, whch s selected by an AIC approach usng the standard GWR model. The results are presented n Table 1, where a reducton n RSS ndcates that PSDM-GWR provdes more accurate predctons than standard GWR. Fg. 3 plots the estmated parameters and from both calbratons. As would be expected, PSDM-GWR provdes a hghly accurate estmate of the statonary (constant) parameter, wth ; whlst smlarly as expected, standard GWR provdes a non-constant estmaton of and as such, s relatvely naccurate. In terms of, both models provde smlar estmates, but the estmates from PSDM-GWR appear slghtly closer to the real values than that found wth standard GWR. Tentatvely, ths smple experment suggests that PSDM-GWR can also provde more accurate parameter estmates than that found wth standard GWR. Table 1. Model calbratons va standard GWR and PSDM-GWR Dstance metrc(s) Kernel functon Bandwdth RSS Standard GWR ED for estmatng both and Gaussan functon wth a fxed 446.11 Zero dstance matrx for estmatng bandwdth selected by AICc 3.54 PSDM-GWR approach n a standard way 418.20 ED matrx for estmatng
Bnbn Lu et al. / Proceda Envronmental Scences 26 ( 2015 ) 109 114 113 Fg. 3. (a) Real values of and estmatons from standard GWR and PSDM-GWR; (b) Real values of and estmatons from standard GWR and PSDM-GWR. 4. Concludng remarks In ths study, we proposed a back-fttng algorthm for PSDM-GWR. Va a smulaton study, we have shown that PSDM-GWR can provde more accurate predctons and parameter estmates than standard GWR. However, ths can only be consdered as prelmnary fndngs, as: The form of the PSDM-GWR model used n ths study s just a specfc case of a mxed GWR model. In ths respect, a more nvolved smulaton study s requred usng (novel) PSDM-GWR specfcatons that do not mmc exstng GWR constructons. The way to defne or select a dstance metrc for an ndependent varable wthn a gven PSDM-GWR model s key and requres refnement. PSDM-GWR also needs to demonstrate ts practcal worth wthn an emprcal case study. The approach could be meshed wth that of Yang [6], where bandwdths vary across relatonshps. Acknowledgements Research presented n ths paper s funded by Natonal Natural Scence Foundaton of Chna (NSFC: 41401455). The authors gratefully acknowledge ths support. References [1].Brunsdon, C., A.S. Fotherngham, and M.E. Charlton, Geographcally Weghted Regresson: A Method for Explorng Spatal Nonstatonarty.
114 Bnbn Lu et al. / Proceda Envronmental Scences 26 ( 2015 ) 109 114 Geographcal Analyss, 1996. 28(4): p. 281-298. [2].Fotherngham, A.S., M.E. Charlton, and C. Brunsdon, Geographcally weghted regresson: a natural evoluton of the expanson method for spatal data analyss. Envronment and Plannng A, 1998. 30(11): p. 1905-1927. [3].Lu, B., M. Charlton, and A.S. Fotherngham, Geographcally Weghted Regresson Usng a Non-Eucldean Dstance Metrc wth a Study on London House Prce Data. Proceda Envronmental Scences, 2011. 7(0): p. 92-97. [4].Lu, B., et al., Geographcally weghted regresson wth a non-eucldean dstance metrc: a case study usng hedonc house prce data. Internatonal Journal of Geographcal Informaton Scence, 2014. 28(4): p. 660-681. [5].Brunsdon, C., A.S. Fotherngham, and M. Charlton, Some Notes on Parametrc Sgnfcance Tests for Geographcally Weghted Regresson. Journal of Regonal Scence, 1999. 39(3): p. 497-524. [6].Yang, W., An Extenson of Geographcally Weghted Regresson wth Flexble Bandwdths, n Centre for GeoInformatcs. 2014, Unversty of St Andrews: St Andrews, UK. [7].Golln, I., et al., GWmodel: an R Package for Explorng Spatal Heterogenety usng Geographcally Weghted Models. Journal of Statstcal Software, 2015. 63(17): p. 1-50. [8].Lu, B., et al., The GWmodel R package: further topcs for explorng spatal heterogenety usng geographcally weghted models. Geo-spatal Informaton Scence, 2014. 17(2): p. 85-101. [9].R Development Core Team, R: A Language and Envronment for Statstcal Computng. 2013, R Foundaton for Statstcal Computng: Venna, Austra. [10].Farber, S. and A. Páez, A systematc nvestgaton of cross-valdaton n GWR model estmaton: emprcal analyss and Monte Carlo smulatons. Journal of Geographcal Systems, 2007. 9(4): p. 371-396-396. [11].Fotherngham, A.S., C. Brunsdon, and M. Charlton, Geographcally Weghted Regresson: the analyss of spatally varyng relatonshps. 2002, Chchester: Wley.