Methodology of optimal sampling planning based on VoI for soil contamination investigation

Japanese Geotechncal Socety Specal Publcaton The 5th Asan Regonal Conference on Sol echancs and Geotechncal Engneerng ethodology of optmal samplng plannng based on VoI for sol contamnaton nvestgaton Iumasa Yoshda ) ) Professor, Department of Urban and Cvl Engneerng, Toyo Cty Unversty, -8- Tamazutsum Setagaya-u, Toyo 58-8557, Japan ABSTRACT Ths paper proposes a method for optmal samplng plannng,.e., the number and placement of addtonal samplng for land wth arbtrary shape and wth exstng samplng data, under the assumpton of Gaussan random feld wth respect to a characterstc parameter. A set of optmal locatons for addtonal samplng are evaluated as a soluton of optmzaton problem, n whch ts objectve functon s Value of Informaton (VoI), and the optmzaton method s Partcle Swarm Optmzaton. Optmal number of samplng s also evaluated by total cost,.e., sum of observaton cost and VoI. The balance of penaltes and observaton cost determnes the optmal number of addtonal samplng. Keywords: optmal samplng placement, rs, observaton, Krgng, random feld INTRODUCTION any researchers have studed placement algorthms that gve a plannng of observaton n cost effectve way, whle maxmzng the benefts. The optmal observaton placement problem contans two aspects, mnmzaton of the relevant uncertantes (maxmzaton of the accuracy) and mnmzaton of total costs. Tradtonal measures of uncertanty, such as covarance matrx or nformaton entropy, however, do not depct the sgnfcance of uncertanty f the consequence due to the uncertanty s not consdered. Raffa & Schlafer (96) descrbe ntensvely the theory of Value of Informaton (VoI) n decson mang under uncertanty. VoI can be nterpreted to be expectancy of rs reducton or beneft obtaned by the nformaton. Nojma & Sugto (999) propose a Bayes decson procedure model wth VoI concept optmzng the process of post-earthquae emergency response n hghly uncertan condtons to prevent secondary damage by emergency shut-off of lfelne servces. Straub (3) descrbes ntensvely concept and applcaton of VoI n mantenance problem of nfrastructures. Pozz & Kureghan () dscuss the applcaton of VoI-based method to structural health montorng. Sol contamnaton s one of the ssues we have to cope wth n modern socety. We judge the need of contamnaton remedaton measures after sol samplng and ts test at several stes. The Gudelnes by nstry of the Envronment, Japan ndcates detaled nvestgaton scheme for the dentfcaton of contamnaton, n whch a basc placement of samplng (samplng grd) s ntroduced. The samplng placement s shown for the square land wthout any exstng samplng nformaton. We sometmes encounter the case where the land has complcated shape and pror nformaton wth respect to contamnaton s gven. Ths paper proposes a method to obtan optmal samplng plannng,.e., the number and placement of new addtonal samplng for land wth arbtrary shape and wth exstng samplng nformaton, under the assumpton of Gaussan random feld wth respect to a characterstc parameter. The proposed method s appled to optmal samplng plannng n one and two dmensonal problems. VALUE OF INFORATION FOR OPTIAL SAPLING PLACEENT. Rs of decson mang It s assumed that observaton s performed to obtan useful nformaton to mae decson by comparng estmator x wth threshold lmt value x, e.g., to judge contamnated sol or ordnary sol by comparng posonous materal concentraton x and ts threshold lmt value x. Statstcal test has two nds of error. A type I error (or error of the frst nd) s the ncorrect rejecton of a true null hypothess. A type II error (or error of the second nd) s the falure to reject a false null hypothess. Referrng to these error types, we defne two types of false decson mang. http://do.org/.38/jgssp.jpn-36 45

Rs (Expected Loss) 9 8 7 6 5 4 3 C =. ) Decson error type Judge x< x when true x> x (e.g., to judge that contamnaton countermeasure s not necessary when t s necessary actually) ) Decson error type Judge x> x when true x< x (e.g., to judge that contamnaton countermeasure s necessary when t s not necessary actually) The probabltes of decson error type, are denoted as P, P (=-P ). The rs of the decson error can be calculated wth penaltes per unt area C, C for the decson errors and the probabltes. Naturally we should mae decson to tae lower rs. ( C P, C ( P ) J = L = mn,, () Rs Rs mn(rs,rs) Suffx ndcates a regon for estmaton of rs. Total rs s calculated by summng up the rs over the area for the evaluaton. Let s have an example that we have estmator x=3 when threshold lmt value x =3. It s assumed that the estmator nvolves uncertanty and ts mean s 3. It s also assumed that penaltes of the error type, are, respectvely. These assumed values are only for the llustraton, and do not have any actual meanng. If the estmator s judged to be less than the threshold value, the probablty of error s.5, and ts rs s 5. If the estmator s judged to be larger than the threshold value, the probablty of error s also.5, and ts rs s. The former and latter are called as rs and respectvely. Snce the smaller rs should be taen naturally, we should tae rs. Fg. shows the rs we should tae for estmator of whch mean s to 4. It s assumed that the estmator s Gaussan and ts standard devaton s.4. x c x safety margn C =. 3 4 mean of estmate, x Fg.. Rs of decson error, type and, and mean of estmator (standard devaton of estmator=.4, penaltes C =,C =) When the mean of estmator s 3, the rs and are plotted at and 5 respectvely. When the mean becomes small, rs also becomes small, on the other hand rs becomes large. The pont x c that rs s equvalent to rs ndcates a threshold value for the judgement under uncertanty. We should judge that the estmator s larger than threshold lmt value x when the mean of estmator s larger than the threshold value for the judgement x c. The dfference between the x c and x expresses safety margn. The threshold for judgement x c s determned by uncertanty of the estmator and the rato of penalty and, C, C.. Quantfcaton of VoI n a Gaussan random feld In general, t s dffcult to compute VoI so that C approach s proposed (Lu et al. ; Pozz and Kureghan ; Straub 3). VoI can be, however, computed easly n updatng of Gaussan random feld,.e., Krgng, whch s a probablstc nterpolaton method (e.g. Chrstaos 99; Cresse 99). It s assumed that observaton data at new locatons are obtaned at each observaton step. T T T { z, z, L z } T Z =, () where z, Z represent observaton data at step and up to step. ean vectors at three types of places are obtaned as follows, x x x = x + x 3 x 3 T T 3 [ + R ] { Z x } where x represents a mean vector at places where the observaton Z s gven; x s a mean vector at places where new observaton z wll be gven; x 3 presents a mean vector at area where decson error rs s evaluated; R s covarance matrx of observaton error. The covarance matrces of x, x j are gven as: j j T ( + R ) j (3) = (4) It s noted that locatons of x are those of observaton ponts at + step, and the observaton data s not obtaned yet. As mentoned above penalty s mposed on false decson-mang. The rs can be evaluated from the product of probablty of false decson mang and the penalty. ( C P, C ( P ) L ( x3,, 3, ) = mn,, where, σ (5), = Φ( ), P β x β = 3, xo σ 3, 45

z= z=3 () Crosssectonal Vew z=5 z= 3.5 3.5.5.5 Estmate St.Dev.(pror) St.Dev.(post) 3 4 5 Locaton () ean and standard devaton of estmate z=3 z=5.5 Rs(pror) Rs(post) z= z=.5 () Plane Vew Fg.. Exstng observaton nformaton and area for evaluaton (constructon) 3 4 5 Locaton () Rs of decson error, the colored area ndcates "Value of Informaton" Fg.3. Rs reducton by addtonal samplng nformaton (locaton:m) Φ s the standard Normal (Gaussan) cumulatve dstrbuton functon; σ 3, s standard devaton of x 3, whch can be obtaned from dagonal component of covarance matrx 33 shown n Eq.(4). The total rs at the decson mang area s gven by: J = L( x3,, σ 3, ) (6) The decson error rs s reduced by the new nformaton z +. After we obtaned observaton vector z +, the mean and covarance matrx of x 3 s updated as: x [ ] + { + + R z } + T 3 = x3 + 3 x + T + 33 33 3 + R ) 3 (7) = ( (8) Naturally value of the new nformaton z + s not gven yet. Therefore x nstead of z + s used n Eq.(7). The expectancy of rs reducton s defned as VoI. + + VoI = E[ J J ] = E[ J ] J (9) The expectancy of rs consderng observaton data n next step z + s + + + E[ J ] = L( x3,, 3, ) p( x ) dx σ () Integraton wth respect to x s requred, but t cannot be performed analytcally. When dmenson of x s hgh, numercal ntegraton s not practcal to mplement. Thans to reproductve property of Gaussan, the numercal ntegraton can be always reduced to one-dmensonal numercal ntegraton. Consequently VoI can be calculated easly even f the dmenson of z + (the number of addtonal observaton ponts) s large, e.g., more than..3 Optmzaton of VoI wth respect to locaton of new observaton When the dmenson of vector z + s low, t s not dffcult to optmze the locaton of new observaton. You can determne the optmal locaton by evaluatng VoI at every possble combnaton of locatons. It s, however, dffcult to evaluate them due to curse of dmensonalty when the dmenson of the vector z + s hgh. In ths paper PSO (Partcle Swarm Optmzaton) s ntroduced to optmze a set of locaton 453

.8.6.4..8.6.4. 3 4 5 Locaton Fg.4. Optmal locatons for four addtonal samplng of new observaton wth respect to VoI. PSO s one of global optmzaton methods, whch was proposed by Kennedy et al. (995). It s sad that PSO s a smple method wth a few parameters that users must determne but effcent for optmzaton wth respect to real number varables. 3 ONE DIENSIONAL OPTIAL PLACEENT Rs(pror) Rs(post) It s assumed that there are four exstng samplng data as shown n Fg.. The values shown n the fgure ndcate contamnaton level. ean and standard devaton of the random feld are.,.. Autocorrelaton dstance s 5.m n all drectons. Area for evaluaton of VoI s shown as "constructon" wth the blue lne n Fg.. The threshold x for the decson-mang of sol contamnaton s 3.. Penalty C and C are and. Standard devaton of observaton error s.. Based on the exstng data, contamnaton level and ts standard devaton along the constructon lne are estmated by usng Krgng as shown n Fg.3(). The dstrbuton of false decson rs based on the exstng data s also shown as "rs (pror)"n Fg.3(). It corresponds to the second term n rght sde n Eq.(9). VoI s evaluated for new observaton pont at m. The dstrbuton of posteror standard devaton s shown as "St.Dev.(post)". It s naturally ndcated that the standard devaton around the new observaton s reduced. The rs s evaluated reflectng not only the standard devaton but also the observed value. It corresponds to the frst term n rght sde n Eq.(9). The rs s reduced around the new observaton pont. The dfference between the pror and posteror rs s VoI whch s ndcated by the area colored n the fgure. Optmal placements of addtonal observaton ponts are determned such that the VoI s mnmzed wth respect to the poston of the addtonal samplng ponts. PSO (Partcle Swarm Optmzaton) s used for the mnmzaton. The optmal placement s shown n Fg.4 when four addtonal samplng s performed. In the same way, optmal placements are determned for to 7 addtonal samplng ponts. Total cost whch s sum of VoI, Cost VOI, Cost 4 3 3 4 4 3 3 4 VoI Obs.Cost Total Cost 4 6 8 Number of Observaton () Observaton Cost = 3 VOI Cost Total 4 6 8 Number of Observaton () Observaton cost = 5 Fg.5. Optmal number of samplng observaton cost and VoI can determne optmal number of addtonal ponts. Fg.5 shows the total cost when observaton cost s assumed to be 5 or 3. The optmal number s for observaton cost 5, 6 for observaton cost 3. 4 TWO DIENSIONAL OPTIAL PLACEENT An example for two dmensonal observaton plannng s shown. The area for nvestgaton s a square of 3m X 3m. ean and standard devaton of the random feld are.,.5. Autocorrelaton dstance s 5m n all drectons. Standard devaton of observaton error s.. The threshold x for the decson-mang of sol contamnaton s. Penalty C and C are and. Optmal placements for 3, 5 and 6 samplngs are shown n Fg.6. Gudelne by nstry of the Envronment Japan shows the placement that s rotated wth 45 degree from Fg.3(). When there s no exstng samplng nformaton, the shape of placement s geometrc, whch s magnable by feelng. It s, however, dffcult to tell the optmal placement by feelng when there are exstng samplng data. Fg.7 shows optmal addtonal samplng placement when there are three exstng samplng nformaton. Three cases of observed values are consdered at the same locatons of exstng samplng. Dependng on the observed values, the optmal placements are determned. 454

() 3 samplng () 5 samplng (3) 6 samplng Fg.6. Optmal samplng placement when there are no exstng (pror) samplng nformaton Z=.5 Z=.5 Z=.7 Z=.5 Z=.3 Z=.7 Z=.6 Z=.5 Z=.6 () Case () Case (3) Case 3 Fg.7. Optmal addtonal samplng locatons dependng on exstng (pror) samplng data Area around the exstng samplng of whch observed value s close to are put weght because the threshold value s. 5 CONCLUSIONS Ths paper proposes an effcent method to obtan optmal samplng (observaton, borng) placement based on Value of Informaton (VoI) n a Gaussan random feld. VoI contans cost attrbutable to the uncertanty to assess the usefulness of observaton nformaton consderng the consequence due to the uncertanty. The proposed method s appled to addtonal samplng placement n one and two dmensonal problems. Optmal number of samplng s also evaluated from total cost,.e., sum of observaton cost and VoI. The balance of penaltes and observaton cost determnes the optmal number though the penaltes are dffcult to determne ratonally. One of the dffcultes n practcal applcatons of VoI les n the determnaton of parameters le penaltes. Ths wll be future topcs to be dscussed. The gudelne by nstry of the Envronment Japan ndcates samplng plan for sol contamnaton n a square space wthout exstng samplng. The proposed method, whch s consstent wth the gudelne, can determne the optmal samplng plannng for any shape and wth any exstng samplng. REFERENCES ) Chrstaos, G. (99): Random Feld odels n Earth Scences, Academc Press Inc. ) Cresse, N. (99): Statstcs for Spatal Data, John Wley & Sons. 3) Lu, X., Lee, J., Ktands, P., Parer, J. & Km, U. (): Value of Informaton as a Context-Specfc easure of Uncertanty n Groundwater Remedaton, Water Resources anagement, 6(6), 53-535. 4) Kennedy, J. & Eberhart, R. (995): Partcle swarm optmzaton, Proc. of IEEE Int. Conf. on Neural Networs, Vol.4, 94 948. 5) Nojma, N. & Sugto,. (999): Bayes Decson Procedure odel for Post-Earthquae Emergency Response, Optmz-ng Post-Earthquae Lfelne System Relablty, Proc. of the 5th U.S. Conference on Lfelne Earthquae Engneerng, 7-6. 6) Pozz. & Der Kureghan A. (): Assessng the Value of Alternatve Brdge Health ontorng Systems, 6th Internatonal Conference on Brdge antenance, Safety and anagement, IABAS: CRC Press. 7) Raffa, H. & Schlafer, R. (96): Appled statstcal decson theory. Boston: Clnton Press, Inc. 8) Straub, D. (3): Value of Informaton Analyss wth Structural Relablty ethods, Structural Safety, specal ssue n the honor of Prof. Wlson Tang 9) Sun, N-Z. (994): Inverse problems n groundwater modellng, Kluwer Academc Publshers 455