Research Article Time Series Outlier Detection Based on Sliding Window Prediction

Size: px
Start display at page:

Download "Research Article Time Series Outlier Detection Based on Sliding Window Prediction"

Transcription

1 Mathematcal Problems n Engneerng, Artcle ID , 14 pages Research Artcle Tme Seres Outler Detecton Based on Sldng Wndow Predcton Yufeng Yu, Yuelong Zhu, Shjn L, and Dngsheng Wan CollegeofComputer&Informaton,HohaUnversty,Nanjng210098,Chna Correspondence should be addressed to Yufeng Yu; hhuheyun@126.com Receved 18 July 2014; Accepted 15 September 2014; Publshed 30 October 2014 Academc Edtor: Jun Jang Copyrght 2014 Yufeng Yu et al. Ths s an open access artcle dstrbuted under the Creatve Commons Attrbuton Lcense, whch permts unrestrcted use, dstrbuton, and reproducton n any medum, provded the orgnal work s properly cted. In order to detect outlers n hydrologcal tme seres data for mprovng data qualty and decson-makng qualty related to desgn, operaton, and management of water resources, ths research develops a tme seres outler detecton method for hydrologc data that can be used to dentfy data that devate from hstorcal patterns. The method frst bult a forecastng model on the hstory data and then used t to predct future values. Anomales are assumed to take place f the observed values fall outsde a gven predcton confdence nterval (PCI), whch can be calculated by the predcted value and confdence coeffcent. The use of PCI as threshold s manly on the fact that t consders the uncertanty n the data seres parameters n the forecastng model to address the sutable threshold selecton problem. The method performs fast, ncremental evaluaton of data as t becomes avalable, scales to large quanttes of data, and requres no preclassfcaton of anomales. Experments wth dfferent hydrologc real-world tme seres showed that the proposed methods are fast and correctly dentfy abnormal data and can be used for hydrologc tme seres analyss. 1. Introducton As the fundamental resources for water resources management and plannng, long-term hydrologcal data are sets of dscrete record values of hydrologcal elements that are collected wth tme and have been frequently analyzed n the feld such as flood and drought control, water resources management, and water envronment protecton. Wth the development of data acquston technology and data transmsson technology, hydrologcal departments collected everncreasng amounts of tme seres data from automatc montorng systems va loggers and telemetry systems. Wthn these datasets, hydrologc tme seres analyss becomes workable and credble for buldng mathematcal model to generate synthetc hydrologc records, to forecast hydrologc events, to detect trends and shfts n hydrologc records, and to fll n mssng data and extend records [1]. However, a hydrologc tme seres s generally composed of a stochastc component supermposed on a determnstc component and usually shows stochastc, fuzzy, nonlnear, nonstatonary, and multtemporal scale characterstcs [2]. So, t s a challengng task to process and nterpret the orgnal hydrologcal tme seres due to the followng: () the large volumes of data, () the parameter pattern beng specfc and changng to dfferent hydrology acquston system due to multtemporal scale characterstc, () abnormal events or dsturbances that create spurous effects n the data seres and result n unexpected patterns, (v) naccuraces n hydrologcal models due to mprecse and outdated nformaton, logger and communcatons falures, poor calbraton, and lack of system feedback. Consequences of such stuatons n hydrologcal nformaton systems may result n the DRQP (data rch, but qualty poor) phenomenon. Consequently, the orgnal montorng data (.e., precptaton, dscharge, and water levels) should undergo a preprocessng step to elmnate the negatve nfluence caused by ncorrect or abnormal data due to

2 2 Mathematcal Problems n Engneerng nstrumentaton faults, data nherent change, operaton error, or other possble nfluencng factors [3]. Therefore, outler detecton usually becomes a vtal step for hydrologc tme seres analyss based on the montorng data. Regardng small montorng datasets, data managers can detect and deal wth outlers drectly wth a smple graphcal or manual process. However, for massve datasets or data stream, an automatc and objectve technque for effectvely detectng and treatng outlers s necessary. Ths study develops a real-tme outler detecton method that employs a wndow-based forecastng model for hydrologc tme seres collected from automatc montorng systems.themethodbuldsaforecastngmodelfromasequence of hstorcal pont values wth a gven wndow to predct future values. If the observed value dffers from the predcted valuebeyondacertanthreshold,anoutlerwouldbendcated. The method uses predcton confdence nterval (PCI) as threshold n consderaton of uncertanty n the data seres parameters n the forecastng model. Data are classfed as anomalous/nonanomalous based on whether or not they fall outsde a gven PCI. Thus, the method provdes a prncpled framework for selectng a threshold. Ths method does not requre any preclassfed examples of data, scales well to large volumes of data, and allows for fast ncremental evaluaton of data as t becomes avalable. In order to evaluate the proposed method, t was appled to two dfferent hydrologcal varables, water level and daly flow, from Huayuankou (hereafter HYK, N, E) and Lanzhou (hereafter LZ, N, E) statons obtaned from natonal hydrology database of MWR, Chna. The results show that the proposed method can exactly detect the outlers n the hydrologcal tme seres wth near neglgble false postve rate. Furthermore, the algorthm s effcency sanalyzedbasedonthedetectonresults. The rest of the paper s organzed as follows. In the next secton (Secton 2) we present the related work to ths area of research. In Secton 3 we present detals about the proposed algorthm for outler detecton n tme seres based on predcton confdence nterval. A number of experments wth the proposed method usng real-world hydrologcal tme seres are reported n Secton 4.Fnally,Secton 5 gves conclusons and suggestons for further research. 2. Related Work 2.1. Tme Seres Analyss. A tme seres (TS) X={x(t) 1 t m}s a sequence of d-dmensonal observatons vector x(t) = (x 1 (t), x 2 (t),...x d (t)) ordered n tme. Mostly these observatons are collected at equally spaced, dscrete tme ntervals. It s called a unvarate (or sngle) tme seres when d s equal to 1 and a multvarate tme seres when d s equal to or greater than 2 [4]. Generally, a tme seres can be regarded as a sample realzaton from an nfnte populaton of such tme seres generated by a stochastc process, whch can be statonary or nonstatonary. In addton, the understandng of the structure and dependence of tme seres s acheved through tme seres analyss. Tme seres analyss s the nvestgaton of a temporally dstrbutedsequenceofdataorthesynthessofamodel for predcton wheren tme s an ndependent varable; as a consequence, the nformaton obtaned from tme seres analyss can be appled to forecastng, process control, outler detecton, and other applcatons [5]. A basc assumpton n tme seres analyss s that some aspects of the past pattern wll contnue to reman n the future. Also under ths setup, often the tme seres process s assumed to be based on past values of the man varable but not on explanatory varables whch may affect the varable/system. In hydrology, tme seres analyss s one of fronter scentfc ssues because t can detect and descrbe quanttatvely each of the hydrologc processes underlyng a gven sequence of observatons. Moreover, hydrologc tme seres analyss can also be used for buldng mathematcal models to generate synthetchydrologcrecords,toforecasthydrologcevents,to detect trends and shfts n hydrologc records, and to fll n mssng data and extend records. Consequently, tme seres analyss has become a vtal tool n hydrologcal scences and ts mportance has been dramatcally enhanced n the recent past due to ever-ncreasng nterest n the scentfc understandng of clmate change [6]. Inthetmeseresanalyss,tsassumedthatthedata (observatons) consst of a systematc pattern and stochastc component; the former s determnstc n nature, whereas the latter accounts for the random error and usually makes the pattern dffcult to be dentfed. Prevous research usually equates stochastc component to system error and then smply dscards t so as to not complcate the statstcal analyses. However, the stochastc component potentally ncludes nterestng and meanngful nformaton; t must be treated wth cauton. It s for ths reason that outler detecton becomes a hotspot research ssue n recent years Outler Detecton. An outler can be defned as observaton that devates so much from other observatons as to arouse suspcon that t was generated by a dfferent mechansm [7], or patternsndatathatdonotconformtoa well-defned noton of normal behavor [8]. Generally, an outler may be ncorrect and/or the crcumstances around themeasurementmayhavechangedovertme.thus,the dentfcaton and treatment of outlers consttute an mportant component of the tme seres analyss before modelng because outlers can have negatve mpacts on the selecton of the approprate model as well as on the estmaton of the assocated parameters. Outler detecton, also known as anomaly detecton n some lteratures, s an mportant long-standng research problem n the domans of data mnng and statstcs. The major objectve of outler detecton s to dentfy data objects that are markedly dfferent from, or nconsstent wth, the remanng set of data [9, 10]. In recent decades, ths research problem s attractng sgnfcant research attentons n the felds of statstcal analyss, machne learnng, and artfcal ntellgence due to ts mportant applcatons n a wde range of areas n busness, securty, nsurance, health care, and engneerng, to name a few. Chandola et al. [8]provdeacomprehensve classfcaton of the outler detecton technques

3 Mathematcal Problems n Engneerng 3 and classfy the technques nto sx categores: classfcatonbased [11], nearest-neghbor-based [12], clusterng-based [13], statstcal [14], nformaton theory-based [15], and spectral theory-based [16]. In addton to the abovementoned algorthms, other approaches have been adopted to solve the outler detecton problem. For nstance, sgnal processng technques such as wavelet transform [17] and Fourer transform [18] have been used to detect outler regons n meteorologcal data. Outler detecton s a very broad feld and has been studed n the context of a large number of applcaton domans where many detecton methods have been proposed accordng to the dfferent data characterstcs. Recently, there has been sgnfcant nterest n detectng outlers n tme seres. Generally, methods for tme seres outler detecton should consder the sequence nature of data and operate ether on a sngle tme seres or on a tme seres database. The goal of outler detecton on a sngle tme seres s to fnd an anomalous subregon, whle the goal of the latter s to dentfy a few sequences as outlers or to dentfy a subsequence n a test sequence as an outler. In some cases, a sngle tme seres s converted to a tme seres database through the use of a sldng wndow [19]. Gven a sngle tme seres, one can fnd partcular elements (or tme ponts) wthn the tme seres as outlers or fnd subsequence outlers. Fox defnes two types of outlers (type I/addtve and type II/nnovatve) based on the data assocated wth an ndvdual object across tme, gnorng the communty aspect completely [20]. Snce then, a large number of t predcton models and profle based models have been proposed to fnd pont outler wthn a tme seres. A straghtforward method for outler detecton n tme seres s based on forecastng [21 23]. Ths approach frst bulds a predcton model from the hstorcal values andsthenusedtopredctfuturevalues.ifapredcted value dffers from the observed value beyond a certan threshold, an outler would be ndcated. The defnton of the value of the threshold to be used for detectng outlers s the man problem of outler detecton based on predcton. The dffcultes of forecastng-based outler detecton have motvated the proposal of anomaly subsequences detecton technques, based on outler score calculated wth respect to smlar measure between subsequences [24 27]. Keogh et al. [24] proposed usng one nearest-neghbor approach to detect maxmal dfferent subsequences wthn a longer sequence (called dscords). Subsequence comparsons can be smartly ordered for effectve prunng usng varous methods lke heurstc reorderng of canddate subsequences [25], localty senstve hashng [26], Haar wavelet [27], and SAX wth augmented tres [28]. The problem of outler detecton n tme seres database manly focuses on how to fnd all anomalous tme seres. It sassumedthatmostofthetmeseresnthedatabaseare normal whle a few are anomalous. Smlar to the tradtonal outler detecton, the usual recpe of solvng such problems s to frst learn a model based on all the tme seres sequences nthedatabaseandthencomputeanoutlerscoreforeach sequence wth respect to the model [29]. Statstcs-based methods are stll conducted to detect outlers n tme seres database because of ther effcent structure. In 2012, Zhang developedanaverage-basedmethodologybasedontme seres analyss and geostatstcs, whch acheved satsfactory detecton results n short snapshots [30]. However, t wll fal f the outlers gather together closely n the same short tme slot. Hence, recent researches have manly focused on nonparametrc outler detecton methods such as Bayesan method and dscrete wavelet transform (DWT). Freda proposed a Bayesan approach to outler modelng, approxmatng the posteror dstrbuton of the model parameters by applcaton of a componentwse Metropols-Hastngs algorthm [31]. Apart from the Bayesan method, Grané and Vega [32] dentfed the outlers as those observatons n the orgnal seres whose detal coeffcents are greater than a certanthresholdbasedonwavelettransform.theyterated the process of DWT and outler correcton untl all detal coeffcents are lower than the threshold. Based on real-world fnancal tme seres, ther method acheved a lower average number of false outlers than Blen and Huzurbazar s [33]. However,ntheseworks,thresholdsaremanlysetsubjectvely, whch makes these methods neffcent and nsenstve when there are several dfferent knds of outlers appearng n thesametmeseres Outler Detecton n Hydrologcal Tme Seres. Hydrologc systems nvolvng outlers nvarably represent complex dynamcal systems. The current state and future evolutons of such dynamcal systems depend on countless propertes and nteractons nvolvng numerous hghly varable physcal elements. The representaton of such dynamcal systems n ther correspondng models s complcated because certan relatonshps can only be developed through analyses. Outler detecton n hydrologc data s a common problem whch has receved consderable attenton n the unvarate framework. In the multvarate settng, the problem s well establshed n statstcs. However, n the hydrologc feld, the concepts are much less establshed. A poneerng work n ths drecton was recently presented by Chebana and Ouarda [34]. Moreover, many outler detecton technques, such as Chauvenet s method, Dxon-Thompson outler test, and Rosner s test [35], are statstcs based on the prncple of hypothess testng wth the underlyng assumpton of log-pearson type III (LP3) dstrbuton [36], whch may not always be readly avalable for hydrologcal tme seres. Hyndman and Shang s [37] methods, on the bass of real data, are graphcal and consst frst n vsualzng functonal data through the ranbow plot and then n dentfyng functonal hydrologcal outlers usng the functonal bag-plot and the functonal hghest-densty regon box-plot. The methods can detect outler curves that may le outsde the range of the majorty of the data or may be wthn the range of the data but have a very dfferent shape. However, as ndcated by Chebana and Ouarda [34], the ponts outsde the fence of the bag-plot or box-plot are consdered as extremes rather than outlers. On ths bass, Chebana et al. proposed a nongraphcal outler detectonmethodbasedonthebvaratescoreponts,whch were obtaned from the frst two prncpal components score vectors generated by functonal prncpal component analyss, to detect outlers n flood frequency analyss [38].

4 4 Mathematcal Problems n Engneerng Ng et al. [39] found that chaotc approach can determne the level of complexty of a system after they appled the chaotc analytcal technques to daly hydrologc seres comprsng of outler. It adopts box-plot [40] as the outler detecton toolsbasedonts(1)statstcalpopularty,(2)abltyto handle and detect multple outlers smultaneously wthout precedng to an teratve detecton, and (3) ablty to detect outlers by block detecton where no maskng effects are nvolved before conductng the chaotc analyss. The method can effectvely detect numerous outlers n the orgnal daly dscharge dataset but may understate the actual number of outlers because the data s usually clustered around the zero mean and consequently results n less skewness n the data. Although many outler detecton methods exst n the lterature, there s a lack of dscusson on the selecton of a proper detecton method for hydrologcal outlers. It s manly because of the fact that most of outler detecton methods belong to statstcal approaches and demanded that thedatamustfollowsomedstrbutons,andtheselectonof a sutable outler detecton method s crtcally determned bythententofanalystandthentendeduseoftheresults. Analysts have to consder several techncal aspects n ts decsons-makng such as the tradeoff between accurate and effcent, the evaluaton of consequences subject (.e., maskng and swampng), the desgn assumptons and the lmtaton of dfferent methods, and the preference on parametrc or nonparametrc approach. Wthout a thorough understandng of outler phenomena, t s dffcult to determne a sutable outler detecton method. Faced wth such challenges, ths work proposes a new method to detect outlers that splts gven hstorcal hydrologcal tme seres nto subsequences by a sldng-wndow andthenanautoregressve(ar) predctonmodeloftme seres and predcton confdence nterval (PCI) calculated from nearest-neghbor hstorcal data to dentfy tme seres anomales. The method used an autoregressve predcton model, whch belongs to data-drven tme seres model essentally, rather than a physcs-based tme seres model, due to thefactthattssmplertodevelopandcanrapdlyproduce accurate short forecast horzon predctons. Data are classfed as anomalous/nonanomalous based on whether or not they fall outsde a gven PCI.ThePCI whch s employed n ths study not only accounts for the fact that the correlatons between adjacent data ponts n the tme seres are hgher than those farther away but also avods fallng nto neffcent and nsenstve dlemma caused by a subjectve-settng threshold. Moreover, the PCI canbecalculateddynamcally accordng to dfferent nearest-neghbor wndows sze and confdence coeffcent of dfference users, whch make t sutable for dfferent varables of hydrologc tme seres outler detecton for dfferent user s demand. In concluson, ths method does not requre any preclassfed examples of data, scales well to large volumes of data, and allows for fast ncremental evaluaton of data as t becomes avalable. 3. Wndow-Based Outler Detecton In ths secton, we formulate the outler detecton problem and gve a formal defnton of some concepts whch are used n the proposed algorthm. And then we wll ntroduce thealgorthmdetectngoutlersntmeseresbasedonthe sldng-wndow predcton model. In addton, we also menton effcent strateges to choose the optmal parameters to meet the users requrements. Next s the formal formulaton of the contextual outler detecton algorthm Problem Defnton. Hydrology s a tme-varyng phenomenon, the change of whch s referred to hydrologcal processes. As mportant scentfc data resources, hydrologcal data are the dscrete records of hydrologcal processes and couldbedvdedntoflow,waterlevel,ranfall,evaporaton, and other hydrologc tme seres accordng to the physcal quanttes of ts representaton. Defnton 1 (hydrologcal tme seres). A hydrologcal tme seres T s a set of real-valued data n successve order, occurrng unform tme nterval. In ths work, T = d 1 =(V 1,t 1 ), d 2 =(V 2,t 2 ),...d m =(V m,t m ),wherems the length of the tme seres and pont d = (V,t ) stands for the observaton V at the moment t ;moreover,t s strctly ncreased. For outler detecton purposes, we are typcally not nterestednanyoftheglobalpropertesofatmeseres;rather,we are nterested n local subsectons of the tme seres, whch are called subsequences. Defnton 2 (subsequence). Gven a tme seres T of length m, asubsequencec of T s a samplng of length n mof contguous poston from p; thats,c= d p =(V p,t p ), d p+1 = (V p+1,t p+1 ),...d p+n 1 = (V p+n 1,t p+n 1 ) for 1 p m n+1.partcularly,subsequencec may contan only one data pont on the condton that n=m. Snce all subsequences may potentally be abnormal, any algorthm wll eventually have to extract all of them; ths can be acheved by use of a sldng wndow. Defnton 3 (sldng wndow). Gven a tme seres T of length m and a user-defned subsequence length of n, all possble subsequences can be extracted by sldng wndow of sze n across T and consderng each subsequence C p. Generally, the frst problem of tme seres outler detecton s to defne what knd of data n a gven dataset s abnormal. That s, the defnton of outler determnes the outler detectons goals. In hydrologc tme seres, tme sequences whch are composed of dfferent physcal quanttes show great dfference anomaly characterstcs; therefore, t s dffcult to gve a unform defnton of abnormalty. In ths paper, we dentfy a subsequence to be outler based on ts nearestneghbor. Defnton 4 (k-nearest-neghbor). Gven a tme seres T of length m and a data pont d ( < m),thek-nearest-neghbor of d s a samplng of length 2k of contguous poston η (k)

5 Mathematcal Problems n Engneerng 5 from kto +kand does not contan ; thats,η (k) = {d k,...d 1,d +1,...d +k } for +1 k m. Defnton 5 (tme seres outler). Gven a tme seres T of length m and the k-nearest-neghbor η (k) of d, d wll be dentfed as an outler f the observed value of d falls outsde a PCI calculated by confdence coeffcent p and predcted value accordng to η (k). From the above defnton, one can see that the nearestneghbors wndow sze k and confdence coeffcent p become key parameters of outler detecton. Therefore, t can dynamcally adjust k and p for dfferent users and dfferent hydrologcal elements to acheve the optmal detecton result Algorthm Descrpton. After studyng the current stuaton and challenge of hydrologcal tme seres and ts outlers, ths study proposes a new outler detecton method that uses a sldng wndow of hydrologcal tme seres T m = d 1 =(V 1,t 1 ), d 2 =(V 2,t 2 ),...d m =(V m,t m ), where pont d = (V,t ) stands for the measurement V at the moment t, to classfy a partcular data pont as anomalous ornot.adatapontd s classfed as anomalous f ts measurement devates sgnfcantly from ts k-nearest-neghbor (KNN) predcton value calculated usng ts neghborng pont set η (k) as nput. Upon ntalzaton, the method flls the wndow wth the most recent measurements and commences classfcaton wth the next measurement taken by the tme seres. Inbref,themethodconsstsofthefollowngstepsbegnnng at tme wthn a gven hydrologcal tme seres T. Step 1. Defne k-nearest-neghbor wndow η (k) for the pont d, based on the data source, from where outlers are to be detected (hstory data ncludng complete sequence or only past data to forecast new ncomngs); the η (k) can be dvded nto one-sded-wndows and two-sded-wndows types. Step 2. Buld a nearest-neghbor-wndow predcton model that takes η (k) as nput to predct V +1,theexpectedvalueofthe pont d +1 ; n addton, calculate the upper and lower bounds of the range wthn whch the measurement should le (.e., the predcton confdence nterval, PCI). Step 3. Compare the actual measurement at tme +1wth the range calculated n Step 2 and classfy t as anomalous f t falls outsde the range; otherwse, classfy t as nonanomalous. Step 4. Modfy T by removng d k+1 from the back of the wndow and addng d +1 whch holds the value V +1 to the front of the wndow to create T +1 ; slde one step and modfy the η (k) to create η (k) +1 f the measurement s classfed as anomalous; else, modfy T by removng d k+1 from the back of the wndow and addng d +1 whch takes ts orgnal value to the front of the wndow to create T +1 ;sldeonestepand modfy the η (k) to create η (k) +1. Step 5. Repeat Steps 1 4, untl all sequence has been detected. Tme seres dataset T m ={ d 1,d 2,d 3,...d k+1...d,...d m } Defne d s k-nearest-neghbor wndow η (k) ={d 2k, d 2k+1,...d 1 } Predct Calculate PCI Compare wth PCI extends PCI? Yes Mark d as outlers and correct t End Fgure 1: Dagram of the proposed outler detecton method. The detecton process of data pont d s llustrated wth a flow chart n Fgure 1. The remander of ths secton descrbes these steps n detal Wndow Defnton. The frst step of ths outler detecton process, the KNN wndow of the test pont d n tme seres data, s defned to llustrate the relatons between the data pont and ts nearest-neghbor. And then, the predcton model can use only the test pont s KNN wndow to predct the measurement of d forthepurposeofsmplfyngthe computatonal complexty. Generally, there are two types of hydrologcal data from where outlers are to be detected: hstory tme seres data or real-tme data. The dfference between them s prmarly based on the fact that the former uses the prevous and subsequent neghbor wndow as nput parameters to detect the outler whle the latter only uses the prevous neghbor wndow as nput parameters. Then, the neghbor wndow can be dvded nto one-sded and two-sded types. (1) Two-Sded Neghbor Wndows. The two-sded-wndows outler detecton method uses a data ponts prevous (left) and subsequent (rght) neghbor s data pont to determne whether ths data pont s outler or not. Gven a water No

6 6 Mathematcal Problems n Engneerng level tme seres T m = d 1 =(V 1,t 1 ), d 2 =(V 2,t 2 ),...d m = (V m,t m ), the two-sded-wndows neghborng pont sets η (k) for the pont d can be defned as follows: η (k) ={d k,...d 1,d +1,...d +k }. (1) Note that 2k s the sze of the neghborhood wndow, startng at kand endng at +k(not ncludng ). (2) One-Sded Neghbor Wndows. As the rght neghbors may contan outlers that had not been detected, therefore, n the real-tme detecton applcaton, only left neghbor can be avalable. So, one-sded-wndows outler detecton method only chooses left neghbors whch removed (or revsed) the outlers had been detected wll make the detecton results more meanngful. Other than the two-sded-wndows method, one-sded-wndows method outler detecton algorthms defne neghborhood pont set η (k) for the pont d,as follows: η (k) ={d 2k,d 2k+1,...d 1 }. (2) Here, 2k s the sze of the neghborhood wndow for the pont d,startngat 2kand endng at Model Selecton. In ths step, an autoregressve (AR) predcton model s bult to forecast the partcular pont s valueusngtsneghborhoodpontsetsasnput.thear model forecasts future measurements n tme seres datasets usng only a specfed set of observatons, that s, η (k),from thesamedschargeste;theyareusedbecausetheyavod complcatons caused by dfferent samplng frequences that can arse f a heterogeneous set of tme seres data was used; moreover, the use of AR model reduces the number of predctons that cannot be made due to nsuffcent data caused by the embedded telemetry equpment that went offlne. The neghborhood pont sets η (k) areusedasnputto the AR model of the tme seres to predct the followng observaton: d =M(η (k) ), (3) where M( ) s the model. Ths method assumes that the behavor of the processes at tme t+1canbedescrbedby a fnte set of k prevous measurements; thus t mplctly assumes that the tme seres s an order k Markov process. Lterature [22] compares natve, nearest cluster (NC), and sngle-layer lnear network (LN) and multlayer perceptron (MLP) on dfferent dataset and concludes that LN and MLP would obtan better detecton result than other models. Based on ths work, we use LN as predcton model and assume that observaton V s a lnear combnaton of two-sded neghbors wndows data pont: V = ( k j=1 w jv j + k j=1 w +jv +j ) ( k j=1 w j + k j=1 w, (4) +j) where w k,...,w 1,w +1,w +k stands for the weght of neghborhood, whch defnes the relatonshp between the two-sded-wndows neghborhood {d k,...d 1,d +1,... d +k } and the expected value of d. Generally, the weght of the neghborng pont d j s nversely proportonal to the dstance between the pont d and d j ;thats,thelargerthe dstance s, the small the weght w s. For smplcty, t assgns the weght vector w k,...,w 1,w +1,w +k wth the values 1,2,...k,k,...2,1. Generally, two-sded neghbor wndows need ponts prevous and subsequent neghbors; however, the rght neghbors may contan outlers that had not been detected, whch may affect detecton results subsequently. So, n some applcaton felds, only prevous (left) neghbor-wndow data can be used to predct and dentfy forthcomng outlers. Therefore, t often uses a smple modfcaton of the twosded-wndows model to predct the measurement at tme : j=1 w jv j V = 2k 2k j=1 w j. (5) Smlar to two-sded neghbors wndows, the weght vector w 2k,w 2k+1,...w 1 stands for the weght of neghborhood and s assgned wth the values 1,2,...2k Outlers Identfcaton. Gven the model predcton from Secton based on test pont s neghbor,the data pont can then be classfed as anomalous usng a confdence boundary PCI calculated va predcted value and confdence coeffcent. The PCI gves the range of plausble values that the test measurement can take; the confdence coeffcent (p = 100(1 α)) ndcates the expected frequency wth whch measurements wll actually fall n ths range. If t s assumed that the model resduals have a zero-mean Gaussan dstrbuton, the p% PCI canbecalculatedasfollows[22]: PCI = V t+1 ±t α/2,2k 1 s k, (6) where V t+1 s the predcton value of the test pont, t α/2,n 1 s the pth percentle of a Student s t-dstrbuton wth 2k 1 degrees of freedom, s s the standard devaton of the model resdual, and k s the wndow sze used to calculate s. Ths type of PCI s a type of t-nterval because t reles on Student s t-dstrbuton. If the test pont s actual measurement falls wthn the bounds of the PCI, then the pont s classfed as nonanomalous; otherwse, t s classfed as anomalous. Thus, the PCI represents a threshold for acceptance or rejecton of a data pont. The beneft of usng the PCI nstead of an arbtrary threshold s that the predcton level gudes the selecton of the nterval wdth. The two-sded-wndows approach for outler detectng s llustrated wth a smple example n Fgure 2 wth a neghborhood wndow wdth of k=4. It should be noted that the area between the two green lnes n Fgure 2 coverng the neghborhood of data ponts ndcates the confdences bound (PCI). In ths example, d 7 s an outler n the current neghborhood of ponts and proposes to be replaced by the

7 Mathematcal Problems n Engneerng Experments and Analyss Varable 2 1 η 7 (4) ={d 3,d 4,d 5,d 6,d 8,d 9,d 10,d 11 } Predcted value Tme Confdence bound Outler To demonstrate the effcacy of the outler detecton methods developed n ths study for data QA/QC, t wll be appled to hydrologcal data seres from natonal hydrology database of MWR, Chna. In the followng, the real data are descrbed and are functonal and results are presented anddscussed.moreprecsely,tfrstusestheprevously presented approaches to dentfy outlers; then, some performanceevaluatonwllbedscussedandnterpretedonthe bass of hydrologcal data; and some results usng multvarate approaches for comparson purposes wll be provded at last. Fgure 2: Outler detecton example, where the raw data, predcted value, confdence bound, and outler are dentfed by dfferent symbols. In ths example, d 7 s an outler n the current neghborhood of ponts and has been detected correctly. predcted value V 7 for analyss and modelng of the tme seres Parameter Optmzaton. In order to detect the outlers n the tme seres, the proposed methods should calculate the plausble values range PCI va predcted value and confdence coeffcent based on the test pont s k-nearestneghbor. Hence, the values of the two parameters, k and p, become the key ssues for mprovng the outler detecton methods mentoned n the prevous secton. In ths case, we can use the followng experments to choose a proper value of those two parameters. (1) The wndow wdth (k) of the methods controls how many neghborng ponts are ncluded n the calculaton of thepredctonvalue.thelargerthewndowwdths,themore ponts are ncluded to compute the predcton value. It vares k from 3 to 15 n ncrements of one; that s, k = {3, 4,..., 15}. (2) The confdence coeffcent p calculates the plausble values range PCI toclassfyadatapontasanoutler.the larger the confdence coeffcent s, the more plausble the predcton value s. Therefore, t vares p from 85% to 99% n ncrements of one percent. To tune the best combnaton of algorthm parameters that maxmzes the rato of detecton, the cross-valdaton scheme s appled. The complete sample s splt nto two segments: a tranng dataset and a testng dataset. Ths s a way of cross-valdatng whether the parameters found durng the frst perod, the tranng phase, are consstent and stll vald n a dfferent perod, the testng phase. The prncple of cross-valdaton s a generc resource to valdate statstcal procedures and t has been appled n dfferent contexts. Furthermore, n tranng phase, the tranng set s dvded nto 10 nonntersectng subsets of equal sze, chosen by random samplng. The model s then traned 10 tmes, each tme reservng one of the subsets as a valdaton set on whch the model error s evaluated whle fttng the model parameters usng the remanng nne subsets. The model parameters wth the lowest mean squared error among the 10 tranng models arethenselectedforthefnalmodel[41] Study Area and Data. In ths subsecton we report on a number of experments usng two dfferent hydrologc elements, water level (m) and daly flow (m 3 s 1 ), from LZ and HYK statons wth reference numbers and , respectvely, obtaned from natonal hydrology database of MWR, Chna. The LZ and HYK statons are the mportant flood preventon and control statons and are generally known as typcal hydrologcal statons of upper and mddle reaches of the Yellow Rver. They play an mportant role n downstream of Yellow Rver n the felds such as hydrologcal data collecton and forecastng, flood control schedulng, rver control experments, and water resources development. Fgure 3 ndcates the geographcal locaton of LZ and HYKstatons. The data downloaded from the natonal hydrology database of MWR, Chna, were avalable as raw data. In addton, we followed the procedures descrbed n the prevous secton. Fgures 4 and 5 llustrate the whole dataset wthn a gven tme seres; t shows that the dataset s nearly perodc and has some suspcous data pont obvously Experment Result. Snce the data used n ths study were subjectedtomanualqualtycontrolbeforebengarchved to the natonal hydrology database, t was expected that the detectors would not dentfy many data outlers n the archve. However, we can easly see that some data ponts devate from ther neghbor. And then, we apply our methods to detect the outlers n the gven hydrologcal tme seres wth the wndow sze k=6and probablty p=95%. And the detecton results fortheproposedmethodsareshownnfgures6, 7, 8,and9. Fgures 6 9 depct real and predcted values wth the proposed outler detected methods n the daly flow and water level tme seres of LZ and HYKstatons, respectvely. In each of these graphcs the PCI for the predctons s also depcted. These graphcs show that most of the real values are very near the respectve predcted value, whle a spot of them les outsde the PCI boundares for the predctons. An outler detecton method based on PCI wouldndcateanoutlerf the real value les outsde the confdence bounds, as descrbed n Secton 3.2. The experments reported here showed that these bounds, bult from cross-valdaton scheme on tranng and valdaton sets, can correctly bnd the regon of normalty. Furthermore, these ntervals could be used to ndcate the level of suspcon assocated wth a gven pont n the future.

8 8 Mathematcal Problems n Engneerng N Staton Fgure 3: Geographcal locaton of LZ and HYK statonsn Yellow Rver. Water level (m) Staton ( ) raw water level data: from 1998/1/1 to 1999/12/31 Daly flow (m 3 /s) Staton ( ) raw daly flow data: from 1998/1/1 to 1999/12/31 Tme (day) Tme (day) (a) Water level (b) Daly flow Fgure 4: Raw hydrologcal tme seres of LZ staton Experment Evaluaton Parameters of Qualty. The detecton results for the proposed methods wth the two dfferent hydrologc elements from LZ and HYK statons llustrate that the outler detecton methods are promsng and can successfully detect a consderable amount of outlers from the hydrologc dataset. However, we also note that there are some vald data ponts whch had been detected as outlers and mputed n error. The objectve of the evaluaton analyss descrbed n ths secton s to assess the effectveness of method; we can classfy the results from our experment nto four categores (see Table 1). The categores n Table 1 correspond to the four possble outcomes of one expermental run, whch conssts of usng both methods wth a partcular combnaton of wndow wdth (k) and probablty (p). CategoresA and D are deal stuatons n whch a pont can be detected correctly, whle categores B and C are undesred, because the methods are not able to dstngush between outlers and exactness. Accordng to these defntons, the Senstvty s the probablty that the proposed methods dscovered a real outler.itsformulasdefnedasfollows: Senstvty = TP (TP + FN). (7) Truth Outler Not an outler Table 1: Assessment of both methods. Outler TruepostvesorTP(A) data ponts that are outlers and dentfy as outlers False postves or FP (B) data ponts that are normal but dentfy as outlers Detecton Not an outler False negatves or FN (C) data ponts that are outlers but dentfy as normal True negatves or TN (D) data ponts that are normal and dentfy as normal Another relevant parameter s the Specfcty,the proporton of negatve test results among the normal; the mathematcal expresson s as follows: Specfcty = TN (TN + FP). (8) The postve predctve value (PPV)stheprobabltythat a detected outler s ndeed a real one. Its formula s as follows: PPV = TP (TP + FP). (9)

9 Mathematcal Problems n Engneerng 9 Staton ( ) raw water level data: from 2006/1/1 to 2007/12/31 Staton ( ) raw daly flow data: from 2006/1/1 to 2007/12/31 Water level (m) Daly flow (m 3 /s) Tme (day) Tme (day) (a) Water level (b) Daly flow Fgure 5: Raw hydrologcal tme seres of HYK staton. Staton ( ) raw water level data: from 1998/1/1 to 1999/12/31 Staton ( ) raw water level data: from 1998/1/1 to 1999/12/31 Water level (m) Tme (day) Water level (m) Tme (day) Predcted value Confdence bound Outler Predcted value Confdence bound Outler (a) Two-sded wndow detecton results (b) One-sded wndow detecton results Fgure6: Detecton resultsoverlz water level. Staton ( ) detecton result: from 1998/1/1 to 1999/12/31 Staton ( ) detecton result: from 1998/1/1 to 1999/12/ Daly flow (m 3 /s) Daly flow (m 3 /s) Tme (day) 0 Tme (day) Predcted value Confdence bound Outler Predcted value Confdence bound Outler (a) Two-sded-wndow results (b) One-sded-wndow detecton results Fgure 7: Detecton results over LZ daly flow.

10 10 Mathematcal Problems n Engneerng Water level (m) Staton ( ) detecton result: from 2006/1/1 to 2007/12/31 Tme (day) Water level (m) Staton ( ) detecton result: from 2006/1/1 to 2007/12/31 Tme (day) Predcted value Confdence bound Outler Predcted value Confdence bound Outler (a) Two-sded-wndow detecton results (b) One-sded-wndow detecton results Fgure8: DetectonresultsoverHYK water level. Staton ( ) detecton result: from 2006/1/1 to 2007/12/31 Staton ( ) detecton result: from 2006/1/1 to 2007/12/ Daly flow (m 3 /s) Daly flow (m 3 /s) Tme (day) 0 Tme (day) Predcted value Confdence bound Outler Predcted value Confdence bound Outler (a) Two-sded-wndow detecton results (b) One-sded-wndow detecton results Fgure 9: Detecton results over HYK daly flow. Fnally, the negatve predctve value (NPV) s the proporton of nonoutlers among subjects wth a negatve test result. Its formula s as follows: NPV = TN (TN + FN). (10) Quantfyng the Outler Occurrences. Astatstcalmeasureoftheaccuracysprovdednthssecton.Theparameters used are the ones descrbed n Secton and summarzedntables2 and 3. Note that all parameters refer to the whole year from 2006 to 2007 of HYK staton and 1998 to 1999 of LZ staton. And just for smplcty, only the detecton result wth three pars of parameters and the explanaton for the water level where (k, p) takes of the value (6, 0.95) s provded, snce the reasonng for the remanng hydrologc elements and parameters s analogous. Note that the optmal values for the parameters (k, p) would be(6,0.95)on water level tme seres of HYK staton accordng to parameter optmzaton method descrbed n Secton 3.3. As t s shown n Table 2, durng the one-sded methods detectng process on water level tme seres of HYK staton wth (k, p) takng the value of (6, 0.95), there were 14 outlers properly detected; that s, TP = 14. On the other hand, there were two occasons whch orgnally represent normal event but are to be consdered an outler by the methods; therefore, FP = 2. Furthermore, there was one day whch was outler and was not detected by the methods; that s, FN=1.And the remanng 713 days were correctly judged as normal event. Thus, TN = 713 ( =730forecasteddaysfrom 2006 to 2007). The results show great accuracy for both features. Partcularly remarkable are the values reached by the specfcty. In partcular, t exceeds 99% n all stuatons. These values mean that when the approach classfes the day to be predcted as normal,tdoestwthhghrelablty.

11 Mathematcal Problems n Engneerng 11 Table 2: Statstcal analyss of one-sded methods wth dfferent parameters of HYK staton. Parameters Water level Daly flow (5, 0.95) (6, 0.95) (6, 0.96) (7, 0.95) (5, 0.95) (5, 0.96) (6, 0.96) (5, 0.97) TP TN FP FN Senstvty 80.00% 93.33% 80.00% 73.33% 83.33% 88.89% 77.78% 72.22% Specfcty 99.72% 99.72% 99.44% 99.58% 99.30% 99.72% 99.44% 99.58% PPV 85.71% 87.50% 75.00% 78.57% 75.00% 88.89% 77.78% 81.25% NPV 99.58% 99.86% 99.58% 99.44% 99.58% 99.72% 99.44% 99.30% Table 3: Statstcal analyss of both methods wth optmal parameters of gven dataset. LZ staton HYK staton Parameters Water level Daly flow Water level Daly flow One-sded (6, 0.96) Two-sded (6, 0.95) One-sded (7, 0.96) Two-sded (6, 0.95) One-sded (6, 0.95) Two-sded (6, 0.96) One-sded (5, 0.96) Two-sded (6, 0.95) TP TN FP FN Senstvty 90.91% 81.82% 85.71% 90.48% 93.33% 86.67% 94.44% 94.44% Specfcty 99.44% 99.44% 99.58% 99.29% 99.72% 99.30% 99.72% 99.44% PPV 83.33% 81.82% 85.71% 79.17% 87.50% 72.22% 89.47% 80.95% NPV 99.72% 99.44% 99.58% 99.72% 99.86% 99.72% 99.86% 99.86% As for the senstvty, all the results reached values greater than 73% (except for daly flow wth the stuaton (k, p) takng thevalueof(5,0.97),nwhchtreaches72.22%)andobtans 81.6% on average. The PPV reached values smlar to those of the senstvty, n partcular, slghtly greater (82.8% on average). Therefore, t can be stated that when the proposed method determnes that there s an upcomng outler, t s hghly relable. Fnally, NPV provdes smlar values for all the stuatons, reachng 99.58% on average; that s, the rate of real outlers not found by the approach cannot be consdered sgnfcant. As for Table 3, t can be easy to draw a concluson that dfferent hydrologc elements of the same staton and the same hydrologc elements of dfferent staton may take dfferent optmal parameters values for (k, p) from experments descrbed n Secton 3.3. Moreover, we can see that the detecton accuracy of one-sded method s better than twosded one as a whole. For example, there were 15 outlers n water level dataset of HYK staton; one-sded method can correctly detect 14 abnormal and 713 normal data ponts; smultaneously, t masks one outler as normal and swamps two normal samples as outlers. As a comparson, two-sded methods can correctly detect 13 abnormal and 710 normal data ponts; correspondngly, the maskng and swampng effects reach to 2 and 5, respectvely. The reason for ths dfference may le n that the latter method needs both sdes neghbors whch may contan outlers, whch may lead to the fact that maskng and swampng event occurs Analyss and Dscusson. We compared our methods wth other methods such as SVM (support vector machne) [42], box-plot technques [40], and medan method[23] on thesametestdatasets.thecomparsonresultswlldsplay n the recever operatng characterstc (ROC) [43] curves, whch wll be shown later. By conventon, the ROC curve dsplays senstvty (TPR) on the vertcal axs aganst the complement of specfcty (1-specfcty or FPR) on the horzontal axs. The ROC curve then demonstrates the characterstc recprocal relatonshp between senstvty and specfcty, expressed as a tradeoff between the TPR and FPR. Ths confguraton of the curve also facltates calculaton of the area beneath t as a summary ndex of overall test performance. Therefore, the larger the area under the ROC curve, the better the performance of the technque. Fgure 10 reveals the ROC curves obtaned by proposed methods and other technques. For these datasets, the performances of proposed methods are satsfactory and stable. For the water level dataset of LZ, our methods obtan smlar results; ther ROC curves show better performance than those of SVM and box-plot methods, whle medan method shows second-better ROC curves. For the daly flow dataset of LZ, our methods and SVM show better performance than medan method and box-plot method; moreover, the performance of one-sded-wndow method s slghtly better than that of two-sded-wndow. For the water level of HYK,ourmethods show preferable results, whle SVM method obtans relatvely better results than box-plot and medan methods. For the

12 12 Mathematcal Problems n Engneerng Table 4: Comparsons of area under ROC curves. AUCs Medan Box-plot SVM Two-sded One-sded LZ water level LZ daly flow HYK water level HYK daly flow Medan Box-plot SVM Two-sded One-sded Medan Box-plot SVM Two-sded One-sded (a) ROC of LZ water level (b) ROC of LZ daly flow Medan Box-plot SVM Two-sded One-sded (c) ROC of HYK water level Medan Box-plot SVM Two-sded One-sded (d) ROC of HYK daly flow Fgure 10: ROC curves for dfferent datasets. Fve methods are compared: medan method, box-plot method, SVM method, and our method, whch are descrbed usng dfferent colored lnes. daly flow ofhyk,the ROC curves nterlace wth each other, so t s dffcult to evaluate the performance only accordng totheobservaton.wehavealsocalculatedtheaucs for the ROC curves to compare the results (see Table 4). Note that there s a remarkable mprovement n the detecton performance. The average AUCs of our method are and 0.942, whch are the hghest two among these fve methods. The average AUCs of medan, box-plot, and SVM methods are 0.892, 0.878, and 0.897, respectvely. For our method, t can be nferred that the anomales canbeeffectvelydetectedbythewndow-basedforecastng model, whch s constructed usng AR predcton model and dynamcally boundary movement strateges. The results suggest that our methods mprove the robustness of the overall decson, whle the other methods show dfferent performances on dfferent datasets. The other three technques have the smlar characterstc; they acheve better or worse results on dfferent datasets. Results ndcate that the proposed framework can fnd robust PCI as a means for defnng the thresholds for detectng noveltes n tme seres and can therefore mprove performance of forecastng-based tme seres novelty detecton. Moreover, our method s robust

13 Mathematcal Problems n Engneerng 13 to detect anomales n dfferent datasets, whch means that t s more data ndependent. It s mportant to menton that we have also valdated our method on dfferent datasets. The results were smlar to those reported n ths secton. Furthermore, ths paper used PCI rather than an arbtrary, user-defned threshold, whch s manly because of the fact that PCI can provde gudance (n the form of the confdence coeffcent) for the calculaton of the normal boundary regon wthout requrng any knowledge of the process varables beng measured. The p% PCI ndcates the regon lkely to contan at least p% of the possble sensor measurements, and we expect that approxmately (1 p)% of the nonanomalous data wll be msclassfed as anomalous. Ths ndcates that the normal assumpton mplct n the PCI calculaton s reasonable for the hydrologcal tme seres data. One lmtaton of our methods s that we dentfy a data pont to be an outler or not by comparng new observed value to PCI calculated dynamcally accordng to dfferent nearest-neghbor-wndows sze and confdence coeffcent of dfference users, whch may cost a certan amount of tme complexty to calculate optmzaton parameter for best detecton results. Notwthstandng that lmtaton, ths study does suggest that the proposed method can mprove the performance of outler detecton. Results confrm that our methods can acheve a more robust performance than other outler detecton technques. 5. Conclusons Outler detecton, one of the classcal topcs of data mnng, has generated a great deal of research n recent years owng to the new challenges posed by large hgh-dmensonal data. In the meantme, outlers n hydrologcal tme seres have many practcal applcatons, such as data QA/QC, adaptve samplng, and anomalous event detecton. Ths research developed a tme seres outler detecton method that employs a wndow-based forecastng mode n conjuncton wth PCI to detect noveltes n hydrologcal tme seres. The method frst splts gven hstorcal hydrologcal tme seres nto subsequences by a sldng-wndow, and then an autoregressve predcton model of tme seres was bult from ts nearestneghbor-wndow to predct future values. Anomales are assumed to take place f the observed values fall outsde a gven PCI, whchcanbecalculateddynamcallyaccordng to dfferent nearest-neghbor-wndows sze and confdence coeffcent of dfferent user. Moreover, experments wth two dfferent hydrologcal varables, water level and daly flow, from LZ and HYK statons obtaned from natonal hydrology database of MWR, Chna, were performed to examne the effects of the method. In the experments, sngle-layer lnear networks were used for forecastng model; confdence coeffcent and wndow sze are assgned to 95% and 6 respectvely n the ntalzaton phase. Moreover, the best combnaton of parameters confdence coeffcent and wndow sze can be tuned accordng to dfferent user s requrements and tme seres features. The case study results suggest that the proposed outler detecton methods developed n ths study are useful tools for dentfyng anomales n hydrologcal tme seres. Snce these methods only requre a tme-seres model of the tme seres, they can be easly appled to many real-tme hydrologcal tme seres. However, t should be noted that, whle the LN produced the best model for the water level and daly flow tme seres consdered n the case study, ths model may not be the most approprate choce for other types of hydrologcal data. Furthermore, the optmzaton combnaton of parameters confdence coeffcent and wndow sze may cost a certan amount of tme complexty for dfferent hydrologcal element and dfferent user s requrements. In spte of ths, these methods do not requre any preclassfed examples of data, scale well to large volumes of data, and allow for fast ncremental evaluaton of data, whch make them an deal choce for correctly and effcently hydrologcal tme seres outler detecton, especally for cases n whch there s lttle nformaton about how to set the threshold value. Conflct of Interests The authors declare that there s no conflct of nterests regardng the publcaton of ths paper. Acknowledgments ThsworkssupportedbytheNaturalScenceFoundaton of Chna (nos , , and ) and the Natonal Scence and Technology Infrastructure of Chna (no. 2005DKA32000). References [1] J. D. Salas, Analyss and modelng of hydrologc tme seres, n Handbook of Hydrology,vol.19,pp.1 72,1993. [2] W. Gujer, Systems Analyss for Water Technology, Sprnger, Berln, Germany, [3] N. Lauzon, Water resources data qualty assessment and descrpton of natural processes usng artfcal ntellgence technques [Ph.D. thess],unverstyofbrtshcolumba,2003. [4] K. Yang and C. Shahab, A PCA-based smlarty measure for multvarate tme seres, n Proceedngs of the 2nd ACM Internatonal Workshop on Multmeda Databases, pp , ACM, November [5] G.E.P.Box,G.M.Jenkns,andG.C.Rensel,Tme Seres Analyss: Forecastng and Control, John Wley & Sons, New York, NY, USA, [6] D. Machwal and M. K. Jha, Hydrologc Tme Seres Analyss: Theory and Practce, Sprnger, New York, NY, USA, [7] D. M. Hawkns, Identfcaton of Outlers, vol. 11, Chapman &Hall, London, UK, [8] V. Chandola, A. Banerjee, and V. Kumar, Anomaly detecton: a survey, ACM Computng Surveys,vol.41,no.3,artcle15,2009. [9]M.Gupta,J.Gao,C.Aggarwal,andJ.Han,Outler Detecton for Temporal Data, Synthess Lectures on Data Mnng and KnowledgeDscovery,Morgan&Claypool,2014. [10] V. J. Hodge and J. Austn, A survey of outler detecton methodologes, Artfcal Intellgence Revew,vol.22,no.2,pp , 2004.

14 14 Mathematcal Problems n Engneerng [11] K. Das and J. Schneder, Detectng anomalous records n categorcal datasets, n Proceedngs of the 13th ACM SIGKDD Internatonal Conference on Knowledge Dscovery and Data Mnng, pp , ACM, August [12] M. M. Breunq, H.-P. Kregel, R. T. Ng, and J. Sander, LOF: dentfyng densty-based local outlers, ACM Sgmod Record, vol. 29, no. 2, pp , [13] Z. He, X. Xu, and S. Deng, Dscoverng cluster-based local outlers, Pattern Recognton Letters, vol. 24, no. 9-10, pp , [14] C. C. Aggarwal and P. S. Yu, Outler detecton wth uncertan data, n Proceedngs of the 8th SIAM Internatonal Conference on Data Mnng, pp , Aprl [15] S. Ando, Clusterng needles n a haystack: An nformaton theoretc analyss of mnorty and outler detecton, n Proceedngs of the 7th IEEE Internatonal Conference on Data Mnng (ICDM 07), pp , October [16] A. Agovc, B. Arndam, G. Auroop, and P. Vladmr, Anomaly detecton usng manfold embeddng and ts applcatons n transportaton corrdors, Intellgent Data Analyss, vol.13,no. 3, pp , [17]S.BaruaandR.Alhajj, Aparallelmult-scaleregonoutler mnng algorthm for meteorologcal data, n Proceedngs of the 15th ACM Internatonal Symposum on Advances n Geographc Informaton Systems (GIS 07), pp , ACM, November [18] F.Rasheed,P.Peng,R.Alhajj,andJ.Rokne, Fourertransform based spatal outler mnng, n Intellgent Data Engneerng and Automated Learnng IDEAL 2009,vol.5788ofLecture Notes n Computer Scence,pp ,Sprnger,Berln,Germany,2009. [19] U. Rebbapragada, P. Protopapas, C. E. Brodley, and C. Alcock, Fndng anomalous perodc tme seres : an applcaton to catalogs of perodc varable stars, Machne Learnng, vol. 74, no. 3, pp , [20] A. J. Fox, Outlers n tme seres, JournaloftheRoyalStatstcal Socety. Seres B (Methodologcal),vol.34,pp ,1972. [21] J. Ma and S. Perkns, Onlne novelty detecton on temporal sequences, n Proceedngs of the 9th ACM SIGKDD Internatonal Conference on Knowledge Dscovery and Data Mnng (KDD 03), pp , ACM, August [22] D. J. Hll and B. S. Mnsker, Anomaly detecton n streamng envronmental sensor data: a data-drven modelng approach, Envronmental Modellng and Software, vol.25,no.9,pp , [23] A. L. I. Olvera and S. R. L. Mera, Detectng noveltes n tme seres through neural networks forecastng wth robust confdence ntervals, Neurocomputng, vol. 70, no. 1 3, pp , [24] E. Keogh, J. Ln, A. W. Fu, and H. Van Herle, Fndng unusual medcal tme-seres subsequences: algorthms and applcatons, IEEE Transactons on Informaton Technology n Bomedcne,vol.10,no.3,pp ,2006. [25] E. Keogh, J. Ln, and A. Fu, HOT SAX: effcently fndng the most unusual tme seres subsequence, n Proceedngs of the 5th IEEE Internatonal Conference on Data Mnng (ICDM 05),pp , Houston, Tex, USA, November [26] L. We, E. Keogh, and X. X, SAXually explct mages: fndng unusual shapes, n Proceedngs of the 6th Internatonal Conference on Data Mnng (ICDM 06), pp , Hong Kong, December [27] Y. Bu, T.-W. Leung, A. W.-C. Fu, E. J. Keogh, J. Pe, and S. Meshkn, WAT: fndng top-k dscords n tme seres database, n Proceedngs of the 7th SIAM Internatonal Conference on Data Mnng (SDM 07), pp , Aprl [28] J. Ln, E. Keogh, A. Fu, and H. van Herle, Approxmatons to magc: fndng unusual medcal tme seres, n Proceedngs of the 18th IEEE Symposum on Computer-Based Medcal Systems, pp , June [29] M. Gupta, J. Gao, C. C. Aggarwal, and J. Han, Outler detecton for temporal data: a survey, IEEE Transactons on Knowledge anddataengneerng,vol.26,no.9,p.1,2014. [30] Y. Zhang, N. A. S. Hamm, N. Meratna, A. Sten, M. van de Voort,andP.J.M.Havnga, Statstcs-basedoutlerdetecton for wreless sensor networks, Internatonal Geographcal Informaton Scence,vol.26,no.8,pp ,2012. [31] R. Freda, I. Agueusopa, B. Bornkampb et al., Bayesan outler detecton n INGARCH tme seres, Sonderforschungsberech (SFB) 823, [32] A. Grané and H. Vega, Wavelet-based detecton of outlers n fnancal tme seres, Computatonal Statstcs & Data Analyss, vol.54,no.11,pp ,2010. [33] C. Blen and S. Huzurbazar, Wavelet-based detecton of outlers n tme seres, Computatonal and Graphcal Statstcs,vol.11,no.2,pp ,2002. [34] F. Chebana and T. B. M. J. Ouarda, Depth-based multvarate descrptve statstcs wth hydrologcal applcatons, Geophyscal Research: Atmospheres, vol. 116, no. D10, [35] R. H. McCuen, Modelng Hydrologc Change: Statstcal Methods, CRC Press, New York, NY, USA, [36] Interagency Advsory Commttee on Water Data, Gudelnes for Determnng Flood Flow Frequency: Bulletn 17B, U.S. Geologcal Survey, Offce of Water Data Coordnaton, Reston, Va, USA, [37] R. J. Hyndman and H. L. Shang, Ranbow plots, bagplots, and boxplots for functonal data, Computatonal and Graphcal Statstcs,vol.19,no.1,pp.29 45,2010. [38] F. Chebana, S. Dabo-Nang, and T. B. M. J. Ouarda, Exploratory functonal flood frequency analyss and outler detecton, Water Resources Research, vol.48,no.4,artcleidw04514, [39] W.W.Ng,U.S.Panu,andW.C.Lennox, ChaosbasedAnalytcal technques for daly extreme hydrologcal observatons, Hydrology,vol.342,no.1-2,pp.17 41,2007. [40] J.M.Chambers,W.S.Cleveland,B.Klener,andP.A.Tukey, Graphcal Methods for Data Analyss, Wadsworth,Belmont, Calf, USA, [41] J. Han and M. Kamber, Data Mnng: Concepts and Technques, Morgan Kaufmann, San Francsco, Calf, USA, [42] J. Ma and S. Perkns, Tme-seres novelty detecton usng oneclass support vector machnes, n Proceedngs of the Internatonal Jont Conference on Neural Networks,vol.3,pp , July [43] T. Fawcett, An ntroducton to ROC analyss, Pattern Recognton Letters,vol.27,no.8,pp ,2006.

15 Advances n Operatons Research Advances n Decson Scences Appled Mathematcs Algebra Probablty and Statstcs The Scentfc World Journal Internatonal Dfferental Equatons Submt your manuscrpts at Internatonal Advances n Combnatorcs Mathematcal Physcs Complex Analyss Internatonal Mathematcs and Mathematcal Scences Mathematcal Problems n Engneerng Mathematcs Dscrete Mathematcs Dscrete Dynamcs n Nature and Socety Functon Spaces Abstract and Appled Analyss Internatonal Stochastc Analyss Optmzaton

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated. Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng, Natonal Unversty of Sngapore {shva, phanquyt, tancl }@comp.nus.edu.sg

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

A Gradient Difference based Technique for Video Text Detection

A Gradient Difference based Technique for Video Text Detection 2009 10th Internatonal Conference on Document Analyss and Recognton A Gradent Dfference based Technque for Vdeo Text Detecton Palaahnakote Shvakumara, Trung Quy Phan and Chew Lm Tan School of Computng,

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

A Semi-parametric Regression Model to Estimate Variability of NO 2

A Semi-parametric Regression Model to Estimate Variability of NO 2 Envronment and Polluton; Vol. 2, No. 1; 2013 ISSN 1927-0909 E-ISSN 1927-0917 Publshed by Canadan Center of Scence and Educaton A Sem-parametrc Regresson Model to Estmate Varablty of NO 2 Meczysław Szyszkowcz

More information

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z.

TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS. Muradaliyev A.Z. TECHNIQUE OF FORMATION HOMOGENEOUS SAMPLE SAME OBJECTS Muradalyev AZ Azerbajan Scentfc-Research and Desgn-Prospectng Insttute of Energetc AZ1012, Ave HZardab-94 E-mal:aydn_murad@yahoocom Importance of

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Fast Computation of Shortest Path for Visiting Segments in the Plane

Fast Computation of Shortest Path for Visiting Segments in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 4 The Open Cybernetcs & Systemcs Journal, 04, 8, 4-9 Open Access Fast Computaton of Shortest Path for Vstng Segments n the Plane Ljuan Wang,, Bo Jang

More information

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT

APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT 3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

A Simple and Efficient Goal Programming Model for Computing of Fuzzy Linear Regression Parameters with Considering Outliers

A Simple and Efficient Goal Programming Model for Computing of Fuzzy Linear Regression Parameters with Considering Outliers 62626262621 Journal of Uncertan Systems Vol.5, No.1, pp.62-71, 211 Onlne at: www.us.org.u A Smple and Effcent Goal Programmng Model for Computng of Fuzzy Lnear Regresson Parameters wth Consderng Outlers

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

Research and Application of Fingerprint Recognition Based on MATLAB

Research and Application of Fingerprint Recognition Based on MATLAB Send Orders for Reprnts to reprnts@benthamscence.ae The Open Automaton and Control Systems Journal, 205, 7, 07-07 Open Access Research and Applcaton of Fngerprnt Recognton Based on MATLAB Nng Lu* Department

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information

The Shortest Path of Touring Lines given in the Plane

The Shortest Path of Touring Lines given in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 262 The Open Cybernetcs & Systemcs Journal, 2015, 9, 262-267 The Shortest Path of Tourng Lnes gven n the Plane Open Access Ljuan Wang 1,2, Dandan He

More information

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data

Type-2 Fuzzy Non-uniform Rational B-spline Model with Type-2 Fuzzy Data Malaysan Journal of Mathematcal Scences 11(S) Aprl : 35 46 (2017) Specal Issue: The 2nd Internatonal Conference and Workshop on Mathematcal Analyss (ICWOMA 2016) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING

BIN XIA et al: AN IMPROVED K-MEANS ALGORITHM BASED ON CLOUD PLATFORM FOR DATA MINING An Improved K-means Algorthm based on Cloud Platform for Data Mnng Bn Xa *, Yan Lu 2. School of nformaton and management scence, Henan Agrcultural Unversty, Zhengzhou, Henan 450002, P.R. Chna 2. College

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

REFRACTIVE INDEX SELECTION FOR POWDER MIXTURES

REFRACTIVE INDEX SELECTION FOR POWDER MIXTURES REFRACTIVE INDEX SELECTION FOR POWDER MIXTURES Laser dffracton s one of the most wdely used methods for partcle sze analyss of mcron and submcron sze powders and dspersons. It s quck and easy and provdes

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Data Mining: Model Evaluation

Data Mining: Model Evaluation Data Mnng: Model Evaluaton Aprl 16, 2013 1 Issues: Evaluatng Classfcaton Methods Accurac classfer accurac: predctng class label predctor accurac: guessng value of predcted attrbutes Speed tme to construct

More information

Improved Methods for Lithography Model Calibration

Improved Methods for Lithography Model Calibration Improved Methods for Lthography Model Calbraton Chrs Mack www.lthoguru.com, Austn, Texas Abstract Lthography models, ncludng rgorous frst prncple models and fast approxmate models used for OPC, requre

More information

Performance Assessment and Fault Diagnosis for Hydraulic Pump Based on WPT and SOM

Performance Assessment and Fault Diagnosis for Hydraulic Pump Based on WPT and SOM Performance Assessment and Fault Dagnoss for Hydraulc Pump Based on WPT and SOM Be Jkun, Lu Chen and Wang Zl PERFORMANCE ASSESSMENT AND FAULT DIAGNOSIS FOR HYDRAULIC PUMP BASED ON WPT AND SOM. Be Jkun,

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Parameter estimation for incomplete bivariate longitudinal data in clinical trials

Parameter estimation for incomplete bivariate longitudinal data in clinical trials Parameter estmaton for ncomplete bvarate longtudnal data n clncal trals Naum M. Khutoryansky Novo Nordsk Pharmaceutcals, Inc., Prnceton, NJ ABSTRACT Bvarate models are useful when analyzng longtudnal data

More information

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007 Syntheszer 1.0 A Varyng Coeffcent Meta Meta-Analytc nalytc Tool Employng Mcrosoft Excel 007.38.17.5 User s Gude Z. Krzan 009 Table of Contents 1. Introducton and Acknowledgments 3. Operatonal Functons

More information

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation

An Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Modular PCA Face Recognition Based on Weighted Average

Modular PCA Face Recognition Based on Weighted Average odern Appled Scence odular PCA Face Recognton Based on Weghted Average Chengmao Han (Correspondng author) Department of athematcs, Lny Normal Unversty Lny 76005, Chna E-mal: hanchengmao@163.com Abstract

More information

Querying by sketch geographical databases. Yu Han 1, a *

Querying by sketch geographical databases. Yu Han 1, a * 4th Internatonal Conference on Sensors, Measurement and Intellgent Materals (ICSMIM 2015) Queryng by sketch geographcal databases Yu Han 1, a * 1 Department of Basc Courses, Shenyang Insttute of Artllery,

More information

Data Mining For Multi-Criteria Energy Predictions

Data Mining For Multi-Criteria Energy Predictions Data Mnng For Mult-Crtera Energy Predctons Kashf Gll and Denns Moon Abstract We present a data mnng technque for mult-crtera predctons of wnd energy. A mult-crtera (MC) evolutonary computng method has

More information

A classification scheme for applications with ambiguous data

A classification scheme for applications with ambiguous data A classfcaton scheme for applcatons wth ambguous data Thomas P. Trappenberg Centre for Cogntve Neuroscence Department of Psychology Unversty of Oxford Oxford OX1 3UD, England Thomas.Trappenberg@psy.ox.ac.uk

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

A Background Subtraction for a Vision-based User Interface *

A Background Subtraction for a Vision-based User Interface * A Background Subtracton for a Vson-based User Interface * Dongpyo Hong and Woontack Woo KJIST U-VR Lab. {dhon wwoo}@kjst.ac.kr Abstract In ths paper, we propose a robust and effcent background subtracton

More information

The Man-hour Estimation Models & Its Comparison of Interim Products Assembly for Shipbuilding

The Man-hour Estimation Models & Its Comparison of Interim Products Assembly for Shipbuilding Internatonal Journal of Operatons Research Internatonal Journal of Operatons Research Vol., No., 9 4 (005) The Man-hour Estmaton Models & Its Comparson of Interm Products Assembly for Shpbuldng Bn Lu and

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

Signature and Lexicon Pruning Techniques

Signature and Lexicon Pruning Techniques Sgnature and Lexcon Prunng Technques Srnvas Palla, Hansheng Le, Venu Govndaraju Centre for Unfed Bometrcs and Sensors Unversty at Buffalo {spalla2, hle, govnd}@cedar.buffalo.edu Abstract Handwrtten word

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information