A Generalized Methodology for Data Analysis

Size: px
Start display at page:

Download "A Generalized Methodology for Data Analysis"

Transcription

1 > < A eneralzed Methodology for ata Analyss Plamen Angelov, Fellow, IEEE, Xaowe, and Jose Prncpe, Fellow, IEEE Abstract Based on a crtcal analyss of data analytcs and ts fondatons, we propose a fnctonal approach to estmate data ensemble propertes, whch s based entrely on the emprcal observatons of dscrete data samples and the relatve promty of these ponts n the data space and hence named emprcal data analyss (EA). The ensemble fnctons nclde the non-parametrc sqare centralty (a measre of closeness sed n graph theory) and typcalty (an emprcally derved qantty whch resembles probablty). A dstnctve featre of the proposed new fnctonal approach to data analyss s that t does not assme randomness or determnsm of the emprcally observed data, nor ndependence. The typcalty s derved from the dscrete data drectly n contrast to the tradtonal approach where a contnos probablty densty fncton (pdf) s assmed a pror. The typcalty s epressed n a closed analytcal form that can be calclated recrsvely and, ths, s comptatonally very effcent. The proposed non-parametrc estmators of the ensemble propertes of the data can also be nterpreted as a dscrete form of the nformaton potental (known from the nformaton theoretc learnng theory as well as the Parzen wndows). Therefore, EA s very stable for the crrent move to a data-rch envronment where the nderstandng of the nderlyng phenomena behnd the avalable vast amonts of data s often not clear. We also present an etenson of EA for nference. The areas of applcatons of the new methodology of the EA are wde becase t concerns the very fondaton of data analyss. Prelmnary tests show ts good performance n comparson to tradtonal technqes. Inde Terms data mnng and analyss, machne learnng, pattern recognton, probablty, statstcs. C I. ITROUCTIO URRETY, there s a growng demand n Machne earnng, Pattern Recognton, Statstcs, ata Mnng and a nmber of related dscplnes broadly called ata Scence, for new concepts and methods that are centered on the actal data, the evdence collected from the real world rather than at theoretcal pror assmptons whch need to be frther confrmed wth the epermental data (e.g the assan assmpton). The core of the statstcal approach s the Manscrpt receved Jly 06. Ths work was partally spported by The Royal Socety grant IE439/04 ovel Machne earnng Paradgms to address Bg ata Streams. Plamen P. Angelov and Xaowe are wth School of Comptng and Commncatons, ancaster Unversty ancaster, A 4WA. (e-mal: {p.angelov,.g3}@lancaster.ac.k) José C. Príncpe s wth Comptatonal eroengneerng aboratory, epartment of Electrcal and Compter Engneerng, Unversty of Florda, USA. (e-mal: prncpe@cnel.fl.ed) defnton of a random varable,.e. a fnctonal measre from the space of events to the real lne, whch defnes the probablty law [] [4]. The probablty densty fncton (pdf) s, by defnton, the dervatve of the cmlatve dstrbton fncton (cdf). It s well known that dfferentaton can create nmercal problems n both practcal and n theoretcal aspects and s a challenge for fnctons whch are not analytcally defned or are comple. In realty, we sally do not have ndependent and dentcally dstrbted (d) events, bt we do have correlated, nterdependent (albet n a comple and often nknown manner) data from dfferent eperments whch complcates the procedre. The appeal of the tradtonal statstcal approach s ts sold mathematcal fondaton and the ablty to provde garantees of performance, when data s plenty (), and created from the same dstrbton that was hypotheszed n the probablty law. The actal data s sally dscrete (or dscretzed), whch n tradtonal probablty theory and statstcs are modeled as a realzaton of the random varable, bt one does not know a pror ther dstrbton. If the pror data generaton hypothess s verfed, good reslts can be epected; otherwse ths opens the door for many falres. Even n the case that the hypotheszed measre meets the realzatons, one has to address the dfference of workng wth realzatons and random varables, whch brngs the sse of choosng estmators of the statstcal qanttes necessary for data analyss. Ths s not a trval problem, and s seldom dscssed n data analyss. The smple determnaton of the probablty law (the measre of the random varable) that eplans the collected data s a hard problem as stded n densty estmaton [] [3]. Moreover, f we are nterested n statstcal nference, for nstance, smlarty between two random varables sng mtal nformaton, the problem gets even harder becase dfferent estmators may provde dfferent reslts [5]. The reason s that very lkely the fnctonal propertes of the chosen estmator do not preserve all the propertes emboded n the statstcal qantty. Therefore, they behave dfferently n the fnte (and even n the nfnte) sample case. An alternatve approach s to proceed from the realzatons to the random varables, whch s the reverse drecton of the statstcal approach. The lteratre has several ecellent eamples of ths approach, n the area of measres of assocaton. For nstance, Pearson s correlaton coeffcent s perfectly well defned n realzatons, as well as n random varables. kewse, Spearman s [6], Kendal s [7], are other eamples of measres of assocaton well defned n both the realzaton and the random varables. However, the problem wth ths approach s that the statstcal propertes of the

2 > < measres n the random varables are not drectly known, and may not be easly obtaned. A good eample of the latter s the generalzed measre of assocaton, whch s well defned n the realzatons, bt not all of the propertes are known n the random varables [8]. Therefore, there are advantages and dsadvantages n each approach, bt from a practcal pont of vew, the non-parametrc approach s very appealng becase we can go beyond the framework of statstcal reasonng to defne new operators and stll cross-valdate the soltons wth the avalable data sng non-parametrc hypothess tests. A good eample s least sqares verss regresson. One can always apply least sqares to any data type, determnstc or stochastc. If the data s stochastc the solton s called regresson, bt the reslt wll be the same, becase the atocorrelaton fncton s a property of the data, ndependent of ts type. The dfference shows p only n the nterpretaton of the solton; most mportantly, the statstcal sgnfcance of the reslt can only be assessed sng regresson. A more recent alternatve s to appromate the dstrbtons sng non-parametrc, data-centered fnctons, sch as partcle flters [9], entropy-based nformaton-theoretc learnng [5], etc. On the other hand, partally tryng to address the same problems, n 965. Zadeh ntrodced fzzy sets theory [0], whch completely departed from obectve observatons and moved (smlarly to the belef-based theory [8] ntrodced a bt later) to the sbectvst defnton of ncertanty. A later strand of fzzy set theory (data drven approach developed manly n 990s) attempted to defne the membershp fnctons based on epermental data. It stands n between probablstc and fzzy representatons [], however, ths approach reqres an assmpton on the type of membershp fncton. An mportant challenge s the posteror dstrbton appromaton. Appromate nference can be done employng mamm a posteror crtera whch reqres comple optmzaton schemes nvolvng, for eample, the epectaton mamzaton algorthm [] [3]. In ths paper, we present a systematc methodology of non-parametrc estmators recently ntrodced n [] [5] for dscrete sets sng ensemble statstcal propertes of the data derved entrely from the epermental dscrete observatons and etend them to contnos spaces. These nclde the cmlatve promty (q), centralty (C), sqare centralty (q - ), standardzed eccentrcty ( ), densty ( ) as well as typcalty, () whch can be etended to contnos spaces, resemblng the nformaton potental obtaned from Parzen wndows [] [4] n Informaton Theoretc earnng (IT) [5]. Its dscrete verson sms p to whle ts contnos verson ntegrates to and s always postve; however, ts vales are always less than nlke the pdf vales that can be greater than. Addtonally, the typcalty s only defned for feasble vales of the ndependent varable whle the pdf can etend to nfeasble vales, e.g. negatve heght, dstance, weght, absolte temperatre, etc. nless specfcally constrant [] [5]. We frther consder dscrete local () and global ( ) versons. Then, we ntrodce an atomatc procedre for dentfyng the local modes/mama of as well as a procedre for redcng the amont of the local mama/modes and etend the non-parametrc estmators to the contnos doman by ntrodcng the contnos global densty, and typcalty,, whch frther nvolves ntegral for normalzaton. Frthermore, we demonstrate that the contnos global typcalty does ntegrate to eactly as the tradtonal pdf (whle beng free form the restrctons the latter has). Ths s a new and sgnfcant reslt whch makes contnos global typcalty an alternatve to the pdf. Ths strengthens the ablty of the emprcal data analyss (EA) framework for obectvely nvestgatng the nknown data pattern behnd the data and opens p the framework for nference. The methodology s eemplfed wth a aïve EA classfer based on. II. THEORETICA BASIS - ISCRETE SETS In ths secton, we start by presentng EA fondatons n dscrete sets []-[5] for completeness and frther clarty. K Frstly, let s consder a real metrc space R and assme a K,,..., R ; wth partclar data set or stream T,,,,...,, K ;,,,, where sbscrpts denote data samples (for a set) or the tme nstances when they arrve (for a stream). Wthn the data set/stream, some data samples may repeat more than once, namely,,. The set of sorted nqe data samples, denoted by,,..., (where, ) and the nmber of occrrence, denoted by f f, f,..., f can be determned atomatcally based on the data. Wth and f, the prmary data set/stream can be reconstrcted. In the remander of ths paper, all the dervatons are condcted n the n th tme nstance ecept when specfcally declared otherwse. The most obvos K choce of R, s the Ecldan space wth the Ecldean dstance, bt we can also etend EA defntons to Hlbert spaces, and Reprodcng Kernel Hlbert spaces. We can, moreover, consder dfferent types of dstances wthn these spaces motvated by the prposes of the analyss that eplot nformaton avalable from the sorce that generated the samples or defntons that are approprate for data analyss. Wthn EA, we ntrodce: a) cmlatve promty, q [] [5]; q b) sqare centralty, ; c) eccentrcty, ξ [] [5]; d) standardzed eccentrcty, ε [] [5]; e) dscrete local densty, [] [5]; f) dscrete local typcalty, [4], [5]; g) dscrete global typcalty, [4], [5]; h) contnos local densty, ; ) contnos global densty,, and ) contnos global typcalty,.

3 > < 3 The dscrete global typcalty, addresses the global propertes of the data and wll be ntrodced n the net secton. For nference, the contnos local ( ), global densty ( ) and the contnos global typcalty, ( ) wll be descrbed n detal n secton IV. A. Cmlatve Promty and Sqare Centralty For every pont ;,,..., one may want to qantfy how close or smlar ths pont s to all other data ponts from. In graph theory, centralty s sed to ndcate the most mportant vertces wthn a graph. A measre of centralty [6], [7] s defned as a sm of dstances from a pont to all other ponts: c ; ; ; () d, where, d s the dstance/smlarty between and, whch can be, bt not lmted to Ecldean, Mahalanobs, cosne, etc. Its mportance comes from the fact that t provdes centralty nformaton abot each data sample n a scalar or vector form. We prevosly defned [] [5] the cmlatve promty q as, q d, ; ; () whch can be seen as nverse centralty wth a sqare dstance Cmlatve promty [] [5] s a very mportant assocaton measre derved emprcally from the observed data wthot makng any pror assmptons abot ther generaton model and plays a fndamental role n dervng other EA qanttes. The complety for comptng the cmlatve promtes of all samples n s O. As a reslt, the comptatonal complety of other EA qanttes for, whch can be derved drectly from cmlatve promty s O. For many types of dstance/smlarty,.e. Ecldean dstance, Mahalanobs dstance, cosne smlarty, etc., wth whch the cmlatve promty can be calclated recrsvely [4], the complety for calclatng the cmlatve promtes O as well. of all the samples n s redced to In a very smlar manner, we can consder sqare centralty as the nverse of the cmlatve promty, defned as follows: q ; (3) d, B. Eccentrcty The eccentrcty,, defned as a normalzed cmlatve promty, s another very mportant assocaton measre derved emprcally from the observed data wthot makng any pror assmptons abot ther generaton model [] [5]. It qantfes data samples away from the mode, sefl to represent dstrbton tals and anomales/otlers. It s derved by normalzng q and takng nto accont all possble data samples. It plays an mportant role n anomaly detecton [4], [5] as well as for the estmaton of the typcalty as t wll be detaled below. The eccentrcty ( ) of a partclar data sample [] [5]: n the set ( ) s calclated as follows d, q q d, h h ; where the coeffcent s nclded to normalze eccentrcty between 0 and,.e.: 0 (5) Here, we also ntrodce standardzed eccentrcty, ε, whch does not decrease as fast as eccentrcty wth the ncrease of the amont of data, and s calclated as follows: q ; q Based on the epresson of the standard eccentrcty (namely, eqaton (6)) one can see that the data samples whch are far away from the maorty tend to have hgher standard eccentrcty vales compared wth others. Ths, the standard eccentrcty can serve as an effectve measre of the tal of data dstrbton wthot the need of clsterng the data n advance. Combnng the standard eccentrcty wth the well-known Chebyshev neqalty [8], whch dscrbes the probablty that certan data sample s more than n ( denotes the standard devaton) dstance away from the mean, we get the EA verson of the Chebyshev neqalty as follows [], [4]: P n (7) n The Chebyshev neqalty epressed by the standard eccentrcty provdes a more elegant form for anomaly detecton. For eample, f 0, has eceeded the 3 lmtaton, and can be categorzed as an anomaly. C. screte ocal ensty screte local densty s defned as the nverse of standardzed eccentrcty and plays an mportant role n data analyss sng EA (,,..., ; ): q d, l l q d, l l For eample, f the Ecldean dstance s sed, the densty can be epressed as (,,..., ; ): (4) (6) (8)

4 > < 4 where X T s the mean of ; T X s the mean of and X can be pdated recrsvely sng [9]: (9) ; k k k k ; ; k k k,,..., k T T X k X k k k ; X ; k k As we can see from eqaton (9), the dscrete local densty tself can be vewed as a nvarate Cachy fncton whle there s no assmpton or any pre-defned parameter nvolved n the dervaton besdes the defnton of the dstance fncton (Ecldean dstance sed here).. screte ocal Typcalty screte local typcalty was frstly ntrodced n [3], and called nmodal typcalty. In ths paper, t s redefned as the normalzed local densty (,,..., ; ): q q (0) The dscrete local typcalty resembles the tradtonal nmodal probablty mass fncton (pmf), bt t s atomatcally defned n the data spport nlke the pmf whch may have non-zero vales for nfeasble vales of the random varable nless specfcally constrant. The dscrete local densty resembles membershp fnctons of a fzzy set havng vale of for = whle the dscrete local typcalty resembles pmf wth the sm of vales beng eqal to and vales for both and beng from the nterval [0,]. As an eample, the sqare centralty, standardzed eccentrcty, dscrete local densty and typcalty of real clmate dataset (wnd chll and wnd gst) measred n Manchester, UK for the perod [0] are presented n Spplementary Fg.. In these eamples, Ecldean dstance s sed. III. THEORETICA BASIS: ISCRETE OBA TYPICAITY In ths secton, we wll consder the more realstc case when data dstrbtons are mltmodal. Tradtonally, ths reqres dentfyng local peaks/modes by clsterng, epectaton mamzaton, optmzaton, etc. [] [3], [] [3]. Wthn EA, the dscrete global typcalty (τ ) s derved atomatcally from the data wth no ser npt and can qantfy mltmodalty. It s based on the local cmlatve promty, sqare centralty, eccentrcty and standardzed eccentrcty. The only reqrements to defne the dscrete global typcalty are the raw data and the type of dstance metrc (whch can be any). A. screte lobal Typcalty Epressons (9)-(0) provde defntons of local operators that are very approprate to qantfy the peak pont ( ) of nmodal dscrete fnctons. Moreover, f the peak concdes wth the global mean ( ), then the vale of the local densty s eqal to :. A smlar property havng a mamm, thogh ts vale s, s also vald for the tradtonal probablty by defnton and accordng to the central lmt theorem [] [3]. In realty, data dstrbtons are sally mltmodal [] [4], therefore the local descrpton shold be mproved. In order to address ths sse, the tradtonal probablty theory often nvolves mtre of nmodal dstrbtons, whch reqres estmaton of nmber of modes and t s not easy [4]. Wthn the EA framework, we provde the dscrete global typcalty, τ, drectly from the dataset, whch provdes mltmodal dstrbtons atomatcally wthot the need of ser decsons and only reqres a threshold for robstness aganst otlers. The dscrete global typcalty of a nqe data sample s epressed as a combnaton of the normalzed dscrete local densty weghted by the correspondng freqency of occrrence of ths nqe data sample (,,..., ; ) : f f f q where q and fq () are the sqare centralty and the dscrete local densty of a partclar data sample, calclated from only. Ths epresson s very fndamental, becase, n fact, t combnes nformaton abot repeated data vales and the scatterng across the data space, and resembles the well-known membershp fnctons of fzzy sets. We frther eplan ths lnk n a pblcaton that s crrently nder revew [5]. a) Hstogram b) screte global typcalty Fg.. Hstogram and dscrete global typcalty of the real clmate data [0] sng Ecldean dstance

5 > < 5 One can easly apprecate from Fg., the dfferences between the and hstogram wth a qantzaton step eqal to 5 for both dmensons. ote that, the hstogram reqres the selecton of one parameter (the qantzaton step) per dmenson, whle none s needed for the dscrete global typcalty. For large dmensons (), ths can be a bg problem. The sze of the grd/as s a ser-specfed parameter. The hstogram takes only vales from a fnte set 0; ; ; ;, whle can take any real vale. The dscrete global typcalty has the followng propertes: ) sms p to ; ) the vale s wthn 0, ; ) provdes a closed analytc form, eqaton (); v) there s no reqrement for pror assmptons as well as any ser- or problem-specfc threshold and parameters; v) s free from some peclartes of tradtonal probablty theory (ts vale never gets and non-zero postve for nfeasble vales [4], [5]) ; v) can be recrsvely calclated for varos types of metrcs. When all the data samples n the dataset have dfferent vales ( f ; ), and the hstogram qantzaton step parameter s not properly set, the hstogram s nable to show any sefl nformaton, whle the dscrete global typcalty can stll show the mtal dstrbton nformaton of the dataset, see Fg. (a) and (b). Ths s a maor advantage of dscrete global typcalty becase t s parameter free. Here the fgres are based on the nqe data samples of the same clmate dataset. As we can see, the data samples whch are closer to the mean of the dataset wll have hgher vale of global typcalty and vce versa. It s also nterestng to notce that for eqally dstant data, the dscrete global typcalty, s eactly the same as the freqentstc form of probablty. Then eqaton () redces to (a) Hstogram wth very small qantzaton Fg. Hstogram and dscrete global typcalty for the nqe data samples f f. Spplementary Fg. shows a smple eample of the dscrete global typcalty and pmf of an artfcal clmate dataset 50 wth only data of wnd chll, whch have nqe data samples, 0;0 ( o C ), whle f 50 0; (b) screte global typcalty, Obvosly, q q d, 0, and o C ; o C 0.6. Indeed, f 0 tmes we observe wnd chll s 0 o C and 30 tmes 0 o C the lkelhood for wnd chll of 0 o C wll be 40% and for wnd chll of 0 o C wll be 60%, respectvely. The dscrete global typcalty 00 of the otcome of throwng dces for 00 tmes s presented n Spplementary Fg. 3 as an addtonal llstratve eample. In ths eperment, for, we can se ; 0; 0; 0; 0; 0; T, for, we can se T 0; ; 0; 0; 0; 0;, etc. et the otcome of throwng dces tmes be f 6 ; ; ; ; ;, the vales of the dscrete global typcalty of the s otcomes are eqal to ther correspondng freqences, see the Spplementary Fg. 3. B. Identfyng ocal Modes of screte lobal Typcalty In ths sb-secton, an atomatc procedre for dentfyng all local mama of the dscrete global typcalty, defned n the prevos sb-secton wll be descrbed. It reslts n the formaton of data clods (samples assocated wth the local mama) [9], [6]. ata clods are free shape whle clsters, are sally hyper-sphercal, hyper-ellpsodal. Ths data parttonng resembles Vorono tessellaton [7]. They are also sed n the AnYa type nero-fzzy predctve [9], [6], classfers and controllers. The llstratve fgres n ths secton are based on the same clmate dataset [0] that was sed earler n Fg., whch has two featres/attrbtes: wnd chll ( o C ) and wnd gst (mph). In all cases, the Ecldean dstance s sed, thogh, the prncple s vald for any metrc. The proposed algorthm can be smmarzed as follows: Step : Identfyng the global mamm of the dscrete global typcalty For every nqe data sample of the dataset dscrete global typcalty, ts (,,..., ) can be calclated sng eqaton (). The data sample wth the hghest s selected as the reference data sample n the ranked collecton : () arg ma (),,...,

6 > < 6 () where s the data sample wth the hghest vale of dscrete global typcalty (n fact, the global mamm), and we set m (). In case when there are more than one mama, we can start wth any one of them. Step : Rankng the dscrete global typcalty Then, we fnd the nqe data sample that s nearest to m denoted by from. () and pt t nto, meanwhle, remove t () s set to be the global mamm m (). The rankng operaton contnes by fndng the net data sample, whch s closest to, m, pttng t nto removng t from and settng t as the new global mamm. By applyng the rankng operaton ntl becomes empty, we can fnally get the ranked nqe data samples, ( denoted as ),,..., and ther correspondng ranked dscrete global typcalty collecton: () (),,..., ( ). Step 3: Identfyng all local mama The ranked dscrete global typcalty s fltered sng eqaton (3) to detect all local mama of : A IF THE s a local mama of (3) We denote the set of the local mama (can be sed as a bass for formng data clods and, frther, AnYa type fzzy rle-based models [9], [6]) of as the set ( ),,..., P ; P s the nmber of the dentfed local mama and P. The ranked dscrete global typcalty s depcted n Fg. 3(a), the correspondng local mama are depcted n Fg. 3(b). (a) Ranked dscrete global typcalty Fg.3. Identfyng local mama of the dscrete global typcalty, P Step 4: Formng data clods Each local mama,, s then set as a prototype of a data clod. All other data ponts are assgned to the nearest prototype (local mamm) formng data clods sng eqaton (4). wnnng label arg mn d ( ), (4),,..., P ata clods can be sed to form AnYa models [9], [6]. After all the data samples wthn are assgned to the data clods, the center (mean), the standard devaton and spport S (,,..., P ) per clod can be calclated. Step 5: Selectng the man local mama of the dscrete global typcalty, We then calclate at the data clod centers, denoted by sng eqaton () wth the corrpesondng spports as ther freqences. Then, we se the followng operaton to take ot the less promnent local mama. For each center, we check the condton (,,,..., P ; ): IF A THE R (5) Ths condton means that f there s another center wth hgher located wthn the area of, ths new more promnent center replaces the estng one. Ths condton garantees that the nflence areas of neghborng data clods wll not overlap sgnfcantly (t s well known that accordng to the Chebyshev neqalty for arbtrary dstrbton the maorty of the data samples (>75%) le wthn dstance from the mean [] [3]). By fndng ot all the centers satsfyng the above condton, we get the fltered data clod and assgnng them to R centers denoted by by ecldng R from P (b) ocal mama/peaks/modes of P R,,..., P ; P P P and ( P R ), where P s the nmber of remanng P centers. After that, we set P, P P P and repeat Steps 4-5 ntl the data clod centers do not change any more. Fnally, we can get the composed reslt, re-named as o, and se the o as the prototypes to

7 > < 7 Fg.4. Fnal flterng reslt (The black denotes the centers of the data clods, the data samples from dfferent data clods are plotted wth dfferent bld colors) data clods sng eqaton (4). The fnal data clod centers for each selecton rond s presented n the Spplementary Vdeo, whch can also be downloadable from: _Vdeo.ppt?dl=0. The fnal reslt s presented n Fg. 4. Compared wth Fg. 3(b), n the fnal rond, there are only two man modes left broadly correspondng to the two man seasons n orthern England and all the detals are fltered ot. Even f f,, the dscrete global typcalty can stll be etracted sccessflly from the data samples, despte the fact that the reslt may not be eactly the same becase of the changng data strctre, see Spplementary Fg. 4, whch ses the same real clmate dataset n Fg. 4. The smmary of atomatc mode dentfcaton algorthm s as follows. Atomatc mode dentfcaton algorthm:. Calclate,,,..., sng eqaton ();. Fnd the nqe data sample sng eqaton ();. Send and delete () nto () v. m ; v. Whle () wth global mamm of () and nto () from ; Fnd the nqe data sample(s) whch s/are nearest to m ; Send the data sample(s) and the correspondng nto and ; elete data sample(s) from ; Set the latest element n v. End Whle v. Flter and P as P m ; sng eqaton (3) and obtan as centers of data clods; P v. Whle are not fed Use P and form the data clods from P sng eqaton (4); Obtan the new centers P and spports P P Calclate standard devatons S of the data clods;,,..., P sng eqaton (); Fnd R satsfyng eqaton (5); Eclde R from P ; P P P ;. End Whle o. ; P P and obtan ;. Bld the data clods wth o sng eqaton (4); C. Propertes of EA Operators Havng ntrodced the basc EA operators, we wll now otlne ther propertes. They are entrely based on the emprcally observed epermental data and ther mtal dstrbton n the data space; They do not reqre any ser- or problem-specfc thresholds and parameters to be pre-specfed; They do not reqre any model of data generaton (random or determnstc), only the type of dstance metrc sed (however, t can be any); The ndvdal data samples (observatons) do not need to be ndependent or dentcally dstrbted (d); on the contrary, ther mtal dependence s taken nto accont drectly throgh the mtal dstance between the data samples; The method does not reqre nfnte nmber of observatons and can work wth st a few eemplars; Wthn EA, we stll can consder cross valdaton and non-parametrc statstcal tests based on the realzatons of epermentally observed data smlarly to the sgnfcance tests tlzed on the random varable assmed n the tradtonal probablty theory and statstcs. As a conclson, EA can be seen as an advanced data analyss framework whch can work effcently wth any feasble data and any type of dstance or smlarty metrc. IV. THEORETICA BASIS - COTIUOUS ESITY A TYPICAITY Up to ths pont, all EA defntons are sefl to descrbe data sets or data streams made p of a dscrete nmber of observatons. However, they cannot be sed for nference becase they are only defned on ponts where samples occr (dscrete spaces). In ths secton, we defne the contnos local and global densty and global typcalty whch can be P

8 > < 8 Fg.5. The process of etractng dstrbton from data n EA sed for nference on the contnos doman of the varable. At ths stage, we depart from the entrely data based and assmptons-free approach we sed so far, however, ths s done after we dentfed the local modes, formed data clods arond these focal ponts and obtaned the spport of these data clods. Therefore, the etenson to the contnos doman s nherently local (per data clod). We assme that the local mode consdered as the mean and the spport consdered as freqency pls the devaton of the emprcal data do provde the trplet of parameters (μ, X, ). We do recognze that these trplets are condtonal on the specfc data samples observed and assocated wth the partclar data clod, bt ths wll be pdated when new data s avalable. ow, havng ths trplet of parameters we, frstly, defne the contnos local densty, as: q q ; (5) ke eqaton (9), for the case of Ecldean dstance, the contnos local densty, s smplfed to a contnos Cachy type fncton over any feasble vale of the varable wth the parameters μ and X etracted from avalable data samples as descrbed earler:, ;,,..., C ; (6),, where, X,, ;, and X, are the mean and the contnos space for each local mamm per data clod. Frthermore, we ntrodce the contnos global densty as a weghted sm of the local densty of each data clod wth weghts beng the spport (nmber of data samples) of the respectve data clod. Fnally, we ntrodce the contnos global typcalty based on. The contnos global densty and typcalty play a smlar role to the mtre of pdfs. However, the qestons how many dstrbtons n the mtre, whch are ther parameters and what type of dstrbtons see Fg. 5 are all answered from the data drectly, free from any ser or problem-specfc pre-defned parameters, pror assmptons, knowledge or pre-processng technqes lke the cases of clsterng, EM, etc. A. Contnos lobal ensty Contnos global densty s a mtre that arses smply from the metrc of the space sed to measre sample dstance and the densty of samples that est n the space. However, t works for all types of dstance/smlarty metrc. As we can see from eqaton (6) the local densty s Cachy type when the Ecldean dstance s employed therefore, the smplest of the procedres s to defne the contnos global densty as a mtre of Cachy dstrbtons. The contnos global densty enables nference of new samples anywhere n the space. For any and any type of dstance sed, we defne contnos global densty n a general form very mch lke the mtre dstrbtons, as a weghted combnaton of contnos local denstes: the average vale of scalar prodcts of the data samples wthn we mpose the condton S the th,. The contnos global data clod; C s the nmber of data clods; the sbscrpt means the local denstes are derved from densty s defned non-parametrcally from each of the observed data samples. It s obvos that wth more data samples observed, the parameters wll change and have to be pdated reglarly. ote that eqaton (6) s defned based on Ecldean dstance. The modes of the data ( ) and near the peaks; t s a very good appromaton of, bt t wll devate progressvely from t n trogh regons. As an eample, the global densty for the same clmate dataset sed before [0] s presented n Fg. 6(a). epresson of contnos local densty vares from the type of dstance sed. onetheless, n general, the contnos local densty of the data can be epressed n the same form as the dscrete local densty bt n the contnos space. The contnos local (a) Contnos global densty (b) Contnos global typcalty densty s defned on Fg.6. Contnos global densty and global typcalty of the real clmate dataset [0] sng Ecldean type dstance. C S,, ; (7) where, s the local densty of n the th data clod; C s the nmber of data clods at the th tme nstance; S, s the spport (nmber of members) of the th data clod based on the avalable epermental/actal data. For normalzaton, C

9 > < 9 Compared wth the dscrete local densty ntrodced n secton II whch s dscrete and nmodal by defnton, s more effectve to detect the natral mltmodal data strctre sch as abnormal data samples becase only the data samples that are close (a) wnd chll ( o C) (b) wnd gst (mph) to the larger data clods, Fg.7. Comparson between the contnos global typcalty, dscrete global typcalty, hstogram and tradtonal pdf. whch can be vewed as the man modes of the data patterns, can have hgher vales of f,,..., Kdd,... dk (0) contnos global densty. Ths featre s clearly depcted by K the vale of of those data samples located n the space between the two man modes n the fgres below, whle for the Based on (7)-(9), we ntrodce the normalzed contnos local densty as follows: local densty, see Spplementary Fg. (c), t s eactly the K opposte case. K,, K B. Contnos lobal Typcalty K, () Havng ntrodced the contnos global densty, we can T also defne the contnos global typcalty, as well. It s Here, X,,, for the Ecldean dstance. also defned as a normalzed form of the densty (smlarly to We can, fnally, get the epresson of the contnos global the weghted typcalty,, eqaton ()) bt wth the se of ntegral nstead of the sm. As stated n secton II, the weghted typcalty, densty as: n terms of the normalzed contnos global C typcalty, s dscrete and sms to. The global typcalty s K K epressed as follows: S,, C, C S C, K K S,,, S,, d (8) C d S,, d () For the Ecldean dstance, eqaton () becomes It s mportant to notce that eqaton (8) s general and vald for any type of dstance/smlarty metrc. For a general mltvarate case, t s mportant to normalze the mtre of to make ntegrate to. contnos local denstes, By fndng ot the ntegral of the contnos global densty wthn the metrc space and dvdng by ts ntegral, one can always garantee nt ntegral, regardless the type of dstance/smlarty metrc sed. As we sad before, we consder the well-known epresson of the mltvarate Cachy dstrbton [] [3] to transform the wthot loss of generalty. f, d K where T,,..., K constant and K K T (9) ; s the well-known mathematcal s the gamma fncton; scalar parameter. Ths garantees that: E ; s K C, K K K,,, S (3) The contnos global typcalty of the real clmate dataset wth Ecldean dstance s presented n Fg.6(b) The comparsons between the contnos global typcalty (the modes are etracted by the approach ntrodced n secton III), dscrete global typcalty, hstogram and tradtonal pdf are presented n form for vsal clarty n Fg. 7 sng the same the real clmate dataset [0]. As shown n Fg. 7, compared wth the tradtonal pdf sng a assan model, the global typcalty derved drectly from the dataset wthot any pror assmpton abot the nmber of local modes or type of dstrbton represents very well the two modes n the data pattern and gves reslts very close to what a hstogram wold gve and sgnfcnatly better to what a sngle nmodal dstrbton wold provde. In smmary, the proposed contnos global typcalty has the followng propertes, many of whch t shares wth the dscrete global typcalty ntrodced n secton III:

10 > < 0 ) ntegrates to ; ) provdes a closed analytc form; ) no reqrement for pror assmptons as well as any ser or problem-specfc threshold and parameters; these are derved from the data entrely; v) can be recrsvely calclated for varos types of metrcs. A. Eamples V. APPICATIOS In ths sbsecton, we wll gve several eamples of the contnos global typcalty, of dfferent datasets etracted by the proposed atomatc mode dentfcaton algorthm. The contnos global typcalty of the Seeds dataset [8] and Combned Cycle Power Plant dataset [9] and Wne Qalty dataset [30] wth Ecldean dstance s presented n Fg. 8. As the dmensonalty of the orgnal datasets s >, for a better vsalzaton, we se the prncpal component analyss (PCA) method [3] to redce the dmensonalty and se the frst prncpal components n the fgres as the -as and y-as. Spplementary Fg. 5 (a) and (b) present the derved from the frst /3 and the frst /3 the Wne Qalty dataset. Spplementary Fg. 5 (c) depcts the derved by scramblng the order of the data samples. The contnos global typcalty of dmensonal benchmark datasets A, S and S [3] are also presented n Spplementary Fg. 6. If we want more detals from the contnos global typcalty, we can also stop the atomatc mode dentfcaton algorthm descrbed n secton III early,.e. before the fnal teraton, and bld the contnos global typcalty based on more detaled data parttonng reslts. The Spplementary Vdeo referred n secton III.B also depcts evolton of the global contnos typcalty based on the reslts of dfferent teraton tmes of the proposed mode dentfcaton algorthm. B. Inference Prmer Assmng, there are 3 arbtrary non-nteger vales of wnd ( o C), whch does not est n the chll data 7.5;.5;4.7 dataset, we can qckly obtan the correspondng contnos global typcalty sng eqaton (8), , , and the nferences made are presented n Fg. 9. Here we only consder the two man modes. That means that wnd chll of -7.5 o C s less lkely whle the wnd chll of.5 o C s more lkely. In addton, f we want to know the contnos global typcalty of all the vales larger than t, we can ntegrate as follows: T t d (4) t For eample, when Ecldean dstance s sed, and here we only consder one-dmensonal data for smpler dervaton, eqaton (4) can be re-wrtten as: C S,, t d T t C t, S, arctan, (5) et s contne the eample n Fg. 9. If we want to know the global contnos typcalty of all the data samples above 0 o C, whch s the green area of ths fgre, we can calclate the vale sng eqaton (5) to yeld T That means that the lkelhood, a vale to be eqal to or greater than 0 o C s 4.47%. One can see that the contnos global typcalty can serve as a form of probablty. C. aïve EA Classfer In ths sb-secton, we borrow the concept of naïve Bayes classfers [] [3] and propose a new verson of naïve EA classfer. In contrast wth the orgnal naïve EA classfer proposed n [5], whch reles for nference on the dscrete global typcalty and lnear nterpolaton and/or etrapolaton, the naïve EA classfer n ths paper ses the contnos global typcalty nstead, whch s based on the local modes of the dscrete global typcalty dentfed by an atomatc procedre as descrbed n secton III.B. Ths procedre s more effectve n reflectng the ensemble featres of the dstrbton of the data samples of dfferent classes n the data space. As the proposed approach accommodates varos type of dstance/smlarty metrcs, one can se the crrent knowledge n the area to choose the desred dstance measre for a reasonable appromaton that smplfes the processng. Moreover, one can change to other dstance measres easly (a) Seeds dataset (b) Combned Cycle Power Plant dataset (c) Wne Qalty dataset Fg.8. Contnos global typcalty of the Seeds dataset [8], Combned Cycle Power Plant dataset [9] and Wne Qalty dataset [30]

11 > < Fg.9. Contnos global typcalty nferences and compare the reslts obtaned by the classfer wth dfferent type of measres. For consstence, n the followng nmercal eamples, we se the Ecldean dstance. et s assme H classes at the th tme nstance, where some classes may have many data clods. The contnos global typcalty per class can be defned as (,,..., H ):, where, class label, W W S,,,,,,,, S d (6) W s the nmber of data clods sharng the same th H W C ;,, clod havng the th class label; contnos local densty. For any nlabeled data sample followng epresson: label S s the spport of the th data,, arg ma,,..., H, of wnd chll data and smple s the correspondng, ts label s decded by the (7) The plots (wnd chll and wnd gst) of the contnos global typcalty wth Ecldean type of dstance of the real clmate dataset are gven n Spplementary Fg.7. The performance of the proposed naïve EA classfer s frther tested on the followng problems: ) Banknote Athentcaton dataset [33]; ) Pma dataset [34]; ) Clmate dataset [0]; v) Pen-Based Handwrtten gts Recognton dataset [35]; TABE I CASSIFICATIO PERFORMACE- 3 PRICIPA COMPOETS COSIERE Overall Accracy ataset aïve EA classfer SVM classfer aïve Bayes classfer Banknote Pma Clmate Pendgt Madelon Optdgt Occpancy detecton testng set Occpancy detecton testng set v) Madelon dataset [36]; v) Optcal Handwrtten gts Recognton dataset [37]; v) Occpancy etecton dataset [38]. The proposed naïve EA classfer s compared wth a SVM classfer wth assan radal bass fncton and a naïve Bayes classfer n terms of ther performance. The detals of the datasets sed n the classfcaton are demonstrated n Spplementary Secton B. In the eperments, PCA [3] s appled as a pre-processng step to redce the dmensonalty and balance the varances of the datasets. It has to be stressed that PCA s not a part of the proposed method and s not necessary for smpler problems. For Banknote Athentcaton, Pma and Clmate datasets, we randomly select 70% of the data for tranng and se the rest for valdaton. The performance s evalated after 0-fold cross-valdaton. For Pen-Based gts, Madelon, Optcal gts and Occpancy etecton datasets, we tran the classfers wth the tranng sets and condct the valdaton wth the testng/valdaton sets. The overall performance of the 3 classfers s tablated n Table I, where we consder the frst 3 prncpal components for classfcaton. Consderng the frst 5 prncpal components, the overall reslts obtaned by the classfers are tablated n Table II. As t s shown n Tables I and II, the proposed naïve EA classfer otperforms the SVM classfer and naïve Bayes classfer on dfferent problems n the maorty of the nmercal eamples. The performance of the proposed naïve EA classfer s the best. In addton, t s worth to note that the classfcaton condcted by the naïve EA classfer s totally free from nrealstc assmptons, restrctons or pror knowledge. VI. COCUSIO A FUTURE IRECTIO In ths paper, we propose a new systematc approach to derve ensemble propertes of data wthot any pror assmptons abot data sorces, amont of data and ser- or problem- specfc parameters. The EA (Emprcal ata Analytcs) framework consders the relatve poston of data n a metrc space only and etracts from the raw epermental dscrete observatons a seres of measres of ther ensemble propertes, sch as the cmlatve promty (q), centralty (C), sqare centralty (q - ), standardzed eccentrcty ( ), densty ( ) as well as typcalty, (). The local and global versons of TABE II CASSIFICATIO PERFORMACE - 5 PRICIPA COMPOETS COSIERE Overall Accracy ataset aïve EA classfer SVM classfer aïve Bayes classfer Pma Clmate Pendgt Madelon Optdgt Occpancy detecton testng set Occpancy detecton testng set

12 > < the typcalty, ( and ) are both consdered orgnally n dscrete form and then n contnos form appromatng the actal data-drven dscrete estmators by a mtre of local fnctons. It was demonstrated that for the case when the dstance metrc sed s Ecldean, the densty (both n ts dscrete form that s eactly descrbng the actal data and n ts contnos form whch s appromatng the entre data space densty) takes the form of a Cachy fncton. However, mportantly, ths s not an assmpton made a pror, bt s drven and parameterzed by the data and the selected metrc. Frthermore, we propose an atonomos algorthm for dentfyng all local modes/mama of the global dscrete typcalty, as well as for flterng ot the man local mama based on the closeness of each local mamm. Fnally, we present a nmber of nmercal eamples amng to verfy the methodology and demonstrate ts advantages. We ntrodce a new type of classfer, whch we call naïve EA for nvestgatng the nknown data pattern behnd the large amont of data n a data-rch envronment. In conclson, the proposed EA framework and methodology provdes an effcent alternatve that s entrely based on the epermental data and the evdence. It toches the very fondatons of data mnng and analyss and, ths, has a wde area of applcatons, especally, n the era of bg data and data streams where handcraftng offlne methods and makng detaled assmptons s often not an opton. onetheless, we have to admt that the bottlenecks of the proposed methodology are the lack of theoretcal confdence levels for the analyss and the theoretcal dea of relablty and generalzaton, whch are the nherted lmtatons of nonparametrc approaches. In ths paper, we only provde the prelmnary algorthms and reslts on data parttonng, analyss, nference and classfcaton. As a ftre work, we wll focs on developng more advanced algorthms wthn the EA framework for varos applcatons of dfferent areas, ncldng, bt not lmted to, hgh freqency tradng data processng, foregn crrency tradng problem, handwrtten dgts recognton, remote sensng, etc. REFERECES [] T. Haste, R. Tbshran, and J. Fredman, The elements of statstcal learnng: ata mnng, nference, and predcton. Brln: Sprnger, 009. [] C. M. Bshop, Pattern recognton. ew York: Sprnger, 006. [3] R. O. da, P. E. Hart, and.. Stork, Pattern classfcaton, nd ed. Chchester, West Ssse, UK,: Wley-Interscence, 000. [4] T. Bayes, An essay towards solvng a problem n the doctrne of chances, Phlos. Trans. R. Soc., vol. 53, p. 370, 763. [5] J. Prncpe, Informaton theoretc learnng: Reny s entropy and kernel perspectves. Sprnger, 00. [6] C. Spearman, The proof and measrement of assocaton between two thngs, Am. J. Psychol., vol. 5, pp. 7 0, 904. [7] M.. Kendall, A new measre of rank correlaton, Bometrka, vol. 30, no., pp. 8 93, 938. [8]. A.. oodman and W. H.. Krskal, Measres of assocaton for cross classfcatons, J. Am. Stat. Assoc., vol. 49, no. 68, pp , 954. [9] P. el Moral, onlnear flterng: nteractng partcle resolton, Comptes Rends l Académe des Sc. - Ser. I - Math., vol. 35, no. 6, pp , 997. [0]. A. Zadeh, Fzzy sets, Inf. Control, vol. 8, no. 3, pp , 965. [] M. Chen and. A. nkens, Rle-base self-generaton and smplfcaton for data-drven fzzy models, Fzzy Sets Syst., vol. 4, no., pp , 004. [] P. P. Angelov, Anomaly detecton based on eccentrcty analyss, n 04 IEEE Symposm Seres n Comptatonal Intellgence, IEEE Symposm on Evolvng and Atonomos earnng Systems, EAS, SSCI 04, 04, pp. 8. [3] P. Angelov, Otsde the bo: an alternatve data analytcs framework, J. Atom. Mob. Robot. Intell. Syst., vol. 8, no., pp , 04. [4] P. Angelov, X., and. Kangn, Emprcal data analytcs, Int. J. Intell. Syst., OI 0.00/nt.899, 07. [5] P. P. Angelov, X., J. Prncpe, and. Kangn, Emprcal data analyss - a new tool for data analytcs, n IEEE Internatonal Conference on Systems, Man, and Cybernetcs, 06, pp [6]. Sabdss, The centralty nde of a graph, Psychometrka, vol. 3, no. 4, pp , 966. [7]. C. Freeman, Centralty n socal networks conceptal clarfcaton, Soc. etworks, vol., no. 3, pp. 5 39, 979. [8] J.. Saw, M. C. K. Yang, and T. S. E. C. Mo, Chebyshev neqalty wth estmated mean and varance, Am. Stat., vol. 38, no., pp. 30 3, 984. [9] P. Angelov, Atonomos learnng systems: from data streams to knowledge n real tme. John Wley & Sons, td., 0. [0] Clmate ataset n Manchester, [] S. adaraah and S. Kotz, Probablty ntegrals of the mltvarate t dstrbton, Can. Appl. Math. Q., vol. 3, no., pp , 005. [] C. ee, Fast smlated annealng wth a mltvarate Cachy dstrbton and the confgraton s ntal temperatre, J. Korean Phys. Soc., vol. 66, no. 0, pp , 05. [3] S. Y. Shatskkha, Mltvarate Cachy dstrbtons as locally assan dstrbtons, J. Math. Sc., vol. 78, no., pp. 0 08, 996. [4] A. Cordnean and C. M. Bshop, Varatonal Bayesan model selecton for mtre dstrbtons, Proc. Eghth Int. Conf. Artf. Intell. Stat., pp. 7 34, 00. [5] P. P. Angelov and X., Emprcal fzzy sets, nder revew, 07. [6] P. Angelov and R. Yager, A new type of smplfed fzzy rle-based system, Int. J. en. Syst., vol. 4, no., pp , 0. [7] A. Okabe, B. Boots, K. Sghara, and S.. Ch, Spatal tessellatons: concepts and applcatons of Vorono dagrams, nd ed. Chchester, England: John Wley & Sons., 999. [8] Seeds ataset, [9] Combned Cycle Power Plant ataset, [30] Wne Qalty ataset, [3] I. Jollffe, Prncpal component analyss. John Wley & Sons, td., 00. [3] Clsterng datasets, [33] Banknote Athentcaton ataset, [34] Pma Indans abetes ataset, [35] Pen-Based Recognton of Handwrtten gts ataset, wrtten+gts. [36] Madelon ataset, [37] Optcal Recognton of Handwrtten gts ataset, rtten+gts. [38] Occpancy etecton ataset, Plamen P. Angelov (F 6, SM'04, M'99) s a Char Professor n Intellgent Systems wth the School of Comptng and Commncatons, ancaster Unversty, UK. He obtaned hs Ph (993) and hs Sc (05) from the Blgaran Academy of Scence. He s the Vce Presdent of the Internatonal eral etworks Socety and a member of the Board of overnors of the Systems, Man and Cybernetcs Socety of the IEEE, a stngshed ectrer of IEEE. He s Edtor-n-Chef of the Evolvng Systems ornal (Sprnger) and Assocate Edtor of IEEE Transactons on Fzzy Systems as well as of IEEE Transactons on Cybernetcs and several other ornals. He

13 > < 3 receved varos awards and s nternatonally recognzed poneerng reslts nto on-lne and evolvng methodologes and algorthms for knowledge etracton n the form of hman-ntellgble fzzy rle-based systems and atonomos machne learnng. He holds a wde portfolo of research proects and leads the ata Scence grop at ancaster. Xaowe receved the B.E. and M.E. degrees from the Hangzho anz Unversty, Hangzho, Chna. He s crrently prsng the Ph.. degree n compter scence wth ancaster Unversty, UK. José C. Príncpe (F 00) s a stngshed Professor of Electrcal and Compter Engneerng at the Unversty of Florda. He s also the Ecks Professor and Fondng rector of Comptatonal eroengneerng aboratory (CE), Unversty of Florda. Hs prmary research nterests are advanced sgnal processng wth nformaton theoretc crtera (entropy and mtal nformaton), adaptve models n the reprodcng kernel Hlbert spaces (RKHS) and the applcaton of these advanced algorthms n Bran Machne Interfaces (BMI). r. Prncpe s a Fellow of the IEEE, ABME and AIBME. He s the past Edtor-n-Chef of the IEEE Transactons on Bomedcal Engneerng, past Char of the Techncal Commttee on eral etworks of the IEEE Sgnal Processng Socety and past Presdent of the Internatonal eral etwork Socety. He receved the IEEE EMBS Career Award, and the IEEE eral etwork Poneer Award.

Modeling Local Uncertainty accounting for Uncertainty in the Data

Modeling Local Uncertainty accounting for Uncertainty in the Data Modelng Local Uncertanty accontng for Uncertanty n the Data Olena Babak and Clayton V Detsch Consder the problem of estmaton at an nsampled locaton sng srrondng samples The standard approach to ths problem

More information

Numerical Solution of Deformation Equations. in Homotopy Analysis Method

Numerical Solution of Deformation Equations. in Homotopy Analysis Method Appled Mathematcal Scences, Vol. 6, 2012, no. 8, 357 367 Nmercal Solton of Deformaton Eqatons n Homotopy Analyss Method J. Izadan and M. MohammadzadeAttar Department of Mathematcs, Faclty of Scences, Mashhad

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

An Improved Isogeometric Analysis Using the Lagrange Multiplier Method

An Improved Isogeometric Analysis Using the Lagrange Multiplier Method An Improved Isogeometrc Analyss Usng the Lagrange Mltpler Method N. Valzadeh 1, S. Sh. Ghorash 2, S. Mohammad 3, S. Shojaee 1, H. Ghasemzadeh 2 1 Department of Cvl Engneerng, Unversty of Kerman, Kerman,

More information

OBJECT TRACKING BY ADAPTIVE MEAN SHIFT WITH KERNEL BASED CENTROID METHOD

OBJECT TRACKING BY ADAPTIVE MEAN SHIFT WITH KERNEL BASED CENTROID METHOD ISSN : 0973-739 Vol. 3, No., Janary-Jne 202, pp. 39-42 OBJECT TRACKING BY ADAPTIVE MEAN SHIFT WITH KERNEL BASED CENTROID METHOD Rahl Mshra, Mahesh K. Chohan 2, and Dhraj Ntnawwre 3,2,3 Department of Electroncs,

More information

Hybrid Method of Biomedical Image Segmentation

Hybrid Method of Biomedical Image Segmentation Hybrd Method of Bomedcal Image Segmentaton Mng Hng Hng Department of Electrcal Engneerng and Compter Scence, Case Western Reserve Unversty, Cleveland, OH, Emal: mxh8@case.ed Abstract In ths paper we present

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification

12/2/2009. Announcements. Parametric / Non-parametric. Case-Based Reasoning. Nearest-Neighbor on Images. Nearest-Neighbor Classification Introducton to Artfcal Intellgence V22.0472-001 Fall 2009 Lecture 24: Nearest-Neghbors & Support Vector Machnes Rob Fergus Dept of Computer Scence, Courant Insttute, NYU Sldes from Danel Yeung, John DeNero

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Restaurants Review Star Prediction for Yelp Dataset

Restaurants Review Star Prediction for Yelp Dataset Restarants Revew Star Predcton for Yelp Dataset Mengq Y UC San Dego A53077101 mey004@eng.csd.ed Meng Xe UC San Dego A53070417 m6xe@eng.csd.ed Wenja Oyang UC San Dego A11069530 weoyang@eng.csd.ed ABSTRACT

More information

A General Algorithm for Computing Distance Transforms in Linear Time

A General Algorithm for Computing Distance Transforms in Linear Time Ths chapter has been pblshed as: A. Mejster, J. B. T. M. Roerdnk and W. H. Hesselnk, A general algorthm for comptng dstance transforms n lnear tme. In: Mathematcal Morphology and ts Applcatons to Image

More information

Scheduling with Integer Time Budgeting for Low-Power Optimization

Scheduling with Integer Time Budgeting for Low-Power Optimization Schedlng wth Integer Tme Bdgetng for Low-Power Optmzaton We Jang, Zhr Zhang, Modrag Potkonjak and Jason Cong Compter Scence Department Unversty of Calforna, Los Angeles Spported by NSF, SRC. Otlne Introdcton

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

A. General Type- Fzzy Clsterng There are two knds of type- fzzy sets whch are often sed n clsterng algorthms: 1) nterval and ) general. In nterval typ

A. General Type- Fzzy Clsterng There are two knds of type- fzzy sets whch are often sed n clsterng algorthms: 1) nterval and ) general. In nterval typ 014 IEEE Internatonal Conference on Fzzy Systems (FUZZ-IEEE) Jly 6-11, 014, Beng, Chna A Hybrd Type- Fzzy Clsterng Technqe for Inpt Data Preprocessng of Classfcaton Algorthms Vahd Nor, Mohammad-. Akbarzadeh-T.

More information

Analysis of Malaysian Wind Direction Data Using ORIANA

Analysis of Malaysian Wind Direction Data Using ORIANA Modern Appled Scence March, 29 Analyss of Malaysan Wnd Drecton Data Usng ORIANA St Fatmah Hassan (Correspondng author) Centre for Foundaton Studes n Scence Unversty of Malaya, 63 Kuala Lumpur, Malaysa

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

CLASSIFICATION OF ULTRASONIC SIGNALS

CLASSIFICATION OF ULTRASONIC SIGNALS The 8 th Internatonal Conference of the Slovenan Socety for Non-Destructve Testng»Applcaton of Contemporary Non-Destructve Testng n Engneerng«September -3, 5, Portorož, Slovena, pp. 7-33 CLASSIFICATION

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007 Syntheszer 1.0 A Varyng Coeffcent Meta Meta-Analytc nalytc Tool Employng Mcrosoft Excel 007.38.17.5 User s Gude Z. Krzan 009 Table of Contents 1. Introducton and Acknowledgments 3. Operatonal Functons

More information

Support Vector Machines

Support Vector Machines Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

A combined test for randomness of spatial distribution of composite microstructures

A combined test for randomness of spatial distribution of composite microstructures ISSN 57-7076 Revsta Matéra, v., n. 4, pp. 597 60, 007 http://www.matera.coppe.frj.br/sarra/artgos/artgo0886 A combned test for randomness of spatal dstrbton of composte mcrostrctres ABSTRACT João Domngos

More information

An Image Fusion Approach Based on Segmentation Region

An Image Fusion Approach Based on Segmentation Region Rong Wang, L-Qun Gao, Shu Yang, Yu-Hua Cha, and Yan-Chun Lu An Image Fuson Approach Based On Segmentaton Regon An Image Fuson Approach Based on Segmentaton Regon Rong Wang, L-Qun Gao, Shu Yang 3, Yu-Hua

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

GSA Training Notes Raft and Piled-raft Analysis

GSA Training Notes Raft and Piled-raft Analysis GSA Tranng Notes Rat and Pled-rat Analyss 1 Introdcton Rat analyss n GSA provdes a means o lnkng GSA statc analyss and sol settlement analyss, so the sol-strctre nteractons can be consdered n the analyss.

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches

Fuzzy Filtering Algorithms for Image Processing: Performance Evaluation of Various Approaches Proceedngs of the Internatonal Conference on Cognton and Recognton Fuzzy Flterng Algorthms for Image Processng: Performance Evaluaton of Varous Approaches Rajoo Pandey and Umesh Ghanekar Department of

More information

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc.

An Indian Journal FULL PAPER ABSTRACT KEYWORDS. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 97-735 Volume Issue 9 BoTechnology An Indan Journal FULL PAPER BTAIJ, (9), [333-3] Matlab mult-dmensonal model-based - 3 Chnese football assocaton super league

More information

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1

Outline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1 4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Obstacle Avoidance by Using Modified Hopfield Neural Network

Obstacle Avoidance by Using Modified Hopfield Neural Network bstacle Avodance by Usng Modfed Hopfeld Neral Network Panrasee Rtthpravat Center of peraton for Feld Robotcs Development (FIB), Kng Mongkt s Unversty of Technology Thonbr. 91 Sksawas road Tongkr Bangkok

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Classifying Acoustic Transient Signals Using Artificial Intelligence

Classifying Acoustic Transient Signals Using Artificial Intelligence Classfyng Acoustc Transent Sgnals Usng Artfcal Intellgence Steve Sutton, Unversty of North Carolna At Wlmngton (suttons@charter.net) Greg Huff, Unversty of North Carolna At Wlmngton (jgh7476@uncwl.edu)

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

Fusion of Static and Dynamic Body Biometrics for Gait Recognition

Fusion of Static and Dynamic Body Biometrics for Gait Recognition Fson of Statc and Dynamc Body Bometrcs for Gat Recognton Lang Wang, Hazhong Nng, Ten Tan, Wemng H Natonal Laboratory of Pattern Recognton (NLPR) Insttte of Atomaton, Chnese Academy of Scences, Bejng, P.

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE

SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE SHAPE RECOGNITION METHOD BASED ON THE k-nearest NEIGHBOR RULE Dorna Purcaru Faculty of Automaton, Computers and Electroncs Unersty of Craoa 13 Al. I. Cuza Street, Craoa RO-1100 ROMANIA E-mal: dpurcaru@electroncs.uc.ro

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

Intra-Parametric Analysis of a Fuzzy MOLP

Intra-Parametric Analysis of a Fuzzy MOLP Intra-Parametrc Analyss of a Fuzzy MOLP a MIAO-LING WANG a Department of Industral Engneerng and Management a Mnghsn Insttute of Technology and Hsnchu Tawan, ROC b HSIAO-FAN WANG b Insttute of Industral

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated. Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,

More information

Three supervised learning methods on pen digits character recognition dataset

Three supervised learning methods on pen digits character recognition dataset Three supervsed learnng methods on pen dgts character recognton dataset Chrs Flezach Department of Computer Scence and Engneerng Unversty of Calforna, San Dego San Dego, CA 92093 cflezac@cs.ucsd.edu Satoru

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

SVM-based Learning for Multiple Model Estimation

SVM-based Learning for Multiple Model Estimation SVM-based Learnng for Multple Model Estmaton Vladmr Cherkassky and Yunqan Ma Department of Electrcal and Computer Engneerng Unversty of Mnnesota Mnneapols, MN 55455 {cherkass,myq}@ece.umn.edu Abstract:

More information

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap

Empirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*

More information

SUPPLEMENTARY INFORMATION

SUPPLEMENTARY INFORMATION The crrent I NS as a fncton of the bas oltage V throgh a N/S pont contact nterface can be descrbed by the BTK theory [3] n whch the nterfacal barrer s represented by a fncton wth a dmensonless strength

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

High Dimensional Data Clustering

High Dimensional Data Clustering Hgh Dmensonal Data Clusterng Charles Bouveyron 1,2, Stéphane Grard 1, and Cordela Schmd 2 1 LMC-IMAG, BP 53, Unversté Grenoble 1, 38041 Grenoble Cede 9, France charles.bouveyron@mag.fr, stephane.grard@mag.fr

More information

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces

Range images. Range image registration. Examples of sampling patterns. Range images and range surfaces Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples

More information

Machine Learning. Topic 6: Clustering

Machine Learning. Topic 6: Clustering Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Modular PCA Face Recognition Based on Weighted Average

Modular PCA Face Recognition Based on Weighted Average odern Appled Scence odular PCA Face Recognton Based on Weghted Average Chengmao Han (Correspondng author) Department of athematcs, Lny Normal Unversty Lny 76005, Chna E-mal: hanchengmao@163.com Abstract

More information

Person Identity Clustering in TV Show Videos

Person Identity Clustering in TV Show Videos Person Identty Clsterng n TV Show Vdeos Yna Han*, Gzhong L* *School of Electrcal and Informaton Engneerng, X an Jaotong Unversty, X an, P.R.Chna Emal:yan@malst.xjt.ed.cn, lgz@mal.xjt.ed.cn Keywords: Identty

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures

A Novel Adaptive Descriptor Algorithm for Ternary Pattern Textures A Novel Adaptve Descrptor Algorthm for Ternary Pattern Textures Fahuan Hu 1,2, Guopng Lu 1 *, Zengwen Dong 1 1.School of Mechancal & Electrcal Engneerng, Nanchang Unversty, Nanchang, 330031, Chna; 2. School

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION

MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION MULTISPECTRAL IMAGES CLASSIFICATION BASED ON KLT AND ATR AUTOMATIC TARGET RECOGNITION Paulo Quntlano 1 & Antono Santa-Rosa 1 Federal Polce Department, Brasla, Brazl. E-mals: quntlano.pqs@dpf.gov.br and

More information

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION

A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION 1 THE PUBLISHING HOUSE PROCEEDINGS OF THE ROMANIAN ACADEMY, Seres A, OF THE ROMANIAN ACADEMY Volume 4, Number 2/2003, pp.000-000 A PATTERN RECOGNITION APPROACH TO IMAGE SEGMENTATION Tudor BARBU Insttute

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide

Lobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.

More information

A Comparative Study of Constraint-Handling Techniques in Evolutionary Constrained Multiobjective Optimization

A Comparative Study of Constraint-Handling Techniques in Evolutionary Constrained Multiobjective Optimization A omparatve Stdy of onstrant-handlng Technqes n Evoltonary onstraned Mltobectve Optmzaton Ja-Peng L, Yong Wang, Member, IEEE, Shengxang Yang, Senor Member, IEEE, and Zxng a, Senor Member, IEEE Abstract

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION

BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION BAYESIAN MULTI-SOURCE DOMAIN ADAPTATION SHI-LIANG SUN, HONG-LEI SHI Department of Computer Scence and Technology, East Chna Normal Unversty 500 Dongchuan Road, Shangha 200241, P. R. Chna E-MAIL: slsun@cs.ecnu.edu.cn,

More information

Approximating MAP using Local Search

Approximating MAP using Local Search UAI21 PARK & DARWICHE 43 Approxmatng MAP sng Local Search James D Park and Adnan Darwche Compter Scence Department Unversty of Calforna Los Angeles, CA 995 {jd,darwche }@csclaed Abstract MAP s the problem

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET

BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET 1 BOOSTING CLASSIFICATION ACCURACY WITH SAMPLES CHOSEN FROM A VALIDATION SET TZU-CHENG CHUANG School of Electrcal and Computer Engneerng, Purdue Unversty, West Lafayette, Indana 47907 SAUL B. GELFAND School

More information

Announcements. Supervised Learning

Announcements. Supervised Learning Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information