A Symbolic Representation of Time Series, with Implications for Streaming Algorithms

Size: px
Start display at page:

Download "A Symbolic Representation of Time Series, with Implications for Streaming Algorithms"

Transcription

1 A Symbol Representaton of Tme Seres, th Implatons for Streamng Algorthms Jessa Ln Eamonn Keogh Stefano Lonard Bll Chu Unversty of Calforna - Rversde Computer Sene & Engneerng Department Rversde, CA 9252, USA {jessa, eamonn, stelo, bll}@s.ur.edu ABSTRACT The parallel explosons of nterest n streamng data, and data mnng of tme seres have had surprsngly lttle nterseton. Ths s n spte of the fat that tme seres data are typally streamng data. The man reason for ths apparent paradox s the fat that the vast majorty of ork on streamng data expltly assumes that the data s dsrete, hereas the vast majorty of tme seres data s real valued. Many researhers have also onsdered transformng real valued tme seres nto symbol representatons, notng that suh representatons ould potentally allo researhers to aval of the ealth of data strutures and algorthms from the text proessng and bonformats ommuntes, n addton to allong formerly bath-only problems to be takled by the streamng ommunty. Whle many symbol representatons of tme seres have been ntrodued over the past deades, they all suffer from three fatal flas. Frstly, the dmensonalty of the symbol representaton s the same as the orgnal data, and vrtually all data mnng algorthms sale poorly th dmensonalty. Seondly, although dstane measures an be defned on the symbol approahes, these dstane measures have lttle orrelaton th dstane measures defned on the orgnal tme seres. Fnally, most of these symbol approahes requre one to have aess to all the data, before reatng the symbol representaton. Ths last feature expltly tharts efforts to use the representatons th streamng algorthms. In ths ork e ntrodue a ne symbol representaton of tme seres. Our representaton s unque n that t allos dmensonalty/numerosty reduton, and t also allos dstane measures to be defned on the symbol approah that loer bound orrespondng dstane measures defned on the orgnal seres. As e shall demonstrate, ths latter feature s partularly extng beause t allos one to run ertan data mnng algorthms on the effently manpulated symbol representaton, hle produng dental results to the algorthms that operate on the orgnal data. Fnally, our representaton allos the real valued data to be onverted n a streamng fashon, th only an nfntesmal tme and spae overhead. We ll demonstrate the utlty of our representaton on the lass data mnng tasks of lusterng, lassfaton, query by ontent and anomaly deteton. Keyords Tme Seres, Data Mnng, Data Streams, Symbol, Dsretze. INTRODUCTION The parallel explosons of nterest n streamng data [4, 8,, 8], and data mnng of tme seres [6, 7, 9, 2, 2, 24, 26, 34] have had surprsngly lttle nterseton. Ths s n spte of the fat that tme seres data are typally streamng data, for example, stok value, medal and meteorologal data [3]. The man reason for ths apparent paradox s the fat that the vast majorty of ork on streamng data expltly assumes that the data s dsrete, hereas the vast majorty of tme seres data s real valued [23]. Many hgh level representatons of tme seres have been proposed for data mnng. Fgure llustrates a herarhy of all the varous tme seres representatons n the lterature [2, 7, 4, 6, 2, 22, 25, 3, 3, 35]. One representaton that the data mnng ommunty has not onsdered n detal s the dsretzaton of the orgnal data nto symbol strngs. At frst glane ths seems a surprsng oversght. In addton to allong the framng of tme seres problems as streamng problems, there s an enormous ealth of exstng algorthms and data strutures that allo the effent manpulatons of symbol representatons. Suh algorthms have reeved deades of attenton n the text retreval ommunty, and more reent attenton from the bonformats ommunty [3, 3, 7, 29, 32, 33]. Some smple examples of tools that are not defned for real-valued sequenes but are defned for symbol approahes nlude hashng, Markov models, suffx trees, deson trees et. As a more onrete example, onsder the Jaard oeffent [3], a dstane measure beloved by streamng researhers. The Jaard oeffent s only ell defned for dsrete data (suh as eb lks or ndvdual keystrokes) as thus annot be used th real-valued tme seres. There s a smple explanaton for the data mnng ommunty s lak of nterest n symbol manpulaton as a supportng tehnque for mnng tme seres. If the data are transformed nto vrtually any of the other representatons depted n Fgure, then t s possble to measure the smlarty of to tme seres n that representaton spae, suh that the dstane s guaranteed to loer bound the true dstane beteen the tme seres n the orgnal spae. Ths smple fat s at the ore of almost all algorthms n tme seres data mnng and ndexng [4]. Hoever, n spte of the fat that there are dozens of tehnques for produng dfferent varants of the symbol representaton [2,, 2], there s no knon method for alulatng the dstane n the symbol spae, hle provdng the loer boundng guarantee. In addton to allong the reaton of loer boundng dstane measures, there s one other hghly desrable property of any tme seres representaton, nludng a symbol one. Almost all tme seres datasets are very hgh dmensonal. Ths s a hallengng fat beause all non-trval data mnng and ndexng algorthms degrade exponentally th dmensonalty. For example, above 6-2 dmensons, ndex strutures degrade to sequental sannng [9].

2 Data Adaptve Tme Seres Representatons Non Data Adaptve Sorted Coeffents Peese Lnear Approxmaton Interpolaton Regresson Peese Polynomal Sngular Value Deomposton Adaptve Peese Constant Approxma ton Symbol Natural Language Loer Boundng Strngs Trees Non - Loer Boundng Haar Wavelets Random Mappngs Orthonormal B - Orthonormal Daubehes dbn n > Coflets Symlets Spetral Dsrete Fourer Transform Peese Aggregate Approxmaton Dsrete Cosne Transform Fgure : A herarhy of all the varous tme seres representatons n the lterature. The leaf nodes refer to the atual representaton, and the nternal nodes refer to the lassfaton of the approah. The ontrbuton of ths paper s to ntrodue a ne representaton, the loer boundng symbol approah None of the symbol representatons that e are aare of allo dmensonalty reduton [2,, 2]. There s some reduton n the storage spae requred, sne feer bts are requred for eah value; hoever, the ntrns dmensonalty of the symbol representaton s the same as the orgnal data. In [4], Babok et. al ask f there s a need for database researhers to develop fundamental and general-purpose models for data streams. The opnon of the authors s affrmatve. In ths ork e take a step toards ths goal by ntrodung a representaton of tme seres that s sutable for streamng algorthms. It s dmensonalty redung, loer boundng and an be obtaned n a streamng fashon. As e shall demonstrate, the loer boundng feature s partularly extng beause t allos one to run ertan data mnng algorthms on the effently manpulated symbol representaton, hle produng dental results to the algorthms that operate on the orgnal data. In partular, e ll demonstrate the utlty of our representaton on the lass data mnng tasks of lusterng [2], lassfaton [6], ndexng [, 4, 22, 35], and anomaly deteton [9, 24, 3]. The rest of ths paper s organzed as follos. Seton 2 brefly dsusses bakground materal on tme seres data mnng and related ork. Seton 3 ntrodues our novel symbol approah, and dsusses ts dmensonalty reduton, numerosty reduton and loer boundng abltes. Seton 4 ontans an expermental evaluaton of the symbol approah on a varety of data mnng tasks. Fnally, Seton 5 offers some onlusons and suggestons for future ork. 2. BACKGROUND AND RELATED WORK Tme seres data mnng has attrated enormous attenton n the last deade. The reve belo s neessarly bref; e refer nterested readers to [3, 23] for a more n depth reve. 2. Tme Seres Data Mnng Tasks Whle makng no pretene to be exhaustve, the follong lst summarzes the areas that have seen the majorty of researh nterest n tme seres data mnng. Indexng: Gven a query tme seres Q, and some smlarty/dssmlarty measure D(Q,C), fnd the most smlar tme seres n database DB [, 7, 4,22, 35]. Clusterng: Fnd natural groupngs of the tme seres n database DB under some smlarty/dssmlarty measure D(Q,C) [2,25]. Classfaton: Gven an unlabeled tme seres Q, assgn t to one of to or more predefned lasses [6]. Summarzaton: Gven a tme seres Q ontanng n dataponts here n s an extremely large number, reate a (possbly graph) approxmaton of Q hh retans ts essental features but fts on a sngle page, omputer sreen, exeutve summary, et [26]. Anomaly Deteton: Gven a tme seres Q, and some model of normal behavor, fnd all setons of Q hh ontan anomales, or surprsng/nterestng/unexpeted/novel behavor [9, 24, 3]. Sne the datasets enountered by data mners typally don t ft n man memory, and dsk I/O tends to be the bottlenek for any data mnng task, a smple gener frameork for tme seres data mnng has emerged [4]. The bas approah s outlned n Table.. Create an approxmaton of the data, hh ll ft n man memory, yet retans the essental features of nterest. 2. Approxmately solve the task at hand n man memory. Make (hopefully very fe) aesses to the orgnal data on dsk to onfrm the soluton obtaned n Step 2, or to 3. modfy the soluton so t agrees th the soluton e ould have obtaned on the orgnal data. Table : A gener tme seres data mnng approah It should be lear that the utlty of ths frameork depends heavly on the qualty of the approxmaton reated n Step. If the approxmaton s very fathful to the orgnal data, then the soluton obtaned n man memory s lkely to be the same as, or very lose to, the soluton e ould have obtaned on the orgnal data. The handful of dsk aesses made n Step 3 to onfrm or slghtly modfy the soluton ll be nonsequental ompared to the number of dsk aesses requred had e orked on the orgnal data. Wth ths n mnd, there has been great nterest n approxmate representatons of tme seres, hh e onsder belo. 2.2 Tme Seres Representatons As th most problems n omputer sene, the sutable hoe of representaton greatly affets the ease and effeny of tme seres data mnng. Wth ths n mnd, a great number of tme seres representatons have been ntrodued, nludng the Dsrete Fourer Transform (DFT) [4], the Dsrete Wavelet Transform (DWT) [7], Peese Lnear, and Peese Constant models (PAA) [22], (APCA) [6, 22], and Sngular Value Deomposton (SVD) [22]. Fgure 2 llustrates the most ommonly used representatons.

3 Fgure 2: The most ommon representatons for tme seres data mnng. Eah an be vsualzed as an attempt to approxmate the sgnal th a lnear ombnaton of bass funtons Reent ork suggests that there s lttle to hoose beteen the above n terms of ndexng poer [23]; hoever, the representatons have other features that may at as strengths or eaknesses. As a smple example, avelets have the useful multresoluton property, but are only defned for tme seres that are an nteger poer of to n length [7]. One mportant feature of all the above representatons s that they are real valued. Ths lmts the algorthms, data strutures and defntons avalable for them. For example, n anomaly deteton e annot meanngfully defne the probablty of observng any partular set of avelet oeffents, sne the probablty of observng any real number s zero [27]. Suh lmtatons have lead researhers to onsder usng a symbol representaton of tme seres. Whle there are lterally hundreds of papers on dsretzng (symbolzng, tokenzng, quantzng) tme seres [2, 2] (see [] for an extensve survey), none of the tehnques allos a dstane measure that loer bounds a dstane measure defned on the orgnal tme seres. For ths reason, the gener tme seres data mnng approah llustrated n Table s of lttle utlty, sne the approxmate soluton to problem reated n man memory may be arbtrarly dssmlar to the true soluton that ould have been obtaned on the orgnal data. If, hoever, one had a symbol approah that alloed loer boundng of the true dstane, one ould take advantage of the gener tme seres data mnng model, and of a host of other algorthms, defntons and data strutures hh are only defned for dsrete data, nludng hashng, Markov models, and suffx trees. Ths s exatly the ontrbuton of ths paper. We all our symbol representaton of tme seres SAX (Symbol Aggregate approxmaton), and defne t n the next seton. 3. SAX: OUR SYMBOLIC APPROACH SAX allos a tme seres of arbtrary length n to be redued to a strng of arbtrary length, ( < n, typally << n). The alphabet sze s also an arbtrary nteger a, here a > 2. Table 2 summarzes the major notaton used n ths and subsequent setons. C A tme seres C =,, n A Peese Aggregate Approxmaton of a tme seres C C =,..., Ĉ A symbol representaton of a tme seres Cˆ = ˆ,..., ˆ Dsrete Fourer Transform Peese Lnear Approxmaton Haar Wavelet Adaptve Peese Constant Approxmaton The number of PAA segments representng tme seres C a Alphabet sze (e.g., for the alphabet = {a,b,}, a = 3) Table 2: A summarzaton of the notaton used n ths paper Our dsretzaton proedure s unque n that t uses an ntermedate representaton beteen the ra tme seres and the symbol strngs. We frst transform the data nto the Peese Aggregate Approxmaton (PAA) representaton and then symbolze the PAA representaton nto a dsrete strng. There are to mportant advantages to dong ths: Dmensonalty Reduton: We an use the ell-defned and ell-doumented dmensonalty reduton poer of PAA [22, 35], and the reduton s automatally arred over to the symbol representaton. Loer Boundng: Provng that a dstane measure beteen to symbol strngs loer bounds the true dstane beteen the orgnal tme seres s non-trval. The key observaton that allos us to prove loer bounds s to onentrate on provng that the symbol dstane measure loer bounds the PAA dstane measure. Then e an prove the desred result by transtvty by smply pontng to the exstng proofs for the PAA representaton tself [35]. We ll brefly reve the PAA tehnque before onsderng the symbol extenson. 3. Dmensonalty Reduton Va PAA A tme seres C of length n an be represented n a -dmensonal spae by a vetor C =,,. The th element of C s K alulated by the follong equaton: = n n j j= n ( ) + Smply stated, to redue the tme seres from n dmensons to dmensons, the data s dvded nto equal szed frames. The mean value of the data fallng thn a frame s alulated and a vetor of these values beomes the data-redued representaton. The representaton an be vsualzed as an attempt to approxmate the orgnal tme seres th a lnear ombnaton of box bass funtons as shon n Fgure C Fgure 3: The PAA representaton an be vsualzed as an attempt to model a tme seres th a lnear ombnaton of box bass funtons. In ths ase, a sequene of length 28 s redued to 8 dmensons C The PAA dmensonalty reduton s ntutve and smple, yet has been shon to rval more sophstated dmensonalty reduton tehnques lke Fourer transforms and avelets [22, 23, 35]. We normalze eah tme seres to have a mean of zero and a standard devaton of one before onvertng t to the PAA ()

4 representaton, sne t s ell understood that t s meanngless to ompare tme seres th dfferent offsets and ampltudes [23]. 3.2 Dsretzaton Havng transformed a tme seres database nto PAA, e an apply a further transformaton to obtan a dsrete representaton. It s desrable to have a dsretzaton tehnque that ll produe symbols th equprobablty [3, 28]. Ths s easly aheved sne normalzed tme seres have a Gaussan dstrbuton [27]. To llustrate ths, e extrated subsequenes of length 28 from 8 dfferent tme seres and plotted a normal probablty plot of the data as shon n Fgure 4. - Fgure 4: A normal probablty plot of the umulatve dstrbuton of values from subsequenes of length 28 from 8 dfferent datasets. The hghly lnear nature of the plot strongly suggests that the data ame from a Gaussan dstrbuton Gven that the normalzed tme seres have hghly Gaussan dstrbuton, e an smply determne the breakponts that ll produe a equal-szed areas under Gaussan urve [27]. Defnton. Breakponts: breakponts are a sorted lst of numbers Β = β,,β a- suh that the area under a N(,) Gaussan urve from β to β + = /a (β and β a are defned as - and, respetvely). These breakponts may be determned by lookng them up n a statstal table. For example, Table 3 gves the breakponts for values of a from 3 to. β a Probablty β β β β β β β β β 9.28 Table 3: A lookup table that ontans the breakponts that dvde a Gaussan dstrbuton n an arbtrary number (from 3 to ) of equprobable regons One the breakponts have been obtaned e an dsretze a tme seres n the follong manner. We frst obtan a PAA of the tme seres. All PAA oeffents that are belo the smallest breakpont are mapped to the symbol a, all oeffents greater than or equal to the smallest breakpont and less than the seond smallest breakpont are mapped to the symbol b, et. Fgure 5 llustrates the dea b Fgure 5: A tme seres s dsretzed by frst obtanng a PAA approxmaton and then usng predetermned breakponts to map the PAA oeffents nto SAX symbols. In the example above, th n = 28, = 8 and a = 3, the tme seres s mapped to the ord baabb Note that n ths example the 3 symbols, a, b, and are approxmately equprobable as e desred. We all the onatenaton of symbols that represent a subsequene a ord. Defnton 2. Word: A subsequene C of length n an be represented as a ord Cˆ = ˆ,, ˆ K as follos. Let alpha denote the th element of the alphabet,.e., alpha = a and alpha 2 = b. Then the mappng from a PAA approxmaton C to a ord Ĉ s obtaned as follos: a a ˆ β < β (2) = alpha j, f j We have no defned SAX, our symbol representaton (the PAA representaton s merely an ntermedate step requred to obtan the symbol representaton). 3.3 Dstane Measures Havng ntrodued the ne representaton of tme seres, e an no defne a dstane measure on t. By far the most ommon dstane measure for tme seres s the Euldean dstane [23, 29]. Gven to tme seres Q and C of the same length n, Eq. 3 defnes ther Euldean dstane, and Fgure 6.A llustrates a vsual ntuton of the measure. n ( C) ( ) D Q, q (3) = If e transform the orgnal subsequenes nto PAA representatons, Q and C, usng Eq., e an then obtan a loer boundng approxmaton of the Euldean dstane beteen the orgnal subsequenes by: = ( q ) 2 n DR( Q, C ) (4) Ths measure s llustrated n Fgure 6.B. A proof that DR(Q, C ) loer bounds the true Euldean dstane appears n [22] (an alteratve proof appears n [35] ). If e further transform the data nto the symbol representaton, e an defne a MINDIST funton that returns the mnmum dstane beteen the orgnal tme seres of to ords: MINDIST Qˆ, Cˆ) n = 2 ( dst( qˆ, ˆ )) ( (5) The funton resembles Eq. 4 exept for the fat that the dstane beteen the to PAA oeffents has been replaed th the subfunton dst(). The dst() funton an be mplemented usng a table lookup as llustrated n Table 4. b b 2 j

5 a b d a b d Table 4: A lookup table used by the MINDIST funton. Ths table s for an alphabet of ardnalty of 4,.e. a=4. The dstane beteen to symbols an be read off by examnng the orrespondng ro and olumn. For example, dst(a,b) = and dst(a,) =.67. The value n ell (r,) for any lookup table an be alulated by the follong expresson., f r ellr, = (6) β max( r, ) β mn( r, ), otherse For a gven value of the alphabet sze a, the table needs only be alulated one, then stored for fast lookup. The MINDIST funton an be vsualzed n Fgure 6.C C C Cˆ Qˆ Fgure 6: A vsual ntuton of the three representatons dsussed n ths ork, and the dstane measures defned on them. A) The Euldean dstane beteen to tme seres an be vsualzed as the square root of the sum of the squared dfferenes of eah par of orrespondng ponts. B) The dstane measure defned for the PAA approxmaton an be seen as the square root of the sum of the squared dfferenes beteen eah par of orrespondng PAA oeffents, multpled by the square root of the ompresson rate. C) The dstane beteen to SAX representatons of a tme seres requres lookng up the dstanes beteen eah par of symbols, squarng them, summng them, takng the square root and fnally multplyng by the square root of the ompresson rate There s one ssue e must address f e are to use a symbol representaton of tme seres. If e sh to approxmate a massve dataset n man memory, the parameters and a have to be hosen n suh a ay that the approxmaton makes the best use of the prmary memory avalable. There s a lear tradeoff beteen the parameter ontrollng the number of approxmatng Q Q = b a a b b = b a b a a (A ) (B ) (C ) elements, and the value a ontrollng the granularty of eah approxmatng element. It s nfeasble to determne the best tradeoff analytally, sne t s hghly data dependent. We an, hoever, emprally determne the best values th a smple experment. Sne e sh to aheve the tghtest possble loer bounds, e an smply estmate the loer bounds over all possble feasble parameters, and hoose the best settngs. MINDIST( Qˆ, Cˆ) Tghtness of Loer Bound = (7) D( Q, C) We performed suh a test th a onatenaton of 5 tme seres databases taken from the UCR tme seres data mnng arhve. For every ombnaton of parameters e averaged the result of, experments on subsequenes of length 256. Fgure 7 shos the results. Tghtness of loer bound Word Sze Fgure 7: The emprally estmated tghtness of loer bounds over the ross produt of a = [3 ] and = [2 8]. The darker hstogram bars llustrate ombnatons of parameters that requre approxmately equal spae to store every possble ord (approxmately 4 megabytes) The results suggest that usng a lo value for a results n eak bounds, but that there are dmnshng returns for large values of a. The results also suggest that the parameters are not too rtal; an alphabet sze n the range of 5 to 8 seems to be a good hoe. 3.4 Numerosty Reduton We have seen that, gven a sngle tme seres, our approah an sgnfantly redue ts dmensonalty. In addton, our approah an redue the numerosty of the data for some applatons. Most applatons assume that e have one very long tme seres T, and that manageable subsequenes of length n are extrated by use of a sldng ndo, then stored n a matrx for further manpulaton [7, 4, 22, 35]. Fgure 8 llustrates the dea. Fgure 8: An llustraton of the notaton ntrodued n ths seton: A tme seres T of length 28, the subsequene C 67 of length n = 6, and the frst 8 subsequenes extrated by a sldng ndo. Note that the sldng ndos are overlappng Alphabet sze a C p C 6 7 p = T

6 When performng sldng ndos subsequene extraton, th any of the real-valued representatons, e must store all T - n + extrated subsequenes (n dmensonalty redued form). Hoever, magne for a moment that e are usng our proposed approah. If the frst ord extrated s aabb, and the ndo s shfted to dsover that the seond ord s also aabb, e an reasonably dede not to nlude the seond ourrene of the ord n sldng ndos matrx. If e ever need to retreve all ourrenes of aabb, e an go to the loaton ponted to by the frst ourrenes, and remember to slde to the rght, testng to see f the next ndo s also mapped to the same ord. We an stop testng as soon as the ord hanges. Ths smple dea s very smlar to the run-length-enodng data ompresson algorthm. The utlty of ths optmzaton depends on the parameters used and the data tself, but t typally yelds a numerosty reduton fator of to or three. Hoever, many datasets are haraterzed by long perods of lttle or no movement, folloed by bursts of atvty (sesmologal data s an obvous example). On these datasets the numerosty reduton fator an be huge. Consder the example shon n Fgure 9. Spae Shuttle STS-57 Telemetry aabb aabb Fgure 9: Sldng ndo extraton on Spae Shuttle Telemetry data, th n = 32. At tme pont 6, the extrated ord s aabb, and the next 4 subsequenes also map to ths ord. Only a ponter to the frst ourrene must be reorded, thus produng a large reduton n numerosty There s only one speal ase e must onsder. As e noted n Seton 3., e normalze eah tme seres (nludng subsequenes) to have a mean of zero and a standard devaton of one. Hoever, f the subsequene ontans only one value, the standard devaton s not defned. More troublesome s the ase here the subsequene s almost onstant, perhaps 3 zeros and a sngle.. If e normalze ths subsequene, the sngle dfferng element ll have ts value exploded to Ths stuaton ours qute frequently. For example, the last 2 tme unts of the data n Fgure 9 appear to be onstant, but atually ontan tny amounts of nose. If e ere to normalze subsequenes extrated from ths area, the normalzaton ould magnfy the nose to large meanngless patterns. We an easly deal th ths problem, f the standard devaton of the sequene before normalzaton s belo an epslon ε, e smply assgn the entre ord to the mddle-ranged alphabet (e.g. f a = 5). We end ths seton th a vsual omparson beteen SAX and the four most used representatons n the lterature (Fgure ) Fgure : A vsual omparson of SAX and the four most ommon tme seres data mnng representatons. A ra tme seres of length 28 s transformed nto the ord ffffffeeeddbaabeedbaaaaaddee. Ths s a far omparson sne the number of bts n eah representaton s the same 4. EXPERIMENTAL VALIDATION OF SAX We performed varous data mnng tasks usng SAX. For lusterng and lassfaton, e ompared the results th the lass Euldean dstane, and th other prevously proposed symbol approahes. Note that none of these other approahes use dmensonalty reduton. In the next paragraph e summarze the stramen representatons that e ompare SAX to. We hoose these to approahes sne they are typal representatves of symbol approahes n the lterature. André-Jönsson, and Badal [2] proposed the SDA algorthm that omputes the hanges beteen values from one nstane to the next, and dvde the range nto user-predefned regons. The dsadvantages of ths approah are obvous: pror knoledge of the data dstrbuton of the tme seres s requred n order to set the breakponts; and the dsretzed tme seres does not onserve the general shape or dstrbuton of the data values. Huang and Yu proposed the IMPACTS algorthm, hh uses hange rato beteen one tme pont to the next tme pont to dsretze the tme seres [2]. The range of hange ratos are then dvded nto equal-szed setons and mapped nto symbols. The tme seres s onverted to a dsretzed olleton of hange ratos. As th SAX, the user needs to defne the ardnalty of symbols. 4. Clusterng Clusterng s one of the most ommon data mnng tasks, beng useful n ts on rght as an exploratory tool, and also as a subroutne n more omplex algorthms [2,5, 2]. 4.. Herarhal Clusterng Comparng herarhal lusterngs s a very good ay to ompare and ontrast smlarty measures, sne a dendrogram of sze N summarzes O(N 2 ) dstane alulatons [23]. The evaluaton s typally subjetve; e smply adjudge hh dstane measure appears to reate the most natural groupngs of the data. Hoever, f e kno the data labels n advane e an also make objetve statements of the qualty of the lusterng. In Fgure e lustered nne tme seres from the Control Chart dataset, three eah from the dereasng trend, upard shft and normal lasses. f e d b a DFT PLA Haar APCA

7 Euldean SAX 265 IMPACTS (alphabet=8) SDA Objetve Funton Ra Ra data data Our Symbol SAX Approah Number of Iteratons Fgure : A omparson of the four dstane measures ablty to luster members of the Control Chart dataset. Complete lnkage as used as the agglomeraton tehnque In ths ase e an objetvely state that SAX s superor, sne t orretly assgns eah lass to ts on subtree. Ths s smply a sde effet due to the smoothng effet of dmensonalty reduton. More generally, e observed SAX losely mms Euldean dstane on varous datasets Parttonal Clusterng Although herarhal lusterng s a good santy hek for any proposed dstane measure, t has lmted utlty for data mnng beause of ts poor salablty. The most ommonly used data mnng lusterng algorthm s k-means [5], so for ompleteness e ll onsder t here. We performed k-means on both the orgnal ra data, and our symbol representaton. Fgure 2 shos a typal run of k-means on a spae telemetry dataset. Both algorthms onverge after teratons on average. The results here are qute unntutve and surprsng; orkng th an approxmaton of the data gves better results than orkng th the orgnal data. Fortunately, a reent paper offers a suggeston as to hy ths mght be so. It has been shon that ntalzng the lusters enters on a lo dmenson approxmaton of the data an mprove the qualty [2], ths s hat lusterng th SAX mpltly does. Fgure 2: A omparson of the k-means lusterng algorthm usng SAX and the ra data. The dataset as Spae Shuttle telemetry,, subsequenes of length 52. Surprsngly, orkng th the symbol approxmaton produes better results than orkng th the orgnal data 4.2 Classfaton Classfaton of tme seres has attrated muh nterest from the data mnng ommunty. The hgh dmensonalty, hgh feature orrelaton, and typally hgh levels of nose found n tme seres provde an nterestng researh problem [23]. Although spealpurpose algorthms have been proposed [25], e ll onsder only the to most ommon lassfaton algorthms for brevty, larty of presentatons and to faltate ndependent onfrmaton of our fndngs Nearest Neghbor Classfaton To ompare dfferent dstane measures on -nearest-neghbor lassfaton, e used leavng-one-out ross valdaton. We ompare SAX th Euldean dstane, IMPACTS, SDA, and LP. To lass synthet datasets are used: the Cylnder-Bell- Funnel (CBF) dataset has 5 nstanes of tme seres for eah of the three lusters, and the Control Chart (CC) has nstanes for eah of the sx lusters [23]. Sne SAX allos dmensonalty and alphabet sze as user nput, and the IMPACTS allos varable alphabet sze, e ran the experments on dfferent ombnatons of dmensonalty reduton and alphabet sze. For the other approahes e appled the smple dmensonalty reduton tehnque of skppng data ponts at a fxed nterval. In Fgure 3, e sho the results th a dmensonalty reduton of 4 to. Smlar results ere observed for other levels of dmensonalty reduton. One agan, SAX s ablty to beat Euldean dstane s probably due to the smoothng effet of dmensonalty reduton; nevertheless, ths experment does sho the superorty of SAX over the other approahes proposed n the lterature Deson Tree Classfaton Due to ts poor salablty, Nearest Neghbor s unsutable for most data mnng applatons; nstead, deson trees are the most ommon hoe of lassfer. Whle deson trees are defned for real data, attemptng to lassfy tme seres usng the ra data ould learly be a mstake, sne the hgh dmensonalty and nose levels ould result n a deep, bushy tree th poor auray.

8 Error Rate Cylnder - Bell - Funnel Control Chart Impats SDA Euldean LP max SAX To ompare performane, e measure the perentage of dsk I/Os requred n order to retreve the one-nearest neghbor to a randomly extrated query, relatve to the number of dsk I/Os requred for sequental san. Sne t has been forbly shon that the hoe of dataset an make a sgnfant dfferene n the relatve ndexng ablty of a representaton, e tested on more than 5 datasets from the UCR Tme Seres Data Mnng Arhve. In Fgure 4 e sho 4 representatve examples Alphabet Sze Alphabet Sze Fgure 3: A omparson of fve dstane measures utlty for nearest neghbor lassfaton. We tested dfferent alphabet szes for SAX and IMPACTS. SDA s alphabet sze s fxed at 5. In an attempt to overome ths problem, Geurts [6] suggests representng the tme seres as a Regresson Tree (RT) (ths representaton s essentally the same as APCA [22], see Fgure 2), and tranng the deson tree dretly on ths representaton. The tehnque shos great promse. We ompared SAX to the Regresson Tree (RT) on to datasets; the results are n Table 5. Dataset SAX Regresson Tree CC 3.4 ± ± 2. CBF.97 ±.4.4 ±.2 Table 5: A omparson of SAX th the spealzed Regresson Tree approah for deson tree lassfaton. Our approah used an alphabet sze of 6; both approahes used a dmensonalty of 8 Note that hle our results are ompettve th the RT approah, The RT representaton s undoubtedly superor n terms of nterpretablty [6]. One agan, our pont s smply that our blak box approah an be ompettve th spealzed solutons. 4.3 Query by Content (Indexng) The majorty of ork on tme seres data mnng appearng n the lterature has addressed the problem of ndexng tme seres for fast retreval [3]. Indeed, t s n ths ontext that most of the representatons enumerated n Fgure ere ntrodued [7, 4, 22, 35]. Dozens of papers have ntrodued tehnques to do ndexng th a symbol approah [2, 2], but thout exepton, the anser set retreved by these tehnques an be very dfferent to the anser set that ould be retreved by the true Euldean dstane. It s only by usng a loer boundng tehnque that one an guarantee retrevng the full anser set, th no false dsmssals [4]. To perform query by ontent, e bult an ndex usng SAX, and ompared t to an ndex bult usng the Haar avelet approah [7]. Sne the datasets e use are large and dsk-resdent, and the redued dmensonalty ould stll be potentally hgh (or at least hgh enough suh that the performane degenerates to sequental san f R-tree ere used [9]), e use Vetor Approxmaton (VA) fle as our ndexng algorthm. We note, hoever, that SAX ould also be ndexed by lass strng ndexng tehnques suh as suffx trees DWT Haar SAX Ballbeam Chaot Memory Wndng Dataset Fgure 4: A omparson of ndexng ablty of avelets versus SAX. The Y-axs s the perentage of the data that must be retreved from dsk to anser a -NN query of length 256, hen the dmensonalty reduton rato s 32 to for both approahes One agan e fnd our representaton ompettve th exstng approahes. 4.4 Takng Advantage of the Dsrete Nature of our Representaton In the prevous setons e shoed examples of ho our proposed representaton an ompete th real-valued representatons and the orgnal data. In ths seton e llustrate examples of data mnng algorthms that take explt advantage of the dsrete nature of our representaton Detetng Novel/Surprsng/Anomalous Behavor A smple dea for detetng anomalous behavor n tme seres s to examne prevously observed normal data and buld a model of t. Data obtaned n the future an be ompared to ths model and any lak of onformty an sgnal an anomaly [9]. In order to aheve ths, n [24] e ombned a statstally sound sheme th an effent ombnatoral approah. The statstal sheme s based on Markov hans and normalzaton. Markov hans are used to model the normal behavor, hh s nferred from the prevously observed data. The tme- and spae-effeny of the algorthm omes from the use of suffx tree as the man data struture. Eah node of the suffx tree represents a pattern. The tree s annotated th a sore obtaned omparng the support of a pattern observed n the ne data th the support reorded n the Markov model. Ths apparently smple strategy turns out to be very effetve n dsoverng surprsng patterns. In the orgnal ork e use a smple symbol approah, smlar to IMPACTS [2]; here e revst the ork usng SAX. For ompleteness, e ll ompare SAX to to hghly referened anomaly deteton algorthms that are defned on real valued representatons, the TSA-tree Wavelet based approah of Shahab et al. [3] and the Immunology (IMM) nspred ork of Dasgupta and Forrest [9]. We also nlude the Markov tehnque usng IMPACTS and SDA n order to dsover ho muh of the dfferene an be attrbuted dretly to the representaton. Fgure 5 ontans an experment omparng all 5 tehnques.

9 5 I) -5 5 II) shos an example of a motf dsovered n an ndustral dataset [5] usng ths tehnque. Wndng Dataset (Angular speed of reel ) A B III) IIII) V) VI) A B VII) Fgure 5: A omparson of fve anomaly deteton algorthms on the same task. I) The tranng data, a slghtly nosy sne ave of length,. II) The tme seres to be examned for anomales s a nosy sne ave that as reated th the same parameters as the tranng sequene, then an assortment of anomales ere ntrodued at tme perods 25, 5 and 75. III) and IIII) The Markov Model tehnque usng the IMPACTS and SDA representaton dd not learly dsover the anomales, and reported some false alarms. V) The IMM anomaly deteton algorthm appears to have dsovered the frst anomaly, but t also reported many false alarms. VI) The TSA-Tree approah s unable to detet the anomales. VII) The Markov model-based tehnque usng SAX learly fnds the anomales, th no false alarms The results on ths smple experment are mpressve. Sne suffx trees and Markov models an be used only on dsrete data, ths offers a motvaton for our symbol approah Motf dsovery It s ell understood n bonformats that overrepresented DNA sequenes often have bologal sgnfane [3, 3, 29]. A substantal body of lterature has been devoted to tehnques to dsover suh patterns [7, 32, 33]. In a prevous ork, e defned the related onept of tme seres motf [26]. Tme seres motfs are lose analogues of ther dsrete ousns, although the defntons must be augmented to prevent ertan degenerate solutons. The naïve algorthm to dsover the motfs s quadrat n the length of the tme seres. In [26], e demonstrated a smple tehnque to mtgate the quadrat omplexty by a large onstant fator; nevertheless, ths tme omplexty s learly untenable for most real datasets. The symbol nature of SAX offers a unque opportunty to aval of the ealth of bonformats researh n ths area. In partular, reent ork by Tompa and Buhler holds great promse [33]. The authors sho that many prevously unsolvable motf dsovery problems an be solved by hashng subsequenes nto bukets usng a random subset of ther features as a key, then dong some post-proessng searh on the hash bukets. They all ther algorthm PROJECTION. We arefully remplemented the random projeton algorthm of Tompa and Buhler, makng mnor hanges n the post-proessng step to allo for the fat that although e are hashng random projetons of our symbol representaton, e atually sh to dsover motfs defned on the orgnal ra data. Fgure 6 Of ourse, ths desrpton greatly understates the ontrbutons of ths ork. We urge the reader to onsult the orgnal paper Fgure 6: Above, a motf dsovered n a omplex dataset by the modfed PROJECTION algorthm. Belo, the motf s best vsualzed by algnng the to subsequenes and zoomng n. The smlarty of the to subsequenes s strkng, and hnts at unexpeted regularty Apart from the attratve salablty of the algorthm, there s another mportant advantage over other approahes. The PROJECTION algorthm s able to dsover motfs even n the presene of nose. Our extenson of the algorthm nherts ths robustness to nose. 5. CONCLUSIONS AND FUTURE DIRECTIONS In ths ork e ntrodued the frst dmensonalty reduton, loer boundng, streamng symbol approah n the lterature. We have shon that our representaton s ompettve th, or superor to, other representatons on a de varety of lass data mnng problems, and that ts dsrete nature allos us to takle emergng tasks suh as anomaly deteton and motf dsovery. A host of future dretons suggest themselves. In addton to use th streamng algorthms, there s an enormous ealth of useful defntons, algorthms and data strutures n the bonformats lterature that an be exploted by our representaton [3, 3, 7, 28, 29, 32, 33]. It may be possble to reate a loer boundng approxmaton of Dynam Tme Warpng [6], by slghtly modfyng the lass strng edt dstane. Fnally, there may be utlty n extendng our ork to multdmensonal tme seres [34]. 6. REFERENCES [] Agraal, R., Psala, G., Wmmers, E. L. & Zat, M. (995). Queryng Shapes of Hstores. In proeedngs of the 2 st Int'l Conferene on Very Large Databases. Zurh, Stzerland, Sept - 5. pp [2] André-Jönsson, H. & Badal. D. (997). Usng Sgnature Fles for Queryng Tme-Seres Data. In proeedngs of Prnples of Data Mnng and Knoledge Dsovery, st European Symposum. Trondhem, Noray, Jun pp [3] Apostolo, A., Bok, M. E. & Lonard, S. (22). Monotony of Surprse and Large-Sale Quest for Unusual Words. In proeedngs of the 6 th Int l Conferene on Researh n Computatonal Moleular Bology. Washngton, DC, Aprl 8-2. pp [4] Babok, B, Babu, S., Datar, M., Motan, R. & Wdom, J. (22). Models and Issues n Data Stream Systems. Invted Paper n proeedngs of the 22 ACM Symp. On Prnples of Database Systems. June 3-5, Madson, WI.

10 [5] Bastogne, T., Noura, H., Rhard A. & Httnger, J..M. (22). Applaton of Subspae Methods to the Identfaton of a Wndng Proess. In proeedngs of the 4 th European Control Conferene, Vol. 5, Brussels. [6] Berndt, D. & Clfford, J. (994) Usng Dynam Tme Warpng to Fnd Patterns n Tme Seres. In proeedngs of the Workshop on Knoledge Dsovery n Databases, at the 2 th Int l Conferene on Artfal Intellgene. July 3-Aug 4, Seattle, WA. pp [7] Chan, K. & Fu, A. W. (999). Effent Tme Seres Mathng by Wavelets. In proeedngs of the 5 th IEEE Int'l Conferene on Data Engneerng. Sydney, Australa, Mar pp [8] Cortes, C., Fsher, K., Pregbon, D., Rogers, A. & Smth, F. (2). Hanok: a Language for Extratng Sgnatures from Data Streams. In proeedngs of the 6 th ACM SIGKDD Int l Conferene on Knoledge Dsovery and Data Mnng. Aug 2-23, Boston, MA. pp 9-7. [9] Dasgupta, D. & Forrest, S. (996) Novelty Deteton n Tme Seres Data usng Ideas from Immunology. In proeedngs of The Internatonal Conferene on Intellgent Systems. June 9-2. [] Datar, M. & Muthukrshnan, S. (22). Estmatng Rarty and Smlarty over Data Stream Wndos. In proeedngs of the th European Symposum on Algorthms. Sep 7-2, Rome, Italy. [] Da, C. S., Fnney, C. E. A. & Tray, E. R. (2). Symbol Analyss of Expermental Data. Reve of Sentf Instruments. ( ). [2] Dng, C., He, X., Zha, & Smon., H. (22). Adaptve Dmenson Reduton for Clusterng Hgh Dmensonal Data. In proeedngs of the 2 nd IEEE Internatonal Conferene on Data Mnng. De 9-2. Maebash, Japan. pp [3] Durbn, R., Eddy, S., Krogh, A. & Mthson, G. (998). Bologal Sequene Analyss: Probablst Models of Protens and Nule Ads. Cambrdge Unversty Press. [4] Faloutsos, C., Ranganathan, M., & Manolopoulos, Y. (994). Fast Subsequene Mathng n Tme-Seres Databases. In proeedngs of the ACM SIGMOD Int l Conferene on Management of Data. May 24-27, Mnneapols, MN. pp [5] Fayyad, U., Rena, C. &. Bradley, P. (998). Intalzaton of Iteratve Refnement Clusterng Algorthms. In proeedngs of the 4 th Internatonal Conferene on Knoledge Dsovery and Data Mnng. Ne York, NY, Aug pp [6] Geurts, P. (2). Pattern Extraton for Tme Seres Classfaton. In proeedngs of the 5th European Conferene on Prnples of Data Mnng and Knoledge Dsovery. Sep 3-7, Freburg, Germany. pp [7] Gons, A. & Mannla, H. (23). Fndng Reurrent Soures n Sequenes. In proeedngs of the 7 th Internatonal Conferene on Researh n Computatonal Moleular Bology. Apr -3, Berln, Germany. To Appear. [8] Guha, S., Mshra, N., Motan, R. & O'Callaghan, L. (2). Clusterng Data Streams. In proeedngs of the 4 st Symposum on Foundatons of Computer Sene. Nov 2-4, Redondo Beah, CA. pp [9] Hellersten, J. M., Papadmtrou, C. H. & Koutsoupas, E. (997). Toards an Analyss of Indexng Shemes. In proeedngs of the 6 th ACM Symposum on Prnples of Database Systems. May 2-4, Tuson, AZ. pp [2] Huang, Y. & Yu, P. S. (999). Adaptve Query Proessng for Tme- Seres Data. In proeedngs of the 5 th Int'l Conferene on Knoledge Dsovery and Data Mnng. San Dego, CA, Aug 5-8. pp [2] Kalpaks, K., Gada, D. & Puttagunta, V. (2). Dstane Measures for Effetve Clusterng of ARIMA Tme-Seres. In proeedngs of the 2 IEEE Internatonal Conferene on Data Mnng, San Jose, CA, Nov 29-De 2. pp [22] Keogh, E., Chakrabart, K., Pazzan, M. & Mehrotra, S. (2). Loally Adaptve Dmensonalty Reduton for Indexng Large Tme Seres Databases. In proeedngs of ACM SIGMOD Conferene on Management of Data. Santa Barbara, CA, May pp [23] Keogh, E. & Kasetty, S. (22). On the Need for Tme Seres Data Mnng Benhmarks: A Survey and Empral Demonstraton. In proeedngs of the 8 th ACM SIGKDD Internatonal Conferene on Knoledge Dsovery and Data Mnng. July 23-26, 22. Edmonton, Alberta, Canada. pp 2-. [24] Keogh, E., Lonard, S. & Chu, W. (22). Fndng Surprsng Patterns n a Tme Seres Database n Lnear Tme and Spae. In the 8 th ACM SIGKDD Internatonal Conferene on Knoledge Dsovery and Data Mnng. July 23-26, 22. Edmonton, Alberta, Canada. pp [25] Keogh, E. & Pazzan, M. (998). An Enhaned Representaton of Tme Seres Whh Allos Fast and Aurate Classfaton, Clusterng and Relevane Feedbak. In proeedngs of the 4 th Int'l Conferene on Knoledge Dsovery and Data Mnng. Ne York, NY, Aug pp [26] Ln, J., Keogh, E., Lonard, S. & Patel, P. (22). Fndng Motfs n Tme Seres. In proeedngs of the 2 nd Workshop on Temporal Data Mnng, at the 8 th ACM SIGKDD Int l Conferene on Knoledge Dsovery and Data Mnng. Edmonton, Alberta, Canada, July pp [27] Larsen, R. J. & Marx, M. L. (986). An Introduton to Mathematal Statsts and Its Applatons. Prente Hall, Engleood, Clffs, N.J. 2 nd Edton. [28] Lonard, S. (2). Global Detetors of Unusual Words: Desgn, Implementaton, and Applatons to Pattern Dsovery n Bosequenes. PhD thess, Department of Computer Senes, Purdue Unversty, August, 2. [29] Renert, G., Shbath, S. & Waterman, M. S. (2). Probablst and Statstal Propertes of Words: An Overve. Journal of Computatonal. Bology. Vol. 7, pp -46. [3] Roddk, J. F., Hornsby, K. & Splopoulou, M. (2). An Updated Bblography of Temporal, Spatal and Spato-Temporal Data Mnng Researh. In Post-Workshop Proeedngs of the Internatonal Workshop on Temporal, Spatal and Spato-Temporal Data Mnng. Berln, Sprnger. Leture Notes n Artfal Intellgene. Roddk, J. F. and Hornsby, K., Eds [3] Shahab, C., Tan, X. & Zhao, W. (2). TSA-tree: A Wavelet- Based Approah to Improve the Effeny of Mult-Level Surprse and Trend Queres In proeedngs of the 2 th Int l Conferene on Sentf and Statstal Database Management. pp [32] Staden, R. (989). Methods for Dsoverng Novel Motfs n Nule Ad Sequenes. Computer Applatons n Bosenes. Vol. 5(5). pp [33] Tompa, M. & Buhler, J. (2). Fndng Motfs Usng Random Projetons. In proeedngs of the 5 th Int l Conferene on Computatonal Moleular Bology. Montreal, Canada, Apr pp [34] Vlahos, M., Kollos, G. & Gunopulos, G. (22). Dsoverng Smlar Multdmensonal Trajetores. In proeedngs of the 8 th Internatonal Conferene on Data Engneerng. Feb 26-Mar, San Jose, CA. [35] Y, B, K., & Faloutsos, C. (2). Fast Tme Sequene Indexng for Arbtrary Lp Norms. In proeedngs of the 26 st Int l Conferene on Very Large Databases. Sep -4, Caro, Egypt. pp

A MPAA-Based Iterative Clustering Algorithm Augmented by Nearest Neighbors Search for Time-Series Data Streams

A MPAA-Based Iterative Clustering Algorithm Augmented by Nearest Neighbors Search for Time-Series Data Streams A MPAA-Based Iteratve Clusterng Algorthm Augmented by Nearest Neghbors Searh for Tme-Seres Data Streams Jessa Ln 1, Mha Vlahos 1, Eamonn Keogh 1, Dmtros Gunopulos 1, Janwe Lu 2, Shouan Yu 2, and Jan Le

More information

Matrix-Matrix Multiplication Using Systolic Array Architecture in Bluespec

Matrix-Matrix Multiplication Using Systolic Array Architecture in Bluespec Matrx-Matrx Multplaton Usng Systol Array Arhteture n Bluespe Team SegFault Chatanya Peddawad (EEB096), Aman Goel (EEB087), heera B (EEB090) Ot. 25, 205 Theoretal Bakground. Matrx-Matrx Multplaton on Hardware

More information

Cluster ( Vehicle Example. Cluster analysis ( Terminology. Vehicle Clusters. Why cluster?

Cluster (  Vehicle Example. Cluster analysis (  Terminology. Vehicle Clusters. Why cluster? Why luster? referene funton R R Although R and R both somewhat orrelated wth the referene funton, they are unorrelated wth eah other Cluster (www.m-w.om) A number of smlar ndvduals that our together as

More information

Performance Evaluation of TreeQ and LVQ Classifiers for Music Information Retrieval

Performance Evaluation of TreeQ and LVQ Classifiers for Music Information Retrieval Performane Evaluaton of TreeQ and LVQ Classfers for Mus Informaton Retreval Matna Charam, Ram Halloush, Sofa Tsekerdou Athens Informaton Tehnology (AIT) 0.8 km Markopoulo Ave. GR - 19002 Peana, Athens,

More information

Research on Neural Network Model Based on Subtraction Clustering and Its Applications

Research on Neural Network Model Based on Subtraction Clustering and Its Applications Avalable onlne at www.senedret.om Physs Proeda 5 (01 ) 164 1647 01 Internatonal Conferene on Sold State Deves and Materals Sene Researh on Neural Networ Model Based on Subtraton Clusterng and Its Applatons

More information

Color Texture Classification using Modified Local Binary Patterns based on Intensity and Color Information

Color Texture Classification using Modified Local Binary Patterns based on Intensity and Color Information Color Texture Classfaton usng Modfed Loal Bnary Patterns based on Intensty and Color Informaton Shvashankar S. Department of Computer Sene Karnatak Unversty, Dharwad-580003 Karnataka,Inda shvashankars@kud.a.n

More information

Multilabel Classification with Meta-level Features

Multilabel Classification with Meta-level Features Multlabel Classfaton wth Meta-level Features Sddharth Gopal Carnege Mellon Unversty Pttsburgh PA 523 sgopal@andrew.mu.edu Ymng Yang Carnege Mellon Unversty Pttsburgh PA 523 ymng@s.mu.edu ABSTRACT Effetve

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Session 4.2. Switching planning. Switching/Routing planning

Session 4.2. Switching planning. Switching/Routing planning ITU Semnar Warsaw Poland 6-0 Otober 2003 Sesson 4.2 Swthng/Routng plannng Network Plannng Strategy for evolvng Network Arhtetures Sesson 4.2- Swthng plannng Loaton problem : Optmal plaement of exhanges

More information

TAR based shape features in unconstrained handwritten digit recognition

TAR based shape features in unconstrained handwritten digit recognition TAR based shape features n unonstraned handwrtten dgt reognton P. AHAMED AND YOUSEF AL-OHALI Department of Computer Sene Kng Saud Unversty P.O.B. 578, Ryadh 543 SAUDI ARABIA shamapervez@gmal.om, yousef@s.edu.sa

More information

Steganalysis of DCT-Embedding Based Adaptive Steganography and YASS

Steganalysis of DCT-Embedding Based Adaptive Steganography and YASS Steganalyss of DCT-Embeddng Based Adaptve Steganography and YASS Qngzhong Lu Department of Computer Sene Sam Houston State Unversty Huntsvlle, TX 77341, U.S.A. lu@shsu.edu ABSTRACT Reently well-desgned

More information

Link Graph Analysis for Adult Images Classification

Link Graph Analysis for Adult Images Classification Lnk Graph Analyss for Adult Images Classfaton Evgeny Khartonov Insttute of Physs and Tehnology, Yandex LLC 90, 6 Lev Tolstoy st., khartonov@yandex-team.ru Anton Slesarev Insttute of Physs and Tehnology,

More information

Progressive scan conversion based on edge-dependent interpolation using fuzzy logic

Progressive scan conversion based on edge-dependent interpolation using fuzzy logic Progressve san onverson based on edge-dependent nterpolaton usng fuzzy log P. Brox brox@mse.nm.es I. Baturone lum@mse.nm.es Insttuto de Mroeletróna de Sevlla, Centro Naonal de Mroeletróna Avda. Rena Meredes

More information

International Journal of Pharma and Bio Sciences HYBRID CLUSTERING ALGORITHM USING POSSIBILISTIC ROUGH C-MEANS ABSTRACT

International Journal of Pharma and Bio Sciences HYBRID CLUSTERING ALGORITHM USING POSSIBILISTIC ROUGH C-MEANS ABSTRACT Int J Pharm Bo S 205 Ot; 6(4): (B) 799-80 Researh Artle Botehnology Internatonal Journal of Pharma and Bo Senes ISSN 0975-6299 HYBRID CLUSTERING ALGORITHM USING POSSIBILISTIC ROUGH C-MEANS *ANURADHA J,

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

LOCAL BINARY PATTERNS AND ITS VARIANTS FOR FACE RECOGNITION

LOCAL BINARY PATTERNS AND ITS VARIANTS FOR FACE RECOGNITION IEEE-Internatonal Conferene on Reent Trends n Informaton Tehnology, ICRTIT 211 MIT, Anna Unversty, Chenna. June 3-5, 211 LOCAL BINARY PATTERNS AND ITS VARIANTS FOR FACE RECOGNITION K.Meena #1, Dr.A.Suruland

More information

Connectivity in Fuzzy Soft graph and its Complement

Connectivity in Fuzzy Soft graph and its Complement IOSR Journal of Mathemats (IOSR-JM) e-issn: 2278-5728, p-issn: 2319-765X. Volume 1 Issue 5 Ver. IV (Sep. - Ot.2016), PP 95-99 www.osrjournals.org Connetvty n Fuzzy Soft graph and ts Complement Shashkala

More information

Pattern Classification: An Improvement Using Combination of VQ and PCA Based Techniques

Pattern Classification: An Improvement Using Combination of VQ and PCA Based Techniques Ameran Journal of Appled Senes (0): 445-455, 005 ISSN 546-939 005 Sene Publatons Pattern Classfaton: An Improvement Usng Combnaton of VQ and PCA Based Tehnques Alok Sharma, Kuldp K. Palwal and Godfrey

More information

A Fast Way to Produce Optimal Fixed-Depth Decision Trees

A Fast Way to Produce Optimal Fixed-Depth Decision Trees A Fast Way to Produe Optmal Fxed-Depth Deson Trees Alreza Farhangfar, Russell Grener and Martn Znkevh Dept of Computng Sene Unversty of Alberta Edmonton, Alberta T6G 2E8 Canada {farhang, grener, maz}@s.ualberta.a

More information

Fuzzy Modeling for Multi-Label Text Classification Supported by Classification Algorithms

Fuzzy Modeling for Multi-Label Text Classification Supported by Classification Algorithms Journal of Computer Senes Orgnal Researh Paper Fuzzy Modelng for Mult-Label Text Classfaton Supported by Classfaton Algorthms 1 Beatrz Wlges, 2 Gustavo Mateus, 2 Slva Nassar, 2 Renato Cslagh and 3 Rogéro

More information

Loop Transformations, Dependences, and Parallelization

Loop Transformations, Dependences, and Parallelization Loop Transformatons, Dependences, and Parallelzaton Announcements Mdterm s Frday from 3-4:15 n ths room Today Semester long project Data dependence recap Parallelsm and storage tradeoff Scalar expanson

More information

Boosting Weighted Linear Discriminant Analysis

Boosting Weighted Linear Discriminant Analysis . Okada et al. / Internatonal Journal of Advaned Statsts and I&C for Eonoms and Lfe Senes Boostng Weghted Lnear Dsrmnant Analyss azunor Okada, Arturo Flores 2, Marus George Lnguraru 3 Computer Sene Department,

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Elsevier Editorial System(tm) for NeuroImage Manuscript Draft

Elsevier Editorial System(tm) for NeuroImage Manuscript Draft Elsever Edtoral System(tm) for NeuroImage Manusrpt Draft Manusrpt Number: Ttle: Comparson of ampltude normalzaton strateges on the auray and relablty of group ICA deompostons Artle Type: Tehnal Note Seton/Category:

More information

Adaptive Class Preserving Representation for Image Classification

Adaptive Class Preserving Representation for Image Classification Adaptve Class Preservng Representaton for Image Classfaton Jan-Xun M,, Qankun Fu,, Wesheng L, Chongqng Key Laboratory of Computatonal Intellgene, Chongqng Unversty of Posts and eleommunatons, Chongqng,

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Lecture 5: Multilayer Perceptrons

Lecture 5: Multilayer Perceptrons Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Bit-level Arithmetic Optimization for Carry-Save Additions

Bit-level Arithmetic Optimization for Carry-Save Additions Bt-leel Arthmet Optmzaton for Carry-Sae s Ke-Yong Khoo, Zhan Yu and Alan N. Wllson, Jr. Integrated Cruts and Systems Laboratory Unersty of Calforna, Los Angeles, CA 995 khoo, zhanyu, wllson @sl.ula.edu

More information

Interval uncertain optimization of structures using Chebyshev meta-models

Interval uncertain optimization of structures using Chebyshev meta-models 0 th World Congress on Strutural and Multdsplnary Optmzaton May 9-24, 203, Orlando, Florda, USA Interval unertan optmzaton of strutures usng Chebyshev meta-models Jngla Wu, Zhen Luo, Nong Zhang (Tmes New

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Performance Analysis of Hybrid (supervised and unsupervised) method for multiclass data set

Performance Analysis of Hybrid (supervised and unsupervised) method for multiclass data set IOSR Journal of Computer Engneerng (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 16, Issue 4, Ver. III (Jul Aug. 2014), PP 93-99 www.osrjournals.org Performane Analyss of Hybrd (supervsed and

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Bottom-Up Fuzzy Partitioning in Fuzzy Decision Trees

Bottom-Up Fuzzy Partitioning in Fuzzy Decision Trees Bottom-Up Fuzzy arttonng n Fuzzy eson Trees Maej Fajfer ept. of Mathemats and Computer Sene Unversty of Mssour St. Lous St. Lous, Mssour 63121 maejf@me.pl Cezary Z. Janow ept. of Mathemats and Computer

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010

Simulation: Solving Dynamic Models ABE 5646 Week 11 Chapter 2, Spring 2010 Smulaton: Solvng Dynamc Models ABE 5646 Week Chapter 2, Sprng 200 Week Descrpton Readng Materal Mar 5- Mar 9 Evaluatng [Crop] Models Comparng a model wth data - Graphcal, errors - Measures of agreement

More information

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval

Fuzzy C-Means Initialized by Fixed Threshold Clustering for Improving Image Retrieval Fuzzy -Means Intalzed by Fxed Threshold lusterng for Improvng Image Retreval NAWARA HANSIRI, SIRIPORN SUPRATID,HOM KIMPAN 3 Faculty of Informaton Technology Rangst Unversty Muang-Ake, Paholyotn Road, Patumtan,

More information

arxiv: v3 [cs.cv] 31 Oct 2016

arxiv: v3 [cs.cv] 31 Oct 2016 Unversal Correspondene Network Chrstopher B. Choy Stanford Unversty hrshoy@a.stanford.edu JunYoung Gwak Stanford Unversty jgwak@a.stanford.edu Slvo Savarese Stanford Unversty sslvo@stanford.edu arxv:1606.03558v3

More information

Clustering Data. Clustering Methods. The clustering problem: Given a set of objects, find groups of similar objects

Clustering Data. Clustering Methods. The clustering problem: Given a set of objects, find groups of similar objects Clusterng Data The lusterng problem: Gven a set of obets, fnd groups of smlar obets Cluster: a olleton of data obets Smlar to one another wthn the same luster Dssmlar to the obets n other lusters What

More information

Avatar Face Recognition using Wavelet Transform and Hierarchical Multi-scale LBP

Avatar Face Recognition using Wavelet Transform and Hierarchical Multi-scale LBP 2011 10th Internatonal Conferene on Mahne Learnng and Applatons Avatar Fae Reognton usng Wavelet Transform and Herarhal Mult-sale LBP Abdallah A. Mohamed, Darryl D Souza, Naouel Bal and Roman V. Yampolsky

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

ABHELSINKI UNIVERSITY OF TECHNOLOGY Networking Laboratory

ABHELSINKI UNIVERSITY OF TECHNOLOGY Networking Laboratory ABHELSINKI UNIVERSITY OF TECHNOLOGY Networkng Laboratory Load Balanng n Cellular Networks Usng Frst Poly Iteraton Johan an Leeuwaarden Samul Aalto & Jorma Vrtamo Networkng Laboratory Helsnk Unersty of

More information

Pixel-Based Texture Classification of Tissues in Computed Tomography

Pixel-Based Texture Classification of Tissues in Computed Tomography Pxel-Based Texture Classfaton of Tssues n Computed Tomography Ruhaneewan Susomboon, Danela Stan Rau, Jaob Furst Intellgent ultmeda Proessng Laboratory Shool of Computer Sene, Teleommunatons, and Informaton

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

An Adaptive Filter Based on Wavelet Packet Decomposition in Motor Imagery Classification

An Adaptive Filter Based on Wavelet Packet Decomposition in Motor Imagery Classification An Adaptve Flter Based on Wavelet Paket Deomposton n Motor Imagery Classfaton J. Payat, R. Mt, T. Chusak, and N. Sugno Abstrat Bran-Computer Interfae (BCI) s a system that translates bran waves nto eletral

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters

Proper Choice of Data Used for the Estimation of Datum Transformation Parameters Proper Choce of Data Used for the Estmaton of Datum Transformaton Parameters Hakan S. KUTOGLU, Turkey Key words: Coordnate systems; transformaton; estmaton, relablty. SUMMARY Advances n technologes and

More information

The Simulation of Electromagnetic Suspension System Based on the Finite Element Analysis

The Simulation of Electromagnetic Suspension System Based on the Finite Element Analysis 308 JOURNAL OF COMPUTERS, VOL. 8, NO., FEBRUARY 03 The Smulaton of Suspenson System Based on the Fnte Element Analyss Zhengfeng Mng Shool of Eletron & Mahanal Engneerng, Xdan Unversty, X an, Chna Emal:

More information

Edge Detection in Noisy Images Using the Support Vector Machines

Edge Detection in Noisy Images Using the Support Vector Machines Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona

More information

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and

More information

Gabor-Filtering-Based Completed Local Binary Patterns for Land-Use Scene Classification

Gabor-Filtering-Based Completed Local Binary Patterns for Land-Use Scene Classification Gabor-Flterng-Based Completed Loal Bnary Patterns for Land-Use Sene Classfaton Chen Chen 1, Lbng Zhou 2,*, Janzhong Guo 1,2, We L 3, Hongjun Su 4, Fangda Guo 5 1 Department of Eletral Engneerng, Unversty

More information

Detection of an Object by using Principal Component Analysis

Detection of an Object by using Principal Component Analysis Detecton of an Object by usng Prncpal Component Analyss 1. G. Nagaven, 2. Dr. T. Sreenvasulu Reddy 1. M.Tech, Department of EEE, SVUCE, Trupath, Inda. 2. Assoc. Professor, Department of ECE, SVUCE, Trupath,

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Microprocessors and Microsystems

Microprocessors and Microsystems Mroproessors and Mrosystems 36 (2012) 96 109 Contents lsts avalable at SeneDret Mroproessors and Mrosystems journal homepage: www.elsever.om/loate/mpro Hardware aelerator arhteture for smultaneous short-read

More information

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example

News. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example Unversty of Brtsh Columba CPSC, Intro to Computaton Jan-Apr Tamara Munzner News Assgnment correctons to ASCIIArtste.java posted defntely read WebCT bboards Arrays Lecture, Tue Feb based on sldes by Kurt

More information

Time Synchronization in WSN: A survey Vikram Singh, Satyendra Sharma, Dr. T. P. Sharma NIT Hamirpur, India

Time Synchronization in WSN: A survey Vikram Singh, Satyendra Sharma, Dr. T. P. Sharma NIT Hamirpur, India Internatonal Journal of Enhaned Researh n Sene Tehnology & Engneerng, ISSN: 2319-7463 Vol. 2 Issue 5, May-2013, pp: (61-67), Avalable onlne at: www.erpublatons.om Tme Synhronzaton n WSN: A survey Vkram

More information

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated. Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Measurement and Calibration of High Accuracy Spherical Joints

Measurement and Calibration of High Accuracy Spherical Joints 1. Introduton easurement and Calbraton of Hgh Auray Spheral Jonts Ale Robertson, Adam Rzepnewsk, Alexander Sloum assahusetts Insttute of Tehnolog Cambrdge, A Hgh auray robot manpulators are requred for

More information

Active Contours/Snakes

Active Contours/Snakes Actve Contours/Snakes Erkut Erdem Acknowledgement: The sldes are adapted from the sldes prepared by K. Grauman of Unversty of Texas at Austn Fttng: Edges vs. boundares Edges useful sgnal to ndcate occludng

More information

Computing Cloud Cover Fraction in Satellite Images using Deep Extreme Learning Machine

Computing Cloud Cover Fraction in Satellite Images using Deep Extreme Learning Machine Computng Cloud Cover Fraton n Satellte Images usng Deep Extreme Learnng Mahne L-guo WENG, We-bn KONG, Mn XIA College of Informaton and Control, Nanjng Unversty of Informaton Sene & Tehnology, Nanjng Jangsu

More information

Multiscale Heterogeneous Modeling with Surfacelets

Multiscale Heterogeneous Modeling with Surfacelets 759 Multsale Heterogeneous Modelng wth Surfaelets Yan Wang 1 and Davd W. Rosen 2 1 Georga Insttute of Tehnology, yan.wang@me.gateh.edu 2 Georga Insttute of Tehnology, davd.rosen@me.gateh.edu ABSTRACT Computatonal

More information

Graph-based Clustering

Graph-based Clustering Graphbased Clusterng Transform the data nto a graph representaton ertces are the data ponts to be clustered Edges are eghted based on smlarty beteen data ponts Graph parttonng Þ Each connected component

More information

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)

Parallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016) Technsche Unverstät München WSe 6/7 Insttut für Informatk Prof. Dr. Thomas Huckle Dpl.-Math. Benjamn Uekermann Parallel Numercs Exercse : Prevous Exam Questons Precondtonng & Iteratve Solvers (From 6)

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

A Real-Time Detecting Algorithm for Tracking Community Structure of Dynamic Networks

A Real-Time Detecting Algorithm for Tracking Community Structure of Dynamic Networks A Real-Tme Detetng Algorthm for Trakng Communty Struture of Dynam Networks Jaxng Shang*, Lanhen Lu*, Feng Xe, Zhen Chen, Jaa Mao, Xueln Fang, Cheng Wu* Department of Automaton, Tsnghua Unversty, Beng,,

More information

Unsupervised Learning

Unsupervised Learning Pattern Recognton Lecture 8 Outlne Introducton Unsupervsed Learnng Parametrc VS Non-Parametrc Approach Mxture of Denstes Maxmum-Lkelhood Estmates Clusterng Prof. Danel Yeung School of Computer Scence and

More information

Semi-analytic Evaluation of Quality of Service Parameters in Multihop Networks

Semi-analytic Evaluation of Quality of Service Parameters in Multihop Networks U J.T. (4): -4 (pr. 8) Sem-analyt Evaluaton of Qualty of Serve arameters n Multhop etworks Dobr tanassov Batovsk Faulty of Sene and Tehnology, ssumpton Unversty, Bangkok, Thaland bstrat

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

FULLY AUTOMATIC IMAGE-BASED REGISTRATION OF UNORGANIZED TLS DATA

FULLY AUTOMATIC IMAGE-BASED REGISTRATION OF UNORGANIZED TLS DATA FULLY AUTOMATIC IMAGE-BASED REGISTRATION OF UNORGANIZED TLS DATA Martn Wenmann, Bors Jutz Insttute of Photogrammetry and Remote Sensng, Karlsruhe Insttute of Tehnology (KIT) Kaserstr. 12, 76128 Karlsruhe,

More information

CS246: Mining Massive Datasets Jure Leskovec, Stanford University

CS246: Mining Massive Datasets Jure Leskovec, Stanford University CS46: Mnng Massve Datasets Jure Leskovec, Stanford Unversty http://cs46.stanford.edu /19/013 Jure Leskovec, Stanford CS46: Mnng Massve Datasets, http://cs46.stanford.edu Perceptron: y = sgn( x Ho to fnd

More information

USING GRAPHING SKILLS

USING GRAPHING SKILLS Name: BOLOGY: Date: _ Class: USNG GRAPHNG SKLLS NTRODUCTON: Recorded data can be plotted on a graph. A graph s a pctoral representaton of nformaton recorded n a data table. t s used to show a relatonshp

More information

Optimal shape and location of piezoelectric materials for topology optimization of flextensional actuators

Optimal shape and location of piezoelectric materials for topology optimization of flextensional actuators Optmal shape and loaton of pezoeletr materals for topology optmzaton of flextensonal atuators ng L 1 Xueme Xn 2 Noboru Kkuh 1 Kazuhro Satou 1 1 Department of Mehanal Engneerng, Unversty of Mhgan, Ann Arbor,

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Collaboratively Regularized Nearest Points for Set Based Recognition

Collaboratively Regularized Nearest Points for Set Based Recognition Academc Center for Computng and Meda Studes, Kyoto Unversty Collaboratvely Regularzed Nearest Ponts for Set Based Recognton Yang Wu, Mchhko Mnoh, Masayuk Mukunok Kyoto Unversty 9/1/013 BMVC 013 @ Brstol,

More information

Clustering Algorithm of Similarity Segmentation based on Point Sorting

Clustering Algorithm of Similarity Segmentation based on Point Sorting Internatonal onference on Logstcs Engneerng, Management and omputer Scence (LEMS 2015) lusterng Algorthm of Smlarty Segmentaton based on Pont Sortng Hanbng L, Yan Wang*, Lan Huang, Mngda L, Yng Sun, Hanyuan

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

All-Pairs Shortest Paths. Approximate All-Pairs shortest paths Approximate distance oracles Spanners and Emulators. Uri Zwick Tel Aviv University

All-Pairs Shortest Paths. Approximate All-Pairs shortest paths Approximate distance oracles Spanners and Emulators. Uri Zwick Tel Aviv University Approxmate All-Pars shortest paths Approxmate dstance oracles Spanners and Emulators Ur Zwck Tel Avv Unversty Summer School on Shortest Paths (PATH05 DIKU, Unversty of Copenhagen All-Pars Shortest Paths

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15

CS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15 CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

OSSM Ordered Sequence Set Mining for Maximal Length Frequent Sequences A Hybrid Bottom-Up-Down Approach

OSSM Ordered Sequence Set Mining for Maximal Length Frequent Sequences A Hybrid Bottom-Up-Down Approach Global Journal of Computer Sene and Tehnology Volume 12 Issue 7 Verson 1.0 Aprl 2012 Type: Double Blnd Peer Revewed Internatonal Researh Journal Publsher: Global Journals In. (USA) Onlne ISSN: 0975-4172

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Support Vector Machines. CS534 - Machine Learning

Support Vector Machines. CS534 - Machine Learning Support Vector Machnes CS534 - Machne Learnng Perceptron Revsted: Lnear Separators Bnar classfcaton can be veed as the task of separatng classes n feature space: b > 0 b 0 b < 0 f() sgn( b) Lnear Separators

More information

Clustering incomplete data using kernel-based fuzzy c-means algorithm

Clustering incomplete data using kernel-based fuzzy c-means algorithm Clusterng noplete data usng ernel-based fuzzy -eans algorth Dao-Qang Zhang *, Song-Can Chen Departent of Coputer Sene and Engneerng, Nanjng Unversty of Aeronauts and Astronauts, Nanjng, 210016, People

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information