A Comparison of Top-k Temporal Keyword Querying over Versioned Text Collections

Size: px
Start display at page:

Download "A Comparison of Top-k Temporal Keyword Querying over Versioned Text Collections"

Transcription

1 A Comparson of Top-k Temporal Keyword Queryng over Versoned Text Collectons Wenyu Huo and Vassls J. Tsotras Department of Computer Scence and Engneerng Unversty of Calforna, Rversde Rversde, CA, USA Abstract. As the web evolves over tme, the amount of versoned text collectons ncreases rapdly. Most web search engnes wll answer a query by rankng all known documents at the (current) tme the query s posed. There are applcatons however (for example customer behavor analyss, crme nvestgaton, etc.) that would need to effcently query these sources as of some past tme, that s, retreve the results as f the user was posng the query n a past tme nstant, thus accessng data known as of that tme. Rankng and searchng over versoned documents consders not only keyword constrants but also the tme dmenson, most commonly, a tme pont or tme range of nterest. In ths paper, we deal wth top-k query evaluatons wth both keyword and temporal constrants over versoned textual documents. In addton to consderng prevous solutons, we propose novel data organzaton and ndexng solutons: the frst one parttons data along rankng postons, whle the other mantans the full rankng order through the use of a multverson ordered lst. We present an expermental comparson for both tme pont and tme nterval constrants. For tme-nterval constrants, dfferent queryng defntons, such as aggregaton functons and consstent top-k queres are evaluated. Expermental evaluatons on large real world datasets demonstrate the advantages of the newly proposed data organzaton and ndexng approaches. 1 Introducton Versoned text collectons are textual documents that retan multple versons as tme evolves. Numerous such collectons are avalable today and a well-known example s the collaboratve authorng envronment, such as Wkpeda [1], where textual content s explctly verson-controlled. Smlarly, web archvng applcatons such as the Internet Archve [2] and the European Archve [3] store regular crawls (over tme) of web pages on a large scale. Other tme-stamped textual nformaton such as, weblogs, mcro-blogs, even feeds and tags, as also create versoned text collectons. If a text collecton does not retan past documents, then a search query ranks only the documents as of the most current tme. If the collecton contans versoned documents, a search typcally consders each verson of a document as a separate document and the rankng s taken over all documents ndependently to the document s verson (creaton tme). There are applcatons however, where ths approach s not S.W. Lddle et al. (Eds.): DEXA 2012, Part II, LNCS 7447, pp , Sprnger-Verlag Berln Hedelberg 2012

2 A Comparson of Top-k Temporal Keyword Queryng over Versoned Text Collectons 361 adequate. Consder the followng example: n order for a company to analyze consumer comments on a specfc product before some event occurred (new product, advertsement campagn etc.), a temporal constrant may be very useful. For example, to vew opnons on phone4, a tme-wndow wthn 06/07/2010 (announce date) and 10/04/2011 (announce date of phone4s) could be a far choce. Many nvestgaton scenaros also requre combnng the keyword search wth a tme-wndow of nterest. For example, whle consderng a fnancal crme, an nvestgator may need to dentfy what nformaton was avalable to the accused as of a specfc tme nstant n the past. Provdng as-of queres s a challengng problem. Frst s the data volume. Document collectons lke Wkpeda and Internet Archve, are already huge even f only ther most recent snapshot s consdered. When searchng n ther evolutonary hstory, we are faced wth even larger data volumes. Moreover, how to quckly return the top-k temporally ranked canddates s another new challenge. Note that returnng all qualfed results wthout temporal constrants would not be effcent snce two extra steps are requred: () flterng out results later than the query specfed tme constrant, and, () rankng the remanng results so as to provde the top-k answers. In ths paper we present an expermental evaluaton of the top-k query over versoned text collectons, comparng prevously proposed as well novel approaches. In partcular the key contrbutons can be summarzed as: 1. Prevous methods related to versoned text keyword search are sutably extended for top-k temporal queres. 2. Novel approaches are proposed n order to accelerate top-k temporal queres. The frst approach parttons the temporal data based on ther rankng postons, whle the other mantans the full rank order usng a multverson ordered lst. 3. In addton to top-k tme-pont keyword based search, we also consder two tmenterval (or tme-range) varants, namely aggregaton rankng and consstent top-k queryng. 4. Expermental evaluatons wth large-scale real-world datasets are performed on both the prevous and newly proposed methods. The rest of the paper s organzed as follows. Prelmnares and related work are ntroduced n secton 2. Our novel approaches appear n secton 3. Dfferent query defntons of tme-nterval top-k queres are presented n secton 4. All technques are comprehensvely evaluated and compared n a seres of experments n secton 5 whle the conclusons appear n secton 6. 2 Prelmnares and Related Work 2.1 Defntons The data model for versoned document collectons was formally ntroduced n [10], and used by later works [9, 5, 6]. Let D be a set of n documents d 1,d 2,,d n where 1 2 m each document d s a sequence of m versons: d = { d, d,..., d }. Each verson has a sem-closed valdty tme-nterval (or lfespan) lfe( d j ) = [ t, t ). Moreover, t s s e

3 362 W. Huo and V.J. Tsotras docid d5 d4 d3 d2 d tme t0 t1 t2 t3 t4 t5 t6 t7 t8 score-order d4, 1, t0, t1 d2, 0.95, t1, t4 d2, 0.9, t4, t8 d2, 0.7, t0, t1 d4, 0.7, t1, t3 d4, 0.4, t3, t6 d4, 0.25, t6, t8 ts-order d2, 0.7, t0, t1 d4, 1, t0, t1 d2, 0.95, t1, t4 d4, 0.7, t1, t3 d4, 0.4, t3, t6 d2, 0.9, t4, t8 d4, 0.25, t6, t8 Fg. 1. Example of versoned documents wth scores for one term assumed that dfferent versons of the same document have dsjont lfe spans. An example of fve documents and ther versons appears n Fg. 1; each document corresponds to a colored lne, whle segments represent dfferent versons of a document. The nverted fle ndex s the standard technque of text ndexng for keyword queres, deployed n many search engnes. Assumng a vocabulary V, for each term v n V, the ndex contans an nverted lst L v consstng of postngs of the form (d, s) where d s a document-dentfer and s s the so-called payload score. There are numerous exstng relevance scorng functons, such as tf-df [7], language models [13] and Okap BM25 [14]. The actual scorng functon s not mportant for our purposes; for smplcty we assume that the payload score contans the term frequency of v n d. In order to support temporal queres, the nverted fle ndex must also contan temporal nformaton. Thus [10] proposed addng the temporal lfespan explctly n the ndex postngs. Each postng ncludes the valdty tme-nterval of the correspondng document verson: (d, s, t s, t e ) where the document d had payload score s durng the tme nterval [t s, t e ). If the document evoluton contans few changes over tme, the assocated score of most terms s unchanged between adjacent versons. In order to reduce the number of postngs n an ndex lst, [10] coalesces temporally adjacent postngs belongng to the same document that have dentcal (or approxmate dentcal) scores. A general keyword search query Q conssts of a set of x terms q = (v 1, v 2,,v x ) and a temporal nterval [lb, rb]. Wthout loss of generalty, we use the aggregated score of a document verson for keyword query q s the sum of the scores from each term v. The tme-nterval [lb, rb] restrcts the canddate document versons as a subset of the [ lb, rb ] j j orgnal collecton: D = { d D [ lb, rb] lfe( d ) }. When lb = rb holds, the query tme nterval collapses nto a sngle tme pont t. For smplcty we frst concentrate on tme-pont query and more complex tme-nterval queres are dscussed n secton 4 wth related varatons. The answer R to a Top-K Tme-Pont keyword query TKTP = (q, t, k) over collecton D s a set of k document versons satsfyng: { d j R ( v q: v d j ) ( d j D t ) ( ( t j d D R): s( d ) s( d ))} where D t = { d j D t lfe( d j )}. The frst condton presents the keyword constrant, the second condton the temporal constrant, whle

4 A Comparson of Top-k Temporal Keyword Queryng over Versoned Text Collectons 363 the thrd mples that the top-k scored document versons are returned. Now we present how to answer query TKTP usng prevous methods based on temporal nverted ndexes. 2.2 Prevous Methods The straghtforward way (referred to as basc) to solve query TKTP uses exactly one nverted lst for each vocabulary term v wth the postng (d, s, t s, t e ). To answer the top-k queres, correspondng nverted lsts are traversed and postngs are fetched. When a postng s scanned, t s also verfed for the tme pont specfed n TKTP. The sort-order of the ndex lsts s also mportant. One natural choce s to sort each lst n score order. Ths method (score-order) enables the classcal top-k algorthms [11] to stop early after havng dentfed the k hghest scores wth qualfed lfespan. Another sutable sortng choce s to order the lsts frst by the start tme t s and then by score (t s -order) whch s benefcal for checkng the temporal constrant. However, ths approach s not effcent for top-k queryng, especally when the query ncludes multple terms. Fg. 1 shows the score-order and ts-order lsts for a specfc term. Note that the effcency of processng a top-k temporal query s nfluenced adversely by the wasted I/O due to read but skpped postngs. We proceed wth varous materalzaton deas of the slce the whole lst of a term nto several sub-lsts or parttons thus mprovng processng costs. Interval Based Slcng splts each term lst along the tme-axs nto several sub-lsts, each of whch corresponds to a contguous sub-nterval of the tme spanned by the full lst. Each of these sub-lsts contans all coalesced postngs that overlap wth the correspondng tme nterval. Note that ndex entres whose valdty tme-nterval spans across the slcng boundares are replcated n each of the spanned sub-lsts. The selecton of the correspondng tme-ntervals where the slces are created s vtal as dscussed n [9, 5]. One obvous strategy s to eagerly slce sub-lsts for all possble tme nstants (and adjacent dentcal lsts can be merged). Ths wll create one sub-lst per tme nstant; ths wll provde deal query performance for a TKTP query snce only the postngs n the sub-lst for the query tme pont wll be accesses. We refer to ths method as elementary. Note that the basc and elementary methods are two extremes: the former requres mnmal space but requres more processng at query tme snce many entres rrelevant to the temporal constrant are accessed; the latter provdes the best possble performance (for tme-pont query) but s not space-effcent (due to copyng of entres among sub lsts). To explore the trade-off between space and performance, [5] employs a smple but practcal approach (referred to as Fx) n whch a partton boundary s placed after a fxed tme wndow. The wndow sze can be a week, a month, a year, or other flexble choces. Fg. 2 shows the Fx-2 and Fx-4 sub-lsts of our runnng example from Fg. 1, wth the partton tme wndow sze as 2 and 4 tme nstants respectvely. Nevertheless, all varatons of the nterval based slcng suffer from an ndex-sze blowup snce entres whose vald-tme nterval spans across the slcng boundares are replcated.

5 364 W. Huo and V.J. Tsotras Fx-2: Fx-4: [t0, t2): d4, 1, t0, t1 d2, 0.95, t1, t4 d2, 0.7, t0, t1 d4, 0.7, t1, t3 [t2, t4): d2, 0.95, t1, t4 d4, 0.7, t1, t3 d4, 0.4, t3, t6 [t4, t6): d2, 0.9, t4, t8 d4, 0.4, t3, t6 [t6, t8): d2, 0.9, t4, t8 d4, 0.25, t6, t8 [t0, t4): d4, 1, t0, t1 d2, 0.95, t1, t4 d2, 0.7, t0, t1 d4, 0.7, t1, t3 d4, 0.4, t3, t6 [t4, t8): d2, 0.9, t4, t8 d4, 0.4, t3, t6 d4, 0.25, t6, t8 Fg. 2. Tme Interval Based Slcng sub-lst examples Stencl Based Parttonng. Another ndex parttonng method along the tme-axs was proposed n [12]. It s dstngushed from the nterval based slcng by usng a mult-level herarchcal (vertcal) parttonng of the lfespan. The nverted lst of term v, at level L 0 contans the entre lfespan of ths lst, whle level L +1 s obtaned from L by parttonng each nterval n L nto b sub-ntervals. Such a parttonng s called a stencl; each ndex postng s placed nto the deepest nterval n the mult-level parttonng that fts ts range. A stencl-based partton of three levels wth b = 2 for the runnng example (from Fg. 1) s shown n Fg. 3. Comparng to the tme nterval based slcng, the stencl based parttonng has sgnfcant advantage n space because each postng falls nto a sngle lst, the deepest sub-nterval that t fts. Nevertheless, for a tme-pont query stencl based parttonng has to fetch multple sub-lsts, one from each level. d1, 0.6 d2, 0.95 d4, 0.7 d3, 0.5 d4, 0.4 d5, 0.75 d2, 0.9 L 0 L 1 d2, 0.7 d4, 1 d4, 0.25 t0 t1 t2 t3 t4 t5 t6 t7 t8 Fg. 3. Stencl-based parttonng wth 3 levels and b = 2 The sort-order of each sub-lst s agan mportant. Snce the temporal parttonng already shreds one full lst nto several sub-lsts along the tme-axs, a more approprate choce for top-k queres s score-orderng. Temporal Shardng. The approach proposed n [6] s to shard (or horzontally partton) each term lst along the document dentfers nstead of tme. Entres n a term lst are thus dstrbuted over dsjont sub-lsts called shards, and entres n a shard are ordered accordng to ther start tmes t s. So as to elmnate wasteful reads, wthn a shard g, entres satsfy a starcase property: pq, g, tsp ( ) tsq ( ) tep ( ) teq ( ). An optmal greedy algorthm for creatng ths parttonng s gven n [6]; an example of temporal shardng for the term lst from Fg. 1, s shown n Fg. 4. L 2 tme

6 A Comparson of Top-k Temporal Keyword Queryng over Versoned Text Collectons 365 docid d5 d4 d3 d2 d1 e5 e2 e1 e3 e7 e4 e1 e2 e6 e3 t0 t1 t2 t3 t4 t5 t6 t7 t8 tme Shard_1: e1: d2, 0.7, t0, t1 e2: d4, 1, t0, t1 e3: e4: e5: e6: d2, 0.9, t4, t8 e7: d4, 0.25, t6, t8 Shard_2: e1: d4, 0.7, t1, t3 e2: d2, 0.95, t1, t4 e3: d4, 0.4, t3, t6 Fg. 4. Temporal shardng example As wth the stencl based approach, the space usage for temporal shardng s optmal snce there are no replcatons of ndex entres. However, for query processng, all shards for each term need to be accessed, resultng n multple sub-lst readngs. Moreover, the entres n each shard can only be tme-ordered (based on start tme t s ). Thus the beneft of score-orderng for ranked queres cannot be acheved, because all temporal vald entres have to be fetched. 3 Novel Approaches A common characterstc of exstng works s that they only consder the versoned documents on the tme- and docid-axes, and try to partton the data along ether drecton. Instead, we vew the ndex entres from a new angle -- namely, ther score over tme, and create ndex organzatons to mprove the performance of top-k queryng. The score-tme vew of the example from Fg. 1 s shown n Fg. 5. docid d d1 d2 d3 d4 d5 score d4 d3 d2 d tme tme t0 t1 t2 t3 t4 t5 t6 t7 t8 t0 t1 t2 t3 t4 t5 t6 t7 t8 Fg. 5. The Score-Tme vew of the versoned documents Recall that to answer a TKTP query we should be able to quckly fnd the top-k scores of a term at a gven tme nstant. The man dea behnd the score-tme vew s to mantan an ndex that wll provde the top scores per term at each tme nstant. For example, at tme t 0, the term depcted n Fg.5 had scores 1 (from d 4 ), 0.7 (from d 2 ), 0.6 (from d 1 ) and 0.5 (from d 3 ). These orderngs change as tme proceeds; for example at tme t 2, the top score s 0.95 from d 2, etc. In rank-based parttonng (secton 3.1), we frst dscuss a smplstc approach (SPR) where an ndex s created for each rank poston of a term. For example, there s an ndex that mantans the top score over tme,

7 366 W. Huo and V.J. Tsotras then one for the second top score, etc. More practcal s the group rankng approach (GR) where an ndex s created to mantan the group of the top-g scores (g s a constant), then the next top-g etc. We also consder temporal ndexng methods (secton 3.2). One soluton s to use the Multverson B-tree and mantan the whole ranked lst n order over tme. We realze however that these ranked lsts are always accessed n order, so a better soluton s provded (multverson lst) that lnks approprately the data pages of the temporal ndex, wthout overhead of the ndex nodes. 3.1 Rank Based Parttonng The Sngle Poston Rankng (SPR) approach creates a separate temporal ndex for each rankng poston of a term. Thus, for the -th rankng poston ( = 1,2, ), a sublst s mantaned that contans all the entres that ever exsted on poston over tme. Together wth each entry we mantan the tme nterval durng whch ths entry occuped that poston. All sub-lst entres are ordered based on ther recorded startng tme; a B+tree bult on the start tmes can easly locate the approprate entry at a gven tme. The SPR of our runnng example (from Fg. 1) s shown n Fg. 6(a). Space can be saved by usng only the start tme of each entry but for smplcty we show the end tmes as well (the end tme s needed only f there s no entry n a partcular poston, but ths s true only at the last poston). Usng the SPR approach, to process a TKTP query about tme t, the frst k sub-lsts have to be accessed for each relevant term; from each sublst the B+tree wll provde the approprate score (and document d) of ths term at tme t. If each sub-lst has m tems on average, the estmated tme complexty s O(k log B m) (here B corresponds to the page sze n records). Many sub-lst accesses can degrade queryng performance; moreover, n ths smple SPR method the same postng can be duplcated n multple rankng poston sub-lsts. Ths unavodable replcaton may result n storage overhead. 1 SPR d4, 1, t0, t1 d2, 0.95, t1, t4 d2, 0.9, t4, t8 (a) 2 d2, 0.7, t0, t1 d4, 0.7, t1, t3 3 4 d4, 0.25, t6, t8 5 d4, 0.4, t3, t6 GR (g = 2) gr1 d4, 1, t0, t1 d2, 0.7, t0, t1 d2, 0.95, t1, t4 d4, 0.7, t1, t3 d2, 0.9, t4, t8 gr2 d4, 0.25, t6, t8 (b) gr3 d4, 0.4, t3, t6 Fg. 6. Rankng poston based parttonng Group Rankng (GR). In order to save space and mprove queryng performance, GR mantans an ndex not for a sngle rankng poston, but for a group of postons. Let the group sze be g. For example, the frst g ranked elements are n group gr 1, the next g ranked elements are n group gr 2, etc. Thus, compared to the n sub-lsts mantaned n SPR for n rankng postons, GR uses nstead n/g sub-lsts. Wth respect to the I/O of top-k queryng, we only need k/g random accesses (each of them stll logarthmc).

8 A Comparson of Top-k Temporal Keyword Queryng over Versoned Text Collectons 367 As wth the SPR each member wthn a group also records the tme nterval that the member was n the group. For example, assume that group gr mantans rankng postons (-1)g+1 through g. If at tme t s the score of a partcular term falls wthn these postons, ths score s added to the group, wth an nterval startng at t s. As long as ths score falls wthn the rankng postons of ths group, t s consdered part of the group; f at tme t e t falls out of the group, the end tme of ts nterval s updated to t e. To save on update tme, wthn each group we do not mantan the rank order. That s, each group s treated as an unordered set of scores that evolves over tme. To answer a TKTP query that nvolves a partcular group gr at tme t, we need to dentfy what members group gr had at tme t. Snce the sze of the group s fxed, we can easly sort these member scores and provde them to the TKTP result n rank order. However, t s guaranteed that gven tme t, the members n gr have no lower scores than those n group gr j where 1 < j ( n/ g). The GR approach for the above example (from Fg. 1) wth g = 2 s shown n Fg. 6(b). An nterestng queston s what ndex to employ for mantanng each group over tme. Dfferent than SPR, each group at a gven tme may contan multple entres; thus a B+ndex on the temporal start tmes s not enough. Instead, temporal ndex structures that mantan and reconstruct effcently an evolvng set over tme, lke the snapshot ndex [15] can be used to accelerate temporal queryng. Note that when mplementng GR n practce, each group may have a dfferent sze g. It s preferable to use smaller g for the top groups and larger g for the lower groups (snce the focus s on top-k, the few top groups wll be accessed more frequently and thus we prefer to gve faster access). For smplcty however, we use the same g for all groups. 3.2 Usng a Multverson Lst Consder the ordered lst of scores that a term has over all documents at tme t; as tme evolves, ths lst changes (new scores are added, scores are promoted, demoted or even removed, etc). Temporal ndexng methods have addressed a more general problem: how to mantan an evolvng set of keys over tme. Ths set s allowed to change by addng, deletng or updatng keys; the man temporal query supported s the so called: temporal-range query: gven t, provde the keys that were n the set at tme t, and are wthn key range r. The Multverson B-tree (MVBT) proposed n [8], s an asymptotcally optmal (n terms of I/O accesses under lnear space) soluton to the temporal range query. Assumng that there were a total of n changes that occurred n the set evoluton, then the MVBT uses lnear space (O(n/B)). Consder a range temporal query that specfes range r and tme t, and let a t denote the number of keys that were wthn range r at tme t (.e., the number of keys that satsfy the query); the MVBT answers the above query usng O(log B n + a t /B) page I/Os, whch s optmal n lnear space [8]. In order for the MVBT to mantan order among the keys, t uses a B+tree to ndex the set. As the set evolves, so does the B+-tree. Conceptually the MVBT contans all B+-trees over tme; for a gven query tme t the MVBT provdes access to the root of the approprate B+-tree, etc. Of course, the MVBT does not copy all B+-trees (as ths would result n quadratc space). Instead t uses clever page update polces.

9 368 W. Huo and V.J. Tsotras In partcular, when a key k s added to the evolvng set at tme t a record s nserted n the (leaf) data page whose range contans k; ths record stores key k and a tme nterval of the form: [t, *). The * denotes that key k has not been updated yet. If later at tme t ths key s removed from the set, ts record s not physcally deleted. Instead ths change s represented by changng the * to t n ths record s nterval. A record s called alve for all tme nstants n ts nterval. Gven a query about tme t, the MVBT tree dentfes all data pages that contan alve records for that tme t. In contrast to a regular B+-tree that deals wth pages that get underutlzed due to record deletons, the MVBT pages cannot get underutlzed because no record s ever deleted. Lke the B+-tree pages can get full of records and need to be splt (page overflow). However, the MVBT needs to also guarantee that the number of alve records n a page do not fall below a lower threshold l (weak verson underflow) and also do not go over an upper threshold u (strong verson overflow)- note that l and u are O(B). If a page overflows, a tme-splt occurs, that copes the alve records of the overflown page (at the tme of the overflow) to a new page. If there are too few alve records, the page s merged wth a sblng page that s also frst tme-splt. If there are too many alve records, a key splt s frst appled (among the alve records) [8]. Usng the MVBT for our purposes means that the scores play the role of keys. That s, the MVBT wll mantan the order of scores over tme. Snce however term records are accessed by the docid they belong to, a hashng ndex s also needed that, for a gven docid, t provdes the leaf page that holds the record wth ths term s current score. Ths hashng scheme need only mantan the most current scores (.e., t does not need to mantan past postons). Nevertheless, the above MVBT approach has a sgnfcant overhead. In partcular, t s bult to answer queres about any range of scores. Ths s acheved by startng from an approprate root of the MVBT and follow ndex nodes untl the leaf data pages n the query range are accessed. For top-k processng however, we only access scores n decreasng order, startng wth the largest score at a partcular tme nstant. Thus, what we actually need, s a way to access the leaf page that has the hghest scores at a partcular tme, and then follow to ts sblng leaf page (wth the next lower scores) at that tme, etc. We stll mantan the splt polces among the leaf pages, but we do not use the MVBT s ndex nodes. Effectvely we mantan a multverson lst (MLst),.e., of the leaf data pages over tme. To access the leaf data page that has the hghest scores at a gven tme, we mantan an array A wth records of the form (t, p) where t s a tme nstant and p s a ponter to the leaf page wth the hghest scores at tme t. If later at tme t another page p becomes the leaf page wth the hghest scores, array A s updated wth a record (t, p ). If ths array becomes too large for man memory, t can easly be ndexed by a B+-tree on the (ordered) tme attrbute. For the above lst of leaf pages dea to work, each leaf page needs to remember the next sblng leaf page (wth lower scores) at each tme. (Note: the MVBT does not requre the sblng ponters, snce access to sblngs s done through the parent ndex nodes). One could stll use the array approach (one array responsble to keep access to the second leaf page, one for the thrd etc.) but ths would requre many array lookups at query tme (each such lookup takng O(log B n) page I/Os. Instead, we propose

10 A Comparson of Top-k Temporal Keyword Queryng over Versoned Text Collectons 369 to embed these arrays wthn the page structure. That s, wthn each leaf page, we allocate a space of c records (where c s a constant) for the sblng page ponter records (also of the form (t,p)). As a result, each leaf page has now space for B-c score records. Snce however, the sblng page can change over tme, t s possble that for a leaf page p the sblng wll change more than c tmes. If ths happens at tme t, page p s tme splt, that s, a new leaf page p s created contanng only the currently alve records of page p and wth an empty array for sblng ponters. Moreover, p replaces p n the lst. If before t, the lst of leaf pages contaned pages (n that order) m p v, a new record (t, p ) s added n the array of page m, and the array of page p s ntalzed wth a record (t,v). If p was the frst page, the record (t,p ) s added to array A. The advantage of the Mlst approach s apparent at query processng tme. A search s frst performed wthn array A for tme t (n O(log B n) page I/Os). Ths wll provde access to the page wth the hghest scores at tme t. Fnd the next sblng page at tme t however wll be provded by lookng among the c records of ths page, etc. That s, the top-k scores at tme t wll be accessed n O(log B n + k/b) page I/Os. The justfcaton s that after the access to array A, each leaf page (except possbly the last one) wll provde O(B) of the top-k scores (snce we are usng the MVBT splttng polces wthn the B-c space of each leaf page and c s a constant, each page s guaranteed to provde at least l=o(b) scores that were vald at the query tme t. 4 Top-k Tme Interval Queres Untl now we focused on the top-k tme pont (TKTP) queryng, and analyzed dfferent ndex structures for solvng t. We proceed wth the tme nterval top-k query. The man dfference s that n the TKTP, each document has at most one vald verson at the gven tme pont t; whle for an nterval queryng, each document may have multple versons vald durng the gven tme nterval [lb, rb]. As a result, there are dfferent varatons, dependng on how the top-k s defned (whch of the vald scores per document partcpate n the top-k computaton). Here, we summarze the dfferent defntons of top-k tme-nterval queres and dscuss how to process them effcently wthn the proposed ndex structures. 4.1 Classc Top-k Tme-Interval Query Ths query defnton s a straght forward extenson from the top-k tme pont query. For a Top-K Tme Interval keyword query TKTI = (q, lb, rb, k) over collecton D, we requre the answer R be a set of k document versons satsfyng:{ j j d R ( v q : v d ) where [, ] [, ] ( d j D lb rb ) ( d ( D lb rb R): s( d j ) s( d ))} D = { d D [ lb, rb] lfe( d ) } [, ] j j. Ths defnton only changes the tme constrants from a tme pont t to a tme range [lb, rb]. The returned top-k answers are dfferent versons, whch may be from the same document, that s, we consder each document verson as an ndependent object. Processng a TKTI query s smlar to processng a TKTP query. For some of the descrbed ndex methods, multple sub-lsts have to be accessed nstead of one.

11 370 W. Huo and V.J. Tsotras For example n tme nterval based slcng and stencl based parttonng, all the sublsts (or stencls) overlappng wth the query tme-nterval should be checked n order to fnd the correct top-k results. The multple parallel sub-lsts can be accessed n a round-robn fashon whch s compatble wth top-k algorthms. 4.2 Document Aggregated Top-k Tme-Interval Query Another possblty s to treat each document as one object, that s, a document appears at most once n the result. There are varous approaches n aggregatng relevance scores of the document versons that exsted at any pont n the temporal constrant [lb, rb] to obtan a document relevance score drs(d, lb, rb). Three aggregaton relevance models are mentoned n [10]: MIN. Ths model judges the relevance of a document based on the mnmum score. It s formally defned as: (,, ) mn{ ( j j drs d lb rb = s d ) [ lb, rb] lfe( d ) }. The MIN scores of our fve-document example for nterval [t 0, t 8 ) are d 2 =0.7, d 3 =0.5, d 4 =0.25, d 1 =0, d 5 =0. MAX. In contrast, ths model takes the maxmum score as an ndcator. It s formally j j defned as: drs( d, lb, rb) = max{ s( d ) [ lb, rb] lfe( d ) }. MAX scores of our fve-document example for nterval [t 0, t 8 ) are d 4 =1, d 2 =0.95, d 5 =0.75, d 1 =0.6, d 3 =0.5. TAVG. Fnally, the TAVG model assgns the score to each document usng a temporal average among all ts vald versons. Snce score sd ( j ) s pecewse-constant n tme, drs(d, lb, rb) can be effcently computed as a weghted summaton of these segments. TAVG scores of our fve-document example for nterval [t 0, t 8 ) are d 2 =0.89, d 4 =0.51, d 3 =0.5, d 5 =0.47, d 1 =0.45. After the aggregaton mechansm has been defned, one can consder the Aggregated Top-K Tme-Interval keyword query TKTI A = (q, lb, rb, k) over collecton D, that fnds the top k documents wth aggregated scores over all ther vald document versons. To process the aggregated top-k tme-nterval query, we need to extend the tradtonal top-k algorthms (such as TA and NRA) by recordng the bookkeepng nformaton and computng the scores and thresholds wth canddates at documentlevel. The relevance score of a document n the query temporal-context depends on the scores of ts verson that are vald durng ths perod. 4.3 Consstent Top-k Tme Range Query The consstent top-k search fnds a set of documents that are consstently n the top-k results of a query throughout a gven tme nterval. The result of ths query has sze 0 to k; queres can have empty results f k s small or the rankngs change drastcally. A relaxng consstent top-k query utlzes a relax factor r, 0 < r <= 1, and seeks for documents that are n the top-k for at least r h(rb lb) tme n the [lb, rb] nterval. For a Consstent Top-K Tme Interval keyword query TKTI C = (q, lb, rb, k) over collecton D, the documents n the answer R are n the top-k for at least r h(rb lb) tme n the [lb, rb] nterval. The consstent top-3 query of our fve-doc example for tme-nterval [t 0, t 8 ) has only one result as d 2 f r = 1, and has three results as d 1, d 2 and d 5 f r = 0.6.

12 A Comparson of Top-k Temporal Keyword Queryng over Versoned Text Collectons 371 In [16] several algorthms were ntroduced to answer the consstent top-k query; the most effcent ones are based on the assumpton that there s a lst contanng all versons satsfyng the keyword and tme nterval constrants and the lst s ordered by score. Ths assumpton concdes wth the purpose of our proposed ndex structures, thus we can access the qualfed entres and execute the consstent top-k tme nterval query usng the proposed approaches n [16]. 5 Expermental Evaluatons Dataset Descrpton and Methods Implemented: We used news-lke artcles as our prmary versoned document collecton. We collected US and world-wde Englsh newspaper webstes and treated each URL as a sngle document. Then ther hstorcal homepage versons were retreved by crawlng the Internet Archve [2] from untl We created two dfferent datasets wth daly unt tme granularty. The US based news had many frequent updates. The sze of raw data s about 0.2 TB, wth 12,649 documents and 1,542,893 versons; thus on average there are 122 versons per document n the US dataset. For the world-wde news webstes, the sze of raw data s about 50 GB, wth 5,046 documents and 275,981 versons, so on average there are 55 versons per document. Prevous related works create query workloads by extractng frequent queres from the AOL query logs. In addton to ths tradtonal query workload, we use popular keywords (such as twtter, phone, lady gaga etc.) from the Google Zetgest [4] annual reports from 2001 untl Overall, we formed 200 queres wth 265 terms for both classc and popular keywords. We organze the data nto term nverted lst(s) usng the prevous and novel approaches. In the basc method wth score-orderng (referred to as Basc-s) we create one nverted lst per term. The second method s elementary tme-nterval slcng wth a mergng of adjacent dentcal sub-lsts (Ele). For the Fx approach we used a tmewndow length of 30 days (Fx-30). The stencl based parttonng was mplemented wth 3 levels and b = 4 (Stencl). Temporal shardng s referred as Shard, whle the sngle poston rankng model appears as SPR. Two group rankng methods were mplemented wth group szes of 25 and 50 (GR-25 and GR-50). For comparson purposes we also ncluded the MVBT ndex (wth the approprate hashng secondary ndex).the multverson lst approach (MLst) uses a factor a = c / B to present the rato of the number of ponter records to the number of all records n a page. Comparson Results: Frst, the space usage for all mplemented methods on both the US-news and World-news datasets s presented n Table 1. The page sze s 4 Kbytes whle B = 100 records. The table presents the space consumed (n GB) to mplement the ndex methods for the 256 terms used n our experments. Clearly, the elementary tme-nterval slcng has a huge space overhead whle the Stencl and Shard methods present substantal space savngs. As expected, the Basc-s approach has the mnmal space requrements. Fx-30 uses more space snce a record may appear n more parttons whle n Stencl and Shard, each record appears once. The addtonal space that Stencl and Shard use wrt Basc-s s due to the addtonal structures they utlze. Among the rank-based parttonng methods, SPR uses more space than the GR approaches; ths s because the SPR approach has to mantan one ndex per ranked poston. GR-25 uses more space than GR-50 snce t uses more ndex structures

13 372 W. Huo and V.J. Tsotras (one per group). For the MLst method, we show the results of a = 7% and a = 10% (referred to MLst-7 and MLst-10). The MLst approaches also use lnear space (but due to the copyng of records at page splts, the space s more than the Stencl and Shard approaches). MLst uses slghtly more space than the MVBT because of the use of sblng ponters and the splts they create. Table 1. The space usage (n GB) for the 256 terms used n the queres Methods Basc-s Ele Fx-30 Stencl Shard SPR GR-25 GR-50 MVBT MLst-7 MLst-10 US World The top-k temporal queres nclude both tme-pont (n our dataset ths corresponds to one day) and tme-nterval queres. For each temporal keyword query, we randomly choose 50 tme constrants from the 15-year lfespan from 1997 to 2011, and record the average performance. For TKTI A, we use TAVG scorng; for TKTI C, we use r = 1. The page I/O costs for top-20 queres usng the US-news dataset are shown n Table 2 (the best performance for each query s shown n bold). For tme nterval queres, the tme-nterval lengths used were 15 days, 30 days, and 60 days. We also present the I/O costs for top-100 queres on both US-news and World-news datasets n Table 3 for both tme-pont query and 30-day tme-nterval queres. Table 2. The page I/O cost of top-20 temporal keyword queres for US news Methods Basc-s Ele Fx-30 Stencl Shard SPR GR-25 GR-50 MVBT MLst-7 MLst-10 TKTP TKTI TKTI TKTI TKTIA TKTIA TKTIA TKTIC TKTIC TKTIC Table 3. The page I/O cost of top-100 temporal keyword queres for US and World news US Basc-s Ele Fx-30 Stencl Shard SPR GR-25 GR-50 MVBT MLst-7 MLst-10 TKTP TKTI TKTIA TKTIC World Basc-s Ele Fx-30 Stencl Shard SPR GR-25 GR-50 MVBT MLst-7 MLst-10 TKTP TKTI TKTIA TKTIC The elementary tme-nterval slcng has the best snapshot queryng performance for both top-20 and top-100 queres. Ths s to be expected snce the answer s bascally prepared for each tme nstant (at the cost of huge storage requrements). Among the other methods, the newly proposed approaches (GR, MLst) outperform

14 A Comparson of Top-k Temporal Keyword Queryng over Versoned Text Collectons 373 the prevous methods (Stencl and Shard). The best performance s provded by the MLst-10 method. It has better performance than the MVBT gven t accesses the answer faster (by avodng the MVBT ndex traversal). Consderng ts low space requrements, ths approach provdes the overall best performance for TKTP queres. For tme-nterval queres, the Ele method s performance degrades drastcally, especally for longer tme-nterval. The group rankng method s performance s related to ts group sze g as t relates to k. For top-20 queryng, a group sze of 25 works better than a group sze of 50 (the answer can be found by accessng the frst group only); whle for top-100 queryng, GR-50 s a better choce (only two groups need to be accessed nstead of four for GR-25, thus less ndex accesses). For top-20 nterval queres, the MLst-10 had consstently the best performance for each query US/TKTP US/TKTI Page I/O Page I/O % 7% 10% 13% 16% a 0 4% 7% 10% 13% 16% a World/TKTP World/TKTI Page I/O Page I/O % 7% 10% 13% 16% a 0 4% 7% 10% 13% 16% a Fg. 7. The Multverson lst method for dfferent rato a usng the US and World news datasets Interestngly, for the top-100 nterval queres the MLst-7 shows better performance for the World-news dataset. The reason for that s that ths dataset has fewer updates. As a result, there wll be fewer ponter changes n the ordered lst, thus a smaller a wll provde enough space to hold the ponter structure. Ths can also be seen n the space requrements for ths dataset: the fewer ponter splts mean that MLst-7 uses less space than MLst-10 (and thus the lsts are shorter and the query performance better). The above observaton mples that the performance of the multverson lst method s related to the value of a. There are two opposng factors affectng the query performance wth respect to a. For a gven page sze, a small a mples that the area allocated to sblng ponters s small; thus few sblng page changes can cause the page to splt. More splts use more space and the query tme ncreases. On the other hand, a large a mples that the space allocated for the regular records n a page s small, thus the page can splt faster due to the record updates. Ths also ncreases space and query tme. The optmzed value of a depends on the dataset characterstcs. Fgure 7 depcts the page I/O for the top-100 results returned by pont

15 374 W. Huo and V.J. Tsotras (TKTP) and nterval (TKTI-30) queres for the US and World-news datasets. For the US-news dataset, a = 10% has the best average performance for both tme-pont queryng (TKTP) and 30-day tme-nterval queryng (TKTI-30) whle for the World-news dataset (whch has less update frequency), the performance s optmzed for a = 7%. 6 Concluson We presented an expermental comparson of ndexng methods over versoned text collectons for top-k temporal keyword queres. In addton to prevous methods, we proposed novel solutons that partton the data along the score-tme axes. Among all methods, the multverson lst provded the most robust performance consderng space usage and query tme effcency for both tme-pont and tme-nterval queres. We examned varatons of the tme-nterval queres, ncludng the document-level aggregated top-k queres and consstent top-k queres. The performance of the multverson lst s affected by the value of a, the percentage of a data page allocated to hold sblng ponters. As future work, we plan to devse a model that can optmze the value of a based on the frequency of updates, the sze of the page and other factors. Acknowledgements. Ths work s partally supported by NSF grant IIS References [1] Wkpeda, [2] Internet Archve, [3] European Archve, [4] Google Zetgest, [5] Anand, A., Bedathur, S., Berberch, K., Schenkel, R.: Effcent Temporal Keyword Queres over Versoned Text. In: CIKM (2010) [6] Anand, A., Bedathur, S., Berberch, K., Schenkel, R.: Temporal Index Shardng for Space-Tme Effcency n Archve Search. In: SIGIR (2011) [7] Baeza-Yates, R., Rbero-Neto, B.: Modern Informaton Retreval. Addson-Wesley (1999) [8] Becker, B., Gschwnd, S., Ohler, T., Seeger, B., Wdmayer, P.: An asymptotcally optmal multverson B-tree. VLDB Journal (1996) [9] Berberch, K., Bedathur, S., Neumann, T., Wekum, G.: A Tme Machne for Text Search. In: SIGIR (2007) [10] Berberch, K., Bedathur, S., Wekum, G.: Effcent Tme-Travel on Versoned Text Collectons. In: BTW (2007) [11] Fagn, R., Lotem, A., Naor, M.: Optmal Aggregaton Algorthms for Mddleware. J. Comput. Syst. Sc. 66(4), (2003) [12] He, J., Suel, T.: Faster Temporal Range Queres over Versoned Text. In: SIGIR (2011) [13] Ponte, J.M., Croft, W.B.: A Language Modelng Approach to Informaton Retreval. In: SIGIR (1998) [14] Robertson, S.E., Walker, S.: Okap/keenbow at TREC-8. In: TREC (1999) [15] Tsotras, V.J., Kangelars, N.: The Snapshot Index: an I/O Optmal Access Method for Snapshot Queres. Informaton System 20(3), (1995) [16] U, L.H., Mamouls, N., Berberch, K., Bedathur, S.: Durable Top-k Search n Document Archves. In: SIGMOD (2010)

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe CSCI 104 Sortng Algorthms Mark Redekopp Davd Kempe Algorthm Effcency SORTING 2 Sortng If we have an unordered lst, sequental search becomes our only choce If we wll perform a lot of searches t may be benefcal

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search Can We Beat the Prefx Flterng? An Adaptve Framework for Smlarty Jon and Search Jannan Wang Guolang L Janhua Feng Department of Computer Scence and Technology, Tsnghua Natonal Laboratory for Informaton

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

UB at GeoCLEF Department of Geography Abstract

UB at GeoCLEF Department of Geography   Abstract UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

USING GRAPHING SKILLS

USING GRAPHING SKILLS Name: BOLOGY: Date: _ Class: USNG GRAPHNG SKLLS NTRODUCTON: Recorded data can be plotted on a graph. A graph s a pctoral representaton of nformaton recorded n a data table. t s used to show a relatonshp

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

Self-tuning Histograms: Building Histograms Without Looking at Data

Self-tuning Histograms: Building Histograms Without Looking at Data Self-tunng Hstograms: Buldng Hstograms Wthout Lookng at Data Ashraf Aboulnaga Computer Scences Department Unversty of Wsconsn - Madson ashraf@cs.wsc.edu Surajt Chaudhur Mcrosoft Research surajtc@mcrosoft.com

More information

Related-Mode Attacks on CTR Encryption Mode

Related-Mode Attacks on CTR Encryption Mode Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Desgn and Analyss of Algorthms Heaps and Heapsort Reference: CLRS Chapter 6 Topcs: Heaps Heapsort Prorty queue Huo Hongwe Recap and overvew The story so far... Inserton sort runnng tme of Θ(n 2 ); sorts

More information

Sorting. Sorting. Why Sort? Consistent Ordering

Sorting. Sorting. Why Sort? Consistent Ordering Sortng CSE 6 Data Structures Unt 15 Readng: Sectons.1-. Bubble and Insert sort,.5 Heap sort, Secton..6 Radx sort, Secton.6 Mergesort, Secton. Qucksort, Secton.8 Lower bound Sortng Input an array A of data

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vidyanagar

CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vidyanagar CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vdyanagar Faculty Name: Am D. Trved Class: SYBCA Subject: US03CBCA03 (Advanced Data & Fle Structure) *UNIT 1 (ARRAYS AND TREES) **INTRODUCTION TO ARRAYS If we want

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Sorting. Sorted Original. index. index

Sorting. Sorted Original. index. index 1 Unt 16 Sortng 2 Sortng Sortng requres us to move data around wthn an array Allows users to see and organze data more effcently Behnd the scenes t allows more effectve searchng of data There are MANY

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

PYTHON IMPLEMENTATION OF VISUAL SECRET SHARING SCHEMES

PYTHON IMPLEMENTATION OF VISUAL SECRET SHARING SCHEMES PYTHON IMPLEMENTATION OF VISUAL SECRET SHARING SCHEMES Ruxandra Olmd Faculty of Mathematcs and Computer Scence, Unversty of Bucharest Emal: ruxandra.olmd@fm.unbuc.ro Abstract Vsual secret sharng schemes

More information

Learning the Kernel Parameters in Kernel Minimum Distance Classifier

Learning the Kernel Parameters in Kernel Minimum Distance Classifier Learnng the Kernel Parameters n Kernel Mnmum Dstance Classfer Daoqang Zhang 1,, Songcan Chen and Zh-Hua Zhou 1* 1 Natonal Laboratory for Novel Software Technology Nanjng Unversty, Nanjng 193, Chna Department

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Bran Curless Sprng 2008 Announcements (5/14/08) Homework due at begnnng of class on Frday. Secton tomorrow: Graded homeworks returned More dscusson

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

Query Clustering Using a Hybrid Query Similarity Measure

Query Clustering Using a Hybrid Query Similarity Measure Query clusterng usng a hybrd query smlarty measure Fu. L., Goh, D.H., & Foo, S. (2004). WSEAS Transacton on Computers, 3(3), 700-705. Query Clusterng Usng a Hybrd Query Smlarty Measure Ln Fu, Don Hoe-Lan

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

Classifier Selection Based on Data Complexity Measures *

Classifier Selection Based on Data Complexity Measures * Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

Information Retrieval

Information Retrieval Anmol Bhasn abhasn[at]cedar.buffalo.edu Moht Devnan mdevnan[at]cse.buffalo.edu Sprng 2005 #$ "% &'" (! Informaton Retreval )" " * + %, ##$ + *--. / "#,0, #'",,,#$ ", # " /,,#,0 1"%,2 '",, Documents are

More information

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval

Description of NTU Approach to NTCIR3 Multilingual Information Retrieval Proceedngs of the Thrd NTCIR Workshop Descrpton of NTU Approach to NTCIR3 Multlngual Informaton Retreval Wen-Cheng Ln and Hsn-Hs Chen Department of Computer Scence and Informaton Engneerng Natonal Tawan

More information

CE 221 Data Structures and Algorithms

CE 221 Data Structures and Algorithms CE 1 ata Structures and Algorthms Chapter 4: Trees BST Text: Read Wess, 4.3 Izmr Unversty of Economcs 1 The Search Tree AT Bnary Search Trees An mportant applcaton of bnary trees s n searchng. Let us assume

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Priority queues and heaps Professors Clark F. Olson and Carol Zander

Priority queues and heaps Professors Clark F. Olson and Carol Zander Prorty queues and eaps Professors Clark F. Olson and Carol Zander Prorty queues A common abstract data type (ADT) n computer scence s te prorty queue. As you mgt expect from te name, eac tem n te prorty

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices

An Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal

More information

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss. Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:

More information

SAO: A Stream Index for Answering Linear Optimization Queries

SAO: A Stream Index for Answering Linear Optimization Queries SAO: A Stream Index for Answerng near Optmzaton Queres Gang uo Kun-ung Wu Phlp S. Yu IBM T.J. Watson Research Center {luog, klwu, psyu}@us.bm.com Abstract near optmzaton queres retreve the top-k tuples

More information

Biostatistics 615/815

Biostatistics 615/815 The E-M Algorthm Bostatstcs 615/815 Lecture 17 Last Lecture: The Smplex Method General method for optmzaton Makes few assumptons about functon Crawls towards mnmum Some recommendatons Multple startng ponts

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Estimating Costs of Path Expression Evaluation in Distributed Object Databases

Estimating Costs of Path Expression Evaluation in Distributed Object Databases Estmatng Costs of Path Expresson Evaluaton n Dstrbuted Obect Databases Gabrela Ruberg, Fernanda Baão, and Marta Mattoso Department of Computer Scence COPPE/UFRJ P.O.Box 685, Ro de Janero, RJ, 2945-970

More information

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems

A Unified Framework for Semantics and Feature Based Relevance Feedback in Image Retrieval Systems A Unfed Framework for Semantcs and Feature Based Relevance Feedback n Image Retreval Systems Ye Lu *, Chunhu Hu 2, Xngquan Zhu 3*, HongJang Zhang 2, Qang Yang * School of Computng Scence Smon Fraser Unversty

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Fast Computation of Shortest Path for Visiting Segments in the Plane

Fast Computation of Shortest Path for Visiting Segments in the Plane Send Orders for Reprnts to reprnts@benthamscence.ae 4 The Open Cybernetcs & Systemcs Journal, 04, 8, 4-9 Open Access Fast Computaton of Shortest Path for Vstng Segments n the Plane Ljuan Wang,, Bo Jang

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Adaptive Load Shedding for Windowed Stream Joins

Adaptive Load Shedding for Windowed Stream Joins Adaptve Load Sheddng for Wndowed Stream Jons Bu gra Gedk College of Computng, GaTech bgedk@cc.gatech.edu Kun-Lung Wu, Phlp Yu T.J. Watson Research, IBM {klwu,psyu}@us.bm.com Lng Lu College of Computng,

More information

Summarizing Data using Bottom-k Sketches

Summarizing Data using Bottom-k Sketches Summarzng Data usng Bottom-k Sketches Edth Cohen AT&T Labs Research 8 Park Avenue Florham Park, NJ 7932, USA edth@research.att.com Ham Kaplan School of Computer Scence Tel Avv Unversty Tel Avv, Israel

More information

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning

Outline. Type of Machine Learning. Examples of Application. Unsupervised Learning Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton

More information

Accounting for the Use of Different Length Scale Factors in x, y and z Directions

Accounting for the Use of Different Length Scale Factors in x, y and z Directions 1 Accountng for the Use of Dfferent Length Scale Factors n x, y and z Drectons Taha Soch (taha.soch@kcl.ac.uk) Imagng Scences & Bomedcal Engneerng, Kng s College London, The Rayne Insttute, St Thomas Hosptal,

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems S. J and D. Shn: An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems 2355 An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems Seunggu J and Dongkun Shn, Member,

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array Inserton Sort Dvde and Conquer Sortng CSE 6 Data Structures Lecture 18 What f frst k elements of array are already sorted? 4, 7, 1, 5, 1, 16 We can shft the tal of the sorted elements lst down and then

More information

Transaction-Consistent Global Checkpoints in a Distributed Database System

Transaction-Consistent Global Checkpoints in a Distributed Database System Proceedngs of the World Congress on Engneerng 2008 Vol I Transacton-Consstent Global Checkponts n a Dstrbuted Database System Jang Wu, D. Manvannan and Bhavan Thurasngham Abstract Checkpontng and rollback

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints

Sum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

CS221: Algorithms and Data Structures. Priority Queues and Heaps. Alan J. Hu (Borrowing slides from Steve Wolfman)

CS221: Algorithms and Data Structures. Priority Queues and Heaps. Alan J. Hu (Borrowing slides from Steve Wolfman) CS: Algorthms and Data Structures Prorty Queues and Heaps Alan J. Hu (Borrowng sldes from Steve Wolfman) Learnng Goals After ths unt, you should be able to: Provde examples of approprate applcatons for

More information

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries Run-Tme Operator State Spllng for Memory Intensve Long-Runnng Queres Bn Lu, Yal Zhu, and lke A. Rundenstener epartment of Computer Scence, Worcester Polytechnc Insttute Worcester, Massachusetts, USA {bnlu,

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE

ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Інформаційні технології в освіті ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Some aspects of programmng educaton

More information