Summarizing Data using Bottom-k Sketches

Size: px
Start display at page:

Download "Summarizing Data using Bottom-k Sketches"

Transcription

1 Summarzng Data usng Bottom-k Sketches Edth Cohen AT&T Labs Research 8 Park Avenue Florham Park, NJ 7932, USA edth@research.att.com Ham Kaplan School of Computer Scence Tel Avv Unversty Tel Avv, Israel hamk@cs.tau.ac.l ABSTRACT A Bottom-k sketch s a summary of a set of tems wth nonnegatve weghts that supports approxmate query processng. A sketch s obtaned by assocatng wth each tem n a ground set an ndependent random rank drawn from a probablty dstrbuton that depends on the weght of the tem and ncludng the k tems wth smallest rank value. Bottom-k sketches are an alternatve to k-mns sketches [9], whch consst of the k mnmum ranked tems n k ndependent rank assgnments, and of mn-hash [5] sketches, where hash functons replace random rank assgnments. Sketches support approxmate aggregatons, ncludng weght and selectvty of a subpopulaton. Coordnated sketches of multple subsets over the same ground set support subset-relaton queres such as Jaccard smlarty or the weght of the unon. All-dstances sketches are applcable for datasets where tems le n some metrc space such as data streams (tme) or networks. These sketches compactly encode the respectve plan sketches of all neghborhoods of a locaton. These sketches support queres posed over tme wndows or neghborhoods and tme/spatally decayng aggregates. An mportant advantage of bottom-k sketches, establshed n a lne of recent work, s much tghter estmators for several basc aggregates. To materalze ths beneft, we must adapt tradtonal k-mns applcatons to use bottom-k sketches. We propose all-dstances bottom-k sketches and develop and analyze data structures that ncrementally construct bottom-k sketches and alldstances bottom-k sketches. Another advantage of bottom-k sketches s that when the data s represented explctly, they can be obtaned much more effcently than k-mns sketches. We show that k-mns sketches can be derved from respectve bottom-k sketches, whch enables the use of bottom-k sketches wth off-the-shelf k-mns estmators. (In fact, we obtan tghter estmators snce each bottom-k sketch s a dstrbuton over k-mns sketches). Categores and Subject Descrptors: E.2 Data Storage Representatons; G.3: probablstc algorthms; E. Data Structures General Terms: Algorthms, Measurement, Performance, Theory Keywords: all-dstances sketches, data streams, bottom-k sketches Permsson to make dgtal or hard copes of all or part of ths work for personal or classroom use s granted wthout fee provded that copes are not made or dstrbuted for proft or commercal advantage and that copes bear ths notce and the full ctaton on the frst page. To copy otherwse, to republsh, to post on servers or to redstrbute to lsts, requres pror specfc permsson and/or a fee. PODC 7, August 2 5, 27, Portland, Oregon, USA. Copyrght 27 ACM /7/8...$5... INTRODUCTION Sketchng or samplng s an extremely useful tool for storage and queres on massve data sets. Sketches allow us to process approxmate queres on the orgnal data sets whle occupyng a fracton of the storage space requred for the full data set and usng a fracton of the computaton resources requred for the exact answer. The value of a sketchng method depends on the effcency of ts mplementaton, ts versatlty n terms of the operatons supported, and the qualty of the estmates obtaned. Bottom-k and k-mns sketches are summares of a set of tems wth postve weghts. k-mns sketches (The mn-rank method [9]) are obtaned by assgnng ndependent random ranks to tems where the dstrbuton used for each tem depends on the weght of the tem. We retan the mnmum rank of an tem n the set. Ths s repeated wth k ndependent rank assgnments for some nteger k and we obtan a k-vector of ndependent mnmum ranks and k ndependent weghted samples. Bottom-k sketches are an emergng alternatve to k-mns sketches. Bottom-k sketches are constructed usng a sngle rank assgnment. The bottom-k sketch of a subset contans the k tems wth smallest ranks n the subset. Bottom-k sketches were mentoned, wthout analyss, n [9, 22]. The sketch supports approxmate query processng over the orgnal data set and subpopulatons of ths dataset. Basc aggregatons nclude the weght of the set or the selectvty of a subpopulaton (subset) of the set and derved aggregatons nclude approxmate quantles, average weght, and varance and hgher moments []. The sketch of a set s a weghted random sample. When used wth exponentally dstrbuted ranks, bottom-k sketches are a weghted sample wthout replacement (WS-sketches) whereas k-mns sketches are a weghted sample wth replacement (WSRsketches). In applcatons where there are multple subsets that are defned over the same ground set of tems, a sketch s produced for each subset. The sketches of dfferent subsets are coordnated, sharng the same rank assgnments to the tems of the ground set, and support queres over subset relatons, such as the weght of the unon or ntersecton, ther weght rato, and resemblance or Jaccard smlarty coeffcent. A useful property of coordnated sketches s that the sketch of a unon can be computed from the sketches of the subsets. Therefore, gven sketches of subsets, we can perform aggregatons on unons of subsets. Example of an applcaton wth multple subsets s when tems are assocated wth nodes of a drected graph and we compute k- mns sketches for the reachablty set of each node. These sketches can be computed n Õ(km) tme (and storage) whereas an explct representaton of the subsets requres O(mn) tme [9]. Applcatons nclude mantanng a sketch of nfluencng events for each process n a computer system [5], when a process A affects pro- 225

2 cess B, the new sketch of B becomes the sketch of the unon; and usng the property that the sketches reduce the approxmate sum problem to that of fndng a mnmum, k-mns sketches were used for aggregatons on gossp networks [2]. Other applcatons wth multple subsets where sketches support fast computaton of subset relatons are near-duplcate detecton for Web pages [5] (a sketch s produced for each Web page), study of smlar Web stes [2], mnng of assocaton rules [22] from market basket data, and elmnatng redundant network traffc [23]. In these applcatons, a varant termed mn-hash sketches substtutes random rank assgnments wth random hash functons (famles of mn-wse ndependent hash functons or ɛ-mn-wse functons [5, 6]). Wth random hash functons, the rank assgnment of an tem depends on the tem dentfer, and t has the property that all copes of the same tem across dfferent subsets obtan the same rank, wthout addtonal book keepng or coordnaton between all occurrences of each tem. Ths allows for effcent aggregatons over dstnct occurrences (see [9]) and supports subset-relaton queres. Bottom-k sketches encode more nformaton than k-mns sketches. (Intutvely, samplng wthout replacement s more nformatve than samplng wth replacement.) A lne of recent work showed that bottom-k sketches are superor to k-mns sketches n terms of estmate qualty. Estmators for subpopulaton weght usng prorty ranks (PRI-sketches) were provded n [, 24] and estmators for general famles of rank functons were provded n [, 2]. The mprovement n estmate qualty s sgnfcant on weght dstrbutons and values of k, such that tems are lkely to be sampled multple tmes n a k-sample drawn wth replacement, such as skewed Zpf-lke dstrbutons that often arse n practce. For subset relatons such as the weght of the ntersecton or unon, bottom-k sketches mprove over k-mns sketches even when weghts are unform [, 2]: Carefully desgned estmators are appled to the combned bottom-k sketches, whch reveal more members of the unon and ntersecton than two correspondng k-mns sketches. Our contrbutons We facltate the use of bottom-k sketches by developng and analyzng data structures that construct these sketches. Our results allow applcatons that use k-mns sketches to use the superor bottomk sketches. An nherent dfference we had to tackle s that k-mns sketches are obtaned usng k ndependent rank functons, whch allows for k ndependent copes of the same smple data structure to be used whereas bottom-k entres are dependent. Sketches are constructed ncrementally as tems are processed. The sketch s manpulated through two basc operatons: A test operaton whch tests f the sketch has to be updated, and an update operaton whch nserts the new tem f the sketch ndeed has to be updated. We make ths dstncton snce test operatons can be performed much more effcently than update operatons. The number of update operatons depends on the order n whch tems are processed and on the weght dstrbuton of the data. The number of test operatons s typcally larger than the number of updates. The extent n whch t s larger, however, hghly depends on the applcaton. We dstngush between applcatons wth explct representaton [3, 2, 22, 23] or mplct representaton [9, 3, 5] of the data. In applcatons wth an explct representaton, tem-subset pars are provded explctly. The dataset could be dstrbuted, presented as a data stream, or n external memory, but the pars are explctly provded and are all processed to produce the sketches. In applcatons wth mplct representaton, the subsets are specfed as neghborhoods n a graph or some metrc space. Wth explct representaton, the number of test operatons s much larger than the number of update operatons. In Secton 3 we analyze the number of test and update operatons and how t depends on the way the data s presented and on the dstrbuton of the tem weghts. All-dstances sketches are a generalzaton of plan sketches that are used when the underlyng dataset has tems assocated wth locatons n some metrc space, and subsets are specfed by neghborhoods of a locaton. All-dstances k-mns sketches were used for data streams (where aggregaton s over wndows of elapsed tme to the present tme) [4], the Eucldean plane (where we are presented wth a query pont and dstance) [3, 2], a graph (the query s a node and dstance) [9], or dstrbuted spatal aggregaton over a network [9, 3]. An all-dstances sketch s a compact encodng of the plan sketches of all neghborhoods of a certan locaton q. For a gven dstance d, the sketch for the d-neghborhood of the locaton can be constructed from the all-dstances sketch. All-dstances sketches also support tme-decayng and spatally-decayng aggregates usng arbtrary decay functons [4, 3]. In Secton 4 we defne bottom-k all-dstances sketches and present effcent data structures for mantanng both all-dstances k-mns sketches and all-dstances bottom-k sketches. We analyze the number of operatons requred to construct all-dstances sketches under dfferent arrval orders of the tems. In Secton 6 we provde a method to derve WSR-sketches (kmns wth exponental ranks) from WS-sketches (bottom-k wth exponental ranks). Ths mmckng process provdes a general method of applyng estmators desgned for WSR-sketches to WS-sketches. Ths process enables us to use bottom-k sketches n applcatons (such as those wth explct representaton of the data) where they can be obtaned much more effcently than k-mns sketches and use readly avalable WSR-sketches estmators. In fact, snce each WS-sketch corresponds to a dstrbuton over WSR-sketches, we obtan estmators wth smaller varance than the underlyng WSRsketches estmators. Ths reducton also shows that WS-sketches are strctly superor to WSR-sketches. We provde examples of applcatons of the mmckng process. 2. PRELIMINARIES Let I be a ground set of tems, where tem I has weght w(). A rank assgnment maps each tem to a random rank r(). The ranks of tems are drawn ndependently usng a famly of dstrbutons f w (w ), where the rank of an tem wth weght w() s drawn accordng to f w(). We use random rank assgnments to obtan sketches of subsets as follows. For a subset J of tems and a rank assgnment r we defne B (r, J) = arg mn j J r(j), to be the tem n J wth smallest rank accordng to r. For {,..., J }, we defne B (r, J) to be the tem n J wth th smallest rank accordng to r and r (J) r(b (r, J)) to be the th smallest rank value n J accordng to r. Defnton 2.. k-mns sketches are produced from k ndependent rank assgnments, r (),..., r (k). The k-mns sketch of a subset J s the k-vector (r () (J), r(2) (J),..., r(k) (J)). To support some queres, we may need to nclude wth each entry an dentfer or some other attrbutes such as the weght of the tems B (r (j), J) (j =,..., k). Defnton 2.2. Bottom-k sketches are produced from a sngle rank assgnment r. The bottom-k sketch s(r, J) of the subset J s a lst of entres (r (J), w(b (r, J))) for =,..., k. The lst s ordered by rank, from smallest to largest. The bottom-k sketch of a subset s therefore a lst wth up to k entres. The sze of the lst s the mnmum of k and the number 226

3 of tems n the subset. For a sngle tem (a subset of sze ), the bottom-k sketch s a lst wth a sngle entry (r (J), w(b (r, J))). To support queres, n addton to the weght, entres n the sketch may nclude an dentfer and attrbute values of tems B (r, J) ( =,..., k). Bottom-k and k-mns sketches have the followng useful property: The sketch of a unon of two sets can be generated from the sketches of the two sets. Let J and H be two subsets. For any rank assgnment r, r (J H) = mn{r (J), r (H)}. Therefore, for k-mns sketches we have (r () (J H),..., r(k) (J H)) = (mn{r () (J), r() (H)},..., mn{r(k) (J), r(k) (H)}). For bottom-k sketches, the k smallest ranks n the unon J H are contaned n the unon of the sets of the k-smallest ranks n each of J and H. That s, s(r, J H) s(r, J) s(r, H). Therefore, the bottom-k sketch of J H can be computed by takng the entres wth k smallest ranks n the combned sketches of J and H. To support sketch-based set operatons and queres, we need to store the rank values of tems. To perform sketch-based queres on a sngle subset, however, we do not need all rank values. Wth bottom-k sketches, t s suffcent to store the (k + )st smallest rank value, r k+ : We (re)draw random rank values for each tem n the sketch usng f w() condtoned on the rank beng smaller than r k+. Ths s just lke (re)drawng a random bottom-k sketch from the probablty subspace where the mnmum rank of tems not n the sketch s equal to r k+ and all tems n the sketch have ranks smaller than r k+. Beyond reduced storage, ths observaton often enables us to obtan tghter estmators. The unbased rank condtonng estmator for subpopulaton weght [, 2] s appled to the value r k+ and the weghts of the tems n the (unordered) sketch. In some cases, however, t s easer to derve estmator that s appled to the ordered sketch wth rank values (the mmckng process n Secton 6 s appled to an ordered bottom-k sketch). In ths case, nstead of applyng an estmator to the orgnal sketch and rank values, we take ts expectaton over re-drawn sketches or ts average over multple draws (f the expectaton s hard to compute). Ths results n an estmator wth at most the same varance and often smaller varance. Correctness follows from a basc property of varances: Lemma 2.3. Let a and a 2 be two random varables over Ω. Suppose there s a partton of Ω such that the value of a 2 on each part s equal to the expectaton of a on that part. Then VAR(a 2) VAR(a ). The choce of whch famly of random rank functons to use matters only when tems are weghted. Otherwse, sketches produced usng one rank functon can be transformed to any other rank functon. WS-sketches and WSR-sketches. A convenent choce for the rank functon f w s an exponental dstrbuton wth parameter w [9]. The densty functon of ths dstrbuton s f w(x) = we wx, and ts cumulatve dstrbuton functon s F w(x) = e wx. We refer to k-mns sketches wth these ranks as WSR-sketches and to bottom-k sketches wth these ranks as WS-sketches. The mnmum rank r (J) of an tem n a subset J I s exponentally dstrbuted wth parameter w(j) = P J w(). Ths follows from the fact that the mnmum of random varables each drawn from an exponental dstrbuton s also an exponentally dstrbuted random varable wth parameter equal to the sum of the parameters of these dstrbutons. The tem wth the mnmum rank We assume to smplfy the analyss that all random values are dstnct. B (r, J) s a weghted random sample from J: The probablty that an tem J s the mnmum rank tem s w()/w(j). Therefore we can conclude that a WSR-sketch of sze k of a subset J s a weghted random sample of sze k, drawn wth replacement from J (hence the term WSR-sketches). The ranks of these tems s a set of k ndependent samples from an exponental dstrbuton wth parameter w(j). Hence, f the weght w(j) s provded and we do not use subset-relaton queres rank values are redundant. If w(j) s not provded, the rank values can be used n unbased estmators for both w(j) and the nverse weght /w(j) [9]. 2 On the other hand, the tems n a WS-sketch are samples drawn wthout replacement from J: Lemma 2.4. A WS-sketch of sze k of a subset J s a sample of sze k drawn wthout replacement from J. PROOF. The probablty that tem J s B (r, J) s w()/w(j). Condtoned on the bottom-j ranked tems n J beng,..., j, B j+(r, J) s J \ {,..., j} wth probablty w()/(w(j) P j h= w( h)). If the weght w(j) s provded and we do not use the sketches for subset-relaton queres t suffces to store the unordered set of tems n s(r, J). Ths nformaton allows us to draw at random a bottom-k sketch from the probablty subspace that contans all sketches where the set of the bottom-k ranked tems s s(r, J). PRI-sketches. Wth prorty ranks [8, ] the rank value of an tem wth weght w s selected unformly at random from [, /w]. Ths s the equvalent to choosng rank value r/w, where r U[, ] s selected from the unform dstrbuton on the nterval [, ]. It s well known that f r U[, ] then ln(r)/w s an exponental random varable wth parameter w. Therefore exponental ranks correspond to usng rank values ln r/w where r U[, ]. Choce of a rank functon. The appeal of PRI-sketches s estmators that (nearly) mnmzes P I VAR( w()) [24]. More precsely, Szegedy showed that the sum of per-tem varances usng PRI-sketches of sze k s no larger than the smallest sum of varances attanable by an estmator that uses sketches wth average sze k. 3 WS-sketches offer several other dstnct advantages. Frst, they support unbased estmators for selectvty (subpopulaton fracton); Second, the estmators for selectvty and for subpopulaton weght when the weght of the set s known (as n data streams), feature negatve covarances between dfferent tems. Therefore, selectvty and weght estmators for larger subpopulatons are much tghter than wth the known estmator for PRI-sketches [2]. Unbased subpopulaton weght estmators exst for bottom-k sketches obtaned usng arbtrary rank functons [2]. These estmators are useful when we want to obtan good estmators wth respect to multple weght functons (eg, for IP flows datasets we are nterested n count of dstnct flows and total bandwdth). 3. MAINTAINING SKETCHES Sketches are produced for each subset of nterest n a collecton of subsets over a ground set of tems. The algorthms for constructng sketches are applcaton-dependent, but on a hgh level, 2 Estmators for the nverse-weght are useful for obtanng unbased estmates for quanttes where the weght appears n the denomnator. These nclude weght rato of two dfferent subsets, set resemblance of two subsets, and average weght of a subset. 3 Szegedy s proof apples only to estmators based on adjusted weght assgnments. It also does not apply to estmators on the weght of subpopulatons. 227

4 sketches are constructed usng an ncremental process, where a current sketch s mantaned for each subset of nterest, and the sketch s updated when a new nformaton (tem, or tem and rank value) s presented. We dentfy two operatons on the current sketch, a test operaton that checks whether ncorporatng the new nformaton causes a modfcaton of the current sketch and an update operaton, whch s a modfcaton of the current sketch. We make the dstncton between test and update because as a general rule, applcatons requre more tests than updates, and n some applcatons, updates are costler than tests. We consder the tme bounds of constructng k-mns and bottomk sketches for two representatve classes of applcatons. We show that when subsets are represented explctly (each occurrence of an tem n a subset s specfed), t s much more effcent to construct bottom-k sketches. Ths pont for unform weghts, was already noted n [4, 22]. We revew t and extend the analyss for weghted tems. For mplct representaton of the subsets, va a graph, we show that the tme bounds for generatng the two types of sketches are comparable. 3. Explct representaton of subsets Examples of applcatons wth explct specfcaton are [3, 2, 22, 23]. Among these are market-basket data, Web duplcate analyss and more. To construct a k-mns sketch for a subset, we mantan a current sketch (m,..., m k ) of the smallest rank value observed so far for each of the k rank functons (along wth attrbutes of the tems wth smallest rank). Intally, m j = + for (j =,..., k). When an tem s processed we compute r () (), r (2) (),..., r (k) (). We then update the sketch so that m j mn{m j, r (j) ()}. Therefore, the processng tme for each occurrence of an tem n a subset s Θ(k) (t s Θ(k) tme for both the test and update operatons). To construct a bottom-k sketch, we use a current sketch that contans the k smallest rank values observed so far m < m 2 < m k as a sorted lst. When an tem s processed, we compute r(), whch s compared to m k (test operaton). If r() < m k, the rank value m k (and correspondng tem) s deleted from the lst and r() s nserted (update operaton). A test operaton takes O() tme and an update takes O(log k) tme. Therefore, the tme bound for generatng a sketch for a subset of sze s s O(sk) for a k-mns sketch and O(s log k) for a bottom-k sketch. We next show that for unform weghts the expected number of update operatons whle constructng a bottom-k sketch of a set of sze s s O(k log s). Ths mples a better bound of O(s + k log s log k) on the expected runnng tme to generate a bottom-k sketch. Lemma 3.. If tems have unform weghts then the expected number of updates to a bottom-k sketch of a set of sze s s k ln s. PROOF. A presented tem trggers an update of the current sketch f and only f t has one of the bottom-k ranks among tems presented so far. If j tems were presented so far, the probablty of that happenng s mn{, k/j}. Summng over all postons n the presentaton order we obtan that the expected number of updates s at most P s j= k/j k ln s. For weghted tems we consder two cases. Frst s the case where tems are presented n an order determned by a random permutaton. Lemma 3.2. If tems are presented n random order then the expected number of updates to a bottom-k sketch of a set of sze s s k ln s. PROOF. Fx the rank assgnment. The probablty that the jth tem n the presentaton order has one of the k th smallest ranks of the frst j tems s mn{, k/j}. Contnue as n the proof of Lemma 3.. From Lemma 3.2 t follows that f tems are weghted and are presented n random order, the bottom-k sketch s constructed n O(s + k log k log s) expected tme. To bound the number of updates when tems are presented n an arbtrary order we need the rank assgnment to defne a close to random permutaton of the tems f weghts are, say, wthn a factor of two from each other. Ths wll hold f the rank functons satsfy the followng property. Defnton 3.3. A famly of rank functons s c-moderate f for any w >, and < w 2w, there s probablty at least such c that an tem drawn accordng to f w has a larger rank than an tem drawn accordng to f w. If the famly of rank functons s c-moderate for some constant c and the weghts of all tems are wthn a factor of two from each other then the probablty that a rank of a partcular tem, say, s among the k-smallest ranks s at most c k, where j s the number of j tems. 4 One can check that exponental ranks are 3-moderate and prorty ranks are 4-moderate. Lemma 3.4. If tems are weghted and presented n arbtrary (worstcase) order, and the famly of rank functons s c-moderate for some constant c, then the expected number of updates of the bottom-k sketch of a set of sze s s O(k log(max w()/ mn w()) log s). PROOF. Consder a partton of the tems nto log(max w()/ mn w()) groups accordng to the weght, so that tems of weght [2 mn w(), 2 + mn w()] are n the same group. We bound the number of updates wthn one group. From the fact that the rank assgnment s c-moderate t follows that the probablty of the jth presented tem n a group to be wthn the bottom-k tems presented so far from ts group s at most ck/j, and hence, the expected number of updates wthn a group s at most ck ln s. The statement of the lemma follows by summng over all groups. From Lemma 3.4 t follows that f weghted tems are presented n arbtrary order, and the set of rank functons s c-moderate for some constant c, then we buld the bottom-k sketch n O(s + k log(max w()/ mn w()) log s log k) expected tme. 3.2 Graph representaton of subsets In some applcatons, tems and locatons are embedded n a graph or a metrc space and subsets correspond to all tems n a certan neghborhood or the reachablty set of a node [9, 3, 5]. The computaton of the sketches s performed concurrently for all subsets, wth tems and ranks beng propagated n a controlled way such that an tem s tested for a subset only f t s farly lkely to occur n the sketch of the subset and the number of test operatons s much smaller than wth an explct representaton. 4 To see that, replace tem by c duplcates, consder a random permutaton of the new set of tems and the probablty that one of the duplcates s among the bottom-k. Ths probablty s smaller than c k and larger than the probablty that tem s among the j bottom k. 228

5 We revew the computaton of sketches for reachablty sets of nodes n a graph [9]. In ths applcaton each node s an tem. Each node computes the sketch of ts reachablty set. Rank values (and assocated nformaton) are propagatng usng a graph traversal method such as breadth-frst or depth-frst search. When a rank value does not result n an update at a node, the propagaton of the rank value s halted at that node. Therefore, the number of test operatons s at most (m/n) tmes the number of update operatons, where m s the number of edges and n the number of nodes. For k-mns sketches, each tem and a rank value assocated wth t are propagated separately (therefore, k truncated traversals are performed for each tem). If, wthn each rank assgnment, tems are propagated n ncreasng rank order, then the combned number of updates for all subsets s n. Therefore, the total number of updates, for all k rank assgnments and subsets s O(kn) and the number of tests (and total tme) s O(km) [9]. Bottom-k sketches are computed by propagatng each tem and ts assocated rank usng a truncated graph traversal (note that n contrast to k-mns sketches, one traversal s performed for each tem). The current sketch at a node s updated when an tem arrves and ts rank value s smaller than the kth smallest current rank at the node. The traversal s halted at nodes where the tem dd not result n an update of the current sketch. When tems are presented n ncreasng rank order, then tems can only be appended to bottom-k sketches and t s never necessary to remove an tem. Therefore, the total number of updates s O(kn) and the total number of tests (and total tme) s O(km). These bounds are the same as the bounds obtaned for k-mns sketches. Arbtrary order. When tems are not presented ordered by ther ranks [3], the number of update operatons ncreases. Smlarly to Lemma 3. and Lemma 3.4 we prove that Lemma 3.5. Suppose we mantan the mnmum rank n a subset of sze s. Then f tems have unform weghts and presented n a fxed but arbtrary order or f tems are weghted and presented n a random order, the expected number of updates to the mnmum rank s ln s. f tems are weghted and presented n a fxed but arbtrary order and the famly of rank functons s c-moderate, the expected number of updates s O(log(max w()/ mn w()) log s). It follows that the total number of updates when computng k- mns sketches of all reachablty sets s O(kn log n) for unform weghts and weghted tems presented n random order and O(kn log(max w()/ mn w()) log n) for weghted tems presented n arbtrary order. We perform a test or update n O() tme and the number of tests s at most m/n tmes the number of updates. Therefore, the total tme s m/n tmes the number of updates. The number of updates for bottom-k sketches s gven n Lemmas 3.,3.2, and 3.4. Each update takes O(log k) tme, and a test takes O() tme. The number of tests s m/n tmes the number of updates. Therefore, the total tme s O(log k + m/n) tmes the number of updates gven n each of these lemmas. 4. ALL-DISTANCES SKETCHES An all-dstances sketch s an encodng of plan sketches of all neghborhoods of a certan locaton q. For a gven dstance d, the sketch for the d-neghborhood of the locaton can be retreved from the all-dstances sketch. We revew k-mns all-dstances sketches and ntroduce bottomk all-dstances sketches. We consder the sze of the all-dstances sketches, ts constructon tme, and the tme t takes to retreve the sketch of a partcular dstance. We consder ncremental constructon, where current all-dstances sketches are mantaned and updated upon the arrval of new nformaton (tem, dstance, rank). The operatons we consder are test that determnes f the current sketch needs to be modfed when new nformaton arrves, update of the current sketch, and a dstance query ssued to the fnal sketch. The dstance query retreves from the all-dstances sketch the plan sketch for the neghborhood of the locaton q specfed by the query dstance. We show that the expected sze of the representaton of the alldstances bottom-k sketch matches that of the k-mns sketch. When subsets are represented explctly, the computaton tme of the alldstances bottom-k sketches s about factor of k faster than that of the all-dstances k-mns sketches. When subsets are represented va a graph, the constructon tmes are comparable. All-dstances k-mns sketches: We revew all-dstances k-mns sketches. Consder a sngle rank assgnment. An MV/D lst of a locaton q (Mnmum Value/Dstance Lst) encodes the mnmum rank n any neghborhood (query dstance) of q n a compact way. It s a lst of trples where each trple contans an tem e, ts rank, and ts dstance from q. An tem e s n the MV/D lst of q f there s no tem wth smaller rank closer to q. The MV/D lst s sorted n ncreasng dstance and decreasng rank order. For a query dstance d, the smallest rank of an tem n the MV/D lst of q of dstance at most d from q s the tem of smallest rank n the subset of tems n the d-neghborhood of q. The expected sze of the lst depends on the rank functon and on the weght dstrbuton of the tems. Lemma 4.. The sze of an MV/D lst of n weghted tems from a locaton q s bounded as follows:. When weghts are unform, the expected sze s O(log n) [9]. 2. If weghts are arbtrary but tems are assgned to locatons at random then the expected sze over assgnments of tems to locatons, and over rank assgnments s O(log n). 3. If tems have arbtrary weghts and placed n arbtrary locatons and ranks are assgned usng a c-moderate famly of rank functons for some constant c, then the expected sze s O(log(max w()/ mn w()) log n). PROOF. Fx the rank assgnment. Order the locatons n ncreasng dstance from q. The assgnment of tems to locaton defnes a random permutaton of the ranks. Therefore, the probablty that the rank value n locaton j s smaller than the rank values n all closer locatons (and therefore the tem occurs on the MV/D lst) s /j. By summng over all postons, we obtan that the expected sze of the MV/D lst s P n j= /j ln n. If the relaton of the weghts and the locatons of tems s arbtrary, the expected sze of the MV/D lsts depends on the locaton of tems: If tem weghts are decreasng wth dstance then the expected sze of the MV/D lst s smaller and f tem weghts are ncreasng wth dstances, then the expected sze s larger (can be lnear n the worst case). The worst-case sze of the MV/D lst, however, can be bounded by the weght dstrbuton of the tems. The proof of the followng lemma s smlar to that of Lemma 3.4. Lemma 4.2. If tems have arbtrary weghts and placed n arbtrary locatons and ranks are assgned usng a c-moderate famly of rank functons for some constant c, the expected sze of the MV/D lst s O(log(max w()/ mn w()) log n). 229

6 PROOF. Let w = mn w(). Consder a partton of the tems so that all tems wth weght n [w 2, w 2 + ) are n group, for =, log 2 (max w()/ mn w()). By the property of c- moderate rank functons, the expected number of tems from each group that appear on the MV/D lst s logarthmc n ts sze. Therefore, the total expected number of tems on the MV/D lst s bounded by 2 ln n( + ln(max w()/ mn w())). The MV/D lst can be constructed ncrementally: When presented wth a new tem, ts rank, and dstance, the lst s updated only f the new tem has smaller rank than all tems on the lst that have the same or smaller dstance. If tems are presented n order of ncreasng rank, (or ncreasng (dstance,rank) n lexcographc order), then tems are never removed from the lst durng updates [9]. Other orders of presentng tems were analyzed n [3]. We summarze and extend these results n the followng lemma. Lemma 4.3. Assume that we construct an MV/D lst of a locaton q, and there are n weghted tems. Then,. When tems are presented n random order and there are unform weghts, the expected number of updates s O(log 2 n) [3]. 2. If tems are assgned to locatons at random, the expected number of updates to the MV/D lst, over assgnments of tems to locatons, rank assgnments, and presentaton order of tems s O(log 2 n). 3. If ranks are assgned usng a c-moderate famly of rank functons for some constant c, then the expected number of updates to the MV/D lst, over rank assgnments, and presentaton order of tems s O(log(max w()/ mn w()) log 2 n). All-dstances bottom-k sketches: An all-dstances bottom-k sketch encodes the bottom-k tems n a neghborhood defned by any query dstance from a locaton q. The all-dstances bottom-k sketch s a data structure that generalzes a sngle MV/D lst. An tem, ts rank value r(), and dstance d() are represented n the sketch f and only f the tem has one of the bottom-k ranks n the d()-neghborhood of the locaton. It s convenent to thnk of the all-dstances bottom-k sketch as a lst of lsts arranged by ncreasng dstance. For each dstance d where the set of bottom-k tems wthn dstance d changes, we record the lst of bottom-k tems wthn ths dstance. Ths lst s vald untl the next dstance for whch there s a change. The lst of lsts representaton, however, s not storage effcent, snce all but one tem are repeated n two consecutve lsts. Ths sketch can be more compactly represented f we only record the changes to the lst. In Secton 5 we dscuss compact representatons for an all-dstances bottom-k sketch that requre storage proportonal to the number of dstances where the bottom-k set changes. We bound the number of dstances for whch the bottom-k lst changes. These bounds mply that the storage for an all-dstances bottom-k sketch s comparable to the storage for k MV/D lsts n an all-dstances k-mns sketch. Lemma 4.4. Consder an all-dstances bottom-k sketch for n tems of a locaton q. We bound the expected number of dstances from q where the set of bottom-k tems changes.. For unform weghts, the expected number of dstances s O(k log n). 2. For a set of tems wth arbtrary weghts that are randomly assgned to locatons the expected number of dstances (over assgnments of tems to locatons, and over rank assgnments) s O(k log n). 3. If tems have arbtrary weghts and placed n arbtrary locatons and ranks are assgned usng a c-moderate famly of rank functons for some constant c, the expected number of dstances s O(k log(max w()/ mn w()) log n). PROOF. Order the tems by ncreasng dstance from q. Let d(j) be the dstance of the jth tem n ths order from q. The jth tem s n the bottom-k set of tems wthn dstance d(j) from q f t s one of the k-smallest tems among the j closest tems to q. Snce weghts are unform, the ranks defne a random permutaton of the tems whch s ndependent of the ther dstances to q. So the jth tem s among the smallest k wth probablty mn{k/j, }. Summng over all tems we obtan that the expected number of tems whch are among the kth smallest tems wthn ther dstance from q s at most X k j k ln n j As n Lemma 3., and 3.4 for weghted tems we can show the followng. Lemma For a set of tems wth arbtrary weghts and a set of locatons, the expected number of dstances from a locaton q where the set of bottom-k tems changes, over assgnments of tems to locatons, and over rank assgnments s O(k log n). 2. If tems have arbtrary weghts and placed n arbtrary locatons and ranks are assgned usng a c-moderate famly of rank functons for some constant c, the expected number of dstances from a locaton q where the set of bottom-k tems changes s O(k log(max w()/ mn w()) log n). If tems are presented n order of ncreasng dstances from q we can obtan a bottom-k lst for the current dstance, from the bottom-k lst of the prevous dstance by dong an nserton and a deleton. Smlarly, f tems arrve sorted by rank value, then the number of updates to the bottom-k sketch s proportonal to the sze (number of breakpont dstances) of the sketch. We can also bound the number of updates performed f tems arrve n a random order. Lemma 4.6. Consder the expected number of updates that s performed n an ncremental constructon of an all-dstances bottom-k sketch of a locaton q when tems are presented n a random order (the order s a random permutaton). When tem weghts are unform, the expected number of updates s O(k log 2 n). 2. When tems have arbtrary weghts, the expected number of updates over assgnments of weghts to locatons, over rank assgnments, and arrval order, s O(k log 2 n). 3. When tems have arbtrary weghts, and the famly of rank functons s c-moderate, the expectaton over rank assgnments and arrval orders of the number of updates s O(k log(max w()/ mn w()) log 2 n). PROOF. Consder unform weghts (Part ). An tem would result n an update f at the tme t s presented, t has one of the k smallest ranks amongst tems already presented that are at least 23

7 as close to q. Consder the jth closest tem to q. It has probablty /j of havng the th rank among all tems that are at least as close to the locaton. We now calculate the probablty that the tem results n an update gven that t has the th rank. Consder the tems that have smaller ranks and are at least as close. The probablty that at most k of them are presented before our tem s that of beng n one of the frst k postons n a random permutaton of tems, whch s mn{k/, }. We obtan P that the expected number of updates for the jth closest tem s j = mn{k/, }/j (/j) P j = k/ (k/j) ln j. summng over all n tems, we obtan that the expected number of updates s (k/j) ln j k ln 2 n. j= The proof of Part 2 and Part 3 follows by an argument as for Lemma 3., and Lemma 3.4. As n the case of a sngle sketch n Secton 3 the number of test operatons depends on the representaton of the subsets. If ths representaton s explct then snce k-mns sketch conssts of k ndependent MV/D lsts the number of tests requred for a k-mns sketch s by a factor of k larger than for a bottom-k sketch. In a graph representaton, the number of tests s at most (m/n) tmes the number of updates for both knds of sketches. In Secton 5 we dscuss representatons of sketches that allow effcent mplementatons of test and update operatons. 5. REPRESENTATIONS OF SKETCHES We consder possble representatons for k-mns sketches and bottom-k sketches. We are nterested n boundng the sze of the data structure that encodes the sketch, and the tme requred to ncrementally construct the sketch when tems are presented n sorted or other orders. For all-dstances sketches we also consder the tme t takes to fnd the sketch for a partcular query dstance. Representaton of an MV/D lst: An effcent data structure for an MV/D lst constructon and queryng was not explctly dscussed n earler works. If tems arrve sorted, by ncreasng rank value or ncreasng dstance, we represent an MV/D lst sorted by ncreasng dstances (and decreasng ranks), as a bnary search tree. Wth ths representaton we can support dstance queres n expected O(log M) tme, where M s the expected sze of the lst. If tems do not arrve n a sorted order, we represent the current MV/D lst as a dynamc bnary search tree. Test operatons then requre expected O(log M) tme. An update s performed n O(log M) expected amortzed tme: Each tem requres an nserton to the tree f t has the smallest rank wthn ts dstance from the query locaton, and possbly a seres of deletons of tems whch are further away from the query locaton and of larger rank. Snce each tem can be deleted at most once, we can charge each deleton to the respectve nserton. The all-dstances k-mns sketches conssts of k ndependent MV/D lsts, one for each rank assgnment. Therefore, for any query dstance, we can obtan the mn-rank sketch over the tems that le wthn that dstance n O(k log M) tme, by searchng ndependently n each of the k lsts. The query tme can be mproved to O(k + log M) usng fractonal cascadng [7]. Usng fractonal cascadng, we perform a bnary search only on one lst and use lnks between tems to fnd the poston n the next lst s O() tme. Another approach to obtan a O(k + log M) bound per query s to use an nterval tree or a segment tree (See e.g. [6]) to represent the km ntervals defned by consecutve ponts on the same lst. We can then do stabbng queres to fnd the k ntervals of a query dstance, whch correspond to the mn-rank n that neghborhood n each of the k rank functons. Constructng and queryng the bottom-k sketch: A natural representaton for a sngle bottom-k sketch s a lst of the tems sorted by ncreasng ranks represented as a search tree, as mentoned n Secton 3. However for all-dstances bottom-k sketch one needs to be more careful so that the sze of the representaton would be proportonal to the number of dstances where the lst changes as mentoned n Secton 4. We suggest possble effcent representatons for an all-dstances bottom-k sketch. Ordered nserton of tems: When tems are presented n an order related to ther dstances or ranks, we can use the followng data structures. If tems are presented n order of ncreasng dstances from q we can obtan a bottom-k lst for the current dstance, from the bottom-k lst of the prevous dstance by dong an nserton and a deleton. If we use a persstent lst [7] to represented each bottomk lst, then we can update a bottom-k lst to obtan the next one n O(k) tme whle consumng only O() space. We can reduce the update tme to O(log k) by usng persstent search trees nstead of persstent lsts, the space requred per operaton s stll O(). We can also construct the bottom-k all-dstances sketch f tems are presented n order of ncreasng ranks so that t takes space proportonal to the number of updates. We construct the frst lst after the k tems wth smallest ranks are presented. Ths lst s assocated wth the dstance of the tem among these k whch s furthest from the query locaton q. When the next tem arrves, say tem j, f tem j s closer to q than any of the already seen tems, we construct a new bottom k lst L. Assume that the prevous lst L whch we constructed was assocated wth dstance d > d(j). We construct L from L by deletng from L the tem at dstance d from q and addng tem j nstead. The dstance assocated wth L s the dstance of the furthest tem n L from q. Usng persstent lsts or persstent search trees to represent the bottom-k lsts we construct all lsts n space whch s proportonal to the number of updates. The update tme s O(k) wth persstent lsts and O(log k) wth persstent trees (we keep the tems n each lst sorted by ncreasng dstances from q). Inserton of tems n arbtrary order: To support arbtrary nserton order, we can thnk of the all-dstances bottom-k sketch as a set of ntervals on a lne. Each tem corresponds to an nterval over the range of dstances n whch t s a bottom-k tem. Let D be the current set of ntervals. A query s a pont stabbng query, the bottom-k lst conssts of the set of ntervals n D ntersectng the query pont. When a new tem z arrves at dstance d we should fgure out f the sketch should be updated. Let I = [d, d 2) be the nterval spannng dstance d wth the largest rank. We should update the sketch f the rank of z s smaller than the rank of the tem correspondng to I. We update the sketch as follows. We replace I wth I = [d, d). Then we fnd the nterval I 2 = [d 2, d 3) wth largest rank at dstance d 2. If the rank of I 2 s larger than the rank of z we delete I 2, and we contnue n the same way fndng for > 2 the nterval I of largest rank at dstance d, and deletng I f the rank of the correspondng tem s larger than the rank of z. Let d j be the rght endpont of the last nterval whch we deleted. We nsert the nterval [d, d j) correspondng to tem z. Snce each nterval s nserted and deleted once the total number of nsertons and deletons of ntervals s proportonal to the number of ntervals. An nterval I may splt many tme. However, each splt of I s assocated wth a newly nserted nterval mmedately followng I. Snce each nserted nterval may cause at most one splt the total number of splts s also proportonal to the total number of ntervals. 23

8 To support these nterval operatons, we can mantan the ntervals ether n a dynamc nterval tree or n a dynamc segment tree [8]. Let M denote the number of ntervals n the tree. A dynamc nterval tree takes O(M) space, and usng t we can report the k ntervals stabbed at a partcular dstance n O(log(M) log(k) + k) tme. We can update an nterval tree n O(log(M) log(k)) amortzed tme. A dynamc segment tree requres O(M log M) space and supports queres n O(log(M) + k) tme and updates n O(log(M) log(k)) amortzed tme. By a standard modfcaton to an nterval tree n whch we store at every secondary node the tem of maxmum rank n ts subtree we can fnd the nterval of maxmum rank stabbed by a query dstance n O(log(M) log(k)) tme. Smlarly, by mantanng at each node of a segment tree the maxmum rank nterval that t contans we can fnd the maxmum rank nterval stabbed by a query dstance n O(log(M)) tme. Ths allows us to test f the bottom-k sketch changes when a new tem arrves n polylogarthmc tme. (Ths s n contrast wth O(k log(n)) tme for k ndependent MV/D lsts that form a k-mns all dstances sketch.) 6. MIMICKED SAMPLING WITH REPLACE- MENT We present a randomzed procedure that uses a WS-sketch (weghted samplng wthout replacement untl k tems are obtaned) to emulate weghted samplng wth replacement. Usng ths process, we can derve a sze-k WSR-sketch from a sze-k WS-sketch. By mmckng we mean that the probablty to obtan a partcular sketch by frst obtanng a WS-sketch and then applyng the procedure s the same as when drectly obtanng a WSR-sketch. The process s descrbed as generatng a sequence of tems (and rank values). The process s randomzed and therefore every WSsketch b corresponds to a dstrbuton M(b) over such sequences. If we stop the process after k samples, we obtan a WSR-sketch. We can use a dfferent stoppng rule and contnue untl the (k + ) dstnct tem s sampled. We refer to a weghted sample wth replacement wth ths stoppng rule as a WSRD-sketch. The WSRDsketch contans the same set of tems as the WS-sketch but also has a count for each tem that corresponds to the number of tmes the tem s sampled untl the process s stopped. Mmckng allows us to apply an estmator ν desgned for WSRsketches or WSRD-sketches to WS-sketches. A WS-sketch estmator can be obtaned by drawng a mmcked sketch s M(b) usng ths process and returnng ν(s). Ths estmator s equvalent to usng the estmator ν on WSR or WSRD-sketches. The estmator ν (b) = E(ν(s) s M(b)) has lower varance (a consequence of Lemma 2.3). It can be approxmated 5 by takng average of ν(s) over multple draws of s M(b). Lower varance estmator (another consequence of Lemma 2.3) s obtaned by consderng the subspace L(b) of WS-sketches wth the same subset of tems as b and f w(j) s not provded and the same rank value r k+. L(b) s an equvalence relaton that defnes a partton of the sample space. The estmator ν (b) = E(ν (b ) b L(b)) can be approxmated by averagng ν(b ) over multple draws of b L(b). We frst provde a mmckng process when the total weght w(i) of the ground set s known. Let,..., k be the tems n the WSsketch b, ordered by ncreasng ranks. The frst tem n the mmcked sample s. We then select wth probablty w( )/w(i) and 2 otherwse, and repeat ths untl we have k samples or untl 2 s selected. In phase j, after outputtng at least one sample of each of,..., j, we select l wth probablty w( l )/w(i) (for 5 Ths approxmaton preserves unbasedness. l j) and j+ otherwse. Each phase can be smulated effcently usng the geometrc dstrbuton to determne the number of samples untl the next tem from b s sampled and the multnomal dstrbuton to determne the number of tmes each tem s sampled. We now provde a mmckng procedure when w(i) s not known. The procedure s appled to an ordered sketch where all tems have rank values. We use propertes of the exponental dstrbuton and the ranks of the tems n the WS-sketch. We frst establsh few lemmas about the dstrbuton of the dfferences between the ranks of the tems n a WS-sketch. The frst lemma follows from the memoryless nature of the exponental dstrbuton. Lemma 6.. Consder a subspace of rank assgnments where the order of the tems accordng to rank values s fxed, say,..., n, and the rank values of the frst j tems are fxed. Let r( j+) be the random varable that s the (j + )st smallest rank. The condtonal dstrbuton of r( j+) r( j) s exponental wth parameter P n w( h). PROOF. Snce rank values of dfferent tems are ndependent, the probablty densty for the event: tems,..., j have the bottomj ranks wth the values r( ) < < r( j) and tems j+,..., n havng the next n j smallest ranks n that order s the product p p 2 where p = w( ) exp( r( )w( ))w( 2) exp( r( 2)w( 2)) w( j) exp( r( j)w( j)) (probablty densty that the tems,..., j have the rank values r( ),..., r( j)) and p 2 = Z w( j+) exp( x j+w( j+)) r( j ) Z w( j+2) exp( x j+2w( j+2)) x j+ Z w( n) exp( x nw( n))dx n dx j+2dx j+. x n s the probablty densty that tems j+,..., n have rank values n that order and all larger than r( j). Performng the ntegraton, we obtan that where p 2 = p 3 exp( r( j) w( j+) w( h )), w( p 3 = P n w( P j+2) n h) h=j+2 w( h) w( n ) w( n ) + w(. n) (p 3 s the probablty that the rank values of tems j+,..., n are n that order and exp( r( P n j)( w( h))) s the probablty that the mnmum rank among j+,..., n s at least r( j).) Therefore, the probablty densty s p p 2 = p p 3 r( j) w( h ) A. () We next calculate the probablty densty for the followng event: tems,..., n have ncreasng ranks, the bottom-j ranks are equal to r( ) <... < r( j), and the (j + )st rank has value r( j) + d. It follows from ndependence of the rank values that the probablty densty s 232

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour

6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour 6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the

More information

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique

The Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique //00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy

More information

Problem Set 3 Solutions

Problem Set 3 Solutions Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms

Course Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques

More information

CE 221 Data Structures and Algorithms

CE 221 Data Structures and Algorithms CE 1 ata Structures and Algorthms Chapter 4: Trees BST Text: Read Wess, 4.3 Izmr Unversty of Economcs 1 The Search Tree AT Bnary Search Trees An mportant applcaton of bnary trees s n searchng. Let us assume

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices

Steps for Computing the Dissimilarity, Entropy, Herfindahl-Hirschman and. Accessibility (Gravity with Competition) Indices Steps for Computng the Dssmlarty, Entropy, Herfndahl-Hrschman and Accessblty (Gravty wth Competton) Indces I. Dssmlarty Index Measurement: The followng formula can be used to measure the evenness between

More information

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe

CSCI 104 Sorting Algorithms. Mark Redekopp David Kempe CSCI 104 Sortng Algorthms Mark Redekopp Davd Kempe Algorthm Effcency SORTING 2 Sortng If we have an unordered lst, sequental search becomes our only choce If we wll perform a lot of searches t may be benefcal

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

Private Information Retrieval (PIR)

Private Information Retrieval (PIR) 2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

CS 534: Computer Vision Model Fitting

CS 534: Computer Vision Model Fitting CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust

More information

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;

Subspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points; Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features

More information

Optimal Workload-based Weighted Wavelet Synopses

Optimal Workload-based Weighted Wavelet Synopses Optmal Workload-based Weghted Wavelet Synopses Yoss Matas School of Computer Scence Tel Avv Unversty Tel Avv 69978, Israel matas@tau.ac.l Danel Urel School of Computer Scence Tel Avv Unversty Tel Avv 69978,

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

CHAPTER 2 DECOMPOSITION OF GRAPHS

CHAPTER 2 DECOMPOSITION OF GRAPHS CHAPTER DECOMPOSITION OF GRAPHS. INTRODUCTION A graph H s called a Supersubdvson of a graph G f H s obtaned from G by replacng every edge uv of G by a bpartte graph,m (m may vary for each edge by dentfyng

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

Life Tables (Times) Summary. Sample StatFolio: lifetable times.sgp

Life Tables (Times) Summary. Sample StatFolio: lifetable times.sgp Lfe Tables (Tmes) Summary... 1 Data Input... 2 Analyss Summary... 3 Survval Functon... 5 Log Survval Functon... 6 Cumulatve Hazard Functon... 7 Percentles... 7 Group Comparsons... 8 Summary The Lfe Tables

More information

Hermite Splines in Lie Groups as Products of Geodesics

Hermite Splines in Lie Groups as Products of Geodesics Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach

Data Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer

More information

Machine Learning: Algorithms and Applications

Machine Learning: Algorithms and Applications 14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of

More information

All-Pairs Shortest Paths. Approximate All-Pairs shortest paths Approximate distance oracles Spanners and Emulators. Uri Zwick Tel Aviv University

All-Pairs Shortest Paths. Approximate All-Pairs shortest paths Approximate distance oracles Spanners and Emulators. Uri Zwick Tel Aviv University Approxmate All-Pars shortest paths Approxmate dstance oracles Spanners and Emulators Ur Zwck Tel Avv Unversty Summer School on Shortest Paths (PATH05 DIKU, Unversty of Copenhagen All-Pars Shortest Paths

More information

CMPS 10 Introduction to Computer Science Lecture Notes

CMPS 10 Introduction to Computer Science Lecture Notes CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not

More information

arxiv: v3 [cs.ds] 7 Feb 2017

arxiv: v3 [cs.ds] 7 Feb 2017 : A Two-stage Sketch for Data Streams Tong Yang 1, Lngtong Lu 2, Ybo Yan 1, Muhammad Shahzad 3, Yulong Shen 2 Xaomng L 1, Bn Cu 1, Gaogang Xe 4 1 Pekng Unversty, Chna. 2 Xdan Unversty, Chna. 3 North Carolna

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Performance Evaluation of Information Retrieval Systems

Performance Evaluation of Information Retrieval Systems Why System Evaluaton? Performance Evaluaton of Informaton Retreval Systems Many sldes n ths secton are adapted from Prof. Joydeep Ghosh (UT ECE) who n turn adapted them from Prof. Dk Lee (Unv. of Scence

More information

Announcements. Supervised Learning

Announcements. Supervised Learning Announcements See Chapter 5 of Duda, Hart, and Stork. Tutoral by Burge lnked to on web page. Supervsed Learnng Classfcaton wth labeled eamples. Images vectors n hgh-d space. Supervsed Learnng Labeled eamples

More information

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics

NAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson

More information

SAO: A Stream Index for Answering Linear Optimization Queries

SAO: A Stream Index for Answering Linear Optimization Queries SAO: A Stream Index for Answerng near Optmzaton Queres Gang uo Kun-ung Wu Phlp S. Yu IBM T.J. Watson Research Center {luog, klwu, psyu}@us.bm.com Abstract near optmzaton queres retreve the top-k tuples

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Design and Analysis of Algorithms

Design and Analysis of Algorithms Desgn and Analyss of Algorthms Heaps and Heapsort Reference: CLRS Chapter 6 Topcs: Heaps Heapsort Prorty queue Huo Hongwe Recap and overvew The story so far... Inserton sort runnng tme of Θ(n 2 ); sorts

More information

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision

SLAM Summer School 2006 Practical 2: SLAM using Monocular Vision SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,

More information

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search

Sequential search. Building Java Programs Chapter 13. Sequential search. Sequential search Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Chapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward

More information

Unsupervised Learning and Clustering

Unsupervised Learning and Clustering Unsupervsed Learnng and Clusterng Why consder unlabeled samples?. Collectng and labelng large set of samples s costly Gettng recorded speech s free, labelng s tme consumng 2. Classfer could be desgned

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array

Insertion Sort. Divide and Conquer Sorting. Divide and Conquer. Mergesort. Mergesort Example. Auxiliary Array Inserton Sort Dvde and Conquer Sortng CSE 6 Data Structures Lecture 18 What f frst k elements of array are already sorted? 4, 7, 1, 5, 1, 16 We can shft the tal of the sorted elements lst down and then

More information

A New Approach For the Ranking of Fuzzy Sets With Different Heights

A New Approach For the Ranking of Fuzzy Sets With Different Heights New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays

More information

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur

FEATURE EXTRACTION. Dr. K.Vijayarekha. Associate Dean School of Electrical and Electronics Engineering SASTRA University, Thanjavur FEATURE EXTRACTION Dr. K.Vjayarekha Assocate Dean School of Electrcal and Electroncs Engneerng SASTRA Unversty, Thanjavur613 41 Jont Intatve of IITs and IISc Funded by MHRD Page 1 of 8 Table of Contents

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Module Management Tool in Software Development Organizations

Module Management Tool in Software Development Organizations Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,

More information

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes

R s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges

More information

Ramsey numbers of cubes versus cliques

Ramsey numbers of cubes versus cliques Ramsey numbers of cubes versus clques Davd Conlon Jacob Fox Choongbum Lee Benny Sudakov Abstract The cube graph Q n s the skeleton of the n-dmensonal cube. It s an n-regular graph on 2 n vertces. The Ramsey

More information

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm

Non-Split Restrained Dominating Set of an Interval Graph Using an Algorithm Internatonal Journal of Advancements n Research & Technology, Volume, Issue, July- ISS - on-splt Restraned Domnatng Set of an Interval Graph Usng an Algorthm ABSTRACT Dr.A.Sudhakaraah *, E. Gnana Deepka,

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

Lecture #15 Lecture Notes

Lecture #15 Lecture Notes Lecture #15 Lecture Notes The ocean water column s very much a 3-D spatal entt and we need to represent that structure n an economcal way to deal wth t n calculatons. We wll dscuss one way to do so, emprcal

More information

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated. Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,

More information

Programming in Fortran 90 : 2017/2018

Programming in Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

Concurrent Apriori Data Mining Algorithms

Concurrent Apriori Data Mining Algorithms Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information

3D vector computer graphics

3D vector computer graphics 3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres

More information

USING GRAPHING SKILLS

USING GRAPHING SKILLS Name: BOLOGY: Date: _ Class: USNG GRAPHNG SKLLS NTRODUCTON: Recorded data can be plotted on a graph. A graph s a pctoral representaton of nformaton recorded n a data table. t s used to show a relatonshp

More information

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007

Synthesizer 1.0. User s Guide. A Varying Coefficient Meta. nalytic Tool. Z. Krizan Employing Microsoft Excel 2007 Syntheszer 1.0 A Varyng Coeffcent Meta Meta-Analytc nalytc Tool Employng Mcrosoft Excel 007.38.17.5 User s Gude Z. Krzan 009 Table of Contents 1. Introducton and Acknowledgments 3. Operatonal Functons

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

TN348: Openlab Module - Colocalization

TN348: Openlab Module - Colocalization TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages

More information

Lecture 5: Probability Distributions. Random Variables

Lecture 5: Probability Distributions. Random Variables Lecture 5: Probablty Dstrbutons Random Varables Probablty Dstrbutons Dscrete Random Varables Contnuous Random Varables and ther Dstrbutons Dscrete Jont Dstrbutons Contnuous Jont Dstrbutons Independent

More information

Priority queues and heaps Professors Clark F. Olson and Carol Zander

Priority queues and heaps Professors Clark F. Olson and Carol Zander Prorty queues and eaps Professors Clark F. Olson and Carol Zander Prorty queues A common abstract data type (ADT) n computer scence s te prorty queue. As you mgt expect from te name, eac tem n te prorty

More information

F Geometric Mean Graphs

F Geometric Mean Graphs Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 2 (December 2015), pp. 937-952 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) F Geometrc Mean Graphs A.

More information

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK

FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK FINDING IMPORTANT NODES IN SOCIAL NETWORKS BASED ON MODIFIED PAGERANK L-qng Qu, Yong-quan Lang 2, Jng-Chen 3, 2 College of Informaton Scence and Technology, Shandong Unversty of Scence and Technology,

More information

RAP. Speed/RAP/CODA. Real-time Systems. Modeling the sensor networks. Real-time Systems. Modeling the sensor networks. Real-time systems:

RAP. Speed/RAP/CODA. Real-time Systems. Modeling the sensor networks. Real-time Systems. Modeling the sensor networks. Real-time systems: Speed/RAP/CODA Presented by Octav Chpara Real-tme Systems Many wreless sensor network applcatons requre real-tme support Survellance and trackng Border patrol Fre fghtng Real-tme systems: Hard real-tme:

More information

Adaptive Load Shedding for Windowed Stream Joins

Adaptive Load Shedding for Windowed Stream Joins Adaptve Load Sheddng for Wndowed Stream Jons Bu gra Gedk College of Computng, GaTech bgedk@cc.gatech.edu Kun-Lung Wu, Phlp Yu T.J. Watson Research, IBM {klwu,psyu}@us.bm.com Lng Lu College of Computng,

More information

Smoothing Spline ANOVA for variable screening

Smoothing Spline ANOVA for variable screening Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory

More information

Analysis of Collaborative Distributed Admission Control in x Networks

Analysis of Collaborative Distributed Admission Control in x Networks 1 Analyss of Collaboratve Dstrbuted Admsson Control n 82.11x Networks Thnh Nguyen, Member, IEEE, Ken Nguyen, Member, IEEE, Lnha He, Member, IEEE, Abstract Wth the recent surge of wreless home networks,

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline

Image Representation & Visualization Basic Imaging Algorithms Shape Representation and Analysis. outline mage Vsualzaton mage Vsualzaton mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and Analyss outlne mage Representaton & Vsualzaton Basc magng Algorthms Shape Representaton and

More information

Sorting. Sorting. Why Sort? Consistent Ordering

Sorting. Sorting. Why Sort? Consistent Ordering Sortng CSE 6 Data Structures Unt 15 Readng: Sectons.1-. Bubble and Insert sort,.5 Heap sort, Secton..6 Radx sort, Secton.6 Mergesort, Secton. Qucksort, Secton.8 Lower bound Sortng Input an array A of data

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

CSE 326: Data Structures Quicksort Comparison Sorting Bound

CSE 326: Data Structures Quicksort Comparison Sorting Bound CSE 326: Data Structures Qucksort Comparson Sortng Bound Steve Setz Wnter 2009 Qucksort Qucksort uses a dvde and conquer strategy, but does not requre the O(N) extra space that MergeSort does. Here s the

More information

GSLM Operations Research II Fall 13/14

GSLM Operations Research II Fall 13/14 GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Greedy Technique - Definition

Greedy Technique - Definition Greedy Technque Greedy Technque - Defnton The greedy method s a general algorthm desgn paradgm, bult on the follong elements: confguratons: dfferent choces, collectons, or values to fnd objectve functon:

More information

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming

Kent State University CS 4/ Design and Analysis of Algorithms. Dept. of Math & Computer Science LECT-16. Dynamic Programming CS 4/560 Desgn and Analyss of Algorthms Kent State Unversty Dept. of Math & Computer Scence LECT-6 Dynamc Programmng 2 Dynamc Programmng Dynamc Programmng, lke the dvde-and-conquer method, solves problems

More information

Self-tuning Histograms: Building Histograms Without Looking at Data

Self-tuning Histograms: Building Histograms Without Looking at Data Self-tunng Hstograms: Buldng Hstograms Wthout Lookng at Data Ashraf Aboulnaga Computer Scences Department Unversty of Wsconsn - Madson ashraf@cs.wsc.edu Surajt Chaudhur Mcrosoft Research surajtc@mcrosoft.com

More information

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.

Today s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss. Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

1 Dynamic Connectivity

1 Dynamic Connectivity 15-850: Advanced Algorthms CMU, Sprng 2017 Lecture #3: Dynamc Graph Connectvty algorthms 01/30/17 Lecturer: Anupam Gupta Scrbe: Hu Han Chn, Jacob Imola Dynamc graph algorthms s the study of standard graph

More information

A Geometric Approach for Multi-Degree Spline

A Geometric Approach for Multi-Degree Spline L X, Huang ZJ, Lu Z. A geometrc approach for mult-degree splne. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 27(4): 84 850 July 202. DOI 0.007/s390-02-268-2 A Geometrc Approach for Mult-Degree Splne Xn L

More information

Adaptive Load Shedding for Windowed Stream Joins

Adaptive Load Shedding for Windowed Stream Joins Adaptve Load Sheddng for Wndowed Stream Jons Buğra Gedk, Kun-Lung Wu, Phlp S. Yu, Lng Lu College of Computng, Georga Tech Atlanta GA 333 {bgedk,lnglu}@cc.gatech.edu IBM T. J. Watson Research Center Yorktown

More information

Sorting: The Big Picture. The steps of QuickSort. QuickSort Example. QuickSort Example. QuickSort Example. Recursive Quicksort

Sorting: The Big Picture. The steps of QuickSort. QuickSort Example. QuickSort Example. QuickSort Example. Recursive Quicksort Sortng: The Bg Pcture Gven n comparable elements n an array, sort them n an ncreasng (or decreasng) order. Smple algorthms: O(n ) Inserton sort Selecton sort Bubble sort Shell sort Fancer algorthms: O(n

More information

Report on On-line Graph Coloring

Report on On-line Graph Coloring 2003 Fall Semester Comp 670K Onlne Algorthm Report on LO Yuet Me (00086365) cndylo@ust.hk Abstract Onlne algorthm deals wth data that has no future nformaton. Lots of examples demonstrate that onlne algorthm

More information

Random Variables and Probability Distributions

Random Variables and Probability Distributions Random Varables and Probablty Dstrbutons Some Prelmnary Informaton Scales on Measurement IE231 - Lecture Notes 5 Mar 14, 2017 Nomnal scale: These are categorcal values that has no relatonshp of order or

More information

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements

2x x l. Module 3: Element Properties Lecture 4: Lagrange and Serendipity Elements Module 3: Element Propertes Lecture : Lagrange and Serendpty Elements 5 In last lecture note, the nterpolaton functons are derved on the bass of assumed polynomal from Pascal s trangle for the fled varable.

More information

Reading. 14. Subdivision curves. Recommended:

Reading. 14. Subdivision curves. Recommended: eadng ecommended: Stollntz, Deose, and Salesn. Wavelets for Computer Graphcs: heory and Applcatons, 996, secton 6.-6., A.5. 4. Subdvson curves Note: there s an error n Stollntz, et al., secton A.5. Equaton

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Feature Reduction and Selection

Feature Reduction and Selection Feature Reducton and Selecton Dr. Shuang LIANG School of Software Engneerng TongJ Unversty Fall, 2012 Today s Topcs Introducton Problems of Dmensonalty Feature Reducton Statstc methods Prncpal Components

More information

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search Can We Beat the Prefx Flterng? An Adaptve Framework for Smlarty Jon and Search Jannan Wang Guolang L Janhua Feng Department of Computer Scence and Technology, Tsnghua Natonal Laboratory for Informaton

More information

Analysis of Continuous Beams in General

Analysis of Continuous Beams in General Analyss of Contnuous Beams n General Contnuous beams consdered here are prsmatc, rgdly connected to each beam segment and supported at varous ponts along the beam. onts are selected at ponts of support,

More information

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science

EECS 730 Introduction to Bioinformatics Sequence Alignment. Luke Huan Electrical Engineering and Computer Science EECS 730 Introducton to Bonformatcs Sequence Algnment Luke Huan Electrcal Engneerng and Computer Scence http://people.eecs.ku.edu/~huan/ HMM Π s a set of states Transton Probabltes a kl Pr( l 1 k Probablty

More information