Approximate Joins for Data Centric XML

Size: px
Start display at page:

Download "Approximate Joins for Data Centric XML"

Transcription

1 Approximte Joins for Dt Centri XML Nikolus Augsten 1, Mihel Böhlen 1, Curtis Dyreson, Johnn Gmper 1 1 Fulty of Computer Siene, Free University of Bozen-Bolzno Dominiknerpltz 3, Bozen, Itly {ugsten,oehlen,gmper}@inf.uniz.it Deprtment of Computer Siene, Uth Stte University Logn, UT , U.S.A. urtis.dyreson@usu.edu Astrt In dt integrtion pplitions, join mthes elements tht re ommon to two dt soures. Often, however, elements re represented slightly different in eh soure, so n pproximte join must e used. For XML dt, most pproximte join strtegies re sed on some ordered tree mthing tehnique. But in dt-entri XML the order is irrelevnt: two elements should mth even if their suelement order vries. In this pper we give solution for the pproximte join of unordered trees. Our solution is sed on windowed pq-grms. We develop n effiient tehnique to systemtilly generte windowed pq-grms in three-step proess: sorting the unordered tree, extending the sorted tree with dummy nodes, nd omputing the windowed pq-grms on the extended tree. The windowed pq-grm distne etween two sorted trees pproximtes the tree edit distne etween the respetive unordered trees. The pproximte join lgorithm sed on windowed pq-grms is implemented s n equlity join on strings nd voids to evlute the distne etween every pir of input trees. Our experiments with syntheti nd rel world dt onfirm the nlyti results nd suggest tht our tehnique is oth useful nd slle. I. INTRODUCTION The mount of dt tht is stored nd exhnged in XML is inresing. When XML dt from different soures is integrted in single dt olletion, dt items tht orrespond to the sme rel world ojet must e mthed. But ext mthes often fil due to inonsistent representtions nd missing glol keys, so pproximte mthing tehniques must e pplied. For instne, when ompnies merge, their ustomer dt will need to e integrted, ut the ompnies my hve different wys to represent ustomer dt. As nother exmple, n internet shop my wnt to enrih its produt desription with dt provided y third prties, whih eh hve slightly different desriptions for the sme produt. One wy to pproximtely mth pir of XML douments is to ompute the miniml edit distne etween the douments [1] [3]. An XML doument n e modeled s n ordered, leled tree. The edit distne etween two suh trees is the minimum numer of node insertions, deletions, nd/or renmings tht trnsform one tree into the other. Though order is importnt in doument-entri senrios (e.g., prgrph tgs in XHTML), most dt-entri pplitions ignore the siling order when onsidering whether two dt items re the sme. Dt-entri XML items re usully modeled s unordered, leled trees. While the miniml tree edit distne etween ordered trees n e omputed in polynomil time, the prolem hs een shown to e NP-omplete for unordered trees [4]. This pper develops n effiient pproximte join etween dt-entri XML. Our solution is sed on windowed pq-grms, whih re smll sutrees of speifi shpe. We develop tehnique to systemtilly generte the set of windowed pq-grms in three-step proess: sorting the unordered tree, extending the sorted tree with dummy nodes, nd omputing the windowed pq-grms on the extended tree. Intuitively, two unordered trees re similr if their sorted trees hve mny windowed pq-grms in ommon. The windowed pq-grm distne etween sorted trees pproximtes the tree edit distne etween the respetive unordered trees. The tree sorting pproh is not pplile to other ommon distnes etween ordered trees. In prtiulr, it is not possile to pproximte the edit distne etween unordered trees with the edit distne etween sorted trees. A windowed pq-grm onsists of stem nd se. The stems re invrint to order, nd the min hllenge is to ompute the ses. Our ses stisfy the following ore properties: ll non-root nodes pper in the sme numer of ses; the Jrd distne etween two siling sets is preserved; nd node moves to other prents re deteted. We provide n lgorithm to ompute the windowed pq-grm distne in O(n log n) time (n is the numer of tree nodes) nd pproximtely join unordered trees using windowed pq-grms. Most joins sed on distne mesures, suh s the edit distne, must evlute the distne etween every pir of input trees. There is no effetive wy to sort sets of trees or prtition them into ukets with hsh funtion. A nested loop join must e pplied. Our lgorithm redues the pproximte join to n equlity join on strings (windowed pq-grms re serilized nd represented s strings) tht tkes dvntge of well-known join optimiztion tehniques. The rest of the pper is orgnized s follows. Setion II presents relted work. We motivte the pproximte join of dt-entri XML in Setion III nd we disuss the impt of the siling order on the tree distne omputtion in Setion IV. Windowed pq-grms re introdued in Setion V. Setion VI disusses ore properties of windowed pq-grms, nd we tune windowed pq-grms to optimize these properties

2 in Setion VII. Setion VIII provides lgorithms, whih re experimentlly evluted in Setion IX. In Setion X we drw onlusions nd point to future work. II. RELATED WORK Most ppers tht ompre similr XML douments represent the XML dt s trees. Lels or (lel,vlue)-pirs re ssigned to the tree nodes. Tree mthing tehniques re pplied to ompute the similrity etween trees. A well known distne funtion for trees is the tree edit distne, whih is defined s the minimum numer of edit opertions (node insertion, node deletion, nd renming) tht trnsforms one tree into nother [5]. The est known tree edit distne lgorithms [6] [9] for ordered trees hve t lest O(n 3 ) runtime for trees with n nodes. The prolem is NP-omplete for unordered trees [4]. Guh et l. [] present n pproximte XML join sed on the tree edit distne etween ordered trees. They give upper nd lower ounds for the tree edit distne tht n e omputed in O(n ) time nd use referene sets to tke dvntge of the ft tht the tree edit distne is metri, thus reduing the tul numer of distnes to ompute in join. Guh et l. [] do not ddress joins of unordered XML. Groflkis nd Kumr [1] disuss pproximte joins in the ontext of dt streming pplitions. They fous on performing mth in limited mount of spe nd present n effiient pproximtion of the tree edit distne; ut their pproximtion ssumes ordered trees. pq-grms were introdued y Augsten et l. [11] s n effetive nd effiient pproximtion of the tree edit distne etween ordered trees. In this pper, we present windowed pq-grms tht extend pq-grms to pproximte the edit distne etween unordered trees, nd we develop n effiient join tehnique sed on windowed pq-grms. In hnge detetion senrios two versions of the sme doument re given nd the differene is omputed. Most reserh in this re ssumes tht the trees re ordered [1], [3], [1]. Coén et l. [1] tke dvntge of existing element IDs, whih n not e ssumed for joins of dt from different soures. Chwthe et l. [13] present heuristi solution for unordered trees tht runs in O(n 3 ) time nd for mny ses in O(n ). The X-Diff lgorithm y Wng et l. [14] llows lef nd sutree insertion nd deletion, nd node renming. To hieve O(n f mx log f mx ) runtime (f mx is the mximum fnout of the nodes) they mth only nodes with the sme pth to the root node. Our windowed pq-grm distne hs O(n log n) runtime omplexity. The distne mesures proposed for hnge detetion re evluted etween pirs of douments. Used s join predite there is no ovious wy to void n expensive nested loop join. We trnsform the distne-sed join to n equlity join on windowed pq-grms nd n pply well-known join optimiztion tehniques. Weis nd Numnn [15] propose n XML similrity mesure for duplite detetion frmework. In the worst se, ll pirs of elements must e ompred. Puhlmnn et l. [16] improve the effiieny y pplying the Sorted Neighorhood method to nested ojets. Both pprohes ssume known, ommon shem of the mthed douments nd require onfigurtion step. No join lgorithm using the proposed similrity mesure is presented. Snz et l. [17] develop similrity-sed inverted index to identify regions of XML douments tht re similr to given pttern. Adjent regions re merged into new regions if the new region etter mthes the pttern thn eh of the merged regions. The merging lgorithm ssumes ordered trees. Joins re not ddressed. A ore opertion in XML query proessing is to find ll ourrenes of twig pttern [18], [19], whih, in ommon with pproximte join tehniques, onerns identifying ptterns in tree. But we split the tree into sutrees in order to lulte the distne etween trees, not to nswer queries. Severl ppers del with the relted, ut different prolem of deteting the struturl similrity etween XML douments [] []. Two douments re onsidered struturlly similr if they re vlid for similr DTD. The text ontent of the elements nd the vlues of the ttriutes re ignored. III. MOTIVATION In our pplition senrio we onsider uilding n online dtse out musi CDs tht integrtes dt from two soures: song lyri store nd CD wrehouse. 1 The integrted dtse will store the rtists nd songs of n lum, informtion out individul songs suh s the lyris, guitr ts, nd informtion out the rtists. Exmple 3.1: Figure 1 shows tree representtions of two different XML douments. Intuitively, oth represent dt out the sme song lum. Yet ext ordered tree mthing would not onsider the items s the sme for numer of resons. The song lyri store hs n element yer tht is sent from the CD wrehouse. The CD wrehouse hs prie for the lum. For one trk the dtses list different rtists. Also the doument order of elements differs, i.e., the two douments hve different siling orders. One wy to mth items from the two soures is to join the douments. The join ttriute is (the prt of) the XML doument tht represents the lum. Two lums mth if they re similr. The join ondition n not e equlity, s the dt items representing the sme lum in the different dtses my not mth extly. The following XQuery expression returns ll lum pirs tht re within distne $tu. The distne funtion, dist, is user-defined funtion tht returns the distne etween pir of XML douments. for $ in do("lyristore.xml")//lum, $ in do("wrehouse.xml")//lum where dist($,$) <= $tu return <mth>{$}{$}</mth> 1 We do not ssume tht the soures use ommon shem, ut we ssume ommon voulry to desrie the dt; the prolem of integrting dt voulries or ontologies is seprte from mthing the dt. Terms in one soure n e onverted to the voulry of the seond soure prior to mthing. We fous on the dt mthing prolem.

3 T LS title So fr wy T WH title So fr wy Fig. 1. trk rtist Mrk trk rtist John yer lum rtist title John Wish you where here () Song Lyri Store Dt lum trk rtist rtist title Mrk Roger Wish you where here () CD Wrehouse Dt trk rtist Roger rtist Nik Two XML Trees Representing the Sme Alum. rtist Dve prie 15 In the XQuery expression, $ nd $ re ound to elements of the sets do("lyristore.xml")//lum nd do("wrehouse.xml")//lum, respetively. Eh lum element is (smll) XML doument itself. We define the pproximte XML join etween two sets of XML douments s follows []. Definition 3.1 (Approximte XML Join): Given two sets of XML douments, F 1 nd F, distne mesure, dist(t i, T j ), etween the douments T i F 1 nd T j F, nd threshold τ. The pproximte XML join omputes ll pirs (T i, T j ) F 1 F, suh tht dist(t i, T j ) τ. Our gol is to find distne funtion for unordered trees tht is effetive for dt-entri XML nd n e omputed effiiently. We use this funtion s the sis of slle pproximte join. IV. ORDERED VS. UNORDERED TREE MATCHING In this setion we introdue si onepts nd disuss the impt of the siling order on the tree distne omputtion. ) XML nd Trees: In order to mke pproximte tree mthing pplile for XML, we represent n XML doument s rooted, leled tree. The tree is unordered in the se of dt-entri XML. Eh node in the tree is triple (i, l, v), where i is the node index, l is the node lel, nd v is the node s vlue. A node in the tree represents n XML element (or ttriute). The node index is ny numer tht identifies the node in the doument, suh s the ordinl position of the element (or ttriute) in doument order. The node is leled with the nme of the element (or ttriute). The vlue of node represents the text ontent of the orresponding element (or the vlue of the orresponding ttriute). If the orresponding element ontins only su-elements nd no ontent, then the node vlue is the empty string, ǫ. An edge onnets n element node with eh of its suelements (or ttriutes). The funtion λ(n) mps node n=(i, l, v) to the pir (l, v) of lel nd vlue. While nodes re unique within tree, the (lel,vlue)-pirs re not. To simplify the disussion we refer to node y its lel nd omit node indexes nd empty vlues. ) Ordered nd Unordered Trees: In n ordered tree the hildren of node form sequene, in n unordered tree the hildren of node re not ordered nd form set. Ordered trees tht differ only in the siling order re permuttions of eh other. In sorted tree the silings re lexiogrphilly ordered y their node lels nd vlues. An unordered tree is trnsformed to n ordered tree y ordering the silings. Grphilly we represent n unordered tree s set onsisting of node nd the sutrees rooted in the node s hildren. Exmple 4.1: The trees T nd T in Figure re ordered (ut not sorted), the trees T 1 nd T re unordered. T is permuttion of T. The unordered tree T 1 n e trnsformed to the ordered tree T (or ny of its permuttions) y hoosing the pproprite siling order. T 1 differs from T in tht the node with lel g is moved etween the trees. There is no siling order tht trnsforms T to T. T 1 T f e d g e f g d h f i h f i Fig.. k j j k T k j g e f d T e d f f h g i Ordered nd Unordered Trees. h i f ) Approximting the Edit Distne etween Unordered Trees: The edit distne etween two trees (ordered or unordered) is the minimum numer of node edit opertions tht trnsform one tree into the other. The prolem is sustntilly hrder for unordered trees (NP-omplete [4]) s the distne lgorithm n not rely on siling order nd must onsider ll siling permuttions. The edit distne etween two unordered trees is pproximted y the smllest edit distne etween permuttions of the respetive ordered trees. Trnsforming n unordered tree to n ordered tree is strightforwrd. The key issue is to find the permuttions of two ordered trees tht yield the smllest edit distne. This is non-trivil s illustrted elow. It is oviously not fesile to ompute the edit distne etween ll permuttions of two ordered trees. Alterntively, onsider n pproh tht sorts the silings of oth trees y their string lels nd vlues. This heuristi fils for the tree edit distne etween ordered trees. When the silings of tree re sorted, lso the sutrees rooted in the silings re sorted. But two sutrees tht should mth my pper in different order in the two sorted trees. For exmple, if sutree root is renmed etween two trees, it my result in different sort position. The edit distne etween ordered trees then moves k permuted sutrees node y node. A sutree rooted in siling n e of size O(n), where n is the numer of tree nodes. Even if the edit distne etween two unordered trees is zero or smll onstnt (for exmple, single renmed node), the edit distne etween the respetive sorted trees my e O(n). Exmple 4.: Consider the two trees T nd T in Figure 3. The hildren of the root node hve the sme lel suh tht j k

4 the lel sort is not unique. Although oth trees re sorted, the sutrees t 1 nd t re permuted etween T nd T. The edit distne etween the respetive unordered trees is zero s they differ only in the siling order. The edit distne etween the ordered trees is out the tree size s the sutrees t 1 nd t must e moved k node y node. T x t 1 Fig. 3. r x t edit-dist uo = edit-dist o = O(n) T windowed pq-grm dist = Edit Distne nd Windowed pq-grms Distne. By ontrst, sorting trees is vlid pproh for the windowed pq-grms introdued in the following setion. We show tht the permuttion of onstnt numer of silings hnges only onstnt numer of windowed pq-grms. Thus, we n sort the unordered trees nd ompute the windowed pq-grms on the sorted trees. The resulting distne pproximtes the edit distne etween the unordered trees, nd the windowed pq-grm distne omputed from identil unordered trees is lwys zero (see Figure 3). V. WINDOWED pq-grams In this setion we introdue windowed pq-grms. We define properties tht we require for our solution, nd we show tht sorting trees is vlid pproh for windowed pq-grms. Proofs re omitted due to spe onstrints. A. Requirements for Windowed pq-grms pq-grms were introdued y Augsten et l. [11] s n pproximtion of the edit distne etween ordered, leled trees. Intuitively, pq-grm is smll sutree of speifi shpe omposed of two prts: stem tht onsists of n nhor node with p 1 nestors nd se tht onsists of q onseutive hildren of the nhor node. For exmple, onsider the ordered tree T in Figure. The stem (,) with nhor node nd the se (k,j) form pq-grm with p=q=. Stems re node hins of length p. They re invrint to order, nd the strtegy for hoosing stems in ordered trees rries over to unordered trees. The ses in ordered trees re formed y onseutive silings. This strtegy is not pplile to unordered trees, sine no siling order is defined. A different strtegy is required. A siling set is the set of ll hildren of tree node. In n unordered tree the ses re formed from susets of siling sets. A strtegy to hoose ll possile siling susets of size q weights nodes differently. There re ( f q) susets of size q in siling set of f silings. pq-grms produed from lrge siling sets disproportionlly ontriute to the totl numer of pq-grms. Chnges overed y these pq-grms re mplified, other hnges re disregrded. Bses tht onsist of single node ignore the siling order. However, pq-grms with suh ses fil to detet siling moves to nother prent if the nestors in the old nd the new x t r x t 1 position hve identil lels nd vlues. For exmple, they nnot distinguish etween the trees T 1 nd T in Figure. The nestors of the moved node g hve identil lels (nd empty vlues), resulting in identil stems, (, ). Anestors with identil lels nd vlues re frequent in dt-entri XML (e.g., ll title elements hve the nestors trk nd lum in the XML of Figure 1). Lrger ses enode siling informtion nd n detet siling moves, s nodes with homonymous nestors my hve silings with different lels nd/or vlues. In our exmple, g hs siling i in T ut not in T 1. A se (g,i) exists only in T nd distinguishes it from T 1. A siling order my e given impliitly, for exmple, y the XML doument order. This order is rndom for dt-entri XML. Bses formed over rndomly ordered siling sets my e very different even for identil siling sets. In our pproh we sort the trees nd use window to ontrol the omputtion of the ses. We seek to uild ses with the following properties: P1: Equl Bse-Node Frequeny. Eh non-root node of the tree ppers in the sme numer of ses, independent of the numer of silings. P: Preservtion of the Siling Distne. For ses uild from two different siling sets the perentge of overlp etween the ses is equl to the perentge of overlp etween the siling sets. In Figure there is 5% overlp etween the two siling sets tht ontin node g, hene lso 5% of the ses should mth. P3: Detetion of Node Moves to Other Prents. In Fig., node g is moved to nother prent with the sme lel () nd vlue (ǫ). All pq-grms with nhor node hve the sme stem. To distinguish T 1 from T the ses must differ. B. Solution We introdue windowed pq-grms tht hve the required properties for the ses. We proeed in three steps: ) sort the unordered tree, ) extend the sorted tree, nd ) ompute the windowed pq-grms on the extended tree. ) Sorting the Unordered Tree: In the first step we sort the trees y imposing horizontl order mong silings. The silings re sorted y node lel nd vlue. Due to nodes with identil (lel,vlue)-pirs n unordered tree n e sorted in different wys. But ll possile sorts of the sme unordered tree yield identil pq-grms nd so re equivlent for our purpose. Figure 4() shows T sort 1, the sorted exmple tree T 1. Definition 5.1 (Sorted Tree): A tree T is sorted if its silings re ordered nd for eh siling pir, n = (i, l, v) nd n = (i, l, v ), the order stisfies l < l (l = l v < v ) n < n. ) Extending the Sorted Tree: The next step extends the sorted tree with dummy nodes ( ). Dummy nodes hve speil (lel,vlue)-pir, whih is the sme for ll dummy nodes, λ( i ) = (*,*) = λ( j ) for ll i, j. The numer of

5 d e f g stem f h i () T sort 1 j k d e f g f * h i j () T ext 1 for w=3 nd p=q=. j k * j k * j k * j k j * k * k j * j * k nhor se () Window Use nd, -Grms with Anhor Node. Fig. 4. k () Sorted Tree, () Extended Tree, nd () Windowed pq-grms. dummy nodes depends on the size of the window tht we shift over the tree for systemti genertion of the windowed pq-grms. Figure 4() shows the extended tree T ext 1 for w=3 nd p=q=. Definition 5. (Extended Tree): Let T sort e sorted tree, p > nd q > e the prmeters determining the shpe of the windowed pq-grms, w q e the window size, nd f denote the fnout of node. The extended tree, T ext, is defined s T sort extended with dummy nodes s follows: root: p 1 nestors re prepended to the root node; leves: q hildren re dded to eh lef node; silings: w f silings re ppended to eh siling sequene ( 1,..., f ) of size <f<w, yielding ( 1,..., f, 1,..., w f ). ) Computing the pq-grms: We give definition of windowed pq-grms sed on the extended tree. Definition 5.3 (Windowed pq-grms): Let T e n unordered tree with extended tree T ext, n e node of T, i e the i th hild of n in T ext (1 i f ext ), nd window W i = ( i, i+1,..., i+w 1 ), k = (k 1) mod f ext +1, e node sequene of length w q tht is wrpped round the right order. A windowed pq-grm (p >, q > ) of T with nhor node n is defined s n ordered sutree of T ext tht is omposed of stem nd se. The stem is node hin ( p 1,..., 1, n), where k is n s nestor t distne k. The se is sequene of mutully different silings ( i,,..., q ) hosen from window W i preserving the node order; i is the first node of the window, {,..., q } { i+1,..., i+w 1 }. If n is lef in T, the se is formed y q dummy nodes. Eh se tht stisfies these onstrints produes windowed pq-grm with nhor node n. The set of windowed pq-grms of ll nodes of T is lled the windowed pq-grm profile of T. We use liner enoding nd represent windowed pq-grm s tuple g = ( p 1,..., 1, n, 1,..., q ). With λ(g) = (λ( p 1 ),..., λ(n),..., λ( q )) we denote pq-grm s node lels nd vlues, lled its lel-vlue tuple. While windowed pq-grm is unique within tree, different windowed pq-grms my yield identil lel-vlue tuples. * The ses re systemtilly omputed y produing for eh window W i only the ses tht ontin the first window node. For eh window position, ( w 1 q 1) ses re produed. Exmple 5.1: Figure 4() shows ll windowed pq-grms for p = q = tht n e formed in T ext 1 for the nhor node with lel. Initilly, the window overs the nodes j, k,*, whih ording to the ove proedure yields two ses of size nd produes the first two windowed pq-grms with lel-vlue tuples ((, ǫ), (, ǫ), (j, ǫ), (k, ǫ)) nd ((, ǫ), (, ǫ), (j, ǫ), (*,*)). Next, the window is moved right nd overs k,*,j. Notie tht the window is wrpped round. Two other windowed pq-grms re produed. The finl position of the window overs *,j,k. Theorem 5.1 (Profile Size): If T is n unordered tree with n nodes, then the size of its windowed pq-grm profile, P(T), q > 1, is liner in the tree size, P(T) nq ( w q). If T hs l leves, nd ll other nodes hve fnout f w, then P(T) = (n 1) ( w 1 q 1) + l. Definition 5.4 (Windowed pq-grm Index): Let T e n unordered tree with windowed pq-grm profile P(T), p >, q >. The windowed pq-grm index, I, of tree T is the g of ll lel-vlue tuples of T, i.e., I(T) = g P(T) λ(g). The windowed pq-grm distne is omputed from the numer of windowed pq-grms tht the indexes of the ompred trees hve in ommon. For two unordered trees, T nd T, the windowed pq-grm distne is dist(t, T ) = 1 I(T) I(T ) I(T) I(T ). The windowed pq-grm distne is 1 if two trees shre no windowed pq-grms, nd if they hve the sme windowed pq-grm index, whih does not neessrily imply tht the trees re equl. Sorting trees involves sutree permuttions. The windowed pq-grm distne is independent of the size of the permuted sutrees. Only the windowed pq-grms tht ontin the root nodes of the permuted sutrees in the ses hnge. This ore feture qulifies windowed pq-grms for our pproh nd sets it prt from other distne mesures suh s the edit distne etween ordered trees. Theorem 5. (Lol Effet of Sutree Permuttions): Given sorted tree T (index I) tht is trnsformed to tree T (index I ) y permuting the order of the f w hildren of node n, then the permuttion ffets only O(f) windowed pq-grms: I \ I O(f). VI. PROPERTIES OF WINDOWED pq-gram BASES We disuss the se properties of windowed pq-grms. We denote siling sets with S, the respetive gs of (lel,vlue)- pirs with L, nd ses formed over L with B. P1: Equl Bse-Node Frequeny. Dummy nodes, windows, nd window wrpping gurntee tht eh node of tree is in the sme numer of ses, thus giving eh node the

6 sme weight. Dummy nodes prevent node from ppering twie in the sme window when the window is wrpped. Due to the window wrpping eh node ppers in ll w positions of window extly one, independent of the numer of left nd right silings. Only ses within windows re formed, thus eh node is in the sme numer of ses. P: Preservtion of the Siling Distne. We nlyze the windowed pq-grms of two nhor nodes tht hve the sme stem nd differ only in the ses. The ses represent the siling sets formed y the hildren of the nhor nodes. The distne etween the ses should pproximte the distne etween the siling sets. Siling Distne. Let L 1 nd L e the gs of (lel,vlue)- pirs of two siling sets. We use the Jrd distne [3] (modified for gs) etween L 1 nd L to ompute the distne etween the respetive siling sets. J(L 1, L ) = 1 L 1 L L 1 L (L 1 or L ) The siling distne is 1 if ll silings re different, nd if L 1 nd L hve identil (lel,vlue)-pirs. Bse Error. Let B 1 nd B e the ses formed over L 1 nd L, respetively. We define the se error, ε(l 1, L, B 1, B ) = J(L 1, L ) J(B 1, B ), (1) where J(B 1, B ) is the Jrd distne etween the ses. The se error ε rnges etween nd 1, ε = mens tht the se distne is equivlent to the siling distne. Exmple 6.1: Let L 1 = {,,d,f,g,i} nd L = {,,, d,e,f,g,h,i}. For q = nd w = 3, we get B 1 = {, d, d, f, df, dg, fg, fi, gi, g, i, i} nd B = {,,, d, d, e, de, df, ef, eg, fg, fh, gh, gi, hi, h, i, i}. With B 1 B =6, B 1 B =3, nd L 1 L =6 the se error is ε= 5. For q=w=3 no ses mth, nd ε= 4 5. P3: Detetion of Node Moves to Other Prents. We define se rell nd se preision to mesure the sensitivity of the ses to node moves. A node move is deteted if t lest one of the ses hnges. We onsider ses of size q = nd disuss lrger ses in the next setion. A se without dummy nodes enodes extly one siling pir. Due to the window wrpping, the sme siling pir my e enoded twie. Two ses formed from the sme siling pir re lled duplites, regrdless of the node order. Bses with dummy nodes give no siling informtion. Let #pirs(s, B) denote the numer of unique siling pirs of S enoded y the ses B, i.e., only ses without dummy nodes nd only one opy of eh duplite re ounted. Bse Rell. For siling set S with f nodes, ( f ) = f(f 1) pirs n e formed. Given the respetive ses B, we define the se rell, ρ, s the rtio of siling pirs enoded y the ses to the numer of possile pirs. ρ(s, B) = #pirs(s, B), f = S () f(f 1) To simplify the nottion of this exmple, we represent (lel,vlue)- pir y its lel nd se y the ontention of its node lels, e.g., the (lel,vlue)-pir (, ǫ) is denoted s, the se ((, ǫ),(, ǫ)) s. ρ = 1 if ll possile pirs of S re in B, ρ = if none of the possile pirs is enoded. Bses with low rell my not enode relevnt siling pirs nd thus miss node moves. Bse Preision. Given siling set S nd the respetive set of ses B, the se preision is the rtio of siling pirs enoded y the ses to the totl numer of ses: π(s, B) = #pirs(s, B). (3) B π = 1 if the ses ontin no duplites/dummy nodes. In the originl tree there re no dummy nodes. A low preision, i.e., mny ses with dummy nodes, dereses the weight of the originl nodes. Exmple 6.: Let B over silings S e the ses in Figure 4() (q =, w = 3). (j,k) nd (k,j) re duplites, ll other ses ontin dummy nodes, thus #pirs(s, B) = 1. Bse rell ρ(s, B) = 1 (ll pirs of S re enoded y B), se preision π(s, B) = 1 6 (only 1 of 6 ses is relevnt for deteting node moves). VII. OPTIMAL WINDOWED pq-grams In this setion we disuss the hoie of the se size q nd the window size w. Speifilly, ses of size q = hve smller se error thn lrger ses (Lemm 7.1), ut n detet extly the sme siling moves (Lemm 7.). For q = we provide se rell nd preision (Lemm 7.3). We hoose window size w tht optimizes oth rell nd preision, nd we show tht ll nodes in the resulting ses hve equl weight (Theorem 7.4). Lemm 7.1 (Optiml Bse Size): Let S nd S e siling sets with the gs of (lel,vlue)-pirs L nd L, respetively, let S e trnsformed to S y one of the following edit sequenes: ) k insertions of new nodes with (lel,vlue)-pirs not in L; ) k renmings of nodes with new (lel,vlue)-pirs not in L (k S ); ) k node deletions (k S ). For given window size w min( S, S ), smll ses of size (B q=, B q= ) hve equl or smller se error thn lrger ses (B q>, B q>): ε(l, L, B q=, B q= ) ε(l, L, B q>, B q> ) Lemm 7. (Siling Move Detetion): Given the siling sets S 1 nd S 1 with the ses B 1 nd B 1. We move node n from S 1 to S 1 nd get the siling sets S nd S with the ses B nd B. For given window size w, if the siling move is deteted for ses with q >, i.e., B 1 B 1 B B, then it is lso deteted for ses with q =. Lemm 7.3 (Rell nd Preision): Let S e siling set with f nodes, B e the ses of size q = formed over S with windows size w. Bse rell, ρ(s, B), nd se preision, π(s, B), re ρ = { w 1 f 1 w < f+1 1 w f+1 π = { 1 w < f+1 f 1 (w 1) w f+1

7 Theorem 7.4: (Optiml Windowed pq-grms) Given n unordered tree with fixed fnout f for the non-lef nodes. For se size q = nd window size w = f+1 we get windowed pq-grms with the following properties: () Eh non-root node ppers in extly w ses. k f for renme k () ε f+k for insert for delete k f k () ρ = 1 for w = f + 1 (d) π = 1 for w = f + 1 The optiml se size w depends on the fnout f. For degenerted tree (onsisting only of the root node nd n 1 leves) w = f 1 = O(n). Even in this se, the windowed pq-grm profile n not grow lrger thn O(n ) (Theorem 5.1, f w, q=). VIII. ALGORITHMS A. Building the pq-grm Index Algorithm 1 omputes the windowed pq-grm profile P for q = y reursively trversing the tree T in preorder. The lgorithm is initilized with the root node n of T, the window size w, stem of dummy nodes ( 1,..., p ), nd the empty profile P =. Whenever the lst siling (in doument order) of siling set is rehed, the silings re sorted (dummy nodes to the end), nd the windowed pq-grms re produed. The runtime is O(n + f mx log f mx ) for douments with n nodes, mximl fnout of f mx, nd onstnt window size. Our experiments onfirm the nlyti runtime result. The index, I, is omputed y ggregting nd ounting the lel-vlue tuples of the windowed pq-grms in the profile P(treeId,pqg): I Γ treeid,λ(pqg) pqg,count( ) nt (P). The runtime is O(n log n) (sorting the profile of size O(n)). The index of forest is the union of the indexes of its trees. To del with node lels nd vlues of different length, suh s element nmes nd text vlues in XML douments, we use fingerprint hsh funtion (e.g., the Krp-Rin fingerprint funtion [4]) tht mps string s to hsh vlue h(s) of fixed length tht is unique with high proility. Insted of storing the lel-vlue tuples of windowed pq-grms, we store the ontention of the hshed lels nd vlues. Note tht the only opertion we need to perform on the (lel,vlue)- pirs is to hek equlity. Exmple 8.1: Figure 5 shows n exmple hsh funtion nd prt of the windowed pq-grm indexes of the two XML douments in Figure 1, the musi lums from the song lyri store (T LS ) nd the CD wrehouse (T WH ). We hoose p = q =, w = 3, λ( ) = (*,*). The lel-vlue tuple ((*,*), (lum, ǫ), (trk, ǫ), (trk, ǫ)) with hsh vlue ppers twie in the index of T LS Algorithm 1: getpqgrms(t, n, w, stem, P) stem dequeue-first-element(stem) n; if n is lef then return P {(T, stem (, )}; C ; foreh hild of n do C C {}; P P getpqgrms(t,, w, stem, P); end C C w f i=1 { }; sort-y-lel-vlue(c); for i to 1 do for j i + 1 to i + w 1 do P P {(T, stem [i] [j mod ])}; end end return P; nd hs two mthes in the other index. The lel-vlue tuple ((lum, ǫ), (yer,), (*,*), (*,*)) with the hsh vlue ppers only one in the index of T LS nd hs no mth in the index of T WH. B. Approximte XML Join Algorithm omputes the pproximte join of two sets of unordered trees, F 1 nd F, given their windowed pq-grm indexes, I 1 nd I, nd the threshold, τ. All pirs (T i, T j ) F 1 F tht stisfy dist(t i, T j ) τ < 1 re returned. PS i is initilized with the profile sizes for the trees in forest F i. Algorithm : pqgrmjoin(i 1, I, τ) foreh I i do I i ρ treeid/treeidi,nt/nt i (I i ); PS i Γ treeidi,sum(nt i) size i (I i ); end return π treeid1,treeid (σ 1 nt size 1 +size τ( Γ treeid1,treeid,sum(min(nt 1,nt )) nt(i 1 I ) PS 1 PS )) As pointed out y Guh et l. [], hsh nd sort-merge joins do not rry over to pproximte tree joins tht use the edit distne, sine the distne funtion must e evluted etween every input pir. There is no effetive wy to sort trees or prtition them into ukets with hsh funtion. The only pproh redily pplile is the nested loop join []. This does not hold for the windowed pq-grm distne. For the lultion of the windowed pq-grm distne tree is represented y its windowed pq-grm index. Insted of omputing the distne etween eh pir of trees diretly, we hek for eh windowed pq-grm in whih pirs of trees it ppers. We trnsform the distne-sed join to n equlity join on ll windowed pq-grms represented s strings. We n pply well known tehniques to optimize this join (e.g., sortmerge nd hsh join). The pproximte join is omputed y ounting windowed pq-grms in the join result.

8 Fig. 5. s h(s) s h(s) So fr wy 67 * 99 Mrk 86 ǫ John 15 lum trk title Wish you where here 4 rtist 11 Roger 6 yer 54 Dve 9 prie 19 Nik 37 () Hsh Funtion. treeid pqg nt T LS T LS T LS T LS T LS T LS T LS () pq-grm Index of the Song Lyri Store. treeid pqg nt T WH T WH T WH T WH T WH T WH T WH () pq-grm Index of the CD Wrehouse. Implementtion of the Windowed pq-grm Index. In the worst se the joined forests onsist of identil opies of the sme tree. Let N e the rdinlity of the forest, n the numer of nodes per tree. The indexes re of size O(Nn) for onstnt window size. In sort-merge join the omplexity of sorting the reltions is O(Nn log(nn)). Eh windowed pq-grm in one index mthes O(N) tuples in the other index. The overll omplexity is O(Nn(N + log n)). Note tht for this worst se senrio the join result is of size O(N ), thus no lgorithm n improve on the qudrti runtime. Different from the nested loop join, our join lgorithm n tke dvntge of the diversity of trees in forest. In the est se, when no two trees in the forest shre windowed pq-grms, the runtime is O(Nn log(nn)) for the index size O(N n). In our experiments we show the performne dvntges of the optimized join for lrge forests. IX. EXPERIMENTS A. Profile nd Index Computtion. We nlyze the slility of the windowed pq-grm index omputtion. Our test dt re XML douments tht rnge etween 1kB nd 1.GB (k to M nodes), p = q = nd w = 3. The index omputtion in Figure 6() inludes the profile omputtion (Algorithm 1) nd the ggregtion of duplite pq-grms within eh tree. The index omputtion sles to very lrge trees. The test douments re generted with xmlgen, provided y the XML enhmrk projet XMrk B. Approximte Join Bsed on Windowed pq-grms. We ompre the slility of the optimized join (Algorithm ) with the slility of join tht omputes the windowed pq-grm distne etween eh pir of douments ( nested loop join ). We join two sets of syntheti XML douments. Eh set onsists of 1 douments with 1 to 17 nodes nd stores 58MB of dt. The douments within set re different, eh doument hs mth in the other set. Figure 6() shows the results. The optimized join omputes only the distne etween douments tht hve pq-grms in ommon. Unlike the nested loop join, it n tke dvntge of the diversity of the trees tht result in smll join results set. The runtime is lose to liner. time [se] ,-grm index (w=3),-grm profile (w=3) 1e+7 e+7 numer of nodes () Windowed pq-grm Index Computtion. Fig. 6. C. Qulity of Mthes. time [se] numer of trees nested-loop join optimized join 1e+6 e+6 numer of nodes 1 () Approximte Join Bsed on Windowed pq-grms. Index Cretion nd Join Slility. We use rel world XML dt sets nd dd noise (spelling mistkes nd missing elements). We pproximtely join the originl nd the noisy set. The Dt Sets. We use the DBLP 4 (iliogrphy), the SwissProt 5 (protein sequene dtse), nd the Treenk 6 (prts of speeh tgged English sentenes) XML dtses. We split eh dtse into set of (su)douments y deleting the root node, nd we rndomly hoose of the resulting douments for our experiments (requiring their size to e lrger thn the numer of errors we introdue). The resulting doument sets re struturlly very different: DBLP ontins smll nd flt douments (15 nodes nd depth 1.9 on verge) with out ten times more elements thn ttriutes, the SwissProt douments re lrger nd deeper with lmost the sme numer of ttriutes nd elements (14 nodes nd depth 3.5 on verge), the Treenk douments hve deep reursive struture (49 nodes nd depth 6.9 on verge, with mximum depth of 3). Adding Noise. We modify the originl douments y deleting nd renming rndom nodes. Node deletions simulte missing elements or ttriutes nd modify the doument struture. Renmed nodes represent different tg nmes or spelling mistkes in the text vlues. The resulting noisy doument is the mth of the originl doument, ll other noisy douments re non-mthes. In our figures we show the perentge of hnged nodes () treenk/

9 pq-grm distne losest non-mth mth () DBLP Fig. 7. pq-grm distne losest non-mth mth () SwissProt pq-grm distne losest non-mth mth () Treenk (-) Distne etween Mthes nd Non-Mthes. (d) 1:1 Mthes for SwissProt. preision / rell [%] preision rell (d) 1:1 Mthes for SwissProt Distne etween Mthes nd Non-Mthes. Eh originl doument hs extly one mth. Figures 7() 7() show the verge distne of the originl douments to their mth nd to the losest non-mth. The SwissProt douments re more similr to eh other thn the DBLP nd Treenk douments. The windowed pq-grm distne to the mthes is lmost liner to the numer of modified nodes. It effetively pproximtes the edit distne. All douments re modified, thus lso the distne to the non-mthes inreses with the numer of hnged nodes. Preision nd Rell. Our join lgorithm mthes eh originl doument to one or more noisy douments. We ount orret nd inorret mthes. With possile we denote the mximum numer of orret mthes tht exist for dtset. In our setting, possile is equl to the numer of douments in the orret dtset. We ompute preision = orret+inorret 1% nd rell = orret possile 1%. The preision is high if the returned mthes re orret, the rell is high if the lgorithm does not miss orret mthes. Figure 8 shows preision nd rell for different thresholds τ. Moving up the threshold dereses the preision nd inreses the rell. Preision nd rell for DBLP nd Treenk re lmost 1%, even for very noisy douments. For SwissProt the preision drops s we inrese the threshold. The SwissProt douments re lustered into groups of very similr douments (protein vrints). For exmple, two douments with 64 elements hve extly the sme struture nd vry only in 6 text vlues. The lustering of the dt is evident from the preision vlues in Figure 8() for = (pproximte self join): Alredy for τ =. mny douments mth other douments thn themselves. We improve the result for SwissProt using vrile threshold. Eh doument is mthed to its nerest neighor. If doument hs more thn one nerest neighor, no mth is returned. Figure 7(d) shows the results for the SwissProt dtse. The lgorithm returns preise mthes, nd even for errors of % we miss only out 1% of the mthes. X. CONCLUSION When XML dt from different soures is integrted in single dt olletion, dt items tht represent the sme rel world ojet must e reognized. Ext mthes, however, often fil in suh pplitions (elements my e missing in one dtse, ontent vlues my not mth due to different oding onventions nd spelling mistkes, nd the dt my preision [%] preision [%] preision [%] tu =.3 tu =.5 tu = tu =.5 tu =.1 tu = tu =.3 tu =.5 tu = Fig. 8. rell [%] () DBLP rell [%] tu =.7 tu =.5 tu = () SwissProt rell [%] () Treenk tu =. tu =.1 tu =.5 tu =.7 tu =.5 tu = Mthing with Different Thresholds. e rrnged in different struture). Approximte mthing tehniques must e pplied. Previous reserh developed pproximte join opertions sed on ordered tree-mthing, ut for dt-entri XML pplitions the order of silings should not mtter. Dtentri XML n e represented s unordered trees. In this pper we propose n pproximte join tehnique for dtentri XML sed on windowed pq-grms. pq-grms were developed for the pproximte mthing of ordered trees [11]. We introdue windowed pq-grms to pproximtely mth unordered trees in three-step proess: sorting the tree, extending the sorted tree with dummy nodes, nd omputing the windowed pq-grms on the extended tree. Windowed pq-grms onsist of stem nd se. The stems re invrint to order, nd the min hllenge is to ompute the ses. Our ses enjoy the following importnt properties:

10 ll non-root nodes pper in the sme numer of ses; the Jrd distne etween siling sets is preserved, nd node moves to other prents re deteted. The windowed pq-grm distne etween sorted trees pproximtes the tree edit distne etween the respetive unordered trees. We show tht the permuttion of onstnt numer of silings hnges only onstnt numer of windowed pq-grms. This ore feture mkes windowed pq-grms eligile for our tree sorting pproh nd rules out other ommon distnes suh s the edit distne etween ordered trees. We provide n effiient lgorithm for the pproximte join of unordered trees, whih is implemented s n equlity join on windowed pq-grms nd n tke dvntge of well known join optimiztion tehniques. To the est of our knowledge, this is the first work to ddress the prolem of pproximtely joining dt-entri XML, where the distne lgorithm n not tke dvntge of predefined doument order. Extensive experiments on oth syntheti nd rel world dt onfirm the nlyti results nd suggest tht our tehnique is oth useful nd slle. Future work inludes the investigtion of persistent, updtle index strutures for the windowed pq-grm join. As windowed pq-grms store lol informtion, doument modifition (e.g., n ltered text vlue) ffets only limited numer of windowed pq-grms. The index should e updted inrementlly y sustituting the ffeted pq-grms only, thus voiding the reomputtion of ll windowed pq-grms from srth. Further we pln to omine our pproximte join on dtentri XML with pproximte string mthing tehniques. The string vlues of some elements or ttriutes my e prtiulrly importnt to identify dt item, for exmple, the title of n rtile is very signifint in XML dtse out pulitions. We would like to inlude oth the similrity of the XML struture nd the similrity of seleted string vlues into our pproximte join. ACKNOWLEDGEMENTS The work hs een done in the frmework of the projet ebz Digitl City, whih is funded y the Muniiplity of Bolzno-Bozen. We wish to thnk our ollegues t the muniiplity, in prtiulr Frno Brdui, Wlter Costnzi, Roerto Loperfido, nd Dnil Srtori. REFERENCES [1] G. Coén, S. Aiteoul, nd A. Mrin, Deteting hnges in XML douments, in Proeedings of the Interntionl Conferene on Dt Engineering (ICDE). Sn Jose, Cliforni: IEEE Computer Siene Press,, pp [] S. Guh, H. V. Jgdish, N. Kouds, D. Srivstv, nd T. Yu, Approximte XML joins, in Proeedings of the ACM SIGMOD Interntionl Conferene on Mngement of Dt. Mdison, Wisonsin: ACM Press,, pp [3] K.-H. Lee, Y.-C. Choy, nd S.-B. Cho, An effiient lgorithm to ompute differenes etween strutured douments, IEEE Trnstions on Knowledge nd Dt Engineering (TKDE), vol. 16, no. 8, pp , Aug. 4. [4] K. Zhng, R. Sttmn, nd D. Shsh, On the editing distne etween unordered leled trees, Informtion Proessing Letters, vol. 4, no. 3, pp , 199. [5] K.-C. Ti, The tree-to-tree orretion prolem, Journl of the ACM (JACM), vol. 6, no. 3, pp , July [6] W. Chen, New lgorithm for ordered tree-to-tree orretion prolem, Journl of Algorithms, vol. 4, no., pp , Aug. 1. [7] E. D. Demine, S. Mozes, B. Rossmn, nd O. Weimnn, An optiml deomposition lgorithm for tree edit distne, in Proeedings of the 34th Interntionl Colloquium on Automt, Lnguges nd Progrmming (ICALP 7), Wrolw, Polnd, 7. [8] P. N. Klein, Computing the edit-distne etween unrooted ordered trees, in Proeedings of the 6th Europen Symposium on Algorithms, ser. Leture Notes in Computer Siene, vol Venie, Itly: Springer, 1998, pp [9] K. Zhng nd D. Shsh, Simple fst lgorithms for the editing distne etween trees nd relted prolems, SIAM Journl on Computing, vol. 18, no. 6, pp , [1] M. Groflkis nd A. Kumr, XML strem proessing using tree-edit distne emeddings, ACM Trnstions on Dtse Systems, vol. 3, no. 1, pp , 5. [11] N. Augsten, M. Böhlen, nd J. Gmper, Approximte mthing of hierrhil dt using pq-grms, in Proeedings of the Interntionl Conferene on Very Lrge Dtses (VLDB). Trondheim, Norwy: Morgn Kufmnn Pulishers In., Sept. 5, pp [1] S. S. Chwthe, A. Rjrmn, H. Gri-Molin, nd J. Widom, Chnge detetion in hierrhilly strutured informtion, in Proeedings of the ACM SIGMOD Interntionl Conferene on Mngement of Dt. Montrel, Cnd: ACM Press, June 1996, pp [13] S. S. Chwthe nd H. Gri-Molin, Meningful hnge detetion in strutured dt, in Proeedings of the ACM SIGMOD Interntionl Conferene on Mngement of Dt. Tuson, Arizon, United Sttes: ACM Press, My 1997, pp [14] Y. Wng, D. J. DeWitt, nd J. Ci, X-Diff: An effetive hnge detetion lgorithm for XML douments. in Proeedings of the Interntionl Conferene on Dt Engineering (ICDE). Bnglore, Indi: IEEE Computer Siene Press, Mr. 3, pp [15] M. Weis nd F. Numnn, DogmtiX trks down duplites in XML, in Proeedings of the ACM SIGMOD Interntionl Conferene on Mngement of Dt. Bltimore, Mrylnd, USA: ACM Press, June 5, pp [16] S. Puhlmnn, M. Weis, nd F. Numnn, XML duplite detetion using sorted neighorhoods, in Proeedings of the Interntionl Conferene on Extending Dtse Tehnology (EDBT), ser. Leture Notes in Computer Siene, vol Munih, Germny: Springer, Mr. 6. [17] I. Snz, M. Mesiti, G. Guerrini, nd R. Berlng, Frgment-sed pproximte retrievl in highly heterogeneous XML olletions, Dt & Knowledge Engineering, vol. 64, no. 1, pp , Jn. 8. [18] N. Bruno, N. Kouds, nd D. Srivstv, Holisti twig joins: Optiml XML pttern mthing, in Proeedings of the ACM SIGMOD Interntionl Conferene on Mngement of Dt. Mdison, Wisonsin: ACM Press, June, pp [19] H. Jing, W. Wng, H. Lu, nd J. X. Yu, Holisti twig joins on indexed XML douments, in Proeedings of the Interntionl Conferene on Very Lrge Dtses (VLDB). Berlin, Germny: Morgn Kufmnn Pulishers In., Sept. 3, pp [] T. Dlmgs, T. Cheng, K.-J. Winkel, nd T. Sellis, A methodology for lustering XML douments y struture, Informtion Systems, vol. 31, no. 3, pp , My 6. [1] S. Fles, G. Mno, E. Msiri, L. Pontieri, nd A. Pugliese, Fst detetion of XML struturl similrity, IEEE Trnstions on Knowledge nd Dt Engineering (TKDE), vol. 17, no., pp , Fe. 5. [] A. Niermn nd H. V. Jgdish, Evluting struturl similrity in XML douments, in Proeedings of the Fifth Interntionl Workshop on the We nd Dtses (WeDB ), Mdison, Wisonsin, USA, June. [3] C. J. vn Rijsergen, Informtion Retrievl, nd ed. Butterworth- Heinemnn, Mr. 1979, h. 3. [4] R. M. Krp nd M. O. Rin, Effiient rndomized pttern-mthing lgorithms, IBM Journl of Reserh nd Development, vol. 31, no., pp. 49 6, Mr

Duality in linear interval equations

Duality in linear interval equations Aville online t http://ijim.sriu..ir Int. J. Industril Mthemtis Vol. 1, No. 1 (2009) 41-45 Dulity in liner intervl equtions M. Movhedin, S. Slhshour, S. Hji Ghsemi, S. Khezerloo, M. Khezerloo, S. M. Khorsny

More information

CS 241 Week 4 Tutorial Solutions

CS 241 Week 4 Tutorial Solutions CS 4 Week 4 Tutoril Solutions Writing n Assemler, Prt & Regulr Lnguges Prt Winter 8 Assemling instrutions utomtilly. slt $d, $s, $t. Solution: $d, $s, nd $t ll fit in -it signed integers sine they re 5-it

More information

Paradigm 5. Data Structure. Suffix trees. What is a suffix tree? Suffix tree. Simple applications. Simple applications. Algorithms

Paradigm 5. Data Structure. Suffix trees. What is a suffix tree? Suffix tree. Simple applications. Simple applications. Algorithms Prdigm. Dt Struture Known exmples: link tble, hep, Our leture: suffix tree Will involve mortize method tht will be stressed shortly in this ourse Suffix trees Wht is suffix tree? Simple pplitions History

More information

Internet Routing. IP Packet Format. IP Fragmentation & Reassembly. Principles of Internet Routing. Computer Networks 9/29/2014.

Internet Routing. IP Packet Format. IP Fragmentation & Reassembly. Principles of Internet Routing. Computer Networks 9/29/2014. omputer Networks 9/29/2014 IP Pket Formt Internet Routing Ki Shen IP protool version numer heder length (words) for qulity of servie mx numer remining hops (deremented t eh router) upper lyer protool to

More information

Chapter 9. Greedy Technique. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

Chapter 9. Greedy Technique. Copyright 2007 Pearson Addison-Wesley. All rights reserved. Chpter 9 Greey Tehnique Copyright 2007 Person Aison-Wesley. All rights reserve. Greey Tehnique Construts solution to n optimiztion prolem piee y piee through sequene of hoies tht re: fesile lolly optiml

More information

Outline. Motivation Background ARCH. Experiment Additional usages for Input-Depth. Regular Expression Matching DPI over Compressed HTTP

Outline. Motivation Background ARCH. Experiment Additional usages for Input-Depth. Regular Expression Matching DPI over Compressed HTTP ARCH This work ws supported y: The Europen Reserh Counil, The Isreli Centers of Reserh Exellene, The Neptune Consortium, nd Ntionl Siene Foundtion wrd CNS-119748 Outline Motivtion Bkground Regulr Expression

More information

COMP108 Algorithmic Foundations

COMP108 Algorithmic Foundations Grph Theory Prudene Wong http://www.s.liv..uk/~pwong/tehing/omp108/201617 How to Mesure 4L? 3L 5L 3L ontiner & 5L ontiner (without mrk) infinite supply of wter You n pour wter from one ontiner to nother

More information

10.2 Graph Terminology and Special Types of Graphs

10.2 Graph Terminology and Special Types of Graphs 10.2 Grph Terminology n Speil Types of Grphs Definition 1. Two verties u n v in n unirete grph G re lle jent (or neighors) in G iff u n v re enpoints of n ege e of G. Suh n ege e is lle inient with the

More information

Lecture 8: Graph-theoretic problems (again)

Lecture 8: Graph-theoretic problems (again) COMP36111: Advned Algorithms I Leture 8: Grph-theoreti prolems (gin) In Prtt-Hrtmnn Room KB2.38: emil: iprtt@s.mn..uk 2017 18 Reding for this leture: Sipser: Chpter 7. A grph is pir G = (V, E), where V

More information

[Prakash* et al., 5(8): August, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116

[Prakash* et al., 5(8): August, 2016] ISSN: IC Value: 3.00 Impact Factor: 4.116 [Prksh* et l 58: ugust 6] ISSN: 77-9655 I Vlue: Impt Ftor: 6 IJESRT INTERNTIONL JOURNL OF ENGINEERING SIENES & RESERH TEHNOLOGY SOME PROPERTIES ND THEOREM ON FUZZY SU-TRIDENT DISTNE Prveen Prksh* M Geeth

More information

Efficient Techniques for Tree Similarity Queries 1

Efficient Techniques for Tree Similarity Queries 1 Efficient Techniques for Tree Similrity Queries 1 Nikolus Augsten Dtbse Reserch Group Deprtment of Computer Sciences University of Slzburg, Austri July 6, 2017 Austrin Computer Science Dy 2017 / IMAGINE

More information

Width and Bounding Box of Imprecise Points

Width and Bounding Box of Imprecise Points Width nd Bounding Box of Impreise Points Vhideh Keikh Mrten Löffler Ali Mohdes Zhed Rhmti Astrt In this pper we study the following prolem: we re given set L = {l 1,..., l n } of prllel line segments,

More information

Lecture 12 : Topological Spaces

Lecture 12 : Topological Spaces Leture 12 : Topologil Spes 1 Topologil Spes Topology generlizes notion of distne nd loseness et. Definition 1.1. A topology on set X is olletion T of susets of X hving the following properties. 1. nd X

More information

Distributed Systems Principles and Paradigms. Chapter 11: Distributed File Systems

Distributed Systems Principles and Paradigms. Chapter 11: Distributed File Systems Distriuted Systems Priniples nd Prdigms Mrten vn Steen VU Amsterdm, Dept. Computer Siene steen@s.vu.nl Chpter 11: Distriuted File Systems Version: Deemer 10, 2012 2 / 14 Distriuted File Systems Distriuted

More information

MITSUBISHI ELECTRIC RESEARCH LABORATORIES Cambridge, Massachusetts. Introduction to Matroids and Applications. Srikumar Ramalingam

MITSUBISHI ELECTRIC RESEARCH LABORATORIES Cambridge, Massachusetts. Introduction to Matroids and Applications. Srikumar Ramalingam Cmrige, Msshusetts Introution to Mtrois n Applitions Srikumr Rmlingm MERL mm//yy Liner Alger (,0,0) (0,,0) Liner inepenene in vetors: v, v2,..., For ll non-trivil we hve s v s v n s, s2,..., s n 2v2...

More information

Midterm Exam CSC October 2001

Midterm Exam CSC October 2001 Midterm Exm CSC 173 23 Otoer 2001 Diretions This exm hs 8 questions, severl of whih hve suprts. Eh question indites its point vlue. The totl is 100 points. Questions 5() nd 6() re optionl; they re not

More information

Parallelization Optimization of System-Level Specification

Parallelization Optimization of System-Level Specification Prlleliztion Optimiztion of System-Level Speifition Luki i niel. Gjski enter for Emedded omputer Systems University of liforni Irvine, 92697, US {li, gjski} @es.ui.edu strt This pper introdues the prlleliztion

More information

What are suffix trees?

What are suffix trees? Suffix Trees 1 Wht re suffix trees? Allow lgorithm designers to store very lrge mount of informtion out strings while still keeping within liner spce Allow users to serch for new strings in the originl

More information

SOFTWARE-BUG LOCALIZATION WITH GRAPH MINING

SOFTWARE-BUG LOCALIZATION WITH GRAPH MINING Chpter 17 SOFTWARE-BUG LOCALIZATION WITH GRAPH MINING Frnk Eihinger Institute for Progrm Strutures nd Dt Orgniztion (IPD) Universit-t Krlsruhe (TH), Germny eihinger@ipd.uk.de Klemens B-ohm Institute for

More information

Distance Computation between Non-convex Polyhedra at Short Range Based on Discrete Voronoi Regions

Distance Computation between Non-convex Polyhedra at Short Range Based on Discrete Voronoi Regions Distne Computtion etween Non-onvex Polyhedr t Short Rnge Bsed on Disrete Voronoi Regions Ktsuki Kwhi nd Hiroms Suzuki Deprtment of Preision Mhinery Engineering, The University of Tokyo 7-3-1 Hongo, Bunkyo-ku,

More information

Problem Final Exam Set 2 Solutions

Problem Final Exam Set 2 Solutions CSE 5 5 Algoritms nd nd Progrms Prolem Finl Exm Set Solutions Jontn Turner Exm - //05 0/8/0. (5 points) Suppose you re implementing grp lgoritm tt uses ep s one of its primry dt strutures. Te lgoritm does

More information

Towards Unifying Advances in Twig Join Algorithms

Towards Unifying Advances in Twig Join Algorithms Pro. 21st Austrlsin Dtse Conferene (ADC 2010), Brisne, Austrli Towrds Unifying Advnes in Twig Join Algorithms Nils Grimsmo Truls A. Bjørklund Deprtment of Computer nd Informtion Siene Norwegin University

More information

CS553 Lecture Introduction to Data-flow Analysis 1

CS553 Lecture Introduction to Data-flow Analysis 1 ! Ide Introdution to Dt-flow nlysis!lst Time! Implementing Mrk nd Sweep GC!Tody! Control flow grphs! Liveness nlysis! Register llotion CS553 Leture Introdution to Dt-flow Anlysis 1 Dt-flow Anlysis! Dt-flow

More information

Lesson 4.4. Euler Circuits and Paths. Explore This

Lesson 4.4. Euler Circuits and Paths. Explore This Lesson 4.4 Euler Ciruits nd Pths Now tht you re fmilir with some of the onepts of grphs nd the wy grphs onvey onnetions nd reltionships, it s time to egin exploring how they n e used to model mny different

More information

Minimal Memory Abstractions

Minimal Memory Abstractions Miniml Memory Astrtions (As implemented for BioWre Corp ) Nthn Sturtevnt University of Alert GAMES Group Ferury, 7 Tlk Overview Prt I: Building Astrtions Minimizing memory requirements Performnes mesures

More information

6.045J/18.400J: Automata, Computability and Complexity. Quiz 2: Solutions. Please write your name in the upper corner of each page.

6.045J/18.400J: Automata, Computability and Complexity. Quiz 2: Solutions. Please write your name in the upper corner of each page. 6045J/18400J: Automt, Computbility nd Complexity Mrh 30, 2005 Quiz 2: Solutions Prof Nny Lynh Vinod Vikuntnthn Plese write your nme in the upper orner of eh pge Problem Sore 1 2 3 4 5 6 Totl Q2-1 Problem

More information

[SYLWAN., 158(6)]. ISI

[SYLWAN., 158(6)]. ISI The proposl of Improved Inext Isomorphi Grph Algorithm to Detet Design Ptterns Afnn Slem B-Brhem, M. Rizwn Jmeel Qureshi Fulty of Computing nd Informtion Tehnology, King Adulziz University, Jeddh, SAUDI

More information

Lecture 13: Graphs I: Breadth First Search

Lecture 13: Graphs I: Breadth First Search Leture 13 Grphs I: BFS 6.006 Fll 2011 Leture 13: Grphs I: Bredth First Serh Leture Overview Applitions of Grph Serh Grph Representtions Bredth-First Serh Rell: Grph G = (V, E) V = set of verties (ritrry

More information

Calculus Differentiation

Calculus Differentiation //007 Clulus Differentition Jeffrey Seguritn person in rowot miles from the nerest point on strit shoreline wishes to reh house 6 miles frther down the shore. The person n row t rte of mi/hr nd wlk t rte

More information

CS453 INTRODUCTION TO DATAFLOW ANALYSIS

CS453 INTRODUCTION TO DATAFLOW ANALYSIS CS453 INTRODUCTION TO DATAFLOW ANALYSIS CS453 Leture Register llotion using liveness nlysis 1 Introdution to Dt-flow nlysis Lst Time Register llotion for expression trees nd lol nd prm vrs Tody Register

More information

Introduction to Algebra

Introduction to Algebra INTRODUCTORY ALGEBRA Mini-Leture 1.1 Introdution to Alger Evlute lgeri expressions y sustitution. Trnslte phrses to lgeri expressions. 1. Evlute the expressions when =, =, nd = 6. ) d) 5 10. Trnslte eh

More information

Distributed Systems Principles and Paradigms

Distributed Systems Principles and Paradigms Distriuted Systems Priniples nd Prdigms Christoph Dorn Distriuted Systems Group, Vienn University of Tehnology.dorn@infosys.tuwien..t http://www.infosys.tuwien..t/stff/dorn Slides dpted from Mrten vn Steen,

More information

CMPUT101 Introduction to Computing - Summer 2002

CMPUT101 Introduction to Computing - Summer 2002 CMPUT Introdution to Computing - Summer 22 %XLOGLQJ&RPSXWHU&LUFXLWV Chpter 4.4 3XUSRVH We hve looked t so fr how to uild logi gtes from trnsistors. Next we will look t how to uild iruits from logi gtes,

More information

V = set of vertices (vertex / node) E = set of edges (v, w) (v, w in V)

V = set of vertices (vertex / node) E = set of edges (v, w) (v, w in V) Definitions G = (V, E) V = set of verties (vertex / noe) E = set of eges (v, w) (v, w in V) (v, w) orere => irete grph (igrph) (v, w) non-orere => unirete grph igrph: w is jent to v if there is n ege from

More information

Profile Based Sub-Image Search in Image Databases

Profile Based Sub-Image Search in Image Databases Profile Bsed Su-Imge Serh in Imge Dtses Vishwkrm Singh 1, Amuj K. Singh 2 Deprtment of Computer Siene, University of Cliforni, Snt Brr, USA 1 vsingh@s.us.edu, 2 muj@s.us.edu Astrt Su-imge serh with high

More information

The Network Layer: Routing in the Internet. The Network Layer: Routing & Addressing Outline

The Network Layer: Routing in the Internet. The Network Layer: Routing & Addressing Outline CPSC 852 Internetworking The Network Lyer: Routing in the Internet Mihele Weigle Deprtment of Computer Siene Clemson University mweigle@s.lemson.edu http://www.s.lemson.edu/~mweigle/ourses/ps852 1 The

More information

Fault tree conversion to binary decision diagrams

Fault tree conversion to binary decision diagrams Loughorough University Institutionl Repository Fult tree onversion to inry deision digrms This item ws sumitted to Loughorough University's Institutionl Repository y the/n uthor. Cittion: ANDREWS, J.D.

More information

Greedy Algorithm. Algorithm Fall Semester

Greedy Algorithm. Algorithm Fall Semester Greey Algorithm Algorithm 0 Fll Semester Optimiztion prolems An optimiztion prolem is one in whih you wnt to fin, not just solution, ut the est solution A greey lgorithm sometimes works well for optimiztion

More information

CS 340, Fall 2016 Sep 29th Exam 1 Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string.

CS 340, Fall 2016 Sep 29th Exam 1 Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string. CS 340, Fll 2016 Sep 29th Exm 1 Nme: Note: in ll questions, the speil symol ɛ (epsilon) is used to indite the empty string. Question 1. [10 points] Speify regulr expression tht genertes the lnguge over

More information

Pattern Matching. Pattern Matching. Pattern Matching. Review of Regular Expressions

Pattern Matching. Pattern Matching. Pattern Matching. Review of Regular Expressions Pttern Mthing Pttern Mthing Some of these leture slides hve een dpted from: lgorithms in C, Roert Sedgewik. Gol. Generlize string serhing to inompletely speified ptterns. pplitions. Test if string or its

More information

Inter-domain Routing

Inter-domain Routing COMP 631: NETWORKED & DISTRIBUTED SYSTEMS Inter-domin Routing Jsleen Kur Fll 2016 1 Internet-sle Routing: Approhes DV nd link-stte protools do not sle to glol Internet How to mke routing slle? Exploit

More information

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming Lecture 10 Evolutionry Computtion: Evolution strtegies nd genetic progrmming Evolution strtegies Genetic progrmming Summry Negnevitsky, Person Eduction, 2011 1 Evolution Strtegies Another pproch to simulting

More information

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards A Tutology Checker loosely relted to Stålmrck s Algorithm y Mrtin Richrds mr@cl.cm.c.uk http://www.cl.cm.c.uk/users/mr/ University Computer Lortory New Museum Site Pemroke Street Cmridge, CB2 3QG Mrtin

More information

Error Numbers of the Standard Function Block

Error Numbers of the Standard Function Block A.2.2 Numers of the Stndrd Funtion Blok evlution The result of the logi opertion RLO is set if n error ours while the stndrd funtion lok is eing proessed. This llows you to rnh to your own error evlution

More information

Tries. Yufei Tao KAIST. April 9, Y. Tao, April 9, 2013 Tries

Tries. Yufei Tao KAIST. April 9, Y. Tao, April 9, 2013 Tries Tries Yufei To KAIST April 9, 2013 Y. To, April 9, 2013 Tries In this lecture, we will discuss the following exct mtching prolem on strings. Prolem Let S e set of strings, ech of which hs unique integer

More information

CS 551 Computer Graphics. Hidden Surface Elimination. Z-Buffering. Basic idea: Hidden Surface Removal

CS 551 Computer Graphics. Hidden Surface Elimination. Z-Buffering. Basic idea: Hidden Surface Removal CS 55 Computer Grphis Hidden Surfe Removl Hidden Surfe Elimintion Ojet preision lgorithms: determine whih ojets re in front of others Uses the Pinter s lgorithm drw visile surfes from k (frthest) to front

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully

More information

Efficient Subscription Management in Content-based Networks

Efficient Subscription Management in Content-based Networks Effiient Susription Mngement in Content-sed Networks Rphël Chnd, Psl A. Feler Institut EURECOM 06904 Sophi Antipolis, Frne {hnd feler}@eureom.fr Astrt Content-sed pulish/susrie systems offer onvenient

More information

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

In the last lecture, we discussed how valid tokens may be specified by regular expressions. LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.

More information

2 Computing all Intersections of a Set of Segments Line Segment Intersection

2 Computing all Intersections of a Set of Segments Line Segment Intersection 15-451/651: Design & Anlysis of Algorithms Novemer 14, 2016 Lecture #21 Sweep-Line nd Segment Intersection lst chnged: Novemer 8, 2017 1 Preliminries The sweep-line prdigm is very powerful lgorithmic design

More information

Can Pythagoras Swim?

Can Pythagoras Swim? Overview Ativity ID: 8939 Mth Conepts Mterils Students will investigte reltionships etween sides of right tringles to understnd the Pythgoren theorem nd then use it to solve prolems. Students will simplify

More information

Computational geometry

Computational geometry Leture 23 Computtionl geometry Supplementl reding in CLRS: Chpter 33 exept 33.3 There re mny importnt prolems in whih the reltionships we wish to nlyze hve geometri struture. For exmple, omputtionl geometry

More information

Doubts about how to use azimuth values from a Coordinate Object. Juan Antonio Breña Moral

Doubts about how to use azimuth values from a Coordinate Object. Juan Antonio Breña Moral Douts out how to use zimuth vlues from Coordinte Ojet Jun Antonio Breñ Morl # Definition An Azimuth is the ngle from referene vetor in referene plne to seond vetor in the sme plne, pointing towrd, (ut

More information

Incremental Design Debugging in a Logic Synthesis Environment

Incremental Design Debugging in a Logic Synthesis Environment Inrementl Design Deugging in Logi Synthesis Environment Andres Veneris Jing Brndon Liu University of Toronto Freesle Semiondutors Dept ECE nd CS High Performne Tools Group Toronto, ON M5S 3G4 Austin, TX

More information

Photovoltaic Panel Modelling Using a Stochastic Approach in MATLAB &Simulink

Photovoltaic Panel Modelling Using a Stochastic Approach in MATLAB &Simulink hotovolti nel Modelling Using Stohsti Approh in MATLAB &Simulink KAREL ZALATILEK, JAN LEUCHTER eprtment of Eletril Engineering University of efene Kouniov 65, 61 City of Brno CZECH REUBLIC krelzpltilek@unoz,

More information

Triple/Quadruple Patterning Layout Decomposition via Novel Linear Programming and Iterative Rounding

Triple/Quadruple Patterning Layout Decomposition via Novel Linear Programming and Iterative Rounding Triple/Qudruple Ptterning Lyout Deomposition vi Novel Liner Progrmming nd Itertive Rounding Yio Lin, Xioqing Xu, Bei Yu, Ross Bldik nd Dvid Z. Pn ECE Dept., University of Texs t Austin, Austin, TX USA

More information

COMP 423 lecture 11 Jan. 28, 2008

COMP 423 lecture 11 Jan. 28, 2008 COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring

More information

Distance vector protocol

Distance vector protocol istne vetor protool Irene Finohi finohi@i.unirom.it Routing Routing protool Gol: etermine goo pth (sequene of routers) thru network from soure to Grph strtion for routing lgorithms: grph noes re routers

More information

Single-Layer Trunk Routing Using 45-Degree Lines within Critical Areas for PCB Routing

Single-Layer Trunk Routing Using 45-Degree Lines within Critical Areas for PCB Routing SASIMI 2010 Proeedings (R3-8) Single-Lyer Trunk Routing Using 45-Degree Lines within Critil Ares for PCB Routing Kyosuke SHINODA Yukihide KOHIRA Atsushi TAKAHASHI Tokyo Institute of Tehnology Dept. of

More information

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5 CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,

More information

COSC 6374 Parallel Computation. Non-blocking Collective Operations. Edgar Gabriel Fall Overview

COSC 6374 Parallel Computation. Non-blocking Collective Operations. Edgar Gabriel Fall Overview COSC 6374 Prllel Computtion Non-loking Colletive Opertions Edgr Griel Fll 2014 Overview Impt of olletive ommunition opertions Impt of ommunition osts on Speedup Crtesin stenil ommunition All-to-ll ommunition

More information

A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications $

A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications $ Informtion Systems 29 (2004) 23 46 A mthing lgorithm for mesuring the struturl similrity etween n XML oument n DTD n its pplitions $ Elis Bertino, Giovnn Guerrini, Mro Mesiti, * Diprtimento i Informti

More information

Research Article Determining Sensor Locations in Wireless Sensor Networks

Research Article Determining Sensor Locations in Wireless Sensor Networks Interntionl Journl of Distriuted Sensor Networks Volume 2015, Artile ID 914625, 6 pges http://dx.doi.org/10.1155/2015/914625 Reserh Artile Determining Sensor Lotions in Wireless Sensor Networks Zimo Li

More information

COMPUTATION AND VISUALIZATION OF REACHABLE DISTRIBUTION NETWORK SUBSTATION VOLTAGE

COMPUTATION AND VISUALIZATION OF REACHABLE DISTRIBUTION NETWORK SUBSTATION VOLTAGE 24 th Interntionl Conferene on Eletriity Distriution Glsgow, 12-15 June 2017 Pper 0615 COMPUTATION AND VISUALIZATION OF REACHABLE DISTRIBUTION NETWORK SUBSTATION VOLTAGE Mihel SANKUR Dniel ARNOLD Lun SCHECTOR

More information

Final Exam Review F 06 M 236 Be sure to look over all of your tests, as well as over the activities you did in the activity book

Final Exam Review F 06 M 236 Be sure to look over all of your tests, as well as over the activities you did in the activity book inl xm Review 06 M 236 e sure to loo over ll of your tests, s well s over the tivities you did in the tivity oo 1 1. ind the mesures of the numered ngles nd justify your wor. Line j is prllel to line.

More information

SMALL SIZE EDGE-FED SIERPINSKI CARPET MICROSTRIP PATCH ANTENNAS

SMALL SIZE EDGE-FED SIERPINSKI CARPET MICROSTRIP PATCH ANTENNAS Progress In Eletromgnetis Reserh C, Vol. 3, 195 22, 28 SMALL SIZE EDGE-FED SIERPINSKI CARPET MICROSTRIP PATCH ANTENNAS W.-L. Chen nd G.-M. Wng Rdr Engineering Deprtment Missile Institute of Air Fore Engineering

More information

FASTEST METHOD TO FIND ALTERNATIVE RE-ROUTE

FASTEST METHOD TO FIND ALTERNATIVE RE-ROUTE INTERNATIONAL JOURNAL OF RESEARCH IN COMPUTER APPLICATIONS AND ROBOTICS ISSN 2320-7345 FASTEST METHOD TO FIND ALTERNATIVE RE-ROUTE 1 M.JothiLkshmi, M.S., M.Phil. 2 C.Theeendr, M.S., M.Phil. 3 M.K.Pvithr,

More information

COSC 6374 Parallel Computation. Communication Performance Modeling (II) Edgar Gabriel Fall Overview. Impact of communication costs on Speedup

COSC 6374 Parallel Computation. Communication Performance Modeling (II) Edgar Gabriel Fall Overview. Impact of communication costs on Speedup COSC 6374 Prllel Computtion Communition Performne Modeling (II) Edgr Griel Fll 2015 Overview Impt of ommunition osts on Speedup Crtesin stenil ommunition All-to-ll ommunition Impt of olletive ommunition

More information

IMAGE COMPRESSION USING HIRARCHICAL LINEAR POLYNOMIAL CODING

IMAGE COMPRESSION USING HIRARCHICAL LINEAR POLYNOMIAL CODING Rsh Al-Tmimi et l, Interntionl Journl of Computer Siene nd Mobile Computing, Vol.4 Issue.1, Jnury- 015, pg. 11-119 Avilble Online t www.ijsm.om Interntionl Journl of Computer Siene nd Mobile Computing

More information

and vertically shrinked by

and vertically shrinked by 1. A first exmple 1.1. From infinite trnsltion surfe mp to end-periodi mp. We begin with n infinite hlf-trnsltion surfe M 0 desribed s in Figure 1 nd n ffine mp f 0 defined s follows: the surfe is horizontlly

More information

McAfee Web Gateway

McAfee Web Gateway Relese Notes Revision C MAfee We Gtewy 7.6.2.11 Contents Aout this relese Enhnement Resolved issues Instlltion instrutions Known issues Additionl informtion Find produt doumenttion Aout this relese This

More information

LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION

LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION Overview LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION 4.4.1.0 Due to the omplex nture of this updte, plese fmilirize yourself with these instrutions nd then ontt RGB Spetrum Tehnil

More information

An Algorithm for Enumerating All Maximal Tree Patterns Without Duplication Using Succinct Data Structure

An Algorithm for Enumerating All Maximal Tree Patterns Without Duplication Using Succinct Data Structure , Mrch 12-14, 2014, Hong Kong An Algorithm for Enumerting All Mximl Tree Ptterns Without Dupliction Using Succinct Dt Structure Yuko ITOKAWA, Tomoyuki UCHIDA nd Motoki SANO Astrct In order to extrct structured

More information

UNIT 11. Query Optimization

UNIT 11. Query Optimization UNIT Query Optimiztion Contents Introduction to Query Optimiztion 2 The Optimiztion Process: An Overview 3 Optimiztion in System R 4 Optimiztion in INGRES 5 Implementing the Join Opertors Wei-Png Yng,

More information

GENG2140 Modelling and Computer Analysis for Engineers

GENG2140 Modelling and Computer Analysis for Engineers GENG4 Moelling n Computer Anlysis or Engineers Letures 9 & : Gussin qurture Crete y Grn Romn Joles, PhD Shool o Mehnil Engineering, UWA GENG4 Content Deinition o Gussin qurture Computtion o weights n points

More information

Class Overview. Database Design. Database Design Process. Database Design. Introduction to Data Management CSE 414

Class Overview. Database Design. Database Design Process. Database Design. Introduction to Data Management CSE 414 Introution to Dt Mngement CSE 44 Unit 6: Coneptul Design E/R Digrms Integrity Constrints BCNF Introution to Dt Mngement CSE 44 E/R Digrms ( letures) CSE 44 Autumn 08 Clss Overview Dtse Design Unit : Intro

More information

Solving Problems by Searching. CS 486/686: Introduction to Artificial Intelligence Winter 2016

Solving Problems by Searching. CS 486/686: Introduction to Artificial Intelligence Winter 2016 Solving Prolems y Serching CS 486/686: Introduction to Artificil Intelligence Winter 2016 1 Introduction Serch ws one of the first topics studied in AI - Newell nd Simon (1961) Generl Prolem Solver Centrl

More information

Algorithm Design (5) Text Search

Algorithm Design (5) Text Search Algorithm Design (5) Text Serch Tkshi Chikym School of Engineering The University of Tokyo Text Serch Find sustring tht mtches the given key string in text dt of lrge mount Key string: chr x[m] Text Dt:

More information

INTEGRATED WORKFLOW ART DIRECTOR

INTEGRATED WORKFLOW ART DIRECTOR ART DIRECTOR Progrm Resoures INTEGRATED WORKFLOW PROGRAM PLANNING PHASE In this workflow phse proess, you ollorte with the Progrm Mnger, the Projet Mnger, nd the Art Speilist/ Imge Led to updte the resoures

More information

Asymmetric Visual Hierarchy Comparison with Nested Icicle Plots

Asymmetric Visual Hierarchy Comparison with Nested Icicle Plots symmetri Visul Hierrhy Comprison with Nested Iile Plots Fin ek 1, Frnz-Josef Wiszniewsky 2, Mihel urh 1, Stephn Diehl 2, nd Dniel Weiskopf 1 1 VISUS, University of Stuttgrt, Germny 2 University of Trier,

More information

Suffix trees, suffix arrays, BWT

Suffix trees, suffix arrays, BWT ALGORITHMES POUR LA BIO-INFORMATIQUE ET LA VISUALISATION COURS 3 Rluc Uricru Suffix trees, suffix rrys, BWT Bsed on: Suffix trees nd suffix rrys presenttion y Him Kpln Suffix trees course y Pco Gomez Liner-Time

More information

Presentation Martin Randers

Presentation Martin Randers Presenttion Mrtin Rnders Outline Introduction Algorithms Implementtion nd experiments Memory consumption Summry Introduction Introduction Evolution of species cn e modelled in trees Trees consist of nodes

More information

Taming Subgraph Isomorphism for RDF Query Processing

Taming Subgraph Isomorphism for RDF Query Processing Tming Sugrph Isomorphism for RDF Query Proessing Jinh Kim # jinh.kim@orle.om Hyungyu Shin hgshin@dl.posteh..kr Wook-Shin Hn wshn@posteh..kr Sungpk Hong # Hssn Chfi # {sungpk.hong, hssn.hfi}@orle.om POSTECH,

More information

UTMC APPLICATION NOTE UT1553B BCRT TO INTERFACE PSEUDO-DUAL-PORT RAM ARCHITECTURE INTRODUCTION ARBITRATION DETAILS DESIGN SELECTIONS

UTMC APPLICATION NOTE UT1553B BCRT TO INTERFACE PSEUDO-DUAL-PORT RAM ARCHITECTURE INTRODUCTION ARBITRATION DETAILS DESIGN SELECTIONS UTMC APPLICATION NOTE UT1553B BCRT TO 80186 INTERFACE INTRODUCTION The UTMC UT1553B BCRT is monolithi CMOS integrte iruit tht provies omprehensive Bus Controller n Remote Terminl funtions for MIL-STD-

More information

Containers: Queue and List

Containers: Queue and List Continers: Queue n List Queue A ontiner in whih insertion is one t one en (the til) n eletion is one t the other en (the he). Also lle FIFO (First-In, First-Out) Jori Cortell n Jori Petit Deprtment of

More information

A METHOD FOR CHARACTERIZATION OF THREE-PHASE UNBALANCED DIPS FROM RECORDED VOLTAGE WAVESHAPES

A METHOD FOR CHARACTERIZATION OF THREE-PHASE UNBALANCED DIPS FROM RECORDED VOLTAGE WAVESHAPES A METHOD FOR CHARACTERIZATION OF THREE-PHASE UNBALANCED DIPS FROM RECORDED OLTAGE WAESHAPES M.H.J. Bollen, L.D. Zhng Dept. Eletri Power Engineering Chlmers University of Tehnology, Gothenurg, Sweden Astrt:

More information

A distributed edit-compile workflow

A distributed edit-compile workflow Time Synhroniztion nd Logil Cloks Tody 1. The need for time synhroniztion 2. Wll lok time synhroniztion 3. Logil Time: Lmport Cloks COS 418: Distriuted Systems Leture 4 Kyle Jmieson 2 A distriuted edit-ompile

More information

CS481: Bioinformatics Algorithms

CS481: Bioinformatics Algorithms CS481: Bioinformtics Algorithms Cn Alkn EA509 clkn@cs.ilkent.edu.tr http://www.cs.ilkent.edu.tr/~clkn/teching/cs481/ EXACT STRING MATCHING Fingerprint ide Assume: We cn compute fingerprint f(p) of P in

More information

Tracking Hidden Agents Through Shadow Information Spaces

Tracking Hidden Agents Through Shadow Information Spaces Trking Hidden Agents Through Shdow Informtion Spes Jingjin Yu Steven M. LVlle jyu@uiu.edu lvlle@uiu.edu Deprtment of Computer Siene University of Illinois Urn, IL 601 USA Astrt This pper ddresses prolems

More information

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs.

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs. Lecture 5 Wlks, Trils, Pths nd Connectedness Reding: Some of the mteril in this lecture comes from Section 1.2 of Dieter Jungnickel (2008), Grphs, Networks nd Algorithms, 3rd edition, which is ville online

More information

Ma/CS 6b Class 1: Graph Recap

Ma/CS 6b Class 1: Graph Recap M/CS 6 Clss 1: Grph Recp By Adm Sheffer Course Detils Adm Sheffer. Office hour: Tuesdys 4pm. dmsh@cltech.edu TA: Victor Kstkin. Office hour: Tuesdys 7pm. 1:00 Mondy, Wednesdy, nd Fridy. http://www.mth.cltech.edu/~2014-15/2term/m006/

More information

Enterprise Digital Signage Create a New Sign

Enterprise Digital Signage Create a New Sign Enterprise Digitl Signge Crete New Sign Intended Audiene: Content dministrtors of Enterprise Digitl Signge inluding stff with remote ess to sign.pitt.edu nd the Content Mnger softwre pplition for their

More information

4.3 Balanced Trees. let us assume that we can manipulate them conveniently and see how they can be put together to form trees.

4.3 Balanced Trees. let us assume that we can manipulate them conveniently and see how they can be put together to form trees. 428 T FOU 4.3 Blned Trees T BT GOIT IN T VIOU setion work well for wide vriety of pplitions, ut they hve poor worst-se performne. s we hve noted, files lredy in order, files in reverse order, files with

More information

Troubleshooting. Verify the Cisco Prime Collaboration Provisioning Installation (for Advanced or Standard Mode), page

Troubleshooting. Verify the Cisco Prime Collaboration Provisioning Installation (for Advanced or Standard Mode), page Trouleshooting This setion explins the following: Verify the Ciso Prime Collortion Provisioning Instlltion (for Advned or Stndrd Mode), pge 1 Upgrde the Ciso Prime Collortion Provisioning from Smll to

More information

COMBINATORIAL PATTERN MATCHING

COMBINATORIAL PATTERN MATCHING COMBINATORIAL PATTERN MATCHING Genomic Repets Exmple of repets: ATGGTCTAGGTCCTAGTGGTC Motivtion to find them: Genomic rerrngements re often ssocited with repets Trce evolutionry secrets Mny tumors re chrcterized

More information

12/9/14. CS151 Fall 20124Lecture (almost there) 12/6. Graphs. Seven Bridges of Königsberg. Leonard Euler

12/9/14. CS151 Fall 20124Lecture (almost there) 12/6. Graphs. Seven Bridges of Königsberg. Leonard Euler CS5 Fll 04Leture (lmost there) /6 Seven Bridges of Königserg Grphs Prof. Tny Berger-Wolf Leonrd Euler 707-783 Is it possile to wlk with route tht rosses eh ridge e Seven Bridges of Königserg Forget unimportnt

More information

An Efficient Code Update Scheme for DSP Applications in Mobile Embedded Systems

An Efficient Code Update Scheme for DSP Applications in Mobile Embedded Systems An Effiient Code Updte Sheme for DSP Applitions in Moile Emedded Systems Weiji Li, Youto Zhng Computer Siene Deprtment,University of Pittsurgh,Pittsurgh, PA 526 {weijili,zhngyt}@s.pitt.edu Astrt DSP proessors

More information

String comparison by transposition networks

String comparison by transposition networks String omprison y trnsposition networks Alexnder Tiskin (Joint work with Peter Krushe) Deprtment of Computer Siene University of Wrwik http://www.ds.wrwik..uk/~tiskin (inludes n extended version of this

More information

s 1 t 4 s 2 4 t 2 a b r 2 r 8 r10 g 4

s 1 t 4 s 2 4 t 2 a b r 2 r 8 r10 g 4 k-pirs Non-Crossing Shortest Pths in Simple Polgon Evnthi Ppdopoulou Northwestern Universit, Evnston, Illinois 60208, USA Astrt. This pper presents n O(n + k) time lgorithm to ompute the set of k non-rossing

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Informtion Retrievl nd Orgnistion Suffix Trees dpted from http://www.mth.tu.c.il/~himk/seminr02/suffixtrees.ppt Dell Zhng Birkeck, University of London Trie A tree representing set of strings { } eef d

More information