A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications $

Size: px
Start display at page:

Download "A matching algorithm for measuring the structural similarity between an XML document and a DTD and its applications $"

Transcription

1 Informtion Systems 29 (2004) A mthing lgorithm for mesuring the struturl similrity etween n XML oument n DTD n its pplitions $ Elis Bertino, Giovnn Guerrini, Mro Mesiti, * Diprtimento i Informti e Comunizione, Universit" egli Stui i Milno, Vi Comelio 39/41, Milno, Itly Diprtimento i Informti, Universit " egli Stui i Pis, Vi Buonrroti 2, Pis, Itly Astrt In this pper we propose mthing lgorithm for mesuring the struturl similrity etween n XML oument n DTD. The mthing lgorithm, y ompring the oument struture ginst the one the DTD requires, is le to ientify ommonlities n ifferenes. Differenes n e ue to the presene of extr elements with respet to those the DTD requires n to the sene of require elements. The evlution of ommonlities n ifferenes gives rise to numeril rnk of the struturl similrity. Moreover, in the pper, some pplitions of the mthing lgorithm re isusse. Speifilly, the mthing lgorithm is exploite for the lssifition of XML ouments ginst set of DTDs, the evolution of the DTD struture, the evlution of struturl queries, the seletive issemintion of XML ouments, n the protetion of XML oument ontents. r 2003 Elsevier Siene Lt. All rights reserve. Keywors: Struturl similrity; Doument lssifition; Struture evolution; Struturl queries; Seletive issemintion of ouments; Doument protetion 1. Introution Similrity plys ruil role in mny reserh fiels. Similrity serves s n orgniztion priniple y whih iniviuls lssify ojets, form onepts, n mke generliztion [1]. Similrity n e ompute t ifferent lyers of strtion: $ A preliminry version of this pper ppere in Proeeings of the 13th Interntionl Symposium on Methoologies for Intelligent Systems, 2002, with the title Mthing n XML Doument ginst Set of DTDs. *Corresponing uthor. Diprtimento i Informti e Comunizione, Universit egli Stui i Milno, Vi Comelio 39/41, Milno, Itly. E-mil ress: mesiti@isi.unige.it (M. Mesiti). t t lyer (i.e. similrity etween t), t type lyer (i.e. similrity etween types lso referre to s shem, moels, or strutures, epening on the pplition omin) or etween the two lyers (i.e. similrity etween t n types). Evluting similrity mong t is relevnt for reting lusters of informtion relte to the sme topi. For exmple, in the imge fiel, the similrity mesure n e exploite for grouping together imges ontining the sme sujet. Evluting similrity etween types is relevnt for the integrtion of shem esriing the sme kin of informtion ut using ifferent strutures [2] n for shem lustering [3]. Evluting similrity etween t n types is relevnt for ientifying /04/$ - see front mtter r 2003 Elsevier Siene Lt. All rights reserve. oi: /s (03)

2 24 E. Bertino et l. / Informtion Systems 29 (2004) t genertor, n thus, pplying to t the properties speifie for the type. Moreover, to this lssifition, similrity n e fouse on the ontents or on the strutures of t involve. In the XML [4] ren, the possiility of evluting similrity hs een reeiving lot of ttention euse more n more informtion exhnge on the We is hering to this formt n pplitions nee to retrieve, ess, n hnle XML ouments imposing relxe onitions n returning pproximte results. At the t lyer, mny pprohes hve een evelope for mesuring the similrity mong XML ouments in orer to luster together ouments eling with the sme topi. Stnr pprohes onsier the textul ontent of the ouments [5], wheres, reently, some new pprohes onsier lso the struture of ouments [6,7]. For wht onerns struturl similrity, mny pprohes rely on the hierrhil strutures of ouments exploiting evlution funtions se on the tree eit istne [8]. At the type lyer, other pprohes hve een evelope for the integrtion of shems tht represent the sme kin of t [9 11] n for shem lustering [3]. Despite this huge tivity t t n type lyers n the ttrtive potentil pplitions in mny fiels, no efforts hve een evote to the omputtion of struturl similrity etween n XML oument (the t) n shem (the type). In this pper we introue mthing lgorithm for omputing the struturl similrity etween n XML oument n DTD, whih is the simplest mens y whih struturl properties of n XML oument n e speifie. In mthing oument ginst DTD, some ttriutes n suelements speifie for n element in the DTD n e missing from the orresponing element of the oument, n, vie vers, the oument n ontin some itionl ttriutes n suelements not ppering in the DTD. Moreover, sine we re fousing on t-entri ouments, elements/ttriutes in the oument n follow ifferent orer w.r.t. the one speifie in the DTD. Finlly, oument n DTD tgs my not e extly the sme, provie they re stems or re similr enough oring to given Thesurus. Therefore, tg similrity rther thn tg equlity is supporte. In mthing oument ginst DTD the gol is then to quntify, through n pproprite mesure, the struturl similrity etween the oument n the DTD. Though our tehnique hnles ll fetures of XML ouments, in the pper we fous on the most meningful ore of the pproh, thus we restrit ourselves to suset of XML ouments n to tg equlity. We refer the intereste reer to [12,13] for the generl se. Mny pplitions n e evise for the mthing lgorithm. For exmple, in the exhnge of XML ouments on the We it is not lwys possile to fore tse to here or to integrte its shem with other shems esriing the sme kin of t. Therefore, the mthing lgorithm n e employe for omputing the similrity etween ouments rriving t given XML tse n the lol shem. As nother exmple, the possiility to exploit the struture of ouments for their retrievl is pushing the nee for query engines le to evlute struturl queries (i.e. queries in whih onitions re impose on the struture of the require ouments). The query engines n employ the mthing lgorithm for evluting the similrity etween oument (possile nswer of the query) n struturl query represente s shem plus ontent onitions. By mens of this, the query engine n filter n rnk nswers to the query. In this pper we fous on five pplitions of the lgorithm: (1) the lssifition of XML ouments ginst set of DTDs; (2) the genertion of new shem for DTD y extrting struturl informtion uring the lssifition of XML ouments; (3) the evelopment of n XML-se serh engine le to nswer pproximte struturl queries; (4) the seletive issemintion of XML ouments; (5) the protetion of the ontents of ouments lssifie ginst set of DTDs of tse, y propgting the uthoriztion poliies speifie t DTD level. The reminer of the pper is orgnize s follows. Setion 2 presents our tree representtion for XML ouments n DTDs. Setion 3 isusses the si priniples unerlying the ehvior of the mthing lgorithm. Setion 4 isusses in etils the mthing lgorithm,

3 E. Bertino et l. / Informtion Systems 29 (2004) wheres Setion 5 presents the mthing lgorithm pplitions. Setion 6 isusses relte work, n, finlly, Setion 7 onlues the work n outlines future reserh iretions. 2. Douments n DTDs s trees more times), whether some suelements re lterntive with respet to eh other ( j ) or re groupe in sequene (, ). We fous on suset of XML ouments. Speifilly, we only onsier elements (tht n hve neste struture) isregring ttriutes (tht n e seen s prtiulr se of elements). Sine we isregr ttriutes, we only onsier nonempty elements. However, empty elements n e simply hnle s elements with the onstrint to hve null ontent. In the mthing proess, we represent oth DTDs n XML ouments through lele trees. The oument representtion is omplint with the tree representtion of DOM [14]. By ontrst, the DTD representtion mkes esy the esription of the lgorithms Tree representtion of ouments Fig. 1. An exmple of XML oument. Fig. 2. An exmple of DTD. An XML oument is represente s lele tree. This representtion only relies on informtion etermine from the struture of the oument. Our efinition is se on the lssil efinition of lele tree. We rell tht, given set N of noes, tree is efine y inution s follows: van is tree; if T 1 ; y; T n re trees, then ðv; ½T 1 ; y; T n ŠÞ is tree. 1 Let NðTÞDN enote the set of noes of tree T; n given set A of lels, lele tree is pir ðt; jþ; where T is tree, n j is leling funtion s.t. 8vANðTÞ; jðvþaa: In our representtion of ouments eh noe represents n element tg or vlue. The lels use to lel the tree elong to set of element tgs (EN ) n to set of vlues tht the t ontents of n element n ssume (V). In eh tree representing oument the lel of the root elongs to EN (it is the nme of the oument element). Moreover, leves of the tree re lele y vlues in V: A key feture of XML is represente y the vrious options one hs ville when moeling oument suelements. We illustrte those options y mens of the oument n DTD reporte respetively in Figs. 1 n 2. The DTD in the figure shows tht for eh suelement it is possile to speify whether it is optionl (? ), whether it my our severl times ( * for 0 or more times, n + for 1 or 1 In the reminer of the pper, for ske of simpliity, we enote with C the sutrees of T (i.e. C ¼½T 1 ; y; T n Š) when it is only relevnt to know tht T is n internl sutree of DTD.

4 26 E. Bertino et l. / Informtion Systems 29 (2004) Definition 1 (XML oument). An XML oument is lele tree ðd; j D Þ efine on the set of lels EN,V with the following properties: 1. D ¼ðv; CÞ with j D ðvþaen ; 2. for eh sutree ðv; CÞ of D; j D ðvþaen ; n 3. for eh (lef) sutree v of D; j D ðvþav: For the ske of simpliity, in the grphil representtion we omit the expliit iretion of eges. All eges re oriente ownwr. Fig. 3 shows the tree representtion of the XML oument in Fig Tree representtion of DTDs A DTD is lso represente s lele tree. In the tree representtion, in orer to represent optionl elements, repetle elements, sequene n lterntive of elements, the set of opertors OP ¼f?, *, +,, ORg is introue. The opertor represents sequene of elements, the OR opertor represents n lterntive of elements (extly one of the lterntives must e selete), the? opertor represents n optionl element, wheres the * n + opertors represent repetle elements (0 or more times, 1 or more times, respetively). In the mthing proess we o not onsier sequenes of unry opertors (tht is,?, *, +) euse onise n equivlent representtion with single opertor lwys exists. In our representtion of DTDs eh noe orrespons to n element, or to n element type, or to n opertor. In eh tree representing DTD the lel of the root elongs to EN (it is the nme of the min element of ouments esrie y the DTD) n there is single ege outgoing from the root. Moreover, there n e more thn one ege outgoing from noe, only if the noe is prout nme version esription "Deliver" urls uthor "2.1.13" homepge "..." ownlo fnme lnme " "ftp://..." "Chip" "Slzenerg" Fig. 3. Tree representtion of the XML oument in Fig. 1. nme * ownlo + urls prout homepge esription ANY * uthor fnme mnme venor lele y or OR. Finlly, ll noes lele y types re leves of the tree. Let ET e the set of possile si types for elements (ET ¼ f; ANYg). Definition 2 (DTD). A DTD is lele tree ðt; j T Þ efine on the set of lels EN,ET,OP with the following properties: 1. T is of the form ðv; ½T 0 ŠÞ with j T ðvþaen ; 2. for eh sutree ðv; CÞ of T; j T ðvþaen,op; 3. for eh (lef) sutree v of T; j T ðvþaet ; 4. for eh sutree ðv; CÞ of T; if j T ðvþafor; g; then C ¼½T 1 ; y; T n Š; n > 1; n 5. for eh sutree ðv; CÞ of T; if j T ðvþaf?; * ; þg,en ; then C ¼½T 0 Š: Fig. 4 shows the tree representtion of the DTD in Fig. 2. We remrk tht the introution of opertors OP ¼f; OR;?; *; þg llows us to represent the struture of ll kins of DTDs. The introution of the opertor is require in orer to istinguish etween n element ontining n lterntive etween sequenes (e.g. o!element ðjð1; 2ÞÞ >) n n element ontining the lterntive etween ll the elements in the sequene (e.g. o!element ðj1j2þ >). The two ifferent tree representtions re shown in Fig. 5(,). The oument in Fig. 5() is vli with respet to the DTD () ut it is not vli with respet to the DTD (). 3. Priniples in mthing n XML oument ginst DTD In this setion we introue y mens of some exmples the ehvior of the mthing lgorithm? OR lnme Fig. 4. Tree representtion of DTD in Fig. 2.? version

5 E. Bertino et l. / Informtion Systems 29 (2004) OR () () () Fig. 5. Exmple of DTDs motivting the introution of lele noes. for the evlution of the similrity etween n XML oument n DTD. In prtiulr we isuss the most relevnt issues in this mth n how the lgorithm resses them. We remrk tht we hve hosen simple exmples tht llow us to fous on the ehvior of the lgorithm in ommon ses. The mthing lgorithm is omplete enough to e use in the similrity evlution of ritrry ouments n DTDs, hrterize y ny omintion of the fetures isusse in this setion Mthing oument ginst set of ouments Two ifferent pprohes n e evise for mesuring the struturl similrity etween n XML oument n DTD: the DTD n e exploite either s genertor of oument strutures (extensionl pproh) or s set of rules onstrining the ontent of eh element (intensionl pproh). Aoring to the extensionl pproh, the set of possile oument strutures of ouments vli for the DTD is onsiere. 2 By onsiering oument struture t time, existing lgorithms for mesuring the struturl similrity etween XML ouments [6,7] n e pplie. The mth resulting in the highest similrity vlue is onsiere s the est mth n its evlution s the struturl similrity egree etween the oument n the DTD. Aoring to the intensionl pproh, y ontrst, the struturl similrity mesure is ompute 2 Note tht this set n e infinite. Tking the oument eing mthe ginst the DTD into ount llows one to onsier only finite, though potentilly ig, set of oument strutures.... OR y mens of mthing lgorithm tht ompres the oument struture to the DTD. The rules onstrining the element ontents re exploite for etermining the est mth. The set of oument strutures the DTD esries is not ompute. Rther, the est struture for n element speifition, for elements ontining lterntives or repetitions, is lolly etermine s soon s the struture of its suelements in the oument is known. Sine the extensionl pproh n result in exponentil omplexity even for very ommon ses, we present mthing lgorithm se on the intensionl pproh. Note tht lso the intensionl pproh hs, in the generl se, exponentil omplexity. However, in signifint suset of ses, the most ommon in prtie, the lgorithm is polynomil s we show in Setion Common, plus, n minus elements The mthing lgorithm relies on the ientifition n proper evlution of: elements ppering oth in the oument n in the DTD, referre to s ommon elements; elements ppering in the oument ut not in the DTD, referre to s plus elements; elements ppering in the DTD ut not in the oument, referre to s minus elements. Exmple 1. Consier the oument D in Fig. 6() n DTDs T ; T in Fig. 6(,), respetively. The mthing lgorithm ientifies tht D n T hve the sme tg for the oument element ut some of () f 9 g h i () () g h Fig. 6. Ientifition of plus, minus, n ommon elements.

6 28 E. Bertino et l. / Informtion Systems 29 (2004) the suelements re ifferent. In prtiulr, D n T shre elements n, wheres D ontins elements n f not ppering in T ; n T ontins elements g n h not ppering in D: Thus, the lgorithm etets tht the two strutures hve two ommon elements, two minus elements, n two plus elements. Consier now the DTD T : The mthing lgorithm etermines tht D n T shre elements,, n, wheres D ontins element f not ppering in T ; n T ontins elements g, h, n i not ppering in D: Thus, the two strutures hve three ommon elements, one plus element, n three minus elements. In the two exmples the ientifie ommon, plus, n minus elements hve to e properly evlute in orer to ientify the est DTD etween T n T : Oviously, to hieve the est similrity, plus n minus elements shoul e minimize n ommon elements shoul e mximize. If we onsier the sene of n element equivlent to the presene of n itionl element, D is more similr to T euse they hve more ommon elements. However, there re situtions in whih plus n minus elements nnot e onsiere equivlent. For this reson we introue n ; two rel numers greter thn 0, tht llow us to properly weight plus n minus elements s we will isuss in Setion 4. The evlution of suh elements is performe y tking into ount two min ftors. First, the mthing lgorithm ssigns weight oring to the level in whih ommon elements re etete in the hierrhil struture of the two tree representtions. Elements t higher levels in the oument struture re more relevnt thn suelements eeply neste in the oument struture. Then, the evlution tkes into ount the struture of plus n minus elements. Complex elements hve greter impt on the evlution thn simpler ones. In the reminer of the setion we isuss how the mthing lgorithm etermines the numer of levels of oument/dtd, n we efine the funtion Weight use for etermining the struturl omplexity of n element Level of n element The similrity mesure thes the intuition tht elements t higher level in oument re more relevnt thn elements t lower level. Exmple 2. Consier the ouments n the DTD in Fig. 7. Element f, suelement of element, is missing in the oument in Fig. 7(). By ontrst, element, suelement of element, is missing in the oument in Fig. 7(). The oument in Fig. 7() is more similr to the DTD in Fig. 7(). We thus introue the notion of level of n element, relte to the epth of the orresponing tree. Given tree T; representing oument, the level of T is its epth s tree, tht is, the numer of noes long the longest mximl pth (tht is, pth from the root to lef) in T: By ontrst, given tree T; representing DTD, its level is the numer of noes, not lele y n opertor, long the longest mximl pth in T: This is euse eges lele y opertors in DTD trees only influene the reth of the orresponing oument trees, not their epths. These notions re formlize y the following efinition. 1 OR e () e 3 () e 5 f () Fig. 7. Douments n DTD of Exmple 2. 7 f

7 E. Bertino et l. / Informtion Systems 29 (2004) Definition 3 (Funtion level). Let T = ðv; ½T 1 ; y; T n ŠÞ e sutree of oument or DTD. Funtion Level is efine s follows: 8 >< 1 þ mx n i¼1 LevelðT iþ if jðvþaen ; LevelðTÞ ¼ mx n i¼1 LevelðT iþ if jðvþaop; >: 0 otherwise: Exmple 3. Let T enote the DTD in Fig. 4, then LevelðTÞ ¼3: In omputing the level of tree leves re not onsiere. This is euse we re intereste in the numer of neste elements n leves only hve t ontents. Now, the mthing lgorithm n ssign ifferent weight to elements t ifferent levels of the tree. Let l ¼ LevelðTÞ e the level of oument/dtd T n g e the ftor of relevne of level with respet to the unerlying level, the root of T will hve weight g l ; n the weight is then ivie y g when going own level to its hilren. Thus, for generi level i of T; g l i is the orresponing weight. Suh weight is multiplie for the numer of ommon, plus, n minus elements ientifie t tht level in orer to tke lso the level into ount in the mth of the two strutures Weight of n element In the evlution of plus n minus elements the mthing lgorithm onsiers their strutures, s shown in the following exmple. Exmple 4. Consier the ouments n DTDs in Fig. 8. If we mth the oument in Fig. 8() ginst the DTD in Fig. 8(), we n see tht the oument lks element n the orresponing vlue. By ontrst, if we mth the oument in Fig. 8() ginst the DTD in Fig. 8(), we n see tht the oument lks element n the orresponing sutree. The lk of element must e evlute ifferently, sine in the first se it hs simple t ontent, wheres in the seon one it hs omplex sustruture. Consier now the oument in Fig. 8() n the DTD in Fig. 8(). The DTD speifies 5 7 () 5 () f 9? ontent for element, wheres in the oument element hs more omplex sustruture. The exmple ove shows tht the mthing lgorithm shoul tke into ount the struture of plus n minus elements. In se of minus elements, however, the struture is not fixe. Consier element in Fig. 8(): it hs n optionl suelement (element tgge ) n n lterntive of suelements (element tgge f or element tgge g). Our ie is to onsier, s struture of the minus elements, the simplest oument struture tht n e generte from tht portion of DTD. Thus, the mesure shoul not tke into ount optionl or repetle elements n, in se of lterntive elements, the mesure shoul tke into ount only one of the lterntive elements (resonly the one with the simplest struture). We thus introue funtion Weight to evlute sutree of oument or of DTD. Definition 4 (Funtion Weight). Let T e sutree of oument or DTD ðd; jþ; n w l f h OR () g i OR? f () Fig. 8. Douments n DTDs of Exmple 4.

8 30 E. Bertino et l. / Informtion Systems 29 (2004) e the weight ssoite with the level of T in D: Funtion Weight is efine s follows: 3 mny times. The weight of the root is 8, therefore the totl weight is w l if lelðtþ 2V,ET ; 0 if lelðtþaf * ;?g; >< WeightðT 0 ; w l Þ if lelðtþ ¼þ n T ¼ðv; ½T 0 ŠÞ; P WeightðT; w l Þ¼ n i¼1 WeightðT i; w l Þ if lelðtþ ¼ n T ¼ðv; ½T 1 ; y; T n ŠÞ; min n i¼1 WeightðT i; w l Þ if lelðtþ ¼OR n T ¼ðv; ½T 1 ; y; T n ŠÞ; P n i¼1 Weight T i; w l >: þ w l otherwise; where T ¼ðv; ½T 1 ; y; T n ŠÞ: g Given sutree of the oument n weight w l ; funtion Weight multiplies the numer of elements in eh level for the weight ssoite with the level. The weight of the level is w l for the first level, w l =2 for the seon level, w l =4 for the thir level, n so on. The resulting vlues re then summe. Given sutree of the DTD n weight w l ; funtion Weight works s on oument, ut it tkes into ount only mntory elements in the DTD. Tht is, the funtion oes not onsier optionl elements or repetle elements lele y *. Moreover, in se of OR lele noes, the weights ssoite with the possile lterntives re evlute n the miniml vlue is hosen. The hoie of the miniml vlue orrespons to seleting the sutree with the simplest struture. Exmple 5. Let T e the DTD in Fig. 4, n ssume g ¼ 2; WeightðT; 8Þ ¼ 27: Note tht, in this se, the weight 8 is 2 3 ; where 3 is the numer of levels of T: Moreover, the elements tht ontriute to the weight of T re the mntory nme, n esription elements n the urls element euse it is repetle from 1 to mny times (thus n ourrene is mntory). The totl weight of these elements is 19. The others o not ontriute euse they re optionl or repetle from 0 to mny times. Note tht, the OR sutree oes not ontriute to the weight euse one of the lterntives it ouns is repetle from 0 to 3 Given T ¼ v or T ¼ðv; CÞ; lelðtþ ¼jðvÞ: 3.3. Optionl n repetle elements In se of repetle elements, the similrity mesure must ientify the est numer of repetitions, tht is, the one tht mximizes ommon elements n minimizes plus n minus elements. Note tht higher numer of repetitions n result in every element in the oument to mth with n element in the DTD (no plus) ut, y ontrst, it n inrese the numer of unmthe elements in the DTD (minus). Optionl elements n e onsiere s speil ses of repetle elements with onstrint on the mximl numer of repetitions. Exmple 6. Consier the oument D n the DTD T in Fig. 9. The possiility of repeting n ritrry numer of times the sequene of elements (,, ) llows us to mp eh element in D to orresponing element in T: However, sine D ontins three elements, the sequene in T must e repete three times, resulting in totl of nine 5 7 () 9 * () Fig. 9. Doument n DTD of Exmple 6.

9 E. Bertino et l. / Informtion Systems 29 (2004) Tle 1 Mesuring the similrity etween the oument n the DTD of Exmple 6 Repetitions Common Plus Minus elements: three present in D (three elements) n six missing from D (three n three elements). If, y ontrst, we h repete the sequene twie, we woul hve otine two ommon elements n four minus elements. The sitution is summrize in Tle 1. The mthing lgorithm hnles repetle elements in the following wy. The lgorithm mthes ll the elements (t the urrent level) ginst the repetle element in orer to etermine the evlution of ommon, minus, n plus elements. After tht, it etermines the est numer of repetitions y pplying the evlution funtion n hoosing the mximl vlue. The evlution is more omplite when sequene or lterntive of elements shoul e hnle. The ehvior of the lgorithm in these situtions is shown in the following setion Sequenes n lterntives of elements The evlution of sequenes of elements is performe in two steps. The first step ientifies the presene or sene of single elements of the sequene (i.e. it ientifies the minus n ommon elements). Minus n ommon elements re evlute s esrie ove. Then, the sequene of elements is evlute y summing up the evlutions otine for the single minus n ommon elements. The evlution is more omplite when the sequene is repetle. In this sitution, inee, the lgorithm shoul ientify the possile repetitions of the sequene. The evlution of eh sequene orrespons to the sum of the evlution of ommon n missing elements n the est numer of repetitions of the sequene is etermine y exploiting the evlution funtion. Exmple 7. Consier the oument n the DTD of Exmple 6. In the evlution of the opertor (whih is repetle), the lgorithm first fins tht element in the DTD hs 3 mthes in the oument, wheres elements n in the DTD hve no mth in the oument. Tking into ount the evlution otine for the single elements, the possile repetitions of the sequene re ompute. Zero repetitions of the sequene mens tht the three elements re plus, n there re no ommon n minus elements. One repetition of the sequene mens tht one of the three elements is ommon, the other two elements re plus, n n elements re minus. The other repetitions of the sequene re ompute in similr wy. Note tht, new repetition of the sequene is onsiere till ommon element hs to e mthe. Therefore, in this se four repetitions re onsiere. The est one is then selete y mens of the evlution funtion. The mthing lgorithm hnles lterntives in similr wy. However, in this se the evlutions otine re not summe up, rther the est one, tht is the evlution orresponing to the est lterntive mong the possile ones, is hosen Role n setting of prmeters The ehvior n results of the mthing lgorithm rely on some prmeters previously outline n reporte in Tle 2. A user sets these prmeters epening on the pplition omin in whih the mthing lgorithm is use. Some exmples will e shown in Setion 5 when we isuss some pplitions of the mthing lgorithm. Tle 2 Prmeters of the mthing lgorithm Prmeter g Desription Weight of plus elements (X0) Weight of minus elements (X0) Relevne ftor of level (gan)

10 32 E. Bertino et l. / Informtion Systems 29 (2004) Depening on the vlues ssigne to n ; the mthing lgorithm gives more relevne to plus elements with respet to minus elements, or vie-vers. For exmple, if ¼ 0 n ¼ 1 plus elements re not tken into ount in mesuring similrity. Therefore, oument with only extr elements with respet to the ones speifie in the DTD hs similrity egree equl to 1. By ontrst, if ¼ 1 n ¼ 0 the minus elements re not tken into ount in the similrity mesure. In the following exmples we ssume tht ¼ ¼ 1; thus giving the sme relevne to plus n minus elements. Depening on the vlue ssigne to gan; the mthing lgorithm gives more relevne to ommon elements t higher levels in the oument with respet to others t lower levels. By tking g ¼ 1 ll the informtion is onsiere eqully relevnt, n thus the ft tht elements pper t ifferent levels in the neste struture is not tken into ount. By ontrst, tking g ¼ 2 elements t given level hve oule relevne with respet to their hilren. In wht follows, we onsier g ¼ 2: 4. The mthing lgorithm In the previous setion we hve outline the ehvior of the mthing lgorithm in the most relevnt ses. In this setion we point out some etils of the evelope lgorithm Evlution funtion In orer to otin the est mth etween the two strutures, ommon elements must e mximize, wheres plus n minus elements must e minimize. However, we wnt to otin numeri vlue tht quntifies the similrity etween the oument n the DTD. Thus, we ssume plus, minus, n ommon elements to e evlute to three nturl vlues p; m; ; tking into ount the levels n the weights, s isusse in Setion 3. These three vlues re omine through n evlution funtion for etermining n overll similrity evlution. The evlution funtion we hoose is funtion E; formlly efine in the following, whih is se on the rtio moel [1]. This funtion omputes the rtio etween the evlution of the ommon elements etween the two strutures (i.e., elements in the intersetion etween the two strutures) n the evlution p þ m þ of ll the elements in the two strutures (i.e., elements in the union of the two strutures). The evlution of plus n minus elements re weighte oring to n prmeters. The otine similrity vlue is rel numer in the rnge [0,1]. Definition 5 (Funtion E). Let ðp; m; Þ e triple of nturl numers n ; e rel numers s.t. ; X0: Funtion E is efine s ( 0 if ðp; m; Þ ¼ð0; 0; 0Þ Eðp; m; Þ ¼ pþþm otherwise: Relying on funtion E; n orer reltionship $ hs een efine. This orer is exploite for seleting mong set of mthes (represente s ðp; m; Þ triples) the optiml ones (i.e. the mximl triple). Detils on the $ orer n e foun in [12]. Exmple 8. Consier the oument n the DTD in Fig. 10. Sine the oument only ontins element 1, if we hoose the right rnh of the OR we hve one ommon element n 39 missing elements. By ontrst, if we hoose the left rnh of the OR, we hve no ommon elements, ut only plus n minus element. OR () () Fig. 10. Tree representtions of oument n DTD of Exmple

11 E. Bertino et l. / Informtion Systems 29 (2004) A sketh of the mthing lgorithm An lgorithm, nme Mth; tht llows one to ssign ðp; m; Þ triple to pir of trees ðoument; DTDÞ hs een efine. Suh lgorithm is se on the ie of lolly etermining the est struture for DTD element, for elements ontining lterntives or repetitions, s soon s the informtion on the struture of its suelements in the oument is known. The lgorithm is generl enough to evlute the similrity etween ny kin of XML ouments n DTDs. In this pper, however, we fous on the most meningful ore of the lgorithm, se on the ssumption tht, in the elrtion of n element, two suelements with the sme tg re forien. Tht is, element elrtions suh s o!element ð * ; ðjþþ > re not onsiere. Detils on the generl version of the lgorithm n e foun in [12,13]. Given oument D; n DTD T; lgorithm Mth first heks whether the root lels of the two trees re equl. If not, then the two strutures o not hve ommon prts, n null triple is returne. If the root lels re equl, the mximl level l etween the levels of the two strutures is etermine, n the reursive funtion M is lle on: 1. the root of the oument, 2. the first (n only) hil of the DTD, 3. the level weight (g l 1 ) tking into ount tht funtion M is lle on the seon level of the DTD struture, n 4. flg initing tht the urrent element (the root element) is not repetle. Funtion M reursively visits the oument n the DTD, t the sme time, from the root to the leves, to mth ommon elements. Speifilly, two istint phses n e istinguishe: 1. in the first phse, moving own in the trees from the roots, the prts of the trees to visit through reursive lls re etermine, ut no evlution is performe; 2. when terminl se is rehe, on return from the reursive lls n going up in the trees, the vrious lterntives re evlute n the est one is selete. Intuitively, in the first phse the DTD is use s guie to etet the ommon elements etween the oument n the DTD, isregring the opertors tht in together suelements of n element. In the seon phse, y ontrst, the DTD opertors re onsiere in orer to verify whih elements re oun s presrie y the DTD, n to efine n evlution of the missing or exeeing prts of the oument with respet to the DTD. Terminl ses re the following: lef of the DTD is rehe, or n element of the DTD not present in the oument is foun. In these ses ðp; m; Þ triple is returne. Then, the seon phse strts n the evlution of internl noes is performe, riven y their lels An illustrtive exmple We now illustrte the ehvior of funtion M on the oument n the DTD in Fig. 11(,). For ske of lrity, in the isussion of the lgorithm, we enote the element of the oument lele y s D ; n the element of the DTD lele y s T : During the first phse, funtion M; riven y the lel of the urrent DTD noe, is lle on sutrees of the oument n the DTD. For exmple, on the first ll of M on ( D ; T ), reursive lls on D n ll the sutrees of T re performe (i.e., on ( D ; T ), n ( D ; T )). Reursive lls re performe isregring the opertors in the DTD n moving own only when n element elre in the DTD is foun in the oument s hil of the urrent noe. Moreover, in suh ses, the weight level is ivie y g in orer to etermine the level weight of the unerlying level. Fig. 11(,) shows the performe reursive lls. An ege (v; v 0 Þ of the tree is ol if reursive ll of funtion M hs een me on the sutree roote t v 0 : Note tht no reursive lls hve een me on T ; T ; n g T euse suh elements re missing in the oument. Note lso tht m D hs not een visite y funtion M; euse this element is not require in the DTD.

12 34 E. Bertino et l. / Informtion Systems 29 (2004) : [(6,9,21)] :[(0,9,13)] : [(0,6,0)] :[(0,3,13)] m 7 5 f e OR * g e * f OR: [(0,0,6)] *: [(0,0,0)] : [(0,0,6)] : [(0,0,0)] :[(0,0,2)] g: [(0,0,0)] e:[(0,3,7)] *: [(0,3,3)] : [(0,3,3)] f: [(0,0,3)] () 3 () () :[(0,0,1)] Fig. 11. Exeution of funtion M: When terminl se is rehe, ðp; m; Þ triple is proue. For exmple, when funtion M is lle on (f D ; T ), the triple ð0; 0; 1Þ is generte, euse the DTD requires t ontent for f D n, tully, suh element hs textul ontent. By ontrst, when funtion M is lle on ( D ; T ), the triple ð0; 6; 0Þ is generte, euse the DTD requires n element tgge, ut suh element is missing in the oument. Therefore, funtion Weight is lle on T n, sine the urrent level weight is 4, the vlue 6 is returne s weight of the missing sutree. On return from the reursive lls, the opertors n the repetility of the noe re onsiere in orer to selet the est hoie mong the possile ones for ining together suelements. For exmple, returning from the evlution of sutrees of the OR element, whih is not repetle, the triples (0,0,0) n (0,0,6) otine for the evlution of sutrees re onsiere. The est one is selete relying on the E evlution funtion. By ontrst, returning from the evlution of sutrees of n element, whih is not repetle, the otine evlutions re summe in orer to etermine the evlution of the sequene of elements. The ehvior of the lgorithm is muh more rtiulte when elements re repetle. In suh ses, inee, not only triple is generte, ut list of triples. The lists of triples re then omine in orer to evlute internl noes. The intermeite evlutions re reporte in Fig. 11(). If n ege is ol the lel is followe y the ðp; m; Þ triple otine from the evlution of the orresponing sutree. If n ege is not ol, ut the lel is followe y ðp; m; Þ triple, it represents the evlution of minus elements of the sutree. The triple ssoite with the min element of the DTD (i.e. (6,9,21)) is otine y the Mth lgorithm summing up the evlution returne y funtion M ((0,9,13)), the evlution of the plus element m ((6,0,0)) n the ientifition of ommon root lel ((0,0,8)) Algorithm omplexity The running time of the Mth lgorithm epens on the running time of funtion M: Let M e the numer of noes of the oument, N e the numer of noes of the DTD, n G the mximl numer of eges outoming from noe of the oument, the running time of funtion M is OðG 2 ðn þ MÞÞ [12]. This omplexity eeply epens on the ssumption we strte with. Tht is, the ssumption tht in the elrtion of n element, two suelements with the sme tg re forien. Relying on this ssumption, in the first phse of the lgorithm, reursive lls re performe only until ommon elements etween the strutures re etete. In this phse of the lgorithm no wrong mthes re etermine, euse t most one mth is possile etween n element of the oument n n element of the DTD. Therefore, this phse hs running time liner in the numer of noes of the two strutures. Then, in the seon

13 E. Bertino et l. / Informtion Systems 29 (2004) phse, the mthing lgorithm evlutes the DTD opertors. For eh ommon element, this phse hs running time qurti in the numer of eges outoming from the noe of the oument. Comining the two results, the ove omplexity is otine. In the generl version of the lgorithm [12] the ove ssumption oes not hol. In suh se, wrong mthes n rise uring the first phse. For exmple, onsier n element of the oument tht mthes with n elements with the sme tg in the DTD. In orer to ientify the est mth, the seon phse shoul e performe n times. Eh time the element of the oument is onsiere in ommon with one of the n elements of the DTD, n, t the en, the mth tht mximizes the evlution funtion is hosen. It is esy to unerstn tht, when the numers of elements with the sme tg either in the oument or in the DTD inreses, the omplexity of the generl version of funtion M hnges from polynomil to exponentil. We woul like to remrk, however, tht the presene in the DTD of elements with the sme tg is often ue to wrong esign of the DTD. However, in [12] some tehniques hve een propose for reuing the exeution time of the mthing lgorithm, even if, in the worst se, the omplexity is still exponentil Similrity mesure The similrity mesure etween oument n DTD is efine s follows. Definition 6 (Similrity mesure). Let D e oument n T DTD. The similrity mesure etween D n T is efine s follows: SðD; TÞ ¼EðMth/D; TSÞ Exmple 9. Let D n T e the oument n the DTD in Fig. 11(,). Their similrity egree is SðD; TÞ ¼EðMth/D; TSÞ ¼Eð/6; 9; 21SÞ ¼ 0:58: The following proposition sttes the reltionship etween the notion of vliity n our similrity mesure. Proposition 1. Let D e oument, T DTD, n ; the prmeters of funtion E: If ; 0 the following properties hol: * if D is vli with respet to T; then SðD; TÞ ¼1; n * if SðD; TÞ ¼1; then D is vli with respet to T; isregring the orer of elements. Proof (Sketh). The first ssertion follows from the ft tht if oument D is vli for DTD T; this mens tht its struture is extly one of the strutures esrie y the DTD. Thus, the oument neither ontins elements not ppering in the DTD (thus, plus ¼ 0), nor it misses elements require y the DTD (thus, minus ¼ 0). Therefore, when funtion E is pplie, the rtio etween n 0 þ 0 þ is ompute, thus otining 1. The seon ssertion hols sine the similrity vlue n e 1 only if the two vlues of whih we ompute the rtio re equl. Sine n re not null, n the p; m; vlues re nturl, thus, non negtive, this n hppen only if p ¼ m ¼ 0: This mens tht the oument neither ontins elements not ppering in the DTD, nor it misses elements require y the DTD. Thus, oring to the notion of vliity, if we isregr the orer of elements, the oument is vli for the DTD. & 5. Applitions In this setion we isuss pplitions of the mthing lgorithm we re investigting Clssifition of ouments A first pplition of the mthing lgorithm is for the lssifition of XML ouments gthere from the We ginst set of DTDs elre in n XML tse. The senrio we refer to is hrterize y numer of heterogeneous tses of XML ouments le to exhnge ouments mong eh other. Eh tse stores n inexes the lol ouments oring to set of lol DTDs. An XML oument entering tse is mthe, y mens of the mthing lgorithm, ginst the lol DTDs. If

14 36 E. Bertino et l. / Informtion Systems 29 (2004) DTD exists to whih the oument onforms oring to the usul notion, then the oument is epte s vli for this DTD. Otherwise, the propose lgorithm is use for seleting the DTD, mong the ones in the tse, tht est esries the struture of the oument. In this senrio, similrity threshol shoul e fixe. Suh threshol represents the miniml egree of similrity require for ining n XML oument to DTD. Oviously, the DTD for whih the similrity egree is the highest, n ove the fixe threshol, is selete. Whenever the similrity egree is not ove the threshol for ny DTD of the tse, the oument is onsiere unlssifie n store in repository of unlssifie ouments. For the retrievl, protetion n inexing of suh ouments none of the filities speifie t DTD level n e pplie. Exmple 10. Consier the oument D n the two DTDs T 1 n T 2 in Fig. 12. The similrity egree etween D n T 1 is SðD; T 1 Þ¼0:62; wheres the similrity egree etween D n T 2 is SðD; T 2 Þ¼0:52: Doument D is more similr to DTD T 1 thn to T 2 ; euse SðD; T 1 Þ > SðD; T 2 Þ: If we set the similrity threshol to 0:6; oument D is lssifie in T 1 : By ontrst, if we set the similrity threshol to 0:8; oument D nnot e lssifie in T 1 ; n, thus, it is store in the repository of unlssifie ouments. Severl experiments hve een rrie on in orer to ssess the similrity mesure n the mthing lgorithm oth from the orretness n from the effiieny viewpoint. First, we onsiere oth rel n syntheti t n lssifie them ginst set of DTDs in orer to verify tht the lgorithm orretly rnks ouments oring to the similrity mesure. In oth the experiments, we otine tht for eh oument D; n for eh pir of DTDs T 1 ; T 2 suh tht D is not vli neither for T 1 nor for T 2 ; whenever SðD; T 1 Þ > SðD; T 2 Þ; D tully is more similr to T 1 thn to T 2 [12]. Then, some performne evlutions hve een rrie on in orer to show tht the mthing lgorithm is resonly effiient to e use in prtie. Note tht this is ruil issue s similrity heks re suppose to e performe frequently n online. The exeution time of the lgorithm vries from few milliseons for simple XML ouments n DTDs, to few seons for very huge ouments n DTDs (i.e. whose size is in the orer of 4 5 Mytes) Evolution of DTD strutures After hving lssifie ertin numer of ouments, the ouments instnes of DTD n present some regulrities tht, if pture y the DTD, woul restrit the ivergene etween the struture of ouments s speifie y the DTD n the tul strutures of ouments instnes of the DTD. The gol of the evolution pproh is to pture these regulrities thus pting the set of DTDs to the set of ouments. Preliminry results hve een reporte in [15]. film title yer L vit e ell 1997 film proution * yer tor title proution prouer G. Brshi film * tor OR iretor 0.62 > 0.52 Fig. 12. Clssifition of oument.

15 E. Bertino et l. / Informtion Systems 29 (2004) The t flow of the evolution pproh is shown in Fig. 13, in whih retngles enote the min funtionl omponents of the pproh, yliners enote t stores, thik rrows enote the ontrol flow, n thin rrows enote t flow. Eh time oument, rete outsie the tse, enters the tse it is initilly inserte in queue of to-e-proesse ouments. When it is then selete, it is ssoite with DTD of the tse, tht is, the one est esriing its struture, through the lssifition lgorithm. If oument, mthe ginst eh DTD, oes not proue similrity vlue ove the similrity threshol, it is inserte in the repository of unlssifie ouments. Otherwise, the oument is hnle s n instne of the DTD for whih the evlution proue the highest similrity vlue. One the lssifition phse is omplete (i.e, the DTD of whih the oument is n instne hs een selete) some struturl informtion re extrte from the oument. Speifilly, informtion out frequent ptterns ientifie in the elements of oument tht re not vli with respet to the orresponing DTD elrtion. A pttern is suset of the tgs of suelements of nonvli element e nv of the oument with respet to DTD. Ptterns re use for ientifying groups of suelements of e nv frequently oun together n, thus, to extrt the new struture of the DTD elrtion of e nv : In the reoring phse this informtion is ssoite with the DTD in t struture referre to s extene DTD. The use of this informtion vois nlyzing gin the oument in the susequent phses. Moreover, this informtion is struturl rther thn ontent informtion, n it is ggregte over the whole set of nlyze ouments, n thus it oes not require muh storge spe. These tivities re iterte till the evolution phse is triggere. The evolution phse is tivte fter ertin numer of ouments hve een lssifie. The evolution phse hs high ost in terms of rewriting the pplitions tht re working on the tse. Therefore, it shoul e triggere whenever the DTD is not representtive nymore of its instnes n suh upte improves the performne of pplitions tht work on them. The event n epen on the ess frequeny to the DTD instnes, on the numer of nononforming elements w.r.t the DTD, n on the numer of ouments urrently onsiere s instnes of the DTD. The hek omponent is responsile to etermine whether the evolution phse shoul e tivte. Similrity threshol Ativtion threshol XML oument Doument queue Clssifition Reoring Chek Evolution XML tse Clssifition of repository ouments XML os DTDs repository Fig. 13. Dt flow of the evolution pproh.

16 38 E. Bertino et l. / Informtion Systems 29 (2004) The evolution phse of the evolution proess is responsile for generting new set of DTDs n n work t ifferent grnulrities, rnging from very orse grnulrity, regenerting the whole DTD, to very fine grnulrity, regenerting the struture of single element in the DTD. By mking use of the informtion ollete in the reoring phse, some ssoition rules re extrte tht represent reltionships etween presene/sene of suelements of n element. Bse on suh rules n on some heuristis we hve ientifie, the new DTD is generte. Finlly, fter the evolution phse, the ouments in the repository re lssifie gin ginst the restruture set of DTDs in orer to hek whether the similrity is now ove the threshol for DTD of the tse so tht the oument n e onsiere s instne of suh DTD. The evolution phse is se on three key priniples. 1. Use of t mining ssoition rules [16,17] for etermining the most frequent ptterns in the struture of suelements of eh element. For eh element of the DTD, y relying on the ptterns store in the t struture, it is possile to etermine elements tht re lwys together (i.e. oun y n opertor), elements tht re never together (i.e. oun y n OR opertor), elements, or groups of elements, tht re repete the sme numer of times (i.e. oun y * or + opertor), elements, or groups of elements, tht re optionl (i.e. oun y n? opertor). 4 Moreover, in orer to estlish when the presene of n element implies the sene of nother element, ssoition rules like if element is sent then element is present hve een onsiere. 2. Inrementl moifition of the DTD. Approhes propose in [18,19] for inferring the type of set of ouments onsier ll the ouments t one. Therefore, when new ouments is e to the set, in orer to etermine new type, the proess strts from 4 Note tht the terms lwys, never, n sme numer shoul e onsiere in their sttistil sense, i.e. in most ses. srth. By ontrst, in our pproh we inrementlly store the relevnt informtion in the t strutures n use them uring the evolution proess. 3. Relevne of previous instnes of the DTD. Different relevne n e given to the urrent struture of the DTD with respet to the ouments lssifie ginst it sine lst DTD evolution. If the DTD ws ummy DTD generte from trining set of ouments or, for the prtiulr pplition re, the rule more reent, more relevnt hols, then the DTD evolution proess shoul forget the previous struture of the DTD n eeply moify it in orer to otin new struture tht losely represents the ouments lssifie in the DTD sine lst evolution. By ontrst, if the DTD struture is onsolite we wnt to minimize the DTD moifitions in orer to over oth the previous struture of the ouments n the new struture eue from the ouments lssifie sine lst evolution. Exmple 11. Let T e the DTD in Fig. 14() n D 1 n D 2 e two sets of ouments whose strutures re reporte in Fig. 14(). The lel of root elements is oth for ouments in D 1 n D 2 n ll ouments ontin sequene of n elements. However, this sequene in ouments in D 1 is followe y sequene of elements, wheres in ouments in D 2 it is followe y n e element. Douments in oth sets re not vli with respet to T: Fig. 14() presents sketh of the extene DTD. Element is ssoite with the set f; ; ; eg of element tgs foun in the ouments lssifie ginst T: Moreover, f; g forms group sine elements n re repete the sme numer of times n element is mrke s repetle n optionl (some ouments o not ontin it). Suppose tht oring to the more reent, more relevnt rule, we eie to upte the DTD struture. The evolution lgorithm, y mens of set of poliies, etermines the new struture of the DTD. We o not etil the heuristi poliies evelope n simply outline the ehvior of the lgorithm in this speifi exmple y mens of Fig. 15.

17 E. Bertino et l. / Informtion Systems 29 (2004) Fig. 14. () DTD, () kin of ouments lssifie ginst the DTD, n () extene DTD. {,,,e} {,}group (1) {,} P1 * {,e} P4 (2) OR e e P13 + e (4) new DTD (3) * OR + e * + OR e Fig. 15. Applition of the evolution lgorithm. The evolution lgorithm first etermines tht elements n pper lwys together (i.e., the presene of implies the presene of, n vie vers), n they hve the sme numer of ourrenes (i.e., they form group). Therefore, the new tree (1) in Fig. 15 is otine. Then, the evolution lgorithm etermines tht elements n e re omplementry (i.e., the presene of implies the sene of e, n the sene of e implies the presene of ), n is repetle. Therefore, the new tree (2) in Fig. 15 is otine. Trees (1) n (2) in Fig. 15 re, finlly, omine together (tree (3) in Fig. 15) y mens of the opertor in orer to otin the finl new DTD struture reporte s tree (4) in the figure Struturl queries Reent pprohes to the retrievl of XML ouments exploit the struture of ouments for improving oth ury n effiieny. Suh queries re referre to s struturl queries. Moreover, severl of those pprohes hve the pility of returning rnke nswers, in the spirit of informtion retrievl. Struturl queries re normlly expresse s lele trees, representing either struturl or ontent onstrints on the ouments whih re possile nswers to the query. By mens of mth etween the tree representtion of the struturl query n n XML oument it is

18 40 E. Bertino et l. / Informtion Systems 29 (2004) possile to verify whether the oument is n nswer to the query, to ompute their egree of similrity, n to extrt the prts of the oument tht the query shoul return. In our ontext, struturl query n e represente s DTD, in whih some itionl onstrints on the vlue of t ontent elements hve een pose. 5 Therefore, query is moele s lele tree representing the struturl n ontent onstrints oument shoul verify in orer to e onsiere n nswer to the query. Then, the mthing lgorithm n e exploite for evluting the egree of similrity etween the two strutures. If suh egree is ove given threshol, the oument is e to the set of nswers for the query (query nswer set). The query nswer set is rnke relying on the similrity egree. Two ifferent interprettions n e given to struturl query expresse s lele tree. First, it n represent templte of the ouments we re looking for. Seon, it n represent the miniml onstrints oument shoul meet in orer to elong to the query nswer set. Aoring to this interprettion, oument n ontin other elements with respet to those of the query in whih some onitions hve een speifie. With smll extension of the mthing lgorithm we evelope (in orer to hnle ontent onitions) oth the interprettions re supporte. The first one is otine without ny effort. Atully, this orrespons to the pplition of the lssifition pproh in whih some itionl onstrints hve een e for the t ontent elements. By ontrst, the seon one is otine y setting to 0: In this wy ll the plus elements foun in the oument re not onsiere in the evlution. Therefore, only elements require y the query ut not present in the oument re tken into ount for properly evluting the similrity egree etween the oument n the query. Exmple 12. Consier the following query expresse through the Xpth [20] nottion: =film½iretor ¼ Fellini Š =film½te > 1974 Š By exploiting the struture of the oument eue y the query formultion, the tree representtion in Fig. 16 n e generte. Consier now the ouments in the lower sie of Fig. 16. Assuming the interprettion of query s oument templte the similrity egrees the mthing lgorithm returns re use to rnk the ouments. Some experiments hve een rrie on for testing the pproh. The otine results re similr to those for the lssifition of XML ouments ginst set of DTDs Seletive issemintion of XML ouments As the mount of XML t ville online n the numer of pervsive pplitions tht tke vntge of these t inrese, systems tht support seletive issemintion of informtion (lle SDI systems) re more n more populr [21]. A seletive issemintion system mnges user preferenes s well s strem of inoming ouments. For eh inoming oument, the system serhes for the set of user preferenes tht mth it in orer to ientify the users to whom the 5 Note tht y exploiting the DTD opertors more expressive struturl queries n e speifie thn the ones possile with urrent pprohes. The? opertor, for exmple, n e exploite for expressing optionl onitions. Fig. 16. Evlution of struturl query.

Distance vector protocol

Distance vector protocol istne vetor protool Irene Finohi finohi@i.unirom.it Routing Routing protool Gol: etermine goo pth (sequene of routers) thru network from soure to Grph strtion for routing lgorithms: grph noes re routers

More information

10.2 Graph Terminology and Special Types of Graphs

10.2 Graph Terminology and Special Types of Graphs 10.2 Grph Terminology n Speil Types of Grphs Definition 1. Two verties u n v in n unirete grph G re lle jent (or neighors) in G iff u n v re enpoints of n ege e of G. Suh n ege e is lle inient with the

More information

MITSUBISHI ELECTRIC RESEARCH LABORATORIES Cambridge, Massachusetts. Introduction to Matroids and Applications. Srikumar Ramalingam

MITSUBISHI ELECTRIC RESEARCH LABORATORIES Cambridge, Massachusetts. Introduction to Matroids and Applications. Srikumar Ramalingam Cmrige, Msshusetts Introution to Mtrois n Applitions Srikumr Rmlingm MERL mm//yy Liner Alger (,0,0) (0,,0) Liner inepenene in vetors: v, v2,..., For ll non-trivil we hve s v s v n s, s2,..., s n 2v2...

More information

Class Overview. Database Design. Database Design Process. Database Design. Introduction to Data Management CSE 414

Class Overview. Database Design. Database Design Process. Database Design. Introduction to Data Management CSE 414 Introution to Dt Mngement CSE 44 Unit 6: Coneptul Design E/R Digrms Integrity Constrints BCNF Introution to Dt Mngement CSE 44 E/R Digrms ( letures) CSE 44 Autumn 08 Clss Overview Dtse Design Unit : Intro

More information

Chapter 9. Greedy Technique. Copyright 2007 Pearson Addison-Wesley. All rights reserved.

Chapter 9. Greedy Technique. Copyright 2007 Pearson Addison-Wesley. All rights reserved. Chpter 9 Greey Tehnique Copyright 2007 Person Aison-Wesley. All rights reserve. Greey Tehnique Construts solution to n optimiztion prolem piee y piee through sequene of hoies tht re: fesile lolly optiml

More information

Table-driven look-ahead lexical analysis

Table-driven look-ahead lexical analysis Tle-riven look-he lexil nlysis WUU YANG Computer n Informtion Siene Deprtment Ntionl Chio-Tung University, HsinChu, Tiwn, R.O.C. Astrt. Moern progrmming lnguges use regulr expressions to efine vli tokens.

More information

V = set of vertices (vertex / node) E = set of edges (v, w) (v, w in V)

V = set of vertices (vertex / node) E = set of edges (v, w) (v, w in V) Definitions G = (V, E) V = set of verties (vertex / noe) E = set of eges (v, w) (v, w in V) (v, w) orere => irete grph (igrph) (v, w) non-orere => unirete grph igrph: w is jent to v if there is n ege from

More information

Greedy Algorithm. Algorithm Fall Semester

Greedy Algorithm. Algorithm Fall Semester Greey Algorithm Algorithm 0 Fll Semester Optimiztion prolems An optimiztion prolem is one in whih you wnt to fin, not just solution, ut the est solution A greey lgorithm sometimes works well for optimiztion

More information

CS 241 Week 4 Tutorial Solutions

CS 241 Week 4 Tutorial Solutions CS 4 Week 4 Tutoril Solutions Writing n Assemler, Prt & Regulr Lnguges Prt Winter 8 Assemling instrutions utomtilly. slt $d, $s, $t. Solution: $d, $s, nd $t ll fit in -it signed integers sine they re 5-it

More information

A decision support system prototype for fuzzy multiple objective optimization

A decision support system prototype for fuzzy multiple objective optimization EUSFLAT - LFA A eision support system prototype for fuzzy multiple ojetive optimiztion Fengjie Wu Jie Lu n Gungqun Zhng Fulty of Informtion Tehnology University of Tehnology Syney Austrli E-mil: {fengjiewjieluzhngg}@it.uts.eu.u

More information

Duality in linear interval equations

Duality in linear interval equations Aville online t http://ijim.sriu..ir Int. J. Industril Mthemtis Vol. 1, No. 1 (2009) 41-45 Dulity in liner intervl equtions M. Movhedin, S. Slhshour, S. Hji Ghsemi, S. Khezerloo, M. Khezerloo, S. M. Khorsny

More information

UTMC APPLICATION NOTE UT1553B BCRT TO INTERFACE PSEUDO-DUAL-PORT RAM ARCHITECTURE INTRODUCTION ARBITRATION DETAILS DESIGN SELECTIONS

UTMC APPLICATION NOTE UT1553B BCRT TO INTERFACE PSEUDO-DUAL-PORT RAM ARCHITECTURE INTRODUCTION ARBITRATION DETAILS DESIGN SELECTIONS UTMC APPLICATION NOTE UT1553B BCRT TO 80186 INTERFACE INTRODUCTION The UTMC UT1553B BCRT is monolithi CMOS integrte iruit tht provies omprehensive Bus Controller n Remote Terminl funtions for MIL-STD-

More information

COMPUTER EDUCATION TECHNIQUES, INC. (WEBLOGIC_SVR_ADM ) SA:

COMPUTER EDUCATION TECHNIQUES, INC. (WEBLOGIC_SVR_ADM ) SA: In orer to lern whih questions hve een nswere orretly: 1. Print these pges. 2. Answer the questions. 3. Sen this ssessment with the nswers vi:. FAX to (212) 967-3498. Or. Mil the nswers to the following

More information

Internet Routing. Reminder: Routing. CPSC Network Programming

Internet Routing. Reminder: Routing. CPSC Network Programming PS 360 - Network Progrmming Internet Routing Mihele Weigle eprtment of omputer Siene lemson University mweigle@s.lemson.eu pril, 00 http://www.s.lemson.eu/~mweigle/ourses/ps360 Reminer: Routing Internet

More information

CICS Application Design

CICS Application Design CICS Applition Design In orer to lern whih questions hve een nswere orretly: 1. Print these pges. 2. Answer the questions. 3. Sen this ssessment with the nswers vi:. FAX to (212) 967-3498. Or. Mil the

More information

COMMON FRACTIONS. or a / b = a b. , a is called the numerator, and b is called the denominator.

COMMON FRACTIONS. or a / b = a b. , a is called the numerator, and b is called the denominator. COMMON FRACTIONS BASIC DEFINITIONS * A frtion is n inite ivision. or / * In the frtion is lle the numertor n is lle the enomintor. * The whole is seprte into "" equl prts n we re onsiering "" of those

More information

Paradigm 5. Data Structure. Suffix trees. What is a suffix tree? Suffix tree. Simple applications. Simple applications. Algorithms

Paradigm 5. Data Structure. Suffix trees. What is a suffix tree? Suffix tree. Simple applications. Simple applications. Algorithms Prdigm. Dt Struture Known exmples: link tble, hep, Our leture: suffix tree Will involve mortize method tht will be stressed shortly in this ourse Suffix trees Wht is suffix tree? Simple pplitions History

More information

Containers: Queue and List

Containers: Queue and List Continers: Queue n List Queue A ontiner in whih insertion is one t one en (the til) n eletion is one t the other en (the he). Also lle FIFO (First-In, First-Out) Jori Cortell n Jori Petit Deprtment of

More information

Advanced Programming Handout 5. Enter Okasaki. Persistent vs. Ephemeral. Functional Queues. Simple Example. Persistent vs.

Advanced Programming Handout 5. Enter Okasaki. Persistent vs. Ephemeral. Functional Queues. Simple Example. Persistent vs. Avne Progrmming Hnout 5 Purel Funtionl Dt Strutures: A Cse Stu in Funtionl Progrmming Persistent vs. Ephemerl An ephemerl t struture is one for whih onl one version is ville t time: fter n upte opertion,

More information

Pattern Matching. Pattern Matching. Pattern Matching. Review of Regular Expressions

Pattern Matching. Pattern Matching. Pattern Matching. Review of Regular Expressions Pttern Mthing Pttern Mthing Some of these leture slides hve een dpted from: lgorithms in C, Roert Sedgewik. Gol. Generlize string serhing to inompletely speified ptterns. pplitions. Test if string or its

More information

XML and Databases. Outline. XPath. Outline - Lectures. XPath Data Model. Outline - Assignments. XPath. Sebastian Maneth NICTA and UNSW

XML and Databases. Outline. XPath. Outline - Lectures. XPath Data Model. Outline - Assignments. XPath. Sebastian Maneth NICTA and UNSW Outline XML n Dtses Leture 6 Noe Seleting Queries: XPth 1.0 1. XPth Dt Moel: 7 types of noes 2. Simple Exmples 3. Lotion Steps n Pths 4. Vlue Comprison, n Other Funtions Sestin Mneth NICTA n UNSW CSE@UNSW

More information

GENG2140 Modelling and Computer Analysis for Engineers

GENG2140 Modelling and Computer Analysis for Engineers GENG4 Moelling n Computer Anlysis or Engineers Letures 9 & : Gussin qurture Crete y Grn Romn Joles, PhD Shool o Mehnil Engineering, UWA GENG4 Content Deinition o Gussin qurture Computtion o weights n points

More information

COMPUTER EDUCATION TECHNIQUES, INC. (XML ) SA:

COMPUTER EDUCATION TECHNIQUES, INC. (XML ) SA: In orer to lern whih questions hve een nswere orretly: 1. Print these pges. 2. Answer the questions. 3. Sen this ssessment with the nswers vi:. FAX to (212) 967-3498. Or. Mil the nswers to the following

More information

Distributed Systems Principles and Paradigms. Chapter 11: Distributed File Systems

Distributed Systems Principles and Paradigms. Chapter 11: Distributed File Systems Distriuted Systems Priniples nd Prdigms Mrten vn Steen VU Amsterdm, Dept. Computer Siene steen@s.vu.nl Chpter 11: Distriuted File Systems Version: Deemer 10, 2012 2 / 14 Distriuted File Systems Distriuted

More information

Generating Editors for Direct Manipulation of Diagrams

Generating Editors for Direct Manipulation of Diagrams Generting Eitors for Diret Mnipultion of Digrms Gerhr Viehstet n Mrk Mins Lehrstuhl für Progrmmiersprhen Universität Erlngen-Nürnerg Mrtensstr. 3, 91058 Erlngen, Germny E-mil: fviehste,minsg@informtik.uni-erlngen.e

More information

WORKSHOP 9 HEX MESH USING SWEEP VECTOR

WORKSHOP 9 HEX MESH USING SWEEP VECTOR WORKSHOP 9 HEX MESH USING SWEEP VECTOR WS9-1 WS9-2 Prolem Desription This exerise involves importing urve geometry from n IGES file. The urves re use to rete other urves. From the urves trimme surfes re

More information

Error Numbers of the Standard Function Block

Error Numbers of the Standard Function Block A.2.2 Numers of the Stndrd Funtion Blok evlution The result of the logi opertion RLO is set if n error ours while the stndrd funtion lok is eing proessed. This llows you to rnh to your own error evlution

More information

Comparing Hierarchical Data in External Memory

Comparing Hierarchical Data in External Memory Compring Hierrhil Dt in Externl Memory Surshn S. Chwthe Deprtment of Computer Siene University of Mryln College Prk, MD 090 hw@s.um.eu Astrt We present n externl-memory lgorithm for omputing minimum-ost

More information

Internet Routing. IP Packet Format. IP Fragmentation & Reassembly. Principles of Internet Routing. Computer Networks 9/29/2014.

Internet Routing. IP Packet Format. IP Fragmentation & Reassembly. Principles of Internet Routing. Computer Networks 9/29/2014. omputer Networks 9/29/2014 IP Pket Formt Internet Routing Ki Shen IP protool version numer heder length (words) for qulity of servie mx numer remining hops (deremented t eh router) upper lyer protool to

More information

Graph Contraction and Connectivity

Graph Contraction and Connectivity Chpter 14 Grph Contrtion n Connetivity So fr we hve mostly overe tehniques for solving problems on grphs tht were evelope in the ontext of sequentil lgorithms. Some of them re esy to prllelize while others

More information

Distributed Systems Principles and Paradigms

Distributed Systems Principles and Paradigms Distriuted Systems Priniples nd Prdigms Christoph Dorn Distriuted Systems Group, Vienn University of Tehnology.dorn@infosys.tuwien..t http://www.infosys.tuwien..t/stff/dorn Slides dpted from Mrten vn Steen,

More information

Bayesian Networks: Directed Markov Properties (Cont d) and Markov Equivalent DAGs

Bayesian Networks: Directed Markov Properties (Cont d) and Markov Equivalent DAGs Byesin Networks: Direte Mrkov Properties (Cont ) n Mrkov Equivlent DAGs Huizhen Yu jney.yu@s.helsinki.fi Dept. Computer Siene, Univ. of Helsinki Proilisti Moels, Spring, 2010 Huizhen Yu (U.H.) Byesin Networks:

More information

Outline. Motivation Background ARCH. Experiment Additional usages for Input-Depth. Regular Expression Matching DPI over Compressed HTTP

Outline. Motivation Background ARCH. Experiment Additional usages for Input-Depth. Regular Expression Matching DPI over Compressed HTTP ARCH This work ws supported y: The Europen Reserh Counil, The Isreli Centers of Reserh Exellene, The Neptune Consortium, nd Ntionl Siene Foundtion wrd CNS-119748 Outline Motivtion Bkground Regulr Expression

More information

Introduction. Example

Introduction. Example OMS0 Introution isjoint sets n minimum spnning trees In this leture we will strt by isussing t struture use for mintining isjoint subsets of some bigger set. This hs number of pplitions, inluing to mintining

More information

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs.

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs. Lecture 5 Wlks, Trils, Pths nd Connectedness Reding: Some of the mteril in this lecture comes from Section 1.2 of Dieter Jungnickel (2008), Grphs, Networks nd Algorithms, 3rd edition, which is ville online

More information

CS553 Lecture Introduction to Data-flow Analysis 1

CS553 Lecture Introduction to Data-flow Analysis 1 ! Ide Introdution to Dt-flow nlysis!lst Time! Implementing Mrk nd Sweep GC!Tody! Control flow grphs! Liveness nlysis! Register llotion CS553 Leture Introdution to Dt-flow Anlysis 1 Dt-flow Anlysis! Dt-flow

More information

CMPUT101 Introduction to Computing - Summer 2002

CMPUT101 Introduction to Computing - Summer 2002 CMPUT Introdution to Computing - Summer 22 %XLOGLQJ&RPSXWHU&LUFXLWV Chpter 4.4 3XUSRVH We hve looked t so fr how to uild logi gtes from trnsistors. Next we will look t how to uild iruits from logi gtes,

More information

WORKSHOP 19 GLOBAL/LOCAL MODELING USING FEM FIELDS

WORKSHOP 19 GLOBAL/LOCAL MODELING USING FEM FIELDS WORKSHOP 19 GLOBAL/LOCAL MODELING USING FEM FIELDS WS19-1 WS19-2 Prolem Desription This exerise is use to emonstrte how to mp isplement results from the nlysis of glol(overll) moel onto the perimeter of

More information

CS 340, Fall 2016 Sep 29th Exam 1 Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string.

CS 340, Fall 2016 Sep 29th Exam 1 Note: in all questions, the special symbol ɛ (epsilon) is used to indicate the empty string. CS 340, Fll 2016 Sep 29th Exm 1 Nme: Note: in ll questions, the speil symol ɛ (epsilon) is used to indite the empty string. Question 1. [10 points] Speify regulr expression tht genertes the lnguge over

More information

2 Computing all Intersections of a Set of Segments Line Segment Intersection

2 Computing all Intersections of a Set of Segments Line Segment Intersection 15-451/651: Design & Anlysis of Algorithms Novemer 14, 2016 Lecture #21 Sweep-Line nd Segment Intersection lst chnged: Novemer 8, 2017 1 Preliminries The sweep-line prdigm is very powerful lgorithmic design

More information

Fig.25: the Role of LEX

Fig.25: the Role of LEX The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing

More information

Graph theory Route problems

Graph theory Route problems Bhelors thesis Grph theory Route prolems Author: Aolphe Nikwigize Dte: 986 - -5 Sujet: Mthemtis Level: First level (Bhelor) Course oe: MAE Astrt In this thesis we will review some route prolems whih re

More information

CS453 INTRODUCTION TO DATAFLOW ANALYSIS

CS453 INTRODUCTION TO DATAFLOW ANALYSIS CS453 INTRODUCTION TO DATAFLOW ANALYSIS CS453 Leture Register llotion using liveness nlysis 1 Introdution to Dt-flow nlysis Lst Time Register llotion for expression trees nd lol nd prm vrs Tody Register

More information

Introduction to Algebra

Introduction to Algebra INTRODUCTORY ALGEBRA Mini-Leture 1.1 Introdution to Alger Evlute lgeri expressions y sustitution. Trnslte phrses to lgeri expressions. 1. Evlute the expressions when =, =, nd = 6. ) d) 5 10. Trnslte eh

More information

Lecture 8: Graph-theoretic problems (again)

Lecture 8: Graph-theoretic problems (again) COMP36111: Advned Algorithms I Leture 8: Grph-theoreti prolems (gin) In Prtt-Hrtmnn Room KB2.38: emil: iprtt@s.mn..uk 2017 18 Reding for this leture: Sipser: Chpter 7. A grph is pir G = (V, E), where V

More information

Section 2.3 Functions. Definition: Let A and B be sets. A function (mapping, map) f from A to B, denoted f :A B, is a subset of A B such that

Section 2.3 Functions. Definition: Let A and B be sets. A function (mapping, map) f from A to B, denoted f :A B, is a subset of A B such that Setion 2.3 Funtions Definition: Let n e sets. funtion (mpping, mp) f from to, enote f :, is suset of suh tht x[x y[y < x, y > f ]] n [< x, y 1 > f < x, y 2 > f ] y 1 = y 2 Note: f ssoites with eh x in

More information

Using Red-Eye to improve face detection in low quality video images

Using Red-Eye to improve face detection in low quality video images Using Re-Eye to improve fe etetion in low qulity vieo imges Rihr Youmrn Shool of Informtion Tehnology University of Ottw, Cn youmrn@site.uottw. Any Aler Shool of Informtion Tehnology University of Ottw,

More information

COMP108 Algorithmic Foundations

COMP108 Algorithmic Foundations Grph Theory Prudene Wong http://www.s.liv..uk/~pwong/tehing/omp108/201617 How to Mesure 4L? 3L 5L 3L ontiner & 5L ontiner (without mrk) infinite supply of wter You n pour wter from one ontiner to nother

More information

Asurveyofpractical algorithms for suffix tree construction in external memory

Asurveyofpractical algorithms for suffix tree construction in external memory Asurveyofprtil lgorithms for suffix tree onstrution in externl memory M. Brsky,, U. Stege n A. Thomo University of Vitori, PO Box, STN CSC Vitori, BC, VW P, Cn SUMMAY The onstrution of suffix trees in

More information

Midterm Exam CSC October 2001

Midterm Exam CSC October 2001 Midterm Exm CSC 173 23 Otoer 2001 Diretions This exm hs 8 questions, severl of whih hve suprts. Eh question indites its point vlue. The totl is 100 points. Questions 5() nd 6() re optionl; they re not

More information

Scalable Spatio-temporal Continuous Query Processing for Location-aware Services

Scalable Spatio-temporal Continuous Query Processing for Location-aware Services Slle Sptio-temporl Continuous uery Proessing for Lotion-wre Servies iopeng iong Mohme F. Mokel Wli G. Aref Susnne E. Hmrush Sunil Prhkr Deprtment of Computer Sienes, Purue University, West Lfyette, IN

More information

PROBLEM OF APOLLONIUS

PROBLEM OF APOLLONIUS PROBLEM OF APOLLONIUS In the Jnury 010 issue of Amerin Sientist D. Mkenzie isusses the Apollonin Gsket whih involves fining the rius of the lrgest irle whih just fits into the spe etween three tngent irles

More information

Cooperative Routing in Multi-Source Multi-Destination Multi-hop Wireless Networks

Cooperative Routing in Multi-Source Multi-Destination Multi-hop Wireless Networks oopertive Routing in Multi-Soure Multi-estintion Multi-hop Wireless Networks Jin Zhng Qin Zhng eprtment of omputer Siene n ngineering Hong Kong University of Siene n Tehnology, HongKong {zjzj, qinzh}@se.ust.hk

More information

Dr. D.M. Akbar Hussain

Dr. D.M. Akbar Hussain Dr. D.M. Akr Hussin Lexicl Anlysis. Bsic Ide: Red the source code nd generte tokens, it is similr wht humns will do to red in; just tking on the input nd reking it down in pieces. Ech token is sequence

More information

Lecture 13: Graphs I: Breadth First Search

Lecture 13: Graphs I: Breadth First Search Leture 13 Grphs I: BFS 6.006 Fll 2011 Leture 13: Grphs I: Bredth First Serh Leture Overview Applitions of Grph Serh Grph Representtions Bredth-First Serh Rell: Grph G = (V, E) V = set of verties (ritrry

More information

Lecture 12 : Topological Spaces

Lecture 12 : Topological Spaces Leture 12 : Topologil Spes 1 Topologil Spes Topology generlizes notion of distne nd loseness et. Definition 1.1. A topology on set X is olletion T of susets of X hving the following properties. 1. nd X

More information

6.045J/18.400J: Automata, Computability and Complexity. Quiz 2: Solutions. Please write your name in the upper corner of each page.

6.045J/18.400J: Automata, Computability and Complexity. Quiz 2: Solutions. Please write your name in the upper corner of each page. 6045J/18400J: Automt, Computbility nd Complexity Mrh 30, 2005 Quiz 2: Solutions Prof Nny Lynh Vinod Vikuntnthn Plese write your nme in the upper orner of eh pge Problem Sore 1 2 3 4 5 6 Totl Q2-1 Problem

More information

FEEDBACK: The standard error of a regression is not an unbiased estimator for the standard deviation of the error in a multiple regression model.

FEEDBACK: The standard error of a regression is not an unbiased estimator for the standard deviation of the error in a multiple regression model. Introutory Eonometris: A Moern Approh 6th Eition Woolrige Test Bnk Solutions Complete ownlo: https://testbnkre.om/ownlo/introutory-eonometris-moern-pproh-6th-eition-jeffreym-woolrige-test-bnk/ Solutions

More information

Lesson 4.4. Euler Circuits and Paths. Explore This

Lesson 4.4. Euler Circuits and Paths. Explore This Lesson 4.4 Euler Ciruits nd Pths Now tht you re fmilir with some of the onepts of grphs nd the wy grphs onvey onnetions nd reltionships, it s time to egin exploring how they n e used to model mny different

More information

Solids. Solids. Curriculum Ready.

Solids. Solids. Curriculum Ready. Curriulum Rey www.mthletis.om This ooklet is ll out ientifying, rwing n mesuring solis n prisms. SOM CUES The Som Cue ws invente y Dnish sientist who went y the nme of Piet Hein. It is simple 3 # 3 #

More information

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards A Tutology Checker loosely relted to Stålmrck s Algorithm y Mrtin Richrds mr@cl.cm.c.uk http://www.cl.cm.c.uk/users/mr/ University Computer Lortory New Museum Site Pemroke Street Cmridge, CB2 3QG Mrtin

More information

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5 CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,

More information

The Network Layer: Routing in the Internet. The Network Layer: Routing & Addressing Outline

The Network Layer: Routing in the Internet. The Network Layer: Routing & Addressing Outline CPSC 852 Internetworking The Network Lyer: Routing in the Internet Mihele Weigle Deprtment of Computer Siene Clemson University mweigle@s.lemson.edu http://www.s.lemson.edu/~mweigle/ourses/ps852 1 The

More information

Rolling Back Remote Provisioning Changes. Dell Command Integration for System Center

Rolling Back Remote Provisioning Changes. Dell Command Integration for System Center Rolling Bk Remote Provisioning Chnges Dell Commn Integrtion for System Center Notes, utions, n wrnings NOTE: A NOTE inites importnt informtion tht helps you mke etter use of your prout. CAUTION: A CAUTION

More information

Structure in solution spaces: Three lessons from Jean-Claude

Structure in solution spaces: Three lessons from Jean-Claude Struture in solution spes: Three lessons from Jen-Clue Dvi Eppstein Computer Siene Deprtment, Univ. of Cliforni, Irvine Conferene on Meningfulness n Lerning Spes: A Triute to the Work of Jen-Clue Flmgne

More information

McAfee Web Gateway

McAfee Web Gateway Relese Notes Revision C MAfee We Gtewy 7.6.2.11 Contents Aout this relese Enhnement Resolved issues Instlltion instrutions Known issues Additionl informtion Find produt doumenttion Aout this relese This

More information

Width and Bounding Box of Imprecise Points

Width and Bounding Box of Imprecise Points Width nd Bounding Box of Impreise Points Vhideh Keikh Mrten Löffler Ali Mohdes Zhed Rhmti Astrt In this pper we study the following prolem: we re given set L = {l 1,..., l n } of prllel line segments,

More information

[SYLWAN., 158(6)]. ISI

[SYLWAN., 158(6)]. ISI The proposl of Improved Inext Isomorphi Grph Algorithm to Detet Design Ptterns Afnn Slem B-Brhem, M. Rizwn Jmeel Qureshi Fulty of Computing nd Informtion Tehnology, King Adulziz University, Jeddh, SAUDI

More information

Parallelization Optimization of System-Level Specification

Parallelization Optimization of System-Level Specification Prlleliztion Optimiztion of System-Level Speifition Luki i niel. Gjski enter for Emedded omputer Systems University of liforni Irvine, 92697, US {li, gjski} @es.ui.edu strt This pper introdues the prlleliztion

More information

What are suffix trees?

What are suffix trees? Suffix Trees 1 Wht re suffix trees? Allow lgorithm designers to store very lrge mount of informtion out strings while still keeping within liner spce Allow users to serch for new strings in the originl

More information

3D convex hulls. Convex Hull in 3D. convex polyhedron. convex polyhedron. The problem: Given a set P of points in 3D, compute their convex hull

3D convex hulls. Convex Hull in 3D. convex polyhedron. convex polyhedron. The problem: Given a set P of points in 3D, compute their convex hull Convex Hull in The rolem: Given set P of oints in, omute their onvex hull onvex hulls Comuttionl Geometry [si 3250] Lur Tom Bowoin College onvex olyheron 1 2 3 olygon olyheron onvex olyheron 4 5 6 Polyheron

More information

Hash-based Subgraph Query Processing Method for Graph-structured XML Documents

Hash-based Subgraph Query Processing Method for Graph-structured XML Documents Hsh-bse Subgrph Query Proessing Metho for Grph-struture XML Douments Hongzhi Wng Hrbin Institute of Teh. wngzh@hit.eu.n Jinzhong Li Hrbin Institute of Teh. lijzh@hit.eu.n Jizhou Luo Hrbin Institute of

More information

The Droplet Virtual Brush for Chinese Calligraphic Character Modeling

The Droplet Virtual Brush for Chinese Calligraphic Character Modeling The Droplet Virtul Brush for Chinese Clligrphi Chrter Moeling Xiofeng Mi Jie Xu Min Tng Jinxing Dong CAD & CG Stte Key L of Chin, Zhejing University, Hngzhou, Chin Artifiil Intelligene Institute, Zhejing

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully

More information

5 ANGLES AND POLYGONS

5 ANGLES AND POLYGONS 5 GLES POLYGOS urling rige looks like onventionl rige when it is extene. However, it urls up to form n otgon to llow ots through. This Rolling rige is in Pington sin in Lonon, n urls up every Friy t miy.

More information

Towards Unifying Advances in Twig Join Algorithms

Towards Unifying Advances in Twig Join Algorithms Pro. 21st Austrlsin Dtse Conferene (ADC 2010), Brisne, Austrli Towrds Unifying Advnes in Twig Join Algorithms Nils Grimsmo Truls A. Bjørklund Deprtment of Computer nd Informtion Siene Norwegin University

More information

Robust internal multiple prediction algorithm Zhiming James Wu, Sonika, Bill Dragoset*, WesternGeco

Robust internal multiple prediction algorithm Zhiming James Wu, Sonika, Bill Dragoset*, WesternGeco Roust internl multiple preition lgorithm Zhiming Jmes Wu, Sonik, Bill Drgoset*, WesternGeo Summry Multiple ttenution is n importnt t proessing step for oth mrine n ln t. Tehniques for surfe- rpily in the

More information

COMP 423 lecture 11 Jan. 28, 2008

COMP 423 lecture 11 Jan. 28, 2008 COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring

More information

Approximate Joins for Data Centric XML

Approximate Joins for Data Centric XML Approximte Joins for Dt Centri XML Nikolus Augsten 1, Mihel Böhlen 1, Curtis Dyreson, Johnn Gmper 1 1 Fulty of Computer Siene, Free University of Bozen-Bolzno Dominiknerpltz 3, Bozen, Itly {ugsten,oehlen,gmper}@inf.uniz.it

More information

Efficient Subscription Management in Content-based Networks

Efficient Subscription Management in Content-based Networks Effiient Susription Mngement in Content-sed Networks Rphël Chnd, Psl A. Feler Institut EURECOM 06904 Sophi Antipolis, Frne {hnd feler}@eureom.fr Astrt Content-sed pulish/susrie systems offer onvenient

More information

Realising Dead Path Elimination in BPMN

Realising Dead Path Elimination in BPMN Relising De Pth Elimintion in BPMN Mtthis Weilih, Alexner Grosskopf Hsso Plttner Institute Potsm, Germny {mtthis.weilih,lexner.grosskopf}@hpi.uni-potsm.e Alistir Brros SAP Reserh Brisne, Austrli listir.rros@sp.om

More information

An Efficient Algorithm for the Physical Mapping of Clustered Task Graphs onto Multiprocessor Architectures

An Efficient Algorithm for the Physical Mapping of Clustered Task Graphs onto Multiprocessor Architectures An Effiient Algorithm for the Physil Mpping of Clustere Tsk Grphs onto Multiproessor Arhitetures Netrios Koziris Pnyiotis Tsnks Mihel Romesis George Ppkonstntinou Ntionl Tehnil University of Athens Dept.

More information

WORKSHOP 8B TENSION COUPON

WORKSHOP 8B TENSION COUPON WORKSHOP 8B TENSION COUPON WS8B-2 Workshop Ojetives Prtie reting n eiting geometry Prtie mesh seeing n iso meshing tehniques. WS8B-3 Suggeste Exerise Steps 1. Crete new tse. 2. Crete geometry moel of the

More information

Suffix trees, suffix arrays, BWT

Suffix trees, suffix arrays, BWT ALGORITHMES POUR LA BIO-INFORMATIQUE ET LA VISUALISATION COURS 3 Rluc Uricru Suffix trees, suffix rrys, BWT Bsed on: Suffix trees nd suffix rrys presenttion y Him Kpln Suffix trees course y Pco Gomez Liner-Time

More information

Declarative Routing: Extensible Routing with Declarative Queries

Declarative Routing: Extensible Routing with Declarative Queries elrtive Routing: Extensile Routing with elrtive Queries Boon Thu Loo 1 Joseph M. Hellerstein 1,2, Ion toi 1, Rghu Rmkrishnn3, 1 University of Cliforni t Berkeley, 2 Intel Reserh Berkeley, 3 University

More information

Final Exam Review F 06 M 236 Be sure to look over all of your tests, as well as over the activities you did in the activity book

Final Exam Review F 06 M 236 Be sure to look over all of your tests, as well as over the activities you did in the activity book inl xm Review 06 M 236 e sure to loo over ll of your tests, s well s over the tivities you did in the tivity oo 1 1. ind the mesures of the numered ngles nd justify your wor. Line j is prllel to line.

More information

Outline. CS38 Introduction to Algorithms. Graphs. Graphs. Graphs. Graph traversals

Outline. CS38 Introduction to Algorithms. Graphs. Graphs. Graphs. Graph traversals Outline CS38 Introution to Algorithms Leture 2 April 3, 2014 grph trversls (BFS, DFS) onnetivity topologil sort strongly onnete omponents heps n hepsort greey lgorithms April 3, 2014 CS38 Leture 2 2 Grphs

More information

SAS Event Stream Processing 5.1: Using SAS Event Stream Processing Studio

SAS Event Stream Processing 5.1: Using SAS Event Stream Processing Studio SAS Event Strem Proessing 5.1: Using SAS Event Strem Proessing Stuio Overview to SAS Event Strem Proessing Stuio Overview SAS Event Strem Proessing Stuio is we-se lient tht enles you to rete, eit, uplo,

More information

Computational geometry

Computational geometry Leture 23 Computtionl geometry Supplementl reding in CLRS: Chpter 33 exept 33.3 There re mny importnt prolems in whih the reltionships we wish to nlyze hve geometri struture. For exmple, omputtionl geometry

More information

Using SIMD Registers and Instructions to Enable Instruction-Level Parallelism in Sorting Algorithms

Using SIMD Registers and Instructions to Enable Instruction-Level Parallelism in Sorting Algorithms Using SIMD Registers n Instrutions to Enle Instrution-Level Prllelism in Sorting Algorithms Timothy Furtk furtk@s.ulert. José Nelson Amrl mrl@s.ulert. Roert Niewiomski niewio@s.ulert. Deprtment of Computing

More information

Type Checking. Roadmap (Where are we?) Last lecture Context-sensitive analysis. This lecture Type checking. Symbol tables

Type Checking. Roadmap (Where are we?) Last lecture Context-sensitive analysis. This lecture Type checking. Symbol tables Type Cheking Rodmp (Where re we?) Lst leture Contet-sensitie nlysis Motition Attriute grmmrs Ad ho Synt-direted trnsltion This leture Type heking Type systems Using synt direted trnsltion Symol tles Leil

More information

Comparison-based Choices

Comparison-based Choices Comprison-se Choies John Ugner Mngement Siene & Engineering Stnfor University Joint work with: Jon Kleinerg (Cornell) Senhil Mullinthn (Hrvr) EC 17 Boston June 28, 2017 Preiting isrete hoies Clssi prolem:

More information

Graphs with at most two trees in a forest building process

Graphs with at most two trees in a forest building process Grphs with t most two trees in forest uilding process rxiv:802.0533v [mth.co] 4 Fe 208 Steve Butler Mis Hmnk Mrie Hrdt Astrct Given grph, we cn form spnning forest y first sorting the edges in some order,

More information

Problem Final Exam Set 2 Solutions

Problem Final Exam Set 2 Solutions CSE 5 5 Algoritms nd nd Progrms Prolem Finl Exm Set Solutions Jontn Turner Exm - //05 0/8/0. (5 points) Suppose you re implementing grp lgoritm tt uses ep s one of its primry dt strutures. Te lgoritm does

More information

Kulleġġ San Ġorġ Preca Il-Liċeo tas-subien Ħamrun. Name & Surname: A) Mark the correct answer by inserting an X in the correct box. a b c d.

Kulleġġ San Ġorġ Preca Il-Liċeo tas-subien Ħamrun. Name & Surname: A) Mark the correct answer by inserting an X in the correct box. a b c d. Kulleġġ Sn Ġorġ Pre Il-Liċeo ts-suien Ħmrun Hlf Yerly Exmintion 2012 Trk 3 Form 3 INFORMATION TECHNOLOGY Time : 1hr 30 mins Nme & Surnme: Clss: A) Mrk the orret nswer y inserting n X in the orret ox. 1)

More information

A METHOD FOR CHARACTERIZATION OF THREE-PHASE UNBALANCED DIPS FROM RECORDED VOLTAGE WAVESHAPES

A METHOD FOR CHARACTERIZATION OF THREE-PHASE UNBALANCED DIPS FROM RECORDED VOLTAGE WAVESHAPES A METHOD FOR CHARACTERIZATION OF THREE-PHASE UNBALANCED DIPS FROM RECORDED OLTAGE WAESHAPES M.H.J. Bollen, L.D. Zhng Dept. Eletri Power Engineering Chlmers University of Tehnology, Gothenurg, Sweden Astrt:

More information

SOFTWARE-BUG LOCALIZATION WITH GRAPH MINING

SOFTWARE-BUG LOCALIZATION WITH GRAPH MINING Chpter 17 SOFTWARE-BUG LOCALIZATION WITH GRAPH MINING Frnk Eihinger Institute for Progrm Strutures nd Dt Orgniztion (IPD) Universit-t Krlsruhe (TH), Germny eihinger@ipd.uk.de Klemens B-ohm Institute for

More information

Lexical Analysis: Constructing a Scanner from Regular Expressions

Lexical Analysis: Constructing a Scanner from Regular Expressions Lexicl Anlysis: Constructing Scnner from Regulr Expressions Gol Show how to construct FA to recognize ny RE This Lecture Convert RE to n nondeterministic finite utomton (NFA) Use Thompson s construction

More information

LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION

LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION Overview LINX MATRIX SWITCHERS FIRMWARE UPDATE INSTRUCTIONS FIRMWARE VERSION 4.4.1.0 Due to the omplex nture of this updte, plese fmilirize yourself with these instrutions nd then ontt RGB Spetrum Tehnil

More information

To access your mailbox from inside your organization. For assistance, call:

To access your mailbox from inside your organization. For assistance, call: 2001 Ative Voie, In. All rights reserved. First edition 2001. Proteted y one or more of the following United Sttes ptents:,070,2;,3,90;,88,0;,33,102;,8,0;,81,0;,2,7;,1,0;,90,88;,01,11. Additionl U.S. nd

More information