Topological Queries on Graph-structured XML Data: Models and Implementations
|
|
- Lenard Norris
- 5 years ago
- Views:
Transcription
1 Topologicl Queries on Grph-structured XML Dt: Models nd Implementtions Hongzhi Wng, Jinzhong Li, nd Jizhou Luo Astrct In mny pplictions, dt is in grph structure, which cn e nturlly represented s grph-structured XML. Existing queries defined on tree-structured nd grph-structured XML dt minly focus on sugrph mtching, which cn not cover ll the requirements of querying on grph. In this pper, new kind of queries, topologicl query on grph-structured XML is presented. This kind of queries consider not only the structure of sugrph ut lso the topologicl reltionship etween sugrphs. With existing sugrph query processing lgorithms, efficient lgorithms for topologicl query processing re designed. Experimentl results show the efficiency of implementtion lgorithms. Keywords XML, Grph Structure, Topologicl query. I. INTRODUCTION In mny pplictions, dt cn e modeled s directed grph nd grph-structured XML cn e nturlly used to descrie grph-structured dt. For exmple, Figure I shows grph structure representing reltionships etween uthors nd cdemic ppers. Query processing on grph-structured dt rings new chllenges. Some query processing techniques hve een presented such s [6], [11], [7], [3].However, current techniques for grph-structured dt hve limittions. An notle one is tht current reserch on querying grph-structured dt minly focus on sugrph mtching without considering topologicl reltionship etween sugrphs. Topologicl reltionship of sugrphs cn e defined s the reltionship of sugrphs in position. For exmple, in Figure 2(), sugrph (1,1,2,2) hs overlpping reltionship with sugrph (1,1,2,c1) in ecuse they shre some nodes. In some pplictions of querying sugrphs on lrge grph, topologicl reltionship cn e used to descrie the constrint of sugrph more sutly. An exmple of the ppliction of topologicl constrints on the grph in Figure I is shown in Exmple 1. Exmple 1: Retrieve the conferences shring t lest one pper with sme title nd uthors with journl. Such query use topologicl reltionship overlpping s constrint. Processing this query in the grph in Figure I is to find sugrphs contining conference nd shring common prts of uthor informtion with some sugrph contining informtion out journl. Processing topologicl query is lso chllengele prolem. Current techniques of query processing on grph dtse cn e clssified into two kinds. Hongzhi Wng is with the School of Computer Science nd Technology, Hrin Institute of Technology, Hrin, emil: wngzh@hit.edu.cn Jinzhong Li is with the School of Computer Science nd Technology, Hrin Institute of Technology, Hrin, emil: lijzh@hit.edu.cn Jizhou Luo is with the School of Computer Science nd Technology, Hrin Institute of Technology, Hrin, emil:luojz@hit.edu.cn Mtch sugrph with structurl constrint on lrge grph [3], [13]. Such techniques retrieve sugrphs mtching given sugrph. Such techniques cn only solve suprolem of processing topologicl query. Additionlly, in some cses, the result of sugrph query is very lrge. If such sugrph is just constrint, only some fetures of intermedite results re useful. Therefore, it is not efficient for topologicl query processing. Select grphs with some feture in lrge set of grphs [14], [15], [8]. For topologicl query, the gol is to find some sugrph is lrge grph stisfying some constrint. Therefore, such techniques cnnot e pplied to topologicl query processing directly. Current techniques cn not e pplied directly on topologicl query processing. For processing topologicl queries efficiently, we present series of lgorithms sed on intervl sed leling scheme. Even sugrph mtching cnnot e voided during the processing, we only store useful fetures s intermedite results nd filter s mny useless nodes s possile. The contriutions of this pper cn e summrized s follows: Topologicl query, novel type of queries on grph, is presented. As we know, this is the first pper of considering topologicl reltionship etween sugrphs in query processing on grph dt. We design series of efficient lgorithms for processing topologicl queries with vrious topologicl constrints. With feture extrcting, mssive storge of intermedite results is voided. Experimentl results show tht our methods use smll extr cost of sugrph query processing nd hve good sclility. This pper is orgnized s follows. In Section II, some ckground knowledge is presented. In Section III, the definitions of topologicl query nd vrious topologicl reltionships re presented. We design topologicl query processing lgorithms in Section IV. Experimentl results re shown in Section V. In Section VI, we give n overview of work relted to this pper. We drw the conclusions in Section VII. II. PRELIMINARIES In this section, we riefly introduce the grph-structured XML model nd some techniques used in this pper. A. Dt Model With IDREF etween two elements representing their reference reltionship [12], XML dt cn e modeled s lelled 721
2 i conference confernece Journl s nme pper pper nme pper pper nme pper pper n 1 title uthor title uthor n 2 title uthor title uthor n 3 title uthor title uthor nme ddress editor nme ddress PCmemer nme ddress Chirmn t 1 t 2 t 3 t 4 t 1 t 4 p1 1 p 2 2 p 3 3 Fig. 1. An Exmple of Grph Structure digrph: elements nd ttriutes re mpped into nodes of the grph; directed nesting nd reference reltionships re mpped into edges in the grph. B. Rechiility Leling Scheme for Grph-structured XML Dt The gol of leling scheme of XML dt is to represent the structurl reltionship so tht the reltionship etween nodes in n XML grph cn e judged y their lels quickly. With suitle leling scheme, structurl query processing cn e efficient. In this pper, we focus on rechility leling scheme, which is used to judge the rechility reltionship etween nodes in n XML grph. We use n extension version of rechility leling scheme presented in [9]. In this leling scheme, ech node of grph is ssigned postid nd some intervls. For two nodes nd in grph, if nd only if.postid is contined y one of intervls ssocited with, so tht the rechility reltionship etween two nodes in the grph cn e judged y their lels without other informtion. The steps of encoding grph G is: 1) A DAG D is generted from G y contrcting ech strongly connected components (SCC for rief) in G into node. 2) Find n optiml tree-cover T of the DAG D generted in the first step. An id nd n intervl is ssigned ech node in T. For node n, its id is its postorder. In tts intervl, [x, y], y is the postorder of n nd x is the smllest postorder numer of ll n s descendnts in T. Note tht during the trversl, when n node n c contrcted from SCC is met, its numer of postorder is incresed y numer c, the numer of nodes in this SCC in G insted of 1. Suppose the postorder of the node met efore n c is pc. The id of this node is n series of numers of pc +1, pc +2,, pc + numer c. 3) Exmine ll the nodes of D in the reverse topologicl order. At ech node n, copy nd merge, if possile, ll the intervls of its out-going nodes in D to its code. 4) All nodes in the sme SCC in G re ssigned to sme intervls s the node n c contrcted from this strongly connected grph in D. Ech of the postid of n c in D is ssigned to node in this SCC. For exmple, the rechility coding of grph in Figure 2() is shown in Figure 2(). For ech node in Figure 2(), the numer fter colon is its id nd series of intervls fter id re the intervl set of this node. The id of c2 is contined in intervl [0, 2] of 3. By such reltionship, we cn judge tht 3 reches c2. III. CONCEPTS OF TOPOLOGICAL QUERIES In this section, we will give the concept of topologicl query. Intuitively speking, topologicl query is to use topologicl reltionship to descrie grph. To retrieve some grphs with some topologicl constrint with grph in some schem, we present six kinds of topologicl reltionships: connecting, connected y, disjoint, overlpping, contining nd contined y. At first, We give the intuitions of these six reltionships. Then we will gives forml definition of these reltionships nd topologicl constrints sed on them. A grph g 1 hs connecting reltionship with nother grph g 2 mens tht there is pth from some node in g 1 to some node in g 2. For exmple, in Figure 2(), sugrph (1,1) is connecting sugrph (2,c2) ecuse oth 2 nd 1 cn rech c2. A grph g 1 hs connected y reltionship with nother grph g 2 mens tht there is pth from some node in g 2 to some node in g 1. For exmple, in Figure 2(), sugrph (2,c2)is connected y sugrph (1,1). A grph g 1 hs disjoint reltionship with nother grph g 2 mens tht no node is shred y g 1 nd g 2. For exmple, in Figure 2(), sugrphs (2,c2)nd (1,1) re disjoint since they shres no node. A grph g 1 hs overlpping reltionship with nother grph g 2 mens tht g 1 nd g 2 shres t lest one node. For exmple, in Figure 2(), sugrphs (2,c2)nd (3,c2) re overlpping since they shres node c2. A grph g 1 hs contining reltionship with nother grph g 2 mens tht g 2 is sugrph of g 1. A grph g 1 hs contined y reltionship with nother grph g 2 mens g 1 is sugrph of g
3 r1 1:8,[0,8] r1:11,[0,11] :6,[0,6] 2:7,[0,2][7,7] 3:9,[1,3][9,9] 4:10,[4,5][10,10] 2 c1 c2 3 4 c3 () Exmple of Grph 2:0,[0,0] c1:1,[1,1] c2:2,[2,2] 3:3,[3,3] 4:4,[4,4] c3:5,[5,5] () Coding of Exmple Grph Fig. 2. An Exmple of Rechiility Leling Scheme The grph structure of XML dt cn e represented s leled, directed grph G =(V,E) with V the set of nodes nd E the set of edges. There is function tg : V TAG where TAG is the set of tgs nd tg(v) is the tg of v. Since topologicl query is sed on sugrph query, we will define sugrph query t first, then define topologicl reltions. Bsed on these concepts, topologicl query is defined formlly. Definition 1 (Sugrph Query): A grph Q =(V,E) is sugrph query, where there re two functions tg : V TAG nd rel : E AXIS. TAG is the set of tgs nd AXIS is the set of node reltions such s prent-child(pc for rief) or ncestor-descendnt(ad for rief). In Definition 1, the set of AXIS descries the reltionship etween two nodes. Bsed on the definition of XPth [4], there re 13 kinds of reltionship. Here we only discuss two xis in common use, PC nd AD. In grph G, two nodes n 1 nd n 2 V G hs PC reltionship if there is edge (n 1,n 2 ) E G. In grph G, two nodes n 1 nd n 2 V G hs AD reltionship if there is pth from n 1 to n 2 in G. Definition 2 (Result of Sugrph Query): The result of sugrph query Q =(V Q,E Q ) over grph G is R G,Q = {g equivlent function f g : V g V Q nd totl nd single vlued function p g : V g V G, such tht n V g,tg(n) =tg(f g (n)) = tg(p g (n))nd e =(n 1,n 2 ) E g,p g (n 1 ),p g (n 2 ) stisfy the reltionship rel(f(n 1 ),f(n 2 ))nd if n 1 n 2 then f g (n 1 ) f g (n 2 ), p g (n 1 ) p g (n 2 ).}. g R G,Q is represented s g = G Q. An exmple of sugrph query is shown in Exmple 2. Exmple 2: A sugrph query on dt in Figure 2() is shown in Figure 3(). The direction of ech edge is from top to ottom. Single line represents PC-reltionship nd doule line represents AD-reltionship. This query represents sugrph with n r node s source. This node must hve PC reltionship with n node nd AD reltionship with node. Such node nd node must hve PC reltionship etween the sme c node. The results of this sugrph query include (r1, 2, 1, c1), (r1, 2, 1, c2), (r1, 3, 1, c1),(r1, 3, 1, c2) nd (r1, 4, 1, c3). Definition 3 ( Connecting Reltionship): Two leled, directed grphs g 1 = G Q 1 hs Connecting reltionship with g 2 = G Q 2 in grph G if n 1 V g1 n 2 V g2, in grph G there is pth from p g1 (n 1 ) to p g2 (n 2 ). Definition 4 ( Connected Reltionship): Two leled, directed grphs g 1 = G Q 1 hs Connected y reltionship with g 2 = G Q 2 in grph G if n 1 V g1 n 2 V g2, in grph G there is pth from p g2 (n 2 ) to p g1 (n 1 ). Definition 5 (Disjoint Reltionship): Two leled, directed grphs g 1 = G Q 1 hs Disjoint reltionship with g 2 = G Q 2 Fig. 3. r c () Exmple Queries () c in grph G if n 1 V g1 n 2 V g2, p g1 (n 1 ) p g2 (n 2 ). Definition 6 (Overlpping Reltionship): Two leled, directed grphs g 1 = G Q 1 hs Overlpping reltionship with g 2 = G Q 2 in grph G if n 1 V g1 n 2 V g2, p g1 (n 1 )=p g2 (n 2 ). Definition 7 (Contining Reltionship): Two leled, directed grphs g 1 = G Q 1 hs contining reltionship with g 2 = G Q 2 in grph G if n 2 V g2 n 1 V g1, p g1 (n 1 )= p g2 (n 2 ). Definition 8 (Contined y Reltionship): Two leled, directed grphs g 1 = G Q 1 hs Contined y reltionship with g 2 = G Q 2 in grph G if n 1 V g1 n 2 V g2, p g1 (n 1 )=p g2 (n 2 ). g 1 hs topologicl reltionships r with g 2 cn e represented s g G,r g 2 Definition 9 (Topologicl Query): A topologicl query Q is triple (G 1,G 2, reltion), where G 1 nd G 2 re sugrph queries nd reltion is one of connecting, connected y, disjoint, overlpping, contining nd contined y. Definition 10: The result of topologicl query Q=(G 1, G 2, reltionship) over grph G is R G,Q = {g g = G G 1 g 2 (g 2 G g 2 = G G 2 g G,reltionship g 2 )}, where type is some topologicl reltionship type. An exmple in Exmple 3 is to illustrte the concept of topologicl result. Exmple 3: A topologicl query Q = (G 1,G 2, Overlpping ) with G 1 in Figure 3() nd G 2 in Figure 3(). Q is to retrieve the sugrphs mtching G 1 nd hve t lest one node in some sugrph mtching G 2. The results include (r1, 2, 1, c1), (r1, 2, 1, c2), (r1, 3, 1, c1) nd (r1, 3, 1, c2). (r1, 4, 1, c3) is not result of Q. It is ecuse tht it hs none node overlpping ny sugrph mtching G 2. IV. EVALUATION OF TOPOLOGICAL QUERIES In this section, we present lgorithms for the evlution of topologicl queries. We find tht topologicl reltionship cn e clssified into two clsses, query-relted nd queryunrelted. Query-relted reltionship mens tht just from the (c) (d) c (e) 723
4 query the reltionship etween the results cn e judged. Such reltionships includes Contining nd Contined y. For nother four topologicl reltionships, the reltionship etween sugrphs cnnot e judged just from the query directly ut hve to e judged from the results of sugrph queries. A topologicl query with query-relted reltionship re clled query-relted query. Otherwise, it is clled query-unrelted query. For query-relted query, the sugrph query G cn e generted from the topologicl query Q nd the result of Q cn e otined directly from the result of G. To process query-unrelted query, we design n uniform frmework with different filters for deferent reltionship. In this section, t first, the strtegy of processing queryrelted queries is presented. Then we will introduce the lgorithms for processing query-unrelted queries. This pper focuses on topologicl query processing. As their su-queries, the processing of sugrph query is eyond the scope of this pper. Existing sugrph query processing lgorithms [3], [13] re used s function. A. Processing Query-Relted Topologicl Queries In this susection, we present the processing strtegy for query-relted topologicl queries. Definition 11 (Query-Relted Reltionship): A topologicl reltionship r is query-relted reltionship if for queryrelted query Q =(G 1,G 2,r), the topologicl reltionship r etween two sugrphs mtching G 1 nd G 2 cn e judged directly from the query G 1 nd G 2 without generting the result. The topologicl reltionships of Contining nd Contined y cn e judged directly from the query. It is ecuse for two sugrph queries G 1 nd G 2 on G, ifg 2 is sugrph of G 1, ny result g 1 of G 1 must contin result of G 2. Otherwise, if some g 1 does not contin ny result of G 2 there must e some node in G 2 tht cn not e mtched y ny node of g 1. Then the corresponding node in G 1 cn not e mtched y ny node of g 1. Then g 1 is not result of G 1. Therefore, just from the query, Contining nd Contined y reltionship cn e judged from the query. Bsed on the feture of query-relted reltionship, topologicl query Q =(G 1,G 2,type) with reltionship Contining cn e process with Algorithm 1. Algorithm 1 Query contining (G, Q) check whether Q.G 2 is sugrph of Q.G 1 if Q.G 2 is sugrph of Q.G 1 then return QuerySugrph(G, Q.G 1) else return φ In this lgorithm, if Q.G 2 is sugrph of Q.G 1, ny result of Q.G 1 must contin sugrph mtching Q.G 2. Otherwise, ny result of Q.G 1 does not contin result mtching Q.G 2. A topologicl query Q =(G 1,G 2,type) with reltionship Contined y cn e process with Algorithm 2. In Algorithm 2, checking whether Q.G 1 is sugrph of Q.G 2 cn use sugrph mtching lgorithm nd ll the Algorithm 2 Query contined (G, Q) check whether Q.G 1 is sugrph of Q.G 2 if Q.G 2 is sugrph of Q.G 1 then G =QuerySugrph(G, Q.G 2) for ech g in G do S =sugrps of g mtching Q.G 1 S = S S return S else return φ sugrphs mtching Q.G 1 in Q.G 2 re found nd ech is given n identify. Note tht one node in Q.G 2 my e ttched multiple identifies. The step otining sugrph of g mtching Q.G 1 is different from sugrph mtching. It is ecuse during the step of checking whether Q.G 1 is sugrph of Q.G 2, the nodes of Q.G 1 relted to Q.G 2 is identified nd during projection, nodes corresponding to the query node with the sme identify cn e extrcted directly. We use n exmple in Exmple 4 to show the processing of Contined y topologicl query. Exmple 4: For topologicl query Q =(G 1, G 2, Contined y ) on grph in Figure 2() with G 1 in Figure 3(d) nd G 2 in Figure 3(c). At first, the reltionship etween G 1 nd G 2 is checked nd G 1 is sugrph of G 2 nd there re two sugrph mtching G 1 in G 2 : the left-top one is defined s group1 nd the right-ottom one is defined s group2. After processing sugrph query G 2, result (1,1,2,2) is otined with (1,1) mtching group1 nd (2,2) mtching group2. Thus,(1,1) nd (2,2) re two results of Q. Complexity Anlysis Since this pper focuses on topologicl query processing, during nlysis, we tke sugrph query processing lgorithm s n orcle nd use Cost M(q,G) to represent the cost of processing sugrph q on grph G. Bsed on Algorithm 4, the time complexity of Contining topologicl query processing only depends on the time of sugrph query nd is t most Cost M(G,Q.G1) + Cost M(Q.G1,Q.G 2). Bsed on Algorithm 5, the time complexity of Contined y is Cost M(G,Q.G2) + Cost M(Q.G2,Q.G 1) + O( Q.G 1.V result size ), where result size is the numer of finl results. It is ecuse esides the cost of grph mtching, extr cost is the genertion of sugrphs mtching Q.Q 1 from the sugrphs mtching Q.Q 2 nd with identity, such cost is just the cost of generting ll finl results. B. Frmework of Processing Query-unrelted Topologicl Queries In this susection, we will present the frmework of topologicl query evlution. The evlution strtegies for 4 Queryunrelted topologicl queries shre the sme frmework. The frmework cn e descrie in Algorithm 3. Algorithm 3 Query(G, Q) G filter = QuerySugrph(G, Q.G 2) S filter =GenerteFilters(G filter, Q.type) G result = QuerySugrphWithFilter(G, Q.G 1,S filter ) return G result In the frmework, for grph G nd topologicl query Q =(G 1,G 2,type), t first, G 2 is processed nd then from the 724
5 results of G 2, the set of filters re generted. At lst, sugrph query G 1 is processed, during the processing of G 1, the filters generted in the lst step is pplied to filter the results not stisfying the topologicl constrint. Note tht in order to void storing intermedite results, sme filters cn e generted when one result mtching G 2 nd such result is not necessry to e stored. The enefit of shring the sme frmework is tht topologicl query with the logicl comintion of multiple topologicl constrints cn e processed with the comintion of filters. C. Algorithms of Processing Query-unrelted Topologicl Queries In this susection, we will present lgorithms of processing query-unrelted topologicl queries sed on the frmework presented in Section IV-B. Connecting For processing topologicl query with topologicl reltionship connecting, rechiility leling scheme is used. As introduced in Section II-B, ech node is ttched to n id nd list of intervls. For topologicl query Q = (G 1,G 2, connecting ), sorted list L is mintined s filter. When sugrph g 2 mtching G 2 is generted, id of ech node g is inserted to L. When sugrph g 1 mtching G 1 is generted, ll intervls of g 1 s nodes re used to vlidte g 1.If n intervl [x, y] contins some id in L, it mens some node of g 1 cn rech node in some sugrph mtching G 2. Then g 1 is result of Q. The lgorithm is sketched in Algorithm 4. Algorithm 4 Query connecting (G, Q) G filter = QuerySugrph(G, Q.G 2) for ech g G filter do for ech node n g do insert n.id to L G pr = QuerySugrph(G, Q.G 1) for ech grph ging pr do for ech node n g do for ech intervl [x, y] in n.intervls do if id L, x id y then dd g to G result nd stop the processing of current g return G result To ccelerte the inserting nd serching, L is implemented with inry serch tree with insert nd delete cost o(logn), where n is the numer of nodes. Complexity Anlysis The run time of connecting lgorithm cn e estimted s following. Cost = Cost M(G,Q.G1) + Cost M(G,Q.G2) + cost insertl Q.G 2.V result ( M(G, Q.G 2 )) + cost serchl result intervl (M(G, Q.G 1 )), where result (M(G, Q.G 2 )) nd result intervl (M(G, Q.G 1 ) re the numer of results of M(G, Q.G 2 ) nd the totl numer of intervls in the result of M(G, Q.G 1 ), respectively. Since the size of L is t most G.V, cost insertl nd cost serchl re oth t most o(log G.V ). Connected y Processing topologicl query Q(G 1,G 2, Connected y ) uses intervls of nodes in ll the results of G 2. All the intervls re mintined in n ordered list L. When result g of G 1 is generted, L is serched for id of ech node in g to find the intervl [x, y] stisfying x id y. If such intervl cn e found, it mens tht some node in g is connected y some node in sugrph s result of G 2 nd g is ccepted s result. The pseudo code of processing Connected y topologicl query is shown in Algorithm 5. Algorithm 5 Query connected (G, Q) G filter = QuerySugrph(G, Q.G 2) for ech g G filter do for ech node n in g do insert ll the intervl in n.intervls to L G pr = QuerySugrph(G, Q.G 1) for ech grph ging pr do for ech node n g do if [x, y] L, x n.id y then dd g to G result nd stop the processing of current g return G result For efficient inserting nd serching, L is in order of x vlue nd mintined with inry serch tree. In order to reduce the spce cost of L, overlpping nd nested intervls cn e merged. Since the judgement condition is existing intervl contining id, the merge of overlpping nd nested intervls will not ffect the result. The merging is performed online. When n intervl [x, y] is inserted into L, the merging strtegies re shown s follows. If n intervl [, ] in L is found stisfying x y, then [x, y] will not e inserted. If n intervl [, ] in L is found stisfying x y, then [x, y] is inserted to L nd [, ] is delete from L. If two intervls [, ] nd [c, d] in L re found stisfying x 1 nd c y +1, then [, ] nd [c, d] re replced y [, d]. If n intervl [, ] in L is found stisfying x +1 y, then [, ] is replced y [, y]. If n intervl [, ] in L is found stisfying x y +1, then [, ] is replced y [x, ]. With merging strtegies, ll intervls in L re not overlpping or nesting. When result g of G 1 is generted, L is serched for id of ech node in g to find the intervl [x, y] with lrgest x vlue mong the intervls with x vlue smller thn id. If such intervl cn e found nd id stisfies x id y, g is considered s result. An exmple in Exmple 5 is used to illustrte the processing of Connected y query. Exmple 5: For topologicl grph Q =(G 1, G 2, Contined y ), where G 1 is the sugrph query in Figure 3(d) nd G 2 is the sugrph query in Figure 3(e). At the first step, sugrph query G 2 is processed nd then results (2, c1), (2, c2), (3, c1), (3, c2) nd (4, c3) re returned. Then intervls of ech nodes in result re dded to the filter. For 2, [0, 2] nd [7, 7] re dded to filter set L. Forc1 nd c2, no dditionl intervl is dded to L since their intervls [1, 1] nd [2, 2] re contined in existing intervl in L, [0, 2]. When 3 is considered, intervl [1, 3] is merged to [0, 2] in L nd [0, 3] tkes the plce of [0, 2] in Lr. For 4, [4, 5] is merged with [0, 3] nd [0, 5] tkes the plce of [0, 3] in L. [10, 10] is lso required to merged with [9, 9]. After processing ll nodes of 725
6 sugrphs mtching G 2, intervls in L include [0, 5],[7, 7] nd [9, 10]. During the process of G 1, when sugrph mtching G 1 such s (1,1) is generted, for ech id of its nodes, L is serched. For (1,1), no intervl in L contins ny of the ids of1 nd 1. So(1,1) is not result for Q. For(2,1), since the id of 1 is contined in the intervl [0,5], it is result of Q. Complexity Anlysis The run time of connecting lgorithm cn e estimted s Cost = Cost M(G,Q.G1) + Cost M(G,Q.G2) + cost insertl Q.G 2.V result intervl (M(G, Q.G 2 )) + cost serchl Q.G 1.V result(m(g, Q.G 1 )), where result intervl (M(G, Q.G 2 )) nd result(m(g, Q.G 1 )) is the totl numer of intervls in the results of M(G, Q.G 2 ) nd numer of results of M(G, Q.G 1 ), respectively. With our merging strtegy nd the property of the leling scheme, the numer of intervls re t most G.V. Therefore, cost insertl nd cost serchl re oth t most o(log G.V ). Overlpping nd Disjoint The filter for topologicl query Q =(G 1,G 2,type) with type overlpping or disjoint is set S of ids of nodes in the sugrphs mtching G 2. The difference of processing these two types of topologicl queries is the filter condition. For sugrph g mtching G 1, the filter condition of Overlpping topologicl query is tht for some node n in g, id equlling to n.id in S cn e found. While the filter condition of Disjoint topologicl query is tht n.id cn not e found in S for ll nodes n in g. The description of the lgorithm to process overlpping is in Algorithm 6. The lgorithm for disjoint is similr except during the filtering step, the grph g with ids of ll nodes not in S re considered s finl result. Algorithm 6 Query Overlpping (G, Q) G filter = QuerySugrph(G, Q.G 2) for ech g G filter do for ech node n g do insert n.id to S G pr = QuerySugrph(G, Q.G 1) for ech grph ging pr do for ech node n g do if n.id cn e found in S then dd g to G result nd stop the processing of current g return G result For efficient inserting nd serching, S cn e implemented with hsh set. Complexity Anlysis The run time of overlpping nd disjoint lgorithms re similr nd cn e estimted s following. Cost = Cost M(G,Q.G1) + Cost M(G,Q.G2) + cost inserts Q.G 2.V result(m(g, Q.G 2 )) + cost serchs Q.G 1.V result(m(g, Q.G 1 )), where result(m(g, Q.G 2 )) nd result(m(g, Q.G 1 )) re the numer of results of M(G, Q.G 2 ) nd numer of results of M(G, Q.G 1 ), respectively. With hsh set s dt structure, cost inserts nd re oth o(1). cost serchs V. EXPERIMENTAL RESULTS In this section, we present the experimentl results nd nlysis of our lgorithms for processing topologicl queries. TABLE I STATISTICS OF THE XMARK DATASETS fctor Size 11.3M 22.8M 34.0M 45.3M 56.2M #Nodes #Edges A. Experimentl Setup All of our experiments were performed on PC with Pentium 2.4GMHz, 512M memory nd 40G IDE hrd disk. The OS is WindowsXP Professionl. We implemented our system using Microsoft Visul C We implemented ll our lgorithms in this pper. We implemented sugrph query processing lgorithms presented in[13] s the sugrph query processing module nd emedded the filter genertion nd result filtering in result genertion step of sugrph query processing module. The sugrph query processing lgorithm in [13] is disk-sed lgorithm. we fix lock size 1K nd uffer size 32K. The dtset we tested is the XMrk enchmrk dtset [10]. It cn e modeled s grph nd hs firly complicted schem, with circles s sugrphs nd mny nodes with multiple prents. We choose fctor 0.1, 0.2, 0.3, 0.4, 0.5 to generte XMrk documents. Their prmeters re shown in Tle I For topologicl queries, we choose three groups of grphs, groupu1: (G u11,g u12 ), groupu2: (G u21,g u22 ) nd groupu3: (G u31,g u32 )for Connecting, Connected, Overlpping nd Disjoint nd three groups of grphs groupr1: (G r11,g r12 ), groupr2: (G r21,g r22 ) nd groupr3: (G r31,g r32 ) for Contining nd Contined y. The grphs re shown in Figure V, where doule line represents ADreltionship etween nodes. G u32 nd G r21 re the sme grphs. For ech group for query-unrelted queries, we design four queries on them with four topologicl reltionship, respectively. In ech group, the former sugrph is used s G 1 nd the ltter sugrph s G 2 in the form of topologicl query Q =(G 1,G 2,type). For ech group for query-relted queries with sugrph query G 1 nd G 2, we design two queries (G 1,G 2, Contining ) nd (G 2,G 1, Contined y ) on them. Since the cse tht G 1 does not contin G 2 cn only processed y compring the two queries nd the cse is trivil, in ech group, G 2 is sugrph of G 1. B. Comprison In this susection, we will present the comprison results etween our lgorithms nd grph mtching. For queryunrelted query, We compre the processing time of queryunrelted queries with the sum of processing time of the two sugrph queries in the topologicl query. The results is shown in Figure 5(). Since the three group of sugrph queries for query-relted queries ll include one sugrph contining nother, we lso compre the processing time of query-relted queries with the lrger one of two sugrph queries in ech group. The results re shown in Figure 5(). From the results, it cn e seen tht compring with the time of processing 726
7 ctegory open_uction closed_uction open_uction seller uyer seller idder item description old emph keyword old emph keyword keyword emph () G u11 () G u12 (c) G u21 (d) G u22 (e) G u31 closed_uction uthor site item open_uction wtch open_uction uthor wtch old emph ctegory old emph ctegory old emph keyword edge eduction edge eduction idder open_uction (f) G u32 nd G r21 (g) G r12 (h) G r21 (i) G r22 (j) G r32 (k) G r32 Fig. 4. Query Grphs () Comprison: Query-Unrelted () Comprison: Query-Relted (c) Sclility: GroupU1 (d) Sclility: GroupU2 (e) Sclility: GroupU3 (f) Sclility: Query-Relted Fig. 5. Experimentl Results sugrph queries, the extr costs of our lgorithms re very smll. C. Sclility In this susection, we will present the results of sclility of our lgorithms. We rnge XMrk dt from 10M to 50M. The results of query-unrelted queries re shown in Figure 5(c), Figure 5(d) nd Figure 5(e). The results of queryrelted queries re shown in Figure 5(f). Note tht the run time of Figure 5(c), Figure 5(e) nd Figure 5(f) re in log scle. From the results, the run time of our lgorithm increses little fster thn linerly ut slower thn exponentilly with dt size. It is ecuse tht our lgorithm increses with the size of prtil results of the sugrph query in topologicl constrint nd finl results. The size of prtil results my increse fster thn linerly ecuse the numers of nodes mtching ech query node in sugrph query increses out linerly with the size of XML document nd so tht the in some cses the increse of the numer of results my e the production of the increse time of nodes mtching every query node. 727
8 VI. RELATED WORK In this section, we give n overview of previous work relted to this pper. Relted work cn e clssified into two kinds. One is the model nd representtion of query on grph, the other is query processing lgorithms on grph. Existing query lnguges for XML [2] re only sed on tree structure without considering grph fetures. Even though Lorel [1] considers IDREF, it considered pth s sic unit insted of grph. The disdvntge of using pth s the sic unit is tht circle nd topologicl reltionships re difficult to e represented. There re lso severl query lnguges designed for descriing recursion reltionship [5] nd grph mtching [6], [11]. They focus on the description of query in the form of lelled grph without complex structurl restrictions nd topologicl restrictions. Query lnguges relted to RDF [7] cn e used to represent query in the form of grph. But current query lnguges do not consider topologicl restrictions. Currently, existing work of querying grph-structured XML minly focus on structurl query of sugrph/sutree mtching in grph [16], [3], [13]. None of them considers the prolem of topologicl query processing. Another kind of work on querying grph is to retrieve grph stisfying some condition from lrge set of grphs. Such work includes [14], [15], [8]. Such works consider the structurl fetures of grph. However, none of them considers the topologicl fetures of grph. VII. CONCLUSIONS In this pper, topologicl query, new kind of query for grph-structured XML dt, is presented. This query is to retrieve sugrph with some given topologicl reltionship with some other sugrph in grph-structured dt. We define six types of topologicl reltionship etween sugrphs to e used s the topologicl constrint. We give the definitions nd evlution lgorithms of topologicl queries. Experimentl results show tht our implementtion lgorithms use smll extr cost of sugrph query nd scles with the size of results of sugrph query in it. Future work includes efficient index uilding nd holistic lgorithms to process topologicl queries. [8] H. He nd A. K. Singh. Closure-tree: An index structure for grph queries. In ICDE, pge 38, [9] H. V. J. Rkesh Agrwl, Alexnder Borgid. Efficient mngement of trnsitive reltionships in lrge dt nd knowledge ses. In Proceedings of the 1989 ACM SIGMOD Interntionl Conference on Mngement of Dt (SIGMOD 1989), pges , Portlnd, Oregon, My [10] A. Schmidt, F. Ws, M. L. Kersten, M. J. Crey, I. Mnolescu, nd R. Busse. XMrk: A enchmrk for XML dt mngement. In Proceedings of 28th Interntionl Conference on Very Lrge Dt Bses (VLDB 2002), pges , [11] L. Sheng, Z. M. Özsoyoglu, nd G. Özsoyoglu. A grph query lnguge nd its query processing. In Proceedings of the 15th Interntionl Conference on Dt Engineering, Mrch 1999, Sydney, Austrili, pges IEEE Computer Society, [12] J. P. Tim Bry, C. M. Spererg-McQueen, nd F. Yergeu. Extensile mrkup lnguge (xml) 1.0 (third edition). In W3C Recommendtion 04 Ferury 2004, [13] H. Wng, W. Wng, X. Lin, nd J. Li. Sugrph join: Efficient processing sugrph queries on grph-structured xml document. In WAIM, pges 68 80, [14] X. Yn, P. S. Yu, nd J. Hn. Grph indexing: A frequent structure-sed pproch. In SIGMOD Conference, pges , [15] X. Yn, P. S. Yu, nd J. Hn. Sustructure similrity serch in grph dtses. In SIGMOD Conference, pges , [16] V. J. T. Zogrfoul Vgen, Mirell Mour Moro. Twig query processing over grph-structured xml dt. In Proceedings of the Seventh Interntionl Workshop on the We nd Dtses(WeDB 2004), pges 43 48, Hongzhi Wng Hongzhi Wng received his PHD in computer science from Hrin Institute of Technology in He is lecturer of Deprtment of Computer Science nd Technology, Hrin Institute of Technology. His reserch re is XML dt mngement nd informtion integrtion. Jinzhong Li Jinzhong Li received his PHD in computer science from Hrin Institute of Technology in He is professor of Deprtment of Computer Science nd Technology, Hrin Institute of Technology. His reserch re includes prllel dtse, sensor network, dt mining, dt wrehouse, compressed dtse nd XML dt mngement. REFERENCES [1] S. Aiteoul, D. Quss, J. McHugh, J. Widom, nd J. L. Wiener. The lorel query lnguge for semistructured dt. Int. J. on Digitl Lirries, 1(1):68 88, [2] A. Bonifti nd S. Ceri. Comprtive nlysis of five xml query lnguges. SIGMOD Record, 29(1):68 79, [3] L. Chen, A. Gupt, nd M. E. Kurul. Stck-sed lgorithms for pttern mtching on dgs. In VLDB, pges , [4] J. Clrk nd S. DeRose. XML pth lnguge (XPth). In W3C Recommendtion, 16 Novemer 1999, [5] I. F. Cruz, A. O. Mendelzon, nd P. T. Wood. A grphicl query lnguge supporting recursion. In SIGMOD Conference, pges , [6] R. H. Güting. Grphd: Modeling nd querying grphs in dtses. In J. B. Bocc, M. Jrke, nd C. Zniolo, editors, VLDB 94, Proceedings of 20th Interntionl Conference on Very Lrge Dt Bses, Septemer 12-15, 1994, Sntigo de Chile, Chile, pges Morgn Kufmnn, [7] P. Hse, J. Broekstr, A. Eerhrt, nd R. Volz. A comprison of rdf query lnguges. In Interntionl Semntic We Conference, pges , Jizhou Luo Jizhou Luo received his PHD in computer science from Hrin Institute of Technology in He is lecturer of Deprtment of Computer Science nd Technology, Hrin Institute of Technology. His reserch re is compressed dtse. 728
2 Computing all Intersections of a Set of Segments Line Segment Intersection
15-451/651: Design & Anlysis of Algorithms Novemer 14, 2016 Lecture #21 Sweep-Line nd Segment Intersection lst chnged: Novemer 8, 2017 1 Preliminries The sweep-line prdigm is very powerful lgorithmic design
More informationSlides for Data Mining by I. H. Witten and E. Frank
Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully
More informationIf you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs.
Lecture 5 Wlks, Trils, Pths nd Connectedness Reding: Some of the mteril in this lecture comes from Section 1.2 of Dieter Jungnickel (2008), Grphs, Networks nd Algorithms, 3rd edition, which is ville online
More informationWhat are suffix trees?
Suffix Trees 1 Wht re suffix trees? Allow lgorithm designers to store very lrge mount of informtion out strings while still keeping within liner spce Allow users to serch for new strings in the originl
More informationTries. Yufei Tao KAIST. April 9, Y. Tao, April 9, 2013 Tries
Tries Yufei To KAIST April 9, 2013 Y. To, April 9, 2013 Tries In this lecture, we will discuss the following exct mtching prolem on strings. Prolem Let S e set of strings, ech of which hs unique integer
More informationUNIT 11. Query Optimization
UNIT Query Optimiztion Contents Introduction to Query Optimiztion 2 The Optimiztion Process: An Overview 3 Optimiztion in System R 4 Optimiztion in INGRES 5 Implementing the Join Opertors Wei-Png Yng,
More informationCOMP 423 lecture 11 Jan. 28, 2008
COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring
More informationIn the last lecture, we discussed how valid tokens may be specified by regular expressions.
LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.
More informationFig.25: the Role of LEX
The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing
More informationCOMBINATORIAL PATTERN MATCHING
COMBINATORIAL PATTERN MATCHING Genomic Repets Exmple of repets: ATGGTCTAGGTCCTAGTGGTC Motivtion to find them: Genomic rerrngements re often ssocited with repets Trce evolutionry secrets Mny tumors re chrcterized
More informationFrom Dependencies to Evaluation Strategies
From Dependencies to Evlution Strtegies Possile strtegies: 1 let the user define the evlution order 2 utomtic strtegy sed on the dependencies: use locl dependencies to determine which ttriutes to compute
More informationToday. CS 188: Artificial Intelligence Fall Recap: Search. Example: Pancake Problem. Example: Pancake Problem. General Tree Search.
CS 88: Artificil Intelligence Fll 00 Lecture : A* Serch 9//00 A* Serch rph Serch Tody Heuristic Design Dn Klein UC Berkeley Multiple slides from Sturt Russell or Andrew Moore Recp: Serch Exmple: Pncke
More informationLecture 10 Evolutionary Computation: Evolution strategies and genetic programming
Lecture 10 Evolutionry Computtion: Evolution strtegies nd genetic progrmming Evolution strtegies Genetic progrmming Summry Negnevitsky, Person Eduction, 2011 1 Evolution Strtegies Another pproch to simulting
More informationCS481: Bioinformatics Algorithms
CS481: Bioinformtics Algorithms Cn Alkn EA509 clkn@cs.ilkent.edu.tr http://www.cs.ilkent.edu.tr/~clkn/teching/cs481/ EXACT STRING MATCHING Fingerprint ide Assume: We cn compute fingerprint f(p) of P in
More informationAn Algorithm for Enumerating All Maximal Tree Patterns Without Duplication Using Succinct Data Structure
, Mrch 12-14, 2014, Hong Kong An Algorithm for Enumerting All Mximl Tree Ptterns Without Dupliction Using Succinct Dt Structure Yuko ITOKAWA, Tomoyuki UCHIDA nd Motoki SANO Astrct In order to extrct structured
More informationSystems I. Logic Design I. Topics Digital logic Logic gates Simple combinational logic circuits
Systems I Logic Design I Topics Digitl logic Logic gtes Simple comintionl logic circuits Simple C sttement.. C = + ; Wht pieces of hrdwre do you think you might need? Storge - for vlues,, C Computtion
More informationEfficient K-NN Search in Polyphonic Music Databases Using a Lower Bounding Mechanism
Efficient K-NN Serch in Polyphonic Music Dtses Using Lower Bounding Mechnism Ning-Hn Liu Deprtment of Computer Science Ntionl Tsing Hu University Hsinchu,Tiwn 300, R.O.C 886-3-575679 nhliou@yhoo.com.tw
More informationCS321 Languages and Compiler Design I. Winter 2012 Lecture 5
CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,
More informationTaming Subgraph Isomorphism for RDF Query Processing
Tming Sugrph Isomorphism for RDF Query Processing Jinh Kim # jinh.kim@orcle.com Hyungyu Shin hgshin@dl.postech.c.kr Sungpck Hong # Hssn Chfi # {sungpck.hong, hssn.chfi}@orcle.com POSTECH, South Kore #
More informationAlgorithm Design (5) Text Search
Algorithm Design (5) Text Serch Tkshi Chikym School of Engineering The University of Tokyo Text Serch Find sustring tht mtches the given key string in text dt of lrge mount Key string: chr x[m] Text Dt:
More informationdocuments 1. Introduction
www.ijcsi.org 4 Efficient structurl similrity computtion etween XML documents Ali Aïtelhdj Computer Science Deprtment, Fculty of Electricl Engineering nd Computer Science Mouloud Mmmeri University of Tizi-Ouzou
More informationPresentation Martin Randers
Presenttion Mrtin Rnders Outline Introduction Algorithms Implementtion nd experiments Memory consumption Summry Introduction Introduction Evolution of species cn e modelled in trees Trees consist of nodes
More informationInformation Retrieval and Organisation
Informtion Retrievl nd Orgnistion Suffix Trees dpted from http://www.mth.tu.c.il/~himk/seminr02/suffixtrees.ppt Dell Zhng Birkeck, University of London Trie A tree representing set of strings { } eef d
More informationΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών
ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy Recognition of Tokens if expressions nd reltionl opertors if è if then è then else è else relop
More informationAnnouncements. CS 188: Artificial Intelligence Fall Recap: Search. Today. Example: Pancake Problem. Example: Pancake Problem
Announcements Project : erch It s live! Due 9/. trt erly nd sk questions. It s longer thn most! Need prtner? Come up fter clss or try Pizz ections: cn go to ny, ut hve priority in your own C 88: Artificil
More informationUNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFORMATICS 1 COMPUTATION & LOGIC INSTRUCTIONS TO CANDIDATES
UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFORMATICS COMPUTATION & LOGIC Sturdy st April 7 : to : INSTRUCTIONS TO CANDIDATES This is tke-home exercise. It will not
More informationMa/CS 6b Class 1: Graph Recap
M/CS 6 Clss 1: Grph Recp By Adm Sheffer Course Detils Adm Sheffer. Office hour: Tuesdys 4pm. dmsh@cltech.edu TA: Victor Kstkin. Office hour: Tuesdys 7pm. 1:00 Mondy, Wednesdy, nd Fridy. http://www.mth.cltech.edu/~2014-15/2term/m006/
More informationComplete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li
2nd Interntionl Conference on Electronic & Mechnicl Engineering nd Informtion Technology (EMEIT-212) Complete Coverge Pth Plnning of Mobile Robot Bsed on Dynmic Progrmming Algorithm Peng Zhou, Zhong-min
More informationPosition Heaps: A Simple and Dynamic Text Indexing Data Structure
Position Heps: A Simple nd Dynmic Text Indexing Dt Structure Andrzej Ehrenfeucht, Ross M. McConnell, Niss Osheim, Sung-Whn Woo Dept. of Computer Science, 40 UCB, University of Colordo t Boulder, Boulder,
More informationA Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards
A Tutology Checker loosely relted to Stålmrck s Algorithm y Mrtin Richrds mr@cl.cm.c.uk http://www.cl.cm.c.uk/users/mr/ University Computer Lortory New Museum Site Pemroke Street Cmridge, CB2 3QG Mrtin
More informationthis grammar generates the following language: Because this symbol will also be used in a later step, it receives the
LR() nlysis Drwcks of LR(). Look-hed symols s eplined efore, concerning LR(), it is possile to consult the net set to determine, in the reduction sttes, for which symols it would e possile to perform reductions.
More informationCSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona
CSc 453 Compilers nd Systems Softwre 4 : Lexicl Anlysis II Deprtment of Computer Science University of Arizon collerg@gmil.com Copyright c 2009 Christin Collerg Implementing Automt NFAs nd DFAs cn e hrd-coded
More informationSuffix trees, suffix arrays, BWT
ALGORITHMES POUR LA BIO-INFORMATIQUE ET LA VISUALISATION COURS 3 Rluc Uricru Suffix trees, suffix rrys, BWT Bsed on: Suffix trees nd suffix rrys presenttion y Him Kpln Suffix trees course y Pco Gomez Liner-Time
More informationAnswering Label-Constraint Reachability in Large Graphs
Answering Lel-Constrint Rechility in Lrge Grphs Kun Xu Peking University Beijing, Chin xukun@icst.pku.edu.cn Lei Chen Hong Kong Univ. of Sci. & Tech. Hong Kong, Chin leichen@cse.ust.hk Lei Zou Peking University
More informationIntermediate Information Structures
CPSC 335 Intermedite Informtion Structures LECTURE 13 Suffix Trees Jon Rokne Computer Science University of Clgry Cnd Modified from CMSC 423 - Todd Trengen UMD upd Preprocessing Strings We will look t
More informationRegular Expression Matching with Multi-Strings and Intervals. Philip Bille Mikkel Thorup
Regulr Expression Mtching with Multi-Strings nd Intervls Philip Bille Mikkel Thorup Outline Definition Applictions Previous work Two new problems: Multi-strings nd chrcter clss intervls Algorithms Thompson
More informationBefore We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1):
Overview (): Before We Begin Administrtive detils Review some questions to consider Winter 2006 Imge Enhncement in the Sptil Domin: Bsics of Sptil Filtering, Smoothing Sptil Filters, Order Sttistics Filters
More informationSAPPER: Subgraph Indexing and Approximate Matching in Large Graphs
SAPPER: Sugrph Indexing nd Approximte Mtching in Lrge Grphs Shijie Zhng, Jiong Yng, Wei Jin EECS Dept., Cse Western Reserve University, {shijie.zhng, jiong.yng, wei.jin}@cse.edu ABSTRACT With the emergence
More informationCS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis
CS143 Hndout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexicl Anlysis In this first written ssignment, you'll get the chnce to ply round with the vrious constructions tht come up when doing lexicl
More informationDr. D.M. Akbar Hussain
Dr. D.M. Akr Hussin Lexicl Anlysis. Bsic Ide: Red the source code nd generte tokens, it is similr wht humns will do to red in; just tking on the input nd reking it down in pieces. Ech token is sequence
More informationUnit #9 : Definite Integral Properties, Fundamental Theorem of Calculus
Unit #9 : Definite Integrl Properties, Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl
More informationThe Greedy Method. The Greedy Method
Lists nd Itertors /8/26 Presenttion for use with the textook, Algorithm Design nd Applictions, y M. T. Goodrich nd R. Tmssi, Wiley, 25 The Greedy Method The Greedy Method The greedy method is generl lgorithm
More informationP(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have
Rndom Numers nd Monte Crlo Methods Rndom Numer Methods The integrtion methods discussed so fr ll re sed upon mking polynomil pproximtions to the integrnd. Another clss of numericl methods relies upon using
More informationITEC2620 Introduction to Data Structures
ITEC0 Introduction to Dt Structures Lecture 7 Queues, Priority Queues Queues I A queue is First-In, First-Out = FIFO uffer e.g. line-ups People enter from the ck of the line People re served (exit) from
More informationLexical Analysis: Constructing a Scanner from Regular Expressions
Lexicl Anlysis: Constructing Scnner from Regulr Expressions Gol Show how to construct FA to recognize ny RE This Lecture Convert RE to n nondeterministic finite utomton (NFA) Use Thompson s construction
More informationImplementing Automata. CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona
Implementing utomt Sc 5 ompilers nd Systems Softwre : Lexicl nlysis II Deprtment of omputer Science University of rizon collerg@gmil.com opyright c 009 hristin ollerg NFs nd DFs cn e hrd-coded using this
More informationInference of node replacement graph grammars
Glley Proof 22/6/27; :6 File: id293.tex; BOKCTP/Hin p. Intelligent Dt Anlysis (27) 24 IOS Press Inference of node replcement grph grmmrs Jcek P. Kukluk, Lwrence B. Holder nd Dine J. Cook Deprtment of Computer
More information10.5 Graphing Quadratic Functions
0.5 Grphing Qudrtic Functions Now tht we cn solve qudrtic equtions, we wnt to lern how to grph the function ssocited with the qudrtic eqution. We cll this the qudrtic function. Grphs of Qudrtic Functions
More informationCS 432 Fall Mike Lam, Professor a (bc)* Regular Expressions and Finite Automata
CS 432 Fll 2017 Mike Lm, Professor (c)* Regulr Expressions nd Finite Automt Compiltion Current focus "Bck end" Source code Tokens Syntx tree Mchine code chr dt[20]; int min() { flot x = 42.0; return 7;
More information4452 Mathematical Modeling Lecture 4: Lagrange Multipliers
Mth Modeling Lecture 4: Lgrnge Multipliers Pge 4452 Mthemticl Modeling Lecture 4: Lgrnge Multipliers Lgrnge multipliers re high powered mthemticl technique to find the mximum nd minimum of multidimensionl
More informationCS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig
CS311H: Discrete Mthemtics Grph Theory IV Instructor: Işıl Dillig Instructor: Işıl Dillig, CS311H: Discrete Mthemtics Grph Theory IV 1/25 A Non-plnr Grph Regions of Plnr Grph The plnr representtion of
More informationCSCE 531, Spring 2017, Midterm Exam Answer Key
CCE 531, pring 2017, Midterm Exm Answer Key 1. (15 points) Using the method descried in the ook or in clss, convert the following regulr expression into n equivlent (nondeterministic) finite utomton: (
More informationMa/CS 6b Class 1: Graph Recap
M/CS 6 Clss 1: Grph Recp By Adm Sheffer Course Detils Instructor: Adm Sheffer. TA: Cosmin Pohot. 1pm Mondys, Wednesdys, nd Fridys. http://mth.cltech.edu/~2015-16/2term/m006/ Min ook: Introduction to Grph
More informationLily Yen and Mogens Hansen
SKOLID / SKOLID No. 8 Lily Yen nd Mogens Hnsen Skolid hs joined Mthemticl Myhem which is eing reformtted s stnd-lone mthemtics journl for high school students. Solutions to prolems tht ppered in the lst
More informationA dual of the rectangle-segmentation problem for binary matrices
A dul of the rectngle-segmenttion prolem for inry mtrices Thoms Klinowski Astrct We consider the prolem to decompose inry mtrix into smll numer of inry mtrices whose -entries form rectngle. We show tht
More informationApplied Databases. Sebastian Maneth. Lecture 13 Online Pattern Matching on Strings. University of Edinburgh - February 29th, 2016
Applied Dtses Lecture 13 Online Pttern Mtching on Strings Sestin Mneth University of Edinurgh - Ferury 29th, 2016 2 Outline 1. Nive Method 2. Automton Method 3. Knuth-Morris-Prtt Algorithm 4. Boyer-Moore
More informationLR Parsing, Part 2. Constructing Parse Tables. Need to Automatically Construct LR Parse Tables: Action and GOTO Table
TDDD55 Compilers nd Interpreters TDDB44 Compiler Construction LR Prsing, Prt 2 Constructing Prse Tles Prse tle construction Grmmr conflict hndling Ctegories of LR Grmmrs nd Prsers Peter Fritzson, Christoph
More informationLexical Analysis. Amitabha Sanyal. (www.cse.iitb.ac.in/ as) Department of Computer Science and Engineering, Indian Institute of Technology, Bombay
Lexicl Anlysis Amith Snyl (www.cse.iit.c.in/ s) Deprtment of Computer Science nd Engineering, Indin Institute of Technology, Bomy Septemer 27 College of Engineering, Pune Lexicl Anlysis: 2/6 Recp The input
More informationText mining: bag of words representation and beyond it
Text mining: bg of words representtion nd beyond it Jsmink Dobš Fculty of Orgniztion nd Informtics University of Zgreb 1 Outline Definition of text mining Vector spce model or Bg of words representtion
More informationAnnouncements. CS 188: Artificial Intelligence Fall Recap: Search. Today. General Tree Search. Uniform Cost. Lecture 3: A* Search 9/4/2007
CS 88: Artificil Intelligence Fll 2007 Lecture : A* Serch 9/4/2007 Dn Klein UC Berkeley Mny slides over the course dpted from either Sturt Russell or Andrew Moore Announcements Sections: New section 06:
More informationPLWAP Sequential Mining: Open Source Code
PL Sequentil Mining: Open Source Code C.I. Ezeife School of Computer Science University of Windsor Windsor, Ontrio N9B 3P4 cezeife@uwindsor.c Yi Lu Deprtment of Computer Science Wyne Stte University Detroit,
More informationI/O Efficient Dynamic Data Structures for Longest Prefix Queries
I/O Efficient Dynmic Dt Structures for Longest Prefix Queries Moshe Hershcovitch 1 nd Him Kpln 2 1 Fculty of Electricl Engineering, moshik1@gmil.com 2 School of Computer Science, himk@cs.tu.c.il, Tel Aviv
More information9 Graph Cutting Procedures
9 Grph Cutting Procedures Lst clss we begn looking t how to embed rbitrry metrics into distributions of trees, nd proved the following theorem due to Brtl (1996): Theorem 9.1 (Brtl (1996)) Given metric
More informationLecture 7: Integration Techniques
Lecture 7: Integrtion Techniques Antiderivtives nd Indefinite Integrls. In differentil clculus, we were interested in the derivtive of given rel-vlued function, whether it ws lgeric, eponentil or logrithmic.
More informationTopic 2: Lexing and Flexing
Topic 2: Lexing nd Flexing COS 320 Compiling Techniques Princeton University Spring 2016 Lennrt Beringer 1 2 The Compiler Lexicl Anlysis Gol: rek strem of ASCII chrcters (source/input) into sequence of
More informationAn Expressive Hybrid Model for the Composition of Cardinal Directions
An Expressive Hyrid Model for the Composition of Crdinl Directions Ah Lin Kor nd Brndon Bennett School of Computing, University of Leeds, Leeds LS2 9JT, UK e-mil:{lin,brndon}@comp.leeds.c.uk Astrct In
More informationPreserving Constraints for Aggregation Relationship Type Update in XML Document
Preserving Constrints for Aggregtion Reltionship Type Updte in XML Document Eric Prdede 1, J. Wenny Rhyu 1, nd Dvid Tnir 2 1 Deprtment of Computer Science nd Computer Engineering, L Trobe University, Bundoor
More informationA New Learning Algorithm for the MAXQ Hierarchical Reinforcement Learning Method
A New Lerning Algorithm for the MAXQ Hierrchicl Reinforcement Lerning Method Frzneh Mirzzdeh 1, Bbk Behsz 2, nd Hmid Beigy 1 1 Deprtment of Computer Engineering, Shrif University of Technology, Tehrn,
More informationarxiv: v1 [cs.db] 16 Sep 2016
Blech: A Distriuted Strem Dt Clening System Yongcho Tin Eurecom yongcho.tin@eurecom.fr Pietro Michirdi Eurecom pietro.michirdi@eurecom.fr Mrko Vukolić IBM Reserch - Zurich mvu@zurich.im.com rxiv:169.5113v1
More information1.5 Extrema and the Mean Value Theorem
.5 Extrem nd the Men Vlue Theorem.5. Mximum nd Minimum Vlues Definition.5. (Glol Mximum). Let f : D! R e function with domin D. Then f hs n glol mximum vlue t point c, iff(c) f(x) for ll x D. The vlue
More informationGraphs with at most two trees in a forest building process
Grphs with t most two trees in forest uilding process rxiv:802.0533v [mth.co] 4 Fe 208 Steve Butler Mis Hmnk Mrie Hrdt Astrct Given grph, we cn form spnning forest y first sorting the edges in some order,
More informationThe Math Learning Center PO Box 12929, Salem, Oregon Math Learning Center
Resource Overview Quntile Mesure: Skill or Concept: 80Q Multiply two frctions or frction nd whole numer. (QT N ) Excerpted from: The Mth Lerning Center PO Box 99, Slem, Oregon 9709 099 www.mthlerningcenter.org
More informationMTH 146 Conics Supplement
105- Review of Conics MTH 146 Conics Supplement In this section we review conics If ou ne more detils thn re present in the notes, r through section 105 of the ook Definition: A prol is the set of points
More informationFile Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment
File Mnger Quick Reference Guide June 2018 Prepred for the Myo Clinic Enterprise Khu Deployment NVIGTION IN FILE MNGER To nvigte in File Mnger, users will mke use of the left pne to nvigte nd further pnes
More informationAlignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey
Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2012 Colin Dewey cdewey@biostt.wisc.edu Gols for Lecture the key concepts to understnd re the following how lrge-scle lignment
More informationSemistructured Data Management Part 2 - Graph Databases
Semistructured Dt Mngement Prt 2 - Grph Dtbses 2003/4, Krl Aberer, EPFL-SSC, Lbortoire de systèmes d'informtions réprtis Semi-structured Dt - 1 1 Tody's Questions 1. Schems for Semi-structured Dt 2. Grph
More informationCS 221: Artificial Intelligence Fall 2011
CS 221: Artificil Intelligence Fll 2011 Lecture 2: Serch (Slides from Dn Klein, with help from Sturt Russell, Andrew Moore, Teg Grenger, Peter Norvig) Problem types! Fully observble, deterministic! single-belief-stte
More informationReducing a DFA to a Minimal DFA
Lexicl Anlysis - Prt 4 Reducing DFA to Miniml DFA Input: DFA IN Assume DFA IN never gets stuck (dd ded stte if necessry) Output: DFA MIN An equivlent DFA with the minimum numer of sttes. Hrry H. Porter,
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationCompression Outline :Algorithms in the Real World. Lempel-Ziv Algorithms. LZ77: Sliding Window Lempel-Ziv
Compression Outline 15-853:Algorithms in the Rel World Dt Compression III Introduction: Lossy vs. Lossless, Benchmrks, Informtion Theory: Entropy, etc. Proility Coding: Huffmn + Arithmetic Coding Applictions
More informationBleach: A Distributed Stream Data Cleaning System
Blech: A Distriuted Strem Dt Clening System Yongcho Tin Eurecom Biot, Frnce Emil: yongcho.tin@eurecom.fr Pietro Michirdi Eurecom Biot, Frnce Emil: pietro.michirdi@eurecom.fr Mrko Vukolić IBM Reserch Zurich,
More informationOutline. Introduction Suffix Trees (ST) Building STs in linear time: Ukkonen s algorithm Applications of ST
Suffi Trees Outline Introduction Suffi Trees (ST) Building STs in liner time: Ukkonen s lgorithm Applictions of ST 2 3 Introduction Sustrings String is ny sequence of chrcters. Sustring of string S is
More informationEfficient Techniques for Tree Similarity Queries 1
Efficient Techniques for Tree Similrity Queries 1 Nikolus Augsten Dtbse Reserch Group Deprtment of Computer Sciences University of Slzburg, Austri July 6, 2017 Austrin Computer Science Dy 2017 / IMAGINE
More informationCompilers Spring 2013 PRACTICE Midterm Exam
Compilers Spring 2013 PRACTICE Midterm Exm This is full length prctice midterm exm. If you wnt to tke it t exm pce, give yourself 7 minutes to tke the entire test. Just like the rel exm, ech question hs
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology
More informationAgilent Mass Hunter Software
Agilent Mss Hunter Softwre Quick Strt Guide Use this guide to get strted with the Mss Hunter softwre. Wht is Mss Hunter Softwre? Mss Hunter is n integrl prt of Agilent TOF softwre (version A.02.00). Mss
More informationCategorical Skylines for Streaming Data
Ctegoricl Skylines for Streming Dt ABSTRACT Nikos Srks University of Toronto nsrks@cs.toronto.edu Nick Kouds University of Toronto kouds@cs.toronto.edu The prolem of skyline computtion hs ttrcted considerle
More informationCOLOUR IMAGE MATCHING FOR DTM GENERATION AND HOUSE EXTRACTION
Hee Ju Prk OLOUR IMAGE MATHING FOR DTM GENERATION AND HOUSE EXTRATION Hee Ju PARK, Petr ZINMMERMANN * Swiss Federl Institute of Technology, Zuric Switzerlnd Institute for Geodesy nd Photogrmmetry heeju@ns.shingu-c.c.kr
More informationApproximate Retrieval of XML Data with ApproXPath
Approximte Retrievl of XML Dt with ApproXPth Lin Xu Curtis Dyreson IBM Snt Teres Snt Teres, CA Emil: linxu@im.com Deprtment of Computer Science Uth Stte University Logn, UT Emil: Curtis.Dyreson@usu.edu
More informationObject and image indexing based on region connection calculus and oriented matroid theory
Discrete Applied Mthemtics 147 (2005) 345 361 www.elsevier.com/locte/dm Oject nd imge indexing sed on region connection clculus nd oriented mtroid theory Ernesto Stffetti, Antoni Gru, Frncesc Serrtos c,
More informationCSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011
CSCI 3130: Forml Lnguges nd utomt Theory Lecture 12 The Chinese University of Hong Kong, Fll 2011 ndrej Bogdnov In progrmming lnguges, uilding prse trees is significnt tsk ecuse prse trees tell us the
More informationCOMPUTER EDUCATION TECHNIQUES, INC. (MS_W2K3_SERVER ) SA:
In order to lern which questions hve een nswered correctly: 1. Print these pges. 2. Answer the questions. 3. Send this ssessment with the nswers vi:. FAX to (212) 967-3498. Or. Mil the nswers to the following
More informationUT1553B BCRT True Dual-port Memory Interface
UTMC APPICATION NOTE UT553B BCRT True Dul-port Memory Interfce INTRODUCTION The UTMC UT553B BCRT is monolithic CMOS integrted circuit tht provides comprehensive MI-STD- 553B Bus Controller nd Remote Terminl
More informationDigital Signal Processing: A Hardware-Based Approach
Digitl Signl Processing: A Hrdwre-Bsed Approch Roert Esposito Electricl nd Computer Engineering Temple University troduction Teching Digitl Signl Processing (DSP) hs included the utilition of simultion
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology
More informationΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών. Lecture 3b Lexical Analysis Elias Athanasopoulos
ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy RecogniNon of Tokens if expressions nd relnonl opertors if è if then è then else è else relop è
More informationAnswer Key Lesson 6: Workshop: Angles and Lines
nswer Key esson 6: tudent Guide ngles nd ines Questions 1 3 (G p. 406) 1. 120 ; 360 2. hey re the sme. 3. 360 Here re four different ptterns tht re used to mke quilts. Work with your group. se your Power
More informationSmall Business Networking
Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology
More informationUnit 5 Vocabulary. A function is a special relationship where each input has a single output.
MODULE 3 Terms Definition Picture/Exmple/Nottion 1 Function Nottion Function nottion is n efficient nd effective wy to write functions of ll types. This nottion llows you to identify the input vlue with
More informationAn Efficient Algorithm for Discovering Frequent Subgraphs. Technical Report
An Efficient Algorithm for Discovering Frequent Sugrphs Technicl Report Deprtment of Computer Science nd Engineering Universit of Minnesot 4-192 EECS Building 200 Union Street SE Minnepolis, MN 55455-0159
More information