Answering Label-Constraint Reachability in Large Graphs

Size: px
Start display at page:

Download "Answering Label-Constraint Reachability in Large Graphs"

Transcription

1 Answering Lel-Constrint Rechility in Lrge Grphs Kun Xu Peking University Beijing, Chin Lei Chen Hong Kong Univ. of Sci. & Tech. Hong Kong, Chin Lei Zou Peking University Beijing, Chin Ynghu Xio Fudn University Shnghi, Chin Jeffrey Xu Yu The Chinese Univ. of Hong Kong Hong Kong, Chin Dongyn Zho Peking University Beijing, Chin ABSTRACT In this pper, we study vrint of rechility queries, clled lel-constrint rechility (LCR) queries, specificlly, given lel set S nd two vertices u 1 nd u in lrge directed grph G, we verify whether there exists pth from u 1 to u under lel constrint S. Like trditionl rechility queries, LCR queries re very useful, such s pthwy finding in iologicl networks, inferring over RDF (resource description f rmework) grphs, reltionship finding in socil networks. However, LCR queries re much more complicted thn their trditionl counterprt. Severl techniques re proposed in this pper to minimize the serch spce in computing pth-lel trnsitive closure. Furthermore, we demonstrte the superiority of our method y extensive experiments. Ctegories nd Suject Descriptors H. [Informtion Systems Applictions]: Miscellneous; H..8 [Dtse Mngement]: Dtse Applictions Grph Dtse Generl Terms Algorithm corresponding uthor: Lei Zou, zoulei@icst.pku.edu.cn Lei Zou nd Dongyn Zho were supported y NSFC under Grnt No nd RFDP under Grnt No Jeffrey Xu Yu ws supported y RGC of the Hong Kong SAR under Grnt No nd Lei Chen s reserch is prtilly supported y HKUST SSRI11EG01 nd NSFC No Ynghu Xio ws supported y the NSFC under Grnt No nd RFDP under Grnt No Permission to mke digitl or hrd copies of ll or prt of this work for personl or clssroom use is grnted without fee provided tht copies re not mde or distriuted for profit or commercil dvntge nd tht copies er this notice nd the full cittion on the first pge. To copy otherwise, to repulish, to post on servers or to redistriute to lists, requires prior specific permission nd/or fee. CIKM 11, Octoer 8, 011, Glsgow, Scotlnd, UK. Copyright 011 ACM /11/10...$ INTRODUCTION The growing populrity of grph dtses hs generted mny interesting dt mngement prolems. One importnt type of queries over grphs re rechility queries [1,,,,, ]. Specificlly, given two vertices u 1 nd u in directed grph G, we wnt to verify whether there exists directed pth from u 1 to u. There re mny pplictions of rechility queries, such s, pthwy finding in iologicl networks [7], inferring over RDF (resource description f rmework) grphs [8], reltionship finding in socil networks [9]. There re two extreme solutions to nswer rechility queries. One pproch is to mterilize the full trnsitive closures of G, enling one to nswer rechility queries efficiently. In the other extreme, one cn perform DFS (depth-f irst serch) or BFS (reth-f irst serch) over grph G until reching the trget vertex, or the serch process cnnot e continued. Oviously, these two methods cnnot work in lrge grph G, since the former needs O(V ) spce to store the trnsitive closure (lrge index spce cost), nd the ltter needs O(V ) time in nswering rechility queries (slow query response time). The key issue in rechility queries is how to find good trde-off etween the two sic solutions. Therefore, mny lgorithms hve een proposed, such s -hop [1, 10, 11], GRIPP [], pth-cover [], tree-cover [, ], pthtree [] nd -hop [], to ddress this issue. In mny rel pplictions, edge lels re utilized to denote different reltionships etween two vertices. For exmple, edge lels in RDF grphs denote different properties. We cn lso use edge lels to define different reltionships in socil networks. In this pper, we study vrint of rechility queries, clled lel-constrint rechility (LCR) queries, which re originlly proposed in [1]. Conceptully, LCR query verifies some specified type of reltionships etween two vertices. Here, we give motivtion exmple to demonstrte the usefulness of LCR queries. Let us consider n inference exmple in RDF Schem (RDFS for short) dtset in Figure 1. Assume tht we wnt to verify whether freshmn is suclss of people. The trditionl dtse system cn simply nswer no, since there exists no triple ( freshmn, rdfs:suclss, people ). However, due to RDFS semntics, rdfs:suclss is trn- 19

2 Definition.. Given two vertices u 1 nd u in grph G nd lel constrint (set) S = {l 1,..., l n}, we sy tht u 1 cn rech u under lel constrint S (denoted s u 1 u ) if nd only if there exists pth p from u 1 to u nd L(p) S. S () RDF Dtset () RDF Grph Figure 1: Inference VS. LCR Query sitive property. Therefore, given two triples ( freshmn, rdfs:suclss, student ) nd ( student, rdfs:suclss, people ), we cn infer tht ( freshmn, rdfs: suclss, people ). Therefore, the correct nswer should e yes. This kind of queries re clled inference queries. Oviously, it is prohiitive to mterilize ll inferred fcts, ccording to RDFS resoning rules, in very lrge RDF dtset. Actully, some inference queries cn e reduced into LCR queries over RDF grphs. For exmple, the ove exmple cn e trnsformed into the following LCR query: verifying whether there exists directed pth from entry freshmn to people in the RDF grph, where ll edge lels long the pth re rdfs:suclss. Although LCR queries re quite useful, it is non-trivil to nswer LCR queries over lrge directed grph. Trditionl indexing methods cn only verify rechility without considering how the connection is mde etween two vertices [1]. Furthermore, the trditionl trnsitive closure does not contin pth lels. In order to ddress LCR queries efficiently, we mke the following contriutions in this work: 1) For LCR queries, we propose method to trnsform n edge-leled directed grph into n ugmented DAG y representing the mximl strongly connected components s iprtite grphs. Then, sed on the ugmented DAG, we propose method to compute trnsitive closure efficiently. ) We re-define the distnce of pth nd propose Dijkstr-like lgorithm to compute single-source pth-lel trnsitive closure. We prove the optimlity of our lgorithm. ) Extensive experiments confirm tht our method is fster thn the existing ones y orders of mgnitude. For exmple, given rndom network stisfying ER model with 100K vertices nd 10K edges, the method in [1] consumes 77 hours for index uilding. However, given the sme grph, our method only needs 0. hour for indexing uilding. Furthermore, our method cn work well in very lrge RDF grph (Ygo dtset) hving more thn million vertices nd million edges nd 97 edge lels.. BACKGROUND.1 Prolem Definition Definition.1. A directed edge-leled grph G is denoted s G = {V, E,, λ}, where (1) V is set of vertices, nd () E V V is set of directed edges, nd () is set of edge lels, nd () the leling function λ defines the mpping E. Given pth p from u 1 to u in grph G, the pth-lel of p is denoted s L(p) = e p λ(e), where λ(e) denotes e s edge lel. Given grph G in Figure, the numers inside vertices re vertex IDs tht we introduce to simplify description of the grph; nd the letters eside edges re edge lels. Considering pth p 1 = (1,, ), the pth-lel of p 1 is L(p 1) = {c}. Definition.. (Prolem Definition) Given two vertices u 1 nd u in grph G nd lel set S = {l 1,..., l n}, lel-constrint rechility (LCR) query verifies whether u 1 cn rech u under the lel constrint S, denoted s LCR(u 1, u, S, G). For exmple, given two vertices 1 nd in grph G in Figure nd lel constrint S = {c}, it is esy to know tht 1 cn rech under lel constrint S, i.e., LCR(1,, S, G) = true, since there exists pth p 1 = {1,,, }, where L(p 1) S. If S = {c}, query LCR(1,, S, G) returns flse. 1 c P(1,) { p (1,,), p (1,,,,)} 1 L( p ) { c}; L( p ) { } 1 LS(1,) { L( p ), L( p )} { c, } M (1,) { } 1 Figure : A Running Exmple Definition.. Given two vertices u 1 nd u in grph G, P (u 1, u ) denotes ll pths from u 1 to u. The pthlel set from u 1 to u is defined s LS(u 1, u ) = {L(p) p P (u 1, u )}. Consider two pths (1,,,,, ) nd (1,,, ) in P (1, ), where L(1,,,,, ) L(1,,, ). Oviously, if pth (1,,, ) cn stisfy some lel constrint S, pth (1,,,,, ) cn lso stisfy S. Therefore, pth (1,,, ) is redundnt (Definition.) for ny LCR query. Definition.. Considering two pths p nd p from vertex u 1 to u, respectively, if L(p) L(p ), we sy L(p) covers L(p ). In this cse, p is redundnt pth, nd L(p ) is lso redundnt in the pth-lel set LS(u 1, u ). Definition.. The miniml pth-lel set from u 1 to u in grph G is defined s M G(u 1, u ), where 1)M G(u 1, u ) LS(u 1, u ); nd ) there exists no redundnt pth-lel in M G(u 1, u ); nd ) pth-lels in M G(u 1, u ) cover ll pth lels in LS(u 1, u ). Definition.7. Given two vertices u 1 nd u nd lel constrint (set) S, we sy M G(u 1, u ) covers S if nd only if there exists pth p from u 1 to u, where S L(p) nd L(p) M G(u 1, u ). Definition.8. Given grph G, pth-lel trnsitive closure is mtrix M G = [M G(u 1, u )] V (G) V (G), where u 1, u V (G), nd single-source pth-lel trnsitive closure is vector M G(u, ) = [M G(u, u i)] 1 V (G), where u i V (G). When the context is cler, we lso sy trnsitive closure insted of pth-lel trnsitive closure for short. Furthermore, for ese of presenttion, we orrow two opertor definitions (P rune nd ) from reference [1]. P rune( ) is defined s P rune(ls(u 1, u )) = M G(u 1, u ), which mens removing ll redundnt pths. In Figure, P rune(ls(1,) = {, c}) =M G(1, ) = {}, since. 19

3 ( ) ( ) is defined s follows: λ( u 1u ) M G(u, u ) = {λ( u 1u ) L(p 1), λ( u 1u ) L(p ),...}, where L(p i) M G(u, u ) nd λ( u 1u ) denotes the lel of edge u 1u. It is esy to prove tht M G(u 1, u ) = P rune( u V (G) {λ ( u 1u ) M G(u, u )}). Anlogously, we lso define λ( u 1u ) M G(u, ). An extreme pproch to nswering LCR queries is to mterilize trnsitive closure M G. At run time, given query LCR(u 1, u, S, G), LCR queries cn e nswered y simply checking M G(u 1, u ). However, computing M G is much more complicted thn trditionl trnsitive closure. Before the forml discussion, we introduce the following theorem, which forms the sis of our lgorithm nd performnce nlysis. Theorem.1. (Apriori Property) Given one pth p, if one of its supths is redundnt, p must e redundnt.. Existing Approches LCR queries re proposed in [1]. Generlly speking, the method in [1] employs spnning tree T nd prtil trnsitive closure NT to compress the full trnsitive closure. Specificlly, spnning tree T is found in the grph G. Bsed on T, we prtition ll pirwise pths into three ctegories P n nd P s nd P e. All pths in P n contins ll pirwise pths whose strting edges nd end edges re oth non-tree edges. All pths in P s (nd P e) contins ll pirwise pths whose strting (nd ending) edges re tree-edges. In Figure, (,,,1) is pth in P n, since, nd, 1 re non-tree edges. NT (u, v) contins ll pth lels etween u nd v in P n. We cn re-construct M G(u, v) y Eqution 1. Therefore, we cn re-construct the full trnsitive closure y the spnning tree T nd prtil trnsitive closure NT = {NT (u, v), u, v V (G)}. M G(u, v) = {{L(P T (u, u ))} NT (u, v ) {L(P T (v, v))} u Succ(u)ndv Pr ed(v)} (1) where u is rechle from u in the spnning tree T nd L(P T (u, u )) denotes the corresponding pth lel in T ; nd v cn rech v in the spnning tree T nd L(P T (v, v)) denotes the corresponding pth lel in T. 1 c Figure : An Exmple of Tree-Cover Oviously, different spnning trees will led to different NT. In order to minimize the size of NT, uthors introduce weight w(e) for ech edge e, where w(e) reflects tht if e is included in spnning tree, the numer of pth-lels tht cn e removed from NT. Therefore, they propose to use the mximl spnning tree in G. However, it is quite expensive to ssign exct edge weights w(e). Thus, they propose smpling method. For ech smpling seed (vertex), they compute single-source trnsitive closure, sed on which, they propose some heuristic methods to define edge weights. 1 c However, there re two limittions of their method in [1]. First, similr with the counterprt methods in trditionl rechility queries [, ], single spnning tree cnnot compress the trnsitive closure gretly, especilly in dense grphs. Consequently, NT my e very lrge. Second, in order to find the optiml spnning tree T, uthors propose n lgorithm to compute single-source trnsitive closure for ech smpling seed (vertex). However, the serch spce in their lgorithm is not miniml, i.e., contining lrge numer of redundnt pths, which ffect the performnce gretly. The ove two prolems (lrge index size nd expensive index uilding process) ffect the sclility of the method in [1]. Since computing single-source trnsitive closure is lso uilding lock in our method, we rgue tht our method is optiml in terms of serch spce. In order to understnd the superiority of our method, in the full version of this pper [1], we nlyze the lgorithm in [1]. Due to spce limit, the detils re omitted in this pper. In [1], Fn et l. dd regulr expressions to grph rechility queries. Specificlly, given two vertices u 1 nd u, the method in [1] verifies whether there exists directed pth P where ll edge lels long the pth stisfy the specified regulr expression. Oviously, LCR query is specil cse of the prolem in [1]. Fn et l. propose i-directionl BFS lgorithm t runtime. We cn utilize the method in [1] for LCR queries. Given two vertices u 1 nd u over lrge grph G nd lel set S, two sets re mintined for u 1 nd u, respectively. Ech set records the vertices tht re rechle from (resp. to) u 1 (resp. u ) only vi edges of lels in S. We expnd the smller set t time until either the two sets intersect, or they cnnot e further expnded (i.e., unrechle). The prolem of this method lies in its lrge serch spce. The serch strtegy in [1] is different from trditionl BFS lgorithm, since one vertex my e visited multiple times in i-directionl BFS lgorithm [1].. COMPUTING TRANSITIVE CLOSURE As mentioned erlier, compred with trditionl trnsitive closure, it is much more chllenging to compute pth-lel trnsitive closure (Definition.8). This section focuses computing pth-lel trnsitive closure efficiently. We first propose Dijkstr-like lgorithm to compute single-source trnsitive closure efficiently (Section.1). Oviously, it is very expensive to iterte single-source trnsitive closure computtion from ech vertex in G to compute M G. In order to ddress this issue, we propose ugmented DAG (DAG for short) y representing ll strongly connected components s iprtite grphs. The DAG-sed solution is discussed in Section...1 Single-Source Trnsitive Closure This susection discusses how to compute single-source trnsitive closure efficiently, since it is uilding lock in our DAG-sed solution. As discussed in Section., the method in [1] is not optiml. The key prolem is tht some redundnt pths re visited efore their corresponding nonredundnt pths. In order to ddress this issue, we propose Dijkstr-like lgorithm. As we know, in ech step of Dijkstr s lgorithm, we lwys ccess one un-visited vertex tht hs the miniml distnce from the origin vertex. In our lgorithm, we redefine distnce. A distnce of pth p is defined s the numer of distinct edge lels in p insted of 197

4 the sum of edge weights. Then, ccording to the distnce definition, we dopt the Dijkstr-like lgorithm to compute single-source trnsitive closure. This lgorithm cn gurntee tht ll redundnt pths must e visited fter their corresponding non-redundnt pths. Therefore, ll redundnt pths cn e pruned from the serch spce (Theorem.1). Hep H Step 1. [{}, (1,), ]; Step. [{}, (1,, ), ]; [{c}, (1,,), ]; Step. [{}, (1,,, ), ]; [{c}, (1,,), ]; Step. [{}, (1,,,,), ]; [{c}, (1,,), ]; Step. [{}, (1,,,,,), ] Pth Set RS [{}, (1,), ]; [{}, (1,,), ]; Figure : Algorithm Process [{}, (1,,, ), ]; [{}, (1,,,,), ]; [{}, (1,,,,,), ] Given grph G in Figure, Figure demonstrtes how to compute M G(1, ) from vertex 1 in our lgorithm (i.e., Algorithm 1). Initilly, we set vertex 1 s the source. All vertex 1 s neighors re put into the hep H. Ech neighor is denoted s neighor triple [L(p), p, d], where d denotes the neighor s ID, p specifies one pth from source s to d, nd L(p) is the pth-lel set of p. All neighor triples re rnked ccording to the totl order defined in Definition.1. Since [{}, (1, ), ] is the hep hed (see Figure ), it is moved to pth set RS. When we move the hep hed T 1 into pth set RS, we check whether T 1 is covered (Definition.) y some neighor triple T in RS (Line in Algorithm 1). If so, we ignore T 1 (Lines -); otherwise, we insert T 1 into RS (Lines 7-8). Definition.1. Given two neighor triples T 1 = [L(p 1), p 1, d 1] nd T = [L(p ), p, d ] in the hep H, T 1 T if nd only if 1) L(p 1) < L(p ) ; or ) L(p 1) = L(p ), the orders of T 1 nd T re ritrrily defined. Definition.. Given one neighor triple T 1 = [L(p 1), p 1, d 1], T 1 is redundnt if nd only if there exists nother neighor triple T = [L(p ), p, d ], where L(p 1) L(p ) nd d 1 = d. In this cse, we sy tht T 1 is covered y T. Definition.. Given two neighor triple T 1 = [L(p 1), p 1, d 1] nd T = [L(p ), p, d ], if p 1 is prent pth (or child pth) of p, we sy tht T 1 (or T ) is prent neighor triple (or child neighor triple) of T (or T 1). Then, we put ll child neighor triples (Definition.) of [{}, (1, ), ] into hep H. Considering one neighor of vertex, such s vertex, we put neighor triple [{} L(, ) = {}, (1,, ), ] into H, where (1,, ) is (1, ) s child pth. Anlogously, we put [{} L(, ) = {c}, (1,, ), ] into H. When we insert some neighor triple T [L(p ), p, d ] into H, we first check whether p is non-simple pth. If so, we ignore T (Lines 10-11). Furthermore, we lso check whether there exists nother triple T tht hs existed in H nd T is covered y T, or T covers T (Lines 1-1). If T is covered y T, we ignore T ; otherwise, T is inserted into H. If T covers some triple T in H, T is deleted from H. At Step, the hep hed is [{}, (1, ), ], which is moved to pth set RS. Itertively, we put ll child neighor triples of [{}, (1, ), ] into hep H. At Step, we find tht [{c}, (1,, ), ] is covered y [{}, (1,,,, ), ]. Therefore, we remove [{c}, (1,, ), ] from H. Figure illustrtes the whole process. All pths nd pth-lels in RS re non-redundnt. According to RS, it is strightforwrd to otin M G(1, ). Note tht, our lgorithm stops the infection from the redundnt pth to its child pths (Theorem.1). For exmple, pth (1,,, ) is pruned from serch spce in our lgorithm. Algorithm 1 Single-Source Trnsitive Closure Computtion Require: Input: A grph G nd vertex u in G; Output: A compressed pth tree C-P T (u). 1: Set u s the source. Set nswer set RS=φ nd hep H=φ. : Put ll neighor triples of u into H. : while H φ do : Let T 1 = [L(p 1 ), p 1, d] to denote the hed in H. : if T 1 is covered y some neighor triple T in RS then : Delete T 1 from H 7: else 8: Move T 1 into RS 9: for ech child neighor triple T [L(p ), p, d ] of T 1 do 10: if p is non-simple pth then 11: continue 1: if T is not covered y some neighor triple T in H then 1: Insert T into H 1: if T covers some neighor triple T in H then 1: Delete T from H 1: According to pths in RS to uild C-P T (u). Theorem.1. Given vertex u in grph G, the following sttements out Algorithm 1 hold: 1. (correctness) Any non-redundnt pth eginning from vertex u cn e found in pth set RS in Algorithm 1.. (optimlity) Given ny redundnt pth p, if one of p s prent pths is lso redundnt, p is pruned from the serch spce in Algorithm 1.. Computing Trnsitive Closures Given grph G, strightforwrd method to compute trnsitive closure M G is to repet Algorithm 1 from ech vertex in G. Oviously, it is inefficient to do tht. Intuitively, given two djcent vertices u 1 nd u, most computtions in Algorithm 1 for u 1 nd u re the sme to ech other. Therefore, n efficient lgorithm should employ the property. Usully, directed grph G is trnsformed into DAG y colescing ech strongly connected component into single vertex to compute trnsitive closure efficiently. However, this method cnnot e used for LCR queries, since it my miss some edge lels. Insted, we propose n ugmented DAG D y representing ll strongly connected components s iprtite grphs. Then, we cn compute single-source trnsitive closure M(u, ) ccording to the reverse order of D. During the computtion, M(u, ) is lwys trnsmitted to its prent vertices in D. In this wy, redundnt computtion cn e voided. Specificlly, we first identify ll mximl connected components in grph G, denoted s C 1,..., C m. For ech C i (i = 1,..., m), we compute locl trnsitive closure in C i (denoted s M Ci ) y iterting Algorithm 1 for ech vertex in C i. For exmple, given grph G in Figure (), one mximl connected component C 1 is identified in G. We compute M C1 for C 1. Then, we represent mximl connected component C i s iprtite grph B i = (V i1, V i ), where 198

5 7 1 8 G 9 C1 8 { } {} 9 7 D {} Figure : Augmented DAG ' B1 In-Portl Out-Portl V i1 contins ll in-portl vertices in C i nd V i contins ll out-portl vertices in C i. A vertex u in C i is clled s n in-portl if nd only if it hs t lest one incoming edge from vertices out of C i. A vertex u in C i is clled s n out-portl if nd only if it hs t lest one outgoing edge to vertices out of C i. If one vertex u is oth n in-portl nd n out-portl, it hs two instnces u nd u tht occur in V i1 nd V i, respectively. For ny two vertices u 1 V i1 nd u V i, we introduce directed edge u 1 to u, whose edge lel is M Ci (u 1, u ). For exmple, iprtite grph B 1 corresponding to C 1 is given in Figure (). Finlly, we replce ll mximl connected components C i y the corresponding iprtite grph B i. In this wy, we cn get grph D, s shown in Figure (). We cn prove tht D must e DAG. In this pper, we cll it ugmented DAG (DAG for short). Note tht, given vertex u in C i, if u / D, u is clled n intr vertex, such s vertices 1, nd in Figure (). Algorithm DAG-Bsed Compute Locl Trnsitive Closure Require: Input: A grph G nd its corresponding DAG D; Output: M G. 1: Identify ll mximl connected component C i. : Employ Algorithm 1 to compute locl trnsitive closure M Ci for ech C i. : Build DAG D y replcing C i y the iprtite grphs B i. : Perform Topologicl Sorting Over D. Set M G (u, ) = {φ,..., φ} for ech vertex u. : for ech vertex u ccording to the reverse topologicl order do : for ech child c i of u do 7: M G (u, ) = P rune(m G (u, ) P rune(λ( uc i ) M G (c i, ))) 8: for ech mximl connect component C i do 9: for ech intr-vertex u C i do 10: M G (u, ) = M Ci (u, ) 11: for ech out-portl u i in V i do 1: M G (u, ) = P rune(m G (u, ) P rune(m Ci (u, u i ) M G (u i, ))) 1: for ech vertex u / C i do 1: for ech intr-vertex u C i do 1: for ech in-portl u i in V 1i do 1: M G (u, u) = P rune(m G (u, u) P rune(m G (u, u i ) M Ci (u i, u))) Algorithm lists the pseudo code to compute trnsitive closure for DAG D. First, we perform topologicl sorting over D. Initilly, for ll vertices u in G, we set M(u, ) = {φ,..., φ}. Then, we process ech vertex ccording to the reverse topologicl sort. If vertex u hs n children c i, for ech c i, we updte M G(u, ) = P rune(m G(u, ) P rune(λ( uc i) M G(c i, ))) itertively. In this wy, we cn otin M G(u, ) for ech vertex u in DAG D. Now, we need to consider intr vertices in ech cluster C i, such s vertices nd 7 in Figure. Given n intr vertex u in C i, we initilize M G(u, ) = M Ci (u, ). Then, for ech out-portl u i in V i, we updte M G(u, ) = P rune(m G(u, ) P rune(m Ci (u, u i) M G(u i, ))) itertively. Consider ny one vertex u / C i. Given n intr vertex u in C i, we compute M G(u, u) s follows: Initilly, we set M G(u, u) = φ. For ech in-portl u i in V 1i, we updte M G(u, u) = P rune(m G(u, u) P rune(m G(u, u i) M Ci (u i, u))) itertively. As we know, -hop leling technique is proposed to compress trditionl trnsitive closure [1, 10]. We lso extend the leling technique to compress the pth-lel trnsitive closure. Definition.. A -lel-hop coding over grph G ssigns to ech vertex u ( V (G)) code C(u) = (C in(u), C out(u)), where the entries in C in(u) nd C out (u) re in form of {(w, M G(w, u)} nd {(w, M G(u, w)}, respectively. Definition.. A -lel-hop coding over grph G is clled complete if nd only if Eqution holds. u 1, u V (G), S + S u 1 u w, w C out(u 1) w C in(u ) ( L(p 1), L(p 1) M G(u 1, w) S L(p 1)) ( L(p ), L(p ) M G(w, u ) S L(p )) where L(p 1) denotes one pth lel-set from u 1 to w, nd L(p ) denotes one pth lel-set from w to u.. EXPERIMENTS In this section, we evlute our methods over oth rndom networks nd rel dtsets, nd compre them with the existing solution the smpling-tree method in [1]. Specificlly, we experimentlly study the performnce of three pproches: 1) the smpling-tree method proposed in [1]; ) we compute pth-lel trnsitive closure method y Algorithm nd compress it y -lel-hop technique. Then, sed on -lel-hop codes, we cn nswer LCR queries. This method is clled trnsitive closure method; ) the idirectionl BFS proposed in [1]. Our methods re implemented using C++, nd our experiments re conducted on P.0GHz mchine with G RAM running Uuntu Linux..1 Dtsets There re two types of synthetic dtsets to e used in our experiments: Erdos Renyi Model (ER) nd Scle-Free Model (SF). ER is clssicl rndom grph model. It defines rndom grph s vertices connected y E edges, chosen rndomly from the ( 1) possile edges. In our experiments, we vry the density E from 1. to.0, nd vry from 1K to 10K. SF defines rndom network with vertices stisfying power-lw distriution in vertex degrees. In our implementtions, we use the grph genertor gengrphwin ( to generte lrge grph G stisfying power-lw distriution. Usully, the power-lw distriution prmeter γ is etween.0 nd.0 to simulte rel complex networks [1]. Thus, defult vlue of prmeter γ is set to. in this work. In order to study the sclility, we lso vry in SF networks from 1K to 10K. The numer of edge lels ( Σ ) is 0. The distriution of lels is generted ccording to uniform distriution. () 199

6 Tle 1: Performnce VS. in ER Grphs V Tle : Grphs -Lel-Hop Codes Smpling-Tree method Bi-directionl BFS (Memorysed) density = 1. IT IS QT IT IS QT QT sec. KB ms. sec. KB ms. ms. 1K K K K K K Performnce VS. Grph Density in ER density -Lel-Hop Codes Smpling-Tree method Bi-directionl BFS (Memory- V =10K IT IS QT IT IS QT QT sec. MB ms. sec. MB ms. ms F F F F F F 0.18 Note: F denotes "fil due to memory crsh" or "cnnot finish index uilding in 8 hours" We lso employ two rel grph dtsets (Yest, Smll- Ygo) in our experiments, which re provided y uthors in [1]. More detils out the two dtsets nd more experiments re given in the full version of this pper [1].. Performnce of Trnsitive Closure Method In this section, we use Algorithm to compute trnsitive closure for grph G. Then, we use -lel-hop coding technique to compress the trnsitive closure. We report index construction time (IT), index size (IS) nd verge query response time (QT) for the experiments on the synthetic dtsets. Note tht, the defult query constrint size ( S ) is 0% =. Furthermore, we lso compre our method with the smpling tree method. Note tht, in the following experiments, we lwys rndomly generte queries to evlute query performnce. QT is reported s the verge response time for one query. In these experiments, we evlute the performnce with regrd to grph size, grph density nd lel constrint size S. Furthermore, we lso test the performnce of i-directionl BFS in Tle 1. Since i-directionl BFS does not need offline processing, thus, we only report QT in the following experiments. Exp1. Vrying Grph Size () on ER grphs. In this experiment, we fix the density E =1. nd lel constrint size S = nd vry from 1,000 to 10,000 to study the performnce y vrying grph sizes. Tle 1 reports the detiled performnce, such s, index sizes (IS), index uilding times (IT) nd verge query response time (QT). From Tle 1, we know tht trnsitive closure method is fster thn the smpling-tree method in offline processing y orders of mgnitude. For exmple, when =1K, trnsitive closure method only spends 1 seconds to uild index, ut the smpling-tree method needs 11 seconds. The index size of our method is much smller thn tht in the smpling-tree method. Furthermore, our query performnce re lso etter thn the smpling tree method. From Tle 1, we know tht memory-sed i-directionl BFS is lso very fst for LCR queries. However, this method is not sclle with regrd to grph size due to its exponentil time complexity. Exp. Vrying Density ( E ) on ER grphs. In this experiment, we fix =10,000 nd vry the density E from to to study the performnce of our method in dense grphs. From Tle, we know tht the index uilding time nd index size increse when vrying E from to in oth methods. Furthermore, the smpling tree method cnnot finish index uilding in 8 hours when E. From Tle, we know tht trnsitive closure method hs etter sclility with regrd to the grph density E thn the smpling tree method. Actully, the two methods oth need to compute M G(u, ) (i.e., single-source trnsitive closure). As proven in Theorem.1, our method hs the miniml serch spce, ut the serch spce in the smpling tree method is not miniml. Thus, lrge serch spce ffects the slility of the smpling tree method.. CONCLUSIONS In this pper, we ddress lel-constrint rechility (LCR) queries over lrge grphs. Theoreticlly, we propose severl methods to optimize pth-lel trnsitive closure computing. We lso demonstrte the superiority of our method y extensive experiments.. REFERENCES [1] E. Cohen, E. Hlperin, H. Kpln, nd U. Zwick, Rechility nd distnce queries vi -hop lels, SIAM J. Comput., vol., no., 00. [] S. Trißl nd U. Leser, Fst nd prcticl indexing nd querying of very lrge grphs, in SIGMOD, 007. [] R. Jin, Y. Xing, N. Run, nd H. Wng, Efficiently nswering rechility queries on very lrge directed grphs, in SIGMOD, 008, pp [] R. Jin, Y. Xing, N. Run, nd D. Fuhry, -hop: high-compression indexing scheme for rechility query, in SIGMOD. [] H. V. Jgdish, A compression technique to mterilize trnsitive closure, ACM Trns. Dtse Syst., vol. 1, no., [] H. Wng, H. He, J. Y. 0001, P. S. Yu, nd J. X. Yu, Dul leling: Answering grph rechility queries in constnt time, in ICDE, 00. [7] S. Lu, F. Zhng, J. Chen, nd S.-H. Sze, Finding pthwy structures in protein interction networks, Algorithmic, vol. 8, no., 007. [8] J. P. McGlothlin nd L. R. Khn, Rdfk: efficient support for rdf inference queries nd knowledge mngement, in IDEAS, 009. [9] J. Zhu, Z. Nie, X. Liu, B. Zhng, nd J.-R. Wen, Sttsnowll: sttisticl pproch to extrcting entity reltionships, in WWW, 009. [10] J. Cheng, J. X. Yu, X. Lin, H. Wng, nd P. S. Yu, Fst computing rechility lelings for lrge grphs with high compression rte, in EDBT, 008. [11] R. Brmndi, B. Choi, nd W. K. Ng, Incrementl mintennce of -hop leling of lrge grphs, IEEE Trns. Knowl. Dt Eng., vol., no., 010. [1] R. Jin, H. Hong, H. Wng, N. Run, nd Y. Xing, Computing lel-constrint rechility in grph dtses, in SIGMOD, 010. [1] L. Zou, K. Xu, J. X. Yu, L. Chen, Y. Xio, nd D. Zho, Answering lel-constrint rechility in lrge grphs, Peking University, Tech. Rep., 011. [Online]. Aville: lelconstrintquery.pdf [1] W. Fn, J. Li, S. M, N. Tng, nd Y. Wu, Adding regulr expressions to grph rechility nd pttern queries, in ICDE, 011. [1] Rék Alert nd Alert-László Brási, Sttisticl mechnics of complex networks, Reviews of Mordern Physics, vol. 7, pp. 7 97,

COMBINATORIAL PATTERN MATCHING

COMBINATORIAL PATTERN MATCHING COMBINATORIAL PATTERN MATCHING Genomic Repets Exmple of repets: ATGGTCTAGGTCCTAGTGGTC Motivtion to find them: Genomic rerrngements re often ssocited with repets Trce evolutionry secrets Mny tumors re chrcterized

More information

COMP 423 lecture 11 Jan. 28, 2008

COMP 423 lecture 11 Jan. 28, 2008 COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring

More information

2 Computing all Intersections of a Set of Segments Line Segment Intersection

2 Computing all Intersections of a Set of Segments Line Segment Intersection 15-451/651: Design & Anlysis of Algorithms Novemer 14, 2016 Lecture #21 Sweep-Line nd Segment Intersection lst chnged: Novemer 8, 2017 1 Preliminries The sweep-line prdigm is very powerful lgorithmic design

More information

Tries. Yufei Tao KAIST. April 9, Y. Tao, April 9, 2013 Tries

Tries. Yufei Tao KAIST. April 9, Y. Tao, April 9, 2013 Tries Tries Yufei To KAIST April 9, 2013 Y. To, April 9, 2013 Tries In this lecture, we will discuss the following exct mtching prolem on strings. Prolem Let S e set of strings, ech of which hs unique integer

More information

Suffix trees, suffix arrays, BWT

Suffix trees, suffix arrays, BWT ALGORITHMES POUR LA BIO-INFORMATIQUE ET LA VISUALISATION COURS 3 Rluc Uricru Suffix trees, suffix rrys, BWT Bsed on: Suffix trees nd suffix rrys presenttion y Him Kpln Suffix trees course y Pco Gomez Liner-Time

More information

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have Rndom Numers nd Monte Crlo Methods Rndom Numer Methods The integrtion methods discussed so fr ll re sed upon mking polynomil pproximtions to the integrnd. Another clss of numericl methods relies upon using

More information

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs.

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs. Lecture 5 Wlks, Trils, Pths nd Connectedness Reding: Some of the mteril in this lecture comes from Section 1.2 of Dieter Jungnickel (2008), Grphs, Networks nd Algorithms, 3rd edition, which is ville online

More information

Presentation Martin Randers

Presentation Martin Randers Presenttion Mrtin Rnders Outline Introduction Algorithms Implementtion nd experiments Memory consumption Summry Introduction Introduction Evolution of species cn e modelled in trees Trees consist of nodes

More information

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming Lecture 10 Evolutionry Computtion: Evolution strtegies nd genetic progrmming Evolution strtegies Genetic progrmming Summry Negnevitsky, Person Eduction, 2011 1 Evolution Strtegies Another pproch to simulting

More information

SAPPER: Subgraph Indexing and Approximate Matching in Large Graphs

SAPPER: Subgraph Indexing and Approximate Matching in Large Graphs SAPPER: Sugrph Indexing nd Approximte Mtching in Lrge Grphs Shijie Zhng, Jiong Yng, Wei Jin EECS Dept., Cse Western Reserve University, {shijie.zhng, jiong.yng, wei.jin}@cse.edu ABSTRACT With the emergence

More information

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

In the last lecture, we discussed how valid tokens may be specified by regular expressions. LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.

More information

Today. CS 188: Artificial Intelligence Fall Recap: Search. Example: Pancake Problem. Example: Pancake Problem. General Tree Search.

Today. CS 188: Artificial Intelligence Fall Recap: Search. Example: Pancake Problem. Example: Pancake Problem. General Tree Search. CS 88: Artificil Intelligence Fll 00 Lecture : A* Serch 9//00 A* Serch rph Serch Tody Heuristic Design Dn Klein UC Berkeley Multiple slides from Sturt Russell or Andrew Moore Recp: Serch Exmple: Pncke

More information

Premaster Course Algorithms 1 Chapter 6: Shortest Paths. Christian Scheideler SS 2018

Premaster Course Algorithms 1 Chapter 6: Shortest Paths. Christian Scheideler SS 2018 Premster Course Algorithms Chpter 6: Shortest Pths Christin Scheieler SS 8 Bsic Grph Algorithms Overview: Shortest pths in DAGs Dijkstr s lgorithm Bellmn-For lgorithm Johnson s metho SS 8 Chpter 6 Shortest

More information

Ma/CS 6b Class 1: Graph Recap

Ma/CS 6b Class 1: Graph Recap M/CS 6 Clss 1: Grph Recp By Adm Sheffer Course Detils Adm Sheffer. Office hour: Tuesdys 4pm. dmsh@cltech.edu TA: Victor Kstkin. Office hour: Tuesdys 7pm. 1:00 Mondy, Wednesdy, nd Fridy. http://www.mth.cltech.edu/~2014-15/2term/m006/

More information

Complete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li

Complete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li 2nd Interntionl Conference on Electronic & Mechnicl Engineering nd Informtion Technology (EMEIT-212) Complete Coverge Pth Plnning of Mobile Robot Bsed on Dynmic Progrmming Algorithm Peng Zhou, Zhong-min

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Informtion Retrievl nd Orgnistion Suffix Trees dpted from http://www.mth.tu.c.il/~himk/seminr02/suffixtrees.ppt Dell Zhng Birkeck, University of London Trie A tree representing set of strings { } eef d

More information

What are suffix trees?

What are suffix trees? Suffix Trees 1 Wht re suffix trees? Allow lgorithm designers to store very lrge mount of informtion out strings while still keeping within liner spce Allow users to serch for new strings in the originl

More information

Solving Problems by Searching. CS 486/686: Introduction to Artificial Intelligence

Solving Problems by Searching. CS 486/686: Introduction to Artificial Intelligence Solving Prolems y Serching CS 486/686: Introduction to Artificil Intelligence 1 Introduction Serch ws one of the first topics studied in AI - Newell nd Simon (1961) Generl Prolem Solver Centrl component

More information

Solving Problems by Searching. CS 486/686: Introduction to Artificial Intelligence Winter 2016

Solving Problems by Searching. CS 486/686: Introduction to Artificial Intelligence Winter 2016 Solving Prolems y Serching CS 486/686: Introduction to Artificil Intelligence Winter 2016 1 Introduction Serch ws one of the first topics studied in AI - Newell nd Simon (1961) Generl Prolem Solver Centrl

More information

UT1553B BCRT True Dual-port Memory Interface

UT1553B BCRT True Dual-port Memory Interface UTMC APPICATION NOTE UT553B BCRT True Dul-port Memory Interfce INTRODUCTION The UTMC UT553B BCRT is monolithic CMOS integrted circuit tht provides comprehensive MI-STD- 553B Bus Controller nd Remote Terminl

More information

CS481: Bioinformatics Algorithms

CS481: Bioinformatics Algorithms CS481: Bioinformtics Algorithms Cn Alkn EA509 clkn@cs.ilkent.edu.tr http://www.cs.ilkent.edu.tr/~clkn/teching/cs481/ EXACT STRING MATCHING Fingerprint ide Assume: We cn compute fingerprint f(p) of P in

More information

Announcements. CS 188: Artificial Intelligence Fall Recap: Search. Today. General Tree Search. Uniform Cost. Lecture 3: A* Search 9/4/2007

Announcements. CS 188: Artificial Intelligence Fall Recap: Search. Today. General Tree Search. Uniform Cost. Lecture 3: A* Search 9/4/2007 CS 88: Artificil Intelligence Fll 2007 Lecture : A* Serch 9/4/2007 Dn Klein UC Berkeley Mny slides over the course dpted from either Sturt Russell or Andrew Moore Announcements Sections: New section 06:

More information

9 Graph Cutting Procedures

9 Graph Cutting Procedures 9 Grph Cutting Procedures Lst clss we begn looking t how to embed rbitrry metrics into distributions of trees, nd proved the following theorem due to Brtl (1996): Theorem 9.1 (Brtl (1996)) Given metric

More information

Topological Queries on Graph-structured XML Data: Models and Implementations

Topological Queries on Graph-structured XML Data: Models and Implementations Topologicl Queries on Grph-structured XML Dt: Models nd Implementtions Hongzhi Wng, Jinzhong Li, nd Jizhou Luo Astrct In mny pplictions, dt is in grph structure, which cn e nturlly represented s grph-structured

More information

AI Adjacent Fields. This slide deck courtesy of Dan Klein at UC Berkeley

AI Adjacent Fields. This slide deck courtesy of Dan Klein at UC Berkeley AI Adjcent Fields Philosophy: Logic, methods of resoning Mind s physicl system Foundtions of lerning, lnguge, rtionlity Mthemtics Forml representtion nd proof Algorithms, computtion, (un)decidility, (in)trctility

More information

An Algorithm for Enumerating All Maximal Tree Patterns Without Duplication Using Succinct Data Structure

An Algorithm for Enumerating All Maximal Tree Patterns Without Duplication Using Succinct Data Structure , Mrch 12-14, 2014, Hong Kong An Algorithm for Enumerting All Mximl Tree Ptterns Without Dupliction Using Succinct Dt Structure Yuko ITOKAWA, Tomoyuki UCHIDA nd Motoki SANO Astrct In order to extrct structured

More information

Fig.25: the Role of LEX

Fig.25: the Role of LEX The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing

More information

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards A Tutology Checker loosely relted to Stålmrck s Algorithm y Mrtin Richrds mr@cl.cm.c.uk http://www.cl.cm.c.uk/users/mr/ University Computer Lortory New Museum Site Pemroke Street Cmridge, CB2 3QG Mrtin

More information

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5 CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,

More information

Today. Search Problems. Uninformed Search Methods. Depth-First Search Breadth-First Search Uniform-Cost Search

Today. Search Problems. Uninformed Search Methods. Depth-First Search Breadth-First Search Uniform-Cost Search Uninformed Serch [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI t UC Berkeley. All CS188 mterils re vilble t http://i.berkeley.edu.] Tody Serch Problems Uninformed Serch Methods

More information

Before We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1):

Before We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1): Overview (): Before We Begin Administrtive detils Review some questions to consider Winter 2006 Imge Enhncement in the Sptil Domin: Bsics of Sptil Filtering, Smoothing Sptil Filters, Order Sttistics Filters

More information

Efficient K-NN Search in Polyphonic Music Databases Using a Lower Bounding Mechanism

Efficient K-NN Search in Polyphonic Music Databases Using a Lower Bounding Mechanism Efficient K-NN Serch in Polyphonic Music Dtses Using Lower Bounding Mechnism Ning-Hn Liu Deprtment of Computer Science Ntionl Tsing Hu University Hsinchu,Tiwn 300, R.O.C 886-3-575679 nhliou@yhoo.com.tw

More information

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus Unit #9 : Definite Integrl Properties, Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl

More information

COMPUTER EDUCATION TECHNIQUES, INC. (MS_W2K3_SERVER ) SA:

COMPUTER EDUCATION TECHNIQUES, INC. (MS_W2K3_SERVER ) SA: In order to lern which questions hve een nswered correctly: 1. Print these pges. 2. Answer the questions. 3. Send this ssessment with the nswers vi:. FAX to (212) 967-3498. Or. Mil the nswers to the following

More information

10.5 Graphing Quadratic Functions

10.5 Graphing Quadratic Functions 0.5 Grphing Qudrtic Functions Now tht we cn solve qudrtic equtions, we wnt to lern how to grph the function ssocited with the qudrtic eqution. We cll this the qudrtic function. Grphs of Qudrtic Functions

More information

Ma/CS 6b Class 1: Graph Recap

Ma/CS 6b Class 1: Graph Recap M/CS 6 Clss 1: Grph Recp By Adm Sheffer Course Detils Instructor: Adm Sheffer. TA: Cosmin Pohot. 1pm Mondys, Wednesdys, nd Fridys. http://mth.cltech.edu/~2015-16/2term/m006/ Min ook: Introduction to Grph

More information

From Dependencies to Evaluation Strategies

From Dependencies to Evaluation Strategies From Dependencies to Evlution Strtegies Possile strtegies: 1 let the user define the evlution order 2 utomtic strtegy sed on the dependencies: use locl dependencies to determine which ttriutes to compute

More information

Network Interconnection: Bridging CS 571 Fall Kenneth L. Calvert All rights reserved

Network Interconnection: Bridging CS 571 Fall Kenneth L. Calvert All rights reserved Network Interconnection: Bridging CS 57 Fll 6 6 Kenneth L. Clvert All rights reserved The Prolem We know how to uild (rodcst) LANs Wnt to connect severl LANs together to overcome scling limits Recll: speed

More information

Announcements. CS 188: Artificial Intelligence Fall Recap: Search. Today. Example: Pancake Problem. Example: Pancake Problem

Announcements. CS 188: Artificial Intelligence Fall Recap: Search. Today. Example: Pancake Problem. Example: Pancake Problem Announcements Project : erch It s live! Due 9/. trt erly nd sk questions. It s longer thn most! Need prtner? Come up fter clss or try Pizz ections: cn go to ny, ut hve priority in your own C 88: Artificil

More information

CSEP 573 Artificial Intelligence Winter 2016

CSEP 573 Artificial Intelligence Winter 2016 CSEP 573 Artificil Intelligence Winter 2016 Luke Zettlemoyer Problem Spces nd Serch slides from Dn Klein, Sturt Russell, Andrew Moore, Dn Weld, Pieter Abbeel, Ali Frhdi Outline Agents tht Pln Ahed Serch

More information

Implementing Automata. CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

Implementing Automata. CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona Implementing utomt Sc 5 ompilers nd Systems Softwre : Lexicl nlysis II Deprtment of omputer Science University of rizon collerg@gmil.com opyright c 009 hristin ollerg NFs nd DFs cn e hrd-coded using this

More information

CSCI 446: Artificial Intelligence

CSCI 446: Artificial Intelligence CSCI 446: Artificil Intelligence Serch Instructor: Michele Vn Dyne [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI t UC Berkeley. All CS188 mterils re vilble t http://i.berkeley.edu.]

More information

Position Heaps: A Simple and Dynamic Text Indexing Data Structure

Position Heaps: A Simple and Dynamic Text Indexing Data Structure Position Heps: A Simple nd Dynmic Text Indexing Dt Structure Andrzej Ehrenfeucht, Ross M. McConnell, Niss Osheim, Sung-Whn Woo Dept. of Computer Science, 40 UCB, University of Colordo t Boulder, Boulder,

More information

Notes for Graph Theory

Notes for Graph Theory Notes for Grph Theory These re notes I wrote up for my grph theory clss in 06. They contin most of the topics typiclly found in grph theory course. There re proofs of lot of the results, ut not of everything.

More information

CS 221: Artificial Intelligence Fall 2011

CS 221: Artificial Intelligence Fall 2011 CS 221: Artificil Intelligence Fll 2011 Lecture 2: Serch (Slides from Dn Klein, with help from Sturt Russell, Andrew Moore, Teg Grenger, Peter Norvig) Problem types! Fully observble, deterministic! single-belief-stte

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully

More information

CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona

CSc 453. Compilers and Systems Software. 4 : Lexical Analysis II. Department of Computer Science University of Arizona CSc 453 Compilers nd Systems Softwre 4 : Lexicl Anlysis II Deprtment of Computer Science University of Arizon collerg@gmil.com Copyright c 2009 Christin Collerg Implementing Automt NFAs nd DFAs cn e hrd-coded

More information

CS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig

CS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig CS311H: Discrete Mthemtics Grph Theory IV Instructor: Işıl Dillig Instructor: Işıl Dillig, CS311H: Discrete Mthemtics Grph Theory IV 1/25 A Non-plnr Grph Regions of Plnr Grph The plnr representtion of

More information

Taming Subgraph Isomorphism for RDF Query Processing

Taming Subgraph Isomorphism for RDF Query Processing Tming Sugrph Isomorphism for RDF Query Processing Jinh Kim # jinh.kim@orcle.com Hyungyu Shin hgshin@dl.postech.c.kr Sungpck Hong # Hssn Chfi # {sungpck.hong, hssn.chfi}@orcle.com POSTECH, South Kore #

More information

Dr. D.M. Akbar Hussain

Dr. D.M. Akbar Hussain Dr. D.M. Akr Hussin Lexicl Anlysis. Bsic Ide: Red the source code nd generte tokens, it is similr wht humns will do to red in; just tking on the input nd reking it down in pieces. Ech token is sequence

More information

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy Recognition of Tokens if expressions nd reltionl opertors if è if then è then else è else relop

More information

Approximation of Two-Dimensional Rectangle Packing

Approximation of Two-Dimensional Rectangle Packing pproximtion of Two-imensionl Rectngle Pcking Pinhong hen, Yn hen, Mudit Goel, Freddy Mng S70 Project Report, Spring 1999. My 18, 1999 1 Introduction 1-d in pcking nd -d in pcking re clssic NP-complete

More information

LR Parsing, Part 2. Constructing Parse Tables. Need to Automatically Construct LR Parse Tables: Action and GOTO Table

LR Parsing, Part 2. Constructing Parse Tables. Need to Automatically Construct LR Parse Tables: Action and GOTO Table TDDD55 Compilers nd Interpreters TDDB44 Compiler Construction LR Prsing, Prt 2 Constructing Prse Tles Prse tle construction Grmmr conflict hndling Ctegories of LR Grmmrs nd Prsers Peter Fritzson, Christoph

More information

A New Learning Algorithm for the MAXQ Hierarchical Reinforcement Learning Method

A New Learning Algorithm for the MAXQ Hierarchical Reinforcement Learning Method A New Lerning Algorithm for the MAXQ Hierrchicl Reinforcement Lerning Method Frzneh Mirzzdeh 1, Bbk Behsz 2, nd Hmid Beigy 1 1 Deprtment of Computer Engineering, Shrif University of Technology, Tehrn,

More information

PARALLEL AND DISTRIBUTED COMPUTING

PARALLEL AND DISTRIBUTED COMPUTING PARALLEL AND DISTRIBUTED COMPUTING 2009/2010 1 st Semester Teste Jnury 9, 2010 Durtion: 2h00 - No extr mteril llowed. This includes notes, scrtch pper, clcultor, etc. - Give your nswers in the ville spce

More information

Nearest Keyword Set Search in Multi-dimensional Datasets

Nearest Keyword Set Search in Multi-dimensional Datasets Nerest Keyword Set Serch in Multi-dimensionl Dtsets Vishwkrm Singh Deprtment of Computer Science University of Cliforni Snt Brbr, USA Emil: vsingh014@gmil.com Ambuj K. Singh Deprtment of Computer Science

More information

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis CS143 Hndout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexicl Anlysis In this first written ssignment, you'll get the chnce to ply round with the vrious constructions tht come up when doing lexicl

More information

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2012 Colin Dewey cdewey@biostt.wisc.edu Gols for Lecture the key concepts to understnd re the following how lrge-scle lignment

More information

documents 1. Introduction

documents 1. Introduction www.ijcsi.org 4 Efficient structurl similrity computtion etween XML documents Ali Aïtelhdj Computer Science Deprtment, Fculty of Electricl Engineering nd Computer Science Mouloud Mmmeri University of Tizi-Ouzou

More information

CSCI 104. Rafael Ferreira da Silva. Slides adapted from: Mark Redekopp and David Kempe

CSCI 104. Rafael Ferreira da Silva. Slides adapted from: Mark Redekopp and David Kempe CSCI 0 fel Ferreir d Silv rfsilv@isi.edu Slides dpted from: Mrk edekopp nd Dvid Kempe LOG STUCTUED MEGE TEES Series Summtion eview Let n = + + + + k $ = #%& #. Wht is n? n = k+ - Wht is log () + log ()

More information

cisc1110 fall 2010 lecture VI.2 call by value function parameters another call by value example:

cisc1110 fall 2010 lecture VI.2 call by value function parameters another call by value example: cisc1110 fll 2010 lecture VI.2 cll y vlue function prmeters more on functions more on cll y vlue nd cll y reference pssing strings to functions returning strings from functions vrile scope glol vriles

More information

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011

CSCI 3130: Formal Languages and Automata Theory Lecture 12 The Chinese University of Hong Kong, Fall 2011 CSCI 3130: Forml Lnguges nd utomt Theory Lecture 12 The Chinese University of Hong Kong, Fll 2011 ndrej Bogdnov In progrmming lnguges, uilding prse trees is significnt tsk ecuse prse trees tell us the

More information

MTH 146 Conics Supplement

MTH 146 Conics Supplement 105- Review of Conics MTH 146 Conics Supplement In this section we review conics If ou ne more detils thn re present in the notes, r through section 105 of the ook Definition: A prol is the set of points

More information

A Heuristic Approach for Discovering Reference Models by Mining Process Model Variants

A Heuristic Approach for Discovering Reference Models by Mining Process Model Variants A Heuristic Approch for Discovering Reference Models by Mining Process Model Vrints Chen Li 1, Mnfred Reichert 2, nd Andres Wombcher 3 1 Informtion System Group, University of Twente, The Netherlnds lic@cs.utwente.nl

More information

Stack Manipulation. Other Issues. How about larger constants? Frame Pointer. PowerPC. Alternative Architectures

Stack Manipulation. Other Issues. How about larger constants? Frame Pointer. PowerPC. Alternative Architectures Other Issues Stck Mnipultion support for procedures (Refer to section 3.6), stcks, frmes, recursion mnipulting strings nd pointers linkers, loders, memory lyout Interrupts, exceptions, system clls nd conventions

More information

The Greedy Method. The Greedy Method

The Greedy Method. The Greedy Method Lists nd Itertors /8/26 Presenttion for use with the textook, Algorithm Design nd Applictions, y M. T. Goodrich nd R. Tmssi, Wiley, 25 The Greedy Method The Greedy Method The greedy method is generl lgorithm

More information

Inference of node replacement graph grammars

Inference of node replacement graph grammars Glley Proof 22/6/27; :6 File: id293.tex; BOKCTP/Hin p. Intelligent Dt Anlysis (27) 24 IOS Press Inference of node replcement grph grmmrs Jcek P. Kukluk, Lwrence B. Holder nd Dine J. Cook Deprtment of Computer

More information

4452 Mathematical Modeling Lecture 4: Lagrange Multipliers

4452 Mathematical Modeling Lecture 4: Lagrange Multipliers Mth Modeling Lecture 4: Lgrnge Multipliers Pge 4452 Mthemticl Modeling Lecture 4: Lgrnge Multipliers Lgrnge multipliers re high powered mthemticl technique to find the mximum nd minimum of multidimensionl

More information

CS201 Discussion 10 DRAWTREE + TRIES

CS201 Discussion 10 DRAWTREE + TRIES CS201 Discussion 10 DRAWTREE + TRIES DrwTree First instinct: recursion As very generic structure, we could tckle this problem s follows: drw(): Find the root drw(root) drw(root): Write the line for the

More information

Misrepresentation of Preferences

Misrepresentation of Preferences Misrepresenttion of Preferences Gicomo Bonnno Deprtment of Economics, University of Cliforni, Dvis, USA gfbonnno@ucdvis.edu Socil choice functions Arrow s theorem sys tht it is not possible to extrct from

More information

Efficient Regular Expression Grouping Algorithm Based on Label Propagation Xi Chena, Shuqiao Chenb and Ming Maoc

Efficient Regular Expression Grouping Algorithm Based on Label Propagation Xi Chena, Shuqiao Chenb and Ming Maoc 4th Ntionl Conference on Electricl, Electronics nd Computer Engineering (NCEECE 2015) Efficient Regulr Expression Grouping Algorithm Bsed on Lbel Propgtion Xi Chen, Shuqio Chenb nd Ming Moc Ntionl Digitl

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-186 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

Algorithm Design (5) Text Search

Algorithm Design (5) Text Search Algorithm Design (5) Text Serch Tkshi Chikym School of Engineering The University of Tokyo Text Serch Find sustring tht mtches the given key string in text dt of lrge mount Key string: chr x[m] Text Dt:

More information

Available at ISSN: Vol. 4, Issue 2 (December 2009) pp (Previously Vol. 4, No.

Available at   ISSN: Vol. 4, Issue 2 (December 2009) pp (Previously Vol. 4, No. Avilble t http://pvmu.edu.edu/pges/398.sp ISSN: 93-9466 Vol. 4, Issue December 009 pp. 434 444 Previously Vol. 4, No. Applictions nd Applied Mthemtics: An Interntionl Journl AAM On -ry Subdivision for

More information

From Indexing Data Structures to de Bruijn Graphs

From Indexing Data Structures to de Bruijn Graphs From Indexing Dt Structures to de Bruijn Grphs Bstien Czux, Thierry Lecroq, Eric Rivls LIRMM & IBC, Montpellier - LITIS Rouen June 1, 201 Czux, Lecroq, Rivls (LIRMM) Generlized Suffix Tree & DBG June 1,

More information

A dual of the rectangle-segmentation problem for binary matrices

A dual of the rectangle-segmentation problem for binary matrices A dul of the rectngle-segmenttion prolem for inry mtrices Thoms Klinowski Astrct We consider the prolem to decompose inry mtrix into smll numer of inry mtrices whose -entries form rectngle. We show tht

More information

LECT-10, S-1 FP2P08, Javed I.

LECT-10, S-1 FP2P08, Javed I. A Course on Foundtions of Peer-to-Peer Systems & Applictions LECT-10, S-1 CS /799 Foundtion of Peer-to-Peer Applictions & Systems Kent Stte University Dept. of Computer Science www.cs.kent.edu/~jved/clss-p2p08

More information

EECS150 - Digital Design Lecture 23 - High-level Design and Optimization 3, Parallelism and Pipelining

EECS150 - Digital Design Lecture 23 - High-level Design and Optimization 3, Parallelism and Pipelining EECS150 - Digitl Design Lecture 23 - High-level Design nd Optimiztion 3, Prllelism nd Pipelining Nov 12, 2002 John Wwrzynek Fll 2002 EECS150 - Lec23-HL3 Pge 1 Prllelism Prllelism is the ct of doing more

More information

Efficient Algorithms For Optimizing Policy-Constrained Routing

Efficient Algorithms For Optimizing Policy-Constrained Routing Efficient Algorithms For Optimizing Policy-Constrined Routing Andrew R. Curtis curtis@cs.colostte.edu Ross M. McConnell rmm@cs.colostte.edu Dn Mssey mssey@cs.colostte.edu Astrct Routing policies ply n

More information

Efficient Techniques for Tree Similarity Queries 1

Efficient Techniques for Tree Similarity Queries 1 Efficient Techniques for Tree Similrity Queries 1 Nikolus Augsten Dtbse Reserch Group Deprtment of Computer Sciences University of Slzburg, Austri July 6, 2017 Austrin Computer Science Dy 2017 / IMAGINE

More information

Definition of Regular Expression

Definition of Regular Expression Definition of Regulr Expression After the definition of the string nd lnguges, we re redy to descrie regulr expressions, the nottion we shll use to define the clss of lnguges known s regulr sets. Recll

More information

Math 142, Exam 1 Information.

Math 142, Exam 1 Information. Mth 14, Exm 1 Informtion. 9/14/10, LC 41, 9:30-10:45. Exm 1 will be bsed on: Sections 7.1-7.5. The corresponding ssigned homework problems (see http://www.mth.sc.edu/ boyln/sccourses/14f10/14.html) At

More information

Intermediate Information Structures

Intermediate Information Structures CPSC 335 Intermedite Informtion Structures LECTURE 13 Suffix Trees Jon Rokne Computer Science University of Clgry Cnd Modified from CMSC 423 - Todd Trengen UMD upd Preprocessing Strings We will look t

More information

Languages. L((a (b)(c))*) = { ε,a,bc,aa,abc,bca,... } εw = wε = w. εabba = abbaε = abba. (a (b)(c)) *

Languages. L((a (b)(c))*) = { ε,a,bc,aa,abc,bca,... } εw = wε = w. εabba = abbaε = abba. (a (b)(c)) * Pln for Tody nd Beginning Next week Interpreter nd Compiler Structure, or Softwre Architecture Overview of Progrmming Assignments The MeggyJv compiler we will e uilding. Regulr Expressions Finite Stte

More information

Reducing a DFA to a Minimal DFA

Reducing a DFA to a Minimal DFA Lexicl Anlysis - Prt 4 Reducing DFA to Miniml DFA Input: DFA IN Assume DFA IN never gets stuck (dd ded stte if necessry) Output: DFA MIN An equivlent DFA with the minimum numer of sttes. Hrry H. Porter,

More information

F. R. K. Chung y. University ofpennsylvania. Philadelphia, Pennsylvania R. L. Graham. AT&T Labs - Research. March 2,1997.

F. R. K. Chung y. University ofpennsylvania. Philadelphia, Pennsylvania R. L. Graham. AT&T Labs - Research. March 2,1997. Forced convex n-gons in the plne F. R. K. Chung y University ofpennsylvni Phildelphi, Pennsylvni 19104 R. L. Grhm AT&T Ls - Reserch Murry Hill, New Jersey 07974 Mrch 2,1997 Astrct In seminl pper from 1935,

More information

File Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment

File Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment File Mnger Quick Reference Guide June 2018 Prepred for the Myo Clinic Enterprise Khu Deployment NVIGTION IN FILE MNGER To nvigte in File Mnger, users will mke use of the left pne to nvigte nd further pnes

More information

Qubit allocation for quantum circuit compilers

Qubit allocation for quantum circuit compilers Quit lloction for quntum circuit compilers Nov. 10, 2017 JIQ 2017 Mrcos Yukio Sirichi Sylvin Collnge Vinícius Fernndes dos Sntos Fernndo Mgno Quintão Pereir Compilers for quntum computing The first genertion

More information

1. SEQUENCES INVOLVING EXPONENTIAL GROWTH (GEOMETRIC SEQUENCES)

1. SEQUENCES INVOLVING EXPONENTIAL GROWTH (GEOMETRIC SEQUENCES) Numbers nd Opertions, Algebr, nd Functions 45. SEQUENCES INVOLVING EXPONENTIAL GROWTH (GEOMETRIC SEQUENCES) In sequence of terms involving eponentil growth, which the testing service lso clls geometric

More information

Functor (1A) Young Won Lim 10/5/17

Functor (1A) Young Won Lim 10/5/17 Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published

More information

Distributed Systems Principles and Paradigms

Distributed Systems Principles and Paradigms Distriuted Systems Principles nd Prdigms Chpter 11 (version April 7, 2008) Mrten vn Steen Vrije Universiteit Amsterdm, Fculty of Science Dept. Mthemtics nd Computer Science Room R4.20. Tel: (020) 598 7784

More information

such that the S i cover S, or equivalently S

such that the S i cover S, or equivalently S MATH 55 Triple Integrls Fll 16 1. Definition Given solid in spce, prtition of consists of finite set of solis = { 1,, n } such tht the i cover, or equivlently n i. Furthermore, for ech i, intersects i

More information

MATH 25 CLASS 5 NOTES, SEP

MATH 25 CLASS 5 NOTES, SEP MATH 25 CLASS 5 NOTES, SEP 30 2011 Contents 1. A brief diversion: reltively prime numbers 1 2. Lest common multiples 3 3. Finding ll solutions to x + by = c 4 Quick links to definitions/theorems Euclid

More information

OUTPUT DELIVERY SYSTEM

OUTPUT DELIVERY SYSTEM Differences in ODS formtting for HTML with Proc Print nd Proc Report Lur L. M. Thornton, USDA-ARS, Animl Improvement Progrms Lortory, Beltsville, MD ABSTRACT While Proc Print is terrific tool for dt checking

More information

Meaningful Change Detection in Structured Data.

Meaningful Change Detection in Structured Data. Meningful Chnge Detection in Structured Dt Sudrshn S. Chwthe Hector Grci-Molin Computer Science Deprtment, Stnford University, Stnford, Cliforni 94305 fchw,hectorg@cs.stnford.edu Astrct Detecting chnges

More information

Outline. Introduction Suffix Trees (ST) Building STs in linear time: Ukkonen s algorithm Applications of ST

Outline. Introduction Suffix Trees (ST) Building STs in linear time: Ukkonen s algorithm Applications of ST Suffi Trees Outline Introduction Suffi Trees (ST) Building STs in liner time: Ukkonen s lgorithm Applictions of ST 2 3 Introduction Sustrings String is ny sequence of chrcters. Sustring of string S is

More information

Video-rate Image Segmentation by means of Region Splitting and Merging

Video-rate Image Segmentation by means of Region Splitting and Merging Video-rte Imge Segmenttion y mens of Region Splitting nd Merging Knur Anej, Florence Lguzet, Lionel Lcssgne, Alin Merigot Institute for Fundmentl Electronics, University of Pris South Orsy, Frnce knur.nej@gmil.com,

More information

Functor (1A) Young Won Lim 8/2/17

Functor (1A) Young Won Lim 8/2/17 Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published

More information

Efficient Rerouting Algorithms for Congestion Mitigation

Efficient Rerouting Algorithms for Congestion Mitigation 2009 IEEE Computer Society Annul Symposium on VLSI Efficient Rerouting Algorithms for Congestion Mitigtion M. A. R. Chudhry*, Z. Asd, A. Sprintson, nd J. Hu Deprtment of Electricl nd Computer Engineering

More information

Suffix Tries. Slides adapted from the course by Ben Langmead

Suffix Tries. Slides adapted from the course by Ben Langmead Suffix Tries Slides dpted from the course y Ben Lngmed en.lngmed@gmil.com Indexing with suffixes Until now, our indexes hve een sed on extrcting sustrings from T A very different pproch is to extrct suffixes

More information