I/O Efficient Dynamic Data Structures for Longest Prefix Queries

Size: px
Start display at page:

Download "I/O Efficient Dynamic Data Structures for Longest Prefix Queries"

Transcription

1 I/O Efficient Dynmic Dt Structures for Longest Prefix Queries Moshe Hershcovitch 1 nd Him Kpln 2 1 Fculty of Electricl Engineering, moshik1@gmil.com 2 School of Computer Science, himk@cs.tu.c.il, Tel Aviv University, Tel Aviv 69978, Isrel Astrct. We present n efficient dt structure for finding the longest prefix of query string p in dynmic dtse of strings. When the strings re IP-ddresses then this is the IP-lookup prolem. Our dt structure is I/O efficient. It supports query with string p using O(log B (n) + p ) I/O opertions, where B is the size of disk lock. B It lso supports n insertion nd deletion of string p with the sme numer of I/O s. The size of the dt structure is liner in the size of the dtse nd the running time of ech opertion is O(log(n) + p ). 1 Introduction We consider the longest prefix prolem which is defined s follows. The input consists of set of strings S = {p 1...p n } which we shll refer to s prefixes. We wnt to preprocess S into dt structure such tht given query string q we cn efficiently find the longest prefix in S which is prefix of q or report tht no prefix in S is prefix of q. We focus on the dynmic version of the prolem where we wnt to e le to insert nd delete prefixes to nd from S, respectively. The min ppliction of this prolem is for pcket forwrding in IP networks. In this ppliction router mintins set of prefixes of IP ddresses ccording to some routing protocol such s BGP (Border Gtewy Protocol). When pcket rrives, the router finds the longest prefix of the destintion ddress of the pcket nd sends the pcket on the outgoing link ssocited with this longest prefix. A longest prefix mtch query in this settings is often clled IP-lookup. The rpid growth of the Internet hs rought the need for routers to mintin lrge sets of prefixes, nd to perform longest prefix mtch queries t high speeds [7]. A min issue in the design of routers is the size of the expensive high speed memory used y the router for pcket forwrding. One cn reduce the size of this expensive memory y using externl memory components. Therefore the I/O efficiency of the lgorithm we use is very importnt. In softwre implementtions the entire dt structure my not fit into the cche nd we my flip prts of it ck nd forth from the min memory or the disk. In hrdwre implementtions it my e too expensive to fit the entire dt structure in the memory which is This work is prtilly supported y United sttes - Isrel Bintionl Science Foundtion, project numer

2 integrted in the chip tht implements the forwrding lgorithm itself. So prts of the dt structure my reside in externl memory devices (such s DRAM). In such designs the communiction etween the min chip nd the externl memory ecomes the performnce ottleneck of the system. In ddition to good I/O performnce n efficient dt structure for IP-lookup should e le to perform queries t the line rte (which is out 40 Gps nd more tody). The dt structure hs to e sclle since the numer of prefixes router hs to mintin is growing rpidly s well s the length of these prefixes. Finlly, we need to support fst updtes minly due to instilities in ckone routing protocols nd security issues. Our computtionl model. Our lgorithm works in the clssicl pointer mchine model[19] using only comprisons to mnipulte prefixes. This is in contrst with mny other lgorithms for IP-lookup tht use it mnipultions on IP ddresses. See Section 1. This restriction which we oey does not come t the cost of complicted dt structure. On the contrry, our dt structure is much simpler thn previously known dt structure with the sme gurntees. We develop our dt structures in two steps. First we reduce the longest prefix prolem to the prolem of finding the shortest segment contining query point. For this dt structure we ssume tht ech prefix fits into constnt numer of computer words so tht we cn compre two endpoints of segments in O(1) time. In the second step we relx this ssumption nd del with vrile length strings with no priori upper ound on their length. We ssume tht the strings re over n ordered lphet Σ nd tht we cn compre two chrcters of Σ in O(1) time. We do not require direct ccess to the chrcters of ech prefix. To nlyze I/O performnce we use the stndrd externl memory model where memory is prtitioned into locks of size B, nd we count the numer of locks tht we hve to trnsfer from slow to fst memory in order to perform the opertion. This quntity is the numer of I/O opertions performed y the opertion.[20]. Overview of our results. We first consider the prolem of mintining dynmic set of segments for point sting queries. Specificlly, we consider dynmic nested fmily of segments where ech pir of segments re either disjoint or one contins the other. We develop dt structure tht given point p cn efficiently find the shortest segment contining p. Our dt structure for this prolem which is sed on segment tree (with lrge fn-out) is prticulrly simple. In segment tree we mp segment s to every node v, such tht s contins 3 v, nd does not contin the prent of v. Typiclly one mintins t ech node v ll the segments tht mp to v in some secondry structure [18, 13, 12]. We mke the crucil oservtion tht for our point sting query it is sufficient to mintin for ech node v only the shortest segment which mps to v. Our dt structure performs query, nd n insertion or deletion of segment in O(log n) time nd O(log B (n)) I/O opertions. We mnipulte the segments only vi comprisons of their endpoints. 3 A segment contins node if it contins ll points in its sutree.

3 We use this result to solve the longest prefix prolem s follows. We ssocite segment [pl,pr] with ech string p, where L nd R re two new chrcters, smller nd lrger thn ll other chrcters, respectively. We then pply the previous dt structure to this set of segments. This gives the dt structure for the cse where we ssume tht two prefixes cn e compred in O(1) time. To hndle strings with no fixed ound on their lengths we comine this ide with the powerful string B-tree of Ferrgin nd Grossi [9]. This dt structure is B-tree crefully designed for storing strings. For efficient serches nd updtes it uses Ptrici trie [14] in ech node. We show how to mintin the informtion which we need for longest prefix queries using the string B-tree. Since our dt structure is sed on B-tree it is lso I/O efficient. If we pick the size of node so tht it fits in disk lock of size B, we otin tht query or updte with string q performs O(log B (n)+ q /B) I/O opertions. The dt structure requires O(n/B) disk locks in ddition to the locks required to store the strings themselves. The time for query or updte with string q is O(log(n) + q ). (This is s efficient s with tries implemented crefully [16], ut tries cnnot e implemented I/O efficiently [6]). Previous relted results. There hs een lot of work minly in the networking community on the IP-lookup prolem. The different dt structures cn e clssified into three fmilies: trie sed structures (See for exmple [7] nd the references there), hsh sed structures (See for exmple [11] nd the references there), nd tree sed structures. In the rest of this section we focus on dynmic tree sed solutions with worst cse gurntees tht re relted to our pproch. Shni nd Kim [15] descrie solution sed on collection of red-lck trees tht requires liner spce nd logrithmic time per opertion. Feldmnn nd Muthukrishnn [8] proposed Ft Inverted Segment tree (FIS). This dt structure supports queries in O(log log n + l) time, where l is the numer of levels in the segment tree. The spce requirement is O(n 1+1/l ), nd insert nd delete tke O(n 1/l log n) time, ut there is n upper ound on the totl numer of insertions nd deletions llowed. Suri et l. [18] proposed dt structure which is similr to ours in the sense tht it is oth segment tree nd B-tree. But they store in ech node ll the segments which re mpped to it nd therefore chieve logrithmic worst cse time ound per opertion nd liner spce only for IP-ddresses. Lu nd Shni [13] suggested n improvement of the segment tree of Suri tht stores ech prefix only in one plce. They mintin other it vectors in internl nodes nd their updte opertions re quite complicted. Our structure, which cn e extended to generl strings (nd generl segments) nd uses only comprisons, is simpler thn ll the solutions mentioned ove. In prticulr it is much clener thn the ltter two B-tree sed implementtions when pplied to IP ddresses. Kpln, Mold, nd Trjn [12] considered the prolem of point sting dynmic nested set of segments. In their setting, which is more generl thn ours, ech segment hs priority ssocited with it nd we wnt to find the segment of minimum priority contining query point. They present dt structure performing query nd updte in O(log n) time tht requires liner spce. It uses oth lnced serch tree nd dynmic tree [17] nd therey

4 more complicted thn ours (when pplied to the specil cse where the priority of n intervl is its length). In recent yers, externl memory dt structures hve een developed for wide rnge of pplictions [20]. A clssicl I/O efficient dt structure is the B-tree [2]. This is serch tree in which we choose the degree of node so tht it occupies single lock. The string B-tree of Ferrgin nd Grossi [9] is fundmentl extension of the B-tree for storing unounded strings. The min ide is to use Ptrici trie [14] in ech node to direct the serch. Unfortuntely this dt structure y itself does not solve the longest prefix prolem. Agrwl, Arge, nd Yi [1] improved more generl dt structure of Kpln, Mold, nd Trjn [12] for sting-min queries ginst generl segments (not necessrily nested). This dt structure is sed on B-tree nd cn e implemented so tht it is I/O efficient. Specificlly, dt structure for n intervls uses O(n/B) disk locks nd O(log B (n)) I/O opertions for query nd updte. This dt structure is quite complicted nd ssumes tht endpoints of intervls cn e compred in O(1) time. Therefore it is not directly pplicle for longest prefix queries in collection of unounded strings. Brodl nd Fgererg [5] otined cche olivious (see Section 4) dt structure for mnipulting strings. This dt structure, which is essentilly trie cn e used to otin n I/O efficient (though complicted) solution for the sttic version of the longest prefix prolem. The outline of the rest of the pper is s follows. In section 2 we present our sic ides using the ssumption tht strings re of constnt size. In section 3 we comine our ides with the string B-tree to otin generl I/O efficient solution. In section 4 we suggest future reserch. 2 B Tree for Longest Prefix Queries Our input is set of prefixes S = {p 1...p n } which re strings over the lphet Σ. We think of ech prefix p s segment I(p) = [pl,pr], where L nd R re two specil chrcters not in Σ, L is smller thn ll chrcters in Σ nd R is lrger thn ll chrcters in Σ. The longest prefix p of query q corresponds to the the shortest segment I(p) contining q. We descrie dynmic dt structure to mintin set of nested segments such tht we cn find the smllest segment contining query point. Although we cn pply our dt structure to ny set of nested segments we present our result using the string terminology nd the set of segments {I(p) p S}. Let P = {p i L, p i R p i S} e the set of endpoints of the prefixes in S. We store P ordered lexicogrphiclly t the leves of B + tree T. Ech internl node x of T hs n(x) children, where n(x) 2. If x is lef then it stores n(x) endpoints of P, where n(x) 2. (See Figure 1.) To simplify the presenttion we ssume tht lef x with n(x) endpoints hs n(x) + 1 dummy children. From now on when we sy lef of T, we refer to one of these dummy nodes, nd we refer to x s height-1 node. Ech endpoint in height-1 node plys the role of key seprting two consecutive

5 A 1 A 3 L A 6 R A 4 R A 2 R A 2 A 3 L A 5 L A 5 R A 6 R A 4 R A 7 R A 8 L A 8 R A 3 R A 2 R A 4 A 3 A 3 A 3 L A 4 L A 5 L A 5 R A 6 L A 6 R A 4 R A 7 L A 7 R A 8 L A 8 R A 3 R A 2 R A 3 A 4 A 5 A 6 A 7 A 8 A 2 A 5 A 6 A 7 A 8 A 4 A 3 A 2 A 1 Fig. 1. A B + tree with = 2 storing the prefixes A 1,..., A 8. Rectngles correspond to internl nodes nd squres correspond to dummy leves. In ech height-1 node we show the endpoints tht it stores. In ech internl node v of height > 1 we show the spns of its children which re lso used s the keys which direct the serch. (Note tht when we use the spns s keys, serch cn never rech the first dummy lef in ech height-1 node. Therefore we do not need to keep longest prefixes of these nodes nd we do not show them in the figure.) The spn of child u of v is the closed intervl from the point depicted to the left of the edge from v to u to the point depicted to the right of the edge from v to u. On ech edge (p(v), v) we show the longest prefix of v. dummy leves. We ssocite ech lef with the open intervl from the endpoint preceding the lef to the endpoint following the lef. We cll this intervl the spn of the lef. (Note tht the lst dummy lef in height-1 node nd the first dummy lef in the next height-1 node hve the sme spn.) We define the spn of n internl node v to e the smllest intervl contining the endpoints which re descendents of v. 4 We denote the spn of node v y spn(v). We think of T s segment tree nd mp ech segment [pl,pr] to every node v such tht pl is to the left of spn(v) nd pr is to the right of spn(v), nd either pl or pr re in spn(p(v)). We define the longest prefix (or the shorter segment) of v nd denote it LP(v) to e the shortest segment mpped to v. 5 If there isn t ny segment which is mpped to v we define LP(v) to e empty. (LP(v) is lso defined if v is dummy lef.) 4 This is the intervl tht strts t the leftmost endpoint in the sutree of v, nd ends t the rightmost endpoint in the sutree of v. 5 Note tht ll segments mpped to v form nested fmily of segments, so the shortest mong them is unique.

6 We store spn(v) nd LP(v) with the pointer to node v. Note tht when we re t node v we cn use the spn vlues of the children of v s the keys which direct the serch. We denote y B = O() the mximum size of node. We pick so tht B is the size of disk lock. 2.1 Finding the longest prefix Assume we wnt to find the longest prefix of query string q. We serch the B + tree with the string q 6 in stndrd wy nd trverse pth A to lef of T. We return LP(w), where w is the lst node on A, such tht LP(w) is not empty. The correctness of the query follows from the following oservtions. For ech prefix p of q, I(p) is mpped to some node u on A. Therefore p is LP(u) unless some longer prefix is mpped to u. Furthermore, since for every v, LP(p(v)) is prefix of LP(v), it follows tht the longest prefix of q must e LP(w), where w is the lst node on A for which LP(v) is not empty. 2.2 Inserting new prefix To insert new prefix p we hve to insert I(p) into T. We insert pl nd pr into the pproprite height-1 nodes w nd w, respectively, ccording to the lexicogrphic order of the endpoints. The endpoint pl is inserted into the spn of lef y nd the endpoint pr is inserted into the spn of lef z. Assume first tht z y. The spn of y is now split etween two new leves: y tht precedes pl nd y tht follows pl. We set the longest prefix of y to e the longest prefix of y nd the longest prefix of y to e p. The spn of z is now split etween two new leves: z tht precedes pr nd z tht follows pr. We set the longest prefixes of z to e p nd the longest prefix of z to e the longest prefix of z. If y = z then the spn of y is split etween three new leves y 1, y 2, nd y 3. We set the longest prefixes of y 1 nd y 3 to e the longest prefix of y, nd the longest prefix of y 2 to e p. There my e nodes v in T, such tht fter dding p, we hve to updte LP(v) to e p. Let y e the lef preceding pl nd let z e the lef following pr. Let u e the lowest common ncestor of y nd z. Let u e the child of u which is n ncestor of y nd let u e the child of u which is n ncestor of z. We my need to updte LP(v) if either: Cse 1: v is child of node w on the pth from u to y, which is right siling of the child w of w on this pth. Cse 2: v is child of node w on the pth from u to z, which is left siling of the child w of w on this pth. Cse 3: v is child of u which is right siling of u nd left siling of u. 6 In fct we pretend to serch with string (not in the dt structure) tht immeditely follows ql in the lexicogrphic order of the strings.

7 For ech such node v, we know tht spn(v) I(p) so we chnge LP(v) to e p, if p is longer thn the current LP(v). Since the depth of T is O(log B (n)), we updte O(B log B (n)) longest prefixes which re stored t O(log B (n)) nodes. After inserting pl nd pr, if the height-1 node contining pl nd the height-1 node contining pr hve no more thn 2 children, we finish the insert. Otherwise we hve to split t lest one of these nodes. We split node v into two nodes v 1 nd v 2. Node v 1 is the prent of the first (or +1) children of v nd node v 2 is the prent of the lst +1 children of v. Both v 1 nd v 2 replce v s consecutive children of p(v). We compute spn(v 1 ) from the spn of its first child nd the spn of its lst child, nd similrly for spn(v 2 ). A 1 v A 3 A 2 A 2 A 5 A 2 A 1 v 1 v 2 A 3 A 2 A 5 u 1 u 2 u 3 u 4 u 5 A 3 A 5 A 2 A 1 u 1 u 2 u 3 u 4 u 5 A 3 A 5 A 2 A 1 Fig. 2. A node v t the left which is split into nodes v 1 nd v 2 to the right. Since spn(v 1) I(A 2) the prefix A 2, which ws the longest prefix of u 2 efore the split, is the longest prefix of v 1 fter the split. The longest prefix of u 2 fter the split is empty. Clerly we hve to updte LP(v 1 ) nd LP(v 2 ). Furthermore, since segment tht ws mpped to child u of v my now e mpped to v 1 or v 2, we my lso hve to updte LP(u) for children u of v 1 nd v 2. Other longest prefixes do not chnge. The following simple oservtions specify how to updte the longest prefixes. In the following if u is child of v prior the split, then LP(u) refers to the longest prefix of u efore the split. Note tht since v exists only efore the split then LP(v) is the longest prefix of v efore the split. Similrly, LP(v 1 ) nd LP(v 2 ) re the longest prefix of v 1 nd v 2, respectively, fter the split. Lemm 1. Let u e child of v 1 fter the split. If spn(v 1 ) I(LP(u)) then fter split LP(u) should e empty. Proof. Let p = LP(u) since spn(v 1 ) I(LP(u)) then the segment I(p) is not mpped to u fter the split. Since I(p) ws the shortest segment tht ws mpped to u no other segment is mpped to u fter the split. Lemm 2. Let u 1 nd u 2 e children of v 1. If spn(v 1 ) I(LP(u 1 )) nd spn(v 1 ) I(LP(u 2 )) then LP(u 1 ) = LP(u 2 ). Proof. Since spn(v 1 ) I(LP(u 1 )) then spn(u 2 ) I(LP(u 1 )). So LP(u 1 ) cnnot e longer thn LP(u 2 ) since this would contrdict the fct tht I(LP(u 2 )) is the shortest segment contining spn(u 2 ). Symmetriclly, LP(u 2 ) cnnot e longer thn LP(u 1 ), so they must e equl.

8 Lemm 3. If there exist child u of v 1 such tht spn(v 1 ) I(LP(u)) then LP(v 1 ) is LP(u). Proof. Oviously LP(u) is mpped to v 1. Furthermore, LP(u) is the longest prefix with this property, since if there is longer prefix q with this property then q should hve een LP(u) efore the split. Lemm 4. If there isn t child u of v 1 such tht spn(v 1 ) I(LP(u)) then LP(v 1 ) is equl LP(v). Proof. We clim tht there exists child u of v 1 tht LP(u) is empty. From this clim the lemm follows since if there is prefix q longer thn LP(v) such tht I(q) is mpped to v 1, then q is mpped to u efore the split nd LP(u) couldn t hve een empty. We prove this clim s follows. Assume to the contrry tht LP(u) is not empty for every child u of v 1. Let w e child of v 1 such tht I(LP(w)) is not contined in I(LP(w )) for ny other child w of v (w exists since segments do not overlp). From our ssumption follows tht I(LP(w)) spn(v 1 ). Therefore t lest one of the endpoints of I(LP(w)), sy z is in the sutree of v 1. Let w e child of v 1 whose sutree contins z. It is esy to see now tht I(LP(w )) nd I(LP(w)) overlp which is contrdiction. A symmetric version of Lemms 1, 2, 3, nd 4 hold for v 2. These oservtions imply the following strightforwrd lgorithm to updte longest prefixes when we perform split. If there is child u of v 1 such tht spn(v 1 ) I(LP(u)) we set LP(v 1 ) to e LP(u), otherwise we set LP(v 1 ) to e LP(v). In ddition we set LP(u) to e empty for every child u of v 1 such tht spn(v 1 ) I(LP(u)). We updte the spn of v 2 nd its children nlogously. See Figure 2. After splitting v we recursively check if p(v 1 ) or p(v 2 ) hs more thn 2 children nd if so we continue to split them until we rech node tht hs no more thn 2 children. 2.3 Deleting prefix To delete prefix p we need to delete I(p) from T. We first find the longest prefix of p in S denoted y w. 7 Then we delete pl nd pr from the height-1 nodes contining them. We hve to chnge the longest prefix of every node v for which LP(v) = p to w. Nodes v for which LP(v) my e equl to p re of three kinds s specified in Cses (1), (2) nd (3) of Section 2.2 As result of deleting pl nd pr from the height-1 nodes contining them we my crete nodes with less thn children. To fix such node v we either orrow child from siling of v or merge v with one of its silings. We omit the detils of these relncing opertions nd their ffect on longest prefixes from this strct. The following theorem summrizes the results of this section. 7 We do tht y query with string (not in the dt structure) tht immeditely follows pr in the lexicogrphiclly order of strings.

9 Theorem 1. Assuming ech string occupies O(1) words, the B-tree dt structure which we descried supports longest prefix queries, insertions, nd deletions in O(log(n)) time. Furthermore, it performs O(log B (n)) I/Os per opertion, nd requires liner spce. 3 String B-tree For Longest Prefix Queries In B-tree, we ssume tht Θ() keys tht reside t single node fit into one disk lock of size B. However if the keys re strings of vrile sizes, which cn e ritrrily long, there my not e enough spce to store Θ() strings in single lock. Insted, we cn store Θ() pointers to strings in ech node, ut ccessing these strings during the serch requires more thn constnt numer of I/O opertions per node. To reduce the numer of I/Os, Ferrgin nd Grossi [9] developed n elegnt generliztion of B-tree clled the string B-tree or SB-tree for short. 0 P = cc 3 4 c mismtch c c c c c c c Correct Position 8 lef1 Common c prefix c Fig. 3. A Ptrici trie of node in string B-tree. The numer in node is its string depth. The chrcter on n edge is the rnching chrcter of the edge. An individul node v of n SB-tree is shown in Figure 3. Insted of storing the keys t node v we store Ptrici trie [14] of the keys, denoted y PT(v). Using this representtion we cn perform -wy rnching using only Θ() chrcters tht re stored in constnt numer of disk locks of size B. Ech internl node ξ of the Ptrici trie stores the length of the string corresponding the pth from the root to ξ. We cll this the string depth of ξ. We store with ech edge e the first chrcter of the string tht corresponds to e. This chrcter is clled the rnching chrcter of e. As n exmple Figure 3 shows Ptrici trie of node in string B-tree. The right child of the root hs string depth 4 nd it s outgoing edges hve the rnching chrcters nd, respectively. This mens tht the node s left sutrie consists of strings whose fifth chrcter is, nd its right sutrie consists of strings whose fifth chrcter is. The first four chrcters in ll

10 the strings in the right sutrie of the root re cc. Let ξ e node of the trie whose string depth is d(ξ). To mke rnching decision t ξ, we compre the d(ξ) + 1 chrcter of the string tht we serch, to the chrcters on the edges outgoing from ξ. For exmple, for the string cc, the serch in the trie in Figure 3 trverses the rightmost pth of the Ptrici trie, exmining the chrcters 1, 5, nd 7 of the string which we serch. Unfortuntely, the lef of the Ptrici trie tht we rech (in our exmple, the lef t the fr right, corresponding to cc ) is not in generl the correct rnching point, from the node of the SB-tree represented y this trie, since we did not compre ll chrcters of the string which we serch. We fix this y sequentilly compring the string which we serch with the key ssocited with the lef of the trie which we reched. If they differ, we find the position in which they first differ. In the exmple the first chrcter of the string cc tht is not equl to the corresponding chrcter of the key cc, is the fourth chrcter. Since the fourth chrcter of cc is smller we know tht the string which we serch is lexicogrphiclly smller thn ll keys in the right sutree of the root. It thus fits in etween the leves c nd cc. For more detils see [9]. Serching ech Ptrici trie requires constnt numer of I/O to lod it into memory, plus dditionl I/Os to do the sequentil scn of the key ssocited with the lef we reched. Therefore our structure s defined so fr does not gurntee tht the totl numer of I/Os is O(log B n + l/b), where l is the length of the string tht we serch. To further reduce the numer of I/Os Ferrgin nd Grossi [9] used the leftmost nd the rightmost strings in the sutree of node v s keys t p(v). Recll tht we in fct did the sme in our B-tree when we use the spns of the children of v s the keys t v. Hving the keys defined this wy, we cn use informtion from the serch in the trie PT(v) of node v to reduce the numer of I/Os in the followings serch of the trie PT(u) of child u of v. Specificlly, let s e the string which we serch, nd let l e the length of the longest common prefix of s nd the key t the lef of PT(v), where the serch ended. Then it is gurnteed tht the length of the longest common prefix of s nd the key t the lef of PT(u), where the serch of s ends is t lest l. Thus, we cn void the first l comprisons nd the I/Os ssocited with them. Ferrgin nd Grossi [9] lso showed how to insert nd delete string in O(log B n + l/b) time in the worst cse. We now descrie how to comine the SB-tree with our lgorithm for longest prefix queries so tht our input prefixes S = {p 1...p n } cn e ritrrily long. As Ferrgin nd Grossi [9], we use the endpoints of spn(v) s keys t p(v), nd represent the keys of ech node v in Ptrici trie PT(v). Ech lef of the Ptrici trie stores pointer to the first lock contining the key tht it corresponds to. We use the sme definition of the longest prefix of node v, denoted y LP(v), s in Section 2. Recll tht from these definitions follow tht if LP(v) is not empty then spn(v) I(LP(v)) nd therefore LP(v) is prefix of every key in the sutree of v. Let spn(v) = [KL(v),KR(v)]. Tht is KL(v)

11 e the leftmost string in the sutree of v nd KR(v) is the rightmost string in the sutree of v. Clerly LP(v) is prefix of KL(v) nd KR(v). The string KL(v) is key seprting v from its siling in p(v) nd therefore corresponds to lef in PT(v). So we represent LP(v) y storing its length, nd pointer to it, in the lef of PT(v), tht corresponds to KL(v). If LP(v) is empty we encode this y storing zero t the ssocited lef. Finding the longest prefix. We serch the SB-tree nd trverse pth A to lef of T. Let w e the lst node on A for which LP(w) is not empty. Together with the pointer to w in p(w), we find LP(w) nd pointer to LP(w). Inserting new prefix. Assume we wnt to insert new prefix p S to the dt structure. We insert pl nd pr into the SB-tree using the insertion lgorithm of the SB-tree. As in Section 2.2 there my e nodes v in T, such tht fter dding p, we need to updte LP(v) to e p. Nodes v for which LP(v) my e equl to p re of three kinds s specified in Cses (1), (2) nd (3) of Section 2.2. For ech such node v we chnge LP(v) to e p, if p is longer thn the current vlue LP(v). This is correct since for ech of these nodes v, we know tht spn(v) I(p). Note tht ll these chnges re locted t O(log B (n)) nodes of the SB-tree, nd therefore we cn perform them while doing O(log B (n)) I/O opertions. After inserting prefix p we my split node v into two nodes v 1 nd v 2. We split node in the SB-tree using the lgorithm of Ferrgin nd Grossi [9]. Splitting my chnge the longest prefixes. To perform these chnges we use the sme lgorithm s in Section 2.2. To implement this lgorithm we need to determine if there is child u of v 1 such tht spn(v 1 ) I(LP(u)). Let u e child of v 1. We decide if spn(v 1 ) I(LP(u)) s follows. Since LP(u) is prefix of KL(u) nd KR(u), nd KL(u) nd KR(u) re keys in PT(v 1 ) then there is pth in PT(v 1 ) tht corresponds to the string LP(u). It follows tht LP(u) is prefix of ll the keys in PT(v 1 ), nd in prticulr of KL(v 1 ) nd KR(v 2 ), if LP(u) is not lrger thn the string depth of the root of PT(v 1 ). We check if there is child u of v 2 tht spn(v 2 ) I(LP(u)) nlogously. Deletion of prefix is similr, we omit the detils from this strct. The following theorem summrizes the results of this section. Theorem 2. The dt structure which we descried in this section supports longest prefix queries, insertions, nd deletions in O(log(n) + q ) time where q is the string which we perform the opertion with. Furthermore, it performs O(log B (n) + q /B) I/Os per opertion, nd requires liner spce. 4 Future Reserch The cche olivious model [10] is generliztion of the I/O model. In this model we seek I/O efficient lgorithms which do not depend on the lock size. Among the stte of the rt in this model is cche-olivious B-tree [3], nd n lmost efficient cche-olivious string B-tree [4] whose query time is optiml

12 ut updtes re not. An ovious open question is to find cche olivious dt structure for longest prefix queries. References 1. P. K. Agrwl, L. Arge, nd K. Yi. An Optiml Dynmic Intervl Sting-Mx Dt Structure? In Proceedings of SODA, pges , R. Byer nd E. M. McCreight. Orgniztion nd Mintennce of Lrge Ordered Indexes. Act Informtic, 1(3): , M. A. Bender, E. Demine, nd M. Frch-Colton. Cche-Olivious B-Trees. SIAM Journl on Computing, 35(2): , M. A. Bender, M. Frch-Colton, nd B. C. Kusznul. Cche-Olivious String B-Trees. In Proceedings of PODS, pges , G. S. Brodl nd R. Fgererg. Cche-olivious string dictionries. In Proc. 17th Annul ACM-SIAM Symposium on Discrete Algorithms, pges , Erik D. Demine, John Icono, nd Stefn Lngermn. Worst-cse optiml tree lyout in memory hierrchy, W. Etherton, Z. Ditti, nd G. Vrghese. Tree Bitmp : Hrdwre/Softwre IP Lookups with Incrementl Updtes. ACM SIGCOMM Computer Communictions Review, 34(2):97 122, A. Feldmnn nd S. Muthukrishnn. Trdeoffs for Pcket Clssifiction. In Proceedings of INFOCOM, pges , P. Ferrgin nd R. Grossi. The String B-Tree: A New Dt Structure for String Serch in Externl Memory nd Its Applictions. Journl of the ACM, 46(2): , M. Frigo, C. E. Leiserson, H. Prokop, nd S. Rmchndrn. Cche-Olivious Algorithms. In Proceedings of FOCS, pges , J. Hsn, S. Cdmi, V. Jkkul, nd S. Chkrdhr. Chisel: A Storge-Efficient, Collision-Free Hsh- Bsed Network Processing Architecture. In Proceedings of ISCA, pges , My H. Kpln, E. Mold, nd R. E. Trjn. Dynmic Rectngulr Intersection with Priorities. In Proceedings of STOC, pges , H. Lu nd S. Shni. A B-Tree Dynmic Router-Tle Design. IEEE Trnsctions on Computers, 54(7): , D. R. Morrison. Ptrici: Prcticl Algorithm to Retrieve Informtion Coded in Alphnumeric. Journl of the ACM, 15(4): , S. Shni nd K. Kim. O(log n) Dynmic Pcket Routing. In Proceedings of ISCC, pges , D. Sletor nd R. E. Trjn. Self-djusting inry serch trees. Journl of the ACM, 32: , D. D. Sletor nd R. E. Trjn. A Dt Structure for Dynmic Trees. JCSS, 26(3): , S. Suri, G. Vrghese, nd P. Wrkhede. Multiwy Rnge Trees: Sclle IP Lookup with Fst Updtes. In Proceedings of GLOBECOM, pges , R. E. Trjn. A Clss of Algorithms which Require Nonliner Time to Mintin Disjoint Sets. Journl of Computing System Science, 18: , J. S. Vitter. Externl Memory Algorithms nd Dt Structures: Deling with Mssive Dt. ACM Computing Surveys, 33(2): , 2001.

What are suffix trees?

What are suffix trees? Suffix Trees 1 Wht re suffix trees? Allow lgorithm designers to store very lrge mount of informtion out strings while still keeping within liner spce Allow users to serch for new strings in the originl

More information

Tries. Yufei Tao KAIST. April 9, Y. Tao, April 9, 2013 Tries

Tries. Yufei Tao KAIST. April 9, Y. Tao, April 9, 2013 Tries Tries Yufei To KAIST April 9, 2013 Y. To, April 9, 2013 Tries In this lecture, we will discuss the following exct mtching prolem on strings. Prolem Let S e set of strings, ech of which hs unique integer

More information

Presentation Martin Randers

Presentation Martin Randers Presenttion Mrtin Rnders Outline Introduction Algorithms Implementtion nd experiments Memory consumption Summry Introduction Introduction Evolution of species cn e modelled in trees Trees consist of nodes

More information

COMP 423 lecture 11 Jan. 28, 2008

COMP 423 lecture 11 Jan. 28, 2008 COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring

More information

2 Computing all Intersections of a Set of Segments Line Segment Intersection

2 Computing all Intersections of a Set of Segments Line Segment Intersection 15-451/651: Design & Anlysis of Algorithms Novemer 14, 2016 Lecture #21 Sweep-Line nd Segment Intersection lst chnged: Novemer 8, 2017 1 Preliminries The sweep-line prdigm is very powerful lgorithmic design

More information

Suffix trees, suffix arrays, BWT

Suffix trees, suffix arrays, BWT ALGORITHMES POUR LA BIO-INFORMATIQUE ET LA VISUALISATION COURS 3 Rluc Uricru Suffix trees, suffix rrys, BWT Bsed on: Suffix trees nd suffix rrys presenttion y Him Kpln Suffix trees course y Pco Gomez Liner-Time

More information

Information Retrieval and Organisation

Information Retrieval and Organisation Informtion Retrievl nd Orgnistion Suffix Trees dpted from http://www.mth.tu.c.il/~himk/seminr02/suffixtrees.ppt Dell Zhng Birkeck, University of London Trie A tree representing set of strings { } eef d

More information

Algorithm Design (5) Text Search

Algorithm Design (5) Text Search Algorithm Design (5) Text Serch Tkshi Chikym School of Engineering The University of Tokyo Text Serch Find sustring tht mtches the given key string in text dt of lrge mount Key string: chr x[m] Text Dt:

More information

CS201 Discussion 10 DRAWTREE + TRIES

CS201 Discussion 10 DRAWTREE + TRIES CS201 Discussion 10 DRAWTREE + TRIES DrwTree First instinct: recursion As very generic structure, we could tckle this problem s follows: drw(): Find the root drw(root) drw(root): Write the line for the

More information

Ma/CS 6b Class 1: Graph Recap

Ma/CS 6b Class 1: Graph Recap M/CS 6 Clss 1: Grph Recp By Adm Sheffer Course Detils Adm Sheffer. Office hour: Tuesdys 4pm. dmsh@cltech.edu TA: Victor Kstkin. Office hour: Tuesdys 7pm. 1:00 Mondy, Wednesdy, nd Fridy. http://www.mth.cltech.edu/~2014-15/2term/m006/

More information

The Greedy Method. The Greedy Method

The Greedy Method. The Greedy Method Lists nd Itertors /8/26 Presenttion for use with the textook, Algorithm Design nd Applictions, y M. T. Goodrich nd R. Tmssi, Wiley, 25 The Greedy Method The Greedy Method The greedy method is generl lgorithm

More information

Intermediate Information Structures

Intermediate Information Structures CPSC 335 Intermedite Informtion Structures LECTURE 13 Suffix Trees Jon Rokne Computer Science University of Clgry Cnd Modified from CMSC 423 - Todd Trengen UMD upd Preprocessing Strings We will look t

More information

MATH 25 CLASS 5 NOTES, SEP

MATH 25 CLASS 5 NOTES, SEP MATH 25 CLASS 5 NOTES, SEP 30 2011 Contents 1. A brief diversion: reltively prime numbers 1 2. Lest common multiples 3 3. Finding ll solutions to x + by = c 4 Quick links to definitions/theorems Euclid

More information

Orthogonal line segment intersection

Orthogonal line segment intersection Computtionl Geometry [csci 3250] Line segment intersection The prolem (wht) Computtionl Geometry [csci 3250] Orthogonl line segment intersection Applictions (why) Algorithms (how) A specil cse: Orthogonl

More information

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming Lecture 10 Evolutionry Computtion: Evolution strtegies nd genetic progrmming Evolution strtegies Genetic progrmming Summry Negnevitsky, Person Eduction, 2011 1 Evolution Strtegies Another pproch to simulting

More information

Fig.25: the Role of LEX

Fig.25: the Role of LEX The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing

More information

From Dependencies to Evaluation Strategies

From Dependencies to Evaluation Strategies From Dependencies to Evlution Strtegies Possile strtegies: 1 let the user define the evlution order 2 utomtic strtegy sed on the dependencies: use locl dependencies to determine which ttriutes to compute

More information

Before We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1):

Before We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1): Overview (): Before We Begin Administrtive detils Review some questions to consider Winter 2006 Imge Enhncement in the Sptil Domin: Bsics of Sptil Filtering, Smoothing Sptil Filters, Order Sttistics Filters

More information

CS481: Bioinformatics Algorithms

CS481: Bioinformatics Algorithms CS481: Bioinformtics Algorithms Cn Alkn EA509 clkn@cs.ilkent.edu.tr http://www.cs.ilkent.edu.tr/~clkn/teching/cs481/ EXACT STRING MATCHING Fingerprint ide Assume: We cn compute fingerprint f(p) of P in

More information

Ma/CS 6b Class 1: Graph Recap

Ma/CS 6b Class 1: Graph Recap M/CS 6 Clss 1: Grph Recp By Adm Sheffer Course Detils Instructor: Adm Sheffer. TA: Cosmin Pohot. 1pm Mondys, Wednesdys, nd Fridys. http://mth.cltech.edu/~2015-16/2term/m006/ Min ook: Introduction to Grph

More information

9 Graph Cutting Procedures

9 Graph Cutting Procedures 9 Grph Cutting Procedures Lst clss we begn looking t how to embed rbitrry metrics into distributions of trees, nd proved the following theorem due to Brtl (1996): Theorem 9.1 (Brtl (1996)) Given metric

More information

Compression Outline :Algorithms in the Real World. Lempel-Ziv Algorithms. LZ77: Sliding Window Lempel-Ziv

Compression Outline :Algorithms in the Real World. Lempel-Ziv Algorithms. LZ77: Sliding Window Lempel-Ziv Compression Outline 15-853:Algorithms in the Rel World Dt Compression III Introduction: Lossy vs. Lossless, Benchmrks, Informtion Theory: Entropy, etc. Proility Coding: Huffmn + Arithmetic Coding Applictions

More information

Distributed Systems Principles and Paradigms

Distributed Systems Principles and Paradigms Distriuted Systems Principles nd Prdigms Chpter 11 (version April 7, 2008) Mrten vn Steen Vrije Universiteit Amsterdm, Fculty of Science Dept. Mthemtics nd Computer Science Room R4.20. Tel: (020) 598 7784

More information

COMBINATORIAL PATTERN MATCHING

COMBINATORIAL PATTERN MATCHING COMBINATORIAL PATTERN MATCHING Genomic Repets Exmple of repets: ATGGTCTAGGTCCTAGTGGTC Motivtion to find them: Genomic rerrngements re often ssocited with repets Trce evolutionry secrets Mny tumors re chrcterized

More information

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

In the last lecture, we discussed how valid tokens may be specified by regular expressions. LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.

More information

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2012 Colin Dewey cdewey@biostt.wisc.edu Gols for Lecture the key concepts to understnd re the following how lrge-scle lignment

More information

CSCI 104. Rafael Ferreira da Silva. Slides adapted from: Mark Redekopp and David Kempe

CSCI 104. Rafael Ferreira da Silva. Slides adapted from: Mark Redekopp and David Kempe CSCI 0 fel Ferreir d Silv rfsilv@isi.edu Slides dpted from: Mrk edekopp nd Dvid Kempe LOG STUCTUED MEGE TEES Series Summtion eview Let n = + + + + k $ = #%& #. Wht is n? n = k+ - Wht is log () + log ()

More information

Solving Problems by Searching. CS 486/686: Introduction to Artificial Intelligence Winter 2016

Solving Problems by Searching. CS 486/686: Introduction to Artificial Intelligence Winter 2016 Solving Prolems y Serching CS 486/686: Introduction to Artificil Intelligence Winter 2016 1 Introduction Serch ws one of the first topics studied in AI - Newell nd Simon (1961) Generl Prolem Solver Centrl

More information

Systems I. Logic Design I. Topics Digital logic Logic gates Simple combinational logic circuits

Systems I. Logic Design I. Topics Digital logic Logic gates Simple combinational logic circuits Systems I Logic Design I Topics Digitl logic Logic gtes Simple comintionl logic circuits Simple C sttement.. C = + ; Wht pieces of hrdwre do you think you might need? Storge - for vlues,, C Computtion

More information

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have Rndom Numers nd Monte Crlo Methods Rndom Numer Methods The integrtion methods discussed so fr ll re sed upon mking polynomil pproximtions to the integrnd. Another clss of numericl methods relies upon using

More information

Outline. Introduction Suffix Trees (ST) Building STs in linear time: Ukkonen s algorithm Applications of ST

Outline. Introduction Suffix Trees (ST) Building STs in linear time: Ukkonen s algorithm Applications of ST Suffi Trees Outline Introduction Suffi Trees (ST) Building STs in liner time: Ukkonen s lgorithm Applictions of ST 2 3 Introduction Sustrings String is ny sequence of chrcters. Sustring of string S is

More information

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus Unit #9 : Definite Integrl Properties, Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully

More information

2-3 search trees red-black BSTs B-trees

2-3 search trees red-black BSTs B-trees 2-3 serch trees red-lck BTs B-trees 3 2-3 tree llow 1 or 2 keys per node. 2-node: one key, two children. 3-node: two keys, three children. ymmetric order. Inorder trversl yields keys in scending order.

More information

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs.

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs. Lecture 5 Wlks, Trils, Pths nd Connectedness Reding: Some of the mteril in this lecture comes from Section 1.2 of Dieter Jungnickel (2008), Grphs, Networks nd Algorithms, 3rd edition, which is ville online

More information

Looking up objects in Pastry

Looking up objects in Pastry Review: Pstry routing tbles 0 1 2 3 4 7 8 9 b c d e f 0 1 2 3 4 7 8 9 b c d e f 0 1 2 3 4 7 8 9 b c d e f 0 2 3 4 7 8 9 b c d e f Row0 Row 1 Row 2 Row 3 Routing tble of node with ID i =1fc s - For ech

More information

Position Heaps: A Simple and Dynamic Text Indexing Data Structure

Position Heaps: A Simple and Dynamic Text Indexing Data Structure Position Heps: A Simple nd Dynmic Text Indexing Dt Structure Andrzej Ehrenfeucht, Ross M. McConnell, Niss Osheim, Sung-Whn Woo Dept. of Computer Science, 40 UCB, University of Colordo t Boulder, Boulder,

More information

A dual of the rectangle-segmentation problem for binary matrices

A dual of the rectangle-segmentation problem for binary matrices A dul of the rectngle-segmenttion prolem for inry mtrices Thoms Klinowski Astrct We consider the prolem to decompose inry mtrix into smll numer of inry mtrices whose -entries form rectngle. We show tht

More information

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5 CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,

More information

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards A Tutology Checker loosely relted to Stålmrck s Algorithm y Mrtin Richrds mr@cl.cm.c.uk http://www.cl.cm.c.uk/users/mr/ University Computer Lortory New Museum Site Pemroke Street Cmridge, CB2 3QG Mrtin

More information

MTH 146 Conics Supplement

MTH 146 Conics Supplement 105- Review of Conics MTH 146 Conics Supplement In this section we review conics If ou ne more detils thn re present in the notes, r through section 105 of the ook Definition: A prol is the set of points

More information

Suffix Tries. Slides adapted from the course by Ben Langmead

Suffix Tries. Slides adapted from the course by Ben Langmead Suffix Tries Slides dpted from the course y Ben Lngmed en.lngmed@gmil.com Indexing with suffixes Until now, our indexes hve een sed on extrcting sustrings from T A very different pproch is to extrct suffixes

More information

Lecture 7: Integration Techniques

Lecture 7: Integration Techniques Lecture 7: Integrtion Techniques Antiderivtives nd Indefinite Integrls. In differentil clculus, we were interested in the derivtive of given rel-vlued function, whether it ws lgeric, eponentil or logrithmic.

More information

Network Interconnection: Bridging CS 571 Fall Kenneth L. Calvert All rights reserved

Network Interconnection: Bridging CS 571 Fall Kenneth L. Calvert All rights reserved Network Interconnection: Bridging CS 57 Fll 6 6 Kenneth L. Clvert All rights reserved The Prolem We know how to uild (rodcst) LANs Wnt to connect severl LANs together to overcome scling limits Recll: speed

More information

10.5 Graphing Quadratic Functions

10.5 Graphing Quadratic Functions 0.5 Grphing Qudrtic Functions Now tht we cn solve qudrtic equtions, we wnt to lern how to grph the function ssocited with the qudrtic eqution. We cll this the qudrtic function. Grphs of Qudrtic Functions

More information

A New Learning Algorithm for the MAXQ Hierarchical Reinforcement Learning Method

A New Learning Algorithm for the MAXQ Hierarchical Reinforcement Learning Method A New Lerning Algorithm for the MAXQ Hierrchicl Reinforcement Lerning Method Frzneh Mirzzdeh 1, Bbk Behsz 2, nd Hmid Beigy 1 1 Deprtment of Computer Engineering, Shrif University of Technology, Tehrn,

More information

Today. CS 188: Artificial Intelligence Fall Recap: Search. Example: Pancake Problem. Example: Pancake Problem. General Tree Search.

Today. CS 188: Artificial Intelligence Fall Recap: Search. Example: Pancake Problem. Example: Pancake Problem. General Tree Search. CS 88: Artificil Intelligence Fll 00 Lecture : A* Serch 9//00 A* Serch rph Serch Tody Heuristic Design Dn Klein UC Berkeley Multiple slides from Sturt Russell or Andrew Moore Recp: Serch Exmple: Pncke

More information

Applied Databases. Sebastian Maneth. Lecture 13 Online Pattern Matching on Strings. University of Edinburgh - February 29th, 2016

Applied Databases. Sebastian Maneth. Lecture 13 Online Pattern Matching on Strings. University of Edinburgh - February 29th, 2016 Applied Dtses Lecture 13 Online Pttern Mtching on Strings Sestin Mneth University of Edinurgh - Ferury 29th, 2016 2 Outline 1. Nive Method 2. Automton Method 3. Knuth-Morris-Prtt Algorithm 4. Boyer-Moore

More information

Solving Problems by Searching. CS 486/686: Introduction to Artificial Intelligence

Solving Problems by Searching. CS 486/686: Introduction to Artificial Intelligence Solving Prolems y Serching CS 486/686: Introduction to Artificil Intelligence 1 Introduction Serch ws one of the first topics studied in AI - Newell nd Simon (1961) Generl Prolem Solver Centrl component

More information

Union-Find Problem. Using Arrays And Chains. A Set As A Tree. Result Of A Find Operation

Union-Find Problem. Using Arrays And Chains. A Set As A Tree. Result Of A Find Operation Union-Find Problem Given set {,,, n} of n elements. Initilly ech element is in different set. ƒ {}, {},, {n} An intermixed sequence of union nd find opertions is performed. A union opertion combines two

More information

Regular Expression Matching with Multi-Strings and Intervals. Philip Bille Mikkel Thorup

Regular Expression Matching with Multi-Strings and Intervals. Philip Bille Mikkel Thorup Regulr Expression Mtching with Multi-Strings nd Intervls Philip Bille Mikkel Thorup Outline Definition Applictions Previous work Two new problems: Multi-strings nd chrcter clss intervls Algorithms Thompson

More information

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis CS143 Hndout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexicl Anlysis In this first written ssignment, you'll get the chnce to ply round with the vrious constructions tht come up when doing lexicl

More information

Section 10.4 Hyperbolas

Section 10.4 Hyperbolas 66 Section 10.4 Hyperbols Objective : Definition of hyperbol & hyperbols centered t (0, 0). The third type of conic we will study is the hyperbol. It is defined in the sme mnner tht we defined the prbol

More information

An Algorithm for Enumerating All Maximal Tree Patterns Without Duplication Using Succinct Data Structure

An Algorithm for Enumerating All Maximal Tree Patterns Without Duplication Using Succinct Data Structure , Mrch 12-14, 2014, Hong Kong An Algorithm for Enumerting All Mximl Tree Ptterns Without Dupliction Using Succinct Dt Structure Yuko ITOKAWA, Tomoyuki UCHIDA nd Motoki SANO Astrct In order to extrct structured

More information

LECT-10, S-1 FP2P08, Javed I.

LECT-10, S-1 FP2P08, Javed I. A Course on Foundtions of Peer-to-Peer Systems & Applictions LECT-10, S-1 CS /799 Foundtion of Peer-to-Peer Applictions & Systems Kent Stte University Dept. of Computer Science www.cs.kent.edu/~jved/clss-p2p08

More information

ITEC2620 Introduction to Data Structures

ITEC2620 Introduction to Data Structures ITEC0 Introduction to Dt Structures Lecture 7 Queues, Priority Queues Queues I A queue is First-In, First-Out = FIFO uffer e.g. line-ups People enter from the ck of the line People re served (exit) from

More information

Definition of Regular Expression

Definition of Regular Expression Definition of Regulr Expression After the definition of the string nd lnguges, we re redy to descrie regulr expressions, the nottion we shll use to define the clss of lnguges known s regulr sets. Recll

More information

Lily Yen and Mogens Hansen

Lily Yen and Mogens Hansen SKOLID / SKOLID No. 8 Lily Yen nd Mogens Hnsen Skolid hs joined Mthemticl Myhem which is eing reformtted s stnd-lone mthemtics journl for high school students. Solutions to prolems tht ppered in the lst

More information

CSEP 573 Artificial Intelligence Winter 2016

CSEP 573 Artificial Intelligence Winter 2016 CSEP 573 Artificil Intelligence Winter 2016 Luke Zettlemoyer Problem Spces nd Serch slides from Dn Klein, Sturt Russell, Andrew Moore, Dn Weld, Pieter Abbeel, Ali Frhdi Outline Agents tht Pln Ahed Serch

More information

1.5 Extrema and the Mean Value Theorem

1.5 Extrema and the Mean Value Theorem .5 Extrem nd the Men Vlue Theorem.5. Mximum nd Minimum Vlues Definition.5. (Glol Mximum). Let f : D! R e function with domin D. Then f hs n glol mximum vlue t point c, iff(c) f(x) for ll x D. The vlue

More information

CS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig

CS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig CS311H: Discrete Mthemtics Grph Theory IV Instructor: Işıl Dillig Instructor: Işıl Dillig, CS311H: Discrete Mthemtics Grph Theory IV 1/25 A Non-plnr Grph Regions of Plnr Grph The plnr representtion of

More information

Stack Manipulation. Other Issues. How about larger constants? Frame Pointer. PowerPC. Alternative Architectures

Stack Manipulation. Other Issues. How about larger constants? Frame Pointer. PowerPC. Alternative Architectures Other Issues Stck Mnipultion support for procedures (Refer to section 3.6), stcks, frmes, recursion mnipulting strings nd pointers linkers, loders, memory lyout Interrupts, exceptions, system clls nd conventions

More information

Pointwise convergence need not behave well with respect to standard properties such as continuity.

Pointwise convergence need not behave well with respect to standard properties such as continuity. Chpter 3 Uniform Convergence Lecture 9 Sequences of functions re of gret importnce in mny res of pure nd pplied mthemtics, nd their properties cn often be studied in the context of metric spces, s in Exmples

More information

Determining Single Connectivity in Directed Graphs

Determining Single Connectivity in Directed Graphs Determining Single Connectivity in Directed Grphs Adm L. Buchsbum 1 Mrtin C. Crlisle 2 Reserch Report CS-TR-390-92 September 1992 Abstrct In this pper, we consider the problem of determining whether or

More information

Tree Structured Symmetrical Systems of Linear Equations and their Graphical Solution

Tree Structured Symmetrical Systems of Linear Equations and their Graphical Solution Proceedings of the World Congress on Engineering nd Computer Science 4 Vol I WCECS 4, -4 October, 4, Sn Frncisco, USA Tree Structured Symmetricl Systems of Liner Equtions nd their Grphicl Solution Jime

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-169 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

Announcements. CS 188: Artificial Intelligence Fall Recap: Search. Today. Example: Pancake Problem. Example: Pancake Problem

Announcements. CS 188: Artificial Intelligence Fall Recap: Search. Today. Example: Pancake Problem. Example: Pancake Problem Announcements Project : erch It s live! Due 9/. trt erly nd sk questions. It s longer thn most! Need prtner? Come up fter clss or try Pizz ections: cn go to ny, ut hve priority in your own C 88: Artificil

More information

Preserving Constraints for Aggregation Relationship Type Update in XML Document

Preserving Constraints for Aggregation Relationship Type Update in XML Document Preserving Constrints for Aggregtion Reltionship Type Updte in XML Document Eric Prdede 1, J. Wenny Rhyu 1, nd Dvid Tnir 2 1 Deprtment of Computer Science nd Computer Engineering, L Trobe University, Bundoor

More information

File Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment

File Manager Quick Reference Guide. June Prepared for the Mayo Clinic Enterprise Kahua Deployment File Mnger Quick Reference Guide June 2018 Prepred for the Myo Clinic Enterprise Khu Deployment NVIGTION IN FILE MNGER To nvigte in File Mnger, users will mke use of the left pne to nvigte nd further pnes

More information

F. R. K. Chung y. University ofpennsylvania. Philadelphia, Pennsylvania R. L. Graham. AT&T Labs - Research. March 2,1997.

F. R. K. Chung y. University ofpennsylvania. Philadelphia, Pennsylvania R. L. Graham. AT&T Labs - Research. March 2,1997. Forced convex n-gons in the plne F. R. K. Chung y University ofpennsylvni Phildelphi, Pennsylvni 19104 R. L. Grhm AT&T Ls - Reserch Murry Hill, New Jersey 07974 Mrch 2,1997 Astrct In seminl pper from 1935,

More information

Efficient Algorithms For Optimizing Policy-Constrained Routing

Efficient Algorithms For Optimizing Policy-Constrained Routing Efficient Algorithms For Optimizing Policy-Constrined Routing Andrew R. Curtis curtis@cs.colostte.edu Ross M. McConnell rmm@cs.colostte.edu Dn Mssey mssey@cs.colostte.edu Astrct Routing policies ply n

More information

CS 268: IP Multicast Routing

CS 268: IP Multicast Routing Motivtion CS 268: IP Multicst Routing Ion Stoic April 5, 2004 Mny pplictions requires one-to-mny communiction - E.g., video/udio conferencing, news dissemintion, file updtes, etc. Using unicst to replicte

More information

4/29/18 FIBONACCI NUMBERS GOLDEN RATIO, RECURRENCES. Fibonacci function. Fibonacci (Leonardo Pisano) ? Statue in Pisa Italy

4/29/18 FIBONACCI NUMBERS GOLDEN RATIO, RECURRENCES. Fibonacci function. Fibonacci (Leonardo Pisano) ? Statue in Pisa Italy /9/8 Fioncci (Leonrdo Pisno) -? Sttue in Pis Itly FIBONACCI NUERS GOLDEN RATIO, RECURRENCES Lecture CS Spring 8 Fioncci function fi() fi() fi(n) fi(n-) + fi(n-) for n,,,,,, 8,,, In his ook in titled Lier

More information

An Efficient Divide and Conquer Algorithm for Exact Hazard Free Logic Minimization

An Efficient Divide and Conquer Algorithm for Exact Hazard Free Logic Minimization An Efficient Divide nd Conquer Algorithm for Exct Hzrd Free Logic Minimiztion J.W.J.M. Rutten, M.R.C.M. Berkelr, C.A.J. vn Eijk, M.A.J. Kolsteren Eindhoven University of Technology Informtion nd Communiction

More information

Lecture 10: Suffix Trees

Lecture 10: Suffix Trees Computtionl Genomics Prof. Ron Shmir, Prof. Him Wolfson, Dr. Irit Gt-Viks School of Computer Science, Tel Aviv University גנומיקה חישובית פרופ' רון שמיר, פרופ' חיים וולפסון, דר' עירית גת-ויקס ביה"ס למדעי

More information

Mobile IP route optimization method for a carrier-scale IP network

Mobile IP route optimization method for a carrier-scale IP network Moile IP route optimiztion method for crrier-scle IP network Tkeshi Ihr, Hiroyuki Ohnishi, nd Ysushi Tkgi NTT Network Service Systems Lortories 3-9-11 Midori-cho, Musshino-shi, Tokyo 180-8585, Jpn Phone:

More information

arxiv: v1 [math.co] 18 Sep 2015

arxiv: v1 [math.co] 18 Sep 2015 Improvements on the density o miml -plnr grphs rxiv:509.05548v [mth.co] 8 Sep 05 János Brát MTA-ELTE Geometric nd Algeric Comintorics Reserch Group rt@cs.elte.hu nd Géz Tóth Alréd Rényi Institute o Mthemtics,

More information

CSCI1950 Z Computa4onal Methods for Biology Lecture 2. Ben Raphael January 26, hhp://cs.brown.edu/courses/csci1950 z/ Outline

CSCI1950 Z Computa4onal Methods for Biology Lecture 2. Ben Raphael January 26, hhp://cs.brown.edu/courses/csci1950 z/ Outline CSCI1950 Z Comput4onl Methods for Biology Lecture 2 Ben Rphel Jnury 26, 2009 hhp://cs.brown.edu/courses/csci1950 z/ Outline Review of trees. Coun4ng fetures. Chrcter bsed phylogeny Mximum prsimony Mximum

More information

Agilent Mass Hunter Software

Agilent Mass Hunter Software Agilent Mss Hunter Softwre Quick Strt Guide Use this guide to get strted with the Mss Hunter softwre. Wht is Mss Hunter Softwre? Mss Hunter is n integrl prt of Agilent TOF softwre (version A.02.00). Mss

More information

UT1553B BCRT True Dual-port Memory Interface

UT1553B BCRT True Dual-port Memory Interface UTMC APPICATION NOTE UT553B BCRT True Dul-port Memory Interfce INTRODUCTION The UTMC UT553B BCRT is monolithic CMOS integrted circuit tht provides comprehensive MI-STD- 553B Bus Controller nd Remote Terminl

More information

Reducing a DFA to a Minimal DFA

Reducing a DFA to a Minimal DFA Lexicl Anlysis - Prt 4 Reducing DFA to Miniml DFA Input: DFA IN Assume DFA IN never gets stuck (dd ded stte if necessry) Output: DFA MIN An equivlent DFA with the minimum numer of sttes. Hrry H. Porter,

More information

Announcements. CS 188: Artificial Intelligence Fall Recap: Search. Today. General Tree Search. Uniform Cost. Lecture 3: A* Search 9/4/2007

Announcements. CS 188: Artificial Intelligence Fall Recap: Search. Today. General Tree Search. Uniform Cost. Lecture 3: A* Search 9/4/2007 CS 88: Artificil Intelligence Fll 2007 Lecture : A* Serch 9/4/2007 Dn Klein UC Berkeley Mny slides over the course dpted from either Sturt Russell or Andrew Moore Announcements Sections: New section 06:

More information

Graphs with at most two trees in a forest building process

Graphs with at most two trees in a forest building process Grphs with t most two trees in forest uilding process rxiv:802.0533v [mth.co] 4 Fe 208 Steve Butler Mis Hmnk Mrie Hrdt Astrct Given grph, we cn form spnning forest y first sorting the edges in some order,

More information

Today. Search Problems. Uninformed Search Methods. Depth-First Search Breadth-First Search Uniform-Cost Search

Today. Search Problems. Uninformed Search Methods. Depth-First Search Breadth-First Search Uniform-Cost Search Uninformed Serch [These slides were creted by Dn Klein nd Pieter Abbeel for CS188 Intro to AI t UC Berkeley. All CS188 mterils re vilble t http://i.berkeley.edu.] Tody Serch Problems Uninformed Serch Methods

More information

MA1008. Calculus and Linear Algebra for Engineers. Course Notes for Section B. Stephen Wills. Department of Mathematics. University College Cork

MA1008. Calculus and Linear Algebra for Engineers. Course Notes for Section B. Stephen Wills. Department of Mathematics. University College Cork MA1008 Clculus nd Liner Algebr for Engineers Course Notes for Section B Stephen Wills Deprtment of Mthemtics University College Cork s.wills@ucc.ie http://euclid.ucc.ie/pges/stff/wills/teching/m1008/ma1008.html

More information

Complete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li

Complete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li 2nd Interntionl Conference on Electronic & Mechnicl Engineering nd Informtion Technology (EMEIT-212) Complete Coverge Pth Plnning of Mobile Robot Bsed on Dynmic Progrmming Algorithm Peng Zhou, Zhong-min

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-186 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

Section 3.1: Sequences and Series

Section 3.1: Sequences and Series Section.: Sequences d Series Sequences Let s strt out with the definition of sequence: sequence: ordered list of numbers, often with definite pttern Recll tht in set, order doesn t mtter so this is one

More information

Dynamic Programming. Andreas Klappenecker. [partially based on slides by Prof. Welch] Monday, September 24, 2012

Dynamic Programming. Andreas Klappenecker. [partially based on slides by Prof. Welch] Monday, September 24, 2012 Dynmic Progrmming Andres Klppenecker [prtilly bsed on slides by Prof. Welch] 1 Dynmic Progrmming Optiml substructure An optiml solution to the problem contins within it optiml solutions to subproblems.

More information

EECS150 - Digital Design Lecture 23 - High-level Design and Optimization 3, Parallelism and Pipelining

EECS150 - Digital Design Lecture 23 - High-level Design and Optimization 3, Parallelism and Pipelining EECS150 - Digitl Design Lecture 23 - High-level Design nd Optimiztion 3, Prllelism nd Pipelining Nov 12, 2002 John Wwrzynek Fll 2002 EECS150 - Lec23-HL3 Pge 1 Prllelism Prllelism is the ct of doing more

More information

VoIP for the Small Business

VoIP for the Small Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become vible solution for even the

More information

Midterm 2 Sample solution

Midterm 2 Sample solution Nme: Instructions Midterm 2 Smple solution CMSC 430 Introduction to Compilers Fll 2012 November 28, 2012 This exm contins 9 pges, including this one. Mke sure you hve ll the pges. Write your nme on the

More information

VoIP for the Small Business

VoIP for the Small Business VoIP for the Smll Business Reducing your telecommunictions costs Reserch firm IDC 1 hs estimted tht VoIP system cn reduce telephony-relted expenses by 30%. Voice over Internet Protocol (VoIP) hs become

More information

ASTs, Regex, Parsing, and Pretty Printing

ASTs, Regex, Parsing, and Pretty Printing ASTs, Regex, Prsing, nd Pretty Printing CS 2112 Fll 2016 1 Algeric Expressions To strt, consider integer rithmetic. Suppose we hve the following 1. The lphet we will use is the digits {0, 1, 2, 3, 4, 5,

More information

Allocator Basics. Dynamic Memory Allocation in the Heap (malloc and free) Allocator Goals: malloc/free. Internal Fragmentation

Allocator Basics. Dynamic Memory Allocation in the Heap (malloc and free) Allocator Goals: malloc/free. Internal Fragmentation Alloctor Bsics Dynmic Memory Alloction in the Hep (mlloc nd free) Pges too corse-grined for llocting individul objects. Insted: flexible-sized, word-ligned blocks. Allocted block (4 words) Free block (3

More information

Scalable Distributed Data Structures: A Survey Λ

Scalable Distributed Data Structures: A Survey Λ Sclble Distributed Dt Structures: A Survey Λ ADRIANO DI PASQUALE University of L Aquil, Itly ENRICO NARDELLI University of L Aquil nd Istituto di Anlisi dei Sistemi ed Informtic, Itly Abstrct This pper

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

AI Adjacent Fields. This slide deck courtesy of Dan Klein at UC Berkeley

AI Adjacent Fields. This slide deck courtesy of Dan Klein at UC Berkeley AI Adjcent Fields Philosophy: Logic, methods of resoning Mind s physicl system Foundtions of lerning, lnguge, rtionlity Mthemtics Forml representtion nd proof Algorithms, computtion, (un)decidility, (in)trctility

More information

Some necessary and sufficient conditions for two variable orthogonal designs in order 44

Some necessary and sufficient conditions for two variable orthogonal designs in order 44 University of Wollongong Reserch Online Fculty of Informtics - Ppers (Archive) Fculty of Engineering n Informtion Sciences 1998 Some necessry n sufficient conitions for two vrile orthogonl esigns in orer

More information

MIPS I/O and Interrupt

MIPS I/O and Interrupt MIPS I/O nd Interrupt Review Floting point instructions re crried out on seprte chip clled coprocessor 1 You hve to move dt to/from coprocessor 1 to do most common opertions such s printing, clling functions,

More information