ISG: Itemset based Subgraph Mining

Size: px
Start display at page:

Download "ISG: Itemset based Subgraph Mining"

Transcription

1 ISG: Itemset bsed Subgrph Mining by Lini Thoms, Stynryn R Vlluri, Kmlkr Krlplem Report No: IIIT/TR/2009/179 Centre for Dt Engineering Interntionl Institute of Informtion Technology Hyderbd , INDIA December 2009

2 ISG: Itemset bsed Subgrph Mining Lini T Thoms 1, Stynryn R Vlluri 1, nd Kmlkr Krlplem 1 lini@reserch.iiit.c.in,sty@iiit.c.in,kml@iiit.c.in Interntionl Institute of Informtion Technology, Hyderbd, INDIA Abstrct. The exponentil number of possible subgrphs mkes the problem of frequent subgrph mining chllenge. Sometimes frequent subgrph mining problems cn find simpler solutions outside the re of frequent subgrph mining. In this pper, we propose new lgorithm ISG tht finds mximl frequent subgrphs of grph dtbse tht hve grphs with unique edge lbels. The unique edge lbel property enbles ISG to use mximl itemset mining lgorithm to find the mximl frequent subgrphs. The experimentl results compre the running time of ISG with gspn, Spin nd MARGIN nd it performs significntly better thn the rest of the lgorithms. 1 Introduction It is common to model complex dt with the help of grphs consisting of nodes nd edges tht re often lbeled to store dditionl informtion. The problem of mximl frequent subgrph mining cn often be complex or simple bsed on the kind of input dtset. If the grph dtbse is known to stisfy some constrint, like constrints on the size of grphs, the node or edge lbels of grphs, the degree of nodes etc, it might be possible to solve the problem of subgrph mining without using ny grph techniques by exploiting the properties of the constrint. Though the clss of such grphs might be smll, this problem is worth investigting since it could led to immense reduction in run time. In this pper, we find mximl frequent subgrphs for dtbse of grphs hving unique edge lbels using itemset mining techniques. Though itemset mining itself is exponentil in the number of items, the complexity of grph mining lgorithms rises steeply s ginst its itemset counterprt. In itemset mining lgorithms, n itemset cn be generted exctly once by using lexicogrphicl order on the items. However, it is difficult to gurntee tht subgrph is generted only once during the process of frequent subgrph mining becuse subgrph cn be grown in severl different wys by dding nodes or edges in different orders. Therefore, voiding multiple explortions of the sme grph dd to the overhed of grph mining lgorithms. Secondly, the frequency of n itemset cn be determined bsed on tid-list intersection of certin subitemsets. On the other hnd, even the presence of ll subgrphs of grph does not gurntee the presence of the grph itself. Hence, detecting the presense of subgrph in grph requires dditionl opertions. Lstly, finding whether n itemset is n improper subitemset of nother is trivil s compred to determining whether

3 grph is subgrph of nother since it involves subgrph isomorphism which is NP-Hrd. In the current literture on mximl frequent subgrph mining, stndrd Apriori techniques nd their extensions were mostly used which either needed to perform subgrph isomorphism or do subgrph extensions. 1.1 Our Approch Grph Dtbse(D) Step 1 Convert Grph Dtbse into Itemset Dtbse (D ) Get Mximl Frequent Itemsets (M) Step 2 D M Cndidte Step 4 Mximl Step 5 Convert Itemset Subgrphs Pruning to Grph No for ech m M, Is converting m to grph mbigious? Yes Set of subsets of m Apply Pre processing Step 3 Mximl Subgrphs Fig. 1. Overview of ISG Algorithm Figure 1 shows the overview of the ISG lgorithm. The entire lgorithm cn be brodly divided into five steps. As the im of this pper is to use mximl frequent itemset mining techniques to find mximl frequent subgrphs, step 1 converts ech of the grphs in the dtbse D into trnsction. This conversion step involves mpping prts of the grph to items nd ssiging unique item id to ech item. If grph g is converted into trnsction t which is set of items then we should be ble to reconstruct g using t. In order to ensure this, the edges nd converse edges (pir of djcent edge lbels) of the grph re mpped to items. In Section FILL, we show tht there is need for mpping of dditionl substructures of the grph into items in order to ensure unmbigious conversion of itemsets into grphs. These dditionl substructures re clled secondry structures nd re explined in detil in Section FILL. Thus, the trnsction dtbse, D constructed contins the item ids of the edges, converse edges nd secondry structures. Consider dtbse of three grphs s in Figure 2 where G 1 is tringle, G 2 is spike nd G 3 is liner chin. Step 1 converts these grphs into the trnsction dtbse in 1. The method of conversion will be described in detil in section??. In step 2, the trnsction dtbse D tht is constructed in step 1 is given s input to ny mximl itemset mining lgorithm nd the set of mximl itemsets of D, denoted s M, is generted. For the trnsction dtbse in 1, the only mximl frequent itemset generted for support two is the set 1,2,3,51,52,53,100. Ech itemset m M should be converted into grph which is done in steps 3 nd 4 in Figure 1. If m cn be unmbigiously converted into grph then

4 step 3 is ommitted nd step 4 is invoked tht genertes the grph corresponding to the itemset m. The informtion of mpping of the edges, converse edges nd secondry structures into items is used in process of converting itemsets to grphs which is explined in section??. On the other hnd, if conversion of m into grph is mbigious, dditionl processing s in step 3 is needed to resolve the mbiguity. In our exmple the mximl itemset m=1,2,3,51,52,53,100 is mbiguous nd is hence preprocessed nd converted into m =1,2,51, 2,3,52, 1,3,53. m will now be pssed to step 4 whic converts ech subitemset into grph s shown in Figure 3. The conversions of grphs to itemsets nd itemsets to grph is involved nd requires explntion nd will be provided in section??. The output of Step 4 re cndidte mximl frequent subgrphs which might require dditionl pruning in order to generte the mximl frequent subgrphs of the dtbse D. Step 5 compres the grphs generted by the end of step 4 nd prunes ny grphs which re subgrphs of some other grphs. Since no grph in our exmple is subgrph of nother, the pruning step 5 does not eliminte ny grphs nd the three grphs re given s output of the ISG lgorithm. 1.2 Exmple e1 e3 e2 G 1 : Tringle e1 e3 e2 G 2 : Spike e1 e3 e2 G 3 : Liner Chin Fig. 2. Exmple Grph Dtbse e3 e1 e1 e3 e2 e2 Fig. 3. Mximl Frequent Subgrphs of the Exmple Dtbse for Support=2 2 Relted Work The exponentil size of frequent subgrphs hs to led to n interest in mximl frequent subgrphs which is much smller set s compred to the set of frequent grphs. A typicl pproch to frequent subgrph mining problem hs been to find frequent subgrphs incrementlly in n Apriori mnner. The Apriori bsed

5 pproch hs been further modified to suit mximl subgrph mining [4] with dded pruning. The mximl frequent subgrph mining lgorithm SPIN[4] first mines ll the frequent tree ptterns from grph dtbse nd then constructs the mximl frequent subgrphs from trees. This pproch offers symptotic dvntges compred to using subgrphs s building blocks, since tree normliztion is simpler problem thn grph normliztion. Another mximl frequent subgrph mining lgorithm M ARGIN [8], finds the set of frequent subgrphs tht hve infrequent supergrph of exctly one more edge. By post-processing step it then finds ll mximlly frequent subgrphs by retining only those subgrphs tht hve ll its immedite supergrphs to be infrequent. [6] finds mximl frequent subgrphs in dtbse of grphs with unique node lbels. The metbolic pthwys re modelled s grphs where ech enzyme is denoted by unique node independent of the number of time is ppers in the pthwy. The set of node lbels in grph cn be denoted s n itemset. Frequency computtion cn now be done efficiently voiding the expensive subgrph isomorphism. The pper follows depth first pproch similr to gspn wherein ech subitemset is extended if the set of lbels on extension form connected subgrph of the originl grph. The connectivity is mintined by only dding edges tht re connected to the current subgrph nd void redundncy by keeping trck of lredy visited edges. [5] ddresses the problem in different mnner where gin frequent itemset mining is used for mining mximl frequent subgrphs. They do not bother looking t edge extensions by mintining connectivity but insted tret ech grph s n itemset to pply mximl frequent itemset mining nd remove ll disconnected grphs. 3 Constrint Bsed Mximl frequent subgrph Mining In this pper, we ttempt to use mximl frequent itemset mining to speeden mximl frequent subgrph mining in the cse of grphs tht hve unique edge lbels. Unique edge lbeled grphs refer to the clss of grphs where ech edge lbel occurs tmost once in the grph. Formlly, we find mximl frequent subgrphs from dtbse of grphs D = G 1, G 2,..., G n where ech grph G i contins no two edges with the sme lbel. Due to multiple occurrences of node lbel, it is not sufficient to check the common node lbel to determine whether two edges re neighbors. In this section we ellborte on the steps given in the Figure Conversion of Grphs to Itemsets This subsection explins step1 of the frmework. Ech edge is mpped to unique triplet of the form (n i,e,n j ) where e is the edge lbel of the edge incident on the nodes with lbels n i nd n j. Similrly, we define converse edge triplet s triplet of the form (e i, n, e j ) where e i nd e j re edges lbels of two edges incident on node with node lbel n. Connverse edge triplets re essentilly triplet representtion of every pir of djcent edges. Ech edge triplet nd

6 converse edge triplet, being unique is mpped to unique item id. A grph in the grph dtbse is converted into n itemset trnsction where the items re the item ids of the edge triplets nd the item ids of the converse edge triplets of the grph. The set of trnsctions so formed by the grphs in the dtbse is now termed s the the trnsctionl dtbse D. The conversion of grph to n itemset should be such tht fter finding the mximl frequent itemsets, they cn be mpped bck uniquely to grphs. However, considering only edge triplets nd converse edge triplets does not gurntee tht ech mximl itemset cn be uniquely converted bck to grph. Consider Figure Figure 2. The edge nd converse edge triplets tht correspond to the tringle nd the spike structures re {(, e1, ), (, e2, ), (, e3, ), (e1,, e2), (e1,, e3), (e2,, e3)}. The edge nd converse edge triplets tht correspond to the liner chin re {(, e1, ), (, e2, ), (, e3, ), (e1,, e2), (e2,, e3)}. Cse 1: The set of triplets corresponding to tringle nd spike re the sme. Hence, they cnnot be uniquely converted bck to grph cusing mbiguity. Cse 2: The mximl itemset found for support vlue three would be the set {(, e1, ), (, e2, ), (, e3, ), (e1,, e2), (e2,, e3)}. This set of triplet is sme s the set of triplets for the liner chin s we sw bove. However, we cn observe tht the liner chin is not frequent. Thus, it cn be concluded tht the itemset dtbse D using only edge nd converse edge triplets leds to mbiguity. It cn be trivilly seen tht no other three edge structures other thn illustrted in Figure 2 cn cuse mbiguity s no other three edge structures with sme lbel exist. Also, no structure of greter thn three cn cuse mbiguity. In order to hndle the bove two cses, we introduce three secondry structures clled tringle, spike nd liner chin. The edge triplets nd converse edge triplet form the primry structures. Figure 2 which we used s n exmple dtbse lso shows n exmple for ech secondry structure. Note tht the node lbels of ll the nodes in ech secondry structure re the sme. Ech secondry structure is ssigned unique item id in ddition to the unique ids of the edge nd the converse edge triplets. The concept of common item id is lso introduced which is representtion of set of triplets irrespective of the structure it denotes. Tht is, if spike nd tringle my hve the sme set of edge nd converse edge triplets s in our exmple, then both re ssigned the sme common item id though their tringle nd spike ids re different. Two tringles with different edge lbels or different node lbel will hve two different tringle ids. In cse of the liner chin we mke n exception. For liner chin tht contins the three edge triplets sme s tht of tringle or spike, the liner chin is ssigned the sme common item id s tht of the tringle or spike. Hence, the three structures in the figure 2 would get the sme common item id though their spike, tringle nd liner chin id would be different. Since edge lbel occurs tmost once in grph, ech common item id present in n itemset cn only correspond to exctly one of the three: tringle or spike or liner chin. Consider the cse when the set of edge nd converse edge triplets of spikes nd tringle is the sme s in

7 our exmple. Neither of the spike structure or the tringle structure is frequent. However, the six edge nd converse triplets 1,2,3,51,52,53 tht represent ech will show up s frequent. There should be wy of indicting tht neither the tringle nor the spike should be constructed s the three edge structures re infrequent but its two edge substructures re ll frequent. Common item id is used for this purpose. If the mximl itemset contins only the common item id but does not contins the item id of corresponding secondry structure then converting the itemset into grphs needs preprocessing in order to ttch ll the frequent two-edge substructures (if ttchble) to the remining possible constructed grph forming more thn one possible grph out of the itemset. The itemset contining common item id without the presence of its secondry item id is clled conflicting mximl itemset nd hs to undergo the preprocessing stge in step 3 of the figure 1. Such n itemset fter being preprocessed cretes disconnected grph where ech component is formed due to ech two frequent edge structure whose three edge combintion(forming the conflicting common item id) is infrequent. The trnsction dtbse in Tble 1 shows the corresponding trnsction mpping of ech grph in our exmple figure 2. The three edge triplets re given item ids 1, 2 nd 3 while the converse edge triplets re given item ids 51, 52 nd 53. The corresponding mpping indicting which triplet ech item id represents is given in Tble 2. The item id 150 represents the tringle, 200 the spike nd 250 the liner chin. The item id 100 is the common item id which occurs in both trnsctions T 1,T 2 nd T 3 in Tble??. The mximl frequent itemset 1,2,3,51,52,53,100 for support vlue 2, does not contin the item id of ny secondry structure but contins the common item id 100. T T T Tble 1. Trnstionl Dtbse to Figure 2 ID Set of Triplets 1 (, e1, ) 2 (, e2, ) 3 (, e3, ) 51 (e1,, e2) 52 (e2,, e3) 53 (e1,, e3) 100 (, e1, ), (, e2, ), (, e3, ), (e1,, e2), (e1,, e3), (e2,, e3) 101 (, e1, ), (, e2, ), (, e3, ), (e1,, e2), (e2,, e3) 150 (, e1, ), (, e2, ), (, e3, ), (e1,, e2), (e1,, e3), (e2,, e3) 200 (, e1, ), (, e2, ), (, e3, ), (e1,, e2), (e1,, e3), (e2,, e3) 250 (, e1, ), (, e2, ), (, e3, ), (e1,, e2), (e2,, e3) Tble 2. Mpping of edges in Figure 2 to unique item id 3.2 Conversion of Mximl Itemsets to Grphs Once the mximl frequent itemsets re found in the previous phse, the corresponding cndidte mximl frequent subgrphs re to be built. As mentioned erlier, the conflicting mximl itemsets need to undergo preprocessing phse

8 in order to resolve the conflict tht none of the three edge structures represented by the common item id is frequent while the common item id itself is frequent. This indictes tht ll three edges re frequent nd occuring s neighbours to ech other but they cnnot co-occur in grph to form ny three edge structure s ech possible three edge structure itself is infrequent. Preprocessing Phse: The preprocessing phse breks the conflicting mximl itemset to form non-conflicting subsets. It mkes sure tht the three edges tht cused conflict re not present together in the ny subitemset creted. The three edges tht correspond to ech conflicting common item re mrked s Invlid Edge Combintions since ll the three cnnot be present in single grph. For exmple, s the common item with id 100 in the exmple dtbse is conflicting, the edges {e1, e2, e3} will be dded to Invlid Edge Combintions. Since only 100 corresponds to the only conflicting common item id for the mximl itemset {1,2,3,51,52,53,100}, Invlid Edge Combintions contins only one entry. The preprocessing of the mximl itemset tht is conflicting begins by forming n initil set of components nd recursively extending these components until they cnnot be extended. The pir of edges tht belong to ech converse edge triplet of the mximl itemset constitute the initil set of components. Thus, for the exmple dtbse nd the mximl itemset {1,2,3,51,52,53,100}, 51, 52 nd 53 form the three converse edge triplets nd hence the initil set of components contins three edge pirs: {C 1 = (e1, e2), C 2 = (e2, e3), C 3 = (e1, e3)}. These components re extended recursively. At ech stge of the recursion, every pir of components re merged if the following two conditions re vlid: (i) they both differ by single element, (ii) on merging, they do not contin ny edge combintion tht belongs to Invlid Edge Combintions. For the exmple mximl itemset, it cn be seen tht no two components cn be merged s merging would produce new component C = (e1, e2, e3) which belongs to the Invlid Edge Combintions. All the pirs tht cn be merged re collected into N ewcomponents. Any components tht could not be merged with ny other components cnnot be extended further nd hence becomes finl component tht cn be dded to the nswer set. For the exmple, NewComponents set is empty while ech edge pir in the components C 1, C 2 nd C 3 become finl components. On the other hnd, if the set NewComponents is not empty, then the bove technique is pplied recursively till no component cn be further extended. All the finl components re returned s the output of the preprocessing phse. The mximl frequent itemsets with conflicting common item ids thus undergo preprocessing stge fter which we get set of components where ech component is subitemset which will be subjected to the grph construction lgorithm. Mximl frequent itemsets tht re not conflicting (i.e., they do not contin conflicting common item id) cn be directly subjected to the grph construction phse. In the running exmple, the preprocessing phse returns three components thus forming three subitemsets. Ech itemset corresponds to the edge triplets nd converse edge triplets tht constitute the components. Thus, C 1 forms the subitemset I 1 = {1, 2, 51} which corresponds to the triplet set

9 {(, e 1, ), (, e 2, ), (e 1,, e 2 )}. Similrly I 2 = {2, 3, 52} nd I 3 = {1, 3, 53} re formed. Converting itemsets to grphs: We next describe how n mximl itemset cn be converted into one or more unique set of grphs. If the mximl itemset is conflicting itemset then the pre-processing phse first converts it into set of subitemsets. Ech such subitemset cn be converted into single unique grph. On the other hnd, if the mximl itemset is non-conflicting then it might correspond to one or more connected grphs which cn be seen s the components of disconnected grph tht cn be constructed from it. The bsic ide of converting n itemset or subitemset into grph is: (i) strt from ny edge triplet tht is not visited so fr, (ii) bsed on the converse edge triplets tht re not visited so fr extend ny edge triplet, (iii) bsed on the remining unvisited edge triplets extend ny converse edge triplet, (iv) repet steps (ii) nd (iii) recursively until no more extension is possible, (v) mrk the grph tht is formed by the end of step (iv) s one of the cndidte mximl subgrphs nd (vi) repet step (i) if there re ny more unvisited edge triplets. Thus, for n subitemset or for n itemset tht corresponds to single grph, ll the edge triplets will be visited in one itertion of the lgorithm. Conversely, if n itemset corresponds to disconnected grph, in ech itertion of the lgorithm, one component of the disconnected grph will be generted. The key to the bove procedure re two steps: the extension of n edge triplet using converse edge triplet in step (ii) nd the extension of converse edge triplet using n edge triplet in step (iii). Extension of n edge triplet: Let (n 1, e, n 2 ) be n edge triplet in the grph constructed so fr where n 1 n 2. If converse edge triplet (e, n 1, e 1 ) is present in the itemset then (n 1, e, n 2 ) cn be extended by dding the converse edge triplet on the n 1 node. As yet, we only hve the informtion tht n edge with lbel e 1 cn be incident on the node with lbel n 1. The informtion bout the opposite node of the edge with lbel e 1 is not yet know. Extension of (n 1, e, n 2 ) using (e, n 1, e 1 ) will introduce n edge whose opposite node detils re not know nd hence is clled dngling edge. The unknown opposite node id is clled dngling node. The node lbel of the dngling node is ssigned in step (iii) in the procedure clled Extension of n converse edge triplet (described below). The ConverseExtension function is clled on pir sy (e, n 1 ) for which it finds mtching converse triplets of the form (e,n 1, *) nd does the necessry extension in the grph. However, extending n edge triplet of the form (n, e, n) where the node lbels of the both nodes re the sme is complex. If converse edge triplet (e, n, e 1 ) is present then the edge triplet (n, e, n) cn be extended on ny one of the sides of the edge where ech extension will led to different grph being constructed. Due to spce constrint, we do not exhustively give the cses of reconstruction. We however provide flvour of one cse nd its solution. The detiled method is provided in our technicl report version?????????. Consider tht the edge to be extended is of the form (n,e,n). Let the edge tht we wnt to construct djcent to the edge with edge lbel e be n edge with lbel e 1 s mentioned bove. Let the grph constructed so fr be the one s in

10 Figure 4. The nodes with lbel n on which the edge with lbel e is incident re the nodes mrked s A nd B. Suppose tht the converse edge triplets (e 1,n,e x ) nd (e 1,n,e x ) re both present in the list of converse edge triplets. If the node lbels of the nodes C nd D re lso n, it would imply tht ny one of the tringles (A,B,C) or (A,B,D) exist. This is resolvd by referring to the tringle ids checking for the presence of either of edge lbels (e,e 1,e x ) or (e,e 1,e y ). D e y A e B e x C Fig. 4. Cses in Edge Extension Extension of converse edge triplet: In this step, the dngling edges tht re introduced in the previous step re converted into regulr edge triplets by the EdgeExtension function. Returning to our exmple, sy the edge (n i, e, n j ) hs been inserted in the grph. The ConverseExtension on the pir (e,n i ) returns e i s n edge lbel incident on n i. In order to construct the edge e i incident on the node n i, one needs to know the opposite node id s well. Edge extension done by the function EdgeExtension refers to finding whether the opposite node id is node lredy existing in the grph constructed so fr or whether new node needs to be creted. It lso determines the node lbel of such node by looking t the edge triplet for the edge e i. It is trivil to find the node lbel of the opposite node id from the set of edge triplets in the mximl itemset. Consider tht the edge triplet (lbel(n i ),e i,l k ) is present. We hence know tht the node lbel of the opposite node is l k. However, due to multiple nodes with the sme lbel the function needs to determine whether e i is to be incident on node lredy present in the grph with lbel l k or whether new node is to be creted which is to be ssigned the lbel l k. The EdgeExtension function checks ech existing node n with lbel l k by picking ny one edge incident on n with sy lbel e k to check whether converse edge triplet (e i, l k, e k ) exists. If such triplet exists, n edge is inserted between node n nd node n i. The dngling edge incident on node n i with edge lbel e i is dropped. If n edge cnnot be formed with ny existing nodes then the dngling node is mrked with lbel l k. Every edge triplet of converse edge triplet once visited is mrked s visited nd is not used for further construction. The bsic frmework is stright forwrd. However consider node n p with node lbel l k. Let the edge e k incident on n p1 hve its opposite node s n p2. If the node lbel of n p2 is lso l k then there is mbiguity bout whether the edge with lbel e i is to be connected to n p1 or n p2. This is becuse if the converse triplet (e i, l k, e k ) exists then it could be due to (e i,lbel(n p1 ),e k ) or (e i,lbel(n p2 ),e k ). Since grph so fr is connected, either of the nodes n p1 or n p2 hs to hve degree greter thn one. Let n p1 hve degree greter thn one. This would men tht there exists nother edge with lbel sy e p incident on n p1 prt from the edge with lbel e k. If the edge e i is to be incident on n p1 nd not n p2 then the converse edge triplet (e p, lbel(n p1 ),e i ) should lso exist prt from the converse edge triplet (e k, lbel(n p1 ),e i ). If such triplet does not exist then the edge e i should be incident on the node n p2.

11 Also, the simple check to void self loops is pplied during EdgeExtension. The detiled psuedocode is provided in the technicl report??. 3.3 Pruning Phse Using subgrph isomophism is one wy of pruning out grphs tht re subgrphs of nother. However, in ISG, s we contin the tringle, spike nd liner chin informtion of ech mximl itemset s well, we cn do simple subset opertion to eliminte subgrphs by removing the grph whose itemset form is subset of the itemset form of nother grph. Given two mximl itemsets M i nd M j, both M i nd M j might yeild connected or disconnected grph. One component of the grph generted by M i might be subgrph of the grph generted by M j or one of the components of the grph generted by M j. Hence not ll subgrphs generted in the previous phse re mximl frequent subgrphs. Hence n dditionl simple pruning phse is needed. In section Appendix 7 of the Appendix, we give the proof tht the mximl frequent subgrphs so generted re correct. 4 Results We implemented the ISG lgorithm nd tested it on both synthetic nd rel-life dtsets. We rn our experiments on 1.8GHz Intel Pentium IV PC with 1 GB of RAM, running Fedor Core 4. The code is implemented in C++ using STL, Grph Templte Librry [1] nd the igrph librry [2]. 4.1 Results on Synthetic Dtsets Synthetic Dt Genertion: We generted the synthetic dtsets using the grph genertor softwre provided by [7]. The grph genertor genertes the dtsets bsed on the six prmeters: D (the number of grphs), E,V (the number of distinct edge nd vertex lbels respectively), T (the verge size of ech grph), I (the verge size of frequent grphs) nd L (the number of frequent ptterns s frequent grphs). In post-processing step, the edge lbels of ech grphs re modified such tht the grph stisfies the unique edge lbel constrint. We compred the time tken by ISG lgorithm with two other mximl frequent subgrph mining lgorithms Spin [4] nd MARGIN [8] nd one frequent subgrph mining lgorithm GSpn [10]. We used the our own implementtions of SPIN nd MARGIN while we used the executble provided by the uthors of gspn. While SPIN nd MARGIN generte the mximl subgrphs, gspn outputs ll the frequent subgrphs. In post-processing step, we compre the frequent subgrphs generted by gspn nd find the mximl frequent subgrphs. The time of gspn includes the time tken for the post processing step too. Tble 3 shows results generted when the number of grphs re tken s 100, 500 nd Other prmeters used for the genertion of the dtset re L=20, E=20, V =20, I=20 nd T =22. We cn see tht the ISG lgorithm performs orders of mgnitude better thn the rest of the pproches.

12 L=20, E=20, V =20, I=20, T =22 D = 100 D = 500 D = 1000 Supp ISG GS Spin MR ISG GS Spin MR ISG GS Spin MR Tble 3. Results with vrying D (Running time in seconds) Tble 4 shows the comprison of running time when D, L, E nd V re kept constnt t 500, 20, 20, 20 nd 20 respectively while I nd T re vried between 10 nd 25. It cn be observed tht the performnce of ISG improves drsticlly for higher vlues of I nd T. D=500, L=20, E=20, V =20 I=10,T =15 I=15,T =20 I=20,T =25 Supp ISG GS Spin MR ISG GS Spin MR ISG GS Spin MR Tble 4. Results with vrying I nd T (Running time in seconds) 4.2 Results on Rel-life Dtset We further present our results on stock mrket dt of 20 compnies collected from the source [3]. We use the correltion function below [9] to clculte the correltion between ny pir of compnies, A nd B. [CB A = 1 T Σ T i=1 (Ai Bi Ā B) ] where T denotes the number of dys in the period T, A i nd B i denote the verge price of the stocks on dy i of the compnies σ A σ B A nd B respectively, nd Ā = 1 T Σ T i=1 Ai, B = 1 T Σ T i=1 Bi 1, σ A = T Σ T i=1 (Ai ) 2 Ā2 1 nd σ B = T Σ T i=1 (Bi ) 2 B 2. We construct grph dtbse where ech grph in the dtbse corresponds to seven successive working dys (referred to s week). For every week, we find the correltion vlues between every pir of compnies. The compnies re clustered into specific number of groups bsed on their verge stock vlue during tht week. Ech group is ssigned unique lbel which corresponds to the node lbel in the grph. The correltion vlues between every pir of compnies re lso

13 ordered nd top K% vlues re rnked nd the vlue of the rnk is used s edge lbel. Thus, the grph corresponding to ech week will hve unique edge lbels. We collected grphs corresponding to top 20 compnies for 295 weeks. The top 20 compnies re selected for ech week bsed on their verge shre price during tht week. Further, the compnies re clssified s high nd low ctegories which re used s node lbels. The correltion vlues re clculted for every pir of compnies nd the edge lbels correspond to the rnk of the vlue. Tble 5 show the comprison of ISG lgorithm with GSpn for this dtset. For low support vlues, ISG performs better thn GSpn. The time vlues for other two lgorithms, Spin nd MARGIN could not be generted s they did not terminte even fter running for very long time intervl becuse of very low support vlue. Supp ISG GSpn Tble 5. Results on Stocks Dt 5 Conclusions The problem of mximl frequent subgrph mining cn often be complex or simple bsed on the constrints in the input dtset. Certin forms of mximl subgrph mining might not require grph mining techniques t ll. Itemset mining hs much low complexity s compred to tht of frequent subgrph mining. It is hence importnt to exploit the the properties of the grph when possible to use itemset mining. We ttempt to find mximl frequent subgrphs of grphs with unique edges using itemset mining. Future work involves identifying other such constrints on the grphs which cn enble usge of itemset bsed techniques for subgrph mining problems. References 1. The grph templte librry. 2. The igrph librry Yhoo finnce J. Hun, W. Wng, J. Prins, nd J. Yng. Spin: mining mximl frequent subgrphs from grph dtbses. pges KDD, A. Inokuchi, T. Wshio, H. Motod, K. Kumsw, nd N. Ari. Bsket nlysis for grph structured dt. pges PAKDD, M. Koyuturk, A. Grm, nd W. Szpnkowski. An efficient lgorithm for detecting frequent subgrphs in biologicl networks. pges ICMB/ECCB(Suplement of Bioinformtics, 2004.

14 7. M. Kurmochi nd G. Krypis. Frequent subgrph discovery. pges ICDM, L. T. Thoms, S. R. Vlluri, nd K. Krlplem. Mrgin: Mximl frequent subgrph mining. pges ICDM, J. Wng, Z. Zeng, nd L. Zhou. Cln: An lgorithm for mining closed cliques from lrge dense grph dtbses. pge 73. ICDE, X. Yn nd J. Hn. gspn: Grph-bsed substructure pttern mining. pges ICDM, 2002.

15 Appendix 6 Illustrtion of n Exmple Exmple: Figure 5 illustrtes simple run of the lgorithm. Listed re the edge nd converse edge triplets of mximl frequent itemset nd the stges in which the grph is extended for ese of understnding. The three edge triplets re (, e 1, b),(, e 3, c),(b, e 2, c) nd the three converse edge triplets re (e 1, b, e 2 ),(e 1,, e 3 ) nd (e 2, c, e 3 ). In this exmple we ssume tht there ws no common item id, spike id, tringle id or liner chin id for ese of understnding. Figure 5(1) shows the initil edge picked. The grph is creted with two nodes, node 0 with lbel, node 1 with lbel b nd edgelbel e 1. The edge triplet (, e 1, b) is mrked in the djcent tble s visited in step1. In Figure 5(2), ConverseExtension looks for the extensions of the edge creted in Figure 5(1). All converse edges of the form (e 1, lbel(1), ) =(e 1, b, ) re serched for mong the converse edge triplets. The triplet (e 1, b, e 2 ) mtches the serch. Hence new djcent edge is creted with edge lbel e 2. While the node lbel of incident node 1 is known, the lbel of the other node 2 is not known. We cll node 2 dngling node nd edge with lbel e 2 s dngling edge. EdgeExtension now looks for edge triplet of the form (lbel(1), e 2, )=(b, e 2, ) in order to find the lbel of the dngling node. Lbel c which mtches the serch is updted in the grph s in Figure 5(3). Next, we look for converse edge triplet extensions of the form (e 2, lbel(2), )= (e 2, c, ) nd extend the grph to contin edge with lbel e 3 s seen in Figure 5(4). On looking for edge triplets of the form (lbel(2), e 3, )=(c, e 3, ), the triplet (, e 3, c) qulifies. Since the node lbel lredy exists in the grph, the edge with e 3 s lbel could crete extensions of two forms. One s in 5(5.1) where it connects to n existing node with lbel or nother s in 5(5.2) where it cretes new node with lbel. For Figure 5(5.1) to be vlid, the converse triplet (e 1,, e 3 ) should be present becuse edges with lbel e 1 nd e 3 now become neighbours.for Figure 5(5.2) to be vlid, the converse triplet (e 1,, e 3 ) should not be present. Since converse edge triplet (e 1,, e 3 ) exists, the dngling edge on node 2 is dropped nd hence 5(5.1) qulifies s the vlid grph. We next look for extensions mong converse edge triplets of the form (e 3,, ). (e 1,, e 3 ) is lredy introduced in the grph. Hence no mtch exists. We hd initilly strted with converse edge triplet extensions of the form (e 1, lbel(1), ). We next look for converse edge triplet extensions of the form (e 1, lbel(0), ). The only mtch being (e 1,, e 3 ) is lredy introduced in the grph. As we cnnot extend further, we terminte by producing the grph in 5(5.1). Since ll edge triplets hve been visited, we lso note tht the mximl subgrph generted is connected. Else, we would need to restrt the process with n unvisited edge triplet to construct other components of the disconnected grph being generted by the itemset. Note tht lternting between serching for converse edge nd edge triplets, we extend the grph until no more extension is possible or no unseen edge triplets exist.

16 (,e1,b) step 1 (,e3,c) step 5 (b,e2,c) step 3 (e1,b,e2) step 2 (e1,,e3) (e2,c,e3) step 4 Appendix 7 0 e1 1 0 e1 1 e2 2 b b (1) (2) 0 e1 1 e2 2 b c (3) e3 e1 1 e e1 1 e2 2 e3 3 0 e1 1 e2 2 e3 3 b c b c b c (4) (5.1) (5.2) Fig. 5. Exmple Proof of Correctness We next show tht the lgorithm gives ll mximl frequent subgrphs. Tht is, every subgrph reported is mximl nd there is no mximl tht is unreported. Clim. All mximl frequent subgrphs re reported. Proof. Consider mximl frequent subgrph m i. The set of edge nd converse edge triplets tht form the mximl frequent subgrph is thus frequent nd hs to be subset of some mximl frequent itemset MI i reported in the itemset mining phse. Let the disconnected or connected grph corresponding to the itemset MI i be GI i. The subgrph of connected GI i or the subgrph of some component of GI i hs to correspond to the mximl frequent subgrph m i. Since m i is mximl, m i cn only be n improper subgrph of GI i or some component of it which mens the connected GI i or the component of GI i itself corresponds to m i. After the ccumltion of ll the possible cndidte mximl frequent subgrphs, the pruning phse prunes this set by removing ll grphs tht re subgrphs of nother grph in this set. A mximl frequent subgrph by definition will not get pruned. Hence, the set report contins ll mximl frequent subgrphs. Clim. There is no grph reported tht is not mximl frequent subgrph. Proof. A grph tht is no mximl frequent gets pruned out in the pruning phse s there hs to exist some subgrph from the set of mximl frequent subgrphs which is supergrph to it. From the bove two clims it cn be concluded tht the set of grphs reported is the set of mximl frequent subgrphs of the originl dtset D.

INTRODUCTION TO SIMPLICIAL COMPLEXES

INTRODUCTION TO SIMPLICIAL COMPLEXES INTRODUCTION TO SIMPLICIAL COMPLEXES CASEY KELLEHER AND ALESSANDRA PANTANO 0.1. Introduction. In this ctivity set we re going to introduce notion from Algebric Topology clled simplicil homology. The min

More information

In the last lecture, we discussed how valid tokens may be specified by regular expressions.

In the last lecture, we discussed how valid tokens may be specified by regular expressions. LECTURE 5 Scnning SYNTAX ANALYSIS We know from our previous lectures tht the process of verifying the syntx of the progrm is performed in two stges: Scnning: Identifying nd verifying tokens in progrm.

More information

Fig.25: the Role of LEX

Fig.25: the Role of LEX The Lnguge for Specifying Lexicl Anlyzer We shll now study how to uild lexicl nlyzer from specifiction of tokens in the form of list of regulr expressions The discussion centers round the design of n existing

More information

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5

CS321 Languages and Compiler Design I. Winter 2012 Lecture 5 CS321 Lnguges nd Compiler Design I Winter 2012 Lecture 5 1 FINITE AUTOMATA A non-deterministic finite utomton (NFA) consists of: An input lphet Σ, e.g. Σ =,. A set of sttes S, e.g. S = {1, 3, 5, 7, 11,

More information

Complete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li

Complete Coverage Path Planning of Mobile Robot Based on Dynamic Programming Algorithm Peng Zhou, Zhong-min Wang, Zhen-nan Li, Yang Li 2nd Interntionl Conference on Electronic & Mechnicl Engineering nd Informtion Technology (EMEIT-212) Complete Coverge Pth Plnning of Mobile Robot Bsed on Dynmic Progrmming Algorithm Peng Zhou, Zhong-min

More information

MATH 25 CLASS 5 NOTES, SEP

MATH 25 CLASS 5 NOTES, SEP MATH 25 CLASS 5 NOTES, SEP 30 2011 Contents 1. A brief diversion: reltively prime numbers 1 2. Lest common multiples 3 3. Finding ll solutions to x + by = c 4 Quick links to definitions/theorems Euclid

More information

II. THE ALGORITHM. A. Depth Map Processing

II. THE ALGORITHM. A. Depth Map Processing Lerning Plnr Geometric Scene Context Using Stereo Vision Pul G. Bumstrck, Bryn D. Brudevold, nd Pul D. Reynolds {pbumstrck,brynb,pulr2}@stnford.edu CS229 Finl Project Report December 15, 2006 Abstrct A

More information

COMP 423 lecture 11 Jan. 28, 2008

COMP 423 lecture 11 Jan. 28, 2008 COMP 423 lecture 11 Jn. 28, 2008 Up to now, we hve looked t how some symols in n lphet occur more frequently thn others nd how we cn sve its y using code such tht the codewords for more frequently occuring

More information

2 Computing all Intersections of a Set of Segments Line Segment Intersection

2 Computing all Intersections of a Set of Segments Line Segment Intersection 15-451/651: Design & Anlysis of Algorithms Novemer 14, 2016 Lecture #21 Sweep-Line nd Segment Intersection lst chnged: Novemer 8, 2017 1 Preliminries The sweep-line prdigm is very powerful lgorithmic design

More information

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards

A Tautology Checker loosely related to Stålmarck s Algorithm by Martin Richards A Tutology Checker loosely relted to Stålmrck s Algorithm y Mrtin Richrds mr@cl.cm.c.uk http://www.cl.cm.c.uk/users/mr/ University Computer Lortory New Museum Site Pemroke Street Cmridge, CB2 3QG Mrtin

More information

9 Graph Cutting Procedures

9 Graph Cutting Procedures 9 Grph Cutting Procedures Lst clss we begn looking t how to embed rbitrry metrics into distributions of trees, nd proved the following theorem due to Brtl (1996): Theorem 9.1 (Brtl (1996)) Given metric

More information

such that the S i cover S, or equivalently S

such that the S i cover S, or equivalently S MATH 55 Triple Integrls Fll 16 1. Definition Given solid in spce, prtition of consists of finite set of solis = { 1,, n } such tht the i cover, or equivlently n i. Furthermore, for ech i, intersects i

More information

Midterm 2 Sample solution

Midterm 2 Sample solution Nme: Instructions Midterm 2 Smple solution CMSC 430 Introduction to Compilers Fll 2012 November 28, 2012 This exm contins 9 pges, including this one. Mke sure you hve ll the pges. Write your nme on the

More information

Misrepresentation of Preferences

Misrepresentation of Preferences Misrepresenttion of Preferences Gicomo Bonnno Deprtment of Economics, University of Cliforni, Dvis, USA gfbonnno@ucdvis.edu Socil choice functions Arrow s theorem sys tht it is not possible to extrct from

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Dt Mining y I. H. Witten nd E. Frnk Simplicity first Simple lgorithms often work very well! There re mny kinds of simple structure, eg: One ttriute does ll the work All ttriutes contriute eqully

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-186 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

Presentation Martin Randers

Presentation Martin Randers Presenttion Mrtin Rnders Outline Introduction Algorithms Implementtion nd experiments Memory consumption Summry Introduction Introduction Evolution of species cn e modelled in trees Trees consist of nodes

More information

An Efficient Divide and Conquer Algorithm for Exact Hazard Free Logic Minimization

An Efficient Divide and Conquer Algorithm for Exact Hazard Free Logic Minimization An Efficient Divide nd Conquer Algorithm for Exct Hzrd Free Logic Minimiztion J.W.J.M. Rutten, M.R.C.M. Berkelr, C.A.J. vn Eijk, M.A.J. Kolsteren Eindhoven University of Technology Informtion nd Communiction

More information

2014 Haskell January Test Regular Expressions and Finite Automata

2014 Haskell January Test Regular Expressions and Finite Automata 0 Hskell Jnury Test Regulr Expressions nd Finite Automt This test comprises four prts nd the mximum mrk is 5. Prts I, II nd III re worth 3 of the 5 mrks vilble. The 0 Hskell Progrmming Prize will be wrded

More information

Pointwise convergence need not behave well with respect to standard properties such as continuity.

Pointwise convergence need not behave well with respect to standard properties such as continuity. Chpter 3 Uniform Convergence Lecture 9 Sequences of functions re of gret importnce in mny res of pure nd pplied mthemtics, nd their properties cn often be studied in the context of metric spces, s in Exmples

More information

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus

Unit #9 : Definite Integral Properties, Fundamental Theorem of Calculus Unit #9 : Definite Integrl Properties, Fundmentl Theorem of Clculus Gols: Identify properties of definite integrls Define odd nd even functions, nd reltionship to integrl vlues Introduce the Fundmentl

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

Epson Projector Content Manager Operation Guide

Epson Projector Content Manager Operation Guide Epson Projector Content Mnger Opertion Guide Contents 2 Introduction to the Epson Projector Content Mnger Softwre 3 Epson Projector Content Mnger Fetures... 4 Setting Up the Softwre for the First Time

More information

CS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig

CS311H: Discrete Mathematics. Graph Theory IV. A Non-planar Graph. Regions of a Planar Graph. Euler s Formula. Instructor: Işıl Dillig CS311H: Discrete Mthemtics Grph Theory IV Instructor: Işıl Dillig Instructor: Işıl Dillig, CS311H: Discrete Mthemtics Grph Theory IV 1/25 A Non-plnr Grph Regions of Plnr Grph The plnr representtion of

More information

12-B FRACTIONS AND DECIMALS

12-B FRACTIONS AND DECIMALS -B Frctions nd Decimls. () If ll four integers were negtive, their product would be positive, nd so could not equl one of them. If ll four integers were positive, their product would be much greter thn

More information

Control-Flow Analysis and Loop Detection

Control-Flow Analysis and Loop Detection ! Control-Flow Anlysis nd Loop Detection!Lst time! PRE!Tody! Control-flow nlysis! Loops! Identifying loops using domintors! Reducibility! Using loop identifiction to identify induction vribles CS553 Lecture

More information

SOME EXAMPLES OF SUBDIVISION OF SMALL CATEGORIES

SOME EXAMPLES OF SUBDIVISION OF SMALL CATEGORIES SOME EXAMPLES OF SUBDIVISION OF SMALL CATEGORIES MARCELLO DELGADO Abstrct. The purpose of this pper is to build up the bsic conceptul frmework nd underlying motivtions tht will llow us to understnd ctegoricl

More information

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs.

If you are at the university, either physically or via the VPN, you can download the chapters of this book as PDFs. Lecture 5 Wlks, Trils, Pths nd Connectedness Reding: Some of the mteril in this lecture comes from Section 1.2 of Dieter Jungnickel (2008), Grphs, Networks nd Algorithms, 3rd edition, which is ville online

More information

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey

Alignment of Long Sequences. BMI/CS Spring 2012 Colin Dewey Alignment of Long Sequences BMI/CS 776 www.biostt.wisc.edu/bmi776/ Spring 2012 Colin Dewey cdewey@biostt.wisc.edu Gols for Lecture the key concepts to understnd re the following how lrge-scle lignment

More information

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming

Lecture 10 Evolutionary Computation: Evolution strategies and genetic programming Lecture 10 Evolutionry Computtion: Evolution strtegies nd genetic progrmming Evolution strtegies Genetic progrmming Summry Negnevitsky, Person Eduction, 2011 1 Evolution Strtegies Another pproch to simulting

More information

What are suffix trees?

What are suffix trees? Suffix Trees 1 Wht re suffix trees? Allow lgorithm designers to store very lrge mount of informtion out strings while still keeping within liner spce Allow users to serch for new strings in the originl

More information

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis

CS143 Handout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexical Analysis CS143 Hndout 07 Summer 2011 June 24 th, 2011 Written Set 1: Lexicl Anlysis In this first written ssignment, you'll get the chnce to ply round with the vrious constructions tht come up when doing lexicl

More information

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have

P(r)dr = probability of generating a random number in the interval dr near r. For this probability idea to make sense we must have Rndom Numers nd Monte Crlo Methods Rndom Numer Methods The integrtion methods discussed so fr ll re sed upon mking polynomil pproximtions to the integrnd. Another clss of numericl methods relies upon using

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

A Heuristic Approach for Discovering Reference Models by Mining Process Model Variants

A Heuristic Approach for Discovering Reference Models by Mining Process Model Variants A Heuristic Approch for Discovering Reference Models by Mining Process Model Vrints Chen Li 1, Mnfred Reichert 2, nd Andres Wombcher 3 1 Informtion System Group, University of Twente, The Netherlnds lic@cs.utwente.nl

More information

Semistructured Data Management Part 2 - Graph Databases

Semistructured Data Management Part 2 - Graph Databases Semistructured Dt Mngement Prt 2 - Grph Dtbses 2003/4, Krl Aberer, EPFL-SSC, Lbortoire de systèmes d'informtions réprtis Semi-structured Dt - 1 1 Tody's Questions 1. Schems for Semi-structured Dt 2. Grph

More information

UNIT 11. Query Optimization

UNIT 11. Query Optimization UNIT Query Optimiztion Contents Introduction to Query Optimiztion 2 The Optimiztion Process: An Overview 3 Optimiztion in System R 4 Optimiztion in INGRES 5 Implementing the Join Opertors Wei-Png Yng,

More information

MA1008. Calculus and Linear Algebra for Engineers. Course Notes for Section B. Stephen Wills. Department of Mathematics. University College Cork

MA1008. Calculus and Linear Algebra for Engineers. Course Notes for Section B. Stephen Wills. Department of Mathematics. University College Cork MA1008 Clculus nd Liner Algebr for Engineers Course Notes for Section B Stephen Wills Deprtment of Mthemtics University College Cork s.wills@ucc.ie http://euclid.ucc.ie/pges/stff/wills/teching/m1008/ma1008.html

More information

9.1 apply the distance and midpoint formulas

9.1 apply the distance and midpoint formulas 9.1 pply the distnce nd midpoint formuls DISTANCE FORMULA MIDPOINT FORMULA To find the midpoint between two points x, y nd x y 1 1,, we Exmple 1: Find the distnce between the two points. Then, find the

More information

Section 3.1: Sequences and Series

Section 3.1: Sequences and Series Section.: Sequences d Series Sequences Let s strt out with the definition of sequence: sequence: ordered list of numbers, often with definite pttern Recll tht in set, order doesn t mtter so this is one

More information

Math 142, Exam 1 Information.

Math 142, Exam 1 Information. Mth 14, Exm 1 Informtion. 9/14/10, LC 41, 9:30-10:45. Exm 1 will be bsed on: Sections 7.1-7.5. The corresponding ssigned homework problems (see http://www.mth.sc.edu/ boyln/sccourses/14f10/14.html) At

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

A New Learning Algorithm for the MAXQ Hierarchical Reinforcement Learning Method

A New Learning Algorithm for the MAXQ Hierarchical Reinforcement Learning Method A New Lerning Algorithm for the MAXQ Hierrchicl Reinforcement Lerning Method Frzneh Mirzzdeh 1, Bbk Behsz 2, nd Hmid Beigy 1 1 Deprtment of Computer Engineering, Shrif University of Technology, Tehrn,

More information

CS201 Discussion 10 DRAWTREE + TRIES

CS201 Discussion 10 DRAWTREE + TRIES CS201 Discussion 10 DRAWTREE + TRIES DrwTree First instinct: recursion As very generic structure, we could tckle this problem s follows: drw(): Find the root drw(root) drw(root): Write the line for the

More information

Dynamic Programming. Andreas Klappenecker. [partially based on slides by Prof. Welch] Monday, September 24, 2012

Dynamic Programming. Andreas Klappenecker. [partially based on slides by Prof. Welch] Monday, September 24, 2012 Dynmic Progrmming Andres Klppenecker [prtilly bsed on slides by Prof. Welch] 1 Dynmic Progrmming Optiml substructure An optiml solution to the problem contins within it optiml solutions to subproblems.

More information

Fall 2018 Midterm 1 October 11, ˆ You may not ask questions about the exam except for language clarifications.

Fall 2018 Midterm 1 October 11, ˆ You may not ask questions about the exam except for language clarifications. 15-112 Fll 2018 Midterm 1 October 11, 2018 Nme: Andrew ID: Recittion Section: ˆ You my not use ny books, notes, extr pper, or electronic devices during this exm. There should be nothing on your desk or

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd business. Introducing technology

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Functor (1A) Young Won Lim 8/2/17

Functor (1A) Young Won Lim 8/2/17 Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

4452 Mathematical Modeling Lecture 4: Lagrange Multipliers

4452 Mathematical Modeling Lecture 4: Lagrange Multipliers Mth Modeling Lecture 4: Lgrnge Multipliers Pge 4452 Mthemticl Modeling Lecture 4: Lgrnge Multipliers Lgrnge multipliers re high powered mthemticl technique to find the mximum nd minimum of multidimensionl

More information

ECE 468/573 Midterm 1 September 28, 2012

ECE 468/573 Midterm 1 September 28, 2012 ECE 468/573 Midterm 1 September 28, 2012 Nme:! Purdue emil:! Plese sign the following: I ffirm tht the nswers given on this test re mine nd mine lone. I did not receive help from ny person or mteril (other

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

arxiv:cs.cg/ v1 18 Oct 2005

arxiv:cs.cg/ v1 18 Oct 2005 A Pir of Trees without Simultneous Geometric Embedding in the Plne rxiv:cs.cg/0510053 v1 18 Oct 2005 Mrtin Kutz Mx-Plnck-Institut für Informtik, Srbrücken, Germny mkutz@mpi-inf.mpg.de October 19, 2005

More information

CS481: Bioinformatics Algorithms

CS481: Bioinformatics Algorithms CS481: Bioinformtics Algorithms Cn Alkn EA509 clkn@cs.ilkent.edu.tr http://www.cs.ilkent.edu.tr/~clkn/teching/cs481/ EXACT STRING MATCHING Fingerprint ide Assume: We cn compute fingerprint f(p) of P in

More information

Memory-Optimized Software Synthesis from Dataflow Program Graphs withlargesizedatasamples

Memory-Optimized Software Synthesis from Dataflow Program Graphs withlargesizedatasamples EURSIP Journl on pplied Signl Processing 2003:6, 54 529 c 2003 Hindwi Publishing orportion Memory-Optimized Softwre Synthesis from tflow Progrm Grphs withlrgesizetsmples Hyunok Oh The School of Electricl

More information

CHAPTER III IMAGE DEWARPING (CALIBRATION) PROCEDURE

CHAPTER III IMAGE DEWARPING (CALIBRATION) PROCEDURE CHAPTER III IMAGE DEWARPING (CALIBRATION) PROCEDURE 3.1 Scheimpflug Configurtion nd Perspective Distortion Scheimpflug criterion were found out to be the best lyout configurtion for Stereoscopic PIV, becuse

More information

Functor (1A) Young Won Lim 10/5/17

Functor (1A) Young Won Lim 10/5/17 Copyright (c) 2016-2017 Young W. Lim. Permission is grnted to copy, distribute nd/or modify this document under the terms of the GNU Free Documenttion License, Version 1.2 or ny lter version published

More information

COMBINATORIAL PATTERN MATCHING

COMBINATORIAL PATTERN MATCHING COMBINATORIAL PATTERN MATCHING Genomic Repets Exmple of repets: ATGGTCTAGGTCCTAGTGGTC Motivtion to find them: Genomic rerrngements re often ssocited with repets Trce evolutionry secrets Mny tumors re chrcterized

More information

Fig.1. Let a source of monochromatic light be incident on a slit of finite width a, as shown in Fig. 1.

Fig.1. Let a source of monochromatic light be incident on a slit of finite width a, as shown in Fig. 1. Answer on Question #5692, Physics, Optics Stte slient fetures of single slit Frunhofer diffrction pttern. The slit is verticl nd illuminted by point source. Also, obtin n expression for intensity distribution

More information

1. SEQUENCES INVOLVING EXPONENTIAL GROWTH (GEOMETRIC SEQUENCES)

1. SEQUENCES INVOLVING EXPONENTIAL GROWTH (GEOMETRIC SEQUENCES) Numbers nd Opertions, Algebr, nd Functions 45. SEQUENCES INVOLVING EXPONENTIAL GROWTH (GEOMETRIC SEQUENCES) In sequence of terms involving eponentil growth, which the testing service lso clls geometric

More information

Intermediate Information Structures

Intermediate Information Structures CPSC 335 Intermedite Informtion Structures LECTURE 13 Suffix Trees Jon Rokne Computer Science University of Clgry Cnd Modified from CMSC 423 - Todd Trengen UMD upd Preprocessing Strings We will look t

More information

CSCI 104. Rafael Ferreira da Silva. Slides adapted from: Mark Redekopp and David Kempe

CSCI 104. Rafael Ferreira da Silva. Slides adapted from: Mark Redekopp and David Kempe CSCI 0 fel Ferreir d Silv rfsilv@isi.edu Slides dpted from: Mrk edekopp nd Dvid Kempe LOG STUCTUED MEGE TEES Series Summtion eview Let n = + + + + k $ = #%& #. Wht is n? n = k+ - Wht is log () + log ()

More information

vcloud Director Service Provider Admin Portal Guide vcloud Director 9.1

vcloud Director Service Provider Admin Portal Guide vcloud Director 9.1 vcloud Director Service Provider Admin Portl Guide vcloud Director 9. vcloud Director Service Provider Admin Portl Guide You cn find the most up-to-dte technicl documenttion on the VMwre website t: https://docs.vmwre.com/

More information

10.5 Graphing Quadratic Functions

10.5 Graphing Quadratic Functions 0.5 Grphing Qudrtic Functions Now tht we cn solve qudrtic equtions, we wnt to lern how to grph the function ssocited with the qudrtic eqution. We cll this the qudrtic function. Grphs of Qudrtic Functions

More information

Cone Cluster Labeling for Support Vector Clustering

Cone Cluster Labeling for Support Vector Clustering Cone Cluster Lbeling for Support Vector Clustering Sei-Hyung Lee Deprtment of Computer Science University of Msschusetts Lowell MA 1854, U.S.A. slee@cs.uml.edu Kren M. Dniels Deprtment of Computer Science

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Tool Vendor Perspectives SysML Thus Far

Tool Vendor Perspectives SysML Thus Far Frontiers 2008 Pnel Georgi Tec, 05-13-08 Tool Vendor Perspectives SysML Thus Fr Hns-Peter Hoffmnn, Ph.D Chief Systems Methodologist Telelogic, Systems & Softwre Modeling Business Unit Peter.Hoffmnn@telelogic.com

More information

A REINFORCEMENT LEARNING APPROACH TO SCHEDULING DUAL-ARMED CLUSTER TOOLS WITH TIME VARIATIONS

A REINFORCEMENT LEARNING APPROACH TO SCHEDULING DUAL-ARMED CLUSTER TOOLS WITH TIME VARIATIONS A REINFORCEMENT LEARNING APPROACH TO SCHEDULING DUAL-ARMED CLUSTER TOOLS WITH TIME VARIATIONS Ji-Eun Roh (), Te-Eog Lee (b) (),(b) Deprtment of Industril nd Systems Engineering, Kore Advnced Institute

More information

9 4. CISC - Curriculum & Instruction Steering Committee. California County Superintendents Educational Services Association

9 4. CISC - Curriculum & Instruction Steering Committee. California County Superintendents Educational Services Association 9. CISC - Curriculum & Instruction Steering Committee The Winning EQUATION A HIGH QUALITY MATHEMATICS PROFESSIONAL DEVELOPMENT PROGRAM FOR TEACHERS IN GRADES THROUGH ALGEBRA II STRAND: NUMBER SENSE: Rtionl

More information

On String Matching in Chunked Texts

On String Matching in Chunked Texts On String Mtching in Chunked Texts Hnnu Peltol nd Jorm Trhio {hpeltol, trhio}@cs.hut.fi Deprtment of Computer Science nd Engineering Helsinki University of Technology P.O. Box 5400, FI-02015 HUT, Finlnd

More information

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an

Scanner Termination. Multi Character Lookahead. to its physical end. Most parsers require an end of file token. Lex and Jlex automatically create an Scnner Termintion A scnner reds input chrcters nd prtitions them into tokens. Wht hppens when the end of the input file is reched? It my be useful to crete n Eof pseudo-chrcter when this occurs. In Jv,

More information

Mid-term exam. Scores. Fall term 2012 KAIST EE209 Programming Structures for EE. Thursday Oct 25, Student's name: Student ID:

Mid-term exam. Scores. Fall term 2012 KAIST EE209 Programming Structures for EE. Thursday Oct 25, Student's name: Student ID: Fll term 2012 KAIST EE209 Progrmming Structures for EE Mid-term exm Thursdy Oct 25, 2012 Student's nme: Student ID: The exm is closed book nd notes. Red the questions crefully nd focus your nswers on wht

More information

pdfapilot Server 2 Manual

pdfapilot Server 2 Manual pdfpilot Server 2 Mnul 2011 by clls softwre gmbh Schönhuser Allee 6/7 D 10119 Berlin Germny info@cllssoftwre.com www.cllssoftwre.com Mnul clls pdfpilot Server 2 Pge 2 clls pdfpilot Server 2 Mnul Lst modified:

More information

Eliminating left recursion grammar transformation. The transformed expression grammar

Eliminating left recursion grammar transformation. The transformed expression grammar Eliminting left recursion grmmr trnsformtion Originl! rnsformed! 0 0! 0 α β α α α α α α α α β he two grmmrs generte the sme lnguge, but the one on the right genertes the rst, nd then string of s, using

More information

A Comparison of the Discretization Approach for CST and Discretization Approach for VDM

A Comparison of the Discretization Approach for CST and Discretization Approach for VDM Interntionl Journl of Innovtive Reserch in Advnced Engineering (IJIRAE) Volume1 Issue1 (Mrch 2014) A Comprison of the Discretiztion Approch for CST nd Discretiztion Approch for VDM Omr A. A. Shib Fculty

More information

arxiv: v1 [cs.cg] 9 Dec 2016

arxiv: v1 [cs.cg] 9 Dec 2016 Some Counterexmples for Comptible Tringultions rxiv:62.0486v [cs.cg] 9 Dec 206 Cody Brnson Dwn Chndler 2 Qio Chen 3 Christin Chung 4 Andrew Coccimiglio 5 Sen L 6 Lily Li 7 Aïn Linn 8 Ann Lubiw 9 Clre Lyle

More information

Preserving Constraints for Aggregation Relationship Type Update in XML Document

Preserving Constraints for Aggregation Relationship Type Update in XML Document Preserving Constrints for Aggregtion Reltionship Type Updte in XML Document Eric Prdede 1, J. Wenny Rhyu 1, nd Dvid Tnir 2 1 Deprtment of Computer Science nd Computer Engineering, L Trobe University, Bundoor

More information

Frequent Closed Itemset Mining Using Prefix Graphs

Frequent Closed Itemset Mining Using Prefix Graphs Frequent Closed Itemset Mining Using Prefix Grphs H. D. K. Moonesinghe, Smh Fodeh, Png-Ning Tn Deprtment of Computer Science & Engineering Michign Stte University Est Lnsing, MI 48824 (moonesin, fodehsm,

More information

Transparent neutral-element elimination in MPI reduction operations

Transparent neutral-element elimination in MPI reduction operations Trnsprent neutrl-element elimintion in MPI reduction opertions Jesper Lrsson Träff Deprtment of Scientific Computing University of Vienn Disclimer Exploiting repetition nd sprsity in input for reducing

More information

QFrag: Distributed Graph Search via Subgraph Isomorphism

QFrag: Distributed Graph Search via Subgraph Isomorphism Mrco Serfini, Ginmrco De Frncisci Morles, nd Georgos Signos Qtr Computing Reserch Institute - HBKU HBKU Reserch Complex 1 Doh, Qtr {mserfini,gmorles,gsignos}@hbku.edu.q ABSTRACT This pper introduces QFrg,

More information

A Transportation Problem Analysed by a New Ranking Method

A Transportation Problem Analysed by a New Ranking Method (IJIRSE) Interntionl Journl of Innovtive Reserch in Science & Engineering ISSN (Online) 7-07 A Trnsporttion Problem Anlysed by New Rnking Method Dr. A. Shy Sudh P. Chinthiy Associte Professor PG Scholr

More information

Notes for Graph Theory

Notes for Graph Theory Notes for Grph Theory These re notes I wrote up for my grph theory clss in 06. They contin most of the topics typiclly found in grph theory course. There re proofs of lot of the results, ut not of everything.

More information

Stained Glass Design. Teaching Goals:

Stained Glass Design. Teaching Goals: Stined Glss Design Time required 45-90 minutes Teching Gols: 1. Students pply grphic methods to design vrious shpes on the plne.. Students pply geometric trnsformtions of grphs of functions in order to

More information

CPSC 213. Polymorphism. Introduction to Computer Systems. Readings for Next Two Lectures. Back to Procedure Calls

CPSC 213. Polymorphism. Introduction to Computer Systems. Readings for Next Two Lectures. Back to Procedure Calls Redings for Next Two Lectures Text CPSC 213 Switch Sttements, Understnding Pointers - 2nd ed: 3.6.7, 3.10-1st ed: 3.6.6, 3.11 Introduction to Computer Systems Unit 1f Dynmic Control Flow Polymorphism nd

More information

Small Business Networking

Small Business Networking Why network is n essentil productivity tool for ny smll business Effective technology is essentil for smll businesses looking to increse the productivity of their people nd processes. Introducing technology

More information

Before We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1):

Before We Begin. Introduction to Spatial Domain Filtering. Introduction to Digital Image Processing. Overview (1): Administrative Details (1): Overview (): Before We Begin Administrtive detils Review some questions to consider Winter 2006 Imge Enhncement in the Sptil Domin: Bsics of Sptil Filtering, Smoothing Sptil Filters, Order Sttistics Filters

More information

SAPPER: Subgraph Indexing and Approximate Matching in Large Graphs

SAPPER: Subgraph Indexing and Approximate Matching in Large Graphs SAPPER: Sugrph Indexing nd Approximte Mtching in Lrge Grphs Shijie Zhng, Jiong Yng, Wei Jin EECS Dept., Cse Western Reserve University, {shijie.zhng, jiong.yng, wei.jin}@cse.edu ABSTRACT With the emergence

More information

3.5.1 Single slit diffraction

3.5.1 Single slit diffraction 3.5.1 Single slit diffrction Wves pssing through single slit will lso diffrct nd produce n interference pttern. The reson for this is to do with the finite width of the slit. We will consider this lter.

More information

Unit 5 Vocabulary. A function is a special relationship where each input has a single output.

Unit 5 Vocabulary. A function is a special relationship where each input has a single output. MODULE 3 Terms Definition Picture/Exmple/Nottion 1 Function Nottion Function nottion is n efficient nd effective wy to write functions of ll types. This nottion llows you to identify the input vlue with

More information

Engineer To Engineer Note

Engineer To Engineer Note Engineer To Engineer Note EE-169 Technicl Notes on using Anlog Devices' DSP components nd development tools Contct our technicl support by phone: (800) ANALOG-D or e-mil: dsp.support@nlog.com Or visit

More information

Address Register Assignment for Reducing Code Size

Address Register Assignment for Reducing Code Size Address Register Assignment for Reducing Code Size M. Kndemir 1, M.J. Irwin 1, G. Chen 1, nd J. Rmnujm 2 1 CSE Deprtment Pennsylvni Stte University University Prk, PA 16802 {kndemir,mji,guilchen}@cse.psu.edu

More information

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFORMATICS 1 COMPUTATION & LOGIC INSTRUCTIONS TO CANDIDATES

UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFORMATICS 1 COMPUTATION & LOGIC INSTRUCTIONS TO CANDIDATES UNIVERSITY OF EDINBURGH COLLEGE OF SCIENCE AND ENGINEERING SCHOOL OF INFORMATICS INFORMATICS COMPUTATION & LOGIC Sturdy st April 7 : to : INSTRUCTIONS TO CANDIDATES This is tke-home exercise. It will not

More information

Graph Exploration: Taking the User into the Loop

Graph Exploration: Taking the User into the Loop Grph Explortion: Tking the User into the Loop Dvide Mottin, Anj Jentzsch, Emmnuel Müller Hsso Plttner Institute, Potsdm, Germny 2016/10/24 CIKM2016, Indinpolis, US Who we re Dvide Mottin grph mining, novel

More information

Answer Key Lesson 6: Workshop: Angles and Lines

Answer Key Lesson 6: Workshop: Angles and Lines nswer Key esson 6: tudent Guide ngles nd ines Questions 1 3 (G p. 406) 1. 120 ; 360 2. hey re the sme. 3. 360 Here re four different ptterns tht re used to mke quilts. Work with your group. se your Power

More information

Midterm I Solutions CS164, Spring 2006

Midterm I Solutions CS164, Spring 2006 Midterm I Solutions CS164, Spring 2006 Februry 23, 2006 Plese red ll instructions (including these) crefully. Write your nme, login, SID, nd circle the section time. There re 8 pges in this exm nd 4 questions,

More information

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών

ΕΠΛ323 - Θεωρία και Πρακτική Μεταγλωττιστών ΕΠΛ323 - Θωρία και Πρακτική Μταγλωττιστών Lecture 3 Lexicl Anlysis Elis Athnsopoulos elisthn@cs.ucy.c.cy Recognition of Tokens if expressions nd reltionl opertors if è if then è then else è else relop

More information

Definition of Regular Expression

Definition of Regular Expression Definition of Regulr Expression After the definition of the string nd lnguges, we re redy to descrie regulr expressions, the nottion we shll use to define the clss of lnguges known s regulr sets. Recll

More information

1 Quad-Edge Construction Operators

1 Quad-Edge Construction Operators CS48: Computer Grphics Hndout # Geometric Modeling Originl Hndout #5 Stnford University Tuesdy, 8 December 99 Originl Lecture #5: 9 November 99 Topics: Mnipultions with Qud-Edge Dt Structures Scribe: Mike

More information

Subtracting Fractions

Subtracting Fractions Lerning Enhncement Tem Model Answers: Adding nd Subtrcting Frctions Adding nd Subtrcting Frctions study guide. When the frctions both hve the sme denomintor (bottom) you cn do them using just simple dding

More information