Scalable Diversified Ranking on Large Graphs

Size: px
Start display at page:

Download "Scalable Diversified Ranking on Large Graphs"

Transcription

1 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 Scalable Diversified Rakig o Large Graphs Rog-Hua Li ad Jeffery Xu Yu Abstract Ehacig diversity i rakig o graphs has bee idetified as a importat retrieval ad miig task. Nevertheless, may existig diversified rakig algorithms either caot be scalable to large graphs due to the time or memory requiremets, or lack a ituitive ad reasoable diversified rakig measure. I this paper, we propose a ew diversified rakig measure o large graphs, which captures both relevace ad diversity, ad formulate the diversified rakig problem as a submodular set fuctio maximizatio problem. Based o the submodularity of the proposed measure, we develop a efficiet greedy algorithm with liear time ad space complexity w.r.t. the size of the graph to achieve ear-optimal diversified rakig. I additio, we preset a geeralized diversified rakig measure ad give a ear-optimal radomized greedy algorithm with liear time ad space complexity for optimizig it. We evaluate the proposed methods through extesive experimets o five real datasets. The experimetal results demostrate the effectiveess ad efficiecy of the proposed algorithms. Idex Terms Diversified Rakig, Graph Algorithms, Scalability, Flajolet-Marti sketch, Submodular Fuctio. INTRODUCTION Rakig odes o graphs is a fudametal task i iformatio retrieval, data miig, ad social etwork aalysis. It has a large umber of applicatios such as rakig web-pages [], measurig cetrality i social etworks [2], as well as ehacig persoalized services for web search [3]. Most of existig graphbased rakig algorithms are based o the statioary distributio of the radom walk o graphs, such as the PageRak algorithm [] ad its variats [3][4]. The idea of this radom walk based rakig algorithms is that the ode of a graph should be raked higher if there are more high-rakig odes lik to it. This basic idea has become a crucial criteria for desigig rakig algorithms o graphs ad also has bee successfully applied i may applicatios. However, as discussed i [5][6], the desig criteria lead to may odes foud i the top- rakig list are similar because it oly cosiders the relevace of the odes. It reduces the rakig effectiveess whe the applicatios eed to icorporate diversity ito the top- rakig results. Take Flickr ( com), which is a well kow photo shared website, as a example. Users i Flickr ca make frieds ad joi i may iterest groups. Cosider a retrieval task of fidig the top- relevat users who are similar to a give user but are from as may iterest groups as possible i the Flickr social etwork. I geeral, we ca use persoalized PageRak algorithms [][3][4] to rak the users, ad the fid the top- users based o their persoalized PageRak scores. However, the top- users foud by the persoalized PageRak typically iclude may users who are i the same iterests group, thereby they caot meet our The Chiese Uiversity of Hog og, {rhli,yu}@se.cuhk.edu.hk objective of diversity. To this ed, we eed to take the diversity of the top- rakig list ito accout for desigig rakig algorithms. I other words, the rakig algorithms i this case should produce diversified rakig results so as to cover as may groups as possible. Recetly, improvig diversity i top- rakig results has attracted much attetio as it has a variety of applicatios i iformatio retrieval ad data miig areas. There exists a large body of work o search results diversificatio both i text ad graph datasets respectively. I this paper, we focus o ehacig diversity i rakig o graph datasets. We are iterested i fidig the top- odes that are ot oly relevat to the query but also dissimilar to oe aother. Here the relevace of the odes is measured by their persoalized PageRak scores. I the literature, there are four frameworks for diversified rakig o graphs. The first oe is based o a greedy vertex selectio procedure [5][7], the secod oe is based o a so-called vertex reiforced radom walk [6], the third framework is based o optimizig the predefied diversified measures [8][9], ad the last oe is based o the resistive graph ceters []. I particular, the greedy vertex selectio procedure chooses a vertex with a maximum radom walk based rakig score at a time, ad the removes the selected vertex from the graph. To get the top- rakig list, this process repeats times. To the best of our kowledge, there are two algorithms based o this framework: the Grasshopper algorithm [5] ad the maifold rak with stop poits algorithm [7]. Both algorithms have empirically show that they ca improve diversity i rakig o graph data. However, the major drawback of this type of algorithms is that they have cubic time complexity, thus they caot be scalable to large graphs. Aother drawback of this type of algorithms is that they lack a theoretical explaatio

2 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 2 for the algorithms why they ca improve diversity i rakig results. Some improvemets of this poit have bee achieved i the secod framework [6]. I [6], Mei, et al. propose a diversified rakig algorithm, called DivRak, based o a vertex reiforced radom walk, ad preset a optimizatio explaatio for DivRak to improve diversity i rakig. However, the explaatio is oly suitable for udirected graphs. I additio, the covergece property of DivRak is ot clear, because it resorts to some approximatio strategies to the origial vertex reiforced radom walks. Aother drawback of DivRak is that it caot be scalable to large graphs for two reasos. O oe had, DivRak dyamically updates the trasitio matrix at each iteratio. This procedure may result i a full trasitio matrix, thus it caot be stored i mai memory if the graph is very large. O the other had, the full trasitio matrix icreases the computatioal cost for the matrix-vector multiplicatio. Tog, et al. i [8] propose a scalable diversified rakig algorithm by optimizig a predefied diversified rakig measure. However, the motivatio of their diversified rakig measure is ot explicitly clarified. Specifically, for measurig diversity, their measure is based o a multiplicatio of the so-called Google matrix ad the persoalized PageRak vector, which lacks a clear topological explaatio. Hece, it does ot directly reflect diversity of a set of odes from graph structural perspective. The last otable diversified rakig algorithm is based o resistive graph ceters []. Similar to the greedy vertex selectio algorithms, the time complexity of this algorithm is cubic, thus it caot scale to large graphs. To overcome the problems i the existig algorithms, i this paper, we preset a ovel diversified rakig method o graphs. The basic idea of our approach is that we first calculate the persoalized PageRak vector o the basis of the query ode, ad the perform a carefully desiged vertex selectio algorithm to fid the top- diversified rakig list accordig to a predefied diversified rakig measure. The key challeges i our method are () how to defie a ituitive ad reasoable diversified rakig measure that captures both relevace ad diversity, ad (2) how to develop a efficiet vertex selectio algorithm to optimize the diversified rakig measure. To this ed, firstly, we propose a modified defiitio of expasio o graph to capture the diversity of the odes. The key ituitio is that if the odes have large expasio, the the odes will be dissimilar to each other, thus leadig to diversity. Secodly, based o this defiitio, we propose a ovel diversified rakig measure by combiig relevace ad diversity. We show that the proposed measure is a odecreasig submodular set fuctio. Based o the submodularity of the proposed measure, we desig a efficiet greedy algorithm with liear time ad space complexity w.r.t. the size of the graph to fid the top- diversified rakig list. Thirdly, we further preset a geeralized diversified rakig measure based o the defiitio of k-step expasio, ad propose a radomized greedy algorithm with liear time ad space complexity to optimize it accurately. Fially, we compare our proposed methods with six existig algorithms o five real etworks. The experimetal results demostrate the effectiveess, efficiecy ad scalability of the proposed algorithms. The prelimiary study of this work is reported i [9]. The rest of this paper is orgaized as follows. We give a briefly review of persoalized PageRak algorithm ad preset our ew diversified rakig measure as well as our problem formulatio i Sectio 2. We show the submodularity of the proposed measure ad give a ear-optimal greedy algorithm for fidig top- diversified rakig i Sectio 3. We preset a geeralized diversified rakig measure ad a radomized greedy algorithm i Sectio 4. Extesive experimets are reported i 5, ad related work is discussed i Sectio 6. We coclude this work i Sectio 7. 2 PRELIMINARIES I this sectio, we first briefly review the persoalized PageRak algorithm that is used as a basic measure of relevace i diversified rakig o graphs. The, we propose a ew diversified rakig measure ad formulate our diversified rakig problem as a discrete optimizatio problem. 2. Persoalized PageRak algorithm Persoalized PageRak [][3][4] is a well kow approach for query-depedet rakig o graphs, ad it has bee successfully used i various applicatios i the past decades. We briefly describe the persoalized PageRak algorithm below. Give a query vector r (also call teleport vector i may literature []), ad a graph G. The, the persoalized PageRak vector w ca be calculated by the followig iterative equatio: w = ( α)r + αa T w, () where α is a dampig factor, ad A is the adjacecy matrix of graph G. The iterative equatio i Eq. () ca coverge to a fixed poit, which correspods to the statioary distributio of the Markov chai. The resultig vector w will be utilized to rak the odes of the graph. However, the persoalized PageRak does ot cosider diversity of the rakig results. This is because the persoalized PageRak makes use of the statioary distributio of the radom walks for rakig odes i graph. The radom walk o graph ca form a Markov chai. By the fudametal theorem of Markov chai [], the statioary distributio of the walks is iversely proportioal to the hittig time. If a

3 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 3 ode is hit very frequetly by radom walks, the the ode will have a high persoalized PageRak score. Also, if a ode is hit frequetly, all its eighbors are most likely to be hit frequetly, thus its eighbors also get high persoalized PageRak scores. Obviously, this process spreads to may adjacet odes i the top- rakig results. I other words, the top- rakig list foud by the persoalized PageRak may cotai may similar odes, which reduces the rakig effectiveess i the applicatios that eed to icorporate diversity. 2.2 Problem formulatio I the literature, there are may rakig algorithms o graphs [5][6][7] that aim at improvig diversity. However, as our aalysis give i the itroductio, the existig diversified rakig algorithms either caot scale to large-scale graphs or lack a ituitive ad reasoable diversified rakig measure. To this ed, i this paper, we propose a ew diversified rakig measure o graphs ad desig a scalable algorithm for optimizig it accurately. Below, we first give some importat otatios ad defiitios, ad the formulate our diversified rakig problem. Notatios ad defiitios: Cosider a graph G = (V, E), with a set of odes V ad a set of edges E, where the size of odes is = V. Defiitio 2.: Let S be a set of odes. The expaded set of S is deoted by N(S) such that N(S) = S {v (V S) u S, (u, v) E}. The expasio of a set of odes, S, is the size of the expaded set, N(S), deoted as N(S). Ad the expasio ratio is defied as σ = N(S) /. It is worth metioig that our defiitio of expasio is based o the topological structure of the graph. which ca be either udirected or directed. I additio, it is importat to ote that our defiitio of expasio is differet from the defiitio of expasio give i the expader graph [2] where the expasio of a graph equals to the miimum expasio ratio amog all the expaded sets. With Def. 2., a set of odes with a large expasio ratio implies that the odes are dissimilar to oe aother. Here, the ituitio behid is that two odes are dissimilar if they do ot share the commo eighbors i a graph. The larger expasio ratio the set of odes has, the better diversity amog the set of odes they ca achieve. Cosider a graph i Fig. (a). Assume we select three odes (red odes) i Fig. (b) ad Fig. (c), respectively. The, the expasio ratio of the selected odes i Fig. (b) ad Fig. (c) are.6 ad.9 respectively. The selected odes i Fig. (b) are well coected, thus they ca be similar to oe aother. O the other had, there is o edge betwee ay two selected odes i Fig. (c), thus they ca be dissimilar to each other. As a result, the selected odes i Fig. (c) are more diverse tha the selected (a) A graph G (b) σ =.6 (c) σ =.9 Fig.. Illustratio of our idea: expasio ratio vs diversity. Red square odes deote the selected odes ad gree odes are the expaded odes (color olie). odes i Fig. (b). This example idicates that odes with a larger expasio ratio result i better diversity. Our diversified rakig measure is based o this key ituitio. Diversified rakig measure: The most commoly used criteria for combiig relevace ad diversity are the so-called maximum margial relevace (MMR) [3], which is a liear combiatio of relevace ad diversity ad is widely used i may documet retrieval systems. With MMR, a documet that has a high margial relevace meas that it is relevat to the query ad is dissimilarity to the previously selected documets. Similarly, i a graph, a ode with a high diversified rak should () have a high persoalized PageRak score, ad (2) be dissimilar to the other selected odes. Our defiitio of expasio ratio ca be deemed as a diversity measure. Ad we aim at fidig a subset S of odes such that () the odes i S have high persoalized PageRak scores ad (2) the expasio ratio of N(S) / is maximum. Formally, our goal is to maximize the followig diversified rakig measure: F (S) = ( λ) w u + λ N(S), (2) u S where w u deotes the persoalized PageRak score of ode u, ad λ [, ] is a parameter that is used to tradeoff relevace ad diversity. The first term i Eq. (2) is the sum of the persoalized PageRak scores over the rakig results, which reflects the relevace of the rakig results. The secod term is the expasio ratio of the rakig results. As discussed, a better expasio ratio implies better diversity. Hece, Eq. (2) captures both relevace ad diversity. Note that F (S) does ot cosider the orderig of the top- rakig list. This is because our defiitio is based o a mild assumptio that the users i a real retrieval system geerally focus o all the top- results. This assumptio is typically reasoable i may practical applicatios [5][6][7]. However, i Sectio 3.3, we will show that our proposed algorithm still yields a orderig results based o both relevace ad diversity score of the ode. To summarize, our problem of fidig top- diversified rakig o graph is formalized as follows: arg max F (S) S V s.t. S =. (3)

4 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, DIVERSIFIED RANING ALGORITHM As discussed, our diversified rakig problem is to maximize the proposed diversified rakig measure subject to a cardiality costrait (Eq. (3)). The followig theorem shows that the problem formulated i Eq. (3) is NP-hard i geeral graphs. Theorem 3.: For a geeral graph G = (V, E), the optimizatio problem i Eq. (3) is NP-hard. Proof Sketch: We cosider a special case of our problem defied i Eq. (3) ad show it is NP-hard. Let λ =, the the problem is equal to maximize N(S) subject to S =. This special problem is equivalet to the maximal expasio problem defied i [4] which is kow to be NP-hard. As a cosequece, our problem defied i Eq. (3) is also NP-hard. Give the hardess of our problem, there is o hope to optimally solve the top- diversified rakig problem o geeral graphs i polyomial time uless P=NP. Oly o trees, the diversified rakig problem (Eq. (3)) ca be solved optimally i polyomial time by a dyamic programmig algorithm, which we describe i the followig subsectio. 3. Diversified rakig o trees Although the diversified rakig problem o geeral graphs is NP-hard, we show that it ca be solved optimally i polyomial time whe the graph is a tree. Our polyomial-time algorithm is based o dyamic programmig. The basic idea is described as follows. Cosider a subtree whose root has x childre, the optimal way of fidig odes from the subtree for the diversified rakig list must follow oe of two cases. I the first case, we iclude the root of the subtree to the rakig list ad the recurse o the childre with a budget of -. I the secod case, we do ot add the root of the subtree, ad istead recurse o the childre with a budget of. A aive implemetatio of this recursio eeds to partitio x childre ito (or ) parts i all possible ways. Obviously, this is extremely expesive if x 2. To overcome this, we costruct a trasformatio that coverts the geeral tree to a biary tree without alterig optimum. The trasformatio is described as follows. We start from the root of tree T, deoted by root(t ). Assume u is a iteral ode of T with childre u, u 2,, u x ad x > 2. The, we replace u by a biary tree with depth at most log 2 x ad leaves u, u 2,, u x. I particular, let u be left child of u. Add a ew ode z ad let it be the right child of u. The, let the remaider childre of u be the childre of z. Repeat these steps util every odes have at most two childre. We set the persoalized PageRak score of the ewly added odes to ad the persoalized PageRak score of u, u, u 2,, u x are the same as before. This ca esure that the ewly added odes will ever be added ito the top- rakig list. Obviously, the depth of the ew tree (a biary tree) is at most a factor of log 2 d max larger tha the depth of the origial tree. Here d max deotes the maximum out-degree of a ode i the origial tree. Further, the size of the biary tree is at most twice the size of the origial tree. More importatly, it is ot very hard to verify that the optimal solutio of Eq. (3) o the biary tree is the same as the optimal solutio o the origial tree. Similar costructios have bee used for various applicatios [5][6]. Based o this costructio, we ca assume the tree is biary, ad is deoted by T. For each ode u i T, we defie a cost fuctio w.r.t. the curret solutio S as C(u, S) = ( λ)w u + λ N({u}) N(S) /. Let F (u, S, k) be the optimal solutio i the subtree rooted by u with budget k, where the set S maitais the curret solutio. Ad let l(u) (r(u)) deotes the left (right) child of ode u. The, the recursive equatio of the dyamic programmig (DP) is give by F (u, S, k) = max{ max k i= {F (l(u), S, i) + F (r(u), S, k i)}, C(u, S) + max k i= {F (l(u), S {u}, i) +F (r(u), S {u}, k i)}}. The first term of the recursive equatio correspods to do ot select u to be i S ad the secod term correspods to add u ito S. We aalyze the time complexity of the DP algorithm as follows (here we use a budget of ). First, buildig the biary tree takes O( log 2 d max ) time. Secod, we eed to evaluate the recursio O() times for each ode i the biary tree. For each such evaluatio, it takes O() time. Notice that computig C(u, S) ca be doe i costat time i a biary tree. There are O( log 2 d max )) odes i the biary tree. Puttig all it together, the time complexity of the DP algorithm is O( 2 log 2 d max )). 3.2 Submodularity Sice the diversified rakig problem o geeral graphs is NP-hard, we resort to develop approximate algorithms for solvig it efficietly. Below, we prove that our proposed diversified rakig measure (F (S)) is a odecreasig submodular set fuctio, which allows us to develop a ear-optimal greedy algorithm for maximizig it efficietly. We give the defiitio of the odecreasig submodular set fuctio [7] as follows. Defiitio 3.: Let V be a fiite set, a real valued fuctio f(s) o the set of subsets of V, S, is called a odecreasig submodular set fuctio, if the followig coditios hold. Nodecreasig: For ay subsets S ad T of V such that S T V, we have f(s) f(t ). Submodularity: Let ρ j (S) = f(s {j}) f(s) be the margial gai. The, for ay subsets S ad T of V such that S T V ad j V \T, we have ρ j (S) ρ j (T ).

5 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 5 We prove that Eq. (2) is a odecreasig submodular fuctio with F ( ) =, where is a empty set. We state the theorem as follows. Theorem 3.2: The set fuctio F (S) defied i Eq. (2) is a odecreasig submodular fuctio with F ( ) =. Proof: For S T V ad j V \T, let ρ j (S) = F (S {j}) F (S), ad ρ j (T ) = F (T {j}) F (T ). The, we have N(T {j}) N(T ) ρ j (T ) = ( λ)w j + λ = ( λ)w j + λ. N({j}) N(T ) Note that the odecreasig property of F (S) ca be guarateed by ρ j (T ). Similarly, we have ρ j (S) = ( λ)w j + λ N({j}) N(S). By defiitio, we have F ( ) = ad N({j}) N(S) N({j}) N(T ). Hece, we coclude ρ j (S) ρ j (T ). This completes the proof. 3.3 The greedy algorithm Because our diversified rakig measure exhibits submodularity property, with the foudig i [7], we develop a efficiet greedy algorithm with a /e approximatio guaratee for our top- diversified rakig problem. Alg. outlie our greedy algorithm. I Alg., the algorithm first computes the persoalized PageRak vector as the iitial rakig (lie ), which measures the relevace of the odes. The, i each iteratio, the algorithm chooses a ode u with the maximum margial gai ρ u (S) = ( λ)w u + λ N({u}) N(S) (lie 7-5), ad adds it ito the aswer set S. To get the top- rakig list, this procedure will repeat times (lie 4-7). The algorithm will produce a orderig rakig list accordig to ρ u (S). Sice ρ u (S) satisfies the odecreasig properties, Alg. will output a reasoable rakig such that the ode with a high rakig score will appear i the top rakig list. Theoretically, the followig theorem shows that Alg. obtais a ear-optimal solutio. Theorem 3.3: Alg. is a /e approximatio algorithm for the top- diversified rakig problem (Eq. (3)). Proof Sketch: This ca be proved by a similar argumet that has bee used to prove the approximatio factor of the greedy algorithm for submodular set fuctio maximizatio problem [7]. It is worth metioig that the /e approximatio factor is tight [8]. I other words, there are o other polyomial-time algorithms that ca achieve a more tight approximatio factor uless P=NP. Below, we aalyze the time ad space complexity of Alg.. Complexity aalysis of the greedy algorithm: The time complexity of Alg. is O( E ). Specifically, i lie, Alg. takes O( E ) time to compute the (4) Algorithm The Greedy Algorithm Iput: Graph G = (V, E),, dampig factor α, adjacecy matrix A, teleport vector r, ad parameter λ Output: A set S with odes : Compute the persoalized PageRak vector w; 2: Iitialize the aswer set S ; 3: For each ode v i, iitialize a idicator array Expa[i] ; 4: for iter = to do 5: max ; 6: maxidx ; 7: for each ode v i (V S) do 8: couter ; 9: for each eighbor ode (v j) of v i do : if Expa[j] = the : couter couter + ; 2: if (( λ)w i + λ couter/ V ) > max the 3: max ( λ)w i + λ couter/ V ; 4: maxidx i; 5: S S {v maxidx }; 6: for each eighbor ode (v j) of v maxidx do 7: Expa[j] ; 8: retur S; persoalized PageRak vector. The time complexity from lie 4 to lie 7 is O( E ). This is because the algorithm eeds to visit all the odes ad their correspodig eighbors, ad the total umber of odes visitig by the algorithm equals to 2 E i the worse-case. Moreover, we ca use the so-called CELF framework to accelerate Alg., which will result i several times speedup [9]. For the space complexity, Alg. eeds to store the iput graph G, the persoalized PageRak vector w, the aswer set S, ad a idicator array, which lead to O( V + E ) i total. Put it all together, the algorithm has liear time ad space complexity w.r.t. the graph size, ad thus it ca be scalable to large-scale graphs. 3.4 Coectio to domiatig set problem The proposed top- diversified rakig problem (Eq. (3)) is well coected to the domiatig set problem i graph theory [2]. The miimum domiatig set problem i graph theory aims to fid the miimum umber of odes whose expaded set ca cover the whole graph. I other words, the odes i the miimum domiatig set ca domiate the other odes of the graph. The domiatio umber (DN) of a graph is the cardiality of the miimum domiatig set. It is well kow that the miimum domiatig set problem is NP-hard. There is a efficiet greedy algorithm with + l( V ) approximatio factor to compute the DN ad the domiatig set of a graph [2]. Specifically, the greedy algorithm chooses a ode with the maximal margial gai (ρ u (S) = N u (S {u}) N u (S) ) at a time, ad it termiates whe the expaded set of the selected odes cover the whole graph. Note that the miimum domiatig set problem oly cosiders the expasio of the odes

6 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 6 ad igore the relevace of the odes, thus caot be directly applied to our problem. Moreover, our top- diversified rakig problem aims to fid the odes such that they are relevat to the query ad simultaeously dissimilar to oe aother, ad it is ot to fid the miimum umber of odes such that their expaded set ca cover the whole graph. I the case that exceeds the domiace umber (DN) of the graph, Alg. will choose odes i terms of their persoalized PageRak scores. However, i may real graphs, is sigificatly smaller tha the DN of the graph. We will address this poit i our experimetal studies i Sectio 5. 4 GENERALIZED DIVERSIFIED RANING I this sectio, we first propose a geeralized diversified rakig measure, ad desig a efficiet greedy algorithm for optimize it accurately. The, we discuss other potetial variats of our diversified rakig measures. 4. Geeralized diversified rakig measure The proposed diversified rakig measure (F (S)) i Def. 2., oly cosiders the immediate eighborhood iformatio of S. Naturally, we ca geeralize the diversified rakig measure F (S) by takig the k- step earest eighbors ito accout. We call such a measure a geeralized diversified rakig measure ad deote it by F k (S). I the followig, we first give the defiitios of k-step expaded set ad k-step expasio. Defiitio 4.: Let S be a set of odes. The k-step expaded set of S is deoted by N k (S) such that N k (S) = S {v (V S) u S, d(u, v) k}, where d(u, v) deotes the legth of the shortest path from u to v. The k-step expasio of S is the cardiality of the k-step expaded set deoted as N k (S). Ad the k-step expasio ratio is defied as σ k = N k (S) /. Based o the k-step expasio, we defie the geeralized diversified rakig measure F k (S) as follows. F k (S) = ( λ) u S w u + λ N k(s) Obviously, F (S) is a special case of F k (S) whe k =. Like F (S), F k (S) is also a odecreasig submodular fuctio. We give a theorem as follows. The proof is similar to the proof of Theorem 3.2, thus we omit it for brevity. Theorem 4.: The set fuctio F k (S) defied i Eq. (5) is a odecreasig submodular fuctio with F k ( ) =, where deotes a empty set. Likewise, the problem of maximizig the set fuctio F k (S) subject to a cardiality costrait is NPhard. However, based o the submodularity property. Here, we use small letter k to distiguish which is used to deote the cardiality of our top- rakig results. (5) of F k (S), we ca develop a greedy algorithm to optimize it accurately. Now, the problem is that the greedy algorithm eeds to fid a ode with the maximum margial gai ρ u (S) = ( λ)w u + λ N k({u}) N k (S) i each iteratio. Ulike Alg., the margial gai ρ u (S) caot be calculated i liear time complexity whe k >. A aive implemetatio of maximizig F k (S) is described as follows. First, we costruct a ew graph such that ay two odes u ad v of the ew graph have a edge (u, v) if u ca reach v i k (k > ) hops i the origial graph. The, we perform Alg. o the ew graph. The costructio of the ew graph ca be implemeted by Floyd algorithm [2], resultig i O( V 3 ) time complexity. Ad performig Alg. o the ew graph will take O( E ) time complexity, here E deotes the umber of edges i the ew graph. Hece, the time complexity of this aive algorithm is O( V 3 ), which is clearly ot scalable. I the followig, we develop a radomized greedy algorithm with liear time complexity usig the Flajolet-Marti (FM) sketch [22]. 4.2 The radomized greedy algorithm Recall that the major time-cosumig step for optimizig the geeralized diversified rakig measure (Eq. (5)) is to evaluate the margial gai (ρ u (S) = ( λ)w u + λ N k(s {u}) N k (S) ). Ispired by the idea of approximate eighbor fuctio [23], we propose a radomized greedy algorithm for the geeralized diversified rakig problem usig the FM sketch. The FM sketch is a probabilistic coutig structure, which ca be used to estimate the umber of distict elemets (cardiality) i a multi-set [22]. Assume the cardiality of a multi-set A is C, the the FM sketch oly uses log C + t bits for estimatig C i high accuracy, where t is a small costat. More specifically, the FM sketch is a bitmap with size s = log C+t. There is a hash fuctio h : A {,, s}, which maps a elemet a i A to a bit i = {,, s} i the bitmap with probability Pr(h(a) = i) = /(2 i+ ). Iitially, all bits i the bitmap is set to. The, each elemet a A is iserted ito the bitmap by settig the correspodig h(a)-th bit to. Fially, a asymptotically ubiased estimatio of the cardiality C ca be obtaied by 2 c /.7735, where c deotes the positio of the least-sigificat zero bit i the bitmap. We ca use multiple hash fuctios to boost the estimatig accuracy. For the sake of brevity, we oly cosider oe hash fuctio to illustrate the algorithm. I additio, a importat property of the FM sketch is that it ca be easily applied to estimate the cardiality of the uio of two multi-sets if these two multi-sets come from the same domai. I particular, we ca costruct a FM sketch with the same size for each multi-set. To estimate the cardiality of the uio of two multi-sets, we oly eed to do a bitwise-or betwee the two FM sketches, ad the estimate the cardiality based o the resultig FM sketch.

7 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 7 Algorithm 2 The Radomized Greedy Algorithm Iput: Graph G = (V, E),, dampig factor α, adjacecy matrix A, teleport vector r, parameter k of the k-step expasio, parameter λ Output: A set S with odes : Compute the persoalized PageRak vector w; 2: Let h : {v,, v } {,, s} be the hash fuctio that maps the odes to a positio of the BITMAP, here s is the size of the BITMAP ; 3: for each ode v i V do 4: Iitialize a BITMAP FM[i] ; 5: Set the h(v i)-bit of FM[i] to ; 6: Iitialize a temporary BITMAP TFM[i] ; 7: for iter = : k do 8: for each ode v i V do 9: TFM[i] FM[i]; : for each edge (v i, v j) E do : FM[i] = (FM[i]) BITWISE-OR (TFM[j]); 2: Iitialize the aswer set S ; 3: Iitialize two BITMAPs NBP, OBP ; 4: c ; 5: for iter = to do 6: max ; 7: maxidx ; 8: for each ode v i (V S) do 9: OBP (NBP) BITWISE-OR (FM[i]); 2: Let t be the positio of the right most bit i the BITMAP OBP; 2: couter 2 t /.7735; 22: couter couter c; 23: if ( λ)w i + λ couter/ V > max the 24: max ( λ)w i + λ couter/ V ; 25: maxidx i; 26: S S {v maxidx }; 27: NBP (NBP) BITWISE-OR (FM[maxIdx]); 28: Let t be the positio of the right most bit i the BITMAP NBP; 29: c 2 t /.7735; 3: retur S; It is worth metioig that there also exist may other probabilistic coutig structures, such as Loglog sketch [24] ad Hyper Loglog sketch [25], but the uio of these sketches caot be easily implemeted by bitwise-or. Therefore, i our problem, we apply the FM sketch to estimate the size of the k-step expasio set, i.e., N k (S). The mai idea of our algorithm is that we costruct a FM sketch to estimate the k-step expasio ( N k ({v}) ) of each ode (v). To estimate the k-step expasio of a set S ( N k (S) ), we oly eed to do S times bitwise-or over all the FM sketches of the odes i S. We depict our algorithm i Alg. 2. Firstly, the algorithm calculates the persoalized PageRak vector w (lie ). Secodly, the algorithm builds V FM sketches for all odes of the graph (lie 2-). Here we make use of the idea of the approximatio eighbor fuctio [23]. Specifically, the idea is based o the observatio that the k-step expaded set of a ode v i is equivalet to the uio of all the (k-)-step expaded sets of the immediate eighbors of v i. More formally, we have N k ({v i }) = N k ({v j }). (6) (v i,v j) E Based o this observatio, we build a FM sketch for each ode v i i a recursive maer (lie 7-). Note that we use the bitwise-or over the FM sketches for implemetig the set uio operatio i Eq. (6) (lie ). Fially, Alg. 2 greedily selects odes accordig to their approximate margial gai (lie 2-3). I particular, we let S be the aswer set, NBP be the FM sketch represetig the expaded set of the aswer set S (N k (S)), c be the k-step expasio of S ( N k (S) ), ad OBP be a temporary FM sketch represetig the expaded set of S {v i }, i.e., N k (S {v i }). Iitially, Alg. 2 sets S to a empty set (lie 2), NBP ad OBP to (lie 3), ad c = (lie 4). The, Alg. 2 iteratively selects odes with the maximal approximate margial gai (lie 5-29). At each iteratio, the algorithm chooses oe ode from V S (lie 8-25). More specifically, for each ode v i (V S), Alg. 2 first estimates N k (S {v i }) usig the FM sketch OBP (lie 9-2). The, Alg. 2 calculates the approximate margial gai of ode v i (ρ i (S) = ( λ)w i + λ N k(s {v i }) N k (S) ) ad records the ode with the maximal approximate margial gai (lie 22-25). Fially, Alg. 2 adds the ode with maximal approximate margial gai ito the aswer set (lie 26-27) ad re-estimates N k (S) by the FM sketch NBP (lie 28-29). Theoretically, Alg. 2 achieves /e ɛ approximatio guaratee with high probability for the geeralized diversified rakig problem, because the FM sketch approximates the k-step expasio of set S withi a ɛ error boud i high probability [22]. I our experimets, we will show that the performace of Alg. 2 is desirable. I the followig, we aalyze the time ad space complexity of Alg. 2. Complexity aalysis of the radomized greedy algorithm: The time complexity of Alg. 2 is O(k E + V ). Specifically, i lie, Alg. 2 computes the persoalized PageRak vector which cosumes O( E ) time complexity. I lie 2-, Alg. 2 eeds to take O(k( E + V )) time to sketch the k-step expaded set for all odes. I lie 2-29, the algorithm takes O( V ) time to fid the aswer set. Note that the bitwise-or ca be doe i ear costat time complexity [23]. Thus, the time complexity of Alg. 2 is O(k E + V ). For the space complexity, like Alg., Alg. 2 eeds to store the graph G ad the persoalized PageRak vector w, which cosumes O( V + E ). I additio, Alg. 2 eeds to maitai O( V ) FM sketches, which takes O( V log V ) bits. As a result, the space complexity of Alg. 2 is O( V log V + E ). Notice that the space complexity of Alg. 2 is approximately O( E ), as O( V log V ) ca be domiated by O( E ) i most graphs. Puttig it all together, we coclude that the

8 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 8 time ad space complexity of Alg. 2 is liear w.r.t. the graph size, thereby it ca be scalable to large graphs. 4.3 Miimum relevace diversified measures Besides MMR, there also exist other diversificatio criterios [26][27]. Here, we discuss some potetial variats of the proposed diversified measures based o the miimum relevace criterio [26], where the worse-case relevace will be maximized. The miimum relevace diversified measures are give as follows: ad J(S) = ( λ) mi u S J k (S) = ( λ) mi u S w u + λ N(S), (7) w u + λ N k(s). (8) Ulike F (S) ad F k (S), the miimum relevace diversified measures defied above are ot submodular. Thus, we caot easily desig a efficiet greedy algorithm with a approximatio guaratee. I effect, it is easy to show that the first term of set fuctio J(S) or J k (S) is supermodular 2 [28] ad the secod term is submodular. Thus, the set fuctio J(S) or J k (S) is a sum over a submodular ad a supermodular fuctio, which could be approximately solved by a supermodular-submodular procedure [28]. But ufortuately, both the covergece properties ad the approximatio factor of the supermodular-submodular procedure are ot kow ow. Developig efficiet algorithm with performace guaratee to maximize J(S) ad J k (S) is a iterestig future work. 5 EXPERIMENTS I this sectio, we evaluate the effectiveess ad efficiecy of the proposed approaches. Below, we first describe the experimetal setup, ad the report our experimetal results. 5. Experimetal setup Datasets: We coduct our experimets o five real etworks, three collaboratio etworks, oe citatio etwork, ad oe social etwork. Collaboratio etworks. We select three collaboratio etworks from Staford etwork datasets [29]: amely GrQc, HepTh, ad CodMat. GrQc, HepTh, ad CodMat are collaboratio etworks collected from the e-prit arxiv archive ad cover all the co-authorships betwee authors o Geeral Relativity ad Quatum Cosmology, High Eergy Physics-Theory, ad Codese Matter Physics, respectively. Notice that all the collaboratio etworks are udirected graph. Citatio etwork. We choose a citatio etwork, amely citehepth, from Staford etwork 2. A set fuctio J(S) is called supermodular, if J(S) is submodular. datasets [29]. The citehepth is a citatio etwork of papers o high eergy physics theory, which is origially collected from e-prit arxiv archive. The citatio etwork is a directed graph. The social etwork. Flickr is a popular photo shared website. The users i Flickr ca upload photos, make frieds as well as joi i various iterest groups. I our experimets, we employ the Flickr dataset from ASU social computig data repository [3]. The dataset cotais a udirected social etwork with 8,53 odes ad 5,899,882 edges ad 95 differet groups that the users joied. The detailed statistical iformatio of our datasets are preseted i Table. From Table, we ca observe that the approximate domiatio umber (DN) of our datasets, which is calculated by a greedy algorithm give i [2], are greater tha,. However, i may practical retrieval systems, users are ofte iterested i the top- results, where is a small costat (eg. =3) ad it is typically smaller tha the approximate DN. TABLE Summary of the datasets ame odes edges approximate DN GrQc ,98,598 HepTh 9,877 5,97 2,829 CodMat 23,33 86,936 4,449 citehepth 27,77 352,87 3,57 Flickr 8,53 5,899,882 3,768 Evaluatio metrics: I the literature, there are o well accepted measures for diversity i rakig o graphs [3]. I our experimets, we employ two metrics to measure the diversity. Oe is proposed i [6], which makes use of the desity of the iduced subgraph by the top- rakig odes. The desity of a graph is a ratio that is equal to the umber of edges existig i the graph divided by the maximum possible umber of edges i the graph. Ituitively, the desity iversely measures the diversity of the top- rakig odes. The secod metric is the expasio ratio which is defied i Def. 2.. The ratioale is that the larger expasio ratio of the top- rakig odes idicates the better diversity. For comparig the relevace with differet algorithms, we use the relevace metric give i [8]. Specifically, the relevace Rel is calculated as v Rel = w i S i v i S w, (9) i where S deotes the top- diversified rakig list by the diversified rakig algorithm, S deotes the top- rakig list by the persoalized PageRak algorithm. Note that Rel defied i Eq. (9) falls ito a iterval [, ], as the persoalized PageRak algorithm always gives the most relevat odes. By defiitio, the higher Rel implies better relevat.

9 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 9 Baselies: We compare our proposed methods with six baselies uder diversity ad relevace metrics defied above. For our methods, we maily focus o k-step, for k = ad k = 2, deoted by Expasio- (Ep) ad Expasio-2 (), respectively. Ep ad are tested usig Alg. ad Alg. 2, respectively. We will study the effectiveess of the parameter k i the followig sectio. For other k-step expasios (k > 2), the performace is ot sigificatly better tha the -step ad 2-step expasios. The six baselies are as follows. Persoalized PageRak (PPR): PPR is a atural competitor of our algorithm, which ca be served as a baselie for evaluatig relevace. Grasshopper (Gra): Gra is a diversified rakig algorithm that leverages a absorbig radom walk to achieve diversity [5]. Gra has bee successfully used i diversified documet summarizatio ad rakig actors i social etworks. Maifold Rakig with Stop Poits (MRSP): MRSP is proposed i [7], which is very similar to the Grasshopper algorithm. It ca also be used o graphs. DivRak (DivR): DivR makes use of the statioary distributio of a vertex reiforced radom walk to rak odes [6]. It has bee applied to diversify rakig i iformatio etworks. There are two various implemetatio of DivR, amely poitwise DivR ad cumulative DivR respectively. As reported i [6], the two algorithms achieve the similar rakig performace. Hece, we use the poitwise DivR i our experimets. Drago (Dra): Dra is a scalable diversified rakig algorithm [8]. Dra aims to optimize a predefied diversified rakig measure. Ulike our diversified rakig measure, the measure used i Dra lacks topological explaatio, thereby it is ot ituitive ad reasoable to some extet. Diversified rakig via Resistive Graph Ceters (RGC): RGC [] aims to lear a diversified teleport vector to achieve diversity i rakig. However, the time complexity of RGC is cubic, thereby it caot scale to large graphs. We do ot make compariso with the MMR algorithm [3] because [6] has show that DivR outperforms MMR over graph datasets. Parameter settigs: I our proposed algorithms (Alg. ad Alg. 2), there are two commo parameters: the dampig factor α for computig the persoalized PageRak, ad the parameter λ used to tradeoff relevace ad diversity. We set α = 5 as it is widely used i web search. For the parameter λ, we set it to.5 because it is ot very sesitive i our experimets. We will show the effect of λ i the followig sectio. Additioally, for Alg. 2, we use 5 hashig fuctios to implemet the FM sketch. For all parameters of the baselie methods, we use the same settigs as give i the origial papers respectively. Experimetal eviromet: All the experimets are coducted o a Widow Server 27 with 4xDual- Core Itel Xeo 2.66 GHz CPU, ad 4G memory. All algorithms are implemeted by MATLAB (R2a). 5.2 Experimetal results I all of our experimets, we radomly geerate queries, ad the results are the average over all the queries. We give the detail results as follows. Results o collaboratio etworks: I this experimet, we compare Ep ad with six baselies over three collaboratio etworks. Fig. 2(a), Fig. 2(b), ad Fig. 2(c) depict our results o GrQc, HepTh, ad CodMat datasets, respectively. From Fig. 2(a), we ca observe that DivR ad Gra achieve ear-optimal relevace, followed by Ep, Dra,, MRSP, ad RGC. Note that the relevace of both Ep ad are more tha over differet values, which idicates that our algorithms ca obtai relevat results w.r.t. the queries. We ca clearly see that the relevace of RGC is extremely low, which is less tha.3 over differet values. This result implies that RGC may produce irrelevat ad meaigless results. For the diversity, we fid that is the wier uder the expasio ratio metric amog all the algorithms. Besides, Ep also outperforms other baselies uder the expasio ratio metric. The expasio ratio by DivR, Gra, ad MRSP are slightly worse tha PPR, which suggests that DivR, Gra, ad MRSP do ot perform well to ehace diversity i collaboratio etworks uder the expasio ratio metric. Uder the desity metric, RGC outperforms the competitors (recall that smaller desity implies better diversity). Ep,, ad MRSP achieve comparable desity, ad they are slightly worse tha Dra. DivR ad Gra also do ot perform well uder the desity metric. Similar results ca be observed i HepTh ad CodMat datasets. Based o the observatios, o the collaboratio etworks, we coclude that DivR, Gra, ad MRSP do ot perform well regardig diversity. The reaso would be that these algorithms lack a clear explaatio for diversity. RGC exhibits excellet performace for improvig diversity, but it sigificatly sacrifices the performace of relevace. Our Ep ad as well as Dra achieve a good tradeoff betwee the relevace ad the diversity. The reaso is that our algorithms ad Dra have a clear objective to optimize the predefied diversified rakig measures. Moreover, our algorithms exhibit better relevace ad better expasio ratio tha Dra. Results o citatio etwork: Ulike the collaboratio etwork, the citatio etwork is a directed graph. Here, we test MRSP by igorig the directio of the edges as MRSP caot be directly applied to the directed graphs. Fig. 3 describes our results.

10 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 (i) Relevace vs. (ii) vs..2 (iii) Desity vs. Relevace PPR Ep Dra DivR RGC Gra MRSP Desity (a) Results o GrQc dataset. (i) Relevace vs..35 (ii) vs..6 (iii) Desity vs Relevace Desity (b) Results o HepTh dataset. Relevace (i) Relevace vs (ii) vs. (c) Results o CodMat dataset. Desity (iii) Desity vs. Fig. 2. Compariso of various diversified rakig algorithms o collaboratio etworks (color olie). Relevace.6 (a) Relevace vs (b) vs. Desity (c) Desity vs. PPR Ep Dra DivR RGC Gra MRSP Fig. 3. Compariso of various diversified rakig algorithms i citehepth dataset. From Fig. 3, we fid that Gra outperforms other algorithms by relevace metric. RGC shows the lowest relevace, which suggests that RGC may geerate completely irrelevat rakig results. For other baselies except PPR, they show comparable relevace. For our approaches, Ep shows better relevace tha. For the diversity, outperforms the other algorithms uder the expasio ratio metric. The expasio ratio by Ep is better tha the expasio ratio by the six baselie algorithms. However, uder the desity metric, we ca observe that RGC gets the best performace. Our approaches, MRSP, ad Dra achieve comparable desity. Also, for our approaches, is slightly better tha Ep uder the desity metric. I geeral, the results o the citatio etworks cosist with the results o the collaboratio etworks. Results o Flickr social etwork: Here we test our proposed algorithms i Flickr social etwork. Our goal is to fid the top- users who ot oly have higher persoalized PageRak scores relative to the queries, but also cover as may iterest groups as possible. Hece, i additio to the diversity measures described i Sectio 5., we itroduce the group

11 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 Relevace.6 (a) Relevace vs..6 (b) vs. Precisio (a) Codmat Ep Dra Precisio (b) citehepth Precisio (c) Flickr Desity (c) Desity vs Group coverage (d) Group coverage vs. PPR Ep Dra Fig. 4. Compariso of various diversified rakig algorithms i Flickr social etwork. coverage as a ew diversity measure i this experimet. Ituitively, the more groups that are covered by the top- rakig list the better diversity it has. I this experimet, we oly compare our Ep ad with PPR ad Dra. The reaso is of twofold. First, the other baselies either caot get aswers i 2 hours or caot be coducted due to their memory requiremets. Secod, as observed i our previous experimets, Dra outperforms the other baselies. Our results are show i Fig. 4. From Fig. 4, we ca observe that both Ep ad sigificatly outperform Dra based o the relevace, the expasio ratio, ad the group coverage metrics. More specifically, uder the relevace ad expasio ratio metrics, Ep is clearly the best performer amog all the diversified rakig algorithms. Also, otice that the relevace by Dra decreases as the icreases. Whe =, Dra exhibits low relevace (less tha ). Istead, our algorithms show quite robust relevace w.r.t. differet values. Furthermore, the relevace of our algorithms are greater tha over various values. Uder the desity metric, Dra slightly outperforms Ep ad. However, uder the group coverage metric, achieves the best performace, followed by the Ep, Dra, ad the PPR. From the practical poit of view, the performace of our algorithms are better tha the performace of Dra, because the rakig results by our algorithms cover more iterest groups tha that of Dra. The reaso ca be that our diversified rakig measures capture the topological properties of the graph, which is more ituitive ad reasoable tha the measure used i Dra. To summarize, over all of our experimets, we make the followig observatios. () DivR ad Gra achieve ear-optimal relevace but their performace of improvig diversity is quite low. (2) RGC gets ear-optimal diversity uder the desity metric, but it exhibits extremely low relevace. (3) The performace Fig. 5. Compariso of precisio of Ep,, ad Dra. of MRSP is very low uder the expasio ratio metric (eve worse tha PPR). (4) Ep,, ad Dra show a good balace betwee the relevace ad the diversity. Moreover, our Ep ad exhibit better relevace ad diversity tha Dra over most datasets used. Precisio compariso: To further evaluate the effectiveess of our algorithms, we compare the precisio of our approaches with the state-of-the-art Dra. Sice there is o groud truth i graph-type datasets, we use the persoalized PageRak as the groud-truth rak which is also used i [8]. The precisio is defied by the followig formula: P re = S S / S, () where S ad S is defied i Eq. (9). Fig. 5 depicts our results i Codmat, citehepth, ad Flickr datasets. Similar results ca be observed i other datasets. From Fig. 5, we ca clearly see that both Ep ad cosistetly outperform Dra i Codmat ad Flickr datasets over differet. I citehepth dataset, we ca observe that all three algorithms geerate comparable rak, ad the performace of Ep is slightly better tha Dra. The performace of Dra is ot very stable over our datasets. I citehepth dataset, the performace of Dra is comparable to our algorithms, but i i Flickr dataset, Dra does ot perform well (precisio is lower tha give = ). This result implies that Dra produces less meaigful rak i Flickr dataset. I cotrast to Dra, the performace of our algorithms is very stable over differet datasets. I this sese, we ca coclude that our algorithms are better tha Dra. Time compariso: We compare the average query processig time of various diversified rakig algorithms over five etwork datasets. We take the average o the query processig time of the rakig algorithms over differet values ad differet queries. Table 2 shows our results. From Table 2, we ca observe that PPR is the most efficiet algorithm. Ep ad Dra achieve competitive efficiecy with PPR. is slightly worse tha Ep, Dra, ad PPR, but is still very efficiet due to the liear time ad space complexity. For the other baselies, we ca clearly see that their time requiremets are very high. More worse, o the Flickr dataset, RGC, Gra, ad MRSP caot get the top- rakig results i 2 hours, ad DivR caot be coducted due to its memory

12 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 2 TABLE 2 Average query time of various algorithms (i secod). GrQc HepTh CodMat citehepth Flickr PPR EP EP Dra DivR RGC Gra MRSP Relevace Desity (a) Relevace vs. λ.5.6 λ (c) Desity vs. λ Ep PPR Dra.6 λ Group coverage Fig. 6. The effect of parameter λ. (b) vs. λ.6 λ (d) Group coverage vs. λ.6 λ requiremet. This results cofirm our time ad space complexity aalysis i the previous sectios. Effect of parameter λ: We study the effect of the parameter λ i Ep ad, i.e. λ i Eq. (2) ad Eq. (5), which is leveraged to tradeoff the relevace ad the diversity. Here we study the top 3 rakig results (=3) uder differet λ values i Flickr dataset. Similar results ca be observed i other datasets ad for other. We use the results of PPR ad Dra as the baselies. The reasos are () the rakig result by PPR is a atural measure for relevace, ad (2) Dra outperforms other baselies. The results are depicted i Fig. 6. As ca be see i Fig. 6(a), the relevace by decreases as λ icreases, while the relevace by Ep is robust w.r.t. λ. For the relevace, both Ep ad outperform Dra. Accordig to Fig. 6(b), Fig. 6(c), ad Fig. 6(d), we ca observe that the diversity by Ep, which is measured by the expasio ratio, desity, ad group coverage, geerally icreases as λ icreases. This is because a larger λ meas more weights are assiged, i order to improve the diversity i our diversified measure (Eq. (2)). We also fid that Ep is very robust w.r.t. λ. I additio, we ca clearly see that both Ep ad outperform Dra by the expasio ratio ad group coverage measures, while by desity measure, our algorithms are slightly worse tha Dra. Scalability testig ad memory cosumptio: To Average query time (s) Ep # of odes ( 5 ) Average query time (s) Ep # of edges ( 5 ) Fig. 7. Scalability of the proposed algorithms. Memory cosumptio (G) Memory cosumptio of the proposed algo- Fig. 8. rithms Ep # of odes ( 5 ) Memory cosumptio (G) Ep # of edges ( 5 ) study the scalability of Ep ad, we geerate two sets of sythetic graphs G with odes ragig from, to 9, ad edges from 8, to 4,, usig the Erdos-Reyi radom graph model, respectively. Here we set = 3, ad similar results ca be observed for other. Our results are described i Fig. 7. From Fig. 7, we ca clearly see that both Ep ad scale liearly w.r.t. both the umbers of odes (left part of Fig. 7) ad edges (right part of Fig. 7). Therefore, our Ep ad ca be used for very large graphs. The results cofirm our time complexity aalysis i the previous sectios. To validate the space complexity of our algorithms, i Fig. 8, we show the memory cosumptio of our algorithms i the same set of sythetic graphs. Specifically, i the left part of Fig. 8, we ca see that the memory cosumptio of both Ep ad icrease as the umber of odes icreases. The curves of both Ep ad become a lie whe the umber of odes is larger tha 5,. Similarly, from the right part of Fig. 8, we ca observe that the memory cosumptio of both Ep ad icrease as the umber of edges icreases, ad the curves of Ep ad ted to be a lie whe the umber of edges is larger tha 2,4,. These results cofirm the liear space complexity of our algorithms. Performace of Alg. 2: It is worth otig that Alg. 2 gives a approximate aswer istead of the exact aswer give by Alg.. We evaluate the approximatio performace of Alg. 2. To this ed, firstly, we use Alg. 2 to test the -step expasio (set k= i Alg. 2), ad we refer to it as Approx. Ep. We compare the performace of Approx. Ep with Ep, which is implemeted by Alg.. Fig. 9 shows our results i Flickr dataset. Similar results ca be observed i other datasets. From Fig. 9(a), we ca fid that Approx. Ep shows better relevace tha Ep. However, from Fig. 9(b), (c), ad (d), Approx. Ep is slightly worse tha Ep uder the three diversity

13 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 3 Relevace Desity (a) Relevace (c) Desity.5.3 Performace of the radomized greedy algo- Fig. 9. rithm. Relevace Desity Ep Approx. Ep Group coverage.6 (b) (d) Group coverage.6 (a) Relevace vs. k k (c) Desity vs. k k Group coverage (b) vs. k k.35.3 (d) Group coverage vs. k k step expasio k Fig.. The effect of parameter k i k-step expasio based algorithms. metrics. Overall, Approx. Ep achieves comparable performace with Ep. This results suggest that our radomized greedy algorithm (Alg. 2) ca achieve a good performace guaratee, which cosists with our aalysis i Sectio 4. Effect of parameter k: We ivestigate how the parameter k affects the performace of the k-step expasio based algorithms, which are implemeted by Alg. 2. Fig. shows our results i Flickr dataset, ad the similar results ca be observed i other datasets. From Fig., we ca see that the relevace ad diversity are geerally ot sesitive w.r.t. differet k whe k 2. The 2-step expasio (k=2) achieves the best expasio ratio ad desity, thereby i our previous experimets we set k=2. 6 RELATED WOR Diversified rakig o text data: Diversity has bee recogized as importat criteria i iformatio retrieval. There are a large body of works o query or search results diversificatio [3][32][33][34][35][36][37]. I documet retrieval, oe of a well-kow method is the maximal margial relevace (MMR) proposed by Carboell ad Goldstei [3], which achieves diversity by maximizig a liear combiatio fuctio that captures both dissimilarity amog the results ad relevace w.r.t. the query. After Carboell ad Goldstei s work, may approaches addressig diversificatio have bee proposed i recet years. Zhai, et al. [38] propose a subtopic retrieval approach to results diversificatio. Agrawal, et al. [39] formulate the query results diversificatio as a submodular fuctio maximizatio problem. Gollapudi, et al. [26] preset several axioms for query results diversificatio. All the above metioed methods primarily address to documets data. A excellet survey o query results diversificatio is give i [27]. Submodular set fuctio maximizatio: Our diversified rakig problem is closely related to submodular set fuctio maximizatio problem, which is geerally NP-hard. However, there always exists a ear-optimal greedy algorithm for solvig such problem [7]. There are may applicatios that have bee formulated as a submodular set fuctio maximizatio problem such as ifluece maximizatio problem i social etworks [4], observatio selectio ad sesor placemet problem [4], [42], documet summarizatio problem [43], [37], as well as the set cover problem [44]. I this paper, we formulate the diversified rakig problem o graphs as the submodular set fuctio maximizatio problem. Expasio o graphs: Our work is also related to the expasio of a graph, which is a well kow cocept i expader graph theory [2]. This cocept recetly is used for samplig commuity structure [45] ad facilitatig decetralized search i etworks [4]. However, our defiitio of expasio is differet from the previous work, ad we leverage expasio to measure diversity of the top- rakig results. 7 CONCLUSIONS I this paper, we preset a study of fidig top- diversified rakig o graphs. Firstly, we propose a ovel diversified rakig measure, which captures both relevace ad diversity. Secodly, we prove the submodularity of this measure ad desig a efficiet greedy algorithm to achieve ear-optimal diversified rakig. The proposed method has liear time ad space complexity w.r.t. the size of the graph, thus it ca be scalable to large graphs. Thirdly, we preset a geeralized diversified rakig measures ad develop a efficiet radomized greedy algorithm for maximizig it accurately. Fially, extesive experimets show the effectiveess, efficiecy ad scalability of the proposed methods.

14 IEEE TRANSACTIONS ON NOWLEDGE AND DATA ENGINEERING, VOL.XXX, NO. XXX, 22 4 ACNOWLEDGMENTS The work was supported by grat of the Research Grats Coucil of the Hog og SAR, Chia No. CUH/499. REFERENCES [] S. Bri ad L. Page, Pagerak: Brigig order to the web, Staford Digital Library Project, Tech. Rep., 997. [2] M. E. J. Newma, Networks: A Itroductio. OXFORD Uiversity Press, 2. [3] T. H. Haveliwala, Topic-sesitive pagerak, i WWW 2. [4] G. Jeh ad J. Widom, Scalig persoalized web search, i WWW 3. [5] X. Zhu, A. B. Goldberg, J. V. Gael, ad D. Adrzejewski, Improvig diversity i rakig usig absorbig radom walks, i HLT-NAACL 7. [6] Q. Mei, J. Guo, ad D. R. Radev, Divrak: the iterplay of prestige ad diversity i iformatio etworks, i DD. [7] X. Zhu, J. Guo, X. Cheg, P. Du, ad H. She, A uified framework for recommedig diverse ad relevat queries, i WWW. [8] H. Tog, J. He, Z. We, R. ouru, ad C.-Y. Li, Diversified rakig o large graphs: a optimizatio viewpoit, i DD, 2. [9] R.-H. Li ad J. X. Yu, Scalable diversified rakig o large graphs, i ICDM, 2, pp [] A. Dubey, S. Chakrabarti, ad C. Bhattacharyya, Diversity i rakig via resistive graph ceters, i DD, 2, pp [] O. Haggstrom, Fiite markov chais ad algorithmic applicatios. Cambridge Uiversity Press, 22. [2] S. Hoory, N. Liial, ad A. Wigderso., Expader graphs ad their applicatios, Bull. Amer. Math. Soc., vol. 43, pp , 26. [3] J. G. Carboell ad J. Goldstei, The use of mmr, diversitybased rerakig for reorderig documets ad producig summaries, i SIGIR 98. [4] A. S. Maiya ad T. Y. Berger-Wolf, Expasio ad search i etworks, i CIM. [5] R. umar,. Puera, ad A. Tomkis, Hierarchical topic segmetatio of websites, i DD 6. [6] T. Lappas, E. Terzi, D. Guopulos, ad H. Maila, Fidig effectors i social etworks, i DD. [7] G. L. Nemhauser, L. A. Wolsey, ad M. L. Fisher, A aalysis of approximatios for maximizig submodular set fuctiosi, Mathematical Programmig, vol. 4, pp , 978. [8] U. Feige, A threshold of l for approximatig set cover, J. ACM, vol. 45, pp , 998. [9] J. Leskovec, A. rause, C. Guestri, C. Faloutsos, J. M. Va- Briese, ad N. S. Glace, Cost-effective outbreak detectio i etworks, i DD, 27. [2] T. W. Hayes, S. T. Hedetiemi, ad P. J. Slater, Domiatio i graphs: advaced topics. MARCEL DEER, INC, 998. [2] T. H. Corme, C. Leiserso, R. Rivest, ad C. Stei, Itroductio to Algorithms, Third Editio. MIT Press, 2. [22] P. Flajolet ad G. N. Marti, Probabilistic coutig algorithms for data base applicatios, J. Comput. Syst. Sci., vol. 3, o. 2, pp , 985. [23] C. R. Palmer, P. B. Gibbos, ad C. Faloutsos, Af: a fast ad scalable tool for data miig i massive graphs, i DD, 22, pp [24] M. Durad ad P. Flajolet, Loglog coutig of large cardialities (exteded abstract), i ESA, 23, pp [25] P. Flajolet, E. Fusy, O. Gadouet, ad F. Meuier, Hyperloglog: the aalysis of a ear-optimal cardiality estimatio algorithm, i ESA, 23, pp [26] S. Gollapudi ad A. Sharma, A axiomatic approach for result diversificatio, i WWW 9. [27] M. Drosou ad E. Pitoura, Search result diversificatio, SIGMOD Rec., vol. 39, pp. 4 47, 2. [28] N. Narasimha ad J. Bilmes, A supermodular-submodular procedure with applicatios to discrimiative structure learig, i UAI 5. [29] J. Leskovec, Stadford etwork aalysis project, 2. [Olie]. Available: [3] R. Zafarai ad H. Liu, Social computig data repository at ASU, 29. [Olie]. Available: edu [3] F. Radliski, P. N. Beett, B. Carterette, ad T. Joachims, Redudacy, diversity ad iterdepedet documet relevace, SIGIR Forum, vol. 43, 29. [32] Y. Zhag, J. P. Calla, ad T. P. Mika, Novelty ad redudacy detectio i adaptive filterig, i SIGIR 2. [33] C.-N. Ziegler, S. M. McNee, J. A. osta, ad G. Lause, Improvig recommedatio lists through topic diversificatio, i WWW 5. [34] C. L. A. Clarke, M. olla, G. V. Cormack, O. Vechtomova, A. Ashka, S. Büttcher, ad I. Macio, Novelty ad diversity i iformatio retrieval evaluatio, i SIGIR 8. [35] H. Ma, M. R. Lyu, ad I. ig, Diversifyig query suggestio results, i AAAI. [36] E. Miack, W. Siberski, ad W. Nejdl, Icremetal diversificatio for very large sets: a streamig-based approach, i SIGIR, 2, pp [37] H. Li ad J. Bilmes, A class of submodular fuctios for documet summarizatio, i ACL, 2, pp [38] C. Zhai, W. W. Cohe, ad J. D. Lafferty, Beyod idepedet relevace: methods ad evaluatio metrics for subtopic retrieval, i SIGIR 3. [39] R. Agrawal, S. Gollapudi, A. Halverso, ad S. Ieog, Diversifyig search results, i WSDM 9. [4] D. empe, J. M. leiberg, ad É. Tardos, Maximizig the spread of ifluece through a social etwork, i DD, 23, pp [4] A. rause ad C. Guestri, Near-optimal observatio selectio usig submodular fuctios, i AAAI, 27, pp [42] A. rause, A. P. Sigh, ad C. Guestri, Near-optimal sesor placemets i gaussia processes: Theory, efficiet algorithms ad empirical studies, Joural of Machie Learig Research, vol. 9, pp , 28. [43] H. Li ad J. Bilmes, Multi-documet summarizatio via budgeted maximizatio of submodular fuctios, i HLT- NAACL, 2. [44] V. V. Vazirai, Approximatio Algorithms. Spriger, 24. [45] A. S. Maiya ad T. Y. Berger-Wolf, Samplig commuity structure, i WWW. Rog-Hua Li Rog-Hua Li is pursuig his PhD degree i Departmet of System Egieerig ad Egieerig Maagemet, The Chiese Uiversity of Hog og, Hog og. His research iterests iclude social etwork aalysis ad miig, complex etwork theory, ucertai graphs miig, Mote- Carlo algorithms, ad machie learig. Jeffery Xu Yu Jeffrey Xu Yu received the BE, ME, ad the PhD degrees i computer sciece from the Uiversity of Tsukuba, Japa, i 985, 987, ad 99, respectively. He held teachig positios i the Istitute of Iformatio Scieces ad Electroics, Uiversity of Tsukuba, Japa, ad the Departmet of Computer Sciece, The Australia Natioal Uiversity. Curretly, he is a professor i the Departmet of Systems Egieerig ad Egieerig Maagemet, the Chiese Uiversity of Hog og. He is servig as a VLDB Joural editorial board member. His curret mai research iterest icludes graph database, graph miig, keyword search i relatioal databases, ad social etwork aalysis.

Counting the Number of Minimum Roman Dominating Functions of a Graph

Counting the Number of Minimum Roman Dominating Functions of a Graph Coutig the Number of Miimum Roma Domiatig Fuctios of a Graph SHI ZHENG ad KOH KHEE MENG, Natioal Uiversity of Sigapore We provide two algorithms coutig the umber of miimum Roma domiatig fuctios of a graph

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Lecture 2: Spectra of Graphs

Lecture 2: Spectra of Graphs Spectral Graph Theory ad Applicatios WS 20/202 Lecture 2: Spectra of Graphs Lecturer: Thomas Sauerwald & He Su Our goal is to use the properties of the adjacecy/laplacia matrix of graphs to first uderstad

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

Data Structures and Algorithms. Analysis of Algorithms

Data Structures and Algorithms. Analysis of Algorithms Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time. Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects. The

More information

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis

Outline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis Outlie ad Readig Aalysis of Algorithms Iput Algorithm Output Ruig time ( 3.) Pseudo-code ( 3.2) Coutig primitive operatios ( 3.3-3.) Asymptotic otatio ( 3.6) Asymptotic aalysis ( 3.7) Case study Aalysis

More information

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments

Running Time ( 3.1) Analysis of Algorithms. Experimental Studies. Limitations of Experiments Ruig Time ( 3.1) Aalysis of Algorithms Iput Algorithm Output A algorithm is a step- by- step procedure for solvig a problem i a fiite amout of time. Most algorithms trasform iput objects ito output objects.

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Ruig Time Most algorithms trasform iput objects ito output objects. The

More information

1 Graph Sparsfication

1 Graph Sparsfication CME 305: Discrete Mathematics ad Algorithms 1 Graph Sparsficatio I this sectio we discuss the approximatio of a graph G(V, E) by a sparse graph H(V, F ) o the same vertex set. I particular, we cosider

More information

arxiv: v2 [cs.ds] 24 Mar 2018

arxiv: v2 [cs.ds] 24 Mar 2018 Similar Elemets ad Metric Labelig o Complete Graphs arxiv:1803.08037v [cs.ds] 4 Mar 018 Pedro F. Felzeszwalb Brow Uiversity Providece, RI, USA pff@brow.edu March 8, 018 We cosider a problem that ivolves

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19

CIS 121 Data Structures and Algorithms with Java Spring Stacks, Queues, and Heaps Monday, February 18 / Tuesday, February 19 CIS Data Structures ad Algorithms with Java Sprig 09 Stacks, Queues, ad Heaps Moday, February 8 / Tuesday, February 9 Stacks ad Queues Recall the stack ad queue ADTs (abstract data types from lecture.

More information

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs

What are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs What are we goig to lear? CSC316-003 Data Structures Aalysis of Algorithms Computer Sciece North Carolia State Uiversity Need to say that some algorithms are better tha others Criteria for evaluatio Structure

More information

How do we evaluate algorithms?

How do we evaluate algorithms? F2 Readig referece: chapter 2 + slides Algorithm complexity Big O ad big Ω To calculate ruig time Aalysis of recursive Algorithms Next time: Litterature: slides mostly The first Algorithm desig methods:

More information

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8) CIS 11 Data Structures ad Algorithms with Java Fall 017 Big-Oh Notatio Tuesday, September 5 (Make-up Friday, September 8) Learig Goals Review Big-Oh ad lear big/small omega/theta otatios Practice solvig

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

condition w i B i S maximum u i

condition w i B i S maximum u i ecture 10 Dyamic Programmig 10.1 Kapsack Problem November 1, 2004 ecturer: Kamal Jai Notes: Tobias Holgers We are give a set of items U = {a 1, a 2,..., a }. Each item has a weight w i Z + ad a utility

More information

New Results on Energy of Graphs of Small Order

New Results on Energy of Graphs of Small Order Global Joural of Pure ad Applied Mathematics. ISSN 0973-1768 Volume 13, Number 7 (2017), pp. 2837-2848 Research Idia Publicatios http://www.ripublicatio.com New Results o Eergy of Graphs of Small Order

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

Random Graphs and Complex Networks T

Random Graphs and Complex Networks T Radom Graphs ad Complex Networks T-79.7003 Charalampos E. Tsourakakis Aalto Uiversity Lecture 3 7 September 013 Aoucemet Homework 1 is out, due i two weeks from ow. Exercises: Probabilistic iequalities

More information

Computational Geometry

Computational Geometry Computatioal Geometry Chapter 4 Liear programmig Duality Smallest eclosig disk O the Ageda Liear Programmig Slides courtesy of Craig Gotsma 4. 4. Liear Programmig - Example Defie: (amout amout cosumed

More information

Analysis of Algorithms

Analysis of Algorithms Aalysis of Algorithms Ruig Time of a algorithm Ruig Time Upper Bouds Lower Bouds Examples Mathematical facts Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite

More information

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015

Heaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015 Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 201 Heaps 201 Goodrich ad Tamassia xkcd. http://xkcd.com/83/. Tree. Used with permissio uder

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015

15-859E: Advanced Algorithms CMU, Spring 2015 Lecture #2: Randomized MST and MST Verification January 14, 2015 15-859E: Advaced Algorithms CMU, Sprig 2015 Lecture #2: Radomized MST ad MST Verificatio Jauary 14, 2015 Lecturer: Aupam Gupta Scribe: Yu Zhao 1 Prelimiaries I this lecture we are talkig about two cotets:

More information

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015.

Hash Tables. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015. Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Hash Tables xkcd. http://xkcd.com/221/. Radom Number. Used with permissio uder Creative

More information

Fast Fourier Transform (FFT) Algorithms

Fast Fourier Transform (FFT) Algorithms Fast Fourier Trasform FFT Algorithms Relatio to the z-trasform elsewhere, ozero, z x z X x [ ] 2 ~ elsewhere,, ~ e j x X x x π j e z z X X π 2 ~ The DFS X represets evely spaced samples of the z- trasform

More information

Strong Complementary Acyclic Domination of a Graph

Strong Complementary Acyclic Domination of a Graph Aals of Pure ad Applied Mathematics Vol 8, No, 04, 83-89 ISSN: 79-087X (P), 79-0888(olie) Published o 7 December 04 wwwresearchmathsciorg Aals of Strog Complemetary Acyclic Domiatio of a Graph NSaradha

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions

Lecturers: Sanjam Garg and Prasad Raghavendra Feb 21, Midterm 1 Solutions U.C. Berkeley CS170 : Algorithms Midterm 1 Solutios Lecturers: Sajam Garg ad Prasad Raghavedra Feb 1, 017 Midterm 1 Solutios 1. (4 poits) For the directed graph below, fid all the strogly coected compoets

More information

Xiaozhou (Steve) Li, Atri Rudra, Ram Swaminathan. HP Laboratories HPL Keyword(s): graph coloring; hardness of approximation

Xiaozhou (Steve) Li, Atri Rudra, Ram Swaminathan. HP Laboratories HPL Keyword(s): graph coloring; hardness of approximation Flexible Colorig Xiaozhou (Steve) Li, Atri Rudra, Ram Swamiatha HP Laboratories HPL-2010-177 Keyword(s): graph colorig; hardess of approximatio Abstract: Motivated b y reliability cosideratios i data deduplicatio

More information

Homework 1 Solutions MA 522 Fall 2017

Homework 1 Solutions MA 522 Fall 2017 Homework 1 Solutios MA 5 Fall 017 1. Cosider the searchig problem: Iput A sequece of umbers A = [a 1,..., a ] ad a value v. Output A idex i such that v = A[i] or the special value NIL if v does ot appear

More information

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem

Exact Minimum Lower Bound Algorithm for Traveling Salesman Problem Exact Miimum Lower Boud Algorithm for Travelig Salesma Problem Mohamed Eleiche GeoTiba Systems mohamed.eleiche@gmail.com Abstract The miimum-travel-cost algorithm is a dyamic programmig algorithm to compute

More information

1.2 Binomial Coefficients and Subsets

1.2 Binomial Coefficients and Subsets 1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =

More information

Markov Chain Model of HomePlug CSMA MAC for Determining Optimal Fixed Contention Window Size

Markov Chain Model of HomePlug CSMA MAC for Determining Optimal Fixed Contention Window Size Markov Chai Model of HomePlug CSMA MAC for Determiig Optimal Fixed Cotetio Widow Size Eva Krimiger * ad Haiph Latchma Dept. of Electrical ad Computer Egieerig, Uiversity of Florida, Gaiesville, FL, USA

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 19 Query Optimizatio Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio Query optimizatio Coducted by a query optimizer i a DBMS Goal:

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

Data diverse software fault tolerance techniques

Data diverse software fault tolerance techniques Data diverse software fault tolerace techiques Complemets desig diversity by compesatig for desig diversity s s limitatios Ivolves obtaiig a related set of poits i the program data space, executig the

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most

More information

c-dominating Sets for Families of Graphs

c-dominating Sets for Families of Graphs c-domiatig Sets for Families of Graphs Kelsie Syder Mathematics Uiversity of Mary Washigto April 6, 011 1 Abstract The topic of domiatio i graphs has a rich history, begiig with chess ethusiasts i the

More information

6.851: Advanced Data Structures Spring Lecture 17 April 24

6.851: Advanced Data Structures Spring Lecture 17 April 24 6.851: Advaced Data Structures Sprig 2012 Prof. Erik Demaie Lecture 17 April 24 Scribes: David Bejami(2012), Li Fei(2012), Yuzhi Zheg(2012),Morteza Zadimoghaddam(2010), Aaro Berstei(2007) 1 Overview Up

More information

Analysis of Algorithms

Analysis of Algorithms Presetatio for use with the textbook, Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 2015 Aalysis of Algorithms Iput 2015 Goodrich ad Tamassia Algorithm Aalysis of Algorithms

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

Consider the following population data for the state of California. Year Population

Consider the following population data for the state of California. Year Population Assigmets for Bradie Fall 2016 for Chapter 5 Assigmet sheet for Sectios 5.1, 5.3, 5.5, 5.6, 5.7, 5.8 Read Pages 341-349 Exercises for Sectio 5.1 Lagrage Iterpolatio #1, #4, #7, #13, #14 For #1 use MATLAB

More information

Lower Bounds for Sorting

Lower Bounds for Sorting Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a

n n B. How many subsets of C are there of cardinality n. We are selecting elements for such a 4. [10] Usig a combiatorial argumet, prove that for 1: = 0 = Let A ad B be disjoit sets of cardiality each ad C = A B. How may subsets of C are there of cardiality. We are selectig elemets for such a subset

More information

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13

CIS 121 Data Structures and Algorithms with Java Spring Stacks and Queues Monday, February 12 / Tuesday, February 13 CIS Data Structures ad Algorithms with Java Sprig 08 Stacks ad Queues Moday, February / Tuesday, February Learig Goals Durig this lab, you will: Review stacks ad queues. Lear amortized ruig time aalysis

More information

CHAPTER IV: GRAPH THEORY. Section 1: Introduction to Graphs

CHAPTER IV: GRAPH THEORY. Section 1: Introduction to Graphs CHAPTER IV: GRAPH THEORY Sectio : Itroductio to Graphs Sice this class is called Number-Theoretic ad Discrete Structures, it would be a crime to oly focus o umber theory regardless how woderful those topics

More information

On (K t e)-saturated Graphs

On (K t e)-saturated Graphs Noame mauscript No. (will be iserted by the editor O (K t e-saturated Graphs Jessica Fuller Roald J. Gould the date of receipt ad acceptace should be iserted later Abstract Give a graph H, we say a graph

More information

Project 2.5 Improved Euler Implementation

Project 2.5 Improved Euler Implementation Project 2.5 Improved Euler Implemetatio Figure 2.5.10 i the text lists TI-85 ad BASIC programs implemetig the improved Euler method to approximate the solutio of the iitial value problem dy dx = x+ y,

More information

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence _9.qxd // : AM Page Chapter 9 Sequeces, Series, ad Probability 9. Sequeces ad Series What you should lear Use sequece otatio to write the terms of sequeces. Use factorial otatio. Use summatio otatio to

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network

Adaptive Resource Allocation for Electric Environmental Pollution through the Control Network Available olie at www.sciecedirect.com Eergy Procedia 6 (202) 60 64 202 Iteratioal Coferece o Future Eergy, Eviromet, ad Materials Adaptive Resource Allocatio for Electric Evirometal Pollutio through the

More information

A Note on Least-norm Solution of Global WireWarping

A Note on Least-norm Solution of Global WireWarping A Note o Least-orm Solutio of Global WireWarpig Charlie C. L. Wag Departmet of Mechaical ad Automatio Egieerig The Chiese Uiversity of Hog Kog Shati, N.T., Hog Kog E-mail: cwag@mae.cuhk.edu.hk Abstract

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

BOOLEAN MATHEMATICS: GENERAL THEORY

BOOLEAN MATHEMATICS: GENERAL THEORY CHAPTER 3 BOOLEAN MATHEMATICS: GENERAL THEORY 3.1 ISOMORPHIC PROPERTIES The ame Boolea Arithmetic was chose because it was discovered that literal Boolea Algebra could have a isomorphic umerical aspect.

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS Prosejit Bose Evagelos Kraakis Pat Mori Yihui Tag School of Computer Sciece, Carleto Uiversity {jit,kraakis,mori,y

More information

ANN WHICH COVERS MLP AND RBF

ANN WHICH COVERS MLP AND RBF ANN WHICH COVERS MLP AND RBF Josef Boští, Jaromír Kual Faculty of Nuclear Scieces ad Physical Egieerig, CTU i Prague Departmet of Software Egieerig Abstract Two basic types of artificial eural etwors Multi

More information

Informed Search. Russell and Norvig Chap. 3

Informed Search. Russell and Norvig Chap. 3 Iformed Search Russell ad Norvig Chap. 3 Not all search directios are equally promisig Outlie Iformed: use problem-specific kowledge Add a sese of directio to search: work toward the goal Heuristic fuctios:

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

Algorithms for Disk Covering Problems with the Most Points

Algorithms for Disk Covering Problems with the Most Points Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi

More information

Σ P(i) ( depth T (K i ) + 1),

Σ P(i) ( depth T (K i ) + 1), EECS 3101 York Uiversity Istructor: Ady Mirzaia DYNAMIC PROGRAMMING: OPIMAL SAIC BINARY SEARCH REES his lecture ote describes a applicatio of the dyamic programmig paradigm o computig the optimal static

More information

Octahedral Graph Scaling

Octahedral Graph Scaling Octahedral Graph Scalig Peter Russell Jauary 1, 2015 Abstract There is presetly o strog iterpretatio for the otio of -vertex graph scalig. This paper presets a ew defiitio for the term i the cotext of

More information

Protected points in ordered trees

Protected points in ordered trees Applied Mathematics Letters 008 56 50 www.elsevier.com/locate/aml Protected poits i ordered trees Gi-Sag Cheo a, Louis W. Shapiro b, a Departmet of Mathematics, Sugkyukwa Uiversity, Suwo 440-746, Republic

More information

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers *

Load balanced Parallel Prime Number Generator with Sieve of Eratosthenes on Cluster Computers * Load balaced Parallel Prime umber Geerator with Sieve of Eratosthees o luster omputers * Soowook Hwag*, Kyusik hug**, ad Dogseug Kim* *Departmet of Electrical Egieerig Korea Uiversity Seoul, -, Rep. of

More information

Sectio 4, a prototype project of settig field weight with AHP method is developed ad the experimetal results are aalyzed. Fially, we coclude our work

Sectio 4, a prototype project of settig field weight with AHP method is developed ad the experimetal results are aalyzed. Fially, we coclude our work 200 2d Iteratioal Coferece o Iformatio ad Multimedia Techology (ICIMT 200) IPCSIT vol. 42 (202) (202) IACSIT Press, Sigapore DOI: 0.7763/IPCSIT.202.V42.0 Idex Weight Decisio Based o AHP for Iformatio Retrieval

More information

Convergence results for conditional expectations

Convergence results for conditional expectations Beroulli 11(4), 2005, 737 745 Covergece results for coditioal expectatios IRENE CRIMALDI 1 ad LUCA PRATELLI 2 1 Departmet of Mathematics, Uiversity of Bologa, Piazza di Porta Sa Doato 5, 40126 Bologa,

More information

CS 683: Advanced Design and Analysis of Algorithms

CS 683: Advanced Design and Analysis of Algorithms CS 683: Advaced Desig ad Aalysis of Algorithms Lecture 6, February 1, 2008 Lecturer: Joh Hopcroft Scribes: Shaomei Wu, Etha Feldma February 7, 2008 1 Threshold for k CNF Satisfiability I the previous lecture,

More information

Big-O Analysis. Asymptotics

Big-O Analysis. Asymptotics Big-O Aalysis 1 Defiitio: Suppose that f() ad g() are oegative fuctios of. The we say that f() is O(g()) provided that there are costats C > 0 ad N > 0 such that for all > N, f() Cg(). Big-O expresses

More information

Improved Random Graph Isomorphism

Improved Random Graph Isomorphism Improved Radom Graph Isomorphism Tomek Czajka Gopal Paduraga Abstract Caoical labelig of a graph cosists of assigig a uique label to each vertex such that the labels are ivariat uder isomorphism. Such

More information

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved.

Chapter 11. Friends, Overloaded Operators, and Arrays in Classes. Copyright 2014 Pearson Addison-Wesley. All rights reserved. Chapter 11 Frieds, Overloaded Operators, ad Arrays i Classes Copyright 2014 Pearso Addiso-Wesley. All rights reserved. Overview 11.1 Fried Fuctios 11.2 Overloadig Operators 11.3 Arrays ad Classes 11.4

More information

Minimum Spanning Trees

Minimum Spanning Trees Miimum Spaig Trees Miimum Spaig Trees Spaig subgraph Subgraph of a graph G cotaiig all the vertices of G Spaig tree Spaig subgraph that is itself a (free) tree Miimum spaig tree (MST) Spaig tree of a weighted

More information

The Counterchanged Crossed Cube Interconnection Network and Its Topology Properties

The Counterchanged Crossed Cube Interconnection Network and Its Topology Properties WSEAS TRANSACTIONS o COMMUNICATIONS Wag Xiyag The Couterchaged Crossed Cube Itercoectio Network ad Its Topology Properties WANG XINYANG School of Computer Sciece ad Egieerig South Chia Uiversity of Techology

More information

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms

Chapter 24. Sorting. Objectives. 1. To study and analyze time efficiency of various sorting algorithms Chapter 4 Sortig 1 Objectives 1. o study ad aalyze time efficiecy of various sortig algorithms 4. 4.7.. o desig, implemet, ad aalyze bubble sort 4.. 3. o desig, implemet, ad aalyze merge sort 4.3. 4. o

More information

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS

APPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful

More information

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical

More information

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions:

Solution printed. Do not start the test until instructed to do so! CS 2604 Data Structures Midterm Spring, Instructions: CS 604 Data Structures Midterm Sprig, 00 VIRG INIA POLYTECHNIC INSTITUTE AND STATE U T PROSI M UNI VERSI TY Istructios: Prit your ame i the space provided below. This examiatio is closed book ad closed

More information

Bayesian approach to reliability modelling for a probability of failure on demand parameter

Bayesian approach to reliability modelling for a probability of failure on demand parameter Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee

More information

A Kernel Density Based Approach for Large Scale Image Retrieval

A Kernel Density Based Approach for Large Scale Image Retrieval A Kerel Desity Based Approach for Large Scale Image Retrieval Wei Tog Departmet of Computer Sciece ad Egieerig Michiga State Uiversity East Lasig, MI, USA togwei@cse.msu.edu Rog Ji Departmet of Computer

More information

Perhaps the method will give that for every e > U f() > p - 3/+e There is o o-trivial upper boud for f() ad ot eve f() < Z - e. seems to be kow, where

Perhaps the method will give that for every e > U f() > p - 3/+e There is o o-trivial upper boud for f() ad ot eve f() < Z - e. seems to be kow, where ON MAXIMUM CHORDAL SUBGRAPH * Paul Erdos Mathematical Istitute of the Hugaria Academy of Scieces ad Reu Laskar Clemso Uiversity 1. Let G() deote a udirected graph, with vertices ad V(G) deote the vertex

More information

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1

CS200: Hash Tables. Prichard Ch CS200 - Hash Tables 1 CS200: Hash Tables Prichard Ch. 13.2 CS200 - Hash Tables 1 Table Implemetatios: average cases Search Add Remove Sorted array-based Usorted array-based Balaced Search Trees O(log ) O() O() O() O(1) O()

More information

ECE4050 Data Structures and Algorithms. Lecture 6: Searching

ECE4050 Data Structures and Algorithms. Lecture 6: Searching ECE4050 Data Structures ad Algorithms Lecture 6: Searchig 1 Search Give: Distict keys k 1, k 2,, k ad collectio L of records of the form (k 1, I 1 ), (k 2, I 2 ),, (k, I ) where I j is the iformatio associated

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

Algorithms Chapter 3 Growth of Functions

Algorithms Chapter 3 Growth of Functions Algorithms Chapter 3 Growth of Fuctios Istructor: Chig Chi Li 林清池助理教授 chigchi.li@gmail.com Departmet of Computer Sciece ad Egieerig Natioal Taiwa Ocea Uiversity Outlie Asymptotic otatio Stadard otatios

More information

Combination Labelings Of Graphs

Combination Labelings Of Graphs Applied Mathematics E-Notes, (0), - c ISSN 0-0 Available free at mirror sites of http://wwwmaththuedutw/ame/ Combiatio Labeligs Of Graphs Pak Chig Li y Received February 0 Abstract Suppose G = (V; E) is

More information

On Nonblocking Folded-Clos Networks in Computer Communication Environments

On Nonblocking Folded-Clos Networks in Computer Communication Environments O Noblockig Folded-Clos Networks i Computer Commuicatio Eviromets Xi Yua Departmet of Computer Sciece, Florida State Uiversity, Tallahassee, FL 3306 xyua@cs.fsu.edu Abstract Folded-Clos etworks, also referred

More information