DUe to the recent developments of gigantic social networks

Size: px
Start display at page:

Download "DUe to the recent developments of gigantic social networks"

Transcription

1 Exploing Communities in Lage Pofiled Gaphs Yankai Chen, Yixiang Fang, Reynold Cheng Membe, IEEE, Yun Li, Xiaojun Chen, Jie Zhang 1 Abstact Given a gaph G and a vetex q G, the community seach (CS) poblem aims to efficiently find a subgaph of G whose vetices ae closely elated to q. Communities ae pevalent in social and biological netwoks, and can be used in poduct advetisement and social event ecommendation. In this pape, we study pofiled community seach (PCS), whee CS is pefomed on a pofiled gaph. This is a gaph in which each vetex has labels aanged in a hieachical manne. Extensive expeiments show that PCS can identify communities with themes that ae common to thei vetices, and is moe effective than existing CS appoaches. As a naive solution fo PCS is highly expensive, we have also developed a tee index, which facilitate efficient and online solutions fo PCS. axiv: v1 [cs.db] 16 Jan 2019 Index Tems community seach, social netwoks, gaph queies, pofiled gaph 1 INTRODUCTION DUe to the ecent developments of gigantic social netwoks (e.g., Flick, Facebook, and Twitte), topics of gaph queies have attacted attention fom industy and eseach aeas [1 7]. Communities, which ae often found in lage gaphs, can be used in vaious applications, such as social event setting, fiend ecommendation, and eseach collaboation analysis [8 12]. Given a gaph G and a quey vetex q G, the goal of community seach (CS) is to extact communities, o densely connected subgaphs of G that contain q, in an online manne. CM ML CM ML AI B CM IS ML AI AI HW A E C D IS DMS (b) a subtee of CCS CM HW IS ML AI DMS ML IS HW IS DMS HW IS DMS (a) a pofiled gaph CCS Machine Leaning Infomation Systems Hadwae HW F CM AI DMS (c) abbeviations G H CM HW HW IS Computing Methodology Atificial Intelligence Data Management System Fig. 1. A pofiled gaph, a subtee of CCS and meanings of tems. In this pape, we investigate the CS poblem fo a pofiled gaph. This is essentially a kind of attibuted gaphs, whee Y. Chen, Y. Fang, and R. Cheng ae with the Depatment of Compute Science, The Univesity of Hong Kong, Hong Kong. {ykchen, yxfang, ckcheng}@cs.hku.hk Y. Li is with Depatment of Compute Science and Technology, Nanjing Univesity, China. liycse@gmail.com X. Chen is with College of Compute Science and Softwae,Shenzhen Univesity, China. xjchen@szu.edu.cn J. Zhang is with School of Compute Science and Engineeing, Nanyang Technological Univesity, Singapoe. zhangj@ntu.edu.sg Manuscipt eceived Mach 20, each gaph vetex is associated with a set of labels aanged in a hieachical manne called a P-tee. Fig. 1(a) shows a pofiled gaph, which is a compute science collaboation netwok; each vetex epesents a eseache, and a link between two vetices depicts that the two coesponding eseaches have woked togethe befoe. Each vetex is associated with a P-tee, which descibes the expetise of eseaches. Fig.1(c) shows the meanings of the tems in each P-tee, following the ACM Computing Classification System (CCS) 1, which is patially pesented in Fig.1(b). Fo instance, vetexb denotes a eseache, whose eseach domain is in computing methodology (CM), with specific inteest in machine leaning (ML) and atificial intelligence (AI). Pofiled gaphs ae infomative and can be found in vaious gaph applications (e.g., knowledge bases, social and collaboation netwoks). Moeove, the P-tees of pofiled gaphs systematically oganize labels elated to a vetex (e.g., hieachical and inteelated knowledge in knowledge bases, affiliation, expetise, and locations in social and collaboation netwoks), eflecting the semantic elationship among them. Fo example, in a P-tee, label London can be a child node of UK, because London is a UK city. C B A D E F (a) two PC s G H CM ML AI (b) {B, C, D} Fig. 2. Illustating pofiled community seach (PCS). IS DMS (c) {A, D, E} Pio woks. The methods elated to etieval communities can geneally be classified into community detection (CD) methods and community seach (CS) methods. In geneal, the aim of CD algoithms is to etieve all communities fo a gaph [13 20]. Note that these solutions ae not quey-based. This means that, given a use-specified quey vetex, they ae not customized fo a quey equest. As a esult, these algoithms nomally take a long time to find all the communities fo a lage gaph. Thus it is not suitable to use CD algoithms fo quick o online etieval of communities. To solve these poblems, CS solutions have been ecently poposed [8, 10, 21 24]. Compaed with CD solutions, 1. ACM CCS:

2 2 CS appoaches ae quey-based, and thus ae suitable to deive communities in an online manne. Howeve, to ou best knowledge, pevious CS algoithms ae not designed fo pofiled gaphs. Ealy solutions (e.g., [8 10]) often only conside gaph topology (e.g., a k-coe is a community such that each vetex is connected to k o moe vetexes). They did not conside the use of vetex labels. As pointed out in [11], the communities etuned by those solutions ae often huge (e.g., a community can easily contain ove 1, 000 vetices). Moeove, the vetices included in the communities wee not quite elated. Recent woks, such as ACQ [11] and ATC [12], popose to use both gaph stuctue and vetex label infomation. While these woks have been shown to be moe effective than CS solutions that do not utilize vetex labels, they did not employ the hieachical elationship among labels (e.g., P-tees in Fig. 1(a)). This may lead to suboptimal esults. In Fig. 1(a), suppose that a enowned expet D wants to oganize a semina whee eseaches ae closely elated to each othe. Based on the ACQ solution [11], with k=2, only a 2-coe is seached (Fig. 2(b)), whose vetices {B, C, D} have seveal labels (i.e.,, CM, ML, AI) in common. Howeve, it fails to etun the community in Fig. 2(c), whose vetices ae also highly simila. Fo these two communities, the shaed labels as well as thei elationships in the P-tee ae vey diffeent. Theefoe, both communities can be pesented to the oganize fo futhe selection. Pofiled community seach. In this pape, we study pofiled community seach (PCS), which aims to find pofiled communities, o PC s, fo a pofiled gaph. To obtain high-quality communities, we use stuctue cohesiveness and pofile cohesiveness to constain PC s. We adopt widely used metic minimum degee [8, 25 29] to measue the stuctue cohesiveness. Note that in PCS poblem, the minimum degee matic can be eplaced by othe useful matics, e.g., k-tuss [10] and k-clique [22], to fit in othe possible application scenaios. In a pofiled gaph, each vetex is associated with a P-tee. To measue the pofile cohesiveness, we fully utilize the infomation in P-tees. Conceptually, a PC is a goup of densely connected vetices, whose P-tees have the lagest degee of ovelap. This ovelapping pat is the lagest common subtee shaed by all the vetices. Fig. 2(a) illustates two PC s in the pofiled gaph of Fig. 1, namely {B, C, D} and {A, D, E}. In Fig. 2(b) and Fig. 2(c), the two PC s, as well as thei lagest common subtees ae espectively shown. Fo example, in Fig. 2(c), veticesa,d, ande all possess the subtee with oot and leaf nodes IS and DMS. Notice that these thee vetices also fom a 2-coe of D, and the common subtee among them is the lagest. The common subtee sufficiently eflects the theme of the community. In the PC of Fig. 2(b), all the eseaches involved shae inteest in machine leaning and atificial intelligence, wheeas fo Fig. 2(c), the eseaches ae all inteested in infomation systems and hadwae studies. Pesonalization. PCS poblem allows a quey use to seach communities that exhibits both stuctue cohesiveness and pofiled cohesiveness. The paamete k contols the density of connection intensiveness. The pofiled cohesiveness constains the community to be semantically simila as much as possible. Fo instance, PCS methods can answe questions such as who ae my close fiends so that we have stong connection and common intesets and expetise? In contast, existing CD methods [30 32] often use some global citeia (e.g., modulaity) whee the gaph is patitioned a-pioi with no efeence to the paticula quey vetices. Thus existing CD methods ae not suitable fo pesonalized queies. Online seach. Simila to othe online CS appoaches, ou PCS method is able to find PC s fom a lage-scale pofiled gaph effectively and efficiently. Howeve, existing CD methods fo gaph quey poblems ae geneally slowe. This is mainly because that they ae designed fo etieving all the communities fo an entie gaph. Contibutions. As we will explain, a simple solution to solve the PCS poblem is extemely expensive. To impove the efficiency of finding PC s (so that they can be used in online applications), we fist intoduce an anti-monotonicity popety, which allows the candidates fo a PC to be puned efficiently. We futhe develop the CP-tee index, which systematically oganizes the gaph vetices and P-tees of a pofiled gaph. The CPtee index enables the development of two fast PC discovey algoithms. We expeimentally evaluate ou solutions on two eal lage pofiled gaphs and two synthetic pofiled gaphs. Ou esults show that PC s ae bette epesentations of communities, and the CP-tee based algoithms ae up to 4 ode-of-magnitude faste than basic solution. Oganization. We eview the elated wok in Section 2. Section 3 pesents the PCS poblem and a basic solution. Section 4 discusses the CP-tee and its elated solutions. We epot the expeimental esults in Section 5, and conclude in Section 6. 2 RELATED WORK In the liteatue, thee ae two kinds of wok elated to the etieval of communities, namely community detection (CD) and community seach (CS). Community detection (CD) aims to obtain all the communities fom a given gaph. Ealie woks [16, 33] use link-based analysis to obtain these communities. Howeve, they do not conside the textual infomation associated with gaphs. Recent woks focus on attibuted gaphs and use some advanced techniques such as clusteing techniques to identify communities. Howeve, these studies often assume that the attibute of the vetex is a set of keywods, and do not conside the hieachical elationship among them. Fo Example, Zhou et al. [20] used keywods to descibe vetices and futhe compute the vetices paiwise similaities to cluste the gaph. Qi et al. [34] studied a poblem of dynamically maintaining communities of moving objects using thei tajectoies. Ruan et al. [35] poposed a method called CODICIL. Based on content similaity, CODICIL augments the oiginal gaphs by ceating new edges, and then uses an effective gaph sampling to boost the efficiency of clusteing. Anothe wide-used appoach is based on topic models [18, 36]. Essentially, these methods still analyze the one-dimensional content to obtain the communities. Anothe common appoach is based on topic models. Link- PLSA-LDA [14] and Topic-Link LDA [37] models jointly model vetices links and content based on the LDA model. In [6], the communities ae clusteed based on pobabilistic infeence. In [38], infomation such as topics, inteaction types and the social connections ae consideed to exploe the communities. CESNA [19] detects ovelapping communities by assuming communities geneate both the link and content. As we intoduced befoe, CD solutions ae typically time consuming, and they may not be suitable fo online applications that equie fast etieval of communities. It is also inteesting to examine how ou PCS solutions can be extended to suppot CD. Community seach (CS) etuns the communities fo a given gaph vetex in a fast and online manne. Most existing CS

3 3 solutions [8 10, 25, 26] only conside gaph topologies, but not the labels associated with the vetices. To define the stuctue cohesiveness of the community, the minimum degee is often used [8, 25, 26]. Sozio et al. [8] poposed the fist algoithm Global to find the k- coe containing the quey vetex. Cui et al. [25] poposed Local, which uses local expansion techniques to impove Global. We will compae these two solutions in ou expeiments. Othe definitions, such as k-clique [9], k- tuss [10] and edge connectivity [39], have been consideed fo seaching meaningful communities. Recent CS solutions, such as ACQ [11, 40] and ATC [12], make use of both vetex labels and gaph stuctue to find communities. Since CS is quey-based, it is much moe suitable fo fast and online quey of the communities on lage-scale pofiled gaphs. Howeve, all above woks ae not designed fo pofiled gaphs, and they do not conside the hieachical elationship among vetex labels. Thus in this pape, we popose methods to solve the community seach poblem on pofiled gaphs. We have pefomed detailed expeiments on eal datasets (Section 5). As we will show, ou algoithms yield bette communities than stateof-the-at CS solutions do. 3 PROBLEM DEFINITION AND BASIC SOLUTION In this section, we fist fomally intoduce the PCS poblem, and then give a basic solution to the PCS poblem. Table 1 lists all notations used in this pape. TABLE 1 Notations and meanings. Notation Meaning G(V,E) A pofiled gaph with vetex set V and edge set E n the numbe of vetices in V m the the numbe of edges in E deg G(v) The degee of vetex v in G T(v) The P-tee of vetex v M(G q) The maximal common subtee of G q the lagest connected subgaph of G s.t. G[T] q G[T], v G[T], T T(v) the lagest connected subgaph of G[T] G k [T] s.t. q G k [T], deg Gk (v) k 3.1 The PCS Poblem A pofiled community is a subgaph of G that fistly satisfies the stuctue cohesiveness (i.e., the vetices in this community ae connected to each othe in some way). Fomal definition will be intoduced late. A common notion of stuctue cohesiveness is that the minimum degee of all the vetices that in the community has to be at least k [8, 25 29]. This is used in the k-coe and the PC. Let us discuss the k-coe fist. Definition 1 (k-coe [27, 41]). Given an intege k (k 0), the k-coe of G, is the lagest subgaph of G, such that v k-coe, degee of v is at least k. Notice that k-coe may not be connected [27]. Its connected components, denoted by k-coe, ae the communities eteieved by k-coe seach algoithms. We use Example 1 to illustate it. Example 1. In Figue 2(a), each dashed cicle epesents a 2-coe and also a 2- coe. Vetices {A, B, D, E} goup a 3- coe and vetices {A, B, C, D, E} fom a 2- coe because C only has a degee of 2, even though othe vetices has a highe degee. A pofiled gaph G(V,E) is an undiected gaph with vetex set V and edge set E. Each vetex v V is associated with a pofiled tee (P-tee) to descibe v s hieachical attibutes. Definition 2 (P-tee). The P-tee of vetex q, denoted by T(q)= (V T(q),E T(q) ), is a ooted odeed tee, whee V T(q) is the set of attibute labels and E T(q) is the set of edges between labels. A P-tee satisfies following constaints: (1) Thee is only one oot node V T(q) ; (2) (x,y) E T(q), it is diected and y is the child attibute label of x; and (3) y V T(q) and y, thee is one and only one x V T(q), s.t. (x,y) E T(q). In pactice, labels in the uppe levels of the P-tee ae moe semantically geneal than those in lowe levels. All edges ine T(q) peseve the semantic elationships among labels in V T(q). Definition 3 (induced ooted subtee). Given two P-tees S=(V S,E S ) and T =(V T,E T ), S is the induced ooted subtee of T, denoted by S T, if V S V T and E S E T. Essentially, an induced ooted subtee defines an inclusion elationship between two P-tees. Unless othewise specified, we use subtee to mean induced ooted subtee. We call the unified P-tee of all vetices P-tees a Global P-tee (GP-tee), which usually coesponds to a taxonomy system in pactice. Definition 4 (maximal common subtee). Given a pofiled gaph G, the maximal common subtee of G, denoted by M(G), holds the popeties: (1) v G, M(G) T(v); (2) thee exists no othe common subtee M (G) such that M(G) M (G). The common subtee depicts the common hieachical pat among all P-tees in a subgaph. We use the maximal stuctue M(G) to conside both the high-level and low-level labels and it fully mines the common featues of this subgaph. As a esult, by using the maximal common subtee, we can maximize vetices common pofiles, including the topology and semantics of uses pofiles. Next, we fomally intoduce the PCS poblem. Poblem 1 (PCS). Given a pofiled gaph G(V, E), a positive intege k, and a quey node q G, find a set G of gaphs, such that G q G, the following popeties hold: Connectivity. G q G is connected and containsq; Stuctue cohesiveness. v G q, deg Gq (v) k, whee deg Gq (v) denotes the degee of v in G q ; Pofile cohesiveness. Thee exists no othe G q G satisfying the above two constaints, such that M(G q ) M(G q ). Maximal stuctue. Thee exists no othe G q satisfying the above popeties, such that G q G q and M(G q) = M(G q); Essentially, a pofiled community (PC) is a subgaph of G, in which vetices ae closely elated in both stuctue and semantics. In Poblem 1, the fist two popeties and last popety ensue the stuctue cohesiveness, as shown in the liteatue [26, 40]. The unique popety pofile cohesiveness captues the maximal shaed pofile among all the vetices of G q. Moeove, since the shaed subtee M(G q ) shows the common hieachical attibute, it can well explain the semantic theme of the community. 3.2 A Basic Solution Since vetices in the PC s shae a common subtee of the quey vetexq, a staightfowad method it that we can enumeate all the

4 4 subtees of q s P-tee and find the coesponding PC s. Howeve, as illustated in Lemma 1, the seach space may be exponentially lage and computation ovehead endes this method impactical. To alleviate this issue, we iteatively pefom the following two steps. Lemma 1. The maximum numbe of subtees of a P-tee with x nodes is 2 x p 1 p 2 px-2 (a) a special case Fig. 3. a P-tee with x nodes. p x-1 i (b) a geneal case Poof. Let f(x) = max{l N L is the numbe of subtees of a tee with x nodes}. As shown in Fig. 3(a), p i denotes the ith child of the P-tee. Then it is not had to find that thee ae (2 n 1 + 1) subtees including the empty tee (no P-tee node is contained). So f(x) 2 x In this case, we do need to woy about the paent-child elationship between P-tee nodes so that 2 x is also the uppe bound of f(x). Then we can infe that f(x) = 2 x Moe fomally, we can veify the coectness of this fomula. As shown in Fig. 3(b), the left tiangle (including ) denotes the subtee with i nodes and the ight one epesents the subtee with x i nodes. We pesent the following equation 1. Note that the empty tee should be included and thus f(0) = 1. Obviously, we can constuct diffeent subtees by combining subtees in left and ight pats. Then we can compute f(x) by using f(i) and f(x i). Note that the empty tee in both left and ight pat should not be included simultaneously. Finally we add 1 to f(x) to epesent the empty tee. { 1 x = 0 f(x) = max x i=0 {f(i) [f(x i) 1]}+1 x 1,x N (1) Now we can diectly veify that f(x) = 2 x 1 +1 satisfy the equation and this complete the poof. Step 1: candidate subtee geneation. To geneate the candidate subtees, the key poblem is how to avoid edundancies of the subtee enumeation. In [42], Asai et al. intoduced a tee patten enumeation stategy, and it is based on the following two concepts: (1) Rightmost leaf is the last P-tee node accoding to the depth-fist tavesal ode. (2) Rightmost path is defined as a path fom the oot node to the ightmost leaf. Given a tee T, a new subtee T can only be geneated by adding a new node t to T such that the following hold: (1) t s paent node is on the ightmost path of T ; (2) t is the ightmost leaf of T. As shown in [42], this geneation stategy guaantees that all the subtees of the P-tee will be enumeated without epetition. Thus, we follow this stategy to geneate the candidate subtees. Step 2: community veification. Afte a candidate subtee T has been geneated, we veify the existence of the coesponding community. We use G k [T] to epesent the lagest connected x-i subgaph of G containing q whee each vetex has at least k neighbos and contains the subtee T. We say that, T is feasible, if G k [T] exists. The veification step is mainly based on the following lemma. Poposition 1. Given a pofiled gaph G, two P-tee T,T and the quey vetex q, if T T,G k [T] G k [T ]. Poof. As we defined befoe, G k [T] denotes the k- coe containingq whee each vetex contains the subteet. (1) If G k [T] =, G k [T] G k [T ] always holds. (2) If G k [T], we have v G k [T], T T(v). Then fom T T, we can infe v G k [T], T T(v). This means each vetex v G k [T] also contains the P-teeT. Thus ifg k [T],G k [T] G k [T ]. In summay, Poposition 1 holds. Lemma 2 (Anti-monotonicity). Given a subteet, ifg k [T], then T T, G k [T ]. Poof. Fom Poposition 1, we know T T,G k [T] G k [T ]. Now since G k [T], we have T T,G k [T ]. By Lemma 2, we can conclude that, if G k [T] is infeasible, then we can stop geneating subtees fomt. Thebasic method begins with geneating a subtee fom the oot node. Then, it iteatively pefoms the two steps above to etieve all the feasible G k [T]s, until no lage subtees can be geneated. Pseudocodes of basic ae attached in Algoithm 1. Complexity analysis. Let m be the numbe of edges in G. In wost case all edges ae tavesed to compute theg k [T] and all the subtees ae veified. As a esult,basic completes in O(2 T(q) m) time whee T(q) denotes the numbe of nodes of T(q). In pactice, the value of 2 T(q) could be exponentially lage and this makes basic impactical. To alleviate this issue, we popose moe efficient index-based solutions in next section. Algoithm 1 pesents basic. We fist initilize the esult set G and load the q s P-tee T(q) (line 2). Then we need to compute G k, the lagest connected subgaph ofgcontainingq whee each vetex has at least k degees (line 3). Now in the iteation, we geneate new subtees fom cuent subtee T. Fo each new subteet, we veify the existence ofg k [T] (lines 4-10). IfG k [T] exists, we addt inφ(lines 11-12); othewise if no subtee can be geneated fomt o all subtees geneated fomt ae infeasible, we add G k [T ] in G if T is maximal (line 13). Finally, all PC s ae etuned (line 14). Algoithm 1 basic quey algoithm 1: function QUERY(G,q, k) 2: G, load T(q) fom G; 3: compute G k fom G; 4: if G k then 5: Ψ GENERATESUBTREE(,T(q)); 6: while Ψ do 7: T Ψ.pop(); flag tue; 8: Φ GENERATESUBTREE(T,T(q)); 9: fo each T Φ do 10: compute G k [T] fom G k ; 11: if G k [T] then 12: flag false; Ψ.push(T); 13: if flag = tue and T is maximal then 14: G = G G k [T ]; 15: etun G;

5 5 4 INDEX-BASED SOLUTIONS We fist intoduce some peliminaies and the poposed CPtee index, and then discuss the index-based quey algoithms. 4.1 k-coe and CL-tee k-coe. In line with existing CS [11, 26], we use k-coe to satisfy the constaints of minimum degee and maximal stuctue of a PC. Given an intege k (k 0), the k-coe of G, denoted by G k, is the lagest subgaph ofg, such that v G k,deg Gk (v) k. Since G k may be disconnected, we use k- coes to denote one of its connected components. An impotant popety of k-coe is the nested popety: given two intege i and j, j-coe i-coe if i < j. In Fig. 4(a), the 0-coe epesents the whole gaph, and 3-coe is nested in 2-coe. Computing all the k-coes of a gaph G, known as coe decomposition, can be completed by an O(m) algoithm [27], whee m is the numbe of edges in G. CL-tee. Since k-coes ae nested, all the k-coes of a gaph can be oganized into a tee stuctue, called CL-tee [11]. In this pape, we adopt it, but skip the labels on the tee. The CL-tee of the gaph in Fig. 4(a) is shown in Fig. 4(b). Clealy, vetices in each CL-tee node and othe vetices in all its descendant nodes epesent a k-coe. Fo example, vetex C and othe vetices {A,B,D,E} in its child node compose a 2- coe. Since each vetex appeas only once, the space cost of CL-tee iso(n) whee n is the numbe of vetices in G. In addition, we maintain a map vetexnodemap, whee the key is the vetex and the value is the node of the coesponding CL-tee node, and it allows us to locate the k- coe containing any quey vetex efficiently. C 3-coe B A D 2-coe E F (a) k-coes Fig. 4. k-coes, CL-tee. 4.2 CP-tee Index G H 2:C 3:ABDE 0:# 2:FGH vetexnodemap: F (b) CL-tee Index Oveview. We build the Coe Pofiled tee (CP-tee) index by consideing both the P-tee stuctue andk-coes. We depict an example CP-tee in Fig. 5 using the pofiled gaph in Fig. 1(a). ML Fig. 5. CP-tee index. CM IS HW 0:G 2:BCD AI headmap: DMS E 2:ADE 0:# 2:FGH Each CP-tee node coesponds to a label and stoes the k- coes shaing this label. To summaize, each node p consists of following fou elements: (1) label: the attibute label; (2) paentnode: the paent node of p; (3) childlist: a list of child CP-tee nodes of p; and (4) vetexnodemap: a map that stoes the CL-tee. In addition, we maintain a map headmap, whee the key is a vetex v, and the value is a list of CP-tee nodes, each of which coesponds to a leaf node of v s P-tee. Main advantages of CPtee ae listed below. Restoe P-tees. By utilizing the headmap, each vetex s P-tee can be estoed by tavesing the leaf nodes up to the oot node. Locating k-ĉoe. Given an intege k, a quey vetex q and a CP-tee node t, using vetexnodemap, we design a function get(k,q,t) to get the k- coe containing q whee each vetex contains the label t.label in constant time cost. Quey efficiency. As discussed above, the label infomation of each vetex s P-tee can be efficiently accessed using the headmap. Index Constuction. We incementally ceate CP-tee nodes and then link them up to build the CP-tee index. Pseudocodes of CPtee index constuction ae pesented in Algoithm 2. Fo each vetex v, we ead T(v) and ceate new CP-tee nodes (lines 2-5). Fo each CP-tee node t, we add v in t fo late CL-tee constuction (lines 6, 9). If P-tee nodexis a leaf node, we update headmap (line 7). Then we link up all CP-tee nodes accoding to the GP-tee stuctue. Note that if GP-tee is unknown, we can simultaneously unify it whiling eading P-tees in the pevious step (line 10). Finally, I is etuned (line 11). Algoithm 2 CP-tee index constuction 1: function BUILDINDEX(G(V,E)) 2: fo each v V do 3: fo each x T(v) do 4: t a CP-tee node in I such that t.label = x.label; 5: if t = null then ceate a CP-tee node t and add it in I; 6: add v in t; 7: if x is the leaf node of T(v) then headmap.put(v,t); 8: fo each t I do 9: Build CL-tee fo the subgaph of t; 10: link to its paent and child nodes; 11: etun I; Complexity analysis. Obviously, lines 2-7 take the linea time. The time complexity of building a CL-tee is O(m α(n)) [11, 40] whee m is the numbe of edges in G and α(n), the invese Ackemann function, is less than 5 fo lage value of n. Thus the time complexity of building CP-tee is O( P m α(n)), and it is linea to the size of G. The space cost of CP-tee is O( P n) whee P denotes the numbe of labels in G. The space cost of the headmap is O(ˆl n) whee ˆl denotes the aveage numbe of leaf nodes in each vetex s P-tee andˆl < P. Theefoe, the total space complexity is O( P n) which is linea to the size of G. 4.3 Index-based Quey Algoithms Now we pesent ou index-based quey solutions. The fist one follows the famewok of basic, and it incementally geneates and veifies the subtees of P-tee (fom smalle subtees to lage ones). Thus we call it ince. The advanced methods boows some ideas fom MARGIN [43], the algoithm of mining maximal fequent subgaphs. As we will explain late, advanced methods can find all PC s by examining a small faction of subtees, esulting in high efficiency. In addition, thei time complexities ae O(2 T(q) m), because in the wost case all the subtees ae veified. Howeve, as we will show in Section 5.4, in pactice they ae much moe efficient than such wose-case time complexities The Methodince We begin with an inteesting lemma, which geatly acceleates the veification step.

6 6 Lemma 3. Given a CP-tee index I, a subtee T and a new subtee T which is geneated fom T by adding a new P-tee node. We have G k [T] G k [T ] I.get(k,q,T\T ), whee T\T denotes the new added node. Poof. T = T t, so we have T T. Based on Poposition 1, we know G k [T] G k [T ]. Similaly, t T, then we have that G k [T] I.get(k,q,T\T ) whee I.get(k,q,T\T ) is the k- coe containing the quey vetex q and P-tee node T\T. Hence G k [T] G k [T ] I.get(k,q,T\T ). As ince seaches the communities in the subgaph which ae found in fome iteation, the quey efficiency is impoved. We pesent ince in Algoithm 3. Algoithm 3 ince quey algoithm 1: function QUERY(I,q, k) 2: estoe T(q) using I.headMap; 3: G,Ψ GENERATESUBTREE(,T(q)); 4: while Ψ do 5: T Ψ.pop(); flag tue; 6: Φ GENERATESUBTREE(T,T(q)); 7: fo each T Φ do 8: compute G k [T] fom G k [T ] I.get(k,q,T\T ); 9: if G k [T] then 10: flag false; Ψ.push(T); 11: if flag = tue and T is maximal then 12: G = G G k [T ]; 13: etun G; We fist use headmap to locate the leaf nodes of T(q) and then estoe T(q) (line 2). We initialize Ψ by usingt(q) (line 3). In the iteation, fo cuent subtee T, we geneate new subtees. Fo each new subtee T, we veify the existence of G k [T] using the index (lines 4-8). If G k [T] exists, we add T in Φ (lines 9-10); othewise if no subtee can be geneated fom T o all subtees geneated fom T ae infeasible, we add G k [T ] in G if T is maximal (line 11). Finally, all PC s ae etuned (line 12) TheAdvanced Methods The method ince follows the Apioi-based method, which exploes all possible subtees by tavesing the seach space fom smalle subtees to lage ones; while, as demonstated in the Section 5.1, the maximal feasible subtees often lie in the middle of the seach space, which implies that most of the exploation may be avoided. Based on this obsevation, we adapt MARGIN [43] to tackle PCS. MARGIN: It does not pefom a bottom-up (o top-down) tavesal of the seach space; instead, it naows the seach space by examining only subgaphs that lie on the bode of fequent and infequent subgaphs. It fistly finds an initial pai of gaphs (CR, R) whee R is fequent and CR is not. In addition, CR is the child subgaph of R (i.e., CR is the subgaph of R and they diffe by exactly one edge). Similaly, R is the paent subgaph of CR. (CR, R) is called a cut and fom this cut, MARGIN expands and finds all othecuts by adding o deleting an edge to obtain new adjacent subgaphs. MARGIN defines this function as expandcut and Thomas et al. [43] has poved that expandcut is able to find all maximal fequent subgaphs. Inspied by MARGIN, we design the following functions. 1. Function expandptee. This function is adapted fom expandcut [43] and the main modifications ae as follows. We dynamically obtain child subgaphs and paent sugaphs, which ae called child subtees and paent subtees in ou case, using the paentnodes and childlists of CP-tee nodes, instead of pe-computing all subtees in the seach space as MARGIN does. We define a pai of P-tees (IF,F ) as a cut, whee IF is the child subtee of F and F is feasible while IF is not; We dynamically veify whethe a feasible subtee is maximal. We develop a function veifyptee to veify the feasibility. Algoithm 4 expandptee 1: function EXPANDPTREE(IF,F, G) 2: if IF = and F then update G; 3: else 4: Q ; Q.push((IF,F)); 5: while Q do 6: (IF,F) Q.pop(); 7: fo each paent Y i of IF do 8: if Y i is feasible then 9: update G if Y i is maximal; 10: fo each child K of Y i do 11: if K is infeasible then Q.push((K,Y i )); 12: if K is feasible then 13: find common child C of K and IF ; 14: Q.push((C,K)); 15: else 16: fo each paent K of Y i do 17: if K is feasible then Q.push((Y i,k)); 18: etun G; We now illustate expandptee in Algoithm 4. As we will intoduce late, if IF = and F we can diectly update G because the F is aleady the maximal common subtee (line 2). Othewise, we fist use (IF,F ) to initialize the queue Q (line 4). Then, fo each pai, we iteatively veify its adjacent pais (lines 5-17). If the paent subtee Y i of IF is feasible, G k [Y i ] hee may not be the final esult. This is because subtees ae not egulaly enumeated, and thusy i may be tempoaily maximal, so we need to epeatedly veify it. If thee exist othe feasible subtees veified in pevious steps that ae the subtee ofy i, we need to eplace thei coesponding subgaphs with G k [Y i ] (line 9). Finally, we etun G (line 18). Lemma 4. Given a P-tee pai (IF,F ),expandptee can find all feasible subtees fo a PCS quey. The poof of Lemma 4 is based on following peliminaies. gaph : a b c lattice: a b a b b c a b c (a) lattice Fig. 6. the lattice and Uppe- -Popety [43]. c C i A e 1 e 2 e 2 P e 1 (b) Uppe- -Popety Lattice is essentially a pe-pocessed data stuctue whee all possible subgaphs of a given gaph ae enumeated. Taking the gaph in Fig. 6(a) as an example, its subgaphs in each level have the same size (i.e., numbes of edges). The bottom level (level 0) coesponds to the empty gaph and the level i lists all size-i subgaphs. In lattice, each subgaph is linked to its paent gaphs (i.e., subgaph of this gaph and they diffe exactly by one edge) and childs (i.e., supe-gaph of this gaph and they diffe exactly by one edge). We can obseve that the P-tee can diectly eplace the gaph to constuct the lattice. C j

7 7 Popety 1 (Uppe- -Popety [43]). Any two child subgaphs C i,c j of a gaphp will have a common child subgapha. In Popety 1, C i,c j,p and A ae fou subgaphs. C i,c j ae two child subgaphs of P (i.e., subgaphs of P and they espectively diffe with P by one egde e 1,e 2 ). Then thee must exist one subgaph A such that A is the child subgaph of C i and C j. Popety 1 is vey intuitive in gaphs. Based on Poposition2, we pove that the Uppe- -Popety can be simply adapted to fit in P-tee models. Poposition 2. P-tees satisfy the Uppe- -Popety. Poof. In P-tees, e 1 and e 2 can be two P-tee nodes such that subteesc i = P e 1 andc j = P e 2. Thee must exist a P-tee A = P e 1 e 2 = (P e 1 ) e 2 = (P e 2 ) e 1. Thus A = C i e 2 = C j e 1 which means A is the common child subtee of C i and C j. Now we fomally give the poof of Lemma 4. Poof. MethodexpandPtee is mainly adpted fom MARGIN. As mentioned in MARGIN, the coectness holds when the adapted poblem satisfies the following constaints [43]: (1) The seach space is a subset of the lattice. (2) The Uppe- -popety holds. (3) The anti-monotone popety is satisfied. (4) A candidate set can be defined which is a bounday set such that evey in the set satisfies a given useconstaint and thee exists an immediate child in the lattice that does not satisfy the constaint because of the anti-monotone popety. Fo evey in the set, thee exists an immediate paent that does not satisfy the constaint fo the monotone popety. (5) Solution sets can be geneated fom the candidate sets. Fo PCS poblem, the element in constaint (1) is the P-tee and obviously constaint (1) is satisfied. Poposition 2 has poved that constaint (2) is satisfied. The anti-monotonicity popety has been poved in Lemma 2 and thus constaint (3) is also satisfied. In MARGIN, the use-constaint of the constaint (4) is that, given a theshold, whethe a gaph is fequent o not. Hee fo constaint (4), the use-constaint is that whethe a P-tee is feasible. Fo instance, a P-tee T is feasible which means G k [T ] exists. If T, which is the child of T, is not feasible (i.e., G k [T ] does not exist). Then T can be defined in this bounday set and its immediate child T does not satisfy this use-constaint fo the anti-monotone popety. Hence constaint (4) holds. Once a is added in the candidate set, we need to veify whethe this is maximal. It means the solution set is the subset of this candidate set. Thus constaint (5) is satisfied. In conclusion, the coectness of Lemma 4 holds. 2. Function veifyptee. Given a subtee T, T child and T paent denote a child and the paent subtee of T. Let l denote the numbe of T paent s leaf nodes and t ni epesent the ith leaf node of T paent. Deived fom Lemma 3, we have G k [T child ] G k [T] I.get(k,q,T child \T). G k [T paent ] l i=1 I.get(k,q,t n i ). Since all P-tees ae subtees of the GP-tee, if a P-tee has the attibute t, then t s paent attibute t is also included. Thus, I.get(k,q,t) I.get(k,q,t ). Fo a special subtee T i (a path fom leaf node t ni to oot node ), we can finally get G k [T i ] = I.get(k,q,t ni ). Note that T paent can be seen as seveal paths and thus we get G k [T paent ] l i=1 I.get(k,q,t n i ). Based on CP-tee, veifyptee can efficiently veify subtees. Next we discuss thee methods to find the initial cut. 3. Function find-i. We can adaptince to find the initial cut. As shown in Algoithm 5, we incementally enumeate subtees and veify the existence of the coesponding communities. Once we find a subtee which is feasible while its child subtee is not, then we can egad them as an initial cut (lines 2-15). Algoithm 5 Find the initial cut: find-i 1: function FIND-I(I,S, q, k) 2: estoe T(q) using I.headMap; 3: IF ; F = T(q); 4: Ψ GENERATESUBTREE(,T(q)); 5: while Ψ do 6: T Ψ.pop(); flag tue; 7: Φ GENERATESUBTREE(T,T(q)); 8: fo each T Φ do 9: compute G k [T] fom G k [T ] I.get(k,q,T\T ); 10: if G k [T] then 11: flag false; Ψ.push(T); 12: if flag = tue and T is maximal then 13: F = T ; IF = T ; 14: beak; 15: etun (IF, F); 4. Function find-d. We can decementally geneate subtees fom lage subtees to smalle ones. We epesent find-d pseudocodes in Algoithm 6. Fistly, if G k [T(q)] exists, we can diectly etun it as a qualified community (lines 2-4). In each step, fo an infeasible subtee T, we emove one of T s leaf nodes and veify the feasibility of the new subtees (lines 6-11). Once thee is a new feasible subtee, we teat T and this new subtee as the initial cut (lines 12-17). Algoithm 6 Find the initial cut: find-d 1: function FIND-D(I,S, q, k) 2: IF ; F ; 3: estoe T(q) using I.headMap; 4: if G k [T(q)] then F = T(q); 5: else 6: Ψ.push(T); 7: while Ψ do 8: T Ψ.pop(); IF = T ; 9: Θ all leaf nodes of T ; 10: fo each t Θ do 11: compute G k [T \t ] fom G; 12: if G k [T \t ] then 13: F = T \t ; 14: Beak; 15: else 16: Ψ.push(T \t ); 17: etun (IF, F); 4. Function find-p. We can find the initial cut by diectly veifying subtees instead of the node one by one. Intuitively, P- tee can be divided into seveal paths (fom leaf nodes to the oot). Accoding to Lemma 2, these paths can be futhe veified by checking the coesponding leaf nodes. We call it find initial cut by path (find-p). We pesent the pseudocodes of find-p in Algoithm 7. S denotes a P-tee node set. Initially, it consists of all leaf nodes of T(q). If thee does not exist a feasible node in S, we tace up to veify thei paent nodes (lines 13-14). Next, we iteatively check the nodes ins. If we find a nodetandg k [F t] exists, we update F (lines 5-6). Let t paent denote the paent node of t. If we find

8 8 Algoithm 7 Find the initial cut: find-p 1: function FIND-P(I,S, q, k) 2: IF ; F find a leaf node t S s.t. I.get(k,q,t) ; 3: if F then 4: fo each t S do 5: computing G k [F t] fom G k [F] I.get(k,q,t); 6: if G k [F t] then F = F t; 7: else 8: path tace a path fom t to in I; 9: find t,t paent on path s.t. G k[t ]=, G k [t paent ] ; 10: IF = F t paent ; F = F t ; 11: Beak; 12: else 13: fo each t S do S.eplace(t, t.paent); 14: FIND-P(I,S, q, k); 15: complete subtees IF, F ; 16: etun (IF, F); a node t that G k [F t] does not exist, we tace up to find the bounday whee G k [t paent] exists while G k [t ] does not and thus we find an initial pai (lines 8-11). Note that at now stage, IF, F may not be complete subtees. Thus fo the nodes in IF and F, we need to include all thei ancesto nodes and then etun (IF,F) as a cut (lines 15-16). Algoithm 8 gives the oveall advanced methods. Notice that, thee ae thee functions, i.e., find-i, find-d, and find-p, of finding the initial cut, so we have thee vaiants of advanced, denoted by adv-i, adv-d and adv-p espectively. Algoithm 8 Advanced method 1: function QUERY(I,q, k) 2: G ; 3: (IF,F) FIND(I,S,q,k); 4: EXPANDPTREE(IF,F, G); 5: etun G; 5 EXPERIMENTS 5.1 Setup We conside two eal datasets (ACMDL and PubMed) and two synthetic datasets (Flick and DBLP). ACMDL 2 and PubMed 3 ae the co-authoship netwoks of eseaches in compute science and biomedical aeas espectively. Each vetex of them epesents an autho, and an edge is a co-authoship between two authos. Fo each autho, he papes have been categoized by a hieachical subject classification system (ACM CCS o Medical Subject Headings (MeSH) 4 ), so we build the P-tee by unifying the categoization infomation of all he papes. Fo Flick 5 [44], each vetex epesents a use and each edge denotes a follow elationship between two uses. Fo DBLP 6, a vetex is an autho and an edge epesents a co-authoship elationship. Fo each use, we use a hash function and map the associated textual content to subjects of CCS to synthesize a P-tee. By doing this, the same textual contents could be mapped fo constucting the same nodes in P-tees. Table 2 shows the statistics of the datasets, including the numbes of vetices and edges, vetices aveage degee d, the aveage numbe of labels in P-tees P, and the aveage numbe of labels in the GP-tee To evaluate PCS queies, in line with [11], we set the default value of k to 6. Fo each dataset, we andomly select 100 quey vetices fom the 6-coe. We implement all the algoithms in Java, and un expeiments on a machine having an eight-coe Intel 3.40GHz pocesso, and 16GB of memoy, with Ubuntu installed. TABLE 2 Datasets used in ou expeiments. Dataset Vetices Edges d P GP-tee ACMDL 107, , ,908 Flick 581,099 4,972, ,908 PubMed 716,459 4,742, ,132 DBLP 977,288 6,864, ,908 we conside all the fou datasets and check the locations of maximal feasible subtees of 100 communities in seach space fo each dataset. In ou expeiments, because the seach space may be vey lage, accoding to the depth, we aveage them into 5 levels. Notice that, in this case, level 3 epesents the middle location of the seach space. The expeimental esults ae attached below. Fo example, thee ae 43% maximal feasible subtees lying on the middle of the seach space in PubMed. This demonstates the above view and explains the motivation fo the advanced methods. TABLE 3 Locations of maximal feasible subtees. ACMDL Flick PubMed DBLP Level 1 3% 8% 11% 5% Level 2 15% 23% 5% 13% Level 3 18% 32% 43% 37% Level 4 26% 25% 24% 31% Level 5 38% 12% 17% 14% 5.2 PCS Effectiveness As mentioned befoe, the existing CS methods mainly focus on non-attibuted gaphs. A ecent wok ACQ [11, 40] investigates CS on attibuted gaphs. In ACQ, each vetex in the attibuted gaph is associated with a set of keywods. Communities etieved by ACQ should satisfy the stuctue cohesiveness (k-coe constaint) and keywod cohesiveness [11, 40], i.e., the numbe of common keywods shaed by all vetices in communities should be maximum. We compae PCS with ACQ. To un ACQ queies, we set each vetex s attibute as a set of keywods, which ae the keywods in its P-tee. In the following, we fist pesent a case study, and then show the quality and divesity of communities. A Case Study: We pefom a case study on the ACMDL dataset and conside a enowned eseache: Jim Gay. We set k = 4 hee. We pesent Jim s two PC s, i.e., PC1 and PC2, with diffeent eseach aeas in Fig. 7 and Fig. 8. Notice that ACQ only finds one community PC1 shown in Fig. 7(a). This is because, ACQ maximizes the numbe of shaed keywods, so PC2 shown in Fig. 8(a), which has five shaed keywods, cannot be etuned. In addition, as shown in Fig. 7(b), all shaed keywods of PC1 ae oganized in a tee with few banches, which implies that the semantics of keywods ae highly ovelapped with each othe. In contast, the shaed subtee of PC2 shown in Fig. 8(b) has multiple banches, so the semantics of keywods ae vey diffeent and divesified. Hence, PCS ae moe effective than ACQ fo extacting communities fom pofiled gaphs. Community Paiwise Similaity (CPS): We compae PCS with thee classic CS methods using minimum degee definition: ACQ [11], Global [8] and Local [25]. We use Tee Edit

9 9 Jim Gay A.Deshpande M. Liebhold A. Szalay M.Hansen S. Nath V. Tao P B. Gibbons M. J. Fanklin M. Balazinska (a) PC1 Fig. 7. One PC of Jim Gay. J. Cogan R. Buns Jim Gay R. Musaloiu-E (a) PC2 S. Oze A. Szalay A. Tezis K. Szlavecz Fig. 8. Anothe PC of Jim Gay. Infomation Retieval Infomation Extaction Infomation Systems Retieval Task & Goals Document Filteing (b) The maximal common subtee of PC1 Softwae & Engineeing Infomation Systems Hadwae Compute System Oganization (b) The maximal common subtee of PC2 Distance (TED) to compute the similaity between the P-tees of any pai of vetices in community G l. Let T i be the P-tee of the i-th vetex in G l. The CPS is then the aveage similaity ove all pais of G l s vetices, and all communities of G: G [ 1 CPS(G) = 1 G l 2 l=1 G l G l j=1 i=1 ] TED(T i,t j ) T i T j The CPS(G) value has a ange of 0 and 1. The highe the value is, the moe cohesive the community is. As shown in Fig 9(a), PCs denotes the communities that only PCS can seach. P- ACs epesents those etuned by both of PCS and ACQ. P-ACs have the most P-tee nodes (i.e., keywods in ACQ definition) in common, and the fewest vetices. Thus they have the highest CPS values. Note that PCs have a close CPS value with P-ACs which implies that these unique PC s ae also of highly quaility. Level-divesity atio (LDR): To futhe measue the quality of PC s, we define a metic, called level-divesity atio (LDR), to measue the divesity of attibutes level by level in the shaed subtees. F denotes the method that we use hee to compae with PCS. Given a quey vetex q, we use T (F,q,j) to epesent the maximal common P-tees of j-th community etuned by the method F. L is the numbe of levels in P-tee T(q). L i (T) is the numbe of unique labels in the i-th level of P-tee T. H and J denote the numbes of communities etuned by the method F and PCS espectively. A lowe LDR value implies that the method F is less divese than PCS. LDR(q,F) = 1 L L i=1 H h=1 ] L i [T (F,q,h) (2) J ] (3) L i [T (PCS,q,j) j=1 Intuitively, LDR eflects the popotion of unique labels in each level. The expeimental esults ae depicted in Fig. 9(b), which shows that communities etuned by ACQ can only cove 40% to 60% labels of PC s in each level. This implies that PC s found by PCS have highe divesity than those of ACQ, because PCS focuses on maximizing the common stuctue of P-tees, athe than the numbe of common keywods. As a esult, all communities with the semantically maximal popeties can be found, and the communities ae of high divesity. Community numbes: Fig. 10(a) epots the aveage numbe of communities that pe quey equest etuns in these methods. Fom the esults, we can see that PCS finds moe communities than othes. This is because only PCS focuses on pofiled gaphs and hieachical infomation in P-tees to etieve communities. Comaped with othe methods, PCS is able to extact communities with moe semantic focuses. Community P-tee Fequency (CPF): CPF is inspied by the document fequency measue. Let fe i,j epesent the numbe of vetices in G i whose P-tee contains T(q) s j-th P-tee node. We use CPF to compute the occuence fequency ove all nodes in T(q) and all communities in G: CPF(q) = 1 G l T(q) G l i=1 T(q) j=1 fe i,j G i Note that CPF(q) anges fom 0 to 1 and a highe value implies a bette cohesiveness. As shown in Fig 9(a), compaed with the communties etieved by both of PCS and ACQ, those unique PCs also have a highly degee of cohesiveness. (a) CPS Fig. 9. Compaing PCS with CS methods. (a) Community numbe Fig. 10. Compaing PCS with CS methods. (b) LDR (b) CPF F1-scoe: Hee we use Facebook ego-netwoks 7 to evaluate the accuacy. We use FBX to denote the X-th netwok and each egonetwok has seveal ovelapping gound-tuth communities, called fiendship cicles [45]. See Table 4, each vetex has eal pofiles, such as political, education, etc. Simila to Flick, we build each P-tee by using a hash function to map the eal pofiles to CCS subjects. We andom quey 100 vetices in these gound-tuth communities and compute the F1-scoes 8 ove diffeent methods. The F1-scoes of all methods ove thee netwoks ae shown in (4)

10 10 Fig. 11. The expeimental esults show that, compaed with othe methods, PCS can stably extact communities with high accuacy ove thee eal netwoks. TABLE 4 Facebook datasets. Dataset Vetices Edges d P FB1 1,233 11, FB2 1,447 17, FB , (a) CPS (b) LDR Fig. 11. F1-scoes ove thee netwoks. (c) community numbe Fig. 12. Evaluation on ACMDL, PubMed datasets. (b) CPF 5.3 Compaison with Othe Definition Metics In this section, we compae seveal potential metics to define the PCS poblems. Geneally, a good community should be a goup of uses, which ae cohesive in both stuctues and pofiles. To measue stuctue cohesiveness, we use the minimum degee metic, which is in line with existing woks [8, 11, 12, 25, 26]. To measue the pofile cohesiveness, we have tied a list of possible metics, including: (a) common nodes of P-tees; (b) common path of P-tees (fom the P-tee leaf to the oot); (c) common subtee of P-tee stuctues; (d) similaity of vetex P-tees. We compae these fou metics ove two eal datasets (ACMDL and pubmed). As shown in Fig 12, compaed with othe metics, Metic (c) can achieve highest scoes ove fou indices. We now discuss the eason fo such diffeences. In a ecently wok ACQ [40], the authos define the vetex attibute as a set of keywods and use the numbe of shaed keywods to constain the communities. Thus, in ou PCS poblem, it is natual to use the numbe of common P-tee nodes to measue the pofile cohesiveness, and it is natual to equie the numbe of common nodes to be the lagest. Howeve, as we have analyzed befoe, this will ignoe the inteelated elations among the nodes and violate the basic motivation fo the PCS poblem. Thus Metic (a) is not suitable fo PCS definition. Metic (b) is defined by common paths (i.e., a common path fom P-tee oot to a leaf node) shaed by all the nodes in the etuned community. Intuitively, we can equie the numbe of common paths to be maximum. This metic will still have some inadequacies, as it amounts to maximize the numbe of common leaf nodes, which will miss out meaningful communities with fewe common leaves. As a esult, based on the discussions above, we think metic (b) is also not suitable fo PCS poblem definition. Metic (c) focuses on the common subtee of all P-tees. Clealy, a subtee consists a set of nodes and thei hieachical elationships. Compaed with the metics above, the common subtee of P-tee stuctue is moe suitable fo measuing the pofile cohesiveness of a community, as it can adequately pesent the commonalities of vetex P-tees. Inspied by anothe ecent community seach wok [12], we tied to use the similaity of P-tees to define the poblem. It means, given a theshold, to find all vetices with a budgeted similaity scoe. Howeve, it is still not suitable fo the PCS poblem. This is because, nomally, if two P-tees ae to be compaed by some similaity methods, the divesity of these P-tees will be nevetheless egaded as the dissimilaity. Thus, based on above discussion and expeimental esults in Fig 12, we adopt Metic (c) in ou PCS poblem definition. 5.4 Results of Efficiency Evaluation In this section, we show the efficiency esults of index constuction and PCS queies. 1. Index constuction. Fig. 13(a)-13(b) show the scalability of the CP-tee index constuction method. To evaluate the scalability of index constuction method w..t the dataset size, fo each dataset, we andomly select 20%, 40%, 60% and 80% of its vetices to obtain fou sub-datasets espectively. As shown in Fig. 13(a), we obseve that, the time cost of the index constuction is linea to the size of pofiled gaphs, which confims ou analysis befoe. Futhemoe, to evaluate the scalability of index constuction method ove diffeent P-tee sizes of vetices and ove diffeent factions of the GP-tee size, we obtain fou sub-datasets in a simila way. As shown in Fig. 13(b) and Fig. 13(c), we demonstate that the time cost of the index constuction is linea to the size of P-tees and GP-tees. 2. Quey efficiency. We vay the value of k and show the quey efficiency of diffeent algoithms in Fig. 14(a)-14(d). The method ince is 100 times faste than the basic method, but slowe than the method adv-i. Futhe, adv-d and adv-p ae 10 times faste thanince. The eason is that, compaed withince, the advanced methods naow the seach space by veifying a smalle faction of subtees. Also, the efficiency gap in finding an initial cut esults in the sightly diffeent pefomance of the advanced methods. Thus, the index-based methods un fast and adv-p stably scales the best. Note that thee advanced methods pefom similaly on Flick. This is because the initial cut esults ae in the

RANDOM IRREGULAR BLOCK-HIERARCHICAL NETWORKS: ALGORITHMS FOR COMPUTATION OF MAIN PROPERTIES

RANDOM IRREGULAR BLOCK-HIERARCHICAL NETWORKS: ALGORITHMS FOR COMPUTATION OF MAIN PROPERTIES RANDOM IRREGULAR BLOCK-HIERARCHICAL NETWORKS: ALGORITHMS FOR COMPUTATION OF MAIN PROPERTIES Svetlana Avetisyan Mikayel Samvelyan* Matun Kaapetyan Yeevan State Univesity Abstact In this pape, the class

More information

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012

Journal of World s Electrical Engineering and Technology J. World. Elect. Eng. Tech. 1(1): 12-16, 2012 2011, Scienceline Publication www.science-line.com Jounal of Wold s Electical Engineeing and Technology J. Wold. Elect. Eng. Tech. 1(1): 12-16, 2012 JWEET An Efficient Algoithm fo Lip Segmentation in Colo

More information

Detection and Recognition of Alert Traffic Signs

Detection and Recognition of Alert Traffic Signs Detection and Recognition of Alet Taffic Signs Chia-Hsiung Chen, Macus Chen, and Tianshi Gao 1 Stanfod Univesity Stanfod, CA 9305 {echchen, macuscc, tianshig}@stanfod.edu Abstact Taffic signs povide dives

More information

Towards Adaptive Information Merging Using Selected XML Fragments

Towards Adaptive Information Merging Using Selected XML Fragments Towads Adaptive Infomation Meging Using Selected XML Fagments Ho-Lam Lau and Wilfed Ng Depatment of Compute Science and Engineeing, The Hong Kong Univesity of Science and Technology, Hong Kong {lauhl,

More information

Point-Biserial Correlation Analysis of Fuzzy Attributes

Point-Biserial Correlation Analysis of Fuzzy Attributes Appl Math Inf Sci 6 No S pp 439S-444S (0 Applied Mathematics & Infomation Sciences An Intenational Jounal @ 0 NSP Natual Sciences Publishing o Point-iseial oelation Analysis of Fuzzy Attibutes Hao-En hueh

More information

A Recommender System for Online Personalization in the WUM Applications

A Recommender System for Online Personalization in the WUM Applications A Recommende System fo Online Pesonalization in the WUM Applications Mehdad Jalali 1, Nowati Mustapha 2, Ali Mamat 2, Md. Nasi B Sulaiman 2 Abstact foeseeing of use futue movements and intentions based

More information

And Ph.D. Candidate of Computer Science, University of Putra Malaysia 2 Faculty of Computer Science and Information Technology,

And Ph.D. Candidate of Computer Science, University of Putra Malaysia 2 Faculty of Computer Science and Information Technology, (IJCSIS) Intenational Jounal of Compute Science and Infomation Secuity, Efficient Candidacy Reduction Fo Fequent Patten Mining M.H Nadimi-Shahaki 1, Nowati Mustapha 2, Md Nasi B Sulaiman 2, Ali B Mamat

More information

Lecture 27: Voronoi Diagrams

Lecture 27: Voronoi Diagrams We say that two points u, v Y ae in the same connected component of Y if thee is a path in R N fom u to v such that all the points along the path ae in the set Y. (Thee ae two connected components in the

More information

HISTOGRAMS are an important statistic reflecting the

HISTOGRAMS are an important statistic reflecting the JOURNAL OF L A T E X CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 1 D 2 HistoSketch: Disciminative and Dynamic Similaity-Peseving Sketching of Steaming Histogams Dingqi Yang, Bin Li, Laua Rettig, and Philippe

More information

Controlled Information Maximization for SOM Knowledge Induced Learning

Controlled Information Maximization for SOM Knowledge Induced Learning 3 Int'l Conf. Atificial Intelligence ICAI'5 Contolled Infomation Maximization fo SOM Knowledge Induced Leaning Ryotao Kamimua IT Education Cente and Gaduate School of Science and Technology, Tokai Univeisity

More information

An Unsupervised Segmentation Framework For Texture Image Queries

An Unsupervised Segmentation Framework For Texture Image Queries An Unsupevised Segmentation Famewok Fo Textue Image Queies Shu-Ching Chen Distibuted Multimedia Infomation System Laboatoy School of Compute Science Floida Intenational Univesity Miami, FL 33199, USA chens@cs.fiu.edu

More information

arxiv: v4 [cs.ds] 7 Feb 2018

arxiv: v4 [cs.ds] 7 Feb 2018 Dynamic DFS in Undiected Gaphs: beaking the O(m) baie Suende Baswana Sheejit Ray Chaudhuy Keeti Choudhay Shahbaz Khan axiv:1502.02481v4 [cs.ds] 7 Feb 2018 Depth fist seach (DFS) tee is a fundamental data

More information

Embeddings into Crossed Cubes

Embeddings into Crossed Cubes Embeddings into Cossed Cubes Emad Abuelub *, Membe, IAENG Abstact- The hypecube paallel achitectue is one of the most popula inteconnection netwoks due to many of its attactive popeties and its suitability

More information

Illumination methods for optical wear detection

Illumination methods for optical wear detection Illumination methods fo optical wea detection 1 J. Zhang, 2 P.P.L.Regtien 1 VIMEC Applied Vision Technology, Coy 43, 5653 LC Eindhoven, The Nethelands Email: jianbo.zhang@gmail.com 2 Faculty Electical

More information

Information Retrieval. CS630 Representing and Accessing Digital Information. IR Basics. User Task. Basic IR Processes

Information Retrieval. CS630 Representing and Accessing Digital Information. IR Basics. User Task. Basic IR Processes CS630 Repesenting and Accessing Digital Infomation Infomation Retieval: Basics Thosten Joachims Conell Univesity Infomation Retieval Basics Retieval Models Indexing and Pepocessing Data Stuctues ~ 4 lectues

More information

An Extension to the Local Binary Patterns for Image Retrieval

An Extension to the Local Binary Patterns for Image Retrieval , pp.81-85 http://x.oi.og/10.14257/astl.2014.45.16 An Extension to the Local Binay Pattens fo Image Retieval Zhize Wu, Yu Xia, Shouhong Wan School of Compute Science an Technology, Univesity of Science

More information

A modal estimation based multitype sensor placement method

A modal estimation based multitype sensor placement method A modal estimation based multitype senso placement method *Xue-Yang Pei 1), Ting-Hua Yi 2) and Hong-Nan Li 3) 1),)2),3) School of Civil Engineeing, Dalian Univesity of Technology, Dalian 116023, China;

More information

An Optimised Density Based Clustering Algorithm

An Optimised Density Based Clustering Algorithm Intenational Jounal of Compute Applications (0975 8887) Volume 6 No.9, Septembe 010 An Optimised Density Based Clusteing Algoithm J. Hencil Pete Depatment of Compute Science St. Xavie s College, Palayamkottai,

More information

IP Network Design by Modified Branch Exchange Method

IP Network Design by Modified Branch Exchange Method Received: June 7, 207 98 IP Netwok Design by Modified Banch Method Kaiat Jaoenat Natchamol Sichumoenattana 2* Faculty of Engineeing at Kamphaeng Saen, Kasetsat Univesity, Thailand 2 Faculty of Management

More information

Communication vs Distributed Computation: an alternative trade-off curve

Communication vs Distributed Computation: an alternative trade-off curve Communication vs Distibuted Computation: an altenative tade-off cuve Yahya H. Ezzeldin, Mohammed amoose, Chistina Fagouli Univesity of Califonia, Los Angeles, CA 90095, USA, Email: {yahya.ezzeldin, mkamoose,

More information

A Novel Automatic White Balance Method For Digital Still Cameras

A Novel Automatic White Balance Method For Digital Still Cameras A Novel Automatic White Balance Method Fo Digital Still Cameas Ching-Chih Weng 1, Home Chen 1,2, and Chiou-Shann Fuh 3 Depatment of Electical Engineeing, 2 3 Gaduate Institute of Communication Engineeing

More information

FACE VECTORS OF FLAG COMPLEXES

FACE VECTORS OF FLAG COMPLEXES FACE VECTORS OF FLAG COMPLEXES ANDY FROHMADER Abstact. A conjectue of Kalai and Eckhoff that the face vecto of an abitay flag complex is also the face vecto of some paticula balanced complex is veified.

More information

Segmentation of Casting Defects in X-Ray Images Based on Fractal Dimension

Segmentation of Casting Defects in X-Ray Images Based on Fractal Dimension 17th Wold Confeence on Nondestuctive Testing, 25-28 Oct 2008, Shanghai, China Segmentation of Casting Defects in X-Ray Images Based on Factal Dimension Jue WANG 1, Xiaoqin HOU 2, Yufang CAI 3 ICT Reseach

More information

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma

MapReduce Optimizations and Algorithms 2015 Professor Sasu Tarkoma apreduce Optimizations and Algoithms 2015 Pofesso Sasu Takoma www.cs.helsinki.fi Optimizations Reduce tasks cannot stat befoe the whole map phase is complete Thus single slow machine can slow down the

More information

A Two-stage and Parameter-free Binarization Method for Degraded Document Images

A Two-stage and Parameter-free Binarization Method for Degraded Document Images A Two-stage and Paamete-fee Binaization Method fo Degaded Document Images Yung-Hsiang Chiu 1, Kuo-Liang Chung 1, Yong-Huai Huang 2, Wei-Ning Yang 3, Chi-Huang Liao 4 1 Depatment of Compute Science and

More information

A Memory Efficient Array Architecture for Real-Time Motion Estimation

A Memory Efficient Array Architecture for Real-Time Motion Estimation A Memoy Efficient Aay Achitectue fo Real-Time Motion Estimation Vasily G. Moshnyaga and Keikichi Tamau Depatment of Electonics & Communication, Kyoto Univesity Sakyo-ku, Yoshida-Honmachi, Kyoto 66-1, JAPAN

More information

SYSTEM LEVEL REUSE METRICS FOR OBJECT ORIENTED SOFTWARE : AN ALTERNATIVE APPROACH

SYSTEM LEVEL REUSE METRICS FOR OBJECT ORIENTED SOFTWARE : AN ALTERNATIVE APPROACH I J C A 7(), 202 pp. 49-53 SYSTEM LEVEL REUSE METRICS FOR OBJECT ORIENTED SOFTWARE : AN ALTERNATIVE APPROACH Sushil Goel and 2 Rajesh Vema Associate Pofesso, Depatment of Compute Science, Dyal Singh College,

More information

Clustering Interval-valued Data Using an Overlapped Interval Divergence

Clustering Interval-valued Data Using an Overlapped Interval Divergence Poc. of the 8th Austalasian Data Mining Confeence (AusDM'9) Clusteing Inteval-valued Data Using an Ovelapped Inteval Divegence Yongli Ren Yu-Hsn Liu Jia Rong Robet Dew School of Infomation Engineeing,

More information

Separability and Topology Control of Quasi Unit Disk Graphs

Separability and Topology Control of Quasi Unit Disk Graphs Sepaability and Topology Contol of Quasi Unit Disk Gaphs Jiane Chen, Anxiao(Andew) Jiang, Iyad A. Kanj, Ge Xia, and Fenghui Zhang Dept. of Compute Science, Texas A&M Univ. College Station, TX 7784. {chen,

More information

Spiral Recognition Methodology and Its Application for Recognition of Chinese Bank Checks

Spiral Recognition Methodology and Its Application for Recognition of Chinese Bank Checks Spial Recognition Methodology and Its Application fo Recognition of Chinese Bank Checks Hanshen Tang 1, Emmanuel Augustin 2, Ching Y. Suen 1, Olivie Baet 2, Mohamed Cheiet 3 1 Cente fo Patten Recognition

More information

A Minutiae-based Fingerprint Matching Algorithm Using Phase Correlation

A Minutiae-based Fingerprint Matching Algorithm Using Phase Correlation A Minutiae-based Fingepint Matching Algoithm Using Phase Coelation Autho Chen, Weiping, Gao, Yongsheng Published 2007 Confeence Title Digital Image Computing: Techniques and Applications DOI https://doi.og/10.1109/dicta.2007.4426801

More information

Scaling Location-based Services with Dynamically Composed Location Index

Scaling Location-based Services with Dynamically Composed Location Index Scaling Location-based Sevices with Dynamically Composed Location Index Bhuvan Bamba, Sangeetha Seshadi and Ling Liu Distibuted Data Intensive Systems Laboatoy (DiSL) College of Computing, Geogia Institute

More information

Shortest Paths for a Two-Robot Rendez-Vous

Shortest Paths for a Two-Robot Rendez-Vous Shotest Paths fo a Two-Robot Rendez-Vous Eik L Wyntes Joseph S B Mitchell y Abstact In this pape, we conside an optimal motion planning poblem fo a pai of point obots in a plana envionment with polygonal

More information

Effective Missing Data Prediction for Collaborative Filtering

Effective Missing Data Prediction for Collaborative Filtering Effective Missing Data Pediction fo Collaboative Filteing Hao Ma, Iwin King and Michael R. Lyu Dept. of Compute Science and Engineeing The Chinese Univesity of Hong Kong Shatin, N.T., Hong Kong { hma,

More information

Effective Data Co-Reduction for Multimedia Similarity Search

Effective Data Co-Reduction for Multimedia Similarity Search Effective Data Co-Reduction fo Multimedia Similaity Seach Zi Huang Heng Tao Shen Jiajun Liu Xiaofang Zhou School of Infomation Technology and Electical Engineeing The Univesity of Queensland, QLD 472,

More information

THE THETA BLOCKCHAIN

THE THETA BLOCKCHAIN THE THETA BLOCKCHAIN Theta is a decentalized video steaming netwok, poweed by a new blockchain and token. By Theta Labs, Inc. Last Updated: Nov 21, 2017 esion 1.0 1 OUTLINE Motivation Reputation Dependent

More information

Quality Aware Privacy Protection for Location-based Services

Quality Aware Privacy Protection for Location-based Services In Poceedings of the th Intenational Confeence on Database Systems fo Advanced Applications (DASFAA 007), Bangkok, Thailand, Apil 9-, 007. Quality Awae Pivacy Potection fo Location-based Sevices Zhen Xiao,,

More information

Assessment of Track Sequence Optimization based on Recorded Field Operations

Assessment of Track Sequence Optimization based on Recorded Field Operations Assessment of Tack Sequence Optimization based on Recoded Field Opeations Matin A. F. Jensen 1,2,*, Claus G. Søensen 1, Dionysis Bochtis 1 1 Aahus Univesity, Faculty of Science and Technology, Depatment

More information

A VECTOR PERTURBATION APPROACH TO THE GENERALIZED AIRCRAFT SPARE PARTS GROUPING PROBLEM

A VECTOR PERTURBATION APPROACH TO THE GENERALIZED AIRCRAFT SPARE PARTS GROUPING PROBLEM Accepted fo publication Intenational Jounal of Flexible Automation and Integated Manufactuing. A VECTOR PERTURBATION APPROACH TO THE GENERALIZED AIRCRAFT SPARE PARTS GROUPING PROBLEM Nagiza F. Samatova,

More information

A New and Efficient 2D Collision Detection Method Based on Contact Theory Xiaolong CHENG, Jun XIAO a, Ying WANG, Qinghai MIAO, Jian XUE

A New and Efficient 2D Collision Detection Method Based on Contact Theory Xiaolong CHENG, Jun XIAO a, Ying WANG, Qinghai MIAO, Jian XUE 5th Intenational Confeence on Advanced Mateials and Compute Science (ICAMCS 2016) A New and Efficient 2D Collision Detection Method Based on Contact Theoy Xiaolong CHENG, Jun XIAO a, Ying WANG, Qinghai

More information

Obstacle Avoidance of Autonomous Mobile Robot using Stereo Vision Sensor

Obstacle Avoidance of Autonomous Mobile Robot using Stereo Vision Sensor Obstacle Avoidance of Autonomous Mobile Robot using Steeo Vision Senso Masako Kumano Akihisa Ohya Shin ichi Yuta Intelligent Robot Laboatoy Univesity of Tsukuba, Ibaaki, 35-8573 Japan E-mail: {masako,

More information

DEADLOCK AVOIDANCE IN BATCH PROCESSES. M. Tittus K. Åkesson

DEADLOCK AVOIDANCE IN BATCH PROCESSES. M. Tittus K. Åkesson DEADLOCK AVOIDANCE IN BATCH PROCESSES M. Tittus K. Åkesson Univesity College Boås, Sweden, e-mail: Michael.Tittus@hb.se Chalmes Univesity of Technology, Gothenbug, Sweden, e-mail: ka@s2.chalmes.se Abstact:

More information

Optical Flow for Large Motion Using Gradient Technique

Optical Flow for Large Motion Using Gradient Technique SERBIAN JOURNAL OF ELECTRICAL ENGINEERING Vol. 3, No. 1, June 2006, 103-113 Optical Flow fo Lage Motion Using Gadient Technique Md. Moshaof Hossain Sake 1, Kamal Bechkoum 2, K.K. Islam 1 Abstact: In this

More information

INDEXATION OF WEB PAGES BASED ON THEIR VISUAL RENDERING

INDEXATION OF WEB PAGES BASED ON THEIR VISUAL RENDERING INDEXATION OF WEB PAGES BASED ON THEIR VISUAL RENDERING Emmanuel Buno Univesité du Sud Toulon-Va / LSIS CNRS BP 20132, F-83957 La Gade buno@univ-tln.f Nicolas Faessel LSIS CNRS Domaine Univesitaie de Saint-Jéôme

More information

Lecture # 04. Image Enhancement in Spatial Domain

Lecture # 04. Image Enhancement in Spatial Domain Digital Image Pocessing CP-7008 Lectue # 04 Image Enhancement in Spatial Domain Fall 2011 2 domains Spatial Domain : (image plane) Techniques ae based on diect manipulation of pixels in an image Fequency

More information

a Not yet implemented in current version SPARK: Research Kit Pointer Analysis Parameters Soot Pointer analysis. Objectives

a Not yet implemented in current version SPARK: Research Kit Pointer Analysis Parameters Soot Pointer analysis. Objectives SPARK: Soot Reseach Kit Ondřej Lhoták Objectives Spak is a modula toolkit fo flow-insensitive may points-to analyses fo Java, which enables expeimentation with: vaious paametes of pointe analyses which

More information

Reachable State Spaces of Distributed Deadlock Avoidance Protocols

Reachable State Spaces of Distributed Deadlock Avoidance Protocols Reachable State Spaces of Distibuted Deadlock Avoidance Potocols CÉSAR SÁNCHEZ and HENNY B. SIPMA Stanfod Univesity We pesent a family of efficient distibuted deadlock avoidance algoithms with applications

More information

Hierarchically Clustered P2P Streaming System

Hierarchically Clustered P2P Streaming System Hieachically Clusteed P2P Steaming System Chao Liang, Yang Guo, and Yong Liu Polytechnic Univesity Thomson Lab Booklyn, NY 11201 Pinceton, NJ 08540 Abstact Pee-to-pee video steaming has been gaining populaity.

More information

Frequency Domain Approach for Face Recognition Using Optical Vanderlugt Filters

Frequency Domain Approach for Face Recognition Using Optical Vanderlugt Filters Optics and Photonics Jounal, 016, 6, 94-100 Published Online August 016 in SciRes. http://www.scip.og/jounal/opj http://dx.doi.og/10.436/opj.016.68b016 Fequency Domain Appoach fo Face Recognition Using

More information

ART GALLERIES WITH INTERIOR WALLS. March 1998

ART GALLERIES WITH INTERIOR WALLS. March 1998 ART GALLERIES WITH INTERIOR WALLS Andé Kündgen Mach 1998 Abstact. Conside an at galley fomed by a polygon on n vetices with m pais of vetices joined by inteio diagonals, the inteio walls. Each inteio wall

More information

On Error Estimation in Runge-Kutta Methods

On Error Estimation in Runge-Kutta Methods Leonado Jounal of Sciences ISSN 1583-0233 Issue 18, Januay-June 2011 p. 1-10 On Eo Estimation in Runge-Kutta Methods Ochoche ABRAHAM 1,*, Gbolahan BOLARIN 2 1 Depatment of Infomation Technology, 2 Depatment

More information

A Full-mode FME VLSI Architecture Based on 8x8/4x4 Adaptive Hadamard Transform For QFHD H.264/AVC Encoder

A Full-mode FME VLSI Architecture Based on 8x8/4x4 Adaptive Hadamard Transform For QFHD H.264/AVC Encoder 20 IEEE/IFIP 9th Intenational Confeence on VLSI and System-on-Chip A Full-mode FME VLSI Achitectue Based on 8x8/ Adaptive Hadamad Tansfom Fo QFHD H264/AVC Encode Jialiang Liu, Xinhua Chen College of Infomation

More information

XFVHDL: A Tool for the Synthesis of Fuzzy Logic Controllers

XFVHDL: A Tool for the Synthesis of Fuzzy Logic Controllers XFVHDL: A Tool fo the Synthesis of Fuzzy Logic Contolles E. Lago, C. J. Jiménez, D. R. López, S. Sánchez-Solano and A. Baiga Instituto de Micoelectónica de Sevilla. Cento Nacional de Micoelectónica, Edificio

More information

Image Enhancement in the Spatial Domain. Spatial Domain

Image Enhancement in the Spatial Domain. Spatial Domain 8-- Spatial Domain Image Enhancement in the Spatial Domain What is spatial domain The space whee all pixels fom an image In spatial domain we can epesent an image by f( whee x and y ae coodinates along

More information

Color Correction Using 3D Multiview Geometry

Color Correction Using 3D Multiview Geometry Colo Coection Using 3D Multiview Geomety Dong-Won Shin and Yo-Sung Ho Gwangju Institute of Science and Technology (GIST) 13 Cheomdan-gwagio, Buk-ku, Gwangju 500-71, Republic of Koea ABSTRACT Recently,

More information

Module 6 STILL IMAGE COMPRESSION STANDARDS

Module 6 STILL IMAGE COMPRESSION STANDARDS Module 6 STILL IMAE COMPRESSION STANDARDS Lesson 17 JPE-2000 Achitectue and Featues Instuctional Objectives At the end of this lesson, the students should be able to: 1. State the shotcomings of JPE standad.

More information

LaSaS: an Aggregated Search based Graph Matching Approach

LaSaS: an Aggregated Search based Graph Matching Approach LaSaS: an ggegated Seach based Gaph Matching ppoach Ghizlane EHBRTHI Univesité de Lyon, NRS Univesité Lyon 1, LIRIS, UMR5205, F-69622, Fance Email: ghizlane.echbathi@univ-lyon1.f Hamamache KHEDDOUI Univesité

More information

Topological Characteristic of Wireless Network

Topological Characteristic of Wireless Network Topological Chaacteistic of Wieless Netwok Its Application to Node Placement Algoithm Husnu Sane Naman 1 Outline Backgound Motivation Papes and Contibutions Fist Pape Second Pape Thid Pape Futue Woks Refeences

More information

SAR: A Sentiment-Aspect-Region Model for User Preference Analysis in Geo-tagged Reviews

SAR: A Sentiment-Aspect-Region Model for User Preference Analysis in Geo-tagged Reviews : A Sentiment-Aspect-Region Model fo Use Pefeence Analysis in Geo-tagged Reviews Kaiqi Zhao, Gao Cong, Quan Yuan School of Compute Engineeing Nanyang Technological Univesity, Singapoe {kzhao2@e., gaocong@,

More information

UNION FIND. naïve linking link-by-size link-by-rank path compression link-by-rank with path compression context. An Improved Equivalence Algorithm

UNION FIND. naïve linking link-by-size link-by-rank path compression link-by-rank with path compression context. An Improved Equivalence Algorithm Disjoint-sets data type Lectue slides by Kevin Wayne Copyight 5 Peason-Addison Wesley http://www.cs.pinceton.edu/~wayne/kleinbeg-tados UNION FIND naïve linking link-by-size link-by-ank path compession

More information

Also available at ISSN (printed edn.), ISSN (electronic edn.) ARS MATHEMATICA CONTEMPORANEA 3 (2010)

Also available at  ISSN (printed edn.), ISSN (electronic edn.) ARS MATHEMATICA CONTEMPORANEA 3 (2010) Also available at http://amc.imfm.si ISSN 1855-3966 (pinted edn.), ISSN 1855-3974 (electonic edn.) ARS MATHEMATICA CONTEMPORANEA 3 (2010) 109 120 Fulleene patches I Jack E. Gave Syacuse Univesity, Depatment

More information

Bo Gu and Xiaoyan Hong*

Bo Gu and Xiaoyan Hong* Int. J. Ad Hoc and Ubiquitous Computing, Vol. 11, Nos. /3, 1 169 Tansition phase of connectivity fo wieless netwoks with gowing pocess Bo Gu and Xiaoyan Hong* Depatment of Compute Science, Univesity of

More information

Optimal Adaptive Learning for Image Retrieval

Optimal Adaptive Learning for Image Retrieval Optimal Adaptive Leaning fo Image Retieval ao Wang Dept of Compute Sci and ech singhua Univesity Beijing 00084, P. R. China Wangtao7@63.net Yong Rui Micosoft Reseach One Micosoft Way Redmond, WA 9805,

More information

Performance Optimization in Structured Wireless Sensor Networks

Performance Optimization in Structured Wireless Sensor Networks 5 The Intenational Aab Jounal of Infomation Technology, Vol. 6, o. 5, ovembe 9 Pefomance Optimization in Stuctued Wieless Senso etwoks Amine Moussa and Hoda Maalouf Compute Science Depatment, ote Dame

More information

Evaluation of Partial Path Queries on XML Data

Evaluation of Partial Path Queries on XML Data Evaluation of Patial Path Queies on XML Data Stefanos Souldatos Dept of EE & CE NTUA, Geece stef@dblab.ntua.g Theodoe Dalamagas Dept of EE & CE NTUA, Geece dalamag@dblab.ntua.g Xiaoying Wu Dept. of CS

More information

On the Forwarding Area of Contention-Based Geographic Forwarding for Ad Hoc and Sensor Networks

On the Forwarding Area of Contention-Based Geographic Forwarding for Ad Hoc and Sensor Networks On the Fowading Aea of Contention-Based Geogaphic Fowading fo Ad Hoc and Senso Netwoks Dazhi Chen Depatment of EECS Syacuse Univesity Syacuse, NY dchen@sy.edu Jing Deng Depatment of CS Univesity of New

More information

Modelling, simulation, and performance analysis of a CAN FD system with SAE benchmark based message set

Modelling, simulation, and performance analysis of a CAN FD system with SAE benchmark based message set Modelling, simulation, and pefomance analysis of a CAN FD system with SAE benchmak based message set Mahmut Tenuh, Panagiotis Oikonomidis, Peiklis Chachalakis, Elias Stipidis Mugla S. K. Univesity, TR;

More information

Positioning of a robot based on binocular vision for hand / foot fusion Long Han

Positioning of a robot based on binocular vision for hand / foot fusion Long Han 2nd Intenational Confeence on Advances in Mechanical Engineeing and Industial Infomatics (AMEII 26) Positioning of a obot based on binocula vision fo hand / foot fusion Long Han Compute Science and Technology,

More information

Evaluation of Partial Path Queries on XML data

Evaluation of Partial Path Queries on XML data Evaluation of Patial Path Queies on XML data Stefanos Souldatos Dept of EE & CE, NTUA stef@dblab.ntua.g Theodoe Dalamagas Dept of EE & CE, NTUA dalamag@dblab.ntua.g Xiaoying Wu Dept. of CS, NJIT xw43@njit.edu

More information

Title. Author(s)NOMURA, K.; MOROOKA, S. Issue Date Doc URL. Type. Note. File Information

Title. Author(s)NOMURA, K.; MOROOKA, S. Issue Date Doc URL. Type. Note. File Information Title CALCULATION FORMULA FOR A MAXIMUM BENDING MOMENT AND THE TRIANGULAR SLAB WITH CONSIDERING EFFECT OF SUPPO UNIFORM LOAD Autho(s)NOMURA, K.; MOROOKA, S. Issue Date 2013-09-11 Doc URL http://hdl.handle.net/2115/54220

More information

Hierarchical Region Mean-Based Image Segmentation

Hierarchical Region Mean-Based Image Segmentation Hieachical Region Mean-Based Image Segmentation Slawo Wesolkowski and Paul Fieguth Systems Design Engineeing Univesity of Wateloo Wateloo, Ontaio, Canada, N2L-3G1 s.wesolkowski@ieee.og, pfieguth@uwateloo.ca

More information

SCALABLE ENERGY EFFICIENT AD-HOC ON DEMAND DISTANCE VECTOR (SEE-AODV) ROUTING PROTOCOL IN WIRELESS MESH NETWORKS

SCALABLE ENERGY EFFICIENT AD-HOC ON DEMAND DISTANCE VECTOR (SEE-AODV) ROUTING PROTOCOL IN WIRELESS MESH NETWORKS SCALABL NRGY FFICINT AD-HOC ON DMAND DISTANC VCTOR (S-AODV) ROUTING PROTOCOL IN WIRLSS MSH NTWORKS Sikande Singh Reseach Schola, Depatment of Compute Science & ngineeing, Punjab ngineeing College (PC),

More information

An Improved Resource Reservation Protocol

An Improved Resource Reservation Protocol Jounal of Compute Science 3 (8: 658-665, 2007 SSN 549-3636 2007 Science Publications An mpoved Resouce Resevation Potocol Desie Oulai, Steven Chambeland and Samuel Piee Depatment of Compute Engineeing

More information

Event-based Location Dependent Data Services in Mobile WSNs

Event-based Location Dependent Data Services in Mobile WSNs Event-based Location Dependent Data Sevices in Mobile WSNs Liang Hong 1, Yafeng Wu, Sang H. Son, Yansheng Lu 3 1 College of Compute Science and Technology, Wuhan Univesity, China Depatment of Compute Science,

More information

IP Multicast Simulation in OPNET

IP Multicast Simulation in OPNET IP Multicast Simulation in OPNET Xin Wang, Chien-Ming Yu, Henning Schulzinne Paul A. Stipe Columbia Univesity Reutes Depatment of Compute Science 88 Pakway Dive South New Yok, New Yok Hauppuage, New Yok

More information

Input Layer f = 2 f = 0 f = f = 3 1,16 1,1 1,2 1,3 2, ,2 3,3 3,16. f = 1. f = Output Layer

Input Layer f = 2 f = 0 f = f = 3 1,16 1,1 1,2 1,3 2, ,2 3,3 3,16. f = 1. f = Output Layer Using the Gow-And-Pune Netwok to Solve Poblems of Lage Dimensionality B.J. Biedis and T.D. Gedeon School of Compute Science & Engineeing The Univesity of New South Wales Sydney NSW 2052 AUSTRALIA bbiedis@cse.unsw.edu.au

More information

FINITE ELEMENT MODEL UPDATING OF AN EXPERIMENTAL VEHICLE MODEL USING MEASURED MODAL CHARACTERISTICS

FINITE ELEMENT MODEL UPDATING OF AN EXPERIMENTAL VEHICLE MODEL USING MEASURED MODAL CHARACTERISTICS COMPDYN 009 ECCOMAS Thematic Confeence on Computational Methods in Stuctual Dynamics and Eathquake Engineeing M. Papadakakis, N.D. Lagaos, M. Fagiadakis (eds.) Rhodes, Geece, 4 June 009 FINITE ELEMENT

More information

Generalized Grey Target Decision Method Based on Decision Makers Indifference Attribute Value Preferences

Generalized Grey Target Decision Method Based on Decision Makers Indifference Attribute Value Preferences Ameican Jounal of ata ining and Knowledge iscovey 27; 2(4): 2-8 http://www.sciencepublishinggoup.com//admkd doi:.648/.admkd.2724.2 Genealized Gey Taget ecision ethod Based on ecision akes Indiffeence Attibute

More information

Structure discovery techniques for circuit design and process model visualization

Structure discovery techniques for circuit design and process model visualization Depatament de iències de la omputació Ph.D. in omputing Stuctue discovey techniques fo cicuit design and pocess model visualization Javie de San Pedo Matín Adviso: Jodi otadella Fotuny Bacelona, May 2017

More information

Fifth Wheel Modelling and Testing

Fifth Wheel Modelling and Testing Fifth heel Modelling and Testing en Masoy Mechanical Engineeing Depatment Floida Atlantic Univesity Boca aton, FL 4 Lois Malaptias IFMA Institut Fancais De Mechanique Advancee ampus De lemont Feand Les

More information

Simulation and Performance Evaluation of Network on Chip Architectures and Algorithms using CINSIM

Simulation and Performance Evaluation of Network on Chip Architectures and Algorithms using CINSIM J. Basic. Appl. Sci. Res., 1(10)1594-1602, 2011 2011, TextRoad Publication ISSN 2090-424X Jounal of Basic and Applied Scientific Reseach www.textoad.com Simulation and Pefomance Evaluation of Netwok on

More information

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS

ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS ANALYTIC PERFORMANCE MODELS FOR SINGLE CLASS AND MULTIPLE CLASS MULTITHREADED SOFTWARE SERVERS Daniel A Menascé Mohamed N Bennani Dept of Compute Science Oacle, Inc Geoge Mason Univesity 1211 SW Fifth

More information

A Neural Network Model for Storing and Retrieving 2D Images of Rotated 3D Object Using Principal Components

A Neural Network Model for Storing and Retrieving 2D Images of Rotated 3D Object Using Principal Components A Neual Netwok Model fo Stong and Reteving 2D Images of Rotated 3D Object Using Pncipal Components Tsukasa AMANO, Shuichi KUROGI, Ayako EGUCHI, Takeshi NISHIDA, Yasuhio FUCHIKAWA Depatment of Contol Engineeng,

More information

arxiv: v2 [physics.soc-ph] 30 Nov 2016

arxiv: v2 [physics.soc-ph] 30 Nov 2016 Tanspotation dynamics on coupled netwoks with limited bandwidth Ming Li 1,*, Mao-Bin Hu 1, and Bing-Hong Wang 2, axiv:1607.05382v2 [physics.soc-ph] 30 Nov 2016 1 School of Engineeing Science, Univesity

More information

Efficient protection of many-to-one. communications

Efficient protection of many-to-one. communications Efficient potection of many-to-one communications Miklós Molná, Alexande Guitton, Benad Cousin, and Raymond Maie Iisa, Campus de Beaulieu, 35 042 Rennes Cedex, Fance Abstact. The dependability of a netwok

More information

Extract Object Boundaries in Noisy Images using Level Set. Final Report

Extract Object Boundaries in Noisy Images using Level Set. Final Report Extact Object Boundaies in Noisy Images using Level Set by: Quming Zhou Final Repot Submitted to Pofesso Bian Evans EE381K Multidimensional Digital Signal Pocessing May 10, 003 Abstact Finding object contous

More information

INFORMATION DISSEMINATION DELAY IN VEHICLE-TO-VEHICLE COMMUNICATION NETWORKS IN A TRAFFIC STREAM

INFORMATION DISSEMINATION DELAY IN VEHICLE-TO-VEHICLE COMMUNICATION NETWORKS IN A TRAFFIC STREAM INFORMATION DISSEMINATION DELAY IN VEHICLE-TO-VEHICLE COMMUNICATION NETWORKS IN A TRAFFIC STREAM LiLi Du Depatment of Civil, Achitectual, and Envionmental Engineeing Illinois Institute of Technology 3300

More information

Transmission Lines Modeling Based on Vector Fitting Algorithm and RLC Active/Passive Filter Design

Transmission Lines Modeling Based on Vector Fitting Algorithm and RLC Active/Passive Filter Design Tansmission Lines Modeling Based on Vecto Fitting Algoithm and RLC Active/Passive Filte Design Ahmed Qasim Tuki a,*, Nashien Fazilah Mailah b, Mohammad Lutfi Othman c, Ahmad H. Saby d Cente fo Advanced

More information

Slotted Random Access Protocol with Dynamic Transmission Probability Control in CDMA System

Slotted Random Access Protocol with Dynamic Transmission Probability Control in CDMA System Slotted Random Access Potocol with Dynamic Tansmission Pobability Contol in CDMA System Intaek Lim 1 1 Depatment of Embedded Softwae, Busan Univesity of Foeign Studies, itlim@bufs.ac.k Abstact In packet

More information

Conversion Functions for Symmetric Key Ciphers

Conversion Functions for Symmetric Key Ciphers Jounal of Infomation Assuance and Secuity 2 (2006) 41 50 Convesion Functions fo Symmetic Key Ciphes Deba L. Cook and Angelos D. Keomytis Depatment of Compute Science Columbia Univesity, mail code 0401

More information

Authentication of Moving Range Queries

Authentication of Moving Range Queries Authentication of Moving Range Queies Duncan Yung Eic Lo Man Lung Yiu Depatment of Computing Hong Kong Polytechnic Univesity {cskwyung, eiclo, csmlyiu}@comp.polyu.edu.hk STRACT A moving ange quey continuously

More information

High performance CUDA based CNN image processor

High performance CUDA based CNN image processor High pefomance UDA based NN image pocesso GEORGE VALENTIN STOIA, RADU DOGARU, ELENA RISTINA STOIA Depatment of Applied Electonics and Infomation Engineeing Univesity Politehnica of Buchaest -3, Iuliu Maniu

More information

All lengths in meters. E = = 7800 kg/m 3

All lengths in meters. E = = 7800 kg/m 3 Poblem desciption In this poblem, we apply the component mode synthesis (CMS) technique to a simple beam model. 2 0.02 0.02 All lengths in metes. E = 2.07 10 11 N/m 2 = 7800 kg/m 3 The beam is a fee-fee

More information

Class 21. N -body Techniques, Part 4

Class 21. N -body Techniques, Part 4 Class. N -body Techniques, Pat Tee Codes Efficiency can be inceased by gouping paticles togethe: Neaest paticles exet geatest foces diect summation. Distant paticles exet smallest foces teat in goups.

More information

ADDING REALISM TO SOURCE CHARACTERIZATION USING A GENETIC ALGORITHM

ADDING REALISM TO SOURCE CHARACTERIZATION USING A GENETIC ALGORITHM ADDING REALISM TO SOURCE CHARACTERIZATION USING A GENETIC ALGORITHM Luna M. Rodiguez*, Sue Ellen Haupt, and Geoge S. Young Depatment of Meteoology and Applied Reseach Laboatoy The Pennsylvania State Univesity,

More information

MULTI-TEMPORAL AND MULTI-SENSOR IMAGE MATCHING BASED ON LOCAL FREQUENCY INFORMATION

MULTI-TEMPORAL AND MULTI-SENSOR IMAGE MATCHING BASED ON LOCAL FREQUENCY INFORMATION Intenational Achives of the Photogammety Remote Sensing and Spatial Infomation Sciences Volume XXXIX-B3 2012 XXII ISPRS Congess 25 August 01 Septembe 2012 Melboune Austalia MULTI-TEMPORAL AND MULTI-SENSOR

More information

The EigenRumor Algorithm for Ranking Blogs

The EigenRumor Algorithm for Ranking Blogs he EigenRumo Algoithm fo Ranking Blogs Ko Fujimua N Cybe Solutions Laboatoies N Copoation akafumi Inoue N Cybe Solutions Laboatoies N Copoation Masayuki Sugisaki N Resonant Inc. ABSRAC he advent of easy

More information

AN ANALYSIS OF COORDINATED AND NON-COORDINATED MEDIUM ACCESS CONTROL PROTOCOLS UNDER CHANNEL NOISE

AN ANALYSIS OF COORDINATED AND NON-COORDINATED MEDIUM ACCESS CONTROL PROTOCOLS UNDER CHANNEL NOISE AN ANALYSIS OF COORDINATED AND NON-COORDINATED MEDIUM ACCESS CONTROL PROTOCOLS UNDER CHANNEL NOISE Tolga Numanoglu, Bulent Tavli, and Wendi Heinzelman Depatment of Electical and Compute Engineeing Univesity

More information

Efficient Execution Path Exploration for Detecting Races in Concurrent Programs

Efficient Execution Path Exploration for Detecting Races in Concurrent Programs IAENG Intenational Jounal of Compute Science, 403, IJCS_40_3_02 Efficient Execution Path Exploation fo Detecting Races in Concuent Pogams Theodous E. Setiadi, Akihiko Ohsuga, and Mamou Maekaa Abstact Concuent

More information

A Family of Distributed Deadlock Avoidance Protocols and their Reachable State Spaces

A Family of Distributed Deadlock Avoidance Protocols and their Reachable State Spaces A Family of Distibuted Deadlock Avoidance Potocols and thei Reachable State Spaces Césa Sánchez, Henny B. Sipma, and Zoha Manna Compute Science Depatment Stanfod Univesity, Stanfod, CA 94305-9025 {cesa,sipma,manna}@cs.stanfod.edu

More information