Exhaustive and Guided Algorithms for Recommendation in a Professional Social Network
|
|
- Morgan Booker
- 6 years ago
- Views:
Transcription
1 Exhaustive and Guided Algorithms for Recommendation in a Professional Social Network Maria Malek, Dalia Sulieman EISTI-Laris laboratory PRES Cergy University FRANCE maria.malek@eisti.fr, dalia.sulieman@eisti.fr July 31, 2010 Abstract This paper proposes skills recommendation algorithm in a professional social network. This network consists of a set of persons with professional weighted ties. To answer the request of an actor, the system recommends a list of other actors that match the best requested criteria. We propose two recommendation algorithms based on three types of knowledge: The first type deals with information concerning the person. This information is stored in the actor vertex level and constitues the user profiles description. The second type of information is computed from the network structure itself. Actually, this consists of exploring the links starting from the initial actor exploring the maximum spanning tree whose the root is the initial actor. We can thus reduce the search space of target actors. While the third type of information is based on the betweenness centrality measure associated to each actor. This measure enables to estimate the control of an actor over other pairs of actors. We use this measure to extract the best paths from the previous spanning tree. 1
2 1 Introduction A social network is a set of people or groups of people with some pattern of contacts or interactions between them. Social networks analysis is defined as the study of social entities such as people in organizations called actors, and their interactions and relationships. A social network is modelled by a graph or network, where each vertex is a node (actor) and each edge is a relationship. We can study the structural properties as well as the role and the social prestige of each actor [12, 9, 7]. We can also find different types of sub graphs such as communities formed by groups of actors with common interests, by isolating the group individuals with a high density [5]. The social network can be also a source for the development of recommendations: find an expert in a given field, suggest products to sell, offer a friend, etc. This development may be based on paths exploration algorithm, degree analysis ([1, 2]). In this paper, we propose an algorithm for computing recommendation of skills in a professional social network. Our network consists of a set of persons with professional weighted ties. To answer the request of an actor, the system recommends a list of other actors that match the best requested criteria. An example is to search for a person whose expertise matches a given task. We work on a social network which is composed of authors related together by similarity links. These authors are extracted from bibliographic data. We propose two recommendation algorithms based on three types of 2
3 knowledge: The first type deals with information concerning the person. This information is stored in the actor vertex level and can be represented by an ontology describing user profiles. The second type of information is computed from the network structure itself. Actually, this consists of exploring the links starting from the initial actor exploring the maximum spanning tree whose the root is the initial actor. We can thus reduce the search space of target actors. The third type of information is based on the betweenness centrality measure associated to each actor. This measure enables to estimate the control of an actor over other pairs of actors. We use this measure to extract the best paths from the previous spanning tree. The reminder of this article is organized as follow: We describe in section II our professional social network and how we extracted it. We detail in section III our approach for expert recommending where the exhaustive ad the guided algorithms are proposed. We present then our experiment results in section IV. In section V, some related works are described. We finally conclude. 2 Professional network Social network is composed of authors extracted from bibliographic data. In this graph, nodes are the authors, while the evaluated edges are the similarity degree between these authors. Each author Z has a given profile P ro Z 3
4 . This profile is described by a weighted vector of keywords T i, these keywords present the topics the authors interests. P ro Z = {(T 1, P 1 ), (T 2, P 2 )..., (T 1, P 1 )} The goal of the system is to recommend, in response to a certain author query, a group of ranked authors according to the similarity between their profiles (terms of interests T m ) and the query terms. For that we had to extract a social network that presents the authors and the relations between them. The social network has been extracted from Microsoft Academic search website libra.msra.cn. Firstly, we have firstly extracted a connected network of authors from this site. The obtained network is described as a valued directed graph (see figure 1 and 2), the nodes of this graph are the authors while the edges of this graph are the citations between these authors, each edge has a value representing the number of citations between two connected authors. This social network is presented by a matrix L. In this matrix: L ij equals n if author i cites a author j n times. In fact, this network presents the citations number between authors (not the similarity between authors), then, we have extracted another social network which is the similarity network depending on this network as described in the next section. 2.1 The similarity based social network The similarity social network is represented by a non-oriented graph, its nodes present authors and its edges present the similarity between authors. For every node, a weighted vector of keywords is extracted and stored to describe the user s profile as mentioned above. 4
5 We suppose that two authors are structurally similar if they: cite a certain number of authors in common or if they are cited by a certain number of authors in common. The similarity relation in this network is based on two matrices, the cocitation matrix and the bibliographic coupling matrix Co-citation matrix The co-citation matrix measures the similarity between authors. It is computed by: n C ij = L ki L kj (1) k=1 where L is the matrix representing the social network of citations as mentioned above (see figure 1). According to this matrix, if two authors cite a certain number of other authors in common, then we can say these two authors have similar interests Bibliographic coupling The bibliographic coupling matrix is another similarity measure between authors which is given by : n B ij = L ik L jk (2) k=1 According to this matrix, if two authors are cited by a certain number of other authors (they are in the bibliography of other authors), then these two authors are similar. 5
6 2.1.3 Structural similarity graph The similarity graph is defined as the sum of the two previous matrices the co-citation matrix C and the bibliographic coupling matrix B. A similarity relation between two authors is created if they cite the same authors or if they are cited by a common author and if the two nodes i and j satisfy the condition [B + C][i][j] >= threshold. In this case we obtain a similarity based social network from the citations based social network (see figure 1). Figure 1: From citations graph to similarity graph. 6
7 Figure 2: Two non-directed similarity graphs extracted from the global directed graph, the first one is denser. 3 Recommendation algorithm 3.1 The algorithm idea The idea is to propose a search algorithm which combines the semantic aspect, the structure and the social networks proprieties: The semantic part is the information stocked about the actor (the person) within each node. In other termes, it is consists of the user profile. The structural part is the information described by the network structure. Our contribution consists of using the maximum spanning tree in order to enhance the search performance. The social part consists of using the betweenness of actors in order to retain certain paths which are more prestigious than others. 7
8 3.1.1 The semantic part We compute the similarity between the request R x of an author X, and the profile of an author Z : R X is the request of X and is composed of a set of terms T i : R X = {T 1, T 2.., T n } P ro Z is the profile associated to the actor Z presented by a set of weighted terms : P ro Z = {(T 1, P 1 ), (T 2, P 2 ).., (T m, P m )}. The similarity is given by: sim(r x, P ro Z ) = j inter(r X,P ro Z ) P ro Z.P j mi=1 P ro Z P j + R X \P ro Z (3) With: inter(r X, P ro Z ) = {k {1,... m}, such as, P ro Z T K R X } The structural part We extract the maximum spanning tree from the valuated similarity graph using the Kruskal algorithm ([8, 4]) and by taking the maximum edge values instead of the minimum values. We aim to enhance the research by finding an optimized navigation in the spanning tree, in stead of exploring the whole or even a part of the graph Nodes beetweenness The betweenness centrality is given by the equation: C B (i) = P jk (i) P jk (4) 8
9 Where: P jk (i)is the number of the shortest paths between j and k, which pass from the node i. P jk Is the number of the shortest paths between j and k. The use of the betweenness allows to prefer certain more privileged search paths for the requested recommendation. 3.2 The algorithm To elaborate some recommendation, we propose to navigate a covering spanning tree in seated of considering the whole graph. This will help to take significant navigation paths and to enhance the system performance. The recommendation algorithm searches a response to the user request by searching the extracted spanning tree. The algorithm input is composed of a request R x posed by an author X, this request is formed as a chain of keywords T i. R x = {T 1, T 2..., T n }. The algorithm output corresponds to a response to the author X request which is presented by a weighted sequence of recommended authors {(Z 1, P 1 ), (Z 2, P 2 ).., (Z n, P n )} ; as well as the semantic chain connecting the two actors X, Z 1 i (see figure 4). The algorithm is given as follow: 1. Compute the maximum spanning tree (see figure 3). 2. Compute and store the betweenness of all the nodes. 3. Extract from the spanning tree a ranked list of actors to recommend 1 The semantic chain connecting the two actors X, Z i is constituted of the list of terms extracted form the profil of nodes (authors) relating X to Z i 9
10 by using the exhaustive algorithm or the guided one. Figure 3: The maximum spanning tree computed for the similarity graph 3.3 The exhaustive version 1. Search the spanning tree starting by the user X (figure 4) and using the breadth first strategy. We search for the nodes Z i where: sim(r X, P ro Zi ) >= threshold to recommend to X. 2. Compute the rating P i associated to each author Z i, this rating depends on two values : the similarity and the betweenness centrality of the authors on the path of the solution. P i = sim(r X, P ro Zi ) sim(r X, P ro Zi ) l j 1 C B(Y j ) l if l 1 if not (5) Y 1, Y 2,..., Y l 1 is the set of authors present on the path relating X to Z i. 10
11 Figure 4: Searching the spanning tree using the breadth first search algorithm - An exemple of an authors list to recommend can be [Z 4, Z 3, Z 1, Z 2 ] ranked according to their rating measurements, the semantic chain between X and Z 4 is [pro(x), pro(y 1 ), pro(y 2 ), pro(z 4 )]. 3.4 The guided version We propose a second version which is more efficient that allows to search solution, by finding more quickly the search path in the spanning tree instead of applying the breadth first strategy. We use an heuristic allowing to choose the next node to visit among a set of candidates ones; we apply the A* algorithm that allows to choose the node Y that maximise the following heuristic: h(y ) = sim(p ro X, P ro Y ) C B (Y ) until we reach the node Z that verifies: sim(x, Z) >= threshold. 11
12 We can prove that our heuristic is monotone and that it decreases slowly on the solution s path, we can prove also that it recognizes the solution. On the other hand, we show with experiments (see next section) that this version converges more quickly to the solution and succeeds to explore from 11% to 49% from the spanning tree explored by the exhaustive version. 4 Experimentations Table 1 presents some statistics about the social network (that describes the similarity between authors): nodes number, edges number and graph density (in social network the graph density is small). Figure 5 presents the degrees distribution of this network and shows that it evolves according to the power law distribution. Nodes number 7065 Edges number Graph density 4, Table 1: Some statistics about the social network describing the structural similarity between authors. We now present an exemple of an experience : we suppose that the author Francesco Masulli submit a request composed of three terms : T 1 = Ranking, T 2 = Clustering, T 3 = Data mining: by applying the algorithm, we obtained table 2 as output, it shows a group of ranked authors to recommend. This table gives the name of the recommended authors as well as their rating values and their distances from it. We have also evaluated the guided version compared to the exhaustive one. We have done ten experiences: each experience begin with a request 12
13 Figure 5: Degrees distribution of the extracted social network. Author Rating Distance Mikolaj Morzy Steven Warner Bob Garcia Wendy Gersten Manuel Lozano Matthias Schonlau 9e Lyane T Watson 7.89e Carl Wunsch 6.78e Yang Seok Kim 3.38e David W Aha 2.39e Table 2: Recommendation results: the found authors, their ratings and their distances from the root author who sent the request. elaborated by an author X (which becomes the root of the spanning tree). For each request, we apply both versions of the algorithm and we pick up the following measurements (see table 3): The rank of the found (recommended) author by the guided algorithm remember that the exhaustive algorithm propose for the same request a set of recommended authors and their ranks. 13
14 The number of visited nodes by the guided algorithm The computation time We notice that for 8 experiences (see table 3), the rank number 1 is found by the guided version, while the rank number 2 is found for the 2 other experiences. Only a part of the spanning tree is searched by the guided version. The search space is thus reduced 11% to 49%. The computation time is also reduced. N The exhaustive algorithm The A* algorithm Recommended author Rating Computation time Recommended author Computation time explored graph 1 Andrew Emili ,41s Andrew Emili (1) 109,27s 39.25% 2 G V Belle ,35s G V Belle (1) 17,45s 21.13% 3 Hans A Kestler ,41s Yuichi Asahiro (2) 11,66s 13.86% 4 Jimin Pei ,61s Jimin Pei (1) 32,52s 20.02% 5 John F Canny ,99s John F Canny (1) 21,77s 11.77% 6 C Wang ,37s C Wang (1) 233,99s 49.13% 7 J Michael Brady ,68s J Michael Brady (1) 118,74s 41.14% 8 Peter G Neumann ,72s Elizabeth J O neil (2) 40,49s 24.88% 9 Peter Eades ,95s Peter Eades (1) 54,47s 30.95% 10 Liang Chen s Liang Chen (1) 14,14s 16.67% Table 3: Comparaison of the breadth first search algorithm and the A* algorithm: each experience corresponds to a request sent by an author root, the recommended authors correspond to those who have the most important rating. We notice that the first author found by the exhaustive algorithm is found also by the A* for 8 experiences. 5 Related work Graph algorithms have been used for experts recommendation in social networks. These strategies are essentially ([14]): Breadth First Search which broadcasts the query to every person in a social network. Random Walk Search (RWS) that randomly chooses one of the current s neighbor to whom to spread the query. 14
15 Best Connected Search (BCS) proposed by [3] which makes use of the skewed degree distribution of many networks. The Weak and Strong Ties algorithms are based on the idea that the connections between two individuals can have different strengths. The strength of association varies and is not always symmetric. Hamming Distance Search (HDS) picks the neighbor which has the most uncommon friends with the current user The Information Scent Search (IIS) picks the next person who has the highest match score between the query and the profile. Searching expertise in social network has been approached in Zhang and Ackerman work since 2005 ([14]). Graph search strategies were applied and evaluated on the Enron data ([14]). The evaluation criteria are: the number of people used per query, the depth of the query chain. The IIS is not obviously better than out degree based strategies (BCS and HDS). Weak Ties have been found to be important in helping people get new information. There will be found that weak ties are critical for automated expertise finding. The out degree strategy such BCS and HDC in networks like the Enron s have clear advantages over other strategies ([14]). In ([6]), the problem of expertise identification using communications is treated. A content-based algorithm is compared with a graph based algorithm using the HITS algorithm and taking into consideration both text and communication. Results show that the graph based algorithm performs better. The same idea is developed in ([10]) showing that 15
16 social networks analysis techniques as the expertise propagation algorithm leads to significant performance improvement. In ([15]), the recommendation is formalized as a ranking problem over a heterogeneous social network. Random Wark Search is used to elaborate a recommendation when a person is doing a search or when browsing the information. On the other hand, in ([13]) the relations between authors in three networks of scientific collaborations, and the different interactions between them are studied. Two authors are connected if they have paper in common. In ([11]) the structure of social network of mathematical papers and the relations between authors in mathematical field are studied, the nodes of this network are the mathematicians and the edges are the common papers between them, the evolution of this network over the time (number of authors, number of papers) is also presented. 6 Conclusion In this paper, we propose an algorithm for computing recommendation of skills in a professional social network. We study a professional network which contains authors connected together. Each author contains a profile description. The nodes of this network are authors while the edges are the similarity between them. Our objective is to recommend to a given author who submits a query a group of ranked authors as response. This recommendation is based, on one hand, on the similarity between authors profiles and the submitted request and on the other hand, on the betweenness centrality of authors found on the search paths. To search the graph we first extract the 16
17 most representative spanning tree and then we explore this tree. The first proposed algorithm is an exhaustive one, it is based on breadth strategy to explore the spanning tree until finding a suitable author to recommend. The second algorithm uses the A* algorithm for searching the spanning tree instead of the breadth search strategy. We specify an admissible heuristic which depends on the similarity between the submitted request and the user profile on the one hand, and on the betweenness measure on the other hand. Experiment results show that the guided version leads to propose the better recommendation and enhance the search performance. By comparing both algorithms we notice that 11% to 49% of the original search space is explored. We are now working on the elaboration of user profile by using a domain ontology representation. We aim to extend our algorithm to search several connected communities. We will try also to use the spanning tree for semantic finality that can for exemple leads to discover semantic matching between different communities. References [1] Ecole d été web intelligence. In WI09. Universit de Lyon, [2] Lada A. Adamic and Eytan Adar. How to search a social network, July [3] Lada A. Adamic, Orkut Buyukkokten, and Eytan Adar. A social network caught in the web. First Monday, 8(6),
18 [4] C. Berge. Graphes. Gauther-Villars, [5] Vincent D. Blondel, Jean-Loup Guillaume, Renaud Lambiotte, and Etienne Lefebvre. Fast unfolding of community hierarchies in large networks. CoRR, abs/ , [6] Christopher S. Campbell, Paul P. Maglio, Alex Cozzi, and Byron Dom. Expertise identification using communications. In CIKM, pages , [7] M. G. Everett and S. P. Borgatti. The centrality of groups and classes. Journal of Mathematical Sociology, 23(3): , [8] J-C. Fournier. Théorie de Graphes et applications. Lavoisier, [9] Linton C. Freeman. Centrality in social networks: Conceptual clarification. Social Networks, 1(3): , [10] Yupeng Fu, Rongjing Xiang, Yiqun Liu, Min Zhang, and Shaoping Ma. Finding experts using social network analysis. In Web Intelligence, pages 77 80, [11] J.W. Grossman. The evolution of the mathematical research collaboration graph. Congressus Numeratium, [12] M. E. J. Newman. The structure and function of complex networks. SIAM Review, 45: , Mar [13] M. E. J. Newman. Coauthorship networks and patterns of scientific collaboration. Proceedings of the National Academy of Science of the United States (PNAS), 101: ,
19 [14] Jun Zhang and Mark S. Ackerman. Searching for expertise in social networks: a simulation of potential strategies. In GROUP, pages 71 80, [15] Jun Zhang, Mark S. Ackerman, and Lada Adamic. Expertise networks in online communities: structure and algorithms. In WWW 07: Proceedings of the 16th international conference on World Wide Web, pages , New York, NY, USA, ACM. 19
Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing
Detecting and Analyzing Communities in Social Network Graphs for Targeted Marketing Gautam Bhat, Rajeev Kumar Singh Department of Computer Science and Engineering Shiv Nadar University Gautam Buddh Nagar,
More informationCommunity Mining Tool using Bibliography Data
Community Mining Tool using Bibliography Data Ryutaro Ichise, Hideaki Takeda National Institute of Informatics 2-1-2 Hitotsubashi Chiyoda-ku Tokyo, 101-8430, Japan {ichise,takeda}@nii.ac.jp Kosuke Ueyama
More informationStructural Analysis of Paper Citation and Co-Authorship Networks using Network Analysis Techniques
Structural Analysis of Paper Citation and Co-Authorship Networks using Network Analysis Techniques Kouhei Sugiyama, Hiroyuki Ohsaki and Makoto Imase Graduate School of Information Science and Technology,
More informationGraph Theory for Network Science
Graph Theory for Network Science Dr. Natarajan Meghanathan Professor Department of Computer Science Jackson State University, Jackson, MS E-mail: natarajan.meghanathan@jsums.edu Networks or Graphs We typically
More information1 Starting around 1996, researchers began to work on. 2 In Feb, 1997, Yanhong Li (Scotch Plains, NJ) filed a
!"#$ %#& ' Introduction ' Social network analysis ' Co-citation and bibliographic coupling ' PageRank ' HIS ' Summary ()*+,-/*,) Early search engines mainly compare content similarity of the query and
More informationStructure Mining for Intellectual Networks
Structure Mining for Intellectual Networks Ryutaro Ichise 1, Hideaki Takeda 1, and Kosuke Ueyama 2 1 National Institute of Informatics, 2-1-2 Chiyoda-ku Tokyo 101-8430, Japan, {ichise,takeda}@nii.ac.jp
More informationAn Analysis of Researcher Network Evolution on the Web
An Analysis of Researcher Network Evolution on the Web Yutaka Matsuo 1, Yuki Yasuda 2 1 National Institute of AIST, Aomi 2-41-6, Tokyo 135-0064, JAPAN 2 University of Tokyo, Hongo 7-3-1, Tokyo 113-8656,
More informationSearching frequent itemsets by clustering data: towards a parallel approach using MapReduce
Searching frequent itemsets by clustering data: towards a parallel approach using MapReduce Maria Malek and Hubert Kadima EISTI-LARIS laboratory, Ave du Parc, 95011 Cergy-Pontoise, FRANCE {maria.malek,hubert.kadima}@eisti.fr
More informationLink Analysis from Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer and other material.
Link Analysis from Bing Liu. Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, Springer and other material. 1 Contents Introduction Network properties Social network analysis Co-citation
More informationCommunity detection. Leonid E. Zhukov
Community detection Leonid E. Zhukov School of Data Analysis and Artificial Intelligence Department of Computer Science National Research University Higher School of Economics Network Science Leonid E.
More informationNon Overlapping Communities
Non Overlapping Communities Davide Mottin, Konstantina Lazaridou HassoPlattner Institute Graph Mining course Winter Semester 2016 Acknowledgements Most of this lecture is taken from: http://web.stanford.edu/class/cs224w/slides
More informationLink prediction in multiplex bibliographical networks
Int. J. Complex Systems in Science vol. 3(1) (2013), pp. 77 82 Link prediction in multiplex bibliographical networks Manisha Pujari 1, and Rushed Kanawati 1 1 Laboratoire d Informatique de Paris Nord (LIPN),
More informationSocial Network Analysis
Social Network Analysis Mathematics of Networks Manar Mohaisen Department of EEC Engineering Adjacency matrix Network types Edge list Adjacency list Graph representation 2 Adjacency matrix Adjacency matrix
More informationKeywords: dynamic Social Network, Community detection, Centrality measures, Modularity function.
Volume 6, Issue 1, January 2016 ISSN: 2277 128X International Journal of Advanced Research in Computer Science and Software Engineering Research Paper Available online at: www.ijarcsse.com An Efficient
More informationJianyong Wang Department of Computer Science and Technology Tsinghua University
Jianyong Wang Department of Computer Science and Technology Tsinghua University jianyong@tsinghua.edu.cn Joint work with Wei Shen (Tsinghua), Ping Luo (HP), and Min Wang (HP) Outline Introduction to entity
More informationGraph Mining and Social Network Analysis
Graph Mining and Social Network Analysis Data Mining and Text Mining (UIC 583 @ Politecnico di Milano) References q Jiawei Han and Micheline Kamber, "Data Mining: Concepts and Techniques", The Morgan Kaufmann
More informationScalable Clustering of Signed Networks Using Balance Normalized Cut
Scalable Clustering of Signed Networks Using Balance Normalized Cut Kai-Yang Chiang,, Inderjit S. Dhillon The 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) Oct.
More informationCommunity Detection in Directed Weighted Function-call Networks
Community Detection in Directed Weighted Function-call Networks Zhengxu Zhao 1, Yang Guo *2, Weihua Zhao 3 1,3 Shijiazhuang Tiedao University, Shijiazhuang, Hebei, China 2 School of Mechanical Engineering,
More informationA SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION
A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS Kyoungjin Park Alper Yilmaz Photogrammetric and Computer Vision Lab Ohio State University park.764@osu.edu yilmaz.15@osu.edu ABSTRACT Depending
More informationGraph Theory for Network Science
Graph Theory for Network Science Dr. Natarajan Meghanathan Professor Department of Computer Science Jackson State University, Jackson, MS E-mail: natarajan.meghanathan@jsums.edu Networks or Graphs We typically
More informationSocial Data Management Communities
Social Data Management Communities Antoine Amarilli 1, Silviu Maniu 2 January 9th, 2018 1 Télécom ParisTech 2 Université Paris-Sud 1/20 Table of contents Communities in Graphs 2/20 Graph Communities Communities
More informationTop-k Keyword Search Over Graphs Based On Backward Search
Top-k Keyword Search Over Graphs Based On Backward Search Jia-Hui Zeng, Jiu-Ming Huang, Shu-Qiang Yang 1College of Computer National University of Defense Technology, Changsha, China 2College of Computer
More informationCUT: Community Update and Tracking in Dynamic Social Networks
CUT: Community Update and Tracking in Dynamic Social Networks Hao-Shang Ma National Cheng Kung University No.1, University Rd., East Dist., Tainan City, Taiwan ablove904@gmail.com ABSTRACT Social network
More informationProximity Prestige using Incremental Iteration in Page Rank Algorithm
Indian Journal of Science and Technology, Vol 9(48), DOI: 10.17485/ijst/2016/v9i48/107962, December 2016 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Proximity Prestige using Incremental Iteration
More informationThe clustering in general is the task of grouping a set of objects in such a way that objects
Spectral Clustering: A Graph Partitioning Point of View Yangzihao Wang Computer Science Department, University of California, Davis yzhwang@ucdavis.edu Abstract This course project provide the basic theory
More informationLecture 9: I: Web Retrieval II: Webology. Johan Bollen Old Dominion University Department of Computer Science
Lecture 9: I: Web Retrieval II: Webology Johan Bollen Old Dominion University Department of Computer Science jbollen@cs.odu.edu http://www.cs.odu.edu/ jbollen April 10, 2003 Page 1 WWW retrieval Two approaches
More informationCAIM: Cerca i Anàlisi d Informació Massiva
1 / 72 CAIM: Cerca i Anàlisi d Informació Massiva FIB, Grau en Enginyeria Informàtica Slides by Marta Arias, José Balcázar, Ricard Gavaldá Department of Computer Science, UPC Fall 2016 http://www.cs.upc.edu/~caim
More informationCollaborative filtering based on a random walk model on a graph
Collaborative filtering based on a random walk model on a graph Marco Saerens, Francois Fouss, Alain Pirotte, Luh Yen, Pierre Dupont (UCL) Jean-Michel Renders (Xerox Research Europe) Some recent methods:
More informationPart I: Data Mining Foundations
Table of Contents 1. Introduction 1 1.1. What is the World Wide Web? 1 1.2. A Brief History of the Web and the Internet 2 1.3. Web Data Mining 4 1.3.1. What is Data Mining? 6 1.3.2. What is Web Mining?
More informationAn Information Theory Approach to Identify Sets of Key Players
An Information Theory Approach to Identify Sets of Key Players Daniel Ortiz-Arroyo and Akbar Hussain Electronics Department Aalborg University Niels Bohrs Vej 8, 6700 Esbjerg Denmark do@cs.aaue.dk, akbar@cs.aaue.dk
More informationStudy of Data Mining Algorithm in Social Network Analysis
3rd International Conference on Mechatronics, Robotics and Automation (ICMRA 2015) Study of Data Mining Algorithm in Social Network Analysis Chang Zhang 1,a, Yanfeng Jin 1,b, Wei Jin 1,c, Yu Liu 1,d 1
More informationALTERNATIVES TO BETWEENNESS CENTRALITY: A MEASURE OF CORRELATION COEFFICIENT
ALTERNATIVES TO BETWEENNESS CENTRALITY: A MEASURE OF CORRELATION COEFFICIENT Xiaojia He 1 and Natarajan Meghanathan 2 1 University of Georgia, GA, USA, 2 Jackson State University, MS, USA 2 natarajan.meghanathan@jsums.edu
More informationA two-stage strategy for solving the connection subgraph problem
Graduate Theses and Dissertations Graduate College 2012 A two-stage strategy for solving the connection subgraph problem Heyong Wang Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/etd
More informationDegree Distribution: The case of Citation Networks
Network Analysis Degree Distribution: The case of Citation Networks Papers (in almost all fields) refer to works done earlier on same/related topics Citations A network can be defined as Each node is a
More informationCS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks
CS224W: Social and Information Network Analysis Project Report: Edge Detection in Review Networks Archana Sulebele, Usha Prabhu, William Yang (Group 29) Keywords: Link Prediction, Review Networks, Adamic/Adar,
More informationProvided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available.
Provided by the author(s) and NUI Galway in accordance with publisher policies. Please cite the published version when available. Title Extracting and Utilizing Social Networks from Log Files of Shared
More informationJure Leskovec, Cornell/Stanford University. Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research
Jure Leskovec, Cornell/Stanford University Joint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research Network: an interaction graph: Nodes represent entities Edges represent interaction
More informationSOMSN: An Effective Self Organizing Map for Clustering of Social Networks
SOMSN: An Effective Self Organizing Map for Clustering of Social Networks Fatemeh Ghaemmaghami Research Scholar, CSE and IT Dept. Shiraz University, Shiraz, Iran Reza Manouchehri Sarhadi Research Scholar,
More informationThe Further Mathematics Support Programme
Degree Topics in Mathematics Groups A group is a mathematical structure that satisfies certain rules, which are known as axioms. Before we look at the axioms, we will consider some terminology. Elements
More informationTopologies and Centralities of Replied Networks on Bulletin Board Systems
Topologies and Centralities of Replied Networks on Bulletin Board Systems Qin Sen 1,2 Dai Guanzhong 2 Wang Lin 2 Fan Ming 2 1 Hangzhou Dianzi University, School of Sciences, Hangzhou, 310018, China 2 Northwestern
More informationWeb 2.0 Social Data Analysis
Web 2.0 Social Data Analysis Ing. Jaroslav Kuchař jaroslav.kuchar@fit.cvut.cz Structure(1) Czech Technical University in Prague, Faculty of Information Technologies Software and Web Engineering 2 Contents
More informationCommunity Detection in Bipartite Networks:
Community Detection in Bipartite Networks: Algorithms and Case Studies Kathy Horadam and Taher Alzahrani Mathematical and Geospatial Sciences, RMIT Melbourne, Australia IWCNA 2014 Community Detection,
More informationTHE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS. Summer semester, 2016/2017
THE KNOWLEDGE MANAGEMENT STRATEGY IN ORGANIZATIONS Summer semester, 2016/2017 SOCIAL NETWORK ANALYSIS: THEORY AND APPLICATIONS 1. A FEW THINGS ABOUT NETWORKS NETWORKS IN THE REAL WORLD There are four categories
More informationMy favorite application using eigenvalues: partitioning and community detection in social networks
My favorite application using eigenvalues: partitioning and community detection in social networks Will Hobbs February 17, 2013 Abstract Social networks are often organized into families, friendship groups,
More informationCS224W: Analysis of Networks Jure Leskovec, Stanford University
CS224W: Analysis of Networks Jure Leskovec, Stanford University http://cs224w.stanford.edu 11/13/17 Jure Leskovec, Stanford CS224W: Analysis of Networks, http://cs224w.stanford.edu 2 Observations Models
More informationQuery Independent Scholarly Article Ranking
Query Independent Scholarly Article Ranking Shuai Ma, Chen Gong, Renjun Hu, Dongsheng Luo, Chunming Hu, Jinpeng Huai SKLSDE Lab, Beihang University, China Beijing Advanced Innovation Center for Big Data
More informationAn Optimal Allocation Approach to Influence Maximization Problem on Modular Social Network. Tianyu Cao, Xindong Wu, Song Wang, Xiaohua Hu
An Optimal Allocation Approach to Influence Maximization Problem on Modular Social Network Tianyu Cao, Xindong Wu, Song Wang, Xiaohua Hu ACM SAC 2010 outline Social network Definition and properties Social
More informationThe PageRank Citation Ranking
October 17, 2012 Main Idea - Page Rank web page is important if it points to by other important web pages. *Note the recursive definition IR - course web page, Brian home page, Emily home page, Steven
More informationEffective Latent Space Graph-based Re-ranking Model with Global Consistency
Effective Latent Space Graph-based Re-ranking Model with Global Consistency Feb. 12, 2009 1 Outline Introduction Related work Methodology Graph-based re-ranking model Learning a latent space graph A case
More informationGraph Sampling Approach for Reducing. Computational Complexity of. Large-Scale Social Network
Journal of Innovative Technology and Education, Vol. 3, 216, no. 1, 131-137 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/1.12988/jite.216.6828 Graph Sampling Approach for Reducing Computational Complexity
More informationTheme Identification in RDF Graphs
Theme Identification in RDF Graphs Hanane Ouksili PRiSM, Univ. Versailles St Quentin, UMR CNRS 8144, Versailles France hanane.ouksili@prism.uvsq.fr Abstract. An increasing number of RDF datasets is published
More informationUAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA
UAPRIORI: AN ALGORITHM FOR FINDING SEQUENTIAL PATTERNS IN PROBABILISTIC DATA METANAT HOOSHSADAT, SAMANEH BAYAT, PARISA NAEIMI, MAHDIEH S. MIRIAN, OSMAR R. ZAÏANE Computing Science Department, University
More informationAn Edge-Swap Heuristic for Finding Dense Spanning Trees
Theory and Applications of Graphs Volume 3 Issue 1 Article 1 2016 An Edge-Swap Heuristic for Finding Dense Spanning Trees Mustafa Ozen Bogazici University, mustafa.ozen@boun.edu.tr Hua Wang Georgia Southern
More informationCompetitive Intelligence and Web Mining:
Competitive Intelligence and Web Mining: Domain Specific Web Spiders American University in Cairo (AUC) CSCE 590: Seminar1 Report Dr. Ahmed Rafea 2 P age Khalid Magdy Salama 3 P age Table of Contents Introduction
More informationRanking of nodes of networks taking into account the power function of its weight of connections
Ranking of nodes of networks taking into account the power function of its weight of connections Soboliev A.M. 1, Lande D.V. 2 1 Post-graduate student of the Institute for Special Communications and Information
More informationMining High Average-Utility Itemsets
Proceedings of the 2009 IEEE International Conference on Systems, Man, and Cybernetics San Antonio, TX, USA - October 2009 Mining High Itemsets Tzung-Pei Hong Dept of Computer Science and Information Engineering
More information2 Approaches to worldwide web information retrieval
The WEBFIND tool for finding scientific papers over the worldwide web. Alvaro E. Monge and Charles P. Elkan Department of Computer Science and Engineering University of California, San Diego La Jolla,
More informationBring Semantic Web to Social Communities
Bring Semantic Web to Social Communities Jie Tang Dept. of Computer Science, Tsinghua University, China jietang@tsinghua.edu.cn April 19, 2010 Abstract Recently, more and more researchers have recognized
More informationAn Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization
An Exploratory Journey Into Network Analysis A Gentle Introduction to Network Science and Graph Visualization Pedro Ribeiro (DCC/FCUP & CRACS/INESC-TEC) Part 1 Motivation and emergence of Network Science
More informationCommunity Detection: Comparison of State of the Art Algorithms
Community Detection: Comparison of State of the Art Algorithms Josiane Mothe IRIT, UMR5505 CNRS & ESPE, Univ. de Toulouse Toulouse, France e-mail: josiane.mothe@irit.fr Karen Mkhitaryan Institute for Informatics
More informationOn Finding Power Method in Spreading Activation Search
On Finding Power Method in Spreading Activation Search Ján Suchal Slovak University of Technology Faculty of Informatics and Information Technologies Institute of Informatics and Software Engineering Ilkovičova
More informationEvaluating the Usefulness of Sentiment Information for Focused Crawlers
Evaluating the Usefulness of Sentiment Information for Focused Crawlers Tianjun Fu 1, Ahmed Abbasi 2, Daniel Zeng 1, Hsinchun Chen 1 University of Arizona 1, University of Wisconsin-Milwaukee 2 futj@email.arizona.edu,
More informationRemotely Sensed Image Processing Service Automatic Composition
Remotely Sensed Image Processing Service Automatic Composition Xiaoxia Yang Supervised by Qing Zhu State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University
More informationIntroduction to network metrics
Universitat Politècnica de Catalunya Version 0.5 Complex and Social Networks (2018-2019) Master in Innovation and Research in Informatics (MIRI) Instructors Argimiro Arratia, argimiro@cs.upc.edu, http://www.cs.upc.edu/~argimiro/
More informationMining Social Network Graphs
Mining Social Network Graphs Analysis of Large Graphs: Community Detection Rafael Ferreira da Silva rafsilva@isi.edu http://rafaelsilva.com Note to other teachers and users of these slides: We would be
More informationClusters and Communities
Clusters and Communities Lecture 7 CSCI 4974/6971 22 Sep 2016 1 / 14 Today s Biz 1. Reminders 2. Review 3. Communities 4. Betweenness and Graph Partitioning 5. Label Propagation 2 / 14 Today s Biz 1. Reminders
More informationSocial Network Analysis With igraph & R. Ofrit Lesser December 11 th, 2014
Social Network Analysis With igraph & R Ofrit Lesser ofrit.lesser@gmail.com December 11 th, 2014 Outline The igraph R package Basic graph concepts What can you do with igraph? Construction Attributes Centrality
More informationResearch and Analysis of Structural Hole and Matching Coefficient
J. Software Engineering & Applications, 010, 3, 1080-1087 doi:10.436/jsea.010.31117 Published Online November 010 (http://www.scirp.org/journal/jsea) Research and Analysis of Structural Hole and Matching
More informationIntroduction to Machine Learning
Introduction to Machine Learning Clustering Varun Chandola Computer Science & Engineering State University of New York at Buffalo Buffalo, NY, USA chandola@buffalo.edu Chandola@UB CSE 474/574 1 / 19 Outline
More informationThe Gene Modular Detection of Random Boolean Networks by Dynamic Characteristics Analysis
Journal of Materials, Processing and Design (2017) Vol. 1, Number 1 Clausius Scientific Press, Canada The Gene Modular Detection of Random Boolean Networks by Dynamic Characteristics Analysis Xueyi Bai1,a,
More informationLink Analysis in the Cloud
Cloud Computing Link Analysis in the Cloud Dell Zhang Birkbeck, University of London 2017/18 Graph Problems & Representations What is a Graph? G = (V,E), where V represents the set of vertices (nodes)
More informationCS 380 ALGORITHM DESIGN AND ANALYSIS
CS 380 ALGORITHM DESIGN AND ANALYSIS Lecture 1: Course Introduction and Motivation Text Reference: Chapters 1, 2 Syllabus Book Schedule Grading: Assignments/Projects/Exams/Quizzes Policies Late Policy
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationCitation Prediction in Heterogeneous Bibliographic Networks
Citation Prediction in Heterogeneous Bibliographic Networks Xiao Yu Quanquan Gu Mianwei Zhou Jiawei Han University of Illinois at Urbana-Champaign {xiaoyu1, qgu3, zhou18, hanj}@illinois.edu Abstract To
More informationResearch on Community Structure in Bus Transport Networks
Commun. Theor. Phys. (Beijing, China) 52 (2009) pp. 1025 1030 c Chinese Physical Society and IOP Publishing Ltd Vol. 52, No. 6, December 15, 2009 Research on Community Structure in Bus Transport Networks
More informationFSRM Feedback Algorithm based on Learning Theory
Send Orders for Reprints to reprints@benthamscience.ae The Open Cybernetics & Systemics Journal, 2015, 9, 699-703 699 FSRM Feedback Algorithm based on Learning Theory Open Access Zhang Shui-Li *, Dong
More informationResearch and implementation of search engine based on Lucene Wan Pu, Wang Lisha
2nd International Conference on Advances in Mechanical Engineering and Industrial Informatics (AMEII 2016) Research and implementation of search engine based on Lucene Wan Pu, Wang Lisha Physics Institute,
More informationIntroduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p.
Introduction p. 1 What is the World Wide Web? p. 1 A Brief History of the Web and the Internet p. 2 Web Data Mining p. 4 What is Data Mining? p. 6 What is Web Mining? p. 6 Summary of Chapters p. 8 How
More informationIncorporating Satellite Documents into Co-citation Networks for Scientific Paper Searches
Incorporating Satellite Documents into Co-citation Networks for Scientific Paper Searches Masaki Eto Gakushuin Women s College Tokyo, Japan masaki.eto@gakushuin.ac.jp Abstract. To improve the search performance
More informationOn Demand Phenotype Ranking through Subspace Clustering
On Demand Phenotype Ranking through Subspace Clustering Xiang Zhang, Wei Wang Department of Computer Science University of North Carolina at Chapel Hill Chapel Hill, NC 27599, USA {xiang, weiwang}@cs.unc.edu
More informationA New Evaluation Method of Node Importance in Directed Weighted Complex Networks
Journal of Systems Science and Information Aug., 2017, Vol. 5, No. 4, pp. 367 375 DOI: 10.21078/JSSI-2017-367-09 A New Evaluation Method of Node Importance in Directed Weighted Complex Networks Yu WANG
More informationSmallBlue: Unlock Collective Intelligence from Information Flows in Social Networks
SmallBlue: Unlock Collective Intelligence from Information Flows in Social Networks Dashun Wang Northeastern University 110 Forsyth Street, Boston, MA 02115 Zhen Wen, Ching-Yung Lin IBM T. J. Watson Research
More informationCHAPTER 5 OPTIMAL CLUSTER-BASED RETRIEVAL
85 CHAPTER 5 OPTIMAL CLUSTER-BASED RETRIEVAL 5.1 INTRODUCTION Document clustering can be applied to improve the retrieval process. Fast and high quality document clustering algorithms play an important
More informationMining Web Data. Lijun Zhang
Mining Web Data Lijun Zhang zlj@nju.edu.cn http://cs.nju.edu.cn/zlj Outline Introduction Web Crawling and Resource Discovery Search Engine Indexing and Query Processing Ranking Algorithms Recommender Systems
More informationRandomized rounding of semidefinite programs and primal-dual method for integer linear programming. Reza Moosavi Dr. Saeedeh Parsaeefard Dec.
Randomized rounding of semidefinite programs and primal-dual method for integer linear programming Dr. Saeedeh Parsaeefard 1 2 3 4 Semidefinite Programming () 1 Integer Programming integer programming
More informationIMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM
IMPROVING INFORMATION RETRIEVAL BASED ON QUERY CLASSIFICATION ALGORITHM Myomyo Thannaing 1, Ayenandar Hlaing 2 1,2 University of Technology (Yadanarpon Cyber City), near Pyin Oo Lwin, Myanmar ABSTRACT
More informationAlgorithmic and Economic Aspects of Networks. Nicole Immorlica
Algorithmic and Economic Aspects of Networks Nicole Immorlica Syllabus 1. Jan. 8 th (today): Graph theory, network structure 2. Jan. 15 th : Random graphs, probabilistic network formation 3. Jan. 20 th
More informationVisoLink: A User-Centric Social Relationship Mining
VisoLink: A User-Centric Social Relationship Mining Lisa Fan and Botang Li Department of Computer Science, University of Regina Regina, Saskatchewan S4S 0A2 Canada {fan, li269}@cs.uregina.ca Abstract.
More informationCS246: Mining Massive Datasets Jure Leskovec, Stanford University
CS246: Mining Massive Datasets Jure Leskovec, Stanford University http://cs246.stanford.edu SPAM FARMING 2/11/2013 Jure Leskovec, Stanford C246: Mining Massive Datasets 2 2/11/2013 Jure Leskovec, Stanford
More informationCPSC 532L Project Development and Axiomatization of a Ranking System
CPSC 532L Project Development and Axiomatization of a Ranking System Catherine Gamroth cgamroth@cs.ubc.ca Hammad Ali hammada@cs.ubc.ca April 22, 2009 Abstract Ranking systems are central to many internet
More informationBug Triaging: Profile Oriented Developer Recommendation
Bug Triaging: Profile Oriented Developer Recommendation Anjali Sandeep Kumar Singh Department of Computer Science and Engineering, Jaypee Institute of Information Technology Abstract Software bugs are
More informationIntroduction to Engineering Systems, ESD.00. Networks. Lecturers: Professor Joseph Sussman Dr. Afreen Siddiqi TA: Regina Clewlow
Introduction to Engineering Systems, ESD.00 Lecture 7 Networks Lecturers: Professor Joseph Sussman Dr. Afreen Siddiqi TA: Regina Clewlow The Bridges of Königsberg The town of Konigsberg in 18 th century
More informationCSE 701: LARGE-SCALE GRAPH MINING. A. Erdem Sariyuce
CSE 701: LARGE-SCALE GRAPH MINING A. Erdem Sariyuce WHO AM I? My name is Erdem Office: 323 Davis Hall Office hours: Wednesday 2-4 pm Research on graph (network) mining & management Practical algorithms
More informationPatent Classification Using Ontology-Based Patent Network Analysis
Association for Information Systems AIS Electronic Library (AISeL) PACIS 2010 Proceedings Pacific Asia Conference on Information Systems (PACIS) 2010 Patent Classification Using Ontology-Based Patent Network
More informationBibliometrics: Citation Analysis
Bibliometrics: Citation Analysis Many standard documents include bibliographies (or references), explicit citations to other previously published documents. Now, if you consider citations as links, academic
More informationDistributed minimum spanning tree problem
Distributed minimum spanning tree problem Juho-Kustaa Kangas 24th November 2012 Abstract Given a connected weighted undirected graph, the minimum spanning tree problem asks for a spanning subtree with
More informationAcademic Recommendation using Citation Analysis with theadvisor
Academic Recommendation using Citation Analysis with theadvisor joint work with Onur Küçüktunç, Kamer Kaya, Ümit V. Çatalyürek esaule@bmi.osu.edu Department of Biomedical Informatics The Ohio State University
More informationEXTRACTION OF RELEVANT WEB PAGES USING DATA MINING
Chapter 3 EXTRACTION OF RELEVANT WEB PAGES USING DATA MINING 3.1 INTRODUCTION Generally web pages are retrieved with the help of search engines which deploy crawlers for downloading purpose. Given a query,
More informationCOMMUNITY SHELL S EFFECT ON THE DISINTEGRATION OF SOCIAL NETWORKS
Annales Univ. Sci. Budapest., Sect. Comp. 43 (2014) 57 68 COMMUNITY SHELL S EFFECT ON THE DISINTEGRATION OF SOCIAL NETWORKS Imre Szücs (Budapest, Hungary) Attila Kiss (Budapest, Hungary) Dedicated to András
More informationPredicting User Ratings Using Status Models on Amazon.com
Predicting User Ratings Using Status Models on Amazon.com Borui Wang Stanford University borui@stanford.edu Guan (Bell) Wang Stanford University guanw@stanford.edu Group 19 Zhemin Li Stanford University
More information