A New Evaluation Method of Node Importance in Directed Weighted Complex Networks

Journal of Systems Science and Information Aug., 2017, Vol. 5, No. 4, pp. 367 375 DOI: 10.21078/JSSI-2017-367-09 A New Evaluation Method of Node Importance in Directed Weighted Complex Networks Yu WANG School of Management, University of Shanghai for Science and Technology, Shanghai 200093, China E-mail: 1540542869@qq.com Jinli GUO School of Management, University of Shanghai for Science and Technology, Shanghai 200093, China E-mail: phd5816@163.com Han LIU Trade and Technology Department, Xijing University, Xi an 710123, China Abstract Current researches on node importance evaluation mainly focus on undirected and unweighted networks, which fail to reflect the real world in a comprehensive and objective way. Based on directed weighted complex network models, the paper introduces the concept of in-weight intensity of nodes and thereby presents a new method to identify key nodes by using an importance evaluation matrix. The method not only considers the direction and weight of edges, but also takes into account the position importance of nodes and the importance contributions of adjacent nodes. Finally, the paper applies the algorithm to a microblog-forwarding network composed of 34 users, then compares the evaluation results with traditional methods. The experiment shows that the method proposed can effectively evaluate the node importance in directed weighted networks. Keywords contribution directed weighted complex network; node importance; in-weight intensity; importance 1 Introduction With the deep study of complex networks, how to guarantee the reliability and invulnerability of networks has become an important research subject [1]. Currently, the research on the node importance evaluation mainly focus on undirected and unweighted networks [2 6],which only contains a qualitative description on whether there are interactions between nodes, rather than describing how strong the interactions are. However, most of the networks in reality are directed and weighted [3], such as web page networks, citation networks, and food chain networks, etc. In these networks, the interactions between nodes have a clear direction and may not be reciprocal. Therefore, it has a certain practical significance to evaluate the node importance in the directed weighted networks. It helps to find the weak links in the networks, seek key Received June 10, 2016, accepted December 19, 2016 Supported by the National Natural Science Foundation of China (71571119) The corresponding author

368 WANG Y, GUO J L, LIU H. nodes and cope with random attacks and deliberate attacks effectively, thereby improving the reliability of the whole network [7]. In recent years, researchers have proposed many valuable evaluation methods of node importance, including analysis methods of social network, systems science and information search area. Among them, methods of social network analysis are on the basis of the thinking that significance is equivalent to importance. Common indices include degree centrality [8],betweenness centrality [9], closeness centrality [10], eigenvector centrality and the cumulative nomination, etc. However, according to [11], Jin, et al. thought that the methods only consider a certain attribute of nodes, and it is too one-sided to reflect the node importance accurately and fully. For example, the degree centrality method is so plain that only the local structure around a node needs to be known, while ignoring the global structure of the network. It seems that the betweenness centrality, in a global sense, could well measure the node importance. The more the shortest paths passing the node are, the more important the node become. Nevertheless, it is not applicable to large-scale complex networks as it has a highly complicated algorithm. Methods of systems science analysis are on the basis of the idea that destructiveness is equivalent importance. The methods can reflect the node importance by measuring the damage of deleting nodes upon the network connection. Typical methods include the node delete method of minimum spanning tree number [12], the node contraction method [5], the shortest path method [13], etc. For instance, the node contraction method focuses on the geometric position of nodes in the networks. However, in case contractions of multiple nodes make the network topological structure identical, all nodes in the networks will have the same importance. Methods of information search analysis study the importance of the page nodes by analyzing the connection relationships between pages. Typical methods include PageRank algorithm [14], HITS algorithm [15], etc. The methods tend to be more effective to evaluate the node importance in directed networks, but it does not apply to undirected networks. The above methods can be used under the relevant background, which have great guidance and reference significance. However, these methods are mostly based on undirected and unweighted networks. Also they fail to consider the importance contribution of adjacent nodes, ignoring the interdependent relationship between nodes. In theory, the node importance not only depends on their own attributes like degree and betweenness, it is also closely related to the importance of adjacent nodes. Based on the above considerations, the paper puts forward a new method to identify key nodes in the directed weighted networks by using an importance evaluation matrix. The method not only considers the positions of nodes and the importance contributions of adjacent nodes, but also relates to the direction and weights of edges, making node importance evaluation more practically significant. In Section 2, the paper constructs a directed weighted network model and uses the node efficiency to characterize the role of nodes in network information circulation. In Section 3, the paper introduces the concept of the in-weight intensity of nodes. Thereby the importance evaluation matrix constructed by the in-weight intensity and the node efficiency could reflect nodes local importance. Then, the paper presents a specific evaluation algorithm. In Section 4, in order to verify the effectiveness of the method, the paper carries out an empirical analysis and compares that with traditional indices. Finally, a summary is given in Section 5.

A New Evaluation Method of Node Importance in Directed Weighted Complex Networks 369 2 Directed Weighted Complex Network Model Many network models (including small-world networks and BA evolution networks) have ignored the directivity of networks. In fact, many important networks are directed in real world, including telephone networks and metabolic networks. Unweighted networks may only reflect the simple connection mode between nodes, making the research not in-depth and comprehensive. In view of this, the paper describes a directed weighted network model. The model not only considers the directivity of interactions between nodes, but also depicts the strength of the interactions with edge weights, thereby presenting the network structures more objectively and completely. Definition 2.1 Directed weighted Network G =(V,E,W). In this definition, V = {v 1,v 2,,v n } is the set of nodes. E = {e 1,e 2,,e n } V V, is the set of edges. v i V (i = 1, 2,,n) represents a node in the network, (v i,v j ) E represents a directed edge from node v i to v j, W expresses the weight matrix of directed edges, w ij expresses weights (or connection strength) of a directed edge (v i,v j ); similarly, w ji expresses weights of a directed edge (v j,v i ). If node v i and node v j are not connected, we will consider w ij =0,w ji = 0. Generally, for any v i,weconsiderw ii =0. w 11 w 1n w 21 w 2n W =..... (1). w n1 w nn In the paper, we study directed graphs, so the weight matrix is generally asymmetric, namely, w ij w ji. Definition 2.2 The efficiencyi k of node k is defined as I k = 1 n 1 n i=1,i k 1 d ki. (2) Where, n is the total number of nodes in the network, d ki is the distance from node v k to node v i. From the definition of I k, we can learn that the node efficiency characterizes the average difficulty from one node to other nodes, and reflects the contributions that a node makes to the network information transmission. The greater the efficiency is, the more important position the node may have during the transmission process. Thus, the efficiency, to a certain extent, can reflect the global importance of nodes. 3 NodeImportanceEvaluationMethod 3.1 Construction of Importance Evaluation Matrix A complex network is a unified whole that contains a large number of nodes. There exist complicated interactions and interdependencies between nodes. For instance, although the bridge connecting node has quite small degree, it is noticeably relied on in the network, that is to say, other nodes make great importance contributions to the node. As a result, the node has a higher importance. Therefore, it is inaccurate to determine the node importance relying

370 WANG Y, GUO J L, LIU H. only on one index. In [16], Zhao, et al. raised a node importance contribution matrix method, in which a node contributed its initial importance, characterized by betweenness, to its adjacent nodes evenly with the base of degree. However, it is improper that the importance contributions of a node are inversely proportional to the degree. Moreover, the method is only applicable to undirected and unweighted networks, which is not practical. In order to overcome the above defects, aiming at directed weighted networks, the paper puts forward a new method for the node importance ranking. The method uses the node efficiency to indicate the location of nodes in networks, namely global importance. Considering the directions and weights of edges, the paper introduces the in-weight intensity to determine the importance contribution ratio between adjacent nodes. Thus the importance contribution matrix constructed therefrom can reflect nodes local importance. Definition 3.1 In-weight intensity of node v i. For the weighted network G =(V,E,W), in-weight intensity of node v i refers to the sum of weights on the directed edges pointing to node v i : n r i = w xi, (3) x=1 where n is the total number of nodes in the network. As can be seen from the definition, a node s in-weight intensity, to a certain extent, can characterize its local importance. In order to simplify the algorithm, the paper only considers the importance contributions of some adjacent nodes that point to the node to be evaluated. The reason is that compared with out-edges, in-edges of a node may provide more valuable importance contribution information. In a directed weighted network G, we can build its adjacency matrix A =[a ij ], where 1, (v i,v j ) / E, a ij = (4) 0, (v i,v j ) E. If node v j points to node v i, the importance contribution ratio of the adjacent node v j to node v i will be r i n c ij = x=1 r, a ji =1, xa jx (5) 0, a ji =0. Then it is extended to all nodes to get the matrix: 1 c 12 c 1n c 21 1 c 2n H NICM =...... (6). c n1 c n2 1 Definition 3.2 The matrix H NICM is called the node importance contribution matrix for directed weighted networks. In [7], Zhou, et al. also used the importance contribution ratio to construct the node importance contribution matrix. However, [7] is different from the method specified in the paper essentially. First of all, the former is just applicable to undirected and unweighted networks;

A New Evaluation Method of Node Importance in Directed Weighted Complex Networks 371 but the paper is dedicated to directed weighted networks. Moreover, in [7], the importance contribution ratio of node v j is dj k,whered j is the degree of node v j, k is the average degree of the network. While in the paper, when node v j points to node v i, c ij denotes the ratio of r i r to the sum of in-weight intensity of all adjacent nodes that node v j points to, i.e., i. n x=1 rxajx Different from [7], when our model degenerates into an undirected and unweighted network model, c ij refers to the ratio of degree d i to the sum of degree of all adjacent nodes of node v j, d i.e., i nx=1 d xa jx. The Equation (2) shows that the node efficiency reflects node s position information in some extent. Therefore, the paper selects the node efficiency as the initial importance value of H NICM, then uses Equation (6) to get the node importance evaluation matrix H: I 1 I 2 c 12 I n c 1n I 1 c 21 I 2 I n c 2n H =........, (7). I 1 c n1 I 2 c n2 I n where the element H ij refers to the importance contribution value of node v j to node v i.ascan be seen from Equation (5) and (7), when a ji = 1, the importance contribution value of node v j r to node v i is H ij = I i j n. The contribution value depends not only on the efficiency of x=1 rxajx node v j, but also on the ratio of r i to the sum of in-weight intensity of all adjacent nodes that node v j points to. The greater I j and the ratio, the higher the importance contribution value. From the Equations (2) and (7), the importance C i of node v i is defined as C i = I i n j=1,j i I j c ij. (8) The C i represents the product of I i times the sum of importance contributions of all adjacent nodes pointing to node v i. From Equation (8), the importance index integrates nodes global importance and local importance, which can improve the accuracy of evaluation. 3.2 Node Importance Evaluation Algorithm The existing evaluation algorithms mainly compare the changes of network performance before and after removing nodes. Nevertheless, the removal of nodes may cause the network to be split up. The situation should be discriminated, and it is hard to clarify that. Here we consider the importance contribution of adjacent nodes and the node efficiency altogether, which means making use of the node s in-weight intensity and position information. The specific algorithm steps for the evaluation method are as follows: Input The weight matrix W = [w ij ] and adjacency matrix A = [a ij ] of the directed weighted network; Output The importance C i of node v i. Begin: Step 0 Calculate the shortest distance between all pairs of nodes Dis =[d ij ]//Floyd algorithm;

372 WANG Y, GUO J L, LIU H. Step 1 Determine the node importance contribution matrix H NICM : According to the adjacency matrix A =[a ij ], we can determine all adjacent nodes pointing to v i, and other adjacent nodes that these adjacent nodes point to. According to the weight matrix W =[w ij ], we can also determine the in-weight intensity of node v i and other adjacent nodes (for i =1to n), then fill them in H NICM according to Equations (5) and (6); Step 2 Determine the node importance evaluation matrix H: according to the Dis, we can determine the efficiency I i, and multiply the elements in i-th column of H NICM by I i.the products will serve as the i-th column of H (for i =1ton); Step 3 Calculate each node s importance: According to Equation (8), the importance C i of each node can be calculated (for i =1ton). End. 4 The Analysis of Experimental Results In order to verify the validity of the method herein, the paper takes Sina Microblog for example, using a microblog-forwarding network between friends to evaluate users importance. In the microblog-forwarding network, nodes represent the network users, directed edges express the following relations among users, and the weights on directed edges represents the amount of microblog forwarded. Figure 1 is a typical microblog-forwarding network form. Assuming there are 4 user nodes, A and B are followed mutually. If A follows C, B may also follows C; similarly, if B follows D, A may also follows D. The Arabic numerals on the directed edges represent the amount of users microblog forwarded. For example, the Arabic numerals 4 means that A forwards 4 pieces of microblogs from C. According to the algorithm steps mentioned above, the typical network form is analyzed experimentally to illustrate the effectiveness of the proposed method. The evaluation results are shown in Table 1. Figure 1 A typical microblogging network form Table 1 The node importance in the network shown in Figure 1 Node In-degree In-weight intensity r Efficiency I Importance C A 1 7 1 0.2121 B 1 8 1 0.2353 C 3 12 0.3333 0.3500 D 3 14 0.3333 0.3898 It can be learnt from Table 1 that, the in-degree index ignores the weights of edges, so the evaluation results lack objectiveness. For example, the in-degrees of nodes A and B are both 1, but their importances are clearly different. Although the efficiency can well reflect

A New Evaluation Method of Node Importance in Directed Weighted Complex Networks 373 nodes roles in the information transmission, it fails to consider the importance contributions of adjacent nodes. For instance, the efficiencies of nodes C and D are both 0.3333. While using the evaluation index C proposed in the paper, we can see obviously that the importance is different (C C =0.3500,C D =0.3898). The index not only embodies the in-weight intensity and node efficiency, but also shows the impact of adjacent nodes. Therefore, the node importance can be distinguished even better. In order to further verify the effectiveness of the method, the paper selects a sample of 38 users from University of Shanghai for Science and Technology for investigation. After excluding the objects not registered in Sina Microblog, we obtain a microblog-forwarding network composed of 34 users as shown in Figure 2. According to the algorithm proposed herein and the programs written in Matlab, each evaluation index is computed and shown in Table 2. Figure 2 Microblogging forwarded network constituted by 34 users As can be seen from Table 2, using different evaluation indices can get different ranking of node importance. The in-degree index only considers the directions of edges and ignores the weights on the edges, making the results quite rough and inaccurate. Likewise, simply using the in-weight intensity also cannot well measure the nodeimportance.forexample, the in-degrees of node v 7, v 19,andv 30 are all 3, the in-weight intensity values are all 13. They seem to have the same importance, but in fact, the two indices ignore the global position of nodes and the impact of adjacent nodes on them. Obviously, the three nodes are in different positions, and accordingly their importance has significant disparity (C v7 =0.1613,C v19 =0.1806,C v30 =0.0468). As mentioned above, the efficiency also makes the results biased as its failure to consider the weights of edges and the importance contribution of adjacent nodes. For example, the node v 5 has the largest efficiency value 0.4732. However, hence concluding that v 5 is the most important node, is very one-sided. From Table 2, C v5 =0.0914 is relatively small. C v21 =0.2233 and C v29 =0.1742 are both greater than C v5, which is consistent with the actual situation. Using

374 WANG Y, GUO J L, LIU H. the algorithm herein we can derive that, the most important node in the microblog-forwarding network is v 23,forC v23 =0.2419 = max(c v1,c v2,c v3,,c v34 ). Table 2 The node importance in the network shown in Figure 2 Node In-degree r I C Node In-degree r I C v 1 1 2 0.3440 0.0085 v 18 3 27 0.2498 0.0996 v 2 1 2 0.3592 0.0099 v 19 3 13 0.3298 0.1806 v 3 2 4 0.2196 0.0516 v 20 3 7 0.3025 0.1099 v 4 4 28 0.1853 0.0815 v 21 3 36 0.3667 0.2233 v 5 2 13 0.4732 0.0914 v 22 2 14 0.2679 0.0396 v 6 3 15 0.3606 0.0833 v 23 5 85 0.2453 0.2419 v 7 3 13 0.3417 0.1613 v 24 2 9 0.2655 0.0371 v 8 3 20 0.3316 0.1523 v 25 2 12 0.2478 0.0430 v 9 2 8 0.2252 0.0166 v 26 3 8 0.2578 0.0379 v 10 2 14 0.2050 0.0307 v 27 2 9 0.4354 0.0544 v 11 3 26 0.2362 0.0880 v 28 4 27 0.2885 0.1511 v 12 1 9 0.1962 0.0059 v 29 3 10 0.3934 0.1742 v 13 2 2 0.2627 0.0152 v 30 3 13 0.2778 0.0468 v 14 3 9 0.2597 0.0832 v 31 3 35 0.3515 0.1327 v 15 1 12 0.2760 0.0216 v 32 2 11 0.2062 0.0654 v 16 2 13 0.2539 0.0863 v 33 2 13 0.2679 0.0627 v 17 3 10 0.3510 0.1074 v 34 4 27 0.1756 0.0524 5 Conclusion Based on actual circumstances, the paper proposes an evaluation method of node importance in directed weighted networks. Considering the directions and weights of edges, and the dependency relations among adjacent nodes, the paper improves the importance contribution matrix. Thereby, using the improved index C can better identify key nodes in the networks. Through experimental analysis, it also proves that, compared with traditional methods, the method proposed can make the evaluation results more practical. In order to simplify the calculation, H NICM only defines the importance contribution of indegree adjacent nodes; that is, if node v j points to v i,thenv j will have importance contribution to v i, but regardless of the contribution of node v i to node v j. Consequently, how to extend the matrix, making it also reflect the importance contribution of out-degree adjacent nodes, or even the dependencies among all nodes, will be the author s study task next. References [1] Xiong W M, Liu Y H. On the reliability of communication networks. Journal of China Institute of Communications, 1990, 11(4): 43 49. [2] Peng X Z, Yao H, Zhang Z H, et al. Invulnerability of scale-free network against critical node failures based on a renewed cascading failure model. Systems Engineering and Electronics, 2013, 35(9): 1974 1978.

A New Evaluation Method of Node Importance in Directed Weighted Complex Networks 375 [3] Landherr A, Friedl B, Heidemann J. A critical review of centrality measures in social networks. Business & Information Systems Engineering, 2010, 2(6): 371 385. [4] Kermarrec A M, Merrer E L, Sericola B, et al. Second order centrality: Distributed assessment of nodes criticity in complex networks. Computer Communications, 2010, 34(5): 619 628. [5] Tan Y J, Wu J, Deng H Z. Evaluation method for node importance based on node contraction in complex networks. Systems Engineering Theory & Practice, 2006, 11(11): 79 83. [6] Liu R R, Jia C X, Zhang J L, et al. Robustness of interdependent networks under several intentional attack strategies. Journal of University of Shanghai for Science and Technology, 2012, 34(3): 235 239. [7] Zhou X, Zhang F M, Li K W, et al. Finding vital node by node importance evaluation matrix in complex networks. Acta Physica Sinica, 2012, 61(5): 1 7. [8] Yang K, Zhang N, Su S Q. Node centrality on individual microblog user network. Journal of University of Shanghai for Science and Technology, 2015, 37(1): 43 48. [9] Shen D, Li J H, Xiong J S, et al. A cascading failure model of double layer complex networks based on betweenness. Complex Systems and Complexity Science, 2014, 11(3): 12 18. [10] Sabidussi G. The centrality index of a graph. Psychometrika, 1966, 31(4): 581 603. [11] Jin J, Xu K, Xiong N, et al. Multi-index evaluation algorithm based on principal component analysis for node importance in complex networks. IET Networks, 2012, 1(3): 108 122. [12] Chen Y, Hu A Q, Hu X. Evaluation method for node importance in communication networks. Journal of China Institute of Communications, 2004, 25(8): 129 134. [13] Rao Y P, Lin J Y, Hou D T. Evaluation method for network invulnerability based on shortest route number. Journal of China Institute of Communications, 2009, 30(4): 113 117. [14] Luo H L. Remanufacturing closed-loop supply chain model with uncertain demand. Science and Technology Management Research, 2012, 2(2): 95 98. [15] Guo Y H, Ma J H, Wang G H. Modeling and analysis of recycling and remanufacturing systems by using repeated game model. Industrial Engineering Journal, 2011, 14(5): 66 70. [16] Zhao Y H, Wang Z L, Zheng J, et al. Finding most vital node by node importance contribution matrix in communication networks. Journal of Beijing University of Aeronautics and Astronautics, 2009, 35(9): 1076 1079.