Distributed Construction of Connected Dominating Sets Optimized by Minimum-Weight Spanning Tree in Wireless Ad-hoc Sensor Networks

2014 IEEE 17th International Conference on Computational Science and Engineering Distributed Construction of Connected Dominating Sets Optimized by Minimum-Weight Spanning Tree in Wireless Ad-hoc Sensor Networks Sijun Ren, Ping Yi, Dapeng Hong,YueWu, Ting Zhu Shanghai Jiao Tong University, China Email: {rensijun, pingyi, wuyue}@sjtu.edu.cn University of Michigan - Shanghai Jiao Tong University Joint Institute, China Email: hongdp@umich.edu University of Maryland Baltimore County, USA Email: zt@umbc.edu Abstract A Connected Dominating Set (CDS) is a subset V of V for the graph G(V,E) and induces a connected subgraph, such that each node in V V is at least adjacent to one node in V. CDSs have been proposed to formulate virtual backbones in wireless ad-hoc sensor networks to design routing protocols for alleviating the serious broadcast storms problem. It is not easy to construct the Minimum Connected Dominating Set (MCDS) due to the NP-hard nature of the problem. In this paper, our algorithm first finds a prior CDS and then uses the Minimum-Weight Spanning Tree (MST) to optimize the result. Our algorithm applies effective degree, the new term introduced in our algorithm, combining with ID to determine dominators. Default event is triggered to recalculate and update the node s effective degree after a predetermined amount of time. By 3-hop message relay, each node can learn the paths leading to the other dominators within 3-hop distance and thus some paths picked up by some rules can convert into the new weight edge by calculating the number of nodes over these paths. An MST will be found from the new weight graph induced by the prior CDS to further reduce CDS size. Our algorithm performs well in terms of CDS size and Average Hop Distance (AHD) by comparing with the existing algorithms. The simulation result also shows that our algorithm is more energy efficient than others. I. INTRODUCTION In recent years, wireless ad-hoc sensor networks composed of a number of wireless communication equipments have been widely and commonly used in many areas such as battlefield surveillance, healthcare industry, environmental monitoring, smart home and so forth. However, wireless ad-hoc sensor networks have significant differences from compared with other networks. Firstly, wireless ad-hoc sensor networks have the nature of multi-hop, self-organizing and self-control [1]. In wireless ad-hoc sensor networks, each of wireless communication equipments is not only a host but also a router. Unlike cellular mobile networks and mesh networks, they both relay on predefined fundamental communication infrastructures such as base stations and access service stations. Instead, wireless communication equipments in wireless ad-hoc sensor networks need to forward the received data packages with the help of intermediate wireless communication equipments. Although wireless ad-hoc sensor networks have no physical backbone infrastructure, a virtual backbone can be formed by wirless communication equipments. CDSs have been widely studied to form virtual backbones for designing the stable and highly efficient network architecture in wireless ad-hoc sensor networks. In general, a dominating set of G =(V,E) is a subset V V, which satisfies v V \V, v V,(v,v ) E. We call the node in V dominator. The CDS is a dominating set that is also connected. Obviously, a smaller size of CDS can greatly reduce the broadcasting storms and thus effectively improves the network performance. However, finding the MCDS in a Unit Disk Graph (UDG) has been shown to be NP-hard [2]. So many algorithms have been proposed to formulate this problem. Now, we briefly survey some algorithms in the literature. Guha and Khuller first introduced two polynomial time algorithms in [3]. Their two algorithms are greedy and centralized. The main idea in algorithm 1 is to build a spanning tree first and then grows the connected dominating set from a node which has a maximum degree until all nodes are added to the tree. And at last, we can obtain the CDS that non-leaf nodes form. The algorithm 2 improves the approximation ratio of 2(H(Δ)+1) to the H(Δ)+2, where Δ is the maximum degree of G =(V,E), and H is a harmonic function. In [4], Ravindara et al. proposed a centralized backbone framework utlilizing a reservoir algorithm for graph sampling in order to reduce the number of nodes used for backbone construction. For the distributed algorithms part, Wu and Li proposed a distributed algorithm in [5]. The algorithm can be divided into two phases. In the first phase, every node v exchanges its neighbor set N(v) with all of its neighbors. And a node becomes a dominator if it has two unconnected neighbors. In the second phase, the algorithm uses two rules to reduce the size of CDS. Wan et al. presented their algorithm with an approximation factor of 8 and running time O(n) in [6]. Cardei s distributed algorithm [7] grows from a single leader and uses 1-hop connectivity information, which can yield O(Δ n) time complexity. In the work [8], the authors presented a distributed algorithm with a performance ratio of 6.91. In [9], the algorithm first constructs an MIS and then makes selective connections among the nodes in MIS through a steiner tree construction. Recent work can be referred to [10], [11], [12], [13], [14], [15]. The rest of this paper is organized as follows. In section 2, we present necessary models, notations and lemmas used in our paper. In section 3, we present our algorithm for constructing a CDS in detail. In section 4, we give theoretical analysis and proof of the algorithm at length. The simulation results are shown in section 5. In section 6, we conclude our paper. 978-1-4799-7981-3/14 $31.00 2014 IEEE DOI 10.1109/CSE.2014.183 902 901

II. PRELIMINARIES FOR THE ALGORITHM In this section, for the purpose of conveniently describing and analysing our algorithm, we assume and present the related models, terminologies and lemmas used in our algorithm. A. System model 1) Let a graph G =(V,E) be an undirected graph to represent a wireless ad-hoc sensor network, where V represents the set of nodes in the network. For u,v V, if they are adjacent, it means that two nodes are connected by an edge e E if and only if u s transmission range covers v and v s transmission range covers u. We assume that all the nodes in V are homogeneous with the same transmission range, which means G is a unit disk graph. 2) We assume that every node has a uniform synchronous clock installed inside it, which means the entire network should be synchronous to a reference clock. So the nodes across the network can synchronize to start a new routine. B. Terminologies ID(u): it represents the unique identifier of the node u. Any node embeds its ID into the message and thus the other nodes know where the souce message comes from. state(u): it is the function to indicate the different status of node u. Its value can be set to white, gray or black. The initial status of each node is white. N(u): it is the set of u s 1-hop neighbors. d (u): it represents the number of u s 1-hop neighbors in white status and u s current status is also in white. It is different from the degree of a node in graph theory, because it is incarnated by the characteristic of dynamic variation. If one of the u s 1-hop neighbors becomes gray, then the value of d (u) decreases 1. Thus, we call d (u) effective degree of node u. Such information can be recalculated and updated by periodic or event-driven 1-hop neighbors status messages. The initial value of any node s effective degree is the same with the number of its 1-hop neighbors. Timer: it records the times that the type-i STATUS MESSAGE has been relayed. Its initial value is 0. The value increases 1 before a node relays the type-i STATUS MESSAGE. LEVAL MESSAGE: it is initiated by a root node to identify the tree level of each node. STATUS-COMPLETE MESSAGE: for a leaf node, it will broadcast this message once it determines its status in black or gray. For the nonleaf node, if it has determined its status in black or gray, then it will forward this message up the tree toward the root node after it receives this message from each of its children. START-UP MESSAGE: such message is broadcasted by the root node and forwarded by the other nodes to announce the next phase starts. Each node will step into the next phase when it receives this message. STATUS MESSAGE: it can be divided into two types. In type-i, the structure of STATUS MESSAGE includes a Timer and IDs of nodes that have relayed the type-i STATUS MESSAGE. A node will package its ID into the message before it needs to relay the type-i STATUS MESSAGE. Type-I STATUS MESSAGE is generated by such nodes that are selected as dominators by our algorithm. Moreover, the type-i STATUS MESSAGE will be relayed to the nodes that are within 3-hop distance away from the information source node. In type-ii, nodes generate this type of message and broadcast to their 1-hop neighbors to announce that their initial status has transformed into gray. The type-ii STATUS MESSAGE is never relayed after it has been released from a information source node. DEGREE MESSAGE: the message is only released from the node in white status. For the node u in white status, such message including updated d (u), namely, effective degree, will be broadcasted by u in every constant time interval T if its effective degree has changed. CONNECT MESSAGE: this kind of message is generated by the dominators to inform which nodes are selected to make the dominators connected. The IDs of selected nodes are encapsulated in the message. The IDs of two dominators are also packaged in the message in a special format to denote that such message is used to establish the connection between them. The CONNECT MESSAGE will be forwarded by such nodes that the message includes their IDs not packaged in the special format. We assume that the ID of node u is packaged in the message. Only under this circumstance, the node u needs to forward the message if the message still inclues the ID of other node not packaged in the special format after u has taken out its ID from the message. C. Lemmas Lemma 1. An Maximal Independent Set (MIS) must also be a dominating set. Proof. We assume I is an MIS in the graph G =(V, E), for u V (G)\I, u must be adjacent to a node in I (otherwise it contradicts with the maximum characteristic of I). Thus, we can conclude that I is also a dominating set. The lemma is proved hereto. Lemma 2. In a unit disk graph, a node is adjacent to at most five dominators. proof. This demonstration was explained clearly in [16]. Lemma 3. For every node u, the number of dominators with k-units away from u is bounded by a constant l k in a unit disk graph. Proof. The analysis of this proof was presented in [17]. III. AN APPROXIMATE MCDS ALGORITHM In this section, we present our algorithm for constructing a CDS in detail. The algorithm first finds a prior CDS from which the Minimum-Weight Spanning Tree (MST) is found. 903 902

According to the fact that an MIS is also a dominating set, the algorithm constructs the prior CDS by establishing an MIS first and then selects new nodes to the MIS to let subgraph induced by these nodes be connected. The nodes are selected into the MIS based on nodes effective degree and effective degree of each white node is updated and recalculated in every constant time interval T. It is reasonable to use effective degree instead of degree as the criterion to choose the dominators because effective degree can guarantee that each node will not be over counted into degree of the other nodes in the execution of the algorithm. The algorithm adopts 3-hop message relay to learn the topology between any pairs of two dominators that are apart from each other within 3-hop distance. Each node executes the algorithm asynchronously and parallelly subject to the same rules. By 3-hop message relay combining with the rules, the selective paths to connect any two dominators that are apart from each other within 3-hop distance can be recognized as the new edges with respective weight by calculating the number of nodes over these paths. Thus we can find an MST from the prior CDS the algorithm has been found. Obviously, an MST is also a CDS. The main challenge is the synchronization point when the proir CDS has been found and then the algorithm starts to find an MST. A. 3-hop message relay. Timer=Timer+1 Timer=Timer+1 Timer=Timer+1 u e d v (a) Timer=Timer+1 Timer=Timer+1 u e d whether to be in black status or not (although in subfigure (a), we mark the node v as black ), it stops relaying the message to the next-hop. In subfigure (b), although the value of Timer is 2 when the node d obtains the message, it terminates to relay the message regarding that the node d is in black status at present. In a nutshell, we have such principles that the node s first thing to do is to check up the value of Timer in the received type- I STATUS MESSAGE. Only in the situation that the value of Timer is no more than 3, the node makes a decision whether or not it relays the message to the next-hop by combining with its status. Furthermore, if the node finds the value of Timer is 1, its status turns gray and then broadcasts the type- II STATUS MESSAGE after it has relayed the received type-i STATUS MESSAGE. Obviously, by 3-hop message relay, each dominator can learn the topology between another dominator within 3-hop distance since each of them will receive the forwarded type-i STATUS MESSAGE generated by each other. B. Rules Rule 1. If v N(u), d (u) > d (v) or d (u) =d (v) and ID(u) < ID(v), then state(u) =black and u broadcasts the type-i STATUS MESSAGE, which trigger 3-hop message relay. Rule 2. The set K is defined as K = N(u) N(v), of which state(u) = black, state(v) = black, k K and ID(k)=min{ID(m), m K}. IfID(u) < ID(v), u broadcasts the CONNECT MESSAGE. As for the smallest ID of k among the nodes in K, sotheid of k is packaged in the CONNECT MESSAGE and the message is marked to establish the connection between u and v in a special format. k will create and keep a list to record such end-to-end ID of u and v when it receives these messages. (b) Fig. 1. 3-hop message relay. If a node is selected as a dominator by the algorithm, it means the node should be in black status and needs to broadcast the type-i STATUS MESSAGE which contains its ID. The type-i STATUS MESSAGE will be relayed to such nodes that are within 3-hop distance away from the dominator. Hence, we name this course as 3-hop message relay. 3-hop message relay can be further splitted into two cases as shown in Fig 1 (a) and (b) respectively. In subfigure (a), the node u is determined as a dominator by the algorithm thus it generates the type-i STATUS MESSAGE to advertise its status to its 1-hop neighbors. Once its 1-hop neighbors receive the message and inspect the value of Timer is 1, it means they are dominated by the other node. So their status turns gray and broadcasts type-ii STATUS MESSAGE. The node e belonging to u s 1- hop neighbors embeds its ID into the message with the value of Timer plus 1 before relaying the type-i STATUS MESSAGE to the next-hop nodes. The node d receives the type-i STATUS MESSAGE and checks up the value of Timer. If the value of Timer is no more than 3 and d s status is not in black, the node d keeps on relaying the message after embeding its ID into the message and increasing the value of Timer by 1. However, when the node v receives the type-i STATUS MESSAGE, it finds the Timer has been 3. No matter the status of node v Rule 3. If state(u) = black, state(v) = black and N(u) N(v) =φ, ID(u) < ID(v), we assume that there is at least a pair of nodes between u and v in gray status which can make them connected. If such a pair of nodes is m and n and the sum of ID of m and n is the smallest among in these pairs, then m and n will be selected into MIS to establish the connection between u and v. Therefore, the CONNECT MESSAGE generated by u includes the ID of m and n and is marked to establish the connection between u and v in a special format. Similarly, m and n will create and keep a list respectively to record such end-to-end ID of u and v when they receive these messages. C. Update mechanism for effective degree Our algorithm can be divided into two phases to update effective degree including initialization phase and updating phase. We explain it in full length as follows. Initialization phase. We assume that the input is a connected graph G(V,E). Foru V, the initial d (u) is set to 0 and its initial status is white. N(u) and d (u) are equivalent after completing the initialization phase. Each node sends a HELLO beacon randomly to detect its 1-hop neighbors. For u V, d (u) increases by 1 whenever a HELLO beacon it receives from its 1-hop neighbors. Thus any node u can obtain its 1-hop neighbors effective degree. After τ time expired, each node stars the following phase. 904 903

Updating phase. Each node in white status maintains a list about the effective degree of its known neighbors. If a node receives the type-ii STATUS MESSAGE from its 1-hop neighbors, it will delete the associated entries about such nodes which has broadcasted the type-ii STATUS MESSAGE from the list. In system model section, we assume that every node has a uniform synchronous clock installed inside it. During time slot [τ + nt,τ + T 1 + nt ],n (0,1,2,3 ), each node will broadcast the type-i and type-ii STATUS MESSAGE if necessary based on Rule 1. When T 1 time is expired, it means that some nodes have determined their status and inform their 1-hop neighbors of status changes by broadcasting the type-i and type-ii STATUS MESSAGE. As for the nodes still in white status, they will update their effective degree by decreasing the number of type-ii STATUS MESSAGE they have received during time slot [τ + nt,τ + T 1 + nt ] from its current effective degree. And such white nodes need to broadcast their DEGREE MESSAGE during time slot [τ + T 1 + nt,τ + T 1 + T 2 + nt ] only when their effective degree has changed. T 1 and T 2 satisfy the following equation: T = T 1 + T 2 In the most ideal case, each node can accomplish its any action and routine if necessary during the amount of time T 1 and T 2 respectively. Obviously, it is related to many factors in real networks. We can obtain the optimal value of T 1 and T 2 by computer simulation. D. MST construction A new subgraph G =(V,E ) is induced by the nodes in prior CDS our algorithm finds. By 3-hop message relay, each dominator can learn the paths leading to other dominators within 3-hop distance. And some paths are picked out by Rule 2 and 3. We can recognize these paths as the new edges associated with the weight by calculating the number of nodes over these paths. So the weight of all the edges is either 1 or 2. a a b c d (a) e 1 1 c (c) f f a 2 a b 1 1 c c (b) (d) Fig. 2. Use the MST to optimize the CDS. The topology in Fig 2 (a) is the subgraph induced by prior CDS found by our algorithm. The gray nodes in Fig 2 (a) are selected into MIS to conncet the dominators based on Rule 2 and 3. The path picked out between dominator a and f is via two gray nodes, so we can use an edge with weight 2 to substitute two gray nodes. In a similar way, the other two edges can be easily drawn up and we can get a new graph G which is shown in Fig 2 (b). The next step we should take is to find an MST from G. The two edges with weight 1 will be reserved. Fig 2 (c) shows the MST we find. From Fig 2 (d), we can see e f f that node d is removed from the prior CDS. Thus, we can use the MST to optimize the CDS. In order to apply the existing MST algorithms to our case, these algorithms should make a little adjustment. In this paper, the CONNECT MESSAGE is particularly designed for such the adjustment. The message broadcasted by a node to its neighbors in MST algorithms is over the adjoining edges. So each message broadcasted by a dominator to its neighbors, the other dominators within 3-hop distance in our case, will be forwarded by the nodes that have been selected to establish the connections between any two dominators if such nodes inspect that the received message contains the ID in their reserved list. E. The method for synchronization point The algorithm stars to run to find an MST until the prior CDS has been found in sequence. In other words, each dominator must obtain the paths leading to the other dominators within 3-hop distance. So dominators need to reach this synchronization point before continuing on the following process. The method is presented as follows. We first apply the distributed leader election algorithm with O(n) time complexity and O(nlog 2 n) message complexity in [18] to construct a rooted spanning tree T rooted at a node r. When the root r has been selected, then each node identifies its tree lever relative to T like this: the root r first announces its level 0 by broadcasting its LEVAL MESSAGE. Each other node, upon receiving the LEVAL MESSAGE from its parent in T, calculates its own level by increasing the level of its parent by 1, and then broadcasts its LEVAL MESSAGE to announce its level. Thus, each node can obtain its children based on the LEVAL MESSAGE. Those nodes which need to broadcast the type-i STATUS MESSAGE form an MIS, and the detailed proof can be found in Theorem 1. By 3-hop message relay, each dominator can learn the topology between the other dominators within 3-hop distance, if any. And such nodes receive the type-i STATUS MESSAGE become gray when the value of Timer in received message is 1. Then, they broadcast the type-ii STATUS MES- SAGE to inform their 1-hop neighbors that their status has became gray. Two dominators that are apart from each other within 3-hop distance can establish the connection Based on Rule 2 and 3. Thus, we can recognise that a prior CDS has been found (subject to a slight exception in reality because the type- I STATUS MESSAGE and CONNECT MESSAGE have a time delay for 3-hop message relay) when each node has determined its status either in black or gray. When a leaf node has determined its status in black or gray, it transmits the STATUS-COMPLETE MESSAGE to its parent. Each nonleaf node will wait till it receives the STATUS- COMPLETE MESSAGE from each of its children. If it also determines its status in gray or black and receives the STATUS- COMPLETE MESSAGE from each of its children, then it will forward the STATUS-COMPLETE MESSAGE up the tree toward the root r. When the root r receives the STATUS- COMPLETE MESSAGE from all its children and the root r itself also has determined its status in black or gray, it indicates that each node has determined its status in black or gray. In other words, the prior CDS has been found. The root r will initiate the distributed MST algorithm in [19] by broadcasting the START-UP MESSAGE. It is desirable to have the algorithm run simultaneously and asynchronously across 905 904

the network so that it can converge faster. That is the reason why we choose the distributed MST algorithm in [19] since it can be initiated spontaneously at any node or any subset of nodes. Each dominator in MIS starts to perform the same local algorithm in [19] upon receiving the START-UP MESSAGE. F. The description of algorithm Start exchange status update effective degree decide new status Fig. 3. A node decides its status in rounds. In our algorithm, each node executes the rules mentioned above to determine its status and eventually a prior CDS can be found, which can be shown in Fig 3. Here, we denote every time interval T defined in the update mechanism for effective degree as a round. In a round, a node exchanges status messages, updates its effective degree by broadcasting the DEGREE MESSAGE, and then makes the decision about its status based on its 1-hop neighbors effective degree. The description of algorithm step 1. Input: a connected graph G =(V,E), and each node s initial status is in white; step 2. Choose a root r V (i.e., using a leader selection in [18]); step 3. r initiates the LEVEL MESSAGE to identify nodes level; step 4. Each node executes the Rule 1, 2 and 3 combining with update mechanism for effective degree to find a prior CDS eventually; step 5. Let C be the set of black nodes and those nodes that are selected into MIS to establish the connections among dominators; step 6. When r receives COMPLETE-MESSAGE, it initiates the distributed MST in [19] by broadcasting the START- UP MESSAGE; step 7. Dominators in C start to perform the same local algorithm upon receiveing the START-UP MESSAGE; step 8. Let C be the set of nodes belonging to MST that the algorithm in [19] finds; step 9. Return C. IV. ANALYSIS OF THE ALGORITHM Theorem 1. The nodes broadcasting the type-i STATUS MESSAGE form an MIS. We denote this MIS as I. Proof. u V \I, let I = {u} I. We can infer that u is dominated by a node v I since it receives the type-i STATUS MESSAGE at least one time from its 1-hop neighbors. Otherwise, it should be a dominator and needs to broadcast the type-i STATUS MESSAGE. SoinI, u,v I, they are adjacent. Hence, I could not be an MIS. In other words, I has been an MIS. Theorem 2. The distributed algorithm has the message complexity of O(Δ n + nlog 2 n), where n is the number of nodes. Proof. Each white node v will broadcast its DEGREE MESSAGE to its 1-hop neighbors in the process of update mechanism for effective degree. If its all 1-hop neighbors turn gray one by one in different rounds, obviously, v needs to broadcast its DEGREE MESSAGE at most Δ times since Δ is the maximal degree in graph theory. Thus the message complexity of this process is O(Δ n). By 3-hop message relay, some nodes need to forward the type-i STATUS MESSAGE generated by dominators which are at most 2-hop distance far away from the nodes. From Lemma 3 we can conclude that the times of forwarding the type- I STATUS MESSAGE by some nodes is constant. Besides, each node broadcasts the type-i STATUS MESSAGE or type- II STATUS MESSAGE only once. In summary, since every node sends constant number of messages, the total number of messages in this process is O(n). The message complexity of the distributed leader selection in [18] is O(nlog 2 n). The total number of messages required by distributed MST algorithm in [19] for a graph of n nodes and E edges is at most 5nlog 2 n+2e. If the nodes do not know the identities of their neighbors, then each node can send its identity over each adjoining edge, thus requiring a total of 2E extra message. In our case applied the MST algorithm, the number of nodes is the number of dominators in MIS. We denote the number of dominators as I. By 3-hop message relay combining with Rule 2 and 3, each dominator can not only know the adjoining edges associated with the respective weight but the identities of dominators at the other end of these edges and thus the distributed MST algorithm in [19] does not require additional messages to identify the neighbors. We should notice that each message broadcasted by dominators will be forwarded to the other dominators within 3-hop distance in our case to find an MST. Based on the Lemma 3, u I, we assume that there are l 2k and l 3k dominators in I with 2-hop and 3-hop distance away from u respectively. Therefore, the number of messages required for finding the MST is at most 5 I log 2 I (1 + l 2k + l 3k ). STATUS-COMPLETE MESSAGE is the same with LEVEL MESSAGE and START-UP MESSAGE that both of them are just broadcasted by each node only once. In order to update the nodes effective degree, each nodes needs to broadcast a HELLO beacon in the initialization phase. Therefore, the message complexity of STATUS-COMPLETE MESSAGE, LEVEL MESSAGE, COMPLETE-MESSAGE and HELLO beacons is all O(n). Based on the above analysis, the message complexity depends on O(Δ n) and O(nlog 2 n). Thus, the message complexity of our algorithm is O(Δ n + nlog 2 n). Theorem 3. The convergence time of finding a prior CDS C is n/3 T. Fig. 4. The topology of the network. 906 905

Proof. It makes sense to give a mathematical expression about the time of finding a prior CDS C. Actually, this question can be converted to the question that the convergence time of finding an MIS I and the related analysis is presented in the section of the method for synchronization point. For each round but not the final round to generate a dominator, if only one white node turns into black, obviously, such white node would die out at least two white nodes since its 1-hop neighbors become gray. Based on this analysis, if dominators are generated one by one in sequence and only die out two white nodes in every time interval T, it will cost the longest time to form the MIS I. Fortunately, we can construct such network topology shown in Fig 4. The network topology we construct emerges a black node with id of 2 + 3i sequentially during time slot [τ + nt,τ + T 1 + nt ], i N, where i is the sequence to generate a dominator. Thus we have the following equation. 3k + r = n (1) Deducing from this equation, we can get the result instantly. If d(u,v) =1, we conclude that there is a shortest path (u,h 1,v). We assume that there is no path through nodes in C between u and v. However, by rule 2, it can guarantee h 1 C and thus v can be reached by u through the node h 1 in C. If d(u, v) =2, that is to say, there is a shortest path (u,h 1,h 2,v). We assume that v can t be reached by u through some nodes in C. However, by rule 3, it can guarantee h 1,h 2 C. Thus, there will be a path covering the nodes in C between u and v. Now, we assume that there always exists a shortest path between the node u and v for d(u,v) < n. When d(u,v)=n,n 3, the shortest path between u and v is (u,h 1,h 2,h 3 h n,v). Observe the case that I is a dominating set, so h i V is either dominated by a dominator or a dominator itself. Let the dominator be z. Hence, we can get z = h i or h i N(z). On the basis of the hypothesis, d(u,z) < n and d(z,v) < n, wecan conclude that u can reach z and z can reach v through the nodes in C. Therefore, there are the nodes in C over the shortest path between u and v for d(u,v)=n. We prove the conclusion. Theorem 4. The bound on the size of C generated by the algorithm has a constant approximation ratio of MCDS in G. Proof. From Lemma 3, u V \I, there are a constant number of dominators that are k-units away from node u. It is obvious that there is a unique path between any pair of dominators that departs from each other at most three hops away based on Rule 2 and 3. That is to say, any pair of dominators has at most two gray nodes on the path to connect them. u I, we have assumed that l 2k and l 3k represent the number of dominators in I and such dominators are either 2-hop or 3-hop distance away from u respectively. Based on Lemma 2, it immediately implies that the size of any independent set is at most 5 opt, where opt is the MCDS of G. Therefore, the number of gray nodes the algorithm adds into I is no more than 5(l 2k + 2l 3k ) opt. Since the distributed MST algorithm may remove some edges from the new weight graph. The nodes over these edges are also removed from the prior CDS C. Thus, the upper bound on the size of ultima CDS C generated by our algorithm is no more than 5(l 2k + 2l 3k + 1) opt. We prove that our algorithm has a constant approximation ratio of MCDS hereto. Theorem 5. If the original network is connected, the nodes in C form a CDS of the network. Proof. Since C is the MST found from C, it is obvious that C is a CDS if C is a CDS. So we just need to prove C is a CDS. We use mathematical induction to prove our conclusion. By Lemma 1 and Theorem 1, we can obtain I is the dominating set. Further to say, the remain what we need to prove is that the nodes in C are connected. C is a CDS if and only if there is always at least a path through the nodes in C between any two arbitrary nodes in C. u,v C, we define d(u,v) as how many nodes are in C over the shortest path from u to v. And the shortest path P is defined as P =(u,h 1,h 2,h 3 h n,v), h 1,h 2,h 3 h n C (n is an integer). If d(u,v)=0, u is adjacent to v. Fig. 7. Number of rounds to find a prior CDS. V. SIMULATION WORK In this section, we evaluate the performance of our algorithm by making a comparison with the work in [20], [8] under different conditions. From now on, we call them Kim-CDS and Funke-CDS respectively. We not only evaluate the CDS size but Average Hop Distance (AHD) of CDS defined as the average the longest shortest path length, namely, the average number of nodes over these paths. We evaluate AHD because AHD captures what is the expected path length for message delivery. In fact, this is a very important factor in the networks based on the fact that the probability of message transmission failure often increases with the longer AHD. As the analysis in Theorem 3, the convergence time of finding C is n/3 T. That is to say, the more number of rounds to generate a prior CDS, the more time demanded. So we also evaluate the number of rounds to find a prior CDS under different conditions. For the specific simulation, random graphs are generated in a 100X100 square units of a 2-D virtual space by randomly throwing a certain number of nodes. The number of nodes is increased from 40 to 100. Any pair of nodes has link connection if their distance is less than transmission radius r. If generated graph is disconnected, simply discard the graph. 907 906

(a) (b) Fig. 5. CDS size comparison when the transmission radius is (a) 15, (b) 20. (a) (b) Fig. 6. AHD of CDS comparison when the transmission radius is (a) 15, (b) 20. Otherwise continue the simulation. The transmission radius r varies among 15 and 20. In order to reduce the influence which is brought by the randomness of the experimental data, we randomly create 100 connected graph instances and compute a CDS for each instance Fig 5 shows the comparision of algorithms in terms of CDS size under different conditions. Obviously, as the transmission radius increases, the size of CDS computed by the algorithms decreases respectively. And as the number of nodes increases under the constant transmission radius, the size of CDS grows big. However, our alogorithm always generates the smallest CDS size under the same condition compared with other algorithms. Fig 6 shows the comparision of AHD generated by different algorithms under different conditions. As expected, the simulation shows that our algorithm has a good quality in terms of AHD. Fig 7 shows the number of rounds to find a prior CDS. It indicates our algorithm finds a prior CDS quickly since the number of rounds is no more than 5 under different conditions. Energy saving is the key factor to prolong the lifetime of sensor nodes and thus we evaluate our algorithm by considering energy consumption. In this simulation work, the number of nodes deployed in the 1000X1000 virtual space is increased by 10 from 40 to 100. The length of each packet is 2048 bits and the data transmission rate of each node is 2048 bps. The packet is generated randomly by a source node to send to the other node. Each node has original energy 1J. We will calculate how many rounds each of the CDS generated by different algorithms will survive until the energy level of any node in the CDS becomes 0. The definition of the round here is that a message has been received successfully by the destination node after it is transmitted by the source node. Obviously, the number of rounds can indicate the network lifetime. We also generate 100 connected graph instances to derive an average value. As for the energy consumption model in simulation, in order to focus on evaluating the energy efficiency of each CDS generated by the different algorithms, energy loss for constructing a CDS is ignored. The amount of energy a node consumes when it sends and receives a bit message is E t = α + βr m and E r = ρ. r is the transmission radius. β is a coefficient term associated with the distance-dependent constant α and m is the path loss index. For the specific values of parameters, they are set same with the simulation work in [20]. The table shows the detailed values of parameters. The routing scheme that we use for the simulation is also the same with [20]. That is to say, we assume each node sends any packet that will be forwarded along a shortest path selected in the CDS. All nodes on this routing path consume energy abiding by mentioned above. 908 907

TABLE I. Values of the parameters. Parameter Value α 50nJ/bit r 250 m 4 β 0.0013pJ/bit/m 4 ρ 50nJ/bit From the result in Fig 8, our algorithm is more energy efficient than others. Fig. 8. Network lifetime comparison. VI. CONCLUSION In this paper, we have presented a simple and efficient distributed algorithm for constructing a CDS in wireless ad-hoc sensor networks. Each node makes its own local decision based on message exchanges of its 1-hop neighbors. Each dominator is generated by making the comparison with effective degree among its neighobrs and effective degree of the white nodes is updated in every time interval T.By 3-hop message relay, each node can learn the paths leading to the other dominators within 3-hop distance and thus some paths picked up by some rules can convert into the new weight edge by calculating the number of nodes over these paths. And then we use the MST to optimize the prior CDS the algorithm finds. We give analysis and demonstration in detail to show some properties of our algorithm. The simulation results show that our algorithm outperforms the other algorithms in terms of CDS size, AHD and network lifetime. In general, our algorithm can be implemented in wireless ad-hoc sensor networks for designing the protocols such as the MAC, location-based routing, energy conservation, resource discovery protocol and so on to improve the performance of networks. Besides, in another paper, we have presented another version which can achieve the same results and doesn t need to evaluate the parameter T 1 and T 2. ACKNOWLEDGMENT This work was supported in part by NSF CNS- 1217791, National Key Basic Research Program of China (2013CB329603), National Natural Science Foundation of China (No.6143100861271220, No.61170164). REFERENCES [1] K. Sohrabi, J. Gao, V. Ailawadhi, and G. J. Pottie, Protocols for selforganization of a wireless sensor network, IEEE personal communications, vol. 7, no. 5, pp. 16 27, 2000. [2] M. R. Garey and D. S. Johnson, Computer and intractability, A Guide to the NP-Completeness. Ney York, NY: WH Freeman and Company, 1979. [3] S. Guha and S. Khuller, Approximation algorithms for connected dominating sets, Algorithmica, vol. 20, no. 4, pp. 374 387, 1998. [4] R. Bhatt and R. Datta, Utilizing graph sampling and connected dominating set for backbone construction in wireless multimedia sensor networks, in Communications (NCC), 2014 Twentieth National Conference on. IEEE, 2014, pp. 1 6. [5] J. Wu and H. Li, On calculating connected dominating set for efficient routing in ad hoc wireless networks, in Proceedings of the 3rd international workshop on Discrete algorithms and methods for mobile computing and communications. ACM, 1999, pp. 7 14. [6] P.-J. Wan, K. M. Alzoubi, and O. Frieder, Distributed construction of connected dominating set in wireless ad hoc networks, in INFOCOM 2002. Twenty-First Annual Joint Conference of the IEEE Computer and Communications Societies. Proceedings. IEEE, vol. 3. IEEE, 2002, pp. 1597 1604. [7] M. Cardei, M. X. Cheng, X. Cheng, and D.-Z. Du, Connected domination in multihop ad hoc wireless networks. in JCIS. Citeseer, 2002, pp. 251 255. [8] S. Funke, A. Kesselman, U. Meyer, and M. Segal, A simple improved distributed algorithm for minimum cds in unit disk graphs, ACM Transactions on Sensor Networks (TOSN), vol. 2, no. 3, pp. 444 453, 2006. [9] R. Misra and C. Mandal, Minimum connected dominating set using a collaborative cover heuristic for ad hoc sensor networks, Parallel and Distributed Systems, IEEE Transactions on, vol. 21, no. 3, pp. 292 302, 2010. [10] C. Lenzen, Y.-A. Pignolet, and R. Wattenhofer, Distributed minimum dominating set approximations in restricted families of graphs, Distributed computing, vol. 26, no. 2, pp. 119 137, 2013. [11] A. E. Abdallah, T. Fevens, and J. Opatrny, 3d local algorithm for dominating sets of unit disk graphs. Ad Hoc & Sensor Wireless Networks, vol. 19, no. 1-2, pp. 21 41, 2013. [12] J. S. He, S. Ji, Y. Pan, and Z. Cai, Approximation algorithms for loadbalanced virtual backbone construction in wireless sensor networks, Theoretical Computer Science, vol. 507, pp. 2 16, 2013. [13] V. C. Sharmila and A. George, Construction of strategic connected dominating set for mobile ad hoc networks, Journal of Computer Science, vol. 10, no. 2, pp. 285 295, 2014. [14] J. A. Torkestani, Mobility-based backbone formation in wireless mobile ad-hoc networks, Wireless personal communications, vol. 71, no. 4, pp. 2563 2586, 2013. [15] R. Jovanovic and M. Tuba, Ant colony optimization algorithm with pheromone correction strategy for the minimum connected dominating set problem, Computer Science and Information Systems, vol. 10, no. 1, pp. 133 149, 2013. [16] M. V. Marathe, H. Breu, H. B. Hunt, S. S. Ravi, and D. J. Rosenkrantz, Simple heuristics for unit disk graphs, Networks, vol. 25, no. 2, pp. 59 68, 1995. [17] Y. Wang and M. Li, Geometric spanners for wireless ad hoc networks, in Distributed Computing Systems, 2002. Proceedings. 22nd International Conference on. IEEE, 2002, pp. 171 178. [18] I. Cidon and O. Mokryn, Propagation and leader election in a multihop broadcast environment, in Distributed Computing. Springer, 1998, pp. 104 118. [19] R. G. Gallager, P. A. Humblet, and P. M. Spira, A distributed algorithm for minimum-weight spanning trees, ACM Trans. Program. Lang. Syst., vol. 5, no. 1, pp. 66 77, Jan. 1983. [Online]. Available: http://doi.acm.org/10.1145/357195.357200 [20] D. Kim, Y. Wu, Y. Li, F. Zou, and D.-Z. Du, Constructing minimum connected dominating sets with bounded diameters in wireless networks, Parallel and Distributed Systems, IEEE Transactions on, vol. 20, no. 2, pp. 147 157, 2009. 909 908