DEFENDING AGAINST MALICIOUS NODES IN CLOSED MANETS THROUGH PACKET AUTHENTICATION AND A HYBRID TRUST MANAGEMENT SYSTEM

Size: px

Start display at page:

Download "DEFENDING AGAINST MALICIOUS NODES IN CLOSED MANETS THROUGH PACKET AUTHENTICATION AND A HYBRID TRUST MANAGEMENT SYSTEM"

Reynold Osborne
6 years ago
Views:

1 DEFENDING AGAINST MALICIOUS NODES IN CLOSED MANETS THROUGH PACKET AUTHENTICATION AND A HYBRID TRUST MANAGEMENT SYSTEM APPROVED BY SUPERVISING COMMITTEE: Turgay Korkmaz, Ph.D., Chair Ali S. Tosun, Ph.D. Gregory B. White, Ph.D. William H. Winsborough, Ph.D G.V.S. Raju, Ph.D. ACCEPTED: Dean, Graduate School

2 DEDICATION I dedicate this dissertation to my parents, Usman Ghani Akbani and Zarina Akbani.

3 DEFENDING AGAINST MALICIOUS NODES IN CLOSED MANETS THROUGH PACKET AUTHENTICATION AND A HYBRID TRUST MANAGEMENT SYSTEM by REHAN AKBANI, M.S. DISSERTATION Presented to the Graduate Faculty of The University of Texas at San Antonio In partial Fulfillment Of the Requirements For the Degree of DOCTOR OF PHILOSOPHY IN COMPUTER SCIENCE THE UNIVERSITY OF TEXAS AT SAN ANTONIO College of Science Department of Computer Science August 2009

4 ACKNOWLEDGEMENTS I would like to thank my Ph.D. advisor, Dr. Turgay Korkmaz, for his guidance and understanding support. I would also like to thank my committee members, Dr. Ali Saman Tosun, Dr. Greg White, Dr. William Winsborough, and especially Dr. G.V.S. Raju for their invaluable input and constructive criticism. August 2009 iii

5 DEFENDING AGAINST MALICIOUS NODES IN CLOSED MANETS THROUGH PACKET AUTHENTICATION AND A HYBRID TRUST MANAGEMENT SYSTEM Rehan Akbani, Ph.D. The University of Texas at San Antonio, 2009 Supervising Professor: Turgay Korkmaz, Ph.D. Wireless links and lack of central administration make MANETs far more susceptible to attacks than conventional networks. MANETs must provide various levels of security guarantees to different applications for their successful deployment and usage. Their security requirements depend greatly on their architecture. In this dissertation we are focusing on closed MANETs where only designated nodes are supposed to access the network (e.g., in a military or corporate setting). We define outsider nodes as those nodes that are not authorized to access the network, and insider nodes as those that are allowed to access the network. The objective of this research is to develop mechanisms that protect a closed MANET against malicious behavior from outsider nodes as well as insider nodes through packet authentication and a Hybrid Trust Management System, respectively. To defend against outsider nodes, we present a new Hop-by-hop, Efficient Authentication Protocol, called HEAP, which is suitable for unicast or multicast applications. HEAP is independent of the routing protocol used and it is based on a modified HMAC algorithm that uses two keys and is very efficient. We compare the performance of HEAP against other algorithms and provide proofs of its security. iv

6 To combat against insider attacks, we propose a new hybrid trust management system that is based on Reputation Systems (RS) and Role Based Trust Management (RBTM). We develop a novel Machine Learning based RS, called EMLTrust and delineate its advantages. We compare its performance against other RSs and demonstrate the improvements in performance. We also highlight the challenges associated with using RBTMs in MANETs and offer some solutions. Finally, we propose a hybrid TM system that combines EMLTrust with RBTM, and evaluate it to illustrate its efficacy in thwarting insider attacks. v

7 Table of Contents 1 Introduction Overall Scope and Background Problem Definition - Its Significance and Challenges Main Contributions and Dissertation Overview Defending Against Outsider Attacks Introduction Related Work TESLA LHAP Lu and Pooch Security Goals HEAP: Hop-by-hop Efficient Authentication Protocol Key Generation and Distribution MAC Generation and Authentication in HEAP Index Numbers Differences between NMAC, HMAC and HEAP s construct Security Analysis of HEAP Simulations and Results Simulation Setup vi

8 2.6.2 Latency Throughput and Packet Delivery Ratio Memory Requirement Computational Cost Conclusions and Future Work Defending Against Insider Attacks Introduction Related Work Monitoring Based Trust Management Reputation Based Trust Management Role Based Trust Management Trust Management Models for MANETs A Basic Machine Learning Based Reputation System Introduction Basic Machine Learning Approach Building the Core SVM based Reputation System Factors in Building the Classifier Feature Selection Proportion of Malicious Nodes Size of Dataset Evaluation Methodology Evaluation Metrics vii

9 4.4.6 Kernel Used Guarding Against Fake Feedbacks Evaluations of the Core SVM based Reputation System Simulation Setup Attack Scenarios Experiments and Results Conclusions EMLTrust: An Enhanced Machine Learning Based Reputation System Enhancing SVM with Fading Memories (SVM-FM) Evaluating SVM with Fading Memories (SVM-FM) Fading Memories Enhanced with Digital Signatures Introducing Dynamic Thresholds with SVM Conclusions and Future Work Hybrid Trust Management in MANETs based on Reputation and Role Based TM Systems Introduction to Role Based Trust Management Challenges of Using Role Based Trust Management in MANETs Credential Storage Credential Chain Distribution and Collection Introduction to a Hybrid Trust Management System Proposed Hybrid Trust Management System (HTMS) viii

10 6.4.1 Determining the Exact Privilege Level Merits of HTMS Evaluating HTMS Simulation Setup Experiments and Results Estimating Percentage of Malicious Nodes Conclusions and Future Work Dissertation Summary and Future Work Research Objectives Guarding Against Outsiders Guarding Against Insiders Different Trust Management Approaches Basic Machine Learning Based Reputation System Enhanced Machine Learning Based Reputation System Hybrid Trust Management System for MANETs A Comprehensive List of the Author s Publications 127 A.1 Book Chapter A.2 Journal Publications A.3 Conference Publications Vita 143 ix

11 List of Figures Figure 2.1 Node A needs to transmit a packet to its neighbors A1 through A Figure 2.2 Packet generation in the proposed HEAP Figure 2.3 Illustration of the NMAC construct Figure 2.4 Illustration of the HMAC construct Figure 2.5 Illustration of the HEAP construct Figure 2.6 Mean Latency (ms) plotted on a log axis vs. Number of hops for all algorithms. The bottom curve represents ODMRP, LHAP Min and HEAP since all three overlap almost exactly Figure 2.7 Mean Throughput (bytes/sec) vs. Packet Rate (pkts/sec) for all five algorithms Figure 2.8 Mean Packet Delivery Ratio (%) vs. Packet Rate (pkts/sec) for all five algorithms Figure 2.9 Mean Throughput (bytes/sec) vs. Packet Rate (pkts/sec) for all five algorithms when the nodes are moving about randomly Figure 2.10 Memory Requirement (bytes) on a log axis vs. Duration of Key Chain (mins) for all four algorithms Figure 2.11 CPU Time (micro seconds) vs. Number of Neighbors for HEAP Figure 3.1 Which architecture should be paired with which TM system Figure 4.1 General framework of a Reputation System that decides whether to transact with a given node or not x

12 Figure 4.2 Classification Error vs. Proportion of malicious nodes for Attack Figure 4.3 Classification Error vs. Proportion of malicious nodes for Attack Figure 4.4 Classification Error vs. Proportion of malicious nodes for Attack Figure 4.5 Classification Error vs. Proportion of malicious nodes for Attack Figure 4.6 Classification Error vs. Proportion of malicious nodes for Attack Figure 4.7 ROC Curves for Attack Figure 4.8 ROC Curves for Attack Figure 4.9 ROC Curves for Attack Figure 4.10 ROC Curves for Attack Figure 4.11 ROC Curves for Attack Figure 4.12 Bandwidth Overhead for each of the three algorithms Figure 5.1 (Adapted from [60]) Updating Fading Memories: F T V [i] denotes the faded values at time t and F T V [i] denotes the faded values at time t Figure 5.2 Classification Error vs. Percentage of Malicious Nodes for small, fixed periods of oscillation for SVM with FM and SVM without FM Figure 5.3 Classification Error vs. Percentage of Malicious Nodes for large, variable periods of oscillation for SVM with FM (FM error) and SVM without FM (Orig Error). Attack 1 has no oscillations Figure 5.4 Classification Error vs. Percentage of Malicious Nodes for large, variable periods of oscillation in attack Figure 5.5 Classification Error vs. Percentage of Malicious Nodes for large, variable periods of oscillation in attack xi

13 Figure 5.6 Classification Error vs. Percentage of Malicious Nodes for large, variable periods of oscillation in attack Figure 5.7 Classification Error vs. Percentage of Malicious Nodes for large, variable periods of oscillation in attack Figure 5.8 Classification Error vs. Malicious Nodes percentage for different algorithms with FM in attack Figure 5.9 Classification Error vs. Malicious Nodes percentage for different algorithms with FM in attack Figure 5.10 Classification Error vs. Malicious Nodes percentage for different algorithms with FM in attack Figure 5.11 SVM s best threshold vs. Malicious Nodes percentage for attack Figure 5.12 SVM error under the default threshold and under the best thresholds Figure 5.13 Dynamic Thresholds Error vs. Number of Samples taken for different proportions of malicious nodes in the network Figure 5.14 Dynamic Thresholds Error with 20 and 25 samples compared to the default error and the minimum possible error Figure 6.1 Randomly generated behavior of a node vs. the corresponding ideal response curve, along with maximum and minimum privilege levels Figure 6.2 Effect of varying proportions of malicious nodes in test sets. Training set has 0% malicious nodes Figure 6.3 Average daily discrepancy from the ideal curve vs. percentage of malicious nodes in the network (smaller is better). Training set has 0% malicious nodes xii

14 Figure 6.4 Effect of varying proportions of malicious nodes in test sets. Training set has 30% malicious nodes Figure 6.5 Average daily discrepancy from the ideal curve vs. percentage of malicious nodes in the network. Training set has 30% malicious nodes Figure 6.6 Effect of varying proportions of malicious nodes in test sets. Training set has 70% malicious nodes Figure 6.7 Average daily discrepancy from the ideal curve vs. percentage of malicious nodes in the network. Training set has 70% malicious nodes xiii

15 Chapter 1: Introduction 1.1 Overall Scope and Background Mobile Ad-hoc Networks (MANETs) have attracted the attention of researchers due to their potential use in exciting new applications. MANETs consist of wireless devices, such as laptops, PDAs, cell phones or sensors, that come together to form a wireless network without any fixed infrastructure. They lack dedicated routers and their topology changes continuously based on the movement of the wireless devices, called nodes. Nodes can communicate with their neighbors who are within wireless range, as well as with distant nodes using intermediary nodes as routers [7]. These unique properties of MANETs allow them to be used in novel situations. For example, MANETs can be rapidly deployed in emergencies and disaster struck areas to allow firefighters, paramedics and security personnel to communicate with each other and coordinate the relief effort. This capability would be indispensable in areas afflicted with hurricanes or other natural disasters, or in areas targeted by terrorism where any pre-existing network infrastructure would have been destroyed. MANETs can also be used in a military setting where troops can exchange strategic information on the move [7]. The ability to cope with high mobility and the lack of fixed infrastructure make MANETs very attractive for military and tactical applications. Another useful application for MANETs is mobile conferencing for business meetings and seminars involving a large group of people, where access points may be absent or inaccessible. They are also very useful for people doing fieldwork such as geologists, 1

16 cartographers or archaeologists who may want to communicate with their colleagues on the field. MANETs can be used to establish a wide wireless coverage area e.g. in airports, university campuses and even in city localities, where installing and maintaining access points at every few hundred feet would be expensive and sometimes impractical. MIT s Media Lab has a $100 Laptop program where the aim is to develop cheap laptops for mass distribution to children in developing countries [33]. These laptops will be programmed to form a MANET out of the box so that the children can communicate with each other. In addition, MANETs can be used as sensor networks where several hundreds or thousands of sensors can be distributed throughout a region being studied. Each sensor records some information about its environment and transmits the data back to a processing station. These sensors may be mobile and may need to coordinate with each other to gather data. 1.2 Problem Definition - Its Significance and Challenges MANETs must provide various levels of security guarantees to different applications for their successful deployment and usage. These security requirements can be very stringent for military and anti-terrorism operations, to less stringent for business and personal use. Consider the scenario where an area has just been attacked by terrorists. We can envision a situation where policemen, paramedics, fire fighters and rescue workers all need to communicate and coordinate their efforts in order to provide disaster relief. This is a life threatening situation where efficient and secure communication could result in lives being saved. On the flip side, we assume that terrorists have hidden wireless devices on the scene whose intent is to try to disrupt these 2

17 communications to maximize the impact of the terrorist incident. Once the use of MANETs becomes widespread in disaster relief efforts, it can become a potential target for terrorists who are rapidly increasing in computer know-how [21]. Other scenarios can also be construed, where in a military setting enemy forces may try to disrupt or inject false information into a MANET used by soldiers on the battlefield. A commercial MANET might become the victim of industrial espionage. A MANET for personal use could be exploited to hack into a user s computer to commit identity theft, credit card theft or fraud. Unfortunately, due to their wireless links and lack of central administration, MANETs are far more susceptible to attacks than conventional networks [11, 19, 20, 25, 31]. For example, it is easy for attackers to enter or leave a wireless network and eavesdrop since no physical connection is required. They can also directly attack the network to drop packets, tamper with packets or inject false packets [8]. As a result, it is possible to launch sophisticated wormhole, man-in-the-middle and Denial of Service (DoS) attacks with ease, or impersonate another node. It is so easy to attack a MANET without security that the risk of using a MANET might actually outweigh the benefits. It is therefore paramount to have effective security solutions in place for their successful deployment. The problem is further exacerbated by the fact that the architecture of the MANET may vary greatly based on its application. In an open MANET anyone is free to enter or leave the network (e.g., in airports and university campuses), whereas in a closed MANET only designated nodes may gain access (e.g., in a military setting). The network may be hierarchical, where nodes have different roles and privileges and provide different services (e.g., network administration, certifying authority (CA), security and access control), or it may be 3

18 flat, where each node provides the same type of services (e.g., packet forwarding). Furthermore, the nodes may belong to a Single Administrative Domain (SAD), where only one administrator controls the network (e.g., the US army), or Multiple Administrative Domains (MAD), where there are multiple administrators (e.g., the US and UK armies in a joint operation) [84]. In general, it is more difficult to provide security in an open, flat, MAD network, such as a city-wide public network. This is because there is no restriction on who may access the network, there are no designated nodes to provide security and the network may be administered differently across different localities. Fortunately, the security requirements of such networks are also not very demanding since users expect public networks to be insecure. By contrast, closed, hierarchical, SAD networks may have very strict security requirements, such as in the military or in the police department. In this dissertation we are focusing on closed networks, which may be flat or hierarchical, SAD or MAD. In the context of closed networks, we define outsider nodes as those nodes that are not authorized to access the network, and insider nodes that are allowed to access the network. The objective of this research is to develop mechanisms that protect a closed MANET against malicious behavior from insider and outsider nodes through packet authentication and a Hybrid Trust Management system. The goal is not to protect an individual node from being compromised, but rather we expect some nodes to be compromised and the goal is to limit the damage such compromises can cause. The network on the whole should be able to function at a satisfactorily level and provide services with reasonable reliability even in the presence of such malicious intruders. We will investigate and quantify the impact of increasing the proportion of malicious nodes in the network. We only make minor assumptions, for instance that the network layer is able to relay 4

19 at least some untampered packets from the source to the destination node. Protocols such as IPSec [17] or HMAC [9] can be used to ensure that packets are not tampered with. If the bottom layers completely fail to relay any packets such as by jamming the wireless physical layer, the network is effectively crippled and any system at a higher layer cannot overcome that problem. In a real world setting, it may be possible to locate the jamming device through triangulation or a physical search and disable it. Dealing with complete failures at lower layers is beyond the scope of this research. In many cases, the attacker may be more interested in covert attacks such propagating false information through the network rather than completely disabling it. We are therefore focusing our research on attacks at the link layer or above. 1.3 Main Contributions and Dissertation Overview Our research is divided into two parts. The first part (Chapter 2) deals with guarding against outsider nodes, while the second part (Chapters 3-6) deals with guarding against insider nodes. In Chapter 2, we propose a scheme that defends against outsider attacks. We use packet authentication and propose a Hop-by-hop, Efficient Authentication Protocol, called HEAP. HEAP authenticates packets at every hop by using a modified HMAC-based algorithm that utilizes two keys and drops any packets that originate from outsiders. HEAP can be used with multicast, unicast or broadcast applications. The remaining chapters deal with defending against insider attacks. In Chapter 3, we present the background material related to guarding against insiders through Trust Management (TM) systems. We present the pros and cons of three traditional TM systems: Behavior based TM, Reputation based TM, and Role based TM systems. We also 5

20 discuss which TM approach is appropriate for which MANET architecture. In the following chapters, we investigate how to utilize various TM systems in the context of closed MANETs, and propose a hybrid TM system. In Chapter 4, we present a new Machine Learning (ML) based Reputation System (RS). We discuss the motivation behind developing an ML based solution and the challenges associated with using ML. Then we discuss the details of our Support Vector Machines (SVM) based RS and compare it against other RSs found in the literature. We perform detailed evaluations of our RS and illustrate its improvements over other RSs. In Chapter 5, two enhancements are proposed to the basic RS presented in Chapter 4. The first enhancement uses an algorithm called Fading Memories with Digital Signatures in order to enable SVM to look at longer histories using only a few features. This forces adversaries to behave well for longer periods of time in order to boost their reputation scores. The second enhancement, called Dynamic Thresholds, improves the accuracy of SVM by relocating its decision boundary dynamically, based on the proportion of malicious nodes in the network. It allows SVM to keep up with varying levels of malicious behavior in the network as time progresses. We call our enhanced Reputation System EMLTrust. In Chapter 6, we highlight the problems associated with using a traditional Role based TM system with MANETs and suggest some solutions. Next, we propose a Hybrid Trust Management System that is based on Reputation Systems (RS) and Role Based Trust Management (RBTM). Such a system can overcome some of the limitations of both approaches when they are used in isolation. We describe the algorithm in detail and evaluate it in order to demonstrate its utility. Finally, we wrap up the dissertation and discuss overall conclusions and future work in Chapter 7. 6

21 The objectives of this dissertation can be outlined as follows Identify the inherent limitations of providing security in MANETs based on the MANET s architecture. Devise a mechanism that guards against attacks from outsider nodes by using hop-by-hop packet based authentication. Consider various TM models and develop an efficient hybrid TM model for guarding against insider nodes. Some important problems that need to be addressed include: How to compute a meaningful Reputation given the history of past interactions of the client with third parties. How to account for the level of trust we have in these third parties and how to guard against nodes that present false evidence. How to deal with clients that have little or no history. How to prevent clients with bad reputations from changing their identities in an attempt to appear as new nodes. How to make the system efficient and robust against node failures. How to guard the system against colluding nodes. How to store, collect and distribute credentials for Role based TM systems. How to subvert the problems associated with using a centralized Certifying Authority (CA), or Trust Authority (TA). Evaluate the proposed solutions using simulations and experiments. 7

22 Chapter 2: Defending Against Outsider Attacks 2.1 Introduction MANETs must provide various levels of security guarantees to different applications for their successful deployment and usage. However, due to their wireless links and lack of central administration, MANETs have far greater security concerns than conventional networks [8]. For example, it is easy for attackers to enter or leave a wireless network and eavesdrop since no physical connection is required. Without a security scheme in place, an outsider node can easily pretend to be an insider and participate in routing packets. Therefore, it can directly attack the network by dropping packets, tampering with packets, injecting false packets or flooding the network. As a result, it is possible to launch sophisticated wormhole, man-in-the-middle and Denial of Service (DoS) attacks with ease, or to impersonate another node. To combat such attacks, we study hop-by-hop packet authentication as a fundamental security strategy to block attacks from outsiders at the first hop. We propose a Hop-by-hop, Efficient Authentication Protocol, called HEAP, which is suitable for multicast, unicast or broadcast applications. In essence, HEAP uses a modified HMAC [9] based algorithm that utilizes two keys. Specifically, in the initial bootstrapping phase every node (i) shares a pairwise secret hash key, called okey, with each of its neighbors, and (ii) generates one common secret hash key, called ikey, and securely distributes it to all of its one-hop neighbors. Now consider the case where a node wants to broadcast a message to all of its neighbors. The naive 8

23 way would be to generate a new MAC for every individual neighbor using its pair-wise key. However, we present a more efficient way of generating MACs. We reexamined the working principles of the HMAC algorithm [9] and divided the HMAC computation into two steps so that we could significantly lower the computational cost. In our scheme, we pay special attention to the efficiency of the proposed authentication scheme since the nodes in MANETs are often constrained in battery power, memory, bandwidth, and CPU power. We explain the details of HEAP in Section 2.4. Using simulations in Section 2.6, we show that HEAP significantly lowers memory requirements and latency compared to previous schemes. In addition, our simulations show that the extra CPU and bandwidth overhead incurred by HEAP is negligible when compared to using no security scheme at all. The rest of the chapter is organized as follows. We present related work in Section 2.2. We explain our security goals in Section 2.3. We then describe our proposed HEAP scheme and its security analysis in Sections 2.4 and 2.5. Using simulations, we compare HEAP with other packet authentication schemes in Section 2.6. Finally, we conclude this chapter and give some directions for future research in Section Related Work MANETs security research includes areas such as intrusion detection, secure routing, key establishment and distribution, and authentication. Intrusion detection and response is addressed in various studies (e.g., [15, 16, 35, 48]). In [18] Dahill et al identify several security vulnerabilities in the popular routing protocols such as AODV and DSR. They propose to use asymmetric cryp- 9

24 tography to secure these protocols. Other secure routing protocols are presented in [47, 50, 52], but they assume the pre-existence and pre-sharing of public and secret keys for all initial members. The key sharing problem is considered by researchers in [22, 23, 24, 34, 94, 104]. Works such as [46, 56] aim at designing self securing MANETs without needing third party certifying authorities outside the network. Other researchers have proposed schemes, including SEAD [27] and Ariadne [28], for securing routing protocols [55, 91, 92, 99, 100, 101, 102]. Most of these schemes authenticate only control packets and not data packets. Extending these schemes to directly authenticate data packets would result in too much overhead. On the other hand, not authenticating data packets leads to serious vulnerabilities against attacks such as DoS, replay, man in the middle, wormhole, and impersonation attacks. Since it is important to guard against these attacks we need to authenticate data as well as control packets. We have designed our scheme, HEAP, to authenticate both types of packets. For comparison, three other packet authentication schemes presented in the literature are also designed to authenticate data as well as control packets: TESLA [49], LHAP [57, 58] and Lu and Pooch s algorithm [45] (hereafter referred to as Lu ). These three algorithms approach the problem of authentication in slightly different ways but all of them use one-way hash key chains. For example, TESLA provides end-to-end authentication. Although it is not specifically designed for MANETs, it can easily be applied to them. LHAP and Lu s algorithm build on the principles of TESLA and are specifically intended for MANETs. They provide hop-by-hop authentication just like HEAP. For this reason, we found it suitable to compare the security and performance of our algorithm against theirs. In the rest of this section, we briefly describe these schemes and discuss their major security merits and vulnerabilities. 10

25 2.2.1 TESLA TESLA is a very efficient multicast stream authentication protocol that can be applied towards wired as well as wireless networks. It can be used for unicast and broadcast applications as well, and its parameters can be adjusted to adapt to varying network conditions, such as congestion. However, since TESLA was not specifically designed for MANETs but for LANs in general, it has certain drawbacks which make it less suitable for MANETs. For instance, packets are not authenticated at every hop, instead they are only authenticated by the final receiver after a delay of several seconds. The packets are held in a cache at the receiving node until the hash key used to authenticate them has been disclosed by the sender. If the hash key can regenerate the MAC of the packet, the packet is considered authentic; otherwise, it is dropped. Intermediate nodes simply forward the packets without authentication. As a result, an attacker can easily launch Denial of Service (DoS) attacks by flooding the network with bogus packets. These packets can flood the packet caches of all receiving nodes causing them to overflow and drop legitimate packets. An attacker could also easily launch worms that would quickly propagate throughout the network unchecked by the forwarding nodes. Hop-byhop authentication restricts the damage of this type of attack by limiting its propagation. In addition, TESLA requires the clocks of all the nodes in the network to be loosely synchronized. This requires a secure time synchronization protocol and possibly time servers to synchronize with. Such time servers may not always be available or reachable in a MANET. Any vulnerability in the synchronization mechanism could potentially allow an attacker to compromise the scheme and successfully send forged packets. A secure time synchronization protocol suitable for MANETs needs to be developed and tested for TESLA to be applicable in 11

26 MANETs. However, as discussed in [57, 45], secure time synchronization is difficult to achieve in MANETs LHAP LHAP was specifically designed for MANETs and it introduces the idea of hop-by-hop authentication. It builds on the principles of TESLA and tries to overcome some of its drawbacks. For example, LHAP does not require loose time synchronization. It is also very efficient and it authenticates packets instantly, reducing latency and eliminating the need for a cache at every node. However, LHAP is vulnerable to wormhole and man-in-the-middle attacks. LHAP s authors themselves point out this vulnerability (see Section 3 of [57], Collaborative Outsider Attack ). Briefly speaking, this vulnerability allows an outside attacker to eavesdrop on authentic packets sent by a node, modify them and retransmit them to another node in the network that is outside the range of the original node. This can be done using a dedicated private channel between two colluding nodes (i.e., a wormhole) or by the attacker being in the middle of the sender and the receiver that are outside each other s range. To thwart this attack, LHAP s authors suggest using GPS devices in all the nodes to ascertain whether a sending node should be within transmission range of the receiving node. We do not think it is practical to equip all the nodes with GPS devices, especially if those nodes are sensors or small wireless devices. We stress that even with GPS coordinates, it is very difficult to figure out whether another node is within transmission range because the range is affected by complex factors such as terrain, weather conditions, radio noise, and transmission power level. In a MANET, all of these factors may continually change and so it is not possible to verify whether a node is within transmission range, unless it is unambiguously too far off or 12

27 very close. LHAP s authors admit that their scheme does not fully address this attack. Because of this vulnerability an attacker can successfully send forged packets which will be considered authentic by the receiver. This goes against the purpose of the scheme which is to authenticate all incoming packets and prevent unauthentic packets from being propagated, leaving LHAP potentially vulnerable to wormhole, man in the middle, DoS and replay attacks Lu and Pooch Lu and Pooch s algorithm builds on LHAP. Like LHAP, Lu s algorithm also uses hop-by-hop authentication and is efficient, but unlike LHAP it uses only one key at every node instead of two. Like TESLA, Lu also uses delayed key disclosure causing network latency to increase, but the delay is not as large as TESLA s. We show here that Lu s algorithm is also vulnerable to wormhole and man-in-the-middle attack. Delayed key disclosure schemes like TESLA and Lu have a security condition that a data packet can only be considered safe and authentic if the receiver receives the packet before the sender discloses the corresponding key used to generate the MAC for that packet. Otherwise, an adversary would retrieve the disclosed key, generate the MAC for a forged packet and send it to the receiver. The receiver would consider that packet to be authentic since it would not know that the key had already been disclosed. To guard against this possibility, the sender in TESLA informs the receiver at what time it will transmit the key disclosure packet. If a data packet is received well before this time and its MAC is later authenticated with the disclosed key, the packet is considered authentic. But if a packet is received after the key disclosure time, then it is dropped even if the receiver has not yet received the key. This scheme requires that the clocks of the sender and the receiver must be synchronized, which is difficult to achieve. To circumvent this problem of 13

28 synchronization, Lu introduced an idea follows. Instead of the sender stating that the next key update packet will be broadcast at time t, the sender will state in the data packet a delay parameter that will state that the key will be disclosed d milliseconds after the transmission of this data packet. But there is a problem with this latter approach. A clever attacker can launch a wormhole or man-in-the-middle attack where it will forward all the packets from the sender to the receiver during the bootstrap phase so that the receiver mistakenly believes that the sender is its neighbor. Then the attacker will wait until it receives a key update packet from the sender. Once the key is received by the attacker, the attacker can forge many packets and send them to the receiver, and then disclose the key. Because the receiver does not expect a key update packet from the sender at a fixed time t, it has no way of knowing that this key was transmitted by the sender some time ago and is now obsolete. The attacker can use this obsolete key and all subsequent keys to successfully transmit forged packets. In this way the attacker can thwart Lu s scheme. 2.3 Security Goals Outsider vs. Insider Nodes: An outsider node is a node that is not an authorized member of the MANET whereas an insider node is an authorized member. For instance, in a military setting each authorized soldier might possess a signed certificate from a trusted third party granting him membership in the MANET. Such a node is an insider node. Any node not possessing such a certificate or possessing a revoked certificate is considered an outsider node. In essence, the goal of this chapter is to thwart attacks from outsider nodes. Detecting 14

29 the attacks from insiders is left to the Intrusion Detection Systems. However, we do provide a foundation on which a response system can be based by providing the capability to effectively cut off a compromised insider from the MANET. In addition, HEAP offers some level of protection against insiders who try to forge packets and impersonate other insiders. But because insiders already have access to the MANET, it is easy for them to launch more sophisticated attacks rather than simply trying to forge packets. In Section 2.5, we discuss these issues in detail. We now list the keys security goals in defending the underlying network against outsider nodes. In Section 2.5, we explain how these goals are addressed by our proposed scheme. 1. Any packet transmitted by an outsider node should be immediately dropped by the receiving insider node at the first hop with a very high probability. In other words, packets sent by outsiders should not be allowed to propagate through the MANET. By fulfilling this requirement, we can successfully guard against a myriad of attacks by the outsider, such as DoS attacks that attempt to flood the network, wormhole attacks, man-in-the-middle, SYN flooding etc. This is because we are effectively disabling the outsider s ability to route any packets to any node that is not its neighbor. Even a neighboring node will simply drop packets from the outsider. However, this requirement dictates that we authenticate every packet at every hop, which in turn means that the authentication mechanism should be extremely efficient. 2. The outsider node is assumed to have the capability to spoof its identity, such as spoofing its IP and MAC addresses to impersonate an insider node. We cannot rely on these markers to verify the origin of a packet. 15

30 3. The outsider is assumed to have access to the wireless channel so it can eavesdrop on legitimate traffic. If the traffic is supposed to remain confidential, end-to-end encryption should be used to protect it. However, we assume that the encrypted traffic and any associated MAC tags are visible to the outsider. 4. If a third party Intrusion Detection System (IDS) discovers an insider to be compromised, we must be able to exclude that insider from propagating any more packets within the MANET. Certificate Revocation Lists (CRL) may be used to revoke the certificates of compromised insiders. 2.4 HEAP: Hop-by-hop Efficient Authentication Protocol In this section, we describe the key steps in our proposed authentication protocol, HEAP. We then highlight the differences between HEAP, HMAC and NMAC in generating MAC (message authentication code) Key Generation and Distribution A node that wants to join a network must first generate a single group key, called ikey (for inner key), and one pair-wise key for each neighbor, called okey (for outer key). Thus each new node will generate n + 1 new keys (n okeys and 1 ikey), where n is the number of nodes in its neighborhood. The keys are simply random bit strings and are cheap to generate. The ikey is secretly shared with all the neighbors, while the pair-wise okey is only shared with the 16

31 corresponding neighbor. This can be done using standard key exchange mechanisms and trusted third party certificates, such as in the X.509 standard [5]. RSI or Elliptic Curve Cryptosystems (ECC) [51] can be used as the underlying PKI. A similar procedure is used in LHAP and Lu where keys are exchanged with each new neighbor as the node moves about. Key exchange is employed under any one of the following circumstances: 1. When a node moves to a new neighborhood it exchanges keys with its neighbors. 2. When an existing node in the neighborhood has remained idle for too long, for example for several minutes. The keys need to be expired after a certain period of inactivity. This is also to accommodate nodes that leave a neighborhood. 3. The keys should expire after a certain amount of time, for example after a few hours, even if they are being used continuously without any idle time. This is to guard against brute force attacks and crypt-analysis attacks by the adversary MAC Generation and Authentication in HEAP After the key exchange phase, packet authentication can take place. We explain the authentication process in HEAP through an example. Consider the scenario given in Fig NodeA needs to transmit a packet to its neighbors, A1 through A5. We assume that all the nodes have joined the network and exchanged keys as described in Section So, node A secretly shares an okey with each of its neighbors, ok A1 to ok A5. In addition, node A has generated an ikey, ik, and has securely distributed it to all of its five neighbors. Only node A and its one-hop neighbors have access to ik. All the keys are hash keys, so for example they are 128 or 160 bits long, respectively, if MD5 or SHA1 is the underlying hash function. Our proposed 17

32 Figure 2.1. Node A needs to transmit a packet to its neighbors A1 through A5. scheme is independent of the underlying hash function. We consider the case where node A wants to broadcast a message to all of its neighbors. We want each recipient of the packet, A1 through A5, to be able to authenticate that the packet originated from node A. One way to do this would be for node A to generate a new HMAC for every individual neighbor using its okey. However, we present a more computationally efficient algorithm using a slightly modified HMAC. A key insight into the working of the HMAC algorithm [9] can significantly lower the amount of computational overhead for multicast and broadcast applications. Recall that HMAC is computed in the following way: HMAC(M, K) = H(K opad H(K ipad M)), where H(x) is the hash value of x using some hash function such as MD5 or SHA1; is the 18

33 XOR operation; M is the message that needs to be sent; opad is the hexadecimal number 5C that is used to pad each byte of K up to one block size, while ipad is the hexadecimal number 36 that is used to pad each byte of K up to one block size (i.e. 512 bits); the symbol represents concatenation. We divide HMAC computation into the following two steps: Step 1: H(K ipad M) Step 2: H(K opad hash from Step 1) For long messages, Step 1 will result in most of the computational overhead. Step 2 only needs to hash 32 bytes for MD5 and 40 bytes for SHA1. This insight led us to develop the following MAC generation algorithm for multicast and broadcast applications. We use the previously described two keys, namely ikey and okey. We use ikey to generate the hash in Step 1. So the Step 1 for node A will be: New Step 1: H(ik M) The ikey is padded with 0 s to make it one block wide (i.e. 512 bits). Since all the one-hop neighbors share the same ikey, these neighbors can authenticate a packet received from node A by only computing this first step. Therefore, the sender A only needs to compute this expensive step only once regardless of how many neighbors it has. However, we cannot use the same key for the second step as well. This is because all of A s neighbors have the key and anyone of them could impersonate A and send forged packets. That is why we use the pair-wise okey, 19

34 ok Ai padded with 0 s to make one block in Step 2: New Step 2: H(ok Ai hash from Step 1) Since only the sender and the receiver have ok Ai no third party could generate this step. Of course, this step needs to be executed once for each neighbor. But fortunately, this step is computationally inexpensive because of the small input to the hash function of only 32 or 40 bytes. Using the above MAC generation scheme, node A generates a packet using the following format and sends it to all its neighbors: A : ind A, M, MAC 1 (M ind A )... MAC n (M ind A ) where represents any recipient, M is the message, ind A is the last index number used by node A (discussed in Section 2.4.3), n is the number of neighbors, and MAC i represents the MAC for neighbor Ai. A pseudocode for generating a packet in HEAP is shown in Figure 2.2. When Ai receives the packet, it computes the MAC and if it matches any of the MAC tags in the message and the index number is greater than the last authentic index number received the message is accepted as authentic; otherwise, it is dropped. To save on bandwidth the whole MAC need not be transmitted. Only the last half of the MAC can be transmitted in accordance with the recommendation made in [9]. This reduces the tag size and still makes the search space large enough for the adversary so that the chances of generating the correct MAC for a forged message are remote. 20

35 Packet generation in HEAP: Input: Message M containing the payload and headers of upper layer, index number ind A, keys ok A1 to ok An, and ik Output: Packet S, new index number ind A 1 if ind A is null then 2 ind A := 1 3 else 4 ind A := ind A end if 6 mac1 := H(ik M ind A ) // New Step 1 7 comp mac := {} 8 for each neighbor Ai do 9 mac2 := H(ok Ai mac1) // New Step 2 10 comp mac := comp mac last half of mac2 11 end for 12 S := ind A M comp mac 13 return S and ind A Figure 2.2. Packet generation in the proposed HEAP Index Numbers Packet index numbers are included in the packet to protect against message replays. Before transmission, an index number is concatenated with the payload data to give the message M. Each time a packet is sent, its index number is incremented by one. The hash generated is dependent on the index number so any tampering of the index number can be detected. Any attempt to store and then retransmit the packet by a malicious node will be thwarted because the receiving node expects to see an index number greater than the last authentic index number and so it will consider this packet to be unauthentic. 21

36 2.4.4 Differences between NMAC, HMAC and HEAP s construct Bellare et al. presented NMAC and HMAC in [9]. They listed two major differences between NMAC and HMAC. First, NMAC (Fig. 2.3) uses two keys, k 1 and k 2, while HMAC (Fig. 2.4) uses only one key, k. Second, NMAC replaces the fixed Initialization Vector (IV) with the Figure 2.3. Illustration of the NMAC construct. Figure 2.4. Illustration of the HMAC construct. key. Normally, hashing algorithms such as MD5 and SHA1 use fixed IVs. Replacing the IV with the key would necessitate a modification to the hashing algorithm so that it takes the IV as a parameter instead of keeping it fixed. This would not be possible in off-the-shelf hardware 22

37 implementations of MD5 or SHA1 and it would involve modifying any software library code for MD5 and SHA1. Although the change required in software would be trivial, it would still be preferable if no change was required at all and that was one of the motivations for the authors of HMAC. In HEAP (Fig. 2.5), we wanted the ability to use off-the-shelf hardware or software implementations of the hashing algorithm like HMAC, but we also wanted to use two keys instead of one like NMAC. Therefore, neither HMAC nor NMAC was completely suited to our Figure 2.5. Illustration of the HEAP construct. purpose and we decided to modify the HMAC instead so it could accept two keys. In the figures above, the symbol represents padding by zeros to increase the size of the string from l bits to b bits, where l is 128 bits for MD5 and 160 bits for SHA-1, while b is 512 bits for both. The keys in HMAC and HEAP are also padded with zeros to make them b bits long. 23

38 2.5 Security Analysis of HEAP In this section, we present security analysis of HEAP and evaluate HEAP against our security goals that are stated in Section 2.3. Evaluation against Goal 1: The first goal is to prevent all outsider traffic from propagating through the MANET. By definition, we assume that the outside attacker does not have any control over insider nodes, otherwise it is no longer an outsider but an insider. We can see from the HEAP algorithm that the only way an outsider can successfully propagate a packet is if it is able to generate the correct MAC for the packet. This means that the overall security of HEAP is dependent on the security of HEAP s cryptographic construct. Therefore, we now analyze the security of HEAP s cryptographic construct. As discussed in Section 2.4.4, the primary difference between HMAC and HEAP s construct is that HMAC uses one key while HEAP uses two keys. However, as pointed out by its authors, HMAC actually uses two pseudorandom keys that are generated from one key. This is accomplished by padding the key with ipad and opad to generate two different seeds and then using the compression function on them to generate two pseudorandom keys. HEAP already has two different keys that can form the seeds for the compression function, eliminating the need to pad the keys. We still need to run the keys through the compression function, however. Otherwise we would need to replace the IV with the keys, like in NMAC, and that would prevent us from using off-the-shelf hash function implementations. Because both HMAC and HEAP use two pseudorandom keys, we can apply the same security analysis for HMAC (or NMAC) to HEAP. For the case of an outsider that does not know any of the two keys in HEAP, the security analysis of HMAC given in [9] holds and we 24

39 can apply the same proof of security to HEAP s construct. Adversarial Model: The adversarial model used in [9], as well as in this chapter, is that the adversary is allowed to query the function HEAP and obtain the correct MAC for up to q messages of length L, even though the keys are unknown to him. The adversary spends a total processing time t in the attack. This is a very strong adversarial model since it is unlikely that an outsider will be able to query HEAP using several messages of its choice and getting the correct MAC. The insider nodes will simply drop its requests. However, it serves to give a worst-case analysis. Based on [9], we can give the following theorem. Theorem 1 If the keyed compression function f is an (ε f, q, t, b)-secure MAC on messages of length b bits, and the keyed iterated hash F is (ε F, q, t, L)-weakly collision-resistant then the HEAP construct is an (ε f + ε F, q, t, L)-secure MAC, assuming both the ikey and the okey are unknown to the attacker. The theorem states that any outsider that launches an attack against HEAP has a probability of success that is no more than the sum of ε F, the probability of finding collisions in the secretly keyed iterated function F, and ε f, the probability of breaking the keyed compression function, f, given t amount of time. The attacker is allowed to make q queries of its choice to HEAP, where each query has a size of at most L bits. The size of the message that the compression function takes is b bits. Proof 1 Rather than repeat the entire proof here, we refer the reader to Section 4.2 in [9] since the proof for HEAP is the same as HMAC when the attacker is an outsider and both keys are unknown to him. 25

40 Insider Adversarial Model: For an insider, the adversarial model stated above now becomes more realistic since an adversarial insider can send an arbitrary number of messages to a neighboring target node and observe the correct MAC for them. Furthermore, the probabilities in Theorem 1 no longer hold. This is because the neighbor knows the ikey so the security is reduced. The security for an insider adversarial model can be stated in the following theorem. Theorem 2 For a neighboring attacker that knows the ikey but not the okey, HEAP is an (ε f +ε kf, q, t, L)-secure MAC, where ε kf is the probability of finding collisions in the underlying hash function when the IV is fixed and known to the attacker. Proof 2 Following Theorem 1 and the proof in [9], we can see that HEAP (and also HMAC) can be broken in one of two ways. The attacker can either break the underlying keyed compression function, f, with probability ε f,, or the attacker can find collisions in the keyed iterated hash function, F, with probability ε F. Knowing the ikey makes finding collisions in F much easier. In fact, as the theorem states, the probability of finding such collisions is equal to the probability of finding collisions when the IV is fixed and known, i.e. ε F, reduces to ε kf. Suppose there exist two different messages, M and M, such that H(M) = H(M ) when the IV is H(ik). Let us assume the probability of finding such messages that collide is ε kf.we will show that for an insider, ε F = ε kf. This can be shown by observing the two steps of HEAP: Step 1: H(ik M) = MAC i Step 2: H(ok Ai MAC i ) = MAC o Following our adversarial mode, first the attacker will query the target node using message M and obtain the correct MAC for it, MAC o. Being an insider, the attacker knows ik 26

41 but not okai. From Fig. 2.4 we can see that if H(M) = H(M ), then H(ik M) = H(ik M ). This is because the output from the leftmost stage f is the same for both cases so the IV for the second stage is H(ik), and as we assumed H(M) = H(M ) when the IV is H(ik). Therefore, MAC i will be the same for both M and M. H(ik M) = H(ik M ) = MAC i It follows that if MAC i is the same for both messages then MAC o will also be the same for both messages regardless of the value of okai since Step 2 would be identical for both. Therefore: HEAP (M) = HEAP (M ) = MAC o Thus, the attacker simply needs to obtain the MAC for M and then construct a new packet, replacing M with M, and retaining the same MAC, MAC o. The message will be accepted as genuine by the recipient regardless of the value of ok Ai. So for an insider, the probability, ε F, of breaking F reduces to the probability, ε kf, of finding two colliding messages, M and M when the IV is H(ik), i.e. ε F = ε kf. However, note that the security is reduced only against insiders that are neighbors and that already have access to the MANET. For outsiders that are not members of the MANET, the security of HEAP is the same as the security of HMAC. Finding collisions in the underlying hash function is still non-trivial and requires considerable amount of resources and time. The insider is already capable of propagating packets through the MANET so there would be no point in it trying to forge packets by finding colli- 27

42 sions. If it wants to incriminate another node, it can simply change the source IP and MAC address headers in each packet and transmit them using its own keys so that the receiver would think that the packet did not originate from the attacker but the attacker simply forwarded it. The inside attacker would be better off using its resources to launch direct attacks rather than trying to forge packets by finding collisions. HEAP s security is further improved by having the security layer below the network layer and including the IP headers in the message to be hashed. This is because the IP checksum is included in the message and it reduces the likelihood that a message found through collisions would contain valid IP headers and checksum. If the headers and checksum are not valid, the network layer would end up dropping the forged packet even if the packet were able to slip through the security layer. Evaluation against Goal 2: HEAP does not make any assumptions about the IP or MAC address of the packet. Even if these addresses are spoofed by an outsider, the outsider will not be able to propagate a packet unless it can generate the correct MAC for it. Therefore, HEAP is secure against IP and MAC address spoofing. Evaluation against Goal 3: In our analysis of HEAP, we made an even stronger assumption than simply that the attacker can eavesdrop on packets and view the message and its MAC tags. We considered the case where the attacker is able to choose its own messages and obtain their correct MAC tags, known as the chosen message attack. Therefore, our security analysis holds even if the attacker were able to view a very large number of messages. Evaluation against Goal 4: As stated, HEAP is not designed to detect insider attacks. But if a third party Intrusion Detection System (IDS) were to detect a compromised insider node and alert other nodes about it, HEAP provides a framework for an effective Re- 28

43 sponse System. The compromised node would be placed on a Certificate Revocation List (CRL) that would be propagated throughout the network. Nodes neighboring the compromised node would re-initiate the bootstrapping phase and exchange new keys with their legitimate neighbors but not with the compromised node. If the compromised node moved to another locality the other nodes would not exchange keys with it either because it is on the CRL. In this way, the compromised node would not be in possession of valid keys any more and it would be treated just like an outsider. We do not deal with the implementation details of the IDS and the CRL in this chapter. 2.6 Simulations and Results We ran simulations in order to evaluate the performance of HEAP and compare it against TESLA, LHAP, and Lu s algorithm and also against a scenario without any security Simulation Setup We used GloMoSim v2.03 [32] simulator for our experiments and used ODMRP [34] as our underlying multicast routing protocol. Other popular protocols, such as AODV or DSR, do not support multicast routing. Unfortunately, GloMoSim s ODMRP implementation had a critical bug in it and we had to fix it before we could use it. The MAC protocol used was IEEE and the transport layer protocol was UDP. The size of the data packets was 512 bytes, except in the CPU time simulation where the packet size was variable. The traffic sources were Constant Bit Rate (CBR). For all other parameters, GloMoSim s default configuration was used where a node s range was found to be 140 meters. 29

44 We ran simulations where the nodes were static, as well as where the nodes were mobile. For the static case, we used a grid network containing 10 X 10 nodes (100 nodes total). Each node was 100 meters away from its neighbor. For the mobile case, all of the 100 nodes were free to move about randomly. The starting positions of all the nodes were chosen randomly in a terrain measuring 1200 x 1200 meters. To set the parameters of TESLA, LHAP and Lu, we studied the recommendations made by their respective authors in their papers. TESLA s authors recommend in Section 2.9 of their paper [49] under Sender Tasks that wireless networks should have a key disclosure delay of 15 to 30 seconds, so we used a delay of 15 seconds. LHAP s authors mention using a TESLA interval of 1 or 2 seconds in the example in Section 4 of their paper [57]. We used an interval of 1 second for LHAP. Lu s authors mention using a 2 or 3 second key disclosure delay in Section IV.C.2 of their paper [45], so we used a delay of 2 seconds. In each case, we have used the minimum delay recommended by the authors and biased the results in their favor so that we could conservatively assess the performance of HEAP against their algorithms. The following metrics were collected Latency One source node was used to send packets to five different nodes that were 1, 2, 3, 4 and 5 hops away from the source, respectively. 10,000 packets were sent by the source and the mean end-to-end latency of all the packets was measured. The results of mean latency vs. number of hops between sender and receiver are plotted in Fig. 2.6 (on a logarithmic y-axis) for all four algorithms, as well as for plain ODMRP. In LHAP s case, if any node on a path from the source to the destination has just joined or rejoined the network after a short delay, it may wait as 30

45 Figure 2.6. Mean Latency (ms) plotted on a log axis vs. Number of hops for all algorithms. The bottom curve represents ODMRP, LHAP Min and HEAP since all three overlap almost exactly. long as one TESLA interval in order to authenticate packets. That scenario is depicted by the curve LHAP Max. Similarly, LHAP Min represents the minimum latency under the ideal case that no node needs to wait for a key update. The curves for plain ODMRP, LHAP Min, and HEAP are almost the same and are overlapping. The results show that HEAP s latency is the lowest and is almost the same as the latency without any security scheme, using only ODMRP. LHAP s minimum latency is also the same, but its maximum latency is much higher. TESLA and Lu s latency is several orders of magnitude higher than HEAP s, especially at large hop counts. This is because TESLA and Lu use delayed key disclosure and cache packets until their key is disclosed. While TESLA caches packets only at the final recipient node, Lu caches packets at every hop, so Lu s latency increases linearly with the number of hops. The latency of several seconds or tens of seconds can 31

46 become critical in real time applications or in QoS applications. HEAP would be the preferable choice for such applications Throughput and Packet Delivery Ratio Two simulations were run, one where all the nodes were static and one where the nodes were moving about randomly. Static Case In this simulation, one source node sent packets to nine receiving nodes. Three of those nodes were one-hop away from the source, three nodes were two hops away, and three nodes were three hops away. The sender s transmission rate was varied from 1 packet per second to 100 packets per second, where each packet contained 512 bytes of data. A total of 10,000 packets were transmitted by the sender in each simulation run. Five different algorithms were tested: Plain ODMRP, TESLA, LHAP, Lu and HEAP. The throughput was computed for each node and the mean of all nine nodes was taken. The results are plotted in Fig The number of packets successfully received by each node was also measured and then divided by the total number of packets sent (i.e. 10,000) to compute the packet delivery ratio for each node. Then the mean of these ratios was taken and the results are plotted in Fig Fig. 2.8 shows that the peak delivery ratio is about 84% and is sustained up to a packet rate of about 25 packets/sec for ODMRP, TESLA, LHAP and HEAP. For higher packet rates, the ratio reduces sharply and then tapers off to about 12. Similarly, in Fig. 3, the peak throughput is achieved at about 25 packets/sec. This is because the throughput is a product 32

47 Figure 2.7. Mean Throughput (bytes/sec) vs. Packet Rate (pkts/sec) for all five algorithms. of the packet rate as well as the delivery ratio. Beyond 25 packets/sec, the delivery ratio falls sharply cutting throughput. Between 40 to 100 packets/sec, the effect of decreasing delivery ratio is offset by the increasing packet rate so the throughput is nearly constant. Lu s algorithm performs significantly worse than all the others in both throughput and delivery ratio simulations. This is because Lu s algorithm caches packets at forwarding nodes. The source node sends out packets uniformly in time (due to CBR) but these packets are cached at the first forwarding node until a key update packet is received. Once the key update is received, the forwarding node authenticates all the packets at once and transmits all of the packets in the cache, one after another in rapid succession, causing a burst of transmission. 33

48 Figure 2.8. Mean Packet Delivery Ratio (%) vs. Packet Rate (pkts/sec) for all five algorithms. After that burst is over, the cache is empty again. The result is that all the intermediate nodes transmit packets in bursts and then stop transmissions completely, then transmit a burst, then stop. This is very inefficient and wastes available bandwidth. If two or more neighboring nodes happen to transmit their bursts simultaneously, there is contention on the channel and collisions may occur, causing packets to be dropped. In addition, if a node is unable to empty the cache quickly enough due to channel congestion the cache may overflow and cause packets to be dropped. This theory is supported by the observation that most of the nodes that were 3 hops away received zero packets out of the 10,000 packets sent by the source. There is no statistically significant difference between the curves for ODMRP, TESLA, LHAP and HEAP, so we deduce that the characteristics of ODMRP dominate the throughput 34

49 and delivery ratio and the effects of TESLA, LHAP and HEAP are negligible. This result shows that the byte overhead introduced by HEAP is negligible since it has an insignificant effect on throughput and packet delivery ratio. Mobile Case In this simulation, the nodes were free to move about randomly as stated in the beginning of this section. All the other parameters were the same as for the static case where one node was sending packets to nine different receiving nodes. The results are plotted in Fig Figure 2.9. Mean Throughput (bytes/sec) vs. Packet Rate (pkts/sec) for all five algorithms when the nodes are moving about randomly. The results show that Lu and TESLA perform significantly worse than ODMRP, LHAP 35

50 and HEAP. This is probably due to the fact that TESLA and Lu use delayed key disclosure and if the keys are never received, the packets they authenticate also need to be dropped. Mobility causes packet delivery ratio to deteriorate causing more lost keys and lower throughput for TESLA and Lu. The results of ODMRP, LHAP and HEAP are not significantly different and HEAP does not cause any noticeable loss in ODMRP s performance under mobility Memory Requirement TESLA, LHAP and Lu all use precomputed hash key chains that must be generated and stored offline, before transmissions can begin. When a node runs out of keys, it must stop and generate new key chains and store them before it can continue transmissions. Therefore, it is desirable to generate and store long key chains in each node to make cessation of transmissions as infrequent as possible. In addition, TESLA and Lu use delayed packet authentication. This means that all the received packets should be stored in a cache in the node until their key is disclosed. The cache should be large enough to store all incoming packets until the key is received. In TESLA, only the receiving node needs to have a cache, but since in a MANET any node can become the recipient, all the nodes need to have a cache anyway. The necessity to have a large cache and store long key chains in each node balloons the memory requirement of each scheme. HEAP does not have any of these requirements and it only needs to store one pair-wise key per neighbor and one neighborhood key, making its memory requirements minute compared to the others. We computed the memory requirements of each scheme based on how large we want the cache and the key chains to be. We selected a cache size that was large enough to store all the packets between key updates, using the key update intervals mentioned above and the 36

51 simulations maximum packet rate of 100 packets/sec. The results for throughput, delivery ratio and latency all use this cache size as smaller caches would drop packets and adversely affect the simulation results. Since LHAP does not use any caches, to compute its memory requirements we used the packet rate that maximizes it throughput, namely 25 packets/sec according to our simulations. LHAP rapidly consumes its keys (one key per packet) and so we need extra long key chains for LHAP. The results are plotted in Fig on a logarithmic y-axis. We plotted the memory requirement vs. how long a key chain would last (shown on the x-axis). HEAP does not use key chains and its memory requirement is independent of time. We assumed an average of ten neighbors per node for HEAP. In all the algorithms, the memory requirement shown is per node, so that every node would have to have that much memory set aside. HEAP uses only 176 bytes per node. The results clearly show that HEAP is extremely economical in terms of memory compared to the other three schemes. This is especially useful for MANETs where small wireless devices or sensors may be constrained in memory Computational Cost To compare the computational cost of HEAP against other schemes, we measured the CPU time needed to compute the Message Authentication Code (MAC) for different message sizes and different number of neighbors. We measured the CPU time taken by TESLA, Lu and HEAP on a 2.0 GHz Intel Pentium M processor. Note that LHAP does not compute MACs for the message for efficiency reasons, but LHAP s authors concede in Section 2.3 of their paper [57] that this makes their algorithm less secure than TESLA. The time taken by TESLA, Lu and HEAP is identical when there is only one neighbor and depends on the message size. However, HEAP s CPU time increases with the number of neighbors since it computes a different MAC 37

52 Figure Memory Requirement (bytes) on a log axis vs. Duration of Key Chain (mins) for all four algorithms. for each neighbor. To quantify this computational cost and compare it to that of TESLA and Lu s, we actually run the algorithms on a 2.0 GHz Intel Pentium M processor and measured the CPU time taken by HEAP, TESLA and Lu. Fig shows the time taken by HEAP when the number of neighbors is varied for different packet sizes. Note that TESLA and Lu s times with any number of neighbors are the same with that of HEAP s with one neighbor since they just compute one MAC for all neighbors. The graph shows that even with six neighbors, HEAP only adds an overhead of about 5 microseconds regardless of message size, over TESLA, Lu or any other scheme that computes 38

53 Figure CPU Time (micro seconds) vs. Number of Neighbors for HEAP. HMAC over the entire message. If we would have used the unmodified version of HMAC for HEAP, the CPU time would have been six times that of TESLA for six neighbors, ten times for ten neighbors and so on. Due to our modifications to HMAC, the algorithm becomes very efficient. Even on a very constrained CPU device, we do not expect this overhead to be more than 100 microseconds. The mean one hop latency that we found for ODMRP was about 6,000 microseconds and so, it is clear that the overhead of 5 to 100 microseconds is negligible, especially in view of the level of security the algorithm offers. 39

54 2.7 Conclusions and Future Work In this chapter we presented a new Hop-by-hop, Efficient Authentication Protocol, called HEAP, that is intended to guard against attacks from outsiders. This protocol is suitable for use in MANETs for unicast, multicast or broadcast applications, and is independent of the routing protocol used. It is based on a modified HMAC algorithm that uses two keys and is very efficient. We stated our security goals for HEAP and presented a security analysis of HEAP to evaluate it against those goals. We showed that the security of HEAP is equivalent to the security of HMAC for attacks against outsiders. We compared the performance of HEAP with three other previously published authentication algorithms, namely TESLA, LHAP and Lu and Pooch s algorithm using simulations. TESLA is vulnerable to DoS attacks and requires secure time synchronization of all the nodes. It introduces very large latencies of several seconds making it unsuitable for real time or QoS applications. It also has huge memory requirements at every node (over 1MB in our simulations) and has mediocre throughput when the nodes are mobile. LHAP is vulnerable to wormhole and man-in-the-middle attacks as pointed out by its authors. We do not think using GPS devices at every node, as recommended by its authors, is a practical solution. An attacker can successfully transmit forged packets using these attacks, defeating the purpose of authentication. It also has very large memory requirements at every node (about 1MB for a one hour key chain) and consumes keys very rapidly. Once a node runs out of keys in its key chain, it needs to stop and regenerate the key chain causing cessation in its transmissions. Lu s scheme has low overall performance. Their scheme significantly degrades through- 40

55 put and packet delivery ratio, both in static as well as mobile nodes. In addition, it introduces large latencies making it unsuitable for real time applications. HEAP is resistant to several outsider attacks such DoS, wormhole, replay, impersonation and man-in-the-middle attacks by making it very difficult for an outsider to propagate any forged packet. It has extremely low memory requirements (only 176 bytes in our simulations) and its CPU overhead is negligible, making it suitable for constrained wireless devices. Its byte overhead is also negligible since it has an insignificant effect on overall throughput and packet delivery ratio. Its latency is almost the same as without any security scheme, making it ideal for real time and QoS applications. In future work, we plan to reduce the bootstrapping and key exchange overhead by proposing ID based, Certificateless key exchange. The concept of ID based PKC was introduced in [66] and it was enhanced in [65] to overcome the key escrow problem. The key escrow problem is that in ID based PKC, the private key generating agency generates all the keys for the participants. This means that the key generator is in possession of all the private keys. In [65], an algorithm was proposed to overcome this problem by allowing the key generator to generate part of the key, while the rest of it is generated by the user secretly. We can use this algorithm to design a more robust ID based key exchange mechanism suitable for MANETs. By relying on ID based keys, we no longer need to verify traditional X.509 style certificates with third party signatures. This reduces the overhead in exchanging keys and makes the overall scheme more efficient. 41

56 Chapter 3: Defending Against Insider Attacks 3.1 Introduction As explained in Chapter 2, HEAP is designed to defend against attacks originating from outsiders. It is not designed to defend against attacks coming from insiders. We are now focusing our attention on designing a separate system that will work with HEAP to guard against insider attacks. A typical MANET may have several resources, such as printers, file servers, databases, web servers etc. In addition, many nodes may provide different services as part of a larger Service Oriented Architecture (SOA) [67] approach. In SOA, large applications are modularized into smaller services which run on heterogeneous devices. It especially makes sense to use SOA in MANETs so that large, computationally expensive applications can be implemented on resource constrained devices in a distributed fashion. But from a security standpoint, we need a mechanism to regulate access to these resources and services by clients. We would like to guard against insider clients who would misuse these resources or launch attacks against them. On the flip side, we would like to ascertain that the service provider itself is reliable. To accomplish this, we are developing a hybrid Trust Management (TM) model for MANETs, based on Role and Reputation Systems, that will control access to various resources on the network. The exact form of the model will vary depending on the architecture of the MANET. The idea is that when a client requests a resource from a server, the client sends signed credentials to the server informing the server of the role assigned to the client. This is 42

57 like a traditional role based TM. At this point, one of three rules would apply: 1. If the client has a highly privileged role that grants unrestricted access to the resource, service would be granted to it. 2. If the client has an under-privileged role below a certain minimum level, service would be denied to it. 3. If the client has a role that is above the minimum privilege level but below the unrestricted access level, then the client s reputation is queried. If it is found to be acceptable, access is granted; otherwise, it is denied. To discover the reputation of a client the server will query other resource providers that have had some prior interactions with this client. The other providers will inform the server how many past interactions with the client have been satisfactory and how many have been unsatisfactory. Based on this information the server will compute the reputation for the client and if it is satisfactory, access to the resource will be granted, otherwise it will be denied. The motivation for this model is based on real world scenarios. Suppose you are the inventory manager for a large retailer. If your supervisor, who has a privileged role, asks for write access to the inventory records you grant access to him. If the company s custodian asks for the same access, you refuse. But if another manager, who is your peer, asks for write access, you base your judgment on how well you, or other people you trust, know him. If he is new to the company or if he is suspected of trying to defraud the company, you may ask him to get written authorization from a supervisor. However, if you ve known him for a long time and found him to be trustworthy, you may grant him access without further verification. In other words, whenever a role is in a borderline or gray area, we turn to the individual s reputation to 43

58 make a decision. Thus, well behaved individuals would be granted access without many hurdles while less trusted individuals will have to provide further verifications. The scheme provides an incentive to users to behave well regardless of their role so that they may enjoy more privileges in the future. Reputation based systems have been suggested before for peer-to-peer systems and on the Internet, however we are designing them so that they can be used in MANETs. This means we have to account for factors such as limited computational resources, no online central authority, nodes that cannot be completely trusted, and malicious nodes that may be colluding [3]. We are also combining the principles of reputation and role based systems, as mentioned above, to overcome the drawbacks of either system. 3.2 Related Work In general, Trust Management (TM) has traditionally taken three broad approaches: Monitoring based TM, Reputation based TM, and Role based TM. In this chapter we discuss each approach and compare their advantages and disadvantages. In the following chapters, we propose a new hybrid Trust Management model for MANETs that attempts to capture the advantages of the existing schemes while alleviating most of their disadvantages Monitoring Based Trust Management In this approach, each node s wireless traffic is monitored by its neighbors and conclusions are drawn based on its behavior [6, 31]. Many monitoring based systems have been proposed in the literature [4, 35, 36, 37, 38] and the principle behind its operation is that traffic from a 44

59 node is classified as legitimate or illegitimate by its neighbors. Examples of illegitimate traffic include known attack signatures, viruses and worms, or anomalous behavior. A node that is behaving well and communicating mostly legitimate traffic is deemed to be trustworthy. Such a node accumulates good credit points through its good behavior and slowly gains access to increasingly sensitive resources. Advantages 1. This model has a very fast response time. A misbehaving node can quickly be detected and its traffic can rapidly be blocked. This is because all the information gathering and decision making is done within a node s one hop neighborhood. Any detected malicious traffic cannot pass beyond this one hop neighborhood. 2. The model relies solely on the observed behavior of the node and not on third party recommendations made from outside the neighborhood. This reduces the possibility of false recommendations being made in favor of or against the node. Of course, we still have to deal with the possibility of a malicious neighbor raising a false alarm against the node. But this would be easier to verify since other neighbors are also monitoring the same node and too many false alarms by a single neighbor would raise suspicions against the neighbor itself. Disadvantages 1. The most serious disadvantage of using monitoring based trust management systems is that they have been shown to raise too many false positives due to noise. According to [85], traditional simulated noise models do not mimic the behavior of observed noise 45

60 patterns in an actual setting, leading to optimistic results in simulations. In experiments done on MANET testbeds, it was shown that monitoring based systems do not work well as they raise too many false alarms [86]. Therefore, in our research, we do not consider using monitoring based systems. 2. Another drawback of this approach is that a malicious node might pretend to be well behaved for a sufficient length of time until it acquires enough credits to gain access to its target resource. Once it has gained that access it could successfully launch its attack, which might go undetected. Examples include gaining read access to a file server that stores confidential files, or issuing well formed but malicious commands to a military mass communications device. 3. Since the monitoring and response system is based on the immediate neighborhood alone, a blocked node could simply move to a new neighborhood and start over. This problem can be mitigated by maintaining Certificate Revocation Lists (CRLs) across the network, but that is not easy to do. 4. Information about a node s past behavior is not retained except for a short time and future interactions are independent of it. A previously malicious node gets to start again with a clean slate. This is contrary to human interactions where convicted felons, for example, are unable to gain membership in authoritative government agencies. Again the problem could be reduced by maintaining CRLs, but it is not easy to do so. 5. Another drawback is that due to the absence of any prior information, the network is forced to allow a new node to join the network, albeit with limited access rights. 46

61 3.2.2 Reputation Based Trust Management This form of trust management is based on the reputation of a node as judged by other nodes. Reputation is the opinion of one entity about another [89]. This model is inspired by PGP s web of trust model [29] in which a member node can vouch for a client node that wants access to a resource. The model is based on prior information about the client node [10, 53, 54]. A node builds its reputation by previous positive experiences with other nodes. What constitutes a positive experience is left to the discretion of the node that vouches for it. In this way this model overlaps with the monitoring based approach. A vouching node might have received positive recommendation for the client from someone it trusts, or it could personally know and trust the owner of the client node, or the client node could have accumulated trust through prior positive interactions. Advantages 1. This model is similar to how human beings gain each other s trust over a long period of time through experience. 2. Records for every node and every interaction are kept so a node s history is permanently tied to it. This will encourage potential adversaries to behave well so that they could maintain a good reputation. Disadvantages 1. The drawback of this approach is that there are no fixed criteria to build a reputation. One vouching node may easily trust another node while another vouching node may not. It is difficult to assess the reliability and value of another node s subjective recommendation. 47

62 Some attempt could be made to assign standardized scores to each type of (mis)behavior but it is not always easy to classify the type of behavior being observed in a given instance. 2. Consider the case where only a few nodes could grant access to a sensitive resource and they had insufficient information to grant access to a client node. It would take too long for that client node to build a reputation strong enough to gain access to the resource. 3. Reputation based systems are very slow to react. It takes a long time to build or diminish a reputation and consequently it takes too long to respond to a threat. A node with good reputation and elevated access rights may launch a serious attack and it would go undetected until it is too late. In a real world scenario, this is referred to as an inside job, such as a security guard on a night shift robbing a bank Role Based Trust Management This approach assigns roles to entities [39, 40]. So for instance, in a commercial organization, some nodes might belong to the role manager, some might belong to clerical staff, and some to executive. All nodes belonging to a given role have the same privileges. All executives might have complete access to a file server while all clerks might have read only access. The kinds of access rights granted to a given role are fixed beforehand. However, roles may be dynamically assigned to individual nodes, so for instance an external auditor might temporarily be assigned a manager role for a fixed period of time. 48

63 Advantages 1. The rights granted to a node are directly related to the job they need to perform in the network. This again follows the real world example where employees have access rights based on their job title. 2. There is no need to issue and maintain different passwords for each node at a central server. Each node can validate itself to any other node simply by presenting to it its signed credential that assigns to it a given role. 3. Temporary role assignments are easy to implement since each credential carries a time of expiry for the role. Visitors can therefore be granted access for a preset amount of time. 4. Because only a few entities have the authority to assign roles, the evidence needed to trust an entity with a given role can be standardized. It is no longer as subjective as the reputation based case. Disadvantages 1. This approach treats all entities within a role equally. Common observation tells us that not all managers can be equally trusted, for example. Some are most trustworthy then others based on their past experience and duration of employment with the firm. It would not be practical to assign such fine-tuned roles as managers with 10 years experience, managers with 5 years experience, etc. because then each role would contain very few entities, perhaps even just one entity. 2. This model also suffers from a slow response time, although not as slow as the reputation based system. This is because if a privileged node behaves maliciously, there is no way 49

64 of stopping it except to demote its role. Only a few nodes have the authority to promote or demote another node s role. These authoritative nodes need to be sent the evidence against a node, which would then be processed by them and a role revocation credential against the node would be broadcast to the entire network. This would make the response time against the malicious behavior unacceptable since by then the damage would already be done. 3.3 Trust Management Models for MANETs The specific TM model used for a MANET depends on the MANET s architecture. Since monitoring based TM has been shown to be ineffective in detecting malicious behavior accurately due to many false positives [86], we decided not to pursue this approach in our research and instead focus on Reputation and Role Based Systems. The figure below summarizes when it makes sense to use a TM system based only on Roles, or only on Reputation, or a combination of both. Architecture Role Only Reputation Only Hyrbid (Role + Reputation) Hierarchical, SAD Hierarchical, MAD Flat, Open Flat, Closed Figure 3.1. Which architecture should be paired with which TM system A hierarchical MANET is, by definition, a closed MANET where each node is assigned one or more roles by a Trust Authority (TA). It could be administered by a single administrator (SAD), or multiple administrators (MAD) [84]. Since roles are preassigned, we cannot opt for a reputation only based TM. On the other hand, a flat network, by definition, has no roles and 50

65 no administrators. It can be open, such as on the Internet or an airport, or closed [84]. Since it cannot have roles, we can only use reputation based TM. Using Reputation or Role based TM in MANETs presents a new set of challenges. This has to do with factors such as there is no online central authority, many nodes are limited in their computational resources, nodes may go offline at any time necessitating redundancy, and nodes are not guaranteed to be completely trustworthy [8]. In the next chapter we propose a new Reputation System based on Machine Learning that is suitable for MANETs. We will then combine this Reputation System with a Role based system to develop a hybrid Trust Management model that can effectively defend against insider attacks. 51

66 Chapter 4: A Basic Machine Learning Based Reputation System 4.1 Introduction In this chapter we will be focusing on developing a Machine Learning based Reputation System (RS) for closed networks. Earlier we presented a technique, called HEAP, for guarding against outsider nodes. We are now directing our attention on the more challenging problem of securing the network against attacks from insider nodes that have been compromised. Such compromises may occur by an adversary hacking into and gaining access to a legitimate node, or by obtaining the secret key of a legitimate node so that it can assume its identity, or in some other way being able to convince the network to accept its node as being legitimate. Unfortunately, there is no litmus test to enable one to verify whether an insider node is malicious or benign. We can only guess the intentions of a node by observing its behavior and trying to discriminate legitimate behavior from malicious behavior. Many researchers have utilized this approach in the context of P2P networks, large scale distributed networks, and on the Internet by utilizing Reputation Systems [59, 60, 61, 75, 88, 90, 103]. Others have used Reputation Systems for securing MANETs and their routing protocols [87, 88, 96, 97, 98], and in sensor networks [4, 93]. Reputation systems have been proposed that try to predict the future behavior of a node by analyzing its past behavior. The basic idea is that the past behavior of a node (such as the number of good vs. bad transactions) is stored in the network in a distributed 52

67 fashion. Each node that interacts with another node stores some feedback about the interaction. The interaction is either classified as legitimate or suspicious (or some value in between). If, for instance, the interaction consisted of downloading a file, the client could determine if the downloaded file was indeed the one requested, or it was a Trojan, a virus, or spam. Based on the feedback of various nodes, a new node can decide whether to transact with a given node or not, even though they may never have interacted before. For example, ebay utilizes this form of reputation system where users leave feedback about other users [72], and Google uses PageRank where web pages are ranked for relevance by other pages [73]. In general, we can summarize existing RSs [59, 60, 61, 74, 76, 78] within the general framework shown in Fig According to this framework, a node that needs to decide whether to transact with another node or not must first gather historical data about that node (e.g., the proportion of good vs. bad transactions in the last x minutes). Then it applies a customized mathematical equation or statistical model to the data to produce an output score. For example, the RS in [61] is based on using Eigen values from Linear Algebra, the one in [60] is based on using derivatives and integrals, whereas the one in [76] is based on Bayesian systems utilizing the Beta distribution. Depending on the output of the equation or model, the system then decides how to respond. In most cases, the equation or model is customized to detect specific types of malicious behavior only. For instance, the algorithm in [60] is designed to detect malicious behavior that alternates with good behavior and varies over time. 53

68 Figure 4.1. General framework of a Reputation System that decides whether to transact with a given node or not. In contrast to developing a separate module for each attack pattern, we employ Machine Learning, specifically Support Vector Machines (SVM), to build a flexible and dynamic RS that can be trained to thwart a multitude of attack patterns easily and efficiently. It can also be retrained to detect new, previously unknown attack patterns. The rest of this chapter is organized as follows. In Section 4.2, we define the RS problem in more detail and justify why Support Vector Machine (SVM) is a suitable candidate for solving it. In Section 4.3, we describe the three principle challenges associated with designing any RS and explain how our core SVM based RS tries to solve them. In Section 4.4, we discuss the factors associated with building the SVM based classifier. Then, in Section 4.5, we evaluate our core RS and compare it against another RS mentioned in the literature, called TrustGuard. Finally, in Section 4.6 we present our conclusions. 4.2 Basic Machine Learning Approach Using Fig. 4.1, we can redefine the problem of designing Reputation Systems (RS) into one of finding the optimal set of input features and equations (steps 1 and 2 in Fig. 4.1) that allow us to distinguish between malicious and benign nodes with high accuracy. Machine Learning (ML) 54

69 is of particular significance in this context since many ML algorithms are able to determine and approximate the optimal equation needed to classify a given set of data. We envision the problem of RS as a time series prediction problem, which states: Given the values of the dependent variable at times (t, t 1, t 2,..., t n), predict the value of the variable at time (t + 1) [81, 82]. The dependent variable in this case is the proportion of good transactions conducted by a node in a given time slot. Predicting this variable at time (t + 1) gives us the probability that the node will behave well if we choose to transact with it at time (t + 1). Therefore, we opted to use Support Vector Machines (SVM) as our ML algorithm because it has been shown to successfully approximate mathematical functions [62] and make time series predictions [70]. In our scheme, we build SVM models against different types of malicious behaviors offline, and then upload those models to the nodes in the network. The nodes can use those models to classify new nodes and predict if a new node is malicious or not. Constructing models is computationally expensive so it is done offline, possibly by a third party. However, the classification step is not very expensive and can be done on the node in real time. When a new type of attack is discovered, a new model can be constructed against it. This is similar to how anti-virus systems work where the anti-virus is developed offline and then uploaded to clients. Similarly, in our scheme the vendor of the RS might update its subscribers with SVM models against new attacks. An implied assumption is that after a transaction has taken place, a node can determine if the transaction was good or bad with a certain high probability. This is true in many cases, such as in commercial transactions on ebay, as well as in file downloads (where a corrupted or virus infected file would be considered bad), or in providing network services [62, 70]. Another 55

70 assumption is that the feedbacks can be reliably transmitted without being tampered with. This can be accomplished by a node digitally signing every feedback it sends. These assumptions are made by many researchers in the field [59, 60, 61] and we also make the same assumptions in our study. However, a few transactions might be incorrectly labeled good or bad. SVM can handle fair amounts of such noise in the dataset [62]. 4.3 Building the Core SVM based Reputation System If all the nodes in a network gave honest and correct feedbacks about the transactions they conducted, then it would be trivial to spot malicious nodes since all the good nodes would have 100% positive feedbacks, whereas the malicious nodes would not. But in reality, this is not the case and we have to deal with three principle challenges: 1. Dishonest feedback given by malicious nodes against other nodes they have transacted with. 2. Incorrect feedback from legitimate nodes by mistake. 3. Fake feedback given by malicious nodes about transactions that never really occurred. Our goal is to use SVM to tackle problems 1 and 2. However, SVM cannot detect if a feedback was fake, so we propose another mechanism in Section to deal with 3. We assume that the proportion of dishonest to honest feedbacks given by malicious nodes is much higher than the proportion of incorrect to correct feedbacks given by legitimate nodes. This is how we can distinguish between inadvertent and deliberately false feedbacks. If malicious 56

71 nodes reduce the proportion of dishonest feedbacks to match those of incorrect feedbacks, we have still succeeded in our goal of reducing malicious behavior. Several factors govern the construction of ML classifiers and the next section discusses some of them. 4.4 Factors in Building the Classifier There are many factors to consider in building the classifier that will be used to distinguish malicious nodes from good nodes. The following factors should be taken into account: 1. Feature Selection: The features used to train and test the classifier. Must be the same in the train as well as the test set. 2. Proportion of Malicious Nodes: The proportion of good vs. bad nodes in the dataset. This relates to the degree of imbalance in the dataset. It need not be the same in the train and test sets. 3. Size of Dataset: The number of instances in the train and test sets. 4. Evaluation Methodology: The method used to evaluate the performance of the classifier, such as train/test split, n-fold cross validation, leave one out, etc. 5. Evaluation Metrics: The metrics used to evaluate the classifier, such as accuracy, precision (specificity), recall (sensitivity), etc. 6. Kernel Used: For SVM, which kernel should be used. The last factor is only valid for Kernel Machines, such as SVM, while the other factors are valid for all types of ML classifiers. 57

72 4.4.1 Feature Selection Feature selection is a critical step in constructing the classifier. You cannot make good wine from bad grapes and you cannot make a good classifier from bad features. Using too few features might not provide sufficient information to the classifier, while using too many might result in the Curse of Dimensionality. It is common knowledge in ML that increasing the number of features increases the accuracy up to a certain point. Thereafter, increasing the number of features results in a degradation in performance. This phenomenon is called the Curse of Dimensionality [62]. To construct our features, we divided time into regular intervals called time slots. The network administrator can choose and fix a time slot that is a few minutes to a few hours long, depending on how frequently nodes in the network transact on average. The features in our experiments consist of the proportion of positive vs. negative feedbacks assigned to a node during a given time slot by the nodes it has transacted with. To collect features for a test node, we need to query all the nodes in the network and ask them to provide us any feedbacks they have about the node for a given slot. The fraction of positive feedbacks versus total feedbacks for that slot forms a single feature. Each time slot then corresponds to one feature. This is in accordance with [60], and is also based on features used in time series prediction problems [70]. We can vary the number of features by varying the number of time slots used. We use 15 time slots for our core SVM. In the next chapter, we present an algorithm that will enable us to look further back in time using only a few features. 58

73 4.4.2 Proportion of Malicious Nodes In a real world setting, we would not know the true proportion of malicious nodes vs. good nodes in the network. The ratio, or degree of imbalance, could vary from zero to very large. As a result, in our experiments we use multiple test sets with different ratios and graph the performance with respect to it. We would expect the accuracy to deteriorate as the ratio of imbalance becomes larger. Next, we need to consider the imbalance ratio for the training set. In an actual setting, we would not know the proportion of malicious nodes in the network, so the testing should be done with varying imbalance ratios. However, the training set can only have one imbalance ratio since we need to build just one SVM model. We use a malicious node proportion of about 60% since that gave us the best results Size of Dataset If the training dataset is too small, the classifier will not have enough instances to properly train itself and the accuracy will suffer due to overfitting. A small test dataset can lead to wide variations between the test error and the true error. However, using very large datasets increases computational costs without necessarily increasing performance. Based on our past experience we decided to use 1,000 instances for the training set and another 1,000 instances for the test set, achieving a good trade-off between computational cost and accuracy. 59

74 4.4.4 Evaluation Methodology Several traditional ways of evaluating the classifier exist in Machine Learning. In one common approach, two independent sets of data are used, one for training the classifier and another one for testing it. We also use two independently constructed datasets in our experiments. Other common techniques for testing include n-fold cross validation and leave-one-out, where a randomly chosen portion of the data is used for training and the remainder of the date is used for testing. This procedure is repeated many times. These techniques usually provide a better estimate of the true error than using simple train and test sets. However, since the proportion of good vs. bad nodes needs to be varied in our test set while keeping the train set constant, cross validation is not the appropriate choice for our experiments Evaluation Metrics Many ubiquitous evaluation metrics are used in ML to evaluate classifiers. The most common metric is accuracy, which simply measures the fraction of correct predictions. Other common metrics include precision and recall (or sensitivity) [71]. Graphs such as ROC curves are based on these metrics. The metrics are defined below: Accuracy = Number of Correct P redictions T otal Number of P redictions P recision = T rue P ositives T rue P ositives + F alse P ositives 60

75 Recall = T rue P ositives T rue P ositives + F alse Negatives In our preliminary experiments, we have used accuracy as our evaluation metric as it is also used in [60]. In future, we will also use precision and recall to evaluate the classifier Kernel Used Unfortunately, there is no easy way to pre-compute which is the best kernel to use for a given dataset [63]. As a result, for most applications we have to use trial and error to evaluate commonly used kernels and determine the most appropriate one. Common kernels include polynomial kernels of varying degrees, exponential kernels, and Radial Basis Function (RBF) kernel. In our experiments, the linear kernel (polynomial kernel of degree 1) has shown good results and increasing the degree of the kernel has not significantly increased performance. In general, it is desirable to use the lowest degree kernel that gives good results to save on computational costs. Exponential and RBF kernels have also been tried and they did not significantly increase performance beyond the linear kernel either. In the next section, we describe how to solve problem 3, guarding against fake feedbacks Guarding Against Fake Feedbacks Fake feedbacks are feedbacks about transactions that never actually occurred. So, for instance, node x may provide some feedback (either good or bad) about another node y, when in reality 61

76 y never transacted with x. A malicious node may try to provide several bad feedbacks about another node in order to reduce its reputation. To guard against this, TrustGuard [60] proposed a mechanism for binding transactions to transaction proofs, such that (i) a proof cannot be forged and (ii) is always exchanged atomically. These proofs ensure that every feedback is tied to a transaction that actually took place. TrustGuard ensures atomicity by using trusted third parties that must become online frequently to resolve disputes. In our work, we also propose to use transaction proofs that cannot be forged, but we relax the restriction of atomicity. This is because we would like to eliminate the need for a trusted third party that must be frequently online. This in itself provides a window of opportunity to adversaries when the third party becomes offline, as pointed out in [60] itself. In our implementation, each node has a pair of public and private keys. A client contacts a server and requests service. The server responds and it either denies the request or commits to providing the service. If it commits, the server will send the client a certificate to sign. The certificate has the following fields: Server ID, Client ID, T ime Slot ID, T ransaction N umber Cert. (1) The server and client IDs are self-explanatory (e.g. IP addresses of each). The Time Slot ID is the time stamp of when the current time slot began. If, for instance, a time slot is 30 minutes long, the client should check that the time slot ID should be no more than 30 minutes prior to the current time. All the nodes in the network must be aware of how long the time slots are and when a new time slot starts. The Transaction Number is the number of transactions that have occurred (including the current one) between the client and the server in the current time slot. All of these fields are verified by the client and then signed. Note that 62

77 each node only needs to keep track of the transactions that it has conducted within the current time slot. A node will never need to sign a certificate for a previous time slot. After verification, the client sends the signed certificate back to the server. The server verifies the signature and then signs the same certificate with its own private key and sends it to the client. Then it provides the requested service. In this way, both the client and the server end up with copies of the certificate signed by each other. In future, if a node z, asks either of them to provide feedback about the other, it will provide the feedback and present the signed certificate. In this way, z can verify the number of transactions that actually occurred between the two nodes in the given time slot, and the number of feedbacks expected for that time slot. We realize that because of the lack of exchange atomicity, after receiving the signed certificate from the client a malicious server might refuse to provide the client with a certificate signed by it. In that case, the server will get only one opportunity to give bad feedback about the client. In addition, the client will know that the server is malicious and not transact with it in future. If the server continues to do this with other nodes, several nodes will quickly realize that the server is malicious. Furthermore, since no transactions were actually completed with those nodes, the server would not achieve its goals of conducting many malicious transactions. On the other hand, if the server were to go ahead and complete the transaction, the client would not know for sure that the server is malicious and might transact with it in future again. This will give the server more than one opportunity to give bad feedbacks about the client and conduct many malicious transactions. Therefore, we argue that it is in the malicious server s interest to complete a transaction, so we do not need to enforce exchange atomicity and employ trusted third parties that must remain online frequently. 63

78 4.5 Evaluations of the Core SVM based Reputation System Simulation Setup We generated the datasets using simulations of a network consisting of 1,000 nodes. Time was divided into slots and in each time slot, several transactions were conducted between two randomly chosen pairs of nodes. Each node would then label the transaction as good or bad and store that label. The label may or may not reflect the true observation of a node, i.e. a node may lie about a transaction and give dishonest feedback (problem 1). Good Behavior: Good behavior is characterized as a node conducting a normal transaction and giving honest feedback about it. Bad Behavior: Bad behavior is characterized as a node conducting a malicious transaction and/or giving dishonest feedback. In addition, we introduced a random error of 5% to account for the fact that a node may incorrectly detect a transaction and mistakenly label it good or bad. This corresponds to problem 2 described above. The simulation was allowed to run for several time slots and then data about each node was gathered. To gather data about a node x, all the other nodes in the network were queried and asked to give information about x going back a certain number of time slots. The total number of good and bad transactions conducted by x in a given time slot were accumulated and the proportion of positive feedback was computed. This computation was repeated for each time slot of interest. In this way a concise, aggregate historical record of x was obtained. The 64

79 correct label of malicious or benign was assigned to x by us, based on its role in the simulation, for testing purposes only. The following attack scenarios were tested Attack Scenarios In each attack scenario all the good nodes behave well consistently throughout the simulation, however the behavior of malicious nodes varies with the attack type as described below. Attack 1: This is the simplest attack scenario in which all the malicious nodes consistently behave maliciously. These nodes do not collude amongst each other. Attack 2: In this scenario, the behavior of malicious nodes oscillates between good and bad at regular intervals. The aim of the malicious node is to boost its reputation by being good first, and then use its high reputation to conduct malicious transactions. As a result, its reputation would decrease again, so it will oscillate into good behavior once again to boost its reputation. Again there is no collusion between the nodes. Attack 3: This attack is similar to attack 2, except that now the malicious nodes collude with each other. Every time they happen to transact with each other, they recognize each other and leave positive feedback to boost each others scores. The nodes might recognize each other, for instance, if they belong to the same owner or colluding groups of owners. Attack 4: This attack is more severe than attack 3 because this time whenever malicious nodes recognize each other, not only do they leave positive feedback about each other, but they also conduct further transactions with each other to leave even more positive feedback. But of course, there is a limit to the number of fake transactions they can conduct without being caught as obviously fake. In our simulations we conduct a random number of fake transactions, up to a maximum of ten, within one time slot. 65

80 Attack 5: In this attack scenario we combined all four types of malicious nodes described above. A quarter of all the malicious nodes behave as in attack 1, another quarter behave as in attack 2 and so on. In a real world setting we would not know which, if any, attack was being launched by any given node, so the performance of the RS in this attack scenario would tell us what would happen if all the attacks were conducted simultaneously Experiments and Results We evaluated our core SVM based RS against two other algorithms, TrustGuard Nave and TrustGuard TVM (Trust Value based credibility Measure) [60]. Each of the five attack scenarios described above were tested. We set the same parameters for TrustGuard that their authors used in their paper. TrustGuard s authors have shown that it performs very well compared to ebay s reputation system, which is commonly used as a benchmark in the literature for RSs. Therefore, we decided to directly compare our performance with TrustGuard, instead of ebay. We collected data going back 15 time slots for each simulation run. For oscillating behavior, the period of oscillations was kept less than 15 to ensure it was distinguishable from legitimate behavior. In the next chapter, we propose a system to overcome this limitation by looking at many more timeslots and compressing their data into a few features using Fading Memories. For SVM, a separate set of training data was also generated and SVM was trained on it using the Weka Machine Learning software [64]. The training data had a fixed proportion of malicious nodes (about 60%). For each node, its transaction history for the last 15 slots was fed into each RS. Then using the output of the RS, a determination was made about whether the node was malicious or benign. For SVM this was done by looking at the distance between the test node and the decision boundary. If this distance was greater than a threshold, the 66

81 node was considered benign. Larger thresholds result in fewer false positives, but also fewer true positives. This might be desirable in critical applications where we want to be sure that a node that is given access to a resource is indeed good, even if that means denying access to some legitimate nodes. We discuss thresholds in greater detail in the next chapter, when we introduce Dynamic Thresholds. TrustGuard also outputs a score that can also be compared against a threshold and access can be granted if the score is greater than the fixed threshold. Classification Error: In the first set of experiments, the thresholds were fixed at their midpoint values so that the results were not artificially biased either towards increasing true positives (lower thresholds) or decreasing false positives (higher thresholds) but were halfway. Since the range of thresholds for SVM is (, ), its threshold was set to 0. The range for TrustGuard is [0, 1], so its threshold was set to 0.5. Then the percentage of malicious nodes in the network was varied. The proportion of nodes that were misclassified, or the classification error, was measured. The results for each attack type are illustrated in Figs The results show that SVM significantly outperforms TrustGuard s Nave and TVM algorithms for all attack types, even if the proportion of malicious nodes is very large (i.e. 80%). The difference is especially stark in attacks 2 to 5, when the attacker s behavior oscillates between good and bad. It is also interesting to note that, with the exception of attack 1, there is not much difference between TrustGuard s TVM and Nave algorithms, even though TVM is much more complex. 67

82 Figure 4.2. Classification Error vs. Proportion of malicious nodes for Attack 1. Figure 4.3. Classification Error vs. Proportion of malicious nodes for Attack 2. 68

83 Figure 4.4. Classification Error vs. Proportion of malicious nodes for Attack 3. Figure 4.5. Classification Error vs. Proportion of malicious nodes for Attack 4. 69

84 Figure 4.6. Classification Error vs. Proportion of malicious nodes for Attack 5. ROC Curves: We generated ROC curves for all three RSs. ROC curves are commonly used in Machine Learning to evaluate classifiers, irrespective of the thresholds used. The curve is obtained by varying the threshold, so that we can compare how the true positive rate varies with the false positive rate. The area under the ROC curve shows how good a classifier is. Classifiers with larger areas under the curve are better. The ideal ROC curve is an upside down L-shaped curve, containing the point (0, 1) that corresponds to 100% true positive rate and 0% false positive rate. Each point on the curve was obtained by running 30 simulations with different random number seeds, and then taking their mean. Confidence Intervals were taken around each point to ensure that the curve of SVM did not overlap with that of TrustGuard (the confidence intervals are too small to be visible on the graphs). The results are shown in Figs , 70

85 along with the diagonal random Guessing line. The results show that SVM outperforms TrustGuard, regardless of the thresholds used. The area under the curve is greater for SVM than TrustGuard in all cases. Figure 4.7. ROC Curves for Attack 1. 71

86 Figure 4.8. ROC Curves for Attack 2. Figure 4.9. ROC Curves for Attack 3. 72

87 Figure ROC Curves for Attack 4. Figure ROC Curves for Attack 5. 73

88 Bandwidth Overhead: Next we measured the bandwidth overhead involved in using SVM vs. TrustGuard. This overhead is due to passing feedbacks betweens the nodes. The overhead is consistent, regardless of the attack scenario, or proportion of malicious nodes, or thresholds used since the number of nodes transacted with in each time slot, and hence the number of feedback messages, will be the same for all cases. This overhead is illustrated in Fig Figure Bandwidth Overhead for each of the three algorithms. The overhead shown corresponds to the average number of messages (or feedbacks) exchanged to classify one node. The results show that the overhead is the same for SVM and TrustGuard Nave (around 30), whereas the overhead for TrustGuard TVM is much higher (around 960), since TVM traverses one more level to obtain further feedbacks about nodes that gave the original feedbacks. 74

89 4.6 Conclusions In this chapter, we proposed a Machine Learning based Reputation System (RS) that can be used to guard against malicious nodes in mission critical networks. The nodes can try to attack the network by conducting malicious transactions, or spreading viruses and worms, or attacking known vulnerabilities. Although there is no way of knowing whether a future transaction will be malicious or not, we try to predict the future behavior of a node by observing its past behavior. We discussed why using Machine Learning (ML), and especially Support Vector Machines (SVM), is a good approach for designing RSs. We explained that the Reputation System problem can be viewed as a time series prediction problem and SVM has been shown to be well suited to time series prediction problems [70] and for approximating unknown functions [62]. We then proposed our SVM based RS and specified what features we used, the kernel and other parameters used. We illustrated the three problems that any Reputation System must solve and SVM tries to solve two of them; dishonest feedback and incorrect feedback. To solve the third problem, we proposed a Digital Signature based scheme for guarding against fake transactions that does not need online trusted third parties. We developed and evaluated our core SVM based RS using simulations. Its performance was compared against two other algorithms found in the literature called TrustGuard Nave and TrustGuard TVM [60]. We chose TrustGuard because it has been shown by its authors to perform very well compared to ebay s reputation system. We simulated 5 different attack scenarios and showed that our model outperforms TrustGuard in all 5 scenarios, whether there is oscillating or steady behavior, collusive or non collusive behavior. Our scheme can achieve high accuracy and correctly predict good vs. 75

90 malicious nodes, even when the proportion of malicious nodes in the network is very high. The ROC curves show that the improvement of SVM over TrustGuard is statistically significant, as their confidence intervals do not overlap each other. We also showed that SVM has the same bandwidth overhead as TrustGuard Nave, but much better overhead than TrustGuard TVM. In the next chapter, we propose enhancements to the basic RS design that will allow us to look back at longer histories using only a few features. This enhancements forces the adversary to behave well for longer periods of time in order to boost its reputation score. In addition, we introduce an algorithm called Dynamic Thresholds that further improves the accuracy of SVM by dynamically shifting the SVM decision boundary based on the proportion of malicious nodes in the network. 76

91 Chapter 5: EMLTrust: An Enhanced Machine Learning Based Reputation System This chapter describes two enhancements that we propose to our core SVM classifier. The first enhancement, SVM with Fading Memories and Digital Signatures (Section 5.1), allows us to look back at longer histories using only a few features. The second enhancement, Dynamic Thresholds (Section 5.2), allows us to better estimate where to place the SVM boundary in order to improve classification accuracy. Finally, Section 5.3 describes conclusions and future work. 5.1 Enhancing SVM with Fading Memories (SVM-FM) As described in Section 4.4.1, each feature used in our SVM classifier corresponds to one time slot. This means that for n features, we can only look back at the behavior of a node going back n time slots. From an adversary s point of view, the adversary only needs to behave well for n time slots since any behavior before that will be forgotten and it can have a fresh start. We would like our security system to be able to look back far into the past so that it is not easy for an adversary to behave well for a short period of time and then wipe its slate clean. However, we cannot make n, the number of SVM features, arbitrarily large for several reasons: Using too many features may result in the well known Machine Learning problem of the Curse of Dimensionality which causes accuracy to decrease as n increases beyond a certain threshold [62]. 77

92 Storage requirements increase as n increases because the nodes have to store n features. Computational requirements increase as the SVM models become more complex. It takes longer to build new SVM models as well as to classify new nodes [62]. Ideally, we would like to use fewer features while retaining the ability to look back further into the past. In other words, a feature should be able to summarize the data for several time slots into one value. To accomplish this goal, we build on the concepts presented in the Fading Memories algorithm of TrustGuard [60] and enhance them for use with SVM and digital signatures. 78

93 Figure 5.1. (Adapted from [60]) Updating Fading Memories: F T V [i] denotes the faded values at time t and F T V [i] denotes the faded values at time t + 1. The basic idea in Fading Memories (as explained in [60]) is that data is aggregated over intervals of exponentially increasing length in the past {k 0, k 1,..., k m 1 } into m values (for some integer k > 0) (Fig. 5.1). For more recent data, the interval size is smaller so fewer time slots are aggregated together resulting in greater precision. For older data, larger intervals are used so more time slots are aggregated together resulting in lower precision but greater compression. This enables us to retain more detailed information about the recent past, while storing less detailed information for older data. The value of k can be chosen so as to allow a tradeoff between compression and precision. We chose a value of 2 for simplicity. More specifically, for a given node x during time slot (t+1), feature Fx t+1 [0] is initialized to p t x, the proportion of positive feedbacks given to x during the previous time slot t (Eq. (1)). The remaining features are computed using Eq. (2). In TrustGuard all the features are updated after each time slot. 79

94 F t+1 x [0] = p t x Eq. 1 Fx t+1 [j] = (F x[j] (2 t j 1)+Fx[j 1]) t 2 j for j > 0 Eq. 2 The advantage of Fading Memories (FM) is that it allows us to look back further in time to evaluate a given node s behavior. Without FM, an adversary would only need to behave well for a short period of time before its history is erased and then it can start behaving maliciously again. It can oscillate its behavior between good and bad indefinitely, as long as its period of good behavior is greater than the history we are looking at. Fading Memories is an attempt to elongate the history size so as to force the adversary into choosing longer periods of good behavior. We were interested in finding out if Fading Memories would work with SVM, so we compared the performance of SVM with Fading Memories (SVM-FM) versus without Fading Memories Evaluating SVM with Fading Memories (SVM-FM) Small Fixed Periods Fading Memories (FM) aggregates data in several time slots resulting in lower precision as compared to one feature representing one time slot. As a result, we would expect the performance of SVM with FM to be worse than SVM without FM when dealing with small oscillation periods. We conducted experiments in order to test this. We generated datasets using simulations of Attack 5 in a network of 1,000 nodes, similar to those described in the previous chapter. Once again, all the good nodes behave well consistently, while malicious nodes that oscillate their behavior have fixed periods of size n, where n was chosen to be 8. They behave well for 4 time 80

95 slots, then maliciously for 4 time slots. The number of features used in both, SVM-FM and SVM without FM is also n. Figure 5.2. Classification Error vs. Percentage of Malicious Nodes for small, fixed periods of oscillation for SVM with FM and SVM without FM. As described in Section 4.4, in SVM without FM each feature corresponds to one time slot, whereas in SVM-FM each feature was computed using equation (2) at each new time slot as the simulation progressed. At the end of the simulation the features were collected and used in building and testing SVM models. The training and testing datasets were distinct and were generated using different random number seeds. The percentage of malicious nodes in the training dataset was fixed at 50%. In the testing datasets the percentage was varied and the classification error was plotted against it. The results are summarized in Fig They show that the change in error with FM is insignificant compared to that without FM. The difference between them is not statistically significant so Fading Memories is a viable option for use with SVM even when the period of oscillation is small. 81

96 Long Variable Periods The real benefit of SVM with FM is apparent when we vary the period of oscillation. To illustrate this, another set of experiments were conducted where the datasets were generated as before, but the period of oscillation for each malicious node was randomly selected up to a maximum of 250 time slots for good behavior and 250 time slots for bad behavior. Given 8 features and k = 2, with FM we can look back at = 255 time slots, whereas we can only look back at the last 8 time slots without FM. As a result we would expect a significant improvement in classification error with FM. We tested this through simulations and the results are plotted for each of the five attack patterns in Figs The results clearly show the advantage of using SVM-FM when the periods of oscillation are long. The error with SVM-FM is much lower than the error without FM, especially at smaller percentages of malicious nodes. In attack 1, there are no oscillations and the nodes have consistent behavior. By reducing the error for longer periods, we can successfully force the adversary to behave well for longer periods of time. In general, the error keeps rising as the percentage of malicious nodes increases until at some point, almost all the good nodes are being classified as bad. This happens when malicious nodes overwhelm the network and leave too many negative feedbacks. SVM concludes that almost every node is malicious, which happens at around 70%. After 70%, virtually all the nodes are classified as bad. The error then becomes proportional to the percentage of good nodes in the network. As the percentage of good nodes decreases, so does the error, which explains the dip in error. We combat this phenomenon using Dynamic Thresholds as discussed in Section

97 Figure 5.3. Classification Error vs. Percentage of Malicious Nodes for large, variable periods of oscillation for SVM with FM (FM error) and SVM without FM (Orig Error). Attack 1 has no oscillations. Figure 5.4. Classification Error vs. Percentage of Malicious Nodes for large, variable periods of oscillation in attack 2. 83

98 Figure 5.5. Classification Error vs. Percentage of Malicious Nodes for large, variable periods of oscillation in attack 3. Figure 5.6. Classification Error vs. Percentage of Malicious Nodes for large, variable periods of oscillation in attack 4. 84

99 Figure 5.7. Classification Error vs. Percentage of Malicious Nodes for large, variable periods of oscillation in attack Fading Memories Enhanced with Digital Signatures In Section 5.1, we showed that by using Fading Memories we can successfully compress the data in several time slots into a few features that SVM can use. This helps us in using longer histories without needing extra storage to store those histories. However, we run into a problem when we try to use Fading Memories along with the Digital Signature scheme presented in Section The digital signature scheme is needed to protect against fake feedbacks, so that one can verify that a given feedback pertains to a transaction that actually took place. As mentioned in Section 4.4.7, at the beginning of every transaction the client and the server exchange digital certificates. The certificate fields are: Server ID, Client ID, T ime Slot ID, T ransaction Number Cert. (1) 85

100 Transaction number tells us how many transactions took place in the given time slot. If a new transaction occurs in the same time slot, the transaction number is incremented by one. This in turn corresponds to the number of feedbacks that one of the nodes, let s say the client, can legitimately give about the other node (the server) for that time slot. But when we start aggregating the data in different time slots in Fading Memories, we run into the problem of trying to aggregate two or more digital certificates into one. As the client moves into a new time slot t + 1, it must aggregate the data in time slot t with the data in time slot t 1, and continue aggregating the data in time slots further back (Fig. 5.1). However, it is unable to combine the digital certificate for t with the certificate for t 1 to get a single certificate because the new certificate would not have a valid signature by the server. As a result, it would be forced to store the certificates for each time slot individually, defeating the purpose of using Fading Memories. To circumvent this problem, we propose a new protocol for the digital signature scheme. Suppose n features are being used by SVM and the system is currently in time slot t. The client and the server will transact for the first time and exchange certificates similar to Cert. (1). If they transact again within the same time slot, they will update this certificate by incrementing the number of transactions and re-signing it. Now let s assume that at some point in the future, t + x, the client again transacts with the server. This time, the client and the server both update their certificates by moving along the time line and using the update step of Fading Memories (Eq. 2). This will allow them to aggregate all the previous time slots according to the Fading Memories formula and then insert the current time slot as a new feature. Since both the client and the server use the same formula, both will end up with the same results about the aggregated number of transactions in previous time slots. Then each will present this new 86

101 certificate for the other to sign. This new certificate will have the following fields: Server ID, Client ID, T ime Slot ID, N 1, N 2,..., N n Cert. (2) Where N i is the aggregate number of transactions that took place for feature slot i (not time slot), beginning at T ime Slot ID. Although this increases the size of each certificate by n 1 more fields, note that only one certificate now needs to be stored by the client for all the transactions it has conducted with the server. In the original scheme, it stored one certificate per time slot, so we have considerable savings in storage space with this new scheme. The certificate is updated whenever a new transaction occurs with the server, so the certificate can tell us the number of transactions in each previous feature slot. Of course, this certificate may be requested at any point in the future after the given time slot has expired. In that case, the receiving node will start at Time Slot ID and update the number of transactions and their feedback data till the present time slot using the update step of Fading Memories (Eq. 2). Using this new protocol, we can use digital signatures to detect fake feedbacks while still using Fading Memories to look at longer histories without increasing storage requirements. 5.2 Introducing Dynamic Thresholds with SVM Figs show SVM with Fading Memories compared to TrustGuard Nave and TVM with Fading Memories for some of the attack patterns. In addition, for attacks 3 and 5, another biased SVM curve was also plotted. The general equation of a linear SVM is [62]: w.x + b 0 w.x + b < 0 for positive class for negative class 87

102 Where x is the instance vector, w is the normal to the SVM hyperplane, and b is a bias threshold. We noticed that in Figs , the reason why the default SVM model performed poorly when the malicious nodes percentage was high in attacks 3 and 5 was because it was misclassifying the positive (non-malicious) instances. Therefore, we increased the threshold, b, slightly so that those instances close to the SVM boundary are classified as positive. In other words, by increasing b we effectively trade false negatives with false positives. The results are shown in the second, biased SVM curve in Figs (attacks 3 and 5). As expected, the bias pays off at higher proportions of malicious nodes when there are more false negatives than false positives. However, it costs us when the proportion of malicious nodes is small since there are more false positives than false negatives. This increases the error for small proportions. This observation led us to the idea that if we could know the proportion of malicious nodes in the network, we could adjust the bias threshold accordingly to optimize accuracy. The threshold would be decreased for fewer malicious nodes, and increased for more malicious nodes. We propose a scheme called Dynamic Thresholds that utilizes this idea. To begin with, we discovered what the ideal thresholds were for given proportions of malicious nodes using brute force trial and error. The ideal threshold is defined as that threshold which maximizes accuracy. We expect the threshold curve to be specific for a given SVM model, and each model would have its own associated curve. For our tests, we used the model that was trained using attack 5 since it combines all the other four attack patterns. The ideal thresholds curve is given in Fig

103 Figure 5.8. Classification Error vs. Malicious Nodes percentage for different algorithms with FM in attack 2. Figure 5.9. Classification Error vs. Malicious Nodes percentage for different algorithms with FM in attack 3. 89

104 Figure Classification Error vs. Malicious Nodes percentage for different algorithms with FM in attack 5. In our SVM simulation, all the w.x values are normalized between the range (-1, 1). The threshold value, b, is with reference to this range. Fig.5.12 shows the reduction in error achieved using the optimum thresholds versus the default threshold of zero. The results clearly show that dynamic thresholds can be very useful for significantly reducing the error. However, the challenge is that in a real world setting, we do not know the proportion of malicious nodes in the network and therefore, we cannot decide what threshold to use. To overcome this, we propose estimating the proportion of malicious nodes through sampling. 90

105 Figure SVM s best threshold vs. Malicious Nodes percentage for attack 5. Figure SVM error under the default threshold and under the best thresholds. The idea is that a new node that joins the network would initially use the default threshold of zero. It would conduct transactions as usual and estimate the proportion of 91

106 malicious nodes from all the nodes it has interacted with. A node is considered malicious if either the Reputation System classifies it as malicious, or if a transaction is conducted with the node and the transaction is deemed to be malicious. Once a large enough sample is collected by the node, it can use that sample to estimate the proportion of malicious nodes in the network and then dynamically adjust its threshold to improve the RS s accuracy. We conducted experiments to determine what a good sample size would be before adjusting the threshold. We ran simulations with different proportions of malicious nodes in the network. Then we randomly sampled the nodes and determined which of them were malicious. Based on the proportion of malicious nodes in our random sample, we estimated the actual proportion in the entire network. Next, we used the estimated proportion to decide which threshold to use. Finally, we obtained the RS error over all the nodes using that threshold. The sample size was varied and the error plotted against it. The results are plotted in Fig The results show that the error is erratic and unstable until a sample size of about 20. After 20 samples, a fairly good estimate of the actual proportion can be obtained. We therefore recommend that nodes should obtain a sample of at least 20 before adjusting their thresholds. Fig shows the reduction in error when Dynamic Thresholds are put into practice. Sample sizes of 20 and 25 are used. These samples are randomly collected and classified, based on a node s interactions with other nodes, and used to estimate the proportion of malicious nodes in the network. The threshold is adjusted based on the estimated proportions. The results show a significant reduction in error even at high proportions. Once the threshold has been adjusted, the sample is discarded and a fresh sample is started. In this way, the node can continuously monitor the proportion of malicious nodes and adjust its threshold as the proportion changes. 92

107 Figure Dynamic Thresholds Error vs. Number of Samples taken for different proportions of malicious nodes in the network. Figure Dynamic Thresholds Error with 20 and 25 samples compared to the default error and the minimum possible error. 93

108 5.3 Conclusions and Future Work In this chapter, we enhanced the core SVM based RS presented in the previous chapter with Fading Memories and modified our digital signature scheme so that we could look back at longer histories without using too many features. We evaluated its performance and showed that it was much better at detecting malicious behavior that varied over longer periods. Finally, we introduced a new technique, called Dynamic Thresholds, that varied the SVM threshold based on its estimation of the proportion of malicious nodes in the network. We showed that Dynamic Thresholds can significantly reduce the classification error. In future we plan to enhance our classifier by taking into account the reliability of the agent providing the feedback. This is done in TrustGuard s PSM (Personalized Similarity Measure) algorithm, however the overhead involved in PSM is similar to TVM since we have to recursively collect further feedback about agents that gave us the original feedback. In future, we will study and try to minimize the overhead associated with collecting these features. 94

109 Chapter 6: Hybrid Trust Management in MANETs based on Reputation and Role Based TM Systems This chapter proposes a new Hybrid Trust Management System, which is based on merging our EMLTrust Reputation System with Role Based Trust Management (RBTM) to combine the advantages of both. We first discuss some background material about RBTMs and outline some of the problems in utilizing RBTMs in MANETs, followed by suggested solutions. Then we describe the Hybrid TM System in detail and evaluate its performance. 6.1 Introduction to Role Based Trust Management Role Based Trust Management (RBTM) was introduced by Li et al. in [39] as the next step in the evolution of access control mechanisms. It combined the merits of some earlier works by merging the concept of Roles from RBAC [40] with Trust Management [41, 42, 43]. The general idea in RBTM is that entities are granted access to resources based on the roles they possess [79]. Roles serve to characterize entities and can represent arbitrary entity attributes, such as student, faculty, or staff [80]. All users belonging to a given role are granted certain access rights using policy statements called credentials. Each credential is issued by an issuer to grant access rights to a subject. The issuer also digitally signs the credential to avoid counterfeiting, possibly using X.509 style certificates [77]. 95

110 For example, let s say the University of Texas wants to grant all enrolled students the right to check out books from its library and the right to purchase parking permits. A tedious way of doing this would be that Enrollment Services (ES) would give the librarian as well as the police department an updated list of enrolled students each semester. However, role based TM makes the job much easier. In RBTM, the librarian (issuer) issues a credential to ES (subject) and delegates to it the right to decide who gets access to the library. The police department also issues another credential to ES, delegating to it the right to decide who can purchase a parking permit. This is illustrated by the following notation. Library.Checkout EnrollmentServices.Student PoliceDept.Permit EnrollmentServices.Student ES in turn, issues a credential to all enrolled students each semester. The credential is set to expire at the end of the semester. For instance, the student Alice would receive the following signed credential. EnrollmentServices.Student Alice If Alice needs to use the library, she only needs to show this credential to the librarian. The librarian can reconstruct the following credential chain: Library.Checkout EnrollmentServices.Student Alice and grant access to Alice. The police department can similarly reconstruct its credential chain and allow Alice to purchase a parking permit. In this way, every enrolled student can enjoy the same access rights easily, without the need for ES to send an updated list of students every semester to every single department at the university. 96

111 6.2 Challenges of Using Role Based Trust Management in MANETs Many issues need to be addressed before Role based TM can be pragmatically deployed in MANETs. In this section, we discuss several issues and propose possible solutions for them. Some of the issues of using TM in MANETS, along with their solutions, have been highlighted in [1, 2, 83]. They include problems such as: 1. Attacks on the authenticity of entities, such as impersonation and Sybil attacks, which is relatively easy to do in MANETs. 2. Ease of eavesdropping which exposes the identity of communicating entities. This may be a problem if we would like to keep the identity of a user confidential, such as in a military, or financial, or medical setting. 3. Problem of selfish nodes which refuse to cooperate, such as not providing or storing credentials, not complying with rules regarding access rights, etc. 4. Unauthorized alteration of distributed resources and of stored or exchanged data. It is difficult to police nodes that are in charge of a distributed resource. 5. Limited computational resources, especially on the client side, where devices such as PDAs and cell phones might request services. in MANETs. In addition, we discuss some other issues in detail that are involved in using RBTM 97

112 6.2.1 Credential Storage MANET system administrators need to decide who should store credentials and whether they are stored redundantly or not. As defined in [39], a credential is a signed certificate from an issuer, X, that grants certain rights to the subject, Y, denoted by: X Y So X could grant Y access rights to a resource. X could also delegate authority to Y so that Y could authorize other nodes. A credential chain can thus be formed: X Y Z In this case, X authorizes Y and Y grants access rights to Z. The chain could be arbitrarily long. It is also possible to construct a web of credentials where multiple issuers issue credentials to multiple subjects: X Y Z U V S W T This creates the problem of who should store the credentials [44]. Should the issuer of a credential store it, or the subject, or a third party? A third party could for instance be a credential storage server, but this would create scalability and availability issues since if the server goes down, the system would be incapacitated. It is therefore desirable to have distributed 98

113 storage of credentials. The location of the credential has significant consequences when it comes to credential distribution and collection, which is discussed in the next subsection. Another problem that we need to address is whether the credentials should be stored redundantly or not. In the chain X Y Z, consider what would happen if Y were to suddenly go offline. Regardless of whether the issuer or the subject chooses to store a credential, if Y goes offline we cannot complete the chain since Y is both an issuer and a subject. In this case Z will not be able to gain access rights even though it is entitled to it. Let us say we choose to store a credential with both its issuer as well as its subject for the sake of redundancy. Then the scheme still fails in cases such as W X Y Z if both X and Y were to go offline simultaneously, since the middle credential would become inaccessible. There is a tradeoff between redundancy and storage requirement based on the degree of robustness a network desires. Redundancy: Ideally, a MANET must store credentials redundantly. Since the entire system is distributed, we cannot store all the credentials at a centralized server. Therefore, each issuer (or subject) stores its own credentials. However, if an issuer such as X goes offline we would like Z to be able to contact some other node in order to obtain X s credentials. One possibility is that we designate another node, W, to store a duplicate of all of X s credentials. If X is unreachable, Z would contact W to obtain the credentials. But if a malicious node wants to target X and remove all its credentials, it only needs to attack X and W and disable them. We propose that instead of storing all the credentials in one backup node, X would randomly divide all its credentials into r equal sized groups and distribute each group to r different nodes to store. Since each credential is signed by X it cannot be tampered with. If X becomes unreachable, Z can contact each of the r nodes in turn to obtain the complete 99

114 set of X s credentials. Using this approach, we do not increase the storage or bandwidth requirements, but we make it harder for an attacker to disable access to all of X s credentials (the total number of nodes accessed by Z is increased however). To disable access to all of X s credentials the attacker must bring down all r nodes along with X. Further redundancy can be introduced by storing each group of credentials on more than one node. This would increase storage requirement though so a balance needs to be reached between storage requirement and the degree of redundancy. Which group of nodes X chooses to store its credentials on depends on the implementation. One possibility is that X can simply choose the nodes with the next higher (or lower) IP address compared to its own. Another possibility is that X would take a hash of its own ID. It would then mod the hash value with the number of nodes in the network, n, to decide which node should store the credentials. X can hash its ID twice, or thrice to obtain the next node that stores the credentials. When Z needs to access X s credentials and X is offline, Z can repeat the same procedure to determine which node has X s credentials. The advantage of this approach is that it can distribute credentials in a pseudo-random fashion within the network and avoid possible clustering, where a small group of nodes end up storing most of the credentials Credential Chain Distribution and Collection The problem of searching for credentials is directly correlated with where the credentials are stored. In [44], the authors discuss three approaches for constructing the chain of credentials going from the subject, the client requesting a service, to the issuer, the server offering the service. The first approach is a top down approach where one would begin constructing the 100

115 chain starting from the issuer s end and spanning out until one would encounter the subject at the end of the chain. The second approach is a bottom up approach in which one would start at the subject s end and span out backwards until the issuer is reached. The third approach is a meet-in-the-middle approach where one would start at both the issuer s and the subject s ends and meet in the middle. It is important to realize that a node storing a given credential could be located anywhere within the MANET, so a credential such as X Y does not imply that X and Y are neighbors. Some practical considerations need to be kept in mind when designing the system. Consider the case where Z needs to use a service provided by X. Z must find and present to X a credential chain authorizing Z to use its services. We assume that the chain X Y Z exists. Suppose we choose to store the credential with the subject. In such a case, it would make sense to use the bottom up approach. Since Z has the credential Y Z, it knows it must contact Y in order to obtain the next link up in the chain. Z would contact Y and ask for all the credentials it possesses to see if a chain back to X can be constructed. Once Z receives the credential X Y, it stops searching. If such a credential is not found, it would in turn contact each of the issuers of Y s credentials and fan out from there until it reaches X, or the maximum allowed search depth is reached. The maximum search depth should be chosen so as to allow most of the chains to be found while not being so large as to place an excessive load on the network and on the node performing the search. Conversely, if the credential is stored with the issuer then we should use the top down approach. In that case, Z would begin at X and know that it needs to contact Y next when it sees the credential X Y. A meet-in-the-middle approach would be effective if both the issuer and the subject store the credential. 101

116 6.3 Introduction to a Hybrid Trust Management System A given organization may consist of many resources and entities who want to access those resources. For example, in a military setting there may be resources such as battle plans, communication systems, surveillance equipment, and weapons systems that may need to be accessed by different personnel at different times. Not all personnel are granted full access rights to every resource, so there must be a Trust Management System (TMS) in place to enforce those access rights. However, it is cumbersome to enforce access rights based merely on an entity s username. Every resource would need to have a database of usernames that are allowed to access it, along with authentication mechanisms to verify the authenticity of usernames provided by users. Making changes to a person s access rights would be a daunting task as every resource that the person accesses would need to have its database updated. Furthermore, having separate databases and authentication mechanisms for each resource makes an attacker s task easier by providing him with more potential points of entry. Role Based Trust Management (RBTM) was introduced to solve some of these problems as the next step in the evolution of access control mechanisms. In RBTM, entities are granted access to resources based on the roles they possess. By contrast, Reputation Systems (RS) grant access to entities based on their past behavior [75]. Good behavior results in escalation of access rights, whereas bad behavior results in the reduction of rights. There is no notion of roles, making RSs unsuitable for organizations consisting of designated roles, such as the military, or corporations. In Chapters 4 and 5, we developed a Machine Learning based Reputation System [14] and showed that it performed better than some other approaches presented in the literature [60]. 102

117 In this chapter, we propose to combine our Reputation System with RBTM in order to create a new Hybrid Trust Management System (HTMS). Its motivation is derived from real life. For example, a soldier with a long history of working with the army is more reliable than a brand new soldier. HTMS takes into account the role of an entity, as well as its historical behavior and experiences in order to decide which access rights to grant to an entity. This approach has the advantages of not granting complete access to a new role, but gradually transferring more rights as the entity becomes more experienced. The process is automatic and it grants rights in small steps, making the access control fine grained. On the other hand, if an entity is deemed to misbehave and abuse its rights, its rights are reduced even though it still belongs to the same role. In addition, the system stores a history of past transactions and their feedbacks so that administrators and supervisors can review the performance of any entity. In Section 6.4, we describe the details of our proposed Hybrid Trust Management System (HTMS), and discuss its merits. In Section 6.5, we evaluate HTMS and present results. Section 6.6 suggests how we can estimate the proportion of malicious nodes in the network to further improve performance. Finally, conclusions and future work and given in Section Proposed Hybrid Trust Management System (HTMS) Our Hybrid Trust Management model utilizes a combination of Role based and Reputation based trust management in order to benefit from the advantages of both. In this model, each entity is assigned a role beforehand by an administrative authority. However, all entities belonging to the same role do not necessarily have the same access rights. Within each role, a range of access rights is defined from a minimum privilege level, to a maximum privilege level. 103

118 For instance, minimum privilege level access could mean that a soldier can only receive communications. Soldiers with higher privilege levels could receive as well as send communications, whereas soldiers with maximum privilege levels could access battle plans. Privilege levels are values between 0 and 1, with 0 being the least privileged, and 1 being the most privileged. Nodes with 0 privilege level cannot access any resource on the network, whereas nodes with privilege level 1 have full access to all the resources on the network. When the administrative authority assigns a role to a new node, it defines a minimum and a maximum privilege level that the node can possess. At any point in time, the exact privilege level of the node is determined by its reputation score, but it will always fall within this range. The administrator then issues a digitally signed certificate to the node, with the following credentials: Node ID, Role, Maximum P rivilege, Minimum P rivilege, Expiration T ime Where NodeID is the ID of the node, and Role is the role title, such as private, major, colonel, or general and is only meant for human consumption. It has no bearing when deciding access rights by machines. Maximum and minimum privilege levels are numbers between 0 and 1, where minimum privilege maximum privilege. Expiration time is the time when the certificate becomes invalid and must be renewed. The certificate is signed by the administrator and given to the node. Whenever the node needs to access any service on the network, it will present this certificate to the server. The server will check what is the minimum privilege level required to access that particular service. This required privilege level is decided beforehand by the server administrator. Then the server will proceed as follows: 104

119 1. Verify the signature to ensure that the certificate originated from the administrative authority. 2. If the required privilege level for the service is greater than the maximum privilege level on the certificate, deny service. 3. If the required privilege level for the service is less than the minimum privilege level on the certificate, grant service. 4. If the required privilege level for the service is in between the minimum and the maximum privilege on the certificate, obtain the reputation score of the node and compute its privilege level at that point in time (explained below). 5. Grant service if node s privilege level is greater than or equal to the required privilege level. Otherwise, deny service Determining the Exact Privilege Level As mentioned, the exact privilege level of a node at a given point in time is determined by its reputation score. Reputation Systems (RS) work by obtaining feedbacks about a node from other nodes it has transacted with in the past. The underlying principle is that a node that has behaved well in the past is likely to behave well in the future, and a node that has misbehaved in the past is likely to misbehave in the future. Such a system is utilized by ebay with its seller rating, where previous buyers leave feedback about the seller so that prospective buyers can decide whether to buy from the seller or not [72]. A similar system is also used by Google s PageRank, where web pages are assigned importance based on the number of other web pages 105

120 that link to it, effectively giving a positive recommendation to it. Pages with higher ranks are shown higher up in Google s search results [73]. In HTMS, a server that needs to obtain the reputation score of the client broadcasts a feedback request throughout the network. Any node that has transacted with the client before responds to the request and sends back encrypted and digitally signed feedbacks to the server. The server decrypts and verifies the signature of the feedback, and then computes a reputation score using any RS implementation. Although any RS can be used with HTMS [60, 61, 75, 76], we chose to use our previously proposed SVM based RS [14] for our experiments. The reasons for this choice are mentioned in Section In any case, whichever RS the network administrator decides to use, the only requirement is that the RS output a reputation score between 0 and 1, with 0 being the least reputable. The RS needs to guard against the possibility of malicious nodes giving incorrect feedback in order to malign another node. In case the node has no history, the RS outputs a preset default score, for example 0.5. Once this reputation score is obtained, the final privilege level is computed using the following equation: P L = (MaxP L MinP L) RScore + MinP L (eq. 1) Where PL is the current privilege level, MaxPL and MinPL are the maximum and minimum privilege levels respectively, according to the role certificate, and RScore is the reputation score, as output by the RS. Equation 1 simply normalizes the privilege level between the minimum and maximum levels. An RScore of 0 yields the minimum privilege level, whereas an RScore of 1 yields the maximum privilege level. A server will grant access to the requested service if PL minimum required privilege level for that service. 106

121 6.4.2 Merits of HTMS Thwarting Global Attacks: The rationale behind HTMS is to find a solution to the problem where a privileged node is compromised by an attacker and then used to damage the network in some way. This could be by trying to attack other nodes, or by conducting malicious transactions with servers. One transaction on its own may not seem malicious, but when a combination of potentially harmful transactions occur globally throughout the network, the results could be disastrous. For instance, one update by a colonel to battle plans may not seem harmful, but a series of small updates over time may change the battle plans entirely. HTMS aims to thwart such attacks by collecting data from multiple parties that have transacted with the node before, and then deciding whether to grant access to this node using a global picture. For this to work, each server assigns a feedback score between -1 and 1 to every transaction that takes place on the server. Potentially harmful transactions, such as updates to battle plans, or changes to passwords are assigned a negative feedback score, whereas harmless transactions, such as reading or printing non-classified files, are assigned positive feedback scores. Similarly, transactions deemed to be malicious, such as uploading viruses or worms, or trying to initiate a buffer overflow attack, will be given negative feedback scores. All of these feedbacks are then used by the Reputation System to compute the overall reliability of the node, which in turn is used to determine access rights. A node that has conducted many potentially harmful transactions in the recent past will have a low reputation score and service will be denied to it, even if it possesses a privileged role, thus thwarting any potential global attacks. At the very least, the attacker will have to slow down the attack, giving more time to the authorities to detect and defeat the attacker. 107

122 Modeling Real World Scenario: The scheme models the real world where an employee is given access rights based on her job title. At first, she is given a certain level of privilege. If she is observed to abuse her rights, some of her privileges are revoked. On the other hand, if she behaves well and gains the trust of her colleagues, she is rewarded with progressively higher access rights. This increase in rights may also be based on the fact that she will become more experienced over time and less likely to abuse the system inadvertently. Eventually, she may gain more and more trust and be promoted to a better position. Our scheme follows this model and it is therefore intuitive. Defense Against False Evidence: By maintaining a Reputation System, we can deter a malicious node, M, from continuously presenting false evidence against X. Because the Reputation System is based on evidence presented by several nodes, no single node can falsely incriminate X to significantly reduce its access rights. In any case, X s rights can never go below the minimum level within its role. Furthermore, if M repeatedly gives a low score to other nodes, it will hurt M s own reputation and diminish its own access rights. Decentralized System: No single node holds all the evidence regarding any node. Therefore, if any one or a few nodes go offline the system can still function well. The Reputation System may be marginally affected since the contribution from the offline nodes will be missing, but it will usually not be critical if only a few nodes are offline. Updating Roles: Periodically, the administrators responsible for role assignment can look at a node s past behavior and reconsider the role assigned to it. They could demote or promote a node to a different role (or job title) based on the behavior. Since each role credential carries an expiry date and time, temporary roles could also be assigned to guest entities for a predetermined amount of time. 108

123 Historical Record Keeping: This will encourage each node to always behave well so that it may retain or upgrade its access rights. This is similar to the credit score system maintained by credit agencies for individuals in the United States. 6.5 Evaluating HTMS Simulation Setup We used simulations to evaluate the efficacy of HTMS. We simulated the scenario where a node, x, is present within a large network of 1,000 nodes. The behavior of node x varies randomly with time. Time is measured in units of days from 1 to 365, i.e. one year. The behavior of a node on any given day is quantified as the proportion of legitimate or non-risky transactions it conducts during that day, versus the total number of transactions conducted by the node during the day. Each transaction is between node X and any other random node on the network. A randomly generated behavior curve that fluctuated slowly over time was used as the basis of the experiments. The behavior curve is illustrated in Fig This curve simulates the behavior of an actual node over time that may have been compromised. Privilege levels are defined on a scale of 0 to 1, where 0 represents no access rights to any resource on the network and 1 represents complete access to all available resources. In our simulation, the system administrator has assigned to node x a certain role, with a maximum privilege level of 0.8 and a minimum privilege level of 0.2. The actual privilege level at any given point in time maybe anywhere between this range, but under no circumstances will the privilege level be higher than 0.8, or lower than 0.2. Other roles may have other corresponding minimum and maximum levels assigned by the administrator. 109

124 In the ideal case, we would like the privilege level at any given time to be proportional to the behavior of the node at that time. If the behavior is 1, the privilege level should be maximum, i.e If the behavior is 0, the privilege level should be minimum, i.e If the behavior is between 0 and 1, the privilege level should be between 0.2 and 0.8 correspondingly. The ideal curve for the simulated behavior is illustrated in Fig Figure 6.1. Randomly generated behavior of a node vs. the corresponding ideal response curve, along with maximum and minimum privilege levels. Our goal is to construct a system that will mimic the ideal curve as closely as possible. The ideal curve, therefore, represents the yardstick against which an implementation of the system can be compared. Of course, in an actual setting, since we would not know the exact behavior of a node, we would need to estimate it using a Reputation System as defined in Section

125 6.5.2 Experiments and Results We used an implementation of the Support Vector Machines (SVM) based Reputation System (RS) that we proposed in a previous work [14] for the following reasons: 1. This RS has been shown to perform well with varying patterns of malicious behavior and varying proportions of malicious nodes. 2. It protects against fake feedbacks about transactions that never really occurred. 3. Since it is based on Machine Learning and SVM, it is easy to construct the RS model and automatically determine the model s parameters if there is training data available [62]. We generated the training data using simulations on a different behavior curve. The proportion of malicious nodes in the network was varied in different simulations to obtain different training sets. A malicious node is defined as a node that lies and gives incorrect feedback about a node in an attempt to either decrease its reputation, or to increase it if the node is another colluding malicious node. The training sets were then used to train SVM. The test sets were generated using the behavior curve illustrated in Fig Feedbacks were taken from the nodes in the network that had transacted with node X. Again, the proportion of malicious in the network was varied between 0% and 70% so that the feedbacks were not always reliable. Each train and test instance consists of feedbacks obtained over the last seven days (i.e. one week). Accordingly, the privilege levels are automatically updated every seven days so that a given privilege level is valid for one week. In the first set of experiments, we used a training set consisting of 0% malicious nodes to train SVM. Then we tested this model against five different test sets consisting of 111

126 0% to 70% malicious nodes. The output of the model for each test set over time is plotted in Fig The figure shows that the model closely mimics the ideal curve when the proportion of malicious nodes in the test set is also 0%, same as in the training set. For other proportions, the output deviates from the ideal curve, becoming almost a horizontal line at 50%. It becomes a mirror image of the ideal curve above 50%, increasing when the ideal curve decreases and vice versa. This is because after 50%, a majority of the nodes lie about the feedback, giving good feedback when the node is bad, and bad feedback when the node is good. This malicious majority overwhelms the feedback from the minority legitimate nodes, leading SVM to reverse its output. At 50%, neither malicious nor legitimate nodes can overwhelm each other, so SVM produces a constant output of approximately 0.5. Figure 6.2. Effect of varying proportions of malicious nodes in test sets. Training set has 0% malicious nodes. We quantified the deviations of the test curves from the ideal curve by measuring the average overestimations of the curve, when the system grants more privilege than it should, 112

127 and underestimations of the curve, when the system grants fewer privileges than it should. The daily averages of the over and under estimations are plotted in Fig The sum of over and under estimation gives the overall average discrepancy from the ideal curve. As expected, the smallest discrepancy occurs when the percentage of malicious nodes in the test set is 0%, the same as in the training set. At 0%, the discrepancy is only about 0.008, which means that on average the system is off from the ideal privilege level by However, the discrepancy can be as high as 0.27 when 70% of the nodes are malicious. Figure 6.3. Average daily discrepancy from the ideal curve vs. percentage of malicious nodes in the network (smaller is better). Training set has 0% malicious nodes. We hypothesized that the discrepancy for a given test set would be minimized when the percentage of malicious nodes in the test set would be the same as in the training set. To test this hypothesis, we conducted further experiments, this time using different percentages of malicious nodes in the training sets. For the second set of experiments, we used 30% malicious nodes in the training set and repeated the experiments with the same test sets as before. The 113

128 results are shown in Figs. 6.4 and 6.5. Then we repeated the experiments with 70% malicious nodes in the training set. The results are shown in Figs. 6.6 and 6.7. The results support our hypothesis that the discrepancy is minimized when the percentage of malicious nodes in the test set is the same as in the training set. Figure 6.4. Effect of varying proportions of malicious nodes in test sets. Training set has 30% malicious nodes. 114

129 Figure 6.5. Average daily discrepancy from the ideal curve vs. percentage of malicious nodes in the network. Training set has 30% malicious nodes. Figure 6.6. Effect of varying proportions of malicious nodes in test sets. Training set has 70% malicious nodes. 115

130 Figure 6.7. Average daily discrepancy from the ideal curve vs. percentage of malicious nodes in the network. Training set has 70% malicious nodes. 6.6 Estimating Percentage of Malicious Nodes Based on our results, we can conclude that in order to minimize the discrepancy, we need to train the SVM model using a training set that has approximately the same percentage of malicious nodes as the test set. Unfortunately, in a real world setting, we do not know what the percentage of malicious nodes actually is, and it may vary with time. To work around this problem, we propose generating several SVM models using different percentages of malicious nodes, for example 0%, 10%, 20% and so on. Then, we need to estimate the percentage of malicious nodes in the network so that we can apply the appropriate model. Estimation of the percentage of malicious nodes can be done by sampling. To begin with, a node uses a default SVM model, for instance, the one with 10% malicious nodes. 116

SUMMERY, CONCLUSIONS AND FUTURE WORK

Chapter - 6 SUMMERY, CONCLUSIONS AND FUTURE WORK The entire Research Work on On-Demand Routing in Multi-Hop Wireless Mobile Ad hoc Networks has been presented in simplified and easy-to-read form in six