The Seed Attack: Can BitTorrent be Nipped in the Bud?

The Seed Attack: Can BitTorrent be Nipped in the Bud? Prithula Dhungel, Xiaojun Hei, Di Wu, Keith W. Ross Polytechnic Institute of NYU, Brooklyn, NY 1121 Huazhong University of Science and Technology, P. R. China Email: pdhung1@students.poly.edu, heixj@ieee.org, dwu@poly.edu, ross@poly.edu Abstract We study a natural and potentially devastating attack against BitTorrent, namely, attacking the initial seed in a torrent s early stages. The goal of this attack is to diminish the seed s ability to upload blocks. If the attacker can discover and react quickly enough to the new torrent, it can possibly nip the torrent in its bud, preventing all of the leechers from obtaining the entire file. We consider two natural seed attacks: the bandwidth attack and the connection attack. We take a three-prong approach to analyze these attacks. First, we actually launch and measure the attacks using popular BitTorrent seeds (Azureus, utorrent, and BitTornado). To this end, because we do not want to interfere with torrents in the wild, we have created our own private torrents within PlanetLab. Second, to gain insight into our empirical results, we carefully analyze the connection management and seeding algorithms in open-source BitTorrent seeds. Third, we construct a simple fluid model which provides additional insights into the empirical results. We also study how torrents can be discovered in their early stages. The observations and conclusions in this paper can help P2P developers to design highly-resilient P2P systems. I. INTRODUCTION Over the past several years, the music industry has been aggressively attempting to curtail the distribution of targeted musical recordings over P2P file sharing networks. These attempts include numerous law suits against P2P file sharing companies (against Napster, Kazaa and many others), tracking and suing users of P2P file sharing systems [1], and most remarkably, launching large-scale Internet attacks against the P2P systems themselves. The large-scale Internet attacks have been performed by specialized anti-p2p companies, also known as interdiction companies (e.g., Media Defender [2], Safenet [3] and Macrovision [4]), working on the behalf of the RIAA and specific record labels. Several studies showed that these attacks were successful at severely impeding the distribution of targeted contents over several P2P file sharing systems, including FastTrack/Kazaa, Overnet/eDonkey, and Gnutella/Limewire [5] [7]. Today, BitTorrent is one of the most popular P2P file distribution protocols, particularly for the distribution of large files such as high-definition movies, television series, record albums, and open-source software distributions [8]. Unlike Napster and Kazaa, BitTorrent uses an open protocol, which has been implemented by more than 5 different clients [9]. The BitTorrent client is thus not developed and distributed by a single company, which can be easily targeted for a lawsuit. Also included in the BitTorrent ecosystem are torrentdiscovery (e.g., PirateBay and Mininova) and peer-discovery services (trackers, DHTs, and gossiping). Not surprisingly, the music, film, and television industries have begun to hire anti-p2p companies to curtail the distribution of assets in BitTorrent [1], [11]. It is important to understand just how vulnerable the Bit- Torrent ecosystem is to large-scale, resource-intensive attacks for several reasons. First, BitTorrent has proven to be extremely effective at distributing large files, including opensource software distributions. According to Evidenzia [12], the number of files distributed on BitTorrent almost tripled from 26 to 27, and that the visits to major torrent location sites PirateBay and Mininova doubled over this same period [13]. Given BitTorrent s large footprint in the Internet, it is of interest to know whether it is vulnerable to attack and if successfully attacked whether its impact on the Internet will be consequently reduced. Second, having proven itself capable of massive-scale file distribution, BitTorrent now serves as a blueprint for other P2P systems, including live streaming [14] and on-demand applications [15]. Understanding Bit- Torrent s vulnerabilities will therefore shed insights into the vulnerabilities of a broader class of P2P systems. Third, the music and film industries are clearly interested in whether network attacks can successfully curtail file distribution over BitTorrent. If not, they can abandon financing the attacks and investigate other solutions and business models. In an earlier paper [16] we studied ongoing attacks against leechers, where interdiction companies attempt to slow leecher downloads by sending fake blocks and monopolizing leecher TCP connections. Our results showed that these attempts by the interdiction companies were unsuccessful and that BitTorrent is inherently resilient to leecher attacks. In this paper we study a natural and potentially devastating attack, namely, attacking the initial seed in the early stages of a torrent. The goal here is to diminish the seed s ability to upload blocks. If the attacker can discover and react quickly enough to the new torrent, it can possibly nip the torrent in its bud, preventing all of the torrent s peers from obtaining the entire file. We consider two natural seed attacks: the bandwidth attack and the connection attack. Our contributions are as follows: We actually launch and measure the attacks using popular BitTorrent seeds (Azureus, utorrent, and BitTornado). In designing the attack experiments, we didn t want to in-

2 terfere with existing torrents in the wild, but nevertheless wanted to attack real BitTorrent seed implementations. To this end, we created our own private torrents on PlanetLab nodes including torrent leechers, attack peers, and a tracker and used this environment to attack Azureus, utorrent, and BitTornado seeds running in our university campus network. To our knowledge, this is the first study to employ PlanetLab as part of a methodology to evaluate the effectiveness of attacks on Internet applications. As part of this environment, we have instrumented a BitTorrent attack client by modifying the code of BitTornado, and have developed a BitTorrent traffic analyzer, enabling us to track the rate at which the seed transmits to the leechers and attackers during the attack. To gain insight into our empirical results, we analyze the connection management and seeding algorithms in two open-source BitTorrent seeds (Azureus and BitTornado). This is of independent interest, as to our knowledge, this is the first detailed description of connection management algorithms for real world BitTorrent clients. To gain further insight into the empirical results, and to guide our attack scenario planning, we construct a simple fluid model for attacks against seeds. The model takes into account seed upload bandwidth, the leechers and attackers download bandwidths, the number of leechers and attackers, and the seed s limit on upload slots. We also investigate how quickly newly released torrents can be discovered, by crawling torrent-discovery web sites and from RSS feeds. As part of our findings, we have learned that some BitTorrent connection-management and seeding algorithms are remarkably resilient to attacks, whereas others are extremely vulnerable. Because BitTorrent serves as a blueprint for many P2P video streaming applications, the observations and conclusions in this paper should be instrumental for designing attackresilient BitTorrent and video streaming applications in the future. This paper is organized as follows. We end this section with a summary of the related work. In Section II we describe the two types of seed attacks: bandwidth attack and connection attack and also describe the experimental setup for our attack experiments. In Sections III and IV we provide the details of seeding and connection management algorithms used by Azureus and BitTornado respectively; we also present and analyze the results obtained for various experiments involving bandwidth and connection attacks. Since both these attacks require the attacker to be able to detect the torrent and identify the seed when the torrent is in its initial stage, in Section V we perform a measurement study of how quickly one can detect newly uploaded torrents and identify the seeds therein. Finally, we conclude the paper in Section VI. A. Related Work Recently, there has been a number of studies on pollution and poisoning attacks on second-generation P2P file sharing systems (such as Kazaa and edonkey) [5] [7], [17]. A measurement study concluded more than 5% of the files in Kazaa to have been polluted [5]. Similarly, the index poisoning attack, wherein an attacker advertises an enormous number of bogus sources for a targeted content, was highly pervasive in the FastTrack and Overnet DHT (edonkey) networks [7]. To date there has been little work on the resilience of BitTorrent to attacks. A recent discrete-event simulation study of BitTorrent examined two attacks: an attack involving tampered buffer maps, and a connection monopolization attack [18] against leechers. This simulation study assumed idealized BitTorrent clients using an idealized protocol. (In practice, developers of BitTorrent clients deviate significantly from the textbook BitTorrent descriptions.) To gain meaningful insights into BitTorrent vulnerabilities, measurement studies with realworld BitTorrent clients are required. In a previous paper [16] we investigated ongoing attacks on BitTorrent leechers. To the best of our knowledge, the current paper is the first study of attacks on BitTorrent seeds. There has been a number of studies on how P2P systems can be used as engines for DDoS attacks against arbitrary hosts in the Internet [19] [21]. There has also been a number of studies on whether BitTorrent s tit-for-tat incentive mechanism is sufficient for preventing freeriders from downloading files [22] [24]. These two classes of attacks (DDos attacks from P2P systems and free rider attacks) are orthogonal to the attacks considered in this paper. II. SEED ATTACKS AND EXPERIMENTAL ENVIRONMENT In our research we have discovered that different versions of BitTorrent clients use a variety of different seeding algorithms, including priority to highest-downloaders, round-robin across all connections, and combinations. However, most clients today use some variation of the traditional bandwidth-first algorithm. In the traditional bandwidth first algorithm, the seed uploads (unchokes) to at most n leechers where n is typically set to 4 or 5. The seed measures the rate at which it uploads to these n unchoked leechers. Periodically (typically every 3 seconds), it chokes the leecher to which it is uploading at the slowest rate, and replaces it with a randomly selected leecher (optimistically unchokes). Below we provide a high level overview of two natural seed attacks considered in this paper. We also provide a simple fluid model for the bandwidth attack. A. Bandwidth Attack In the bandwidth attack, the attacker attempts to consume the majority of the seed s upload bandwidth, leaving as little bandwidth as possible for the legitimate peers. To this end, the attackers connect to the seed and try to download at relatively high rates, so that they occupy most of the seed s unchoke slots. We now provide a simple fluid model for the bandwidth attack. At any given instant of the time, the seed will have TCP connections to both legitimate leechers and attackers. Let N A denote the number of connections to attack peers, and N L

3 denote the number of connections to legitimate peers. Suppose all the attackers have the same downstream bandwidth of b A ; and all the legitimate peers have the same downstream bandwidth of b L. For a bandwidth attack, it is reasonable to assume that b A > b L, that is, an attacker has more bandwidth resources than a typical legitimate peer. Denote the seed upstream bandwidth by u. Finally, let n denote the seed s maximum number of upload slots. Among the n upload slots, at any given time, n 1 slots are used by regular unchoked peers, and 1 slot is used by the current optimistic unchoked peer. In the context of this high-level fluid model, we now determine how much of the seed s bandwidth is obtained by the legitimate peers. First observe that each unchoked legitimate peer receives bits from the seed at a rate bounded by min{b L, u/n}; similarly, each unchoked attack peer receives bits from the seed at a rate bounded by min{b A, u/n}. There are two cases: b A > b L u/n: In this case, each unchoked peer receives at rate u/n, so the attackers have no advantage over legitimate peers in the Bandwidth First seeding algorithm. Assuming that peers are chosen at random from the N L + N A peers for optimistic unchoke, the long-run fraction of seed bandwidth obtained by the attackers is determined by the fraction of attackers among connected peers, that is, N A /(N L +N A ), and the amount bandwidth available to legitimate peers is un L /(N L + N A ). b A u/n > b L : Here an unchocked legitimate peer receives at rate b L where each unchocked attack peer receives at least at rate u/n. Thus, under the Bandwidth First seeding algorithm, all the regular unchoked slots are taken by attackers. Legitimate peers can only get optimistically unchoked. The fraction of seed upstream bandwidth obtained by attackers is thus at least n 1 n, and the amount of bandwidth available to legitimate peers is at most b L. This model will be useful in guiding the design of our attack experiments and interpreting the measurement results. B. Connection Attack A seed has about 5 connection slots (typically configurable). So if c is the number of connection slots, the seed will maintain TCP connections with up to c peers, with up to n of the c peers being unchoked for uploading. In the connection attack, the attacker attempts to consume the majority of the seed s connection slots. For it to be successful, the attacker needs to detect the torrent in its early stages and grab the majority of the connection slots before, leaving few connection slots for legitimate peers. To ensure that the seed maintains the connections, the attackers may need to occasionally download some blocks. In the context of real-world seeding and connectionmanagement algorithms, we will describe the bandwidth and connection attacks in greater detail in the subsequent sections. C. Attack Environment We seek to determine whether popular BitTorrent seed implementations, along with their seed and connectionmanagement algorithms, are vulnerable to bandwidth and connection attacks. To this end, we could attempt to detect new torrents in the wild, launch attacks against the seeds in those torrents, and measure the effectiveness of the attacks. We prefer, however, not to interfere with torrents in the wild. An alternative approach is to use discrete-event simulations to study the effectiveness of the seed attacks using idealized seed models. However, since popular seed implementations (such as Azureus and utorrent) often deviate significantly from the idealized models (such as bandwidth-first and roundrobin), simulation results would not be truly conclusive. To this end, we create our own private torrents on PlanetLab nodes including torrent leechers, attack peers, and a tracker and use this environment to attack Azureus, utorrent, and BitTornado seeds running in our university campus network. To our knowledge, this is the first study to employ PlanetLab as part of a methodology to evaluate the effectiveness of attacks on Internet applications. Our attack environment consists of a single seed sharing a file, 3 (legitimate) leechers, a single tracker, and a number of attack peers. The seed is located on our university network. The tracker, 3 leechers, and attackers are located on different PlanetLab nodes. For experiments that require a large number of attackers, we run multiple attacker instances per PlanetLab node. In order to evaluate the effectiveness of the attack experiments, for each experiment set, we also downloaded the file at the leechers without the presence of attackers and compared the average time required at the 3 leechers for downloading the file. We created the attacker by modifying the source code for BitTornado.3.17. Specifically, we commented out the portion of the code responsible for querying the tracker and added code to directly make a TCP connection with the seed. Thus, the attacker works by directly making TCP connection to the seed, downloading blocks from it, and never forwarding blocks to any other peers. The leechers and the tracker used the BitTornado.3.17 client. Furthermore, in order to simulate a real world scenario in the torrents, the leechers were capped with a maximum upload bandwidth of 48 KB/sec and a maximum download bandwidth of 192 KB/sec. For experiments involving the bandwidth attack, attackers had no bandwidth limitations (limited only by the bandwidth of the PlanetLab nodes they were running at). Since the main idea behind the connection attack is to hinder file distribution without using a lot of bandwidth, for experiments involving connection attack, attackers were capped with a maximum upload bandwidth of 1 KB/sec and a maximum download bandwidth of 2 KB/sec. We attacked three different seeds - Azureus (Vuze) 3.1.1, utorrent 1.7.7, and BitTornado.3.17. Azureus and utorrent are among the most popular seeds in the wild today [25], whereas Azureus and BitTornado are open source, which greatly facilitates analysis. Because utorrent gave results

4 similar to BitTornado, in this paper we only present results for Azureus and BitTornado. All the experiments simulated a flash crowd effect, in that, 5 leechers are started first; after all 5 are connected to the seed, all the attackers are started, followed by the remaining 25 leechers. In these experiments, we are exploring an advantageous scenario for the attackers, in which they quickly detect and join the new torrent. For bandwidth and connection attack experiments, we used file sizes 1 MB and 5 MB, respectively. For each experiment, the number of unchoke slots at the seed was set to 4. Table I summarizes the parameter settings of the experiments (AZ and BT refer to Azureus and BitTornado respectively). Also as part of the attack environment, we developed a BitTorrent analyzer, which sniffs, parses, and analyzes the packets sent by the seed. The analyzer enables to determine the amount of bandwidth the seed provides to the leechers and attackers over time. III. EXPERIMENTS AND RESULTS: AZUREUS We now investigate the vulnerability of Azureus s seeding and connection-management algorithms. We choose to study Azureus because it is one of the most popular BitTorrent seeds (as well as clients), and its code is open source. To interpret the empirical results, we carefully examined the source code of Azureus (Vuze) 3.1.1. to understand its seeding and connection-management algorithm. We now briefly describe those algorithms. We believe that this is of independent interest as, to our knowledge, this is the first description of connection-management algorithms in real-world popular BitTorrent clients. A. Seeding Algorithm By carefully examining the source code, we determined that Azureus uses a variation of bandwidth first as its seeding algorithm. Every time a new optimistic unchoke is performed, the n unchoked peers are updated. The Azureus seed creates two sorted lists from the current unchoke list: one is an ascending ordered list of peers based on their download rates and the other is a descending ordered list of peers based on the amount of data downloaded by each. The peers are again sorted based on the sum of their indices in the above two lists to form an ordered unchoke list. Thus, the peers in the current unchoke list that have higher download rates as well as fewer bytes downloaded from the seed will be near the top of the unchoke list. The peer at index n in this new list is then choked to make room for the optimistic unchoke. When selecting a peer for the optimistic unchoke, a peer is randomly chosen from the current list of connected peers. B. Connection Management Algorithm The state diagram in Fig. 1 shows how the state of a peer transits from one state to another as a result of the connection management algorithm. The connection management in Azureus is conducted for incoming connections and outgoing connections, respectively. The Azureus seed allocates a single thread to continuously listen and accept new incoming connections. As soon as a connection request is received, it is accepted immediately if the total number of connection slots currently occupied has not reached the maximum limit c. A peer is thus connected and enters the connected state. Azureus also maintains a finite sized queue called Discovered Peers. This queue contains the peers that have been discovered using methods like crawling the tracker, DHT, and so on. The peers which are only known to the seed via incoming connection requests are not considered to be discovered peers. In the management of outgoing connections, peers are removed from the current connection list to make room for other potentially better connections and new peers are chosen whenever connection slots are available. Every second, if there is a free connection slot, the seed initiates a connection to the oldest peer on the discovered list, performing an Optimistic Connect. The oldest discovered peer is the one that stays at the front of the Discovered Peers queue. Whenever the size of the queue exceeds the maximum (5), the peer from the front of the queue is dropped; hence, this peer enters the undiscovered state. When working in the seeding mode, for each connected peer, the Azureus seed keeps track of the time when data was last sent to the peer. Every 3 seconds, if all connection slots are full, the operation of Optimistic Disconnect is performed. For every peer in the connection list, the time, when data was last sent to it, is determined and the one, for which it has been the longest after data was sent last (provided that data has been sent to the node at least once), is selected. If this value is greater than 5 minutes for the peer, the connection to this peer is closed by the seed; hence, this peer enters the undiscovered state. In this way, the seed makes room to accept new connections. A connected peer just occupies a connection slot in the seed but does not receive any data. Once a connected peer is unchoked by the seed, it becomes one of the n unchoked peers. An unchoked peer returns to the connected state as soon as it receives a choke message from the seed. Fig. 1. Tracker, DHT Discovered Discovered queue full Undiscovered Optimistic Connect Optimistic Disconnect Incoming Connection Connected Unchoke Choke Unchoked State Diagram for Connection Management in Azureus C. Bandwidth Attack Results Table II shows the results for bandwidth attacks against Azureus seeds. In this table, Delay Ratio is the ratio of the

5 TABLE I PARAMETER SETTINGS OF THE SEED ATTACK EXPERIMENTS File Attacker Leecher Seed Type Size # of Bandwidth # of Bandwidth Unchoke Max Up Client Client Client (MB) Attackers (KB/sec) Leechers (KB/sec) Slots Connections Bandwidth Up Down Up Down (KB/sec) Bandwidth 1 BT 4 u/l u/l BT 3 48 192 AZ 4 5 6/1/14 Connection 5 BT 8 1 2 BT 3 48 192 AZ 4 5 6/1/14 Bandwidth 1/5 BT 4 u/l u/l BT 3 48 192 BT 4 5 5/1/15 Connection 5 BT 8 1 2 BT 3 48 192 BT 4 5 5/1/15 average time taken to download the file at the 3 leechers in the presence of attackers to that without attackers, with other experiment settings the same. We observe that even with the number of attackers as high as 4 and seed bandwidth as low as 6 KB/sec, the delay ratio achieved is less than 3. Recall that the seeding algorithm of Azureus prefers peers that have both high download rates and fewer bytes downloaded from the seed so far. Even if the attackers download from the seed at a high rate, because they will have downloaded more bytes already, they cannot always occupy the unchoke slots. The leechers still get unchoked at regular intervals thereby resulting in a low delay ratio. Furthermore, note the bandwidth settings for leechers and attackers. All the bandwidth experiments fall in the first scenario in Section II-A (b A > b L u/n), for which attackers do not achieve any special advantage over the leechers when fighting for unchoke slots. (We will discuss the other case subsequently.) Notice that increasing the seed bandwidth decreases the delay ratio. This is not surprising because it is more difficult to attack high bandwidth seeds. Fig. 2 shows the bandwidth sharing between attackers and leechers at the Azureus seed (1 KB/sec) in a typical bandwidth attack experiment. We observe that neither the attackers, nor the leechers monopolize the upload bandwidth of the seed. After the leechers finish download at around 5 minutes, all the bandwidth is occupied by the attackers. 1 8 6 4 2 1 2 3 4 5 6 7 total leecher attacker Fig. 2. Seed Bandwidth Distribution for Attackers and Leechers in an Azureus Bandwidth Attack Even in cases in the second scenario of bandwidth attacks (b A u/n > b L ), the attacks are not effective. First, the second unchoke criterion ensures that the (low bandwidth) leechers are able to grab unchoke slots from time to time. Second, to enter the second scenario, we need to increase the value of u to 8 Mbps. In this case, even if the leechers occasionally obtain optimistic unchoke slots, they achieve a download bandwidth as high as b L = 192 KB/sec when unchoked. This ensures that these leechers are able to quickly download the file. To verify this, we also performed an experiment with the BitTornado seed, which uses a pure bandwidth first algorithm as discussed in Section IV-A, with the seed upload bandwidth of 8 Mbps and other parameters the same. The delay ratio in this BitTornado case was still less than 3. TABLE II BANDWIDTH ATTACK RESULTS FOR AZUREUS (4 ATTACKERS, 1 MB FILE) Seed Uplink (KB/sec) Avg Download Time w/o Attackers (mins) Delay Ratio 6 36 2.24 6 36 2.58 1 35 1.37 1 35 1.45 14 34 1.8 14 34 1.4 1) Connection Attack Results: Table III shows the results for the connection attacks against Azureus seeds. All the attacks were very successful. Even for the cases with seed bandwidth as high as 14 KB/sec, the download at the leechers was completely stopped. Recall that for each of the experiments, 5 leechers were started at first, followed by all attackers, and then the remaining 25 leechers. In each case, the first 5 leechers obtain the connection slots to the seed but the rest of the connection slots (45) are all occupied by attackers. With the download progressing, the 5 leechers gradually became disconnected. After all 5 leechers were disconnected, none of the leechers, including the remaining 25 that were started after the attackers, could be re-connected to the seed; hence, the downloads at the leechers were stopped completely. To understand why the attack is successful, recall the connection management in the Azureus seed. Every 3 seconds, the seed checks the peers currently connected to find out the one which has not downloaded data from it for the longest time. If this time is greater than 5 minutes, the peer is disconnected. Also recall the seeding algorithm: peers get unchoked based on their download rate and the amount of data downloaded so far. Hence, even if the attackers have relatively lower bandwidth compared to the leechers, the second unchoke

6 TABLE III CONNECTION ATTACK RESULTS FOR AZUREUS (8 ATTACKERS, 5 MB FILE) Seed Uplink (KB/sec) Delay Ratio 6 All leechers were stuck at 1.2% 6 All leechers were stuck at 1.2% 1 All leechers were stuck at 22.3% 1 All leechers were stuck at 4.3% 14 All leechers were stuck at 44.9% 14 All leechers were stuck at 16.2% criterion - fewer amount of data bytes downloaded so far - helps them grab some upload slots. Since there are a large number of them (45 attackers versus 5 leechers), even if each of the attackers grabs only a few unchoke slots, these attackers can together prevent the leechers from obtaining slots for more than 5 minutes. This explains why one after another, the initial 5 leechers are disconnected from the seed as the operation result of Optimistic Disconnect. Note that the attackers may also be optimistically disconnected. However, since they have been configured to continuously attempt to connect to the seed, they quickly reconnect to the seed after the previous disconnection. This aggressive request nature of the attackers also helps them grab the connection slots previously held by the initial 5 leechers as soon as the leechers are disconnected from the seed. One may argue that for the low bandwidth seed case, the attack was more like a bandwidth attack since the relative bandwidth of attackers is high. To investigate this, we plot the bandwidth sharing of the seed bandwidth (14 KB/sec) between attackers and leechers in one typical experiment in Fig. 3. We observe that up to around 35 minutes from the start of the attack, leechers occupy most of the seed bandwidth while attackers receive only a small fraction. However, after 35 minutes, there is no seed bandwidth occupied by the leechers. At t = 35 minutes, all the initial 5 leechers have lost connections to the seed and all the 5 slots are occupied by the attackers. Even after t = 35 minutes, the attackers are still using very little bandwidth (roughly 2 kbps = 25 KB/sec). This is indeed a connection attack. 12 1 8 6 4 2 1 2 3 4 5 total leecher attacker Fig. 3. Seed Bandwidth Distribution for Leechers and Attackers in an Azureus Connection Attack Figure 4 shows, for the same experiment, the bandwidth distribution for each individual leecher and collectively for all attackers. We observe that the leechers are fighting with each other as well as the attackers for the seed bandwidth. None of the leechers can continuously occupy unchoke slots. This is a consequence of the second unchoke criterion - fewer bytes downloaded from the seed - that leechers get choked despite a high download rate. In this figure, each of the leechers reaches the state of not having received data for 5 minutes. This leads to each of leechers being Optimistically Disconnected from the seed one by one. 12 1 8 6 4 2 1 2 3 4 5 Attackers Leecher 1 Leecher 2 Leecher 3 Leecher 4 Leecher 5 Leecher others Fig. 4. Seed Bandwidth Distribution for Individual Leechers and All Attackers in an Azureus Connection Attack IV. EXPERIMENTS AND RESULTS: BITTORNADO We also performed experiments with a BitTornado seed. We choose to investigate BitTornado because it uses a pure bandwidth-first algorithm (unlike Azureus) and because its code is open source. We also performed extensive experiments with the popular (but closed-source) utorrent seed. The utorrent results were very similar to the BitTornado results reported below. A. Seeding Algorithm By carefully examining the source code, we determined that BitTornado uses a pure bandwidth first algorithm when seeding. Every time the unchoke list is to be updated, a new list called the preferred list is created. The list consists of highest-download rate interested peers in a descending order. The size of this list is then cut down to n 1 (n being the maximum number of unchoke slots), making room for optimistic unchoke. The final unchoke list contains all peers from the preferred list, one peer that is not preferred and is interested, and zero or more peers that are not preferred and are uninterested. Furthermore, every 3 seconds (default value), the current connection list is shuffled so that the top of the list contains an interested but choked peer. This shuffling of the connection list ensures selecting random peers for optimistic unchoke. Algorithm 1 provides the details of the seeding algorithm in BitTornado.

7 Algorithm 1 BitTornado Seeding Algorithm preferred = [] for c in Connections do if c.isinterested then preferred.append(c) end if end for preferred.sortdescending() n = getmaxunchokes() del preferred[n-1:] count = length(preferred) unchokelist = [] for c in Connections do if c in preferred then unchokelist.append(c) else if count n 1 or not hit then unchokelist.append(c) if c.isinterested() then count += 1 hit = True end if end if end for B. Connection Management Algorithm BitTornado is a fairly simple client without many complicated features. The connection management is also fairly simple and handles incoming and outgoing connections, respectively. For incoming connections, the client continuously listens and accepts new incoming connections. As soon as a request is received, the connection is established provided that the maximum number of connections has not been reached. For outgoing connections, BitTornado only supports querying the tracker(s) to retrieve the peer list. The client queries the tracker(s) at regular intervals (5 minutes by default) to harvest new peers in the swarm. If new peers are received and the maximum number of connection slots has not been filled out; then the local peer initiates connections to those peers. A BitTornado client also checks whether there are any dead connections; if so, it removes those dead connections from the connections list. A peer is considered dead if the data delivery to this peer fails for more than two times. 1) Bandwidth Attack Results: Table IV shows the delay results for bandwidth attacks against BitTornado seeds when the download file is of size 1 MB. We observe that even with the seed bandwidth as low as 5 KB/sec the attacks are not very effective. The seeding algorithm used by BitTornado is purely bandwidth first and favors high bandwidth peers over others. However, the bandwidth settings in all experiments are configured such that the upload seed bandwidth allocated to each unchoke slot (seed upload capacity/number of unchoke slots) is lower than the download capacities of both attackers and leechers (first scenario in Section II-A, b A > b L u/n). Hence, the attackers have no advantage over the leechers when fighting for unchoke slots at the seed. The only advantage the attackers have in these cases is that they occupy a larger number of connection slots at the seed and hence have a higher probability of being selected for an optimistic unchoke. TABLE IV BANDWIDTH ATTACK RESULTS FOR BITTORNADO (4 ATTACKERS, 1 MB FILE) Seed Uplink (KB/sec) Avg Download Time w/o Attackers (mins) Delay Ratio 5 39 1.37 5 39 3.9 1 34 1.12 1 34 1.6 15 33 1.5 15 33 1.2 Table V shows the results for bandwidth attacks against the BitTornado seed with same experimental settings except for the size of the download file. This time we use a larger file (5 MB). It is interesting to observe that in these experiments, the delay ratio values are higher than those for the smaller file size (1 MB). As discussed previously, the attackers only have an advantage over the leechers when being selected for an optimistic unchoke. However, because the file size is larger, the download duration is longer. TABLE V BANDWIDTH ATTACK RESULTS FOR BITTORNADO (4 ATTACKERS, 5 MB FILE) Seed Uplink (KB/sec) Avg Download Time w/o Attackers (mins) Delay Ratio 5 178 4.11 5 178 4.44 1 176 1.95 1 176 1.39 15 167 1.8 15 167 1.6 As further shown in Fig. 5, for the seed bandwidth of 15 KB/sec, despite their high download capacity, the attackers do not take advantage over the leechers in achieving high download rates. In the first 18 minutes in the experiment, the seed bandwidth is shared more or less equally among the attackers and leechers. After that, leechers complete their downloads and all the seed bandwidth is occupied by the attackers. C. Connection Attack Results Table VI shows the download delay results for connection attacks on BitTornado seeds. We launch 8 attackers in the experiments. We first note that for high seed bandwidths (1 KB/sec and 15 KB/sec), the connection attack is completely ineffective. This is a direct consequence of the bandwidth-first seeding algorithm used by BitTornado. For the high bandwidth seed, the bandwidth allocated per slot u/n is higher than the download rate of an attacker (2 KB/sec) but lower than that of the leechers. The 5 initial leechers that are connected to the seed always win over the attackers when fighting for unchoke

8 14 12 total leecher attacker 9 8 7 total leecher attacker 1 6 8 6 4 5 4 3 2 2 1 5 1 15 2 25 3 5 1 15 2 Fig. 5. Seed Bandwidth Distribution for Attackers and Leechers in a BitTornado Bandwidth Attack Fig. 6. Seed Bandwidth Distribution for Attackers and Leechers in a BitTornado Connection Attack slots. Out of the 4 unchoke slots, the leechers always occupy 3 of them. Hence, these 5 leechers can download the entire file quickly and redistribute it to the other 25 leechers. For the slower seed upload rate of 5 KB/sec, the attack is moderately effective with delay ratios in the vicinity of 9. This case actually corresponds to a combined bandwidth and connection attack, since the bandwidth of the attacker, although small, is now higher than the per slot bandwidth of the seed, u/n. In this situation, all peers are viewed equally by the bandwidth-first seeding algorithm. Because the ratio between the attacker number (connected to the seed) and the leecher number is high (9 : 1), the attackers are selected for optimistic unchokes most of the time. However, the leechers still get selected frequently enough to eventually obtain the entire file. TABLE VI CONNECTION ATTACK RESULTS FOR BITTORNADO (8 ATTACKERS, 5 MB FILE) Seed Uplink (KB/sec) Avg Download Time w/o Attackers (mins) Delay Ratio 5 178 9.52 5 178 8.99 1 176 1. 1 176 1. 15 167 1.3 15 167 1.3 Fig. 6 shows the results of the connection attack experiment with the seed bandwidth 1 KB/sec. Before the 5 MB file is completely downloaded by the leechers at about 18 minutes, most of the seed bandwidth is occupied by the leechers. The attackers only manage to receive a small portion of the seed bandwidth. However, they still grab the seed bandwidth of about 2 kbps together. Because there are many more attackers than leechers (9 : 1), the attackers can connect to the seed mostly via optimistic unchoke (u/n = 1/4 = 25 KB/sec = 2 kbps). In summary, both Azureus and BitTornado seeds are quite resilient against bandwidth attacks. The widely-deployed Azureus seed is, however, highly vulnerable to connection attacks. The BitTornado seed, using simpler and more conventional seeding and connection-management algorithms, is moderately vulnerable to combined connection and bandwidth attacks. V. DETECTING INITIAL SEED Both of the attacks discussed in previous sections require the attacker to join the torrent in its initial stage, identify the seed, and quickly launch the attack. In this section, we examine whether the attacker can be among the first peers to join a torrent. (Note that the attacker can determine the seed by requesting the buffer maps from each of the peers in the torrent.) In most of the popular torrent web sites (e.g., Pirate Bay and Mininova), each torrent file is indexed with an integer. For each new torrent that is uploaded, the integer is incremented. Therefore, by continuously monitoring the highest unused web page indices, one can detect new torrents being added to the web site. Most popular web sites also provide an RSS feed, allowing users to subscribe to new torrents under different categories. By subscribing to torrents falling under all categories, one can get updates of all new torrents added. We developed a monitoring program that continuously monitors a given torrent web site for new torrents being added, downloads the torrent file, parses it to get torrent details, and quickly gets the initial peer list from the tracker(s). We chose PirateBay as our test web site. We monitored the web site for 1 new torrents that were uploaded to the site. For PirateBay, we observed that the RSS web pages get updated a few minutes after the torrents are added. Hence, we used the web index crawling technique for the monitoring process. Figure 7 shows the CDF of the number of initial peers in the 1 new torrents at the time when we detected and joined the torrents. Specifically, the figure shows the CDF of the number of peers returned by the tracker(s) immediately after downloading and joining corresponding torrents. It can be seen that more than 9% of the torrents have less than 5 peers at the joining time. Our monitoring program could also detect (and join) most of the newly formed torrents in less

9 than a minute of their upload into the web site. Due to space limitations, we do not show the CDF of the detection (and joining) time in this paper. In conclusion, an attacker should be able to join most torrents in their early stages, and then attempt to take the initial seed, as described in the previous sections. CDF Fig. 7. 1.9.8.7.6.5.4.3.2.1 5 1 15 2 25 3 35 4 Swarm Size CDF of Swarm Size at Time of Torrent Join VI. CONCLUSION In this paper, we investigated two natural attacks against initial seeds in BitTorrent: the bandwidth attack and the connection attack. To study these attacks, we developed a simple fluid model, studied the source code of popular clients, and deployed an attack environment in PlanetLab. Our measurement results showed that bandwidth attacks are largely ineffective, usually not prolonging download times by more than 1% and never by more than a factor of 5. But our measurements also showed that current BitTorrent seeds are vulnerable to connection attacks. In particular, the connection attack was able to prevent the Azureus seed from distributing its file into the swarm; and the combined connection and bandwidth attack was able to significantly hinder file distribution when using a BitTornado seed with the pure bandwidth-first algorithm. We also performed a measurement study on how quickly attackers are able to detect newly uploaded torrents. By monitoring PirateBay, our torrent tracking utility is able to detect new torrents in the order of seconds and join the torrents before many other peers do. This demonstrates the feasibility on launching seed attacks against newly formed torrents. The main reason why the download at the leechers was completely stopped in Azureus connection attack is that the Optimistic Disconnect feature kicked away the leechers not being unchoked for more than 5 minutes. Had there been no such strategy, the connection attack in Azureus would have prolonged the download time; however, the leechers would eventually have finished download. Hence, one defense Azureus could use is to discard the Optimistic Disconnect strategy. However, in non-attack scenarios, this feature is desirable since it makes room for potentially better connections. In a connection attack, the attackers need to launch several attack peers that connect to the seed. For this, they need to use a number of nodes from the same subnet. One obvious defense then is not to allow more than a certain number of connections per subnet. This defense strategy also works for the combined connection and bandwidth attack, preventing a large number of high bandwidth attackers from connecting to the seed. Of course, this strategy does not work if attackers can distribute the attack nodes using botnets. However, since botnets are operated by illegal entities, it is unlikely that interdiction companies (commissioned by Fortune-5 companies) would get involved with them. In future work, we will explore various defense strategies against seed attacks. REFERENCES [1] A. Banerjee, M. Faloutsos, and L. Bhuyan, Is someone tracking P2P users? in Proc. IFIP Networking, Atlanta, GA, May 27. [2] Media defenders, http://www.mediadefender.com. [3] Safenet, http://www.safenet-inc.com. [4] Macrovision, http://www.macrovision.com. [5] J. Liang, R. Kumar, Y. Xi, and K. W. Ross, Pollution in P2P file sharing systems, in Proc. IEEE INFOCOM, Miami, FL, Mar. 25. [6] N. Christin, A. S. Weigend, and J. Chuang, Content availability, pollution and poisoning in file sharing peer-to-peer networks, in Proc. ACM EC, Vancouver, Canada, Jun. 25. [7] J. Liang, N. Naoumov, and K. W. Ross, The index poisoning attack in P2P file-sharing systems, in Proc. IEEE INFOCOM, Spain, 26. [8] J. A. Pouwelse, P. Garbacki, D. H. Epema, and H. Sips, The BitTorrent P2P file-sharing system: Measurements and analysis, in Proc. IPTPS, Ithaca, NY, Feb. 25. [9] BitTorrent client, http://www.wikipedia.org/wiki/bittorrent client. [1] Ziptorrent blacklist, http://torrentfreak.com/ziptorrent-pollutes-andslows-down-popular-torrents. [11] BitTorrent servers under attack, http://news.zdnet.com. [12] P2P monitoring systems, http://www.evidenzia.de/. [13] Bittorrent more popular than ever, releases triple in a year, http:// torrentfreak.com/bittorrent-more-popular-than-ever-719/. [14] X. Hei, C. Liang, J. Liang, Y. Liu, and K. W. Ross, A measurement study of a large-scale P2P IPTV system, IEEE Trans. Multimedia, vol. 9, no. 8, pp. 1672 1687, Dec. 27. [15] Y. Huang, T. T. Fu, D. Chiu, J. C. Lui, and C. Huang, Challenges, design and analysis of a large-scale P2P VoD systems, in Proc. ACM SIGCOMM, Aug. 28. [16] P. Dhungel, D. Wu, B. Schonhorst, and K. W. Ross, A measurement study of attacks on BitTorrent leechers, in Proc. IPTPS, FL, Feb. 28. [17] D. Dumitriu, E. W. Knightly, A. Kuzmanovic, I. Stoica, and W. Zwaenepoel, Denial-of-service resilience in peer-to-peer file sharing systems, in Proc. ACM SIGMETRICS, Alberta, Canada, Jun. 25. [18] M. A. Konrath, M. P. Barcellos, and R. B. Mansilha, Attacking a swarm with a band of liars: evaluating the impact of attacks on bittorrent, in Proc. IEEE P2P, Galway, Ireland, Sep. 27. [19] N. Naoumov and K. W. Ross, Exploiting P2P systems for DDoS attacks, in Proc. InfoScale, Hong Kong, May 26. [2] X. Sun, R. Torres, and S. Rao, DDoS attacks by subverting membership management in P2P systems, in Proc. Workshop on Secure Network Protocols (NPSec), Beijing, China, Oct. 27. [21] E. Athanasopoulos, K. G. Anagnostakis, and E. P. Markatos, Misusing unstructured P2P systems to perform DoS attacks: The network that never forgets, in Proc. 4th Int. Conference on Applied Cryptography and Network Security, Singapore, June 26. [22] S. Jun and M. Ahamad, Incentives in Bittorrent induce free riding, in ACM SIGCOMM Workshop P2PEcon, Philadelphia, PA, Aug 25. [23] M. Piatek, T. Isdal, T. Anderson, A. Krishnamurthy, and A. Venkataramani, Do incentives build robustness in BitTorrent? in Proc. USENIX NSDI, Cambridge, MA, Apr. 27. [24] M. Sirivianos, J. H. Park, R. Chen, and X. Yang, Free-riding in BitTorrent networks with the large view exploit, in Proc. IPTPS, Bellevue, WA, Feb. 27. [25] What are the good bittorrent software packages? http: //netforbeginners.about.com/od/peersharing/f/torrentclients.htm.