Adaptive throttling of Tor clients by entry guards

Similar documents
Tor: The Second-Generation Onion Router. Roger Dingledine, Nick Mathewson, Paul Syverson

CE Advanced Network Security Anonymity II

One Fast Guard for Life (or 9 months)

Challenges in protecting Tor hidden services from botnet abuse

Web Hosting. Important features to consider

Port-Scanning Resistance in Tor Anonymity Network. Presented By: Shane Pope Dec 04, 2009

ICANN Start, Episode 1: Redirection and Wildcarding. Welcome to ICANN Start. This is the show about one issue, five questions:

Understanding VMware Capacity

Congestion and its control: Broadband access networks. David Clark MIT CFP October 2008

THE SECOND GENERATION ONION ROUTER. Roger Dingledine Nick Mathewson Paul Syverson. -Presented by Arindam Paul

Measuring the Tor Network Evaluation of Relays from Public Directory Data

A SIMPLE INTRODUCTION TO TOR

CS244 Advanced Topics in Computer Networks Midterm Exam Monday, May 2, 2016 OPEN BOOK, OPEN NOTES, INTERNET OFF

Traffic Optimization in Anonymous Networks

Counting daily bridge users

Outlook is easier to use than you might think; it also does a lot more than. Fundamental Features: How Did You Ever Do without Outlook?

The first thing we ll need is some numbers. I m going to use the set of times and drug concentration levels in a patient s bloodstream given below.

Economics of peering. Introduction. What is peering? What is an exchange point?

Security. 1 Introduction. Alex S. 1.1 Authentication

Midterm. You have two and a half hours to complete this exam (but you shouldn t need it).

Onion services. Philipp Winter Nov 30, 2015

CS 4349 Lecture October 18th, 2017

Sorting and Filtering Data

Robbing the Bank with a Theorem Prover

Performance Evaluation of Tor Hidden Services

FreeBSD Portsnap. What (it is), Why (it was written), and How (it works) Colin Percival The FreeBSD Project

Tor Experimentation Tools

2 ND GENERATION ONION ROUTER

9 R1 Get another piece of paper. We re going to have fun keeping track of (inaudible). Um How much time do you have? Are you getting tired?

A Guide to Condor. Joe Antognini. October 25, Condor is on Our Network What is an Our Network?

Fairness, Queue Management, and QoS

FBI Tor Overview. Andrew Lewman January 17, 2012

0x1A Great Papers in Computer Security

Throttling Tor Bandwidth Parasites. Technical Report

Many people find group theory pretty difficult because the logic behind it seems

To complete the computer assignments, you ll use the EViews software installed on the lab PCs in WMC 2502 and WMC 2506.

Worst-case Ethernet Network Latency for Shaped Sources

The 9 Tools That Helped. Collect 30,236 s In 6 Months

Anonymity Tor Overview

CS519: Computer Networks. Lecture 5, Part 4: Mar 29, 2004 Transport: TCP congestion control

2016 All Rights Reserved

Using GitHub to Share with SparkFun a

BBC Tor Overview. Andrew Lewman March 7, Andrew Lewman () BBC Tor Overview March 7, / 1

List Building Income

25 TIPS TO INCREASE YOUR CONVERSION RATE

Attacking Tor with covert channel based on cell counting

Lab 3: Sampling Distributions

FIGURING OUT WHAT MATTERS, WHAT DOESN T, AND WHY YOU SHOULD CARE

WordPress Tutorial for Beginners with Step by Step PDF by Stratosphere Digital

Peer-to-Peer Systems and Security

TIME AND SYNCHRONIZATION. I. Physical Clock Synchronization: Motivation and Challenges

DDoS Defense by Offense

Excel Basics Rice Digital Media Commons Guide Written for Microsoft Excel 2010 Windows Edition by Eric Miller

Arduino IDE Friday, 26 October 2018

Bring in several pieces of junk mail that have been opened and spread the mail around the classroom.

Animator Friendly Rigging Part 1

Usability Test Report: Requesting Library Material 1

The Art and Science of Memory Allocation

Mr G s Java Jive. #11: Formatting Numbers

Case study on PhoneGap / Apache Cordova

Performance Improvements on Tor or, Why Tor is slow and what we re going to do about it

Welcome Back! Without further delay, let s get started! First Things First. If you haven t done it already, download Turbo Lister from ebay.

Anonymity C S A D VA N C E D S E C U R I T Y TO P I C S P R E S E N TAT I O N BY: PA N AY I OTO U M A R KO S 4 T H O F A P R I L

Trawling for Tor Hidden Services: Detection, Measurement, Deanonymization

Web Host. Choosing a. for Your WordPress Site. What is web hosting, and why do you need it?

Listening with TSDE (Transport Segment Delay Estimator) Kathleen Nichols Pollere, Inc.

Congestion-aware Path Selection for Tor

Master Cold s. - The ebook. Written with at FindThatLead.com

_APP A_541_10/31/06. Appendix A. Backing Up Your Project Files

Lecture 16. Today: Start looking into memory hierarchy Cache$! Yay!

[301] Bits and Memory. Tyler Caraza-Harter

COMP 273 Winter physical vs. virtual mem Mar. 15, 2012

Memory Hierarchy. Memory Flavors Principle of Locality Program Traces Memory Hierarchies Associativity. (Study Chapter 5)

REVIEW: A LOOK AT A PORTABLE USB3 NETWORK TAP WIRESHARK HEROES SERIES VISIT

Toward Improving Path Selection in Tor

Introducing optical switching into the network

VERSION Lab 3: Link Layer

Resource Guide Implementing QoS for WX/WXC Application Acceleration Platforms

Fractions and their Equivalent Forms

Native, HTML5, Hybrid?

UV Mapping to avoid texture flaws and enable proper shading

Today s Topics. Percentile ranks and percentiles. Standardized scores. Using standardized scores to estimate percentiles

CS519: Computer Networks. Lecture 1 (part 2): Jan 28, 2004 Intro to Computer Networking

Case Study: Best Strategy To Rank Your Content On Google

Clickbank Domination Presents. A case study by Devin Zander. A look into how absolutely easy internet marketing is. Money Mindset Page 1

How to Read AWStats. Why it s important to know your stats

AMD DEVELOPER INSIDE TRACK

Skewed-Associative Caches: CS752 Final Project

What Is Congestion? Effects of Congestion. Interaction of Queues. Chapter 12 Congestion in Data Networks. Effect of Congestion Control

Hiding Amongst the Clouds

Craigslist Quick Tricks Manual helping you get your ads on Craigslist and getting people to click through to your site or product offer

Lesson 9 Transcript: Backup and Recovery

How To Create Instant Authority So You Can Rank Like Crazy Without Doing Complicated Boring SEO WORK and NO Tech Overwhelm!

Understanding Managed Services

Up and Running Software The Development Process

ENGR 40M Project 3c: Switch debouncing

[Video] and so on... Problems that require a function definition can be phrased as a word problem such as the following:

WEBINARS FOR PROFIT. Contents

Within Kodi you can add additional programs called addons. Each of these addons provides access to lots of different types of video content.

CS519: Computer Networks. Lecture 5, Part 5: Mar 31, 2004 Queuing and QoS

Transcription:

Adaptive throttling of Tor clients by entry guards Roger Dingledine arma@torproject.org Tor Tech Report 2010-09-001 September 19, 2010 Abstract Looking for a paper topic (or a thesis topic)? Here s a Tor research area that needs more attention. The short version is: if we prevent the really loud users from using too much of the Tor network, how much can it help? We ve instrumented Tor s entry relays so they can rate-limit connections from users, and we ve instrumented the directory authorities so they can change the rate-limiting parameters globally across the network. Which parameter values improve performance for the Tor network as a whole? How should relays adapt their rate-limiting parameters based on their capacity and based on the network load they see, and what rate-limiting algorithms will work best? We d love to work with you to help answer these questions. One of the reasons why Tor is slow is that some people use it for file-sharing or other high-volume transfers. That means if you want to get your instant message cell through, it sometimes needs to wait in line behind a pile of other cells leading to high latency and, maybe even worse, highly variable latency. One way to improve the situation is Can and Goldberg s An Improved Algorithm for Tor Circuit Scheduling [2], which was integrated into Tor as of 0.2.2.11-alpha. The idea is to track how many cells the relay has handled for each circuit lately, and give priority to cells from quieter circuits. But while that puts some cells in front of others, it can only work so many miracles: if many cells have been placed in front of your cell it still has to wait, and the more overall load there is in the network, the more often that will happen. Which leads to the research problem: if we work to keep the really loud flows off the network in the first place, how much can it help? Tor 0.2.2.15-alpha lets you set the PerConnBWRate and PerConnBWBurst config options in your relay, to use token buckets 1 to rate limit connections from non-relays. Tor 0.2.2.16-alpha added the capability for the directory authorities to broadcast network-wide token bucket 1 https://en.wikipedia.org/wiki/token_bucket 1

Figure 1: Influence of PerConnBW* settings on user-perceived Tor performance over time parameters, so we can change how much throttling there is and then observe the results on the network. So the first question is to model how this should work to improve performance at a single entry guard. Say we do periodic performance tests fetching files of size 50KB, 1MB, and 5MB using that relay as our first hop, and say the relay sets PerConnBWRate to 5KB/s and PerConnBWBurst to 2MB. That is, bursts of up to 2 megabytes on the connection are unthrottled, but after that it s squeezed to a long-term average of 5 kilobytes per second, until the flow lets up at which point the allowed burst slowly builds back up to 2MB. We would expect the 5MB tests to show horrible performance, since they ll need at least 3000/5 = 600 seconds to fetch the last 3MB of the file. We would expect the 50KB and 1MB tests to look better, though: if we re squeezing out the really big flows, there s more space for what s left. We actually performed this experiment, using Sebastian s relay fluxe3. You can see his performance graphs over time in Figure 1. The black dots are individual fetches, the y axis is how many seconds it took to fetch the file, and the x axis is time (the green-shaded areas are when the feature is turned on). Don t pay too much attention to the blue smoothing line; it s probably not matching the actual distribution well. Figure 2 has the cumulative distribution functions for the same data. The performance gets significantly better both for the 50KB and the 1MB downloads, and as expected gets significantly worse for the 5MB downloads. So far so good; we ve done the proof of concept, and now we need a research group to step in, make it rigorous, and help tackle the real research questions. The next question is: how well does this trick work under various conditions? It would seem that if there s plenty of spare bandwidth on the relay, it should have little effect. If we choose parameters that are too lenient, it also would have little effect. But choosing parameters that are too low will hurt normal web browsing users too. Are there rate and burst values 2

Figure 2: Empirical cumulative distribution functions of the influence of PerConnBW* settings on user-perceived Tor performance that would cleanly separate web browsers from bulk downloaders? Will certain relays (that is, relays with certain characteristics) provide clearer performance improvements than others? How often is it the case, for various relay capacities and various user loads, that at least one connection is being throttled? The CDFs appear to show improved performance spread out pretty evenly relative to download time. What statistics should we make relays track over time so we can get a better intuition about the load they re really seeing? Now is where it gets tricky. If we re only thinking about one relay turning on this feature in a vacuum, then if that relay has enough capacity to handle all its flows, it should no sense slowing down anybody if you can handle all the traffic. But instead think of the entire Tor network as a system: your Tor requests might also slow down because a loud Tor flow entered at some other relay and is colliding with yours somewhere along the path. So there s a reason to squeeze incoming connections even if you have enough capacity to handle them: to reduce the effects of bottlenecks elsewhere in the system. If all relays squeeze client connections, what changes in the performance test results do we expect to see for various rate limiting parameter values? We have the capability of doing network-wide experiments to validate your theory, by putting the rate and burst in the hourly networkstatus consensus and letting relays pick it up from there. That is, you can modify the global network-wide parameters and then observe the effects in the network. Note that the experiment will be complicated by the fact that only relays running a sufficiently recent version of Tor will honor the parameters; that fraction should increase significantly when Tor 0.2.2.x becomes the new stable release (sometime in late 2010). And finally, once you have a good intuition about how throttling flows at point X affects flow performance at point Y, we get to the really hard research questions. It s probably the case that no fixed network-wide rate limiting parameters are going to provide optimal behavior: the 3

parameters ought to be a function of the load patterns on the network and of the capacity of the given relay. Should relays track the flows they ve seen recently and adapt their throttling parameters over time? Is there some flow distribution that relays can learn the parameters of, such that we e.g. throttle the loudest 10% of flows? Can we approximate optimal behavior when relays base their parameters on only the local behavior they see, or do we need a more global view of the system to choose good parameters? And what does optimal mean here anyway? Or said another way, how much flexibility do we have to define optimal, or do the facts here railroad us into prefering certain definitions? What are the anonymity implications of letting the behavior of user A influence the performance of user B, in the local-view case or the global-view case? We could imagine active attacks that aim to influence the parameters; can those impact anonymity? As a nice side effect, this feature may provide a defense against Sambuddho s bandwidthbased link congestion attack [1], which relies on a sustained high-volume flow if the attack ever gets precise enough that we can try out our defense and compare. Here are some other constraints to keep in mind: We need the Burst to be big enough to handle directory fetches (up to a megabyte or two), or things will get really ugly because clients will get slowed down while bootstrapping. Mike s bandwidth authority 2 measurement scripts 3 send traffic over relays to discover their actual capacity relative to their advertised capacity. The larger the claimed capacity, the more they send up to several megabytes for the fast relays. If relays throttle these bandwidth tests, the directory authorities will assign less traffic to them, making them appear to provide better performance when in fact they re just being less useful to the network. This is an example of a case where it would be easy to misinterpret your results if you don t understand the rest of the Tor system. A short-term workaround would be to turn the bandwidth authority Tors into relays so they don t get throttled. You ll want to do experiments on an established relay with the Guard flag, or it probably won t see many client connections except for clients fetching directory updates. Note that the longer a relay has had the Guard flag, the more users it will have attracted; but after a month this effect falls off. Thinking from an economic perspective, it may turn out that the performance for a given user doesn t actually get better if we throttle the high-volume users, yet we ve still improved things. That is, if the available capacity of the Tor network improves and thus it is supporting more users, we ve made a better Tor network even if we didn t make it faster for individual users. Fortunately, this feedback effect shouldn t happen for short-term experiments; but it s something to keep in mind for the long term. It sure would be nice if whatever rate limiting algorithm you recommend is efficient to calculate and update over time. Some relays are seeing tens of thousands of users, and can t afford much processing time on each. 2 https://blog.torproject.org/blog/torflow-node-capacity-integrity-and-reliabilitymeasurements-hotpets 3 https://gitweb.torproject.org/torflow.git/blob/head:/networkscanners/bwauthority/readme. BwAuthorities 4

References [1] Sambuddho Chakravarty, Angelos Stavrou, and Angelos D. Keromytis. Traffic analysis against low-latency anonymity networks using available bandwidth estimation. In Proceedings of the 15th European conference on Research in computer security, ESORICS 10, pages 249 267, Berlin, Heidelberg, 2010. Springer-Verlag. [2] Can Tang and Ian Goldberg. An improved algorithm for Tor circuit scheduling. In Angelos D. Keromytis and Vitaly Shmatikov, editors, Proceedings of the 2010 ACM Conference on Computer and Communications Security, CCS 2010, Chicago, Illinois, USA, October 4 8, 2010. ACM, 2010. 5