Rate-Limiting at Scale. SANS AppSec Las Vegas 2012 Nick

Similar documents
Approaches for application request throttling. Maarten

Business Logic Attacks BATs and BLBs

Fighting Phishing I: Get phish or die tryin.

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2017)

Bro: Actively defending so that you can do other stuff

Secure your Web Applications with AWS WAF & AWS Shield. James Chiang ( 蔣宗恩 ) AWS Solution Architect

ANALYZING THE MOST COMMON PERFORMANCE AND MEMORY PROBLEMS IN JAVA. 18 October 2017

magento_1:full_page_cache

Site Audit SpaceX

Handout 20 - Quiz 2 Solutions

Threat Modeling. Bart De Win Secure Application Development Course, Credits to

What personal data we collect and why we collect it

Advanced Techniques for DDoS Mitigation and Web Application Defense

Sucuri Webinar Q&A HOW TO IDENTIFY AND FIX A HACKED WORDPRESS WEBSITE. Ben Martin - Remediation Team Lead

Security issues. Unit 27 Web Server Scripting Extended Diploma in ICT 2016 Lecture: Phil Smith

CS 355. Computer Networking. Wei Lu, Ph.D., P.Eng.

CS244 Advanced Topics in Computer Networks Midterm Exam Monday, May 2, 2016 OPEN BOOK, OPEN NOTES, INTERNET OFF

Big Data Infrastructure CS 489/698 Big Data Infrastructure (Winter 2016)

SaaS Providers. ThousandEyes for. Summary

pgmemcache and the over reliance on RDBMSs Copyright (c) 2004 Sean Chittenden. All rights reserved.

Checklist for Testing of Web Application

WEB SECURITY WORKSHOP TEXSAW Presented by Solomon Boyd and Jiayang Wang

Site Audit Boeing

Lecture 12. Application Layer. Application Layer 1

Putting together the platform: Riak, Redis, Solr and Spark. Bryan Hunt

Copyright ECSC Group plc 2017 ECSC - UNRESTRICTED

magento_1:full_page_cache

<Insert Picture Here> Looking at Performance - What s new in MySQL Workbench 6.2

Our Narrow Focus Computer Networking Security Vulnerabilities. Outline Part II

COMP102: Introduction to Databases, 20

Crawling the Web. Web Crawling. Main Issues I. Type of crawl

Lifehack #1 - Automating Twitter Growth without Being Blocked by Twitter

FAQ: Crawling, indexing & ranking(google Webmaster Help)

Data Plane Protection. The googles they do nothing.

NET 311 INFORMATION SECURITY

NoSQL Injection SEC642. Advanced Web App Penetration Testing, Ethical Hacking, and Exploitation Techniques S

CS61C Machine Structures. Lecture 4 C Pointers and Arrays. 1/25/2006 John Wawrzynek. www-inst.eecs.berkeley.edu/~cs61c/

PUBCRAWL: Protecting Users and Businesses from CRAWLers

Summer Cart Synchronization Guide for.net

Internet Technology. 06. Exam 1 Review Paul Krzyzanowski. Rutgers University. Spring 2016

Security. 1 Introduction. Alex S. 1.1 Authentication

barge In option 127 bigdecimal variables 16 biginteger variables 16 boolean variables 15 business hours step 100

Internet Technology 3/2/2016

Importing source database objects from a database

Not-A-Bot: Improving Service Availability in the Face of Botnet Attacks

WHITE PAPER. Best Practices for Web Application Firewall Management

Monitoring GSS Global Server Load-Balancing Operation

Is Your Web Application Really Secure? Ken Graf, Watchfire

CS 161 Computer Security

REST in a Nutshell: A Mini Guide for Python Developers

Persistence. SWE 432, Fall 2017 Design and Implementation of Software for the Web

How to Create Fields in Microsoft CRM Online

Positive Security Model for Web Applications, Challenges. Ofer Shezaf OWASP IL Chapter leader CTO, Breach Security

Web 2.0 and AJAX Security. OWASP Montgomery. August 21 st, 2007

Site Audit Virgin Galactic

THE FLEXIBLE DATA-STRUCTURE SERVER THAT COULD.

Finding & Registering The Best Local Domain

Search Engine Optimization (SEO) using HTML Meta-Tags

Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY

CC PROCESAMIENTO MASIVO DE DATOS OTOÑO 2018

Today s lecture. Basic crawler operation. Crawling picture. What any crawler must do. Simple picture complications

CSE 565 Computer Security Fall 2018

Control for CloudFlare - Installation and Preparations

Screw You and the Script You Rode in On

TCP Tuning for the Web

CLOAK OF VISIBILITY : DETECTING WHEN MACHINES BROWSE A DIFFERENT WEB

Work Book. Sharkfest Presentation Material. Copyright Advance Seven Limited. All rights reserved.

JSON Home Improvement. Christophe Pettus PostgreSQL Experts, Inc. SCALE 14x, January 2016


DreamFactory Security Guide

This presentation is copyrighted by ProSites, Inc. No part of this presentation can be copied, reproduced, displayed or changed without the express

Quick Guide to Installing and Setting Up MySQL Workbench

NETWORK THREATS DEMAN

Big Data Analytics CSCI 4030

Introduction to OpenMP

Today s lecture. Information Retrieval. Basic crawler operation. Crawling picture. What any crawler must do. Simple picture complications

DRIVE YOUR CONTENT WITH SEO

Google Active View Description of Methodology

Monitoring the Device

OODA Security. Taking back the advantage!

Franzes Francisco Manila IBM Domino Server Crash and Messaging

Cloak of Visibility. -Detecting When Machines Browse A Different Web. Zhe Zhao

Improve WordPress performance with caching and deferred execution of code. Danilo Ercoli Software Engineer

How to do an On-Page SEO Analysis Table of Contents

IBM SECURITY NETWORK PROTECTION (XGS)

MemC3: MemCache with CLOCK and Concurrent Cuckoo Hashing

Beating the Final Boss: Launch your game!

BIG-IP otse vastu internetti. Kas tulemüüri polegi vaja?

War Stories from the Cloud: Rise of the Machines. Matt Mosher Director Security Sales Strategy

EasyCrypt passes an independent security audit

IMPORTANT: Circle the last two letters of your class account:

Running at 99%: Surviving an Application DoS Ryan

Enterprise Overview. Benefits and features of Cloudflare s Enterprise plan FLARE

Three interface Router without NAT Cisco IOS Firewall Configuration

ThousandEyes for. Application Delivery White Paper

Beyond Blind Defense: Gaining Insights from Proactive App Sec

Lecture 4 September Required reading materials for this class

Hash Tables. Hash Tables

EE 122: Network Security

Creating the Data Layer

Transcription:

Rate-Limiting at Scale SANS AppSec Las Vegas 2012 Nick Galbreath @ngalbreath nickg@etsy.com

Who is Etsy? Marketplace for Small Creative Businesses Alexa says #51 for USA traffic > $500MM transaction volume last year

What s a Rate Limit? Maximum number of events per (brief) period per user after which the resource is denied. e.g. no more than 2 logins per minute

Why?

Whoops Without rate limits on credit card authorizations your site becomes a card skimmer site. Using a website is much easier than going to the gas station pump or other anonymous card reader

Whoops Rate limits needed for anything that gets reviewed by humans such as customer service requests. CRMs are typically bad at dealing with spammy stuff

Whoops Robots / Crawlers Gone Wild (not always an intended DDoS) 20,000 items in shopping cart spam attack! Can crush sites very quickly, at almost no cost. Especially when crawl generates load or writes to the database

Do Rate Limits Stop all Fraud? No, but... Eliminates false positives and punks Allows you to focus on more sophisticated attacks Protects against damaging bursts of activity (malicious or not)

Rate Limits are needed on anything that depends on an external resource This is almost everything!

Implementation

Continuous Rate Limits Store user identifier, event-type, timestamp Allows easy rate-limits for multiple ranges Allows easy cross-event limits Easy to implement in SQL

Continuous RL Schema Check if your database timestamps store microseconds or not. You want em.

Continuous RL Queries

At Scale Pain At scale, this is really painful for databases to handle. Constant index churn Use in-memory database (or run off ramdisk) if trying this out

Quantized Rate Limits Stores a count in a time-window or bucket. Map current time to a bucket (int) (NOW()/period) e.g. NOW()/3600 is gives the hour bucket.

Direct Lookup Everything is primary key lookup. userid-event-period-bucketid 60min: nickg-login-3600-5589007547 10min: nickg-login-600-33534045284 Multiple time-frames require multiple buckets, which means multiple inserting/checking.

Quantized RL Accuracy Not exact. If you set N per Period, quantized rate-limits may go as high as: (n-1)x2 per Period. e.g. 10 per minute --> 18 per minute 9 9 18

Rate-Limits at Scale We traded exact accuracy and flexibility for scaling. Implementation using Memcache or Redis (and perhaps SQL) set nickg-login-60-212331231 += 1 Well known sharding techniques Auto-expiration of old buckets Each set/get takes 1/10 or less of millisecond. Almost invisible.

Usage

Please write unit tests! Easy to get wrong, and consequences can be unpleasant Edge cases and race conditions memcache doesn t have a insert or increment operation. Need to do multiple steps and check error conditions.

Rollout Once in production start with guestimates on rate limits If rate limit is triggered, take no action and only log/graph Does volume match expectations? Wash, Rinse, Repeat until tuned appropriately

So a user hit a rate limit. Now what? Do you let them know? (visible indicator) Do you start CAPTCHA-ing? Do you black hole it? (silent) Also keep logging and graphing. You ll need these to debug when things go awry

Other Topics

Anonymous Identifiers hash of (IP + appropriate HTTP headers) order of headers matters different browsers order them differently Spoofed user agents don t always get the order right Can use first 8 bytes of hash and convert to 64-bit integer (and make negative)

Laddering Use Laddering to do rate limits at different time scales for the same event. Set a short period and high rate, to prevent bursts Set a longer period with lower rate, to prevent slow crawls robots.

Ladder longer periods to have a smaller rate Negative example: 2 per Minute ( ~0.033 events per sec ) or 2x60 = 120 per Hour so laddering with 300 per Hour (~ 0.083 events per sec) does nothing, but 100 per Hour (~ 0.028) is good.

And finally Almost every action on Etsy has ladder rate-limit We learn the hard way what is not limited Virtually no performance impact at scale. Should we open source the driver?

Nick Galbreath nickg@etsy.com @ngalbreath SANS AppSec Las Vegas 2012