Starting the Avalanche:

Similar documents
Serverless Architecture Hochskalierbare Anwendungen ohne Server. Sascha Möllering, Solutions Architect

Microservices without the Servers: AWS Lambda in Action

How to go serverless with AWS Lambda

At Course Completion Prepares you as per certification requirements for AWS Developer Associate.

AWS Reference Architecture - CloudGen Firewall Auto Scaling Cluster

AWS Lambda: Event-driven Code in the Cloud

Elastic Load Balancing

PrepAwayExam. High-efficient Exam Materials are the best high pass-rate Exam Dumps

IoT Device Simulator

Zombie Apocalypse Workshop

Amazon Web Services (AWS) Training Course Content

How can you implement this through a script that a scheduling daemon runs daily on the application servers?

PARTLY CLOUDY DESIGN & DEVELOPMENT OF A HYBRID CLOUD SYSTEM

PracticeDump. Free Practice Dumps - Unlimited Free Access of practice exam

Develop and test your Mobile App faster on AWS

Pragmatic Cloud Security

Making Non-Distributed Databases, Distributed. Ioannis Papapanagiotou, PhD Shailesh Birari

AWS Lambda. 1.1 What is AWS Lambda?

DISTRIBUTED SYSTEMS [COMP9243] Lecture 8a: Cloud Computing WHAT IS CLOUD COMPUTING? 2. Slide 3. Slide 1. Why is it called Cloud?

Zero to Microservices in 5 minutes using Docker Containers. Mathew Lodge Weaveworks

Amazon. Exam Questions AWS-Certified-Solutions-Architect- Professional. AWS-Certified-Solutions-Architect-Professional.

Exam Questions AWS-Certified- Developer-Associate

Building a Self-Defending Border. Shane Baldacchino, Solutions Architect, AWS Marcus Santos, Solutions Architect, AWS

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

Amazon Web Services Training. Training Topics:

Scaling DreamFactory

Amazon Web Services (AWS) Solutions Architect Intermediate Level Course Content

Managing your microservices with Kubernetes and Istio. Craig Box

SaaS Identity and Isolation with Amazon Cognito on the AWS Cloud

MongoDB in AWS (MongoDB as a DBaaS)

TIBCO Cloud Integration Security Overview

Microservices mit Java, Spring Boot & Spring Cloud. Eberhard Wolff

vcloud Automation Center Reference Architecture vcloud Automation Center 5.2

Elastic Efficient Execution of Varied Containers. Sharma Podila Nov 7th 2016, QCon San Francisco

Application Resilience Engineering and Operations at Netflix. Ben Software Engineer on API Platform at Netflix

FAST, FLEXIBLE, RELIABLE SEAMLESSLY ROUTING AND SECURING BILLIONS OF REQUESTS PER MONTH

PREPARING FOR DISASTER

ARCHITECTING WEB APPLICATIONS FOR THE CLOUD: DESIGN PRINCIPLES AND PRACTICAL GUIDANCE FOR AWS

SOLUTION BRIEF. Enabling and Securing Digital Business in API Economy. Protect APIs Serving Business Critical Applications

Level Up Your CF Apps with Amazon Web Services

Security Overview of the BGI Online Platform

Using DC/OS for Continuous Delivery

We are ready to serve Latest IT Trends, Are you ready to learn? New Batches Info

Design and Architecture. Derek Collison

The Guide to Best Practices in PREMIUM ONLINE VIDEO STREAMING

How to Deploy an AMI Test Agent in Amazon Web Services

Cloud Security Best Practices

ArcGIS for Server: Administration and Security. Amr Wahba

Amazon Web Services. Block 402, 4 th Floor, Saptagiri Towers, Above Pantaloons, Begumpet Main Road, Hyderabad Telangana India

Going Serverless. Building Production Applications Without Managing Infrastructure

Cloud Computing. Amazon Web Services (AWS)

Cloud Computing /AWS Course Content

Using AWS to Build a Large Scale Dockerized Microservices Architecture. Dr. Oliver Wahlen moovel Group GmbH Frankfurt, 30.

To Kill a Monolith: Slaying the Demons of a Monolith with Node.js Microservices on CloudFoundry. Tony Erwin,

Performance Case Study

Reactive Microservices Architecture on AWS

CogniFit Technical Security Details

A Cloud Gateway - A Large Scale Company s First Line of Defense. Mikey Cohen Manager - Edge Gateway Netflix

Istio. A modern service mesh. Louis Ryan Principal

How to use or not use the AWS API Gateway for Microservices

About Intellipaat. About the Course. Why Take This Course?

The Day the DNS Died

PRAGMATIC SECURITY AUTOMATION FOR CLOUD

What Building Multiple Scalable DC/OS Deployments Taught Me about Running Stateful Services on DC/OS

EVCache: Lowering Costs for a Low Latency Cache with RocksDB. Scott Mansfield Vu Nguyen EVCache

DreamFactory Security Guide

NGF0502 AWS Student Slides

Aurora, RDS, or On-Prem, Which is right for you

How we scaled push messaging for millions of Netflix devices. Susheel Aroskar Cloud Gateway

How to Re-Architect without Breaking Stuff (too much) Owen Garrett March 2018

Getting Started with AWS Security

How the Cloud is Enabling the Disruption of the Construction Industry. AWS Case Study Construction Industry. Abstract

Cloud Catastrophes. and how to avoid them

Managing and Auditing Organizational Migration to the Cloud TELASA SECURITY

Wrapp. Powered by AWS EC2 Container Service. Jude D Souza Solutions Wrapp Phone:

AWS 101. Patrick Pierson, IonChannel

Microservices at Netflix Scale. First Principles, Tradeoffs, Lessons Learned Ruslan

Incident Response and Forensics in your Pyjamas

Amazon Search Services. Christoph Schmitter

Large-Scale Web Applications

Overview. AWS networking services including: VPC Extend your network into a virtual private cloud. EIP Elastic IP

Copyright. Copyright Ping Identity Corporation. All rights reserved. PingAccess Server documentation Version 5.

Emulating Lambda to speed up development. Kevin Epstein CTO CorpInfo AWS Premier Partner

SAP Vora - AWS Marketplace Production Edition Reference Guide

Layer-4 to Layer-7 Services

Tetration Cluster Cloud Deployment Guide

AWS Well Architected Framework

AWS Solution Architect Associate

AWS London Loft: CloudFormation Workshop

Hosting Roadmap Upgrades, Improvements and Changes

Microservices Architekturen aufbauen, aber wie?

Enroll Now to Take online Course Contact: Demo video By Chandra sir

Containers or Serverless? Mike Gillespie Solutions Architect, AWS Solutions Architecture

ISTIO 1.0 INTRODUCTION & OVERVIEW OpenShift Commons Briefing Brian redbeard Harrington Product Manager, Istio

AGENDA Introduction Pivotal Cloud Foundry NSX-V integration with Cloud Foundry New Features in Cloud Foundry Networking NSX-T with Cloud Fou

Cloud Security Strategy - Adapt to Changes with Security Automation -

CONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS

API Best Practices. Managing APIs holistically across the enterprise

Planning Resources. vrealize Automation 7.1

Exam Questions AWS-Solution- Architect-Associate

Transcription:

Starting the Avalanche: Application DoS In Microservice Architectures Scott Behrens Jeremy Heffner

Introductions Scott Behrens Netflix senior application security engineer Breaking and building for 8+ years Contributor to a variety of open source projects (github.com/sbehrens) Jeremy Heffner Senior Security Software Engineer Developing and securing things for 20+ years

DoS focused on application layer logic

Photo of a battering ram by Flickr user Patrick Denker; License: https://creativecommons.org/licenses/by/2.0/

http://www.interestingfacts.org/fact/first-example-of-biological-warfare

How Novel is Application DoS? https://www.akamai.com/us/en/multimedia/documents/state-of-the-internet/q1-2017-state-of-the-internet-security-report.pdf

Microservice Primer: High Level View Architecture Client Libraries and API Gateway Circuit Breakers / Failover Cache

Microservice Primer: Architecture GOOD Scale Service independence Fault isolation Eliminates stack debt BAD Distributed system complexity Deployment complexity Cascading service failures if things aren t set up right

Simplified Microservice API Architecture INTERNET ZUUL PROXY EDGE Middle Backend WEBSITE CORE API Middle Tier Service Middle Tier Service Middle Tier Service Backend Tier Service Backend Tier Service Backend Tier Service Backend Tier Service Backend Tier Service ZUUL PROXY PROXIES Middle Tier Service Backend Tier Service Backend Tier Service Backend Tier Service

Microservice Primer: API Gateways and Client Libraries Interface for middle tier services Services provide client libraries to API Gateway Diagrams provided by microservices.io

Microservice Primer: Circuit Breaker Helps with handling service failures How do you know what timeout to choose? How long should the breaker be triggered? Diagrams provided by microservices.io

Microservice Primer: Cache Speeds up response time Reduces load on services fronted by cache Reduces the number of servers needed to handle requests https://github.com/netflix/evcache

Old school Application DoS CPU Mem Cache Disk Network

New School Application DoS CPU Mem Cache Disk Network Queueing Client Library Timeouts Healthchecks Connection Pool Hardware Operations (HSMs)

New School Application DoS CPU Mem Cache Disk Network Queueing Client Library Timeouts Healthchecks Connection Pool Hardware Operations (HSMs)

Difference Between Old School and New School App DoS Old School Application DoS New School Application DoS Often 1 to 1 Often 1 to Many

Simple Web Application Architecture

Old School Application DoS Attack HTTP Timeouts 300 requests per second HTTP Timeouts > perl create_many_profiles.pl POST /create_profile HTTP/1.1 profile_name=$counter + hacker https://www.teachprivacy.com/the-funniest-hacker-stock-photos/ https://openclipart.org/image/2400px/svg_to_png/241842/sad_panda.png http://www.funnyordie.com/lists/f64f7beefd/brent-rambo-approves-of-these-gifs

New School Microservice API DoS EDGE Middle Backend Backend Tier Service > python grizzly.py ZUUL PROXY ZUUL PROXY WEBSITE PROXIES CORE API Middle Tier Service Middle Tier Service Middle Tier Service Middle Tier Service Backend Tier Service Backend Tier Service Backend Tier Service Backend Tier Service Backend Tier Service Backend Tier Service Backend Tier Service POST /recommendations HTTP/1.1 { recommendations : { range : [0,10000]}}

New School Microservice API DoS > python grizzly.py Fallback or Site Error ZUUL PROXY ZUUL PROXY EDGE Middle Backend WEBSITE PROXIES CORE API Middle Tier Service Middle Tier Service Middle Tier Service Middle Tier Service Backend Tier Service Backend Tier Service Backend Tier Service Backend Tier Service Backend Tier Service Backend Tier Service Backend Tier Service Backend Tier Service POST /recommendations HTTP/1.1 { recommendations : { range : [0,10000]}} Core Client API Timeouts, making circuit many client breakers requests triggered, fallback experience triggered Middle tier services making many calls to backend services Backend service queues filling up with expensive requests

Workflow for Identifying Application DoS - Part 1 Identify the most latent service calls Investigate if latent calls allow for manipulation Tune payload to fly under WAF/Rate Limiting Test hypothesis Scale your test using Cloudy Kraken (orchestrator) and Repulsive Grizzly (attack framework)

Workflow for Identifying Application DoS - Part 1 Identify the most latent service calls Investigate if latent calls allow for manipulation Tune payload to fly under WAF/Rate Limiting Test hypothesis Scale your test using Cloudy Kraken (orchestrator) and Repulsive Grizzly (attack framework)

Identifying Latent Service Calls

Identifying Latent Service Calls

Microservice Application DoS: Attack Patterns Range Object Out per Object in Request Size All of the Above

Application DoS Technique: Range

Application DoS Technique: Object Out Per Object In

Application DoS Technique: Request Size

Application DoS Technique: All of the Above <--What about N languages? <--What about more object fields?

Rate Limited # Req Service Impact Service Healthy Service Impact Service Auto-Scaling/Healthy Logical Work Per Request

New School Application DoS Attack: Case Study

Making the call more expensive

Workflow for Identifying Application DoS - Part 2 Identify the most latent service calls Investigate if latent calls allow for range, object out/object in, request size, or other manipulation Tune payload to fly under WAF/Rate Limiting while causing the most application instability Test hypothesis on a smaller scale using Repulsive Grizzly Scale your test using Cloudy Kraken

Repulsive Grizzly Skunkworks application DoS framework Written in Python3 Eventlet for high concurrency Uses AWS SNS for logging analysis Easily configurable

Repulsive Grizzly: Command File

Repulsive Grizzly: Payload and Header Files Provide payloads in any format you want Headers are provided as a JSON key/value hash Use $$AUTH$$ placeholder to tell grizzly where to place tokens

Repulsive Grizzly: Bypass Rate Limiter with Sessions http://nerdprint.com/wp-content/uploads/2014/09/dumle-mountain-cookies-advertising2.png

Repulsive Grizzly: Single Node

https://giphy.com/gifs/dancing-90s-computer-uwv3upfwoz088

Cloudy Kraken Overview

Push the latest configuration file and attack scripts to S3. Launch instances Tear down and reset the environment in each region Reset the DynamoDB state. Update Config Build Environment Start up Attack Nodes Collect data Tear-Down Configure the VPCs in each region Wait for data to come through SNS

Cloudy Kraken Configuration S3 Bucket Zip File and Configuration Reset State DynamoDB Table

Cloudy Kraken: Key AWS Deployment Building Blocks Region VPC Auto-Scaling Group Node Node NodeNode Node Node Node NodeNode Node AZ/Subnet AZ/Subnet Region => AWS Geographical Region VPC => VLAN ASG => Automatically starts identical nodes AZ/Subnet => Localized nodes / Subnet Launch Config => Initial configuration

Cloudy Kraken Deployment phase VPC Region A Security Group Auto Scaling Group VPC Region B Security Group Auto Scaling Group

Cloudy Kraken Workers Each worker node is a single EC2 instance Each worker runs many threads EC2 gives you access to Enhanced Networking Driver Minimal overhead with launch config and ASG

Cloudy Kraken Execution phase On startup, each worker node runs a cloud-init script Enables ssh access for monitoring and debugging Downloads and runs main config script Downloads ZIP file with attack script Spins up attack worker Waits for coordinated time to start

Cloudy Kraken Kill-Switch Script to set the kill switch, and bring it all down

Cloudy Kraken Tear-Down Terminates all the instances Removes ASGs and Launch Configs Removes VPC, Security group, and Instance Profiles

We scaled up, time to run the test! Tested against prod Multi-region and multi-agent Conducted two 5 minute attacks Monitored for success

Results of Test 80% Error Rate

$1.71 5 minute outage for a single AWS region

So What Failed? Expensive API calls could be invoked with non-member cookies Expensive traffic resulted in many RPCs per request WAF/Rate Limiter was unable to monitor middle tier RPCs Missing fallback experience when cache missed

Demo Test app Launching and scaling attack with Cloudy Kraken

Microservice Application DoS: Mitigations Understand which microservices impact customer experience

Microservice Application DoS: Mitigations Rate limiter (WAF) should monitor middle tier signals or cost of request*

Microservice Application DoS: Mitigations Middle tier services should provide context on abnormal behavior

Microservice Application DoS: Mitigations Rate limiter (WAF) should monitor volume of cache misses*

Microservice Application DoS: Mitigations Prioritize authenticated traffic over unauthenticated

Microservice Application DoS: Mitigations Configure reasonable client library timeouts

Microservice Application DoS: Mitigations Trigger fallback experiences when cache or lookups fail

Thanks! https://github.com/netflix-skunkworks/repulsive-grizzly https://github.com/netflix-skunkworks/cloudy-kraken @helloarbit