Beauty and the Burst

Similar documents
Adaptive Video Acceleration. White Paper. 1 P a g e

Dynamic Adaptive Streaming over HTTP (DASH) Application Protocol : Modeling and Analysis

Network Traffic Characteristics of Data Centers in the Wild. Proceedings of the 10th annual conference on Internet measurement, ACM

Achieving Low-Latency Streaming At Scale

Challenges in building learning models when traff is enfrypted

SamKnows test methodology

Proxy-based TCP-friendly streaming over mobile networks

Characterizing Netflix Bandwidth Consumption

Page 1. Outline / Computer Networking : 1 st Generation Commercial PC/Packet Video Technologies

Lecture 12. Application Layer. Application Layer 1

Measuring KSA Broadband

Automated Website Fingerprinting through Deep Learning

CONTENTS. System Requirements FAQ Webcast Functionality Webcast Functionality FAQ Appendix Page 2

Basic Concepts in Intrusion Detection

Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard

Internet Video Delivery. Professor Hui Zhang

Configure Video and Audio Settings

DRAFT. Measuring KSA Broadband. Meqyas, Q Report

Adaptive Bit Rate (ABR) Video Detection and Control

irtc: Live Broadcasting

Real-Time Protocol (RTP)

It s Not the Cost, It s the Quality! Ion Stoica Conviva Networks and UC Berkeley

Chapter 2 Application Layer

Video Quality Management Guidebook

IVMS 4500 User Guide

Week 7: Traffic Models and QoS

CSCD 433/533 Advanced Networks

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

The Frozen Mountain irtc White Paper Series

Correlating Network Congestion with Video QoE Degradation - a Last-Mile Perspective

From ATM to IP and back again: the label switched path to the converged Internet, or another blind alley?

SFMap: Inferring Services over Encrypted Web Flows using Dynamical Domain Name Graphs TMA 2015

Trisul Network Analytics - Traffic Analyzer

TANGO: Enabling New Services through Cooperation between Cellular Network and Mobile Devices. Motivation

CS 557 Congestion and Complexity

P3 Insights Separate T-Mobile Binge On Fact from Fiction

NMLRG #4 meeting in Berlin. Mobile network state characterization and prediction. P.Demestichas (1), S. Vassaki (2,3), A.Georgakopoulos (2,3)

Computer Science 461 Final Exam May 22, :30-3:30pm

Requet: Real-Time QoE Detection for Encrypted YouTube Traffic

Anonymity C S A D VA N C E D S E C U R I T Y TO P I C S P R E S E N TAT I O N BY: PA N AY I OTO U M A R KO S 4 T H O F A P R I L

Application-Layer Protocols Peer-to-Peer Systems, Media Streaming & Content Delivery Networks

CE Advanced Network Security Anonymity II

NDN-RTC. Peter Gusev UCLA REMAP 9/5/2014

QUALITY of SERVICE. Introduction

A Robust Classifier for Passive TCP/IP Fingerprinting

Mohammad Hossein Manshaei 1393

Overview Computer Networking What is QoS? Queuing discipline and scheduling. Traffic Enforcement. Integrated services

arxiv: v1 [cs.cr] 14 Jan 2019

ECEN Final Exam Fall Instructor: Srinivas Shakkottai

The Internet today. Measuring the Internet: challenges and applications. Politecnico di Torino 7/12/2011. Speaker: Marco Mellia

QOE: A MICRO AND MACRO PERSPECTIVE

DASH trial Olympic Games. First live MPEG-DASH large scale demonstration.

Resource Sharing or Designing Access Network For Low Cost.

Network Management & Monitoring

Confused, Timid, and Unstable: Picking a Video Streaming Rate is Hard

Self Programming Networks

Advanced Computer Networks

Port-Scanning Resistance in Tor Anonymity Network. Presented By: Shane Pope Dec 04, 2009

Guaranteeing Video Quality

The Diffie-Hellman Key Exchange

Anatomy of a DASH Client. Ali C. Begen, Ph.D.

ADAPTIVE STREAMING. Improve Retention for Live Content. Copyright (415)

Can t you hear me knocking

Deploying IPTV and OTT

Contents. About Objective Quality Benchmarks 15 Overview of Objective Benchmarks and Tools 16

How the web works - 1

TCP Protocol Optimization for HTTP Adaptive Streaming

Video at the Edge passive delay measurements. Kathleen Nichols Pollere, Inc November 17, 2016

Sky Italia - Operation Evolution. London March 20th, 2018

Distributed Systems. 21. Content Delivery Networks (CDN) Paul Krzyzanowski. Rutgers University. Fall 2018

CS November 2018

Introduction Challenges with using ML Guidelines for using ML Conclusions

Lecture 27 DASH (Dynamic Adaptive Streaming over HTTP)

Advanced Computer Networks

Can we overcome. FEARLESS engineering

Quality of Service (QoS)

Mobile Network Congestion Management

Characterizing Netflix Bandwidth Consumption

Emulation of Dynamic Adaptive Streaming over HTTP with Mininet

Comparing the bandwidth and priority Commands of a QoS Service Policy

Performance Characterization of a Commercial Video Streaming Service

A SIMPLE INTRODUCTION TO TOR

Tema 0: Transmisión de Datos Multimedia

Integrated and Differentiated Services. Christos Papadopoulos. CSU CS557, Fall 2017

Cloak of Visibility. -Detecting When Machines Browse A Different Web. Zhe Zhao

Session 1: Physical and Web Infrastructure

The FootFall Project Tracing Attacks Through Non-Cooperative Networks and Stepping Stones with Timing-Based Watermarking

QoS Guarantees. Motivation. . link-level level scheduling. Certain applications require minimum level of network performance: Ch 6 in Ross/Kurose

QoS MIB Implementation

Advanced Networking Technologies

Telex Anticensorship in the

Improving Internet Performance through Traffic Managers

Configuring Application Visibility and Control for Cisco Flexible Netflow

LECTURE NOTES Professor Anita Wasilewska NEURAL NETWORKS

Practical Keystroke Timing Attacks in Sandboxed JavaScript

Module 6 STILL IMAGE COMPRESSION STANDARDS

Application Detection

Traffic Characteristics of Bulk Data Transfer using TCP/IP over Gigabit Ethernet

SARA: Segment Aware Rate Adaptation for DASH Video Services

Early detection of Crossfire attacks using deep learning

Transcription:

Beauty and the Burst Remote Identification of Encrypted Video Streams Roei Schuster Cornell Tech, Tel Aviv University Vitaly Shmatikov Cornell Tech Eran Tromer Columbia University, Tel Aviv University

Video traffic is interesting

Video traffic is encrypted

Video traffic is encrypted What can still be learned?

Traffic analysis for video identification streaming service victim

Traffic analysis for video identification streaming service victim

Traffic analysis for video identification streaming service Metadata! packet times, sizes, victim

Traffic analysis for video identification streaming service Metadata! packet times, sizes, Victim is watching Beauty and the Beast! victim

packet size (bytes) Initial buffering, then on / off bursts time (seconds)

packet size (bytes) Initial buffering, then on / off bursts time (seconds)

packet size (bytes) Initial buffering, then on / off bursts time (seconds) [RLLTBD 11], [ARNL 12], [MFWS 13],

packet size (bytes) Initial buffering, then on / off bursts time (seconds) [RLLTBD 11], [ARNL 12], [MFWS 13], Where do bursts come from?

Video representation on server streaming service

Video representation on server streaming service

Video representation on server Pulp Fiction Die Hard Armageddon streaming service 12 Monkeys The Fifth Element Die Hard II

Video representation on server MPEG-DASH standard: widely adopted by Netflix, YouTube, others Die Hard Armageddon Pulp Fiction 12 Monkeys The Fifth Element Die Hard II

Video representation on server MPEG-DASH standard: widely adopted by Netflix, YouTube, others Die Hard video stored in segment-files segment1.m4s segment2.m4s segment3.m4s segment4.m4s Armageddon Pulp Fiction 12 Monkeys Die Hard II The Fifth Element

Video representation on server MPEG-DASH standard: widely adopted by Netflix, YouTube, others Die Hard segment = a few seconds of playback 0-5sec 5-10sec 10-15sec 15-20sec video stored in segment-files segment1.m4s segment2.m4s segment3.m4s segment4.m4s Armageddon Pulp Fiction 12 Monkeys Die Hard II The Fifth Element

DASH client-server interaction (simplified) client server server buffer below threshold? no yes request next segment segment1.m4s segment2.m4s segment3.m4s segment4.m4s segment5.m4s segment6.m4s

DASH client-server interaction (simplified) client buffer below threshold? no yes request next segment segment fetched every few seconds server server segment1.m4s segment2.m4s segment3.m4s segment4.m4s segment5.m4s segment6.m4s

DASH client-server interaction (simplified) client buffer below threshold? no yes request next segment segment fetched every few seconds fetching causes a traffic burst server server segment1.m4s segment2.m4s segment3.m4s segment4.m4s segment5.m4s segment6.m4s

Bitrate (bytes) Variable bit rate encoding Time (seconds) Different video seconds require different amount of bytes to encode Iguana vs. Snakes VBR

Phases of Iguana vs Snakes in Bitrate Bitrate (bits per second) scenery, movement, tension rising Time (seconds)

Phases of Iguana vs Snakes in Bitrate Bitrate (bits per second) tension peaking, iguana is still Time (seconds)

Phases of Iguana vs Snakes in Bitrate Bitrate (bits per second) chase Time (seconds)

Phases of Iguana vs Snakes in Bitrate Bitrate (bits per second) chase iguana almost captured Time (seconds)

Phases of Iguana vs Snakes in Bitrate Bitrate (bits per second) iguana safe, resting

Bitrate (bytes) Variable bit rate encoding Time (seconds) Different video seconds require different amount of bytes to encode Iguana vs. Snakes VBR

Variable bit rate variable segment size Die Hard 0-5sec 5-10sec 10-15sec 15-20sec 20-25sec Segment1.m4s Segment2.m4s Segment3.m4s Segment4.m4s Segment5.m4s Pulp Fiction Armageddon 12 Monkeys Die Hard II The Fifth Element

burst size (bytes) Variable segment size variable burst size Time (seconds) buffering On/off bursts

burst size (bytes) Variable segment size variable burst size Time (seconds) buffering On/off bursts

MPEG-DASH leak content VBR pattern segments burst sizes stream time

From a leak to a fingerprint burst sizes Does the pattern of burst (segment) sizes uniquely characterize a title? Can we learn a title s identifying pattern? stream time

From a leak to a fingerprint burst sizes Does the pattern of burst (segment) sizes uniquely characterize a title? Diversity: empirically measure pairwise distances for 3500 downloaded and segmented YouTube titles Can we learn a title s identifying pattern? stream time

From a leak to a fingerprint burst sizes Does the pattern of burst (segment) sizes uniquely characterize a title? Diversity: empirically measure pairwise distances for 3500 downloaded and segmented YouTube titles Can we learn a title s identifying pattern? Consistency: empirically evaluate attacker s measurement error bound stream time

From a leak to a fingerprint burst sizes Does the pattern of burst (segment) sizes uniquely characterize a title? Diversity: empirically measure pairwise distances for 3500 downloaded and segmented YouTube titles Can we learn a title s identifying pattern? Consistency: empirically evaluate attacker s measurement error bound stream time ~20% of YouTube titles have fingerprints

Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys victim network

Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys victim network

Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata victim network

Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training victim network detectors

Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training victim network detectors

Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training victim network detectors

Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training detectors Victim is watching Armageddon! victim network

Attack details attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training vantage point? victim network detectors

Scenario I: on-path attack bursts on-path vantage point Wi-Fi access points, proxies, routers, enterprise or national network censors, ISPs

Attack details attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training machine learning victim network detectors

Deep neural networks Very good at learning high-level concepts that are hard to express formally (e.g., traffic traces are similar ) Existing NN architectures very accurate on classification and detection problems

Advantages of neural networks Robust: can operate on noisy and coarse measurements Agnostic to protocol-specific attributes (e.g., QUIC vs. TLS) Can learn features other than burst patterns, e.g., arrival patterns of individual packets Can use multiple session representations, train on all at once

packet size Features Each feature is a time-series, sampled at 0.25-second intervals (example: bytes per second) 1500 300 0 0.25 0.5 0.75 1 time (seconds) 0.25 1500 0 2 1500 300 Features considered: downstream/upstream/total values of bytes per second, packet per second, average packet length, and burst sizes

Attack attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training neural net On-path attacker victim network detectors

Datasets and identification experiments 100 titles 100 1-minute sessions 18 titles 100 3-minute sessions + 3500 sessions of different other titles 10 titles 100 1.5-minute sessions 10 titles 100 1-minute sessions

Datasets and identification experiments 100 titles 100 1-minute sessions 100 classes 18 titles 100 3-minute sessions + 3500 sessions of different other titles 10 titles 100 1.5-minute sessions 10 titles 100 1-minute sessions

Datasets and identification experiments 100 titles 100 1-minute sessions 100 classes 18 titles 100 3-minute sessions + 3500 sessions of different other titles open-world identification 18+1=19 classes 10 titles 100 1.5-minute sessions 10 titles 100 1-minute sessions

Datasets and identification experiments 100 titles 100 1-minute sessions 100 classes 18 titles 100 3-minute sessions + 3500 sessions of different other titles open-world identification 18+1=19 classes 10 titles 100 1.5-minute sessions 10 classes 10 titles 100 1-minute sessions 10 classes

Datasets and identification experiments 100 titles 100 1-minute sessions 100 classes 98.5% accuracy 18 titles 100 3-minute sessions + 3500 sessions of different other titles open-world identification 18+1=19 classes 99.5% accuracy 10 titles 100 1.5-minute sessions 10 classes 92.5% accuracy 10 titles 100 1-minute sessions 10 classes 98.6% accuracy

Empirical results: confusion matrices YouTube (feature: total burst size) Netflix (feature: total burst size) Predicted label unknown class, 3500 samples Predicted label

Empirical results: confusion matrices YouTube (feature: total burst size) Netflix (feature: total burst size) Predicted label Exactly 2 false positives unknown class, 3500 samples Predicted label No recurrent confusions (despite many same-series titles)

Tuning for precision YouTube (feature: total burst size) Netflix (feature: total burst size) 0 false positives with 0.988 recall 0.0005 false positive rate with 0.93 recall

Attack details attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training neural net vantage point? victim network detectors

Off-path attackers victim network bursts Wi-Fi access points, proxies, routers, enterprise or national network censors, ISPs on-path vantage point

Off-path attackers victim network bursts

Off-path attackers victim network bursts A visited webpage? A smartphone app?

Off-path attackers victim network bursts Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app?

Off-path attackers victim network bursts Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app?

Off-path attackers victim network bursts Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app?

Off-path attackers victim network bursts Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app? Web ad

Off-path attackers victim network bursts Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app? Web ad

Off-path attackers victim network bursts Three-fold confinement: different device, browser process, sandboxed iframe Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app? Web ad

Cross-device attack viewer Browser neighbor

Cross-device attack viewer Browser attacker Web site JavaScript attacker client neighbor

Cross-device attack viewer attacker Web site messages Browser JavaScript attacker client neighbor

Cross-device attack viewer Congestion attacker Web site messages Browser JavaScript attacker client neighbor

Cross-device attack viewer bursts Congestion attacker Web site messages Browser JavaScript attacker client neighbor

Cross-device attack viewer bursts Congestion Browser delays attacker Web site messages JavaScript attacker client neighbor

Cross-device attack viewer bursts Congestion Browser delays attacker Web site messages JavaScript attacker client Noisy, coarse estimate of actual traffic bursts neighbor

Delay-bursts delay (milliseconds) Message delays traffic burst sizes (scaled down) time (seconds)

delay (milliseconds) Delay-bursts Message delays traffic burst sizes (scaled down) For each traffic burst, compute aggregate delay induced. Use resulting time-series as input to neural network time (seconds)

Delay-bursts vs. traffic bursts delay-bursts time series: the delays induced by traffic bursts

1/10 cross-device attack: precision vs. recall Accuracy: 0.965 false positive rate: 0.003, recall 0.933

Cross-device attack viewer Browser attacker Web site JavaScript detector code neighbor

Cross-site attack victim PC browser window Streaming client attacker Web site browser window JavaScript detector code

Mitigating the DASH leak Modern streaming traffic characteristics Title bitrate pattern unique when sampled at few-seconds granularity Fetching at segment granularity (= every few seconds) Buffer below threshold? no yes fetch next segment Maximizes quality of experience, server load, and network bandwidth utilization However, information leakage is intrinsic

Thank you! Further information and the paper: https://beautyburst.github.io/