Beauty and the Burst Remote Identification of Encrypted Video Streams Roei Schuster Cornell Tech, Tel Aviv University Vitaly Shmatikov Cornell Tech Eran Tromer Columbia University, Tel Aviv University
Video traffic is interesting
Video traffic is encrypted
Video traffic is encrypted What can still be learned?
Traffic analysis for video identification streaming service victim
Traffic analysis for video identification streaming service victim
Traffic analysis for video identification streaming service Metadata! packet times, sizes, victim
Traffic analysis for video identification streaming service Metadata! packet times, sizes, Victim is watching Beauty and the Beast! victim
packet size (bytes) Initial buffering, then on / off bursts time (seconds)
packet size (bytes) Initial buffering, then on / off bursts time (seconds)
packet size (bytes) Initial buffering, then on / off bursts time (seconds) [RLLTBD 11], [ARNL 12], [MFWS 13],
packet size (bytes) Initial buffering, then on / off bursts time (seconds) [RLLTBD 11], [ARNL 12], [MFWS 13], Where do bursts come from?
Video representation on server streaming service
Video representation on server streaming service
Video representation on server Pulp Fiction Die Hard Armageddon streaming service 12 Monkeys The Fifth Element Die Hard II
Video representation on server MPEG-DASH standard: widely adopted by Netflix, YouTube, others Die Hard Armageddon Pulp Fiction 12 Monkeys The Fifth Element Die Hard II
Video representation on server MPEG-DASH standard: widely adopted by Netflix, YouTube, others Die Hard video stored in segment-files segment1.m4s segment2.m4s segment3.m4s segment4.m4s Armageddon Pulp Fiction 12 Monkeys Die Hard II The Fifth Element
Video representation on server MPEG-DASH standard: widely adopted by Netflix, YouTube, others Die Hard segment = a few seconds of playback 0-5sec 5-10sec 10-15sec 15-20sec video stored in segment-files segment1.m4s segment2.m4s segment3.m4s segment4.m4s Armageddon Pulp Fiction 12 Monkeys Die Hard II The Fifth Element
DASH client-server interaction (simplified) client server server buffer below threshold? no yes request next segment segment1.m4s segment2.m4s segment3.m4s segment4.m4s segment5.m4s segment6.m4s
DASH client-server interaction (simplified) client buffer below threshold? no yes request next segment segment fetched every few seconds server server segment1.m4s segment2.m4s segment3.m4s segment4.m4s segment5.m4s segment6.m4s
DASH client-server interaction (simplified) client buffer below threshold? no yes request next segment segment fetched every few seconds fetching causes a traffic burst server server segment1.m4s segment2.m4s segment3.m4s segment4.m4s segment5.m4s segment6.m4s
Bitrate (bytes) Variable bit rate encoding Time (seconds) Different video seconds require different amount of bytes to encode Iguana vs. Snakes VBR
Phases of Iguana vs Snakes in Bitrate Bitrate (bits per second) scenery, movement, tension rising Time (seconds)
Phases of Iguana vs Snakes in Bitrate Bitrate (bits per second) tension peaking, iguana is still Time (seconds)
Phases of Iguana vs Snakes in Bitrate Bitrate (bits per second) chase Time (seconds)
Phases of Iguana vs Snakes in Bitrate Bitrate (bits per second) chase iguana almost captured Time (seconds)
Phases of Iguana vs Snakes in Bitrate Bitrate (bits per second) iguana safe, resting
Bitrate (bytes) Variable bit rate encoding Time (seconds) Different video seconds require different amount of bytes to encode Iguana vs. Snakes VBR
Variable bit rate variable segment size Die Hard 0-5sec 5-10sec 10-15sec 15-20sec 20-25sec Segment1.m4s Segment2.m4s Segment3.m4s Segment4.m4s Segment5.m4s Pulp Fiction Armageddon 12 Monkeys Die Hard II The Fifth Element
burst size (bytes) Variable segment size variable burst size Time (seconds) buffering On/off bursts
burst size (bytes) Variable segment size variable burst size Time (seconds) buffering On/off bursts
MPEG-DASH leak content VBR pattern segments burst sizes stream time
From a leak to a fingerprint burst sizes Does the pattern of burst (segment) sizes uniquely characterize a title? Can we learn a title s identifying pattern? stream time
From a leak to a fingerprint burst sizes Does the pattern of burst (segment) sizes uniquely characterize a title? Diversity: empirically measure pairwise distances for 3500 downloaded and segmented YouTube titles Can we learn a title s identifying pattern? stream time
From a leak to a fingerprint burst sizes Does the pattern of burst (segment) sizes uniquely characterize a title? Diversity: empirically measure pairwise distances for 3500 downloaded and segmented YouTube titles Can we learn a title s identifying pattern? Consistency: empirically evaluate attacker s measurement error bound stream time
From a leak to a fingerprint burst sizes Does the pattern of burst (segment) sizes uniquely characterize a title? Diversity: empirically measure pairwise distances for 3500 downloaded and segmented YouTube titles Can we learn a title s identifying pattern? Consistency: empirically evaluate attacker s measurement error bound stream time ~20% of YouTube titles have fingerprints
Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys victim network
Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys victim network
Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata victim network
Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training victim network detectors
Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training victim network detectors
Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training victim network detectors
Attack overview attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training detectors Victim is watching Armageddon! victim network
Attack details attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training vantage point? victim network detectors
Scenario I: on-path attack bursts on-path vantage point Wi-Fi access points, proxies, routers, enterprise or national network censors, ISPs
Attack details attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training machine learning victim network detectors
Deep neural networks Very good at learning high-level concepts that are hard to express formally (e.g., traffic traces are similar ) Existing NN architectures very accurate on classification and detection problems
Advantages of neural networks Robust: can operate on noisy and coarse measurements Agnostic to protocol-specific attributes (e.g., QUIC vs. TLS) Can learn features other than burst patterns, e.g., arrival patterns of individual packets Can use multiple session representations, train on all at once
packet size Features Each feature is a time-series, sampled at 0.25-second intervals (example: bytes per second) 1500 300 0 0.25 0.5 0.75 1 time (seconds) 0.25 1500 0 2 1500 300 Features considered: downstream/upstream/total values of bytes per second, packet per second, average packet length, and burst sizes
Attack attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training neural net On-path attacker victim network detectors
Datasets and identification experiments 100 titles 100 1-minute sessions 18 titles 100 3-minute sessions + 3500 sessions of different other titles 10 titles 100 1.5-minute sessions 10 titles 100 1-minute sessions
Datasets and identification experiments 100 titles 100 1-minute sessions 100 classes 18 titles 100 3-minute sessions + 3500 sessions of different other titles 10 titles 100 1.5-minute sessions 10 titles 100 1-minute sessions
Datasets and identification experiments 100 titles 100 1-minute sessions 100 classes 18 titles 100 3-minute sessions + 3500 sessions of different other titles open-world identification 18+1=19 classes 10 titles 100 1.5-minute sessions 10 titles 100 1-minute sessions
Datasets and identification experiments 100 titles 100 1-minute sessions 100 classes 18 titles 100 3-minute sessions + 3500 sessions of different other titles open-world identification 18+1=19 classes 10 titles 100 1.5-minute sessions 10 classes 10 titles 100 1-minute sessions 10 classes
Datasets and identification experiments 100 titles 100 1-minute sessions 100 classes 98.5% accuracy 18 titles 100 3-minute sessions + 3500 sessions of different other titles open-world identification 18+1=19 classes 99.5% accuracy 10 titles 100 1.5-minute sessions 10 classes 92.5% accuracy 10 titles 100 1-minute sessions 10 classes 98.6% accuracy
Empirical results: confusion matrices YouTube (feature: total burst size) Netflix (feature: total burst size) Predicted label unknown class, 3500 samples Predicted label
Empirical results: confusion matrices YouTube (feature: total burst size) Netflix (feature: total burst size) Predicted label Exactly 2 false positives unknown class, 3500 samples Predicted label No recurrent confusions (despite many same-series titles)
Tuning for precision YouTube (feature: total burst size) Netflix (feature: total burst size) 0 false positives with 0.988 recall 0.0005 false positive rate with 0.93 recall
Attack details attacker network Die Hard Pulp Fiction Armageddon 12 Monkeys metadata training neural net vantage point? victim network detectors
Off-path attackers victim network bursts Wi-Fi access points, proxies, routers, enterprise or national network censors, ISPs on-path vantage point
Off-path attackers victim network bursts
Off-path attackers victim network bursts A visited webpage? A smartphone app?
Off-path attackers victim network bursts Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app?
Off-path attackers victim network bursts Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app?
Off-path attackers victim network bursts Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app?
Off-path attackers victim network bursts Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app? Web ad
Off-path attackers victim network bursts Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app? Web ad
Off-path attackers victim network bursts Three-fold confinement: different device, browser process, sandboxed iframe Example: A visited webpage? checking Facebook feed while streaming A smartphone Armageddon app? Web ad
Cross-device attack viewer Browser neighbor
Cross-device attack viewer Browser attacker Web site JavaScript attacker client neighbor
Cross-device attack viewer attacker Web site messages Browser JavaScript attacker client neighbor
Cross-device attack viewer Congestion attacker Web site messages Browser JavaScript attacker client neighbor
Cross-device attack viewer bursts Congestion attacker Web site messages Browser JavaScript attacker client neighbor
Cross-device attack viewer bursts Congestion Browser delays attacker Web site messages JavaScript attacker client neighbor
Cross-device attack viewer bursts Congestion Browser delays attacker Web site messages JavaScript attacker client Noisy, coarse estimate of actual traffic bursts neighbor
Delay-bursts delay (milliseconds) Message delays traffic burst sizes (scaled down) time (seconds)
delay (milliseconds) Delay-bursts Message delays traffic burst sizes (scaled down) For each traffic burst, compute aggregate delay induced. Use resulting time-series as input to neural network time (seconds)
Delay-bursts vs. traffic bursts delay-bursts time series: the delays induced by traffic bursts
1/10 cross-device attack: precision vs. recall Accuracy: 0.965 false positive rate: 0.003, recall 0.933
Cross-device attack viewer Browser attacker Web site JavaScript detector code neighbor
Cross-site attack victim PC browser window Streaming client attacker Web site browser window JavaScript detector code
Mitigating the DASH leak Modern streaming traffic characteristics Title bitrate pattern unique when sampled at few-seconds granularity Fetching at segment granularity (= every few seconds) Buffer below threshold? no yes fetch next segment Maximizes quality of experience, server load, and network bandwidth utilization However, information leakage is intrinsic
Thank you! Further information and the paper: https://beautyburst.github.io/