SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics

Similar documents
SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics

An Empirical Study of High Availability in Stream Processing Systems

Putting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21

EMC RecoverPoint. EMC RecoverPoint Support

CA485 Ray Walshe Google File System

Deep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services

DRIZZLE: FAST AND Adaptable STREAM PROCESSING AT SCALE

Pricing Intra-Datacenter Networks with

NFSv4 as the Building Block for Fault Tolerant Applications

SparkStreaming. Large scale near- realtime stream processing. Tathagata Das (TD) UC Berkeley UC BERKELEY

Discretized Streams. An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters

GFS: The Google File System

Spark Streaming. Guido Salvaneschi

Distributed and Fault-Tolerant Execution Framework for Transaction Processing

PREGEL: A SYSTEM FOR LARGE- SCALE GRAPH PROCESSING

R-Storm: A Resource-Aware Scheduler for STORM. Mohammad Hosseini Boyang Peng Zhihao Hong Reza Farivar Roy Campbell

Toward a Reliable Data Transport Architecture for Optical Burst-Switched Networks

GFS: The Google File System. Dr. Yingwu Zhu

Distributed Systems Question Bank UNIT 1 Chapter 1 1. Define distributed systems. What are the significant issues of the distributed systems?

Spark, Shark and Spark Streaming Introduction

Fault Tolerance for Highly Available Internet Services: Concept, Approaches, and Issues

Recovering from a Crash. Three-Phase Commit

Tardigrade: Leveraging Lightweight Virtual Machines to Easily and Efficiently Construct Fault-Tolerant Services

MI-PDB, MIE-PDB: Advanced Database Systems

The Google File System

Amazon Aurora Deep Dive

Over the last few years, we have seen a disruption in the data management

How VoltDB does Transactions

Distributed Filesystem

Scalable Streaming Analytics

Big Streaming Data Processing. How to Process Big Streaming Data 2016/10/11. Fraud detection in bank transactions. Anomalies in sensor data

Data Analytics with HPC. Data Streaming

The Google File System

Real-time data processing with Apache Flink

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

The Google File System

Scalable In-memory Checkpoint with Automatic Restart on Failures

CSPP Geo GRB Software v0.4 Interface Control Document

Streaming data Model is opposite Queries are usually fixed and data are flows through the system.

NPTEL Course Jan K. Gopinath Indian Institute of Science

Google File System. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google fall DIP Heerak lim, Donghun Koo

Transparent TCP Recovery

CLOUD-SCALE FILE SYSTEMS

ABSTRACT I. INTRODUCTION

MillWheel:Fault Tolerant Stream Processing at Internet Scale. By FAN Junbo

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

Modern Stream Processing with Apache Flink

Dell PowerVault MD3600f/MD3620f Remote Replication Functional Guide

The Google File System (GFS)

PHX: Memory Speed HPC I/O with NVM. Pradeep Fernando Sudarsun Kannan, Ada Gavrilovska, Karsten Schwan

BERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

Using streaming replication of PostgreSQL with pgpool-ii. Tatsuo Ishii President/PostgreSQL committer SRA OSS, Inc. Japan

Programming Systems for Big Data

CS 856 Latency in Communication Systems

Developing MapReduce Programs

CoRAL: Confined Recovery in Distributed Asynchronous Graph Processing

Smart Home Network Management with Dynamic Traffic Distribution. Chenguang Zhu Xiang Ren Tianran Xu

Primary-Backup Replication

Lecture 11 Hadoop & Spark

Hadoop File System S L I D E S M O D I F I E D F R O M P R E S E N T A T I O N B Y B. R A M A M U R T H Y 11/15/2017

Apache Flink. Alessandro Margara

HUAWEI OceanStor Enterprise Unified Storage System. HyperReplication Technical White Paper. Issue 01. Date HUAWEI TECHNOLOGIES CO., LTD.

Fault Tolerance for Stream Processing Engines

Intra-cluster Replication for Apache Kafka. Jun Rao

May MapR Technologies 2014 MapR Technologies 1

Twitter Heron: Stream Processing at Scale

Synergetics-Standard-SQL Server 2012-DBA-7 day Contents

Streaming & Apache Storm

Page 1 FAULT TOLERANT SYSTEMS. Coordinated Checkpointing. Time-Based Synchronization. A Coordinated Checkpointing Algorithm

Cisco Tetration Analytics Platform: A Dive into Blazing Fast Deep Storage

Checkpointing HPC Applications

Data Centers. Tom Anderson

Distributed Computation Models

Data Acquisition. The reference Big Data stack

CGAR: Strong Consistency without Synchronous Replication. Seo Jin Park Advised by: John Ousterhout

Engineering Fault-Tolerant TCP/IP servers using FT-TCP. Dmitrii Zagorodnov University of California San Diego

OTSDN What is it? Does it help?

Stateless Datacenter Load Balancing with Beamer

Distributed File Systems II

CPS 512 midterm exam #1, 10/7/2016

Durability for Memory-Based Key-Value Stores

Google File System. Arun Sundaram Operating Systems

CS268: Beyond TCP Congestion Control

Authors : Sanjay Ghemawat, Howard Gobioff, Shun-Tak Leung Presentation by: Vijay Kumar Chalasani

Cloud Computing and Hadoop Distributed File System. UCSB CS170, Spring 2018

Google File System (GFS) and Hadoop Distributed File System (HDFS)

Distributed Systems Exam 1 Review Paul Krzyzanowski. Rutgers University. Fall 2016

Hadoop and HDFS Overview. Madhu Ankam

HDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung

Yves Goeleven. Solution Architect - Particular Software. Shipping software since Azure MVP since Co-founder & board member AZUG

WIRELESS MESH NETWORKING: ZIGBEE VS. DIGIMESH WIRELESS MESH NETWORKING: ZIGBEE VS. DIGIMESH

SmartMD: A High Performance Deduplication Engine with Mixed Pages

18-hdfs-gfs.txt Thu Oct 27 10:05: Notes on Parallel File Systems: HDFS & GFS , Fall 2011 Carnegie Mellon University Randal E.

Toward Energy-efficient and Fault-tolerant Consistent Hashing based Data Store. Wei Xie TTU CS Department Seminar, 3/7/2017

Discretized Streams: Fault-Tolerant Streaming Computation at Scale

Content distribution networks

The Google File System

Primary/Backup. CS6450: Distributed Systems Lecture 3/4. Ryan Stutsman

MapReduce for Data Intensive Scientific Analyses

iscsi Target Usage Guide December 15, 2017

Transcription:

1 SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics Qin Liu, John C.S. Lui 1 Cheng He, Lujia Pan, Wei Fan, Yunlong Shi 2 1 The Chinese University of Hong Kong 2 Huawei Noah s Ark Lab

Introduction 2

3 Motivation Network traffic arrives in a streaming fashion, and should be processed in real-time. For example, 1. Network traffic classification 2. Anomaly detection 3. Policy and charging control in cellular networks 4. Recommendations based on user behaviors

1 http://storm.incubator.apache.org/ 4 Challenges 1. A stream processing system must sustain high-speed network traffic in cellular core networks existing systems: S4 [Neumeyer 10], Storm 1... implemented in Java: heavy processing overheads cannot sustain high-speed network traffic

1 http://storm.incubator.apache.org/ 4 Challenges 1. A stream processing system must sustain high-speed network traffic in cellular core networks existing systems: S4 [Neumeyer 10], Storm 1... implemented in Java: heavy processing overheads cannot sustain high-speed network traffic 2. For critical applications, it is necessary to provide correct results after failure recovery high hardware cost cannot provide correct results after failure recovery at-least-once vs. exactly-once

5 Contributions Design and implement SAND in C++: high performance on network traffic a new fault tolerance scheme

Background 6

7 Continuous operator model: Background Each node runs an operator with in-memory mutable state For each input event, state is updated and new events are sent out Mutable state is lost if node fails.

8 Example: AppTracker AppTracker: traffic classification for cellular network traffic Output traffic distribution in real-time: Application Distribution HTTP 15.60% Sina Weibo 4.13% QQ 2.56% DNS 2.34% HTTP in QQ 2.17%

9 Example: AppTracker Under the continuous operator model: Spout: capture packets from cellular network Decoder: extract IP packets from raw packets DPI-Engine: perform deep packet inspection on packets Tracker: track the distribution of application level protocols (HTTP, P2P, Skype...)

System Design 10

11 Architecture of SAND One coordinator and multiple workers. Each worker can be seen as an operator.

12 Coordinator Coordinator is responsible for managing worker executions detecting worker failures relaying control messages among workers monitoring performance statistics Zookeeper cluster provides fault tolerance and reliable coordination service.

The container daemon spawns or stops the processes communicates with the coordinator 13 Worker Contain 3 types of processes: The dispatcher decodes streams and distributes them to multiple analyzers Each analyzer independently processes the assigned streams The collector aggregates the intermediate results from all analyzers

14 Communication Channels Efficient communication channels: Intra-worker: a lock-free shared memory ring buffer Inter-worker: ZeroMQ, a socket library optimized for clustered products

Fault-Tolerance 15

Previous Fault-Tolerance Schemes 1. Replication: each operator has a replica operator [Hwang 05,Shah 04,Balazinska 08] Data streams are processed twice by two identical nodes Synchronization protocols ensures exact ordering of events in both nodes On failure, the system switches over to the replica nodes 2x hardware cost. 16

17 Previous Fault-Tolerance Schemes 2. Upstream backup with checkpoint [Fernandez 03,Gu 09]: Each node maintains backup of the forwarded events since last checkpoint On failure, upstream nodes replay the backup events serially to the failover node to recreate the state Less hardware cost. It s hard to provide correct results after recovery.

18 Why is it hard? Stateful continuous operators tightly integrate computation with mutable state Makes it harder to define clear boundaries when computation and state can be moved around

19 Checkpointing Need to coordinate checkpointing operation on each worker 1985: Chandy-Lamport invented an asynchronous snapshot algorithm for distributed systems A variant algorithm was implemented within SAND

20 Checkpointing Protocol Coordinator initiates a global checkpoint by sending markers to all source workers For each worker w, on receiving a data event E from worker u if marker from u has arrived, w buffers E else w processes E normally on receiving a marker from worker u if all markers have arrived, w starts checkpointing operation

21 Checkpointing Operation On each worker: When a checkpoint starts, the worker creates child processes using fork The parent processes then resume with the normal processing The child processes write the internal state to HDFS, which performs replication for data reliability

22 Output Buffer Buffer output events for recovery: Each worker records output data events in its output buffer, so as to replay output events during failure recovery When global checkpoint c is finished, data in output buffers before checkpoint c can be deleted

23 Failure Recovery F P F b d D F a c e g h f F : failed workers D F : downstream workers of F F D F : rolled back to the most recent checkpoint c P F : the upstream workers of F D F Workers in P F replay output events after checkpoint c

Evaluation 24

25 Experiment 1 Testbed: one quad-core machine with 4GB RAM Dataset: packet header trace; 331 million packets accounting for 143GB of traffic Application: packet counter System Packets/s Payload Rate Header Rate Storm 260K 840Mb/s 81.15Mb/s Blockmon 2.7M 8.4Gb/s 844.9Mb/s SAND 9.6M 31.4Gb/s 3031.7Mb/s 3.7X and 37.4X compared to Blockmon [Simoncelli 13] and Storm

26 Experiment 2 Testbed: three 16-core machines with 94GB RAM Dataset: a 2-hour network trace (32GB) collected from a commercial GPRS core network in China in 2013 Application: AppTracker

27 Experiment 2 Throughput (Mb/s) 1200 1000 800 600 400 Interval 2s Interval 5s Interval 10s No Fault-Tolerance 200 0 0 2 4 6 8 10 12 Number of Analyzers Scale out by running parallel workers on multiple servers Negligible overheads

Experiment 3 Throughput (Mb/s) 1000 800 600 400 200 0 Interval 5s Interval 10s t 1 t 2 t 3 t 4 t 5 0 10 20 30 40 50 60 Time (seconds) Recover in order of seconds Recovery time is in proportion to checkpointing interval 28

29 Conclusion Present a new distributed stream processing system for network analytics Propose a novel checkpointing protocol that provides reliable fault tolerance for stream processing systems SAND can operate at core routers level and can recover from failure in order of seconds

Thank you! Q & A 30