An Empirical Study of High Availability in Stream Processing Systems

Similar documents
A Hybrid Approach to High Availability in Stream Processing Systems

SAND: A Fault-Tolerant Streaming Architecture for Network Traffic Analytics

NFSv4 as the Building Block for Fault Tolerant Applications

Fault-Tolerance Implementation in Typical Distributed Stream Processing Systems

Fault Tolerance for Highly Available Internet Services: Concept, Approaches, and Issues

Application-Transparent Checkpoint/Restart for MPI Programs over InfiniBand

The Google File System

GFS: The Google File System. Dr. Yingwu Zhu

Discretized Streams. An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters

The Google File System

GFS: The Google File System

Distributed System. Gang Wu. Spring,2018

Copyright 2012 EMC Corporation. All rights reserved.

CEC: Continuous Eventual Checkpointing for Data Stream Processing Operators

IBM InfoSphere Streams v4.0 Performance Best Practices

Estimate Mailbox Storage Capacity Requirements Estimate Mailbox I/O Requirements Determine Storage Type Choose Storage Solution Determine Number of

Distributed File Systems II

WHITE PAPER: BEST PRACTICES. Sizing and Scalability Recommendations for Symantec Endpoint Protection. Symantec Enterprise Security Solutions Group

Challenges in Data Stream Processing

Load Shedding in Network Monitoring Applications

The Google File System

iscsi Target Usage Guide December 15, 2017

Dell DR4000 Replication Overview

FAULT TOLERANT SYSTEMS

White Paper. A System for Archiving, Recovery, and Storage Optimization. Mimosa NearPoint for Microsoft

The Google File System

Clustering In A SAN For High Availability

VERITAS Volume Replicator. Successful Replication and Disaster Recovery

REMEM: REmote MEMory as Checkpointing Storage

VIAF: Verification-based Integrity Assurance Framework for MapReduce. YongzhiWang, JinpengWei

Data Protection for Cisco HyperFlex with Veeam Availability Suite. Solution Overview Cisco Public

vsan Remote Office Deployment January 09, 2018

An Oracle White Paper May Oracle VM 3: Overview of Disaster Recovery Solutions

Enhancing Checkpoint Performance with Staging IO & SSD

A Global Operating System for HPC Clusters

Native vsphere Storage for Remote and Branch Offices

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Blizzard: A Distributed Queue

Fault-tolerant Stream Processing using a Distributed, Replicated File System

The Google File System (GFS)

Replication Solutions with Open-E Data Storage Server (DSS) April 2009

Streaming data Model is opposite Queries are usually fixed and data are flows through the system.

Technical Note. Dell/EMC Solutions for Microsoft SQL Server 2005 Always On Technologies. Abstract

Synology Alex Wang CEO, Synology America

Team 4: Unicommerce. Team Members

LRC: Dependency-Aware Cache Management for Data Analytics Clusters. Yinghao Yu, Wei Wang, Jun Zhang, and Khaled B. Letaief IEEE INFOCOM 2017

ARTIST-Relevant Research from Linköping

April 21, 2017 Revision GridDB Reliability and Robustness

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange 2007

Oracle E-Business Availability Options. Solution Series for Oracle: 2 of 5

Database Services at CERN with Oracle 10g RAC and ASM on Commodity HW

Current Topics in OS Research. So, what s hot?

Coordinating Parallel HSM in Object-based Cluster Filesystems

SANGFOR AD Product Series

vsan Disaster Recovery November 19, 2017

AR-SMT: A Microarchitectural Approach to Fault Tolerance in Microprocessors

Microsoft Office SharePoint Server 2007

Replication in Distributed Systems

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

Dell PowerEdge R720xd with PERC H710P: A Balanced Configuration for Microsoft Exchange 2010 Solutions

Adaptive replica consistency policy for Kafka

Microsoft SQL Server 2012 Fast Track Reference Configuration Using PowerEdge R720 and EqualLogic PS6110XV Arrays

Storage Integration with Host-based Write-back Caching

Designing Fault-Tolerant Applications

Synology High Availability (SHA)

MOVING TOWARDS ZERO DOWNTIME FOR WINTEL Caddy Tan 21 September Leaders Have Vision visionsolutions.com 1

High Availability and Disaster Recovery features in Microsoft Exchange Server 2007 SP1

The advantages of architecting an open iscsi SAN

BUSINESS CONTINUITY: THE PROFIT SCENARIO

Business Continuity and Disaster Recovery. Ed Crowley Ch 12

Disaster Recovery Options

A Comparison of Stream-Oriented High-Availability Algorithms

The Microsoft Large Mailbox Vision

The Google File System

EMC CLARiiON CX3-40. Reference Architecture. Enterprise Solutions for Microsoft Exchange Enabled by MirrorView/S

Software-defined Storage: Fast, Safe and Efficient

Advanced Memory Management

Distributed Systems Question Bank UNIT 1 Chapter 1 1. Define distributed systems. What are the significant issues of the distributed systems?

Vendor: EMC. Exam Code: E Exam Name: Cloud Infrastructure and Services Exam. Version: Demo

CMS Failover White Paper. Surveillance Station and above

Protecting Mission-Critical Workloads with VMware Fault Tolerance W H I T E P A P E R

goals monitoring, fault tolerance, auto-recovery (thousands of low-cost machines) handle appends efficiently (no random writes & sequential reads)

High Availability Client Login Profiles

Auto Management for Apache Kafka and Distributed Stateful System in General

EMC Business Continuity for Microsoft Applications

The Google File System

White Paper. How to select a cloud disaster recovery method that meets your requirements.

Supporting Application- Tailored Grid File System Sessions with WSRF-Based Services

Cutting the Cord: A Robust Wireless Facilities Network for Data Centers

ZHT: Const Eventual Consistency Support For ZHT. Group Member: Shukun Xie Ran Xin

Distributed systems. Lecture 6: distributed transactions, elections, consensus and replication. Malte Schwarzkopf

CA485 Ray Walshe Google File System

Care and Feeding of Oracle Rdb Hot Standby

DiskReduce: Making Room for More Data on DISCs. Wittawat Tantisiriroj

CSCI 4717 Computer Architecture

The Google File System. Alexandru Costan

Introduction to the Service Availability Forum

The Fusion Distributed File System

TITLE. the IT Landscape

Transcription:

An Empirical Study of High Availability in Stream Processing Systems Yu Gu, Zhe Zhang, Fan Ye, Hao Yang, Minkyong Kim, Hui Lei, Zhen Liu

Stream Processing Model software operators (PEs) Ω Unexpected machine failures Loss of data and internal state Disruption to normal processing Challenge: how to preserve data / state and minimize disruption? subjob deployment machines 2 An Empirical Study of High Availability in DSPS

Existing approaches: Active Standby vs. Passive Standby Passive Active Standby 3 An Empirical Study of High Availability in DSPS

Basic Tradeoff between AS and PS Active Standby Overhead: double processing load; at least double message load Recovery delay: almost zero Passive Standby Overhead: checkpoint messages Recovery delay: failure detection + deploy new job + recover state 4

Motivation Tradeoffs of AS & PS not fully understood Only systematic comparison: [Hwang ICDE05] Used a variant of PS with high overhead Evaluated in simulations rather than real systems Our contributions A sweeping checkpointing method Reducing checkpoint overhead by one order of magnitude Proof of consistency A real prototype distributed stream processing system Comprehensive and empirical evaluation of AS and PS 5 An Empirical Study of High Availability in DSPS

Outline Background and Motivation Design and Implementation Sweeping Checkpointing System Architecture Performance Evaluation Related Work Conclusions 6 An Empirical Study of High Availability in DSPS

Overview of Sweeping Checkpointing What to include Input queues Internal states Output queues When to trim Checkpointing Multiple PEs Proof of consistency recoverable from upstream output queues dominating ckpt size with high data rates 7 An Empirical Study of High Availability in DSPS

When to Trim In U s output queue, only removing those packets that have been processed and checkpointed by D upstream node U downstream node D 5 4 3 2 1 1 1 1 8 An Empirical Study of High Availability in DSPS

When to Trim In U s output queue, only removing those packets that have been processed and checkpointed by D upstream node U downstream node D 5 4 3 2 1 1 1 9 An Empirical Study of High Availability in DSPS

When to Trim In U s output queue, only removing those packets that have been processed and checkpointed by D upstream node U downstream node D 5 4 3 2 1 2 1 10 An Empirical Study of High Availability in DSPS

When to Trim In U s output queue, only removing those packets that have been processed and checkpointed by D checkpoint upstream node U downstream node D 5 4 3 2 1 2 1 11 An Empirical Study of High Availability in DSPS

When to Trim In U s output queue, only removing those packets that have been processed and checkpointed by D 2 1 checkpoint upstream node U downstream node D 5 4 3 2 1 2 1 1 and 2 have been processed and checkpointed 12 An Empirical Study of High Availability in DSPS

Checkpointing Multiple PEs Synchronous snapshot of the whole sub job Freeze all PEs, then checkpoint all state, then resume all PEs checkpoint manager CM CM sub job 1 sub job 2 Site 1 Site 2 13 An Empirical Study of High Availability in DSPS

Checkpointing Multiple PEs Individual Freeze / checkpoint / resume each PE individually checkpoint manager CM CM sub job 1 sub job 2 Site 1 Site 2 14 An Empirical Study of High Availability in DSPS

Checkpointing Multiple PEs Sweeping checkpoint manager CM Checkpoint a PE immediately after receipt of acknowledgement and output queue trimming CM sub job 1 sub job 2 Site 1 Site 2 15 An Empirical Study of High Availability in DSPS

Sketch of Proof for Consistency Scenario: single node failure (N i ) Actions for recovery Recovering operator state Recovering input queue from output queues of upstream Reprocessing affected elements Scenario: multiple concurrent node failures Actions for recovery Finding and recovering most upstream failed node Recovering other nodes recursively only trimmed to reflect latest checkpoint of N i 16 An Empirical Study of High Availability in DSPS

System Architecture Remote Execution Coordinator manage HA protection for distributed jobs Node Job Management manage job deployment FM REC JMN CM Checkpoint Manager manage checkpoint tasks according to assigned checkpoint mechanism Failover Manager monitor other nodes and initiate recovery Jobs and Processing Nodes Job take data from upstream, execute processing tasks, and send results to downstream Features: Job A distributed job consists of multiple subjobs, each of which can choose its own specific HA mechanism (AS, PS) The system coordinates the deployment and protection of subjobs among all machines 17 An Empirical Study of High Availability in DSPS

Outline Background and Motivation Design and Implementation Performance Evaluation Experiment Setup Overhead and Delay Results Related Work Conclusions 18 An Empirical Study of High Availability in DSPS

Experiment Setup Testbed: a cluster environment Dual Xeon 3.06GHz CPUs, 800MHz, 512KB L2 caches, 4GB memory, 80GB disk 1Gbps LAN A distributed job containing 4 subjobs, each having 2 processing nodes running on one machine Metrics Recovery delay Message overhead 19 An Empirical Study of High Availability in DSPS

Avg. Checkpoint Queue Size Comparison 3000 elements/second Sweeping reduces checkpoint size by about 96% 20 An Empirical Study of High Availability in DSPS

Checkpoint Time Comparison checkpoint interval = 500 ms Sweeping reduces checkpoint time by about 75% 21 An Empirical Study of High Availability in DSPS

Message Overhead Comparison AS-AS incurs almost 4 times message overhead vs. PS 22 An Empirical Study of High Availability in DSPS

Recovery Delay Decomposition Detection delay becomes dominant with large heartbeat interval 23 An Empirical Study of High Availability in DSPS

Outline Background and Motivation Design and Implementation Performance Evaluation Related Work Conclusions 24 An Empirical Study of High Availability in DSPS

Related Work Borealis 1. Fault tolerance in the Borealis distributed stream processing system (SIGMOD 05) A variant of AS Achieving flexible trade-off between availability and consistency by introducing tentative data concept 2. Fast and reliable stream processing over wide area networks (ICDE 07) A variant of AS Most expensive variant; upstream sending to all downstream replicas No switch required when failure occurs 3. A cooperative, self-configuring high-availability solution for stream processing (ICDE 07) A variant of PS Novel checkpoint scheduling and backup assignment Balances recovery load over multiple servers 4. Borealis-R: a replication-transparent stream processing system for wide-area monitoring applications (SIGMOD 08) A variant of AS Same technique as in [2] Novel mechanism to allow replicas execute without coordination but still produce consistent results 25 An Empirical Study of High Availability in DSPS

Related Work System S 5. Towards automatic fault recovery in System-S (ICAC 07) Checkpoint state Recovery of JMN, not jobs 6. Failure recovery in cooperative data streaming analysis (ARES 07) How to select a backup site on demand, not recovery technique 7. Online failure forecast for fault-tolerant data stream processing (ICDE 08) Prediction of potential failures, a monitoring technique Leverages varies system metrics (system productivity, available CPU, etc.) to predict failures before they occur Comparison of AS and PS 8. High-availability algorithms for distributed stream processing (ICDE 05) Valuable summaries of basic tradeoffs PS variant has large overhead Evaluation mainly based on simulations 26 An Empirical Study of High Availability in DSPS

Conclusions Fundamental tradeoffs between AS and PS in stream processing systems not fully understood Our contributions A novel sweeping checkpoint mechanism Proof of consistency System implementation and empirical evaluation Performance results providing valuable insights Importance of queue trimming in checkpointing Decomposition of recovery delay 27 An Empirical Study of High Availability in DSPS

Questions? Zhe Zhang zhezhang@ornl.gov http://users.nccs.gov/~zzhang3/pubs/empirical-middleware09.pdf 28 An Empirical Study of High Availability in DSPS