Typhoon: An SDN Enhanced Real-Time Big Data Streaming Framework

Similar documents
Scalable Streaming Analytics

Data Acquisition. The reference Big Data stack

Apache Storm. Hortonworks Inc Page 1

REAL-TIME ANALYTICS WITH APACHE STORM

Twitter Heron: Stream Processing at Scale

STORM AND LOW-LATENCY PROCESSING.

Over the last few years, we have seen a disruption in the data management

Flying Faster with Heron

Before proceeding with this tutorial, you must have a good understanding of Core Java and any of the Linux flavors.

Storm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter

ONOS: TOWARDS AN OPEN, DISTRIBUTED SDN OS. Chun Yuan Cheng

IQ for DNA. Interactive Query for Dynamic Network Analytics. Haoyu Song. HUAWEI TECHNOLOGIES Co., Ltd.

A Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers

Software-Defined Networking (Continued)

Storm. Distributed and fault-tolerant realtime computation. Nathan Marz Twitter

Application-Aware SDN Routing for Big-Data Processing

How we scaled push messaging for millions of Netflix devices. Susheel Aroskar Cloud Gateway

Stateless Network Functions:

Data Acquisition. The reference Big Data stack

Tutorial: Apache Storm

Application of SDN: Load Balancing & Traffic Engineering

BIG DATA. Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management. Author: Sandesh Deshmane

Delay Controlled Elephant Flow Rerouting in Software Defined Network

Scaling the Yelp s logging pipeline with Apache Kafka. Enrico

FROM LEGACY, TO BATCH, TO NEAR REAL-TIME. Marc Sturlese, Dani Solà

8/24/2017 Week 1-B Instructor: Sangmi Lee Pallickara

Lecture 21 11/27/2017 Next Lecture: Quiz review & project meetings Streaming & Apache Kafka

Evolution of an Apache Spark Architecture for Processing Game Data

Self Regulating Stream Processing in Heron

OTSDN What is it? Does it help?

MAGIC OF SDN IN NETWORKING

Data Analytics with HPC. Data Streaming

A BIG DATA STREAMING RECIPE WHAT TO CONSIDER WHEN BUILDING A REAL TIME BIG DATA APPLICATION

Deep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services

Homework 1. Question 1 - Layering. CSCI 1680 Computer Networks Fonseca

The Measurement Manager Modular End-to-End Measurement Services

Paper Presented by Harsha Yeddanapudy

Be Fast, Cheap and in Control with SwitchKV. Xiaozhou Li

Flash Storage Complementing a Data Lake for Real-Time Insight

Practical Big Data Processing An Overview of Apache Flink

Scaling the LTE Control-Plane for Future Mobile Access

Informatica Universiteit van Amsterdam. Distributed Load-Balancing of Network Flows using Multi-Path Routing. Kevin Ouwehand. September 20, 2015

Scaling Internet TV Content Delivery ALEX GUTARIN DIRECTOR OF ENGINEERING, NETFLIX

MOHA: Many-Task Computing Framework on Hadoop

Chapter 3. Design of Grid Scheduler. 3.1 Introduction

TITLE: PRE-REQUISITE THEORY. 1. Introduction to Hadoop. 2. Cluster. Implement sort algorithm and run it using HADOOP

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Distributed ETL. A lightweight, pluggable, and scalable ingestion service for real-time data. Joe Wang

Software Defined Networking

Software Defined Networking

The Stream Processor as a Database. Ufuk

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Streaming & Apache Storm

Software-Defined Networking (SDN) Now for Operational Technology (OT) Networks SEL 2017

Lecture 10.1 A real SDN implementation: the Google B4 case. Antonio Cianfrani DIET Department Networking Group netlab.uniroma1.it

Europeana Core Service Platform

Data Center Configuration. 1. Configuring VXLAN

Assignment 5: Software Defined Networking CS640 Spring 2015

HEX Switch: Hardware-assisted security extensions of OpenFlow

Towards a Real- time Processing Pipeline: Running Apache Flink on AWS

Architecture of Flink's Streaming Runtime. Robert

lecture 18: network virtualization platform (NVP) 5590: software defined networking anduo wang, Temple University TTLMAN 401B, R 17:30-20:00

Fluentd + MongoDB + Spark = Awesome Sauce

Cloudline Autonomous Driving Solutions. Accelerating insights through a new generation of Data and Analytics October, 2018

Programmable Software Switches. Lecture 11, Computer Networks (198:552)

Communication System Design Projects

Chapter 5 Network Layer: The Control Plane

Programmability, Integration and Visibility for Media Networks

Big Data Infrastructures & Technologies

Programmable data planes, P4, and Trellis

Advanced Computer Networks. End Host Optimization

Optimizing your virtual switch for VXLAN. Ron Fuller, VCP-NV, CCIE#5851 (R&S/Storage) Staff Systems Engineer NSBU

OpenADN: A Case for Open Application Delivery Networking

Firewall offloading based on SDN and NFV

IBM InfoSphere Streams v4.0 Performance Best Practices

Cisco Nexus Data Broker for Network Traffic Monitoring and Visibility

RIPE75 - Network monitoring at scale. Louis Poinsignon

Microservices, Messaging and Science Gateways. Review microservices for science gateways and then discuss messaging systems.

T9: SDN and Flow Management: DevoFlow

Transformation-free Data Pipelines by combining the Power of Apache Kafka and the Flexibility of the ESB's

SDN AND NFV SECURITY DR. SANDRA SCOTT-HAYWARD, QUEEN S UNIVERSITY BELFAST COINS SUMMER SCHOOL, 23 JULY 2018

Building Efficient and Reliable Software-Defined Networks. Naga Katta

Esper EQC. Horizontal Scale-Out for Complex Event Processing

Overview SENTINET 3.1

PIRE ExoGENI ENVRI preparation for Big Data science

Apache Flink Big Data Stream Processing

Coflow. Recent Advances and What s Next? Mosharaf Chowdhury. University of Michigan

11/30/16. Game Plan. OpenFlow 1.3: Protocol, Use Cases, And Building a Fault Tolerant Application. Up Next. Before We Get Started

From Routing to Traffic Engineering

Securing Network Application Deployment in Software Defined Networking 11/23/17

End-to-End Adaptive Packet Aggregation for High-Throughput I/O Bus Network Using Ethernet

Apache Storm: Hands-on Session A.A. 2016/17

Huge market -- essentially all high performance databases work this way

A Policy-aware Switching Layer for Data Centers

Putting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21

Agilio OVS Software Architecture

Lecture 11 Hadoop & Spark

Application Delivery Using Software Defined Networking

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017.

Transcription:

Typhoon: An SDN Enhanced Real-Time Big Data Streaming Framework Junguk Cho, Hyunseok Chang, Sarit Mukherjee, T.V. Lakshman, and Jacobus Van der Merwe 1

Big Data Era Big data analysis is increasingly common Recommendation systems (e.g., Netflix, Amazon) Targeted advertising (e.g., Google, Facebook) Mobile network management (e.g., AT&T, Verizon) Real time stream framework Continuously process unlimited streams of data High throughput and low latency using computation parallelism Fault tolerance A lot of open-source stream frameworks (e.g., Storm, Heron, Flink, Spark etc.) 2

Limitations of Current Real-Time Stream Frameworks Lack of runtime flexibility Scale-in/out, computation logic, and routing policy 3

Limitations of Current Real-Time Stream Frameworks Lack of runtime flexibility Scale-in/out, computation logic, and routing policy Solutions Shutdown -> modification -> restart Lose data 4

Limitations of Current Real-Time Stream Frameworks Lack of runtime flexibility Scale-in/out, computation logic, and routing policy Solutions Shutdown -> modification -> restart Lose data Instant swapping Require stable front-end queue systems (e.g., kafka) Still resource inefficient 5

Limitations of Current Real-Time Stream Frameworks Lack of runtime flexibility Scale-in/out, computation logic, and routing policy Solutions Shutdown -> modification -> restart Lose data Instant swapping Require stable front-end queue systems (e.g., kafka) Still resource inefficient Not optimal for one-to-many communication No broadcast concept Multiple same computations in application-layer 6

Limitations of Current Real-Time Stream Frameworks Lack of runtime flexibility Scale-in/out, computation logic, and routing policy Solutions Shutdown -> modification -> restart Lose data Instant swapping Require stable front-end queue systems (e.g., kafka) Still resource inefficient Not optimal for one-to-many communication No broadcast concept Multiple same computations in application-layer Problem on inflexibility and performance is mainly from application-layer data routing 7

Limitations of Current Real-Time Stream Frameworks Lack of runtime flexibility Scale-in/out, computation logic, and routing policy Solutions Shutdown -> modification -> restart Lose data Instant swapping Require stable front-end queue systems (e.g., kafka) Still resource inefficient Not optimal for one-to-many communication No broadcast concept Multiple same computations in application-layer Lack of management for deployed stream apps Limited to application monitoring 8

Typhoon : An SDN Enhanced Real-Time Stream Framework Offer flexible stream processing pipelines for dynamic reconfigurations Scale-in/out, computation logic, routing policy Optimize one-to-many communication Broadcast concept at network layer Enhance the management capabilities in stream frameworks SDN control plane applications Cross-layer design Partially offload application data routing layer into network layer Manage both layers from an SDN controller 9

Why SDN in Stream Framework? SDN benefits Centralized views and control Programmability One-to-many communication More efficient packet-level replications at the network layer Well-defined protocol and interface OpenFlow Evolvable framework Extensible via control plane applications 10

Can We Apply SDN to Stream Framework? Conceptually similar to computer networks Graph-based communication pattern 11

Can We Apply SDN to Stream Framework? Conceptually similar to computer networks Graph-based communication pattern Round-Robin Key-based Computing & Routing components 12

Can We Apply SDN to Stream Framework? All topology information pre-defined in submitted application App Developer Logical Topology New Application Or Reconfiguration Request Input (1) Split (2) Count (2) Aggreg ator (1) Stream frameworks Input Split Split Count Count Aggreg ator Physical Topology 13

Can We Apply SDN to Stream Framework? All topology information pre-defined in submitted application App Developer Well fit in proactive SDN model Logical Topology New Application Or Reconfiguration Request Input (1) Split (2) Count (2) Aggreg ator (1) Stream frameworks Input Split Split Count Count Aggreg ator Physical Topology 14

Current Stream Frameworks Compute Cluster Streaming Manager Topology Builder & Scheduler Central Coordinator Worker Agent Worker.. Worker 15

Typhoon Architecture Central Coordinator Compute Cluster Streaming Manager Worker Agent Topology Builder & Scheduler SDN Controller Worker.. Worker Dynamic Topology Manager SDN Control Plane Application Software SDN Switch 16

Typhoon Design Challenges How can we integrate SDN into stream frameworks? How can we support dynamic reconfigurations? 17

Typhoon Design Challenges How can we integrate SDN into stream framework? How can we support dynamic reconfigurations? 18

Integration of SDN into Stream Framework Worker Agent Worker.. Worker Software SDN Switch 19

Integration of SDN into Stream Framework Worker Agent Worker.. Worker Software SDN Switch Worker Agent Worker Data Computation Data Routing I/O Layer Worker Data Computation Data Routing I/O Layer Software SDN Switch 20

Integration of SDN into Stream Framework Worker Agent Worker.. Worker Software SDN Switch Worker Agent Worker Data Computation Data Routing I/O Layer Worker Data Computation Data Routing I/O Layer Port Port Software SDN Switch 21

Integration of SDN into Stream Framework Worker Agent Worker.. Worker Software SDN Switch Worker Agent Worker Data Computation Data Routing I/O Layer Worker Data Computation Data Routing I/O Layer Port Port Software SDN Switch To use SDN as data forwarding functions, Typhoon needs match fields in flow rules and packet format to be matched 22

Design Flow Rules and Packet Format in Typhoon Data Tuple: stream communication data model for worker-to-worker communications in existing stream frameworks Metadata of data tuple Tuple Length Destination worker ID Source worker ID StreamID Output from Data Computation 23

Design Flow Rules and Packet Format in Typhoon Data Tuple: stream communication data model for worker-to-worker communications in existing stream frameworks Metadata of data tuple Tuple Length Destination worker ID Source worker ID StreamID Output from Data Computation Unique IDs in a stream application 24

Design Flow Rules and Packet Format in Typhoon Data Tuple: stream communication data model for worker-to-worker communications in existing stream frameworks Metadata of data tuple Tuple Length Destination worker ID Source worker ID StreamID Output from Data Computation Unique IDs in a stream application Custom transport protocol in Typhoon Ethernet header Payload (Data tuple) Destination worker ID Tuple Length StreamID Output from Data Computation Source worker ID Ether type Metadata of data tuple In Typhoon 25

Offloading Data Forwarding from Application Layer Packet format created by I/O Layer Ethernet header Payload Destination worker ID Set of data tuples Source worker ID Ether type Source worker Data Computation Data Routing I/O Layer Destination worker Data Computation Data Routing I/O Layer Pre-installed flow rule Match Port Port in_port=[src worker port], dl_dst=[dest worker ID], dl_src=[src worker ID] Action output=[dest worker port] Software SDN Switch 26

Typhoon Design Challenges How can we integrate SDN into stream framework? How can we support dynamic reconfigurations? What are the requirements for dynamic reconfigurations? 27

Application Layer Data Routing in Typhoon Worker Data Computation Data Routing I/O Layer Policy-specific routing states Routing Type Round-robin Key-based One-to-many States Counter Index for hash Worker data routing Policy-specific routing states Policy-independent routing states Routing computations Policy-independent routing states Destination Worker IDs Destination 1... Destination N Routing computations /* Round robin */ Index = (counter++) % numdestworkers; dstworker = nextworkers[index]; 28

Application Layer Data Routing in Typhoon Worker Data Computation Data Routing I/O Layer Policy-specific routing states Routing Type Round-robin Key-based One-to-many States Counter Index for hash Update routing states if needed Policy-independent routing states Scale-in/out Computation logic update Policy-specific routing states Change routing type and states Policy-independent routing states Destination Worker IDs Destination 1... Destination N 29

Typhoon Design Challenges How can we integrate SDN into stream framework? How can we support dynamic reconfigurations? What are the requirements for dynamic reconfigurations? Updating per-worker routing states for reconfigurations 30

Typhoon Design Challenges How can we integrate SDN into stream framework? How can we support dynamic reconfigurations? What are the requirements for dynamic reconfigurations? Updating per-worker routing states for reconfigurations Framework-level coordination to avoid packet loss during reconfiguration 31

Typhoon Design Challenges How can we integrate SDN into stream framework? How can we support dynamic reconfigurations? What are the requirements for dynamic reconfigurations? Updating per-worker routing states for reconfigurations Framework-level coordination to avoid packet loss during reconfiguration How can Typhoon manage per-worker routing states in centralized manners? What are the protocol and Interface for managing applicationlayer routing states? How can Typhoon prevent packet loss during reconfigurations? 32

How to manage per-worker routing states in Typhoon Offloading routing states management from the application layer to the network layer SDN Controller (i.e., network) Manage application-layer routing states with centralized manners Policy-independent routing states (e.g., forwarding tables) Policy-specific routing states (e.g., index for key-based routing) 33

What is a management layer in Typhoon? What is a transport method to control per-worker? PacketOut in OpenFlow (i.e., Protocol) Each worker is coupled to a port in a software SDN switch. SDN Controller Worker Agent Worker Data Computation Data Routing I/O Layer SDN Control Plane Application Port Software SDN Switch 34

What is a management layer in Typhoon? What is a transport method to control per-worker? PacketOut in OpenFlow (i.e., Protocol) Each worker is coupled to a port in a software SDN switch. What is a message format to carry routing states? Metadata Control tuple: Re-use data tuple format in stream frameworks for framework compatibility Routing States (e.g., New destination IDs) Control Tuple (i.e., Interface) Tuple Length Control tuple type SDN Controller SDN Control Plane Application Worker Agent Worker Data Computation Data Routing I/O Layer Port Software SDN Switch 35

Stable Dynamic Reconfiguration in Typhoon How can Typhoon prevent packet loss during reconfigurations? Well-defined order to update each component under framework coordination 36

Stable Dynamic Reconfiguration in Typhoon Destination Source New destination Scale-out Scale-in 37

Stable Dynamic Reconfiguration in Typhoon Destination Source New destination Scale-out Scale-in No packet loss under framework coordination 38

Runtime Flexibility in Typhoon Design flexible stream processing pipelines Offload routing managements to an SDN controller Leverage control tuples carried by OpenFlow PacketOut Offload the data forwarding to network with SDN flow rules Support reconfigurations under framework coordination Prevent packet loss during reconfigurations 39

Limitations of Current Real-Time Stream Frameworks Lack of runtime flexibility Not optimal for one-to-many communication Lack of management for deployed stream apps 40

Suboptimal One-to-Many Communication Source Worker Data Computation Data Routing Destination Worker 1 I/O Layer Destination Worker 2 Output from Data Computation 41

Suboptimal One-to-Many Communication Data Tuple Source Worker Data Computation Data Routing I/O Layer Destination Worker 1 Destination Worker 2 Tuple Length Destination worker ID Source worker ID StreamID Output from Data Computation 42

Suboptimal One-to-Many Communication Data Tuple Source Worker Data Computation Data Routing I/O Layer Destination Worker 1 Destination Worker 2 Tuple Length Destination worker ID Source worker ID StreamID Output from Data Computation Degrade performance due to multiple same computations Multiple same computations Heavy multiple serializations Data copy to different TCP connections No broadcast concept in application layers 43

Typhoon for One-to-Many Communication Source Worker Destination Worker 1 Destination Worker 2 Data Computation Data Routing Data Computation Data Routing Data Computation Data Routing I/O Layer I/O Layer I/O Layer Port Port Port Match in_port=[source worker port], dl_dst=[broadcast], dl_src=[source worker ID] Action output=[all destination worker s ports] Software SDN Switch 44

Typhoon for One-to-Many Communication Ethernet header Payload BROADCAST Set of tuples Data Computation Data Routing I/O Layer Source Worker ID 0xffff Source Worker Destination Worker 1 Destination Worker 2 Data Computation Data Routing I/O Layer Data Computation Data Routing I/O Layer Port Port Port Match in_port=[source worker port], dl_dst=[broadcast], dl_src=[source worker ID] Action output=[all destination worker s ports] Software SDN Switch 45

Typhoon for One-to-Many Communication Ethernet header Payload BROADCAST Set of tuples Data Computation Data Routing I/O Layer Source Worker ID 0xffff Source Worker Destination Worker 1 Destination Worker 2 Data Computation Data Routing I/O Layer Data Computation Data Routing I/O Layer Port Port Port Match in_port=[source worker port], dl_dst=[broadcast], dl_src=[source worker ID] Action output=[all destination worker s ports] Software SDN Switch 46

One-to-Many Communication In Typhoon Straightforward, but very effective Use a broadcast concept in network One-to-many communication aware worker I/O and an SDN controller Packet-level replications at the network layer 47

Limitations of Current Real-Time Stream Frameworks Lack of runtime flexibility Not optimal for one-to-many communication Lack of management for deployed stream apps 48

Limitations of Current Real-Time Stream Frameworks Only monitoring Lack of runtime flexibility Not optimal for one-to-many communication Lack of management for deployed stream apps 49

SDN Control Plane Applications Unique opportunities by tightly coupling framework and SDN Rich cross-layer information from application and network layers SDN control plane applications based on cross-layer information Fault detector Auto Scaler Live debugger Load balancer 50

Fault Detector SDN Control Plane Application Existing stream frameworks Polling approach Heartbeat based on timeout Expensive in case of large scale Not instantaneous Typhoon Event-driven approach and network layer information OpenFlow event Unexpected port deletion events Traffic steering with flow rule changes 51

Auto Scaler SDN Control Plane Application How to decide scale-in/out operation Application-layer information is more accurate than network information to know worker is overloaded Capacity = (# of processed tuples * average process time) / measurement time Auto Scaler SDN control plane application Threshold-based scale-in/out operations using capacity metric (i.e., application layer information) 52

Typhoon Prototype: Refactor Apache Storm Re-define global states using Thrift Update global states to include reconfigurations and detailed topology information stored in a central coordinator (i.e., zookeeper) Stream manager (i.e., Nimbus) Extend the topology builder and the scheduler Implement dynamic topology manager module For updating global states for initial submissions and reconfigurations Worker I/O layer Replace netty-based Storm transport with Typhoon custom transport Implement with JAVA and C using JNI to bridge them C using the DPDK framework (De)Packetizing and Packet batching in C Java implementing Storm transport interface Queue and batch tuples (De)Multiplexing tuples 53

Typhoon Prototype Software SDN switch DPDK-based user space OVS Floodlight controller Use apache curator for interaction between an SDN controller and the zookeeper Implement SDN control plane applications Provide rest APIs for control plane applications Re-use tuple libraries from Storm for control tuples 54

Evaluation Baseline performance One-to-many communication performance SDN control plane applications Auto Scaler Fault detector Update compute nodes Live debugger Load balancer 55

Evaluation Baseline performance One-to-many communication performance SDN control plane applications Auto Scaler Fault detector Update compute nodes Live debugger Load balancer 56

One-to-Many Communication Performance Increase the number of destination workers from two to six Destination Source Destination 57

One-to-Many Communication Performance Increase the number of destination workers from two to six # of destination workers Destination Source Destination 58

One-to-Many Communication Performance Increase the number of destination workers from two to six # of destination workers Destination Source Destination Typhoon outperforms Storm due to lightweight packet replications in the SDN switch 59

Auto Scaler SDN Control Plane Application Count Input Split Split Count Count Count Packet loss due to faulty split worker 60

Auto Scaler SDN Control Plane Application Scale-out split worker Split Count Input Split Split Count Count Count 61

Auto Scaler SDN Control Plane Application Scale-out split worker Split Count Input Split Split Count Count Count Typhoon avoids packet loss due to worker faults using threshold-based auto scaler 62

Conclusion Propose an SDN-based real time stream framework which tightly integrates SDN functionality into a realtime stream frameworks Demonstrate that the proposed system provides highly flexible stream pipeline and high-performance application-level data broadcast using cross-layer design and SDN Showcase several SDN control plane applications leveraging cross-layer information 63