Viper: Communication-Layer Determinism and Scaling in Low-Latency Stream Processing

Size: px
Start display at page:

Download "Viper: Communication-Layer Determinism and Scaling in Low-Latency Stream Processing"

Transcription

1 Viper: Communication-Layer Determinism and Scaling in Low-Latency Stream Processing Ivan Walulya, Yiannis Nikolakopoulos, Vincenzo Gulisano Marina Papatriantafilou and Philippas Tsigas Auto-DaSP 2017 Chalmers University of Technology Göteborg, Sweden 1

2 Stream Processing Applications o Process infinite streams of data High throughput Low latency o High resource requirements (multiple cores/nodes) Input event Sink 2

3 Stream Processing Applications o Process infinite streams of data High throughput Low latency o High resource requirements (multiple cores/nodes) o Abstraction: Data-flow graphs of operators and streams Expose pipeline and task parallelism Input event Sink 2

4 Introduction Input event Sink Processing Element 3

5 Introduction o Parallel processing: Pipeline parallelism Task parallelism Input event Sink sink Processing Element 3

6 o Parallel processing: Pipeline parallelism Task parallelism Introduction Compilers Run-time Schedulers Limited by data graph Input event Sink sink Processing Element 3

7 Introduction Input event..o 4 O 3 O 2 O 1 4

8 Introduction o Data or operator parallelism: Bottlenecks at operator level Split the data and replicate the operator Input event O 4 O 1 O 3 O 2 4

9 Introduction o Data or operator parallelism: Bottlenecks at operator level Split the data and replicate the operator Not limited by data-flow graph Trade latency for high throughput Preserve sequential semantics Input event O 4 O 1 O 3 O 2 4

10 Motivation o Determinism: preserve sequential semantics (safety) Input event O 4 O 1 O 3 O 2 5

11 Motivation o Determinism: preserve sequential semantics (safety) Merge operators Enforce ordering amongst the output tuples. Input event O 4 O 1 Merger O 3 O 2 5

12 Motivation o Determinism: preserve sequential semantics (safety) Merge operators Enforce ordering amongst the output tuples. Compiler generated [Schneider 2013]. Left to Application or Library developer [ Apache Storm, Flink]. Input event O 4 O 1 Merger O 3 O 2 5

13 Problem Statement layer Dedicated merge-sorting operators F 1 A2-M 1 A2 1 M-M 1 M 1 F 2 A2-M 2 A2 2 M-M 2 M 2 Dedicated channel between each pair of operator instances Communication layer 6

14 Problem Statement o o Overheads of Merge operators: Introduce computation overhead Higher latency due to increase in operators Become processing bottleneck Considerable burden on the developers Challenge: How to apply data-parallelism transparently and safely layer Dedicated merge-sorting operators F 1 A2-M 1 A2 1 M-M 1 M 1 F 2 A2-M 2 A2 2 M-M 2 M 2 Dedicated channel between each pair of operator instances Communication layer 6

15 Our Solution o Communication-layer determinism: Leverage links to achieve both communication and determinism. Shared state and synchronization layer Dedicated merge-sorting operators F 1 A2-M 1 A2 1 M-M 1 M 1 F 2 A2-M 2 A2 2 M-M 2 M 2 Dedicated channel between each pair of operator instances Communication layer 7

16 Our Solution o Communication-layer determinism: Leverage links to achieve both communication and determinism. Shared state and synchronization layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 7

17 Our Solution o Communication-layer determinism: Leverage links to achieve both communication and determinism. Shared state and synchronization Extends the ScaleGate [Gulisano et. al. 2016]. Modularly take the logic of deterministic away from developer (Viper ). layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 7

18 Our Solution o Communication-layer determinism: Leverage links to achieve both communication and determinism. Shared state and synchronization Extends the ScaleGate [Gulisano et. al. 2016]. Modularly take the logic of deterministic away from developer (Viper ). Evaluate our implementation on Apache Storm. layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 7

19 Viper Module o Overall Approach Replace merge operators and channels with a Viper module layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 8

20 Viper Module o Overall Approach Replace merge operators and channels with a Viper module Channel maintained for any set of source operator instances S 1,..S n layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 8

21 Viper Module o Overall Approach Replace merge operators and channels with a Viper module Channel maintained for any set of source operator instances S 1,..S n Channel is either a thread-safe concurrent queue or a ScaleGate object layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 8

22 Viper Module o Overall Approach Replace merge operators and channels with a Viper module Channel maintained for any set of source operator instances S 1,..S n Channel is either a thread-safe concurrent queue or a ScaleGate object Sorting overhead shared by threads assigned to the same instance layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 8

23 Viper Module o Overall Approach Replace merge operators and channels with a Viper module Channel maintained for any set of source operator instances S 1,..S n Channel is either a thread-safe concurrent queue or a ScaleGate object Sorting overhead shared by threads assigned to the same instance Watermarking mechanism to handle back-pressure layer F 1 A2 1 M 1 F 2 A2 2 M 2 Communication layer (Viper) Shared channel between each operator instance and its upstream peers 8

24 ScaleGate Data Structure [Gulisano et. al. 2016]. o API: addtuple(tuple,sourceid) allows a tuple from sourceid to be merged by ScaleGate in the resulting sorted stream of ready tuples. getnextreadytuple(readerid) provides to readerid the next ready tuple that has not been yet consumed by the former. 9

25 Viper API register(channel, sources, readers) Register a new channel, specifying which sources will add tuples and which readers will get timestamp-sorted tuples from the channel. 10

26 Viper API register(channel, sources, readers) Register a new channel, specifying which sources will add tuples and which readers will get timestamp-sorted tuples from the channel. addtuple(channel, sourceid, tuple) Add tuple from a given source sourceid to the specified channel 10

27 Viper API register(channel, sources, readers) Register a new channel, specifying which sources will add tuples and which readers will get timestamp-sorted tuples from the channel. addtuple(channel, sourceid, tuple) Add tuple from a given source sourceid to the specified channel getreadytuple(channel, readerid) Retrieve the next ready tuple (if any) for the specified readerid from the channel 10

28 Evaluation o Integrated Viper module in Apache Storm Execution Thread Task Send thread s based on LMAX disruptor A Dedicated send thread per Executor process Inbound Outbound 11

29 Evaluation o Integrated Viper module in Apache Storm Execution Thread Task Send thread s based on LMAX disruptor A Dedicated send thread per Executor process Inbound Outbound Execution Thread Remove send threads Concurrent queues or ScaleGate Inbound Task 11

30 Evaluation Setup o Linear-Road Dataset Simulate vehicular traffic on a network of dynamic-toll roads Variable toll dependent on congestion and accident proximity Position reports and historical query requests Tuple<Type = 0, Time, VID, Spd, XWay, Lane, Dir, Seg, Pos> 12

31 Evaluation Setup o Linear-Road Dataset Simulate vehicular traffic on a network of dynamic-toll roads Variable toll dependent on congestion and accident proximity Position reports and historical query requests Tuple<Type = 0, Time, VID, Spd, XWay, Lane, Dir, Seg, Pos> o Both stateful and stateless operators 12

32 Evaluation Setup o Linear-Road Dataset Simulate vehicular traffic on a network of dynamic-toll roads Variable toll dependent on congestion and accident proximity Position reports and historical query requests Tuple<Type = 0, Time, VID, Spd, XWay, Lane, Dir, Seg, Pos> o Both stateful and stateless operators o Metrics: Throughput, Latency and Energy 12

33 Evaluation Setup o Linear-Road Dataset Simulate vehicular traffic on a network of dynamic-toll roads Variable toll dependent on congestion and accident proximity Position reports and historical query requests Tuple<Type = 0, Time, VID, Spd, XWay, Lane, Dir, Seg, Pos> o Both stateful and stateless operators o Metrics: Throughput, Latency and Energy o Evaluation Platform Intel Xeon E5-2687W v2 3.4 GHz server (32 threads over 2 sockets), 64 GB of RAM Likwid library to read RAPL energy counters 12

34 Evaluation Results Viper No-Viper o o o o Stateless Forward position reports Viper : Communication Layer Viper Module used No-Viper : Layer Merge-Sort operator deployed Injection rate varied from 10,000 t/s to 1,200,000 t/s 1 Spout 2 Sink Spout N 13

35 Evaluation Results Viper No-Viper o o o o Stateless Forward position reports Viper : Communication Layer Viper Module used No-Viper : Layer Merge-Sort operator deployed Injection rate varied from 10,000 t/s to 1,200,000 t/s 1 Spout 2 Sink Spout N 13

36 Evaluation Results Viper No-Viper o o o Stateful Vehicle entering new Segment Viper : Communication Layer Viper Module used No-Viper : Layer Merge-Sort operator deployed 1 Spout 2 Sink Spout N 14

37 Conclusions and Future work o Discussed limitations of operator layer determinism and proposed a solution to overcome these at the communication layer. o Developed a Viper module that can be integrated in Ss o Evaluated the performance of the proposed module on Apache Storm o Scale the per tuple workload in the evaluation 15

38 Viper: Communication-Layer Determinism and Scaling in Low-Latency Stream Processing Thank You! 16

39 References o S. Schneider, M. Hirzel, B. Gedik and K. L. Wu Schneider 2015, Safe Data Parallelism for General Streaming in IEEE Transactions on Computers, o V. Gulisano, Y. Nikolakopoulos, M. Papatriantafilou and P. Tsigas Gulisano 2016, ScaleJoin: a Deterministic, Disjoint-Parallel and Skew-Resilient Stream Join in IEEE Transactions on Big Data,

Deterministic Real-Time Analytics of Geospatial Data Streams through ScaleGate Objects

Deterministic Real-Time Analytics of Geospatial Data Streams through ScaleGate Objects Deterministic Real-Time Analytics of Geospatial Data Streams through ScaleGate Objects Vincenzo Gulisano, Yiannis Nikolakopoulos, Ivan Walulya, Marina Papatriantafilou and Philippas Tsigas Chalmers University

More information

Latency and Throughput in Center versus Edge Stream Processing

Latency and Throughput in Center versus Edge Stream Processing Latency and Throughput in Center versus Edge Stream Processing A Case Study in the Transportation Domain Master s thesis in Computer Science: Algorithms, Languages and Logic Gregor Ulm Department of Computer

More information

Challenges in Data Stream Processing

Challenges in Data Stream Processing Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Challenges in Data Stream Processing Corso di Sistemi e Architetture per Big Data A.A. 2016/17 Valeria

More information

arxiv: v1 [cs.ds] 15 Jun 2016

arxiv: v1 [cs.ds] 15 Jun 2016 Efficient data streaming multiway aggregation through concurrent algorithmic designs and new abstract data types arxiv:1606.04746v1 [cs.ds] 15 Jun 2016 Vincenzo Gulisano, Yiannis Nikolakopoulos, Daniel

More information

The Stream Processor as a Database. Ufuk

The Stream Processor as a Database. Ufuk The Stream Processor as a Database Ufuk Celebi @iamuce Realtime Counts and Aggregates The (Classic) Use Case 2 (Real-)Time Series Statistics Stream of Events Real-time Statistics 3 The Architecture collect

More information

Concurrent Data Structures for Efficient Streaming Aggregation

Concurrent Data Structures for Efficient Streaming Aggregation Technical Report No. 203: Concurrent Data Structures for Efficient Streaming Aggregation DANIEL CEDERMAN VINCENZO GULISANO YIANNIS NIKOLAKOPOULOS MARINA PAPATRIANTAFILOU PHILIPPAS TSIGAS Department of

More information

Putting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21

Putting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21 Big Processing -Parallel Computation COS 418: Distributed Systems Lecture 21 Michael Freedman 2 Ex: Word count using partial aggregation Putting it together 1. Compute word counts from individual files

More information

Transactum Business Process Manager with High-Performance Elastic Scaling. November 2011 Ivan Klianev

Transactum Business Process Manager with High-Performance Elastic Scaling. November 2011 Ivan Klianev Transactum Business Process Manager with High-Performance Elastic Scaling November 2011 Ivan Klianev Transactum BPM serves three primary objectives: To make it possible for developers unfamiliar with distributed

More information

Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency

Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Cache-Aware Lock-Free Queues for Multiple Producers/Consumers and Weak Memory Consistency Anders Gidenstam Håkan Sundell Philippas Tsigas School of business and informatics University of Borås Distributed

More information

Twitter Heron: Stream Processing at Scale

Twitter Heron: Stream Processing at Scale Twitter Heron: Stream Processing at Scale Saiyam Kohli December 8th, 2016 CIS 611 Research Paper Presentation -Sun Sunnie Chung TWITTER IS A REAL TIME ABSTRACT We process billions of events on Twitter

More information

Accelerate Applications Using EqualLogic Arrays with directcache

Accelerate Applications Using EqualLogic Arrays with directcache Accelerate Applications Using EqualLogic Arrays with directcache Abstract This paper demonstrates how combining Fusion iomemory products with directcache software in host servers significantly improves

More information

Parallel Patterns for Window-based Stateful Operators on Data Streams: an Algorithmic Skeleton Approach

Parallel Patterns for Window-based Stateful Operators on Data Streams: an Algorithmic Skeleton Approach Parallel Patterns for Window-based Stateful Operators on Data Streams: an Algorithmic Skeleton Approach Tiziano De Matteis, Gabriele Mencagli University of Pisa Italy INTRODUCTION The recent years have

More information

MULTI-THREADED QUERIES

MULTI-THREADED QUERIES 15-721 Project 3 Final Presentation MULTI-THREADED QUERIES Wendong Li (wendongl) Lu Zhang (lzhang3) Rui Wang (ruiw1) Project Objective Intra-operator parallelism Use multiple threads in a single executor

More information

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant

Big and Fast. Anti-Caching in OLTP Systems. Justin DeBrabant Big and Fast Anti-Caching in OLTP Systems Justin DeBrabant Online Transaction Processing transaction-oriented small footprint write-intensive 2 A bit of history 3 OLTP Through the Years relational model

More information

Scalable Streaming Analytics

Scalable Streaming Analytics Scalable Streaming Analytics KARTHIK RAMASAMY @karthikz TALK OUTLINE BEGIN I! II ( III b Overview Storm Overview Storm Internals IV Z V K Heron Operational Experiences END WHAT IS ANALYTICS? according

More information

TrafficDB: HERE s High Performance Shared-Memory Data Store Ricardo Fernandes, Piotr Zaczkowski, Bernd Göttler, Conor Ettinoffe, and Anis Moussa

TrafficDB: HERE s High Performance Shared-Memory Data Store Ricardo Fernandes, Piotr Zaczkowski, Bernd Göttler, Conor Ettinoffe, and Anis Moussa TrafficDB: HERE s High Performance Shared-Memory Data Store Ricardo Fernandes, Piotr Zaczkowski, Bernd Göttler, Conor Ettinoffe, and Anis Moussa EPL646: Advanced Topics in Databases Christos Hadjistyllis

More information

Apache Flink- A System for Batch and Realtime Stream Processing

Apache Flink- A System for Batch and Realtime Stream Processing Apache Flink- A System for Batch and Realtime Stream Processing Lecture Notes Winter semester 2016 / 2017 Ludwig-Maximilians-University Munich Prof Dr. Matthias Schubert 2016 Introduction to Apache Flink

More information

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop

More information

Apache Storm. Hortonworks Inc Page 1

Apache Storm. Hortonworks Inc Page 1 Apache Storm Page 1 What is Storm? Real time stream processing framework Scalable Up to 1 million tuples per second per node Fault Tolerant Tasks reassigned on failure Guaranteed Processing At least once

More information

Apache Flink. Alessandro Margara

Apache Flink. Alessandro Margara Apache Flink Alessandro Margara alessandro.margara@polimi.it http://home.deib.polimi.it/margara Recap: scenario Big Data Volume and velocity Process large volumes of data possibly produced at high rate

More information

Distributed Stream Analysis with Java 8 Stream API Master s Thesis in Computer Systems and Networks

Distributed Stream Analysis with Java 8 Stream API Master s Thesis in Computer Systems and Networks Distributed Stream Analysis with Java 8 Stream API Master s Thesis in Computer Systems and Networks ANDY MOISE PHILOGENE BRIAN MWAMBAZI Chalmers University of Technology University of Gothenburg Department

More information

10/24/2017 Sangmi Lee Pallickara Week 10- A. CS535 Big Data Fall 2017 Colorado State University

10/24/2017 Sangmi Lee Pallickara Week 10- A. CS535 Big Data Fall 2017 Colorado State University CS535 Big Data - Fall 2017 Week 10-A-1 CS535 BIG DATA FAQs Term project proposal Feedback for the most of submissions are available PA2 has been posted (11/6) PART 2. SCALABLE FRAMEWORKS FOR REAL-TIME

More information

Processing of big data with Apache Spark

Processing of big data with Apache Spark Processing of big data with Apache Spark JavaSkop 18 Aleksandar Donevski AGENDA What is Apache Spark? Spark vs Hadoop MapReduce Application Requirements Example Architecture Application Challenges 2 WHAT

More information

SnailTrail Generalizing Critical Paths for Online Analysis of Distributed Dataflows

SnailTrail Generalizing Critical Paths for Online Analysis of Distributed Dataflows SnailTrail Generalizing Critical Paths for Online Analysis of Distributed Dataflows Moritz Hoffmann, Andrea Lattuada, John Liagouris, Vasiliki Kalavri, Desislava Dimitrova, Sebastian Wicki, Zaheer Chothia,

More information

VOLTDB + HP VERTICA. page

VOLTDB + HP VERTICA. page VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics

More information

GFS: The Google File System. Dr. Yingwu Zhu

GFS: The Google File System. Dr. Yingwu Zhu GFS: The Google File System Dr. Yingwu Zhu Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one big CPU More storage, CPU required than one PC can

More information

Efficient and Reliable Lock-Free Memory Reclamation Based on Reference Counting

Efficient and Reliable Lock-Free Memory Reclamation Based on Reference Counting Efficient and Reliable Lock-Free Memory Reclamation d on Reference ounting nders Gidenstam, Marina Papatriantafilou, Håkan Sundell and Philippas Tsigas Distributed omputing and Systems group, Department

More information

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Apache Flink

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Apache Flink Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Apache Flink Matthias Schubert, Matthias Renz, Felix Borutta, Evgeniy Faerman, Christian Frey, Klaus Arthur Schmid, Daniyal Kazempour,

More information

Lecture 21 11/27/2017 Next Lecture: Quiz review & project meetings Streaming & Apache Kafka

Lecture 21 11/27/2017 Next Lecture: Quiz review & project meetings Streaming & Apache Kafka Lecture 21 11/27/2017 Next Lecture: Quiz review & project meetings Streaming & Apache Kafka What problem does Kafka solve? Provides a way to deliver updates about changes in state from one service to another

More information

Real-time data processing with Apache Flink

Real-time data processing with Apache Flink Real-time data processing with Apache Flink Gyula Fóra gyfora@apache.org Flink committer Swedish ICT Stream processing Data stream: Infinite sequence of data arriving in a continuous fashion. Stream processing:

More information

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context 1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes

More information

On Design and Applications of Practical Concurrent Data Structures

On Design and Applications of Practical Concurrent Data Structures THESIS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY On Design and Applications of Practical Concurrent Data Structures IVAN WALULYA Department of Computer Science and Engineering CHALMERS UNIVERSITY OF TECHNOLOGY

More information

Course Content. Parallel & Distributed Databases. Objectives of Lecture 12 Parallel and Distributed Databases

Course Content. Parallel & Distributed Databases. Objectives of Lecture 12 Parallel and Distributed Databases Database Management Systems Winter 2003 CMPUT 391: Dr. Osmar R. Zaïane University of Alberta Chapter 22 of Textbook Course Content Introduction Database Design Theory Query Processing and Optimisation

More information

DIAMOND RINGS ACKNOWLEDGED EVENT PROPAGATION IN MANY-CORE PROCESSORS

DIAMOND RINGS ACKNOWLEDGED EVENT PROPAGATION IN MANY-CORE PROCESSORS th August DIAMOND RINGS ACKNOWLEDGED EVENT PROPAGATION IN MANY-CORE PROCESSORS Stefan Nürnberger, Randolf Rotta, Gabor Drescher, Daniel Danner, Jörg Nolte ACKNOWLEDGED EVENT PROPAGATION What does it do?

More information

New Techniques to Curtail the Tail Latency in Stream Processing Systems

New Techniques to Curtail the Tail Latency in Stream Processing Systems New Techniques to Curtail the Tail Latency in Stream Processing Systems Guangxiang Du University of Illinois at Urbana-Champaign Urbana, IL, 61801, USA gdu3@illinois.edu Indranil Gupta University of Illinois

More information

Securing the Frisbee Multicast Disk Loader

Securing the Frisbee Multicast Disk Loader Securing the Frisbee Multicast Disk Loader Robert Ricci, Jonathon Duerig University of Utah 1 What is Frisbee? 2 Frisbee is Emulab s tool to install whole disk images from a server to many clients using

More information

TLDK Overview. Transport Layer Development Kit Keith Wiles April Contributions from Ray Kinsella & Konstantin Ananyev

TLDK Overview. Transport Layer Development Kit Keith Wiles April Contributions from Ray Kinsella & Konstantin Ananyev TLDK Overview Transport Layer Development Kit Keith Wiles April 2017 Contributions from Ray Kinsella & Konstantin Ananyev Notices and Disclaimers Intel technologies features and benefits depend on system

More information

GFS: The Google File System

GFS: The Google File System GFS: The Google File System Brad Karp UCL Computer Science CS GZ03 / M030 24 th October 2014 Motivating Application: Google Crawl the whole web Store it all on one big disk Process users searches on one

More information

ESE532: System-on-a-Chip Architecture. Today. Programmable SoC. Message. Process. Reminder

ESE532: System-on-a-Chip Architecture. Today. Programmable SoC. Message. Process. Reminder ESE532: System-on-a-Chip Architecture Day 5: September 18, 2017 Dataflow Process Model Today Dataflow Process Model Motivation Issues Abstraction Basic Approach Dataflow variants Motivations/demands for

More information

Before proceeding with this tutorial, you must have a good understanding of Core Java and any of the Linux flavors.

Before proceeding with this tutorial, you must have a good understanding of Core Java and any of the Linux flavors. About the Tutorial Storm was originally created by Nathan Marz and team at BackType. BackType is a social analytics company. Later, Storm was acquired and open-sourced by Twitter. In a short time, Apache

More information

Huge market -- essentially all high performance databases work this way

Huge market -- essentially all high performance databases work this way 11/5/2017 Lecture 16 -- Parallel & Distributed Databases Parallel/distributed databases: goal provide exactly the same API (SQL) and abstractions (relational tables), but partition data across a bunch

More information

Linear Road: A Stream Data Management Benchmark

Linear Road: A Stream Data Management Benchmark Linear Road: A Stream Data Management Benchmark Arvind Arasu Stanford University arvinda@cs.stanford.edu Mitch Cherniack Brandeis University mfc@cs.brandeis.edu Eduardo Galvez Brandeis University eddie@cs.brandeis.edu

More information

Highly Concurrent Stream Synchronization in Many-core Embedded Systems

Highly Concurrent Stream Synchronization in Many-core Embedded Systems Highly Concurrent Stream Synchronization in Many-core Embedded Systems Yiannis Nikolakopoulos Marina Papatriantafilou Chalmers University of Technology Gothenburg, Sweden {ioaniko, ptrianta@chalmers.se

More information

Typhoon: An SDN Enhanced Real-Time Big Data Streaming Framework

Typhoon: An SDN Enhanced Real-Time Big Data Streaming Framework Typhoon: An SDN Enhanced Real-Time Big Data Streaming Framework Junguk Cho, Hyunseok Chang, Sarit Mukherjee, T.V. Lakshman, and Jacobus Van der Merwe 1 Big Data Era Big data analysis is increasingly common

More information

Distributed File Systems II

Distributed File Systems II Distributed File Systems II To do q Very-large scale: Google FS, Hadoop FS, BigTable q Next time: Naming things GFS A radically new environment NFS, etc. Independence Small Scale Variety of workloads Cooperation

More information

Recovering Disk Storage Metrics from low level Trace events

Recovering Disk Storage Metrics from low level Trace events Recovering Disk Storage Metrics from low level Trace events Progress Report Meeting May 05, 2016 Houssem Daoud Michel Dagenais École Polytechnique de Montréal Laboratoire DORSAL Agenda Introduction and

More information

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Motivation And Intro Programming Model Spark Data Transformation Model Construction Model Training Model Inference Execution Model Data Parallel Training

More information

TARDiS: A branch and merge approach to weak consistency

TARDiS: A branch and merge approach to weak consistency TARDiS: A branch and merge approach to weak consistency By: Natacha Crooks, Youer Pu, Nancy Estrada, Trinabh Gupta, Lorenzo Alvis, Allen Clement Presented by: Samodya Abeysiriwardane TARDiS Transactional

More information

Multi-threaded Queries. Intra-Query Parallelism in LLVM

Multi-threaded Queries. Intra-Query Parallelism in LLVM Multi-threaded Queries Intra-Query Parallelism in LLVM Multithreaded Queries Intra-Query Parallelism in LLVM Yang Liu Tianqi Wu Hao Li Interpreted vs Compiled (LLVM) Interpreted vs Compiled (LLVM) Interpreted

More information

HiTune. Dataflow-Based Performance Analysis for Big Data Cloud

HiTune. Dataflow-Based Performance Analysis for Big Data Cloud HiTune Dataflow-Based Performance Analysis for Big Data Cloud Jinquan (Jason) Dai, Jie Huang, Shengsheng Huang, Bo Huang, Yan Liu Intel Asia-Pacific Research and Development Ltd Shanghai, China, 200241

More information

MillWheel:Fault Tolerant Stream Processing at Internet Scale. By FAN Junbo

MillWheel:Fault Tolerant Stream Processing at Internet Scale. By FAN Junbo MillWheel:Fault Tolerant Stream Processing at Internet Scale By FAN Junbo Introduction MillWheel is a low latency data processing framework designed by Google at Internet scale. Motived by Google Zeitgeist

More information

STORM AND LOW-LATENCY PROCESSING.

STORM AND LOW-LATENCY PROCESSING. STORM AND LOW-LATENCY PROCESSING Low latency processing Similar to data stream processing, but with a twist Data is streaming into the system (from a database, or a netk stream, or an HDFS file, or ) We

More information

Datenbanksysteme II: Modern Hardware. Stefan Sprenger November 23, 2016

Datenbanksysteme II: Modern Hardware. Stefan Sprenger November 23, 2016 Datenbanksysteme II: Modern Hardware Stefan Sprenger November 23, 2016 Content of this Lecture Introduction to Modern Hardware CPUs, Cache Hierarchy Branch Prediction SIMD NUMA Cache-Sensitive Skip List

More information

Flying Faster with Heron

Flying Faster with Heron Flying Faster with Heron KARTHIK RAMASAMY @KARTHIKZ #TwitterHeron TALK OUTLINE BEGIN I! II ( III b OVERVIEW MOTIVATION HERON IV Z OPERATIONAL EXPERIENCES V K HERON PERFORMANCE END [! OVERVIEW TWITTER IS

More information

CSE 544: Principles of Database Systems

CSE 544: Principles of Database Systems CSE 544: Principles of Database Systems Anatomy of a DBMS, Parallel Databases 1 Announcements Lecture on Thursday, May 2nd: Moved to 9am-10:30am, CSE 403 Paper reviews: Anatomy paper was due yesterday;

More information

Tuning Browser-to-Browser Offloading for Heterogeneous Stream Processing Web Applications

Tuning Browser-to-Browser Offloading for Heterogeneous Stream Processing Web Applications Tuning Browser-to-Browser Offloading for Heterogeneous Stream Processing Web Applications Masiar Babazadeh Faculty of Informatics, University of Lugano (USI), Switzerland {name.surname@usi.ch} Abstract.

More information

Chapter 18: Parallel Databases Chapter 19: Distributed Databases ETC.

Chapter 18: Parallel Databases Chapter 19: Distributed Databases ETC. Chapter 18: Parallel Databases Chapter 19: Distributed Databases ETC. Introduction Parallel machines are becoming quite common and affordable Prices of microprocessors, memory and disks have dropped sharply

More information

An Empirical Study of High Availability in Stream Processing Systems

An Empirical Study of High Availability in Stream Processing Systems An Empirical Study of High Availability in Stream Processing Systems Yu Gu, Zhe Zhang, Fan Ye, Hao Yang, Minkyong Kim, Hui Lei, Zhen Liu Stream Processing Model software operators (PEs) Ω Unexpected machine

More information

DSMS Benchmarking. Morten Lindeberg University of Oslo

DSMS Benchmarking. Morten Lindeberg University of Oslo DSMS Benchmarking Morten Lindeberg University of Oslo Agenda Introduction DSMS Recap General Requirements Metrics Example: Linear Road Example: StreamBench 30. Sep. 2009 INF5100 - Morten Lindeberg 2 Introduction

More information

MINIMIZING TRANSACTION LATENCY IN GEO-REPLICATED DATA STORES

MINIMIZING TRANSACTION LATENCY IN GEO-REPLICATED DATA STORES MINIMIZING TRANSACTION LATENCY IN GEO-REPLICATED DATA STORES Divy Agrawal Department of Computer Science University of California at Santa Barbara Joint work with: Amr El Abbadi, Hatem Mahmoud, Faisal

More information

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet

Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Reducing CPU and network overhead for small I/O requests in network storage protocols over raw Ethernet Pilar González-Férez and Angelos Bilas 31 th International Conference on Massive Storage Systems

More information

StreamBox: Modern Stream Processing on a Multicore Machine

StreamBox: Modern Stream Processing on a Multicore Machine StreamBox: Modern Stream Processing on a Multicore Machine Hongyu Miao and Heejin Park, Purdue ECE; Myeongjae Jeon and Gennady Pekhimenko, Microsoft Research; Kathryn S. McKinley, Google; Felix Xiaozhu

More information

B.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2

B.H.GARDI COLLEGE OF ENGINEERING & TECHNOLOGY (MCA Dept.) Parallel Database Database Management System - 2 Introduction :- Today single CPU based architecture is not capable enough for the modern database that are required to handle more demanding and complex requirements of the users, for example, high performance,

More information

New Oracle NoSQL Database APIs that Speed Insertion and Retrieval

New Oracle NoSQL Database APIs that Speed Insertion and Retrieval New Oracle NoSQL Database APIs that Speed Insertion and Retrieval O R A C L E W H I T E P A P E R F E B R U A R Y 2 0 1 6 1 NEW ORACLE NoSQL DATABASE APIs that SPEED INSERTION AND RETRIEVAL Introduction

More information

Datacenter replication solution with quasardb

Datacenter replication solution with quasardb Datacenter replication solution with quasardb Technical positioning paper April 2017 Release v1.3 www.quasardb.net Contact: sales@quasardb.net Quasardb A datacenter survival guide quasardb INTRODUCTION

More information

Modern Processor Architectures. L25: Modern Compiler Design

Modern Processor Architectures. L25: Modern Compiler Design Modern Processor Architectures L25: Modern Compiler Design The 1960s - 1970s Instructions took multiple cycles Only one instruction in flight at once Optimisation meant minimising the number of instructions

More information

Stateful Load Balancing for Parallel Stream Processing

Stateful Load Balancing for Parallel Stream Processing Stateful Load Balancing for Parallel Stream Processing Qingsong Guo, Yongluan Zhou North University of China, University of Copenhagen Auto-DaSP 2017 - August 29, 2017 StreamProcessing in 20Years Data

More information

Over the last few years, we have seen a disruption in the data management

Over the last few years, we have seen a disruption in the data management JAYANT SHEKHAR AND AMANDEEP KHURANA Jayant is Principal Solutions Architect at Cloudera working with various large and small companies in various Verticals on their big data and data science use cases,

More information

Stream Processing on High-Bandwidth Memory

Stream Processing on High-Bandwidth Memory Stream Processing on High-Bandwidth Memory Constantin Pohl TU Ilmenau, Germany constantin.pohl@tu-ilmenau.de ABSTRACT High-Bandwidth Memory (HBM) provides lower latency on many concurrent memory accesses

More information

AGORA: A Dependable High-Performance Coordination Service for Multi-Cores

AGORA: A Dependable High-Performance Coordination Service for Multi-Cores AGORA: A Dependable High-Performance Coordination Service for Multi-Cores Rainer Schiekofer 1, Johannes Behl 2, and Tobias Distler 1 1 Friedrich-Alexander University Erlangen-Nürnberg (FAU) 2 TU Braunschweig

More information

Systems I: Programming Abstractions

Systems I: Programming Abstractions Systems I: Programming Abstractions Course Philosophy: The goal of this course is to help students become facile with foundational concepts in programming, including experience with algorithmic problem

More information

GPU Accelerated Machine Learning for Bond Price Prediction

GPU Accelerated Machine Learning for Bond Price Prediction GPU Accelerated Machine Learning for Bond Price Prediction Venkat Bala Rafael Nicolas Fermin Cota Motivation Primary Goals Demonstrate potential benefits of using GPUs over CPUs for machine learning Exploit

More information

Performance and Scalability with Griddable.io

Performance and Scalability with Griddable.io Performance and Scalability with Griddable.io Executive summary Griddable.io is an industry-leading timeline-consistent synchronized data integration grid across a range of source and target data systems.

More information

Webinar Series TMIP VISION

Webinar Series TMIP VISION Webinar Series TMIP VISION TMIP provides technical support and promotes knowledge and information exchange in the transportation planning and modeling community. Today s Goals To Consider: Parallel Processing

More information

An Evaluation of Distributed Concurrency Control. Harding, Aken, Pavlo and Stonebraker Presented by: Thamir Qadah For CS590-BDS

An Evaluation of Distributed Concurrency Control. Harding, Aken, Pavlo and Stonebraker Presented by: Thamir Qadah For CS590-BDS An Evaluation of Distributed Concurrency Control Harding, Aken, Pavlo and Stonebraker Presented by: Thamir Qadah For CS590-BDS 1 Outline Motivation System Architecture Implemented Distributed CC protocols

More information

Sub-millisecond Stateful Stream Querying over Fast-evolving Linked Data

Sub-millisecond Stateful Stream Querying over Fast-evolving Linked Data Sub-millisecond Stateful Stream Querying over Fast-evolving Linked Data Yunhao Zhang, Rong Chen, Haibo Chen Institute of Parallel and Distributed Systems (IPADS) Shanghai Jiao Tong University Stream Query

More information

Dynamic Fine Grain Scheduling of Pipeline Parallelism. Presented by: Ram Manohar Oruganti and Michael TeWinkle

Dynamic Fine Grain Scheduling of Pipeline Parallelism. Presented by: Ram Manohar Oruganti and Michael TeWinkle Dynamic Fine Grain Scheduling of Pipeline Parallelism Presented by: Ram Manohar Oruganti and Michael TeWinkle Overview Introduction Motivation Scheduling Approaches GRAMPS scheduling method Evaluation

More information

ASYNCHRONOUS SHADERS WHITE PAPER 0

ASYNCHRONOUS SHADERS WHITE PAPER 0 ASYNCHRONOUS SHADERS WHITE PAPER 0 INTRODUCTION GPU technology is constantly evolving to deliver more performance with lower cost and lower power consumption. Transistor scaling and Moore s Law have helped

More information

Hash Joins for Multi-core CPUs. Benjamin Wagner

Hash Joins for Multi-core CPUs. Benjamin Wagner Hash Joins for Multi-core CPUs Benjamin Wagner Joins fundamental operator in query processing variety of different algorithms many papers publishing different results main question: is tuning to modern

More information

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1

What s New in VMware vsphere 4.1 Performance. VMware vsphere 4.1 What s New in VMware vsphere 4.1 Performance VMware vsphere 4.1 T E C H N I C A L W H I T E P A P E R Table of Contents Scalability enhancements....................................................................

More information

Disruptor Using High Performance, Low Latency Technology in the CERN Control System

Disruptor Using High Performance, Low Latency Technology in the CERN Control System Disruptor Using High Performance, Low Latency Technology in the CERN Control System ICALEPCS 2015 21/10/2015 2 The problem at hand 21/10/2015 WEB3O03 3 The problem at hand CESAR is used to control the

More information

Allocating memory in a lock-free manner

Allocating memory in a lock-free manner Allocating memory in a lock-free manner Anders Gidenstam, Marina Papatriantafilou and Philippas Tsigas Distributed Computing and Systems group, Department of Computer Science and Engineering, Chalmers

More information

High-Performance Event Processing Bridging the Gap between Low Latency and High Throughput Bernhard Seeger University of Marburg

High-Performance Event Processing Bridging the Gap between Low Latency and High Throughput Bernhard Seeger University of Marburg High-Performance Event Processing Bridging the Gap between Low Latency and High Throughput Bernhard Seeger University of Marburg common work with Nikolaus Glombiewski, Michael Körber, Marc Seidemann 1.

More information

Apache Spark Graph Performance with Memory1. February Page 1 of 13

Apache Spark Graph Performance with Memory1. February Page 1 of 13 Apache Spark Graph Performance with Memory1 February 2017 Page 1 of 13 Abstract Apache Spark is a powerful open source distributed computing platform focused on high speed, large scale data processing

More information

Packet-Level Network Analytics without Compromises NANOG 73, June 26th 2018, Denver, CO. Oliver Michel

Packet-Level Network Analytics without Compromises NANOG 73, June 26th 2018, Denver, CO. Oliver Michel Packet-Level Network Analytics without Compromises NANOG 73, June 26th 2018, Denver, CO Oliver Michel Network monitoring is important Security issues Performance issues Equipment failure Analytics Platform

More information

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of California, Berkeley Operating Systems Principles

More information

Deep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services

Deep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services Deep Dive Amazon Kinesis Ian Meyers, Principal Solution Architect - Amazon Web Services Analytics Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung SOSP 2003 presented by Kun Suo Outline GFS Background, Concepts and Key words Example of GFS Operations Some optimizations in

More information

Curriculum 2013 Knowledge Units Pertaining to PDC

Curriculum 2013 Knowledge Units Pertaining to PDC Curriculum 2013 Knowledge Units Pertaining to C KA KU Tier Level NumC Learning Outcome Assembly level machine Describe how an instruction is executed in a classical von Neumann machine, with organization

More information

Distributed Filesystem

Distributed Filesystem Distributed Filesystem 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributing Code! Don t move data to workers move workers to the data! - Store data on the local disks of nodes in the

More information

The Google File System

The Google File System The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google* 정학수, 최주영 1 Outline Introduction Design Overview System Interactions Master Operation Fault Tolerance and Diagnosis Conclusions

More information

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab Apache is a fast and general engine for large-scale data processing aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes Low latency: sub-second

More information

Fast and Concurrent RDF Queries with RDMA-Based Distributed Graph Exploration

Fast and Concurrent RDF Queries with RDMA-Based Distributed Graph Exploration Fast and Concurrent RDF Queries with RDMA-Based Distributed Graph Exploration JIAXIN SHI, YOUYANG YAO, RONG CHEN, HAIBO CHEN, FEIFEI LI PRESENTED BY ANDREW XIA APRIL 25, 2018 Wukong Overview of Wukong

More information

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage

TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage TPC-E testing of Microsoft SQL Server 2016 on Dell EMC PowerEdge R830 Server and Dell EMC SC9000 Storage Performance Study of Microsoft SQL Server 2016 Dell Engineering February 2017 Table of contents

More information

SCRIPT: An Architecture for IPFIX Data Distribution

SCRIPT: An Architecture for IPFIX Data Distribution SCRIPT Public Workshop January 20, 2010, Zurich, Switzerland SCRIPT: An Architecture for IPFIX Data Distribution Peter Racz Communication Systems Group CSG Department of Informatics IFI University of Zürich

More information

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed

ASPERA HIGH-SPEED TRANSFER. Moving the world s data at maximum speed ASPERA HIGH-SPEED TRANSFER Moving the world s data at maximum speed ASPERA HIGH-SPEED FILE TRANSFER 80 GBIT/S OVER IP USING DPDK Performance, Code, and Architecture Charles Shiflett Developer of next-generation

More information

Apache Storm: Hands-on Session A.A. 2016/17

Apache Storm: Hands-on Session A.A. 2016/17 Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Apache Storm: Hands-on Session A.A. 2016/17 Matteo Nardelli Laurea Magistrale in Ingegneria Informatica

More information

Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core

Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core Navendu Jain Lisa Amini, Henrique Andrade, Richard King, Yoonho Park, Philippe Selo, Chitra Venkatramani

More information

A Skiplist-based Concurrent Priority Queue with Minimal Memory Contention

A Skiplist-based Concurrent Priority Queue with Minimal Memory Contention A Skiplist-based Concurrent Priority Queue with Minimal Memory Contention Jonatan Lindén and Bengt Jonsson Uppsala University, Sweden December 18, 2013 Jonatan Lindén 1 Contributions Motivation: Improve

More information

Paper Presented by Harsha Yeddanapudy

Paper Presented by Harsha Yeddanapudy Storm@Twitter Ankit Toshniwal, Siddarth Taneja, Amit Shukla, Karthik Ramasamy, Jignesh M. Patel*, Sanjeev Kulkarni, Jason Jackson, Krishna Gade, Maosong Fu, Jake Donham, Nikunj Bhagat, Sailesh Mittal,

More information