PREDICTIVE DATACENTER ANALYTICS WITH STRYMON
|
|
- Deborah McLaughlin
- 6 years ago
- Views:
Transcription
1 PREDICTIVE DATACENTER ANALYTICS WITH STRYMON Vasia Kalavri QCon San Francisco 14 November 2017 Support:
2 ABOUT ME Postdoc at ETH Zürich Systems Group: PMC member of Apache Flink Research Large-scale graph processing Streaming dataflow engines Current project Predictive datacenter analytics and management 2
3 DATACENTER MANAGEMENT
4 4
5 5
6 DATACENTER MANAGEMENT Configuration updates Resource scaling Network failures Service deployment Workload fluctuations Software updates 6
7 DATACENTER MANAGEMENT Configuration updates Resource scaling Network failures Service deployment Workload fluctuations Software updates Can we predict the effect of changes? 6
8 DATACENTER MANAGEMENT Configuration updates Resource scaling Network failures Service deployment Workload fluctuations Software updates Can we predict the effect of changes? Can we prevent catastrophic faults? 6
9 What-if Analysis: Predicting outcomes under hypothetical conditions 7
10 What-if Analysis: Predicting outcomes under hypothetical conditions How will response time change if we migrate a large service? What will happen to link utilization if we change the routing protocol costs? Will SLOs will be violated if a certain switch fails? Which services will be affected if we change load balancing strategy? 7
11 Test deployment? a physical small-scale cluster to try out configuration changes and what-ifs Data Center Test Cluster expensive to operate and maintain some errors only occur in a large scale! 8
12 Analytical model? infer workload distribution from samples and analytically model system components (e.g. disk failure rate) Data Center hard to develop, large design space to explore often inaccurate 9
13 MODERN ENTERPRISE DATACENTERS ARE ALREADY HEAVILY INSTRUMENTED 10
14 Trace-driven online simulation Use existing instrumentation to build a datacenter model construct DC state from real events simulate the state we cannot directly observe current DC state Data Center forked forked what-if forked what-if what-if DC DC state DC state state 11
15 STRYMON: ONLINE DATACENTER ANALYTICS AND MANAGEMENT event streams Strymon Datacenter traces, configuration, topology updates, Datacenter state policy enforcement, what-if scenarios, queries, complex analytics, simulations, strymon.systems.ethz.ch 12
16 feedback TIMELY DATAFLOW ingress egress
17 STRYMON S OPERATIONAL REQUIREMENTS Low latency: react quickly to network failures High throughput: keep up with high stream rates Iterative computation: complex graph analytics on the network topology Incremental computation: reuse already computed results when possible e.g. do not recompute forwarding rules after a link update 14
18 TIMELY DATAFLOW: STREAM PROCESSING IN RUST Data-parallel computations Arbitrary cyclic dataflows Logical timestamps (epochs) Asynchronous execution Low latency D. Murray, F. McSherry, M. Isard, R. Isaacs, P. Barham, M. Abadi. Naiad: A Timely Dataflow System. In SOSP,
19 WORDCOUNT IN TIMELY fn main() { timely::execute_from_args(std::env::args(), worker { let mut input = InputHandle::new(); let mut probe = ProbeHandle::new(); let index = worker.index(); worker.dataflow( scope { input.to_stream(scope).flat_map( text: String text.split_whitespace().map(move word (word.to_owned(), 1)).collect::<Vec<_>>() ).aggregate( _key, val, agg { *agg += val; }, key, agg: i64 (key, agg), key hash_str(key) ).inspect( data println!("seen {:?}", data)).probe_with(&mut probe); }); //feed data }).unwrap(); } 16
20 WORDCOUNT IN TIMELY fn main() { timely::execute_from_args(std::env::args(), worker { let mut input = InputHandle::new(); let mut probe = ProbeHandle::new(); let index = worker.index(); worker.dataflow( scope { input.to_stream(scope).flat_map( text: String text.split_whitespace().map(move word (word.to_owned(), 1)).collect::<Vec<_>>() ).aggregate( _key, val, agg { *agg += val; }, key, agg: i64 (key, agg), key hash_str(key) ).inspect( data println!("seen {:?}", data)).probe_with(&mut probe); }); //feed data }).unwrap(); } initialize and run a timely job 16
21 WORDCOUNT IN TIMELY fn main() { timely::execute_from_args(std::env::args(), worker { } let mut input = InputHandle::new(); let mut probe = ProbeHandle::new(); let index = worker.index(); create input and progress handles worker.dataflow( scope { input.to_stream(scope).flat_map( text: String text.split_whitespace().map(move word (word.to_owned(), 1)).collect::<Vec<_>>() ).aggregate( _key, val, agg { *agg += val; }, key, agg: i64 (key, agg), key hash_str(key) ).inspect( data println!("seen {:?}", data)).probe_with(&mut probe); }); //feed data }).unwrap(); 16
22 WORDCOUNT IN TIMELY fn main() { timely::execute_from_args(std::env::args(), worker { let mut input = InputHandle::new(); let mut probe = ProbeHandle::new(); let index = worker.index(); worker.dataflow( scope { input.to_stream(scope).flat_map( text: String text.split_whitespace().map(move word (word.to_owned(), 1)).collect::<Vec<_>>() } }); //feed data }).unwrap(); ).aggregate( define the dataflow and its operators _key, val, agg { *agg += val; }, key, agg: i64 (key, agg), key hash_str(key) ).inspect( data println!("seen {:?}", data)).probe_with(&mut probe); 16
23 WORDCOUNT IN TIMELY fn main() { timely::execute_from_args(std::env::args(), worker { let mut input = InputHandle::new(); let mut probe = ProbeHandle::new(); let index = worker.index(); worker.dataflow( scope { input.to_stream(scope).flat_map( text: String text.split_whitespace().map(move word (word.to_owned(), 1)).collect::<Vec<_>>() ).aggregate( _key, val, agg { *agg += val; }, key, agg: i64 (key, agg), key hash_str(key) ) } }); //feed data }).unwrap(); watch for progress.inspect( data println!("seen {:?}", data)).probe_with(&mut probe); 16
24 WORDCOUNT IN TIMELY fn main() { timely::execute_from_args(std::env::args(), worker { let mut input = InputHandle::new(); let mut probe = ProbeHandle::new(); let index = worker.index(); worker.dataflow( scope { input.to_stream(scope).flat_map( text: String text.split_whitespace() } }); //feed data }).unwrap();.map(move word (word.to_owned(), 1)).collect::<Vec<_>>() ).aggregate( _key, val, agg { *agg += val; }, key, agg: i64 (key, agg), key hash_str(key) ).inspect( data println!("seen {:?}", data)).probe_with(&mut probe); a few Rust peculiarities to get used to :-) 16
25 PROGRESS TRACKING A distributed protocol that allows operators reason about the possibility of receiving data All tuples bear a logical timestamp (think event time) To send a timestamped tuple, an operator must hold a capability for it Workers broadcast progress changes to other workers Each worker independently determines progress 17
26 MAKING PROGRESS fn main() { timely::execute_from_args(std::env::args(), worker { for round in { input.send(("round".to_owned(), 1)); input.advance_to(round + 1); while probe.less_than(input.time()) { worker.step(); } } }).unwrap(); } 18
27 MAKING PROGRESS fn main() { timely::execute_from_args(std::env::args(), worker { } push data to the input stream for round in { input.send(("round".to_owned(), 1)); input.advance_to(round + 1); while probe.less_than(input.time()) { worker.step(); } } }).unwrap(); 18
28 MAKING PROGRESS fn main() { timely::execute_from_args(std::env::args(), worker { for round in { input.send(("round".to_owned(), 1)); } } }).unwrap(); advance the input epoch input.advance_to(round + 1); while probe.less_than(input.time()) { worker.step(); } 18
29 MAKING PROGRESS fn main() { timely::execute_from_args(std::env::args(), worker { for round in { input.send(("round".to_owned(), 1)); input.advance_to(round + 1); while probe.less_than(input.time()) { worker.step(); } } } }).unwrap(); do work while there s still data for this epoch 18
30 TIMELY ITERATIONS timely::example( scope { let (handle, stream) = scope.loop_variable(100, 1); (0..10).to_stream(scope).concat(&stream).inspect( x println!("seen: {:?}", x)).connect_loop(handle); }); 19
31 TIMELY ITERATIONS timely::example( scope { let (handle, stream) = scope.loop_variable(100, 1); (0..10).to_stream(scope).concat(&stream) }); loop 100 times at most.inspect( x println!("seen: {:?}", x)).connect_loop(handle); create the feedback loop advance timestamps by 1 in each iteration 19
32 TIMELY ITERATIONS timely::example( scope { let (handle, stream) = scope.loop_variable(100, 1); (0..10).to_stream(scope).concat(&stream) }); (t, l1) loop 100 times at most.inspect( x println!("seen: {:?}", x)).connect_loop(handle); create the feedback loop advance timestamps by 1 in each iteration t (t, (l1, l2)) 19
33 TIMELY & I 20
34 TIMELY & I Relationship status: it s complicated 20
35 TIMELY & I Relationship status: it s complicated performance fault-tolerance debugging expressiveness incremental computation APIs & libraries deployment ecosystem performance 20
36 STRYMON USE-CASES
37 Strymon Real-time datacenter analytics Incremental network routing Online critical path analysis What-if analysis Query and analyze state online Control and enforce configuration Understand performance Simulate what-if scenarios 22
38 Strymon Real-time datacenter analytics Incremental network routing Online critical path analysis What-if analysis Query and analyze state online Control and enforce configuration Understand performance Simulate what-if scenarios 23
39 RECONSTRUCTING USER SESSIONS A.1 Application A A.2 A.3 B.1 Application B B.2 B.3 B.4 Time: 2015/09/01 10:03: Session ID: XKSHSKCBA53U088FXGE7LD8 Transaction ID:
40 PERFORMANCE RESULTS Logs from 1263 streams and 42 servers Client Log event Transaction n-2-8 Transaction ID Time 1.3 million events/s at MB/s A ms per epoch vs. 2.1s per epoch with Flink B Inactivity More Results: Zaheer Chothia, John Liagouris, Desislava Dimitrova, and Timothy Roscoe. Online Reconstruction of Structural Information from Datacenter Logs. (EuroSys '17). 25
41 Strymon Real-time datacenter analytics Incremental network routing Online critical path analysis What-if analysis Query and analyze state online Control and enforce configuration Understand performance Simulate what-if scenarios 26
42 ROUTING AS A STREAMING COMPUTATION Compute APSP and keep forwarding rules as operator state Flow requests translate to lookups on that state Network updates cause re-computation of affected rules only iteration Network changes delta-join G R Updated rules Forwarding rules 27
43 REACTION TO LINK FAILURES Fail random link 500 individual runs 32 threads More Results: Desislava C. Dimitrova et. al. Quick Incremental Routing Logic for Dynamic Network Graphs. SIGCOMM Posters and Demos '17 28
44 Strymon Real-time datacenter analytics Incremental network routing Online critical path analysis What-if analysis Query and analyze state online Control and enforce configuration Understand performance Simulate what-if scenarios 29
45 ONLINE CRITICAL PATH ANALYSIS input stream output stream periodic snapshot trace snapshot stream analyzer Apache Flink, Apache Spark, TensorFlow, Heron, Timely Dataflow performance summaries 30
46 TASK SCHEDULING IN APACHE SPARK DRIVER W1 W2 W3 Venkataraman, Shivaram, et al. "Drizzle: Fast and adaptable stream processing at scale." Spark Summit (2016). 31
47 SCHEDULING BOTTLENECK IN APACHE SPARK Strymon Profiling Conventional Profiling 0.8 CP % weight Snapshot Processing Snapshot Scheduling Apache Spark: Yahoo! Streaming Benchmark, 16 workers, 8s snapshots 32
48 SCHEDULING BOTTLENECK IN APACHE SPARK Strymon Profiling Conventional Profiling 0.8 CP % weight Snapshot Processing Snapshot Scheduling Apache Spark: Yahoo! Streaming Benchmark, 16 workers, 8s snapshots 32
49 Strymon Real-time datacenter analytics Incremental network routing Online critical path analysis What-if analysis Query and analyze state online Control and enforce configuration Understand performance Simulate what-if scenarios 33
50 EVALUATING LOAD BALANCING STRATEGIES 13k services ~100K user requests/s OSPF routing Weighted Round-Robin load balancing Traffic matrix simulation under topology and load balancing changes 34
51 WHAT-IF DATAFLOW RPC logs (text) ETL Load Balancing Routing Traffic Matrix Construction Logging Servers Topology changes (graph) LB View Construction Topology Discovery Configuration DBs Device specs, application placement (relational) DC Model construction Routing View Construction 35
52 STRYMON DEMO
53 Strymon has been released! Try it out: Send us feedback: 37
54 THE STRYMON TEAM & FRIENDS Prof. Timothy Roscoe Desislava Dimitrova Moritz Hoffmann Andrea Lattuada John Liagouris Zaheer Chothia Sebastian Wicki Frank McSherry Vasiliki Kalavri 38
55 THE STRYMON TEAM & FRIENDS Prof. Timothy Roscoe Desislava Dimitrova Moritz Hoffmann Andrea Lattuada John Liagouris Zaheer Chothia Sebastian Wicki Frank McSherry?? Vasiliki Kalavri 38
56 THE STRYMON TEAM & FRIENDS Prof. Timothy Roscoe Desislava Dimitrova Moritz Hoffmann Andrea Lattuada IT COULD BE YOU! strymon.systems.ethz.ch John Liagouris Zaheer Chothia Sebastian Wicki Frank McSherry?? Vasiliki Kalavri 38
57 PREDICTIVE DATACENTER ANALYTICS WITH STRYMON Vasia Kalavri QCon San Francisco 14 November 2017 Support:
SnailTrail Generalizing Critical Paths for Online Analysis of Distributed Dataflows
SnailTrail Generalizing Critical Paths for Online Analysis of Distributed Dataflows Moritz Hoffmann, Andrea Lattuada, John Liagouris, Vasiliki Kalavri, Desislava Dimitrova, Sebastian Wicki, Zaheer Chothia,
More informationSnailTrail: Generalizing Critical Paths for Online Analysis of Distributed Dataflows
SnailTrail: Generalizing Critical Paths for Online Analysis of Distributed Dataflows Moritz Hoffmann, Andrea Lattuada, John Liagouris, Vasiliki Kalavri, Desislava Dimitrova, Sebastian Wicki, Zaheer Chothia,
More informationPractical Big Data Processing An Overview of Apache Flink
Practical Big Data Processing An Overview of Apache Flink Tilmann Rabl Berlin Big Data Center www.dima.tu-berlin.de bbdc.berlin rabl@tu-berlin.de With slides from Volker Markl and data artisans 1 2013
More informationThree steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows
Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows Vasiliki Kalavri, John Liagouris, Moritz Hoffmann, and Desislava Dimitrova, ETH Zurich; Matthew
More informationDRIZZLE: FAST AND Adaptable STREAM PROCESSING AT SCALE
DRIZZLE: FAST AND Adaptable STREAM PROCESSING AT SCALE Shivaram Venkataraman, Aurojit Panda, Kay Ousterhout, Michael Armbrust, Ali Ghodsi, Michael Franklin, Benjamin Recht, Ion Stoica STREAMING WORKLOADS
More informationSoftware Defined Networking Data centre perspective: Open Flow
Software Defined Networking Data centre perspective: Open Flow Seminar: Prof. Timothy Roscoe & Dr. Desislava Dimitrova D. Dimitrova, T. Roscoe 04.03.2016 1 OpenFlow Specification, protocol, architecture
More informationApache Flink Big Data Stream Processing
Apache Flink Big Data Stream Processing Tilmann Rabl Berlin Big Data Center www.dima.tu-berlin.de bbdc.berlin rabl@tu-berlin.de XLDB 11.10.2017 1 2013 Berlin Big Data Center All Rights Reserved DIMA 2017
More informationApache Flink. Alessandro Margara
Apache Flink Alessandro Margara alessandro.margara@polimi.it http://home.deib.polimi.it/margara Recap: scenario Big Data Volume and velocity Process large volumes of data possibly produced at high rate
More informationPutting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21
Big Processing -Parallel Computation COS 418: Distributed Systems Lecture 21 Michael Freedman 2 Ex: Word count using partial aggregation Putting it together 1. Compute word counts from individual files
More informationFlash Storage Complementing a Data Lake for Real-Time Insight
Flash Storage Complementing a Data Lake for Real-Time Insight Dr. Sanhita Sarkar Global Director, Analytics Software Development August 7, 2018 Agenda 1 2 3 4 5 Delivering insight along the entire spectrum
More informationStreamBox: Modern Stream Processing on a Multicore Machine
StreamBox: Modern Stream Processing on a Multicore Machine Hongyu Miao and Heejin Park, Purdue ECE; Myeongjae Jeon and Gennady Pekhimenko, Microsoft Research; Kathryn S. McKinley, Google; Felix Xiaozhu
More informationReal-time data processing with Apache Flink
Real-time data processing with Apache Flink Gyula Fóra gyfora@apache.org Flink committer Swedish ICT Stream processing Data stream: Infinite sequence of data arriving in a continuous fashion. Stream processing:
More informationUsing the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver
Using the SDACK Architecture to Build a Big Data Product Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver Outline A Threat Analytic Big Data product The SDACK Architecture Akka Streams and data
More informationPouya Kousha Fall 2018 CSE 5194 Prof. DK Panda
Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda 1 Motivation And Intro Programming Model Spark Data Transformation Model Construction Model Training Model Inference Execution Model Data Parallel Training
More informationTENSORFLOW: LARGE-SCALE MACHINE LEARNING ON HETEROGENEOUS DISTRIBUTED SYSTEMS. by Google Research. presented by Weichen Wang
TENSORFLOW: LARGE-SCALE MACHINE LEARNING ON HETEROGENEOUS DISTRIBUTED SYSTEMS by Google Research presented by Weichen Wang 2016.11.28 OUTLINE Introduction The Programming Model The Implementation Single
More informationDrizzle: Fast and Adaptable Stream Processing at Scale
Drizzle: Fast and Adaptable Stream Processing at Scale Shivaram Venkataraman * UC Berkeley Michael Armbrust Databricks Aurojit Panda UC Berkeley Ali Ghodsi Databricks, UC Berkeley Kay Ousterhout UC Berkeley
More informationCoflow. Recent Advances and What s Next? Mosharaf Chowdhury. University of Michigan
Coflow Recent Advances and What s Next? Mosharaf Chowdhury University of Michigan Rack-Scale Computing Datacenter-Scale Computing Geo-Distributed Computing Coflow Networking Open Source Apache Spark Open
More informationContainer 2.0. Container: check! But what about persistent data, big data or fast data?!
@unterstein @joerg_schad @dcos @jaxdevops Container 2.0 Container: check! But what about persistent data, big data or fast data?! 1 Jörg Schad Distributed Systems Engineer @joerg_schad Johannes Unterstein
More informationSparrow. Distributed Low-Latency Spark Scheduling. Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica
Sparrow Distributed Low-Latency Spark Scheduling Kay Ousterhout, Patrick Wendell, Matei Zaharia, Ion Stoica Outline The Spark scheduling bottleneck Sparrow s fully distributed, fault-tolerant technique
More informationModern Stream Processing with Apache Flink
1 Modern Stream Processing with Apache Flink Till Rohrmann GOTO Berlin 2017 2 Original creators of Apache Flink da Platform 2 Open Source Apache Flink + da Application Manager 3 What changes faster? Data
More informationThe Power of Snapshots Stateful Stream Processing with Apache Flink
The Power of Snapshots Stateful Stream Processing with Apache Flink Stephan Ewen QCon San Francisco, 2017 1 Original creators of Apache Flink da Platform 2 Open Source Apache Flink + da Application Manager
More informationUsing Apache Beam for Batch, Streaming, and Everything in Between. Dan Halperin Apache Beam PMC Senior Software Engineer, Google
Abstract Apache Beam is a unified programming model capable of expressing a wide variety of both traditional batch and complex streaming use cases. By neatly separating properties of the data from run-time
More informationApache Flink- A System for Batch and Realtime Stream Processing
Apache Flink- A System for Batch and Realtime Stream Processing Lecture Notes Winter semester 2016 / 2017 Ludwig-Maximilians-University Munich Prof Dr. Matthias Schubert 2016 Introduction to Apache Flink
More informationOver the last few years, we have seen a disruption in the data management
JAYANT SHEKHAR AND AMANDEEP KHURANA Jayant is Principal Solutions Architect at Cloudera working with various large and small companies in various Verticals on their big data and data science use cases,
More informationOnline Reconstruction of Structural Information from Datacenter Logs
Online Reconstruction of Structural Information from Datacenter Logs Zaheer Chothia John Liagouris Desislava Dimitrova Timothy Roscoe Systems Group, Department of Computer Science, ETH Zürich firstname.lastname@inf.ethz.ch
More informationmodern database systems lecture 10 : large-scale graph processing
modern database systems lecture 1 : large-scale graph processing Aristides Gionis spring 18 timeline today : homework is due march 6 : homework out april 5, 9-1 : final exam april : homework due graphs
More informationLecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Apache Flink
Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Apache Flink Matthias Schubert, Matthias Renz, Felix Borutta, Evgeniy Faerman, Christian Frey, Klaus Arthur Schmid, Daniyal Kazempour,
More informationData Analytics with HPC. Data Streaming
Data Analytics with HPC Data Streaming Reusing this material This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License. http://creativecommons.org/licenses/by-nc-sa/4.0/deed.en_us
More informationNaiad (Timely Dataflow) & Streaming Systems
Naiad (Timely Dataflow) & Streaming Systems CS 848: Models and Applications of Distributed Data Systems Mon, Nov 7th 2016 Amine Mhedhbi What is Timely Dataflow?! What is its significance? Dataflow?! Dataflow?!
More informationDeep Learning Inference as a Service
Deep Learning Inference as a Service Mohammad Babaeizadeh Hadi Hashemi Chris Cai Advisor: Prof Roy H. Campbell Use case 1: Model Developer Use case 1: Model Developer Inference Service Use case
More information@unterstein #bedcon. Operating microservices with Apache Mesos and DC/OS
@unterstein @dcos @bedcon #bedcon Operating microservices with Apache Mesos and DC/OS 1 Johannes Unterstein Software Engineer @Mesosphere @unterstein @unterstein.mesosphere 2017 Mesosphere, Inc. All Rights
More informationFunctional Comparison and Performance Evaluation. Huafeng Wang Tianlun Zhang Wei Mao 2016/11/14
Functional Comparison and Performance Evaluation Huafeng Wang Tianlun Zhang Wei Mao 2016/11/14 Overview Streaming Core MISC Performance Benchmark Choose your weapon! 2 Continuous Streaming Micro-Batch
More informationNaiad. a timely dataflow model
Naiad a timely dataflow model What s it hoping to achieve? 1. high throughput 2. low latency 3. incremental computation Why? So much data! Problems with other, contemporary dataflow systems: 1. Too specific
More informationDistributed Filesystem
Distributed Filesystem 1 How do we get data to the workers? NAS Compute Nodes SAN 2 Distributing Code! Don t move data to workers move workers to the data! - Store data on the local disks of nodes in the
More informationSpark, Shark and Spark Streaming Introduction
Spark, Shark and Spark Streaming Introduction Tushar Kale tusharkale@in.ibm.com June 2015 This Talk Introduction to Shark, Spark and Spark Streaming Architecture Deployment Methodology Performance References
More informationAsanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks
Asanka Padmakumara ETL 2.0: Data Engineering with Azure Databricks Who am I? Asanka Padmakumara Business Intelligence Consultant, More than 8 years in BI and Data Warehousing A regular speaker in data
More informationProgramming Systems for Big Data
Programming Systems for Big Data CS315B Lecture 17 Including material from Kunle Olukotun Prof. Aiken CS 315B Lecture 17 1 Big Data We ve focused on parallel programming for computational science There
More informationScalable Streaming Analytics
Scalable Streaming Analytics KARTHIK RAMASAMY @karthikz TALK OUTLINE BEGIN I! II ( III b Overview Storm Overview Storm Internals IV Z V K Heron Operational Experiences END WHAT IS ANALYTICS? according
More informationlecture 18: network virtualization platform (NVP) 5590: software defined networking anduo wang, Temple University TTLMAN 401B, R 17:30-20:00
lecture 18: network virtualization platform (NVP) 5590: software defined networking anduo wang, Temple University TTLMAN 401B, R 17:30-20:00 Network Virtualization in multi-tenant Datacenters Teemu Koponen.,
More informationHow we build TiDB. Max Liu PingCAP Amsterdam, Netherlands October 5, 2016
How we build TiDB Max Liu PingCAP Amsterdam, Netherlands October 5, 2016 About me Infrastructure engineer / CEO of PingCAP Working on open source projects: TiDB: https://github.com/pingcap/tidb TiKV: https://github.com/pingcap/tikv
More informationTyphoon: An SDN Enhanced Real-Time Big Data Streaming Framework
Typhoon: An SDN Enhanced Real-Time Big Data Streaming Framework Junguk Cho, Hyunseok Chang, Sarit Mukherjee, T.V. Lakshman, and Jacobus Van der Merwe 1 Big Data Era Big data analysis is increasingly common
More informationUnifying Big Data Workloads in Apache Spark
Unifying Big Data Workloads in Apache Spark Hossein Falaki @mhfalaki Outline What s Apache Spark Why Unification Evolution of Unification Apache Spark + Databricks Q & A What s Apache Spark What is Apache
More informationApache Flink Streaming Done Right. Till
Apache Flink Streaming Done Right Till Rohrmann trohrmann@apache.org @stsffap What Is Apache Flink? Apache TLP since December 2014 Parallel streaming data flow runtime Low latency & high throughput Exactly
More informationData Streams. Building a Data Stream Management System. DBMS versus DSMS. The (Simplified) Big Picture. (Simplified) Network Monitoring
Building a Data Stream Management System Prof. Jennifer Widom Joint project with Prof. Rajeev Motwani and a team of graduate students http://www-db.stanford.edu/stream stanfordstreamdatamanager Data Streams
More informationResearch challenges in data-intensive computing The Stratosphere Project Apache Flink
Research challenges in data-intensive computing The Stratosphere Project Apache Flink Seif Haridi KTH/SICS haridi@kth.se e2e-clouds.org Presented by: Seif Haridi May 2014 Research Areas Data-intensive
More informationShen PingCAP 2017
Shen Li @ PingCAP About me Shen Li ( 申砾 ) Tech Lead of TiDB, VP of Engineering Netease / 360 / PingCAP Infrastructure software engineer WHY DO WE NEED A NEW DATABASE? Brief History Standalone RDBMS NoSQL
More information1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda
Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:
More informationApache Ignite and Apache Spark Where Fast Data Meets the IoT
Apache Ignite and Apache Spark Where Fast Data Meets the IoT Denis Magda GridGain Product Manager Apache Ignite PMC http://ignite.apache.org #apacheignite #denismagda Agenda IoT Demands to Software IoT
More informationIQ for DNA. Interactive Query for Dynamic Network Analytics. Haoyu Song. HUAWEI TECHNOLOGIES Co., Ltd.
IQ for DNA Interactive Query for Dynamic Network Analytics Haoyu Song www.huawei.com Motivation Service Provider s pain point Lack of real-time and full visibility of networks, so the network monitoring
More informationCOMPARATIVE EVALUATION OF BIG DATA FRAMEWORKS ON BATCH PROCESSING
Volume 119 No. 16 2018, 937-948 ISSN: 1314-3395 (on-line version) url: http://www.acadpubl.eu/hub/ http://www.acadpubl.eu/hub/ COMPARATIVE EVALUATION OF BIG DATA FRAMEWORKS ON BATCH PROCESSING K.Anusha
More informationData-Intensive Distributed Computing
Data-Intensive Distributed Computing CS 451/651 431/631 (Winter 2018) Part 9: Real-Time Data Analytics (1/2) March 27, 2018 Jimmy Lin David R. Cheriton School of Computer Science University of Waterloo
More informationCrescando: Predictable Performance for Unpredictable Workloads
Crescando: Predictable Performance for Unpredictable Workloads G. Alonso, D. Fauser, G. Giannikis, D. Kossmann, J. Meyer, P. Unterbrunner Amadeus S.A. ETH Zurich, Systems Group (Funded by Enterprise Computing
More informationDeep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services
Deep Dive Amazon Kinesis Ian Meyers, Principal Solution Architect - Amazon Web Services Analytics Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More informationReal Time Recommendations using Spark Streaming. Elliot Chow
Real Time Recommendations using Spark Streaming Elliot Chow Why? - React more quickly to changes in interest - Time-of-day effects - Real-world events Feedback Loop UI Recommendation Systems Data Systems
More informationCPS 512 midterm exam #1, 10/7/2016
CPS 512 midterm exam #1, 10/7/2016 Your name please: NetID: Answer all questions. Please attempt to confine your answers to the boxes provided. If you don t know the answer to a question, then just say
More informationAdaptive Cluster Computing using JavaSpaces
Adaptive Cluster Computing using JavaSpaces Jyoti Batheja and Manish Parashar The Applied Software Systems Lab. ECE Department, Rutgers University Outline Background Introduction Related Work Summary of
More informationMagpie: Distributed request tracking for realistic performance modelling
Magpie: Distributed request tracking for realistic performance modelling Rebecca Isaacs Paul Barham Richard Mortier Dushyanth Narayanan Microsoft Research Cambridge James Bulpin University of Cambridge
More informationDatacenter replication solution with quasardb
Datacenter replication solution with quasardb Technical positioning paper April 2017 Release v1.3 www.quasardb.net Contact: sales@quasardb.net Quasardb A datacenter survival guide quasardb INTRODUCTION
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationSub-millisecond Stateful Stream Querying over Fast-evolving Linked Data
Sub-millisecond Stateful Stream Querying over Fast-evolving Linked Data Yunhao Zhang, Rong Chen, Haibo Chen Institute of Parallel and Distributed Systems (IPADS) Shanghai Jiao Tong University Stream Query
More informationArchitecture of Flink's Streaming Runtime. Robert
Architecture of Flink's Streaming Runtime Robert Metzger @rmetzger_ rmetzger@apache.org What is stream processing Real-world data is unbounded and is pushed to systems Right now: people are using the batch
More informationBuilding a Scalable Architecture for Web Apps - Part I (Lessons Directi)
Intelligent People. Uncommon Ideas. Building a Scalable Architecture for Web Apps - Part I (Lessons Learned @ Directi) By Bhavin Turakhia CEO, Directi (http://www.directi.com http://wiki.directi.com http://careers.directi.com)
More informationdata parallelism Chris Olston Yahoo! Research
data parallelism Chris Olston Yahoo! Research set-oriented computation data management operations tend to be set-oriented, e.g.: apply f() to each member of a set compute intersection of two sets easy
More informationDATA SCIENCE USING SPARK: AN INTRODUCTION
DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationCSE 444: Database Internals. Lecture 23 Spark
CSE 444: Database Internals Lecture 23 Spark References Spark is an open source system from Berkeley Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing. Matei
More informationStreaming analytics better than batch - when and why? _Adam Kawa - Dawid Wysakowicz_
Streaming analytics better than batch - when and why? _Adam Kawa - Dawid Wysakowicz_ About Us At GetInData, we build custom Big Data solutions Hadoop, Flink, Spark, Kafka and more Our team is today represented
More informationLecture 11 Hadoop & Spark
Lecture 11 Hadoop & Spark Dr. Wilson Rivera ICOM 6025: High Performance Computing Electrical and Computer Engineering Department University of Puerto Rico Outline Distributed File Systems Hadoop Ecosystem
More informationA Parallel R Framework
A Parallel R Framework for Processing Large Dataset on Distributed Systems Nov. 17, 2013 This work is initiated and supported by Huawei Technologies Rise of Data-Intensive Analytics Data Sources Personal
More informationDistributed systems for stream processing
Distributed systems for stream processing Apache Kafka and Spark Structured Streaming Alena Hall Alena Hall Large-scale data processing Distributed Systems Functional Programming Data Science & Machine
More informationTHE DATACENTER AS A COMPUTER AND COURSE REVIEW
THE DATACENTER A A COMPUTER AND COURE REVIEW George Porter June 8, 2018 ATTRIBUTION These slides are released under an Attribution-NonCommercial-hareAlike 3.0 Unported (CC BY-NC-A 3.0) Creative Commons
More informationAdvanced Continuous Delivery Strategies for Containerized Applications Using DC/OS
Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS ContainerCon @ Open Source Summit North America 2017 Elizabeth K. Joseph @pleia2 1 Elizabeth K. Joseph, Developer Advocate
More informationCS 398 ACC Streaming. Prof. Robert J. Brunner. Ben Congdon Tyler Kim
CS 398 ACC Streaming Prof. Robert J. Brunner Ben Congdon Tyler Kim MP3 How s it going? Final Autograder run: - Tonight ~9pm - Tomorrow ~3pm Due tomorrow at 11:59 pm. Latest Commit to the repo at the time
More informationA Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers
A Distributed System Case Study: Apache Kafka High throughput messaging for diverse consumers As always, this is not a tutorial Some of the concepts may no longer be part of the current system or implemented
More informationSpark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies
Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay 1 Apache Spark - Intro Spark within the Big Data ecosystem Data Sources Data Acquisition / ETL Data Storage Data Analysis / ML Serving 3 Apache
More informationHiTune. Dataflow-Based Performance Analysis for Big Data Cloud
HiTune Dataflow-Based Performance Analysis for Big Data Cloud Jinquan (Jason) Dai, Jie Huang, Shengsheng Huang, Bo Huang, Yan Liu Intel Asia-Pacific Research and Development Ltd Shanghai, China, 200241
More informationIntelligent Performance Software Testing
White Paper Intelligent Performance Software Testing The field of software functional testing is undergoing a major transformation. What used to be an onerous manual process took a big step forward with
More informationOptimizing Network Performance in Distributed Machine Learning. Luo Mai Chuntao Hong Paolo Costa
Optimizing Network Performance in Distributed Machine Learning Luo Mai Chuntao Hong Paolo Costa Machine Learning Successful in many fields Online advertisement Spam filtering Fraud detection Image recognition
More informationPortable stateful big data processing in Apache Beam
Portable stateful big data processing in Apache Beam Kenneth Knowles Apache Beam PMC Software Engineer @ Google klk@google.com / @KennKnowles https://s.apache.org/ffsf-2017-beam-state Flink Forward San
More informationClearSpeed Visual Profiler
ClearSpeed Visual Profiler Copyright 2007 ClearSpeed Technology plc. All rights reserved. 12 November 2007 www.clearspeed.com 1 Profiling Application Code Why use a profiler? Program analysis tools are
More informationGLADE: A Scalable Framework for Efficient Analytics. Florin Rusu (University of California, Merced) Alin Dobra (University of Florida)
DE: A Scalable Framework for Efficient Analytics Florin Rusu (University of California, Merced) Alin Dobra (University of Florida) Big Data Analytics Big Data Storage is cheap ($100 for 1TB disk) Everything
More informationTailwind: Fast and Atomic RDMA-based Replication. Yacine Taleb, Ryan Stutsman, Gabriel Antoniu, Toni Cortes
Tailwind: Fast and Atomic RDMA-based Replication Yacine Taleb, Ryan Stutsman, Gabriel Antoniu, Toni Cortes In-Memory Key-Value Stores General purpose in-memory key-value stores are widely used nowadays
More informationCloud Monitoring as a Service. Built On Machine Learning
Cloud Monitoring as a Service Built On Machine Learning Table of Contents 1 2 3 4 5 6 7 8 9 10 Why Machine Learning Who Cares Four Dimensions to Cloud Monitoring Data Aggregation Anomaly Detection Algorithms
More informationChapter 4: Apache Spark
Chapter 4: Apache Spark Lecture Notes Winter semester 2016 / 2017 Ludwig-Maximilians-University Munich PD Dr. Matthias Renz 2015, Based on lectures by Donald Kossmann (ETH Zürich), as well as Jure Leskovec,
More informationVarys. Efficient Coflow Scheduling. Mosharaf Chowdhury, Yuan Zhong, Ion Stoica. UC Berkeley
Varys Efficient Coflow Scheduling Mosharaf Chowdhury, Yuan Zhong, Ion Stoica UC Berkeley Communication is Crucial Performance Facebook analytics jobs spend 33% of their runtime in communication 1 As in-memory
More informationApache Bahir Writing Applications using Apache Bahir
Apache Big Data Seville 2016 Apache Bahir Writing Applications using Apache Bahir Luciano Resende About Me Luciano Resende (lresende@apache.org) Architect and community liaison at Have been contributing
More informationDesigning Hybrid Data Processing Systems for Heterogeneous Servers
Designing Hybrid Data Processing Systems for Heterogeneous Servers Peter Pietzuch Large-Scale Distributed Systems (LSDS) Group Imperial College London http://lsds.doc.ic.ac.uk University
More informationShark. Hive on Spark. Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker
Shark Hive on Spark Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker Agenda Intro to Spark Apache Hive Shark Shark s Improvements over Hive Demo Alpha
More informationBig Data Technology Incremental Processing using Distributed Transactions
Big Data Technology Incremental Processing using Distributed Transactions Eshcar Hillel Yahoo! Ronny Lempel Outbrain *Based on slides by Edward Bortnikov and Ohad Shacham Roadmap Previous classes Stream
More informationA Scalable, Commodity Data Center Network Architecture
A Scalable, Commodity Data Center Network Architecture B Y M O H A M M A D A L - F A R E S A L E X A N D E R L O U K I S S A S A M I N V A H D A T P R E S E N T E D B Y N A N X I C H E N M A Y. 5, 2 0
More informationPocket: Elastic Ephemeral Storage for Serverless Analytics
Pocket: Elastic Ephemeral Storage for Serverless Analytics Ana Klimovic*, Yawen Wang*, Patrick Stuedi +, Animesh Trivedi +, Jonas Pfefferle +, Christos Kozyrakis* *Stanford University, + IBM Research 1
More informationOverview SENTINET 3.1
Overview SENTINET 3.1 Overview 1 Contents Introduction... 2 Customer Benefits... 3 Development and Test... 3 Production and Operations... 4 Architecture... 5 Technology Stack... 7 Features Summary... 7
More informationNext-Generation Cloud Platform
Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology
More information利用 Mesos 打造高延展性 Container 環境. Frank, Microsoft MTC
利用 Mesos 打造高延展性 Container 環境 Frank, Microsoft MTC About Me Developer @ Yahoo! DevOps @ HTC Technical Architect @ MSFT Agenda About Docker Manage containers Apache Mesos Mesosphere DC/OS application = application
More informationCloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List)
CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List) Microsoft Solution Latest Sl Area Refresh No. Course ID Run ID Course Name Mapping Date 1 AZURE202x 2 Microsoft
More informationFlexible Network Analytics in the Cloud. Jon Dugan & Peter Murphy ESnet Software Engineering Group October 18, 2017 TechEx 2017, San Francisco
Flexible Network Analytics in the Cloud Jon Dugan & Peter Murphy ESnet Software Engineering Group October 18, 2017 TechEx 2017, San Francisco Introduction Harsh realities of network analytics netbeam Demo
More informationMASSACHUSETTS INSTITUTE OF TECHNOLOGY Distributed System Engineering: Spring Exam II
Department of Electrical Engineering and Computer Science MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.824 Distributed System Engineering: Spring 2018 Exam II Write your name on this cover sheet. If you tear
More informationApache Flink: Distributed Stream Data Processing
Apache Flink: Distributed Stream Data Processing K.M.J. Jacobs CERN, Geneva, Switzerland 1 Introduction The amount of data is growing significantly over the past few years. Therefore, the need for distributed
More informationData Integrity in Stateful Services. Percona Live, Santa Clara, 2017
Data Integrity in Stateful Services Percona Live, Santa Clara, 2017 Data Integrity Bringing Sexy Back Protect the Data. -Every DBA who doesn t want to be fired Breaking Integrity Down Physical Integrity
More information