Pulsar. Realtime Analytics At Scale. Wang Xinglang

Size: px
Start display at page:

Download "Pulsar. Realtime Analytics At Scale. Wang Xinglang"

Transcription

1 Pulsar Realtime Analytics At Scale Wang Xinglang

2 Agenda Pulsar : Real Time Analytics At ebay Business Use Cases Product Requirements Pulsar : Technology Deep Dive 2

3 Pulsar

4 Business Use Case: Behavioral Analytics Behavioral Analytics : Analysis Of User Behavior on ebay Inputs : User Clicks, Impression, Conversions etc Problem In Session Targeting Campaign Optimization Requirements Low Latency <1 Sec E2E latency Scalability Millions of events/sec Data Accuracy Data Sources are lossy 4

5 Business Use Case: Monitoring Monitoring : Application, Systems and Infrastructure Monitoring Inputs : Heartbeats, SNMP, Logs Problem Collection, Aggregation of Metrics Co-relation and Alerting Requirements Scalability 10s of Millions of events/sec Availability 99.99% Uptime Flexibility User Driven Rules On the fly joins with other streams, data sources 5

6 Business Use Case: Security Security : Traffic Limiting, DOS Detection, Bot Detection Inputs : Clicks, Impressions, Logs Problem Enforce Quotas in Real-time Detect and Prevent Bad Bots Requirements Scalability 10s of Millions of events/sec Latency 1 sec SLA Flexibility User Drive Custom Rules Data Accuracy 99% is acceptable 6

7 Pulsar Product Requirements Summary Scalability Scale to 10s of Millions Events/Sec Latency <1 Sec Delivery of Events Availability Highly Available System No downtime during upgrades Disaster Recovery Support across data centers Flexibility User Driven Complex Rules. Eg ( CPU/TPS > 3 AND ERRORS/TPS>0.2) Data Accuracy Should deal with missing data 99.9% Delivery Guarantee 7

8 Pulsar Technical Deep Dive

9 Pulsar CEP framework JetStream 9

10 Pulsar Deployment Architecture Zookeeper DC#1 CEP Cell Kafka Queue CEP Cell Mongo CEP Cell Producer Events Consumer CEP Application Cluster Ring Cell pipeline Inbound Channel Processor SQL Outbound Channel CEP Cell CEP Cell Mongo CEP Cell DC#2 Zookeeper Kafka Queue

11 Availability And Scalability Multi datacenter failovers Dynamic Partitioning Elastic Clusters Self Healing Shutdown Orchestration Dynamic Flow Routing Dynamic Topology Changes 11

12 Pulsar Framework Building Blocks Inbound Channel Inbound Channel Event Processor Outbound Channel Outbound Channel Event = Tuples (K,V) - Mutable Inbound Channel (Event Source) adapts to external world and sources events Outbound Channel (Event Sink) adapts to external world and consumes events Event Processor (Event Sink and Event Source) Pipelining, Batching & Flow control 12

13 Pulsar Application (CEP Cell) Inbound Channel-1 Inbound Channel-2 Processor-1 Processor-2 Outbound Channel Spring Container JVM <bean id="inboundchannel-1 class="com.ebay.jetstream.event.channel.messaging.inboundmessagingchan nel"> <property name="address" ref="inboundchanneladdress" /> <property name="eventsinks"> <list> <ref bean= Processor-1" /> </list> </property> </bean> <bean id="processor-1" class="com.ebay.jetstream.event.processor.esper.esperprocessor"> <property name="espereventlistener" ref="espereventlistener" /> <property name="configuration" ref="esperconfiguration" /> <property name="epl" ref="epl" /> <property name="eventsinks"> <list> <ref bean= outboundchannel" /> </list> </property> </bean>

14 Messaging And Clustering Producer Push Model Consumer Producer Consumer (At most once delivery semantics) Broker Queue Pull Model Producer Pause/Resume Consumer (At least once delivery semantics) Broker Replayer Queue Hybrid Model

15 Real Time Stream Processing The 8 Requirements of Real-Time Stream Processing (Ref. Stonebraker) Rule 1: Keep the Data Moving Rule 2: Query using SQL on Streams (StreamSQL) Rule 3: Handle Stream Imperfections Rule 4: Generate Predictable Outcomes Rule 5: Integrate Stored and Streaming Data Rule 6: Guarantee Data Safety and Availability Rule 7: Partition and Scale Applications Automatically Rule 8: Process and Respond Instantaneously Rule 9: Continuous Processing Rule 10: Dynamic Topology Changes Rule 11: Dynamic SQL changes & flow control 15

16 Flexibility 16

17 Flexibility : CEP Engine (Esper) SQL like language for specifying processing rules Analysis over rolling and tumbling windows of time Filtering and Joining streams Grouping and Ordering output For routing events between stages and between clusters Event Mutation Correlation Patterns 17

18 Flexibility : DATA MODEL <bean id="eventdefinition" class="com.ebay.jetstream.event.processor.esper.esperdeclaredevents"> <property name="eventtypes"> <list> key <bean class="com.ebay.jetstream.event.processor.esper.mapeventtype"> <property name="eventalias" value="rawevent"/> <property name="eventfields"> js_ev_type <map> <entry key= D1" value="java.lang.string"/> <entry key= D2" value="java.lang.string"/> <entry key= D3" value="java.lang.string"/> <entry key= D4" value="java.lang.integer"/> <entry key= D5" value="java.lang.integer"/> <entry key="createtime" value="java.lang.long"/> </map> </property> </bean> <list> </property? </bean> value RawEvent D D D D D5 1 createtime

19 Flexibility : Event Processing Language (EPL) Metrics over Rolling window INSERT INTO RateOffender SELECT D1, count(*) AS total_count FROM RawEvent(D5 = 1).win:time(10 sec) group by D1 having count(*) > 8 output last every 1 Rate.Monitor/rateOffender ) SELECT D1, block as Policy FROM RateOffender; Rate.Monitor/rateOffender CEP OMC OKC 19

20 Flexibility : Event Processing Language (EPL) Filtering, Mutation & Routing INSERT INTO STREAMROUTE SELECT D1, D2, D3, D4 FROM RawEvent(D1 in (' ',' ') and Common.isNumeric(D2) and D2!= OMC") = D2") SELECT * FROM STREAMROUTE; ChanA/aEvent CEP OMC ChanB/bEvent 20

21 Flexibility : Event Processing Language (EPL) Creating Multidimensional Metrics Over Tumbling Windows create context MCContext end after 10 seconds; context MCContext insert into MetricAggregate select count(*) as count, D1, D2, M1' as metricname from RawEvent(D1 is not null and D2 is not null) group by D1, D2 output snapshot when terminated; OMC") name="grpdim", dimensionspan= D1, D2, M1")) select * from MetricAggregate; MetricAggregate IMC2 CEP2 CEP1 IMC1 RawEvent OMC Time Series DB 21

22 Flexibility : Top N, Distinct Count & Percentiles Computation insert into TOPITEMSTREAM select count(*) as count, D1 from RawEvent() group by D1; select * from TOPITEMSTREAM order by count; Computations are heavy both in time and space. High Cardinality dimensions makes it worse Consider approximate algorithms Margin of error around1% Costs very little in space and time Trade off accuracy for performance Implemented as aggregate functions insert into TOPITEMSTREAM select TopN(1000, 10, D1) as topitems from RawEvent(); select * from TOPITEMSTREAM; 22

23 Flexibility : Hot Deployment of EPL Declare Event EPL Grammer Event Processing Engine EPL ANTLR EPL Java Code CGLI B EPL Java Byte Code Event Executes at run time Event

24 Pulsar Stream Pipeline 24

25 Pulsar Real-time Analytics In-memory compute cloud Marketing Personalization Behavioral Events Business Events Enrich Aggregate Filter Mutate Dashboards Machine Learning Security Queries Risk Complex Event Processing: SQL on stream data Custom sub-stream creation: Filtering and Mutation In Memory Aggregation: Multi Dimensional counting 25

26 Pulsar Real Time Pipeline Consumers Avg latency < 100 millis, Hundreds of thousands events/sec, Availability Producers Real-time Data Visualization Ebay App Servers Collector BOT Detection Enrichment(Geo & Device Classification) Sessionization Event Distributor (filtering/ Mutation) Metrics Calculator Reports Geo DB Session DB Metrics Store

27 Key Takeaways Creating pipelines declaratively SQL driven processing logic with hot deployment of SQL Framework for custom SQL extensions Dynamic partitioning and flow control < 100 millisecond pipeline latency Availability < 0.01% steady state data loss Cloud deployable 27

28 28

29 Q&A Thanks

IoT Sensor Analytics with Apache Kafka, KSQL and TensorFlow

IoT Sensor Analytics with Apache Kafka, KSQL and TensorFlow 1 IoT Sensor Analytics with Apache Kafka, KSQL and TensorFlow Kafka-Native End-to-End IoT Data Integration and Processing Kai Waehner - Technology Evangelist kontakt@kai-waehner.de - LinkedIn Twitter :

More information

Kafka Streams: Hands-on Session A.A. 2017/18

Kafka Streams: Hands-on Session A.A. 2017/18 Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Kafka Streams: Hands-on Session A.A. 2017/18 Matteo Nardelli Laurea Magistrale in Ingegneria Informatica

More information

VOLTDB + HP VERTICA. page

VOLTDB + HP VERTICA. page VOLTDB + HP VERTICA ARCHITECTURE FOR FAST AND BIG DATA ARCHITECTURE FOR FAST + BIG DATA FAST DATA Fast Serve Analytics BIG DATA BI Reporting Fast Operational Database Streaming Analytics Columnar Analytics

More information

Esper EQC. Horizontal Scale-Out for Complex Event Processing

Esper EQC. Horizontal Scale-Out for Complex Event Processing Esper EQC Horizontal Scale-Out for Complex Event Processing Esper EQC - Introduction Esper query container (EQC) is the horizontal scale-out architecture for Complex Event Processing with Esper and EsperHA

More information

Apache BookKeeper. A High Performance and Low Latency Storage Service

Apache BookKeeper. A High Performance and Low Latency Storage Service Apache BookKeeper A High Performance and Low Latency Storage Service Hello! I am Sijie Guo - PMC Chair of Apache BookKeeper Co-creator of Apache DistributedLog Twitter Messaging/Pub-Sub Team Yahoo! R&D

More information

Event Streams using Apache Kafka

Event Streams using Apache Kafka Event Streams using Apache Kafka And how it relates to IBM MQ Andrew Schofield Chief Architect, Event Streams STSM, IBM Messaging, Hursley Park Event-driven systems deliver more engaging customer experiences

More information

WEBSCALE CONVERGED APPLICATION DELIVERY PLATFORM

WEBSCALE CONVERGED APPLICATION DELIVERY PLATFORM SECURITY ANALYTICS WEBSCALE CONVERGED APPLICATION DELIVERY PLATFORM BLAZING PERFORMANCE, HIGH AVAILABILITY AND ROBUST SECURITY FOR YOUR CRITICAL WEB APPLICATIONS OVERVIEW Webscale is a converged multi-cloud

More information

Fluentd + MongoDB + Spark = Awesome Sauce

Fluentd + MongoDB + Spark = Awesome Sauce Fluentd + MongoDB + Spark = Awesome Sauce Nishant Sahay, Sr. Architect, Wipro Limited Bhavani Ananth, Tech Manager, Wipro Limited Your company logo here Wipro Open Source Practice: Vision & Mission Vision

More information

Deep Dive into Concepts and Tools for Analyzing Streaming Data

Deep Dive into Concepts and Tools for Analyzing Streaming Data Deep Dive into Concepts and Tools for Analyzing Streaming Data Dr. Steffen Hausmann Sr. Solutions Architect, Amazon Web Services Data originates in real-time Photo by mountainamoeba https://www.flickr.com/photos/mountainamoeba/2527300028/

More information

Esper. Luca Montanari. MIDLAB. Middleware Laboratory

Esper. Luca Montanari. MIDLAB. Middleware Laboratory Esper Luca Montanari montanari@dis.uniroma1.it Esper Open Source CEP and ESP engine Available for Java as Esper, for.net as NEsper Developed by Codehaus http://esper.codehaus.org/ (write esper complex

More information

@joerg_schad Nightmares of a Container Orchestration System

@joerg_schad Nightmares of a Container Orchestration System @joerg_schad Nightmares of a Container Orchestration System 2017 Mesosphere, Inc. All Rights Reserved. 1 Jörg Schad Distributed Systems Engineer @joerg_schad Jan Repnak Support Engineer/ Solution Architect

More information

Data Analytics at Logitech Snowflake + Tableau = #Winning

Data Analytics at Logitech Snowflake + Tableau = #Winning Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief

More information

How we built a highly scalable Machine Learning platform using Apache Mesos

How we built a highly scalable Machine Learning platform using Apache Mesos How we built a highly scalable Machine Learning platform using Apache Mesos Daniel Sârbe Development Manager, BigData and Cloud Machine Translation @ SDL Co-founder of BigData/DataScience Meetup Cluj,

More information

Apache Flink Big Data Stream Processing

Apache Flink Big Data Stream Processing Apache Flink Big Data Stream Processing Tilmann Rabl Berlin Big Data Center www.dima.tu-berlin.de bbdc.berlin rabl@tu-berlin.de XLDB 11.10.2017 1 2013 Berlin Big Data Center All Rights Reserved DIMA 2017

More information

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:

More information

SQL Azure. Abhay Parekh Microsoft Corporation

SQL Azure. Abhay Parekh Microsoft Corporation SQL Azure By Abhay Parekh Microsoft Corporation Leverage this Presented by : - Abhay S. Parekh MSP & MSP Voice Program Representative, Microsoft Corporation. Before i begin Demo Let s understand SQL Azure

More information

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case

More information

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time

More information

An Implementation of Fog Computing Attributes in an IoT Environment

An Implementation of Fog Computing Attributes in an IoT Environment An Implementation of Fog Computing Attributes in an IoT Environment Ranjit Deshpande CTO K2 Inc. Introduction Ranjit Deshpande CTO K2 Inc. K2 Inc. s end-to-end IoT platform Transforms Sensor Data into

More information

Un'introduzione a Kafka Streams e KSQL and why they matter! ITOUG Tech Day Roma 1 Febbraio 2018

Un'introduzione a Kafka Streams e KSQL and why they matter! ITOUG Tech Day Roma 1 Febbraio 2018 Un'introduzione a Kafka Streams e KSQL and why they matter! ITOUG Tech Day Roma 1 Febbraio 2018 R E T H I N K I N G Stream Processing with Apache Kafka Kafka the Streaming Data Platform 1.0 Enterprise

More information

TIBCO Complex Event Processing Evaluation Guide

TIBCO Complex Event Processing Evaluation Guide TIBCO Complex Event Processing Evaluation Guide This document provides a guide to evaluating CEP technologies. http://www.tibco.com Global Headquarters 3303 Hillview Avenue Palo Alto, CA 94304 Tel: +1

More information

High-Performance Event Processing Bridging the Gap between Low Latency and High Throughput Bernhard Seeger University of Marburg

High-Performance Event Processing Bridging the Gap between Low Latency and High Throughput Bernhard Seeger University of Marburg High-Performance Event Processing Bridging the Gap between Low Latency and High Throughput Bernhard Seeger University of Marburg common work with Nikolaus Glombiewski, Michael Körber, Marc Seidemann 1.

More information

How Microsoft Built MySQL, PostgreSQL and MariaDB for the Cloud. Santa Clara, California April 23th 25th, 2018

How Microsoft Built MySQL, PostgreSQL and MariaDB for the Cloud. Santa Clara, California April 23th 25th, 2018 How Microsoft Built MySQL, PostgreSQL and MariaDB for the Cloud Santa Clara, California April 23th 25th, 2018 Azure Data Service Architecture Share Cluster with SQL DB Azure Infrastructure Services Azure

More information

RIPE75 - Network monitoring at scale. Louis Poinsignon

RIPE75 - Network monitoring at scale. Louis Poinsignon RIPE75 - Network monitoring at scale Louis Poinsignon Why monitoring and what to monitor? Why do we monitor? Billing Reducing costs Traffic engineering Where should we peer? Where should we set-up a new

More information

EVCache: Lowering Costs for a Low Latency Cache with RocksDB. Scott Mansfield Vu Nguyen EVCache

EVCache: Lowering Costs for a Low Latency Cache with RocksDB. Scott Mansfield Vu Nguyen EVCache EVCache: Lowering Costs for a Low Latency Cache with RocksDB Scott Mansfield Vu Nguyen EVCache 90 seconds What do caches touch? Signing up* Logging in Choosing a profile Picking liked videos

More information

Architectural challenges for building a low latency, scalable multi-tenant data warehouse

Architectural challenges for building a low latency, scalable multi-tenant data warehouse Architectural challenges for building a low latency, scalable multi-tenant data warehouse Mataprasad Agrawal Solutions Architect, Services CTO 2017 Persistent Systems Ltd. All rights reserved. Our analytics

More information

DSMS Benchmarking. Morten Lindeberg University of Oslo

DSMS Benchmarking. Morten Lindeberg University of Oslo DSMS Benchmarking Morten Lindeberg University of Oslo Agenda Introduction DSMS Recap General Requirements Metrics Example: Linear Road Example: StreamBench 30. Sep. 2009 INF5100 - Morten Lindeberg 2 Introduction

More information

Monitoring and Operating a Private Cloud with System Center 2012 (70-246) Course Outline Module 1: Introduction to the Private Cloud

Monitoring and Operating a Private Cloud with System Center 2012 (70-246) Course Outline Module 1: Introduction to the Private Cloud Monitoring and Operating a Private Cloud with System Center 2012 10750 246 Configuring and Deploying a Private Cloud with System Center 2012 10751 247 Duration : 40 Hrs /module Course 10750A: Monitoring

More information

17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS

17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS What are all those Azure* and Power* services and why do I want them? Dr Greg Low SQL Down Under greg@sqldownunder.com Who is Greg? CEO and Principal Mentor at SDU Data Platform MVP Microsoft Regional

More information

Scalable Streaming Analytics

Scalable Streaming Analytics Scalable Streaming Analytics KARTHIK RAMASAMY @karthikz TALK OUTLINE BEGIN I! II ( III b Overview Storm Overview Storm Internals IV Z V K Heron Operational Experiences END WHAT IS ANALYTICS? according

More information

Cisco Tetration Analytics

Cisco Tetration Analytics Cisco Tetration Analytics Enhanced security and operations with real time analytics Christopher Say (CCIE RS SP) Consulting System Engineer csaychoh@cisco.com Challenges in operating a hybrid data center

More information

Azure Webinar. Resilient Solutions March Sander van den Hoven Principal Technical Evangelist Microsoft

Azure Webinar. Resilient Solutions March Sander van den Hoven Principal Technical Evangelist Microsoft Azure Webinar Resilient Solutions March 2017 Sander van den Hoven Principal Technical Evangelist Microsoft DX @svandenhoven 1 What is resilience? Client Client API FrontEnd Client Client Client Loadbalancer

More information

The Stream Processor as a Database. Ufuk

The Stream Processor as a Database. Ufuk The Stream Processor as a Database Ufuk Celebi @iamuce Realtime Counts and Aggregates The (Classic) Use Case 2 (Real-)Time Series Statistics Stream of Events Real-time Statistics 3 The Architecture collect

More information

Machine Learning meets Databases. Ioannis Papapanagiotou Cloud Database Engineering

Machine Learning meets Databases. Ioannis Papapanagiotou Cloud Database Engineering Machine Learning meets Databases Ioannis Papapanagiotou Cloud Database Engineering Create Personalized Recommendations for discoveries of engaging video content that maximizes member joy. Personalize Everything

More information

Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source

Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source DMITRIY SETRAKYAN Founder, PPMC https://ignite.apache.org @apacheignite @dsetrakyan Agenda About In- Memory Computing Apache Ignite

More information

WLS Neue Optionen braucht das Land

WLS Neue Optionen braucht das Land WLS Neue Optionen braucht das Land Sören Halter Principal Sales Consultant 2016-11-16 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information

More information

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo Document Sub Title Yotpo Technical Overview 07/18/2016 2015 Yotpo Contents Introduction... 3 Yotpo Architecture... 4 Yotpo Back Office (or B2B)... 4 Yotpo On-Site Presence... 4 Technologies... 5 Real-Time

More information

Introduc)on to Apache Ka1a. Jun Rao Co- founder of Confluent

Introduc)on to Apache Ka1a. Jun Rao Co- founder of Confluent Introduc)on to Apache Ka1a Jun Rao Co- founder of Confluent Agenda Why people use Ka1a Technical overview of Ka1a What s coming What s Apache Ka1a Distributed, high throughput pub/sub system Ka1a Usage

More information

Upgrade Your MuleESB with Solace s Messaging Infrastructure

Upgrade Your MuleESB with Solace s Messaging Infrastructure The era of ubiquitous connectivity is upon us. The amount of data most modern enterprises must collect, process and distribute is exploding as a result of real-time process flows, big data, ubiquitous

More information

How to Troubleshoot Databases and Exadata Using Oracle Log Analytics

How to Troubleshoot Databases and Exadata Using Oracle Log Analytics How to Troubleshoot Databases and Exadata Using Oracle Log Analytics Nima Haddadkaveh Director, Product Management Oracle Management Cloud October, 2018 Copyright 2018, Oracle and/or its affiliates. All

More information

DURATION : 03 DAYS. same along with BI tools.

DURATION : 03 DAYS. same along with BI tools. AWS REDSHIFT TRAINING MILDAIN DURATION : 03 DAYS To benefit from this Amazon Redshift Training course from mildain, you will need to have basic IT application development and deployment concepts, and good

More information

Datacenter Management and The Private Cloud. Troy Sharpe Core Infrastructure Specialist Microsoft Corp, Education

Datacenter Management and The Private Cloud. Troy Sharpe Core Infrastructure Specialist Microsoft Corp, Education Datacenter Management and The Private Cloud Troy Sharpe Core Infrastructure Specialist Microsoft Corp, Education System Center Helps Deliver IT as a Service Configure App Controller Orchestrator Deploy

More information

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp.

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp. 17-18 March, 2018 Beijing Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp. The world is changing AI increased by 300% in 2017 Data will grow to 44 ZB in 2020 Today, 80% of organizations

More information

Auto Management for Apache Kafka and Distributed Stateful System in General

Auto Management for Apache Kafka and Distributed Stateful System in General Auto Management for Apache Kafka and Distributed Stateful System in General Jiangjie (Becket) Qin Data Infrastructure @LinkedIn GIAC 2017, 12/23/17@Shanghai Agenda Kafka introduction and terminologies

More information

CLUSTERING HIVEMQ. Building highly available, horizontally scalable MQTT Broker Clusters

CLUSTERING HIVEMQ. Building highly available, horizontally scalable MQTT Broker Clusters CLUSTERING HIVEMQ Building highly available, horizontally scalable MQTT Broker Clusters 12/2016 About this document MQTT is based on a publish/subscribe architecture that decouples MQTT clients and uses

More information

Challenges for Data Driven Systems

Challenges for Data Driven Systems Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data

More information

Data pipelines with PostgreSQL & Kafka

Data pipelines with PostgreSQL & Kafka Data pipelines with PostgreSQL & Kafka Oskari Saarenmaa PostgresConf US 2018 - Jersey City Agenda 1. Introduction 2. Data pipelines, old and new 3. Apache Kafka 4. Sample data pipeline with Kafka & PostgreSQL

More information

UMP Alert Engine. Status. Requirements

UMP Alert Engine. Status. Requirements UMP Alert Engine Status Requirements Goal Terms Proposed Design High Level Diagram Alert Engine Topology Stream Receiver Stream Router Policy Evaluator Alert Publisher Alert Topology Detail Diagram Alert

More information

Index. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI /

Index. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI / Index A ACID, 251 Actor model Akka installation, 44 Akka logos, 41 OOP vs. actors, 42 43 thread-based concurrency, 42 Agents server, 140, 251 Aggregation techniques materialized views, 216 probabilistic

More information

Hadoop JMX Monitoring and Alerting

Hadoop JMX Monitoring and Alerting Hadoop JMX Monitoring and Alerting Introduction High-Level Monitoring/Alert Flow Metrics Collector Agent Metrics Storage NameNode Metrics DataNode Metrics HBase Master Metrics RegionServer Metrics Data

More information

Lenses 2.1 Enterprise Features PRODUCT DATA SHEET

Lenses 2.1 Enterprise Features PRODUCT DATA SHEET Lenses 2.1 Enterprise Features PRODUCT DATA SHEET 1 OVERVIEW DataOps is the art of progressing from data to value in seconds. For us, its all about making data operations as easy and fast as using the

More information

Distributed Data on Distributed Infrastructure. Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere

Distributed Data on Distributed Infrastructure. Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere Distributed Data on Distributed Infrastructure Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere Kunal Kusoorkar Director Solutions Engineering, ArangoDB @neunhoef Jörg Schad Claudius

More information

Modern Stream Processing with Apache Flink

Modern Stream Processing with Apache Flink 1 Modern Stream Processing with Apache Flink Till Rohrmann GOTO Berlin 2017 2 Original creators of Apache Flink da Platform 2 Open Source Apache Flink + da Application Manager 3 What changes faster? Data

More information

Data-Driven DevOps: Bringing Visibility to Any Cloud, Any App, & Any Device. Erik Giesa SVP of Marketing and Business Development, ExtraHop Networks

Data-Driven DevOps: Bringing Visibility to Any Cloud, Any App, & Any Device. Erik Giesa SVP of Marketing and Business Development, ExtraHop Networks Data-Driven DevOps: Bringing Visibility to Any Cloud, Any App, & Any Device Erik Giesa SVP of Marketing and Business Development, ExtraHop Networks Your Monitoring Strategy Must Change How can you maintain

More information

Nimble Storage Adaptive Flash

Nimble Storage Adaptive Flash Nimble Storage Adaptive Flash Read more Nimble solutions Contact Us 800-544-8877 solutions@microage.com MicroAge.com TECHNOLOGY OVERVIEW Nimble Storage Adaptive Flash Nimble Storage s Adaptive Flash platform

More information

Building Durable Real-time Data Pipeline

Building Durable Real-time Data Pipeline Building Durable Real-time Data Pipeline Apache BookKeeper at Twitter @sijieg Twitter Background Layered Architecture Agenda Design Details Performance Scale @Twitter Q & A Publish-Subscribe Online services

More information

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme CNA2080BU Deep Dive: How to Deploy and Operationalize Kubernetes Cornelia Davis, Pivotal Nathan Ness Technical Product Manager, CNABU @nvpnathan #VMworld #CNA2080BU Disclaimer This presentation may contain

More information

Container 2.0. Container: check! But what about persistent data, big data or fast data?!

Container 2.0. Container: check! But what about persistent data, big data or fast data?! @unterstein @joerg_schad @dcos @jaxdevops Container 2.0 Container: check! But what about persistent data, big data or fast data?! 1 Jörg Schad Distributed Systems Engineer @joerg_schad Johannes Unterstein

More information

Complex Event Processing

Complex Event Processing Complex Event Processing Developing event driven applications with Esper Dan Pritchett Rearden Commerce Dan Pritchett Complex Event Processing: Developing event driven applications with Esper Slide 1 Why

More information

TrueSight 10 Architecture & Scalability Q&A Best Practice Webinar 8/18/2015

TrueSight 10 Architecture & Scalability Q&A Best Practice Webinar 8/18/2015 Q: Where can I find the TrueSight Operations Management Best Practice material? A: TrueSight OM Best Practice material is published on the BMC Communities web site at the following link. https://communities.bmc.com/docs/doc-37443

More information

Craig Blitz Oracle Coherence Product Management

Craig Blitz Oracle Coherence Product Management Software Architecture for Highly Available, Scalable Trading Apps: Meeting Low-Latency Requirements Intentionally Craig Blitz Oracle Coherence Product Management 1 Copyright 2011, Oracle and/or its affiliates.

More information

Data Acquisition. The reference Big Data stack

Data Acquisition. The reference Big Data stack Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Data Acquisition Corso di Sistemi e Architetture per Big Data A.A. 2016/17 Valeria Cardellini The reference

More information

Availability and Performance for Tier1 applications

Availability and Performance for Tier1 applications Assaf Fraenkel Senior Architect (MCA+MCM SQL 2008) MCS Israel Availability and Performance for Tier1 applications Agenda and Takeaways Agenda: Introduce the new SQL Server High Availability and Disaster

More information

Apache Ignite and Apache Spark Where Fast Data Meets the IoT

Apache Ignite and Apache Spark Where Fast Data Meets the IoT Apache Ignite and Apache Spark Where Fast Data Meets the IoT Denis Magda GridGain Product Manager Apache Ignite PMC http://ignite.apache.org #apacheignite #denismagda Agenda IoT Demands to Software IoT

More information

Putting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21

Putting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21 Big Processing -Parallel Computation COS 418: Distributed Systems Lecture 21 Michael Freedman 2 Ex: Word count using partial aggregation Putting it together 1. Compute word counts from individual files

More information

<Insert Picture Here> Oracle Coherence & Extreme Transaction Processing (XTP)

<Insert Picture Here> Oracle Coherence & Extreme Transaction Processing (XTP) Oracle Coherence & Extreme Transaction Processing (XTP) Gary Hawks Oracle Coherence Solution Specialist Extreme Transaction Processing What is XTP? Introduction to Oracle Coherence

More information

High Availability for PROS Pricing Solution Suite on SQL Server

High Availability for PROS Pricing Solution Suite on SQL Server High Availability for PROS Pricing Solution Suite on SQL Server Step-by-step guidance for setting up high availability for the PROS Pricing Solution Suite running on SQL Server 2008 R2 Enterprise Technical

More information

Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development

Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development Oracle NoSQL Database Overview Marie-Anne Neimat, VP Development June14, 2012 1 Copyright 2012, Oracle and/or its affiliates. All rights Agenda Big Data Overview Oracle NoSQL Database Architecture Technical

More information

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme CNA1612BU Deploying real-world workloads on Kubernetes and Pivotal Cloud Foundry VMworld 2017 Fred Melo, Director of Technology, Pivotal Merlin Glynn, Sr. Technical Product Manager, VMware Content: Not

More information

The Future of Real-Time in Spark

The Future of Real-Time in Spark The Future of Real-Time in Spark Reynold Xin @rxin Spark Summit, New York, Feb 18, 2016 Why Real-Time? Making decisions faster is valuable. Preventing credit card fraud Monitoring industrial machinery

More information

Intra-cluster Replication for Apache Kafka. Jun Rao

Intra-cluster Replication for Apache Kafka. Jun Rao Intra-cluster Replication for Apache Kafka Jun Rao About myself Engineer at LinkedIn since 2010 Worked on Apache Kafka and Cassandra Database researcher at IBM Outline Overview of Kafka Kafka architecture

More information

CloudExpo November 2017 Tomer Levi

CloudExpo November 2017 Tomer Levi CloudExpo November 2017 Tomer Levi About me Full Stack Engineer @ Intel s Advanced Analytics group. Artificial Intelligence unit at Intel. Responsible for (1) Radical improvement of critical processes

More information

Architecting Microsoft Azure Solutions (proposed exam 535)

Architecting Microsoft Azure Solutions (proposed exam 535) Architecting Microsoft Azure Solutions (proposed exam 535) IMPORTANT: Significant changes are in progress for exam 534 and its content. As a result, we are retiring this exam on December 31, 2017, and

More information

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight

Abstract. The Challenges. ESG Lab Review InterSystems IRIS Data Platform: A Unified, Efficient Data Platform for Fast Business Insight ESG Lab Review InterSystems Data Platform: A Unified, Efficient Data Platform for Fast Business Insight Date: April 218 Author: Kerry Dolan, Senior IT Validation Analyst Abstract Enterprise Strategy Group

More information

F5 in AWS Part 3 Advanced Topologies and More on Highly Available Services

F5 in AWS Part 3 Advanced Topologies and More on Highly Available Services F5 in AWS Part 3 Advanced Topologies and More on Highly Available Services ChrisMutzel, 2015-17-08 Thus far in our article series about running BIG-IP in EC2, we ve talked about some VPC/EC2 routing and

More information

Kafka Connect the Dots

Kafka Connect the Dots Kafka Connect the Dots Building Oracle Change Data Capture Pipelines With Kafka Mike Donovan CTO Dbvisit Software Mike Donovan Chief Technology Officer, Dbvisit Software Multi-platform DBA, (Oracle, MSSQL..)

More information

Panoptes: A Network Telemetry Ecosystem - Part Deux

Panoptes: A Network Telemetry Ecosystem - Part Deux Panoptes: A Network Telemetry Ecosystem - Part Deux Panoptes is: Greenfield Python based network telemetry platform that provides real time telemetry and analytics @ Yahoo Implements discovery, polling,

More information

Monitoring and Operating a Private Cloud with System Center 2012

Monitoring and Operating a Private Cloud with System Center 2012 Monitoring and Operating a Private Cloud with System Center 2012 Course 10750 - Five days - Instructor-led - Hands-on Introduction This course describes how to monitor and operate a private cloud with

More information

Pasiruoškite ateičiai: modernus duomenų centras. Laurynas Dovydaitis Microsoft Azure MVP

Pasiruoškite ateičiai: modernus duomenų centras. Laurynas Dovydaitis Microsoft Azure MVP Pasiruoškite ateičiai: modernus duomenų centras Laurynas Dovydaitis Microsoft Azure MVP 2016-05-17 Tension drives change The datacenter today Traditional datacenter Tight coupling between infrastructure

More information

Stanislav Harvan Internet of Things

Stanislav Harvan Internet of Things Stanislav Harvan v-sharva@microsoft.com Internet of Things IoT v číslach Gartner: V roku 2020 bude na Internet pripojených viac ako 25mld zariadení: 1,5mld smart TV 2,5mld pc 5mld smart phone 16mld dedicated

More information

Migrating a critical high-performance platform to Azure with zero downtime

Migrating a critical high-performance platform to Azure with zero downtime Microsoft IT Showcase Migrating a critical high-performance platform to Azure with zero downtime At Microsoft IT, our strategy is to move workloads from on-premises datacenters to Azure. So, when the Microsoft

More information

MarkLogic Technology Briefing

MarkLogic Technology Briefing MarkLogic Technology Briefing Edd Patterson CTO/VP Systems Engineering, Americas Slide 1 Agenda Introductions About MarkLogic MarkLogic Server Deep Dive Slide 2 MarkLogic Overview Company Highlights Headquartered

More information

Qualys Cloud Platform

Qualys Cloud Platform 18 QUALYS SECURITY CONFERENCE 2018 Qualys Cloud Platform Looking Under the Hood: What Makes Our Cloud Platform so Scalable and Powerful Dilip Bachwani Vice President, Engineering, Qualys, Inc. Cloud Platform

More information

BIG DATA COURSE CONTENT

BIG DATA COURSE CONTENT BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data

More information

Contact Center Assurance Dashboards

Contact Center Assurance Dashboards The Cisco Prime Collaboration Contact Center Assurance performance dashboards help you to monitor your network by providing near real-time information about the Contact Center components such as Cisco

More information

Deploying Applications on DC/OS

Deploying Applications on DC/OS Mesosphere Datacenter Operating System Deploying Applications on DC/OS Keith McClellan - Technical Lead, Federal Programs keith.mcclellan@mesosphere.com V6 THE FUTURE IS ALREADY HERE IT S JUST NOT EVENLY

More information

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache Databases on AWS 2017 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services,

More information

Hortonworks DataFlow Sam Lachterman Solutions Engineer

Hortonworks DataFlow Sam Lachterman Solutions Engineer Hortonworks DataFlow Sam Lachterman Solutions Engineer 1 Hortonworks Inc. 2011 2017. All Rights Reserved Disclaimer This document may contain product features and technology directions that are under development,

More information

Overview SENTINET 3.1

Overview SENTINET 3.1 Overview SENTINET 3.1 Overview 1 Contents Introduction... 2 Customer Benefits... 3 Development and Test... 3 Production and Operations... 4 Architecture... 5 Technology Stack... 7 Features Summary... 7

More information

Best practices for OpenShift high-availability deployment field experience

Best practices for OpenShift high-availability deployment field experience Best practices for OpenShift high-availability deployment field experience Orgad Kimchi, Red Hat, Christina Wong, NuoDB, Ariff Kassam, NuoDB Traditional disaster recovery means maintaining underutilized

More information

Diplomado Certificación

Diplomado Certificación Diplomado Certificación Duración: 250 horas. Horario: Sabatino de 8:00 a 15:00 horas. Incluye: 1. Curso presencial de 250 horas. 2.- Material oficial de Oracle University (e-kit s) de los siguientes cursos:

More information

Contact Center Assurance Dashboards

Contact Center Assurance Dashboards The Prime Collaboration Contact Center Assurance performance dashboards help you to monitor your network by providing near real-time information about the Contact Center components such as CUIC, Finesse,

More information

JStorm Based Network Analytics Platform. Alibaba Cloud Senior Technical Manager, Biao Lyu

JStorm Based Network Analytics Platform. Alibaba Cloud Senior Technical Manager, Biao Lyu JStorm Based Network Analytics Platform Alibaba Cloud Senior Technical Manager, Biao Lyu Overview of Alibaba Cloud 18 Regions 150+ Products 1Million+ Customers Comprehensive Networking Product Family 12

More information

OLAP Introduction and Overview

OLAP Introduction and Overview 1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata

More information

Over the last few years, we have seen a disruption in the data management

Over the last few years, we have seen a disruption in the data management JAYANT SHEKHAR AND AMANDEEP KHURANA Jayant is Principal Solutions Architect at Cloudera working with various large and small companies in various Verticals on their big data and data science use cases,

More information

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp. Data 101 Which DB, When Joe Yong (joeyong@microsoft.com) Azure SQL Data Warehouse, Program Management Microsoft Corp. The world is changing AI increased by 300% in 2017 Data will grow to 44 ZB in 2020

More information

Performance and Scalability with Griddable.io

Performance and Scalability with Griddable.io Performance and Scalability with Griddable.io Executive summary Griddable.io is an industry-leading timeline-consistent synchronized data integration grid across a range of source and target data systems.

More information

Barry D. Lamkin Executive IT Specialist Capitalware's MQ Technical Conference v

Barry D. Lamkin Executive IT Specialist Capitalware's MQ Technical Conference v What happened to my Transaction? Barry D. Lamkin Executive IT Specialist blamkin@us.ibm.com Transaction Tracking - APM Transaction Tracking is a major part of Application Performance Monitoring To ensure

More information

Technologies for the future of Network Insight and Automation

Technologies for the future of Network Insight and Automation Technologies for the future of Network Insight and Automation Richard Wade (ricwade@cisco.com) Technical Leader, Asia-Pacific Infrastructure Programmability This Session s Context Service Creation Service

More information

Azure File Sync. Webinaari

Azure File Sync. Webinaari Azure File Sync Webinaari 12.3.2018 Agenda Why use Azure? Moving to the Cloud Azure Storage Backup and Recovery Azure File Sync Demo Q&A What is Azure? A collection of cloud services from Microsoft that

More information