Tungsten Replicator for Kafka, Elasticsearch, Cassandra

Size: px
Start display at page:

Download "Tungsten Replicator for Kafka, Elasticsearch, Cassandra"

Transcription

1 Tungsten Replicator for Kafka, Elasticsearch, Cassandra

2 Topics In todays session Replicator Basics Filtering and Glue Kafka and Options Elasticsearch and Options Cassandra Future Direction 2

3 Asynchronous replication decouples transaction processing on master and slave DBMS nodes MySQL/Oracle MySQL/Oracle DBMS-specific Logging (i.e. Redo or Binary) DBMS Logs Option 1: Local Install Extractor reads directly from the logs, even when the DBMS service is down. This is the default. Option 2: Remote Extractor gets log data via MySQL Replication Slave protocols (which requires the DBMS service to be online) or the Redo Reader feature. This is how we handle RDS and Oracle extraction tasks. Extractor Options 2 1 Master Replicator: Extractor THL Download transactions via network THL = Events + Metadata Apply using JDBC Slave Replicator: Applier THL 39

4 Parallel apply maximizes DBMS I/O bandwidth when updating replicas Slave Replicator Pipeline Stage Stage Stage remote-to-thl thl-to-q q-to-dbms Extract Filter Apply Extract Filter Apply Extract Filter Apply Extract Filter Apply Master replicator THL (Events+ metadata) Parallel Queue Extract Filter Apply Slave 30

5 Why Kafka Kafka is a high performance message bus NOT a database Great for distributing messages and firing/triggering operations on content Log aggregation Activity/security tracking Metrics Auditing Data ingestion for Hadoop

6 Mass Data Collection with Kafka Kafka Tungsten Replicator Kafka Kafka

7 Multiple Target Distribution Database Tungsten Replicator Kafka Image Process Database Kafka Database Kafka Metrics

8 How Kafka Replication Works MySQL/Oracle DBMS-specific Logging (i.e. Redo or Binary) DBMS Logs Kafka Applier (Native) Master Replicator: Extractor Download transactions via network Slave Replicator: Applier Zookeeper THL THL = Events + Metadata THL

9 What Tungsten Replicator Does to Apply into Kafka Takes an incoming row and converts it to a message Message consists of metadata: Schema name, table name Sequence number Commit timestamp Operation Type Embedded Message Content

10 Message Structure Schema Table Row Row Row Row Row Row Topic: Schema_Table MsgID: Schema Table PKey Row MsgID: Schema Table PKey Row

11 Sample Message { } "_meta_committime" : " :27:18.0", "_meta_source_schema" : "sbtest", "_meta_seqno" : "10130", "_meta_source_table" : "sbtest", "_meta_optype" : "INSERT", "record" : { "c" : "Base Msg", "k" : "100", "id" : "255759", "pad" : "Some other submsg" }

12 Customizable Elements Whether acknowledgements are required from Kafka How much distribution/replication is required before sending the message Format of the message key Whether to embed schema and table name Whether the commit timestamp should be embedded

13 Demo

14 Elasticsearch Immediately replicate data into Elasticsearch for searching Contains the core text and content of the records Provides the original information to track back to the record Content structure against the schema (index type) and tablename (index) Document ID based on the pkey and other information which is configurable

15 How Elasticsearch Replication Works Redo Logging DBMS Logs Generated PLOG Redo Reader Elasticsearch Applier (REST API) Master Replicator: Extractor Download transactions via network Slave Replicator: Applier THL THL = Events + Metadata THL

16 Sample Entry { "_id" : "99999", "_type" : "mg", "found" : true, "_version" : 2, "_index" : "msg", "_source" : { "msg" : "Hello ElasticSearch", "id" : "99999" } }

17 Replicating into Cassandra Replicating into Cassandra

18 Demo

19 Cassandra Great for fast online and CRM style deployments Highly fault tolerant and scalable Has some data and formatting changes Currently needs our DDL translation tool (soon built-in) Quasi table/doccument style

20 How Cassandra Replication Works base base Master Replicator Slave Replicator JS Cassandra Ruby Connector merge staging CSV 46

21 Demo

22 Future Direction for these appliers and related technology Full transaction support for Kafka Support for Amazon Elasticsearch Kafka Extraction Parsing contents of Kafka message queues Database updates Large scale distribution of database changes Filtering and re-submission

23 General Tungsten Replicator Functionality Expanding the standard filter technology Data translation (dates, numbers, hex) Basic lookup/combination to aid ETL style deployments Data munging/obfuscation (PII, credit cards) for analytics More appliers InfluxDB SQL Server PostgreSQL Hadoop JDBC MemSQL Amazon (Aurora, Elasticsearch) CouchDB/Base THL Compression/Encryption

24 Next Steps If you are interested in knowing more about Tungsten Replicator and would like to try it out for yourself, please contact our sales team who will be able to take you through the details and setup a POC sales@continuent.com Read the documentation at Subscribe to our Tungsten University YouTube channel! 14

25 For more information, contact us: MC Brown VP Products

MySQL Multi-Site/Multi-Master MySQL High Availability and Disaster Recovery ~~~ Heterogeneous Real-Time Data Replication Oracle Replication

MySQL Multi-Site/Multi-Master MySQL High Availability and Disaster Recovery ~~~ Heterogeneous Real-Time Data Replication Oracle Replication MySQL Multi-Site/Multi-Master MySQL High Availability and Disaster Recovery ~~~ Heterogeneous Real-Time Data Replication Oracle Replication Continuent Quick Introduction History Products 2004 2009 2014

More information

Spread the Database Love with Heterogeneous Replication. MC Brown, VP, Products

Spread the Database Love with Heterogeneous Replication. MC Brown, VP, Products Spread the Database Love with Heterogeneous Replication MC Brown, VP, Products Heterogeneous Replication is NOT Exporting and Importing Data One Time Exports Moving to a different database platform ETL

More information

Basics: Backup, Recovery, and Provisioning with a Continuent Tungsten Cluster

Basics: Backup, Recovery, and Provisioning with a Continuent Tungsten Cluster Basics: Backup, Recovery, and Provisioning with a Continuent Tungsten Cluster 1 Topics In this short course we will: Methods and Tools for taking a backup Verifying the backup contains the last binary

More information

Advanced: Performing Schema Changes in a Multi-Site Multi-Master Environment

Advanced: Performing Schema Changes in a Multi-Site Multi-Master Environment Advanced: Performing Schema Changes in a Multi-Site Multi-Master Environment 1 Topics In this short course we will: Review the Cluster Architecture Explore the challenges with Schema Changes Describe the

More information

Clustering for the Masses A Gentle Introduction to Tungsten for MySQL. Robert Hodges CTO, Continuent, Inc.

Clustering for the Masses A Gentle Introduction to Tungsten for MySQL. Robert Hodges CTO, Continuent, Inc. Clustering for the Masses A Gentle Introduction to Tungsten for MySQL Robert Hodges CTO, Continuent, Inc. Topics / What is the Problem? / What is Tungsten and how does it work? / What can you do with it?

More information

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case

More information

Jailbreaking MySQL Replication Featuring Tungsten Replicator. Robert Hodges, CEO, Continuent

Jailbreaking MySQL Replication Featuring Tungsten Replicator. Robert Hodges, CEO, Continuent Jailbreaking MySQL Replication Featuring Tungsten Robert Hodges, CEO, Continuent About Continuent / Continuent is the leading provider of data replication and clustering for open source relational databases

More information

Kafka Connect the Dots

Kafka Connect the Dots Kafka Connect the Dots Building Oracle Change Data Capture Pipelines With Kafka Mike Donovan CTO Dbvisit Software Mike Donovan Chief Technology Officer, Dbvisit Software Multi-platform DBA, (Oracle, MSSQL..)

More information

Design Patterns for Large- Scale Data Management. Robert Hodges OSCON 2013

Design Patterns for Large- Scale Data Management. Robert Hodges OSCON 2013 Design Patterns for Large- Scale Data Management Robert Hodges OSCON 2013 The Start-Up Dilemma 1. You are releasing Online Storefront V 1.0 2. It could be a complete bust 3. But it could be *really* big

More information

Search Engines and Time Series Databases

Search Engines and Time Series Databases Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Search Engines and Time Series Databases Corso di Sistemi e Architetture per Big Data A.A. 2017/18

More information

An Information Asset Hub. How to Effectively Share Your Data

An Information Asset Hub. How to Effectively Share Your Data An Information Asset Hub How to Effectively Share Your Data Hello! I am Jack Kennedy Data Architect @ CNO Enterprise Data Management Team Jack.Kennedy@CNOinc.com 1 4 Data Functions Your Data Warehouse

More information

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development:: Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized

More information

Shen PingCAP 2017

Shen PingCAP 2017 Shen Li @ PingCAP About me Shen Li ( 申砾 ) Tech Lead of TiDB, VP of Engineering Netease / 360 / PingCAP Infrastructure software engineer WHY DO WE NEED A NEW DATABASE? Brief History Standalone RDBMS NoSQL

More information

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time

More information

Future-Proofing MySQL for the Worldwide Data Revolution

Future-Proofing MySQL for the Worldwide Data Revolution Future-Proofing MySQL for the Worldwide Data Revolution Robert Hodges, CEO. What is Future-Proo!ng? Future-proo!ng = creating systems that last while parts change and improve MySQL is not losing out to

More information

Accelerate Your Data Pipeline for Data Lake, Streaming and Cloud Architectures

Accelerate Your Data Pipeline for Data Lake, Streaming and Cloud Architectures WHITE PAPER : REPLICATE Accelerate Your Data Pipeline for Data Lake, Streaming and Cloud Architectures INTRODUCTION Analysis of a wide variety of data is becoming essential in nearly all industries to

More information

Ingest. Aaron Mildenstein, Consulting Architect Tokyo Dec 14, 2017

Ingest. Aaron Mildenstein, Consulting Architect Tokyo Dec 14, 2017 Ingest Aaron Mildenstein, Consulting Architect Tokyo Dec 14, 2017 Data Ingestion The process of collecting and importing data for immediate use 2 ? Simple things should be simple. Shay Banon Elastic{ON}

More information

Ingest. David Pilato, Developer Evangelist Paris, 31 Janvier 2017

Ingest. David Pilato, Developer Evangelist Paris, 31 Janvier 2017 Ingest David Pilato, Developer Evangelist Paris, 31 Janvier 2017 Data Ingestion The process of collecting and importing data for immediate use in a datastore 2 ? Simple things should be simple. Shay Banon

More information

Architectural challenges for building a low latency, scalable multi-tenant data warehouse

Architectural challenges for building a low latency, scalable multi-tenant data warehouse Architectural challenges for building a low latency, scalable multi-tenant data warehouse Mataprasad Agrawal Solutions Architect, Services CTO 2017 Persistent Systems Ltd. All rights reserved. Our analytics

More information

Transformation-free Data Pipelines by combining the Power of Apache Kafka and the Flexibility of the ESB's

Transformation-free Data Pipelines by combining the Power of Apache Kafka and the Flexibility of the ESB's Building Agile and Resilient Schema Transformations using Apache Kafka and ESB's Transformation-free Data Pipelines by combining the Power of Apache Kafka and the Flexibility of the ESB's Ricardo Ferreira

More information

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Raanan Dagan and Rohit Pujari September 25, 2017 Washington, DC Forward-Looking Statements During the course of this presentation, we may

More information

Dbvisit Product and Company Website Copy

Dbvisit Product and Company Website Copy Dbvisit Product and Company Website Copy To promote maximum effectiveness with our partnership, we are supplying you with Dbvisit copy for your website. Below you will find copy by Dbvisit Partner Type

More information

Hacking PostgreSQL Internals to Solve Data Access Problems

Hacking PostgreSQL Internals to Solve Data Access Problems Hacking PostgreSQL Internals to Solve Data Access Problems Sadayuki Furuhashi Treasure Data, Inc. Founder & Software Architect A little about me... > Sadayuki Furuhashi > github/twitter: @frsyuki > Treasure

More information

Index. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI /

Index. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI / Index A ACID, 251 Actor model Akka installation, 44 Akka logos, 41 OOP vs. actors, 42 43 thread-based concurrency, 42 Agents server, 140, 251 Aggregation techniques materialized views, 216 probabilistic

More information

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop

More information

BIS Database Management Systems.

BIS Database Management Systems. BIS 512 - Database Management Systems http://www.mis.boun.edu.tr/durahim/ Ahmet Onur Durahim Learning Objectives Database systems concepts Designing and implementing a database application Life of a Query

More information

MIS Database Systems.

MIS Database Systems. MIS 335 - Database Systems http://www.mis.boun.edu.tr/durahim/ Ahmet Onur Durahim Learning Objectives Database systems concepts Designing and implementing a database application Life of a Query in a Database

More information

Building Event Driven Architectures using OpenEdge CDC Richard Banville, Fellow, OpenEdge Development Dan Mitchell, Principal Sales Engineer

Building Event Driven Architectures using OpenEdge CDC Richard Banville, Fellow, OpenEdge Development Dan Mitchell, Principal Sales Engineer Building Event Driven Architectures using OpenEdge CDC Richard Banville, Fellow, OpenEdge Development Dan Mitchell, Principal Sales Engineer October 26, 2018 Agenda Change Data Capture (CDC) Overview Configuring

More information

Data Infrastructure at LinkedIn. Shirshanka Das XLDB 2011

Data Infrastructure at LinkedIn. Shirshanka Das XLDB 2011 Data Infrastructure at LinkedIn Shirshanka Das XLDB 2011 1 Me UCLA Ph.D. 2005 (Distributed protocols in content delivery networks) PayPal (Web frameworks and Session Stores) Yahoo! (Serving Infrastructure,

More information

Search and Time Series Databases

Search and Time Series Databases Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Search and Time Series Databases Corso di Sistemi e Architetture per Big Data A.A. 2016/17 Valeria

More information

Introduction to Geodatabase and Spatial Management in ArcGIS. Craig Gillgrass Esri

Introduction to Geodatabase and Spatial Management in ArcGIS. Craig Gillgrass Esri Introduction to Geodatabase and Spatial Management in ArcGIS Craig Gillgrass Esri Session Path The Geodatabase - What is it? - Why use it? - What types are there? - What can I do with it? Query Layers

More information

WHITEPAPER. MemSQL Enterprise Feature List

WHITEPAPER. MemSQL Enterprise Feature List WHITEPAPER MemSQL Enterprise Feature List 2017 MemSQL Enterprise Feature List DEPLOYMENT Provision and deploy MemSQL anywhere according to your desired cluster configuration. On-Premises: Maximize infrastructure

More information

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : PH NO: 9963799240, 040-40025423

More information

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale

More information

Amazon Aurora Relational databases reimagined.

Amazon Aurora Relational databases reimagined. Amazon Aurora Relational databases reimagined. Ronan Guilfoyle, Solutions Architect, AWS Brian Scanlan, Engineer, Intercom 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Current

More information

If Informatica Data Replication is not currently installed, complete a first-time installation. Otherwise, complete the upgrade procedure.

If Informatica Data Replication is not currently installed, complete a first-time installation. Otherwise, complete the upgrade procedure. Informatica Corporation Data Replication Version 9.5.1 Release Notes August 2013 Copyright (c) 2013 Informatica Corporation. All rights reserved. Contents Introduction... 1 Installation and Upgrading...

More information

Lenses 2.1 Enterprise Features PRODUCT DATA SHEET

Lenses 2.1 Enterprise Features PRODUCT DATA SHEET Lenses 2.1 Enterprise Features PRODUCT DATA SHEET 1 OVERVIEW DataOps is the art of progressing from data to value in seconds. For us, its all about making data operations as easy and fast as using the

More information

1

1 1 2 3 6 7 8 9 10 Storage & IO Benchmarking Primer Running sysbench and preparing data Use the prepare option to generate the data. Experiments Run sysbench with different storage systems and instance

More information

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017.

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017. Dublin Apache Kafka Meetup, 30 August 2017 The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Joseph @pleia2 * ASF projects 1 Elizabeth K. Joseph, Developer Advocate Developer Advocate

More information

To Shard or Not to Shard That is the question! Peter Zaitsev April 21, 2016

To Shard or Not to Shard That is the question! Peter Zaitsev April 21, 2016 To Shard or Not to Shard That is the question! Peter Zaitsev April 21, 2016 Story Let s start with the story 2 First things to decide Before you decide how to shard you d best understand whether or not

More information

Chapter 24 NOSQL Databases and Big Data Storage Systems

Chapter 24 NOSQL Databases and Big Data Storage Systems Chapter 24 NOSQL Databases and Big Data Storage Systems - Large amounts of data such as social media, Web links, user profiles, marketing and sales, posts and tweets, road maps, spatial data, email - NOSQL

More information

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp. Data 101 Which DB, When Joe Yong (joeyong@microsoft.com) Azure SQL Data Warehouse, Program Management Microsoft Corp. The world is changing AI increased by 300% in 2017 Data will grow to 44 ZB in 2020

More information

Big Data Architect.

Big Data Architect. Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional

More information

Streaming ETL of High-Velocity Big Data Using SAS Event Stream Processing and SAS Viya

Streaming ETL of High-Velocity Big Data Using SAS Event Stream Processing and SAS Viya SAS 1679-2018 Streaming ETL of High-Velocity Big Data Using SAS Event Stream Processing and SAS Viya ABSTRACT Joydeep Bhattacharya and Manish Jhunjhunwala, SAS Institute Inc. A typical ETL happens once

More information

Comparing SQL and NOSQL databases

Comparing SQL and NOSQL databases COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2014 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations

More information

Through O Shaped Glasses

Through O Shaped Glasses Through O Shaped Glasses Introducing Kafka to the Oracle DBA Mike Donovan CTO Dbvisit Software Mike Donovan Chief Technology Officer, Dbvisit Software Multi-platform DBA, (Oracle, MSSQL..) Conference speaker:

More information

How Samsung ARTIK serves global IoT customers in the cloud

How Samsung ARTIK serves global IoT customers in the cloud How Samsung ARTIK serves global IoT customers in the cloud How Samsung ARTIK Cloud secured their SaaS revenue using Tungsten Clustering on cloud-based services Serve your SaaS customers better. In the

More information

Data Lake Based Systems that Work

Data Lake Based Systems that Work Data Lake Based Systems that Work There are many article and blogs about what works and what does not work when trying to build out a data lake and reporting system. At DesignMind, we have developed a

More information

Trafodion Enterprise-Class Transactional SQL-on-HBase

Trafodion Enterprise-Class Transactional SQL-on-HBase Trafodion Enterprise-Class Transactional SQL-on-HBase Trafodion Introduction (Welsh for transactions) Joint HP Labs & HP-IT project for transactional SQL database capabilities on Hadoop Leveraging 20+

More information

MySQL Replication: What's New In MySQL 5.7 and MySQL 8. Luís Soares Software Development Director MySQL Replication

MySQL Replication: What's New In MySQL 5.7 and MySQL 8. Luís Soares Software Development Director MySQL Replication MySQL Replication: What's New In MySQL 5.7 and MySQL 8 Luís Soares Software Development Director MySQL Replication Tuesday, 24th April 2018, Santa Clara, CA, USA Copyright 2018, Oracle and/or its affiliates.

More information

Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017

Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017 Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017 About the Presentation Problems Existing Solutions Denis Magda

More information

Postgres-XC PG session #3. Michael PAQUIER Paris, 2012/02/02

Postgres-XC PG session #3. Michael PAQUIER Paris, 2012/02/02 Postgres-XC PG session #3 Michael PAQUIER Paris, 2012/02/02 Agenda Self-introduction Highlights of Postgres-XC Core architecture overview Performance High-availability Release status 2 Self-introduction

More information

Everything You Need to Know About MySQL Group Replication

Everything You Need to Know About MySQL Group Replication Everything You Need to Know About MySQL Group Replication Luís Soares (luis.soares@oracle.com) Principal Software Engineer, MySQL Replication Lead Copyright 2017, Oracle and/or its affiliates. All rights

More information

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache Databases on AWS 2017 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services,

More information

FROM LEGACY, TO BATCH, TO NEAR REAL-TIME. Marc Sturlese, Dani Solà

FROM LEGACY, TO BATCH, TO NEAR REAL-TIME. Marc Sturlese, Dani Solà FROM LEGACY, TO BATCH, TO NEAR REAL-TIME Marc Sturlese, Dani Solà WHO ARE WE? Marc Sturlese - @sturlese Backend engineer, focused on R&D Interests: search, scalability Dani Solà - @dani_sola Backend engineer

More information

VMware Continuent. Benefits and Configurations TECHNICAL WHITE PAPER

VMware Continuent. Benefits and Configurations TECHNICAL WHITE PAPER Benefits and Configurations TECHNICAL WHITE PAPER Table of Contents What is VMware Continuent?....3 Key benefits....4 High availability (HA), disaster recovery (DR), and continuous operations.... 4 Ease

More information

High-Performance Distributed DBMS for Analytics

High-Performance Distributed DBMS for Analytics 1 High-Performance Distributed DBMS for Analytics 2 About me Developer, hardware engineering background Head of Analytic Products Department in Yandex jkee@yandex-team.ru 3 About Yandex One of the largest

More information

MySQL Replication : advanced features in all flavours. Giuseppe Maxia Quality Assurance Architect at

MySQL Replication : advanced features in all flavours. Giuseppe Maxia Quality Assurance Architect at MySQL Replication : advanced features in all flavours Giuseppe Maxia Quality Assurance Architect at VMware @datacharmer 1 About me Who s this guy? Giuseppe Maxia, a.k.a. "The Data Charmer" QA Architect

More information

LOG AGGREGATION. To better manage your Red Hat footprint. Miguel Pérez Colino Strategic Design Team - ISBU

LOG AGGREGATION. To better manage your Red Hat footprint. Miguel Pérez Colino Strategic Design Team - ISBU LOG AGGREGATION To better manage your Red Hat footprint Miguel Pérez Colino Strategic Design Team - ISBU 2017-05-03 @mmmmmmpc Agenda Managing your Red Hat footprint with Log Aggregation The Situation The

More information

Oral Questions and Answers (DBMS LAB) Questions & Answers- DBMS

Oral Questions and Answers (DBMS LAB) Questions & Answers- DBMS Questions & Answers- DBMS https://career.guru99.com/top-50-database-interview-questions/ 1) Define Database. A prearranged collection of figures known as data is called database. 2) What is DBMS? Database

More information

Application monitoring with BELK. Nishant Sahay, Sr. Architect Bhavani Ananth, Architect

Application monitoring with BELK. Nishant Sahay, Sr. Architect Bhavani Ananth, Architect Application monitoring with BELK Nishant Sahay, Sr. Architect Bhavani Ananth, Architect Why logs Business PoV Input Data Analytics User Interactions /Behavior End user Experience/ Improvements 2017 Wipro

More information

Data Ingestion at Scale. Jeffrey Sica

Data Ingestion at Scale. Jeffrey Sica Data Ingestion at Scale Jeffrey Sica ARC-TS @jeefy Overview What is Data Ingestion? Concepts Use Cases GPS collection with mobile devices Collecting WiFi data from WAPs Sensor data from manufacturing machines

More information

Container 2.0. Container: check! But what about persistent data, big data or fast data?!

Container 2.0. Container: check! But what about persistent data, big data or fast data?! @unterstein @joerg_schad @dcos @jaxdevops Container 2.0 Container: check! But what about persistent data, big data or fast data?! 1 Jörg Schad Distributed Systems Engineer @joerg_schad Johannes Unterstein

More information

How do we build TiDB. a Distributed, Consistent, Scalable, SQL Database

How do we build TiDB. a Distributed, Consistent, Scalable, SQL Database How do we build TiDB a Distributed, Consistent, Scalable, SQL Database About me LiuQi ( 刘奇 ) JD / WandouLabs / PingCAP Co-founder / CEO of PingCAP Open-source hacker / Infrastructure software engineer

More information

Data Acquisition. The reference Big Data stack

Data Acquisition. The reference Big Data stack Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Data Acquisition Corso di Sistemi e Architetture per Big Data A.A. 2016/17 Valeria Cardellini The reference

More information

Etlworks Integrator cloud data integration platform

Etlworks Integrator cloud data integration platform CONNECTED EASY COST EFFECTIVE SIMPLE Connect to all your APIs and data sources even if they are behind the firewall, semi-structured or not structured. Build data integration APIs. Select from multiple

More information

Data Acquisition. The reference Big Data stack

Data Acquisition. The reference Big Data stack Università degli Studi di Roma Tor Vergata Dipartimento di Ingegneria Civile e Ingegneria Informatica Data Acquisition Corso di Sistemi e Architetture per Big Data A.A. 2017/18 Valeria Cardellini The reference

More information

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp.

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp. 17-18 March, 2018 Beijing Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp. The world is changing AI increased by 300% in 2017 Data will grow to 44 ZB in 2020 Today, 80% of organizations

More information

Postgres-XC PostgreSQL Conference Michael PAQUIER Tokyo, 2012/02/24

Postgres-XC PostgreSQL Conference Michael PAQUIER Tokyo, 2012/02/24 Postgres-XC PostgreSQL Conference 2012 Michael PAQUIER Tokyo, 2012/02/24 Agenda Self-introduction Highlights of Postgres-XC Core architecture overview Performance High-availability Release status Copyright

More information

Distributed Systems. 29. Distributed Caching Paul Krzyzanowski. Rutgers University. Fall 2014

Distributed Systems. 29. Distributed Caching Paul Krzyzanowski. Rutgers University. Fall 2014 Distributed Systems 29. Distributed Caching Paul Krzyzanowski Rutgers University Fall 2014 December 5, 2014 2013 Paul Krzyzanowski 1 Caching Purpose of a cache Temporary storage to increase data access

More information

Fluentd + MongoDB + Spark = Awesome Sauce

Fluentd + MongoDB + Spark = Awesome Sauce Fluentd + MongoDB + Spark = Awesome Sauce Nishant Sahay, Sr. Architect, Wipro Limited Bhavani Ananth, Tech Manager, Wipro Limited Your company logo here Wipro Open Source Practice: Vision & Mission Vision

More information

Enterprise Architect Import Db Schema From Odbc Source

Enterprise Architect Import Db Schema From Odbc Source Enterprise Architect Import Db Schema From Odbc Source "Import DB Schema from ODBC" does not work with Oracle 12c. The result set (list of available objects) is empty. There is no error message. Accessing

More information

Understanding NoSQL Database Implementations

Understanding NoSQL Database Implementations Understanding NoSQL Database Implementations Sadalage and Fowler, Chapters 7 11 Class 07: Understanding NoSQL Database Implementations 1 Foreword NoSQL is a broad and diverse collection of technologies.

More information

replic8 The Eighth Generation of MySQL Replication Sven Sandberg MySQL Replication Core Team Lead

replic8 The Eighth Generation of MySQL Replication Sven Sandberg MySQL Replication Core Team Lead replic8 The Eighth Generation of MySQL Replication Sven Sandberg (sven.sandberg@oracle.com) MySQL Replication Core Team Lead Safe Harbour Statement The following is intended to outline our general product

More information

MySQL & NoSQL: The Best of Both Worlds

MySQL & NoSQL: The Best of Both Worlds MySQL & NoSQL: The Best of Both Worlds Mario Beck Principal Sales Consultant MySQL mario.beck@oracle.com 1 Copyright 2012, Oracle and/or its affiliates. All rights Safe Harbour Statement The following

More information

MySQL for Database Administrators Ed 4

MySQL for Database Administrators Ed 4 Oracle University Contact Us: (09) 5494 1551 MySQL for Database Administrators Ed 4 Duration: 5 Days What you will learn The MySQL for Database Administrators course teaches DBAs and other database professionals

More information

StorageTapper. Real-time MySQL Change Data Uber. Ovais Tariq, Shriniket Kale & Yevgeniy Firsov. October 03, 2017

StorageTapper. Real-time MySQL Change Data Uber. Ovais Tariq, Shriniket Kale & Yevgeniy Firsov. October 03, 2017 StorageTapper Real-time MySQL Change Data Streaming @ Uber Ovais Tariq, Shriniket Kale & Yevgeniy Firsov October 03, 2017 Overview What we will cover today Background & Motivation High Level Features System

More information

Technical Sheet NITRODB Time-Series Database

Technical Sheet NITRODB Time-Series Database Technical Sheet NITRODB Time-Series Database 10X Performance, 1/10th the Cost INTRODUCTION "#$#!%&''$!! NITRODB is an Apache Spark Based Time Series Database built to store and analyze 100s of terabytes

More information

Introducing Kafka Connect. Large-scale streaming data import/export for

Introducing Kafka Connect. Large-scale streaming data import/export for Introducing Kafka Connect Large-scale streaming data import/export for Kafka @tlberglund My Secret Agenda 1. Review of Kafka 2. Why do we need Connect? 3. How does Connect work? 4. Tell me about these

More information

A Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers

A Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers A Distributed System Case Study: Apache Kafka High throughput messaging for diverse consumers As always, this is not a tutorial Some of the concepts may no longer be part of the current system or implemented

More information

Event Streams using Apache Kafka

Event Streams using Apache Kafka Event Streams using Apache Kafka And how it relates to IBM MQ Andrew Schofield Chief Architect, Event Streams STSM, IBM Messaging, Hursley Park Event-driven systems deliver more engaging customer experiences

More information

Apache Ignite and Apache Spark Where Fast Data Meets the IoT

Apache Ignite and Apache Spark Where Fast Data Meets the IoT Apache Ignite and Apache Spark Where Fast Data Meets the IoT Denis Magda GridGain Product Manager Apache Ignite PMC http://ignite.apache.org #apacheignite #denismagda Agenda IoT Demands to Software IoT

More information

Evolution of an Apache Spark Architecture for Processing Game Data

Evolution of an Apache Spark Architecture for Processing Game Data Evolution of an Apache Spark Architecture for Processing Game Data Nick Afshartous WB Analytics Platform May 17 th 2017 May 17 th, 2017 About Me nafshartous@wbgames.com WB Analytics Core Platform Lead

More information

WHY AND HOW TO LEVERAGE THE POWER AND SIMPLICITY OF SQL ON APACHE FLINK - FABIAN HUESKE, SOFTWARE ENGINEER

WHY AND HOW TO LEVERAGE THE POWER AND SIMPLICITY OF SQL ON APACHE FLINK - FABIAN HUESKE, SOFTWARE ENGINEER WHY AND HOW TO LEVERAGE THE POWER AND SIMPLICITY OF SQL ON APACHE FLINK - FABIAN HUESKE, SOFTWARE ENGINEER ABOUT ME Apache Flink PMC member & ASF member Contributing since day 1 at TU Berlin Focusing on

More information

FUJITSU Software ServerView Cloud Monitoring Manager V1.1. Release Notes

FUJITSU Software ServerView Cloud Monitoring Manager V1.1. Release Notes FUJITSU Software ServerView Cloud Monitoring Manager V1.1 Release Notes J2UL-2170-01ENZ0(00) July 2016 Contents Contents About this Manual... 4 1 What's New?...6 1.1 Performance Improvements... 6 1.2

More information

Creating a Recommender System. An Elasticsearch & Apache Spark approach

Creating a Recommender System. An Elasticsearch & Apache Spark approach Creating a Recommender System An Elasticsearch & Apache Spark approach My Profile SKILLS Álvaro Santos Andrés Big Data & Analytics Solution Architect in Ericsson with more than 12 years of experience focused

More information

1 Big Data Hadoop. 1. Introduction About this Course About Big Data Course Logistics Introductions

1 Big Data Hadoop. 1. Introduction About this Course About Big Data Course Logistics Introductions Big Data Hadoop Architect Online Training (Big Data Hadoop + Apache Spark & Scala+ MongoDB Developer And Administrator + Apache Cassandra + Impala Training + Apache Kafka + Apache Storm) 1 Big Data Hadoop

More information

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Copyright 2013, Oracle and/or its affiliates. All rights reserved. 1 Oracle NoSQL Database: Release 3.0 What s new and why you care Dave Segleau NoSQL Product Manager The following is intended to outline our general product direction. It is intended for information purposes

More information

sqoop Easy, parallel database import/export Aaron Kimball Cloudera Inc. June 8, 2010

sqoop Easy, parallel database import/export Aaron Kimball Cloudera Inc. June 8, 2010 sqoop Easy, parallel database import/export Aaron Kimball Cloudera Inc. June 8, 2010 Your database Holds a lot of really valuable data! Many structured tables of several hundred GB Provides fast access

More information

New Data Architectures For Netflow Analytics NANOG 74. Fangjin Yang - Imply

New Data Architectures For Netflow Analytics NANOG 74. Fangjin Yang - Imply New Data Architectures For Netflow Analytics NANOG 74 Fangjin Yang - Cofounder @ Imply The Problem Comparing technologies Overview Operational analytic databases Try this at home The Problem Netflow data

More information

Highly Available Database Architectures in AWS. Santa Clara, California April 23th 25th, 2018 Mike Benshoof, Technical Account Manager, Percona

Highly Available Database Architectures in AWS. Santa Clara, California April 23th 25th, 2018 Mike Benshoof, Technical Account Manager, Percona Highly Available Database Architectures in AWS Santa Clara, California April 23th 25th, 2018 Mike Benshoof, Technical Account Manager, Percona Hello, Percona Live Attendees! What this talk is meant to

More information

Oracle 1Z0-591 Exam Questions and Answers (PDF) Oracle 1Z0-591 Exam Questions 1Z0-591 BrainDumps

Oracle 1Z0-591 Exam Questions and Answers (PDF) Oracle 1Z0-591 Exam Questions 1Z0-591 BrainDumps Oracle 1Z0-591 Dumps with Valid 1Z0-591 Exam Questions PDF [2018] The Oracle 1Z0-591 Oracle Business Intelligence Foundation Suite 11g Essentials exam is an ultimate source for professionals to retain

More information

Let the data flow! Data Streaming & Messaging with Apache Kafka Frank Pientka. Materna GmbH

Let the data flow! Data Streaming & Messaging with Apache Kafka Frank Pientka. Materna GmbH Let the data flow! Data Streaming & Messaging with Apache Kafka Frank Pientka Wer ist Frank Pientka? Dipl.-Informatiker (TH Karlsruhe) Verheiratet, 2 Töchter Principal Software Architect in Dortmund Fast

More information

Ingesting Streaming Data for Analysis in Apache Ignite. Pat Patterson

Ingesting Streaming Data for Analysis in Apache Ignite. Pat Patterson Ingesting Streaming Data for Analysis in Apache Ignite Pat Patterson StreamSets pat@streamsets.com @metadaddy Agenda Product Support Use Case Continuous Queries in Apache Ignite Integrating StreamSets

More information

Personalizing Netflix with Streaming datasets

Personalizing Netflix with Streaming datasets Personalizing Netflix with Streaming datasets Shriya Arora Senior Data Engineer Personalization Analytics @shriyarora What is this talk about? Helping you decide if a streaming pipeline fits your ETL problem

More information

Mega-scale Postgres How to run 1,000,000 Postgres Databases

Mega-scale Postgres How to run 1,000,000 Postgres Databases Mega-scale Postgres How to run 1,000,000 Postgres Databases Program What is Heroku & Heroku Postgres? Organizing principles for mega-scale operations Heroku Postgres Code deployment is good, but what

More information

Architecture and Design of MySQL Powered Applications. Peter Zaitsev CEO, Percona Highload Moscow, Russia 31 Oct 2014

Architecture and Design of MySQL Powered Applications. Peter Zaitsev CEO, Percona Highload Moscow, Russia 31 Oct 2014 Architecture and Design of MySQL Powered Applications Peter Zaitsev CEO, Percona Highload++ 2014 Moscow, Russia 31 Oct 2014 About Percona 2 Open Source Software for MySQL Ecosystem Percona Server Percona

More information

It also performs many parallelization operations like, data loading and query processing.

It also performs many parallelization operations like, data loading and query processing. Introduction to Parallel Databases Companies need to handle huge amount of data with high data transfer rate. The client server and centralized system is not much efficient. The need to improve the efficiency

More information

Cassandra- A Distributed Database

Cassandra- A Distributed Database Cassandra- A Distributed Database Tulika Gupta Department of Information Technology Poornima Institute of Engineering and Technology Jaipur, Rajasthan, India Abstract- A relational database is a traditional

More information

DataSunrise Database Security Suite Release Notes

DataSunrise Database Security Suite Release Notes www.datasunrise.com DataSunrise Database Security Suite 4.0.4 Release Notes Contents DataSunrise Database Security Suite 4.0.4... 3 New features...3 Known limitations... 3 Version history... 5 DataSunrise

More information