Event Streams using Apache Kafka

Similar documents
Introduction to Kafka (and why you care)

Transformation-free Data Pipelines by combining the Power of Apache Kafka and the Flexibility of the ESB's

Designing MQ deployments for the cloud generation

IBM Compose Managed Platform for Multiple Open Source Databases

Deep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services

IBM Software Group. IBM WebSphere MQ V7.0. Introduction and Technical Overview. An IBM Proof of Technology IBM Corporation

Lenses 2.1 Enterprise Features PRODUCT DATA SHEET

MQ High Availability and Disaster Recovery Implementation scenarios

Mesosphere and Percona Server for MongoDB. Peter Schwaller, Senior Director Server Eng. (Percona) Taco Scargo, Senior Solution Engineer (Mesosphere)

Mesosphere and Percona Server for MongoDB. Jeff Sandstrom, Product Manager (Percona) Ravi Yadav, Tech. Partnerships Lead (Mesosphere)

Kafka Connect the Dots

How to Leverage Containers to Bolster Security and Performance While Moving to Google Cloud

IoT Sensor Analytics with Apache Kafka, KSQL and TensorFlow

IBM MQ Update BITUG BigSIG Gerry Reilly Development Director and CTO IBM Messaging and IoT Foundation IBM Hursley Lab, UK

IBM Planning Analytics Workspace Local Distributed Soufiane Azizi. IBM Planning Analytics

A Distributed System Case Study: Apache Kafka. High throughput messaging for diverse consumers

Kubernetes 101. Doug Davis, STSM September, 2017

Introduction and Technical Overview

Introducing Kafka Connect. Large-scale streaming data import/export for

Griddable.io architecture

Transform to Your Cloud

Deploying SQL Stream Processing in Kubernetes with Ease

YOUR APPLICATION S JOURNEY TO THE CLOUD. What s the best way to get cloud native capabilities for your existing applications?

Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS

Intra-cluster Replication for Apache Kafka. Jun Rao

IBM MQ Hybrid Cloud Architectures

AN EVENTFUL TOUR FROM ENTERPRISE INTEGRATION TO SERVERLESS. Marius Bogoevici Christian Posta 9 May, 2018

Let the data flow! Data Streaming & Messaging with Apache Kafka Frank Pientka. Materna GmbH

Migrating Applications with CloudCenter

Data Acquisition. The reference Big Data stack

Implementing the Twelve-Factor App Methodology for Developing Cloud- Native Applications

WHITEPAPER. MemSQL Enterprise Feature List

Cisco Cloud Application Centric Infrastructure

Office 365 and Azure Active Directory Identities In-depth

CoreOS and Red Hat. Reza Shafii Joe Fernandes Brandon Philips Clayton Coleman May 2018

MOM MESSAGE ORIENTED MIDDLEWARE OVERVIEW OF MESSAGE ORIENTED MIDDLEWARE TECHNOLOGIES AND CONCEPTS. MOM Message Oriented Middleware

TOP REASONS TO CHOOSE DELL EMC OVER VEEAM

Hitachi Enterprise Cloud Container Platform

Taking your next integration or BPM project to the cloud WebSphere Integration User Group, 12 July 2012 IBM Hursley

Broker Clusters. Cluster Models

How to Keep UP Through Digital Transformation with Next-Generation App Development

Cloud Services. Infrastructure-as-a-Service

5 reasons why choosing Apache Cassandra is planning for a multi-cloud future

IBM Lotus Expeditor 6.2 Server MQ Everyplace Overview

Qualys Cloud Platform

Issues Fixed in DC/OS

Airship A New Open Infrastructure Project for OpenStack

StreamSets Control Hub Installation Guide

Data Protection Modernization: Meeting the Challenges of a Changing IT Landscape

Your Complete Guide to Backup and Recovery for MongoDB

Distributed ETL. A lightweight, pluggable, and scalable ingestion service for real-time data. Joe Wang

The Impact of Hyper- converged Infrastructure on the IT Landscape

Journey to the Cloud Next Generation Infrastructure for the future workforce.

Introduction to the Open Service Broker API. Doug Davis

CLUSTERING HIVEMQ. Building highly available, horizontally scalable MQTT Broker Clusters

Enterprise Open Source Databases

High performance and functionality

Data Acquisition. The reference Big Data stack

Multi-Cloud and Application Centric Modeling, Deployment and Management with Cisco CloudCenter (CliQr)

Matrix IT work Copyright Do not remove source or Attribution from any graphic or portion of graphic

OpenShift Roadmap Enterprise Kubernetes for Developers. Clayton Coleman, Architect, OpenShift

VMWARE PKS. What is VMware PKS? VMware PKS Architecture DATASHEET

Distributed Data on Distributed Infrastructure. Claudius Weinberger & Kunal Kusoorkar, ArangoDB Jörg Schad, Mesosphere

Learn. Connect. Explore.

Developing Microsoft Azure Solutions (70-532) Syllabus

HPE Hyper Converged. Mohannad Daradkeh Data center and Hybrid Cloud Architect Hewlett-Packard Enterprise Saudi Arabia

Go Faster: Containers, Platforms and the Path to Better Software Development (Including Live Demo)

Exam C IBM Cloud Platform Application Development v2 Sample Test

Diving into Open Source Messaging: What Is Kafka?

The SMACK Stack: Spark*, Mesos*, Akka, Cassandra*, Kafka* Elizabeth K. Dublin Apache Kafka Meetup, 30 August 2017.

Perfect Balance of Public and Private Cloud

Building Durable Real-time Data Pipeline

Choosing the Right Container Infrastructure for Your Organization

VMWARE ENTERPRISE PKS

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE

I D C T E C H N O L O G Y S P O T L I G H T. V i r t u a l and Cloud D a t a Center Management

SAP VORA 1.4 on AWS - MARKETPLACE EDITION FREQUENTLY ASKED QUESTIONS

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Pulsar. Realtime Analytics At Scale. Wang Xinglang

Virtustream Managed Services Drive value from technology investments through IT management solutions. Tim Calahan, Manager Managed Services

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE

A day in the life of a log message Kyle Liberti, Josef

IBM Europe Announcement ZP , dated November 6, 2007

How to Route Internet Traffic between A Mobile Application and IoT Device?

Oracle Application Container Cloud

THE RISE OF THE MODERN DATA CENTER

Fluentd + MongoDB + Spark = Awesome Sauce

Enterprise messaging in the cloud

Hortonworks DataFlow Sam Lachterman Solutions Engineer

<Insert Picture Here> QCon: London 2009 Data Grid Design Patterns

SCALE AND SECURE MOBILE / IOT MQTT TRAFFIC

Jitterbit is comprised of two components: Jitterbit Integration Environment

The New Enterprise Network In The Era Of The Cloud. Rohit Mehra Director, Enterprise Communications Infrastructure IDC

Introducing. Thursday, May 3 14:45-15:20 Colin Sullivan / Waldemar Quevedo /

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Scalable backup and recovery for modern applications and NoSQL databases. Best practices for cloud-native applications and NoSQL databases on AWS

DataPower-MQ Integration Deep Dive

MarkLogic 8 Overview of Key Features COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.

Enhancing cloud applications by using messaging services IBM Corporation

20533B: Implementing Microsoft Azure Infrastructure Solutions

Transcription:

Event Streams using Apache Kafka And how it relates to IBM MQ Andrew Schofield Chief Architect, Event Streams STSM, IBM Messaging, Hursley Park

Event-driven systems deliver more engaging customer experiences Phone company has existing data around customers usage and buying preferences Combined with events generated when phones connect to in-store wi-fi Enables a more engaging and personal customer experience

But being event-driven is different Event driven solutions require different thinking Events form the nervous system of the digital business Application infrastructure needs to provide event stream processing capabilities and support emerging event-driven programming models This is an event-driven journey and will underpin the next generation of digital customer experiences Source: Gartner - From data-centric to event-centric IT priority 3

How does this differ from messaging? Message ü Queuing Transient Data Persistence Request / Reply Targeted Reliable Delivery Event Streaming Stream History Scalable Consumption Immutable Data 4

Apache Kafka is an Open-Source Streaming Platform Use cases Pub/sub messaging Producer Producer Producer Website activity tracking Metrics External System External System Source Connector Sink Connector Kafka Cluster Stream Processor Stream Processor Log aggregation Stream processing Event sourcing Commit log Consumer Consumer Consumer 5

Kafka is built for scale Topic Partition 0 0 1 2 3 4 5 6

Kafka is built for scale Topic Partition 0 0 1 2 3 4 5 6 7 7

Kafka is built for scale Topic Partition 0 0 1 2 3 4 5 6 7 Partition 1 0 1 2 3 Partition 2 0 1 2 3 4 5 8

Kafka is built for scale Broker 1 Broker 2 Broker 3 9

Kafka is built for scale Broker 1 Broker 2 Broker 3 10

Kafka is built for scale Broker 1 Broker 2 Broker 3 Broker 4 Broker 5 11

Kafka is built for scale Broker 1 Broker 2 Broker 3 Broker 4 Broker 5 12

Kafka is built for availability Broker 1 Broker 2 Broker 3 Broker 4 Leader Leader Leader 13

Kafka is built for availability Broker 1 Broker 2 Broker 3 Broker 4 Leader Leader Leader 14

Topics are partitioned for scale Topic Partition 0 0 1 2 3 4 5 6 7 Partition 1 0 1 2 3 Writes Partition 2 0 1 2 3 4 5 Old New 15

Each partition is an immutable sequence of records Producer Topic writes Partition 0 0 1 2 3 4 5 6 7 8 9 10 reads Consumer A (offset 6) Consumer B (offset 9) 16

Consumer groups share the partitions Consumer Group A Consumer Topic p0, offset 7 Consumer p1, offset 3 Partition 0 0 1 2 3 4 5 6 7 Consumer p2, offset 4 Consumer Partition 1 0 1 2 3 Partition 2 0 1 2 3 4 5 Consumer Group B p0, offset 6 Consumer p1, offset 1 Consumer p2, offset 5 17

Reliability Producer can choose acknowledgement level 0 Fire-and-forget, fast but risky 1 Waits for 1 broker to acknowledge All Waits for all replica brokers to acknowledge Producer can choose whether to retry 0 Do not retry, lose messages on error >0 Retry, might results in duplicates on error Consumer can choose how to commit offsets Automatic Commits might go faster than processing Manual, asynchronous Fairly safe, but could re-process messages Manual, synchronous Safe, but slows down processing A common pattern is to commit offsets on a timer Producer can also choose idempotence Means that retries can be used without risk of duplicates Exactly-once semantics Can group sending messages and committing offsets into transactions Primarily aimed at stream-processing applications 18

Compacted topics are evolving data stores Source change log Read Producer Partition 0 0 key:a val:a1 1 key:b val:b1 2 key:a val:a2 3 key:c val:c1 4 key:c val:c2 5 key:b val: writes Old New reads Key Value Consumer takes latest values and builds own data store a b c A2 <deleted> C2 19

Compacted topics are evolving data stores Partition 0 0 key:a val:a1 1 key:b val:b1 2 key:a val:a2 3 key:c val:c1 4 key:c val:c2 5 key:b val: Periodic compaction eliminates duplicate keys to minimize storage Key a b Value A2 <deleted> Partition 0 (rewritten) 2 key:a val:a2 4 key:c val:c2 5 key:b val: c C2 20

Kafka Connect Over 80 connectors Source HDFS External System External System Connector (Transform) Converter Connector (Transform) Converter Sink Elasticsearch MySQL JDBC IBM MQ MQTT CoAP + many others https://www.confluent.io/product/connectors/ 21

It s easy to connect IBM MQ to Apache Kafka IBM has created two open-source connectors available on GitHub Source Connector From MQ queue to Kafka topic https://github.com/ibm-messaging/kafka-connect-mq-source Sink Connector From Kafka topic to MQ queue https://github.com/ibm-messaging/kafka-connect-mq-sink Detailed instructions for running them: https://github.com/ibm-messaging/kafka-connect-mq-source/blob/master/usingmqwithkafkaconnect.md

Kafka Connect source connector for IBM MQ MQ Source connector Open-source build it yourself Use any supported MQ release RecordBuilder Converter Uses JMS client internally Client connections TO.KAFKA.Q FROM.MQ.TOPIC Supports TLS, authentication MQ Message SourceRecord Kafka Record MQ queue->kafka topic MQMD (MQRFH2) Payload Schema Value (may be complex) Null key Value byte[] Support for binary, text, JSON Easy to extend https://github.com/ibm-messaging/kafka-connect-mq-source

Configuration of MQ Source connector Configuration is provided in a properties file Sample file provided in GitHub Required: mq.queue.manager MQ QMgr name mq.connection.name.list MQ client conname mq.channel.name MQ svrconn channel name mq.queue MQ source queue topic Kafka target topic Optional: mq.user.name MQ user name for client mq.password MQ password for client mq.message.body.jms native MQ or JMS mq.ssl.cipher.suite MQ SSL cipher suite mq.ssl.peer.name MQ SSL peer name Conversion parameters: mq.record.builder value.converter Example: name=mq-source connector.class=com.ibm.mq.kafkaconnect.mqsourceconnector tasks.max=1 mq.queue.manager=myqm mq.connection.name.list=localhost:1414 mq.channel.name=mysvrconn mq.queue=to.kafka.q topic=from.mq.topic mq.user.name=alice mq.password=passw0rd mq.record.builder=com.ibm.mq.kafkaconnect.builders.defaultrecordbuilder value.converter=org.apache.kafka.connect.converters.bytearrayconverter

Kafka Connect sink connector for IBM MQ MQ Sink connector Open-source build it yourself Use any supported MQ release Converter MessageBuilder Uses JMS client internally Client connections TO.MQ.TOPIC FROM.KAFKA.Q Supports TLS, authentication Kafka Record SinkRecord MQ Message Kafka topic -> MQ queue Key Value byte[] Schema Value (may be complex) MQMD (MQRFH2) Payload Support for binary, text, JSON Easy to extend https://github.com/ibm-messaging/kafka-connect-mq-sink

Configuration of MQ Sink connector Configuration is provided in a properties file Sample file provided in GitHub Required: topics Kafka source topic list mq.queue.manager MQ QMgr name mq.connection.name.list MQ client conname mq.channel.name MQ svrconn channel name mq.queue MQ target queue Optional: mq.user.name MQ user name for client mq.password MQ password for client mq.message.body.jms native MQ or JMS mq.ssl.cipher.suite MQ SSL cipher suite mq.ssl.peer.name MQ SSL peer name mq.time.to.live MQ message expiration mq.persistent MQ message persistence Conversion parameters: mq.message.builder value.converter Example: name=mq-sink connector.class=com.ibm.mq.kafkaconnect.mqsinkconnector tasks.max=1 topics=to.mq.topic mq.queue.manager=myqm mq.connection.name.list=localhost:1414 mq.channel.name=mysvrconn mq.queue=from.kafka.q mq.user.name=alice mq.password=passw0rd mq.message.builder=com.ibm.mq.kafkaconnect.builders.defaultmessagebuilder value.converter=org.apache.kafka.connect.converters.bytearrayconverter

DEMO Ubuntu 14.04 virtual machine MQ Source connector RecordBuilder Converter MYQSOURCE TSOURCE MQ Message MQMD (MQRFH2) Payload SourceRecord Schema Value (may be complex) Kafka Record Null key Value byte[]

Introducing React to events in real-time to deliver more engaging experiences for your customers Deploy production-ready Apache Kafka onto IBM Cloud Private in minutes Rely on disaster recovery & security designed for mission-critical use Build intelligent apps on Kafka with the confidence IBM is supporting you Exploit existing data to become a real Event-Driven Enterprise 30

builds on the open standards of IBM Cloud Private Text or image Text or image Text or image Text or image Executable package of software that includes everything needed to run it Automate deployment, scaling, and management of containerized applications Define, install, and upgrade Kubernetes applications Infrastructure as code to provision public cloud and onpremises environments Containers Orchestration Management Provisioning 31

Benefit from the Core Services of IBM Cloud Private Enterprise Content Catalog Open Source and IBM Middleware, Data, Analytics, and AI Software Core Operational Services Log Management, Monitoring, Metering, Security, Alerting Kubernetes Container Orchestration Platform Strategic Value: Self-service catalog Agility, scalability, and elasticity Choose your infrastructure: IBM Z Self-healing Enterprise security No vendor lock-in 32 32

Apache Kafka Orchestrated with Kubernetes and Helm is packaged as a Helm chart A 3-node Kafka cluster, plus ZooKeeper, UI, network proxies and so on is over 20 containers Kubernetes and Helm brings this all under control Install a Kafka cluster with a few clicks from the IBM Cloud Private catalog It comes Highly available Secure Ready for production 33

High Availability, Scaling and Configuration with Ease Highly available by design Brokers are spread across ICP worker nodes using anti-affinity policies Minimizes the risk of down-time in the event of a node outage Scale the Kafka cluster up with one command Safely grows the stateful set, reconfigures the network interfaces and gives you more capacity Roll out Kafka cluster configuration changes easily Make a single configuration change and Event Streams rolls it out across the brokers in the cluster Broker availability is managed using health checks to ensure that availability is maintained 34

Safe, Planned Upgrade of Apache Kafka Upgrade Kafka versions safely and without hassle First, upgrade the Helm chart to a newer version of Rolling update of the Kafka brokers minimizes disruption As a separate step, upgrade the broker data and protocol version to complete the upgrade Until this point, you can roll back 35

Making Apache Kafka Intuitive and Easy Visualisation of your topic data Simple deploy with just 3 clicks 36

Making Apache Kafka Intuitive and Easy Integrated feedback and support Monitor status at a glance 37

Security Authentication and Access Control User and group information controlled centrally Integrate with your corporate LDAP through IBM Cloud Private Control access to Event Streams resources using role-based access control policies Assign roles to users: Viewer, Editor, Operator, Administrator Optionally, specify access for individual resources, such as topic T Application access control using service IDs Same concepts as controlling user access Can restrict application access to exactly the resources they need Prevent accidental or intentional interference between applications Example policy Permit Bob to write to topic T User Role Service Type Resource Service action topic.read topic.write topic.manage bob Editor Event Streams instance R topic T Roles Viewer and above Editor and above Operator and Administrator Permissions Read messages or config Write messages Delete or change config 38

Enterprise-Grade Reliability Integrated geo-replication for disaster recovery 39

Geo-Replication makes Disaster Recovery simple Origin cluster Destination cluster Origin topic Destination topic Origin topic Destination topic Target is take-over of workload on the destination cluster by business applications within 15 minutes Easy configuration using the Event Streams UI from the origin cluster sets up the replicator and security credentials At-least-once reliability so messages are not lost 40

Connects to existing MQ backbone Kafka Connect source connector for IBM MQ Fully supported by IBM for customers with support entitlement for 41

Ready for Mission-Critical Workloads All with IBM 24x7 worldwide support IBM has years of experience running Apache Kafka across the globe 42