Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source

Similar documents
Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017

Agenda. Apache Ignite Project Apache Ignite Data Fabric: Data Grid HPC & Compute Streaming & CEP Hadoop & Spark Integration Use Cases Demo Q & A

Apache Ignite and Apache Spark Where Fast Data Meets the IoT

GridGain and Apache Ignite In-Memory Performance with Durability of Disk

In-Memory Performance Durability of Disk GridGain Systems, Inc.

In-Memory Computing Essentials

2017 GridGain Systems, Inc. In-Memory Performance Durability of Disk

Accelerate MySQL for Demanding OLAP and OLTP Use Case with Apache Ignite December 7, 2016

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Processing of big data with Apache Spark

IN-MEMORY DATA FABRIC: Real-Time Streaming

2017 GridGain Systems, Inc. In-Memory Performance Durability of Disk

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

Take Telecom to the Next Level with In- Memory Computing

Getting Started with Apache Ignite as a Distributed Database

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Spark Overview. Professor Sasu Tarkoma.

Innovatus Technologies

CSE 444: Database Internals. Lecture 23 Spark

Esper EQC. Horizontal Scale-Out for Complex Event Processing

利用 Mesos 打造高延展性 Container 環境. Frank, Microsoft MTC

Microsoft Big Data and Hadoop

CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)

Spark, Shark and Spark Streaming Introduction

Chapter 4: Apache Spark

WHITEPAPER. MemSQL Enterprise Feature List

Using the SDACK Architecture to Build a Big Data Product. Yu-hsin Yeh (Evans Ye) Apache Big Data NA 2016 Vancouver

Principal Software Engineer Red Hat Emerging Technology June 24, 2015

A GridGain Systems In-Memory Computing White Paper

Oracle NoSQL Database Enterprise Edition, Version 18.1

Big Data Analytics using Apache Hadoop and Spark with Scala

Oracle Big Data. A NA LYT ICS A ND MA NAG E MENT.

Gain Insights From Unstructured Data Using Pivotal HD. Copyright 2013 EMC Corporation. All rights reserved.

Big Data Hadoop Course Content

Certified Big Data Hadoop and Spark Scala Course Curriculum

Shark: Hive (SQL) on Spark

Building a Data-Friendly Platform for a Data- Driven Future

Oracle NoSQL Database Enterprise Edition, Version 18.1

Big Data Syllabus. Understanding big data and Hadoop. Limitations and Solutions of existing Data Analytics Architecture

How Apache Hadoop Complements Existing BI Systems. Dr. Amr Awadallah Founder, CTO Cloudera,

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

Hadoop. Introduction / Overview

Elastify Cloud-Native Spark Application with PMEM. Junping Du --- Chief Architect, Tencent Cloud Big Data Department Yue Li --- Cofounder, MemVerge

Hadoop Development Introduction

Stages of Data Processing

@unterstein #bedcon. Operating microservices with Apache Mesos and DC/OS

Cloud Computing & Visualization

Hadoop course content

Certified Big Data and Hadoop Course Curriculum


DATA SCIENCE USING SPARK: AN INTRODUCTION

Big Data. Big Data Analyst. Big Data Engineer. Big Data Architect

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks

Apache Ignite - Using a Memory Grid for Heterogeneous Computation Frameworks A Use Case Guided Explanation. Chris Herrera Hashmap

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Oracle Big Data Fundamentals Ed 1

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab


An Introduction to Apache Spark

Hadoop Map Reduce 10/17/2018 1

Analytic Cloud with. Shelly Garion. IBM Research -- Haifa IBM Corporation

Big Data Architect.

In-memory data pipeline and warehouse at scale using Spark, Spark SQL, Tachyon and Parquet

IBM Data Science Experience White paper. SparkR. Transforming R into a tool for big data analytics

Specialist ICT Learning

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

Accelerate Big Data Insights

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Hadoop, Yarn and Beyond

HDInsight > Hadoop. October 12, 2017

The Datacenter Needs an Operating System

MapR Enterprise Hadoop

Turning Relational Database Tables into Spark Data Sources

Cloud, Big Data & Linear Algebra

Sunil Shah SECURE, FLEXIBLE CONTINUOUS DELIVERY PIPELINES WITH GITLAB AND DC/OS Mesosphere, Inc. All Rights Reserved.

Distributed Computation Models

An Introduction to Apache Spark Big Data Madison: 29 July William Red Hat, Inc.

CompSci 516: Database Systems

Big Data with Hadoop Ecosystem

Page 1. Oracle9i OLAP. Agenda. Mary Rehus Sales Consultant Patrick Larkin Vice President, Oracle Consulting. Oracle Corporation. Business Intelligence

High Performance in-memory computing with Apache Ignite

Page 1. Goals for Today" Background of Cloud Computing" Sources Driving Big Data" CS162 Operating Systems and Systems Programming Lecture 24

Big Data Hadoop Stack

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

what is cloud computing?

Index. Raul Estrada and Isaac Ruiz 2016 R. Estrada and I. Ruiz, Big Data SMACK, DOI /

Deploying Applications on DC/OS

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

GridGain In- Memory Compu2ng Pla6orm: Empowering Insurance With In- Memory CompuBng

Container 2.0. Container: check! But what about persistent data, big data or fast data?!

Blended Learning Outline: Cloudera Data Analyst Training (171219a)

Prototyping Data Intensive Apps: TrendingTopics.org

Configuring and Deploying Hadoop Cluster Deployment Templates

Apache Flink Big Data Stream Processing

Lecture 11 Hadoop & Spark

Mesosphere and the Enterprise: Run Your Applications on Apache Mesos. Steve Wong Open Source Engineer {code} by Dell

Approaching the Petabyte Analytic Database: What I learned

Transcription:

Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source DMITRIY SETRAKYAN Founder, PPMC https://ignite.apache.org @apacheignite @dsetrakyan

Agenda About In- Memory Computing Apache Ignite (tm) In- Memory Data Fabric Advanced Clustering Data Grid Compute Grid Service Grid Ignite For Analytics Streaming & CEP Share State Across Spark Jobs In- Memory MapReduce Interactive SQL DevOps: Yarn and Mesos Q & A

Apache Ignite TM In- Memory Data Fabric: Strategic Approach to IMC Supports Applications of various types and languages Open Source Apache 2.0 Simple Java APIs 1 JAR Dependency High Performance & Scale Automatic Fault Tolerance Management/Monitoring Runs on Commodity Hardware Supports existing & new data sources No need to rip & replace

In- Memory Data Fabric: More Than Data Grid

Apache Ignite: Complete Cloud Support Automatic Discovery Simple Configuration AWS/EC2/S3 Google Compute Engine Other Clouds with JClouds Docker Support Automatically Build and Deploy

Data Grid: JCache (JSR 107) JCache (JSR 107) Basic Cache Operations ConcurrentMap APIs Collocated Processing (EntryProcessor) Events and Metrics Pluggable Persistence Ignite Data Grid ACID Transactions SQL Queries (ANSI 99) In- Memory Indexes Automatic RDBMS Integration

Data Grid: Partitioned Cache

Data Grid: Replicated Cache

Data Grid: Off- Heap Memory Unlimited Vertical Scale Avoid Java Garbage Collection Pauses Small On- Heap Footprint Large Off- Heap Footprint Off- Heap Indexes Full RAM Utilization Simple Configuration

Data Grid: Ad- Hoc SQL (ANSI 99) ANSI- 99 SQL Always Consistent Fault Tolerant In- Memory Indexes (On- Heap and Off- Heap) Automatic Group By, Aggregations, Sorting Cross- Cache Joins, Unions, etc. Ad- Hoc SQL Support

SQL Cross- Cache JOIN Example

SQL Cross- Cache GROUP BY Example

In- Memory Compute Grid Direct API for MapReduce Direct API for ForkJoin Zero Deployment Cron- like Task Scheduling State Checkpoints Load Balancing Automatic Failover Full Cluster Management Pluggable SPI Design

In- Memory Streaming and CEP Streaming Data Never Ends Branching Pipelines Pluggable Routing Sliding Windows for CEP/Continuous Query SQL Queries (ANSI 99) Query Across Sliding Windows Real Time Analysis

In- Memory Service Grid Singletons on the Cluster Cluster Singleton Node Singleton Key Singleton Distribute any Data Structure Available Anywhere on the Grid Access Anywhere via Proxies Guaranteed Availability Auto Redeployment in Case of Failures

Apache Ignite for BI and Analytics

DevOps: Integration with Yarn and Mesos Automatic Resource Management Easy Data Center Installation Easy Data Center Configuration On- Demand Elasticity

Share RDDs Across Spark Jobs IgniteRDD Share RDD across jobs on the host Share RDD across jobs in the application Share RDD globally Faster SQL In- Memory Indexes SQL on top of Shared RDD

Ignite In- Memory File System Ignite In- Memory File System (IGFS) Hadoop- compliant Easy to Install On- Heap and Off- Heap Caching Layer for HDFS Write- through and Read- through HDFS Performance Boost

Ignite In- Memory Map Reduce In- Memory Native Performance Zero Code Change Use existing MR code Use existing Hive queries No Name Node No Network Noise In- Process Data Colocation Eager Push Scheduling

Interactive SQL with Apache Zeppelin

GridGain Enterprise & Apache Ignite Comparison Chart Features Apache Ignite Enterprise Edition In-Memory Data Grid In-Memory Compute Grid Real-Time Streaming & CEP Hadoop Acceleration Management & Monitoring GUI Portable Objects.Net and C++ APIs Enterprise-grade Security Network Segmentation Protection Local Restartable Store Rolling Production Updates Datacenter Replication 9x5 and 24x7 Support Long Term Support & Patches GridGain Enterprise Subscriptions include the following during the term of the subscription: > Right to use GridGain Enterprise Edition > Bug fixes, patches, updates and upgrades > 9x5 or 24x7 Support > Ability to procure Training and Consulting Services from GridGain > Confidence and protection, not provided under Open Source licensing, that only a commercial vendor can provide, such as indemnification

ANY QUESTIONS? Thank you for joining us. Follow the conversation. https://ignite.apache.org @apacheignite @dsetrakyan