Do-It-Yourself 1. Oracle Big Data Appliance 2X Faster than

Size: px
Start display at page:

Download "Do-It-Yourself 1. Oracle Big Data Appliance 2X Faster than"

Transcription

1

2 Oracle Big Data Appliance 2X Faster than Do-It-Yourself 1 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more complete information visit Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit Configurations were compared by using the Big Data Benchmark for BigBench.Oracle* Big Data Appliance configuration included 6 nodes comprised of: Intel Xeon CPU E v3 (HT enabled) with 128 GB DDR4, 12 X 4TB HDD, Infiniband network (1 connection) observed max throughput 24 Gb/sec, Oracle* Linux Enterprise 6, and CDH* with modified configuration. DIY cluster configuration included 6 nodes comprised of: Intel Xeon CPU E v3 (HT enabled) with 128 GB DDR4, 1 x 64GB SSD for OS, 12 X 4TB HDD, 10Gb network (1 connection), CentOS* 6.6, CDH* with minimal changes.

3 Big Data SQL Roadmap Jean- Pierre Dijcks Big Data Product Management Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted

4 Safe Harbor Statement The following is intended to outline our general product direcmon. It is intended for informamon purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or funcmonality, and should not be relied upon in making purchasing decisions. The development, release, and Mming of any features or funcmonality described for Oracle s products remains at the sole discremon of Oracle. Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 4

5 Extending Data Management Query all your Data: Hadoop, NoSQL & Rela7onal SQL Python node.js REST {MapReduce} Oracle Big {APIs} Data SQL SQL R Graph Java NoSQL Copyright 2015, Oracle and/or its affiliates. All rights reserved. 5

6 Big Data SQL Metadata IntegraMon Hive Tables Define Read Metadata à SerDe = Columns à Inpu_ormat = Records Oracle_Hive Oracle_HDFS Define Read Metadata à Oracle Data Types à Parallelism for Database query à Source objects Oracle External Tables B JSON Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 6

7 Big Data SQL Query ExecuMon Overview Hadoop Cluster SELECT w.sess_id, c.name FROM web_logs w, customers c WHERE w.source_country = Brazil AND w.cust_id = c.customer_id; Big Data SQL: Distributed IO Smart Scan Storage Indexes Relevant SQL runs on BDA nodes 10 s of Gigabytes of Data WEB_LOGS B B B Only columns and rows needed to answer query are returned CUSTOMERS Avoid Scanning Data Oracle Database Copyright 2015, Oracle and/or its affiliates. All rights reserved. 7

8 StorageHandlers: Extensibility Beyond HDFS Oracle Big Data SQL StorageHandlers are a metadata bridge. Hive Metastore Copyright 2015, Oracle and/or its affiliates. All rights reserved.

9 Copyright 2015, Oracle and/or its affiliates. All rights reserved.

10 Big Data SQL: A New Hadoop Processing Engine MapReduce and Hive Processing Layer Spark Impala Search Big Data SQL Resource Management (YARN, cgroups) Storage Layer Filesystem (HDFS) NoSQL Databases (Oracle NoSQL DB, Hbase) Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 10

11 Smart Scan for Hadoop: OpMmizing Performance Big Data SQL Agent Smart Scan External Table Services Data Node Oracle on top Apply filter predicates Project columns Parse semi- structured data Hadoop on the bohom Work close to the data Schema- on- read with Hadoop classes TransformaMon into Oracle data stream Disk Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 11

12 Big Data SQL Dataflow Big Data SQL Agent Smart Scan External Table Services SerDe RecordReader Data Node Disks Read data from HDFS Data Node Direct- path reads C- based readers when possible Use namve Hadoop classes otherwise Translate bytes to Oracle Apply Smart Scan to Oracle bytes Apply filters Project Columns Parse JSON/XML Score models Copyright 2015, Oracle and/or its affiliates. All rights reserved.

13 Big Data SQL 1.1: Enhanced Parallelism and More Storage Handlers Big Data SQL 1.0 Hadoop DoP linked to RDBMS DoP Lead to many idle PQ processes Required explicit declaramon Big Data SQL 1.1 Unlink Hadoop and RDBMS DoP AutomaMc max Hadoop parallelism Even on serial tables An average of 40% faster Even at equivalent DoP StorageHandler support for Hbase And more Copyright 2015, Oracle and/or its affiliates. All rights reserved.

14 Big Data SQL 2.0: Storage Indexing and AcMve SQL Monitor Reports Big Data SQL 1.0 & 1.1 All blocks in a query must be read from disk Large (256MB) disk I/O for each block Big Data SQL 2.0 SQL Monitor integramon AutomaMcally create Storage Indexes in Big Data SQL agents Check index before reading blocks Skip unnecessary I/Os An average of 65% faster Up to 100x faster for highly selecmve queries Copyright 2015, Oracle and/or its affiliates. All rights reserved.

15 AcMve SQL Monitor Reports for Greater Diagnosability Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 15

16 Oracle Big Data SQL Storage Index Field1, Field2, HDFS Field3,, Fieldn HDFS Block1 (256MB) HDFS Block2 (256MB) Example: Find all ramngs from movies with a MOVIE_ID of 1109 Index B1 Movie_ID Min: 1001 Max: 1609 B2 Movie_ID Min: 1909 Max: Storage index provides query speed- up through transparent IO eliminamon of HDFS Blocks Columns in SQL are mapped to fields in the HDFS file via External Table DefiniMons Min / max value is recorded for each HDFS Block in a storage index Copyright 2015, Oracle and/or its affiliates. All rights reserved. 16

17 Oracle Big Data SQL Storage Index ORACLE_HDFS Example Map column movie_id in External Table to file fields: com.oracle.bigdata.colmap: {"col": movie_id", \ "field": Field2"} Run SQL Query on Oracle Database: Select rating from Movie_Logs where movie_id = 1109; Big Data SQL Agent receives work request Big Data SQL Agent verifies SI and constructs block list to be scanned by Data Node for this query: Block 1,4 Big Data SQL Agent requests IO for only the HDFS blocks in the list, eliminamng blocks and speeding up the query Smart Scan applies column projecmon and row eliminamon and sends result set to Oracle Database Big Data SQL Agent Storage Index Block 1: movie_id: min:1001 max:1609 Block 2: movie_id: min:1909 max:13010 Block 3: movie_id: min:2001 max:9043 Block 4: movie_id: min:909 max:2356 Block 1 Block 4 Data Scanner Data Node Disks Copyright 2015, Oracle and/or its affiliates. All rights reserved. 17

18 Big Data SQL 2.0: Selected TPC- DS Speed Ups Query Elapsed Time Times Faster without SI with SI s 12.3s s 26.3s s 44.3s s 11.0s s 63.5s s 76.1s s 8.9s s 10.2s s 11.9s s 67.1s s 13.1s 19.3 Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 18

19 Futures Where Do We Go From Here? Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 19

20 Big Data SQL Vision Data VirtualizaMon across technologies and deployments SQL Python node.js REST R Graph Java Oracle Big Data SQL Copyright 2015, Oracle and/or its affiliates. All rights reserved. 20

21 Big Data SQL Performance Enhancements Predicate pushdown to underlying Storage Engines Feature 1: Hive parmmon eliminamon support Feature 2: IO eliminamon on Parquet and ORC files Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 21

22 Big Data SQL Performance Enhancements Predicate pushdown to underlying Storage Engines Feature 1: Hive parmmon eliminamon support Big Data SQL Agent Smart Scan External Table Services Less Data + Less Scan Time IO Request + Query Predicates Data Node Hive ParMMon EliminaMon Disk Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 22

23 Big Data SQL Performance Enhancements Predicate pushdown to underlying Storage Engines Feature 2: IO eliminamon on Parquet and ORC files Also speeds- up HBase and Oracle NoSQL Database Big Data SQL Agent Smart Scan External Table Services Less IO + Less Scan Time IO Request + Query Predicates Data Node Parquet or ORC Metadata driven IO eliminamon Disk Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 23

24 Big Data SQL Performance Enhancements Predicate pushdown to underlying Storage Engines Feature 1: Hive parmmon eliminamon support Feature 2: IO eliminamon on Parquet and ORC files Feature 3: Columnar Cache Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 24

25 Big Data SQL Performance Enhancements Predicate pushdown to underlying Storage Engines Feature 1: Hive parmmon eliminamon support Feature 2: IO eliminamon on Parquet and ORC files Feature 3: Columnar Cache Modeled on the Exadata Columnar Flash Cache Implemented without Flash Otherwise equal in funcmon and goals Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 25

26 Big Data SQL Announcing Generic Database 12c Support Big Data SQL for generic Oracle 12c DB (incl. Sparc SuperCluster) B B B Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 26

27 Copyright 2015, Oracle and/or its affiliates. All rights reserved. Oracle ConfidenMal Internal/Restricted/Highly Restricted 27

28

Oracle Big Data SQL High Performance Data Virtualization Explained

Oracle Big Data SQL High Performance Data Virtualization Explained Keywords: Oracle Big Data SQL High Performance Data Virtualization Explained Jean-Pierre Dijcks Oracle Redwood City, CA, USA Big Data SQL, SQL, Big Data, Hadoop, NoSQL Databases, Relational Databases,

More information

Big Data SQL Deep Dive

Big Data SQL Deep Dive Big Data SQL Deep Dive Jean-Pierre Dijcks Big Data Product Management DOAG 2016 Copyright 2016, Oracle and/or its affiliates. All rights reserved. 2 Safe Harbor Statement The following is intended to outline

More information

Security and Performance advances with Oracle Big Data SQL

Security and Performance advances with Oracle Big Data SQL Security and Performance advances with Oracle Big Data SQL Jean-Pierre Dijcks Oracle Redwood Shores, CA, USA Key Words SQL, Oracle, Database, Analytics, Object Store, Files, Big Data, Big Data SQL, Hadoop,

More information

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data Oracle Big Data SQL Release 3.2 The unprecedented explosion in data that can be made useful to enterprises from the Internet of Things, to the social streams of global customer bases has created a tremendous

More information

Eine für Alle - Oracle DB für Big Data, In-memory und Exadata Dr.-Ing. Holger Friedrich

Eine für Alle - Oracle DB für Big Data, In-memory und Exadata Dr.-Ing. Holger Friedrich Eine für Alle - Oracle DB für Big Data, In-memory und Exadata Dr.-Ing. Holger Friedrich Agenda Introduction Old Times Exadata Big Data Oracle In-Memory Headquarters Conclusions 2 sumit AG Consulting and

More information

Part 1 Configuring Oracle Big Data SQL

Part 1 Configuring Oracle Big Data SQL Oracle Big Data, Data Science, Advance Analytics & Oracle NoSQL Database Securely analyze data across the big data platform whether that data resides in Oracle Database 12c, Hadoop or a combination of

More information

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,

More information

Recent Innovations in Data Storage Technologies Dr Roger MacNicol Software Architect

Recent Innovations in Data Storage Technologies Dr Roger MacNicol Software Architect Recent Innovations in Data Storage Technologies Dr Roger MacNicol Software Architect Copyright 2017, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The following is intended to

More information

Oracle Big Data Connectors

Oracle Big Data Connectors Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process

More information

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here 2013-11-12 Copyright 2013 Cloudera

More information

Albis: High-Performance File Format for Big Data Systems

Albis: High-Performance File Format for Big Data Systems Albis: High-Performance File Format for Big Data Systems Animesh Trivedi, Patrick Stuedi, Jonas Pfefferle, Adrian Schuepbach, Bernard Metzler, IBM Research, Zurich 2018 USENIX Annual Technical Conference

More information

4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015)

4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) 4th National Conference on Electrical, Electronics and Computer Engineering (NCEECE 2015) Benchmark Testing for Transwarp Inceptor A big data analysis system based on in-memory computing Mingang Chen1,2,a,

More information

Oracle Database Exadata Cloud Service Exadata Performance, Cloud Simplicity DATABASE CLOUD SERVICE

Oracle Database Exadata Cloud Service Exadata Performance, Cloud Simplicity DATABASE CLOUD SERVICE Oracle Database Exadata Exadata Performance, Cloud Simplicity DATABASE CLOUD SERVICE Oracle Database Exadata combines the best database with the best cloud platform. Exadata is the culmination of more

More information

Evolving To The Big Data Warehouse

Evolving To The Big Data Warehouse Evolving To The Big Data Warehouse Kevin Lancaster 1 Copyright Director, 2012, Oracle and/or its Engineered affiliates. All rights Insert Systems, Information Protection Policy Oracle Classification from

More information

April Copyright 2013 Cloudera Inc. All rights reserved.

April Copyright 2013 Cloudera Inc. All rights reserved. Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on

More information

Turning Relational Database Tables into Spark Data Sources

Turning Relational Database Tables into Spark Data Sources Turning Relational Database Tables into Spark Data Sources Kuassi Mensah Jean de Lavarene Director Product Mgmt Director Development Server Technologies October 04, 2017 3 Safe Harbor Statement The following

More information

Oracle Big Data Appliance X7-2

Oracle Big Data Appliance X7-2 Oracle Big Data Appliance X7-2 Oracle Big Data Appliance is a flexible, high-performance, secure platform for running diverse workloads on Hadoop, Kafka and NoSQL. With Oracle Big Data SQL, Oracle Big

More information

MapR Enterprise Hadoop

MapR Enterprise Hadoop 2014 MapR Technologies 2014 MapR Technologies 1 MapR Enterprise Hadoop Top Ranked Cloud Leaders 500+ Customers 2014 MapR Technologies 2 Key MapR Advantage Partners Business Services APPLICATIONS & OS ANALYTICS

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Big Data Connectors: High Performance Integration for Hadoop and Oracle Database Melli Annamalai Sue Mavris Rob Abbott 2 Program Agenda Big Data Connectors: Brief Overview Connecting Hadoop with Oracle

More information

Cloudera Kudu Introduction

Cloudera Kudu Introduction Cloudera Kudu Introduction Zbigniew Baranowski Based on: http://slideshare.net/cloudera/kudu-new-hadoop-storage-for-fast-analytics-onfast-data What is KUDU? New storage engine for structured data (tables)

More information

Oracle Big Data. A NA LYT ICS A ND MA NAG E MENT.

Oracle Big Data. A NA LYT ICS A ND MA NAG E MENT. Oracle Big Data. A NALYTICS A ND MANAG E MENT. Oracle Big Data: Redundância. Compatível com ecossistema Hadoop, HIVE, HBASE, SPARK. Integração com Cloudera Manager. Possibilidade de Utilização da Linguagem

More information

Resource and Performance Distribution Prediction for Large Scale Analytics Queries

Resource and Performance Distribution Prediction for Large Scale Analytics Queries Resource and Performance Distribution Prediction for Large Scale Analytics Queries Prof. Rajiv Ranjan, SMIEEE School of Computing Science, Newcastle University, UK Visiting Scientist, Data61, CSIRO, Australia

More information

Oracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA

Oracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA Oracle Database 11g for Data Warehousing & Big Data: Strategy, Roadmap Jean-Pierre Dijcks, Hermann Baer Oracle Redwood City, CA, USA Keywords: Big Data, Oracle Big Data Appliance, Hadoop, NoSQL, Oracle

More information

Oracle Linux, Virtualization & OEM12 Discussion Sahil Mahajan / Sundeep Dhall

Oracle Linux, Virtualization & OEM12 Discussion Sahil Mahajan / Sundeep Dhall Oracle Linux, Virtualization & OEM12 Discussion Sahil Mahajan / Sundeep Dhall 1 Copyright 2011, 2013, Oracle and/or its affiliates. All rights reserved. reserved. Insert Information Protection Policy Classification

More information

Just add Magic. Enterprise Parquet. Jean-Pierre Dijcks Product Management, Big

Just add Magic. Enterprise Parquet. Jean-Pierre Dijcks Product Management, Big Just add Magic Enterprise Parquet Jean-Pierre Dijcks Product Management, Big Data @jpdijcks Program Agenda 1 2 3 Context Enterprise Parquet Q&A 3 Context 4 Use Cases and Non-Use Cases The entre presentaton

More information

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours Big Data Hadoop Developer Course Content Who is the target audience? Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours Complete beginners who want to learn Big Data Hadoop Professionals

More information

Oracle NoSQL Database Enterprise Edition, Version 18.1

Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across

More information

Apache Hive for Oracle DBAs. Luís Marques

Apache Hive for Oracle DBAs. Luís Marques Apache Hive for Oracle DBAs Luís Marques About me Oracle ACE Alumnus Long time open source supporter Founder of Redglue (www.redglue.eu) works for @redgluept as Lead Data Architect @drune After this talk,

More information

Impala. A Modern, Open Source SQL Engine for Hadoop. Yogesh Chockalingam

Impala. A Modern, Open Source SQL Engine for Hadoop. Yogesh Chockalingam Impala A Modern, Open Source SQL Engine for Hadoop Yogesh Chockalingam Agenda Introduction Architecture Front End Back End Evaluation Comparison with Spark SQL Introduction Why not use Hive or HBase?

More information

Colin Cunningham, Intel Kumaran Siva, Intel Sandeep Mahajan, Oracle 03-Oct :45 p.m. - 5:30 p.m. Moscone West - Room 3020

Colin Cunningham, Intel Kumaran Siva, Intel Sandeep Mahajan, Oracle 03-Oct :45 p.m. - 5:30 p.m. Moscone West - Room 3020 Colin Cunningham, Intel Kumaran Siva, Intel Sandeep Mahajan, Oracle 03-Oct-2017 4:45 p.m. - 5:30 p.m. Moscone West - Room 3020 Big Data Talk Exploring New SSD Usage Models to Accelerate Cloud Performance

More information

Oracle Exadata: Strategy and Roadmap

Oracle Exadata: Strategy and Roadmap Oracle Exadata: Strategy and Roadmap - New Technologies, Cloud, and On-Premises Juan Loaiza Senior Vice President, Database Systems Technologies, Oracle Safe Harbor Statement The following is intended

More information

Oracle 1Z Oracle Big Data 2017 Implementation Essentials.

Oracle 1Z Oracle Big Data 2017 Implementation Essentials. Oracle 1Z0-449 Oracle Big Data 2017 Implementation Essentials https://killexams.com/pass4sure/exam-detail/1z0-449 QUESTION: 63 Which three pieces of hardware are present on each node of the Big Data Appliance?

More information

Scott Oaks, Oracle Sunil Raghavan, Intel Daniel Verkamp, Intel 03-Oct :45 p.m. - 4:30 p.m. Moscone West - Room 3020

Scott Oaks, Oracle Sunil Raghavan, Intel Daniel Verkamp, Intel 03-Oct :45 p.m. - 4:30 p.m. Moscone West - Room 3020 Scott Oaks, Oracle Sunil Raghavan, Intel Daniel Verkamp, Intel 03-Oct-2017 3:45 p.m. - 4:30 p.m. Moscone West - Room 3020 Big Data Talk Exploring New SSD Usage Models to Accelerate Cloud Performance 03-Oct-2017,

More information

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop

More information

Elastify Cloud-Native Spark Application with PMEM. Junping Du --- Chief Architect, Tencent Cloud Big Data Department Yue Li --- Cofounder, MemVerge

Elastify Cloud-Native Spark Application with PMEM. Junping Du --- Chief Architect, Tencent Cloud Big Data Department Yue Li --- Cofounder, MemVerge Elastify Cloud-Native Spark Application with PMEM Junping Du --- Chief Architect, Tencent Cloud Big Data Department Yue Li --- Cofounder, MemVerge Table of Contents Sparkling: The Tencent Cloud Data Warehouse

More information

1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

1 Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Copyright 2012, Oracle and/or its affiliates. All rights The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Raanan Dagan and Rohit Pujari September 25, 2017 Washington, DC Forward-Looking Statements During the course of this presentation, we may

More information

Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture

Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture Hadoop 1.0 Architecture Introduction to Hadoop & Big Data Hadoop Evolution Hadoop Architecture Networking Concepts Use cases

More information

Apache Kudu. Zbigniew Baranowski

Apache Kudu. Zbigniew Baranowski Apache Kudu Zbigniew Baranowski Intro What is KUDU? New storage engine for structured data (tables) does not use HDFS! Columnar store Mutable (insert, update, delete) Written in C++ Apache-licensed open

More information

MySQL & NoSQL: The Best of Both Worlds

MySQL & NoSQL: The Best of Both Worlds MySQL & NoSQL: The Best of Both Worlds Mario Beck Principal Sales Consultant MySQL mario.beck@oracle.com 1 Copyright 2012, Oracle and/or its affiliates. All rights Safe Harbour Statement The following

More information

Oracle Big Data SQL brings SQL and Performance to Hadoop

Oracle Big Data SQL brings SQL and Performance to Hadoop Oracle Big Data SQL brings SQL and Performance to Hadoop Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data SQL, Hadoop, Big Data Appliance, SQL, Oracle, Performance, Smart Scan Introduction

More information

Oracle R Technologies

Oracle R Technologies Oracle R Technologies R for the Enterprise Mark Hornick, Director, Oracle Advanced Analytics @MarkHornick mark.hornick@oracle.com Safe Harbor Statement The following is intended to outline our general

More information

Integrating with Apache Hadoop

Integrating with Apache Hadoop HPE Vertica Analytic Database Software Version: 7.2.x Document Release Date: 10/10/2017 Legal Notices Warranty The only warranties for Hewlett Packard Enterprise products and services are set forth in

More information

Hadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours)

Hadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours) Bigdata Fundamentals Day1: (2hours) 1. Understanding BigData. a. What is Big Data? b. Big-Data characteristics. c. Challenges with the traditional Data Base Systems and Distributed Systems. 2. Distributions:

More information

Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Copyright 2011, Oracle and/or its affiliates. All rights reserved. The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,

More information

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development:: Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized

More information

Big Data with Hadoop Ecosystem

Big Data with Hadoop Ecosystem Diógenes Pires Big Data with Hadoop Ecosystem Hands-on (HBase, MySql and Hive + Power BI) Internet Live http://www.internetlivestats.com/ Introduction Business Intelligence Business Intelligence Process

More information

Introducing Oracle Machine Learning

Introducing Oracle Machine Learning Introducing Oracle Machine Learning A Collaborative Zeppelin notebook for Oracle s machine learning capabilities Charlie Berger Marcos Arancibia Mark Hornick Advanced Analytics and Machine Learning Copyright

More information

cstore_fdw Columnar store for analytic workloads Hadi Moshayedi & Ben Redman

cstore_fdw Columnar store for analytic workloads Hadi Moshayedi & Ben Redman cstore_fdw Columnar store for analytic workloads Hadi Moshayedi & Ben Redman What is CitusDB? CitusDB is a scalable analytics database that extends PostgreSQL Citus shards your data and automa/cally parallelizes

More information

Oracle NoSQL Database Enterprise Edition, Version 18.1

Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across

More information

Innovatus Technologies

Innovatus Technologies HADOOP 2.X BIGDATA ANALYTICS 1. Java Overview of Java Classes and Objects Garbage Collection and Modifiers Inheritance, Aggregation, Polymorphism Command line argument Abstract class and Interfaces String

More information

Stay Informed During and AEer OpenWorld

Stay Informed During and AEer OpenWorld Stay Informed During and AEer OpenWorld TwiIer: @OracleBigData, @OracleExadata, @Infrastructure Follow #CloudReady LinkedIn: Oracle IT Infrastructure Oracle Showcase Page Oracle Big Data Oracle Showcase

More information

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context 1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes

More information

1 Copyright 2012, Oracle and/or its affiliates. All rights reserved.

1 Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Engineered Systems - Exadata Juan Loaiza Senior Vice President Systems Technology October 4, 2012 2 Safe Harbor Statement "Safe Harbor Statement: Statements in this presentation relating to Oracle's

More information

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data Introduction to Hadoop High Availability Scaling Advantages and Challenges Introduction to Big Data What is Big data Big Data opportunities Big Data Challenges Characteristics of Big data Introduction

More information

Future of Database. - Journey to the Cloud. Juan Loaiza Senior Vice President Oracle Database Systems

Future of Database. - Journey to the Cloud. Juan Loaiza Senior Vice President Oracle Database Systems Future of Database - Journey to the Cloud Juan Loaiza Senior Vice President Oracle Database Systems Copyright 2016, Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The following

More information

EsgynDB Enterprise 2.0 Platform Reference Architecture

EsgynDB Enterprise 2.0 Platform Reference Architecture EsgynDB Enterprise 2.0 Platform Reference Architecture This document outlines a Platform Reference Architecture for EsgynDB Enterprise, built on Apache Trafodion (Incubating) implementation with licensed

More information

Impala Intro. MingLi xunzhang

Impala Intro. MingLi xunzhang Impala Intro MingLi xunzhang Overview MPP SQL Query Engine for Hadoop Environment Designed for great performance BI Connected(ODBC/JDBC, Kerberos, LDAP, ANSI SQL) Hadoop Components HDFS, HBase, Metastore,

More information

Deploying Spatial Applications in Oracle Public Cloud

Deploying Spatial Applications in Oracle Public Cloud Deploying Spatial Applications in Oracle Public Cloud David Lapp, Product Manager Oracle Spatial and Graph Oracle Spatial Summit at BIWA 2017 Safe Harbor Statement The following is intended to outline

More information

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. reserved. Insert Information Protection Policy Classification from Slide 8

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. reserved. Insert Information Protection Policy Classification from Slide 8 The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material,

More information

Next-Generation Cloud Platform

Next-Generation Cloud Platform Next-Generation Cloud Platform Jangwoo Kim Jun 24, 2013 E-mail: jangwoo@postech.ac.kr High Performance Computing Lab Department of Computer Science & Engineering Pohang University of Science and Technology

More information

Oracle Big Data Fundamentals Ed 2

Oracle Big Data Fundamentals Ed 2 Oracle University Contact Us: 1.800.529.0165 Oracle Big Data Fundamentals Ed 2 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, you learn about big data, the technologies

More information

Hive and Shark. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic)

Hive and Shark. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic) Hive and Shark Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Hive and Shark 1393/8/19 1 / 45 Motivation MapReduce is hard to

More information

DATA INTEGRATION PLATFORM CLOUD. Experience Powerful Data Integration in the Cloud

DATA INTEGRATION PLATFORM CLOUD. Experience Powerful Data Integration in the Cloud DATA INTEGRATION PLATFORM CLOUD Experience Powerful Integration in the Want a unified, powerful, data-driven solution for all your data integration needs? Oracle Integration simplifies your data integration

More information

Data Analytics using MapReduce framework for DB2's Large Scale XML Data Processing

Data Analytics using MapReduce framework for DB2's Large Scale XML Data Processing IBM Software Group Data Analytics using MapReduce framework for DB2's Large Scale XML Data Processing George Wang Lead Software Egnineer, DB2 for z/os IBM 2014 IBM Corporation Disclaimer and Trademarks

More information

Oracle Big Data Fundamentals Ed 1

Oracle Big Data Fundamentals Ed 1 Oracle University Contact Us: +0097143909050 Oracle Big Data Fundamentals Ed 1 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big Data

More information

TPCX-BB (BigBench) Big Data Analytics Benchmark

TPCX-BB (BigBench) Big Data Analytics Benchmark TPCX-BB (BigBench) Big Data Analytics Benchmark Bhaskar D Gowda Senior Staff Engineer Analytics & AI Solutions Group Intel Corporation bhaskar.gowda@intel.com 1 Agenda Big Data Analytics & Benchmarks Industry

More information

Introduction to Oracle NoSQL Database

Introduction to Oracle NoSQL Database Introduction to Oracle NoSQL Database Anand Chandak Ashutosh Naik Agenda NoSQL Background Oracle NoSQL Database Overview Technical Features & Performance Use Cases 2 Why NoSQL? 1. The four V s of Big Data

More information

DBAs can use Oracle Application Express? Why?

DBAs can use Oracle Application Express? Why? DBAs can use Oracle Application Express? Why? 20. Jubilarna HROUG Konferencija October 15, 2015 Joel R. Kallman Director, Software Development Oracle Application Express, Server Technologies Division Copyright

More information

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( ) Guide: CIS 601 Graduate Seminar Presented By: Dr. Sunnie S. Chung Dhruv Patel (2652790) Kalpesh Sharma (2660576) Introduction Background Parallel Data Warehouse (PDW) Hive MongoDB Client-side Shared SQL

More information

Copyright 2017, Oracle and/or its affiliates. All rights reserved.

Copyright 2017, Oracle and/or its affiliates. All rights reserved. Using Oracle Columnar Technologies Across the Information Lifecycle Roger MacNicol Software Architect Data Storage Technology Safe Harbor Statement The following is intended to outline our general product

More information

Big Data Hadoop Course Content

Big Data Hadoop Course Content Big Data Hadoop Course Content Topics covered in the training Introduction to Linux and Big Data Virtual Machine ( VM) Introduction/ Installation of VirtualBox and the Big Data VM Introduction to Linux

More information

New Oracle NoSQL Database APIs that Speed Insertion and Retrieval

New Oracle NoSQL Database APIs that Speed Insertion and Retrieval New Oracle NoSQL Database APIs that Speed Insertion and Retrieval O R A C L E W H I T E P A P E R F E B R U A R Y 2 0 1 6 1 NEW ORACLE NoSQL DATABASE APIs that SPEED INSERTION AND RETRIEVAL Introduction

More information

IaaS Vendor Comparison

IaaS Vendor Comparison IaaS Vendor Comparison Analysis of competitor products Tobias Deml Senior Systemberater BU Cloud & Core Technologies February 01, 2018 2 Tobias Deml Senior Systemberater BU Cloud & Core Technologies Topics

More information

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case

More information

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale

More information

Verarbeitung von Vektor- und Rasterdaten auf der Hadoop Plattform DOAG Spatial and Geodata Day 2016

Verarbeitung von Vektor- und Rasterdaten auf der Hadoop Plattform DOAG Spatial and Geodata Day 2016 Verarbeitung von Vektor- und Rasterdaten auf der Hadoop Plattform DOAG Spatial and Geodata Day 2016 Hans Viehmann Product Manager EMEA ORACLE Corporation 12. Mai 2016 Safe Harbor Statement The following

More information

Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect

Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect Igor Roiter Big Data Cloud Solution Architect Working as a Data Specialist for the last 11 years 9 of them as a Consultant specializing

More information

Performance Innovations with Oracle Database In-Memory

Performance Innovations with Oracle Database In-Memory Performance Innovations with Oracle Database In-Memory Eric Cohen Solution Architect Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information

More information

Evolution of the Logging Service Hands-on Hadoop Proof of Concept for CALS-2.0

Evolution of the Logging Service Hands-on Hadoop Proof of Concept for CALS-2.0 Evolution of the Logging Service Hands-on Hadoop Proof of Concept for CALS-2.0 Chris Roderick Marcin Sobieszek Piotr Sowinski Nikolay Tsvetkov Jakub Wozniak Courtesy IT-DB Agenda Intro to CALS System Hadoop

More information

Oracle GoldenGate for Big Data

Oracle GoldenGate for Big Data Oracle GoldenGate for Big Data The Oracle GoldenGate for Big Data 12c product streams transactional data into big data systems in real time, without impacting the performance of source systems. It streamlines

More information

Solaris Engineered Systems

Solaris Engineered Systems Solaris Engineered Systems SPARC SuperCluster Introduction Andy Harrison andy.harrison@oracle.com Engineered Systems, Revenue Product Engineering The following is intended to outline

More information

Continuous delivery of Java applications. Marek Kratky Principal Sales Consultant Oracle Cloud Platform. May, 2016

Continuous delivery of Java applications. Marek Kratky Principal Sales Consultant Oracle Cloud Platform. May, 2016 Continuous delivery of Java applications using Oracle Cloud Platform Services Marek Kratky Principal Sales Consultant Oracle Cloud Platform May, 2016 Safe Harbor Statement The following is intended to

More information

Micron and Hortonworks Power Advanced Big Data Solutions

Micron and Hortonworks Power Advanced Big Data Solutions Micron and Hortonworks Power Advanced Big Data Solutions Flash Energizes Your Analytics Overview Competitive businesses rely on the big data analytics provided by platforms like open-source Apache Hadoop

More information

Automating Information Lifecycle Management with

Automating Information Lifecycle Management with Automating Information Lifecycle Management with Oracle Database 2c The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated

More information

Lecture 7 (03/12, 03/14): Hive and Impala Decisions, Operations & Information Technologies Robert H. Smith School of Business Spring, 2018

Lecture 7 (03/12, 03/14): Hive and Impala Decisions, Operations & Information Technologies Robert H. Smith School of Business Spring, 2018 Lecture 7 (03/12, 03/14): Hive and Impala Decisions, Operations & Information Technologies Robert H. Smith School of Business Spring, 2018 K. Zhang (pic source: mapr.com/blog) Copyright BUDT 2016 758 Where

More information

Was ist dran an einer spezialisierten Data Warehousing platform?

Was ist dran an einer spezialisierten Data Warehousing platform? Was ist dran an einer spezialisierten Data Warehousing platform? Hermann Bär Oracle USA Redwood Shores, CA Schlüsselworte Data warehousing, Exadata, specialized hardware proprietary hardware Introduction

More information

FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC

FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC white paper FlashGrid Software Intel SSD DC P3700/P3600/P3500 Topic: Hyper-converged Database/Storage FlashGrid Software Enables Converged and Hyper-Converged Appliances for Oracle* RAC Abstract FlashGrid

More information

NOSQL DATABASE CLOUD SERVICE. Flexible Data Models. Zero Administration. Automatic Scaling.

NOSQL DATABASE CLOUD SERVICE. Flexible Data Models. Zero Administration. Automatic Scaling. NOSQL DATABASE CLOUD SERVICE Flexible Data Models. Zero Administration. Automatic Scaling. Application development with no hassle... Oracle NoSQL Cloud Service is a fully managed NoSQL database cloud service

More information

Oracle Big Data SQL User's Guide. Release 3.2.1

Oracle Big Data SQL User's Guide. Release 3.2.1 Oracle Big Data SQL User's Guide Release 3.2.1 E87609-06 May 2018 Oracle Big Data SQL User's Guide, Release 3.2.1 E87609-06 Copyright 2012, 2018, Oracle and/or its affiliates. All rights reserved. This

More information

Cisco and Cloudera Deliver WorldClass Solutions for Powering the Enterprise Data Hub alerts, etc. Organizations need the right technology and infrastr

Cisco and Cloudera Deliver WorldClass Solutions for Powering the Enterprise Data Hub alerts, etc. Organizations need the right technology and infrastr Solution Overview Cisco UCS Integrated Infrastructure for Big Data and Analytics with Cloudera Enterprise Bring faster performance and scalability for big data analytics. Highlights Proven platform for

More information

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a) Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a) Cloudera s Developer Training for Apache Spark and Hadoop delivers the key concepts and expertise need to develop high-performance

More information

Cloud Computing & Visualization

Cloud Computing & Visualization Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International

More information

Hive SQL over Hadoop

Hive SQL over Hadoop Hive SQL over Hadoop Antonino Virgillito THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Introduction Apache Hive is a high-level abstraction on top of MapReduce Uses

More information

Big Data Syllabus. Understanding big data and Hadoop. Limitations and Solutions of existing Data Analytics Architecture

Big Data Syllabus. Understanding big data and Hadoop. Limitations and Solutions of existing Data Analytics Architecture Big Data Syllabus Hadoop YARN Setup Programming in YARN framework j Understanding big data and Hadoop Big Data Limitations and Solutions of existing Data Analytics Architecture Hadoop Features Hadoop Ecosystem

More information

Big Spatial Data Performance With Oracle Database 12c. Daniel Geringer Spatial Solutions Architect

Big Spatial Data Performance With Oracle Database 12c. Daniel Geringer Spatial Solutions Architect Big Spatial Data Performance With Oracle Database 12c Daniel Geringer Spatial Solutions Architect Oracle Exadata Database Machine Engineered System 2 What Is the Oracle Exadata Database Machine? Oracle

More information

CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)

CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI) CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI) The Certificate in Software Development Life Cycle in BIGDATA, Business Intelligence and Tableau program

More information

Hadoop An Overview. - Socrates CCDH

Hadoop An Overview. - Socrates CCDH Hadoop An Overview - Socrates CCDH What is Big Data? Volume Not Gigabyte. Terabyte, Petabyte, Exabyte, Zettabyte - Due to handheld gadgets,and HD format images and videos - In total data, 90% of them collected

More information

Oracle Data Integrator 12c New Features

Oracle Data Integrator 12c New Features Oracle Data Integrator 12c New Features Joachim Jaensch Principal Sales Consultant Copyright 2014 Oracle and/or its affiliates. All rights reserved. Safe Harbor Statement The following is intended to outline

More information