Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data

Similar documents
Security and Performance advances with Oracle Big Data SQL

Oracle Big Data Connectors

DATA INTEGRATION PLATFORM CLOUD. Experience Powerful Data Integration in the Cloud

Oracle GoldenGate for Big Data

Oracle Big Data SQL High Performance Data Virtualization Explained

Leverage the Oracle Data Integration Platform Inside Azure and Amazon Cloud

VISUAL APPLICATION CREATION AND PUBLISHING FOR ANYONE

Oracle CIoud Infrastructure Load Balancing Connectivity with Ravello O R A C L E W H I T E P A P E R M A R C H

An Oracle White Paper February Optimizing Storage for Oracle PeopleSoft Applications

An Oracle White Paper June Exadata Hybrid Columnar Compression (EHCC)

Oracle Database Exadata Cloud Service Exadata Performance, Cloud Simplicity DATABASE CLOUD SERVICE

Oracle NoSQL Database For Time Series Data O R A C L E W H I T E P A P E R D E C E M B E R

Oracle API Platform Cloud Service

Oracle NoSQL Database Enterprise Edition, Version 18.1

Key Features. High-performance data replication. Optimized for Oracle Cloud. High Performance Parallel Delivery for all targets

Oracle Spatial and Graph: Benchmarking a Trillion Edges RDF Graph ORACLE WHITE PAPER NOVEMBER 2016

Big Data and Enterprise Data, Bridging Two Worlds with Oracle Data Integration

Automatic Data Optimization with Oracle Database 12c O R A C L E W H I T E P A P E R S E P T E M B E R

Oracle Mobile Hub. Complete Mobile Platform

An Oracle White Paper November Primavera Unifier Integration Overview: A Web Services Integration Approach

Oracle Developer Studio 12.6

Oracle Database Mobile Server, Version 12.2

Oracle Data Masking and Subsetting

RAC Database on Oracle Ravello Cloud Service O R A C L E W H I T E P A P E R A U G U S T 2017

Oracle Database Security Assessment Tool

Oracle Mobile Application Framework

NOSQL DATABASE CLOUD SERVICE. Flexible Data Models. Zero Administration. Automatic Scaling.

Veritas NetBackup and Oracle Cloud Infrastructure Object Storage ORACLE HOW TO GUIDE FEBRUARY 2018

StorageTek ACSLS Manager Software

Oracle Cloud Applications. Oracle Transactional Business Intelligence BI Catalog Folder Management. Release 11+

Oracle Solaris 11: No-Compromise Virtualization

Oracle Social Network

Application Container Cloud

Oracle Developer Studio Performance Analyzer

Oracle Clusterware 18c Technical Overview O R A C L E W H I T E P A P E R F E B R U A R Y

Oracle DIVArchive Storage Plan Manager

Oracle Exadata Statement of Direction NOVEMBER 2017

Oracle Enterprise Performance Reporting Cloud. What s New in September 2016 Release (16.09)

ORACLE DATABASE LIFECYCLE MANAGEMENT PACK

Oracle Financial Consolidation and Close Cloud. What s New in the November Update (16.11)

Autonomous Data Warehouse in the Cloud

Oracle Big Data SQL brings SQL and Performance to Hadoop

Oracle NoSQL Database Enterprise Edition, Version 18.1

JD Edwards EnterpriseOne User Experience

CONTAINER CLOUD SERVICE. Managing Containers Easily on Oracle Public Cloud

Your New Autonomous Data Warehouse

Repairing the Broken State of Data Protection

Creating Custom Project Administrator Role to Review Project Performance and Analyze KPI Categories

New Oracle NoSQL Database APIs that Speed Insertion and Retrieval

Oracle WebLogic Server Multitenant:

ORACLE SNAP MANAGEMENT UTILITY FOR ORACLE DATABASE

Oracle Big Data Appliance X7-2

Transforming Data Management

Using the Oracle Business Intelligence Publisher Memory Guard Features. August 2013

Oracle TimesTen Scaleout: Revolutionizing In-Memory Transaction Processing

Correction Documents for Poland

Achieving High Availability with Oracle Cloud Infrastructure Ravello Service O R A C L E W H I T E P A P E R J U N E

COMPUTE CLOUD SERVICE. Moving to SPARC in the Oracle Cloud

Hybrid Columnar Compression (HCC) on Oracle Database 18c O R A C L E W H IT E P A P E R FE B R U A R Y

Oracle Big Data. A NA LYT ICS A ND MA NAG E MENT.

Migrating VMs from VMware vsphere to Oracle Private Cloud Appliance O R A C L E W H I T E P A P E R O C T O B E R

STORAGETEK SL150 MODULAR TAPE LIBRARY

Generate Invoice and Revenue for Labor Transactions Based on Rates Defined for Project and Task

Siebel CRM Applications on Oracle Ravello Cloud Service ORACLE WHITE PAPER AUGUST 2017

Loading User Update Requests Using HCM Data Loader

Oracle Learn Cloud. Taleo Release 16B.1. Release Content Document

Unified Query for Big Data Management Systems

Oracle Service Cloud Agent Browser UI. November What s New

Oracle Financial Consolidation and Close Cloud. What s New in the December Update (16.12)

Automatic Receipts Reversal Processing

STORAGETEK SL150 MODULAR TAPE LIBRARY

Oracle Developer Studio Code Analyzer

See What's Coming in Oracle CPQ Cloud

TABLE OF CONTENTS DOCUMENT HISTORY 3

Oracle Financial Consolidation and Close Cloud

Oracle Database Appliance X6-2S / X6-2M ORACLE ENGINEERED SYSTEMS NOW WITHIN REACH FOR EVERY ORGANIZATION

ORACLE ENTERPRISE COMMUNICATIONS BROKER

Working with Time Zones in Oracle Business Intelligence Publisher ORACLE WHITE PAPER JULY 2014

Oracle Risk Management Cloud

Oracle Database 12c: JMS Sharded Queues

MySQL CLOUD SERVICE. Propel Innovation and Time-to-Market

Implementing Information Lifecycle Management (ILM) with Oracle Database 18c O R A C L E W H I T E P A P E R F E B R U A R Y

An Oracle White Paper October Oracle Social Cloud Platform Text Analytics

Installation Instructions: Oracle XML DB XFILES Demonstration. An Oracle White Paper: November 2011

An Oracle White Paper October 12 th, Oracle Metadata Management v New Features Overview

Oracle Java SE Advanced for ISVs

Oracle Data Provider for.net Microsoft.NET Core and Entity Framework Core O R A C L E S T A T E M E N T O F D I R E C T I O N F E B R U A R Y

Partitioning in Oracle Database 10g Release 2. An Oracle White Paper May 2005

August 6, Oracle APEX Statement of Direction

Oracle Financial Consolidation and Close Cloud. What s New in the February Update (17.02)

Configuring Oracle Business Intelligence Enterprise Edition to Support Teradata Database Query Banding

Load Project Organizations Using HCM Data Loader O R A C L E P P M C L O U D S E R V I C E S S O L U T I O N O V E R V I E W A U G U S T 2018

ORACLE SERVICES FOR APPLICATION MIGRATIONS TO ORACLE HARDWARE INFRASTRUCTURES

Oracle Database 12c Release 2 for Data Warehousing and Big Data O R A C L E W H I T E P A P E R N O V E M B E R

ORACLE EXADATA DATABASE MACHINE X2-8

JD Edwards EnterpriseOne Licensing

Oracle Diagnostics Pack For Oracle Database

Oracle Cloud Infrastructure Virtual Cloud Network Overview and Deployment Guide ORACLE WHITEPAPER JANUARY 2018 VERSION 1.0

Oracle Grid Infrastructure 12c Release 2 Cluster Domains O R A C L E W H I T E P A P E R N O V E M B E R

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

Transcription:

Oracle Big Data SQL Release 3.2 The unprecedented explosion in data that can be made useful to enterprises from the Internet of Things, to the social streams of global customer bases has created a tremendous opportunity for businesses. However, with the enormous possibilities of Big Data, there can also be enormous complexity. Integrating Big Data systems to leverage these vast new data resources with existing information estates can be challenging. Valuable data may be stored in a system separate from where the majority of business-critical operations take place. Moreover, accessing this data may require significant investment in re-developing code for analysis and reporting - delaying access to data as well as reducing the ultimate value of the data to the business. Oracle Big Data SQL enables organizations to immediately analyze data across Apache Hadoop, Apache Kafka, NoSQL and Oracle Database leveraging their existing SQL skills, security policies and applications with extreme performance. From simplifying data science efforts to unlocking data lakes, Big Data SQL makes the benefits of Big Data available to the largest group of end users possible. KEY FEATURES Seamlessly query data across Oracle Database, Hadoop, Kafka and NoSQL sources Runs all Oracle SQL queries without modification preserving application investment Smart Scan on Hadoop, Kafka, NoSQL and Oracle tablespaces enhance performance by parsing and intelligently filtering data where it resides Enables Oracle Database 12c access to leading Hadoop distributions Oracle Database Security features provide single access control to sensitive data across Oracle Database, Hadoop, Kafka and NoSQL data Easily copy data from Oracle Rich SQL Processing on All Data Oracle Big Data SQL is a data virtualization innovation from Oracle. It is a new architecture and solution for SQL and other data APIs (such as REST and Node.js) on disparate data sets, seamlessly integrating data in Apache Hadoop, Apache Kafka and a number of NoSQL databases with data stored in Oracle Database. Using Oracle Big Data SQL, organizations can: Combine data from Oracle Database, Apache Hadoop, Apache Kafka and NoSQL in a single SQL query Query and analyze data in Apache Hadoop, Apache Kafka and NoSQL Maximize query performance on all data using advanced techniques like Smart Scan, Partition Pruning, Storage Indexes, Bloom Filters and Predicate Push-Down in a distributed architecture Integrate big data analyses into existing applications and architectures Extend security and access policies from Oracle Database to data in Apache Hadoop, Apache Kafka and NoSQL

Database to Hadoop using Copy to Hadoop Enhanced External Tables KEY BENEFITS Transparently analyze data sets across Hadoop, Kafka, NoSQL and Oracle Database Achieve fast query performance by leveraging data local processing Use your existing SQL skills to analyze data across big data sources Current SQL-based applications can seamlessly integrate new data Seamlessly extend Information Lifecycle Management strategy to leverage lower cost Hadoop storage Use Oracle Database security policies to keep all sensitive data safe When dealing with large data sets stored in disparate systems, it can be difficult to know where your data is, let alone understand how the data is structured. Big Data SQL adds new external table types to Oracle Database 12c, which give users a single location to catalog and secure data in Hadoop, Kafka and NoSQL systems: the Oracle Database. Big Data SQL keeps track of the metadata about external data sources both clusters and the tables within them without moving or copying data. External tables for Big Data SQL provide: Seamless metadata integration and queries which join data from Oracle Database with data from Hadoop, Kafka and NoSQL databases Automatic mappings from metadata stored in HCatalog (or the Hive Metastore) to Oracle Tables Multiple cluster support to allow one Oracle Database to query multiple Hadoop clusters Enhanced access parameters to give database administrators the flexibility to control column mapping and data access behavior Oracle SQL Figure 1. Oracle Big Data SQL enables Oracle SQL queries to span Oracle Database, Apache Hadoop, Apache Kafka and selected NoSQL data stores. Smart Scan: Data-Driven Parallel Processing Finding insights from Big Data can mean sifting through an extraordinary amount of data. With the massive increase in data volumes that Big Data brings, analytical performance can only be achieved by moving the analytics to the data, not the other way around. Big Data SQL applies the power of Smart Scan, first introduced in Oracle s best-in-class Exadata Database Machine, to big data stores. Smart Scan enables Oracle SQL operations to be pushed down to the storage tiers of the Big Data system. Paired with the horizontal scalability of these storage systems, Smart Scan automatically provides parallel processing equal to your biggest data set, enabling: 2 ORACLE BIG DATA SQL DATA SHEET

Locally filtered data, so that only the rows and columns relevant to your query are transmitted to Oracle Database Join optimization via Bloom filters, speeding up joins between data in Oracle Database and massive data in Hadoop Scoring for data mining models and enhanced processing for querying document data sets in for example JSON or XML Oracle-native operators providing complete fidelity between queries run with Big Data SQL and Oracle Database alone Storage Indexing: More Effective I/O In addition to the set of Smart Scan features, Oracle Big Data SQL provides Storage Index technology to speed up processing before any I/O occurs. As data is accessed, Oracle Big Data SQL automatically builds local, in-memory indexes that capture where relevant data is stored. On subsequent queries of the same data, Storage Index technology ensures that data blocks that are not relevant to the query are not read. Because data blocks in Big Data systems can be very large (up to hundreds of megabytes), this I/O skipping strategy can improve performance on some queries by orders of magnitude. Predicate Push-Down: Harness External Storage Systems Oracle Big Data SQL not only enables easy integration of data from Hadoop and NoSQL sources, Big Data SQL also leverages the underlying storage mechanisms to provide the best possible performance. Big Data SQL s predicate push down technology allows predicates in queries issued in Oracle Database to be executed by remote systems, and to be pushed into certain file formats. Using predicate push down, Big Data SQL enables you to: Prune partitions from tables managed by Apache Hive Minimize I/O on files stored in Apache Parquet and Apache ORC formats Enable remote reads on data stored in Oracle NoSQL Database or Apache HBase Query Streams: SQL Access to Kafka Topics Apache Kafka is a distributed, scalable, fault tolerant messaging system. Organizations utilize Kafka as a central hub for delivering real-time streams of data. Instead of systems communicating directly with one another, applications publish messages to a Kafka topic and these messages are then consumed by other applications. Big Data SQL supports direct access to Kafka topics enabling SQL queries to combine near real-time events with data from Oracle Database and big data stores. Extend Information Lifecycle Management to Hadoop For many years, Oracle Database has provided rich support for Information Lifecycle Management (ILM). Numerous capabilities are available for data tiering or storing data in different media based on access requirements and storage cost considerations. These tiers may scale from 1) in-memory for real time data analysis, 2) Database Flash for frequently accessed data, 3) Database Storage and Exadata Cells for queries of operational data and 4) Hadoop for infrequently accessed raw and archive data: 3 ORACLE BIG DATA SQL DATA SHEET

Copy to Hadoop Copying data from Oracle Database to Hadoop can be complicated. Oracle Big Data SQL includes the Oracle Copy to Hadoop utility. This utility simplifies copying Oracle data to the Hadoop Distributed File System (HDFS). Data copied to the Hadoop cluster by Copy to Hadoop is stored in Oracle Data Pump format. This format optimizes queries thru Big Data SQL: 1) the data is stored as Oracle data types eliminating data type conversions and 2) the data is queried directly without requiring the overhead associated with Java SerDes. Native Hadoop tools like Hive can easily access these same Oracle Data Pump export files using optimized input format classes. Smart Scan on Tablespaces in HDFS Oracle Partitioning is the enabling technology that allows a single table s data partitions to be stored on the various tiers. This enables immutable archive data within a table to reside in Hadoop in highly optimized Oracle Database storage formats. Database queries seamlessly access this archive data as they would any other data exploiting the optimized access and storage structures (like indexes) for fast query performance. In addition, Big Data SQL s Smart Scan capabilities enable compound performance benefits. Big Data SQL Smart Scan utilizes the massively parallel processing power of the Hadoop cluster to filter data at its source greatly reducing data movement and network traffic between the cluster and the database. Oracle Database Security on Big Data Oracle Big Data SQL s unique approach to integrating data enables applications to automatically leverage underlying data authorization rules (i.e. access privileges on files in HDFS) and then layer on top of that advanced Oracle Database Security policies. This approach both simplifies secure implementations and yields enables the utilization of Oracle security features that are unavailable on underlying stores Using Oracle security mechanisms, you can secure Big Data using: Standard Oracle Database roles and privileges to govern access to data Data redaction, to ensure that sensitive information is obscured when accessed by unauthorized users Virtual Private Databases to better enforce governance policies 4 ORACLE BIG DATA SQL DATA SHEET

Supports a Range of Big Data Deployments Oracle Big Data SQL is designed to support a wide range of deployment options and platforms both in the Oracle Cloud and on premise. Oracle Big Data SQL Cloud Service enables Oracle Exadata Cloud Service to analyze data available in Oracle Big Data Cloud Service. For on premise deployments, Big Data SQL requires 1) any Oracle Database 12c (version 12.1.0.2 or ) running Enterprise Linux and 2) leading Apache Hadoop distributions from Cloudera and Hortonworks. Big Data SQL achieves highest performance when paired with Oracle Engineered Systems or in the Oracle Cloud. Big Data SQL takes full advantage of the power of Oracle Exadata and Oracle Big Data Appliance to create a best-in-class Big Data Management System, unifying the power of big data and Oracle Database Oracle Database Version Database Hardware Hadoop Cluster Hardware Hadoop Distribution and Version 12.1.0.2 or Oracle Exadata (Linux OL6) or Any Intel x86 64-bit system (Linux OL6, RHEL6) Any Intel x86 64-bit system (Linux OL6, OL7, RHEL6, RHEL7) CDH* 5.5 and HDP ** 2.3 and 12.1.0.2 or Oracle Exadata Oracle Big Data CDH 5.5 and (Linux OL6) or Any Intel x86 64-bit system (Linux OL6, RHEL6) Appliance (Linux OL5 and OL6) * CDH: Cloudera s Distribution Including Apache Hadoop ** HDP: Hortonworks Data Platform Getting Started Try using Oracle Big Data SQL as well as other components of Oracle s big data platform in Oracle Big Data Lite Virtual Machine (http://www.oracle.com/technetwork/database/bigdata-appliance/oracle-bigdatalite- 2104726.html). Big Data Lite allows you to test drive Oracle s big data capabilities from your laptop or desktop computer. Capabilities include CDH, Oracle Big Data Spatial and Graph, Oracle Big Data Discovery, Oracle Big Data Connectors, Oracle Data Integrator, Oracle Golden Gate and more. 5 ORACLE BIG DATA SQL DATA SHEET

CONTACT US For more information about Oracle Big Data SQL, visit oracle.com or call +1.800.ORACLE1 to speak to an Oracle representative. CONNECT WITH US blogs.oracle.com/oracle facebook.com/oracle twitter.com/oracle oracle.com Copyright 2016, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only, and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document, and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners. Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or registered trademarks of Advanced Micro Devices. UNIX is a registered trademark of The Open Group. 0116