Part 1 Configuring Oracle Big Data SQL

Size: px
Start display at page:

Download "Part 1 Configuring Oracle Big Data SQL"

Transcription

1 Oracle Big Data, Data Science, Advance Analytics & Oracle NoSQL Database Securely analyze data across the big data platform whether that data resides in Oracle Database 12c, Hadoop or a combination of these sources. You will able to leverage your existing Oracle skill sets and applications to gain these insights. Apply Oracle's rich SQL dialect and security policies across the data platform greatly simplifying the ability to gain insights from all your data. Two parts to Big Data SQL: Enhanced Oracle external tables Oracle Big Data SQL Server o Big Data SQL Server applies SmartScan over data stored in Hadoop in order to achieve fast performance (Big Data SQL Server is available on Oracle Big Data Appliance only not on VM/OVA Part 1 Configuring Oracle Big Data SQL Copy/Download bigdatasql_hol_otn_setup.sql and bigdatasql_hol.sql files. Run bigdatasql_hol_otn_setup.sql script on SQL Developer, when prompted for a connection, select the moviedemo connection and click OK. This will complete the setup for this tutorial, the script bigdatasql_hol.sql has DEMO steps. The virtual environment for this tutorial is mostly preconfigured for Oracle Big Data SQL: There are six simple tasks required to configure Oracle Big Data SQL: 1. Create the Common Directory and a Cluster Directory on the Exadata Server. DONE. 2. Create and populate the bigdata.properties file in the Common Directory. DONE. 3. Copy the Hadoop configuration files into the Cluster Directory. DONE. 4. Create corresponding Oracle directory objects that reference these configuration directories. 1. Install Oracle Big Data SQL on the BDA using Mammoth the BDA's installation and configuration utility. DONE. 5. Install a CDH client on each Exadata Server. DONE.

2 Common Directory The Common directory contains a few subdirectories and an important file, named bigdata.properties. This file stores configuration information that is common to all BDA clusters. Specifically, it contains property value pairs used to configure the JVM and identify a default cluster. For Exadata, the Common directory must be on a clusterwide file system; it is critical that all Exadata Database nodes access the exact same configuration information. cd /u01/bigdatasql_config/ cat bigdata.properties Cluster Directory The Cluster directory contains configuration files required to connect to a specific BDA cluster. In addition, the Cluster directory must be a subdirectory of the Common directory and the name of the directory is important: It is the name that you will use to identify the cluster. Notes: The properties, which are not specific to a hadoop cluster, include items such as the location of the Java VM, classpaths and the LD_LIBRARY_PATH. In addition, the last line of the file specifies the default cluster property in this case bigdatalite. As you will see later, the default cluster simplifies the definition of Oracle tables that are accessing data in Hadoop. In our hands-on lab, there is a single cluster: bigdatalite. The bigdatalite subdirectory contains the configuration files for the bigdatalite cluster. The name of the cluster must match the name of the subdirectory (and it is case sensitive!. cd /u01/bigdatasql_config/bigdatalite ls Notes: These are the files required to connect Oracle Database to HDFS and to Hive. Although not required, in our example these files were previously retrieved by using Cloudera Manager. The screenshot below shows the home page for a Cloudera Manager cluster. In our example, we select View Client URLs from the actions menu, and then downloaded the configuration files for both YARN and Hive to the Cluster Directory.

3 Create the Corresponding Oracle Directory Objects (Task #4 ORACLE_BIGDATA_CONFIG : the Oracle directory object that references the Common Directory ORACLE_BIGDATA_CL_bigdatalite : the Oracle directory object that references the Cluster Directory. The naming convention for this directory is as follows: Cluster Directory name begins with ORACLE_BIGDATA_CL_ Followed by the cluster name (i.e. "bigdatalite". This name is case sensitive and is limited to 15 characters. Must match the physical directory name in the file system (repeat: it's case sensitive!. SQL> create or replace directory ORACLE_BIGDATA_CONFIG as '/u01/bigdatasql_config'; SQL> create or replace directory "ORA_BIGDATA_CL_bigdatalite" as ''; Notice that there is no location specified for the Cluster Directory. It is expected that the directory will be a subdirectory of ORACLE_BIGDATA_CONFIG, Use the cluster name as identified by the Oracle directory object. Recommended Practice: In addition to the Oracle directory objects, you should also create the Big Data SQL Multithreaded Agent (MTA. ( Already done as pre-configuration. This agent bridges the metadata between Oracle Database and Hadoop. Technically, the MTA allows the external process to be multithreaded instead of launching a JVM for every process (which can be quite slow. SQL> create public database link BDSQL$_bigdatalite using 'extproc_connection_data'; SQL> create public database link BDSQL$_DEFAULT_CLUSTER using 'extproc_connection_data';

4 Part 2 Create Oracle Table Over Application Log The movie application streamed data into HDFS specifically into the following directory: /user/oracle/moviework/applog_json Execute the following command to review the log file stored in HDFS: hadoop fs -ls /user/oracle/moviework/applog_json hadoop fs -tail /user/oracle/moviework/applog_json/movieapp_log_json.log The JSON log captures the following information about each interaction/ contains every click happened on the web site: Create Oracle Table: SQL> CREATE TABLE movielog (click VARCHAR2(4000 ORGANIZATION EXTERNAL (TYPE ORACLE_HDFS DEFAULT DIRECTORY DEFAULT_DIR LOCATION ('/user/oracle/moviework/applog_json/' REJECT LIMIT UNLIMITED; SELECT * FROM movielog WHERE rownum < 20; SQL> CREATE TABLE movielog_plus (click VARCHAR2(40 ORGANIZATION EXTERNAL (TYPE ORACLE_HDFS DEFAULT DIRECTORY DEFAULT_DIR ACCESS PARAMETERS ( com.oracle.bigdata.cluster=bigdatalite com.oracle.bigdata.overflow={"action":"truncate"} LOCATION ('/user/oracle/moviework/applog_json/' REJECT LIMIT UNLIMITED; The click column has been changed to a VARCHAR2(40. Clearly, this is going to be a problem; the length of a JSON document exceeds that size. There are numerous ways to handle this situation, including: o Generate an error and then either reject the record, o set its value to null or replace it with an alternate value. o Simply truncate the data. Here, we are truncating the data. And, we have applied this truncate action to all columns in the table; you can also specify the individual column(s to truncate. A cluster bigdatalite has been specified. This cluster will be used instead of the default (which in this case happens to be the same. Currently a given session may only connect to a single cluster. SELECT * FROM movielog_plus WHERE rownum < 20; Oracle Database 12c ( includes native JSON support. This allows queries to easily extract attribute data from JSON documents. Run the following query in SQL Developer: SQL> SELECT m.click.custid, m.click.movieid, m.click.genreid, m.click.time FROM movielog m WHERE rownum < 20; The column specification in the select list is a full path to the JSON attribute. The specification starts with the table alias ("m" note: this is required!, followed by the column name ("click", and then a case sensitive JSON path (e.g. "genreid".

5 Combine data from Oracle Database and Hadoop. Combine the "click" data with data sourced from the movie dimension table from Oracle database. SQL> SELECT f.click.custid, m.title, m.year, m.gross, f.click.rating FROM movielog f, movie m WHERE f.click.movieid = m.movie_id AND f.click.rating > 4; Create view(s to simplify queries against the JSON data. SQL> SELECT CAST(m.click.custid AS NUMBER custid, CAST(m.click.movieid AS NUMBER movieid, CAST(m.click.activity AS NUMBER activity, CAST(m.click.genreid AS NUMBER genreid, CAST(m.click.recommended AS VARCHAR2(1 recommended, CAST(m.click.time AS VARCHAR2(20 time, CAST(m.click.rating AS NUMBER rating, CAST(m.click.price AS NUMBER price FROM movielog m; Oracle SQL for MoviePlex average ratings compare to top 10 grossing movies: SQL> SELECT m.title, m.year, m.gross, round(avg(f.rating, 1 FROM movielog_v f, movie m WHERE f.movieid = m.movie_id GROUP BY m.title, m.year, m.gross ORDER BY m.gross desc FETCH FIRST 10 ROWS ONLY; Part 3 Leverage the Hive Metastore to Access Data in Hadoop Hive enables SQL access to data stored in Hadoop and NoSQL stores. Two parts to Hive: the Hive execution engine and the Hive Metastore. The Hive execution engine launches MapReduce job(s based on the SQL that has been issued. MapReduce is a batch processing framework and is not intended for interactive query and analysis but it is extremely useful for querying massive data sets using the well understood SQL language. Importantly, no coding is required (Java, Pig, etc.. The SQL supported by Hive is still limited (SQL92, but improvements are being made over time. The Hive Metastore has become the standard metadata repository for data stored in Hadoop. It contains the definitions of tables (table name, columns and data types, the location of data files (e.g. directory in HDFS, and the routines required parse that data (e.g. StorageHandlers, InputFormats and SerDes - serializer/deserializer. The same metadata can be shared across multiple products (e.g. Hive, Oracle Big Data SQL, Impala, Pig, Stinger, etc.; Review Tables Stored in Hive: CLI hive > show tables;

6 The movielog Table is equivalent to the external table that was defined in Oracle Database in the previous exercise. Review the definition of the table by executing the following command at the hive> show create table movielog; hive> select * from movielog limit 10; Because there are no columns in the select list and no filters applied, the query simply scans the file and returning the results. No MapReduce job is executed. The second table queries that same file however this time it is using a SerDe that will translate the attributes into columns. Review the definition of the table by executing the following command: There are columns defined for each field in the JSON document making it much easier to understand and query the data. A java class org.apache.hive.hcatalog.data.jsonserde is used to deserialize the JSON file. hive > show create table movieapp_log_json;

7 This is an illustration of Hadoop's schema on read paradigm; a file is stored in HDFS, but there is no schema associated with it until that file is read. Our examples are using two different schemas to read that same data; these schemas are encapsulated by the Hive tables movielog and movieapp_log_json. The Hive query execution engine converted hivesql query into a MapReduce job. The author of the query does not need to worry about the underlying implementation Hive handles this automatically. hive > select * from movieapp_log_json where rating > 4; hive > exit; Leverage Hive Metadata When Creating Oracle Tables: Create a table over the Hive movieapp_log_json table using the following DDL: The ORACLE_HIVE access driver type invokes Oracle Big Data SQL at query compilation time to retrieve the metadata details from the Hive Metastore. The default can be overridden using ACCESS PARAMETERS. The metadata includes the location of the data and the classes required to process the data (e.g. StorageHandlers, InputFormats and SerDes. The scanned the files found in the /user/oracle/movie/moviework/applog_json directory and then used the Hive SerDe to parse each JSON document. In a true Oracle Big Data Appliance environment, the input splits would be processed in parallel across the nodes of the cluster by the Big Data SQL Server, the data would then be filtered locally using Smart Scan, and only the filtered results (rows and columns would be returned to Oracle Database. SQL> CREATE TABLE movieapp_log_json ( custid INTEGER, movieid INTEGER, genreid INTEGER, time VARCHAR2 (20, recommended VARCHAR2 (4, activity NUMBER, rating INTEGER, price NUMBER ORGANIZATION EXTERNAL ( TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR REJECT LIMIT UNLIMITED; SQL> SELECT * FROM movieapp_log_json WHERE rating > 4 ;

8 The second Hive table over the same movie log content except the data is in Avro format not JSON text format. Create an Oracle table over that Avro based Hive table using the following command: The Oracle table name does not match the Hive table name. Therefore, an ACCESS PARAMETER was specified that references the Hive table (default.movieapp_log_avro. SQL> CREATE TABLE mylogdata ( custid INTEGER, movieid INTEGER, genreid INTEGER, time VARCHAR2 (20, recommended VARCHAR2 (4, activity NUMBER, rating INTEGER, price NUMBER ORGANIZATION EXTERNAL ( TYPE ORACLE_HIVE DEFAULT DIRECTORY DEFAULT_DIR ACCESS PARAMETERS ( com.oracle.bigdata.tablename=default.movieapp_log_avro REJECT LIMIT UNLIMITED; SQL> SELECT custid, movieid, time FROM mylogdata; To illustrate how Oracle Big Data SQL uses the Hive Metastore at query compilation to determine query execution parameters, just change the definition of the hive table movieapp_log_data. In Hive, alter the table's LOCATION field so that it points to a file that containing has only two records ( example. The Oracle SQL also runs without making any changes to the Oracle table query movieapp_log_json: hive > ALTER TABLE movieapp_log_json SET LOCATION "hdfs://bigdatalite.localdomain:8020/user/oracle/moviework/two_recs"; hive > SELECT * FROM movieapp_log_json; SQL > SELECT * FROM movieapp_log_json; Reset the Hive table and then confirm that there are more than two rows. Execute the following commands. hive > ALTER TABLE movieapp_log_json SET LOCATION "hdfs://bigdatalite.localdomain:8020/user/oracle/moviework/applog_json"; hive > select * from movieapp_log_json limit 10;

9 Part 4 Applying Oracle Database Security Policies Over Data in Hadoop Oracle Database security features, including strong authentication, row level access, data redaction, data masking, auditing and more have been utilized to ensure that data remains safe on HDFS/Hadoop/BigDATA. Example, to protect personally identifiable information, including the customer last name and customer id. Oracle Data Redaction policy has already been set up on the customer table that obscures these two fields. This was accomplished by using the DBMS_REDACT PL/SQL package. SQL> DBMS_REDACT.ADD_POLICY( object_schema => 'MOVIEDEMO', object_name => 'CUSTOMER', column_name => 'CUST_ID', policy_name => 'customer_redaction', function_type => DBMS_REDACT.PARTIAL, function_parameters => '9,1,7', expression => '1=1' ; Creates a policy called customer_redaction: It is applied to the cust_id column moviedemo.customer table It performs a partial redaction I e. it is not nec. applied to all characters in the field It replaces the first 7 characters with the number "9" The redaction policy will always apply since the expression describing when it will apply is specified as "1=1" DBMS_REDACT.ALTER_POLICY( object_schema => 'MOVIEDEMO', object_name => 'CUSTOMER', action => DBMS_REDACT.ADD_COLUMN, column_name => 'LAST_NAME', policy_name => 'customer_redaction', function_type => DBMS_REDACT.PARTIAL, function_parameters => 'VVVVVVVVVVVVVVVVVVVVVVVVV,VVVVVVVVVVVVVVVVVVVVVVVVV,*,3,25', expression => '1=1' ; Updates the customer_redaction policy, redacting a second column in that same table. It will replace the characters 3 to 25 in the LAST_NAME column with an '*'. The fact that the data is redacted is transparent to application code. SELECT cust_id, last_name FROM customer;

10 Apply Redaction Policies to Data Stored in Hadoop: Apply an equivalent redaction policy to two of our Oracle Big Data SQL tables, with the following effects: The first procedure redacts data sourced from JSON in HDFS. The second procedure redacts Avro data sourced from Hive. Both policies redact the custid; attribute. SQL> BEGIN -- JSON file in HDFS DBMS_REDACT.ADD_POLICY( object_schema => 'MOVIEDEMO', object_name => 'MOVIELOG_V', column_name => 'CUSTID', policy_name => 'movielog_v_redaction', function_type => DBMS_REDACT.PARTIAL, function_parameters => '9,1,7', expression => '1=1' ; Avro data from Hive -- DBMS_REDACT.ADD_POLICY( object_schema => 'MOVIEDEMO', object_name => 'MYLOGDATA', column_name => 'CUSTID', policy_name => 'mylogdata_redaction', function_type => DBMS_REDACT.PARTIAL, function_parameters => '9,1,7', expression => '1=1' ; END; / Review the redacted data from the Avro source: SQL> SELECT * FROM mylogdata WHERE rownum < 20; Join the redacted HDFS data to the customer table by executing the following SELECT statement: SQL> SELECT f.custid, c.last_name, f.movieid, f.time FROM customer c, movielog_v f WHERE c.cust_id = f.custid;

11 Part 5 Using Oracle Analytic SQL Across All Your Data Oracle Big Data SQL allows you to utilize Oracle's rich SQL dialect to query all your data, regardless of where that data may reside. Oracle MoviePlex's understanding of customers by utilizing an RFM analysis: Recency : when was the last time the customer accessed the site? Frequency : what is the level of activity for that customer on the site? Monetary : how much money has the customer spent? SQL Analytic Functions will be applied to data residing in both the application logs on Hadoop and sales data in Oracle Database tables. RFM combined score of 551 indicates that the customer is in the highest tier of customers in terms of recent visits (R=5 and activity on the site (F=5, however the customer is in the lowest tier in terms of spend (M=1. Apply Oracle NTILE functions across all data: The customer_sales subquery selects from the Oracle Database fact table movie_sales to categorize customers based on sales. The click_data subquery performs a similar task for web site activity stored in the application logs categorizing customers based on their activity and recent visits. These two subqueries are then joined to produce the complete RFM score. SQL> WITH customer_sales AS ( Sales and customer attributes SELECT m.cust_id, c.last_name, c.first_name, c.country, c.gender, c.age, c.income_level, NTILE (5 over (order by sum(sales AS rfm_monetary FROM movie_sales m, customer c WHERE c.cust_id = m.cust_id GROUP BY m.cust_id, c.last_name, c.first_name, c.country, c.gender, c.age, c.income_level, click_data AS ( clicks from application log SELECT custid, NTILE (5 over (order by max(time AS rfm_recency, NTILE (5 over (order by count(1 AS rfm_frequency FROM movielog_v GROUP BY custid SELECT c.cust_id, c.last_name, c.first_name, cd.rfm_recency, cd.rfm_frequency, c.rfm_monetary, cd.rfm_recency*100 + cd.rfm_frequency*10 + c.rfm_monetary AS rfm_combined, c.country, c.gender, c.age, c.income_level FROM customer_sales c, click_data cd WHERE c.cust_id = cd.custid;

12 2 We want to target customers who we may be losing to competition. Therefore, execute the following amend the query which finds important customers (high monetary score that have not visited the site recently (low recency score: SQL> WITH customer_sales AS ( Sales and customer attributes SELECT m.cust_id, c.last_name, c.first_name, c.country, c.gender, c.age, c.income_level, NTILE (5 over (order by sum(sales AS rfm_monetary FROM movie_sales m, customer c WHERE c.cust_id = m.cust_id GROUP BY m.cust_id, c.last_name, c.first_name, c.country, c.gender, c.age, c.income_level, click_data AS ( clicks from application log SELECT custid, NTILE (5 over (order by max(time AS rfm_recency, NTILE (5 over (order by count(1 AS rfm_frequency FROM movielog_v GROUP BY custid SELECT c.cust_id, c.last_name, c.first_name, cd.rfm_recency, cd.rfm_frequency, c.rfm_monetary, cd.rfm_recency*100 + cd.rfm_frequency*10 + c.rfm_monetary AS rfm_combined, c.country, c.gender, c.age, c.income_level FROM customer_sales c, click_data cd WHERE c.cust_id = cd.custid AND c.rfm_monetary >= 4 AND cd.rfm_recency <= 2 ORDER BY c.rfm_monetary desc, cd.rfm_recency desc ; Pattern Matching and Advance Analytics with PIVOT tables::

Oracle Big Data SQL High Performance Data Virtualization Explained

Oracle Big Data SQL High Performance Data Virtualization Explained Keywords: Oracle Big Data SQL High Performance Data Virtualization Explained Jean-Pierre Dijcks Oracle Redwood City, CA, USA Big Data SQL, SQL, Big Data, Hadoop, NoSQL Databases, Relational Databases,

More information

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data Oracle Big Data SQL Release 3.2 The unprecedented explosion in data that can be made useful to enterprises from the Internet of Things, to the social streams of global customer bases has created a tremendous

More information

Do-It-Yourself 1. Oracle Big Data Appliance 2X Faster than

Do-It-Yourself 1. Oracle Big Data Appliance 2X Faster than Oracle Big Data Appliance 2X Faster than Do-It-Yourself 1 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such

More information

Oracle Big Data Fundamentals Ed 1

Oracle Big Data Fundamentals Ed 1 Oracle University Contact Us: +0097143909050 Oracle Big Data Fundamentals Ed 1 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, learn to use Oracle's Integrated Big Data

More information

Security and Performance advances with Oracle Big Data SQL

Security and Performance advances with Oracle Big Data SQL Security and Performance advances with Oracle Big Data SQL Jean-Pierre Dijcks Oracle Redwood Shores, CA, USA Key Words SQL, Oracle, Database, Analytics, Object Store, Files, Big Data, Big Data SQL, Hadoop,

More information

Big Data SQL Deep Dive

Big Data SQL Deep Dive Big Data SQL Deep Dive Jean-Pierre Dijcks Big Data Product Management DOAG 2016 Copyright 2016, Oracle and/or its affiliates. All rights reserved. 2 Safe Harbor Statement The following is intended to outline

More information

Blended Learning Outline: Cloudera Data Analyst Training (171219a)

Blended Learning Outline: Cloudera Data Analyst Training (171219a) Blended Learning Outline: Cloudera Data Analyst Training (171219a) Cloudera Univeristy s data analyst training course will teach you to apply traditional data analytics and business intelligence skills

More information

Overview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training::

Overview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training:: Module Title Duration : Cloudera Data Analyst Training : 4 days Overview Take your knowledge to the next level Cloudera University s four-day data analyst training course will teach you to apply traditional

More information

Question: 1 You need to place the results of a PigLatin script into an HDFS output directory. What is the correct syntax in Apache Pig?

Question: 1 You need to place the results of a PigLatin script into an HDFS output directory. What is the correct syntax in Apache Pig? Volume: 72 Questions Question: 1 You need to place the results of a PigLatin script into an HDFS output directory. What is the correct syntax in Apache Pig? A. update hdfs set D as./output ; B. store D

More information

Exam Questions 1z0-449

Exam Questions 1z0-449 Exam Questions 1z0-449 Oracle Big Data 2017 Implementation Essentials https://www.2passeasy.com/dumps/1z0-449/ 1. What two actions do the following commands perform in the Oracle R Advanced Analytics for

More information

Lecture 7 (03/12, 03/14): Hive and Impala Decisions, Operations & Information Technologies Robert H. Smith School of Business Spring, 2018

Lecture 7 (03/12, 03/14): Hive and Impala Decisions, Operations & Information Technologies Robert H. Smith School of Business Spring, 2018 Lecture 7 (03/12, 03/14): Hive and Impala Decisions, Operations & Information Technologies Robert H. Smith School of Business Spring, 2018 K. Zhang (pic source: mapr.com/blog) Copyright BUDT 2016 758 Where

More information

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data Introduction to Hadoop High Availability Scaling Advantages and Challenges Introduction to Big Data What is Big data Big Data opportunities Big Data Challenges Characteristics of Big data Introduction

More information

Oracle Big Data Connectors

Oracle Big Data Connectors Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process

More information

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours Big Data Hadoop Developer Course Content Who is the target audience? Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours Complete beginners who want to learn Big Data Hadoop Professionals

More information

Oracle BDA: Working With Mammoth - 1

Oracle BDA: Working With Mammoth - 1 Hello and welcome to this online, self-paced course titled Administering and Managing the Oracle Big Data Appliance (BDA). This course contains several lessons. This lesson is titled Working With Mammoth.

More information

1Z Oracle Big Data 2017 Implementation Essentials Exam Summary Syllabus Questions

1Z Oracle Big Data 2017 Implementation Essentials Exam Summary Syllabus Questions 1Z0-449 Oracle Big Data 2017 Implementation Essentials Exam Summary Syllabus Questions Table of Contents Introduction to 1Z0-449 Exam on Oracle Big Data 2017 Implementation Essentials... 2 Oracle 1Z0-449

More information

Hive SQL over Hadoop

Hive SQL over Hadoop Hive SQL over Hadoop Antonino Virgillito THE CONTRACTOR IS ACTING UNDER A FRAMEWORK CONTRACT CONCLUDED WITH THE COMMISSION Introduction Apache Hive is a high-level abstraction on top of MapReduce Uses

More information

Oracle. Oracle Big Data 2017 Implementation Essentials. 1z Version: Demo. [ Total Questions: 10] Web:

Oracle. Oracle Big Data 2017 Implementation Essentials. 1z Version: Demo. [ Total Questions: 10] Web: Oracle 1z0-449 Oracle Big Data 2017 Implementation Essentials Version: Demo [ Total Questions: 10] Web: www.myexamcollection.com Email: support@myexamcollection.com IMPORTANT NOTICE Feedback We have developed

More information

Hadoop. Introduction / Overview

Hadoop. Introduction / Overview Hadoop Introduction / Overview Preface We will use these PowerPoint slides to guide us through our topic. Expect 15 minute segments of lecture Expect 1-4 hour lab segments Expect minimal pretty pictures

More information

Oracle 1Z Oracle Big Data 2017 Implementation Essentials.

Oracle 1Z Oracle Big Data 2017 Implementation Essentials. Oracle 1Z0-449 Oracle Big Data 2017 Implementation Essentials https://killexams.com/pass4sure/exam-detail/1z0-449 QUESTION: 63 Which three pieces of hardware are present on each node of the Big Data Appliance?

More information

Big Data Hadoop Stack

Big Data Hadoop Stack Big Data Hadoop Stack Lecture #1 Hadoop Beginnings What is Hadoop? Apache Hadoop is an open source software framework for storage and large scale processing of data-sets on clusters of commodity hardware

More information

Oracle Big Data Fundamentals Ed 2

Oracle Big Data Fundamentals Ed 2 Oracle University Contact Us: 1.800.529.0165 Oracle Big Data Fundamentals Ed 2 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, you learn about big data, the technologies

More information

Hadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours)

Hadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours) Bigdata Fundamentals Day1: (2hours) 1. Understanding BigData. a. What is Big Data? b. Big-Data characteristics. c. Challenges with the traditional Data Base Systems and Distributed Systems. 2. Distributions:

More information

Apache Hive for Oracle DBAs. Luís Marques

Apache Hive for Oracle DBAs. Luís Marques Apache Hive for Oracle DBAs Luís Marques About me Oracle ACE Alumnus Long time open source supporter Founder of Redglue (www.redglue.eu) works for @redgluept as Lead Data Architect @drune After this talk,

More information

Verarbeitung von Vektor- und Rasterdaten auf der Hadoop Plattform DOAG Spatial and Geodata Day 2016

Verarbeitung von Vektor- und Rasterdaten auf der Hadoop Plattform DOAG Spatial and Geodata Day 2016 Verarbeitung von Vektor- und Rasterdaten auf der Hadoop Plattform DOAG Spatial and Geodata Day 2016 Hans Viehmann Product Manager EMEA ORACLE Corporation 12. Mai 2016 Safe Harbor Statement The following

More information

1z0-449.exam. Number: 1z0-449 Passing Score: 800 Time Limit: 120 min File Version: Oracle. 1z0-449

1z0-449.exam. Number: 1z0-449 Passing Score: 800 Time Limit: 120 min File Version: Oracle. 1z0-449 1z0-449.exam Number: 1z0-449 Passing Score: 800 Time Limit: 120 min File Version: 1.0 Oracle 1z0-449 Oracle Big Data 2017 Implementation Essentials Version 1.0 Exam A QUESTION 1 The NoSQL KVStore experiences

More information

Innovatus Technologies

Innovatus Technologies HADOOP 2.X BIGDATA ANALYTICS 1. Java Overview of Java Classes and Objects Garbage Collection and Modifiers Inheritance, Aggregation, Polymorphism Command line argument Abstract class and Interfaces String

More information

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : PH NO: 9963799240, 040-40025423

More information

Oracle Big Data SQL User's Guide. Release 3.2.1

Oracle Big Data SQL User's Guide. Release 3.2.1 Oracle Big Data SQL User's Guide Release 3.2.1 E87609-06 May 2018 Oracle Big Data SQL User's Guide, Release 3.2.1 E87609-06 Copyright 2012, 2018, Oracle and/or its affiliates. All rights reserved. This

More information

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

Copyright 2012, Oracle and/or its affiliates. All rights reserved. 1 Big Data Connectors: High Performance Integration for Hadoop and Oracle Database Melli Annamalai Sue Mavris Rob Abbott 2 Program Agenda Big Data Connectors: Brief Overview Connecting Hadoop with Oracle

More information

Oracle Big Data SQL brings SQL and Performance to Hadoop

Oracle Big Data SQL brings SQL and Performance to Hadoop Oracle Big Data SQL brings SQL and Performance to Hadoop Jean-Pierre Dijcks Oracle Redwood City, CA, USA Keywords: Big Data SQL, Hadoop, Big Data Appliance, SQL, Oracle, Performance, Smart Scan Introduction

More information

Importing and Exporting Data Between Hadoop and MySQL

Importing and Exporting Data Between Hadoop and MySQL Importing and Exporting Data Between Hadoop and MySQL + 1 About me Sarah Sproehnle Former MySQL instructor Joined Cloudera in March 2010 sarah@cloudera.com 2 What is Hadoop? An open-source framework for

More information

Big Data with Hadoop Ecosystem

Big Data with Hadoop Ecosystem Diógenes Pires Big Data with Hadoop Ecosystem Hands-on (HBase, MySql and Hive + Power BI) Internet Live http://www.internetlivestats.com/ Introduction Business Intelligence Business Intelligence Process

More information

Impala. A Modern, Open Source SQL Engine for Hadoop. Yogesh Chockalingam

Impala. A Modern, Open Source SQL Engine for Hadoop. Yogesh Chockalingam Impala A Modern, Open Source SQL Engine for Hadoop Yogesh Chockalingam Agenda Introduction Architecture Front End Back End Evaluation Comparison with Spark SQL Introduction Why not use Hive or HBase?

More information

Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture

Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture Hadoop 1.0 Architecture Introduction to Hadoop & Big Data Hadoop Evolution Hadoop Architecture Networking Concepts Use cases

More information

Introduction to Hive Cloudera, Inc.

Introduction to Hive Cloudera, Inc. Introduction to Hive Outline Motivation Overview Data Model Working with Hive Wrap up & Conclusions Background Started at Facebook Data was collected by nightly cron jobs into Oracle DB ETL via hand-coded

More information

Certified Big Data Hadoop and Spark Scala Course Curriculum

Certified Big Data Hadoop and Spark Scala Course Curriculum Certified Big Data Hadoop and Spark Scala Course Curriculum The Certified Big Data Hadoop and Spark Scala course by DataFlair is a perfect blend of indepth theoretical knowledge and strong practical skills

More information

Certified Big Data and Hadoop Course Curriculum

Certified Big Data and Hadoop Course Curriculum Certified Big Data and Hadoop Course Curriculum The Certified Big Data and Hadoop course by DataFlair is a perfect blend of in-depth theoretical knowledge and strong practical skills via implementation

More information

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop

More information

Oracle Big Data Cloud Service, Oracle Storage Cloud Service, Oracle Database Cloud Service

Oracle Big Data Cloud Service, Oracle Storage Cloud Service, Oracle Database Cloud Service Demo Introduction Keywords: Oracle Big Data Cloud Service, Oracle Storage Cloud Service, Oracle Database Cloud Service Goal of Demo: Oracle Big Data Preparation Cloud Services can ingest data from various

More information

Stages of Data Processing

Stages of Data Processing Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,

More information

Actual4Test. Actual4test - actual test exam dumps-pass for IT exams

Actual4Test.   Actual4test - actual test exam dumps-pass for IT exams Actual4Test http://www.actual4test.com Actual4test - actual test exam dumps-pass for IT exams Exam : 1z1-449 Title : Oracle Big Data 2017 Implementation Essentials Vendor : Oracle Version : DEMO Get Latest

More information

Big Data Hadoop Course Content

Big Data Hadoop Course Content Big Data Hadoop Course Content Topics covered in the training Introduction to Linux and Big Data Virtual Machine ( VM) Introduction/ Installation of VirtualBox and the Big Data VM Introduction to Linux

More information

April Copyright 2013 Cloudera Inc. All rights reserved.

April Copyright 2013 Cloudera Inc. All rights reserved. Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on

More information

CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)

CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI) CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI) The Certificate in Software Development Life Cycle in BIGDATA, Business Intelligence and Tableau program

More information

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here

Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and thevirtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here 2013-11-12 Copyright 2013 Cloudera

More information

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018 Big Data com Hadoop Impala, Hive e Spark VIII Sessão - SQL Bahia 03/03/2018 Diógenes Pires Connect with PASS Sign up for a free membership today at: pass.org #sqlpass Internet Live http://www.internetlivestats.com/

More information

Hive and Shark. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic)

Hive and Shark. Amir H. Payberah. Amirkabir University of Technology (Tehran Polytechnic) Hive and Shark Amir H. Payberah amir@sics.se Amirkabir University of Technology (Tehran Polytechnic) Amir H. Payberah (Tehran Polytechnic) Hive and Shark 1393/8/19 1 / 45 Motivation MapReduce is hard to

More information

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development:: Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized

More information

Securing the Oracle BDA - 1

Securing the Oracle BDA - 1 Hello and welcome to this online, self-paced course titled Administering and Managing the Oracle Big Data Appliance (BDA). This course contains several lessons. This lesson is titled Securing the Oracle

More information

Quick Deployment Step- by- step instructions to deploy Oracle Big Data Lite Virtual Machine

Quick Deployment Step- by- step instructions to deploy Oracle Big Data Lite Virtual Machine Quick Deployment Step- by- step instructions to deploy Oracle Big Data Lite Virtual Machine Version 4.1.0 Please note: This appliance is for testing and educational purposes only; it is unsupported and

More information

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine

Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine Quick Deployment Step-by-step instructions to deploy Oracle Big Data Lite Virtual Machine Version 4.11 Last Updated: 1/10/2018 Please note: This appliance is for testing and educational purposes only;

More information

Oracle Big Data Appliance

Oracle Big Data Appliance Oracle Big Data Appliance Software User's Guide Release 4 (4.4) E65665-12 July 2016 Describes the Oracle Big Data Appliance software available to administrators and software developers. Oracle Big Data

More information

Data Lake Based Systems that Work

Data Lake Based Systems that Work Data Lake Based Systems that Work There are many article and blogs about what works and what does not work when trying to build out a data lake and reporting system. At DesignMind, we have developed a

More information

An Introduction to Big Data Formats

An Introduction to Big Data Formats Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION

More information

Oracle Database: SQL and PL/SQL Fundamentals NEW

Oracle Database: SQL and PL/SQL Fundamentals NEW Oracle University Contact Us: 001-855-844-3881 & 001-800-514-06-97 Oracle Database: SQL and PL/SQL Fundamentals NEW Duration: 5 Days What you will learn This Oracle Database: SQL and PL/SQL Fundamentals

More information

Hadoop Online Training

Hadoop Online Training Hadoop Online Training IQ training facility offers Hadoop Online Training. Our Hadoop trainers come with vast work experience and teaching skills. Our Hadoop training online is regarded as the one of the

More information

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. HCatalog

About the Tutorial. Audience. Prerequisites. Copyright & Disclaimer. HCatalog About the Tutorial HCatalog is a table storage management tool for Hadoop that exposes the tabular data of Hive metastore to other Hadoop applications. It enables users with different data processing tools

More information

Oracle Big Data. A NA LYT ICS A ND MA NAG E MENT.

Oracle Big Data. A NA LYT ICS A ND MA NAG E MENT. Oracle Big Data. A NALYTICS A ND MANAG E MENT. Oracle Big Data: Redundância. Compatível com ecossistema Hadoop, HIVE, HBASE, SPARK. Integração com Cloudera Manager. Possibilidade de Utilização da Linguagem

More information

Hadoop is supplemented by an ecosystem of open source projects IBM Corporation. How to Analyze Large Data Sets in Hadoop

Hadoop is supplemented by an ecosystem of open source projects IBM Corporation. How to Analyze Large Data Sets in Hadoop Hadoop Open Source Projects Hadoop is supplemented by an ecosystem of open source projects Oozie 25 How to Analyze Large Data Sets in Hadoop Although the Hadoop framework is implemented in Java, MapReduce

More information

BIG DATA ANALYTICS USING HADOOP TOOLS APACHE HIVE VS APACHE PIG

BIG DATA ANALYTICS USING HADOOP TOOLS APACHE HIVE VS APACHE PIG BIG DATA ANALYTICS USING HADOOP TOOLS APACHE HIVE VS APACHE PIG Prof R.Angelin Preethi #1 and Prof J.Elavarasi *2 # Department of Computer Science, Kamban College of Arts and Science for Women, TamilNadu,

More information

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou The Hadoop Ecosystem EECS 4415 Big Data Systems Tilemachos Pechlivanoglou tipech@eecs.yorku.ca A lot of tools designed to work with Hadoop 2 HDFS, MapReduce Hadoop Distributed File System Core Hadoop component

More information

Oracle Big Data Connectors & Big Data SQL

Oracle Big Data Connectors & Big Data SQL Oracle Big Data Connectors & Big Data SQL Dr. Nadine Schöne, Detlef E. Schröder, Gavin Dupré DOAG Big Data Days 20./21.09.2018, Dresden Safe Harbor Statement The following is intended to outline our general

More information

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a) Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a) Cloudera s Developer Training for Apache Spark and Hadoop delivers the key concepts and expertise need to develop high-performance

More information

Safe Harbor Statement

Safe Harbor Statement Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment

More information

Introducing Oracle R Enterprise 1.4 -

Introducing Oracle R Enterprise 1.4 - Hello, and welcome to this online, self-paced lesson entitled Introducing Oracle R Enterprise. This session is part of an eight-lesson tutorial series on Oracle R Enterprise. My name is Brian Pottle. I

More information

This is a brief tutorial that explains how to make use of Sqoop in Hadoop ecosystem.

This is a brief tutorial that explains how to make use of Sqoop in Hadoop ecosystem. About the Tutorial Sqoop is a tool designed to transfer data between Hadoop and relational database servers. It is used to import data from relational databases such as MySQL, Oracle to Hadoop HDFS, and

More information

MySQL for Developers Ed 3

MySQL for Developers Ed 3 Oracle University Contact Us: 1.800.529.0165 MySQL for Developers Ed 3 Duration: 5 Days What you will learn This MySQL for Developers training teaches developers how to plan, design and implement applications

More information

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism Big Data and Hadoop with Azure HDInsight Andrew Brust Senior Director, Technical Product Marketing and Evangelism Datameer Level: Intermediate Meet Andrew Senior Director, Technical Product Marketing and

More information

Introduction to the Oracle Big Data Appliance - 1

Introduction to the Oracle Big Data Appliance - 1 Hello and welcome to this online, self-paced course titled Administering and Managing the Oracle Big Data Appliance (BDA). This course contains several lessons. This lesson is titled Introduction to the

More information

Techno Expert Solutions An institute for specialized studies!

Techno Expert Solutions An institute for specialized studies! Course Content of Big Data Hadoop( Intermediate+ Advance) Pre-requistes: knowledge of Core Java/ Oracle: Basic of Unix S.no Topics Date Status Introduction to Big Data & Hadoop Importance of Data& Data

More information

Eine für Alle - Oracle DB für Big Data, In-memory und Exadata Dr.-Ing. Holger Friedrich

Eine für Alle - Oracle DB für Big Data, In-memory und Exadata Dr.-Ing. Holger Friedrich Eine für Alle - Oracle DB für Big Data, In-memory und Exadata Dr.-Ing. Holger Friedrich Agenda Introduction Old Times Exadata Big Data Oracle In-Memory Headquarters Conclusions 2 sumit AG Consulting and

More information

CIS 612 Advanced Topics in Database Big Data Project Lawrence Ni, Priya Patil, James Tench

CIS 612 Advanced Topics in Database Big Data Project Lawrence Ni, Priya Patil, James Tench CIS 612 Advanced Topics in Database Big Data Project Lawrence Ni, Priya Patil, James Tench Abstract Implementing a Hadoop-based system for processing big data and doing analytics is a topic which has been

More information

IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps://

IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps:// IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps://www.certqueen.com Exam : 1Z1-449 Title : Oracle Big Data 2017 Implementation Essentials Version : DEMO 1 / 4 1.You need to place

More information

Hadoop & Big Data Analytics Complete Practical & Real-time Training

Hadoop & Big Data Analytics Complete Practical & Real-time Training An ISO Certified Training Institute A Unit of Sequelgate Innovative Technologies Pvt. Ltd. www.sqlschool.com Hadoop & Big Data Analytics Complete Practical & Real-time Training Mode : Instructor Led LIVE

More information

Workload Experience Manager

Workload Experience Manager Workload Experience Manager Important Notice 2010-2018 Cloudera, Inc. All rights reserved. Cloudera, the Cloudera logo, and any other product or service names or slogans contained in this document are

More information

Big Data and Hadoop. Course Curriculum: Your 10 Module Learning Plan. About Edureka

Big Data and Hadoop. Course Curriculum: Your 10 Module Learning Plan. About Edureka Course Curriculum: Your 10 Module Learning Plan Big Data and Hadoop About Edureka Edureka is a leading e-learning platform providing live instructor-led interactive online training. We cater to professionals

More information

Oracle R Advanced Analytics for Hadoop Release Notes. Oracle R Advanced Analytics for Hadoop Release Notes

Oracle R Advanced Analytics for Hadoop Release Notes. Oracle R Advanced Analytics for Hadoop Release Notes Oracle R Advanced Analytics for Hadoop 2.7.1 Release Notes i Oracle R Advanced Analytics for Hadoop 2.7.1 Release Notes Oracle R Advanced Analytics for Hadoop 2.7.1 Release Notes ii REVISION HISTORY NUMBER

More information

Hortonworks Data Platform

Hortonworks Data Platform Hortonworks Data Platform Workflow Management (August 31, 2017) docs.hortonworks.com Hortonworks Data Platform: Workflow Management Copyright 2012-2017 Hortonworks, Inc. Some rights reserved. The Hortonworks

More information

Hadoop An Overview. - Socrates CCDH

Hadoop An Overview. - Socrates CCDH Hadoop An Overview - Socrates CCDH What is Big Data? Volume Not Gigabyte. Terabyte, Petabyte, Exabyte, Zettabyte - Due to handheld gadgets,and HD format images and videos - In total data, 90% of them collected

More information

Hadoop. Introduction to BIGDATA and HADOOP

Hadoop. Introduction to BIGDATA and HADOOP Hadoop Introduction to BIGDATA and HADOOP What is Big Data? What is Hadoop? Relation between Big Data and Hadoop What is the need of going ahead with Hadoop? Scenarios to apt Hadoop Technology in REAL

More information

microsoft

microsoft 70-775.microsoft Number: 70-775 Passing Score: 800 Time Limit: 120 min Exam A QUESTION 1 Note: This question is part of a series of questions that present the same scenario. Each question in the series

More information

Spatial Analytics Built for Big Data Platforms

Spatial Analytics Built for Big Data Platforms Spatial Analytics Built for Big Platforms Roberto Infante Software Development Manager, Spatial and Graph 1 Copyright 2011, Oracle and/or its affiliates. All rights Global Digital Growth The Internet of

More information

Big Data Architect.

Big Data Architect. Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional

More information

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Raanan Dagan and Rohit Pujari September 25, 2017 Washington, DC Forward-Looking Statements During the course of this presentation, we may

More information

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case

More information

Oracle Big Data Appliance X7-2

Oracle Big Data Appliance X7-2 Oracle Big Data Appliance X7-2 Oracle Big Data Appliance is a flexible, high-performance, secure platform for running diverse workloads on Hadoop, Kafka and NoSQL. With Oracle Big Data SQL, Oracle Big

More information

Strategies for Incremental Updates on Hive

Strategies for Incremental Updates on Hive Strategies for Incremental Updates on Hive Copyright Informatica LLC 2017. Informatica, the Informatica logo, and Big Data Management are trademarks or registered trademarks of Informatica LLC in the United

More information

IBM Big SQL Partner Application Verification Quick Guide

IBM Big SQL Partner Application Verification Quick Guide IBM Big SQL Partner Application Verification Quick Guide VERSION: 1.6 DATE: Sept 13, 2017 EDITORS: R. Wozniak D. Rangarao Table of Contents 1 Overview of the Application Verification Process... 3 2 Platform

More information

Integrating Big Data with Oracle Data Integrator 12c ( )

Integrating Big Data with Oracle Data Integrator 12c ( ) [1]Oracle Fusion Middleware Integrating Big Data with Oracle Data Integrator 12c (12.2.1.1) E73982-01 May 2016 Oracle Fusion Middleware Integrating Big Data with Oracle Data Integrator, 12c (12.2.1.1)

More information

MySQL for Developers Ed 3

MySQL for Developers Ed 3 Oracle University Contact Us: 0845 777 7711 MySQL for Developers Ed 3 Duration: 5 Days What you will learn This MySQL for Developers training teaches developers how to plan, design and implement applications

More information

Introduction to BigData, Hadoop:-

Introduction to BigData, Hadoop:- Introduction to BigData, Hadoop:- Big Data Introduction: Hadoop Introduction What is Hadoop? Why Hadoop? Hadoop History. Different types of Components in Hadoop? HDFS, MapReduce, PIG, Hive, SQOOP, HBASE,

More information

Big Data Analytics using Apache Hadoop and Spark with Scala

Big Data Analytics using Apache Hadoop and Spark with Scala Big Data Analytics using Apache Hadoop and Spark with Scala Training Highlights : 80% of the training is with Practical Demo (On Custom Cloudera and Ubuntu Machines) 20% Theory Portion will be important

More information

Oracle NoSQL Database Enterprise Edition, Version 18.1

Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database Enterprise Edition, Version 18.1 Oracle NoSQL Database is a scalable, distributed NoSQL database, designed to provide highly reliable, flexible and available data management across

More information

Hadoop Overview. Lars George Director EMEA Services

Hadoop Overview. Lars George Director EMEA Services Hadoop Overview Lars George Director EMEA Services 1 About Me Director EMEA Services @ Cloudera Consulting on Hadoop projects (everywhere) Apache Committer HBase and Whirr O Reilly Author HBase The Definitive

More information

Accessing Hadoop Data Using Hive

Accessing Hadoop Data Using Hive An IBM Proof of Technology Accessing Hadoop Data Using Hive Unit 3: Hive DML in action An IBM Proof of Technology Catalog Number Copyright IBM Corporation, 2015 US Government Users Restricted Rights -

More information

Oracle Database 18c and Autonomous Database

Oracle Database 18c and Autonomous Database Oracle Database 18c and Autonomous Database Maria Colgan Oracle Database Product Management March 2018 @SQLMaria Safe Harbor Statement The following is intended to outline our general product direction.

More information

Impala Intro. MingLi xunzhang

Impala Intro. MingLi xunzhang Impala Intro MingLi xunzhang Overview MPP SQL Query Engine for Hadoop Environment Designed for great performance BI Connected(ODBC/JDBC, Kerberos, LDAP, ANSI SQL) Hadoop Components HDFS, HBase, Metastore,

More information

HADOOP COURSE CONTENT (HADOOP-1.X, 2.X & 3.X) (Development, Administration & REAL TIME Projects Implementation)

HADOOP COURSE CONTENT (HADOOP-1.X, 2.X & 3.X) (Development, Administration & REAL TIME Projects Implementation) HADOOP COURSE CONTENT (HADOOP-1.X, 2.X & 3.X) (Development, Administration & REAL TIME Projects Implementation) Introduction to BIGDATA and HADOOP What is Big Data? What is Hadoop? Relation between Big

More information

sqoop Easy, parallel database import/export Aaron Kimball Cloudera Inc. June 8, 2010

sqoop Easy, parallel database import/export Aaron Kimball Cloudera Inc. June 8, 2010 sqoop Easy, parallel database import/export Aaron Kimball Cloudera Inc. June 8, 2010 Your database Holds a lot of really valuable data! Many structured tables of several hundred GB Provides fast access

More information