Overview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training::

Similar documents
Blended Learning Outline: Cloudera Data Analyst Training (171219a)

CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

Innovatus Technologies

Big Data Hadoop Stack

Big Data Architect.

Certified Big Data and Hadoop Course Curriculum

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

1Z Oracle Big Data 2017 Implementation Essentials Exam Summary Syllabus Questions

1 Big Data Hadoop. 1. Introduction About this Course About Big Data Course Logistics Introductions

Lecture 7 (03/12, 03/14): Hive and Impala Decisions, Operations & Information Technologies Robert H. Smith School of Business Spring, 2018

Big Data Hadoop Course Content

Hadoop. Introduction / Overview

Big Data Analytics using Apache Hadoop and Spark with Scala

Hadoop & Big Data Analytics Complete Practical & Real-time Training

Certified Big Data Hadoop and Spark Scala Course Curriculum

Hadoop Online Training

Big Data Hadoop Certification Training

BIG DATA ANALYTICS USING HADOOP TOOLS APACHE HIVE VS APACHE PIG

Big Data. Big Data Analyst. Big Data Engineer. Big Data Architect

Introduction to BigData, Hadoop:-

Hadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours)

Expert Lecture plan proposal Hadoop& itsapplication

Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture

Hadoop Development Introduction

Apache Spark and Scala Certification Training

We are ready to serve Latest Testing Trends, Are you ready to learn.?? New Batches Info

Big Data with Hadoop Ecosystem

HADOOP COURSE CONTENT (HADOOP-1.X, 2.X & 3.X) (Development, Administration & REAL TIME Projects Implementation)

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Hortonworks Certified Developer (HDPCD Exam) Training Program

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::


Oracle Big Data Fundamentals Ed 2

Oracle Big Data Fundamentals Ed 1

Oracle Data Integrator 12c: Integration and Administration

Hive SQL over Hadoop

Big Data Syllabus. Understanding big data and Hadoop. Limitations and Solutions of existing Data Analytics Architecture

This is a brief tutorial that explains how to make use of Sqoop in Hadoop ecosystem.

Oracle Data Integrator 12c: Integration and Administration

Big Data Analytics. Description:

BEST BIG DATA CERTIFICATIONS

Pig A language for data processing in Hadoop

Big Data and Hadoop. Course Curriculum: Your 10 Module Learning Plan. About Edureka

Hadoop course content

Hadoop. Introduction to BIGDATA and HADOOP

microsoft

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo

Techno Expert Solutions An institute for specialized studies!

Things Every Oracle DBA Needs to Know about the Hadoop Ecosystem. Zohar Elkayam

Hadoop is supplemented by an ecosystem of open source projects IBM Corporation. How to Analyze Large Data Sets in Hadoop

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

Oracle Big Data. A NA LYT ICS A ND MA NAG E MENT.

Hadoop: The Definitive Guide

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou

Exam Questions

A complete Hadoop Development Training Program.

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

Specialist ICT Learning

Oracle Big Data Connectors

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

Hadoop An Overview. - Socrates CCDH

exam. Microsoft Perform Data Engineering on Microsoft Azure HDInsight. Version 1.0

Hortonworks Data Platform

Cloudera Introduction

Data Science Bootcamp Curriculum. NYC Data Science Academy

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab

Oracle. Oracle Big Data 2017 Implementation Essentials. 1z Version: Demo. [ Total Questions: 10] Web:

A Tutorial on Apache Spark

International Journal of Advance Engineering and Research Development. A study based on Cloudera's distribution of Hadoop technologies for big data"

CSE 444: Database Internals. Lecture 23 Spark

Performance Comparison of Hive, Pig & Map Reduce over Variety of Big Data

Stages of Data Processing

Migrate from Netezza Workload Migration

Cloudera Introduction

IT Certification Exams Provider! Weofferfreeupdateserviceforoneyear! h ps://

Hadoop Overview. Lars George Director EMEA Services

Exam Questions 1z0-449

BIG DATA COURSE CONTENT

Data Science Training

Microsoft Big Data and Hadoop

April Copyright 2013 Cloudera Inc. All rights reserved.

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

In-memory data pipeline and warehouse at scale using Spark, Spark SQL, Tachyon and Parquet

Data warehousing on Hadoop. Marek Grzenkowicz Roche Polska

Introduction to Big Data

Outline. CS-562 Introduction to data analysis using Apache Spark

Chase Wu New Jersey Institute of Technology

Introduction to Big-Data

Data in the Cloud and Analytics in the Lake

SpagoBI and Talend jointly support Big Data scenarios

Cloudera Introduction

Question: 1 You need to place the results of a PigLatin script into an HDFS output directory. What is the correct syntax in Apache Pig?

HPE Basic Implementation Service for Hadoop

Hortonworks University. Education Catalog 2018 Q1

Data Lake Based Systems that Work

Data Clustering on the Parallel Hadoop MapReduce Model. Dimitrios Verraros

Transcription:

Module Title Duration : Cloudera Data Analyst Training : 4 days Overview Take your knowledge to the next level Cloudera University s four-day data analyst training course will teach you to apply traditional data analytics and business intelligence skills to big data tools like Apache Impala (incubating), Apache Hive, and Apache Pig. Cloudera presents the tools data professionals need to access, manipulate, transform, and analyze complex data sets using SQL and familiar scripting languages. Learn a modern toolset Students will have the chance to learn and work with modern tools, such as: Apache Impala (incubating) enables instant interactive analysis of the data stored in Hadoop via a native SQL environment. Apache Hive provides a SQL-like query language with HiveQL that makes data accessible to analysts, database administrators, and others without Java programming expertise. Apache Pig applies the fundamentals of familiar scripting languages to the Hadoop cluster. Get hands-on experience Through instructor-led discussion and interactive, hands-on exercises, participants will navigate the Hadoop ecosystem, learning how to: Acquire, store, and analyze data using features in Pig, Hive, and Impala Perform fundamental ETL (extract, transform, and load) tasks with Hadoop tools Use Pig, Hive, and Impala to improve productivity for typical analysis tasks Join diverse datasets to gain valuable business insight Perform interactive, complex queries on datasets What to expect This course is designed for data analysts, business intelligence specialists, developers, system architects, and database administrators. Prior knowledge of Apache Hadoop is not required. Knowledge of SQL is assumed Basic familiarity with the Linux command line is expected Knowledge of a scripting language (such as Bash scripting, Perl, Python, or Ruby) is helpful but not essential. I V E R S O N A S S O C I A T E S S D N B H D Page 1 of 5

Get certified Upon completion of the course, attendees are encouraged to continue their study and register for the CCA Data Analyst exam. Certification is a great differentiator. It helps establish you as a leader in the field, providing employers and customers with tangible evidence of your skills and expertise. Course Details Introduction Hadoop Fundamentals The Motivation for Hadoop Hadoop Overview Data Storage: HDFS Distributed Data Processing: YARN, MapReduce, and Spark Data Processing and Analysis: Pig, Hive, and Impala Database Integration: Sqoop Other Hadoop Data Tools Exercise Scenarios Introduction to Pig What is Pig? Pig s Features Pig Use Cases Interacting with Pig Basic Data Analysis with Pig Pig Latin Syntax Loading Data Simple Data Types Field Definitions Data Output Viewing the Schema Filtering and Sorting Data Commonly Used Functions Processing Complex Data with Pig Storage Formats Complex/Nested Data Types Grouping I V E R S O N A S S O C I A T E S S D N B H D Page 2 of 5

Built-In Functions for Complex Data Iterating Grouped Data Multi-Dataset Operations with Pig Techniques for Combining Datasets Joining Datasets in Pig Set Operations Splitting Datasets Pig Troubleshooting and Optimization Troubleshooting Pig Logging Using Hadoop s Web UI Data Sampling and Debugging Performance Overview Understanding the Execution Plan Tips for Improving the Performance of Pig Jobs Introduction to Hive and Impala What is Hive? What is Impala? Why Use Hive and Impala? Schema and Data Storage Comparing Hive and Impala to Traditional Databases Use Cases Querying with Hive and Impala Databases and Tables Basic Hive and Impala Query Language Syntax Data Types Using Hue to Execute Queries Using Beeline (Hive s Shell) Using the Impala Shell Hive and Impala Data Management Data Storage Creating Databases and Tables Loading Data Altering Databases and Tables I V E R S O N A S S O C I A T E S S D N B H D Page 3 of 5

Simplifying Queries with Views Storing Query Results Data Storage and Performance Partitioning Tables Loading Data into Partitioned Tables When to Use Partitioning Choosing a File Format Using Avro and Parquet File Formats Relational Data Analysis with Hive and Impala Joining Datasets Common Built-In Functions Aggregation and Windowing Complex Data with Hive and Impala Complex Data with Hive Complex Data with Impala Analyzing Text with Hive and Impala Using Regular Expressions with Hive and Impala Processing Text Data with SerDes in Hive Sentiment Analysis and n-grams Hive Optimization Understanding Query Performance Bucketing Indexing Data Hive on Spark Impala Optimization How Impala Executes Queries Improving Impala Performance Extending Hive and Impala Custom SerDes and File Formats in Hive Data Transformation with Custom Scripts in Hive User-Defined Functions Parameterized Queries Choosing the Best Tool for the Job Comparing Pig, Hive, Impala, and Relational Databases I V E R S O N A S S O C I A T E S S D N B H D Page 4 of 5

Conclusion Which to Choose? I V E R S O N A S S O C I A T E S S D N B H D Page 5 of 5