Similar documents
Hadoop Development Introduction

Big Data Analytics using Apache Hadoop and Spark with Scala

Certified Big Data Hadoop and Spark Scala Course Curriculum

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Big Data Architect.

Apache Spark and Scala Certification Training

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab

A Tutorial on Apache Spark

Overview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training::

Blended Learning Outline: Cloudera Data Analyst Training (171219a)

Analytic Cloud with. Shelly Garion. IBM Research -- Haifa IBM Corporation

Techno Expert Solutions An institute for specialized studies!

Spark Overview. Professor Sasu Tarkoma.

HADOOP COURSE CONTENT (HADOOP-1.X, 2.X & 3.X) (Development, Administration & REAL TIME Projects Implementation)

Innovatus Technologies

Cloud Computing 3. CSCI 4850/5850 High-Performance Computing Spring 2018

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

Processing of big data with Apache Spark

CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)

Big Data Syllabus. Understanding big data and Hadoop. Limitations and Solutions of existing Data Analytics Architecture

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

About the Tutorial. Audience. Prerequisites. Copyright and Disclaimer. PySpark

Specialist ICT Learning

Turning Relational Database Tables into Spark Data Sources

IBM Data Science Experience White paper. SparkR. Transforming R into a tool for big data analytics

Apache Spark 2.0. Matei

Big Data. Big Data Analyst. Big Data Engineer. Big Data Architect

Cloud, Big Data & Linear Algebra

Hadoop. Introduction to BIGDATA and HADOOP

Scalable Tools - Part I Introduction to Scalable Tools

Hadoop, Yarn and Beyond

Unifying Big Data Workloads in Apache Spark

Principal Software Engineer Red Hat Emerging Technology June 24, 2015

Apache Spark 2 X Cookbook Cloud Ready Recipes For Analytics And Data Science

An Introduction to Apache Spark

Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture

The Evolution of Big Data Platforms and Data Science

CSC 261/461 Database Systems Lecture 24. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

An Introduction to Apache Spark

2/4/2019 Week 3- A Sangmi Lee Pallickara

DATA SCIENCE USING SPARK: AN INTRODUCTION

Big Data Hadoop Certification Training

Chapter 4: Apache Spark

We are ready to serve Latest Testing Trends, Are you ready to learn.?? New Batches Info

Data Analytics Job Guarantee Program

COSC 6339 Big Data Analytics. Introduction to Spark. Edgar Gabriel Fall What is SPARK?

Cloud Computing & Visualization

Hadoop Online Training

Agenda. Spark Platform Spark Core Spark Extensions Using Apache Spark

Hadoop course content

Hadoop. Course Duration: 25 days (60 hours duration). Bigdata Fundamentals. Day1: (2hours)

Big data systems 12/8/17

CSE 444: Database Internals. Lecture 23 Spark

Certified Big Data and Hadoop Course Curriculum

1 Big Data Hadoop. 1. Introduction About this Course About Big Data Course Logistics Introductions

Oracle Big Data Fundamentals Ed 2

Khadija Souissi. Auf z Systems November IBM z Systems Mainframe Event 2016

SparkSQL 11/14/2018 1

Big Data Infrastructures & Technologies

An Overview of Apache Spark

Big Data and Hadoop. Course Curriculum: Your 10 Module Learning Plan. About Edureka

In-memory data pipeline and warehouse at scale using Spark, Spark SQL, Tachyon and Parquet

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

Big Data Hadoop Course Content

Bringing Data to Life

Data processing in Apache Spark

An Introduction to Apache Spark Big Data Madison: 29 July William Red Hat, Inc.

Index. bfs() function, 225 Big data characteristics, 2 variety, 3 velocity, 3 veracity, 3 volume, 2 Breadth-first search algorithm, 220, 225

08/04/2018. RDDs. RDDs are the primary abstraction in Spark RDDs are distributed collections of objects spread across the nodes of a clusters

Data Science Training


New Developments in Spark

Data Platforms and Pattern Mining

Apache Ignite TM - In- Memory Data Fabric Fast Data Meets Open Source

Introduction to BigData, Hadoop:-

RDDs are the primary abstraction in Spark RDDs are distributed collections of objects spread across the nodes of a clusters

Spark 2. Alexey Zinovyev, Java/BigData Trainer in EPAM

Hadoop. Introduction / Overview

Introduction to Hadoop. High Availability Scaling Advantages and Challenges. Introduction to Big Data

Comparative Study of Apache Hadoop vs Spark

Integration of Machine Learning Library in Apache Apex


Spark, Shark and Spark Streaming Introduction

Analytics in Spark. Yanlei Diao Tim Hunter. Slides Courtesy of Ion Stoica, Matei Zaharia and Brooke Wenig

2/26/2017. RDDs. RDDs are the primary abstraction in Spark RDDs are distributed collections of objects spread across the nodes of a clusters

Higher level data processing in Apache Spark

Big Data Hadoop Stack

Lightning Fast Cluster Computing. Michael Armbrust Reflections Projections 2015 Michast

Twitter data Analytics using Distributed Computing

Evolution From Shark To Spark SQL:

Dell In-Memory Appliance for Cloudera Enterprise

Beyond MapReduce: Apache Spark Antonino Virgillito

Cloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

Data processing in Apache Spark

MapReduce, Hadoop and Spark. Bompotas Agorakis

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks

Transcription:

About Codefrux While the current trends around the world are based on the internet, mobile and its applications, we try to make the most out of it. As for us, we are a well established IT professionals based in Bangalore, constantly coping up with the extensive advancement and adapting to new Technology. Since the start of the company in 2010, our niches are Mobile Application Development, Web Application Development, Staffing Services and Training Programs (includes Corporate Training). What you will Learn In This Course Hands-on knowledge exploring, running and deploying Apache Spark Access to numerous and wide variety of Spark with Scala, Spark SQL, Spark Streaming and Spark MLLib source code examples Create hands-on Spark environments for experimenting with course examples Participate in course discussion boards with instructor and other students Know when and how Spark with Scala, Spark SQL, Spark Streaming and Spark MLLibr may be an appropriate solution Who can learn Apache Spark and Scala There is a huge demand for Apache Spark and Scala professionals in IT industry. Big data professionals,analytics professionals,research professionals,it developers and testers,data scientists,bi and reporting professionals can do this course.

1. Introduction to Spark a. Overview of BigData and Spark b. MapReduce limitations c. Spark History d. Spark Architecture e. Limitations of MapReduce in Hadoop Objectives f. Batch vs. Real-time analytics g. Application of stream processing h. Spark and Hadoop Advantages i. Benefits of Spark and Hadoop j. Introduction to Spark Eco-system k. Spark Installation l. Quiz m. Summary n. Hands on 2. Introduction to Programming in Scala a. Features of Scala b. Basic data types and literals used c. List the operators and methods used in Scala d. Concepts of Scala e. Scala foundation f. Features of Scala g. Setup Spark and Scala on Unbuntu and Windows OS h. Install IDE's for Scala i. Run Scala Codes on Scala Shell j. Understanding Data types in Scala k. Implementing Lazy Values l. Control Structures m. Looping Structures n. Functions o. Procedures p. Collections q. Arrays and Array Buffers r. Map's, Tuples and Lists s. Quiz t. Summary u. Hands on

3. OOPS and Functional Programming in Scala a. Implementing Classes b. Getters and Setters c. Properties with only Getters d. Object & Object Private Fields e. Implementing Nested Classes f. Abstract Classes g. Constructor h. Auxiliary and Primary Constructor i. Singletons j. Companion Objects k. Extending a Class l. Understanding Packages m. Override Methods n. Type Checking o. Casting p. Overriding Methods q. Traits as Interfaces r. Layered Traits s. Functional Programming t. Higher Order Functions u. Anonymous Functions v. Closures and Currying w. Performing File Processing x. Quiz y. Summary z. Hands on 4. Foundation to Spark a. Spark Shell and PySpark b. Creating the Spark Context c. Invoking Spark Shell d. Loading a file in Shell e. Basic operations on Shell f. Spark Java projects g. Spark Context and Spark Properties h. Overview of SBT i. Building a Spark project with SBT j. Running Spark project with SBT k. Local mode l. Spark mode m. Caching overview n. Persistance in Spark

o. HDFS data from Spark p. Implementing Server Log Analysis using Spark q. Quiz r. Summary s. Hands on 5. Working with RDD a. Understanding RDD b. How to create RDDs c. RDD operations and methods d. Transformations in RDD e. Actions in RDD f. Loading data into RDD g. Saving data through RDD h. Key-Value Pair RDD i. MapReduce and Pair RDD Operations j. Spark and Hadoop Integration-HDFS k. Key-Value Pair RDD l. Scala RDD, Paired RDD, Double RDD & General RDD Functions m. Implementing HadoopRDD, Filtered RDD, Joined RDD n. Transformations, Actions and Shared Variables o. Spark Operations on YARN p. Sequence File Processing q. Partitioner and its role in Performance improvement r. Quiz s. Summary t. Hands on 6. Spark Streaming & Spark SQL a. Introduction to Spark Streaming b. Introduction to Spark SQL c. Querying Files as Tables d. Text file Format e. JSON file Format f. Parquet file Format g. Hive and Spark SQL Architecture h. Integrating Spark & Apache Hive i. Spark SQL performance optimization j. Implementing Data visualization in Spark k. Quiz l. Summary m. Hands on

7. Spark ML Programming a. Explain the use cases and techniques of Machine Learning (ML) b. Describe the key concepts of Spark ML c. Explain the concept of an ML Dataset, and ML algorithm, model selection via cross validation d. Quiz e. Summary f. Hands on 8. Spark GraphX Programming g. Explain the key concepts of Spark GraphX programming h. Limitations of the Graph Parallel system i. Describe the operations with a graph j. Graph system optimizations k. Quiz l. Summary m. Hands on

Project Work After course completion, students will be assigned to work on live project to polish the technology skills you have acquired with us.