Processing Big Data. with AZURE DATA LAKE ANALYTICS. Sean Forgatch - Senior Consultant. 6/23/ TALAVANT. All Rights Reserved.

Size: px
Start display at page:

Download "Processing Big Data. with AZURE DATA LAKE ANALYTICS. Sean Forgatch - Senior Consultant. 6/23/ TALAVANT. All Rights Reserved."

Transcription

1 Processing Big Data with AZURE DATA LAKE ANALYTICS Sean Forgatch - Senior Consultant 6/23/ TALAVANT. All Rights Reserved. 1

2 SQL Saturday Iowa /23/ TALAVANT. All Rights Reserved. 2

3 About Me Sean Forgatch Milwaukee, WI originally West Frankfort, IL Consulting Data and Analytics Management Healthcare, Insurance, SaaS Industry Speaker President Northeast WI PASS Running, Hiking, Traveling 6/23/ TALAVANT. All Rights Reserved. 3

4 Agenda 1. Data Monetization and Big Data 2. Data Lakes 3. Azure Data Lake Analytics 6/23/ TALAVANT. All Rights Reserved. 4

5 Big Data 6/23/ TALAVANT. All Rights Reserved. 5

6 Data Monetization Data monetization, a form of monetization, may refer to the act of generating measurable economic benefits from available data sources (analytics). - 6/23/ TALAVANT. All Rights Reserved. 6

7 Data Monetization 6/23/ TALAVANT. All Rights Reserved. 7

8 Data Monetization 6/23/ TALAVANT. All Rights Reserved. 8

9 Data Monetization: Use Case Lockheed-Martin text mines project documentation and communications for leading indicators of project issues, leading to hundreds of millions of dollars in reduced cost overruns 6/23/ TALAVANT. All Rights Reserved. 9

10 Big Data: Azure Tools 6/23/ TALAVANT. All Rights Reserved. 10

11 BI Maturity 6/23/ TALAVANT. All Rights Reserved. 11

12 Analytics Maturity 6/23/ TALAVANT. All Rights Reserved. 12

13 Agenda 1. Data Monetization and Big Data 2. Data Lakes 3. Azure Data Lake Analytics 6/23/ TALAVANT. All Rights Reserved. 13

14 Data Lakes Lake of Egypt Southern Illinois 6/23/ TALAVANT. All Rights Reserved. 14

15 Data Lakes Azure Data Lake Store RAW STAGE CURATED 2 Operational Value has been identified 1 EXPLORATORY Explorational Value is being discovered 6/23/ TALAVANT. All Rights Reserved. 15

16 Data Lakes Azure Data Lake Store ERP RDBS Devices IoT RAW STAGE CURATED EXPLORATORY 2 Operational Value has been identified 1 Explorational Value is being discovered 6/23/ TALAVANT. All Rights Reserved. 16

17 Data Lakes Azure Data Lake Store Discover ERP RDBS RAW STAGE CURATED 2 Operational Value has been identified Tag / Explore Devices IoT EXPLORATORY 1 Explorational Value is being discovered 6/23/ TALAVANT. All Rights Reserved. 17

18 6/23/ TALAVANT. All Rights Reserved. 18

19 SUBJECT DATA SOURCE DATA SET DATE FILE DATASET DATE FILE DESTINATION DATA SET FILE? 6/23/ TALAVANT. All Rights Reserved. 19

20 Roles RAW (1) STAGE (2) CURATED (3) EXPLORATION (0) Data Experts/Engineers Data Experts/Engineers ETL and BI Engineers / SME s / Analysts Data Scientist / Analysts TOOLS 6/23/ TALAVANT. All Rights Reserved. 20

21 Tagging RAW (1) STAGE (2) CURATED (3) EXPLORATION (0) Data Experts/Engineers Data Experts/Engineers ETL and BI Engineers / SME s / Analysts Data Scientist / Analysts AUTOMATED SME N/A TOOLS 6/23/ TALAVANT. All Rights Reserved. 21

22 Purpose RAW (1) STAGE (2) CURATED (3) EXPLORATION (0) Data Experts/Engineers Data Experts/Engineers ETL and BI Engineers / SME s / Analysts Data Scientist / Analysts AUTOMATED SME N/A INGESTION CLEANSING DISTRIBUTION TOOLS 6/23/ TALAVANT. All Rights Reserved. 22

23 Azure Data Lake Store WebHDFS Data Store for housing data in it s Native Raw Format Built on Apache YARN Process and store Petabyte size files Designed and tuned specifically for analytics Enterprise Security through Azure Active Directory No storage limit 6/23/ TALAVANT. All Rights Reserved. 23

24 Azure Data Lake Store SMALL FILE E1 EXTENTS V1 VERTEXES 6/23/ TALAVANT. All Rights Reserved. 24

25 Azure Data Lake Store SMALL FILE BIG FILE E1 EXTENTS E1 E2 E3 E4 V1 VERTEXES V1 V2 V3 V4 6/23/ TALAVANT. All Rights Reserved. 25

26 Azure Data Lake Store - Ingestion Azure Streaming Analytics Azure Data Factory SSIS Apache Sqoop Azure Event Hubs Azure Data Lake Analytics (Federated) Powershell ADLCopy DistCp Storage Explorer, Azure Portal, Visual Studio 6/23/ TALAVANT. All Rights Reserved. 26

27 Ingestion Demo Azure Data Factory 6/23/ TALAVANT. All Rights Reserved. 27

28 Agenda 1. Data Monetization and Big Data 2. Data Lakes 3. Azure Data Lake Analytics 6/23/ TALAVANT. All Rights Reserved. 28

29 Azure Data Lake Analytics Big Data Queries as a Service Analytics Query Federation Develop in U-SQL,.NET, R, and Python Cognitive Services Scale Instantly Pay Per Job 6/23/ TALAVANT. All Rights Reserved. 29

30 U-SQL KEY FEATURES Combines SQL and C# Patterned File Processing Thousands of Files Extensions: Python, R, Cognitive Query Data where it Lives (Federated Querying) Partition and Distribution of Data for Massive Parallelism Manage Structure and Shared Programming through U-SQL Catalog U-SQL Procedures 6/23/ TALAVANT. All Rights Reserved. 30

31 U-SQL: Scripts U-SQL Query 6/23/ TALAVANT. All Rights Reserved. 31

32 U-SQL: Scripts U-SQL Query Expression Tree 6/23/ TALAVANT. All Rights Reserved. 32

33 U-SQL: Scripts Job Graph Expression Tree U-SQL Query 6/23/ TALAVANT. All Rights Reserved. 33

34 U-SQL: Job Execution 6/23/ TALAVANT. All Rights Reserved. 34

35 U-SQL: Query Expressions EXPRESSION TYPES EXTRACT SELECT SAMPLE PROCESS REDUCE COMBINE 6/23/ TALAVANT. All Rights Reserved. 35

36 U-SQL: Query Expressions EXPRESSION TYPES EXTRACT SELECT SAMPLE PROCESS REDUCE COMBINE ROWSET VARIABLES 6/23/ TALAVANT. All Rights Reserved. 36

37 U-SQL: Query Expressions EXPRESSION TYPES EXTRACT SELECT SAMPLE PROCESS REDUCE COMBINE QUERY EXPRESSIONS 6/23/ TALAVANT. All Rights Reserved. 37

38 U-SQL: File-Sets Input Files DEFINE A FILE-PATH PATTERN /DataLake/A_RAW/Stocks/Stocks/{*}.txt /DataLake/A_RAW/Stocks/Stocks/{FileName}.txt /DataLake/A_RAW/Stocks/Stocks/{Date:yyyy}.txt 6/23/ TALAVANT. All Rights Reserved. 38

39 U-SQL: File-Sets Output Files DEFINE A FILE-PATH PATTERN /DataLake/B_STAGE/Stocks/{*}.csv..current /DataLake/B_STAGE/Stocks/{FileName}.csv..private preview 6/23/ TALAVANT. All Rights Reserved. 39

40 U-SQL: Extract Query 1 = EXTRACT Field1 string, Field2 int, Field 3 int? FROM /datalake/01_raw/{*}.csv USING Extractors.Csv(); T-SQL CREATE TABLE mytable ( Field1 VARCHAR(100), Field2 INT, Field3 INT NOT NULL ); INSERT INTO mytable ( Field1, Field2, Field3) SELECT CAST(Field1 as varchar(100) as Field1, CAST(Field2 AS INT) as Field2, CONVERT(INT, Field3) as Field 3 FROM mytable 6/23/ TALAVANT. All Rights Reserved. 40

41 U-SQL: Extract Query 1 2 = EXTRACT Field1 string, Field2 int, Field 3 int? FROM /datalake/01_raw/{*}.csv USING = SELECT Field1, MAX(Field2) AS Field2 GROUP BY Field1; T-SQL CREATE TABLE mytable ( Field1 VARCHAR(100), Field2 INT, Field3 INT NOT NULL ); INSERT INTO mytable ( Field1, Field2, Field3) SELECT CAST(Field1 as varchar(100) as Field1, CAST(Field2 AS INT) as Field2, CONVERT(INT, Field3) as Field 3 FROM mytable 6/23/ TALAVANT. All Rights Reserved. 41

42 U-SQL: Extract Query 1 2 = EXTRACT Field1 string, Field2 int, Field 3 int? FROM /datalake/01_raw/{*}.csv USING = SELECT Field1, MAX(Field2) AS Field2 GROUP BY Field1; 3 TO /datalake/02_stage/myoutput.csv USING Outputters.Csv(); T-SQL CREATE TABLE mytable ( Field1 VARCHAR(100), Field2 INT, Field3 INT NOT NULL ); INSERT INTO mytable ( Field1, Field2, Field3) SELECT CAST(Field1 as varchar(100) as Field1, CAST(Field2 AS INT) as Field2, CONVERT(INT, Field3) as Field 3 FROM mytable 6/23/ TALAVANT. All Rights Reserved. 42

43 U-SQL: Extractors and Outputters (native) EXTRACTORS USING EXTRACTORS.Text() USING EXTRACTORS.Tsv() USING EXTRACTORS.Csv() USING EXTRACTORS.Parquet() USING EXTRACTORS.Orc() OUTPUTTERS USING OUTPUTTERS.Text() USING OUTPUTTERS.Tsv() USING OUTPUTTERS.Csv() USING OUTPUTTERS.Parquet() USING OUTPUTTERS.Orc() 6/23/ TALAVANT. All Rights Reserved. 43

44 U-SQL: Extractors and Outputters (community) XML JSON AVRO FLEX 6/23/ TALAVANT. All Rights Reserved. 44

45 U-SQL: Extractor and Outputter Parameters EXTRACT FROM /datalake/01_raw/{*}.csv USING Extractors.Csv(silent : true, delimiter :, ); CSV(); PARAMETERS Delimiter Encoding escapecharecter nullescape Quoting rowdelimiter Silent skipfirstnrows charformat 6/23/ TALAVANT. All Rights Reserved. 45

46 DEMO: U-SQL 1 FILE PER STOCK. FILE INCLUDES FULL HISTORY 6/23/ TALAVANT. All Rights Reserved. 46

47 DEMO: U-SQL Stock Reference Information 6/23/ TALAVANT. All Rights Reserved. 47

48 ADL & U-SQL: Best Practices U-SQL Constructs before custom code SELECT before PROCESS, Windowing Functions Use Tables and Partitioning for Big Performance Understand Skew for Performance Tuning Understand Data Profile Avoid many small files General good file size 500MB to 2GB Use Event Hubs or Streaming Analytics to pull larger file sets together Find your use case! Use Virtual Columns to filter file sets Avoid: Store Everything in Data Lake and analyze it later 6/23/ TALAVANT. All Rights Reserved. 48

49 Azure Data Lake: Use Case Azure Data Lake connects supply chain data for advanced analytics Source: 6/23/ TALAVANT. All Rights Reserved. 49

50 Data Lake: Advice 1. Don t ingest everything in the beginning. 2. Depending on BI and Analytics maturity, determines data lake priority. 3. Then, the data lake augments the EDW. 6/23/ TALAVANT. All Rights Reserved. 50

51 Data Lake: Cost 6/23/ TALAVANT. All Rights Reserved. 51

52 Azure Data Lake Whitepaper 6/23/ TALAVANT. All Rights Reserved. 52

53 Go Learn! Michael Rys(U-SQL Program Director) LinkedIn Slide Share s GitHub U-SQL Repository SQL Server Central Stairway to U-SQL Azure Built in Example 6/23/ TALAVANT. All Rights Reserved. 53

54 Talavant Solutions 6/23/ TALAVANT. All Rights Reserved. 54

55 Thank you! 6/23/ TALAVANT. All Rights Reserved. 55

Azure Data Lake Store

Azure Data Lake Store Azure Data Lake Store Analytics 101 Kenneth M. Nielsen Data Solution Architect, MIcrosoft Our Sponsors About me Kenneth M. Nielsen Worked with SQL Server since 1999 Data Solution Architect at Microsoft

More information

Modern Data Warehouse The New Approach to Azure BI

Modern Data Warehouse The New Approach to Azure BI Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics

More information

Alexander Klein. #SQLSatDenmark. ETL meets Azure

Alexander Klein. #SQLSatDenmark. ETL meets Azure Alexander Klein ETL meets Azure BIG Thanks to SQLSat Denmark sponsors Save the date for exiting upcoming events PASS Camp 2017 Main Camp 05.12. 07.12.2017 (04.12. Kick-Off abends) Lufthansa Training &

More information

Swimming in the Data Lake. Presented by Warner Chaves Moderated by Sander Stad

Swimming in the Data Lake. Presented by Warner Chaves Moderated by Sander Stad Swimming in the Data Lake Presented by Warner Chaves Moderated by Sander Stad Thank You microsoft.com hortonworks.com aws.amazon.com red-gate.com Empower users with new insights through familiar tools

More information

Index. Scott Klein 2017 S. Klein, IoT Solutions in Microsoft s Azure IoT Suite, DOI /

Index. Scott Klein 2017 S. Klein, IoT Solutions in Microsoft s Azure IoT Suite, DOI / Index A Advanced Message Queueing Protocol (AMQP), 44 Analytics, 9 Apache Ambari project, 209 210 API key, 244 Application data, 4 Azure Active Directory (AAD), 91, 257 Azure Blob Storage, 191 Azure data

More information

Azure Data Lake Analytics Introduction for SQL Family. Julie

Azure Data Lake Analytics Introduction for SQL Family. Julie Azure Data Lake Analytics Introduction for SQL Family Julie Koesmarno @MsSQLGirl www.mssqlgirl.com jukoesma@microsoft.com What we have is a data glut Vernor Vinge (Emeritus Professor of Mathematics at

More information

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks Asanka Padmakumara ETL 2.0: Data Engineering with Azure Databricks Who am I? Asanka Padmakumara Business Intelligence Consultant, More than 8 years in BI and Data Warehousing A regular speaker in data

More information

Exam Questions

Exam Questions Exam Questions 70-775 Perform Data Engineering on Microsoft Azure HDInsight (beta) https://www.2passeasy.com/dumps/70-775/ NEW QUESTION 1 You are implementing a batch processing solution by using Azure

More information

Oliver Engels & Tillmann Eitelberg. Big Data! Big Quality?

Oliver Engels & Tillmann Eitelberg. Big Data! Big Quality? Oliver Engels & Tillmann Eitelberg Big Data! Big Quality? Like to visit Germany? PASS Camp 2017 Main Camp 5.12 7.12.2017 (4.12 Kick Off Evening) Lufthansa Training & Conference Center, Seeheim SQL Konferenz

More information

Microsoft Developer Day

Microsoft Developer Day Microsoft Developer Day Pradeep Menon Microsoft Developer Day Solutions Architect Agenda Microsoft Developer Day Traditional Business Intelligence Architecture Structured Sources Extract Transform Structurize

More information

microsoft

microsoft 70-775.microsoft Number: 70-775 Passing Score: 800 Time Limit: 120 min Exam A QUESTION 1 Note: This question is part of a series of questions that present the same scenario. Each question in the series

More information

Overview of Data Services and Streaming Data Solution with Azure

Overview of Data Services and Streaming Data Solution with Azure Overview of Data Services and Streaming Data Solution with Azure Tara Mason Senior Consultant tmason@impactmakers.com Platform as a Service Offerings SQL Server On Premises vs. Azure SQL Server SQL Server

More information

Alexander Klein. ETL in the Cloud

Alexander Klein. ETL in the Cloud Alexander Klein ETL in the Cloud Sponsors help us to run this event! THX! You Rock! Sponsor Gold Sponsor Silver Sponsor Bronze Sponsor You Rock! Sponsor Session 13:45 Track 1 Das super nerdige Solisyon

More information

Data Architectures in Azure for Analytics & Big Data

Data Architectures in Azure for Analytics & Big Data Data Architectures in for Analytics & Big Data October 20, 2018 Melissa Coates Solution Architect, BlueGranite Microsoft Data Platform MVP Blog: www.sqlchick.com Twitter: @sqlchick Data Architecture A

More information

Azure Data Factory VS. SSIS. Reza Rad, Consultant, RADACAD

Azure Data Factory VS. SSIS. Reza Rad, Consultant, RADACAD Azure Data Factory VS. SSIS Reza Rad, Consultant, RADACAD 2 Please silence cell phones Explore Everything PASS Has to Offer FREE ONLINE WEBINAR EVENTS FREE 1-DAY LOCAL TRAINING EVENTS VOLUNTEERING OPPORTUNITIES

More information

BIG DATA COURSE CONTENT

BIG DATA COURSE CONTENT BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data

More information

exam. Microsoft Perform Data Engineering on Microsoft Azure HDInsight. Version 1.0

exam.   Microsoft Perform Data Engineering on Microsoft Azure HDInsight. Version 1.0 70-775.exam Number: 70-775 Passing Score: 800 Time Limit: 120 min File Version: 1.0 Microsoft 70-775 Perform Data Engineering on Microsoft Azure HDInsight Version 1.0 Exam A QUESTION 1 You use YARN to

More information

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019 From Single Purpose to Multi Purpose Data Lakes Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019 Agenda Data Lakes Multiple Purpose Data Lakes Customer Example Demo Takeaways

More information

Modeling. Preparation. Operationalization. Profile Explore. Model Testing & Validation. Feature & Algorithm Selection. Transform Cleanse Denormalize

Modeling. Preparation. Operationalization. Profile Explore. Model Testing & Validation. Feature & Algorithm Selection. Transform Cleanse Denormalize Preparation Modeling Ingest Transform Cleanse Denormalize Profile Explore Visualize Feature & Algorithm Selection Model Testing & Validation Operationalization Models Visualizations Deploy Apps, Services

More information

WPC010 Introduction to Azure Data Lake. Andrea Uggetti Microsoft Francesco Diaz Insight

WPC010 Introduction to Azure Data Lake. Andrea Uggetti Microsoft Francesco Diaz Insight WPC010 Introduction to Azure Data Lake P R E S E N T A Andrea Uggetti Microsoft - @matusa69 Francesco Diaz Insight - @francedit Agenda Data Lake concepts Introduction to Azure Data Lake DEMO Q/A Data Lake

More information

Take P, R or U. and solve your data quality problems Oliver Engels & Tillmann Eitelberg, OH22

Take P, R or U. and solve your data quality problems Oliver Engels & Tillmann Eitelberg, OH22 Take P, R or U and solve your data quality problems Oliver Engels & Tillmann Eitelberg, OH22 Oliver Engels CEO, oh22data AG @oengels Datamonster from Germany MS Data Platform MVP President of PASS Germany

More information

HDInsight > Hadoop. October 12, 2017

HDInsight > Hadoop. October 12, 2017 HDInsight > Hadoop October 12, 2017 2 Introduction Mark Hudson >20 years mixing technology with data >10 years with CapTech Microsoft Certified IT Professional Business Intelligence Member of the Richmond

More information

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo Microsoft Exam Questions 70-775 Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo NEW QUESTION 1 You have an Azure HDInsight cluster. You need to store data in a file format that

More information

Oliver Engels & Tillmann Eitelberg. Big Data! Big Quality?

Oliver Engels & Tillmann Eitelberg. Big Data! Big Quality? Oliver Engels & Tillmann Eitelberg Big Data! Big Quality? Sponsors help us to run this event! THX! You Rock! Sponsor Gold Sponsor Silver Sponsor Bronze Sponsor You Rock! Sponsor Session 13:45 Track 1 Das

More information

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo Microsoft Exam Questions 70-775 Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo NEW QUESTION 1 HOTSPOT You install the Microsoft Hive ODBC Driver on a computer that runs Windows

More information

Microsoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud

Microsoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud Microsoft Azure Databricks for data engineering Building production data pipelines with Apache Spark in the cloud Azure Databricks As companies continue to set their sights on making data-driven decisions

More information

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp. Data 101 Which DB, When Joe Yong (joeyong@microsoft.com) Azure SQL Data Warehouse, Program Management Microsoft Corp. The world is changing AI increased by 300% in 2017 Data will grow to 44 ZB in 2020

More information

17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS

17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS What are all those Azure* and Power* services and why do I want them? Dr Greg Low SQL Down Under greg@sqldownunder.com Who is Greg? CEO and Principal Mentor at SDU Data Platform MVP Microsoft Regional

More information

Approaching the Petabyte Analytic Database: What I learned

Approaching the Petabyte Analytic Database: What I learned Disclaimer This document is for informational purposes only and is subject to change at any time without notice. The information in this document is proprietary to Actian and no part of this document may

More information

COURSE 10977A: UPDATING YOUR SQL SERVER SKILLS TO MICROSOFT SQL SERVER 2014

COURSE 10977A: UPDATING YOUR SQL SERVER SKILLS TO MICROSOFT SQL SERVER 2014 ABOUT THIS COURSE This five-day instructor-led course teaches students how to use the enhancements and new features that have been added to SQL Server and the Microsoft data platform since the release

More information

Understanding the latent value in all content

Understanding the latent value in all content Understanding the latent value in all content John F. Kennedy (JFK) November 22, 1963 INGEST ENRICH EXPLORE Cognitive skills Data in any format, any Azure store Search Annotations Data Cloud Intelligence

More information

Diving into your Azure Data Lake with U-SQL. Helge Rege

Diving into your Azure Data Lake with U-SQL. Helge Rege Diving into your Azure Data Lake with U-SQL Helge Rege Gårdsvoll, @datahelge THANKS to all Sponsors! EVENT SPONSORS EXPO SPONSORS EXPO LIGHT SPONSORS Azure Data Lake has three components Data Lake Store

More information

Data sources. Gartner, The State of Data Warehousing in 2012

Data sources. Gartner, The State of Data Warehousing in 2012 data warehousing has reached the most significant tipping point since its inception. The biggest, possibly most elaborate data management system in IT is changing. Gartner, The State of Data Warehousing

More information

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a) Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a) Cloudera s Developer Training for Apache Spark and Hadoop delivers the key concepts and expertise need to develop high-performance

More information

COURSE 20466D: IMPLEMENTING DATA MODELS AND REPORTS WITH MICROSOFT SQL SERVER

COURSE 20466D: IMPLEMENTING DATA MODELS AND REPORTS WITH MICROSOFT SQL SERVER ABOUT THIS COURSE The focus of this five-day instructor-led course is on creating managed enterprise BI solutions. It describes how to implement multidimensional and tabular data models, deliver reports

More information

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism Big Data and Hadoop with Azure HDInsight Andrew Brust Senior Director, Technical Product Marketing and Evangelism Datameer Level: Intermediate Meet Andrew Senior Director, Technical Product Marketing and

More information

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics Cy Erbay Senior Director Striim Executive Summary Striim is Uniquely Qualified to Solve the Challenges of Real-Time

More information

Oskari Heikkinen. New capabilities of Azure Data Factory v2

Oskari Heikkinen. New capabilities of Azure Data Factory v2 Oskari Heikkinen New capabilities of Azure Data Factory v2 Oskari Heikkinen Lead Cloud Architect at BIGDATAPUMP Microsoft P-TSP Azure Advisors Numerous projects on Azure Worked with Microsoft Data Platform

More information

Survey of the Azure Data Landscape. Ike Ellis

Survey of the Azure Data Landscape. Ike Ellis Survey of the Azure Data Landscape Ike Ellis Wintellect Core Services Consulting Custom software application development and architecture Instructor Led Training Microsoft s #1 training vendor for over

More information

Real-time Analytics with Azure Stream Analytics. Michael

Real-time Analytics with Azure Stream Analytics. Michael Real-time Analytics with Azure Stream Analytics Michael Johnson @MikeJohnsonZA What I d like to share with you today Introduction to streaming data Overview of Azure Steam Analytics Demonstrate a simple

More information

Vishesh Oberoi Seth Reid Technical Evangelist, Microsoft Software Developer, Intergen

Vishesh Oberoi Seth Reid Technical Evangelist, Microsoft Software Developer, Intergen Vishesh Oberoi Technical Evangelist, Microsoft VishO@microsoft.com @ovishesh Seth Reid Software Developer, Intergen contact@sethreid.co.nz @sethreidnz Vishesh Oberoi Technical Evangelist, Microsoft VishO@microsoft.com

More information

Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect

Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect Igor Roiter Big Data Cloud Solution Architect Working as a Data Specialist for the last 11 years 9 of them as a Consultant specializing

More information

Demystifying Data Warehouse as a Service (DWaaS)

Demystifying Data Warehouse as a Service (DWaaS) YOUR DATA, NO LIMITS Demystifying Data Warehouse as a Service (DWaaS) Kent Graziano, Senior Technical Evangelist Snowflake Computing @KentGraziano 1 My Bio Senior Technical Evangelist, Snowflake Computing

More information

Dr. Michael Curry. Oregon. The Big Picture: SQL Overview and Getting the Most from SQL Saturday

Dr. Michael Curry. Oregon. The Big Picture: SQL Overview and Getting the Most from SQL Saturday Dr. Michael Curry michael.curry@wsu.edu Oregon The Big Picture: SQL Overview and Getting the Most from SQL Saturday Academic Data Management E-Commerce Entrepreneurship Dr. Michael Curry /michaellcurry/

More information

Microsoft Analytics Platform System (APS)

Microsoft Analytics Platform System (APS) Microsoft Analytics Platform System (APS) The turnkey modern data warehouse appliance Matt Usher, Senior Program Manager @ Microsoft About.me @two_under Senior Program Manager 9 years at Microsoft Visual

More information

BI ENVIRONMENT PLANNING GUIDE

BI ENVIRONMENT PLANNING GUIDE BI ENVIRONMENT PLANNING GUIDE Business Intelligence can involve a number of technologies and foster many opportunities for improving your business. This document serves as a guideline for planning strategies

More information

Introduction to U-SQL & Data Lake Alex Whittles

Introduction to U-SQL & Data Lake Alex Whittles Introduction to U-SQL & Data Lake Alex Whittles Alex@PurpleFrogSystems.com PurpleFrogSystems.com PurpleFrogSystems.com/blog @PurpleFrogAlex Alex Whittles SQL Relay Committee SQLRelay.co.uk SQL Bits Committee

More information

Getting personal with your customers and GDPR

Getting personal with your customers and GDPR Getting personal with your customers and GDPR A practical approach to a secure, governed 360 degree customer view Darren Brunt Presales Director UK&I, Talend Colm Moynihan Partner Presales Manager EMEA,

More information

Franck Mercier. Technical Solution Professional Data + AI Azure Databricks

Franck Mercier. Technical Solution Professional Data + AI Azure Databricks Franck Mercier Technical Solution Professional Data + AI http://aka.ms/franck @FranmerMS Azure Databricks Thanks to our sponsors Global Gold Silver Bronze Microsoft JetBrains Rubrik Delphix Solution OMD

More information

An Introduction to Big Data Formats

An Introduction to Big Data Formats Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION

More information

Think & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI)

Think & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI) Think & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI) About the Speaker Dr. SubraMANI Paramasivam PhD., MCT, MCSE, MCITP, MCP, MCTS, MCSA CEO, Principal Consultant & Trainer

More information

Drawing the Big Picture

Drawing the Big Picture Drawing the Big Picture Multi-Platform Data Architectures, Queries, and Analytics Philip Russom TDWI Research Director for Data Management August 26, 2015 Sponsor 2 Speakers Philip Russom TDWI Research

More information

Data integration made easy with Talend Open Studio for Data Integration. Dimitar Zahariev BI / DI Consultant

Data integration made easy with Talend Open Studio for Data Integration. Dimitar Zahariev BI / DI Consultant Data integration made easy with Talend Open Studio for Data Integration Dimitar Zahariev BI / DI Consultant dimitar@zahariev.pro @shekeriev Disclaimer Please keep in mind that: 2 I m not related in any

More information

Shine a Light on Dark Data with Vertica Flex Tables

Shine a Light on Dark Data with Vertica Flex Tables White Paper Analytics and Big Data Shine a Light on Dark Data with Vertica Flex Tables Hidden within the dark recesses of your enterprise lurks dark data, information that exists but is forgotten, unused,

More information

Data sources. Gartner, The State of Data Warehousing in 2012

Data sources. Gartner, The State of Data Warehousing in 2012 data warehousing has reached the most significant tipping point since its inception. The biggest, possibly most elaborate data management system in IT is changing. Gartner, The State of Data Warehousing

More information

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development:: Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized

More information

Processing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd.

Processing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd. Processing Unstructured Data Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd. http://dinesql.com / Dinesh Priyankara @dinesh_priya Founder/Principal Architect dinesql Pvt Ltd. Microsoft Most

More information

End-to-End data mining feature integration, transformation and selection with Datameer Datameer, Inc. All rights reserved.

End-to-End data mining feature integration, transformation and selection with Datameer Datameer, Inc. All rights reserved. End-to-End data mining feature integration, transformation and selection with Datameer Fastest time to Insights Rapid Data Integration Zero coding data integration Wizard-led data integration & No ETL

More information

Transitioning From SSIS to Azure Data Factory. Meagan Longoria, Solution Architect, BlueGranite

Transitioning From SSIS to Azure Data Factory. Meagan Longoria, Solution Architect, BlueGranite Transitioning From SSIS to Azure Data Factory Meagan Longoria, Solution Architect, BlueGranite Microsoft Data Platform MVP I enjoy contributing to and learning from the Microsoft data community. Blogger

More information

Designing a Modern Data Warehouse + Data Lake

Designing a Modern Data Warehouse + Data Lake Designing a Modern Warehouse + Lake Strategies & architecture options for implementing a modern data warehousing environment Melissa Coates Analytics Architect, SentryOne Blog: sqlchick.com Twitter: @sqlchick

More information

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp.

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp. 17-18 March, 2018 Beijing Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp. The world is changing AI increased by 300% in 2017 Data will grow to 44 ZB in 2020 Today, 80% of organizations

More information

MOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server

MOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server MOC 20463C: Implementing a Data Warehouse with Microsoft SQL Server Course Overview This course provides students with the knowledge and skills to implement a data warehouse with Microsoft SQL Server.

More information

The Emerging Data Lake IT Strategy

The Emerging Data Lake IT Strategy The Emerging Data Lake IT Strategy An Evolving Approach for Dealing with Big Data & Changing Environments bit.ly/datalake SPEAKERS: Thomas Kelly, Practice Director Cognizant Technology Solutions Sean Martin,

More information

Blended Learning Outline: Cloudera Data Analyst Training (171219a)

Blended Learning Outline: Cloudera Data Analyst Training (171219a) Blended Learning Outline: Cloudera Data Analyst Training (171219a) Cloudera Univeristy s data analyst training course will teach you to apply traditional data analytics and business intelligence skills

More information

Cortana Intelligence Suite; Where the Magic Happens

Cortana Intelligence Suite; Where the Magic Happens Cortana Intelligence Suite; Where the Magic Happens Reza Rad, Leila Etaati #509 Brisbane 2016 About Us Reza Rad Leila Etaati MVP BI Consultant and Trainer Author of Books Speaker in conferences; PASS Summit,

More information

Stages of Data Processing

Stages of Data Processing Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,

More information

Things I Learned The Hard Way About Azure Data Platform Services So You Don t Have To -Meagan Longoria

Things I Learned The Hard Way About Azure Data Platform Services So You Don t Have To -Meagan Longoria Things I Learned The Hard Way About Azure Data Platform Services So You Don t Have To -Meagan Longoria 2 University of Nebraska at Omaha Special thanks to UNO and the College of Business Administration

More information

Demystifying Cloud Data Warehousing

Demystifying Cloud Data Warehousing YOUR DATA, NO LIMITS Demystifying Cloud Data Warehousing Nicolas Baret Director of Pre-Sales EMEA @Snowflake TDWI Helsinki, October 2017 1 What is a Cloud Data Warehouse and what should we expect? 2 What

More information

Introduction to U-SQL & Data Lake Alex Whittles

Introduction to U-SQL & Data Lake Alex Whittles Introduction to U-SQL & Data Lake Alex Whittles Alex@PurpleFrogSystems.com PurpleFrogSystems.com PurpleFrogSystems.com/blog @PurpleFrogAlex Alex Whittles SQL Relay Committee SQLRelay.co.uk SQL Bits Committee

More information

Big Data analytics in insurance

Big Data analytics in insurance Big Data analytics in insurance Who we are Experts At Your Service > Over 50 specialists in IT infrastructure > Certified, experienced, passionate Based In Switzerland > 100% self-financed Swiss company

More information

Data Analytics at Logitech Snowflake + Tableau = #Winning

Data Analytics at Logitech Snowflake + Tableau = #Winning Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief

More information

A Guide to Best Practices

A Guide to Best Practices APRIL 2014 Putting the Data Lake to Work A Guide to Best Practices SPONSORED BY CONTENTS Introduction 1 What Is a Data Lake and Why Has It Become Popular? 1 The Initial Capabilities of a Data Lake 1 The

More information

Data in the Cloud and Analytics in the Lake

Data in the Cloud and Analytics in the Lake Data in the Cloud and Analytics in the Lake Introduction Working in Analytics for over 5 years Part the digital team at BNZ for 3 years Based in the Auckland office Preferred Languages SQL Python (PySpark)

More information

Implement a Data Warehouse with Microsoft SQL Server

Implement a Data Warehouse with Microsoft SQL Server Implement a Data Warehouse with Microsoft SQL Server 20463D; 5 days, Instructor-led Course Description This course describes how to implement a data warehouse platform to support a BI solution. Students

More information

20463C-Implementing a Data Warehouse with Microsoft SQL Server. Course Content. Course ID#: W 35 Hrs. Course Description: Audience Profile

20463C-Implementing a Data Warehouse with Microsoft SQL Server. Course Content. Course ID#: W 35 Hrs. Course Description: Audience Profile Course Content Course Description: This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse 2014, implement ETL with

More information

What is Gluent? The Gluent Data Platform

What is Gluent? The Gluent Data Platform What is Gluent? The Gluent Data Platform The Gluent Data Platform provides a transparent data virtualization layer between traditional databases and modern data storage platforms, such as Hadoop, in the

More information

Building Next- GeneraAon Data IntegraAon Pla1orm. George Xiong ebay Data Pla1orm Architect April 21, 2013

Building Next- GeneraAon Data IntegraAon Pla1orm. George Xiong ebay Data Pla1orm Architect April 21, 2013 Building Next- GeneraAon Data IntegraAon Pla1orm George Xiong ebay Data Pla1orm Architect April 21, 2013 ebay Analytics >50 TB/day new data 100+ Subject Areas >100 PB/day Processed >100 Trillion pairs

More information

Microsoft Perform Data Engineering on Microsoft Azure HDInsight.

Microsoft Perform Data Engineering on Microsoft Azure HDInsight. Microsoft 70-775 Perform Data Engineering on Microsoft Azure HDInsight http://killexams.com/pass4sure/exam-detail/70-775 QUESTION: 30 You are building a security tracking solution in Apache Kafka to parse

More information

Implementing a Data Warehouse with Microsoft SQL Server

Implementing a Data Warehouse with Microsoft SQL Server Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server Page 1 of 6 Implementing a Data Warehouse with Microsoft SQL Server Course 20463C: 4 days; Instructor-Led Introduction This course

More information

Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture

Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture Hadoop 1.0 Architecture Introduction to Hadoop & Big Data Hadoop Evolution Hadoop Architecture Networking Concepts Use cases

More information

Overview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training::

Overview. : Cloudera Data Analyst Training. Course Outline :: Cloudera Data Analyst Training:: Module Title Duration : Cloudera Data Analyst Training : 4 days Overview Take your knowledge to the next level Cloudera University s four-day data analyst training course will teach you to apply traditional

More information

20767B: IMPLEMENTING A SQL DATA WAREHOUSE

20767B: IMPLEMENTING A SQL DATA WAREHOUSE ABOUT THIS COURSE This 5-day instructor led course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL Server

More information

Microsoft Big Data and Hadoop

Microsoft Big Data and Hadoop Microsoft Big Data and Hadoop Lara Rubbelke @sqlgal Cindy Gross @sqlcindy 2 The world of data is changing The 4Vs of Big Data http://nosql.mypopescu.com/post/9621746531/a-definition-of-big-data 3 Common

More information

Přehled novinek v SQL Server 2016

Přehled novinek v SQL Server 2016 Přehled novinek v SQL Server 2016 Martin Rys, BI Competency Leader martin.rys@adastragrp.com https://www.linkedin.com/in/martinrys 20.4.2016 1 BI Competency development 2 Trends, modern data warehousing

More information

WHITEPAPER. MemSQL Enterprise Feature List

WHITEPAPER. MemSQL Enterprise Feature List WHITEPAPER MemSQL Enterprise Feature List 2017 MemSQL Enterprise Feature List DEPLOYMENT Provision and deploy MemSQL anywhere according to your desired cluster configuration. On-Premises: Maximize infrastructure

More information

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018 Big Data com Hadoop Impala, Hive e Spark VIII Sessão - SQL Bahia 03/03/2018 Diógenes Pires Connect with PASS Sign up for a free membership today at: pass.org #sqlpass Internet Live http://www.internetlivestats.com/

More information

SQL Server 2017 Power your entire data estate from on-premises to cloud

SQL Server 2017 Power your entire data estate from on-premises to cloud SQL Server 2017 Power your entire data estate from on-premises to cloud PREMIER SPONSOR GOLD SPONSORS SILVER SPONSORS BRONZE SPONSORS SUPPORTERS Vulnerabilities (2010-2016) Power your entire data estate

More information

MAPR DATA GOVERNANCE WITHOUT COMPROMISE

MAPR DATA GOVERNANCE WITHOUT COMPROMISE MAPR TECHNOLOGIES, INC. WHITE PAPER JANUARY 2018 MAPR DATA GOVERNANCE TABLE OF CONTENTS EXECUTIVE SUMMARY 3 BACKGROUND 4 MAPR DATA GOVERNANCE 5 CONCLUSION 7 EXECUTIVE SUMMARY The MapR DataOps Governance

More information

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success. ACTIVATORS Designed to give your team assistance when you need it most without

More information

Event Sponsors. Expo Sponsors. Expo Light Sponsors

Event Sponsors. Expo Sponsors. Expo Light Sponsors Event Sponsors Expo Sponsors Expo Light Sponsors IoT for the BI professional David L. Bojsen - Principal Architect What to expect Level 200 session Which basically means PowerPoint and talking Enthusiastic

More information

Oracle Big Data Discovery

Oracle Big Data Discovery Oracle Big Data Discovery Turning Data into Business Value Harald Erb Oracle Business Analytics & Big Data 1 Safe Harbor Statement The following is intended to outline our general product direction. It

More information

SOFTWARE DEVELOPMENT: DATA SCIENCE

SOFTWARE DEVELOPMENT: DATA SCIENCE PROFESSIONAL CAREER TRAINING INSTITUTE SOFTWARE DEVELOPMENT: DATA SCIENCE www.pcti.edu/data-science applicant@pcti.edu 832-484-9100 PROGRAM OVERVIEW Prepare for a life changing career as a data scientist

More information

What is Data Warehouse like

What is Data Warehouse like What is Data Warehouse like in the Big Data Era? Sales (Asia) Data Warehouse Sales (US) ETL ETL Collects and organizes historical data from multiple sources Inventory Advertising ETL ETL So far Ø Star

More information

Azure Data Factory. Data Integration in the Cloud

Azure Data Factory. Data Integration in the Cloud Azure Data Factory Data Integration in the Cloud 2018 Microsoft Corporation. All rights reserved. This document is provided "as-is." Information and views expressed in this document, including URL and

More information

Vendor: Microsoft. Exam Code: Exam Name: Implementing a Data Warehouse with Microsoft SQL Server Version: Demo

Vendor: Microsoft. Exam Code: Exam Name: Implementing a Data Warehouse with Microsoft SQL Server Version: Demo Vendor: Microsoft Exam Code: 70-463 Exam Name: Implementing a Data Warehouse with Microsoft SQL Server 2012 Version: Demo DEMO QUESTION 1 You are developing a SQL Server Integration Services (SSIS) package

More information

Syncsort DMX-h. Simplifying Big Data Integration. Goals of the Modern Data Architecture SOLUTION SHEET

Syncsort DMX-h. Simplifying Big Data Integration. Goals of the Modern Data Architecture SOLUTION SHEET SOLUTION SHEET Syncsort DMX-h Simplifying Big Data Integration Goals of the Modern Data Architecture Data warehouses and mainframes are mainstays of traditional data architectures and still play a vital

More information

SAP HANA Leading Marketplace for IT and Certification Courses

SAP HANA Leading Marketplace for IT and Certification Courses SAP HANA Overview SAP HANA or High Performance Analytic Appliance is an In-Memory computing combines with a revolutionary platform to perform real time analytics and deploying and developing real time

More information

Implementing a SQL Data Warehouse

Implementing a SQL Data Warehouse Course 20767B: Implementing a SQL Data Warehouse Page 1 of 7 Implementing a SQL Data Warehouse Course 20767B: 4 days; Instructor-Led Introduction This 4-day instructor led course describes how to implement

More information

Hadoop Overview. Lars George Director EMEA Services

Hadoop Overview. Lars George Director EMEA Services Hadoop Overview Lars George Director EMEA Services 1 About Me Director EMEA Services @ Cloudera Consulting on Hadoop projects (everywhere) Apache Committer HBase and Whirr O Reilly Author HBase The Definitive

More information