Data Architectures in Azure for Analytics & Big Data

Similar documents
BIG DATA COURSE CONTENT

HDInsight > Hadoop. October 12, 2017

Modern Data Warehouse The New Approach to Azure BI

Understanding the latent value in all content

microsoft

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp.

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.

Overview of Data Services and Streaming Data Solution with Azure

Franck Mercier. Technical Solution Professional Data + AI Azure Databricks

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo

Alexander Klein. #SQLSatDenmark. ETL meets Azure

Index. Scott Klein 2017 S. Klein, IoT Solutions in Microsoft s Azure IoT Suite, DOI /

exam. Microsoft Perform Data Engineering on Microsoft Azure HDInsight. Version 1.0

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Microsoft Perform Data Engineering on Microsoft Azure HDInsight.

Stages of Data Processing

Exam Questions

Designing a Modern Data Warehouse + Data Lake

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks

BI ENVIRONMENT PLANNING GUIDE

17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS

Přehled novinek v SQL Server 2016

Architecting Microsoft Azure Solutions (proposed exam 535)

Azure Free Training. Module 1 : Azure Governance Model. Azure. By Hicham KADIRI October 27, Naming. Convention. A K&K Group Company

Azure Data Lake Analytics Introduction for SQL Family. Julie

Transitioning From SSIS to Azure Data Factory. Meagan Longoria, Solution Architect, BlueGranite

Swimming in the Data Lake. Presented by Warner Chaves Moderated by Sander Stad

Data sources. Gartner, The State of Data Warehousing in 2012

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo

CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List)

Microsoft Developer Day

Processing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd.

Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect

Azure Data Factory VS. SSIS. Reza Rad, Consultant, RADACAD

Big data streaming: Choices for high availability and disaster recovery on Microsoft Azure. By Arnab Ganguly DataCAT

Azure Data Factory. Data Integration in the Cloud

Data sources. Gartner, The State of Data Warehousing in 2012

White Paper / Azure Data Platform: Ingest

Take P, R or U. and solve your data quality problems Oliver Engels & Tillmann Eitelberg, OH22

Tour of Database Platforms as a Service. June 2016 Warner Chaves Christo Kutrovsky Solutions Architect

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

Agenda. Spark Platform Spark Core Spark Extensions Using Apache Spark

Ian Choy. Technology Solutions Professional

CERTIFICATE IN SOFTWARE DEVELOPMENT LIFE CYCLE IN BIG DATA AND BUSINESS INTELLIGENCE (SDLC-BD & BI)

Alexander Klein. ETL in the Cloud

Unifying Big Data Workloads in Apache Spark

MapR Enterprise Hadoop

R Language for the SQL Server DBA

Lambda Architecture for Batch and Stream Processing. October 2018

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Oskari Heikkinen. New capabilities of Azure Data Factory v2

CloudExpo November 2017 Tomer Levi

Things I Learned The Hard Way About Azure Data Platform Services So You Don t Have To -Meagan Longoria

Capture Business Opportunities from Systems of Record and Systems of Innovation

Big Data Syllabus. Understanding big data and Hadoop. Limitations and Solutions of existing Data Analytics Architecture

We are ready to serve Latest Testing Trends, Are you ready to learn?? New Batches Info

Best practices for building a Hadoop Data Lake Solution CHARLOTTE HADOOP USER GROUP

Data Lake Based Systems that Work

#techsummitch

Heute in der Suppenküche: Cognitive Services Allerlei

Transform your data estate with cloud, data and AI

Big Data with Hadoop Ecosystem

Big Data Hadoop Stack

Deploying Applications on DC/OS

Microsoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud

Modern ETL Tools for Cloud and Big Data. Ken Beutler, Principal Product Manager, Progress Michael Rainey, Technical Advisor, Gluent Inc.

Hosted Azure for your business. Build virtual servers, deploy with flexibility, and reduce your hardware costs with a managed cloud solution.

Hadoop. Introduction / Overview

Modeling. Preparation. Operationalization. Profile Explore. Model Testing & Validation. Feature & Algorithm Selection. Transform Cleanse Denormalize

Microsoft vision for a new era

Data Platform Futures

Microsoft Big Data and Hadoop

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

Oliver Engels & Tillmann Eitelberg. Big Data! Big Quality?

Big Data. Big Data Analyst. Big Data Engineer. Big Data Architect

App Service Overview. Rand Pagels Azure Technical Specialist - Application Development US Great Lakes Region

Big Data Hadoop Developer Course Content. Big Data Hadoop Developer - The Complete Course Course Duration: 45 Hours

28 February 1 March 2018, Trafo Baden. #techsummitch

Making Data Integration Easy For Multiplatform Data Architectures With Diyotta 4.0. WEBINAR MAY 15 th, PM EST 10AM PST

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Leverage the Oracle Data Integration Platform Inside Azure and Amazon Cloud

20777A: Implementing Microsoft Azure Cosmos DB Solutions

Delving Deep into Hadoop Course Contents Introduction to Hadoop and Architecture

Event Sponsors. Expo Sponsors. Expo Light Sponsors

IBM dashdb Local. Using a software-defined environment in a private cloud to enable hybrid data warehousing. Evolving the data warehouse

Innovatus Technologies

What s New at AWS? A selection of some new stuff. Constantin Gonzalez, Principal Solutions Architect, Amazon Web Services

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

IBM Data Replication for Big Data

Big Data Hadoop Course Content

##SQLSatMadrid. Project [Vélib by Cortana]

SQL Server SQL Server 2008 and 2008 R2. SQL Server SQL Server 2014 Currently supporting all versions July 9, 2019 July 9, 2024

CHOOSING A DATABASE- AS-A-SERVICE

Get ready to be what s next.

Cortana Intelligence Suite; Where the Magic Happens

Migrating Enterprise BI to Azure

Survey of the Azure Data Landscape. Ike Ellis

Kontejneri u Azureu uz pomoć Kubernetesa što i kako? Tomislav Tipurić Partner Technology Strategist Microsoft

Big Data on AWS. Big Data Agility and Performance Delivered in the Cloud. 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Transcription:

Data Architectures in for Analytics & Big Data October 20, 2018 Melissa Coates Solution Architect, BlueGranite Microsoft Data Platform MVP Blog: www.sqlchick.com Twitter: @sqlchick

Data Architecture A set of rules, policies, standards, & models that govern and define the type of data collected & how it is used, stored, managed, & integrated within an organization & its database systems Source: Technopedia

Data Architecture Components Source: https://docs.microsoft.com/en-us/azure/architecture/data-guide/big-data/

Source: http://azureplatform.azurewebsites.net/ A public cloud computing platform and infrastructure for building, deploying, and managing MSFTspecific and third party software and services through a global network of Microsoftmanaged datacenters

Technologies

Data Storage: Relational Databases IaaS Infrastructure as a Service SMP Symmetric Multi-Processing PaaS Platform as a Service MPP Massively Parallel Processing Relational database of your choice in a virtual machine SQL Database Managed Database Managed Instance SQL Data Warehouse MPP Massively Parallel Processing Database for MySQL Database for PostgreSQL Database for MariaDB

Data Storage: Big Data PaaS Platform as a Service Object Storage (Flat) Hierarchical Storage Blob Storage Data Lake Store (Gen1) Multi-Modal Data Lake Storage (Gen2) (PREVIEW)

Data Storage: NoSQL PaaS Platform as a Service Multi-Model HDInsight HBase Cluster Table Storage CosmosDB Key Value Column Family JSON Documents Graph

Data Storage: Analytical & OLAP IaaS Infrastructure as a Service PaaS Platform as a Service SQL Server Analysis Services Analysis Services Power BI

Compute: Big Data IaaS Infrastructure as a Service PaaS Platform as a Service SaaS Software as a Service HDInsight in a VM HDInsight Spark Cluster Databricks Data Lake Analytics HDInsight HDInsight Interactive Hadoop Cluster Query Cluster (Hive LLAP)

Compute: Streaming & Event Processing PaaS Platform as a Service HDInsight Kafka Cluster HDInsight Storm Cluster HDInsight Spark Streaming Cluster IoT Hub Event Hub Stream Analytics

Compute: Data Integration IaaS Infrastructure as a Service PaaS Platform as a Service Serverless Tool of your choice Data in a virtual machine Factory Databricks Functions HDInsight: Spark, Hive, Pig, Scoop, Oozie Automation

Reference Architectures Following are examples only! There are many variations & opportunities to exchange one service for another.

Small/Medium Data Warehousing Source Data Multi-Structured Data Blob Storage Reporting & Analysis Tools DW: Structured Data Semantic Layer SQL Database Analysis Services Power BI Excel

Enterprise Data Warehousing and BI Multi-Structured Data Data Lake Storage Data Mart(s) DW: Structured Data SQL Data Warehouse SQL Database Semantic Layer Power BI Analysis Services Excel

Data Science and Artificial Intelligence Multi-Structured Data Data Science and AI Data Lake Storage Databricks Machine Learning HDInsight Cognitive Services DW: Structured Data SQL Data Warehouse Data Mart(s) Power BI SQL Database Excel

Unified Data Science & Data Engineering Data Lake: Multi-Structured Data Data Lake Storage Scheduled Notebook Job Structured Data SQL Database Raw Data Curated Data Data Science Sandbox Operationalized Analytics Exploratory Analytics Databricks

Big Data Interactive Querying (SQL on Hadoop) Data Lake Data Lake Storage Hive Metastore SQL Database Hive Data Warehouse HDInsight Interactive Query Cluster (Hive LLAP) HiveQL

Big Data Batch Processing Data Lake: Multi-Structured Data Data Lake Storage Big Data Job Processing Data Lake Analytics U-SQL Job Processing Job 1 Job 2 U-SQL Extensions Python ADLA Catalog Database SQL Server in VM SQL DB SQL DW Tables Views Schemas Procedures Functions Assemblies External Data Sources Cognitive Services

IoT + Batch Data (Lambda Architecture) Speed Layer Streaming Dashboard Serving Layer Event Hub Stream Analytics Power BI Batch Layer Data Lake Storage SQL Data Warehouse Analysis Services Power BI Excel

Operational BI (Embedded BI) Published Reports Power BI Service Embedded Visuals Custom Application Source Data SQL Database Data Model + Reports Power BI Desktop Premium Capacity App Workspace REST API calls

Web Application Web Page SQL Database Web App Cache Diagnostics Backups App Service Plan Storage Account Storage Account

Wrap-Up

More Info Solution Architectures: https://azure.microsoft.com/en-us/solutions/architecture/

More Info Data Architecture Guide: https://docs.microsoft.com/en-us/azure/architecture/data-guide/

Thanks! Download latest version of slides: SQLChick.com > Presentations & Downloads page Creative Commons License 3.0 Attribute to me as original author if you share this material No usage of this material for commercial purposes No derivatives or changes to this material