Similar documents
Master BIG DATA with SQL Server 2012

The Microsoft Big Data architecture approach

Microsoft Analytics Platform System (APS)

SQL Server Pre Lanzamiento. Federico Marty. Mariano Kovo. Especialista en Plataforma de Aplicaciones Microsoft Argentina & Uruguay

SQL Server Everything built-in

SQL 2016 Performance, Analytics and Enhanced Availability. Tom Pizzato

Přehled novinek v SQL Server 2016

WHAT S NEW IN SQL SERVER 2016 REPORTING SERVICES?

Bull Fast Track/PDW and Big Data

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019

Modern Data Warehouse The New Approach to Azure BI

UNE APPROCHE CONVERGÉE AVEC LES SOLUTIONS VCE JEUDI 19 NOVEMBRE Olivier LE ROLLAND : varchitecte Manager, VCE France

Pervasive Insight. Mission Critical Platform

Copyright 2012, Oracle and/or its affiliates. All rights reserved.

One is the Loneliest Number: Scaling out your Data Warehouse

SQL Server 2017 Power your entire data estate from on-premises to cloud

Gerhard Brueckl. Deep-dive into Polybase

Stages of Data Processing

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

Shine a Light on Dark Data with Vertica Flex Tables

Drawing the Big Picture

Tomasz Libera. Azure SQL Data Warehouse

SQL Server 2014 Column Store Indexes. Vivek Sanil Microsoft Sr. Premier Field Engineer

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Overview of Data Services and Streaming Data Solution with Azure

*Gartner Magic Quadrant for Business Intelligence and Analytics Platforms, by Rita L. Sallam, Cindi Howson, Carlie J. Idoine, Thomas W.

CAST(HASHBYTES('SHA2_256',(dbo.MULTI_HASH_FNC( tblname', schemaname'))) AS VARBINARY(32));

SQL Server Evolution. New innovations. George Walters. Sr. Technical Solutions Professional, Data Platform Microsoft

Transform the datacenter Help your customers move forward in the age of the cloud

VOLTDB + HP VERTICA. page

Analyze Big Data Faster and Store It Cheaper

Ta kontroll över er data! Christofer Jensen Client Technical Specialist. Stockholm

Non-relational Lift and Shift. Cheap, flexible Access Customer managed 250GB PB+

Azure SQL Data Warehouse. Andrija Marcic Microsoft

StarWind Virtual SAN Getting Started

Ransomware & Modern DR: Risky Business

ColumnStore Indexes. מה חדש ב- 2014?SQL Server.

Future of Database. - Journey to the Cloud. Juan Loaiza Senior Vice President Oracle Database Systems

HP Integration with Incorta: Connection Guide. HP Vertica Analytic Database

WHAT AI AND THE CLOUD REVOLUTION MEAN FOR BUSINESS COMMUNICATIONS

28 February 1 March 2018, Trafo Baden. #techsummitch

Combine Native SQL Flexibility with SAP HANA Platform Performance and Tools

VxRail: Level Up with New Capabilities and Powers GLOBAL SPONSORS

Top Five Reasons for Data Warehouse Modernization Philip Russom

Swimming in the Data Lake. Presented by Warner Chaves Moderated by Sander Stad

Designing a Modern Data Warehouse + Data Lake

Netezza The Analytics Appliance

Big Data Architect.

Application Visibility in Virtualized Environment

April Copyright 2013 Cloudera Inc. All rights reserved.

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

SAP HANA Scalability. SAP HANA Development Team

Designing your BI Architecture

StarWind Virtual SAN Free

FAST SQL SERVER BACKUP AND RESTORE

REFERENCE ARCHITECTURE Microsoft SQL Server 2016 Data Warehouse Fast Track. FlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray//X

L300 deck. Paul Duffett Mar2017

Big Data with Hadoop Ecosystem

Part 1: Indexes for Big Data

Temenos Bringing banking to millions through Cloud Scale Innovation

Leading Innovation in the Data Center

Microsoft certified solutions associate

Big Data and Enterprise Data, Bridging Two Worlds with Oracle Data Integration

THE EMC ISILON STORY. Big Data In The Enterprise. Deya Bassiouni Isilon Regional Sales Manager Emerging Africa, Egypt & Lebanon.

Modernize Your Infrastructure

VLDB. Partitioning Compression

SAP HANA Update. Saul Cunningham SAP Big Data Centre of Excellence

Aruba ridefinisce il futuro del Mobile, Cloud e IoT

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Oracle Big Data SQL. Release 3.2. Rich SQL Processing on All Data

Microsoft Exam

No Trade Secrets. Microsoft does not claim any trade secret rights in this documentation.

Processing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd.

Oracle Big Data Connectors

Copyright 2015 EMC Corporation. All rights reserved. STRATEGIC FORUM PAT GELSINGER CEO, VMware

Ian Choy. Technology Solutions Professional

September 2013 Alberto Abelló & Oscar Romero 1

Polybase In Action. Kevin Feasel Engineering Manager, Predictive Analytics ChannelAdvisor #ITDEVCONNECTIONS ITDEVCONNECTIONS.COM

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

5 Fundamental Strategies for Building a Data-centered Data Center

[MS-DPPDW]: Parallel Data Warehouse Data Portability Overview. Intellectual Property Rights Notice for Open Specifications Documentation

Impala. A Modern, Open Source SQL Engine for Hadoop. Yogesh Chockalingam

BIG DATA COURSE CONTENT

WHITEPAPER. MemSQL Enterprise Feature List

Rickard Linck Client Technical Professional Core Database and Lifecycle Management Common Analytic Engine Cloud Data Servers On-Premise Data Servers

Guide Users along Information Pathways and Surf through the Data

Built for Speed: Comparing Panoply and Amazon Redshift Rendering Performance Utilizing Tableau Visualizations

Zero Trust in Healthcare Centrify Corporations. All Rights Reserved.

SQL SERVER Lubo Goryl Solution Professional Microsoft Slovakia

COMPANY VISION 기업의생산성향상에기여하는최고의협업솔루션을제공하는최고의클라우드기업솔루션공급사

PUBLIC SAP Vora Sizing Guide

FlashStack 70TB Solution with Cisco UCS and Pure Storage FlashArray

Safe Harbor Statement

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

Interactive SQL-on-Hadoop from Impala to Hive/Tez to Spark SQL to JethroData

SQL Server Evolution. SQL 2016 new innovations. Trond Brande

What is Gluent? The Gluent Data Platform

Answer: A Reference: df(page 1, first para)

Amazon Web Services. For Government, Education, and Nonprofit Organizations

Transcription:

marko.hotti@microsoft.com

GARTNER MAGIC QUADRANT DW & BI Data Warehouse Database Management Systems Business Intelligence and Analytics Platforms * Disclaimer: Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner's research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose. 2

The Traditional Data Warehouse 3

Breaking Points of The Traditional Data Warehouse 3 1 2 4 5

Introducing The Modern Data Warehouse Business Intelligence Data Sources 6

Microsoft Hadoop Vision Insights to all users by activating new types of data

Limitations: Performance and Scale today Diminishing performance Scale UP Rowstore Existing Tables (Partitions) Diminishing Scale as requirements grow Non-optimal performance for many DW queries

SQL Server 2012 Parallel Data Warehouse (PDW) Insights on any data of any size Next-generation Performance At Scale Built For Big Data

Manageable Costs Scale Out MPP versus Scale Up SMP Big Data Integration Query Performance Updateable xvelocity Columnstore Appliance Simplicity: HW + SW

What is Parallel Data Warehouse? Shared-nothing parallel database system» Massively parallel processing (MPP)» A Control server that accepts user queries, generates a plan, and distributes operations in parallel to compute nodes» Multiple Compute servers running SQL Server» A Management server for administering the system» A Data Movement Service that facilitates parallel SQL operations Delivered as an appliance» Balanced and pre-configured software and industry standard hardware from Dell or HP» Single Call Support» Fastest Time to Market» Scales from 2 to 56 Nodes HP Example

Key Design Elements Modular Design High Density Leverage latest Microsoft software features» Windows Server 2012 Storage Spaces» Windows Server 2012 Hyper-V» SQL Server 2012 xvelocity ColumnStore HP Example

Ultra Shared Nothing architecture: Distribution Larger Fact Table is Hash Distributed Across All Compute Nodes Time Dim Product Dim TD SD SF 01-08 PD MD Date Dim ID Calendar Year Calendar Qtr Calendar Mo Calendar Day Prod Dim ID Prod Category Prod Sub Cat Prod Desc TD SD SF 09-16 PD MD Store Dim Store Dim ID Store Name Store Mgr Store Size Sales Facts Date Dim ID Store Dim ID Prod Dim ID Mktg Camp ID Qty Sold Dollars Sold Mktg Campaign Dim Mktg Camp ID Camp Name Camp Mgr Camp Start Camp End TD SD TD SD TD SD SF 17-24 SF 25-32 SF 33-n PD MD PD MD PD MD

In-Memory Columnstore in PDW V2 & SQL Server 2014 xvelocity in-memory columnstore in PDW columnstore index as primary data store in a scale-out MPP Data Warehouse - PDW V2 Appliance Updateable clustered columnstore index (CCI) Support for bulk load and insert/update/delete Extended data types decimal/numeric for all precision and scale Query processing enhancements for more batch mode processing (for example, Outer/Semi/Antisemi joins, union all, scalar aggregation) Customer benefits Outstanding query performance from in-memory columnstore index 600 GB per hour for a single 12-core server Significant hardware cost savings due to high compression 4 15x compression ratio Improved productivity through updateable index Ships in PDW V2 appliance and SQL Server 2014 14

Introducing PolyBase Fundamental breakthrough in data processing SQL SQL Server 2012 PDW Powered by PolyBase Single Query; Structured and Unstructured Query and join Hadoop tables with Relational Tables Use Standard SQL language Select, From Where Database HDFS (Hadoop) Existing SQL Skillset No IT Intervention Save Time and Costs Analyze All Data Types

External Tables» An external table is PDW s representation of data residing in HDFS» The table (metadata) lives in the context of a SQL Server database» The actual table data resides in HDFS CREATE EXTERNAL TABLE table_name ({<column_definition>} [,...n ]) {WITH (LOCATION = <URI>,[FORMAT_OPTIONS = (<VALUES>)])} [;] Required to indicate location of Hadoop cluster Optional format options associated with parsing of data from HDFS (e.g. field delimiters & reject-related thresholds)

Native Query Across Hadoop and PDW Parallel Data Import from HDFS into PDW Persistently storing data from HDFS in PDW tables Fully parallelized via CREATE TABLE AS SELECT (CTAS) with external tables as source table and PDW tables (either distributed or replicated) as destination CREATE TABLE ClickStream_PDW WITH DISTRIBUTION = HASH(url) AS SELECT url, event_date, user_ip FROM ClickStream Retrieval of data in HDFS on-the-fly Sensor & RFID Web Apps Social Apps Mobile Apps Hadoop Unstructured data Parallel HDFS Reads CTAS External Table Enhanced PDW query engine HDFS bridge Results DMS DMS Reader Reader 1 N Parallel Importing Traditional DW applications PDW Structured data

Native Query Across Hadoop and PDW Parallel Data Export from PDW into HDFS Fully parallelized via CREATE EXTERNAL TABLE AS SELECT (CETAS) with external tables as destination table and PDW tables as source Round-trip of data possible with first importing data from HDFS, joining it with relational data, and then exporting results back to HDFS CREATE EXTERNAL TABLE ClickStream (url, event_date, user_ip) WITH (LOCATION = hdfs://myhadoop:5000/users/outputdir, FORMAT_OPTIONS (FIELD_TERMINATOR = ' ')) AS SELECT url, event_date, user_ip FROM ClickStream_PDW Sensor & RFID Web Apps Social Apps Mobile Apps HDFS data nodes Unstructured data Parallel HDFS Writes CETAS External Table Enhanced PDW query engine HDFS bridge DMS Writer 1 Results DMS Writer N Parallel Reading Traditional DW applications PDW Structured data

PDW V2.0 Management Dashboard

PDW V2.0 Management Dashboard

PDW V2.0 Management Dashboard

Microsoft Business Intelligence Platform