Microsoft Analytics Platform System (APS)

Similar documents
Modern Data Warehouse The New Approach to Azure BI

Stages of Data Processing

Přehled novinek v SQL Server 2016

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

Ian Choy. Technology Solutions Professional

VOLTDB + HP VERTICA. page

SQL Server 2017 Power your entire data estate from on-premises to cloud

Bull Fast Track/PDW and Big Data

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

One is the Loneliest Number: Scaling out your Data Warehouse

SAP IQ Software16, Edge Edition. The Affordable High Performance Analytical Database Engine

SQL 2016 Performance, Analytics and Enhanced Availability. Tom Pizzato


BI ENVIRONMENT PLANNING GUIDE

SQL Server Everything built-in

Data sources. Gartner, The State of Data Warehousing in 2012

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.

Evolving To The Big Data Warehouse

Overview of Data Services and Streaming Data Solution with Azure

SQL Server Pre Lanzamiento. Federico Marty. Mariano Kovo. Especialista en Plataforma de Aplicaciones Microsoft Argentina & Uruguay

Microsoft Big Data and Hadoop

COURSE 10977A: UPDATING YOUR SQL SERVER SKILLS TO MICROSOFT SQL SERVER 2014

Big Data with Hadoop Ecosystem

Columnstore Technology Improvements in SQL Server Presented by Niko Neugebauer Moderated by Nagaraj Venkatesan

IT directors, CIO s, IT Managers, BI Managers, data warehousing professionals, data scientists, enterprise architects, data architects

Security and Performance advances with Oracle Big Data SQL

SQL Server Evolution. SQL 2016 new innovations. Trond Brande

Microsoft Exam

Top Five Reasons for Data Warehouse Modernization Philip Russom

Data sources. Gartner, The State of Data Warehousing in 2012

Processing Unstructured Data. Dinesh Priyankara Founder/Principal Architect dinesql Pvt Ltd.

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp.

Approaching the Petabyte Analytic Database: What I learned

SQL SERVER Lubo Goryl Solution Professional Microsoft Slovakia

SQL Server SQL Server 2008 and 2008 R2. SQL Server SQL Server 2014 Currently supporting all versions July 9, 2019 July 9, 2024

Polybase In Action. Kevin Feasel Engineering Manager, Predictive Analytics ChannelAdvisor #ITDEVCONNECTIONS ITDEVCONNECTIONS.COM

In-Memory Computing EXASOL Evaluation

QLIK INTEGRATION WITH AMAZON REDSHIFT

Data Warehouse Design Decisions

Mastering Data Warehouse Aggregates Solutions For Star Schema Performance

Modernizing Business Intelligence and Analytics

Swimming in the Data Lake. Presented by Warner Chaves Moderated by Sander Stad

Drawing the Big Picture

Netezza The Analytics Appliance

Updating your Business Intelligence Skills to Microsoft SQL Server 2012

Microsoft certified solutions associate

Enterprise Data Warehousing

"Charting the Course... MOC B Updating Your SQL Server Skills to Microsoft SQL Server 2014 Course Summary

Course Outline. Upgrading Your Skills to SQL Server 2016 Course 10986A: 3 days Instructor Led

FIVE BEST PRACTICES FOR ENSURING A SUCCESSFUL SQL SERVER MIGRATION

Master BIG DATA with SQL Server 2012

Updating Your Skills to SQL Server 2016

Oracle Big Data Connectors

Updating your Business Intelligence Skills to Microsoft SQL Server 2012 Course 40009A; 3 Days, Instructor-led

From Single Purpose to Multi Purpose Data Lakes. Thomas Niewel Technical Sales Director DACH Denodo Technologies March, 2019

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

SQL Server New innovations. Ivan Kosyakov. Technical Architect, Ph.D., Microsoft Technology Center, New York

Azure SQL Data Warehouse. Andrija Marcic Microsoft

Cloud Computing & Visualization

New Features and Enhancements in Big Data Management 10.2

Accelerate your SAS analytics to take the gold

WHAT S NEW IN SQL SERVER 2016 REPORTING SERVICES?

Oracle #1 RDBMS Vendor

Microsoft SQL Server 2017

Azure Data Factory. Data Integration in the Cloud

Microsoft SQL Server on SUSE Linux. Radosław Łebkowski Microsoft

Implementing a Data Warehouse with Microsoft SQL Server 2012

Appliances and DW Architecture. John O Brien President and Executive Architect Zukeran Technologies 1

Embedded Technosolutions

Leveraging Customer Behavioral Data to Drive Revenue the GPU S7456

Pervasive Insight. Mission Critical Platform

April Copyright 2013 Cloudera Inc. All rights reserved.

SAP HANA. Jake Klein/ SVP SAP HANA June, 2013

New Approaches to Big Data Processing and Analytics

WHITEPAPER. MemSQL Enterprise Feature List

Lenovo Database Configuration for Microsoft SQL Server TB

Implementing a Data Warehouse with Microsoft SQL Server 2012

BI, Big Data, Mission Critical. Eduardo Rivadeneira Specialist Sales Manager

Super SQL Bootcamp. Price $ (inc GST)

Data Lake Based Systems that Work

Big Data com Hadoop. VIII Sessão - SQL Bahia. Impala, Hive e Spark. Diógenes Pires 03/03/2018

Was ist dran an einer spezialisierten Data Warehousing platform?

HANA Performance. Efficient Speed and Scale-out for Real-time BI

Oracle Exadata: The World s Fastest Database Machine

Big Trend in Business Intelligence: Data Mining over Big Data Web Transaction Data. Fall 2012

Part 1: Indexes for Big Data

An Introduction to Big Data Formats

Azure Data Factory VS. SSIS. Reza Rad, Consultant, RADACAD

Updating your Database Skills to Microsoft SQL Server 2012

Massively Parallel Processing. Big Data Really Fast. A Proven In-Memory Analytical Processing Platform for Big Data

Tour of Database Platforms as a Service. June 2016 Warner Chaves Christo Kutrovsky Solutions Architect

Shine a Light on Dark Data with Vertica Flex Tables

Renovating your storage infrastructure for Cloud era

Combine Native SQL Flexibility with SAP HANA Platform Performance and Tools

Implementing a SQL Data Warehouse

What is Gluent? The Gluent Data Platform

Oracle GoldenGate for Big Data

Oliver Engels & Tillmann Eitelberg. Big Data! Big Quality?

28 February 1 March 2018, Trafo Baden. #techsummitch

SQL Server on Linux and Containers

Transcription:

Microsoft Analytics Platform System (APS) The turnkey modern data warehouse appliance Matt Usher, Senior Program Manager @ Microsoft

About.me @two_under Senior Program Manager 9 years at Microsoft Visual Studio Office Windows Server Analytics Platform System (APS) Amazon, Deloitte Consulting, 5 startups

data warehousing has reached the most significant tipping point since its inception. The biggest, possibly most elaborate data management system in IT is changing. - Gartner, The State of Data Warehousing in 2012 3

The traditional data warehouse Data sources Non-relational data 4

The modern data warehouse Data sources Non-relational data

Insights from all your data Enrich and optimize your data from non-traditional sources 6

Microsoft Analytics Platform System The turnkey modern data warehouse appliance Relational and non-relational data in a single appliance Enterprise-ready Hadoop Integrated querying across Hadoop and PDW using T-SQL Direct integration with Microsoft BI tools such as Microsoft Excel Near real-time performance with In-Memory Columnstore Ability to scale out to accommodate growing data Removal of data warehouse bottlenecks with MPP SQL Server Concurrency that fuels rapid adoption Industry s lowest data warehouse appliance price per terabyte Value through a single appliance solution Value with flexible hardware options using commodity hardware

Hardware and software engineered together The ease of an appliance Analytics Platform System Pre-built hardware + software appliance Co-engineered with Dell, HP, and Quanta SQL Server Parallel Data Warehouse Pre-built hardware Pre-installed software Plug and play Built-in best practices PolyBase Microsoft HDInsight Time savings Built for Big Data

Microsoft Analytics Platform System The turnkey modern data warehouse appliance

Hadoop alone is not the answer to all Big Data challenges Steep learning curve, slow and inefficient Hadoop ecosystem Move HDFS into the warehouse before analysis Learn new skills New data sources T-SQL New New data data sources sources Build Integrate Manage Maintain Support ETL

APS delivers enterprise-ready Hadoop w/ HDInsight Manageable, secured, and highly available Hadoop integrated into the appliance SQL Server Parallel Data Warehouse High performance and tuned within the appliance End-user authentication with Active Directory PolyBase Microsoft HDInsight 100-percent Apache Hadoop Managed and monitored using System Center Accessible insights for everyone with Microsoft BI tools

Connecting islands of data with PolyBase Bringing Hadoop point solutions and the data warehouse together for users and IT Select Result set Microsoft Azure HDInsight Hortonworks for Windows and Linux Cloudera SQL Server Parallel Data Warehouse PolyBase Microsoft HDInsight Provides a single T-SQL query model for PDW and Hadoop with rich features of T-SQL, including joins without ETL Uses the power of MPP to enhance query execution performance Supports Windows Azure HDInsight to enable new hybrid cloud scenarios Provides the ability to query non-microsoft Hadoop distributions, such as Hortonworks and Cloudera

PolyBase simplifies using Hadoop data Bringing islands of Hadoop data together Running high performance queries against Hadoop data Archiving data warehouse data to Hadoop (move) Exporting relational data to Hadoop (copy) Importing Hadoop data into a data warehouse (copy)

Automatic MapReduce pushdown Source systems Analytics / Ad-hoc / Visualization SQL Server Data Marts Hadoop / Data Lake (Cloudera, Hortonworks, HDInsight) MapReduce SQL Server Parallel Data Warehouse PolyBase T-SQL SQL Server Reporting Services Microsoft HDInsight Day / Hour / Minute Refresh APS SQL Server Analysis Services

PolyBase Predicate pushdown Dynamic binding HDFS File / Directory //hdfs/social_media/twitter //hdfs/social_media/twitter/daily.log Column filtering User Location Product Sentiment Rtwt Hour Date SELECT User, Product, Sentiment Sean CA xbox -1 5 2 5-25-14 FROM Twitter_Table Audie Suz CO WA excel xbox 1 0 0 0 2 2 5-25-14 5-25-14 WHERE Hour = Current - 1 AND Date = Today AND Sentiment > 0; Tom Sanjay IL MN sqls wp8 1 1 8 0 2 1 5-25-14 5-25-14 Roger TX ssas 1 0 23 5-25-14 Row filtering Steve AL ssrs 1 Hadoop 0 23 5-24-14

19 Demo

Microsoft Analytics Platform System The turnkey modern data warehouse appliance

Performance and Scale Limitations in Traditional Data Warehouses Scale up Rowstore Forklift Forklift Data C1 C2 C3 C4 R1 R1 R1 R1 R2 R2 R2 R2 R3 R3 R3 R3 R4 R4 R4 R4 R5 R5 R5 R5 R6 R6 R6 R6 Querying data by row Page 1 Page 2 Page 3 Diminishing scale as requirements grow Sub-optimal performance for many data warehouse queries

Scaling out your data to petabytes Scale-out technologies in the Analytics Platform System Scale out Multiple nodes with dedicated CPU, memory, and storage PDW / HDInsight PDW / HDInsight PDW / HDInsight PDW / HDInsight PDW / HDInsight PDW / HDInsight Ability to incrementally add hardware for near-linear scale to multiple petabytes Ability to handle query complexity and concurrency at scale PDW No forklift of prior warehouse to increase capacity 0 terabytes 6 petabytes Ability to scale out HDInsight and PDW 22

Clustered Columnstore Index Why is a clustered columnstore index important? Saves space Provides easier management by eliminating maintenance of secondary indexes Supports all PDW data types, including highprecision decimal data types and more 20.0 15.0 10.0 5.0 Space used in GB (table with 101 million rows) 91% savings In-Memory Columnstore is featured in the storage engine in PDW AU1 0.0 1 2 3 4 5 6 Space used = table space + index space

Concurrency that fuels rapid adoption Great performance with mixed workloads Analytics Platform System ETL/ELT with SSIS, DQS, MDS Intra-Day CRTAS SQL Server SMP ERP CRM LOB APPS ETL/ELT with DWLoader Near real-time PDW Link Table Real-Time Reporting and cubes Columnstore ROLAP / MOLAP DirectQuery Hadoop / Big Data Polybase PolyBase SNAC BI Tools Ad hoc queries Fast ad hoc HDInsight

Blazing-fast performance MPP and In-Memory Columnstore for next-generation performance Columnstore index representation Up to 100x faster queries Up to 15x more compression Updateable clustered columnstore vs. table with customary indexing 25 Parallel query execution Query Results Store data in columnar format for massive compression Load data into or out of memory for nextgeneration performance with up to 60% improvement in data loading speed Updateable and clustered for real-time trickle loading

Microsoft Analytics Platform System The turnkey modern data warehouse appliance

Thousands Lowest $/TB for data warehouse appliance High performance using commodity hardware Price per terabyte for leading vendors $30 $25 $20 Price per terabyte for user-available storage (compressed) Significantly lower price per terabyte than the closest competitor $15 $10 $5 $0 Oracle EMC IBM Teradata Microsoft NOTE: Orange line indicates average price per terabyte. Lower storage costs with Windows Server 2012 Storage Spaces

29 Demo

www.microsoft.com/aps www.microsoft.com/bigdata

Questions?

Thank You for Attending