Survey of the Azure Data Landscape. Ike Ellis

Similar documents
Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

HDInsight > Hadoop. October 12, 2017

Tour of Database Platforms as a Service. June 2016 Warner Chaves Christo Kutrovsky Solutions Architect

Introduction to Database Services

Tomasz Libera. Azure SQL Data Warehouse

Modern Data Warehouse The New Approach to Azure BI

Stages of Data Processing

Data Architectures in Azure for Analytics & Big Data

17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS

ColumnStore Indexes UNIQUE and NOT DULL

WHITEPAPER. MemSQL Enterprise Feature List

SQL Server 2014 Column Store Indexes. Vivek Sanil Microsoft Sr. Premier Field Engineer

Swimming in the Data Lake. Presented by Warner Chaves Moderated by Sander Stad

Cloud Analytics and Business Intelligence on AWS

Ian Choy. Technology Solutions Professional

BIG DATA COURSE CONTENT

MySQL Cluster Web Scalability, % Availability. Andrew

AWS Database Migration Service

Oracle GoldenGate for Big Data

Azure Data Factory. Data Integration in the Cloud

The Freedom to Choose

Developing Microsoft Azure Solutions (70-532) Syllabus

Azure File Sync. Webinaari

Microservices without the Servers: AWS Lambda in Action

Welcome to the. Migrating SQL Server Databases to Azure

CHOOSING A DATABASE- AS-A-SERVICE

Microsoft vision for a new era

Oracle Autonomous Database

MySQL Cluster Ed 2. Duration: 4 Days

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Challenges for Data Driven Systems

CIB Session 12th NoSQL Databases Structures

Developing Microsoft Azure Solutions (70-532) Syllabus

Alexander Klein. #SQLSatDenmark. ETL meets Azure

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

Cloud Storage with AWS: EFS vs EBS vs S3 AHMAD KARAWASH

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Making MongoDB Accessible to All. Brody Messmer Product Owner DataDirect On-Premise Drivers Progress Software

MySQL High Availability

We are ready to serve Latest IT Trends, Are you ready to learn? New Batches Info

Introduction to Big Data. NoSQL Databases. Instituto Politécnico de Tomar. Ricardo Campos

#techsummitch

QLIK INTEGRATION WITH AMAZON REDSHIFT

Under the Covers of DynamoDB. Steffen Krause Technology

10/18/2017. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414

Scaling DreamFactory

Gabriel Villa. Architecting an Analytics Solution on AWS

microsoft

OpenStack Trove and DBaaS: Impedance Match?

About Intellipaat. About the Course. Why Take This Course?

Achieving Horizontal Scalability. Alain Houf Sales Engineer

NewSQL Without Compromise

Exam Questions

Microsoft Perform Data Engineering on Microsoft Azure HDInsight.

STATE OF MODERN APPLICATIONS IN THE CLOUD

Martin Cairney. Hybrid data platform making the most of Azure plus your onprem

Understanding the latent value in all content

ARCHITECTING WEB APPLICATIONS FOR THE CLOUD: DESIGN PRINCIPLES AND PRACTICAL GUIDANCE FOR AWS

Aurora, RDS, or On-Prem, Which is right for you

Randy Pagels Sr. Developer Technology Specialist DX US Team AZURE PRIMED

Department of Computer Science University of Cyprus. EPL342 Databases. Lab 1. Introduction to SQL Server Panayiotis Andreou

Developing Microsoft Azure Solutions (70-532) Syllabus

Distributed Systems CS6421

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

Accelerate MySQL for Demanding OLAP and OLTP Use Cases with Apache Ignite. Peter Zaitsev, Denis Magda Santa Clara, California April 25th, 2017

Azure SQL Database Training. Complete Practical & Real-time Trainings. A Unit of SequelGate Innovative Technologies Pvt. Ltd.

Overview of Data Services and Streaming Data Solution with Azure

Sydney SQL Server Enterprise User Group. News: 2 March 2011

SQL Server Databases in the Clouds

Accessing other data fdw, dblink, pglogical, plproxy,...

Microsoft Azure Stack Hybrid Cloud. The Modern System Architecture

Big Data on AWS. Peter-Mark Verwoerd Solutions Architect

Accelerate MySQL for Demanding OLAP and OLTP Use Case with Apache Ignite December 7, 2016

Big Data Architect.

Index. Pranab Mazumdar, Sourabh Agarwal, Amit Banerjee 2016 P. Mazumdar et al., Pro SQL Server on Microsoft Azure, DOI /

Embedded Technosolutions

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks

Oracle TimesTen In-Memory Database 18.1

Developing Microsoft Azure Solutions

Using the MySQL Document Store

Examining Public Cloud Platforms

Javaentwicklung in der Oracle Cloud

Sepand Gojgini. ColumnStore Index Primer

5/2/16. Announcements. NoSQL Motivation. The New Hipster: NoSQL. Serverless. What is the Problem? Database Systems CSE 414

Class Overview. Two Classes of Database Applications. NoSQL Motivation. RDBMS Review: Client-Server. RDBMS Review: Serverless

Azure Development Course

DATABASE SCALE WITHOUT LIMITS ON AWS

In-Memory Tables and Natively Compiled T-SQL. Blazing Speed for OLTP and MOre

Announcements. Two Classes of Database Applications. Class Overview. NoSQL Motivation. RDBMS Review: Serverless

Database Systems CSE 414

A NoSQL Introduction for Relational Database Developers. Andrew Karcher Las Vegas SQL Saturday September 12th, 2015

Data Warehouse Tutorial For Beginners Sql Server 2008 Book

Cloud Computing & Visualization

Building Highly Available and Scalable Real- Time Services with MySQL Cluster

SQL Server SQL Server 2008 and 2008 R2. SQL Server SQL Server 2014 Currently supporting all versions July 9, 2019 July 9, 2024

BERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

5/24/ MVP SQL Server: Architecture since 2010 MCT since 2001 Consultant and trainer since 1992

Azure Data Factory VS. SSIS. Reza Rad, Consultant, RADACAD

Oracle Database 18c and Autonomous Database

NosDB vs DocumentDB. Comparison. For.NET and Java Applications. This document compares NosDB and DocumentDB. Read this comparison to:

Transcription:

Survey of the Azure Data Landscape Ike Ellis

Wintellect Core Services Consulting Custom software application development and architecture Instructor Led Training Microsoft s #1 training vendor for over 14 years having trained more than 50,000 Microsoft developers On-Demand Training World class, subscription based online training

Industry Influencers We wrote the book (over 30 of them)

Some Microsoft Related Highlights Gold Azure Partner 2016 IAMCP Gold Partner of the Year for the U.S. announced at WPC CEO is the Microsoft Regional Director (RD) for Atlanta DevOps competency partner Multiple ALM Rangers Software Development competency partner Xamarin Premier Consulting Partner Multiple Xamarin Certified Engineers Chosen to teach the 2-day Xamarin University pre-con at Evolve 2016 Other: Visual Studio Integration Partner, Azure Circle Partner, ALM Inner Circle Partner, MVP of the Year, and more

Agenda Azure Blob Storage Azure Table Storage Azure CosmosDB Azure SQL Database Azure SQL in a VM Azure SQL Data Warehouse Azure Data Lake Lots of other things supported: Postgres, MySQL, MongoDB, Redis

Topic Agenda What is it? How is it used? What are the competitors? DEMO!

Azure Blob Storage

Azure Blob Storage Blobs are files (PDFs, JPGs, DOCs, etc) Highly durable, massively scalable More than 40 trillion stored objects 3.5+ Million requests/second Exposed via REST APIs Use them in.net, C++, Java, Node.JS, Android AzCopy, PowerShell

Blob Storage Fault Tolerance & Scalability

What kind of blobs can I have? Share files with clients off-load static content from web servers (invoices, contracts, resumes) Azure Websites Platform as a Service no files on a webserver SQL BAK Files VM Hard Drives

Competitors? On premise SANS and arrays Amazon S3 Blob Storage

Azure Blob Storage Demo

Azure Table Storage Much of it similar to Azure Blob Storage Same scalability & redundancy Affordable price Very, very fast NoSQL key value pair solution Quick data retrieval, little configuration

Competitors Amazon DynamoDB Table Storage

Azure Table Storage Demo

JSON Document Standard for passing data between a server and a web application Replacement for XML Hierarchical Terse Simple data types

Modeling in CosmosDB { "id": "1", "firstname": "Thomas", "lastname": "Andersen", "addresses": [ Reading is one operation Writing is one operation No assembly de-assembly { "line1": "100 Some Street", "line2": "Unit 1", "city": "Seattle", "state": "WA", "zip": 98012 } ], "contactdetails": [ {"email: "thomas@andersen.com"},

Query Playground http://www.documentdb.com/sql/demo

Competitors MongoDB Amazon DynamoDB

Azure SQL Database Platform as a Service All data is backed up for you Point in time restore Can be geo-redundant Scalable both in performance and in data size Up to 1TB Not feature complete with SQL Server in a VM

Database Replicas

Azure SQL Database Unsupported Features https://azure.microsoft.com/en-us/documentation/articles/sql-database-transactsql-information/

You can also make it scale up!

Competitors Amazon RDS

Azure SQL Database Demo

Azure SQL Server in a VM You manage backups You create fault tolerant options You manage disk space You manage patching You don t manage hardware failure You don t manage purchasing hardware You don t manage networking infrastructure

Performance Considerations Use Premium Storage. Use a VM size of DS3 or higher for SQL Enterprise edition and DS2 or higher for SQL Standard edition. Use a minimum of 2 P30 disks (1 for log files; 1 for data files and TempDB). Keep the storage accountand SQL Server VM in the same region. Disable Azure geo-redundant storage (geo-replication) on the storage account. Avoid using operating system or temporary disks for database storage or logging.

Backups & Fault Tolerance Back up to Azure Blob Storage Use Always on Availability Groups and Windows Failover Clustering Services (WFCS) for fault tolerance Can use mirroring or log shipping too Can also mix in on-premise

Competitors Amazon EC2 VMs in the cloud

Azure SQL Data Warehouse Elastic Massively Parallel Processing System Use T-SQL to query across relational and non-relational data Up to petabyte volumes of data Scale compute separately from data When paused, you only pay for storage Deploys in seconds

Azure SQL Data Warehouse Supports 32 concurrent queries Used for fanning out queries over multiple machines for processing/aggregation/analytics Performance becomes far more predictable than with just straight SQL Server Not used in OLTP environments

What is a DTU (Data Warehouse Unit)? A unit of scale that determines how much hardware will give great performance Done in increments of 100 (mostly) How many DTUs? Start Small Monitor Change as needed. It s instant. ALTER DATABASE MySQLDW MODIFY (SERVICE_OBJECTIVE = 'DW1000') ;

Partitioning Data Two choices: Distribute data based on hashing values from a single column Good if clusters of tables will be joined and are related Distribute data evenly but randomly Fail-safe method

Non-supported data types geometry, use a varbinary type geography, use a varbinary type hierarchyid, CLR type not native image, text, ntext when text based use varchar/nvarchar (smaller the better) nvarchar(max), use varchar(4000) or smaller for better performance numeric, use decimal sql_variant, split column into several strongly typed columns sysname, use nvarchar(128) table, convert to temporary tables timestamp, re-work code to use datetime2 and CURRENT_TIMESTAMP function. varchar(max), use varchar(8000) or smaller for better performance uniqueidentifier, use varbinary(8) user defined types, convert back to their native types where possible xml, use a varchar(8000) or smaller for better performance - split across columns if needed

Unsupported Features primary keys foreign keys check constraints unique constraints unique indexes computed columns user-defined types indexed views identities sequences triggers synonyms sparse columns

Competitors Amazon RedShift

Azure SQL Data Warehouse Demo

Azure Data Lake HDFS for the cloud Can use tools like Spark, Storm, Flume, Sqoop, Kafka, etc. No fixed limits on account size or file size

What is a generic data lake? An enterprise wide repository of every type of data collected in a single place Prior to any formal definition of requirements or schema. Allows every type of data to be kept without discrimination Organizations can then use Hadoop or advanced analytics to find patterns of the data. Serve as a repository for lower cost data preparation prior to moving curated data into a data warehouse.

Products Azure Data Lake Store Built on HDFS Azure Data Lake Analytics Built on Yarn. Introduces U-SQL/C# 45

Competitors A lot of Hadoop implementations, but nothing really quite like it 46

More data options. MongoDB PostGres Redis MySQL Oracle 47

Ike Ellis Ike Ellis, MVP blog.ikeellis.com Book: Developing Azure Solutions Podcast Guest: Talk Python to Me Dec 2015, June 2016.NET Rocks Sept 2015, Sept 2016 SDTIG www.sdtig.com