Data Platform Futures

Similar documents
Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.

Understanding the latent value in all content

Recurrent Neural Networks. Deep neural networks have enabled major advances in machine learning and AI. Convolutional Neural Networks

Transform your data estate with cloud, data and AI

28 February 1 March 2018, Trafo Baden. #techsummitch

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp.

BIG DATA COURSE CONTENT

CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List)

Data Architectures in Azure for Analytics & Big Data

Microsoft vision for a new era

Vishesh Oberoi Seth Reid Technical Evangelist, Microsoft Software Developer, Intergen

STREAMLINED CERTIFICATION PATHS

Modern Data Warehouse The New Approach to Azure BI

App Service Overview. Rand Pagels Azure Technical Specialist - Application Development US Great Lakes Region

Přehled novinek v SQL Server 2016

Event Sponsors. Expo Sponsors. Expo Light Sponsors

Architecting Microsoft Azure Solutions (proposed exam 535)

Fast Hardware For AI

Agenda. Future Sessions: Azure VMs, Backup/DR Strategies, Azure Networking, Storage, How to move

White Paper / Azure Data Platform: Ingest

#techsummitch

Overview of Data Services and Streaming Data Solution with Azure

Alexander Klein. #SQLSatDenmark. ETL meets Azure

Hosted Azure for your business. Build virtual servers, deploy with flexibility, and reduce your hardware costs with a managed cloud solution.

Azure Webinar. Resilient Solutions March Sander van den Hoven Principal Technical Evangelist Microsoft

MICROSOFT CLOUD PLATFORM AND INFRASTRUCTURE CERTIFICATION. Includes certifications for Microsoft Azure and Windows Server

Microsoft SQL Server on SUSE Linux. Radosław Łebkowski Microsoft

WHITEPAPER. MemSQL Enterprise Feature List

SQL Server SQL Server 2008 and 2008 R2. SQL Server SQL Server 2014 Currently supporting all versions July 9, 2019 July 9, 2024

OSIsoft Technologies for the Industrial IoT and Industry 4.0

Ten Innovative Financial Services Applications Powered by Data Virtualization

TECHED USER CONFERENCE MAY 3-4, 2016

Vinnie Saini Cloud Solution Architect Big Data & AI

Developing Microsoft Azure Solutions

Microsoft certified solutions associate

Deep learning prevalence. first neuroscience department. Spiking Neuron Operant conditioning First 1 Billion transistor processor

AWS & Intel: A Partnership Dedicated to fueling your Innovations. Thomas Kellerer BDM CSP, Intel Central Europe

Microsoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud

Azure File Sync. Webinaari

STREAMLINED CERTIFICATION PATHS

Learning as a Service 2018 Course Catalog

Azure Data Factory. Data Integration in the Cloud

microsoft

Learning as a Service 2018 Course Catalog

Industry-leading Application PaaS Platform

Cloud has become the New Normal

Microsoft Azure Stack Hybrid Cloud. The Modern System Architecture

Kontejneri u Azureu uz pomoć Kubernetesa što i kako? Tomislav Tipurić Partner Technology Strategist Microsoft

Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Capture Business Opportunities from Systems of Record and Systems of Innovation

Netezza The Analytics Appliance

DATACENTER SERVICES DATACENTER

MCSE Mobility Earned: MCSE Cloud Platform & Infrastructure Earned: 2017 MCSE MCSE. MCSD App Builder. MCSE Business Applications Earned 2017

Build an open hybrid cloud and paint it red and blue

Data and AI LATAM 2018

SQL Server on Linux and Containers

Oracle Exadata: Strategy and Roadmap

VMworld 2015 Track Names and Descriptions

MCSE Cloud Platform & Infrastructure CLOUD PLATFORM & INFRASTRUCTURE.

Hyper scale Infrastructure is the enabler

Integrate MATLAB Analytics into Enterprise Applications

CONSOLIDATING RISK MANAGEMENT AND REGULATORY COMPLIANCE APPLICATIONS USING A UNIFIED DATA PLATFORM

Transforming IT: From Silos To Services

exam. Microsoft Perform Data Engineering on Microsoft Azure HDInsight. Version 1.0

OSIsoft Technologies for the Industrial IoT and Industry 4.0 Chris Felts, Sr. Product Manager Houston Regional Seminar, October 4, 2017

2014 年 3 月 13 日星期四. From Big Data to Big Value Infrastructure Needs and Huawei Best Practice

Title DC Automation: It s a MARVEL!

SQL Server 2017 Power your entire data estate from on-premises to cloud

Increase Value from Big Data with Real-Time Data Integration and Streaming Analytics

Accelerating Data Center Workloads with FPGAs

VMworld 2015 Track Names and Descriptions

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks

Cisco Tetration Analytics

Leverage the Oracle Data Integration Platform Inside Azure and Amazon Cloud

Energy Management with AWS

IoT Edge within the IoT Framework

Azure Everywhere. Brandon Murray, Cami Williams, David Haver, Kevin Carter, Russ Henderson

Intelligence for the connected world How European First-Movers Manage IoT Analytics Projects Successfully

Stages of Data Processing

Your New Autonomous Data Warehouse

Big Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara

Big data streaming: Choices for high availability and disaster recovery on Microsoft Azure. By Arnab Ganguly DataCAT

Open Source Tools as a platform for research on Microsoft Azure

The Ladder to AI. Rob Thomas General Manager, IBM

IBM Data Replication for Big Data

1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. reserved. Insert Information Protection Policy Classification from Slide 8

The Cortana Intelligence Suite

Azure SQL Database. Indika Dalugama. Data platform solution architect Microsoft datalake.lk

Franck Mercier. Technical Solution Professional Data + AI Azure Databricks

Cortana Intelligence Suite; Where the Magic Happens

Pervasive Insight. Mission Critical Platform

Hitachi Vantara Overview Pentaho 8.0 and 8.1 Roadmap. Pedro Alves

Integrate MATLAB Analytics into Enterprise Applications

Cloud Models. SaaS. PaaS. IaaS. Traditional. You Manage. You Manage. Managed by Microsoft. Managed by Microsoft. You Manage. Managed by Microsoft

Accelerating Digital Transformation with InterSystems IRIS and vsan

Index. Scott Klein 2017 S. Klein, IoT Solutions in Microsoft s Azure IoT Suite, DOI /

WITH INTEL TECHNOLOGIES

Connectivity Data Analytics

BUILD BETTER MICROSOFT SQL SERVER SOLUTIONS Sales Conversation Card

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

Transcription:

Data Platform Futures Jon Jahren Data & AI Architect Microsoft jon.jahren@microsoft.com

2017 Microsoft. All rights reserved. The following content contains forward looking statements including ongoing Microsoft research and may include non-committed research and product engineering activities that Microsoft may not deliver as future operations, future products or new features for existing products. This information is subject to change at any time without prior notification. Statements contained in this document concerning these matters only reflect Microsoft's expectations as of the date of this document. Changes in product strategy resulting from technological, internal corporate, market and other changes may occur. This is not a commitment to deliver any material, code or functionality and should not be relied upon in making purchasing decisions.

SQL v.next & Scale-out Azure SQL Futures in IoT and Big Data Engineering What s bigger than Big Data Informational Graphs, bots and Reinforced Learning Machine Reading and Natural Language Computing Homomorphic Security doing analytics on someone else s private data Collaborative Clouds and multi-party data vaults Quantum Machine Learning Quantum Networks and Quantum Database Computing

Vulnerabilities (2010-2016) Self-service BI per user S Q L S E R V E R 2 0 1 7 1/10th the cost of Oracle I N D U S T R Y - L E A D I N G P E R F O R M A N C E A N D S E C U R I T Y N O W O N L I N U X A N D D O C K E R Choice of platform and language Industry-leading performance Most secure over the last 7 years Only commercial DB with AI built-in End-to-end mobile BI on any device 200 180 $2,230 160 1/10 140 120 100 R T-SQL Java C/C++ C#/VB.NET PHP Node.js Python Ruby #1 TPC-H performance 1TB, 10TB, 30TB #1 TPC-E performance 80 60 40 20 0 1 2 3 4 5 6 R and Python + in-memory at massive scale $480 $120 Microsoft Tableau Oracle #1 price/performance Native T-SQL scoring A fraction of the cost In-memory across all workloads Private cloud Most consistent data platform Public cloud

Query times S Q L S E R V E R 2 0 1 7 Key New Functionality 1010 0101 0110 Plan 1 Plan 2 Plan 3 Plan 2 Revert to previously effective plan Statistics Skill Andy Smith Degree earned Position B.S. Science, Finance Business Analyst R R and Python + in-memory at massive scale Native T-SQL scoring

S Q L S E R V E R v N E X T I N V E S T M E N T T H E M E S Reason over any data, anywhere Choice of language and platform Industry leading performance and security Only commercial database with AI built-in Continued improvements to SQL Server on Linux

It s not practical to think in terms of a Roadmap anymore

Event horizon Now I recommend we focus on today Dynamic Networks Big Data Mobility Knowledge Exploration (Descriptive, Diagnostic, Predictive, Prescriptive) Patterns & Insight (Machine Learning) Virtual / Augmented Reality Security & Cyber-forensics Next years Homomorphic Security What's bigger than Big Data Algorithms / Artificial Intelligence Natural Language Computing Informational Graphs Context and Learning Relationship Graphs Holo-portation (3D See, Hear, Interact) (link) Dialogue Systems Field Programmable Gate Arrays (Algorithm Hardware) Off Grid Computing Underwater Datacenters A few more years Quantum Computing (link) Cryogenic Temperature Memory (link) Materials that can store Data with light (link) Storage of information on DNA strands (link) all of the data on the net could fit in a shoe box Cryptography, Security, Applied Math Defined Researched

The Data Flow OnPrem Data Store Azure Functions Event Hubs Other Data Cold Path Analytics Azure HDInsight, AzureML Azure Data Lake, Data Lake Analytics Power BI Hot Path Analytics Azure Stream Analytics App Service Web Apps Mobile Apps OPC-UA Client Azure HDInsight Storm Hot Path Business Logic OPC-UA Proxy Service Fabric & Actor Framework Logic Apps Notification Hubs BizTalk Services Azure IoT Edge Gateway Devices and Data Sources Cloud Gateway Warm Path Analytics IoT Hub Time Series Insights Data Ingestion & Processing, Command and Control Presentation & Business Connectivity

Solution scenarios Big Data & Advanced Analytics Modern Data Warehousing Advanced Analytics Real-time Analytics We want to integrate all our data including big data with our data warehouse We are trying to predict when our customers churn. We are trying to get insights from our devices in real-time, etc.

D A T A W A R E H O U S I N G P A T T E R N I N A Z U R E Loading and preparing data for analysis with a data warehouse DATA FACTORY Azure Import/Export Service Azure Data Box API s, CLI & GUI Tools APPLICATIONS r LOGS, FILES AND MEDIA (UNSTRUCTURED) DATA LAKE STORE AZURE STORAGE AZURE DATABRICKS COSMOS DB AZURE SQL DW HDINSIGHT DATA LAKE ANALYTICS BUSINESS / CUSTOM APPS (STRUCTURED) AAS DASHBOARDS COSMOS DB SQL DB

A D V A N C E D A N A L Y T I C S P A T T E R N I N A Z U R E Performing data collection/understanding, modeling and deployment SENSORS AND IOT (UNSTRUCTURED) AZURE ML AZURE ML STUDIO ML SERVER AZURE DATABRICKS SQL Server (Spark ML) (In-database ML) DATA SCIENCE VM BATCH AI COSMOS DB APPLICATIONS SQL DB r LOGS, FILES AND MEDIA (UNSTRUCTURED) DATA LAKE STORE AZURE STORAGE COSMOS DB SQL DB DATA LAKE ANALYTICS AZURE DATABRICKS HDINSIGHT SQL DW BUSINESS / CUSTOM APPS (STRUCTURED) DATA FACTORY AZURE CONTAINER SERVICE SQL Server (In-database ML) AZURE ANALYSIS SERVICES DASHBOARDS

B I G D A T A S T R E A M I N G P A T T E R N W I T H A Z U R E SENSORS AND IOT (UNSTRUCTURED) AZURE ML STUDIO R SERVER AZURE DATABRICKS (Spark ML) REAL-TIME APPLICATIONS r LOGS, FILES AND MEDIA (UNSTRUCTURED) EVENT HUBS IoT HUB KAFKA on HDINSIGHT STREAM ANALYTICS AZURE DATABRICKS (Spark Streaming) STORM on HDINSIGHT BUSINESS / CUSTOM APPS (STRUCTURED) REAL-TIME DASHBOARDS

There is a natural balance in IoT between the cloud and the edge SOLUTIONS THINGS Build Connect Manage INSIGHTS ACTIONS

IoT Pattern + Edge Azure IoT Hub Things Insights Actions Cloud Gateway Insights Actions SOLUTIONS THINGS Build Connect Manage INSIGHTS ACTIONS

IoT scale time-series data store Schema-less store, just send data Easy IoT Hub connection Store, query and visualize billions of events Simple and fast navigation SOLUTIONS THINGS INSIGHTS Discover Operationalize Refine ACTIONS

MULTI-PARTY DATA TRUSTED DATA COLLABORATIVE ACCESSIBLE DATA SECURE ANALYTICS VERIFIED COMPLIANCE TECHNOLOGY BUSINESS MODEL INNOVATIONS PROCESSING TRUST ENCRYPTED FRAMEWORKS DATA TAMPER DATA RESISTANT TRUSTEES AUDIT LOGS POLICY BASED SMART KEY RECOVERY CONTRACTS POLICY DATA ENCUMBERED MARKETS DATASETS DATA PROVENANCE MULTI-PARTY DIGITAL DATA CHAIN VAULTS OF CUSTODY

UN / World Bank A platform for National Statistics and Sustainable Development for established and developing nations City of Seattle Urban mobility experience and design City of Bellevue pedestrian safety through video analytics Financial Fabric data-sharing and analytics between hedge funds on pension funds to manage systemic risk Answer ALS Data sharing platform for the largest ALS research collaborative in the world UC Davis Statewide waterenergy conservation City of Bellevue NIST End to end water insight and emergency response San Diego County Courts juvenile recidivism Industry-focused Geographic Scope Multi-party protected data sets Customer-connected Innovation engagements

DATA MANAGEMENT GATEWAY MANUAL CONFIG OF RBAC-BASED DATA ACCESS MANUAL PROCESS OF PUBLISHING SURVEY DATA ADMIN- ISTRATIVE MICRO DATA OPEN DATA LANDSAT

Deep neural networks have enabled major advances in machine learning and AI Computer vision Language translation Speech recognition Question answering And more Problem: DNNs are challenging to serve and deploy in large-scale online services Recurrent Neural Networks y t-1 y t y t+1 h t-1 h t h t+1 h t-1 h t h t+1 x t-1 x t x t+1 Convolutional Neural Networks 22

DNN Processing Units Registers Contro l Unit (CU) CPUs Arithmeti c Logic Unit (ALU) GPUs Soft DPU (FPGA) Hard DPU ASICs FLEXIBILITY EFFICIENCY BrainWave Baidu SDA Deephi Tech ESE Teradeep Etc. Cerebras Google TPU Graphcore Groq Intel Nervana Movidius Wave Computing Etc. 23

Performance Excellent inference performance at low batch sizes Ultra-low latency serving on modern DNNs >10X lower than CPUs and GPUs Scale to many FPGAs in single DNN service Flexibility FPGAs ideal for adapting to rapidly evolving ML CNNs, LSTMs, MLPs, reinforcement learning, feature extraction, decision trees, etc. Inference-optimized numerical precision Exploit sparsity, deep compression for larger, faster models Scale Microsoft has the world s largest cloud investment in FPGAs Multiple Exa-Ops of aggregate AI capacity BrainWave runs on Microsoft s scale infrastructure 24

RSA-2048 Challenge Problem 251959084756578934940271832400483985714292821262040 320277771378360436620207075955562640185258807844069 1829064124951508218929855914917618450280848912007284 4992687392807287776735971418347270261896375014971824 Classical Quantum 6911650776133798590957000973304597488084284017974291 1 00642458691817195118746121515172654632282216869987549 billion 182422433637259085141865462043576798423387184774447 9207399342365848238242811981638150106748104516603773 years 0605620161967625613384414360383390441495263443219011 4657544454178424020924616515723350778707749817125772 467962926386356373289912154831438167899885040445364 023527381951378636564391212010397122822120720357 100 seconds

01 000 001 010 011 100 101 110 111

000 001 010 011 100 101 110 111 Quantum F(x) Processor F(000) F(001) F(010) F(011) F(100) F(110) F(111)

Nitrogen fixation 100-200 100-200 100s-1000s 100s-1000s