Data and AI LATAM 2018

Size: px
Start display at page:

Download "Data and AI LATAM 2018"

Transcription

1 Data and AI LATAM 2018

2 La parte de imagen con el identificador de relación rid5 no se encontró en el archivo. La parte de imagen con el identificador de relación rid5 no se encontró en el archivo. La parte de imagen con el identificador de relación rid3 no se encontró en el archivo. Streamline Productivity and Simplify Deployment Separate service or embedded logic Scoring Applications Applications Model Model Training Analytics server Data Transformations MODEL SQL Server Data Transformations Model Training Scoring MODEL

3 IEEE Spectrum Top Programming Languages IEEE Spectrum, July 2017 KDnuggets Top Data Science Tools, 2017

4 SQL Server 2017

5

6 Eliminate data movement Operationalize scripts and models Enterprise grade performance and scale Extensibility

7 Demo Hello World

8 EXEC =N' print(paste("hello World from:", Revo.version$version.string)); ' EXEC =N' import sys print ("Hellow World from:", sys.version) '

9 Using SQL Server 2017 Machine Learning Services

10

11 Augments R & Python with parallelized, distributed algorithms Provides in-database execution of scripts and algorithms used Parallel algorithms overcome Python and R memory limitations Reduces security risks by keeping data in-database Creates and consumes portable models

12 One call, one answer Arbitrarily large data sets Arbitrarily large worker task set Mathematically the same as single-threaded Platform independent Most are written in C++ for speed 1. Algorithm begins initiator process 2. Initiator distributes work to nodes 3. Finalizer collects results 4. Finalizer iterates or continues 5. Finalizer evaluates final model 6. Returns single model to calling script RevoScaleR & RevoScalePy Algorithms and Functions Load a large dataset Run a RevoScaleR or RevoScalePy algorithm Data Larger than RAM

13 One call One or many models returned Arbitrarily large data sets Arbitrarily large worker task set Augments RevoScaleR Fast learners Deep learning algorithms Ensemble results using rxensemble RevoScaleR & RevoScalePy Algorithms and Functions Load a large dataset Run a RevoScaleR or RevoScalePy algorithm Data Larger than RAM

14 Executes RevoScaleR algos on remote data & CPUs rxsetcomputecontext redirects to remote Algorithms in RevoScaleR library redirect as set Results are returned to script as though local 1. Algorithm on local checks compute context 2. If set remote, packages and ships request 3. Local script blocks (by default) awaiting response 4. Remote unpacks and executes in parallel 5. Remote returns results to local interface 6. Local interface returns results to script Load a large dataset Run a RevoScaleR or RevoScalePy algorithm SQL Server, Teradata 1, Hadoop 1, Hadoop MapReduce 1, HDInsight 1 1 R only

15

16

17

18 Supports custom, multi-layer network topology with filtered, convolutional, and pooling bundles Binary classification Multi-class classification Regression Bing Ads Click Prediction ($50M per year revenue gain); Image Classification L1, L2 regularization Binary classification Multi-class classification Easy to train learner for anomaly detection Boosted decision tree. Similar to XGBoost. Supports up to ~100K features state-of-the-art tree ensembles (Random Forest) Supports up to ~100K features. Speed, scalability and supports L1,L2 regularization. Supports up to 1B features! Anomaly Detection Binary classification Regression Binary classification Regression Binary classification, Regression Classifying user feedback Fraud detection One of the most popular and best performing learners inside Microsoft Churn Prediction Outlook used for spam filtering Battle tested, large language support, performant (Bing, Office) Ease of use; 1 line of code to set Ease of use Performs natural language processing of free text into numerical representation Converts categories into numerical data Selects a subset of features to speed up training time Support ticket classification, Sentiment analysis Ad Click Prediction Sentiment analysis, Ad Click Prediction

19

20

21

22 Canonical deployment patterns

23 SQL Server 2017

24 Key Points: Classical pattern of pulling data out of database to a separate modeling environment Data scientists will SQL already be familiar with this approach, Server so it's something to build on 2017

25 Remote Execution Context SQL Server 2016/17 Results RevoScaleR & RevoScalePy Parallel Algorithms Iterate/ Sequence Parallel Worker Tasks

26 Remote Execution Context Key Points: Fast and friendly for existing R/Python users Results a SQL Compute Context SQL Server 2016/17 RevoScaleR & RevoScalePy Parallel Algorithms Limited to what can be done through Requires external script execute permissions Iterate/ Sequence Parallel Worker Tasks

27 Run R and Python from SQL environments T-SQL Apps T-SQL Script SQL Server 2016/17 Run Python & R From within the Query Processor

28 T-SQL Apps Key Points: Works with all of R/Python functions for maximum flexibility Most natural for users with some SQL familiarity, but doable for all Run R and Python from SQL environments T-SQL Script SQL Server 2016/17 Run Python & R From within the Query Processor

29 BI & Reporting; Web apps T-SQL Script Enable smart non-r apps SQL Server 2016/17 T-SQL Stored Procedure

30 BI & Reporting; Web apps T-SQL Script Enable smart non-r apps Key Points: Helper functions are available so you don't have to manually recode and R/Python script Allows firing via traditional triggers SQL Server 2016/17 T-SQL Stored Procedure

31 Production Apps T-SQL SQL Server 2017 Events Events Models Stored Proc s and Triggers Real time scoring engine

32 Production Apps T-SQL Key Points: R/Python need not be installed Works with many of the RevoScale* and MicrosoftML models SQL Server 2017 Events Works on SQL 2017 and Azure SQL DB Models Stored Proc s and Triggers single millisecond Events response times Real time scoring engine

33 Other Hints and Deployment Considerations

34 Integration with SQL query execution Parallel query pushing data to multiple external processes / threads Use in-memory technology and Columnstore Indexes alongside your ML scripts Streaming mode execution Stream data in batches to the R/Python process to scale beyond available memory Train and Predict using parallelism Leverage RevoScaleR/revoscalepy and scale your R and Python scripts using multi-threading and parallel processing Native scoring for faster real-time predictions (New in 2017)

35 No dependency between rows (ex: scoring) Trivial Parallelism exec = = N' # unserialize model logitobj <- unserialize(modelbin); # build classification model to predict tipped or not system.time(outputdataset <- data.frame(predict(logitobj, newdata = InputDataSet, type = = N SELECT tipped, passenger_count, trip_time_in_secs, trip_distance, d.direct_distance FROM dbo.nyctaxi_sample TABLESAMPLE (50 PERCENT) REPEATABLE (98074) CROSS APPLY [CalculateDistance](pickup_latitude, pickup_longitude, dropoff_latitude, dropoff_longitude) as d OPTION(MAXDOP 2) -- Needed only to control = = N'@modelbin @r_rowsperread = 5000; sp_execute_exte = N = 1 (MAXDOP = 2)

36 Requirements: No dependency between rows (ex: scoring) Key Benefits: Execute script over chunks of data Process data that doesn t fit in memory Can be used from client (rx* function) or server exec = = N' # unserialize model logitobj <- unserialize(modelbin); # build classification model to predict tipped or not system.time(outputdataset <- data.frame(predict(logitobj, newdata = InputDataSet, type = = N SELECT tipped, passenger_count, trip_time_in_secs, trip_distance, d.direct_distance FROM dbo.nyctaxi_sample TABLESAMPLE (50 PERCENT) REPEATABLE (98074) CROSS APPLY [CalculateDistance](pickup_latitude, pickup_longitude, dropoff_latitude, dropoff_longitude) as = N'@modelbin @r_rowsperread = 5000; Dataset = Rows =

37 exec = = N' # Define the connection string connstr <- paste("driver=sql Server;Server=", instance_name, ";Database=", database_name, ";Trusted_Connection=true;", sep=""); # Set ComputeContext cc <- RxInSqlServer(connectionString = connstr, numtasks = 4); # Pull data from query featuredatasource = RxSqlServerData(sqlQuery = input_query, connectionstring = connstr, computecontext = cc); # Table to write data to, using compute context tippredictions = RxSqlServerData(table = "nyc_taxi_tip_predictions", connectionstring = connstr); # Unserialize model logitobj <- unserialize(modelbin); # Predict tipped or not based on model Predictions -> rxpredict(logitobj, data = featuredatasource, outdata = tippredictions, overwrite = = N'@input_query = N'SELECT * FROM nyctaxi_training_sample' sp_execute_ext = N = N SELECT. (MAXDOP = 2) rxcall <Model Object> rxcall +BxlServer +BxlServer m 1 + m 2

38 @model SELECT native_model FROM models WHERE model_name = 'Fraud Detection Model PREDICT MODEL DATA = new_transaction

39

40 -- Check/set External Resource Pool config SELECT * FROM sys.resource_governor_resource_pools WHERE name = 'default' SELECT * FROM sys.resource_governor_external_resource_pools WHERE name = 'default' ALTER RESOURCE POOL "default" WITH (max_memory_percent = 60); ALTER EXTERNAL RESOURCE POOL "default" WITH (max_memory_percent = 80); ALTER RESOURCE GOVERNOR RECONFIGURE; -- enforce changes

41 DMV sys.dm_exec_requests sys.dm_external_script_requests sys.dm_external_script_execution_stats sys.dm_os_performance_counters Description New column: external_script_request_id Returns running external scripts, DOP & assigned user account Number of executions for rx* functions in RevoScaleR package New External Scripts performance counters

42 here

43

44 Reduced surface area and isolation external scripts enabled required R/Python script execution outside of SQL Server process space Script execution requires explicit permission sp_execute_external_script requires EXECUTE ' ANY EXTERNAL SCRIPT for nonadmins SQL Server login/user required and db/table access R/Python processes have limited privileges R/Python processes run under local user accounts in the SQLRUserGroup Each execution is isolated. Different users with different accounts Windows firewall rules to block outbound traffic

45 Examples using R: sqlpackages <- rxinstalledpackages(fields = c("package", "Version", "Built"), computecontext = sqlserver) pkgs <- c("ggplot2") rxinstallpackages(pkgs = pkgs, verbose = TRUE, scope = "private", computecontext = sqlserver) Example using T-SQL: mypackages <- rxinstalledpackages(); OutputDataSet <- as.data.frame(mypackages); ' pkgs <- c("ggplot2") rxremovepackages(pkgs = pkgs, verbose = TRUE, scope = "private", computecontext = sqlserver)

46 Azure SQL Database R support Python support Machine Learning Services in SQL Server on Linux Additional algorithms and pre-trained models Native Scoring for more models

47 R Services ML Services AKA.MS/MLSQLDEV SSMS Reports for ML Services ML cheat sheet Hospital length of Stay demo scripts SQL Server Machine Learning Services

48 Muchas Gracias!

49

50

51 Learning and Scoring Process Learning Labels Images Featurization (using pre-trained ResNet18 neural network model) Features Classification Algorithm (Boosted Tree) Classifier Model Scoring Images Featurization (using pre-trained ResNet18 neural network model) Features Classification Predictions

52 Distributed Featurization and Training On HD-Insight SQL Server Models Table HDInsight-MRS Azure Blob Storage CT Scan Images Classifier Training Featurization Edge Distributed Featurization

53 Scoring with Deep Learning Model in SQL SQL Server Web App Stored Procedures with R Code Featurization Scoring with the classifier model Stored Procedure call Model table, Features table, New Images table Diagnosis: 35% certainty

54 Image Featurization

55 Parallel Featurization (30x speedup)

56 Training on Spark and storing in SQL

57 Scoring in SQL

Boost your Analytics with ML for SQL Nerds

Boost your Analytics with ML for SQL Nerds Boost your Analytics with ML for SQL Nerds SQL Saturday Spokane Mar 10, 2018 Julie Koesmarno @MsSQLGirl mssqlgirl.com jukoesma@microsoft.com Principal Program Manager in Business Analytics for SQL Products

More information

SQL Server Machine Learning Marek Chmel & Vladimir Muzny

SQL Server Machine Learning Marek Chmel & Vladimir Muzny SQL Server Machine Learning Marek Chmel & Vladimir Muzny @VladimirMuzny & @MarekChmel MCTs, MVPs, MCSEs Data Enthusiasts! vladimir@datascienceteam.cz marek@datascienceteam.cz Session Agenda Machine learning

More information

Microsoft, Open Source, R: You Gotta be Kidding Me!

Microsoft, Open Source, R: You Gotta be Kidding Me! Microsoft, Open Source, R: You Gotta be Kidding Me! Bio - Niels Berglund Software Specialist - Derivco lots of production dev. plus figuring out ways to "use and abuse" existing and new technologies Author

More information

Boost your Analytics with Machine Learning for SQL Nerds. Julie mssqlgirl.com

Boost your Analytics with Machine Learning for SQL Nerds. Julie mssqlgirl.com Boost your Analytics with Machine Learning for SQL Nerds Julie Koesmarno @MsSQLGirl mssqlgirl.com 1. Y ML 2. Operationalizing ML 3. Tips & Tricks 4. Resources automation delighting customers Deepen Engagement

More information

#Azure #MicrosoftAIJourney

#Azure #MicrosoftAIJourney http://aka.ms/aicommunity #Azure #MicrosoftAIJourney Robin.Lester@Microsoft.com SQL Server + R 1990s 1995 1997 2000 2004 2013 Development started based on S language (created in 1980) 1993 R starts

More information

Andrea Martorana Tusa. Failure prediction for manifacturing industry

Andrea Martorana Tusa. Failure prediction for manifacturing industry Andrea Martorana Tusa Failure prediction for manifacturing industry Event Sponsors Expo Sponsors Expo Light Sponsors Speaker Info First name: Andrea. Last name: Martorana Tusa. Italian, working by Widex

More information

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect BEOP.CTO.TP4 Owner: OCTO Revision: 0001 Approved by: JAT Effective: 08/30/2018 Buchanan & Edwards Proprietary: Printed copies of

More information

Modeling. Preparation. Operationalization. Profile Explore. Model Testing & Validation. Feature & Algorithm Selection. Transform Cleanse Denormalize

Modeling. Preparation. Operationalization. Profile Explore. Model Testing & Validation. Feature & Algorithm Selection. Transform Cleanse Denormalize Preparation Modeling Ingest Transform Cleanse Denormalize Profile Explore Visualize Feature & Algorithm Selection Model Testing & Validation Operationalization Models Visualizations Deploy Apps, Services

More information

Overview of Data Services and Streaming Data Solution with Azure

Overview of Data Services and Streaming Data Solution with Azure Overview of Data Services and Streaming Data Solution with Azure Tara Mason Senior Consultant tmason@impactmakers.com Platform as a Service Offerings SQL Server On Premises vs. Azure SQL Server SQL Server

More information

Understanding the latent value in all content

Understanding the latent value in all content Understanding the latent value in all content John F. Kennedy (JFK) November 22, 1963 INGEST ENRICH EXPLORE Cognitive skills Data in any format, any Azure store Search Annotations Data Cloud Intelligence

More information

Populating the Galaxy Zoo

Populating the Galaxy Zoo Populating the Galaxy Zoo Real-time Image Classification with SQL Server R Services David M Smith @revodavid R Community Lead Microsoft Algorithms and Data Science THANKS to all Sponsors! EVENT SPONSORS

More information

R Language for the SQL Server DBA

R Language for the SQL Server DBA R Language for the SQL Server DBA Beginning with R Ing. Eduardo Castro, PhD, Principal Data Analyst Architect, LP Consulting Moderated By: Jose Rolando Guay Paz Thank You microsoft.com idera.com attunity.com

More information

Indira Bandari. Predictive Analytics using R in SQL Server

Indira Bandari. Predictive Analytics using R in SQL Server Indira Bandari Predictive Analytics using R in SQL Server Agenda What is Predictive Analytics? Analytics vs. Predictive Analytics Benefits of using R Predictive Analytics Life Cycle Demo Indira Bandari

More information

Microsoft vision for a new era

Microsoft vision for a new era Microsoft vision for a new era United platform for the modern service provider MICROSOFT AZURE CUSTOMER DATACENTER CONSISTENT PLATFORM SERVICE PROVIDER Enterprise-grade Global reach, scale, and security

More information

Scaling MATLAB. for Your Organisation and Beyond. Rory Adams The MathWorks, Inc. 1

Scaling MATLAB. for Your Organisation and Beyond. Rory Adams The MathWorks, Inc. 1 Scaling MATLAB for Your Organisation and Beyond Rory Adams 2015 The MathWorks, Inc. 1 MATLAB at Scale Front-end scaling Scale with increasing access requests Back-end scaling Scale with increasing computational

More information

Integrate MATLAB Analytics into Enterprise Applications

Integrate MATLAB Analytics into Enterprise Applications Integrate Analytics into Enterprise Applications Lyamine Hedjazi 2015 The MathWorks, Inc. 1 Data Analytics Workflow Preprocessing Data Business Systems Build Algorithms Smart Connected Systems Take Decisions

More information

exam. Number: Passing Score: 800 Time Limit: 120 min File Version: Microsoft

exam. Number: Passing Score: 800 Time Limit: 120 min File Version: Microsoft 70-773.exam Number: 70-773 Passing Score: 800 Time Limit: 120 min File Version: 1.0 Microsoft 70-773 Analyzing Big Data with Microsoft R Version 1.0 Exam A QUESTION 1 You plan to read data from an Oracle

More information

Integrate MATLAB Analytics into Enterprise Applications

Integrate MATLAB Analytics into Enterprise Applications Integrate Analytics into Enterprise Applications Aurélie Urbain MathWorks Consulting Services 2015 The MathWorks, Inc. 1 Data Analytics Workflow Data Acquisition Data Analytics Analytics Integration Business

More information

CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List)

CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List) CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List) Microsoft Solution Latest Sl Area Refresh No. Course ID Run ID Course Name Mapping Date 1 AZURE202x 2 Microsoft

More information

BIG DATA COURSE CONTENT

BIG DATA COURSE CONTENT BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data

More information

Outrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS

Outrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS Outrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS Topics AGENDA Challenges with Big Data Analytics How SAS can help you to minimize time to value with

More information

OPERATIONALIZING MACHINE LEARNING USING GPU ACCELERATED, IN-DATABASE ANALYTICS

OPERATIONALIZING MACHINE LEARNING USING GPU ACCELERATED, IN-DATABASE ANALYTICS OPERATIONALIZING MACHINE LEARNING USING GPU ACCELERATED, IN-DATABASE ANALYTICS 1 Why GPUs? A Tale of Numbers 100x Performance Increase Infrastructure Cost Savings Performance 100x gains over traditional

More information

Tackling Big Data Using MATLAB

Tackling Big Data Using MATLAB Tackling Big Data Using MATLAB Alka Nair Application Engineer 2015 The MathWorks, Inc. 1 Building Machine Learning Models with Big Data Access Preprocess, Exploration & Model Development Scale up & Integrate

More information

Noviembre18, 2017 Concepción, Chile. #sqlsatconce

Noviembre18, 2017 Concepción, Chile. #sqlsatconce Noviembre8, 27 Concepción, Chile #sqlsatconce SQL Server 27 - Deep Learning, clasificación de imágenes usando Azure Data Science Virtual Machine Nombre Speaker: Adrián J. Fernandez Cargo : Especialista

More information

MATLAB. Senior Application Engineer The MathWorks Korea The MathWorks, Inc. 2

MATLAB. Senior Application Engineer The MathWorks Korea The MathWorks, Inc. 2 1 Senior Application Engineer The MathWorks Korea 2017 The MathWorks, Inc. 2 Data Analytics Workflow Business Systems Smart Connected Systems Data Acquisition Engineering, Scientific, and Field Business

More information

exam. Microsoft Perform Data Engineering on Microsoft Azure HDInsight. Version 1.0

exam.   Microsoft Perform Data Engineering on Microsoft Azure HDInsight. Version 1.0 70-775.exam Number: 70-775 Passing Score: 800 Time Limit: 120 min File Version: 1.0 Microsoft 70-775 Perform Data Engineering on Microsoft Azure HDInsight Version 1.0 Exam A QUESTION 1 You use YARN to

More information

Scalable Machine Learning in R. with H2O

Scalable Machine Learning in R. with H2O Scalable Machine Learning in R with H2O Erin LeDell @ledell DSC July 2016 Introduction Statistician & Machine Learning Scientist at H2O.ai in Mountain View, California, USA Ph.D. in Biostatistics with

More information

Integrate MATLAB Analytics into Enterprise Applications

Integrate MATLAB Analytics into Enterprise Applications Integrate Analytics into Enterprise Applications Dr. Roland Michaely 2015 The MathWorks, Inc. 1 Data Analytics Workflow Access and Explore Data Preprocess Data Develop Predictive Models Integrate Analytics

More information

Deploying, Managing and Reusing R Models in an Enterprise Environment

Deploying, Managing and Reusing R Models in an Enterprise Environment Deploying, Managing and Reusing R Models in an Enterprise Environment Making Data Science Accessible to a Wider Audience Lou Bajuk-Yorgan, Sr. Director, Product Management Streaming and Advanced Analytics

More information

Oracle Big Data Connectors

Oracle Big Data Connectors Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process

More information

Alexander Klein. #SQLSatDenmark. ETL meets Azure

Alexander Klein. #SQLSatDenmark. ETL meets Azure Alexander Klein ETL meets Azure BIG Thanks to SQLSat Denmark sponsors Save the date for exiting upcoming events PASS Camp 2017 Main Camp 05.12. 07.12.2017 (04.12. Kick-Off abends) Lufthansa Training &

More information

SQL Server 2019 Big Data Clusters

SQL Server 2019 Big Data Clusters SQL Server 2019 Big Data Clusters Ben Weissman @bweissman > SOLISYON GMBH > FÜRTHER STRAßE 212 > 90429 NÜRNBERG > +49 911 990077 20 Who am I? Ben Weissman @bweissman b.weissman@solisyon.de http://biml-blog.de/

More information

Introduction to MATLAB application deployment

Introduction to MATLAB application deployment Introduction to application deployment Antti Löytynoja, Application Engineer 2015 The MathWorks, Inc. 1 Technical Computing with Products Access Explore & Create Share Options: Files Data Software Data

More information

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks Asanka Padmakumara ETL 2.0: Data Engineering with Azure Databricks Who am I? Asanka Padmakumara Business Intelligence Consultant, More than 8 years in BI and Data Warehousing A regular speaker in data

More information

Week 1 Unit 1: Introduction to Data Science

Week 1 Unit 1: Introduction to Data Science Week 1 Unit 1: Introduction to Data Science The next 6 weeks What to expect in the next 6 weeks? 2 Curriculum flow (weeks 1-3) Business & Data Understanding 1 2 3 Data Preparation Modeling (1) Introduction

More information

GPU Accelerated Data Processing Speed of Thought Analytics at Scale

GPU Accelerated Data Processing Speed of Thought Analytics at Scale GPU Accelerated Data Processing Speed of Thought Analytics at Scale The benefits of Brytlyt s GPU Accelerated Database Brytlyt is an ultra-high performance database that combines patent pending intellectual

More information

Stages of Data Processing

Stages of Data Processing Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,

More information

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp.

Data 101 Which DB, When. Joe Yong Azure SQL Data Warehouse, Program Management Microsoft Corp. Data 101 Which DB, When Joe Yong (joeyong@microsoft.com) Azure SQL Data Warehouse, Program Management Microsoft Corp. The world is changing AI increased by 300% in 2017 Data will grow to 44 ZB in 2020

More information

SQL Server on Linux and Containers

SQL Server on Linux and Containers http://aka.ms/bobwardms https://github.com/microsoft/sqllinuxlabs SQL Server on Linux and Containers A Brave New World Speaker Name Principal Architect Microsoft bobward@microsoft.com @bobwardms linkedin.com/in/bobwardms

More information

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager

Characterization and Benchmarking of Deep Learning. Natalia Vassilieva, PhD Sr. Research Manager Characterization and Benchmarking of Deep Learning Natalia Vassilieva, PhD Sr. Research Manager Deep learning applications Vision Speech Text Other Search & information extraction Security/Video surveillance

More information

Database Integrated Analytics using R: Initial Experiences with SQL-Server + R

Database Integrated Analytics using R: Initial Experiences with SQL-Server + R Database Integrated Analytics using R: Initial Experiences with SQL-Server + R Josep Ll. Berral and Nicolas Poggi Barcelona Supercomputing Center (BSC) Universitat Politècnica de Catalunya (BarcelonaTech)

More information

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX

UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX 1 Successful companies know that analytics are key to winning customer loyalty, optimizing business processes and beating their

More information

As a reference, please find a version of the Machine Learning Process described in the diagram below.

As a reference, please find a version of the Machine Learning Process described in the diagram below. PREDICTION OVERVIEW In this experiment, two of the Project PEACH datasets will be used to predict the reaction of a user to atmospheric factors. This experiment represents the first iteration of the Machine

More information

microsoft

microsoft 70-775.microsoft Number: 70-775 Passing Score: 800 Time Limit: 120 min Exam A QUESTION 1 Note: This question is part of a series of questions that present the same scenario. Each question in the series

More information

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism

SQT03 Big Data and Hadoop with Azure HDInsight Andrew Brust. Senior Director, Technical Product Marketing and Evangelism Big Data and Hadoop with Azure HDInsight Andrew Brust Senior Director, Technical Product Marketing and Evangelism Datameer Level: Intermediate Meet Andrew Senior Director, Technical Product Marketing and

More information

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop

More information

SQL Server SQL Server 2008 and 2008 R2. SQL Server SQL Server 2014 Currently supporting all versions July 9, 2019 July 9, 2024

SQL Server SQL Server 2008 and 2008 R2. SQL Server SQL Server 2014 Currently supporting all versions July 9, 2019 July 9, 2024 Current support level End Mainstream End Extended SQL Server 2005 SQL Server 2008 and 2008 R2 SQL Server 2012 SQL Server 2005 SP4 is in extended support, which ends on April 12, 2016 SQL Server 2008 and

More information

Database Administration for Azure SQL DB

Database Administration for Azure SQL DB Database Administration for Azure SQL DB Martin Cairney SQL Saturday #582, Melbourne 11 th February 2017 Housekeeping Mobile Phones Please set to stun during sessions Evaluations Please complete a session

More information

Netezza The Analytics Appliance

Netezza The Analytics Appliance Software 2011 Netezza The Analytics Appliance Michael Eden Information Management Brand Executive Central & Eastern Europe Vilnius 18 October 2011 Information Management 2011IBM Corporation Thought for

More information

Graph Analytics and Machine Learning A Great Combination Mark Hornick

Graph Analytics and Machine Learning A Great Combination Mark Hornick Graph Analytics and Machine Learning A Great Combination Mark Hornick Oracle Advanced Analytics and Machine Learning November 3, 2017 Safe Harbor Statement The following is intended to outline our research

More information

Why data science is the new frontier in software development

Why data science is the new frontier in software development Why data science is the new frontier in software development And why every developer should care Jeff Prosise jeffpro@wintellect.com @jprosise Assertion #1 Being a programmer is like being the god of your

More information

Enable IoT Solutions using Azure

Enable IoT Solutions using Azure Internet Of Things A WHITE PAPER SERIES Enable IoT Solutions using Azure 1 2 TABLE OF CONTENTS EXECUTIVE SUMMARY INTERNET OF THINGS GATEWAY EVENT INGESTION EVENT PERSISTENCE EVENT ACTIONS 3 SYNTEL S IoT

More information

white paper Aster Data ncluster In - database Analytics with R

white paper Aster Data ncluster In - database Analytics with R white paper Aster Data ncluster In - database Analytics with R Contents Introduction to Aster Data ncluster and SQL-MapReduce... 3 R in Aster Data ncluster... 3 Proprietary Scoring using R without In-database

More information

Putting it all together: Creating a Big Data Analytic Workflow with Spotfire

Putting it all together: Creating a Big Data Analytic Workflow with Spotfire Putting it all together: Creating a Big Data Analytic Workflow with Spotfire Authors: David Katz and Mike Alperin, TIBCO Data Science Team In a previous blog, we showed how ultra-fast visualization of

More information

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success. ACTIVATORS Designed to give your team assistance when you need it most without

More information

Think & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI)

Think & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI) Think & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI) About the Speaker Dr. SubraMANI Paramasivam PhD., MCT, MCSE, MCITP, MCP, MCTS, MCSA CEO, Principal Consultant & Trainer

More information

Přehled novinek v SQL Server 2016

Přehled novinek v SQL Server 2016 Přehled novinek v SQL Server 2016 Martin Rys, BI Competency Leader martin.rys@adastragrp.com https://www.linkedin.com/in/martinrys 20.4.2016 1 BI Competency development 2 Trends, modern data warehousing

More information

ECS289: Scalable Machine Learning

ECS289: Scalable Machine Learning ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences

More information

DATA SCIENCE USING SPARK: AN INTRODUCTION

DATA SCIENCE USING SPARK: AN INTRODUCTION DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data

More information

17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS

17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS What are all those Azure* and Power* services and why do I want them? Dr Greg Low SQL Down Under greg@sqldownunder.com Who is Greg? CEO and Principal Mentor at SDU Data Platform MVP Microsoft Regional

More information

##SQLSatMadrid. Project [Vélib by Cortana]

##SQLSatMadrid. Project [Vélib by Cortana] Project [Vélib by Cortana] BIG Thanks to SQLSatMadrid Sponsors Speakers Agenda Presentation of the Project Cortana Intelligent Suite Creation of the architecture Purpose of the Project Get a descriptive

More information

Webinar Series TMIP VISION

Webinar Series TMIP VISION Webinar Series TMIP VISION TMIP provides technical support and promotes knowledge and information exchange in the transportation planning and modeling community. Today s Goals To Consider: Parallel Processing

More information

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp.

Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp. 17-18 March, 2018 Beijing Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp. The world is changing AI increased by 300% in 2017 Data will grow to 44 ZB in 2020 Today, 80% of organizations

More information

Build a system health check for Db2 using IBM Machine Learning for z/os

Build a system health check for Db2 using IBM Machine Learning for z/os Build a system health check for Db2 using IBM Machine Learning for z/os Jonathan Sloan Senior Analytics Architect, IBM Analytics Agenda A brief machine learning overview The Db2 ITOA model solutions template

More information

Evolving To The Big Data Warehouse

Evolving To The Big Data Warehouse Evolving To The Big Data Warehouse Kevin Lancaster 1 Copyright Director, 2012, Oracle and/or its Engineered affiliates. All rights Insert Systems, Information Protection Policy Oracle Classification from

More information

Azure Webinar. Resilient Solutions March Sander van den Hoven Principal Technical Evangelist Microsoft

Azure Webinar. Resilient Solutions March Sander van den Hoven Principal Technical Evangelist Microsoft Azure Webinar Resilient Solutions March 2017 Sander van den Hoven Principal Technical Evangelist Microsoft DX @svandenhoven 1 What is resilience? Client Client API FrontEnd Client Client Client Loadbalancer

More information

Vinnie Saini Cloud Solution Architect Big Data & AI

Vinnie Saini Cloud Solution Architect Big Data & AI Vinnie Saini Cloud Solution Architect Big Data & AI vasaini@microsoft.com data intelligence cloud Data + Intelligence + Cloud Extensible Applications Easy to consume Artificial Intelligence Most comprehensive

More information

S8873 GBM INFERENCING ON GPU. Shankara Rao Thejaswi Nanditale, Vinay Deshpande

S8873 GBM INFERENCING ON GPU. Shankara Rao Thejaswi Nanditale, Vinay Deshpande S8873 GBM INFERENCING ON GPU Shankara Rao Thejaswi Nanditale, Vinay Deshpande Introduction AGENDA Objective Experimental Results Implementation Details Conclusion 2 INTRODUCTION 3 BOOSTING What is it?

More information

Monitoring & Tuning Azure SQL Database

Monitoring & Tuning Azure SQL Database Monitoring & Tuning Azure SQL Database Dustin Ryan, Data Platform Solution Architect, Microsoft Moderated By: Paresh Motiwala Presenting Sponsors Thank You to Our Presenting Sponsors Empower users with

More information

MLeap: Release Spark ML Pipelines

MLeap: Release Spark ML Pipelines MLeap: Release Spark ML Pipelines Hollin Wilkins & Mikhail Semeniuk SATURDAY Web Dev @ Cornell Studied some General Biology Rails Consulting for TrueCar and other companies Implement ML model for ClearBook

More information

Modern Data Warehouse The New Approach to Azure BI

Modern Data Warehouse The New Approach to Azure BI Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics

More information

Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models

Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models Large-Scale Lasso and Elastic-Net Regularized Generalized Linear Models DB Tsai Steven Hillion Outline Introduction Linear / Nonlinear Classification Feature Engineering - Polynomial Expansion Big-data

More information

Azure Data Factory VS. SSIS. Reza Rad, Consultant, RADACAD

Azure Data Factory VS. SSIS. Reza Rad, Consultant, RADACAD Azure Data Factory VS. SSIS Reza Rad, Consultant, RADACAD 2 Please silence cell phones Explore Everything PASS Has to Offer FREE ONLINE WEBINAR EVENTS FREE 1-DAY LOCAL TRAINING EVENTS VOLUNTEERING OPPORTUNITIES

More information

Dr. SubraMANI Paramasivam. Think & Work like a Data Scientist with SQL 2016 & R

Dr. SubraMANI Paramasivam. Think & Work like a Data Scientist with SQL 2016 & R Dr. SubraMANI Paramasivam Think & Work like a Data Scientist with SQL 2016 & R About the Speaker Group Leader Dr. SubraMANI Paramasivam PhD., MVP, MCT, MCSE (x2), MCITP (x2), MCP, MCTS (x3), MCSA CEO,

More information

Azure SQL Database Training. Complete Practical & Real-time Trainings. A Unit of SequelGate Innovative Technologies Pvt. Ltd.

Azure SQL Database Training. Complete Practical & Real-time Trainings. A Unit of SequelGate Innovative Technologies Pvt. Ltd. Azure SQL Database Training Complete Practical & Real-time Trainings A Unit of SequelGate Innovative Technologies Pvt. Ltd. AZURE SQL / DBA Training consists of TWO Modules: Module 1: Azure SQL Database

More information

Exam Questions

Exam Questions Exam Questions 70-775 Perform Data Engineering on Microsoft Azure HDInsight (beta) https://www.2passeasy.com/dumps/70-775/ NEW QUESTION 1 You are implementing a batch processing solution by using Azure

More information

Approaching the Petabyte Analytic Database: What I learned

Approaching the Petabyte Analytic Database: What I learned Disclaimer This document is for informational purposes only and is subject to change at any time without notice. The information in this document is proprietary to Actian and no part of this document may

More information

Transforming Transport Infrastructure with GPU- Accelerated Machine Learning Yang Lu and Shaun Howell

Transforming Transport Infrastructure with GPU- Accelerated Machine Learning Yang Lu and Shaun Howell Transforming Transport Infrastructure with GPU- Accelerated Machine Learning Yang Lu and Shaun Howell 11 th Oct 2018 2 Contents Our Vision Of Smarter Transport Company introduction and journey so far Advanced

More information

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo Microsoft Exam Questions 70-775 Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo NEW QUESTION 1 HOTSPOT You install the Microsoft Hive ODBC Driver on a computer that runs Windows

More information

Computational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich

Computational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich Computational Databases: Inspirations from Statistical Software Linnea Passing, linnea.passing@tum.de Technical University of Munich Data Science Meets Databases Data Cleansing Pipelines Fuzzy joins Data

More information

Microsoft Exam

Microsoft Exam Volume: 42 Questions Case Study: 1 Relecloud General Overview Relecloud is a social media company that processes hundreds of millions of social media posts per day and sells advertisements to several hundred

More information

Azure SQL Database. Indika Dalugama. Data platform solution architect Microsoft datalake.lk

Azure SQL Database. Indika Dalugama. Data platform solution architect Microsoft datalake.lk Azure SQL Database Indika Dalugama Data platform solution architect Microsoft indalug@microsoft.com datalake.lk Agenda Overview Azure SQL adapts Azure SQL Instances (single,e-pool and MI) How to Migrate

More information

Columnstore Technology Improvements in SQL Server Presented by Niko Neugebauer Moderated by Nagaraj Venkatesan

Columnstore Technology Improvements in SQL Server Presented by Niko Neugebauer Moderated by Nagaraj Venkatesan Columnstore Technology Improvements in SQL Server 2016 Presented by Niko Neugebauer Moderated by Nagaraj Venkatesan Thank You microsoft.com hortonworks.com aws.amazon.com red-gate.com Empower users with

More information

Machine Learning for Large-Scale Data Analysis and Decision Making A. Distributed Machine Learning Week #9

Machine Learning for Large-Scale Data Analysis and Decision Making A. Distributed Machine Learning Week #9 Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Distributed Machine Learning Week #9 Today Distributed computing for machine learning Background MapReduce/Hadoop & Spark Theory

More information

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet.

The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or

More information

My name is Brian Pottle. I will be your guide for the next 45 minutes of interactive lectures and review on this lesson.

My name is Brian Pottle. I will be your guide for the next 45 minutes of interactive lectures and review on this lesson. Hello, and welcome to this online, self-paced lesson entitled ORE Embedded R Scripts: SQL Interface. This session is part of an eight-lesson tutorial series on Oracle R Enterprise. My name is Brian Pottle.

More information

Spark, Shark and Spark Streaming Introduction

Spark, Shark and Spark Streaming Introduction Spark, Shark and Spark Streaming Introduction Tushar Kale tusharkale@in.ibm.com June 2015 This Talk Introduction to Shark, Spark and Spark Streaming Architecture Deployment Methodology Performance References

More information

HDInsight > Hadoop. October 12, 2017

HDInsight > Hadoop. October 12, 2017 HDInsight > Hadoop October 12, 2017 2 Introduction Mark Hudson >20 years mixing technology with data >10 years with CapTech Microsoft Certified IT Professional Business Intelligence Member of the Richmond

More information

SQL Server 2017: Data Science with Python or R?

SQL Server 2017: Data Science with Python or R? SQL Server 2017: Data Science with Python or R? Dejan Sarka Sponsor Introduction Dejan Sarka (dsarka@solidq.com, dsarka@siol.net, @DejanSarka) 30 years of experience SQL Server MVP, MCT, 16 books 20+ courses,

More information

COPYRIGHT DATASHEET

COPYRIGHT DATASHEET Your Path to Enterprise AI To succeed in the world s rapidly evolving ecosystem, companies (no matter what their industry or size) must use data to continuously develop more innovative operations, processes,

More information

Prepare. Model. Operationalize

Prepare. Model. Operationalize Prepare Model Operationalize Model Re-Code Validate Deploy How do we operationalize R? Turn R analytics Web services in one line of code; Swagger-based REST APIs, easy to consume, with any programming

More information

NVIDIA DEEP LEARNING INSTITUTE

NVIDIA DEEP LEARNING INSTITUTE NVIDIA DEEP LEARNING INSTITUTE TRAINING CATALOG Valid Through July 31, 2018 INTRODUCTION The NVIDIA Deep Learning Institute (DLI) trains developers, data scientists, and researchers on how to use artificial

More information

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo

Microsoft. Exam Questions Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo Microsoft Exam Questions 70-775 Perform Data Engineering on Microsoft Azure HDInsight (beta) Version:Demo NEW QUESTION 1 You have an Azure HDInsight cluster. You need to store data in a file format that

More information

Cloudera s Enterprise Data Hub on the Amazon Web Services Cloud: Quick Start Reference Deployment October 2014

Cloudera s Enterprise Data Hub on the Amazon Web Services Cloud: Quick Start Reference Deployment October 2014 Cloudera s Enterprise Data Hub on the Amazon Web Services Cloud: Quick Start Reference Deployment October 2014 Karthik Krishnan Page 1 of 20 Table of Contents Table of Contents... 2 Abstract... 3 What

More information

Microsoft certified solutions associate

Microsoft certified solutions associate Microsoft certified solutions associate MCSA: BI Reporting This certification demonstrates your expertise in analyzing data with both Power BI and Excel. Exam 70-778/Course 20778 Analyzing and Visualizing

More information

Oskari Heikkinen. New capabilities of Azure Data Factory v2

Oskari Heikkinen. New capabilities of Azure Data Factory v2 Oskari Heikkinen New capabilities of Azure Data Factory v2 Oskari Heikkinen Lead Cloud Architect at BIGDATAPUMP Microsoft P-TSP Azure Advisors Numerous projects on Azure Worked with Microsoft Data Platform

More information

Informatica Enterprise Information Catalog

Informatica Enterprise Information Catalog Data Sheet Informatica Enterprise Information Catalog Benefits Automatically catalog and classify all types of data across the enterprise using an AI-powered catalog Identify domains and entities with

More information

Big Data con MATLAB. Lucas García The MathWorks, Inc. 1

Big Data con MATLAB. Lucas García The MathWorks, Inc. 1 Big Data con MATLAB Lucas García 2015 The MathWorks, Inc. 1 Agenda Introduction Remote Arrays in MATLAB Tall Arrays for Big Data Scaling up Summary 2 Architecture of an analytics system Data from instruments

More information

Data and AI LATAM 2018

Data and AI LATAM 2018 El Internet de las Cosas para Desarrolladores Joaquin Guerrero Sr. Technical Evangelist Microsoft LATAM The Internet of Things isn t a technology revolution IoT is a business revolution, enabled by technology

More information

Index. Pranab Mazumdar, Sourabh Agarwal, Amit Banerjee 2016 P. Mazumdar et al., Pro SQL Server on Microsoft Azure, DOI /

Index. Pranab Mazumdar, Sourabh Agarwal, Amit Banerjee 2016 P. Mazumdar et al., Pro SQL Server on Microsoft Azure, DOI / Index A Azure Active Directory (AAD), 17 Azure architecture compute, 20 fault domain, 31 IaaS, 19 models classic deployment model, 32 deployment automation, 34 RBAC, 33 Resource Manager deployment model,

More information