R Language for the SQL Server DBA Beginning with R Ing. Eduardo Castro, PhD, Principal Data Analyst Architect, LP Consulting Moderated By: Jose Rolando Guay Paz
Thank You microsoft.com idera.com attunity.com Empower users with new insights through familiar tools while balancing the need for IT to monitor and manage user created content. Deliver access to all data types across structured and unstructured sources. IDERA s award-winning SQL Server database solutions and multi-platform database, application and cloud monitoring tools ensure your business never slows down. Attunity, a leader in data integration and management software, helps move, transform and analyze data efficiently in SQL Server/Azure environments. 2
JOIN PASS PASS is a not-for-profit organization which offers year-round learning opportunities to data professionals Membership is free, join today at www.sqlpass.org Access to online training and content Join Local Chapters and Virtual Chapters Enjoy discounted event rates Get advance notice of member exclusives
Save on PASS Summit 2016 Registration! The world s largest gathering of SQL Server & BI professionals Learn from the world s top data experts, in over 190 technical sessions More than 4000 attendees from all over the world Meet the Microsoft engineering team! Save $200 right now using discount code 24HOP200! $2,195 until September 18, 2016 www.passsummit.com
BIO Ing. Eduardo Castro, PhD Microsoft Data Platform MVP and PASS Board of Advisor for LATAM, is a well known LATAM SQL Server Expert and focuses on architecture, Business Intelligence and Data Analytics, Eduardo has an specialization in Data Analysis and Big Data. ecastrom @edocastro http://tinyurl.com/h35nqt4
Session objective R and Phyton are the new tools for data professionals. The SQL Server DBA should know how to integrate R Scripts into data analytics and data warehouses. In this session, you will learn how to use the new feature in SQL Server 2016 to run R Scripts.
Data Science and Data Analytics Statistics, machine learning algorithms applied to data analysis Hypotheses, experiments, facts with tools popular among statistics experts.
Data wrangling Big data Data mining & machine learning Statistics
New data sources in the Data Analysis Pipe 010101010101010101 1010101010101010 01010101010101 101010101010 Data Transformation Big Data Tools R Language Big Data Unstructured Data Sources Tabular OLAP SQL PowerBI
Tools Chart from "Data Science Salary Survey 2014" (ISBN 978-1-491-91842-5) 2015 O'Reilly Media, used with permission. Arrows mine. For more info, and great titles on data science, visit oreilly.com
Popular Tools SPSS, Matlab, SAS NoSQL, Mongo DB, Couchbase, Cassandra Microsoft Excel Java, R, Python, Clojure, Haskell, Scala Hadoop, HDFS MapReduce, Spark, Storm HBase, Pig Hive, Shark ETL, Webscrapers,Flume, SqoopSQL, RDBMS, DW, OLAP Knime, Weka, RapidMiner
Tools by Microsoft Hadoop in the cloud + Storm (real time analysis) +HBase (NoSQL) +Mahoot (Macine Learning Power BI: Power Query, Power View, and Dashboards Excel Azure Data Factory (ETL in the cloud) Analytics Platform System (SQL Server on steroids + Hadoop + hardware) Streaming Data from Cloud Based in HDInsight / Hadoop
Tools by Microsoft Let s you run Scrips inside Visual Studio Integrate R Scripts Integrate R Graphs Open Source and Enterprise Editions
What is R? Interpreted Language Emphasis in statistical software packages 5000+ IDE: R Studio http://www.rstudio.com/ Open Source, free, multiplatform R Core: http://cran.r-project.org/ Revolution Analytics: parallelism and Performance: http://www.revolutionanalytics.com/ Azure ML: built-in
First steps with R R is a language popular among statistics experts and data scientists Open Source R is extensible, the are hundreds of packages that add new functionalities to R How to install R http://www.r-project.org/ Multiplatform Windows, Mac, Linux To install an IDE R Studio: IDE for R http://www.rstudio.com/ First install R then R Studio
R Studio
The Open Source R R loads data in memory R only has ONE thread Is not easy to create a R Cluster R Open is supported by the community Microsoft R Server doesn t have this limitations
Microsoft R Server previously Revolution Server
Microsof R Server Versions Microsoft R Open Microsoft R Enterprise
Integrating R inside SQL Server 2016 Fraud detection Sales forecast Predictive Maintenance R Language R Scripting 010010 100100 010101 010010 100100 010101 010010 100100 010101 Analytical library 010010 100100 010101 T-SQL Interface Relational data 010010 100100 010101 SQL Server 2016 Data scientists interact directly data Data Developer / DBA Data management and analytical in the same engine Azure Machine Learning Support R Language and Phyton
Installing R Support in SQL 2016
Installing R Support in SQL 2016
Installing R Support in SQL 2016
R integration within SQL Server 2016 exec sp_configure'external scripts enabled', 1; reconfigure; "C: \ Program files \ RRO \ RRO-3.2.2-for-RRE-7.5.0 \ R-3.2.2 \ library \RevoScaleR\rxLibs\ X64 \ registerrext.exe "/ install
R integration within SQL Server 2016 USE <target database name> GO CREATE LOGIN [<login name>] WITH PASSWORD = '<password>', CHECK_EXPIRATION = OFF, CHECK_POLICY = OFF; CREATE USER [<user name>] FOR LOGIN [<login name>] WITH DEFAULT_SCHEMA = [db_datareader] ALTER ROLE [db_datareader] ADD MEMBER [<user name>]
R integration within SQL Server 2016 USE [master] GO CREATE USER [<user name>] FOR LOGIN [<login name>] WITH DEFAULT_SCHEMA = [db_rrerole] ALTER ROLE [db_rrerole] ADD MEMBER [<user name>]
Demo. Installing R Support
What tool should I use?
Using R Studio
Demo. Using R Studio
Review R inside SQL Server 2016 Fraud detection Sales forecast Predictive Maintenance R Language R Scripting 010010 100100 010101 010010 100100 010101 010010 100100 010101 Analytical library 010010 100100 010101 T-SQL Interface Relational data 010010 100100 010101 SQL Server 2016 Data scientists interact directly data Data Developer / DBA Data management and analytical in the same engine Azure Machine Learning Support R Language and Phyton
Demo. Running R Scripts inside SQL Server
Summary There are new requirements for the DBA Often they come from the Data Science area In this session we had shown how to leverage the new features in SQL Server 2016 to include R Scripts inside the database in an integrated way