Noviembre18, 2017 Concepción, Chile. #sqlsatconce
|
|
- Osborn Craig
- 6 years ago
- Views:
Transcription
1 Noviembre8, 27 Concepción, Chile #sqlsatconce
2 SQL Server 27 - Deep Learning, clasificación de imágenes usando Azure Data Science Virtual Machine Nombre Speaker: Adrián J. Fernandez Cargo : Especialista técnico en Datos e Inteligencia Artificial (Microsoft TSP Data & AI) Adrian.Fernandez@microsoft.com Blog: chile.pass.org
3 Patrocinadores del SQL Saturday SQL Saturday #684 Concepcion, Chile
4 Agenda Classifying galaxy using neural networks in the R language Data Science and Deep Learning Virtual Machine Demos: Galaxies classification / SQL Server R Services WW Telescope SQL Saturday #684 Concepcion, Chile
5 Microsoft R
6 Microsoft R Microsoft R Server family SQL Saturday #684 Concepcion, Chile
7 Microsoft R Server family From Data To Action On Premises and In the Cloud Data Sources People Apps Microsoft R Apps Windows SQL Server Sensors and devices Hadoop Teradata Linux Automated Systems DATA INTELLIGENCE ACTION
8 Microsoft R Server Scales Analytics to Big Data Scales via parallelization Scales via in-cluster execution Escapes R s memory limitations Reduces data movement & duplication Deploys into multiple platforms Windows, Linux SQL Server (R Services) Hadoop/Spark Teradata R Open R Server
9 ScaleR Parallel + Big Data Our ScaleR algorithms work inside multiple cores / nodes in parallel at high speed Stream data in to RAM in blocks. Big Data can be any data size. We handle Megabytes to Gigabytes to Terabytes XDF file format is optimised to work with the ScaleR library and significantly speeds up iterative algorithm processing. Interim results are collected and combined analytically to produce the output on the entire data set
10 Scale R Parallelized Algorithms & Functions Data Preparation Data import Delimited, Fixed, SAS, SPSS, OBDC Variable creation & transformation Recode variables Factor variables Missing value handling Sort, Merge, Split Aggregate by category (means, sums) Descriptive Statistics Min / Max, Mean, Median (approx.) Quantiles (approx.) Standard Deviation Variance Correlation Covariance Sum of Squares (cross product matrix for set variables) Pairwise Cross tabs Risk Ratio & Odds Ratio Cross-Tabulation of Data (standard tables & long form) Marginal Summaries of Cross Tabulations Statistical Tests Chi Square Test Kendall Rank Correlation Fisher s Exact Test Student s t-test Sampling Subsample (observations & variables) Random Sampling Predictive Models Sum of Squares (cross product matrix for set variables) Multiple Linear Regression Generalized Linear Models (GLM) exponential family distributions: binomial, Gaussian, inverse Gaussian, Poisson, Tweedie. Standard link functions: cauchit, identity, log, logit, probit. User defined distributions & link functions. Covariance & Correlation Matrices Logistic Regression Classification & Regression Trees Predictions/scoring for models Residuals for all models Variable Selection Stepwise Regression Simulation Simulation (e.g. Monte Carlo) Parallel Random Number Generation Cluster Analysis K-Means Classification Decision Trees Decision Forests Gradient Boosted Decision Trees Naïve Bayes Custom Development rxdatastep rxexec PEMA-R API Custom Algorithms
11 ScaleR: Dramatic Performance and Capacity
12 minutes In-Database Acceleration R Open on a server pulling data via SQL Microsoft R on a server Invoking MRS ScaleR Inside the EDW rows
13 Times faster than CRAN R MRS on Spark Compared to Open Source R HDInsight - Logistic Regression Comparisons Preliminary measure Number of rows (millions) rxlogit in HDInsight (Spark CC) CRAN R glm 5 Spark Nodes is 22X Faster (~25x/node) than One CRAN R node running GLM Configuration: HDI cluster size: 5 nodes - Edge node:d4 V2 (6 cores, 2GB) - Worker Nodes: D2 (4 cores, 28GB) Dataset: Airlines dataset (text format) Number of columns: 44
14 Times faster than local CC MRS on Spark Compared to MRS on Hadoop 25 rxlogit in HDInsight Spark 2 5 Preliminary measure R Server on Spark generally 6x faster than MapReduce but local is the speed champion for smaller files Number of rows (millions) MapReduce Local Configuration: HDI cluster size: 5 nodes - Edge node:d4 V2 (6 cores, 2GB) - Worker Nodes: D2 (4 cores, 28GB) Dataset: Airlines dataset Number of columns: 44
15 TIME (SECONDS) Scalability rxlogit on a node HDInsight cluster Preliminary measure 5E+9 E+ NUMBER OF ROWS Linear Scale out to 2 Billons Rows Configuration: HDI cluster size: nodes - Edge node:d4 V2 (6 cores, 2GB) - Worker Nodes: D2 (4 cores, 28GB) Dataset: NY City taxi trip dataset Number of columns:??
16 Machine Learning Templates With SQL Server R Services
17
18 2 trillion galaxies in observable universe Galaxy shape tells us about evolution Spiral galaxies Elliptical galaxies NGC 325 M M5 Collisions and other events ESO 325-G4 Forming Ancient NASA, ESA, K. Kuntz (JHU), F. Bresolin (University of Hawaii), J. Trauger (Jet Propulsion Lab), J. Mould (NOAO), Y.-H. Chu (University of Illinois, Urbana), and STScI
19 How to classify galaxies? Professional astronomers Citizen data science Computer vision The Astronomer by Johannes Vermeer (Wikipedia)
20 How do computers see? Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations HonglakLee, Roger Grosse, Rajesh Ranganath, Andrew Y. Ng
21 What computers see Neural network Spiral Neural network Elliptical
22 What computers see Match pieces of the image Convolution Then repeat across the entire image Matches specific shape (kernel) across entire image Automatic feature generation
23 Deep Learning: Convolutional Neural Network 678c5b4b463?lipi=urn%3Ali%3Apage%3Ad_flagship3_feed%3BgQjKSKavROmoHrZL8fXioQ%3D%3D
24 Deep stacking Layers can be repeated several (or many) times. Spiral Convolution Convolution Pooling Pooling Elliptical
25 Galaxy Characterization via DNN Rapid Characterization of Celestial Bodies Uses New Microsoft ML Package Exploits GPU Acceleration Part of New Microsoft Investments in R Ease of Use Powerful Capability for R Users SQL Server, Windows, HPC, Batch Services
26 R Server inside SQL Server Call to remote SQL Server instance with R inside Fast linear learner (SDGA) Fast trees and forests One-class SVM Regularized logistic regression (L and L2) Neural networks
27 R code outline library library Load the required R packages
28 R code outline library(revoscaler) library(microsoftml) Load the required R packages multiclass Run the neural network rxneuralnet Neural networks for regression modeling and for Binary and multi-class classification. A character string denoting Fast Tree type: "binary" for the default binary classification neural network. "multiclass" for multi-class classification neural network. "regression" for a regression neural network.
29 R code outline library(revoscaler) library(microsoftml) model <- rxneuralnet( formula, data = galaxy_data, netdefinition = netdefinition, type = "multiclass" gpu 32 ) Load the required R packages Run the neural network Use GPU acceleration Getting started with GPU acceleration for MicrosoftML s rxneuralnet
30 R code outline library(revoscaler) library(microsoftml) model <- rxneuralnet( formula, data = galaxy_data, netdefinition = netdefinition, type = "multiclass" acceleration = "gpu", minibatchsize = 32 initwtsdiameter =., 5) Load the required R packages Run the neural network Use GPU acceleration Specify hyperparameters
31 R code outline library(revoscaler) library(microsoftml) model <- rxneuralnet( formula, data = galaxy_data, netdefinition = netdefinition, type = "multiclass" acceleration = "gpu", minibatchsize = 32 initwtsdiameter =., numiterations = 5) The Net# definition of the structure of the neural network NET#
32 Network definition #NET and Hidden Layers input [3, 5, 5] rlinear [3, 5, 5] 64 convolve Input images 64 maps [, 4, 4] response norm normalize [, 3, 3] max pool max pooling output [3] softmax all all fully connected output
33 Demo: Galaxy Image Classification using Deep Learning New Microsoft machine learning package with GPU-powered deep learning
34 Galaxy image classification
35 Demo Galaxies Classifier
36 Azure is the Microsoft cloud service Scalable computing, storage and services
37 R Server GPU support Train neural network using GPU on Azure GPU = Graphical processing unit x increase in speed
38 Deployment on Azure
39 Results ~K Training images 8 Layers in deep network 3 hours Computing time on Azure GPU 95% Overall accuracy - training data 8% Overall accuracy - test data The technique works, but has scope for improvement!
40 8% Overall accuracy on test data
41
42
43 Deploy to SQL server Conclusion Convolutional neural nets can predict galaxy class You can use R Server to train and deploy a model Use Azure GPU machines for faster training
44 Demo WWT
45 A faster, more efficient, more intelligent cloud Data explosion: ZB ZB ML, DNN, AI are driving requirements up faster Autonomous decision making Real-time insights into connected devices Interactive user experiences Cloud-scale services Searches and recommendations (Indexing the Internet!) The need for SCALE The need for LOW-LATENCY The need for THROUGHPUT ZB 44 ZB Source: IDC 24
46 Azure AI Supercomputer
47 Silicon alternatives TRAINING CPUs and GPUs, limited FPGAs, ASICs under investigation EVALUATION CPUs and FPGAs, ASICs under investigation Registers Control Unit (CU) CPUs Arithmetic Logic Unit (ALU) GPUs FPGAs ASICs FLEXIBILITY EFFICIENCY
48 Visualization Virtual Machines Powered by NVIDIA GRID NV6 NV2 NV24 Cores GPU M6 GPU (/2 Physical Card) 2 M6 GPUs ( Physical Card) 4 M6 GPUs (2 Physical Cards) Memory 56 GB 2 GB 224 GB Disk ~38 GB SSD ~68 GB SSD ~.5 TB SSD Network Azure Network Azure Network Azure Network
49 Compute Azure Virtual Machines NC6 NC2 NC24 NC24r Cores GPU K8 GPU (/2 Physical Card) 2 K8 GPUs ( Physical Card) 4 K8 GPUs (2 Physical Cards) 4 K8 GPUs (2 Physical Cards) Memory 56 GB 2 GB 224 GB 224 GB Disk ~38 GB SSD ~68 GB SSD ~.5 TB SSD ~.5 TB SSD Network Azure Network Azure Network Azure Network InfiniBand
50 Sitio de la Comunidad en Chile chile.pass.org SQL Saturday #684 Concepcion, Chile
51 Sitio de la Comunidad Global SQL Saturday #684 Concepcion, Chile
52 Sea cual sea su pasión datos hay uncapítulo virtual para usted! SQL Saturday #684 Concepcion, Chile
53 Preguntas SQL Saturday #684 Concepcion, Chile
54 Gracias por vuestra asistencia!
Populating the Galaxy Zoo
Populating the Galaxy Zoo Real-time Image Classification with SQL Server R Services David M Smith @revodavid R Community Lead Microsoft Algorithms and Data Science THANKS to all Sponsors! EVENT SPONSORS
More informationIntroducing Microsoft SQL Server 2016 R Services. Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone
Introducing Microsoft SQL Server 2016 R Services Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone SQL Server 2016: Everything built-in built-in built-in built-in built-in built-in $2,230
More informationMicrosoft. Advanced Analytics. Juan Carlos Rodriguez García Data Platform Solution Architect
Microsoft Advanced Analytics Juan Carlos Rodriguez García jurodr@microsoft.com Data Platform Solution Architect VALOR Fuente: Gartner DIFICULTAD Banca Omnicanal MAX Maximizar el Tiempo, Todo el Tiempo
More informationMicrosoft SQL Server 2016 R Services
Microsoft SQL Server 2016 R Services SQL Server 2016: Everything built-in built-in built-in built-in built-in built-in $2,230 80 70 60 50 43 69 49 SQL Server SQL Server SQL Server 40 30 20 10 0 34 29 22
More informationAnalyzing Big Data with Microsoft R
Analyzing Big Data with Microsoft R 20773; 3 days, Instructor-led Course Description The main purpose of the course is to give students the ability to use Microsoft R Server to create and run an analysis
More informationOverview. Audience profile. At course completion. Course Outline. : 20773A: Analyzing Big Data with Microsoft R. Course Outline :: 20773A::
Module Title Duration : 20773A: Analyzing Big Data with Microsoft R : 3 days Overview The main purpose of the course is to give students the ability to use Microsoft R Server to create and run an analysis
More informationOracle Machine Learning Notebook
Oracle Machine Learning Notebook Included in Autonomous Data Warehouse Cloud Charlie Berger, MS Engineering, MBA Sr. Director Product Management, Machine Learning, AI and Cognitive Analytics charlie.berger@oracle.com
More informationEvent: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect
Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect BEOP.CTO.TP4 Owner: OCTO Revision: 0001 Approved by: JAT Effective: 08/30/2018 Buchanan & Edwards Proprietary: Printed copies of
More informationApache SystemML Declarative Machine Learning
Apache Big Data Seville 2016 Apache SystemML Declarative Machine Learning Luciano Resende About Me Luciano Resende (lresende@apache.org) Architect and community liaison at Have been contributing to open
More informationParallel and Distributed Computing with MATLAB Gerardo Hernández Manager, Application Engineer
Parallel and Distributed Computing with MATLAB Gerardo Hernández Manager, Application Engineer 2018 The MathWorks, Inc. 1 Practical Application of Parallel Computing Why parallel computing? Need faster
More informationExploiting the OpenPOWER Platform for Big Data Analytics and Cognitive. Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center
Exploiting the OpenPOWER Platform for Big Data Analytics and Cognitive Rajesh Bordawekar and Ruchir Puri IBM T. J. Watson Research Center 3/17/2015 2014 IBM Corporation Outline IBM OpenPower Platform Accelerating
More informationMachine Learning and SystemML. Nikolay Manchev Data Scientist Europe E-
Machine Learning and SystemML Nikolay Manchev Data Scientist Europe E- mail: nmanchev@uk.ibm.com @nikolaymanchev A Simple Problem In this activity, you will analyze the relationship between educational
More informationData and AI LATAM 2018
Data and AI LATAM 2018 La parte de imagen con el identificador de relación rid5 no se encontró en el archivo. La parte de imagen con el identificador de relación rid5 no se encontró en el archivo. La parte
More information##SQLSatMadrid. Project [Vélib by Cortana]
Project [Vélib by Cortana] BIG Thanks to SQLSatMadrid Sponsors Speakers Agenda Presentation of the Project Cortana Intelligent Suite Creation of the architecture Purpose of the Project Get a descriptive
More informationSDA: Software-Defined Accelerator for Large- Scale DNN Systems
SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, Yong Wang, Bo Yu, Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A dominant
More informationMachine Learning In A Snap. Thomas Parnell Research Staff Member IBM Research - Zurich
Machine Learning In A Snap Thomas Parnell Research Staff Member IBM Research - Zurich What are GLMs? Ridge Regression Support Vector Machines Regression Generalized Linear Models Classification Lasso Regression
More informationSDA: Software-Defined Accelerator for Large- Scale DNN Systems
SDA: Software-Defined Accelerator for Large- Scale DNN Systems Jian Ouyang, 1 Shiding Lin, 1 Wei Qi, 1 Yong Wang, 1 Bo Yu, 1 Song Jiang, 2 1 Baidu, Inc. 2 Wayne State University Introduction of Baidu A
More informationDecision models for the Digital Economy
Decision Camp 2017 Birbeck, University of London Decision models for the Digital Economy Vijay Bandekar InteliOps Inc. Agenda Problem Statement Proposed Solution Case studies and results Key takeaways
More informationFast Hardware For AI
Fast Hardware For AI Karl Freund karl@moorinsightsstrategy.com Sr. Analyst, AI and HPC Moor Insights & Strategy Follow my blogs covering Machine Learning Hardware on Forbes: http://www.forbes.com/sites/moorinsights
More informationFacial Expression Classification with Random Filters Feature Extraction
Facial Expression Classification with Random Filters Feature Extraction Mengye Ren Facial Monkey mren@cs.toronto.edu Zhi Hao Luo It s Me lzh@cs.toronto.edu I. ABSTRACT In our work, we attempted to tackle
More informationDATA SCIENCE INTRODUCTION QSHORE TECHNOLOGIES. About the Course:
DATA SCIENCE About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst/Analytics Manager/Actuarial Scientist/Business
More informationOnto Petaflops with Kubernetes
Onto Petaflops with Kubernetes Vishnu Kannan Google Inc. vishh@google.com Key Takeaways Kubernetes can manage hardware accelerators at Scale Kubernetes provides a playground for ML ML journey with Kubernetes
More informationIntroduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core)
Introduction to Data Science What is Analytics and Data Science? Overview of Data Science and Analytics Why Analytics is is becoming popular now? Application of Analytics in business Analytics Vs Data
More informationScalable Machine Learning in R. with H2O
Scalable Machine Learning in R with H2O Erin LeDell @ledell DSC July 2016 Introduction Statistician & Machine Learning Scientist at H2O.ai in Mountain View, California, USA Ph.D. in Biostatistics with
More informationOracle Big Data Connectors
Oracle Big Data Connectors Oracle Big Data Connectors is a software suite that integrates processing in Apache Hadoop distributions with operations in Oracle Database. It enables the use of Hadoop to process
More informationIntegrating Advanced Analytics with Big Data
Integrating Advanced Analytics with Big Data Ian McKenna, Ph.D. Senior Financial Engineer 2017 The MathWorks, Inc. 1 The Goal SCALE! 2 The Solution tall 3 Agenda Introduction to tall data Case Study: Predicting
More informationTutorial on Machine Learning Tools
Tutorial on Machine Learning Tools Yanbing Xue Milos Hauskrecht Why do we need these tools? Widely deployed classical models No need to code from scratch Easy-to-use GUI Outline Matlab Apps Weka 3 UI TensorFlow
More informationUsing Existing Numerical Libraries on Spark
Using Existing Numerical Libraries on Spark Brian Spector Chicago Spark Users Meetup June 24 th, 2015 Experts in numerical algorithms and HPC services How to use existing libraries on Spark Call algorithm
More informationBoost your Analytics with ML for SQL Nerds
Boost your Analytics with ML for SQL Nerds SQL Saturday Spokane Mar 10, 2018 Julie Koesmarno @MsSQLGirl mssqlgirl.com jukoesma@microsoft.com Principal Program Manager in Business Analytics for SQL Products
More informationMatrix Computations and " Neural Networks in Spark
Matrix Computations and " Neural Networks in Spark Reza Zadeh Paper: http://arxiv.org/abs/1509.02256 Joint work with many folks on paper. @Reza_Zadeh http://reza-zadeh.com Training Neural Networks Datasets
More informationParallel and Distributed Computing with MATLAB The MathWorks, Inc. 1
Parallel and Distributed Computing with MATLAB 2018 The MathWorks, Inc. 1 Practical Application of Parallel Computing Why parallel computing? Need faster insight on more complex problems with larger datasets
More informationMATLAB. Senior Application Engineer The MathWorks Korea The MathWorks, Inc. 2
1 Senior Application Engineer The MathWorks Korea 2017 The MathWorks, Inc. 2 Data Analytics Workflow Business Systems Smart Connected Systems Data Acquisition Engineering, Scientific, and Field Business
More informationSharePoint SQL 2016 qué hay de nuevo?
#SQLSatMexCity Bienvenidos!!! SharePoint 2016 + SQL 2016 qué hay de nuevo? Vladimir Medina Community Leader SPSEVENTS.ORG @VladPoint vladimir_mg@hotmail.com http://blogs.technet.com/b/vladpoint https://www.facebook.com/groups/56850858767/
More informationSpecialist ICT Learning
Specialist ICT Learning APPLIED DATA SCIENCE AND BIG DATA ANALYTICS GTBD7 Course Description This intensive training course provides theoretical and technical aspects of Data Science and Business Analytics.
More informationTackling Big Data Using MATLAB
Tackling Big Data Using MATLAB Alka Nair Application Engineer 2015 The MathWorks, Inc. 1 Building Machine Learning Models with Big Data Access Preprocess, Exploration & Model Development Scale up & Integrate
More informationOptimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink
Optimizing Out-of-Core Nearest Neighbor Problems on Multi-GPU Systems Using NVLink Rajesh Bordawekar IBM T. J. Watson Research Center bordaw@us.ibm.com Pidad D Souza IBM Systems pidsouza@in.ibm.com 1 Outline
More informationPredictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA
Predictive Analytics: Demystifying Current and Emerging Methodologies Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA May 18, 2017 About the Presenters Tom Kolde, FCAS, MAAA Consulting Actuary Chicago,
More informationThe exam is closed book, closed notes except your one-page (two-sided) cheat sheet.
CS 189 Spring 2015 Introduction to Machine Learning Final You have 2 hours 50 minutes for the exam. The exam is closed book, closed notes except your one-page (two-sided) cheat sheet. No calculators or
More informationIntegrate MATLAB Analytics into Enterprise Applications
Integrate Analytics into Enterprise Applications Aurélie Urbain MathWorks Consulting Services 2015 The MathWorks, Inc. 1 Data Analytics Workflow Data Acquisition Data Analytics Analytics Integration Business
More informationPractical Guidance for Machine Learning Applications
Practical Guidance for Machine Learning Applications Brett Wujek About the authors Material from SGF Paper SAS2360-2016 Brett Wujek Senior Data Scientist, Advanced Analytics R&D ~20 years developing engineering
More informationDefense Data Generation in Distributed Deep Learning System Se-Yoon Oh / ADD-IDAR
Defense Data Generation in Distributed Deep Learning System Se-Yoon Oh / 2017. 10. 31 syoh@add.re.kr Page 1/36 Overview 1. Introduction 2. Data Generation Synthesis 3. Distributed Deep Learning 4. Conclusions
More informationOutrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS
Outrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS Topics AGENDA Challenges with Big Data Analytics How SAS can help you to minimize time to value with
More informationECS289: Scalable Machine Learning
ECS289: Scalable Machine Learning Cho-Jui Hsieh UC Davis Sept 22, 2016 Course Information Website: http://www.stat.ucdavis.edu/~chohsieh/teaching/ ECS289G_Fall2016/main.html My office: Mathematical Sciences
More informationPython With Data Science
Course Overview This course covers theoretical and technical aspects of using Python in Applied Data Science projects and Data Logistics use cases. Who Should Attend Data Scientists, Software Developers,
More informationData Analytics and Machine Learning: From Node to Cluster
Data Analytics and Machine Learning: From Node to Cluster Presented by Viswanath Puttagunta Ganesh Raju Understanding use cases to optimize on ARM Ecosystem Date BKK16-404B March 10th, 2016 Event Linaro
More informationService Oriented Performance Analysis
Service Oriented Performance Analysis Da Qi Ren and Masood Mortazavi US R&D Center Santa Clara, CA, USA www.huawei.com Performance Model for Service in Data Center and Cloud 1. Service Oriented (end to
More informationFault Detection using Advanced Analytics at CERN's Large Hadron Collider: Too Hot or Too Cold BIWA Summit 2016
Fault Detection using Advanced Analytics at CERN's Large Hadron Collider: Too Hot or Too Cold BIWA Summit 2016 Mark Hornick, Director, Advanced Analytics January 27, 2016 Safe Harbor Statement The following
More informationRapid growth of massive datasets
Overview Rapid growth of massive datasets E.g., Online activity, Science, Sensor networks Data Distributed Clusters are Pervasive Data Distributed Computing Mature Methods for Common Problems e.g., classification,
More informationImageNet Classification with Deep Convolutional Neural Networks
ImageNet Classification with Deep Convolutional Neural Networks Alex Krizhevsky Ilya Sutskever Geoffrey Hinton University of Toronto Canada Paper with same name to appear in NIPS 2012 Main idea Architecture
More informationScaling MATLAB. for Your Organisation and Beyond. Rory Adams The MathWorks, Inc. 1
Scaling MATLAB for Your Organisation and Beyond Rory Adams 2015 The MathWorks, Inc. 1 MATLAB at Scale Front-end scaling Scale with increasing access requests Back-end scaling Scale with increasing computational
More informationSupervised vs unsupervised clustering
Classification Supervised vs unsupervised clustering Cluster analysis: Classes are not known a- priori. Classification: Classes are defined a-priori Sometimes called supervised clustering Extract useful
More informationThe Evolution of Big Data Platforms and Data Science
IBM Analytics The Evolution of Big Data Platforms and Data Science ECC Conference 2016 Brandon MacKenzie June 13, 2016 2016 IBM Corporation Hello, I m Brandon MacKenzie. I work at IBM. Data Science - Offering
More informationMachine Learning for Large-Scale Data Analysis and Decision Making A. Distributed Machine Learning Week #9
Machine Learning for Large-Scale Data Analysis and Decision Making 80-629-17A Distributed Machine Learning Week #9 Today Distributed computing for machine learning Background MapReduce/Hadoop & Spark Theory
More informationModeling. Preparation. Operationalization. Profile Explore. Model Testing & Validation. Feature & Algorithm Selection. Transform Cleanse Denormalize
Preparation Modeling Ingest Transform Cleanse Denormalize Profile Explore Visualize Feature & Algorithm Selection Model Testing & Validation Operationalization Models Visualizations Deploy Apps, Services
More informationApache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context
1 Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context Generality: diverse workloads, operators, job sizes
More informationWhat's New in MATLAB for Engineering Data Analytics?
What's New in MATLAB for Engineering Data Analytics? Will Wilson Application Engineer MathWorks, Inc. 2017 The MathWorks, Inc. 1 Agenda Data Types Tall Arrays for Big Data Machine Learning (for Everyone)
More informationUsing Machine Learning to Identify Security Issues in Open-Source Libraries. Asankhaya Sharma Yaqin Zhou SourceClear
Using Machine Learning to Identify Security Issues in Open-Source Libraries Asankhaya Sharma Yaqin Zhou SourceClear Outline - Overview of problem space Unidentified security issues How Machine Learning
More informationC5##54&6*"6*1%2345*D&'*E2)2*F"4G)&"69
?23(&65*@52%6&6'*A&)(*B*267* C5##54&6*"6*1%2345*D&'*E2)2*F"4G)&"69!"#$%&'%(?2%3"9*
More informationBig Data Systems on Future Hardware. Bingsheng He NUS Computing
Big Data Systems on Future Hardware Bingsheng He NUS Computing http://www.comp.nus.edu.sg/~hebs/ 1 Outline Challenges for Big Data Systems Why Hardware Matters? Open Challenges Summary 2 3 ANYs in Big
More informationLinear Regression Optimization
Gradient Descent Linear Regression Optimization Goal: Find w that minimizes f(w) f(w) = Xw y 2 2 Closed form solution exists Gradient Descent is iterative (Intuition: go downhill!) n w * w Scalar objective:
More informationHDInsight > Hadoop. October 12, 2017
HDInsight > Hadoop October 12, 2017 2 Introduction Mark Hudson >20 years mixing technology with data >10 years with CapTech Microsoft Certified IT Professional Business Intelligence Member of the Richmond
More informationData Science Training
Data Science Training R, Predictive Modeling, Machine Learning, Python, Bigdata & Spark 9886760678 Introduction: This is a comprehensive course which builds on the knowledge and experience a business analyst
More informationIntegration with popular Big Data Frameworks in Statistica and Statistica Enterprise Server Solutions Statistica White Paper
and Statistica Enterprise Server Solutions Statistica White Paper Siva Ramalingam Thomas Hill TIBCO Statistica Table of Contents Introduction...2 Spark Support in Statistica...3 Requirements...3 Statistica
More informationThe exam is closed book, closed notes except your one-page cheat sheet.
CS 189 Fall 2015 Introduction to Machine Learning Final Please do not turn over the page before you are instructed to do so. You have 2 hours and 50 minutes. Please write your initials on the top-right
More informationOracle Big Data Science
Oracle Big Data Science Tim Vlamis and Dan Vlamis Vlamis Software Solutions 816-781-2880 www.vlamis.com @VlamisSoftware Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri
More informationOPERATIONALIZING MACHINE LEARNING USING GPU ACCELERATED, IN-DATABASE ANALYTICS
OPERATIONALIZING MACHINE LEARNING USING GPU ACCELERATED, IN-DATABASE ANALYTICS 1 Why GPUs? A Tale of Numbers 100x Performance Increase Infrastructure Cost Savings Performance 100x gains over traditional
More informationData Science Bootcamp Curriculum. NYC Data Science Academy
Data Science Bootcamp Curriculum NYC Data Science Academy 100+ hours free, self-paced online course. Access to part-time in-person courses hosted at NYC campus Machine Learning with R and Python Foundations
More informationADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA
INSIGHTS@SAS: ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA AGENDA 09.00 09.15 Intro 09.15 10.30 Analytics using SAS Enterprise Guide Ellen Lokollo 10.45 12.00 Advanced Analytics using SAS
More informationMINITAB Release Comparison Chart Release 14, Release 13, and Student Versions
Technical Support Free technical support Worksheet Size All registered users, including students Registered instructors Number of worksheets Limited only by system resources 5 5 Number of cells per worksheet
More informationTechnical Support Minitab Version Student Free technical support for eligible products
Technical Support Free technical support for eligible products All registered users (including students) All registered users (including students) Registered instructors Not eligible Worksheet Size Number
More informationMachine Learning Duncan Anderson Managing Director, Willis Towers Watson
Machine Learning Duncan Anderson Managing Director, Willis Towers Watson 21 March 2018 GIRO 2016, Dublin - Response to machine learning Don t panic! We re doomed! 2 This is not all new Actuaries adopt
More informationMachine Learning With Python. Bin Chen Nov. 7, 2017 Research Computing Center
Machine Learning With Python Bin Chen Nov. 7, 2017 Research Computing Center Outline Introduction to Machine Learning (ML) Introduction to Neural Network (NN) Introduction to Deep Learning NN Introduction
More information732A54/TDDE31 Big Data Analytics
732A54/TDDE31 Big Data Analytics Lecture 10: Machine Learning with MapReduce Jose M. Peña IDA, Linköping University, Sweden 1/27 Contents MapReduce Framework Machine Learning with MapReduce Neural Networks
More informationEvolving To The Big Data Warehouse
Evolving To The Big Data Warehouse Kevin Lancaster 1 Copyright Director, 2012, Oracle and/or its Engineered affiliates. All rights Insert Systems, Information Protection Policy Oracle Classification from
More informationIntegrate MATLAB Analytics into Enterprise Applications
Integrate Analytics into Enterprise Applications Dr. Roland Michaely 2015 The MathWorks, Inc. 1 Data Analytics Workflow Access and Explore Data Preprocess Data Develop Predictive Models Integrate Analytics
More informationPolytechnic University of Tirana
1 Polytechnic University of Tirana Department of Computer Engineering SIBORA THEODHOR ELINDA KAJO M ECE 2 Computer Vision OCR AND BEYOND THE PRESENTATION IS ORGANISED IN 3 PARTS : 3 Introduction, previous
More informationData center: The center of possibility
Data center: The center of possibility Diane bryant Executive vice president & general manager Data center group, intel corporation Data center: The center of possibility The future is Thousands of Clouds
More informationIBM Leading High Performance Computing and Deep Learning Technologies
IBM Leading High Performance Computing and Deep Learning Technologies Yubo Li ( 李玉博 ) Chief Architect, on Cloud IBM Research -- China email: liyubobj@cn.ibm.com QQ: 395238640 GTC China 2016 Sept. 13, 2016
More informationChapter 1 - The Spark Machine Learning Library
Chapter 1 - The Spark Machine Learning Library Objectives Key objectives of this chapter: The Spark Machine Learning Library (MLlib) MLlib dense and sparse vectors and matrices Types of distributed matrices
More informationCorrectly Compute Complex Samples Statistics
SPSS Complex Samples 15.0 Specifications Correctly Compute Complex Samples Statistics When you conduct sample surveys, use a statistics package dedicated to producing correct estimates for complex sample
More informationCS 179 Lecture 16. Logistic Regression & Parallel SGD
CS 179 Lecture 16 Logistic Regression & Parallel SGD 1 Outline logistic regression (stochastic) gradient descent parallelizing SGD for neural nets (with emphasis on Google s distributed neural net implementation)
More informationBuilding the Most Efficient Machine Learning System
Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide
More informationNavigating Big Data with MATLAB
Navigating Big Data with MATLAB Isaac Noh Application Engineer 2015 The MathWorks, Inc. 1 How big is big? What does Big Data even mean? Big data is a term for data sets that are so large or complex that
More informationThe OpenVX Computer Vision and Neural Network Inference
The OpenVX Computer and Neural Network Inference Standard for Portable, Efficient Code Radhakrishna Giduthuri Editor, OpenVX Khronos Group radha.giduthuri@amd.com @RadhaGiduthuri Copyright 2018 Khronos
More informationPutting it all together: Creating a Big Data Analytic Workflow with Spotfire
Putting it all together: Creating a Big Data Analytic Workflow with Spotfire Authors: David Katz and Mike Alperin, TIBCO Data Science Team In a previous blog, we showed how ultra-fast visualization of
More informationCloud Computing 2. CSCI 4850/5850 High-Performance Computing Spring 2018
Cloud Computing 2 CSCI 4850/5850 High-Performance Computing Spring 2018 Tae-Hyuk (Ted) Ahn Department of Computer Science Program of Bioinformatics and Computational Biology Saint Louis University Learning
More informationCortana Intelligence Suite; Where the Magic Happens
Cortana Intelligence Suite; Where the Magic Happens Reza Rad, Leila Etaati #509 Brisbane 2016 About Us Reza Rad Leila Etaati MVP BI Consultant and Trainer Author of Books Speaker in conferences; PASS Summit,
More informationOverview of Data Services and Streaming Data Solution with Azure
Overview of Data Services and Streaming Data Solution with Azure Tara Mason Senior Consultant tmason@impactmakers.com Platform as a Service Offerings SQL Server On Premises vs. Azure SQL Server SQL Server
More informationMachine Learning with Python
DEVNET-2163 Machine Learning with Python Dmitry Figol, SE WW Enterprise Sales @dmfigol Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session
More informationCERN openlab & IBM Research Workshop Trip Report
CERN openlab & IBM Research Workshop Trip Report Jakob Blomer, Javier Cervantes, Pere Mato, Radu Popescu 2018-12-03 Workshop Organization 1 full day at IBM Research Zürich ~25 participants from CERN ~10
More informationData Mining: STATISTICA
Outline Data Mining: STATISTICA Prepare the data Classification and regression (C & R, ANN) Clustering Association rules Graphic user interface Prepare the Data Statistica can read from Excel,.txt and
More informationScaled Machine Learning at Matroid
Scaled Machine Learning at Matroid Reza Zadeh @Reza_Zadeh http://reza-zadeh.com Machine Learning Pipeline Learning Algorithm Replicate model Data Trained Model Serve Model Repeat entire pipeline Scaling
More informationChallenges for Data Driven Systems
Challenges for Data Driven Systems Eiko Yoneki University of Cambridge Computer Laboratory Data Centric Systems and Networking Emergence of Big Data Shift of Communication Paradigm From end-to-end to data
More information10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors
Dejan Sarka Anomaly Detection Sponsors About me SQL Server MVP (17 years) and MCT (20 years) 25 years working with SQL Server Authoring 16 th book Authoring many courses, articles Agenda Introduction Simple
More informationDeep learning in MATLAB From Concept to CUDA Code
Deep learning in MATLAB From Concept to CUDA Code Roy Fahn Applications Engineer Systematics royf@systematics.co.il 03-7660111 Ram Kokku Principal Engineer MathWorks ram.kokku@mathworks.com 2017 The MathWorks,
More informationMIOVISION DEEP LEARNING TRAFFIC ANALYTICS SYSTEM FOR REAL-WORLD DEPLOYMENT. Kurtis McBride CEO, Miovision
MIOVISION DEEP LEARNING TRAFFIC ANALYTICS SYSTEM FOR REAL-WORLD DEPLOYMENT Kurtis McBride CEO, Miovision ABOUT MIOVISION COMPANY Founded in 2005 40% growth, year over year Offices in Kitchener, Canada
More informationTransforming Transport Infrastructure with GPU- Accelerated Machine Learning Yang Lu and Shaun Howell
Transforming Transport Infrastructure with GPU- Accelerated Machine Learning Yang Lu and Shaun Howell 11 th Oct 2018 2 Contents Our Vision Of Smarter Transport Company introduction and journey so far Advanced
More informationMachine Learning in WAN Research
Machine Learning in WAN Research Mariam Kiran mkiran@es.net Energy Sciences Network (ESnet) Lawrence Berkeley National Lab Oct 2017 Presented at Internet2 TechEx 2017 Outline ML in general ML in network
More informationBuilding the Most Efficient Machine Learning System
Building the Most Efficient Machine Learning System Mellanox The Artificial Intelligence Interconnect Company June 2017 Mellanox Overview Company Headquarters Yokneam, Israel Sunnyvale, California Worldwide
More informationBeyond Training The next steps of Machine Learning. Chris /in/chrisparsonsdev
Beyond Training The next steps of Machine Learning Chris Parsons chrisparsons@uk.ibm.com @chrisparsonsdev /in/chrisparsonsdev What is this talk? Part 1 What is Machine Learning? AI Infrastructure PowerAI
More information