Tomaž Kaštrun. Data Science for Beginners
|
|
- Duane Jenkins
- 6 years ago
- Views:
Transcription
1 Tomaž Kaštrun Data Science for Beginners
2 To all sponsors, thank you!
3 Thanks to all organizers! GetLatestVersion. it
4 About (2.0.1) BI Developer and data analyst SQL Server, SAS, R, Python, C#, SAP, SPSS 15years experience MSSQL, DEV, BI, DM Working at GEN-I Frequent community speaker Avid coffee drinker & Bicycle junkie
5 This talk was born out of frustration! Why?
6 faking it!
7 Can you answer? 1.Explain what regularization is and why it is useful. 2. Explain what precision and recall are. How do they relate to the ROC curve? 3.What is root cause analysis? 4.Are you familiar with pricing optimization, price elasticity, inventory management, competitive intelligence? Give examples. 5.What is statistical power? 6.Explain what resampling methods are and why they are useful. Also explain their limitations. 7.Is it better to have too many false positives, or too many false negatives? Explain. 8.What is selection bias, why is it important and how can you avoid it? 9.Give an example of how you would use experimental design to answer a question about user behavior. 10.How would you screen for outliers and what should you do if you find one? 11.How would you use either the extreme value theory, Monte Carlo simulations or mathematical statistics (or anything else) to correctly estimate the chance of a very rare event? 12.What is a recommendation engine? How does it work? 13.Explain what a false positive and a false negative are. Why is it important to differentiate these from each other? Source:
8 Can you answer? Source:
9 Developers and database people Statisticians Data science People from business
10 Developers and database people Statisticians People from business
11 Developers and database people Statisticians a.k.a. Data scientist People from business
12 What is data science? It s a buzz word!
13 Terms over time statistics Data mining Data science? ~ Examples: - Regression (Stats 101, SSAS, R+MSSQL) - Decision trees (Stats 101, SSAS, R+MSSQL) - Complexity reduction (Stats 101, SSAS, R+MSSQL) - Clustering (Stats 101, SSAS, R+MSSQL) 2010 ~
14 Interfaces over time statistics Data mining Data science ~ ~
15 Who is data scienctist? MacBook Statistician San Francisco
16 So who is a data scienctist? NOOooo!!!! It s a statistician!!! Source: Internet
17 Because what s next???? Internet killing doctors? It s like searching on internet for your symptoms instead of visiting a doctor!
18 Same goes for data science? Is the buzz data science killing statistics? and statisticians?
19 Why data scientist was born? because it was too damn hard to pronounce /ˌstætɪˈstɪʃ(ə)n/ STATISTICIAN
20 and because it s a sexy job According to a lot of articles published in 2015 and 2016 (thanks to decline of research journalism and copy/paste journalist) this was the top paid job, highly appreciated and wanted position. Well good morning! But we had these positions since the 60 s. They were called. f@%#! that word stasti statsi.something well, straccatella
21 and think about the movies.
22 at the end everyone would like to talk about. eventhough they have never seen.
23 at the very end data science is like teenage sex Everybody talks about it. Nobody really knows how to do it. Everybody thinks everyone else is doing it. So everyone claims they are doing it. In reality this looks like
24 I don t want to conclude, but. Having copy of your business logic and data in jammed into 1,048,576 rows by 16,384 column Data Science = Point & Click adventure game in Azure Machine Learning Copy-Paste-from-web advanced skills Huh Totally forgot about that?!
25 Let s do a quick example to support this problem Euclidian distances between friends
26 Let s do a quick example to support this problem Did he just say Eucl???? distance?
27 Let s do a quick example to support this problem dist(samplerestauratns, distance=euclidian)
28 Let s do a quick example to support this problem Simple distance from the center can gives us sensitivity of the ratings
29 Let s do a quick example to support this problem And so called sensitivity?
30 Think about. Data scientist use: Statisticians (and developers) use: dist(samplerestauratns, distance=euclidian) scale(samplerestaurants, center=t, scale=t)
31 Let s do a quick example to support this problem And doing some more goofying around.
32 Let s do a quick example to support this problem but not sure if algorithm preventing goofy pictures exists?
33 How to stop faking it? Start learning it Understand Test, test, test
34 Why you re not a data scientist?! Some Time series business intelligence stack doesn t make you a data scientist. Programming experience with Hadoop, R, Python, Octave, Matlib and Mathematica are data science tools. Tool skills alone don t give you data science cred. The 8-week course you took on Coursera or the Data Science boot camp you attended does not make you a data scientist Evangelizing Big data does not make you a data scientist Having degree in mathematics and statistics without field and applicative knowledge does not make you a data scientists Source:
35 Agenda for today 1) What we do in Data Science 2) Materials, tools and programs 3) Data science in business world
36 1) What we do in Data Science (part 1) Querying relational data Analyzing and visualizing the data Tasks for: Developers and Database people
37 1) What we do in Data Science (part 2) Understanding stastistics Exploring data with R / Python / Julia Understanding core data science concepts Understand Machine Learning Programming with R / Python to manipulate and model data Apply solution Tasks for: statisticians Tasks for: Business people
38 5 Core data science concetps? 1) is this weird? 2) is A better than B, respectively? 3) how much / many of this is needed? 4) this belong to group A? 5) what is next?
39 Think of. Algorithm as a cooking recipe
40 Think of. Your dataset as an ingredients
41 Think of. Your pans and pots as a computer
42 Think of. Statistician Data Scientist as a chef
43 Think of. Results as a finished dish
44 Think of. API as a prepared food
45 Is this weird spot the intruder? Anomaly detection
46 Is it Blue or is it Gold? Classification
47 What will be the temperature / stock? Regression
48 Belong to which group? Clustering
49 what is next? Reinforcement learning algorithms
50 2) Is your data ready for data science? 1) Is it relevant 2) Is data correlated 3) Is data distributed and accurate 4) Do I have enough data (variables, columns) 5) Unwanted correlations (multicolinearity, hyper )
51 2) Is your data ready for data science? Is it relevant
52 2) Is your data ready for data science? Is it related ( or non-empty)?
53 2) Is your data ready for data science? Is it accurate? Source:
54 2) Is your data ready for data science? Do we have enough? Sampling Number or observations vs. Number of variables Type of algorithm
55 3) Ask the right question? 1) Ask SMART 2) Ask in this way that includes target/predicted data 3) Formulate question based on data and algorithm
56 Model data and apply solution 1) Model data 2) Predict data 3) Apply solution
57 2) Materials and tools (Part 1) R consortium Books on statistics, statistical learning and machine learning Microsoft Books on line Microsoft Virtual academy Udemy, Packt,
58 2) Materials and tools (Part 2) R / Python / Julia / Excel Use Microsoft Azure ML Amazon Web service EC2 SQL Server BI stack Many vendors: SAS, IBM, Tibco, SAP, Tableau, Pentaho,Qlick, Microstrategy, Alteryx, etc.
59 3) Data Science in business world Loyalty program churn analysis Frau detection Out-of-stock prediction Customer classification Recommendation stuff Customer behaviour
60 Sources: Microsoft MVA Microsoft Data Science program ( Stats
61 R and SQL Server (SQL Server 2017 CTP 2.x)
62 The behind Architecture
63 Ecosystem RevoScaleR Package
64 R and T-SQL for predictive analytics EXECUTE = N'R',@script = N' library(e1071); irismodel <-naivebayes(iris_data[,1:4], iris_data[,5]); trained_model <- data.frame(payload = as.raw(serialize(irismodel, connection=null)));',@input_data_1 = N'select "Sepal.Length", "Sepal.Width","Petal.Length","Petal.Width","Species" from iris_data',@input_data_1_name = N'iris_data',@output_data_1_name = N'trained_model' WITH RESULT SETS ((model VARBINARY(MAX))); EXECUTE = N'R',@script = N'require("RevoScaleR"); irislinmod <- rxlinmod(sepal.length ~ Sepal.Width + Petal.Length + Petal.Width + Species, data = iris_rx_data); trained_model <- data.frame(payload = as.raw(serialize(irislinmod, connection=null)));',@input_data_1 = N'select "Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width", "Species" from iris_rx_data',@input_data_1_name = N'iris_rx_data',@output_data_1_name = N'trained_model' WITH result SETS ((model VARBINARY(MAX)));
65 Thanks! and learn Statistics!!
66 #sqlsat675 THANKS! Q&A
Andrea Martorana Tusa. Failure prediction for manifacturing industry
Andrea Martorana Tusa Failure prediction for manifacturing industry Event Sponsors Expo Sponsors Expo Light Sponsors Speaker Info First name: Andrea. Last name: Martorana Tusa. Italian, working by Widex
More informationSQL Server 2016 R Integration for database administrators
SQL Server 2016 R Integration for database administrators What can DBA gain by using R Integration for SQL Server 2016? Tomaž Kaštrun 20.Jänner, 2017 Our Sponsors About BI Developer and data analyst (SQL
More informationThink & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI)
Think & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI) About the Speaker Dr. SubraMANI Paramasivam PhD., MCT, MCSE, MCITP, MCP, MCTS, MCSA CEO, Principal Consultant & Trainer
More informationBEST BIG DATA CERTIFICATIONS
VALIANCE INSIGHTS BIG DATA BEST BIG DATA CERTIFICATIONS email : info@valiancesolutions.com website : www.valiancesolutions.com VALIANCE SOLUTIONS Analytics: Optimizing Certificate Engineer Engineering
More informationDr. SubraMANI Paramasivam. Think & Work like a Data Scientist with SQL 2016 & R
Dr. SubraMANI Paramasivam Think & Work like a Data Scientist with SQL 2016 & R About the Speaker Group Leader Dr. SubraMANI Paramasivam PhD., MVP, MCT, MCSE (x2), MCITP (x2), MCP, MCTS (x3), MCSA CEO,
More informationPredictive Analysis: Evaluation and Experimentation. Heejun Kim
Predictive Analysis: Evaluation and Experimentation Heejun Kim June 19, 2018 Evaluation and Experimentation Evaluation Metrics Cross-Validation Significance Tests Evaluation Predictive analysis: training
More informationModeling. Preparation. Operationalization. Profile Explore. Model Testing & Validation. Feature & Algorithm Selection. Transform Cleanse Denormalize
Preparation Modeling Ingest Transform Cleanse Denormalize Profile Explore Visualize Feature & Algorithm Selection Model Testing & Validation Operationalization Models Visualizations Deploy Apps, Services
More informationBoost your Analytics with ML for SQL Nerds
Boost your Analytics with ML for SQL Nerds SQL Saturday Spokane Mar 10, 2018 Julie Koesmarno @MsSQLGirl mssqlgirl.com jukoesma@microsoft.com Principal Program Manager in Business Analytics for SQL Products
More informationBuild a system health check for Db2 using IBM Machine Learning for z/os
Build a system health check for Db2 using IBM Machine Learning for z/os Jonathan Sloan Senior Analytics Architect, IBM Analytics Agenda A brief machine learning overview The Db2 ITOA model solutions template
More informationData Analysis Using Sql And Excel 2nd Edition
We have made it easy for you to find a PDF Ebooks without any digging. And by having access to our ebooks online or by storing it on your computer, you have convenient answers with data analysis using
More informationMACHINE LEARNING Example: Google search
MACHINE LEARNING Lauri Ilison, PhD Data Scientist 20.11.2014 Example: Google search 1 27.11.14 Facebook: 350 million photo uploads every day The dream is to build full knowledge of the world and know everything
More informationData Science Training
Data Science Training R, Predictive Modeling, Machine Learning, Python, Bigdata & Spark 9886760678 Introduction: This is a comprehensive course which builds on the knowledge and experience a business analyst
More informationThe Allure of Machine Learning, now within Reach in Microsoft Azure
A Mariner White Paper The Allure of Machine Learning, now within Reach in Microsoft Azure Or Why AzureML is Better for Data Mining than Excel By Colby Ford, Associate Data Analytics Consultant 2719 Coltsgate
More informationScalable Tools - Part I Introduction to Scalable Tools
Scalable Tools - Part I Introduction to Scalable Tools Adisak Sukul, Ph.D., Lecturer, Department of Computer Science, adisak@iastate.edu http://web.cs.iastate.edu/~adisak/mbds2018/ Scalable Tools session
More informationSOFTWARE DEVELOPMENT: DATA SCIENCE
PROFESSIONAL CAREER TRAINING INSTITUTE SOFTWARE DEVELOPMENT: DATA SCIENCE www.pcti.edu/data-science applicant@pcti.edu 832-484-9100 PROGRAM OVERVIEW Prepare for a life changing career as a data scientist
More informationClustering algorithms and autoencoders for anomaly detection
Clustering algorithms and autoencoders for anomaly detection Alessia Saggio Lunch Seminars and Journal Clubs Université catholique de Louvain, Belgium 3rd March 2017 a Outline Introduction Clustering algorithms
More informationSpecialist ICT Learning
Specialist ICT Learning APPLIED DATA SCIENCE AND BIG DATA ANALYTICS GTBD7 Course Description This intensive training course provides theoretical and technical aspects of Data Science and Business Analytics.
More informationDeploying, Managing and Reusing R Models in an Enterprise Environment
Deploying, Managing and Reusing R Models in an Enterprise Environment Making Data Science Accessible to a Wider Audience Lou Bajuk-Yorgan, Sr. Director, Product Management Streaming and Advanced Analytics
More informationData Engineering for Data Science
Engineering for Science Arup Nanda VP, Services Priceline booking.com priceline.com kayak.com agoda.com rentalcars.com opentable.com 2 Science and Machine Learning Customer Segmentation Prediction of Behavior
More informationSlice Intelligence!
Intern @ Slice Intelligence! Wei1an(Wu( September(8,(2014( Outline!! Details about the job!! Skills required and learned!! My thoughts regarding the internship! About the company!! Slice, which we call
More informationSQL Server Machine Learning Marek Chmel & Vladimir Muzny
SQL Server Machine Learning Marek Chmel & Vladimir Muzny @VladimirMuzny & @MarekChmel MCTs, MVPs, MCSEs Data Enthusiasts! vladimir@datascienceteam.cz marek@datascienceteam.cz Session Agenda Machine learning
More informationUSERS CONFERENCE Copyright 2016 OSIsoft, LLC
Bridge IT and OT with a process data warehouse Presented by Matt Ziegler, OSIsoft Complexity Problem Complexity Drives the Need for Integrators Disparate assets or interacting one-by-one Monitoring Real-time
More informationIntroduction to Data Mining and Data Analytics
1/28/2016 MIST.7060 Data Analytics 1 Introduction to Data Mining and Data Analytics What Are Data Mining and Data Analytics? Data mining is the process of discovering hidden patterns in data, where Patterns
More informationSpotfire and Tableau Positioning. Summary
Licensed for distribution Summary So how do the products compare? In a nutshell Spotfire is the more sophisticated and better performing visual analytics platform, and this would be true of comparisons
More informationACHIEVEMENTS FROM TRAINING
LEARN WELL TECHNOCRAFT DATA SCIENCE/ MACHINE LEARNING SYLLABUS 8TH YEAR OF ACCOMPLISHMENTS AUTHORIZED GLOBAL CERTIFICATION CENTER FOR MICROSOFT, ORACLE, IBM, AWS AND MANY MORE. 8411002339/7709292162 WWW.DW-LEARNWELL.COM
More informationDr. Michael Curry. Oregon. The Big Picture: SQL Overview and Getting the Most from SQL Saturday
Dr. Michael Curry michael.curry@wsu.edu Oregon The Big Picture: SQL Overview and Getting the Most from SQL Saturday Academic Data Management E-Commerce Entrepreneurship Dr. Michael Curry /michaellcurry/
More informationExecution of R Built Predictive Solutions
Execution of R Built Predictive Solutions Alex Guazzelli, PhD VP, Analytics - Zementis, Inc. user! 2010 Zementis Exporting Models from R Memory Why? Speed Transparency Freedom Interoperability Accessibility
More informationBig Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition
Big Data Analytics: What is Big Data? Stony Brook University CSE545, Fall 2016 the inaugural edition What s the BIG deal?! 2011 2011 2008 2010 2012 What s the BIG deal?! (Gartner Hype Cycle) What s the
More informationThe Data Science Process. Polong Lin Big Data University Leader & Data Scientist IBM
The Data Science Process Polong Lin Big Data University Leader & Data Scientist IBM polong@ca.ibm.com Every day, we create 2.5 quintillion bytes of data so much that 90% of the data in the world today
More informationMastering Data Warehouse Aggregates Solutions For Star Schema Performance
Mastering Data Warehouse Aggregates Solutions For Star Schema Performance Star Schema The Complete Reference Christopher Adamson Amazon. Mastering Data Warehouse Aggregates, Solutions for Star Schema Performance
More informationCertified Data Science with Python Professional VS-1442
Certified Data Science with Python Professional VS-1442 Certified Data Science with Python Professional Certified Data Science with Python Professional Certification Code VS-1442 Data science has become
More informationPython With Data Science
Course Overview This course covers theoretical and technical aspects of using Python in Applied Data Science projects and Data Logistics use cases. Who Should Attend Data Scientists, Software Developers,
More informationChuck Cartledge, PhD. 12 October 2018
Big Data: Data Analysis Boot Camp Introduction and Overview Chuck Cartledge, PhD 12 October 2018 1/14 Table of contents (1 of 1) 1 Introduction The global view 2 Overview The world from 50,000 feet. Text
More informationIntroduction to Data Mining
Introduction to Data Mining Lecture #1: Course Introduction U Kang Seoul National University U Kang 1 In This Lecture Motivation to study data mining Administrative information for this course U Kang 2
More informationIndira Bandari. Predictive Analytics using R in SQL Server
Indira Bandari Predictive Analytics using R in SQL Server Agenda What is Predictive Analytics? Analytics vs. Predictive Analytics Benefits of using R Predictive Analytics Life Cycle Demo Indira Bandari
More informationKnowledge Discovery. URL - Spring 2018 CS - MIA 1/22
Knowledge Discovery Javier Béjar cbea URL - Spring 2018 CS - MIA 1/22 Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics
More informationMike Schulte Data Scientist at the University of Pittsburgh Professor of Economics and Philosophy at Western Michigan University
Mike Schulte mike@shrewd-owl.com Data Scientist at the University of Pittsburgh Professor of Economics and Philosophy at Western Michigan University Advanced Analytics Introduced Advanced Analytics within
More informationUNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX
UNLEASHING THE VALUE OF THE TERADATA UNIFIED DATA ARCHITECTURE WITH ALTERYX 1 Successful companies know that analytics are key to winning customer loyalty, optimizing business processes and beating their
More informationNote: In the presentation I should have said "baby registry" instead of "bridal registry," see
Q-and-A from the Data-Mining Webinar Note: In the presentation I should have said "baby registry" instead of "bridal registry," see http://www.target.com/babyregistryportalview Q: You mentioned the 'Big
More informationIan Choy. Technology Solutions Professional
Ian Choy Technology Solutions Professional XML KPIs SQL Server 2000 Management Studio Mirroring SQL Server 2005 Compression Policy-Based Mgmt Programmability SQL Server 2008 PowerPivot SharePoint Integration
More informationData analysis using Microsoft Excel
Introduction to Statistics Statistics may be defined as the science of collection, organization presentation analysis and interpretation of numerical data from the logical analysis. 1.Collection of Data
More informationDoing the Data Science Dance
Doing the Data Science Dance Dean Abbott Abbott Analytics, SmarterHQ KNIME Fall Summit 2018 Email: dean@abbottanalytics.com Twitter: @deanabb 1 Data Science vs. Other Labels 2 Google Trends 3 Abbott Analytics,
More informationCitizen Data Scientist is the new Data Analyst
Welcome # T C 1 8 Citizen Data Scientist is the new Data Analyst Mehmet Vanli Sales Consultant Tableau Australia Citizen data scientist: A person who creates models that use advanced diagnostic analytics
More informationSpam. Time: five years from now Place: England
Spam Time: five years from now Place: England Oh no! said Joe Turner. When I go on the computer, all I get is spam email that nobody wants. It s all from people who are trying to sell you things. Email
More informationDatabase Management Systems
Database Management Systems Fall 2017 Knowledge is of two kinds: we know a subject ourselves, or we know where we can find information upon it. -- Samuel Johnson (1709-1784) Queries for Today Why? Who?
More informationBEGINNER SQL PROGRAMMING USING MICROSOFT SQL SERVER 2016
BEGINNER SQL PROGRAMMING USING PDF EBOOK3000 LEARNING SQL PROGRAMMING - LYNDA.COM 1 / 6 2 / 6 3 / 6 beginner sql programming using pdf ebook Details: Paperback: 296 pages Publisher: WOW! ebook (September
More informationAn Introduction to Big Data Formats
Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION
More informationCollaboration at Scale: Prioritizing a Backlog. 13-Dec-2017
Collaboration at Scale: Prioritizing a Backlog 13-Dec-2017 Collaboration at Scale Designed for Scrum-centric organizations with more than 10 Scrum teams, the Collaboration at Scale webinar series provides
More informationKNIME for the life sciences Cambridge Meetup
KNIME for the life sciences Cambridge Meetup Greg Landrum, Ph.D. KNIME.com AG 12 July 2016 What is KNIME? A bit of motivation: tool blending, data blending, documentation, automation, reproducibility More
More informationData Mining Concepts & Tasks
Data Mining Concepts & Tasks Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Sept 9, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos Last Time
More informationDATA SCIENCE USING SPARK: AN INTRODUCTION
DATA SCIENCE USING SPARK: AN INTRODUCTION TOPICS COVERED Introduction to Spark Getting Started with Spark Programming in Spark Data Science with Spark What next? 2 DATA SCIENCE PROCESS Exploratory Data
More informationData Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology
❷Chapter 2 Basic Statistics Business School, University of Shanghai for Science & Technology 2016-2017 2nd Semester, Spring2017 Contents of chapter 1 1 recording data using computers 2 3 4 5 6 some famous
More informationBEGINNER SQL PROGRAMMING USING MICROSOFT SQL SERVER 2012
BEGINNER SQL PROGRAMMING USING PDF EBOOK3000 LEARNING SQL PROGRAMMING - LYNDA.COM 1 / 6 2 / 6 3 / 6 beginner sql programming using pdf ebook Details: Paperback: 206 pages Publisher: WOW! ebook (September
More informationDEEP DIVE. Leave IT Alone: The Vast Value of Self-Service. #DMRadio
DEEP DIVE Leave IT Alone: The Vast Value of Self-Service #DMRadio Featured Speakers The Long-Standing Data Warehousing Models The Reliance on ETL Must Subside! Trust is the Cornerstone of Data-Driven
More informationAPACHE SPARK 2 FOR BEGINNERS BY RAJANARAYANAN THOTTUVAIKKATUMANA DOWNLOAD EBOOK : APACHE SPARK 2 FOR BEGINNERS BY RAJANARAYANAN THOTTUVAIKKATUMANA PDF
Read Online and Download Ebook APACHE SPARK 2 FOR BEGINNERS BY RAJANARAYANAN THOTTUVAIKKATUMANA DOWNLOAD EBOOK : APACHE SPARK 2 FOR BEGINNERS BY RAJANARAYANAN Click link bellow and free register to download
More informationData Analyst Nanodegree Syllabus
Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working
More informationMachine Learning - Clustering. CS102 Fall 2017
Machine Learning - Fall 2017 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for
More informationOPERATIONALIZING MACHINE LEARNING USING GPU ACCELERATED, IN-DATABASE ANALYTICS
OPERATIONALIZING MACHINE LEARNING USING GPU ACCELERATED, IN-DATABASE ANALYTICS 1 Why GPUs? A Tale of Numbers 100x Performance Increase Infrastructure Cost Savings Performance 100x gains over traditional
More informationLecture 19: Generative Adversarial Networks
Lecture 19: Generative Adversarial Networks Roger Grosse 1 Introduction Generative modeling is a type of machine learning where the aim is to model the distribution that a given set of data (e.g. images,
More informationCPSC 340: Machine Learning and Data Mining. Kernel Trick Fall 2017
CPSC 340: Machine Learning and Data Mining Kernel Trick Fall 2017 Admin Assignment 3: Due Friday. Midterm: Can view your exam during instructor office hours or after class this week. Digression: the other
More informationLecture 25: Review I
Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,
More informationI am a Data Nerd and so are YOU!
I am a Data Nerd and so are YOU! Not This Type of Nerd Data Nerd Coffee Talk We saw Cloudera as the lone open source champion of Hadoop and the EMC/Greenplum/MapR initiative as a more closed and
More informationWeek 1 Unit 1: Introduction to Data Science
Week 1 Unit 1: Introduction to Data Science The next 6 weeks What to expect in the next 6 weeks? 2 Curriculum flow (weeks 1-3) Business & Data Understanding 1 2 3 Data Preparation Modeling (1) Introduction
More informationSQL Server 2017: Data Science with Python or R?
SQL Server 2017: Data Science with Python or R? Dejan Sarka Sponsor Introduction Dejan Sarka (dsarka@solidq.com, dsarka@siol.net, @DejanSarka) 30 years of experience SQL Server MVP, MCT, 16 books 20+ courses,
More informationComparative analysis of data mining methods for predicting credit default probabilities in a retail bank portfolio
Comparative analysis of data mining methods for predicting credit default probabilities in a retail bank portfolio Adela Ioana Tudor, Adela Bâra, Simona Vasilica Oprea Department of Economic Informatics
More information10/14/2017. Dejan Sarka. Anomaly Detection. Sponsors
Dejan Sarka Anomaly Detection Sponsors About me SQL Server MVP (17 years) and MCT (20 years) 25 years working with SQL Server Authoring 16 th book Authoring many courses, articles Agenda Introduction Simple
More informationBusiness Club. Decision Trees
Business Club Decision Trees Business Club Analytics Team December 2017 Index 1. Motivation- A Case Study 2. The Trees a. What is a decision tree b. Representation 3. Regression v/s Classification 4. Building
More informationOutrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS
Outrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS Topics AGENDA Challenges with Big Data Analytics How SAS can help you to minimize time to value with
More informationOracle Big Data Science
Oracle Big Data Science Tim Vlamis and Dan Vlamis Vlamis Software Solutions 816-781-2880 www.vlamis.com @VlamisSoftware Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri
More informationHow App Ratings and Reviews Impact Rank on Google Play and the App Store
APP STORE OPTIMIZATION MASTERCLASS How App Ratings and Reviews Impact Rank on Google Play and the App Store BIG APPS GET BIG RATINGS 13,927 AVERAGE NUMBER OF RATINGS FOR TOP-RATED IOS APPS 196,833 AVERAGE
More informationELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4. Prof. James She
ELEC6910Q Analytics and Systems for Social Media and Big Data Applications Lecture 4 Prof. James She james.she@ust.hk 1 Selected Works of Activity 4 2 Selected Works of Activity 4 3 Last lecture 4 Mid-term
More informationWhy is it Difficult to Find a Good Free Web Host
From the SelectedWorks of Umakant Mishra February, 2012 Why is it Difficult to Find a Good Free Web Host Umakant Mishra Available at: https://works.bepress.com/umakant_mishra/102/ Why is it difficult to
More informationCreating a Recommender System. An Elasticsearch & Apache Spark approach
Creating a Recommender System An Elasticsearch & Apache Spark approach My Profile SKILLS Álvaro Santos Andrés Big Data & Analytics Solution Architect in Ericsson with more than 12 years of experience focused
More informationSQL SERVER INTERVIEW QUESTIONS AND ANSWERS FOR ALL DATABASE DEVELOPERS AND DEVELOPERS ADMINISTRATORS
SQL SERVER INTERVIEW QUESTIONS AND ANSWERS FOR ALL DATABASE DEVELOPERS AND DEVELOPERS ADMINISTRATORS page 1 / 5 page 2 / 5 sql server interview questions pdf SQL Server - 204 SQL Server interview questions
More informationThe OLX data theory of everything
The OLX data theory of everything Caspar Schönau Head of Global BI Jakub Orłowski Data engineering manager The biggest internet company that you have never heard of Founded 1915 South-Africa Market cap:
More informationOverview and Practical Application of Machine Learning in Pricing
Overview and Practical Application of Machine Learning in Pricing 2017 CAS Spring Meeting May 23, 2017 Duncan Anderson and Claudine Modlin (Willis Towers Watson) Mark Richards (Allstate Insurance Company)
More informationData Analytics Training Program
Data Analytics Training Program In exclusive association with 1200+ Trainings 20,000+ Participants 10,000+ Brands 45+ Countries [Since 2009] Training partner for Who Is This Course For? Programers Willing
More informationData Warehouse Tutorial For Beginners Sql Server 2008 Book
Data Warehouse Tutorial For Beginners Sql Server 2008 Book You've read some of the content of well-known Data Warehousing books now what? How do. Implementing a Data Warehouse with Microsoft SQL Server.
More informationWhat's New in MATLAB for Engineering Data Analytics?
What's New in MATLAB for Engineering Data Analytics? Will Wilson Application Engineer MathWorks, Inc. 2017 The MathWorks, Inc. 1 Agenda Data Types Tall Arrays for Big Data Machine Learning (for Everyone)
More informationClickbank Domination Presents. A case study by Devin Zander. A look into how absolutely easy internet marketing is. Money Mindset Page 1
Presents A case study by Devin Zander A look into how absolutely easy internet marketing is. Money Mindset Page 1 Hey guys! Quick into I m Devin Zander and today I ve got something everybody loves! Me
More informationCSE 446 Bias-Variance & Naïve Bayes
CSE 446 Bias-Variance & Naïve Bayes Administrative Homework 1 due next week on Friday Good to finish early Homework 2 is out on Monday Check the course calendar Start early (midterm is right before Homework
More informationDATA SCIENCE INTRODUCTION QSHORE TECHNOLOGIES. About the Course:
DATA SCIENCE About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst/Analytics Manager/Actuarial Scientist/Business
More informationInstructor: Dr. Mehmet Aktaş. Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University
Instructor: Dr. Mehmet Aktaş Mining of Massive Datasets Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University J. Leskovec, A. Rajaraman, J. Ullman: Mining of Massive Datasets, http://www.mmds.org
More informationPlay with Python: An intro to Data Science
Play with Python: An intro to Data Science Ignacio Larrú Instituto de Empresa Who am I? Passionate about Technology From Iphone apps to algorithmic programming I love innovative technology Former Entrepreneur:
More informationIntro to Stata for Political Scientists
Intro to Stata for Political Scientists Andrew S. Rosenberg Junior PRISM Fellow Department of Political Science Workshop Description This is an Introduction to Stata I will assume little/no prior knowledge
More informationBDD and Testing. User requirements and testing are tightly coupled
BDD and Testing User requirements and testing are tightly coupled 1 New Concept: Acceptance Tests Customer criteria for accepting a milestone Get paid if pass! Black-box tests specified with the customer
More informationGetting Started with Advanced Analytics in Finance, Marketing, and Operations
Getting Started with Advanced Analytics in Finance, Marketing, and Operations Southwest Regional Oracle Applications User Group Dan Vlamis February 24, 2017 @VlamisSoftware Vlamis Software Solutions Vlamis
More informationMachine Learning in Action
Machine Learning in Action PETER HARRINGTON Ill MANNING Shelter Island brief contents PART l (~tj\ssification...,... 1 1 Machine learning basics 3 2 Classifying with k-nearest Neighbors 18 3 Splitting
More informationIntroducing Microsoft SQL Server 2016 R Services. Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone
Introducing Microsoft SQL Server 2016 R Services Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone SQL Server 2016: Everything built-in built-in built-in built-in built-in built-in $2,230
More informationCreating publication-ready Word tables in R
Creating publication-ready Word tables in R Sara Weston and Debbie Yee 12/09/2016 Has this happened to you? You re working on a draft of a manuscript with your adviser, and one of her edits is something
More informationData Mining - Data. Dr. Jean-Michel RICHER Dr. Jean-Michel RICHER Data Mining - Data 1 / 47
Data Mining - Data Dr. Jean-Michel RICHER 2018 jean-michel.richer@univ-angers.fr Dr. Jean-Michel RICHER Data Mining - Data 1 / 47 Outline 1. Introduction 2. Data preprocessing 3. CPA with R 4. Exercise
More informationIntroduction to Data Management. Lecture #1 (Course Trailer )
Introduction to Data Management Lecture #1 (Course Trailer ) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Topics v Welcome to one
More informationData and AI LATAM 2018
Data and AI LATAM 2018 La parte de imagen con el identificador de relación rid5 no se encontró en el archivo. La parte de imagen con el identificador de relación rid5 no se encontró en el archivo. La parte
More informationOverview of Big Data
Overview of Big Data Tools and Techniques, Discoveries and Pitfalls Spring 2018 What Does Big Data Mean? (1) Collecting large amounts of data Via computers, sensors, people, events (2) Doing something
More informationPrototyping Data Intensive Apps: TrendingTopics.org
Prototyping Data Intensive Apps: TrendingTopics.org Pete Skomoroch Research Scientist at LinkedIn Consultant at Data Wrangling @peteskomoroch 09/29/09 1 Talk Outline TrendingTopics Overview Wikipedia Page
More informationData Mining Concepts & Tasks
Data Mining Concepts & Tasks Duen Horng (Polo) Chau Georgia Tech CSE6242 / CX4242 Jan 16, 2014 Partly based on materials by Professors Guy Lebanon, Jeffrey Heer, John Stasko, Christos Faloutsos Last Time
More informationWelcome! Power BI User Group (PUG) Copenhagen
Welcome! Power BI User Group (PUG) Copenhagen Connect to Data in Power BI Desktop Just Thorning Blindbæk Consultant, Trainer and Speaker Connect to Data in Power BI Desktop Basic introduction to data connectivity
More informationApproaching the Petabyte Analytic Database: What I learned
Disclaimer This document is for informational purposes only and is subject to change at any time without notice. The information in this document is proprietary to Actian and no part of this document may
More informationThe Mathematics Behind Neural Networks
The Mathematics Behind Neural Networks Pattern Recognition and Machine Learning by Christopher M. Bishop Student: Shivam Agrawal Mentor: Nathaniel Monson Courtesy of xkcd.com The Black Box Training the
More informationModern Data Warehouse The New Approach to Azure BI
Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics
More information