ENTERPRISE MINER: 1 DATA EXPLORATION AND VISUALISATION

Size: px
Start display at page:

Download "ENTERPRISE MINER: 1 DATA EXPLORATION AND VISUALISATION"

Transcription

1 ENTERPRISE MINER: 1 DATA EXPLORATION AND VISUALISATION JOZEF MOFFAT, ANALYTICS & INNOVATION PRACTICE, SAS UK 10, MAY 2016

2 DATA EXPLORATION AND VISUALISATION AGENDA SAS Webinar 10th May 2016 at 10:00 AM BST Enterprise Miner: Data Exploration and Visualisation The session looks at: - Data Visualisation and Sampling - Variable Selection - Missing Value Imputation - Outlier Detection

3 ROLE OF SAS ENTERPRISE MINER

4 THE ANALYTICS LIFECYCLE PREDICTIVE ANALYTICS AND DATA MINING Domain Expertise Decision Making Process and ROI Evaluation EVALUATE / MONITOR RESULTS IDENTIFY / FORMULATE PROBLEM DATA PREPARATION Data Exploration Data Visualization Report Creation DEPLOY MODEL DATA EXPLORATION Model Validation Model Deployment Model Monitoring Data Preparation VALIDATE MODEL BUILD MODEL TRANSFORM & SELECT Exploratory Analysis Descriptive Segmentation Predictive Modeling

5 THE ANALYTICS LIFECYCLE PREDICTIVE ANALYTICS AND DATA MINING Domain Expertise Decision Making Process and ROI Evaluation EVALUATE / MONITOR RESULTS IDENTIFY / FORMULATE PROBLEM DATA PREPARATION Data Exploration Data Visualization Report Creation DEPLOY MODEL DATA EXPLORATION Model Validation Model Deployment Model Monitoring Data Preparation VALIDATE MODEL BUILD MODEL TRANSFORM & SELECT Exploratory Analysis Descriptive Segmentation Predictive Modeling

6 SAS ENTERPRISE MINER SAS ENTERPRISE MINER Modern, collaborative, easy-to-use data mining workbench Sophisticated set of data preparation and exploration tools Modern suite of modeling techniques and methods Interactive model comparison, testing and validation Automated scoring process delivers faster results Open, extensible design for ultimate flexibility

7 SAS ENTERPRISE MINER SAS ENTERPRISE MINER MODEL DEVELOPMENT PROCESS Sample Explore Modify Model Assess

8 SAS ENTERPRISE MINER Utility SAS ENTERPRISE MINER MODEL DEVELOPMENT PROCESS Apps. Time Series HPDM Credit Scoring

9 SAS ENTERPRISE MINER SAS ENTERPRISE MINER SEMMA IN ACTION REPEATABLE PROCESS

10 DATA VISUALISATION AND SAMPLING

11 DATA VISUALISATION AND SAMPLING VISUALISATION Quickly find related patterns within a set of data via interactive pictures.

12 DATA VISUALISATION AND SAMPLING SAS ENTERPRISE MINER SEMMA PROCESS Sample Explore Modify Model Assess

13 DATA VISUALISATION AND SAMPLING SAMPLE AND EXPLORE Data selection Required & excluded fields Sample balancing Data partitioning Data evaluation Statistical measures Visualization Identifying outliers Analytical segmentation Variable creation & selection

14 DATA VISUALISATION AND SAMPLING SAMPLING Sampling Sample Node: Stratified / Simple Random Sampling Used for over/under sampling input data Data Partition Node: Random sampling into Training, Validation and Test sets Prevent model over fitting Filter Node: Select time period of interest Filter based on pre-defined flag

15 DATA VISUALISATION AND SAMPLING SAMPLING Segmentation Cluster Node Unsupervised, k-means clustering algorithm Data driven Output tree based descriptions

16 DEMO

17 VARIABLE SELECTION

18 VARIABLE SELECTION VARIABLE SELECTION ROUTINES Variable Selection Variable Selection Node Relationship of independent variables to dependent target R-Square and Chi-square selection criteria Variable Clustering Node Identify correlations and covariance's between input variables Select Best variable from cluster or Cluster Component Interactive Grouping Node Computes Weights of Evidence GINI and Information Values for variable selection

19 DEMO

20 MISSING VALUE IMPUTATION

21 MISSING VALUE IMPUTATION IMPUTE Missing Value Imputation Impute Node Complete case required for models such as Regression Multiple imputation techniques e.g. Tree, Distribution, Mean, Mode Class (categorical) variables Input/Target Count Default Constant Value Distribution Tree (only for inputs) Tree Surrogate (only for inputs) Interval (numeric) variables Input/Target Mean Median Midrange Distribution Tree (only for inputs) Tree Surrogate (only for inputs) Mid-Minimum Spacing Tukey s Biweight Huber Andrew s Wave Default Constant

22 DEMO

23 OUTLIER DETECTION

24 OUTLIER DETECTION DETECT AND TREAT Outlier Detection Filter Node Automated and Interactive filtering Identify and exclude extreme outliers Replacement Node Generates score code to process unknown levels when scoring Interactively specify replacement values for class and interval levels

25 DEMO

26 SUMMARY

27 DATA EXPLORATION AND VISUALISATION SUMMARY Comprehensive data mining toolset Variety of visualisation and sampling methodologies Number of approaches to data and dimension reduction Importance of enhancing data prior to model development Garbage in = Garbage out (GIGO)

28 QUESTIONS AND ANSWERS

29

ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA

ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA INSIGHTS@SAS: ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA AGENDA 09.00 09.15 Intro 09.15 10.30 Analytics using SAS Enterprise Guide Ellen Lokollo 10.45 12.00 Advanced Analytics using SAS

More information

SAS Enterprise Miner 7.1

SAS Enterprise Miner 7.1 SAS Enterprise Miner 7.1 Data Mining using SAS IASRI Satyajit Dwivedi Transforming the World DATA MINING SEMMA Process Sample Explore Modify Model Assess Utility 2 SEMMA Process - Creating Library Select

More information

Data Mining Overview. CHAPTER 1 Introduction to SAS Enterprise Miner Software

Data Mining Overview. CHAPTER 1 Introduction to SAS Enterprise Miner Software 1 CHAPTER 1 Introduction to SAS Enterprise Miner Software Data Mining Overview 1 Layout of the SAS Enterprise Miner Window 2 Using the Application Main Menus 3 Using the Toolbox 8 Using the Pop-Up Menus

More information

Getting Started with. SAS Enterprise Miner 5.3

Getting Started with. SAS Enterprise Miner 5.3 Getting Started with SAS Enterprise Miner 5.3 The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2008. Getting Started with SAS Enterprise Miner TM 5.3. Cary, NC: SAS

More information

SAS E-MINER: AN OVERVIEW

SAS E-MINER: AN OVERVIEW SAS E-MINER: AN OVERVIEW Samir Farooqi, R.S. Tomar and R.K. Saini I.A.S.R.I., Library Avenue, Pusa, New Delhi 110 012 Samir@iasri.res.in; tomar@iasri.res.in; saini@iasri.res.in Introduction SAS Enterprise

More information

Enterprise Miner Software: Changes and Enhancements, Release 4.1

Enterprise Miner Software: Changes and Enhancements, Release 4.1 Enterprise Miner Software: Changes and Enhancements, Release 4.1 The correct bibliographic citation for this manual is as follows: SAS Institute Inc., Enterprise Miner TM Software: Changes and Enhancements,

More information

Enterprise Miner Tutorial Notes 2 1

Enterprise Miner Tutorial Notes 2 1 Enterprise Miner Tutorial Notes 2 1 ECT7110 E-Commerce Data Mining Techniques Tutorial 2 How to Join Table in Enterprise Miner e.g. we need to join the following two tables: Join1 Join 2 ID Name Gender

More information

Outrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS

Outrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS Outrun Your Competition With SAS In-Memory Analytics Sascha Schubert Global Technology Practice, SAS Topics AGENDA Challenges with Big Data Analytics How SAS can help you to minimize time to value with

More information

The Consequences of Poor Data Quality on Model Accuracy

The Consequences of Poor Data Quality on Model Accuracy The Consequences of Poor Data Quality on Model Accuracy Dr. Gerhard Svolba SAS Austria Cologne, June 14th, 2012 From this talk you can expect The analytical viewpoint on data quality Answers to the questions

More information

SAS Enterprise Miner TM 6.1

SAS Enterprise Miner TM 6.1 Getting Started with SAS Enterprise Miner TM 6.1 SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2009. Getting Started with SAS Enterprise Miner TM

More information

Data Mining Using SAS Enterprise Miner : A Case Study Approach, Fourth Edition

Data Mining Using SAS Enterprise Miner : A Case Study Approach, Fourth Edition Data Mining Using SAS Enterprise Miner : A Case Study Approach, Fourth Edition SAS Documentation March 26, 2018 The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2018.

More information

Introducing Microsoft SQL Server 2016 R Services. Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone

Introducing Microsoft SQL Server 2016 R Services. Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone Introducing Microsoft SQL Server 2016 R Services Julian Lee Advanced Analytics Lead Global Black Belt Asia Timezone SQL Server 2016: Everything built-in built-in built-in built-in built-in built-in $2,230

More information

SAS Enterprise Miner : What does the future hold?

SAS Enterprise Miner : What does the future hold? SAS Enterprise Miner : What does the future hold? David Duling EM Development Director SAS Inc. Sascha Schubert Product Manager Data Mining SAS International Topics for Discussion: EM 4.2/SAS 9.0 AF/SCL

More information

What does SAS Enterprise Miner do? For whom is SAS Enterprise Miner designed?

What does SAS Enterprise Miner do? For whom is SAS Enterprise Miner designed? FACT SHEET SAS Enterprise Miner Create highly accurate analytical models that enable you to predict with confidence What does SAS Enterprise Miner do? It streamlines the data mining process so you can

More information

GETTING STARTED WITH DATA MINING

GETTING STARTED WITH DATA MINING GETTING STARTED WITH DATA MINING Nora Galambos, PhD Senior Data Scientist Office of Institutional Research, Planning & Effectiveness Stony Brook University AIR Forum 2017 Washington, D.C. 1 Using Data

More information

3. Data Preprocessing. 3.1 Introduction

3. Data Preprocessing. 3.1 Introduction 3. Data Preprocessing Contents of this Chapter 3.1 Introduction 3.2 Data cleaning 3.3 Data integration 3.4 Data transformation 3.5 Data reduction SFU, CMPT 740, 03-3, Martin Ester 84 3.1 Introduction Motivation

More information

2. Data Preprocessing

2. Data Preprocessing 2. Data Preprocessing Contents of this Chapter 2.1 Introduction 2.2 Data cleaning 2.3 Data integration 2.4 Data transformation 2.5 Data reduction Reference: [Han and Kamber 2006, Chapter 2] SFU, CMPT 459

More information

SAS High-Performance Analytics Products

SAS High-Performance Analytics Products Fact Sheet What do SAS High-Performance Analytics products do? With high-performance analytics products from SAS, you can develop and process models that use huge amounts of diverse data. These products

More information

Getting Started with SAS Enterprise Miner 12.1

Getting Started with SAS Enterprise Miner 12.1 Getting Started with SAS Enterprise Miner 12.1 SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc 2012. Getting Started with SAS Enterprise Miner 12.1.

More information

Getting Started with SAS Enterprise Miner 14.2

Getting Started with SAS Enterprise Miner 14.2 Getting Started with SAS Enterprise Miner 14.2 SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2016. Getting Started with SAS Enterprise Miner 14.2.

More information

SAS Enterprise Miner : Tutorials and Examples

SAS Enterprise Miner : Tutorials and Examples SAS Enterprise Miner : Tutorials and Examples SAS Documentation February 13, 2018 The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2017. SAS Enterprise Miner : Tutorials

More information

Introduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core)

Introduction to Data Science. Introduction to Data Science with Python. Python Basics: Basic Syntax, Data Structures. Python Concepts (Core) Introduction to Data Science What is Analytics and Data Science? Overview of Data Science and Analytics Why Analytics is is becoming popular now? Application of Analytics in business Analytics Vs Data

More information

Automation, DevOps, and the Demands of a Multicloud World in the Telecommunications Industry

Automation, DevOps, and the Demands of a Multicloud World in the Telecommunications Industry Automation, DevOps, and the Demands of a Multicloud World in the Telecommunications Industry An IDC InfoBrief, Sponsored by Red Hat March 2018 Sponsored by Red Hat Page 1 Methodology In September, 2017

More information

Doing the Data Science Dance

Doing the Data Science Dance Doing the Data Science Dance Dean Abbott Abbott Analytics, SmarterHQ KNIME Fall Summit 2018 Email: dean@abbottanalytics.com Twitter: @deanabb 1 Data Science vs. Other Labels 2 Google Trends 3 Abbott Analytics,

More information

K236: Basis of Data Science

K236: Basis of Data Science Schedule of K236 K236: Basis of Data Science Lecture 6: Data Preprocessing Lecturer: Tu Bao Ho and Hieu Chi Dam TA: Moharasan Gandhimathi and Nuttapong Sanglerdsinlapachai 1. Introduction to data science

More information

SAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation

SAS Enterprise Miner Performance on IBM System p 570. Jan, Hsian-Fen Tsao Brian Porter Harry Seifert. IBM Corporation SAS Enterprise Miner Performance on IBM System p 570 Jan, 2008 Hsian-Fen Tsao Brian Porter Harry Seifert IBM Corporation Copyright IBM Corporation, 2008. All Rights Reserved. TABLE OF CONTENTS ABSTRACT...3

More information

Predictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA

Predictive Analytics: Demystifying Current and Emerging Methodologies. Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA Predictive Analytics: Demystifying Current and Emerging Methodologies Tom Kolde, FCAS, MAAA Linda Brobeck, FCAS, MAAA May 18, 2017 About the Presenters Tom Kolde, FCAS, MAAA Consulting Actuary Chicago,

More information

An Interactive GUI Front-End for a Credit Scoring Modeling System by Jeffrey Morrison, Futian Shi, and Timothy Lee

An Interactive GUI Front-End for a Credit Scoring Modeling System by Jeffrey Morrison, Futian Shi, and Timothy Lee An Interactive GUI Front-End for a Credit Scoring Modeling System by Jeffrey Morrison, Futian Shi, and Timothy Lee Abstract The need for statistical modeling has been on the rise in recent years. Banks,

More information

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect BEOP.CTO.TP4 Owner: OCTO Revision: 0001 Approved by: JAT Effective: 08/30/2018 Buchanan & Edwards Proprietary: Printed copies of

More information

A Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA)

A Comparative Study of Data Mining Process Models (KDD, CRISP-DM and SEMMA) International Journal of Innovation and Scientific Research ISSN 2351-8014 Vol. 12 No. 1 Nov. 2014, pp. 217-222 2014 Innovative Space of Scientific Research Journals http://www.ijisr.issr-journals.org/

More information

Overview. Data Mining for Business Intelligence. Shmueli, Patel & Bruce

Overview. Data Mining for Business Intelligence. Shmueli, Patel & Bruce Overview Data Mining for Business Intelligence Shmueli, Patel & Bruce Galit Shmueli and Peter Bruce 2010 Core Ideas in Data Mining Classification Prediction Association Rules Data Reduction Data Exploration

More information

Types of Data Mining

Types of Data Mining Data Mining and The Use of SAS to Deploy Scoring Rules South Central SAS Users Group Conference Neil Fleming, Ph.D., ASQ CQE November 7-9, 2004 2W Systems Co., Inc. Neil.Fleming@2WSystems.com 972 733-0588

More information

Practical Guidance for Machine Learning Applications

Practical Guidance for Machine Learning Applications Practical Guidance for Machine Learning Applications Brett Wujek About the authors Material from SGF Paper SAS2360-2016 Brett Wujek Senior Data Scientist, Advanced Analytics R&D ~20 years developing engineering

More information

DevOps Agility Demands Advanced Management and Automation

DevOps Agility Demands Advanced Management and Automation DevOps Agility Demands Advanced Management and Automation An IDC InfoBrief, Sponsored by Red Hat December 2017 Sponsored by Red Hat Page 1 Methodology In September, 2017 IDC conducted a global study to

More information

An Interactive GUI Front-End for a Credit Scoring Modeling System

An Interactive GUI Front-End for a Credit Scoring Modeling System Paper 6 An Interactive GUI Front-End for a Credit Scoring Modeling System Jeffrey Morrison, Futian Shi, and Timothy Lee Knowledge Sciences & Analytics, Equifax Credit Information Services, Inc. Abstract

More information

DATA SCIENCE INTRODUCTION QSHORE TECHNOLOGIES. About the Course:

DATA SCIENCE INTRODUCTION QSHORE TECHNOLOGIES. About the Course: DATA SCIENCE About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst/Analytics Manager/Actuarial Scientist/Business

More information

Implementing SVM in an RDBMS: Improved Scalability and Usability. Joseph Yarmus, Boriana Milenova, Marcos M. Campos Data Mining Technologies Oracle

Implementing SVM in an RDBMS: Improved Scalability and Usability. Joseph Yarmus, Boriana Milenova, Marcos M. Campos Data Mining Technologies Oracle Implementing SVM in an RDBMS: Improved Scalability and Usability Joseph Yarmus, Boriana Milenova, Marcos M. Campos Data Mining Technologies Oracle Overview Oracle RDBMS resources leveraged by data mining

More information

Performance and Load Testing R12 With Oracle Applications Test Suite

Performance and Load Testing R12 With Oracle Applications Test Suite Performance and Load Testing R12 With Oracle Applications Test Suite Deep Ram Technical Director Oracle Corporation Daniel Gonzalez Practice Manager Oracle Corporation Safe Harbor

More information

Data Mining. Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA.

Data Mining. Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA. Data Mining Ryan Benton Center for Advanced Computer Studies University of Louisiana at Lafayette Lafayette, La., USA January 13, 2011 Important Note! This presentation was obtained from Dr. Vijay Raghavan

More information

Scoring with Analytic Stores

Scoring with Analytic Stores Scoring with Analytic Stores Merve Yasemin Tekbudak, SAS Institute Inc., Cary, NC In supervised learning, scoring is the process of applying a previously built predictive model to a new data set in order

More information

Enterprise Miner Version 4.0. Changes and Enhancements

Enterprise Miner Version 4.0. Changes and Enhancements Enterprise Miner Version 4.0 Changes and Enhancements Table of Contents General Information.................................................................. 1 Upgrading Previous Version Enterprise Miner

More information

Now, Data Mining Is Within Your Reach

Now, Data Mining Is Within Your Reach Clementine Desktop Specifications Now, Data Mining Is Within Your Reach Data mining delivers significant, measurable value. By uncovering previously unknown patterns and connections in data, data mining

More information

The Value of Migrating from Cisco Tidal Horizon to Cisco Process Orchestrator

The Value of Migrating from Cisco Tidal Horizon to Cisco Process Orchestrator White Paper The Value of Migrating from Cisco Tidal Horizon to Cisco Process Orchestrator Migrating from Cisco Tidal Horizon for SAP to Cisco Process Orchestrator can help you reduce total cost of ownership

More information

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining

Data Mining: Data. Lecture Notes for Chapter 2. Introduction to Data Mining Data Mining: Data Lecture Notes for Chapter 2 Introduction to Data Mining by Tan, Steinbach, Kumar Data Preprocessing Aggregation Sampling Dimensionality Reduction Feature subset selection Feature creation

More information

Predict Outcomes and Reveal Relationships in Categorical Data

Predict Outcomes and Reveal Relationships in Categorical Data PASW Categories 18 Specifications Predict Outcomes and Reveal Relationships in Categorical Data Unleash the full potential of your data through predictive analysis, statistical learning, perceptual mapping,

More information

Expensive to drill wells when all goes to plan. Drilling problems add costs throughout the well s life.

Expensive to drill wells when all goes to plan. Drilling problems add costs throughout the well s life. Beau Rollins Map View Cross Section View Expensive to drill wells when all goes to plan. Surface Location Drilling problems add costs throughout the well s life. The best scenario for minimizing costs

More information

Machine Learning Feature Creation and Selection

Machine Learning Feature Creation and Selection Machine Learning Feature Creation and Selection Jeff Howbert Introduction to Machine Learning Winter 2012 1 Feature creation Well-conceived new features can sometimes capture the important information

More information

Data Mining. Lab Exercises

Data Mining. Lab Exercises Data Mining Lab Exercises Predictive Modelling Purpose The purpose of this study is to learn how data mining methods / tools (SAS System/ SAS Enterprise Miner) can be used to solve predictive modeling

More information

Automated Testing of Tableau Dashboards

Automated Testing of Tableau Dashboards Kinesis Technical Whitepapers April 2018 Kinesis CI Automated Testing of Tableau Dashboards Abstract Companies make business critical decisions every day, based on data from their business intelligence

More information

BIG DATA SCIENTIST Certification. Big Data Scientist

BIG DATA SCIENTIST Certification. Big Data Scientist BIG DATA SCIENTIST Certification Big Data Scientist Big Data Science Professional (BDSCP) certifications are formal accreditations that prove proficiency in specific areas of Big Data. To obtain a certification,

More information

SAS (Statistical Analysis Software/System)

SAS (Statistical Analysis Software/System) SAS (Statistical Analysis Software/System) SAS Adv. Analytics or Predictive Modelling:- Class Room: Training Fee & Duration : 30K & 3 Months Online Training Fee & Duration : 33K & 3 Months Learning SAS:

More information

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset. Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied

More information

RiskSense Attack Surface Validation for IoT Systems

RiskSense Attack Surface Validation for IoT Systems RiskSense Attack Surface Validation for IoT Systems 2018 RiskSense, Inc. Surfacing Double Exposure Risks Changing Times and Assessment Focus Our view of security assessments has changed. There is diminishing

More information

Build a system health check for Db2 using IBM Machine Learning for z/os

Build a system health check for Db2 using IBM Machine Learning for z/os Build a system health check for Db2 using IBM Machine Learning for z/os Jonathan Sloan Senior Analytics Architect, IBM Analytics Agenda A brief machine learning overview The Db2 ITOA model solutions template

More information

Co-creation for Success

Co-creation for Success SAP SAPPHIRE NOW 2018 Orlando, June 5-7, 2018 Human Centric Innovation Co-creation for Success 0 2018 FUJITSU Fujitsu Hybrid IT Conduit for Digital Transformation Orlando, June 5-7, 2018 Human Centric

More information

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp

CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp CS 229 Final Project - Using machine learning to enhance a collaborative filtering recommendation system for Yelp Chris Guthrie Abstract In this paper I present my investigation of machine learning as

More information

Knowledge Discovery. URL - Spring 2018 CS - MIA 1/22

Knowledge Discovery. URL - Spring 2018 CS - MIA 1/22 Knowledge Discovery Javier Béjar cbea URL - Spring 2018 CS - MIA 1/22 Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics

More information

Automatic Detection of Section Membership for SAS Conference Paper Abstract Submissions: A Case Study

Automatic Detection of Section Membership for SAS Conference Paper Abstract Submissions: A Case Study 1746-2014 Automatic Detection of Section Membership for SAS Conference Paper Abstract Submissions: A Case Study Dr. Goutam Chakraborty, Professor, Department of Marketing, Spears School of Business, Oklahoma

More information

Six Sigma in the datacenter drives a zero-defects culture

Six Sigma in the datacenter drives a zero-defects culture Six Sigma in the datacenter drives a zero-defects culture Situation Like many IT organizations, Microsoft IT wants to keep its global infrastructure available at all times. Scope, scale, and an environment

More information

Define the data source for the Organic Purchase Analysis project.

Define the data source for the Organic Purchase Analysis project. Introduction to Predictive Modeling Define the data source for the Organic Purchase Analysis project. Your Task Define AAEM.ORGANICS as a data source in the elearnem project and adjust the column metadata

More information

CHAPTER 11 EXAMPLES: MISSING DATA MODELING AND BAYESIAN ANALYSIS

CHAPTER 11 EXAMPLES: MISSING DATA MODELING AND BAYESIAN ANALYSIS Examples: Missing Data Modeling And Bayesian Analysis CHAPTER 11 EXAMPLES: MISSING DATA MODELING AND BAYESIAN ANALYSIS Mplus provides estimation of models with missing data using both frequentist and Bayesian

More information

Moving Databases to Oracle Cloud: Performance Best Practices

Moving Databases to Oracle Cloud: Performance Best Practices Moving Databases to Oracle Cloud: Performance Best Practices Kurt Engeleiter Product Manager Oracle Safe Harbor Statement The following is intended to outline our general product direction. It is intended

More information

APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING PRACTICAL GUIDE Data Mining and Data Warehousing

APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING PRACTICAL GUIDE Data Mining and Data Warehousing ROZWÓJ POTENCJAŁU I OFERTY DYDAKTYCZNEJ POLITECHNIKI WROCŁAWSKIEJ Wrocław University of Technology Internet Engineering Henryk Maciejewski APPLICATION PROGRAMMING: DATA MINING AND DATA WAREHOUSING PRACTICAL

More information

Paper SAS Taming the Rule. Charlotte Crain, Chris Upton, SAS Institute Inc.

Paper SAS Taming the Rule. Charlotte Crain, Chris Upton, SAS Institute Inc. ABSTRACT Paper SAS2620-2016 Taming the Rule Charlotte Crain, Chris Upton, SAS Institute Inc. When business rules are deployed and executed--whether a rule is fired or not if the rule-fire outcomes are

More information

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:

More information

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE

APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE APPLICATION OF MULTIPLE RANDOM CENTROID (MRC) BASED K-MEANS CLUSTERING ALGORITHM IN INSURANCE A REVIEW ARTICLE Sundari NallamReddy, Samarandra Behera, Sanjeev Karadagi, Dr. Anantha Desik ABSTRACT: Tata

More information

Week 1 Unit 1: Introduction to Data Science

Week 1 Unit 1: Introduction to Data Science Week 1 Unit 1: Introduction to Data Science The next 6 weeks What to expect in the next 6 weeks? 2 Curriculum flow (weeks 1-3) Business & Data Understanding 1 2 3 Data Preparation Modeling (1) Introduction

More information

Microsoft Office SharePoint Server 2007

Microsoft Office SharePoint Server 2007 Microsoft Office SharePoint Server 2007 Enabled by EMC Celerra Unified Storage and Microsoft Hyper-V Reference Architecture Copyright 2010 EMC Corporation. All rights reserved. Published May, 2010 EMC

More information

The Ladder to AI. Rob Thomas General Manager, IBM

The Ladder to AI. Rob Thomas General Manager, IBM The Ladder to AI Rob Thomas General Manager, IBM Analytics @robdthomas Value Clients Are Advancing Their Use of Data for Business Value Operational BI and Data Warehousing Most are here Self-Service Analytics

More information

Knowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA

Knowledge Discovery. Javier Béjar URL - Spring 2019 CS - MIA Knowledge Discovery Javier Béjar URL - Spring 2019 CS - MIA Knowledge Discovery (KDD) Knowledge Discovery in Databases (KDD) Practical application of the methodologies from machine learning/statistics

More information

Gain Insight and Improve Performance with Data Mining

Gain Insight and Improve Performance with Data Mining Clementine 11.0 Specifications Gain Insight and Improve Performance with Data Mining Data mining provides organizations with a clearer view of current conditions and deeper insight into future events.

More information

Oracle Big Data Science IOUG Collaborate 16

Oracle Big Data Science IOUG Collaborate 16 Oracle Big Data Science IOUG Collaborate 16 Session 4762 Tim and Dan Vlamis Tuesday, April 12, 2016 Vlamis Software Solutions Vlamis Software founded in 1992 in Kansas City, Missouri Developed 200+ Oracle

More information

Intrusion Detection using NASA HTTP Logs AHMAD ARIDA DA CHEN

Intrusion Detection using NASA HTTP Logs AHMAD ARIDA DA CHEN Intrusion Detection using NASA HTTP Logs AHMAD ARIDA DA CHEN Presentation Overview - Background - Preprocessing - Data Mining Methods to Determine Outliers - Finding Outliers - Outlier Validation -Summary

More information

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking

More information

Data Science. Data Analyst. Data Scientist. Data Architect

Data Science. Data Analyst. Data Scientist. Data Architect Data Science Data Analyst Data Analysis in Excel Programming in R Introduction to Python/SQL/Tableau Data Visualization in R / Tableau Exploratory Data Analysis Data Scientist Inferential Statistics &

More information

Visualizing the World

Visualizing the World Visualizing the World An Introduction to Visualization 15.071x The Analytics Edge Why Visualization? The picture-examining eye is the best finder we have of the wholly unanticipated -John Tukey Visualizing

More information

3.6 Sample code: yrbs_data <- read.spss("yrbs07.sav",to.data.frame=true)

3.6 Sample code: yrbs_data <- read.spss(yrbs07.sav,to.data.frame=true) InJanuary2009,CDCproducedareportSoftwareforAnalyisofYRBSdata, describingtheuseofsas,sudaan,stata,spss,andepiinfoforanalyzingdatafrom theyouthriskbehaviorssurvey. ThisreportprovidesthesameinformationforRandthesurveypackage.Thetextof

More information

DIGGING INTO COMPETENCIES AND CREDENTIALING REQUIREMENTS SEPTEMBER 23, 2015 LISA HORWITZ

DIGGING INTO COMPETENCIES AND CREDENTIALING REQUIREMENTS SEPTEMBER 23, 2015 LISA HORWITZ DIGGING INTO COMPETENCIES AND CREDENTIALING REQUIREMENTS SEPTEMBER 23, 2015 LISA HORWITZ WHY ARE COMPETENCY PATHS AND CREDENTIALS IMPORTANT? The more you know about SAS technology, the better you will

More information

Data Mining and Analytics. Introduction

Data Mining and Analytics. Introduction Data Mining and Analytics Introduction Data Mining Data mining refers to extracting or mining knowledge from large amounts of data It is also termed as Knowledge Discovery from Data (KDD) Mostly, data

More information

Sourcefire Network Security Analytics: Finding the Needle in the Haystack

Sourcefire Network Security Analytics: Finding the Needle in the Haystack Sourcefire Network Security Analytics: Finding the Needle in the Haystack Mark Pretty Consulting Systems Engineer #clmel Agenda Introduction The Sourcefire Solution Real-time Analytics On-Demand Analytics

More information

SCHEME OF COURSE WORK. Data Warehousing and Data mining

SCHEME OF COURSE WORK. Data Warehousing and Data mining SCHEME OF COURSE WORK Course Details: Course Title Course Code Program: Specialization: Semester Prerequisites Department of Information Technology Data Warehousing and Data mining : 15CT1132 : B.TECH

More information

Lecture 7: Linear Regression (continued)

Lecture 7: Linear Regression (continued) Lecture 7: Linear Regression (continued) Reading: Chapter 3 STATS 2: Data mining and analysis Jonathan Taylor, 10/8 Slide credits: Sergio Bacallado 1 / 14 Potential issues in linear regression 1. Interactions

More information

Javaentwicklung in der Oracle Cloud

Javaentwicklung in der Oracle Cloud Javaentwicklung in der Oracle Cloud Sören Halter Principal Sales Consultant 2016-11-17 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2013 http://ce.sharif.edu/courses/91-92/2/ce725-1/ Agenda Features and Patterns The Curse of Size and

More information

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400

Statistics (STAT) Statistics (STAT) 1. Prerequisites: grade in C- or higher in STAT 1200 or STAT 1300 or STAT 1400 Statistics (STAT) 1 Statistics (STAT) STAT 1200: Introductory Statistical Reasoning Statistical concepts for critically evaluation quantitative information. Descriptive statistics, probability, estimation,

More information

Data-Driven Policing Summit

Data-Driven Policing Summit Reduce Crime and Manage Risk in Policing with Data Analysis Data-Driven Policing Summit Using Data Analytics and Predictive Modeling to Mitigate Risk and Reduce Crime September 18-19, 2017 Washington,

More information

An Introduction to Data Mining in Institutional Research. Dr. Thulasi Kumar Director of Institutional Research University of Northern Iowa

An Introduction to Data Mining in Institutional Research. Dr. Thulasi Kumar Director of Institutional Research University of Northern Iowa An Introduction to Data Mining in Institutional Research Dr. Thulasi Kumar Director of Institutional Research University of Northern Iowa AIR/SPSS Professional Development Series Background Covering variety

More information

Introducing Oracle Machine Learning

Introducing Oracle Machine Learning Introducing Oracle Machine Learning A Collaborative Zeppelin notebook for Oracle s machine learning capabilities Charlie Berger Marcos Arancibia Mark Hornick Advanced Analytics and Machine Learning Copyright

More information

Big Data Analytics The Data Mining process. Roger Bohn March. 2016

Big Data Analytics The Data Mining process. Roger Bohn March. 2016 1 Big Data Analytics The Data Mining process Roger Bohn March. 2016 Office hours HK thursday5 to 6 in the library 3115 If trouble, email or Slack private message. RB Wed. 2 to 3:30 in my office Some material

More information

Note: In the presentation I should have said "baby registry" instead of "bridal registry," see

Note: In the presentation I should have said baby registry instead of bridal registry, see Q-and-A from the Data-Mining Webinar Note: In the presentation I should have said "baby registry" instead of "bridal registry," see http://www.target.com/babyregistryportalview Q: You mentioned the 'Big

More information

R07. FirstRanker. 7. a) What is text mining? Describe about basic measures for text retrieval. b) Briefly describe document cluster analysis.

R07. FirstRanker. 7. a) What is text mining? Describe about basic measures for text retrieval. b) Briefly describe document cluster analysis. www..com www..com Set No.1 1. a) What is data mining? Briefly explain the Knowledge discovery process. b) Explain the three-tier data warehouse architecture. 2. a) With an example, describe any two schema

More information

Features: representation, normalization, selection. Chapter e-9

Features: representation, normalization, selection. Chapter e-9 Features: representation, normalization, selection Chapter e-9 1 Features Distinguish between instances (e.g. an image that you need to classify), and the features you create for an instance. Features

More information

Knowledge Discovery and Data Mining

Knowledge Discovery and Data Mining Knowledge Discovery and Data Mining Lecture 10 - Classification trees Tom Kelsey School of Computer Science University of St Andrews http://tom.home.cs.st-andrews.ac.uk twk@st-andrews.ac.uk Tom Kelsey

More information

Simulation of Imputation Effects Under Different Assumptions. Danny Rithy

Simulation of Imputation Effects Under Different Assumptions. Danny Rithy Simulation of Imputation Effects Under Different Assumptions Danny Rithy ABSTRACT Missing data is something that we cannot always prevent. Data can be missing due to subjects' refusing to answer a sensitive

More information

Statistical Pattern Recognition

Statistical Pattern Recognition Statistical Pattern Recognition Features and Feature Selection Hamid R. Rabiee Jafar Muhammadi Spring 2014 http://ce.sharif.edu/courses/92-93/2/ce725-2/ Agenda Features and Patterns The Curse of Size and

More information

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data

Data Statistics Population. Census Sample Correlation... Statistical & Practical Significance. Qualitative Data Discrete Data Continuous Data Data Statistics Population Census Sample Correlation... Voluntary Response Sample Statistical & Practical Significance Quantitative Data Qualitative Data Discrete Data Continuous Data Fewer vs Less Ratio

More information

Viewpoint Review & Analytics

Viewpoint Review & Analytics The Viewpoint all-in-one e-discovery platform enables law firms, corporations and service providers to manage every phase of the e-discovery lifecycle with the power of a single product. The Viewpoint

More information

Smart Manufacturing in the Food & Beverage Industry

Smart Manufacturing in the Food & Beverage Industry Smart Manufacturing in the Food & Beverage Industry PUBLIC Copyright 2016 Rockwell Automation, Inc. All Rights Reserved. 1 Rockwell Automation at a Glance $5.9B FISCAL 2016 SALES 22,000 EMPLOYEES 80+ COUNTRIES

More information

Streaming iphone sensor data to SAS Event Stream Processing

Streaming iphone sensor data to SAS Event Stream Processing SAS USER FORUM Streaming iphone sensor data to SAS Event Stream Processing Pasi Helenius Senior Advisor SAS Event Stream Processing 3 KEY CHARACTERISTICS Technology Process steams of data events, on the

More information

Machine Learning - Clustering. CS102 Fall 2017

Machine Learning - Clustering. CS102 Fall 2017 Machine Learning - Fall 2017 Big Data Tools and Techniques Basic Data Manipulation and Analysis Performing well-defined computations or asking well-defined questions ( queries ) Data Mining Looking for

More information