towards advanced HR Analytics by Arie-Jan Baan and Bram Eigenhuis

Size: px
Start display at page:

Download "towards advanced HR Analytics by Arie-Jan Baan and Bram Eigenhuis"

Transcription

1 towards advanced HR Analytics by Arie-Jan Baan and Bram Eigenhuis

2 Content #1 advanced Data Analytics (?) #2 data Science Process #3 a case study #4 your case #5 Q&A

3 Who is who? and what is your expectation?

4 There are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns the ones we don't know we don't know.

5 Some myths Everyone Is Ahead of Us in Adopting Big Data We Have So Much Data, We Don't Need to Worry About Every Little Data Flaw Big Data Technology Will Eliminate the Need for Data Integration It's Pointless Using a Data Warehouse for Advanced Analytics Data Lakes Will Replace the Data Warehouse

6 What we hear and read Data Science Artificial Intelligence Data Mining Statistics Statistical Learning Business Analytics Machine Learning Deep Learning.

7 New data-analytics Statistics Teaching humans what has happened/is happening by looking at data Data collected to answer a given question Questions come first, data come second Data analyzed by people with the aid of computers More focus on inference (assuming a hypotheses before analyzing the data) Data Science Teaching computers to predict the unkown by learning from data Data collected electronically for possibly future use Data come first, questions come second Data processes by computer algorithms with the aid ofpeople More focus on prediction (without assuming a GDP)

8 Big data - Analytics landscape Rapidly growing Diverse and fragmented Where to start?

9 4 v s

10 Analytics Complexity From reporting to prescriptive analytics Prescriptive Predictive Forecasting Monitoring Prescriptive Analytics What should I do (to make it happen)? decision automation Predictive Analytics Based on historic data, what may happen? What is likely to happen? decision support Machine Learning Regression Analysis Simulation Optimization (Operation Research) Deep learning Analysis Reporting Descriptive and Diagnostic Analytics What happened? Why did it happen? What s happening now? decision evaluation Analytics Complexity Data mining Data exploration Data visualization

11 rule-based decision making statistical reasoning the case of today machine learning artificial intelligence Boolean data (yes/no) Examples phone notification time- or threshold based alarms simple pattern matching every programmer simple regression Numerical data (allowing for curve fitting) Examples extra- and interpolation outlier detection predictive maintenance classification tasks Arbitrary data (that needs to be abstracted into numbers) Examples Identification of relevant features from large datasets Quality control using various metrics dynamic adaptation Arbitrary data (autonomous selection of best methodology) Examples Autonomous vehicles Human-like conversational skills Intelligent digital assistant data science complex systems experts

12 Data Science process 1. Ask interesting questions (understand the business) Start 2. Start with with the data sources Which internal and external sources are available? Get Get the the data data 3. Get and understand the How is the data logged? How Which is the data data is relevant logged? How is Are the Which there data data logged? privacy is relevant issues? Which data is relevant? Are there Are there privacy privacy issues? issues? 4. Prepare, integrate and explore the data Are there patterns, correlations, anomalies? 5. Model the data Build a model, fit the model, validate the model 6. Visualize the data Create appealing graphs and let the data speak 7. Communicate insights Build a compelling storyline: what did we learn? Do the results make sense. and start again

13 Ask interesting questions (understand the business and data) 1 Determine business objectives Start with broad questions at the beginning while gathering the data, Focus on data-centric question, they start with where, when, how or how often. They allow to focus the search for data within a specific set of parameters. They are more likely to provide data that lends itself to be visually represented NB. Asking a questions is a process as such and not necessarily a task that must be done at the beginning. When having a better understanding of the data new questions arise and finally we are able to ask a good question

14 Start with the datasources 2 2 Which internal and external sources are available? Is the data available? Do you need partners NB. Don t forget open data

15 Several types of sources 1. Internal Data ERP, CRM, financial, HRM, support, product, services 2. Open Data Governmental; census; demographic; federal; environmental; company; financial 3. Commercial Data Profile brokers; marketing agencies; web profiles; demographic profiles; location; mobile; 4. Web Data Social media, websites, webdocs 5. IoT Data Devices, machines, sensors

16 Aan de slag met data consultants, data engineers, data scientisten van NS Welke data is beschikbaar? Hoe is de data gelogd? Hoe kunnen we diverse datatabellen aan elkaar koppelen? Machinistengegevens Treindienst uitgevoerd Nabel informatie machinisten [Log-informatie FLIRT materieel] Materieel uitgevoerd Personeel uitgevoerd

17 Get and understand the data 3 3 Describe the data Select data Extract data Explore data Verify data quality $ CRM, SCM, ERP Transactional data Documents Images Audio Video Social Mobile Machine Weather GPS Open Media Data Data data Sensor data

18 Two types of data Quantitative All about numbers that can be measured and counted (e.g. # autodorp in bag) Qualitative Things you can observe but can t measure. (e.g. tast, color of autodrop in a bag) Discrete (meristic) Based on counts and can only be sertain values such as whole numbers (e.g. # autodorp in bag) Continuous Can take any value in a range, infinite specificity to be accurate (e.g. height, time, temp.)

19 scales Numerical Measurement Discrete Continuous Interval Ratio Circular When data is being grouped (e.g. the aged of visitors ) Have a meaningful zero point (e.g. temperture or weight) Time (e.g. annual dates, clock times and others)

20 The state of your data The term data is very bread and general. Data comes in many forms and state of quality. Understanding your data and prepare for data analysis RAW Data Technically correct Tidy data Aggregated data Formatted data

21 Prepare, integrate and explore the data 4 4 Clean data Construct data Integrate data Enrich data Aggregate data Format data which keys are essential to merge the datafiles?

22 Data wrangling Split cells Handle missing values Expand tables Unite tables Make new variables Subset observations

23 Van ruwe data naar inzichten Sourcing raw data Machinisten-gegevens Treindienst uitgevoerd Materieel uitgevoerd Staging and transformations Flat data file Exploration and Analytics Analytics library Insights, visuals, dashboarding Personeel uitgevoerd Storingsdata Nabel informatie machinisten [Log-informatie FLIRT materieel]

24 NEXT STEPS Predicting performance Create customized education program

25 Three types of analytics

26 Model the data 5 5 Select modelling technique Generate PoC (test design) Build and train model Test and assess model Improve model with domain experts

27 Data + algorithm = machine learning data algorithms

28 What is an algorithm?

29 Algorithm - requirements In simple terms, it is possible to say that an algorithm is a sequence of steps which allow to solve a certain task. Three important characteristics to be considered valid: It should be finite: If your algorithm never ends trying to solve the problem it was designed to solve then it is useless It should have well defined instructions: Each step of the algorithm has to be precisely defined; the instructions should be unambiguously specified for each case. It should be effective: The algorithm should solve the problem it was designed to solve. And it should be possible to demonstrate that the algorithm converges with just a paper and pencil.

30 Three types of learning Supervised Unsupervised Reinforcement Use training data with right outcomes to train the algorithm then apply it to data without a correct answer no training data throw data into the algorithm, hope it makes some kind of sense out of the data no training data agent learns from environment and rewards trial and error

31

32 Data landscape Integration of data Using the big data stack

33 Can we predict the optimal schedule for a traindriver based on his characteristics and his experience? a customized education program for new trains? Detailed timetable data Detailed timetable data Personal characteristics Detailed timetabl e data Expert Knowlegde Detailed data of delays Detailed timetable data

34 THANK YOU Arie-Jan Baan Senior Consultant / Data Scientist a.j.baan@intellerts.com Bram Eigenhuis Consultant HR Analytics b.eigenhuis@berenschot.nl

Learning Objectives for Data Concept and Visualization

Learning Objectives for Data Concept and Visualization Learning Objectives for Data Concept and Visualization Assignment 1: Data Quality Concept and Impact of Data Quality Summarize concepts of data quality. Understand and describe the impact of data on actuarial

More information

BIG DATA SCIENTIST Certification. Big Data Scientist

BIG DATA SCIENTIST Certification. Big Data Scientist BIG DATA SCIENTIST Certification Big Data Scientist Big Data Science Professional (BDSCP) certifications are formal accreditations that prove proficiency in specific areas of Big Data. To obtain a certification,

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1395 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1395 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1395 1 / 21 Table of contents 1 Introduction 2 Data mining

More information

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394

Data Mining. Introduction. Hamid Beigy. Sharif University of Technology. Fall 1394 Data Mining Introduction Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 20 Table of contents 1 Introduction 2 Data mining

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

USERS CONFERENCE Copyright 2016 OSIsoft, LLC

USERS CONFERENCE Copyright 2016 OSIsoft, LLC Bridge IT and OT with a process data warehouse Presented by Matt Ziegler, OSIsoft Complexity Problem Complexity Drives the Need for Integrators Disparate assets or interacting one-by-one Monitoring Real-time

More information

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.

Data Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Input: Concepts, instances, attributes Terminology What s a concept?

More information

Title DC Automation: It s a MARVEL!

Title DC Automation: It s a MARVEL! Title DC Automation: It s a MARVEL! Name Nikos D. Anagnostatos Position Network Consultant, Network Solutions Division Classification ISO 27001: Public Data Center Evolution 2 Space Hellas - All Rights

More information

Overview and Practical Application of Machine Learning in Pricing

Overview and Practical Application of Machine Learning in Pricing Overview and Practical Application of Machine Learning in Pricing 2017 CAS Spring Meeting May 23, 2017 Duncan Anderson and Claudine Modlin (Willis Towers Watson) Mark Richards (Allstate Insurance Company)

More information

Basic Concepts Weka Workbench and its terminology

Basic Concepts Weka Workbench and its terminology Changelog: 14 Oct, 30 Oct Basic Concepts Weka Workbench and its terminology Lecture Part Outline Concepts, instances, attributes How to prepare the input: ARFF, attributes, missing values, getting to know

More information

DATA MINING AND WAREHOUSING

DATA MINING AND WAREHOUSING DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making

More information

Introduction to Data Mining and Data Analytics

Introduction to Data Mining and Data Analytics 1/28/2016 MIST.7060 Data Analytics 1 Introduction to Data Mining and Data Analytics What Are Data Mining and Data Analytics? Data mining is the process of discovering hidden patterns in data, where Patterns

More information

Six Core Data Wrangling Activities. An introductory guide to data wrangling with Trifacta

Six Core Data Wrangling Activities. An introductory guide to data wrangling with Trifacta Six Core Data Wrangling Activities An introductory guide to data wrangling with Trifacta Today s Data Driven Culture Are you inundated with data? Today, most organizations are collecting as much data in

More information

Data Analyst Nanodegree Syllabus

Data Analyst Nanodegree Syllabus Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working

More information

SOCIAL MEDIA MINING. Data Mining Essentials

SOCIAL MEDIA MINING. Data Mining Essentials SOCIAL MEDIA MINING Data Mining Essentials Dear instructors/users of these slides: Please feel free to include these slides in your own material, or modify them as you see fit. If you decide to incorporate

More information

ISSN: (Online) Volume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies

ISSN: (Online) Volume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies ISSN: 2321-7782 (Online) Volume 3, Issue 9, September 2015 International Journal of Advance Research in Computer Science and Management Studies Research Article / Survey Paper / Case Study Available online

More information

Slides for Data Mining by I. H. Witten and E. Frank

Slides for Data Mining by I. H. Witten and E. Frank Slides for Data Mining by I. H. Witten and E. Frank 7 Engineering the input and output Attribute selection Scheme-independent, scheme-specific Attribute discretization Unsupervised, supervised, error-

More information

Cortana Intelligence Suite; Where the Magic Happens

Cortana Intelligence Suite; Where the Magic Happens Cortana Intelligence Suite; Where the Magic Happens Reza Rad, Leila Etaati #509 Brisbane 2016 About Us Reza Rad Leila Etaati MVP BI Consultant and Trainer Author of Books Speaker in conferences; PASS Summit,

More information

Data Mining Practical Machine Learning Tools and Techniques

Data Mining Practical Machine Learning Tools and Techniques Input: Concepts, instances, attributes Data ining Practical achine Learning Tools and Techniques Slides for Chapter 2 of Data ining by I. H. Witten and E. rank Terminology What s a concept z Classification,

More information

Oracle Big Data Discovery

Oracle Big Data Discovery Oracle Big Data Discovery Turning Data into Business Value Harald Erb Oracle Business Analytics & Big Data 1 Safe Harbor Statement The following is intended to outline our general product direction. It

More information

Chapter 3: Data Mining:

Chapter 3: Data Mining: Chapter 3: Data Mining: 3.1 What is Data Mining? Data Mining is the process of automatically discovering useful information in large repository. Why do we need Data mining? Conventional database systems

More information

Accelerate AI with Cisco Computing Solutions

Accelerate AI with Cisco Computing Solutions Accelerate AI with Cisco Computing Solutions Data is everywhere. Your data scientists are propelling your business into a future of data-driven intelligence. But how do you deploy and manage artificial

More information

Machine Learning Chapter 2. Input

Machine Learning Chapter 2. Input Machine Learning Chapter 2. Input 2 Input: Concepts, instances, attributes Terminology What s a concept? Classification, association, clustering, numeric prediction What s in an example? Relations, flat

More information

1 of 5 1/28/2015 12:27 PM BDA Program Program Mission/Purpose The mission of the Bachelor of Science in Business Data Analytics (BDA) program is to prepare students to understand the foundation of business

More information

Smart Data Center From Hitachi Vantara: Transform to an Agile, Learning Data Center

Smart Data Center From Hitachi Vantara: Transform to an Agile, Learning Data Center Smart Data Center From Hitachi Vantara: Transform to an Agile, Learning Data Center Leverage Analytics To Protect and Optimize Your Business Infrastructure SOLUTION PROFILE Managing a data center and the

More information

Lecture 25: Review I

Lecture 25: Review I Lecture 25: Review I Reading: Up to chapter 5 in ISLR. STATS 202: Data mining and analysis Jonathan Taylor 1 / 18 Unsupervised learning In unsupervised learning, all the variables are on equal standing,

More information

Data Preprocessing. Slides by: Shree Jaswal

Data Preprocessing. Slides by: Shree Jaswal Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data

More information

Using Machine Learning to Optimize Storage Systems

Using Machine Learning to Optimize Storage Systems Using Machine Learning to Optimize Storage Systems Dr. Kiran Gunnam 1 Outline 1. Overview 2. Building Flash Models using Logistic Regression. 3. Storage Object classification 4. Storage Allocation recommendation

More information

ETL Best Practices and Techniques. Marc Beacom, Managing Partner, Datalere

ETL Best Practices and Techniques. Marc Beacom, Managing Partner, Datalere ETL Best Practices and Techniques Marc Beacom, Managing Partner, Datalere Thank you Sponsors Experience 10 years DW/BI Consultant 20 Years overall experience Marc Beacom Managing Partner, Datalere Current

More information

What's New in MATLAB for Engineering Data Analytics?

What's New in MATLAB for Engineering Data Analytics? What's New in MATLAB for Engineering Data Analytics? Will Wilson Application Engineer MathWorks, Inc. 2017 The MathWorks, Inc. 1 Agenda Data Types Tall Arrays for Big Data Machine Learning (for Everyone)

More information

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation

Data Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization

More information

Think & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI)

Think & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI) Think & Work like a Data Scientist with SQL 2016 & R DR. SUBRAMANI PARAMASIVAM (MANI) About the Speaker Dr. SubraMANI Paramasivam PhD., MCT, MCSE, MCITP, MCP, MCTS, MCSA CEO, Principal Consultant & Trainer

More information

Data Science Tutorial

Data Science Tutorial Eliezer Kanal Technical Manager, CERT Daniel DeCapria Data Scientist, ETC Software Engineering Institute Carnegie Mellon University Pittsburgh, PA 15213 2017 SEI SEI Data Science in in Cybersecurity Symposium

More information

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining.

This tutorial has been prepared for computer science graduates to help them understand the basic-to-advanced concepts related to data mining. About the Tutorial Data Mining is defined as the procedure of extracting information from huge sets of data. In other words, we can say that data mining is mining knowledge from data. The tutorial starts

More information

Computational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich

Computational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich Computational Databases: Inspirations from Statistical Software Linnea Passing, linnea.passing@tum.de Technical University of Munich Data Science Meets Databases Data Cleansing Pipelines Fuzzy joins Data

More information

Data Collection, Preprocessing and Implementation

Data Collection, Preprocessing and Implementation Chapter 6 Data Collection, Preprocessing and Implementation 6.1 Introduction Data collection is the loosely controlled method of gathering the data. Such data are mostly out of range, impossible data combinations,

More information

From Insight to Action: Analytics from Both Sides of the Brain. Vaz Balasingham Director of Solutions Consulting

From Insight to Action: Analytics from Both Sides of the Brain. Vaz Balasingham Director of Solutions Consulting From Insight to Action: Analytics from Both Sides of the Brain Vaz Balasingham Director of Solutions Consulting vbalasin@tibco.com Insight to Action from Both Sides of the Brain Value Grow Revenue Reduce

More information

Il caso della Prescriptive Maintenance

Il caso della Prescriptive Maintenance Reimagine 2018 HPE Pointnext AI at Work Il caso della Prescriptive Maintenance nel mondo industriale Edmondo De Salvo WW AI, Data & Emerging Technology Center of Excellence 24 maggio 2018 We live in a

More information

Table Of Contents: xix Foreword to Second Edition

Table Of Contents: xix Foreword to Second Edition Data Mining : Concepts and Techniques Table Of Contents: Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments xxxi About the Authors xxxv Chapter 1 Introduction 1 (38) 1.1 Why Data

More information

Machine Learning Techniques for Data Mining

Machine Learning Techniques for Data Mining Machine Learning Techniques for Data Mining Eibe Frank University of Waikato New Zealand 10/25/2000 1 PART VII Moving on: Engineering the input and output 10/25/2000 2 Applying a learner is not all Already

More information

BIG DATA SCIENCE PROFESSIONAL Certification. Big Data Science Professional

BIG DATA SCIENCE PROFESSIONAL Certification. Big Data Science Professional BIG DATA SCIENCE PROFESSIONAL Certification Big Data Science Professional Big Data Science Professional (BDSCP) certifications are formal accreditations that prove proficiency in specific areas of Big

More information

EMEA USERS CONFERENCE BERLIN, GERMANY. Copyright 2016 OSIsoft, LLC

EMEA USERS CONFERENCE BERLIN, GERMANY. Copyright 2016 OSIsoft, LLC Bridge IT and OT with a process data warehouse Presented by Franco Camba, OSIsoft Matt Ziegler, OSIsoft Frank Ruland, SAP Audience Poll Have you invested or are you looking into Business Intelligence tools?

More information

ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA

ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA INSIGHTS@SAS: ADVANCED ANALYTICS USING SAS ENTERPRISE MINER RENS FEENSTRA AGENDA 09.00 09.15 Intro 09.15 10.30 Analytics using SAS Enterprise Guide Ellen Lokollo 10.45 12.00 Advanced Analytics using SAS

More information

AI: A UAE Perspective. Nawaf I. Almoosa Deputy Director - EBTIC March 7 th 2018

AI: A UAE Perspective. Nawaf I. Almoosa Deputy Director - EBTIC March 7 th 2018 AI: A UAE Perspective Nawaf I. Almoosa Deputy Director - EBTIC March 7 th 2018 2 EBTIC Overview 3 Our Model Research & Innovation Structure Partners and Target Sectors Sustainability Smart Air- Conditioning

More information

Analytics Fundamentals by Mark Peco

Analytics Fundamentals by Mark Peco Analytics Fundamentals by Mark Peco All rights reserved. Reproduction in whole or part prohibited except by written permission. Product and company names mentioned herein may be trademarks of their respective

More information

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha

Data Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking

More information

Seven Things You Didn t Know You Could Do With Google Analytics

Seven Things You Didn t Know You Could Do With Google Analytics Seven Things You Didn t Know You Could Do With Google Analytics Introduction Google Analytics is a fantastic and powerful tool for tracking your website activity and using that data to inform and improve

More information

Citizen Data Scientist is the new Data Analyst

Citizen Data Scientist is the new Data Analyst Welcome # T C 1 8 Citizen Data Scientist is the new Data Analyst Mehmet Vanli Sales Consultant Tableau Australia Citizen data scientist: A person who creates models that use advanced diagnostic analytics

More information

A Road Map for Advancing Your Career. Distinguish yourself professionally. Get an edge over the competition. Advance your career with CBIP.

A Road Map for Advancing Your Career. Distinguish yourself professionally. Get an edge over the competition. Advance your career with CBIP. TDWI Certification A Road Map for Advancing Your Career Distinguish yourself professionally. Get an edge over the competition. Advance your career with CBIP. www.tdwi.org/cbip TDWI s Certified Business

More information

United States Personal Device History and Forecast, April 2019

United States Personal Device History and Forecast, April 2019 United States Personal Device History and Forecast, 1975-2023 April 2019 Table of Content Market Overview Forecasting Approach and Process Market Tables Next Five Years Major Causal Influences and Trends

More information

P L A Y.

P L A Y. P L A Y https://youtu.be/4pqrjadinos 1 2 3 Multiple building 145 structures 15M Sqf office systems & lab space 58,400 housed heads $55M annual utility spend 2M connection points 50-55 megawatt hour average

More information

TDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended.

TDWI strives to provide course books that are contentrich and that serve as useful reference documents after a class has ended. Previews of TDWI course books offer an opportunity to see the quality of our material and help you to select the courses that best fit your needs. The previews cannot be printed. TDWI strives to provide

More information

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset.

Analytical model A structure and process for analyzing a dataset. For example, a decision tree is a model for the classification of a dataset. Glossary of data mining terms: Accuracy Accuracy is an important factor in assessing the success of data mining. When applied to data, accuracy refers to the rate of correct values in the data. When applied

More information

Jim Green CTO, IoT Software Cisco Systems. The State of The Internet of The Things

Jim Green CTO, IoT Software Cisco Systems. The State of The Internet of The Things Jim Green CTO, IoT Software Cisco Systems The State of The Internet of The Things The Goal of IoT Leveraging Machine Generated Data for Business Benefit Product Quality Assessment at the Well Is IoT Big

More information

Overview of Data Services and Streaming Data Solution with Azure

Overview of Data Services and Streaming Data Solution with Azure Overview of Data Services and Streaming Data Solution with Azure Tara Mason Senior Consultant tmason@impactmakers.com Platform as a Service Offerings SQL Server On Premises vs. Azure SQL Server SQL Server

More information

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018

MIT 801. Machine Learning I. [Presented by Anna Bosman] 16 February 2018 MIT 801 [Presented by Anna Bosman] 16 February 2018 Machine Learning What is machine learning? Artificial Intelligence? Yes as we know it. What is intelligence? The ability to acquire and apply knowledge

More information

Intelligent Infrastructure Solutions

Intelligent Infrastructure Solutions Intelligent Infrastructure Solutions High Performance for the Life of your Buildings Construction practices, tools and procurement models are evolving New architectural trends Net-zero & Bioclimatic New

More information

The Rules of Subsurface Analytics Jane McConnell, Practice Partner Oil and Gas, Teradata DEJ KL, 4 October 2017

The Rules of Subsurface Analytics Jane McConnell, Practice Partner Oil and Gas, Teradata DEJ KL, 4 October 2017 The Rules of Subsurface Analytics Jane McConnell, Practice Partner Oil and Gas, Teradata DEJ KL, 4 October 2017 Agenda Why subsurface analytics is different The Rules Rule 1: Right People Rule 2: Right

More information

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect

Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect Event: PASS SQL Saturday - DC 2018 Presenter: Jon Tupitza, CTO Architect BEOP.CTO.TP4 Owner: OCTO Revision: 0001 Approved by: JAT Effective: 08/30/2018 Buchanan & Edwards Proprietary: Printed copies of

More information

Data Preprocessing. Komate AMPHAWAN

Data Preprocessing. Komate AMPHAWAN Data Preprocessing Komate AMPHAWAN 1 Data cleaning (data cleansing) Attempt to fill in missing values, smooth out noise while identifying outliers, and correct inconsistencies in the data. 2 Missing value

More information

Contents. Foreword to Second Edition. Acknowledgments About the Authors

Contents. Foreword to Second Edition. Acknowledgments About the Authors Contents Foreword xix Foreword to Second Edition xxi Preface xxiii Acknowledgments About the Authors xxxi xxxv Chapter 1 Introduction 1 1.1 Why Data Mining? 1 1.1.1 Moving toward the Information Age 1

More information

SAP HANA Spatial Location-based business platform

SAP HANA Spatial Location-based business platform SAP HANA Spatial Location-based business platform Thomas Hammer, HANA Spatial Development April 19, 2018 SAP HANA Architecture Application development All Devices SAP, ISV and Custom Applications SAP HANA

More information

How to integrate data into Tableau

How to integrate data into Tableau 1 How to integrate data into Tableau a comparison of 3 approaches: ETL, Tableau self-service and WHITE PAPER WHITE PAPER 2 data How to integrate data into Tableau a comparison of 3 es: ETL, Tableau self-service

More information

Week 1 Unit 1: Introduction to Data Science

Week 1 Unit 1: Introduction to Data Science Week 1 Unit 1: Introduction to Data Science The next 6 weeks What to expect in the next 6 weeks? 2 Curriculum flow (weeks 1-3) Business & Data Understanding 1 2 3 Data Preparation Modeling (1) Introduction

More information

Data Mining: Models and Methods

Data Mining: Models and Methods Data Mining: Models and Methods Author, Kirill Goltsman A White Paper July 2017 --------------------------------------------------- www.datascience.foundation Copyright 2016-2017 What is Data Mining? Data

More information

Correlative Analytic Methods in Large Scale Network Infrastructure Hariharan Krishnaswamy Senior Principal Engineer Dell EMC

Correlative Analytic Methods in Large Scale Network Infrastructure Hariharan Krishnaswamy Senior Principal Engineer Dell EMC Correlative Analytic Methods in Large Scale Network Infrastructure Hariharan Krishnaswamy Senior Principal Engineer Dell EMC 2018 Storage Developer Conference. Dell EMC. All Rights Reserved. 1 Data Center

More information

Drawing the Big Picture

Drawing the Big Picture Drawing the Big Picture Multi-Platform Data Architectures, Queries, and Analytics Philip Russom TDWI Research Director for Data Management August 26, 2015 Sponsor 2 Speakers Philip Russom TDWI Research

More information

Course Information

Course Information Course Information 2018-2020 Master of Information Systems: Digital Business System Institutt for teknologi / Department of Technology Index Index... i 1 s... 1 1.1 Content... 1 1.2 Name... 1 1.3 Programme

More information

VERSION EIGHT PRODUCT PROFILE. Be a better auditor. You have the knowledge. We have the tools.

VERSION EIGHT PRODUCT PROFILE. Be a better auditor. You have the knowledge. We have the tools. VERSION EIGHT PRODUCT PROFILE Be a better auditor. You have the knowledge. We have the tools. Improve your audit results and extend your capabilities with IDEA's powerful functionality. With IDEA, you

More information

WKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems

WKU-MIS-B10 Data Management: Warehousing, Analyzing, Mining, and Visualization. Management Information Systems Management Information Systems Management Information Systems B10. Data Management: Warehousing, Analyzing, Mining, and Visualization Code: 166137-01+02 Course: Management Information Systems Period: Spring

More information

5. Technology Applications

5. Technology Applications 5. Technology Applications 5.1 What is a Database? 5.2 Types of Databases 5.3 Choosing the Right Database 5.4 Database Programming Tools 5.5 How to Search Your Database 5.6 Data Warehousing and Mining

More information

Yunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace

Yunfeng Zhang 1, Huan Wang 2, Jie Zhu 1 1 Computer Science & Engineering Department, North China Institute of Aerospace [Type text] [Type text] [Type text] ISSN : 0974-7435 Volume 10 Issue 20 BioTechnology 2014 An Indian Journal FULL PAPER BTAIJ, 10(20), 2014 [12526-12531] Exploration on the data mining system construction

More information

Features: representation, normalization, selection. Chapter e-9

Features: representation, normalization, selection. Chapter e-9 Features: representation, normalization, selection Chapter e-9 1 Features Distinguish between instances (e.g. an image that you need to classify), and the features you create for an instance. Features

More information

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska

Preprocessing Short Lecture Notes cse352. Professor Anita Wasilewska Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept

More information

Summary. Introduction

Summary. Introduction . Tony Martin*, Cristiano Saturni and Peter Ashby, ION Geophysical Summary Modern marine seismic surveys may contain many Terabytes of data. In seismic processing terms, understanding the impact and effectiveness

More information

WebAttract WebinarReady Master Webinar Producer Training Course

WebAttract WebinarReady Master Webinar Producer Training Course WebAttract WebinarReady Master Webinar Producer Training Course A Step-By-Step Training that Covers Every Aspect of Planning, Delivering, and Optimizing Your Next Webinar How would you like a step-by-step

More information

Dr. SubraMANI Paramasivam. Think & Work like a Data Scientist with SQL 2016 & R

Dr. SubraMANI Paramasivam. Think & Work like a Data Scientist with SQL 2016 & R Dr. SubraMANI Paramasivam Think & Work like a Data Scientist with SQL 2016 & R About the Speaker Group Leader Dr. SubraMANI Paramasivam PhD., MVP, MCT, MCSE (x2), MCITP (x2), MCP, MCTS (x3), MCSA CEO,

More information

Machine Learning with Python

Machine Learning with Python DEVNET-2163 Machine Learning with Python Dmitry Figol, SE WW Enterprise Sales @dmfigol Cisco Spark How Questions? Use Cisco Spark to communicate with the speaker after the session 1. Find this session

More information

Applying Auto-Data Classification Techniques for Large Data Sets

Applying Auto-Data Classification Techniques for Large Data Sets SESSION ID: PDAC-W02 Applying Auto-Data Classification Techniques for Large Data Sets Anchit Arora Program Manager InfoSec, Cisco The proliferation of data and increase in complexity 1995 2006 2014 2020

More information

Accelerating High Performance Manufacturing

Accelerating High Performance Manufacturing Copyright 2013 Rockwell Automation, Inc. All Rights Reserved. Accelerating High Performance Manufacturing John Nesi Vice President Market Development Copyright 2013 Rockwell Automation, Inc. All Rights

More information

Data Mining and Analytics. Introduction

Data Mining and Analytics. Introduction Data Mining and Analytics Introduction Data Mining Data mining refers to extracting or mining knowledge from large amounts of data It is also termed as Knowledge Discovery from Data (KDD) Mostly, data

More information

Big Data Security Internal Threat Detection. The Critical Role of Machine Learning.

Big Data Security Internal Threat Detection. The Critical Role of Machine Learning. Big Data Security Internal Threat Detection The Critical Role of Machine Learning Objectives 1.Discuss internal user risk management challenges in Big Data Environment 2.Discuss why machine learning is

More information

Get more out of technology starting day one. ProDeploy Enterprise Suite

Get more out of technology starting day one. ProDeploy Enterprise Suite Enterprise Suite Get more out of technology starting day one 1 Secure the path to a future-ready data center The landscape faced by IT managers and business leaders today can be daunting to navigate. Continually

More information

Big Data Analytics The Data Mining process. Roger Bohn March. 2016

Big Data Analytics The Data Mining process. Roger Bohn March. 2016 1 Big Data Analytics The Data Mining process Roger Bohn March. 2016 Office hours HK thursday5 to 6 in the library 3115 If trouble, email or Slack private message. RB Wed. 2 to 3:30 in my office Some material

More information

GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATIONS (MCA) Semester: IV

GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATIONS (MCA) Semester: IV GUJARAT TECHNOLOGICAL UNIVERSITY MASTER OF COMPUTER APPLICATIONS (MCA) Semester: IV Subject Name: Elective I Data Warehousing & Data Mining (DWDM) Subject Code: 2640005 Learning Objectives: To understand

More information

Research Data Analysis using SPSS. By Dr.Anura Karunarathne Senior Lecturer, Department of Accountancy University of Kelaniya

Research Data Analysis using SPSS. By Dr.Anura Karunarathne Senior Lecturer, Department of Accountancy University of Kelaniya Research Data Analysis using SPSS By Dr.Anura Karunarathne Senior Lecturer, Department of Accountancy University of Kelaniya MBA 61013- Business Statistics and Research Methodology Learning outcomes At

More information

Big Data - Security with Privacy

Big Data - Security with Privacy Big Data - Security with Privacy Elisa Bertino CS Department, Cyber Center, and CERIAS Purdue University Cyber Center Today we have technologies for Acquiring and sensing data Transmitting data Storing,

More information

Understanding Clustering Supervising the unsupervised

Understanding Clustering Supervising the unsupervised Understanding Clustering Supervising the unsupervised Janu Verma IBM T.J. Watson Research Center, New York http://jverma.github.io/ jverma@us.ibm.com @januverma Clustering Grouping together similar data

More information

COMP 465 Special Topics: Data Mining

COMP 465 Special Topics: Data Mining COMP 465 Special Topics: Data Mining Introduction & Course Overview 1 Course Page & Class Schedule http://cs.rhodes.edu/welshc/comp465_s15/ What s there? Course info Course schedule Lecture media (slides,

More information

Smart Manufacturing in the Food & Beverage Industry

Smart Manufacturing in the Food & Beverage Industry Smart Manufacturing in the Food & Beverage Industry PUBLIC Copyright 2016 Rockwell Automation, Inc. All Rights Reserved. 1 Rockwell Automation at a Glance $5.9B FISCAL 2016 SALES 22,000 EMPLOYEES 80+ COUNTRIES

More information

A Neural Network Model Of Insurance Customer Ratings

A Neural Network Model Of Insurance Customer Ratings A Neural Network Model Of Insurance Customer Ratings Jan Jantzen 1 Abstract Given a set of data on customers the engineering problem in this study is to model the data and classify customers

More information

Computer-based Tracking Protocols: Improving Communication between Databases

Computer-based Tracking Protocols: Improving Communication between Databases Computer-based Tracking Protocols: Improving Communication between Databases Amol Deshpande Database Group Department of Computer Science University of Maryland Overview Food tracking and traceability

More information

Creating a Cybersecurity Culture: (ISC)2 Survey Responses

Creating a Cybersecurity Culture: (ISC)2 Survey Responses 10/3/18 Creating a Cybersecurity Culture: (ISC)2 Survey Responses Dr. Keri Pearlson (ISC)2 Conference October 8, 2018 CAMS - (IC)3 https://cams.mit.edu 1 200,000Security events The average company handles

More information

MEDICAL INFORMATICS & DATABASE MANAGEMENT MODULE 5: BIG DATA MANAGEMENT AND ANALYSIS DR.ORALUCK PATTANAPRATEEP

MEDICAL INFORMATICS & DATABASE MANAGEMENT MODULE 5: BIG DATA MANAGEMENT AND ANALYSIS DR.ORALUCK PATTANAPRATEEP MEDICAL INFORMATICS & DATABASE MANAGEMENT MODULE 5: BIG DATA MANAGEMENT AND ANALYSIS DR.ORALUCK PATTANAPRATEEP Doctor of Philosophy Program in Clinical Epidemiology Section for Clinical Epidemiology &

More information

Advance Your Career. Be recognized as an industry leader. Get ahead of the competition. Validate your expertise with CBIP.

Advance Your Career. Be recognized as an industry leader. Get ahead of the competition. Validate your expertise with CBIP. 2019 Advance Your Career. Be recognized as an industry leader. Get ahead of the competition. Validate your expertise with CBIP. Get Started Today Be recognized as an industry leader. Distinguishing yourself

More information

Massive Data Analysis

Massive Data Analysis Professor, Department of Electrical and Computer Engineering Tennessee Technological University February 25, 2015 Big Data This talk is based on the report [1]. The growth of big data is changing that

More information

Higher National Unit specification: general information

Higher National Unit specification: general information Higher National Unit specification: general information Unit code: H16Y 35 Superclass: CB Publication date: November 2012 Source: Scottish Qualifications Authority Version: 02 Unit purpose This Unit is

More information

Data Mining. Jeff M. Phillips. January 7, 2019 CS 5140 / CS 6140

Data Mining. Jeff M. Phillips. January 7, 2019 CS 5140 / CS 6140 Data Mining CS 5140 / CS 6140 Jeff M. Phillips January 7, 2019 What is Data Mining? What is Data Mining? Finding structure in data? Machine learning on large data? Unsupervised learning? Large scale computational

More information

EUROPEAN ICT PROFESSIONAL ROLE PROFILES VERSION 2 CWA 16458:2018 LOGFILE

EUROPEAN ICT PROFESSIONAL ROLE PROFILES VERSION 2 CWA 16458:2018 LOGFILE EUROPEAN ICT PROFESSIONAL ROLE PROFILES VERSION 2 CWA 16458:2018 LOGFILE Overview all ICT Profile changes in title, summary, mission and from version 1 to version 2 Versions Version 1 Version 2 Role Profile

More information

NOVEMBER 2017 Leading Digital Transformation Driving innovation at scale Marc Leroux Executive Evangelist/Digital ABB

NOVEMBER 2017 Leading Digital Transformation Driving innovation at scale Marc Leroux Executive Evangelist/Digital ABB NOVEMBER 2017 Leading Digital Transformation Driving innovation at scale Marc Leroux Executive Evangelist/Digital ABB Discussion topics Introductions The digital transformation Innovating at scale How

More information