Data Mining & Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues. Slides by Michael Hahsler

Similar documents
Data Mining. Associate Professor Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology

Data Warehouse and Data Mining

Data Mining & Data Warehouse

Data Warehousing (1)

Information Management course

Syllabus. Syllabus. Motivation Decision Support. Syllabus

Data Mining Concepts & Techniques

Knowledge Discovery & Data Mining

On-Line Analytical Processing (OLAP) Traditional OLTP

Managing Information Resources

ISM 50 - Business Information Systems

Evolution of Database Systems

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

Summary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394

Decision Support, Data Warehousing, and OLAP

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS

Outline. Managing Information Resources. Concepts and Definitions. Introduction. Chapter 7

What is a Data Warehouse?

Data Warehouses Chapter 12. Class 10: Data Warehouses 1

CHAPTER 3 Implementation of Data warehouse in Data Mining

Decision Support Systems aka Analytical Systems

Data Mining. ❸Chapter 3 Data warehouse, ETL and OLAP. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology

Introduction to Data Warehousing

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?

Rocky Mountain Technology Ventures

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1396

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394

Information Management course

CS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University

CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP)

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective

Question Bank. 4) It is the source of information later delivered to data marts.

Guide Users along Information Pathways and Surf through the Data

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:-

Data Warehousing and OLAP

Databases and Data Warehouses

TIM 50 - Business Information Systems

CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI

Data Warehouse and Data Mining

Introduction to DWML. Christian Thomsen, Aalborg University. Slides adapted from Torben Bach Pedersen and Man Lung Yiu

Table of Contents. Knowledge Management Data Warehouses and Data Mining. Introduction and Motivation

Knowledge Management Data Warehouses and Data Mining

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A

Data Warehousing & OLAP

Data-Intensive Distributed Computing

Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20

An Overview of Data Warehousing and OLAP Technology

Data Warehousing and OLAP Technologies for Decision-Making Process

CS 1655 / Spring 2013! Secure Data Management and Web Applications

DATA MINING AND WAREHOUSING

Data Warehousing and Decision Support

Data Warehousing and Decision Support

REPORTING AND QUERY TOOLS AND APPLICATIONS

Data Warehouse and Data Mining

TIM 50 - Business Information Systems

The strategic advantage of OLAP and multidimensional analysis

Time: 3 hours. Full Marks: 70. The figures in the margin indicate full marks. Answers from all the Groups as directed. Group A.

5-1McGraw-Hill/Irwin. Copyright 2007 by The McGraw-Hill Companies, Inc. All rights reserved.

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

YOU SUN JEONG DATA ANALYTICS WITH DRUID

HANA Performance. Efficient Speed and Scale-out for Real-time BI

The COSMIC Functional Size Measurement Method Version 4.0.1

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)

Step-by-step data transformation

DHANALAKSHMI COLLEGE OF ENGINEERING, CHENNAI

Data Warehousing & OLAP

COGNOS DYNAMIC CUBES: SET TO RETIRE TRANSFORMER? Update: Pros & Cons

Data Modeling and Databases Ch 7: Schemas. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Business Intelligence An Overview. Zahra Mansoori

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

Adnan YAZICI Computer Engineering Department

Big Data 13. Data Warehousing

Data Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Database Vs. Data Warehouse

CS 245: Database System Principles. Warehousing. Outline. What is a Warehouse? What is a Warehouse? Notes 13: Data Warehousing

Data Centers and Cloud Computing

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

Jarek Szlichta Acknowledgments: Jiawei Han, Micheline Kamber and Jian Pei, Data Mining - Concepts and Techniques

The University of Iowa Intelligent Systems Laboratory The University of Iowa Intelligent Systems Laboratory

Chapter 4, Data Warehouse and OLAP Operations

Data Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong

DATA WAREHOUSING IN LIBRARIES FOR MANAGING DATABASE

Data Centers and Cloud Computing. Data Centers

Partner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g

Data warehousing in telecom Industry

LazyBase: Trading freshness and performance in a scalable database

Data warehouses Decision support The multidimensional model OLAP queries

Data Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa

IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts. Enn Õunapuu

DATA MINING TRANSACTION

Big Data 13. Data Warehousing

Sql Fact Constellation Schema In Data Warehouse With Example

Q1) Describe business intelligence system development phases? (6 marks)

Evolving To The Big Data Warehouse

DATA WAREHOUSE- MODEL QUESTIONS

ETL and OLAP Systems

ALTERNATE SCHEMA DIAGRAMMING METHODS DECISION SUPPORT SYSTEMS. CS121: Relational Databases Fall 2017 Lecture 22

International Journal of Scientific & Engineering Research, Volume 7, Issue 11, November ISSN

Transcription:

Data Mining & Analytics Data Mining Reference Model Data Warehouse Legal and Ethical Issues Slides by Michael Hahsler

Data Mining & Analytics Analytics is the discovery and communication of meaningful patterns in data. Analytics relies on the simultaneous application of statistics, computer programming and operations research to quantify performance. Analytics often favors data visualization to communicate insight. [Wikipedia]

Analytics and Visualization Infoviz is a field by its own. Napoleon's Army in Russia by Charles Minard (around 1850)

Do you notice the slight flaw?

Data Mining & Analytics OR Data Mining / Stats Statistics OR Machine Learning DB / CS

CRISP-DM Reference Model Cross Industry Standard Process for Data Mining De facto standard for conducting data mining and knowledge discovery projects. Defines tasks and outputs. Now developed by IBM as the Analytics Solutions Unified Method for Data Mining/Predictive Analytics (ASUM-DM). SAS has SEMMA and most consulting companies use their own process.

Tasks in the CRISP-DM Model

Problem: Mining Point of Sale (POS) Data

Problem: How is POS data stored? Relational data base? How do the tables look like? Has every store/region its own data base? What if I want to know how many units of product A were sold in the last three month in Texas? This must be easier!

Data Warehouse

ELT: Extract, Transform and Load Extracting data from outside sources Transforming it to fit analytical needs. E.g., Clean (missing data, wrong data) Translate (1 "female") Join (from several sources) Calculate and aggregate data Loading it into the end target (data warehouse)

Data Warehouse Subject Oriented: Data warehouses are designed to help you analyze data in a certain area (e.g., sales). Integrated: Integrates data from disparate sources into a consistent format. Nonvolatile: Data in the data warehouse are never overwritten or deleted. Time Variant: they maintain both historical and (nearly) current data.

OLAP: OnLine Analytical Processing Operations: Slice Dice Drill-down Roll-up Pivot Smartphones Product TX 12 Tim 0 2 e Region For fast operation OLAP requires a special database structure (Snow-flake scheme)

Online Transcation Processing (OLTP) vs. Online Analytical Processing (OLAP) OLTP OLAP users clerk, IT professional knowledge worker function day to day operations decision support DB design application-oriented subject-oriented data current, up-to-date detailed, flat relational isolated historical, summarized, multidimensional integrated, consolidated usage repetitive ad-hoc access read/write index/hash on prim. key lots of scans unit of work short, simple transaction complex query # records accessed tens millions #users thousands hundreds DB size 100MB-GB 100GB-TB metric transaction throughput query throughput, response

Legal, Privacy and Security Issues?

Legal, Privacy and Security Issues Are we allowed to collect the data? Are we allowed to use the data? Is privacy preserved in the process? Is it ethical to use and act on the data? Problem: Internet is global but legislation is local!

Legal, Privacy and Security Issues Data-Gathering via Apps Presents a Gray Legal Area By KEVIN J. O BRIEN Published: October 28, 2012 BERLIN Angry Birds, the top-selling paid mobile app for the iphone in the United States and Europe, has been downloaded more than a billion times by devoted game players around the world, who often spend hours slinging squawking fowl at groups of egg-stealing pigs. When Jason Hong, an associate professor at the Human-Computer Interaction Institute at Carnegie Mellon University, surveyed 40 users, all but two were unaware that the game was storing their locations so that they could later be the targets of ads...

Here is what the small print says... USA Today Network Josh Hafner, USA TODAY 2:38 p.m. EDT July 13, 2016 Pokémon Go s constant location tracking and camera access required for gameplay, paired with its skyrocketing popularity, could provide data like no app before it. Their privacy policy is vague, Hong said. I d say deliberately vague, because of the lack of clarity on the business model.... The agreement says Pokémon Go collects data about its users as a business asset. This includes data used to personally identify players such as email addresses and other information pulled from Google and Facebook accounts players use to sign up for the game. If Niantic is ever sold, the agreement states, all that data can go to another company.