Decision Support Systems 2012/2013. MEIC - TagusPark. Homework #1. Due: 11.Mar.2013
|
|
- Andrew Johns
- 5 years ago
- Views:
Transcription
1 Decision Support Systems 2012/2013 MEIC - TagusPark Homework #1 Due: 11.Mar Data Description and Pre-processing 1. A hospital is conducting a study on obesity in adult men and, as part of that study, tested the age and body fat for 18 randomly selected adults, with the following results: Age % Fat Age % Fat (a) ( 1 / 2 val.) Compute the mean, median and standard deviation for Age and % Fat. Include the expressions you used in your calculations and any additional elements you find relevant. Note: In your calculations, take into consideration that the data above concerns a sample of the population, not the whole population. The mean can be computed using the expression: X Age = 1 N N x i = i=1 XFat = 1 N N x i = i=1 The median can be computed upon sorting the data and determining the middle element. In this case, since we have an even number of data-points, median Age = x N/2 + x N/2+1 2 Finally, the (sample) variance can be computed as: = 33.5 median Fat = x N/2 + x N/2+1 2 = 17.4 and we get S 2 Age = 1 N 1 N (x i x Age ) 2 = SFat 2 = 1 N (x i x Fat ) 2 = N 1 i=1 s Age = i=1 SAge 2 = 8.71 s Fat = SFat 2 = 5.29.
2 Homework 1 Decision Support Systems Page 2 of 10 (b) ( 1 / 2 val.) Draw a scatter plot and a q-q plot based on the two variables. Include a brief explanation of the plots. The scatter plot is obtaining by plotting each pair of data-points (x Age, x Fat ) as they appear in the original table. The resulting plot is: 30 Scatter plot of Age vs. % of body fat 25 Body fat (\%) Age (years) The q-q plot, on the other hand, can be obtaining by pairing the quantiles of the two attributes. In this case, since both attributes have the same number of data-points, the q-q plot can easily be obtained by sorting the values in both attributes and plotting the resulting pairs, to yield: 30 Q Q plot of Age vs. % of body fat 25 Body fat (%) Age (years) (c) (1 val.) Normalize the two variables using min-max normalization, so that the data fits in the interval [0, 1]. Include the expressions you used in your calculations and any additional elements you find relevant.
3 Homework 1 Decision Support Systems Page 3 of 10 To normalize the two variables, we use, in each dataset, x i,norm = x i min j {x j } max j x j min j {x j } to get: Age norm % Fat norm Age norm % Fat norm (d) (1 val.) Calculate the correlation coefficient (Pearson s product moment coefficient) between the two attributes. Based on this computation, justify whether the two variables are positively or negatively correlated. The (sample) correlation coefficient betweeen Age and %Fat can be computed as corr(x, Y ) = 1 (N 1) N i=1 x i X s X yi Ȳ s Y. In our case, we have corr(age, %Fat) = 0.54 and we can conclude that, since corr(age, %Fat) > 0, Age and %Fat are positively correlated. 2. (3 val.) Suppose that you want to predict the value of some (discrete) variable Y knowing that some other variable, X, takes some given value, x. For example, returning to the setup of Question 1, you could be interested in predicting the value of % Fat for a person with Age = 35. Show that, in terms of expected squared error: E = E [ (Y c) 2 X = x ], the value c = E [Y X = x] is the best possible prediction for Y. Indicate all relevant computations. Suggestion: Compute the minimum of the above expression with respect to c. Recall that E [f(y ) X = x] denotes the expected value of f(y ) conditioned on the random variable X taking the value x, and is analytically given by E [f(y ) X = x] = f(y)p [Y = y X = x]. y In order to compute the value of c that minimizes the squared error E, we derive it with respect to
4 Homework 1 Decision Support Systems Page 4 of 10 c, to yield: de dc = d dc E [ (Y c) 2 X = x ] [ ] d = E dc (Y c)2 X = x = E [ 2(Y c) X = x] = 2E [Y X = x] + 2c. due to the linearity of E [ ] Equating the above expression to 0 and solving for c, we finally get: c = E [Y X = x]. 2 OLAP Queries 3. Suppose that a data warehouse for a technical support company is structured around three dimensions, Time, Technician and Client, and the measures Count and Charge. The measure Count keeps track of the number of times that a client/technicial was assisted/called upon. The measure Charge keeps track of the payments charged to each customer upon a visit by a technician. (a) (1 val.) Starting with the cuboid [Day, Technician, Client], what specific OLAP operations should be performed in order to list the total fee collected by each technician in 2012? Note: In your answer, you are free to assume any (reasonable) hierarchy for each of the different dimensions in the DW. You should indicate them in your answer. We assume that each of the three dimensions is organized according to the following hierarchies: Time is organized in the hierarchy Day-Month-Semester-Year-all Technician is organized in Technician-Expertise-Area-all Client is organized in Client-Neighborhood-City-State-all. Given this hierarchy, we would require the following OLAP operations: 1. A roll-up on the dimension Time from Day to Year. 2. A roll-up on the dimension Client from Client to all, to aggregate on this dimension. 3. A slice on the dimension Time to select Year = (b) (1 val.) Write a Transact-SQL query to obtain the same result, assuming that the data are stored in a relational database with the scheme Fee(Day, Month, Year, Technician, Store, Client, Charge). You should use the relevant OLAP operators you practiced in the lab session (indicate only the query). A possible T-SQL query would be:
5 Homework 1 Decision Support Systems Page 5 of 10 Technician, SUM(Charge) Fee WHERE Year = 2012 Technician WITH ROLLUP where we used ROLLUP to also include the total amount charged in 2012 (aggregated over all technicians). 4. (2 val.) Give an example of a query that uses grouping with ROLLUP that cannot be expressed by a single clause. Besides your query, you should indicate the clauses necessary to obtain the same information. In your example, you can use the JoBS database used in the lab session (indicate only the queries, not the result). Resorting to the JoBS database, as suggested, the query CASE WHEN GROUPING(E.EngineerName)=1 THEN All Engineers ELSE E.EngineerName END, CASE WHEN GROUPING(S.PartNumber)=1 THEN All Parts ELSE S.PartNumber END, SUM(S.UnitsHeld) EngineerStock S Engineers E S.EngineerId = E.EngineerId E.EngineerName, S.PartNumber WITH ROLLUP would require three clauses if the ROLLUP clause were not used, E.EngineerName, S.PartNumber, SUM(S.UnitsHeld) EngineerStock S Engineers E S.EngineerId = E.EngineerId E.EngineerName,
6 Homework 1 Decision Support Systems Page 6 of 10 S.PartNumber UNI E.EngineerName, All Parts, SUM(S.UnitsHeld) EngineerStock S Engineers E S.EngineerId = E.EngineerId E.EngineerName UNI All Engineers, All Parts, SUM(S.UnitsHeld) EngineerStock S Engineers E S.EngineerId = E.EngineerId in order to compute the total parts per Engineer and the total number of parts (overall). 2.1 Practical Questions (Using SQL Server 2008) For the following questions, you should use the AdventureWorksDW2012 database. To this purpose, at the beginning of your code you should include the following SQL statement: USE AdventureWorksDW2012; GO You can use the Object Explorer in the MS SQL Server Management Studio to explore the tables in this database as well as the attributes in each table. For completeness, Fig. 1 includes a (simplified) representation of the relevant tables and attributes for this homework, where the attributes in italic correspond to primary keys. 5. (3 val.) Write down the SQL query necessary to obtain the relation described in Fig. 2 from the table dbo.factinternetsales (see Fig. 1). This relation describes, for each order in dbo.factinternetsales, the first and last name of the corresponding customer, the postal code, state and country associated with the customer, the name of the product in the order, its shipping date, and the total amount paid by the customer. You need only to indicate the query, not the results. In terms of the relations in Fig. 1, The attribute OrderNumber in the new table corresponds to the attribute SalesOrderNumber in dbo.factinternetsales; The attribute State in the new table corresponds to the attribute StateProvinceCode in the table dbo.dimgeography;
7 Homework 1 Decision Support Systems Page 7 of 10 dbo.dimcustomer CustomerKey GeographyKey FirstName LastName BirthDate... dbo.factinternetsales SalesOrderNumber ProductKey ShipDateKey CustomerKey SalesAmount... dbo.dimgeography GeographyKey EnglishCountryRegionName City StateProvinceCode CountryRegionCode PostalCode... dbo.dimproduct ProductKey EnglishProductName ModelName ProductLine... dbo.dimdate DateKey FullDateAlternateKey CalendarYear... Figura 1: Simplified schema of the AdventureWorksDW2008 that includes only the relevant tables and attributes. (NoName) OrderNumber FirstName LastName PostalCode State Country Product ShipDate Total Figura 2: Table for question 5. The attribute Country in the new table corresponds to the attribute CountryRegionCode in the table dbo.geography; The attribute Product in the new table corresponds to the attribute EnglishProductName in dbo.dimproduct; The attribute ShipDate in the new table corresponds to the attribute FullDateAlternateKey in dbo.dimdate; The attribute Total in the new table corresponds to the attribute SalesAmount in the table dbo.factinternetsales; In your query you should use adequate JOIN operations. In particular, note that there may be orders for which not all information above may be available, but which should still be included in the results of your query. The SQL query would be: F.SalesOrderNumber AS OrderNumber, C.FirstName, C.LastName, G.PostalCode
8 Homework 1 Decision Support Systems Page 8 of 10 G.StateProvinceCode AS State, G.CountryRegionCode AS Country, P.EnglishProductName AS Product, D.FullDataAlternateKey AS ShipDate, F.SalesAmount AS Total dbo.factinternetsales F LEFT JOIN dbo.dimcustomer C F.CustomerKey = C.CustomerKey LEFT JOIN dbo.dimgeography G C.GeographyKey = G.GeographyKey LEFT JOIN dbo.dimproduct P F.ProductKey = P.ProductKey LEFT JOIN dbo.dimdate D F.ShipDateKey = D.DateKey 6. From the table dbo.factinternetsales (see Fig. 1), (a) (2 val.) Write down the SQL query necessary to determine the total sales amount per calendar year. You should include both the query and the obtained results. The SQL query is: D.CalendarYear AS Year, SUM(F.SalesAmount) AS Total dbo.factinternetsales F dbo.dimdate D F.ShipDateKey = D.DateKey D.CalendarYear The corresponding values are: Year Total ,517, ,158, ,105, ,576,978.98
9 Homework 1 Decision Support Systems Page 9 of 10 (b) (2 val.) With a single query, determine both the global and per year the total sales amount (Suggestion: Use a ROLLUP clause). You should include both the SQL query and the obtained results. The SQL query is: D.CalendarYear AS Year, SUM(F.SalesAmount) AS Total dbo.factinternetsales F dbo.dimdate D F.ShipDateKey = D.DateKey D.CalendarYear WITH ROLLUP The corresponding values are: Year Total ,105, ,576, ,517, ,158, All 29,358, (c) (3 val.) Using the CUBE clause, determine the total sales amount across the two dimensions: Year, corresponding to the CalendarYear attribute in table dbo.dimdate (associated with the shipping date), and Country, corresponding to the EnglishCountryRegionName attribute in table dbo.dimgeography. Write down the adequate SQL query and express the results as a crosstabulation. The SQL query is: D.CalendarYear as Year, G.EnglishCountryRegionName as Country, SUM(F.SalesAmount) as Total dbo.factinternetsales F dbo.dimdate D F.ShipDateKey = D.DateKey dbo.dimcustomer C F.CustomerKey = C.CustomerKey
10 Homework 1 Decision Support Systems Page 10 of 10 dbo.dimgeography G C.GeographyKey = G.GeographyKey D.CalendarYear, G.EnglishCountryRegionName WITH CUBE The corresponding cross-tabulation is: All Australia 1,251, ,166, ,002, ,641, ,061,000.6 Canada 143, , , , ,977,844.9 France 172, , , , ,644,017.7 Germany 219, , ,021, ,125, ,894,312.3 United Kingdom 280, , ,271, ,255, ,391,712.2 United States 1,038, ,172, ,721, ,457, ,389,789.5 All 3,105, ,576, ,517, ,158, ,358,677.2
--NOTE: When the task does not specify sort order, it is your responsibility to order the information -- so that is easy to interpret.
--* BUSIT 103 Assignment #8 DUE DATE: Consult course calendar Name: Christopher Singleton Class: BUSIT103 - Online Instructor: Art Lovestedt Date: 11/15/2014 --You are to develop SQL statements for each
More information/* Module 9 Subqueries
/* Module 9 Subqueries This first part of this demo uses the AdventureWorksDW2012 database which is the data warehouse that corresponds to the AdventureWorks2012 operational database. */ USE AdventureWorksDW2012;
More informationSample Dataedo documentation. Data warehouse. Documentation
Sample Dataedo documentation Data warehouse Documentation 2017-09-01 of Contents 1. Dimensions... 5 1.1. s...5 1.1.1. : dbo.dimaccount... 5 1.1.2. : dbo.dimcurrency... 6 1.1.3. : dbo.dimcustomer...6 1.1.4.
More informationAssignment 1. Question 1: Brock Wilcox CS
Assignment 1 Brock Wilcox wilcox6@uiuc.edu CS 412 2009-09-30 Question 1: In the introduction chapter, we have introduced different ways to perform data mining: (1) using a data mining language to write
More informationTemporal Data Warehouses: Logical Models and Querying
Temporal Data Warehouses: Logical Models and Querying Waqas Ahmed, Esteban Zimányi, Robert Wrembel waqas.ahmed@ulb.ac.be Université Libre de Bruxelles Poznan University of Technology April 2, 2015 ITBI
More informationData Analysis and Data Science
Data Analysis and Data Science CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/29/15 Agenda Check-in Online Analytical Processing Data Science Homework 8 Check-in Online Analytical
More informationIn this task, you specify formatting properties for the currency and percentage measures in the Analysis Services Tutorial cube.
AL GHURAIR UNIVERSITY College of Computing CIS 303 Data Mining Week 3: Modifying Measures, Attributes and Hierarchies After defining, deploying, and processing your initial cube, and then reviewing dimension
More informationChapter 18: Data Analysis and Mining
Chapter 18: Data Analysis and Mining Database System Concepts See www.db-book.com for conditions on re-use Chapter 18: Data Analysis and Mining Decision Support Systems Data Analysis and OLAP 18.2 Decision
More informationSQL SERVER ASSIGNMENTS OPPORTUNITIES CREATE A SCHOOL DATABASE SIMILAR TO THIS
1 OPPORTUNITIES CREATE A SCHOOL DATABASE SIMILAR TO THIS 2 CREATE A LIBRARY MANAGEMENT SYSTEM DATABASE SIMILAR TO THIS. TABLES ASSIGNMENT CREATE A TABLE STRUCTURE SIMILAR TO THIS : 3 SELECT C.EnglishProductCategoryName,S.EnglishProductSubcategoryName,
More informationSection A. 1. a) Explain the evolution of information systems into today s complex information ecosystems and its consequences.
Section A 1. a) Explain the evolution of information systems into today s complex information ecosystems and its consequences. b) Discuss the reasons behind the phenomenon of data retention, its disadvantages,
More informationImplementing a Data Warehouse with SQL Server 2014
Training Handbook Implementing a Data Warehouse with SQL Server 2014 Some elements of this workshop are subject to change. This workshop is for informational purposes only. Module 2: Creating Multidimensional
More informationSQL Server Analysis Services
DataBase and Data Mining Group of DataBase and Data Mining Group of Database and data mining group, SQL Server 2005 Analysis Services SQL Server 2005 Analysis Services - 1 Analysis Services Database and
More informationCIS 611: ENTERPRISE DATABASE AND DATA WAREHOUSING. Project: Multidimensional OLAP Cube using Adventure Works Data Warehouse
CIS 611: ENTERPRISE DATABASE AND DATA WAREHOUSING Project: Multidimensional OLAP Cube using Adventure Works Data Warehouse Overview: A data warehouse is a centralized repository that stores data from multiple
More informationSSAS 2008 Tutorial: Understanding Analysis Services
Departamento de Engenharia Informática Sistemas de Informação e Bases de Dados Online Analytical Processing (OLAP) This tutorial has been copied from: https://www.accelebrate.com/sql_training/ssas_2008_tutorial.htm
More informationECLT 5810 Data Preprocessing. Prof. Wai Lam
ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate
More informationOLAP2 outline. Multi Dimensional Data Model. A Sample Data Cube
OLAP2 outline Multi Dimensional Data Model Need for Multi Dimensional Analysis OLAP Operators Data Cube Demonstration Using SQL Multi Dimensional Data Model Multi dimensional analysis is a popular approach
More informationDecision Support Systems aka Analytical Systems
Decision Support Systems aka Analytical Systems Decision Support Systems Systems that are used to transform data into information, to manage the organization: OLAP vs OLTP OLTP vs OLAP Transactions Analysis
More informationData Mining By IK Unit 4. Unit 4
Unit 4 Data mining can be classified into two categories 1) Descriptive mining: describes concepts or task-relevant data sets in concise, summarative, informative, discriminative forms 2) Predictive mining:
More informationETL and OLAP Systems
ETL and OLAP Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 3. Chapter 3: Data Preprocessing. Major Tasks in Data Preprocessing
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 3 1 Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major Tasks in Data Preprocessing Data Cleaning Data Integration Data
More informationA. Incorrect! This would be the negative of the range. B. Correct! The range is the maximum data value minus the minimum data value.
AP Statistics - Problem Drill 05: Measures of Variation No. 1 of 10 1. The range is calculated as. (A) The minimum data value minus the maximum data value. (B) The maximum data value minus the minimum
More informationUnit 7: Basics in MS Power BI for Excel 2013 M7-5: OLAP
Unit 7: Basics in MS Power BI for Excel M7-5: OLAP Outline: Introduction Learning Objectives Content Exercise What is an OLAP Table Operations: Drill Down Operations: Roll Up Operations: Slice Operations:
More information2. (a) Briefly discuss the forms of Data preprocessing with neat diagram. (b) Explain about concept hierarchy generation for categorical data.
Code No: M0502/R05 Set No. 1 1. (a) Explain data mining as a step in the process of knowledge discovery. (b) Differentiate operational database systems and data warehousing. [8+8] 2. (a) Briefly discuss
More informationData warehouses Decision support The multidimensional model OLAP queries
Data warehouses Decision support The multidimensional model OLAP queries Traditional DBMSs are used by organizations for maintaining data to record day to day operations On-line Transaction Processing
More informationGetting to Know Your Data
Chapter 2 Getting to Know Your Data 2.1 Exercises 1. Give three additional commonly used statistical measures (i.e., not illustrated in this chapter) for the characterization of data dispersion, and discuss
More informationImplementing and Maintaining Microsoft SQL Server 2008 Analysis Services
Course 6234A: Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services Course Details Course Outline Module 1: Introduction to Microsoft SQL Server Analysis Services This module introduces
More informationData Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 432 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationRegression Analysis and Linear Regression Models
Regression Analysis and Linear Regression Models University of Trento - FBK 2 March, 2015 (UNITN-FBK) Regression Analysis and Linear Regression Models 2 March, 2015 1 / 33 Relationship between numerical
More informationData Science. Data Analyst. Data Scientist. Data Architect
Data Science Data Analyst Data Analysis in Excel Programming in R Introduction to Python/SQL/Tableau Data Visualization in R / Tableau Exploratory Data Analysis Data Scientist Inferential Statistics &
More informationSQL Server 2005 Analysis Services
atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of atabase and ata Mining Group of SQL Server
More informationSTA 570 Spring Lecture 5 Tuesday, Feb 1
STA 570 Spring 2011 Lecture 5 Tuesday, Feb 1 Descriptive Statistics Summarizing Univariate Data o Standard Deviation, Empirical Rule, IQR o Boxplots Summarizing Bivariate Data o Contingency Tables o Row
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical
More informationData Warehousing 2. ICS 421 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa
ICS 421 Spring 2010 Data Warehousing 2 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/30/2010 Lipyeow Lim -- University of Hawaii at Manoa 1 Data Warehousing
More informationData Warehousing and Decision Support
Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 4320 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business
More informationCHAPTER-13. Mining Class Comparisons: Discrimination between DifferentClasses: 13.4 Class Description: Presentation of Both Characterization and
CHAPTER-13 Mining Class Comparisons: Discrimination between DifferentClasses: 13.1 Introduction 13.2 Class Comparison Methods and Implementation 13.3 Presentation of Class Comparison Descriptions 13.4
More informationCHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS. Assist. Prof. Dr. Volkan TUNALI
CHAPTER 8 DECISION SUPPORT V2 ADVANCED DATABASE SYSTEMS Assist. Prof. Dr. Volkan TUNALI Topics 2 Business Intelligence (BI) Decision Support System (DSS) Data Warehouse Online Analytical Processing (OLAP)
More informationData Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha
Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking
More informationSql Fact Constellation Schema In Data Warehouse With Example
Sql Fact Constellation Schema In Data Warehouse With Example Data Warehouse OLAP - Learn Data Warehouse in simple and easy steps using Multidimensional OLAP (MOLAP), Hybrid OLAP (HOLAP), Specialized SQL
More informationCall: SAS BI Course Content:35-40hours
SAS BI Course Content:35-40hours Course Outline SAS Data Integration Studio 4.2 Introduction * to SAS DIS Studio Features of SAS DIS Studio Tasks performed by SAS DIS Studio Navigation to SAS DIS Studio
More informationCourse Number : SEWI ZG514 Course Title : Data Warehousing Type of Exam : Open Book Weightage : 60 % Duration : 180 Minutes
Birla Institute of Technology & Science, Pilani Work Integrated Learning Programmes Division M.S. Systems Engineering at Wipro Info Tech (WIMS) First Semester 2014-2015 (October 2014 to March 2015) Comprehensive
More informationData Warehouses. Yanlei Diao. Slides Courtesy of R. Ramakrishnan and J. Gehrke
Data Warehouses Yanlei Diao Slides Courtesy of R. Ramakrishnan and J. Gehrke Introduction v In the late 80s and early 90s, companies began to use their DBMSs for complex, interactive, exploratory analysis
More informationReal-World Performance Training Dimensional Queries
Real-World Performance Training al Queries Real-World Performance Team Agenda 1 2 3 4 5 The DW/BI Death Spiral Parallel Execution Loading Data Exadata and Database In-Memory al Queries al Queries 1 2 3
More informationTesting Masters Technologies
1. What is Data warehouse ETL TESTING Q&A Ans: A Data warehouse is a subject oriented, integrated,time variant, non volatile collection of data in support of management's decision making process. Subject
More informationEXAMGOOD QUESTION & ANSWER. Accurate study guides High passing rate! Exam Good provides update free of charge in one year!
EXAMGOOD QUESTION & ANSWER Exam Good provides update free of charge in one year! Accurate study guides High passing rate! http://www.examgood.com Exam : 70-460 Title : Transition Your MCITP: Business Intelligence
More informationData Mining. 2.4 Data Integration. Fall Instructor: Dr. Masoud Yaghini. Data Integration
Data Mining 2.4 Fall 2008 Instructor: Dr. Masoud Yaghini Data integration: Combines data from multiple databases into a coherent store Denormalization tables (often done to improve performance by avoiding
More informationAdnan YAZICI Computer Engineering Department
Data Warehouse Adnan YAZICI Computer Engineering Department Middle East Technical University, A.Yazici, 2010 Definition A data warehouse is a subject-oriented integrated time-variant nonvolatile collection
More informationBasics of Dimensional Modeling
Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimension
More informationBusiness Analytics in the Oracle 12.2 Database: Analytic Views. Event: BIWA 2017 Presenter: Dan Vlamis and Cathye Pendley Date: January 31, 2017
Business Analytics in the Oracle 12.2 Database: Analytic Views Event: BIWA 2017 Presenter: Dan Vlamis and Cathye Pendley Date: January 31, 2017 Vlamis Software Solutions Vlamis Software founded in 1992
More informationAcknowledgment. MTAT Data Mining. Week 7: Online Analytical Processing and Data Warehouses. Typical Data Analysis Process.
MTAT.03.183 Data Mining Week 7: Online Analytical Processing and Data Warehouses Marlon Dumas marlon.dumas ät ut. ee Acknowledgment This slide deck is a mashup of the following publicly available slide
More informationImproving the Performance of OLAP Queries Using Families of Statistics Trees
Improving the Performance of OLAP Queries Using Families of Statistics Trees Joachim Hammer Dept. of Computer and Information Science University of Florida Lixin Fu Dept. of Mathematical Sciences University
More informationData Warehousing & OLAP
CMPUT 391 Database Management Systems Data Warehousing & OLAP Textbook: 17.1 17.5 (first edition: 19.1 19.5) Based on slides by Lewis, Bernstein and Kifer and other sources University of Alberta 1 Why
More informationDta Mining and Data Warehousing
CSCI6405 Fall 2003 Dta Mining and Data Warehousing Instructor: Qigang Gao, Office: CS219, Tel:494-3356, Email: q.gao@dal.ca Teaching Assistant: Christopher Jordan, Email: cjordan@cs.dal.ca Office Hours:
More informationData Mining and Analytics. Introduction
Data Mining and Analytics Introduction Data Mining Data mining refers to extracting or mining knowledge from large amounts of data It is also termed as Knowledge Discovery from Data (KDD) Mostly, data
More informationCSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores
CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs
More informationCS570: Introduction to Data Mining
CS570: Introduction to Data Mining Fall 2013 Reading: Chapter 3 Han, Chapter 2 Tan Anca Doloc-Mihu, Ph.D. Some slides courtesy of Li Xiong, Ph.D. and 2011 Han, Kamber & Pei. Data Mining. Morgan Kaufmann.
More informationData Mining: Exploring Data. Lecture Notes for Chapter 3
Data Mining: Exploring Data Lecture Notes for Chapter 3 1 What is data exploration? A preliminary exploration of the data to better understand its characteristics. Key motivations of data exploration include
More informationInformation Management course
Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 07 : 06/11/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter
More informationDeccansoft Software Services Microsoft Silver Learning Partner. SSAS Syllabus
Overview: Analysis Services enables you to analyze large quantities of data. With it, you can design, create, and manage multidimensional structures that contain detail and aggregated data from multiple
More informationResearch Methods for Business and Management. Session 8a- Analyzing Quantitative Data- using SPSS 16 Andre Samuel
Research Methods for Business and Management Session 8a- Analyzing Quantitative Data- using SPSS 16 Andre Samuel A Simple Example- Gym Purpose of Questionnaire- to determine the participants involvement
More informationAn Overview of Data Warehousing and OLAP Technology
An Overview of Data Warehousing and OLAP Technology CMPT 843 Karanjit Singh Tiwana 1 Intro and Architecture 2 What is Data Warehouse? Subject-oriented, integrated, time varying, non-volatile collection
More informationOFFICIAL MICROSOFT LEARNING PRODUCT 10778A. Implementing Data Models and Reports with Microsoft SQL Server 2012 Companion Content
OFFICIAL MICROSOFT LEARNING PRODUCT 10778A Implementing Data Models and Reports with Microsoft SQL Server 2012 Companion Content 2 Implementing Data Models and Reports with Microsoft SQL Server 2012 Information
More informationData Warehousing and Data Mining SQL OLAP Operations
Data Warehousing and Data Mining SQL OLAP Operations SQL table expression query specification query expression SQL OLAP GROUP BY extensions: rollup, cube, grouping sets Acknowledgements: I am indebted
More informationImplementing Data Models and Reports with SQL Server 2014
Course 20466D: Implementing Data Models and Reports with SQL Server 2014 Page 1 of 6 Implementing Data Models and Reports with SQL Server 2014 Course 20466D: 4 days; Instructor-Led Introduction The focus
More informationSyllabus. Syllabus. Motivation Decision Support. Syllabus
Presentation: Sophia Discussion: Tianyu Metadata Requirements and Conclusion 3 4 Decision Support Decision Making: Everyday, Everywhere Decision Support System: a class of computerized information systems
More informationBUSINESS ANALYTICS. 96 HOURS Practical Learning. DexLab Certified. Training Module. Gurgaon (Head Office)
SAS (Base & Advanced) Analytics & Predictive Modeling Tableau BI 96 HOURS Practical Learning WEEKDAY & WEEKEND BATCHES CLASSROOM & LIVE ONLINE DexLab Certified BUSINESS ANALYTICS Training Module Gurgaon
More informationData Mining: Exploring Data. Lecture Notes for Chapter 3. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Chapter 3 Introduction to Data Mining by Tan, Steinbach, Kumar What is data exploration? A preliminary exploration of the data to better understand its characteristics.
More informationLectures for the course: Data Warehousing and Data Mining (IT 60107)
Lectures for the course: Data Warehousing and Data Mining (IT 60107) Week 1 Lecture 1 21/07/2011 Introduction to the course Pre-requisite Expectations Evaluation Guideline Term Paper and Term Project Guideline
More informationSAS Visual Analytics 8.2: Working with Report Content
SAS Visual Analytics 8.2: Working with Report Content About Objects After selecting your data source and data items, add one or more objects to display the results. SAS Visual Analytics provides objects
More informationData Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality
Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data e.g., occupation = noisy: containing
More informationBig Data 13. Data Warehousing
Ghislain Fourny Big Data 13. Data Warehousing fotoreactor / 123RF Stock Photo The road to analytics Aurelio Scetta / 123RF Stock Photo Another history of data management (T. Hofmann) 1970s 2000s Age of
More informationData Mining: Exploring Data. Lecture Notes for Data Exploration Chapter. Introduction to Data Mining
Data Mining: Exploring Data Lecture Notes for Data Exploration Chapter Introduction to Data Mining by Tan, Steinbach, Karpatne, Kumar 02/03/2018 Introduction to Data Mining 1 What is data exploration?
More informationData Mining. ❷Chapter 2 Basic Statistics. Asso.Prof.Dr. Xiao-dong Zhu. Business School, University of Shanghai for Science & Technology
❷Chapter 2 Basic Statistics Business School, University of Shanghai for Science & Technology 2016-2017 2nd Semester, Spring2017 Contents of chapter 1 1 recording data using computers 2 3 4 5 6 some famous
More informationAdvanced Data Management Technologies
ADMT 2017/18 Unit 10 J. Gamper 1/37 Advanced Data Management Technologies Unit 10 SQL GROUP BY Extensions J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements: I
More informationData Mining. Part 2. Data Understanding and Preparation. 2.4 Data Transformation. Spring Instructor: Dr. Masoud Yaghini. Data Transformation
Data Mining Part 2. Data Understanding and Preparation 2.4 Spring 2010 Instructor: Dr. Masoud Yaghini Outline Introduction Normalization Attribute Construction Aggregation Attribute Subset Selection Discretization
More information1. Basic Steps for Data Analysis Data Editor. 2.4.To create a new SPSS file
1 SPSS Guide 2009 Content 1. Basic Steps for Data Analysis. 3 2. Data Editor. 2.4.To create a new SPSS file 3 4 3. Data Analysis/ Frequencies. 5 4. Recoding the variable into classes.. 5 5. Data Analysis/
More informationData Warehouses Chapter 12. Class 10: Data Warehouses 1
Data Warehouses Chapter 12 Class 10: Data Warehouses 1 OLTP vs OLAP Operational Database: a database designed to support the day today transactions of an organization Data Warehouse: historical data is
More information1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda
Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:
More informationData Warehousing Conclusion. Esteban Zimányi Slides by Toon Calders
Data Warehousing Conclusion Esteban Zimányi ezimanyi@ulb.ac.be Slides by Toon Calders Motivation for the Course Database = a piece of software to handle data: Store, maintain, and query Most ideal system
More informationcollection of data that is used primarily in organizational decision making.
Data Warehousing A data warehouse is a special purpose database. Classic databases are generally used to model some enterprise. Most often they are used to support transactions, a process that is referred
More information8: Statistics. Populations and Samples. Histograms and Frequency Polygons. Page 1 of 10
8: Statistics Statistics: Method of collecting, organizing, analyzing, and interpreting data, as well as drawing conclusions based on the data. Methodology is divided into two main areas. Descriptive Statistics:
More informationReminds on Data Warehousing
BUSINESS INTELLIGENCE Reminds on Data Warehousing (details at the Decision Support Database course) Business Informatics Degree BI Architecture 2 Business Intelligence Lab Star-schema datawarehouse 3 time
More informationYear 10 General Mathematics Unit 2
Year 11 General Maths Year 10 General Mathematics Unit 2 Bivariate Data Chapter 4 Chapter Four 1 st Edition 2 nd Edition 2013 4A 1, 2, 3, 4, 6, 7, 8, 9, 10, 11 1, 2, 3, 4, 6, 7, 8, 9, 10, 11 2F (FM) 1,
More informationData Modeling and Databases Ch 7: Schemas. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich
Data Modeling and Databases Ch 7: Schemas Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich Database schema A Database Schema captures: The concepts represented Their attributes
More informationWelcome to the topic of SAP HANA modeling views.
Welcome to the topic of SAP HANA modeling views. 1 At the end of this topic, you will be able to describe the three types of SAP HANA modeling views and use the SAP HANA Studio to work with views in the
More informationFurther Maths Notes. Common Mistakes. Read the bold words in the exam! Always check data entry. Write equations in terms of variables
Further Maths Notes Common Mistakes Read the bold words in the exam! Always check data entry Remember to interpret data with the multipliers specified (e.g. in thousands) Write equations in terms of variables
More informationWriting Queries Using Microsoft SQL Server 2008 Transact- SQL
Writing Queries Using Microsoft SQL Server 2008 Transact- SQL Course 2778-08; 3 Days, Instructor-led Course Description This 3-day instructor led course provides students with the technical skills required
More informationDecision Support Systems 2012/2013. MEIC - TagusPark. Homework #5. Due: 15.Apr.2013
Decision Support Systems 2012/2013 MEIC - TagusPark Homework #5 Due: 15.Apr.2013 1 Frequent Pattern Mining 1. Consider the database D depicted in Table 1, containing five transactions, each containing
More informationPart I, Chapters 4 & 5. Data Tables and Data Analysis Statistics and Figures
Part I, Chapters 4 & 5 Data Tables and Data Analysis Statistics and Figures Descriptive Statistics 1 Are data points clumped? (order variable / exp. variable) Concentrated around one value? Concentrated
More informationINTERMEDIATE SQL GOING BEYOND THE SELECT. Created by Brian Duffey
INTERMEDIATE SQL GOING BEYOND THE SELECT Created by Brian Duffey WHO I AM Brian Duffey 3 years consultant at michaels, ross, and cole 9+ years SQL user What have I used SQL for? ROADMAP Introduction 1.
More informationRelations and Functions 2.1
Relations and Functions 2.1 4 A 2 B D -5 5 E -2 C F -4 Relation a set of ordered pairs (Domain, Range). Mapping shows how each number of the domain is paired with each member of the range. Example 1 (2,
More informationImplementing and Maintaining Microsoft SQL Server 2005 Analysis Services
Implementing and Maintaining Microsoft SQL Server 2005 Analysis Services Introduction Elements of this syllabus are subject to change. This three-day instructor-led course teaches students how to implement
More informationChoosing the Right Procedure
3 CHAPTER 1 Choosing the Right Procedure Functional Categories of Base SAS Procedures 3 Report Writing 3 Statistics 3 Utilities 4 Report-Writing Procedures 4 Statistical Procedures 6 Available Statistical
More informationHyperion Interactive Reporting Reports & Dashboards Essentials
Oracle University Contact Us: +27 (0)11 319-4111 Hyperion Interactive Reporting 11.1.1 Reports & Dashboards Essentials Duration: 5 Days What you will learn The first part of this course focuses on two
More informationHomework set 4 - Solutions
Homework set 4 - Solutions Math 3200 Renato Feres 1. (Eercise 4.12, page 153) This requires importing the data set for Eercise 4.12. You may, if you wish, type the data points into a vector. (a) Calculate
More informationWriting Analytical Queries for Business Intelligence
MOC-55232 Writing Analytical Queries for Business Intelligence 3 Days Overview About this Microsoft SQL Server 2016 Training Course This three-day instructor led Microsoft SQL Server 2016 Training Course
More informationComputational Databases: Inspirations from Statistical Software. Linnea Passing, Technical University of Munich
Computational Databases: Inspirations from Statistical Software Linnea Passing, linnea.passing@tum.de Technical University of Munich Data Science Meets Databases Data Cleansing Pipelines Fuzzy joins Data
More informationSAS Web Report Studio 3.1
SAS Web Report Studio 3.1 User s Guide SAS Documentation The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Web Report Studio 3.1: User s Guide. Cary, NC: SAS
More informationStatistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) *
OpenStax-CNX module: m39305 1 Statistics: Normal Distribution, Sampling, Function Fitting & Regression Analysis (Grade 12) * Free High School Science Texts Project This work is produced by OpenStax-CNX
More informationMultiple Regression White paper
+44 (0) 333 666 7366 Multiple Regression White paper A tool to determine the impact in analysing the effectiveness of advertising spend. Multiple Regression In order to establish if the advertising mechanisms
More informationQUALITY MONITORING AND
BUSINESS INTELLIGENCE FOR CMS DATA QUALITY MONITORING AND DATA CERTIFICATION. Author: Daina Dirmaite Supervisor: Broen van Besien CERN&Vilnius University 2016/08/16 WHAT IS BI? Business intelligence is
More information