CIS 611: ENTERPRISE DATABASE AND DATA WAREHOUSING. Project: Multidimensional OLAP Cube using Adventure Works Data Warehouse

Similar documents
Implementing a Data Warehouse with SQL Server 2014

Designing and Building a Data Mining Project with Cube and OLAP

SQL Server Analysis Services

In this task, you specify formatting properties for the currency and percentage measures in the Analysis Services Tutorial cube.

Hands-On Lab. Lab: Developing BI Applications. Lab version: Last updated: 2/23/2011

Building a Data Mining Model Using MS Data Integration Service for Business Intelligence

SSAS 2008 Tutorial: Understanding Analysis Services

Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

Deccansoft Software Services Microsoft Silver Learning Partner. SSAS Syllabus

SQL Server 2005 Analysis Services

BI4Dynamics Customization Manual

COURSE 20466D: IMPLEMENTING DATA MODELS AND REPORTS WITH MICROSOFT SQL SERVER

Hands-On Lab. Developing BI Applications. Lab version: Last updated: 2/23/2011

Getting Started enterprise 88. Oracle Warehouse Builder 11gR2: operational data warehouse. Extract, Transform, and Load data to

A Case Study Building Financial Report and Dashboard Using OBIEE Part I

Lesson 3: Building a Market Basket Scenario (Intermediate Data Mining Tutorial)

Implementing Data Models and Reports with Microsoft SQL Server Exam Summary Syllabus Questions

20466C - Version: 1. Implementing Data Models and Reports with Microsoft SQL Server

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis

Quality Gates User guide

6234A - Implementing and Maintaining Microsoft SQL Server 2008 Analysis Services

IMPLEMENTING STATISTICAL DOMAIN DATABASES IN POLAND. OPPORTUNITIES AND THREATS. Central Statistical Office in Poland

SAS Data Integration Studio 3.3. User s Guide

Unit 7: Basics in MS Power BI for Excel 2013 M7-5: OLAP

Implementing Data Models and Reports with SQL Server 2014

UNIT

Implementing and Maintaining Microsoft SQL Server 2005 Analysis Services

After completing this course, participants will be able to:

4 Introduction to Web Intelligence

An Overview of Data Warehousing and OLAP Technology

Training 24x7 DBA Support Staffing. MCSA:SQL 2016 Business Intelligence Development. Implementing an SQL Data Warehouse. (40 Hours) Exam

Getting Started Guide. ProClarity Analytics Platform 6. ProClarity Professional

Overview of Reporting in the Business Information Warehouse

Matière : System décisionnel

DATA MINING TRANSACTION

Recently Updated Dumps from PassLeader with VCE and PDF (Question 1 - Question 15)

Deltek Vision 7.1. Installation and Configuration Guide for Performance Management. (Analysis Cubes and Performance Dashboards)

Reporting With SAP Crystal Reports

Upgrade: Transition Your MCITP SQL Server 2005 BI Developer to MCITP SQL Server 2008 BI Developer

Basics of Dimensional Modeling

SAS Web Report Studio 3.1

Creating a target user and module

Cube Designer User Guide SAP BusinessObjects Financial Consolidation, Cube Designer 10.0

DEVELOPING SQL DATA MODELS

exam.105q Microsoft Implementing Data Models and Reports with Microsoft SQL Server 2014

SAMPLE. Preface xi 1 Introducting Microsoft Analysis Services 1

IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts. Enn Õunapuu

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

CertifyMe. CertifyMe

A Journey to Power BI

Jet Data Manager 2014 Product Enhancements

Microsoft SQL Server Training Course Catalogue. Learning Solutions

PASS4TEST. IT Certification Guaranteed, The Easy Way! We offer free update service for one year

HYPERION ESSBASE INITIAL EXERCISE

Chapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model

Business Analytics in the Oracle 12.2 Database: Analytic Views. Event: BIWA 2017 Presenter: Dan Vlamis and Cathye Pendley Date: January 31, 2017

BI4Dynamics AX/NAV Integrate external data sources

OLAP Introduction and Overview

Development of an interface that allows MDX based data warehouse queries by less experienced users

BUSINESS INTELLIGENCE. SSAS - SQL Server Analysis Services. Business Informatics Degree

Venezuela: Teléfonos: / Colombia: Teléfonos:

to-end Solution Using OWB and JDeveloper to Analyze Your Data Warehouse

InfoSphere Warehouse V9.5 Exam.

Module 2: Creating Multidimensional Analysis Solutions

Designing SQL Server 2012 Analysis Services Cubes using Samsclub_Star Dataset

1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples.

Developing SQL Data Models

Data Warehousing. Overview

MICROSOFT EXAM QUESTIONS & ANSWERS

Accurate study guides, High passing rate! Testhorse provides update free of charge in one year!

Partner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

OSR Administration 3.7 User Guide. Updated:

Guide Users along Information Pathways and Surf through the Data

Convert Point of Sale Enterprise database to Point of Sale Professional database

EXAMGOOD QUESTION & ANSWER. Accurate study guides High passing rate! Exam Good provides update free of charge in one year!

A case study to introduce Microsoft Data Mining in the database course

DATA MINING AND WAREHOUSING

Kyubit Business Intelligence OLAP analysis - User Manual

Pentaho Aggregation Designer User Guide

The strategic advantage of OLAP and multidimensional analysis

MICROSOFT BUSINESS INTELLIGENCE (MSBI: SSIS, SSRS and SSAS)

Data warehouses Decision support The multidimensional model OLAP queries

Business Intelligence Tutorial

CS 1655 / Spring 2013! Secure Data Management and Web Applications

Open source business analytics. William D. Back Nicholas Goodman Julian Hyde MANNING SAMPLE CHAPTER

Foundations of SQL Server 2008 R2 Business. Intelligence. Second Edition. Guy Fouche. Lynn Lang it. Apress*

Microsoft.Visualexams v by.saveq.70q

Visit our Web site at or call to learn about training classes that are added throughout the year.

Microsoft Developing SQL Data Models.

REPORTING AND QUERY TOOLS AND APPLICATIONS

Data Warehousing. Jens Teubner, TU Dortmund Winter 2015/16. Jens Teubner Data Warehousing Winter 2015/16 1

Data Warehousing and OLAP

Hyperion Interactive Reporting Reports & Dashboards Essentials

Using the Palladium Business Intelligence Functionality

SAS Information Map Studio 3.1: Tips and Techniques

Management Reports Centre. User Guide. Emmanuel Amekuedi

Project management integrated into Outlook

Page 1. Oracle9i OLAP. Agenda. Mary Rehus Sales Consultant Patrick Larkin Vice President, Oracle Consulting. Oracle Corporation. Business Intelligence

Implementing Data Models and Reports with Microsoft SQL Server 2012

Transcription:

CIS 611: ENTERPRISE DATABASE AND DATA WAREHOUSING Project: Multidimensional OLAP Cube using Adventure Works Data Warehouse

Overview: A data warehouse is a centralized repository that stores data from multiple sources and converts those data into a cube/multidimensional data, which are very efficient for querying and analysing the data. This Multidimensional cube has two uses such as OLAP and Data Mining which are two technologies that are been used for Business Intelligence. OLAP database are divided into cubes (depending on the requirement, per se: user can keep entire database in a single cube, or they can keep each record in per cube like financial data, sales, production, etc like that). Using this it is used to analyse the data and predict the future based on the existing data. The data s that are store in the cube are all pre-aggregated, so even if it retrieves millions of information from the database, it will still be efficient and fast. Likewise, Data mining will also do prediction by analysing the data and generates who all are the possible buyers. This prediction is done by using few algorithms such as Microsoft clustering, Microsoft Decision Trees, Microsoft Linear Regression, Microsoft Linear Regression, Microsoft Naïve Bayes, Microsoft Time Series, Microsoft Sequence Clustering, Microsoft Logistic Regression (These are some most commonly and popular data mining algorithms used for analysing data and predicting data). Tools required: 1. Microsoft SQL Server 2016 or latest 2. Microsoft SQL Server Management Studio 2017 or latest 3. Microsoft Visual Studio SSDT 2017 or latest The database used in this project is Adventure Works 2014 (Data warehouse). 1. Adventure Works Database: This is a fictional multinational bicycle company, with their headquarters located in Bothell, Washington and consists around 300 employees, 29 as sales representatives. The company s sales are done via two methods (i). Online sales and, (ii). Reseller sales. The company mainly focuses on United States, United Kingdom, Canada, Australia, France (Europe) and few more places. The products that are sold by this company are: o Bikes - They have many bike types but their preferences on bike products are narrowed down to 3 bikes (Mountain, Road and, Touring) based sales. o Clothing The clothing products that are sold are short, jerseys, socks, etc. o Components They sell all the components that are required for repairing and spares for their manufactured bikes only. o Accessories They sell products like bike stand, rack stand, peddles, handle bar rods, etc. The database contains all the details relating to sales, production, geography, etc. The sales database contains more information like mode of sales: internet or reseller. We can analyse the sales based on the Geography (country, city, state, region), Promotions (reseller, old products, excess products, no discount, etc), etc.

2. Build Database for Data mining, Multidimensional: Firstly, we will need DW (Data warehouse) database. To get the database that is used here: https://msftdbprodsamples.codeplex.com/downloads/get/880664 1. Extract the downloaded database to C:\Program Files\Microsoft SQL Server\MSSQL13. (username) \MSSQL\Backup\ Then Open SQL Server and connect to Database Engine. Now, right click on database engine and select restore option and in that go to the specified path where we stored/saved the extracted file. Or, you can use the query to import the database: USE [master] RESTORE DATABASE [AdventureWorksDW2014] FROM DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL13.(username) \MSSQL\Backup\AdventureWorksDW2014.bak' WITH FILE = 1, NOUNLOAD, STATS = 5 GO 3. Building Cube: 3.1) Open Visual studio SSDT, click new project -> new project. Then, in new project in installed look for Business Intelligence and in that select Analysis Services Multidimensional and Data Mining and, give any name. After clicking, it will look like this.

3.2) Data Source: Data source is used to import data from multiple sources and store it. Use the data s stored here we will build the cube. Steps to create data source: 1. Right click on data source and click new, then click next on that page. 2. In this page Click create a data source based on an existing or new creation (if you use it for the first time), now a dialog box like this pop up. In this pop up, under server name: write your server name mentioned from SQL server or you can just keep a. And keep authentication to windows then click Test Connection which will show if it was success or not.

After this is done, next important thing is to select the database that you want to use. Then click ok.

Impersonation Information will pop up in that select Use a specific Windows user name and password, in that type your windows ID and password then click next Then give data source a name and click finish. 2.3) Data Source View:

Now right click on data source view and click new Data source view, then click next on the page that that pops up. In this it will show relational data sources that are available from data source, select the appropriate database and click next. Then a new page called as select tables and views will come. Select necessary tables but I have imported all the data and click ok. In the next page give a name and finish it.

Once it is done, a star schema will be formed where it will point to the relationships between each table. Star Schema, Fact Tables and Dimension Tables: 3.4) Cube: Star Schema: This is the simplest way to see how each table to related to other tables and by which column. It is called as star schema as the table in centre points to other tables gives a look like star. Fact Table: It contains pre-aggregated data (especially with numeric data, and fact table is otherwise also known as Measures.) Dimension table: This contains all kinds of data in it, but it does not have pre-aggregated data for numeric like Fact Table. This table allows the user to create hierarchy for each table as per the requirement. OLAP Cube is used to storage MDX (Multidimensional) data, by using this we will analyse the data and predic the data patterns. Steps for Building and Deploying a Cube:

Right click on the cube and select New Cube and a wizard pops up in that click next. Select Creation Method: In this cube select Use existing table and click next. Select Measures Group Tables: Measures is not but data that are used for calculation and in this process, it will collect all the data and pre-aggregate it. -> Select data that should be selected, or you can also click suggest (by default, it will select all the tables that contains numeric data type fields.) If it missed any data that needs to be measured, we can select that or go with that. Select Measures: By default, everything will be selected. Just click Next. Select New Dimensions: By default, all the dimension tables will be select. Just click next.

Completing the Wizard: Give the cube a name and click finish. After it is completed, this is how it will look. Creating Hierarchies: In dimensions, double Dim Date.dim it will open a new tab like this: In the right most column Data Source View, select dates that are required. (I have selected everything in that except French and

Spanish related and FullDateAlternateKey) After selecting the data, drag and drop them in Attributes.

Now create hierarchies using attributes, select data from attributes and drop in hierarchy column. This will create a new table like this. In that rename the hierarchy, how it is required. Now, rename English month name to month, so it is easier to understand. Attention symbol next to calendar is seen because it does not have any relationship within the hierarchy created. To create it, go to Attribute Relationships you will see like this In this, we can see that data key is hierarchy for everything that is the reason for us to get the attention symbol. Now what we will do is,

select the first key and point it to the next. Point it like this. Once this is done Click save and a dialog box will pop us, click proceed. Follow this procedure for everything and create hierarchies. Deploy the Cube: After clicking on save all, go to Solution Explorer, right click the project name (AdventureWorksDW2014) and select build first and see if you get any error or else it will say it is succeeded. Then, right click the project name again and deploy it now.

If you are editing in Dimensions after the deployment, you should not deploy it again, without doing process (in case you forget to do so, while deploying even it will ask: if you want to process, click proceed for it process.). To process.., right click on the project name and click process. While deployment, we will get two types of error: 1. Security: Enter correct windows name and password, if it is correct then it will work. If it was not given at the time of creating follow then follow this: Double-click on data source

In this, click on Impersonation Information: In that select windows authentication and then enter the details. 2. SQL SERVER Server Agent is turned off. To Turn it on by going to the database in SQL Server Management Studio and connect it, then in the Object Explorer open click the + then, it will show up like this.

Right click on, SQL Server Agent and click start. Then a pop like this will come, click YES in it. If you deploy it now, then it will get deployed. Now open the SQL Server Management Studio and before connecting the database to database engine, click on database engine a drop-down list will come in that select Analysis service and then click connect.

4. MDX Querying: After having connected the server type with Analysis Services. Go to Object Explorer, in that click plus symbol on Databases and select the deployed database. Now right click on the deployed database and go to New Query -> MDX. This is how it will look: Now we will do the prediction using MDX Queries: 1. Show Product wise sales by both internet sales amount and Reseller Sales amount for the fiscal quarter and month/ year. Query: SELECT NON EMPTY {[Measures].[Internet Sales Amount],[Measures].[Reseller Sales Amount], [Measures].[Sales Amount]} on columns, non empty crossjoin ([Product].[Category].[Category],[Date].[Fiscal Quarter of Year].[Fiscal Quarter of Year],[Date].[Calendar].[Month]) on rows from [Adventure Works]; Output:

Pivot Chart: $60,000,000.00 $50,000,000.00 $40,000,000.00 $30,000,000.00 $20,000,000.00 Reseller Sales Amount Internet Sales Amount $10,000,000.00 $0.00 Australia Canada France Germany United Kingdom United States Prediction: We can see the highest Promotional sales were reseller sales, assuming this we can add more products for Sale in reseller sales than on internet sales. (Hint: normally we can not add more tables with different dimension, so if we can use crossjoin to add multiple tables)

2. Show the cities in United States where the total sales amount is more than 5000000 (5 Million) for all the product. (Using VisualTotals and without VisualTotals) Query: select [Geography].[City].[City] on columns, order(visualtotals (filter([product].[category].members, [Measures].[Sales Amount] > 5000000)),[Measures].[Sales Amount], basc) on rows from [Adventure Works] where [Geography].[Country].&[United States]; Output: Without VisualTotals:

Using VisualTotals:

Prediction: Houston had the highest sales and will also have the highest sales for the next year. (Hint: Without visualtotals-> When we filter the data and list hets filtered and displays list that meets our requirements and the final result/value will not match the value that because the total is not calculated as it is listed, it fetches the value from cube where all the details are pre-aggregated. Using VisualTotals-> In this It calculates the final result and displays.) 3. What is the total reseller order count and reseller amount for the products (Chains, Breaks, Crank set, Handle Bars, Road Bikes, Pumps, Locks, Helmets, Cleaners, Bottles ad Cages, Bike Stand and Racks), with promotional offers such (i). Seasonal Discount, (ii). Excess Inventory,, where the customer is Male, Year (2012,2013,2014) in all State-Province. Query: SELECT NON EMPTY {[Measures].[Reseller Order Count],[Measures].[Reseller Sales Amount]} ON COLUMNS, NON EMPTY {([Date].[Calendar Year].[Calendar Year].ALLMEMBERS,[Geography].[Geography].[State- Province].ALLMEMBERS)}ON ROWS FROM (SELECT ({[Promotion].[Promotion Type].&[Seasonal Discount],[Promotion].[Promotion Type].&[Excess Inventory]}) ON COLUMNS FROM (SELECT ({[Product].[Product Categories].[Subcategory].&[26],[Product].[Product Categories].[Subcategory].&[27],[Product].[Product Categories].[Subcategory].&[29],[Product].[Product Categories].[Subcategory].&[34],[Product].[Product Categories].[Subcategory].&[36],[Product].[Product Categories].[Subcategory].&[2],[Product].[Product Categories].[Subcategory].&[31],[Product].[Product Categories].[Subcategory].&[28],[Product].[Product Categories].[Subcategory].&[4],[Product].[Product Categories].[Subcategory].&[8],[Product].[Product Categories].[Subcategory].&[6],[Product].[Product Categories].[Subcategory].&[7]}) ON COLUMNS FROM (SELECT ({[Customer].[Gender].&[M]}) ON COLUMNS FROM (SELECT

( {[Date].[Calendar Year].&[2014],[Date].[Calendar Year].&[2013],[Date].[Calendar Year].&[2012]} ) ON COLUMNS FROM [Adventure Works]) ) ) ) Output:

Prediction: California has the highest reseller sales along with promotions, where men has bought bike along with multiple accessories and components, based on this we can that California only will have highest reseller sales, so we can provide them more stock during that offer.

References: Create tables and followed procedure from this website: https://docs.microsoft.com/en-us/sql/analysis-services/tabular-modeling-adventureworks-tutorial Code Project: http://www.codeproject.com/articles/710387/learn-to-write-custom-mdx-query- First-Time http://www.codeproject.com/articles/658912/create-first-olap-cube-in-sql- ServerAnalysis-Serv https://docs.microsoft.com/en-us/sql/analysis-services/data-mining/data-miningalgorithms-analysis-services-data-mining