COMP33111: Tutorial/lab exercise 2
|
|
- Mark Lane
- 5 years ago
- Views:
Transcription
1 COMP33111: Tutorial/lab exercise 2 Part 1: Data cleaning, profiling and warehousing Note: use lecture slides and additional materials (see Blackboard and COMP33111 web page). 1. Explain why legacy data might be difficult to integrate. Provide an example. 2. What is the role of data profiling? Explain what a data profiling report should have. Explain the steps in data cleansing. 3. Explain how approximate joins can help in data cleansing at the instance level. 4. Some DW practitioners class data errors in the ETL process into four in- categories: incomplete data; incorrect data; incomprehensible data; inconsistent data. Provide examples for each category. Consider 5. How could we resolve various types of conflicts in data transformation/mapping? (see Section 3.3) 6. Explain why data warehouses are time-variant? 7. What are the main challenges in building a data warehouse? Explain and discuss them using an example. Explain the main architectural components in a data warehouse. 8. Explain the nine steps in modelling, constructing and managing a data warehouse. 9. Describe the main clues to identify measures and dimensions while designing a data-warehouse. Give an example using the star schema. 10. A videotape company has stores in several regions. They would like to track profit information across different departments (Video Sales and Video Rentals) and regions (East, West, Central) in different years (e.g and 2012). Design an appropriate data warehouse schema using the star multi-dimensional model and discuss the fact and dimension tables you would need. Would you need/recommend a showflake schema? Explain your views.
2 Part 2: Data exploration and integration with WEKA 1 The first step in a data integration and/or analytics project is getting to know your data. In this tutorial, you will examine three data sets using the Weka framework. It provides a large number of machine learning algorithms and visualisations useful for exploratory data mining. Task A: Install and check that WEKA is running WEKA should be installed on all CS computers. If you need your own copy, download it from On start, you will see the Weka GUI chooser screen appear on your desktop. Select Explorer. The main interface for WEKA will appear as shown below 1 Tailored after: Lab 1: Getting To Know Your Data (MSCS 228: Data Mining) by Dr Craig A. Struble, Marquette University, and other resources.
3 Task B: Using WEKA to understand your data (in ARFF format) This task is intended to provide an introduction to Weka with exploration of data sets that are ARFF formatted. ARFF is the data format for Weka, so no data transformation is necessary. B1. ARFF limits the attribute types it supports in its data files. Using the WEKA documentation, discuss what are the attribute types ARFF supports? B2. Download the two data sets (labor.arff and contact-lenses.arff) from the course web site. Examine the format used in ARFF files. Start Weka and load contact-lenses.arff. Notice that the file contains 24 instances where each instance represents an individual who either wears soft contact lenses, hard contact lenses, or no contact lenses (regular glasses). The five attributes are listed on the left age, spectacle-prescrip, astigmatism, tear-prod-rate, contact-lenses. Notice that Weka provides summary information for each attribute - browse each of the attributes in the data file. For example, on the right, values for the attribute age can be seen. Nominal indicates that the age attribute is not numeric. There are eight instances of young individuals, eight pre-presbyopic and eight prebyopic instances. The coloured chart indicates the distribution of the instances relative to the age attribute within each class. Explore other attributes in detail.
4 B3. Select the Visualization tab in Weka. The Visualization tab provides a scatter plot with two data attributes as the axes. You can change which attributes are along the axes using the drop down menus in the top portion of the visualizer or by clicking on the plots in the right portion of the visualizer. Explore different scatter plots. B4. Repeat the task with the labor.arff file. The data in labor.arff contains two classes, bad and good, along with other attributes. Looking at the values for the different attributes, select three attributes that might be good predictors for the class (i.e. if you knew the value for that attribute, you could guess pretty well whether the class for the same instance was good or bad). Explain why you chose the three attributes you chose. Task C: Exploring your csv data towards a data warehousing model A company has two branches that sell different types of products via Web or through catalogue sale. You are given 2 datasets (download them from the course web site): Catalog_Orders.txt and Web_Orders.txt. Each of the rows in the datasets contains seven attributes, representing: ID of the transaction, INVOICE number, DATE, CATALOG (describing the type of product), internal code (PCODE), QTY and customer number. You also have a file Products.txt that describes the products. C1. Using Weka, explore each of the data sets i.e. their attribute values. Write a brief report that includes: data types, length and value ranges, data variance, uniqueness (provide simple statistical analyses); distribution of key attribute values, relations between pairs or small numbers of attributes (if any); are there any typical string patterns (e.g. phone numbers or dates); are there any specific properties of significant sub-populations within the files? C2. Consider data quality in the three data sets: are there any illegal values; any misspellings; is the data complete (does it cover all the cases required), are there any missing values? Does it contain errors and if there are errors how common are they, how are they represented, where do they occur? Consider, in particular the values in the CATALOG attribute. C3. Use your favourite programming environment to write a program to clean the data sets. Provide a report that documents what has been done. C4. What are the challenges in integrating these datasets in a single data warehouse (consider, for example, DATE)? How would you resolve them? Integrate the datasets in a single file. Repeat the exploration as in tasks C1 and C2. C5. Propose a data warehouse model for this company. What could be measures? What are the potential dimensions (consider Products.txt)? Which data model you would use (star/snowflake)? Draw a schema diagram that explains your model.
5 Task D: Transforming Data into ARFF In this task, a raw data is first transformed into a formatted ARFF file. Once transformed, the exploration steps are repeated as above. As part of this process, you need to identify the different attribute types as those are required by the ARFF format. Note that the ARFF format is more limited in the kinds of data attributes. When using a system with limited data types, it may be necessary to transform the raw data values in order for the tool to work properly. Keep this in mind as you work through the steps below. D1. Visit and read the information about the Teacher Pay by States data set. D2. Identify the attributes of the data. Record the attributes and the type of attribute for the data. D3. Select, download and save the raw data set in a file. D4. Convert the raw data set into CSV (comma separated value) format. One easy way to do this is to load it into Excel and use Save As.... You can also write a program to do this. When you do this conversion, it is considered good practice to replace any nominal values represented by numbers with their textual representation. This will make the data easier to interpret as you summarize and mine it. D5. Edit the CSV file, and add the ARFF header information to the file. This involves creating line, line per attribute, to signify the start of data. It is also considered good practice to add comments at the top of the file describing where you obtained this data set, what its summary characteristics are, etc. A comment in the ARFF format is started with the percent character % and continues until the end of the line. D6. Load your ARFF file into Weka and repeat the steps you performed in task B. You may run into errors as you load your ARFF file - look at for tips how to solve the problem. D7. Repeat the analysis of the data set as you did in tasks B and C. In addition, there exists one data point that appears to be a clear outlier. What point is this? (Try to use Weka to identify this point). Do you see any natural groupings from the teacher pay vs. spending per student dataset? Do the natural groups correspond to the three regions in the dataset?
Homework 1 Sample Solution
Homework 1 Sample Solution 1. Iris: All attributes of iris are numeric, therefore ID3 of weka cannt be applied to this data set. Contact-lenses: tear-prod-rate = reduced: none tear-prod-rate = normal astigmatism
More informationCHAPTER 3 Implementation of Data warehouse in Data Mining
CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected
More informationData Mining. Practical Machine Learning Tools and Techniques. Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall Input: Concepts, instances, attributes Terminology What s a concept?
More informationWhat is KNIME? workflows nodes standard data mining, data analysis data manipulation
KNIME TUTORIAL What is KNIME? KNIME = Konstanz Information Miner Developed at University of Konstanz in Germany Desktop version available free of charge (Open Source) Modular platform for building and
More informationSql Fact Constellation Schema In Data Warehouse With Example
Sql Fact Constellation Schema In Data Warehouse With Example Data Warehouse OLAP - Learn Data Warehouse in simple and easy steps using Multidimensional OLAP (MOLAP), Hybrid OLAP (HOLAP), Specialized SQL
More informationData Preprocessing. Slides by: Shree Jaswal
Data Preprocessing Slides by: Shree Jaswal Topics to be covered Why Preprocessing? Data Cleaning; Data Integration; Data Reduction: Attribute subset selection, Histograms, Clustering and Sampling; Data
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 03 Architecture of DW Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Basic
More informationQuestion Bank. 4) It is the source of information later delivered to data marts.
Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile
More informationMachine Learning Chapter 2. Input
Machine Learning Chapter 2. Input 2 Input: Concepts, instances, attributes Terminology What s a concept? Classification, association, clustering, numeric prediction What s in an example? Relations, flat
More informationData Representation Information Retrieval and Data Mining. Prof. Matteo Matteucci
Data Representation Information Retrieval and Data Mining Prof. Matteo Matteucci Instances, Attributes, Concepts 2 Instances The atomic elements of information from a dataset Also known as records, prototypes,
More information20767B: IMPLEMENTING A SQL DATA WAREHOUSE
ABOUT THIS COURSE This 5-day instructor led course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse with Microsoft SQL Server
More informationFig 1.2: Relationship between DW, ODS and OLTP Systems
1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions
More informationImplement a Data Warehouse with Microsoft SQL Server
Implement a Data Warehouse with Microsoft SQL Server 20463D; 5 days, Instructor-led Course Description This course describes how to implement a data warehouse platform to support a BI solution. Students
More information20463C-Implementing a Data Warehouse with Microsoft SQL Server. Course Content. Course ID#: W 35 Hrs. Course Description: Audience Profile
Course Content Course Description: This course describes how to implement a data warehouse platform to support a BI solution. Students will learn how to create a data warehouse 2014, implement ETL with
More informationInput: Concepts, Instances, Attributes
Input: Concepts, Instances, Attributes 1 Terminology Components of the input: Concepts: kinds of things that can be learned aim: intelligible and operational concept description Instances: the individual,
More informationImplementing a SQL Data Warehouse
Course 20767B: Implementing a SQL Data Warehouse Page 1 of 7 Implementing a SQL Data Warehouse Course 20767B: 4 days; Instructor-Led Introduction This 4-day instructor led course describes how to implement
More informationCOURSE 20466D: IMPLEMENTING DATA MODELS AND REPORTS WITH MICROSOFT SQL SERVER
ABOUT THIS COURSE The focus of this five-day instructor-led course is on creating managed enterprise BI solutions. It describes how to implement multidimensional and tabular data models, deliver reports
More informationalteryx training courses
alteryx training courses alteryx designer 2 day course This course covers Alteryx Designer for new and intermediate Alteryx users. It introduces the User Interface and works through core Alteryx capability,
More informationImplementing a Data Warehouse with Microsoft SQL Server
Course 20463C: Implementing a Data Warehouse with Microsoft SQL Server Page 1 of 6 Implementing a Data Warehouse with Microsoft SQL Server Course 20463C: 4 days; Instructor-Led Introduction This course
More informationPreprocessing Short Lecture Notes cse352. Professor Anita Wasilewska
Preprocessing Short Lecture Notes cse352 Professor Anita Wasilewska Data Preprocessing Why preprocess the data? Data cleaning Data integration and transformation Data reduction Discretization and concept
More information1Z0-526
1Z0-526 Passing Score: 800 Time Limit: 4 min Exam A QUESTION 1 ABC's Database administrator has divided its region table into several tables so that the west region is in one table and all the other regions
More informationBusiness Intelligence Tutorial
IBM DB2 Universal Database Business Intelligence Tutorial Version 7 IBM DB2 Universal Database Business Intelligence Tutorial Version 7 Before using this information and the product it supports, be sure
More informationHandout 12 Data Warehousing and Analytics.
Handout 12 CS-605 Spring 17 Page 1 of 6 Handout 12 Data Warehousing and Analytics. Operational (aka transactional) system a system that is used to run a business in real time, based on current data; also
More informationHOW TO USE THE EXPORT FEATURE IN LCL
HOW TO USE THE EXPORT FEATURE IN LCL In LCL go to the Go To menu and select Export. Select the items that you would like to have exported to the file. To select them you will click the item in the left
More informationBasic Concepts Weka Workbench and its terminology
Changelog: 14 Oct, 30 Oct Basic Concepts Weka Workbench and its terminology Lecture Part Outline Concepts, instances, attributes How to prepare the input: ARFF, attributes, missing values, getting to know
More informationData Mining Practical Machine Learning Tools and Techniques
Input: Concepts, instances, attributes Data ining Practical achine Learning Tools and Techniques Slides for Chapter 2 of Data ining by I. H. Witten and E. rank Terminology What s a concept z Classification,
More informationSummary. Machine Learning: Introduction. Marcin Sydow
Outline of this Lecture Data Motivation for Data Mining and Learning Idea of Learning Decision Table: Cases and Attributes Supervised and Unsupervised Learning Classication and Regression Examples Data:
More information2. Click File and then select Import from the menu above the toolbar. 3. From the Import window click the Create File to Import button:
Totality 4 Import How to Import data into Totality 4. Totality 4 will allow you to import data from an Excel spreadsheet or CSV (comma separated values). You must have Microsoft Excel installed in order
More informationKNIME TUTORIAL. Anna Monreale KDD-Lab, University of Pisa
KNIME TUTORIAL Anna Monreale KDD-Lab, University of Pisa Email: annam@di.unipi.it Outline Introduction on KNIME KNIME components Exercise: Data Understanding Exercise: Market Basket Analysis Exercise:
More informationUsing Weka for Classification. Preparing a data file
Using Weka for Classification Preparing a data file Prepare a data file in CSV format. It should have the names of the features, which Weka calls attributes, on the first line, with the names separated
More informationData Warehousing. Data Warehousing and Mining. Lecture 8. by Hossen Asiful Mustafa
Data Warehousing Data Warehousing and Mining Lecture 8 by Hossen Asiful Mustafa Databases Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age Information,
More informationMIS2502: Data Analytics Dimensional Data Modeling. Jing Gong
MIS2502: Data Analytics Dimensional Data Modeling Jing Gong gong@temple.edu http://community.mis.temple.edu/gong Where we are Now we re here Data entry Transactional Database Data extraction Analytical
More informationData Warehouse and Data Mining
Data Warehouse and Data Mining Lecture No. 04-06 Data Warehouse Architecture Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology
More informationData Science. Data Analyst. Data Scientist. Data Architect
Data Science Data Analyst Data Analysis in Excel Programming in R Introduction to Python/SQL/Tableau Data Visualization in R / Tableau Exploratory Data Analysis Data Scientist Inferential Statistics &
More informationExam /Course 20767B: Implementing a SQL Data Warehouse
Exam 70-767/Course 20767B: Implementing a SQL Data Warehouse Course Outline Module 1: Introduction to Data Warehousing This module describes data warehouse concepts and architecture consideration. Overview
More informationK236: Basis of Data Science
Schedule of K236 K236: Basis of Data Science Lecture 6: Data Preprocessing Lecturer: Tu Bao Ho and Hieu Chi Dam TA: Moharasan Gandhimathi and Nuttapong Sanglerdsinlapachai 1. Introduction to data science
More informationData Preprocessing. S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha
Data Preprocessing S1 Teknik Informatika Fakultas Teknologi Informasi Universitas Kristen Maranatha 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking
More informationCognos also provides you an option to export the report in XML or PDF format or you can view the reports in XML format.
About the Tutorial IBM Cognos Business intelligence is a web based reporting and analytic tool. It is used to perform data aggregation and create user friendly detailed reports. IBM Cognos provides a wide
More informationData Analysis and Data Science
Data Analysis and Data Science CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/29/15 Agenda Check-in Online Analytical Processing Data Science Homework 8 Check-in Online Analytical
More informationData Management Glossary
Data Management Glossary A Access path: The route through a system by which data is found, accessed and retrieved Agile methodology: An approach to software development which takes incremental, iterative
More informationTutorial 9. Review. Data Tables and Scenario Management. Data Validation. Protecting Worksheet. Range Names. Macros
Tutorial 9 Data Tables and Scenario Management Review Data Validation Protecting Worksheet Range Names Macros 1 Examine cost-volume-profit relationships Suppose you were the owner of a water store. An
More informationImplementing a SQL Data Warehouse
Implementing a SQL Data Warehouse Course 20767B 5 Days Instructor-led, Hands on Course Information This five-day instructor-led course provides students with the knowledge and skills to provision a Microsoft
More informationData Preprocessing. Why Data Preprocessing? MIT-652 Data Mining Applications. Chapter 3: Data Preprocessing. Multi-Dimensional Measure of Data Quality
Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data e.g., occupation = noisy: containing
More informationCS434 Notebook. April 19. Data Mining and Data Warehouse
CS434 Notebook April 19 2017 Data Mining and Data Warehouse Table of Contents The DM Process MS s view (DMX)... 3 The Basics... 3 The Three-Step Dance... 3 Few Important Concepts... 4 More on Attributes...
More informationData Mining: Concepts and Techniques. (3 rd ed.) Chapter 3. Chapter 3: Data Preprocessing. Major Tasks in Data Preprocessing
Data Mining: Concepts and Techniques (3 rd ed.) Chapter 3 1 Chapter 3: Data Preprocessing Data Preprocessing: An Overview Data Quality Major Tasks in Data Preprocessing Data Cleaning Data Integration Data
More informationDta Mining and Data Warehousing
CSCI645 Fall 23 Dta Mining and Data Warehousing Instructor: Qigang Gao, Office: CS219, Tel:494-3356, Email: qggao@cs.dal.ca Teaching Assistant: Christopher Jordan, Email: cjordan@cs.dal.ca Office Hours:
More informationImplementing a SQL Data Warehouse
Implementing a SQL Data Warehouse 20767B; 5 days, Instructor-led Course Description This 4-day instructor led course describes how to implement a data warehouse platform to support a BI solution. Students
More informationWhatsApp Group Data Analysis with R
WhatsApp Group Data Analysis with R Sanchita Patil MCA Department Vivekanand Education Society's Institute of Technology Chembur, Mumbai 400074. ABSTRACT The means of communication has changed over time
More informationWEKA homepage.
WEKA homepage http://www.cs.waikato.ac.nz/ml/weka/ Data mining software written in Java (distributed under the GNU Public License). Used for research, education, and applications. Comprehensive set of
More informationData Mining. Asso. Profe. Dr. Raed Ibraheem Hamed. University of Human Development, College of Science and Technology Department of CS (1)
Data Mining Asso. Profe. Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology Department of CS 2016 2017 (1) Points to Cover Problem: Heterogeneous Information Sources
More informationIntroduction to Data Science
UNIT I INTRODUCTION TO DATA SCIENCE Syllabus Introduction of Data Science Basic Data Analytics using R R Graphical User Interfaces Data Import and Export Attribute and Data Types Descriptive Statistics
More informationContents. Part I Setting the Scene
Contents Part I Setting the Scene 1 Introduction... 3 1.1 About Mobility Data... 3 1.1.1 Global Positioning System (GPS)... 5 1.1.2 Format of GPS Data... 6 1.1.3 Examples of Trajectory Datasets... 8 1.2
More informationData Preprocessing Yudho Giri Sucahyo y, Ph.D , CISA
Obj ti Objectives Motivation: Why preprocess the Data? Data Preprocessing Techniques Data Cleaning Data Integration and Transformation Data Reduction Data Preprocessing Lecture 3/DMBI/IKI83403T/MTI/UI
More informationData Warehousing. Adopted from Dr. Sanjay Gunasekaran
Data Warehousing Adopted from Dr. Sanjay Gunasekaran Main Topics Overview of Data Warehouse Concept of Data Conversion Importance of Data conversion and the steps involved Common Industry Methodology Outline
More informationETL and OLAP Systems
ETL and OLAP Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester
More informationPractical Data Mining COMP-321B. Tutorial 1: Introduction to the WEKA Explorer
Practical Data Mining COMP-321B Tutorial 1: Introduction to the WEKA Explorer Gabi Schmidberger Mark Hall Richard Kirkby July 12, 2006 c 2006 University of Waikato 1 Setting up your Environment Before
More informationStep-by-step data transformation
Step-by-step data transformation Explanation of what BI4Dynamics does in a process of delivering business intelligence Contents 1. Introduction... 3 Before we start... 3 1 st. STEP: CREATING A STAGING
More informationECT7110. Data Preprocessing. Prof. Wai Lam. ECT7110 Data Preprocessing 1
ECT7110 Data Preprocessing Prof. Wai Lam ECT7110 Data Preprocessing 1 Why Data Preprocessing? Data in the real world is dirty incomplete: lacking attribute values, lacking certain attributes of interest,
More informationData Mining With Weka A Short Tutorial
Data Mining With Weka A Short Tutorial Dr. Wenjia Wang School of Computing Sciences University of East Anglia (UEA), Norwich, UK Content 1. Introduction to Weka 2. Data Mining Functions and Tools 3. Data
More informationMicrosoft End to End Business Intelligence Boot Camp
Microsoft End to End Business Intelligence Boot Camp 55045; 5 Days, Instructor-led Course Description This course is a complete high-level tour of the Microsoft Business Intelligence stack. It introduces
More informationUNIT 2 Data Preprocessing
UNIT 2 Data Preprocessing Lecture Topic ********************************************** Lecture 13 Why preprocess the data? Lecture 14 Lecture 15 Lecture 16 Lecture 17 Data cleaning Data integration and
More informationMKTG 460 Winter 2019 Solutions #1
MKTG 460 Winter 2019 Solutions #1 Short Answer: Data Analysis 1. What is a data table and how are the data values organized? A data table stores the data values for variables across different observations,
More informationUpload and Go! Tired of doing data entry? Save time and increase cash flow by submitting accounts in bulk upload. Upload and Go!
Tired of doing data entry? Save time and increase cash flow by submitting accounts in bulk upload. Step 1: TIP: Make sure the file, to be uploaded, does not have any blank lines above the header line or
More informationData. Notes. are required reading for the week. textbook reading and a few slides on data formats and data cleaning
CS 725/825 Information Visualization Spring 2018 Data Dr. Michele C. Weigle http://www.cs.odu.edu/~mweigle/cs725-s18/ Notes } We will not cover these slides in class, but they are required reading for
More informationDATA MINING AND WAREHOUSING
DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making
More informationMS-55045: Microsoft End to End Business Intelligence Boot Camp
MS-55045: Microsoft End to End Business Intelligence Boot Camp Description This five-day instructor-led course is a complete high-level tour of the Microsoft Business Intelligence stack. It introduces
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012
Implementing a Data Warehouse with Microsoft SQL Server 2012 Course 10777A 5 Days Instructor-led, Hands-on Introduction Data warehousing is a solution organizations use to centralize business data for
More informationDATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY
DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY CHARACTERISTICS Data warehouse is a central repository for summarized and integrated data
More informationElementary Statistics
1 Elementary Statistics Introduction Statistics is the collection of methods for planning experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing
More informationKnowledge Modelling and Management. Part B (9)
Knowledge Modelling and Management Part B (9) Yun-Heh Chen-Burger http://www.aiai.ed.ac.uk/~jessicac/project/kmm 1 A Brief Introduction to Business Intelligence 2 What is Business Intelligence? Business
More informationDay 1 Agenda. Brio 101 Training. Course Presentation and Reference Material
Data Warehouse www.rpi.edu/datawarehouse Brio 101 Training Course Presentation and Reference Material Day 1 Agenda Training Overview Data Warehouse and Business Intelligence Basics The Brio Environment
More informationIJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 05, 2016 ISSN (online):
IJSRD - International Journal for Scientific Research & Development Vol. 4, Issue 05, 2016 ISSN (online): 2321-0613 A Study on Handling Missing Values and Noisy Data using WEKA Tool R. Vinodhini 1 A. Rajalakshmi
More informationChapter 3. The Multidimensional Model: Basic Concepts. Introduction. The multidimensional model. The multidimensional model
Chapter 3 The Multidimensional Model: Basic Concepts Introduction Multidimensional Model Multidimensional concepts Star Schema Representation Conceptual modeling using ER, UML Conceptual modeling using
More informationMicroStrategy Academic Program
MicroStrategy Academic Program Creating a center of excellence for enterprise analytics and mobility. HOW TO DEPLOY ENTERPRISE ANALYTICS AND MOBILITY ON AWS APPROXIMATE TIME NEEDED: 1 HOUR In this workshop,
More informationExtended TDWI Data Modeling: An In-Depth Tutorial on Data Warehouse Design & Analysis Techniques
: An In-Depth Tutorial on Data Warehouse Design & Analysis Techniques Class Format: The class is an instructor led format using multiple learning techniques including: lecture to present concepts, principles,
More informationDATA WAREHOUING UNIT I
BHARATHIDASAN ENGINEERING COLLEGE NATTRAMAPALLI DEPARTMENT OF COMPUTER SCIENCE SUB CODE & NAME: IT6702/DWDM DEPT: IT Staff Name : N.RAMESH DATA WAREHOUING UNIT I 1. Define data warehouse? NOV/DEC 2009
More informationDesigning Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses
Designing Data Warehouses To begin a data warehouse project, need to find answers for questions such as: Data Warehousing Design Which user requirements are most important and which data should be considered
More informationCS513-Data Mining. Lecture 2: Understanding the Data. Waheed Noor
CS513-Data Mining Lecture 2: Understanding the Data Waheed Noor Computer Science and Information Technology, University of Balochistan, Quetta, Pakistan Waheed Noor (CS&IT, UoB, Quetta) CS513-Data Mining
More informationSTRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS. By: Dr. Tendani J. Lavhengwa
STRATEGIC INFORMATION SYSTEMS IV STV401T / B BTIP05 / BTIX05 - BTECH DEPARTMENT OF INFORMATICS LECTURE: 05 (A) DATA WAREHOUSING (DW) By: Dr. Tendani J. Lavhengwa lavhengwatj@tut.ac.za 1 My personal quote:
More informationAnalysing crime data in Maps for Office and ArcGIS Online
Analysing crime data in Maps for Office and ArcGIS Online For non-commercial use only by schools and universities Esri UK GIS for School Programme www.esriuk.com/schools Introduction ArcGIS Online is a
More informationData Warehouse. Asst.Prof.Dr. Pattarachai Lalitrojwong
Data Warehouse Asst.Prof.Dr. Pattarachai Lalitrojwong Faculty of Information Technology King Mongkut s Institute of Technology Ladkrabang Bangkok 10520 pattarachai@it.kmitl.ac.th The Evolution of Data
More informationLecture 18. Business Intelligence and Data Warehousing. 1:M Normalization. M:M Normalization 11/1/2017. Topics Covered
Lecture 18 Business Intelligence and Data Warehousing BDIS 6.2 BSAD 141 Dave Novak Topics Covered Test # Review What is Business Intelligence? How can an organization be data rich and information poor?
More informationECLT 5810 Data Preprocessing. Prof. Wai Lam
ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate
More informationImplementing a Data Warehouse with Microsoft SQL Server 2012
10777 - Implementing a Data Warehouse with Microsoft SQL Server 2012 Duration: 5 days Course Price: $2,695 Software Assurance Eligible Course Description 10777 - Implementing a Data Warehouse with Microsoft
More informationDuration: 5 Days. EZY Intellect Pte. Ltd.,
Implementing a SQL Data Warehouse Duration: 5 Days Course Code: 20767A Course review About this course This 5-day instructor led course describes how to implement a data warehouse platform to support a
More informationTime: 3 hours. Full Marks: 70. The figures in the margin indicate full marks. Answers from all the Groups as directed. Group A.
COPYRIGHT RESERVED End Sem (V) MCA (XXVIII) 2017 Time: 3 hours Full Marks: 70 Candidates are required to give their answers in their own words as far as practicable. The figures in the margin indicate
More informationTo study the application of Data Visualization and Analysis tools
To study the application of Data Visualization and Analysis tools Mrs. Shibani Kulkarni, Department of Computer Science, Dr. D. Y. Patil ACS College, Pimpri, Pune-18 Ms. Neeta Takawale, Department of Computer
More informationData Foundations. Topic Objectives. and list subcategories of each. its properties. before producing a visualization. subsetting
CS 725/825 Information Visualization Fall 2013 Data Foundations Dr. Michele C. Weigle http://www.cs.odu.edu/~mweigle/cs725-f13/ Topic Objectives! Distinguish between ordinal and nominal values and list
More informationData Mining & Data Warehouse
Data Mining & Data Warehouse Asso. Profe. Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology Department of Information Technology 2016 2017 (1) Points to Cover Problem:
More informationE(xtract) T(ransform) L(oad)
Gunther Heinrich, Tobias Steimer E(xtract) T(ransform) L(oad) OLAP 20.06.08 Agenda 1 Introduction 2 Extract 3 Transform 4 Load 5 SSIS - Tutorial 2 1 Introduction 1.1 What is ETL? 1.2 Alternative Approach
More informationJet Data Manager 2014 SR2 Product Enhancements
Jet Data Manager 2014 SR2 Product Enhancements Table of Contents Overview of New Features... 3 New Features in Jet Data Manager 2014 SR2... 3 Improved Features in Jet Data Manager 2014 SR2... 5 New Features
More informationCS377: Database Systems Data Warehouse and Data Mining. Li Xiong Department of Mathematics and Computer Science Emory University
CS377: Database Systems Data Warehouse and Data Mining Li Xiong Department of Mathematics and Computer Science Emory University 1 1960s: Evolution of Database Technology Data collection, database creation,
More informationShopkeep Cashier Transactions. 1) Sign into Shopkeep using your 4 digit manager code which should have been previously provided to you.
Shopkeep Cashier Transactions 1) Sign into Shopkeep using your 4 digit manager code which should have been previously provided to you. a) You will then need to open the shift by pressing the green Open
More informationData Analyst Nanodegree Syllabus
Data Analyst Nanodegree Syllabus Discover Insights from Data with Python, R, SQL, and Tableau Before You Start Prerequisites : In order to succeed in this program, we recommend having experience working
More informationTraining 24x7 DBA Support Staffing. MCSA:SQL 2016 Business Intelligence Development. Implementing an SQL Data Warehouse. (40 Hours) Exam
MCSA:SQL 2016 Business Intelligence Development Implementing an SQL Data Warehouse (40 Hours) Exam 70-767 Prerequisites At least 2 years experience of working with relational databases, including: Designing
More informationData Warehousing Introduction. Toon Calders
Data Warehousing Introduction Toon Calders toon.calders@ulb.ac.be Course Organization Lectures on Tuesday 14:00 and Friday 16:00 Check http://gehol.ulb.ac.be/ for room Most exercises in computer class
More informationYour Name: Section: INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression
Your Name: Section: 36-201 INTRODUCTION TO STATISTICAL REASONING Computer Lab #4 Scatterplots and Regression Objectives: 1. To learn how to interpret scatterplots. Specifically you will investigate, using
More information20767: Implementing a SQL Data Warehouse
Let s Reach For Excellence! TAN DUC INFORMATION TECHNOLOGY SCHOOL JSC Address: 103 Pasteur, Dist.1, HCMC Tel: 08 38245819; 38239761 Email: traincert@tdt-tanduc.com Website: www.tdt-tanduc.com; www.tanducits.com
More informationMicrosoft Implementing a SQL Data Warehouse
1800 ULEARN (853 276) www.ddls.com.au Microsoft 20767 - Implementing a SQL Data Warehouse Length 5 days Price $4290.00 (inc GST) Version C Overview This five-day instructor-led course provides students
More informationCS 8520: Artificial Intelligence. Weka Lab. Paula Matuszek Fall, CSC 8520 Fall Paula Matuszek
CS 8520: Artificial Intelligence Weka Lab Paula Matuszek Fall, 2015!1 Weka is Waikato Environment for Knowledge Analysis Machine Learning Software Suite from the University of Waikato Been under development
More information