ETL Interview Question Bank

Size: px
Start display at page:

Download "ETL Interview Question Bank"

Transcription

1 ETL Interview Question Bank Author: - Sheetal Shirke Version: - Version 0.1

2 ETL Architecture Diagram 1 ETL Testing Questions 1. What is Data WareHouse? A data warehouse (DW or DWH), also known as an enterprise data warehouse (EDW), is a system used for reporting and data. DWs are central repositories of integrated data from one or more disparate sources. They store current and historical data and are used for creating analytical reports for knowledge workers throughout the enterprise (Refer Diagram 1) 2. Difference between Dataware House and Data Mart? Refer Diagram 1 Data mart and data warehousing are tools to assist management to come up with relevant information about the organization at any point of time While data marts are limited for use of a department only, data warehousing applies to an entire organization Data marts are easy to design and use while data warehousing is complex and difficult to manage Data warehousing is more useful as it can come up with information from any department 3. What is ETL/Data Warehouse Testing? ETL stands for Extract Transformation and Load, It collect the different source data from Heterogeneous System (DB), Transform the data into Data warehouse (Target) At the Time of Transformation, Data are first transform to Staging Table (temporary table)

3 Based on Business rules the data are mapped into target table, this process are manually mapped / we configure using ETL Tool. ETL not transformed the Duplicate data Data Transformation process speed based on Source and Target Data ware House We need to consider the OLAP(Online Analytic Processing) Structure.Data warehouse Model Source data consist of (XML, Flat file,database.excel Report.Dataware House We need to set the validation at time of data transformation like Avoid the NULL values in the table, validate the data type as using Tiny int instead of integer.etc Based on the user requirement, ETL process starts. 4. Explain what are the ETL testing operations include? Verify whether the data is transforming correctly according to business requirements Verify that the projected data is loaded into the data warehouse without any truncation and data loss Make sure that ETL application reports invalid data and replaces with default values Make sure that data loads at expected time frame to improve scalability and performance 5. What are the various tools used in ETL. Cognos Decision Stream Oracle Warehouse Builder/Oracle Data Integrator Business Objects XI SAS business warehouse SAS Enterprise ETL server 6. What are staging area in ETL testing and its purpose? Staging area is place where you hold temporary tables on data warehouse server. Staging tables are connected to work area or fact tables. We basically need staging area to hold the data, and perform data cleansing and merging, before loading the data into warehouse. 7. What is Primary, Foreign keys and difference among them? Primary Key: A primary key is a field or combination of fields that uniquely identify a record in a table, so that an individual record can be located without confusion. Foreign Key: A foreign key (sometimes called a referencing key) is a key used to link two tables together. Typically you take the primary key field from one table and insert it into the other table where it becomes a foreign key (it remains a primary key in the original table).

4 8. What is Surrogate Key? Surrogate key is a substitution for the natural primary key. It is just a unique identifier or number for each row that can be used for the primary key to the table. The only requirement for a surrogate primary key is that it is unique for each row in the table. Data warehouses typically use a surrogate, (also known as artificial or identity key), key for the dimension tables primary keys. 9. What is Fact and Dimensions? Fact: Facts are the metrics that business users would use for making business decisions. Generally, facts are mere numbers. The facts cannot be used without their dimensions Dimension: Dimensions are those attributes that qualify facts. They give structure to the facts. Dimensions give different views of the facts. The facts & Dimension tables are linked by means of key called surrogate keys. Each fact table would have a column surrogate key that would have a corresponding key in the dimension tables. 10. Types of Dimensions. Slowly Changing Dimensions: Attributes of a dimension that would undergo changes over time. It depends on the business requirement whether particular attribute history of changes should be preserved in the data warehouse. This is called a Slowly Changing Attribute and a dimension containing such an attribute is called a Slowly Changing Dimension. Rapidly Changing Dimensions: A dimension attribute that changes frequently is a Rapidly Changing Attribute. If you don t need to track the changes, the Rapidly Changing Attribute is no problem, but if you do need to track the changes, using a standard Slowly Changing Dimension technique can result in a huge inflation of the size of the dimension. One solution is to move the attribute to its own dimension, with a separate foreign key in the fact table. This new dimension is called a Rapidly Changing Dimension. Junk Dimensions: A junk dimension is a single table with a combination of different and unrelated attributes to avoid having a large number of foreign keys in the fact table. Junk dimensions are often created to manage the foreign keys created by Rapidly Changing Dimensions.

5 Inferred Dimensions: While loading fact records, a dimension record may not yet be ready. One solution is to generate a surrogate key with Null for all the other attributes. This should technically be called an inferred member, but is often called an inferred dimension. Conformed Dimensions: A Dimension that is used in multiple locations is called a conformed dimension. A conformed dimension may be used with multiple fact tables in a single database, or across multiple data marts or data warehouses. Degenerate Dimensions: A degenerate dimension is when the dimension attribute is stored as part of fact table, and not in a separate dimension table. These are essentially dimension keys for which there are no other attributes. In a data warehouse, these are often used as the result of a drill through query to analyze the source of an aggregated number in a report. You can use these values to trace back to transactions in the OLTP system. Role Playing Dimensions: A role-playing dimension is one where the same dimension key along with its associated attributes can be joined to more than one foreign key in the fact table. For example, a fact table may include foreign keys for both Ship Date and Delivery Date. But the same date dimension attributes apply to each foreign key, so you can join the same dimension table to both foreign keys. Here the date dimension is taking multiple roles to map ship date as well as delivery date, and hence the name of Role Playing dimension. Shrunken Dimensions: A shrunken dimension is a subset of another dimension. For example, the Orders fact table may include a foreign key for Product, but the Target fact table may include a foreign key only for Product Category, which is in the Product table, but much less granular. Creating a smaller dimension table, with Product Category as its primary key, is one way of dealing with this situation of heterogeneous grain. If the Product dimension is snowflake, there is probably already a separate table for Product Category, which can serve as the Shrunken Dimension. Static Dimensions: Static dimensions are not extracted from the original data source, but are created within the context of the data warehouse. A static dimension can be loaded manually for example with Status codes or it can be generated by a procedure, such as a Date or Time dimension. 11. Types of Facts. Additive: Additive facts are facts that can be summed up through all of the dimensions in the fact table. A sales fact is a good example for additive fact.

6 Semi-Additive: Semi-additive facts are facts that can be summed up for some of the dimensions in the fact table, but not the others. Eg: Daily balances fact can be summed up through the customers dimension but not through the time dimension. Non-Additive: Non-additive facts are facts that cannot be summed up for any of the dimensions present in the fact table. Eg: Facts which have percentages, ratios calculated. Factless Fact Table: In the real world, it is possible to have a fact table that contains no measures or facts. These tables are called Factless Fact tables. Eg: A fact table which has only product key and date key is a factless fact. There are no measures in this table. But still you can get the number products sold over a period of time. Based on the above classifications, fact tables are categorized into two: Cumulative: This type of fact table describes what has happened over a period of time. For example, this fact table may describe the total sales by product by store by day. The facts for this type of fact tables are mostly additive facts. The first example presented here is a cumulative fact table. Snapshot: This type of fact table describes the state of things in a particular instance of time, and usually includes more semi-additive and non-additive facts. The second example presented here is a snapshot fact table. 12. What SCD (Slowly Changing Dimensions) and its types? Slowly Changing Dimensions (SCD) - dimensions that change slowly over time, rather than changing on regular schedule, time-base. In Data Warehouse there is a need to track changes in dimension attributes in order to report historical data. In other words, implementing one of the SCD types should enable users assigning proper dimensions attribute value for given date? Example of such dimensions could be: customer, geography, and employee. There are many approaches how to deal with SCD. The most popular are: Type 0 - The passive method Type 1 - Overwriting the old value Type 2 - Creating a new additional record Type 3 - Adding a new column

7 Type 4 - Using historical table Type 6 - Combine approaches of types 1, 2, 3 (1+2+3=6) Type 0 - The passive method. In this method no special action is performed upon dimensional changes. Some dimension data can remain the same as it was first time inserted, others may be overwritten. Type 1 - Overwriting the old value. In this method no history of dimension changes is kept in the database. The old dimension value is simply overwritten be the new one. This type is easy to maintain and is often use for data which changes are caused by processing corrections (e.g. removal special characters, correcting spelling errors). Before the change: Customer_ID Customer_Name Customer_Type 1 Cust_1 Corporate After the change: Customer_ID Customer_Name Customer_Type 1 Cust_1 Retail Type 2 - Creating a new additional record. In this methodology all history of dimension changes is kept in the database. You capture attribute change by adding a new row with a new surrogate key to the dimension table. Both the prior and new rows contain as attributes the natural key (or other durable identifier). Also 'effective date' and 'current indicator' columns are used in this method. There could be only one record with current indicator set to 'Y'. For 'effective date' columns, i.e. start_date and end_date, the end_date for current record usually is set to value Introducing changes to the dimensional model in type 2 could be very expensive database operation so it is not recommended to use it in dimensions where a new attribute could be added in the future. Before the change: Customer_ID Customer_Name Customer_Type Start_Date End_Date Current_Flag 1 Cust_1 Corporate Y After the change: Customer_ID Customer_Name Customer_Type Start_Date End_Date Current_Flag 1 Cust_1 Corporate N 2 Cust_1 Retail Y Type 3 - Adding a new column. In this type usually only the current and previous value of dimension is kept in the database. The new value is loaded into 'current/new' column and the old one into 'old/previous' column. Generally speaking the history is limited to the number of column created for storing historical data. This is the least commonly needed technique.

8 Before the change: Customer_ID Customer_Name Current_Type Previous_Type 1 Cust_1 Corporate Corporate After the change: Customer_ID Customer_Name Current_Type Previous_Type 1 Cust_1 Retail Corporate Type 4 - Using historical table. In this method a separate historical table is used to track all dimensions attribute historical changes for each of the dimension. The 'main' dimension table keeps only the current data e.g. customer and customer_history tables. Current table: Customer_ID Customer_Name Customer_Type 1 Cust_1 Corporate Historical table: Customer_ID Customer_Name Customer_Type Start_Date End_Date 1 Cust_1 Retail Cust_1 Other Cust_1 Corporate Type 6 - Combine approaches of types 1, 2, 3 (1+2+3=6). In this type we have in dimension table such additional columns as: current_type - for keeping current value of the attribute. All history records for given item of attribute have the same current value. historical_type - for keeping historical value of the attribute. All history records for given item of attribute could have different values. start_date - for keeping start date of 'effective date' of attribute's history. end_date - for keeping end date of 'effective date' of attribute's history. current_flag - for keeping information about the most recent record. In this method to capture attribute change we add a new record as in type 2. The current_type information is overwritten with the new one as in type 1. We store the history in a historical_column as in type 3. Customer_I D Customer_Nam e 1 Cust_1 Corporate Retail 2 Cust_1 Corporate Other Current_Type Historical_Type Start_Date End_Date Current_Flag N N

9 3 Cust_1 Corporate Corporate Y 13. What are OLTP and OLAP? OLTP (On-line Transaction Processing) is characterized by a large number of short on-line transactions (INSERT, UPDATE, and DELETE). The main emphasis for OLTP systems is put on very fast query processing, maintaining data integrity in multi-access environments and an effectiveness measured by number of transactions per second. In OLTP database there is detailed and current data, and schema used to store transactional databases is the entity model (usually 3NF). OLAP (On-line Analytical Processing) is characterized by relatively low volume of transactions. Queries are often very complex and involve aggregations. For OLAP systems a response time is an effectiveness measure. OLAP applications are widely used by Data Mining techniques. In OLAP

10 database there is aggregated, historical data, stored in multi-dimensional schemas (usually star schema). The following table summarizes the major differences between OLTP and OLAP system design. Source of data Purpose of data OLTP System Online Transaction Processing (Operational System) Operational data; OLTPs are the original source of the data. To control and run fundamental business tasks What the data Reveals a snapshot of ongoing business processes Inserts and Updates Queries Processing Speed Space Requirements Database Design Backup and Recovery Short and fast inserts and updates initiated by end users Relatively standardized and simple queries Returning relatively few records Typically very fast Can be relatively small if historical data is archived Highly normalized with many tables Backup religiously; operational data is critical to run the business, data loss is likely to entail significant monetary loss and legal liability OLAP System Online Analytical Processing (Data Warehouse) Consolidation data; OLAP data comes from the various OLTP Databases To help with planning, problem solving, and decision support Multi-dimensional views of various kinds of business activities Periodic long-running batch jobs refresh the data Often complex queries involving aggregations Depends on the amount of data involved; batch data refreshes and complex queries may take many hours; query speed can be improved by creating indexes Larger due to the existence of aggregation structures and history data; requires more indexes than OLTP Typically de-normalized with fewer tables; use of star and/or snowflake schemas Instead of regular backups, some environments may consider simply reloading the OLTP data as a recovery method 14. What is Grain of Fact OR Fact Granularity? Level of granularity means level of detail that you put into the fact table in a data warehouse. For example: Based on design you can decide to put the sales data in each transaction. Now, level of granularity would mean what detail you are willing to put for each

11 transactional fact. Product sales with respect to each minute or you want to aggregate it up to minute and put that data. It also means that we can have (for example) data aggregated for a year for a given product as well as the data can be drilled down to Monthly, weekly and daily basis...the lowest level is known as the grain. Going down to details is Granularity 15. STAR Schema and Snowflake Schema? STAR Schema: Star Schema has single fact table connected to dimension tables and it visualize as a star. In star schema only one link establishes the relationship between the fact table and any of the dimension tables. SNOW-FLAKE Schema: Snowflake Schema is an extension of the star schema. In this model, dimension tables are not necessarily fully flattened. Here, very large dimension tables are normalized into multiple sub dimensional tables. It is used when a dimensional table becomes very big. Also, every dimension table is associated with sub dimension table and has multiple links.

12 16. What are Transformation and its types? Aggregator Transformation Application Source Qualifier Transformation Custom Transformation Data Masking Transformation Expression Transformation External Procedure Transformation Filter Transformation HTTP Transformation Input Transformation Java Transformation Joiner Transformation Lookup Transformation Normaliser Transformation Output Transformation Rank Transformation Reusable Transformation Router Transformation Sequence Generator Transformation Sorter Transformation Source Qualifier Transformation SQL Transformation Stored Procedure Transformation Transaction Control Transaction

13 Union Transformation Unstructured Data Transformation Update Strategy Transformation XML Generator Transformation XML Parser Transformation XML Source Qualifier Transformation Advanced External Procedure Transformation External Transformation 17. Explain what is the difference between OLAP tools and ETL tools? ETL tool is meant for extraction data from the legacy systems and load into specified data base with some process of cleansing data. Eg: Informatica, data stage...etc OLAP is meant for Reporting purpose. In OLAP data available in Multidimensional model. So that u can write simple query to extract data from the data base. Eg: Business objects, Cognos...etc 18. Explain these terms Session, Worklet, Mapplet and Workflow? Mapplet : It arranges or creates sets of transformation Worklet: It represents a specific set of tasks given Workflow: It s a set of instructions that tell the server how to execute tasks Session: It is a set of parameters that tells the server how to move data from sources to target 19. What are Data ware House and Data Mining? Data mining is the process of finding patterns in a given data set. These patterns can often provide meaningful and insightful data to whoever is interested in that data. Data mining is used today in a wide variety of contexts in fraud detection, as an aid in marketing campaigns, and even supermarkets use it to study their consumers. Data warehousing can be said to be the process of centralizing or aggregating data from multiple sources into one common repository. 20. What are the basic checks done during ETL Testing? Reconciliation testing: Sometimes, it is also referred as Source to Target count testing. In this check, matching of count of records is checked. Although this is not the best way, but in case of time crunch, it helps.

14 Eg: Constraint testing: Here test engineer, maps data from source to target and identify whether the data is mapped or not. Following are the key checks: UNIQUE, NULL, NOT NULL, Primary Key, Foreign key, DEFAULT, CHECK Validation testing (source to target data): It is generally executed in mission critical or financial projects. Here, test engineer, validates each data point and match source to target data. Testing for duplicate check: It is done to ensure that there are no duplicate values for unique columns. Duplicate data can arise due to any reason like missing primary key etc. Below is one Eg: Testing for attribute check: To check if all attributes of source system are present in target table. Logical or transformation testing: To test any logical gaps in the. Here, depending upon the scenario, following methods can be used: boundary value analysis, equivalence partitioning, comparison testing, error guessing or sometimes, graph based testing methods. It also covers testing for look-up conditions. Incremental and historical data testing: Test to check the data integrity of old & new data with the addition of new data. It also covers the validation of purging policy related scenarios. GUI / navigation testing: To check the navigation or GUI aspects of the front end reports. Note: In case of ETL or data warehouse testing, re-testing or regression testing is also part of this effort. Their concept / definition remain the same.

15 21. What is Active and Passive Transformation? List various types of them. If something is changing for the row in a transformation then it s an active Transformation. But what is changing? A transformation that changes the number of rows passing through it. Changing the order of the rows passing through it also consider in active transformation. When a row enters a transformation, Informatica assigns a row number. If this number change for a row, that's an Active transformation. In other words the nth row coming in will go as n'th row, and then the transformation is Passive Filter Transformation: The number of rows getting in the transformation and coming out is different. And as specified above it satisfies the criteria for being an active transformation. But this is not the case, if all the rows in filter transformation will satisfy the True filter condition then it s behave as a Passive Transformation. Aggregator Transformation: Aggregator transformation is used to get the aggregate value based on the group by ports. Thus if we have duplicates on the group by columns then it will pass only the distinct records. So here also the records coming into the transformation and going out are different and acts as an active transformation. But similar to filter transformation here also there can be an exception. That is if there are no duplicates on the group by ports then all the rows will be passed. Sorter Transformation: This is one transformation which can satisfy both the criteria of being an active transformation. The sorter transformation is also provided to output only the distinct rows, where it can filter the duplicate rows and send the unique set. As it sorts the data so the order of the rows changes which satisfy our second criteria. Union Transformation: This is a transformation which becomes active only due to second criteria. In the union transformation the order of rows always not same as it came from source. Unlike Joiner transformation which restricts the data flow from one source until it gets all the data from the other source and so the order of the rows doesn t change. While on other hand in Union transformation it does not restrict the flow of any data and keeps on passing the data as it receives. So the order of the rows keeps on changing satisfying the second criteria for being an active transformation.

16 Router Transformation: Router Transformation becomes active Due to filter condition we are changing the input rows and output rows. For multiple groups if condition satisfies for more than one group the we will send the data in multiple output transformation. So for example we get 50 rows and we have 4 groups with a condition TRUE then all the groups will pass 50 rows that is total of 200 rows will come out of the Router making it an active transformation. 22. What is erroneous data in ETL testing? Data which fails to satisfy the business rules is called as Erroneous Data. This data is captured in Error Table during transformation for further analysis. Database related questions: 1. What are joins and its types?

17 SQL JOIN is a method to retrieve data from two or more database tables. a. JOIN or INNER JOIN b. OUTER JOIN i. LEFT OUTER JOIN or LEFT JOIN ii. RIGHT OUTER JOIN or RIGHT JOIN iii. FULL OUTER JOIN or FULL JOIN c. NATURAL JOIN d. CROSS JOIN e. SELF JOIN JOIN or INNER JOIN: In this kind of a JOIN, we get all records that match the condition in both the tables, and records in both the tables that do not match are not reported. In other words, INNER JOIN is based on the single fact that: ONLY the matching entries in BOTH the tables SHOULD be listed. Note that a JOIN without any other JOIN keywords (like INNER, OUTER, LEFT, etc) is an INNER JOIN. In other words, INNER JOIN is a Syntactic sugar for JOIN OUTER JOIN: OUTER JOIN retrieves either, the matched rows from one table and all rows in the other table OR, all rows in all tables (it doesn't matter whether or not there is a match). There are three kinds of Outer Join: LEFT OUTER JOIN or LEFT JOIN: This join returns all the rows from the left table in conjunction with the matching rows from the right table. If there are no columns matching in the right table, it returns NULL values. RIGHT OUTER JOIN or RIGHT JOIN: This JOIN returns all the rows from the right table in conjunction with the matching rows from the left table. If there are no columns matching in the left table, it returns NULL values. FULL OUTER JOIN or FULL JOIN: This JOIN combines LEFT OUTER JOIN and RIGHT OUTER JOIN. It returns row from either table when the conditions are met and returns NULL value when there is no match.

18 In other words, OUTER JOIN is based on the fact that : ONLY the matching entries in ONE OF the tables (RIGHT or LEFT) or BOTH of the tables(full) SHOULD be listed. Note that `OUTER JOIN` is a loosened form of `INNER JOIN`. NATURAL JOIN: It is based on the two conditions: The JOIN is made on all the columns with the same name for equality. Removes duplicate columns from the result. This seems to be more of theoretical in nature and as a result (probably) most DBMS don't even bother supporting this. CROSS JOIN: It is the Cartesian product of the two tables involved. The result of a CROSS JOIN will not make sense in most of the situations. Moreover, we won t need this at all (or needs the least, to be precise). SELF JOIN: It is not a different form of JOIN; rather it is a JOIN (INNER, OUTER, etc) of a table to itself. JOINs based on Operators Depending on the operator used for a JOIN clause, there can be two types of JOINs. They are Equi JOIN and Theta JOIN Equi JOIN: For whatever JOIN type (INNER, OUTER, etc), if we use ONLY the equality operator (=), then we say that the JOIN is an EQUI JOIN. Theta JOIN: This is same as EQUI JOIN but it allows all other operators like >, <, >= etc. 2. What is difference between delete, truncate and drop command? DELETE: The DELETE command is used to remove rows from a table. A WHERE clause can be used to only remove some rows. If no WHERE condition is specified, all rows will be removed. After performing a DELETE operation you need to COMMIT or ROLLBACK the transaction to make the change permanent or to undo it. Note that this operation will cause all DELETE triggers on the table to fire. TRUNCATE: TRUNCATE removes all rows from a table. The operation cannot be rolled back and no triggers will be fired. As such, TRUCATE is faster and doesn't use as much undo space as a DELETE.

19 DROP: The DROP command removes a table from the database. All the tables' rows, indexes and privileges will also be removed. No DML triggers will be fired. The operation cannot be rolled back. 3. What are constraints and different types of it? Constraints enable the RDBMS enforce the integrity of the database automatically, without needing you to create triggers, rule or defaults. Types of constraints: PRIMARY KEY UNIQUE FOREIGN KEY CHECK NOT NULL A PRIMARY KEY constraint is a unique identifier for a row within a database table. Every table should have a primary key constraint to uniquely identify each row and only one primary key constraint can be created for each table. The primary key constraints are used to enforce entity integrity. A UNIQUE constraint enforces the uniqueness of the values in a set of columns, so no duplicate values are entered. The unique key constraints are used to enforce entity integrity as the primary key constraints. A FOREIGN KEY constraint prevents any actions that would destroy link between tables with the corresponding data values. A foreign key in one table points to a primary key in another table. Foreign keys prevent actions that would leave rows with foreign key values when there are no primary keys with that value. The foreign key constraints are used to enforce referential integrity. A CHECK constraint is used to limit the values that can be placed in a column. The check constraints are used to enforce domain integrity. A NOT NULL constraint enforces that the column will not accept null values. The not null constraints are used to enforce domain integrity, as the check constraints. You can create constraints when the table is created, as part of the table definition by using the CREATE TABLE statement.

20 4. What does UNION do? What is the difference between UNION and UNION ALL? UNION merges the contents of two structurally-compatible tables into a single combined table. The difference between UNION and UNION ALL is that UNION will omit duplicate records whereas UNION ALL will include duplicate records. It is important to note that the performance of UNION ALL will typically be better than UNION, since UNION requires the server to do the additional work of removing any duplicates. So, in cases where is is certain that there will not be any duplicates, or where having duplicates is not a problem, use of UNION ALL would be recommended for performance reasons. 5. Write a query to find nth highest salary. Select max(salary) from (Select emp.salary, ROW_NUMBER() over (order by emp.salary desc) rno From Employee emp) a Where rno = n; OR Select SALARY from (Select emp.salary, ROW_NUMBER() over (order by emp.salary desc) rno From Employee emp) a Where rno = n order by SALARY desc; Note: where rno =n, you can replace n by highest number you want to find. 6. Write a query to find duplicate records in table. Select <column_name>, count(*) From <Table_Name> Group by <column_name> Having count(*)>1; 7. Write a query to display unique records from table. Select distinct Table_Name.* From Table_Name; 8. What is difference between ROW_NUMBER(), RANK(), DENSE_RANK()? ROW_NUMBER():

21 This function will assign a unique id to each row returned from the query. RANK(): This function will assign a unique number to each distinct row, but it leaves a gap between the groups. DENSE_RANK(): This function is similar to Rank with only difference, this will not leave gaps between groups. DEPTNO RN R DR Where RN = ROW_NUMBER() R= RANK() DR= DENSE_RANK() 9. How to retrieve dropped table from database? Using Flashback command. Eg: FLASHBACK <TABLE_NAME>; NOTE: helpful for SQL Queries.

Data Warehousing Concepts

Data Warehousing Concepts Data Warehousing Concepts Data Warehousing Definition Basic Data Warehousing Architecture Transaction & Transactional Data OLTP / Operational System / Transactional System OLAP / Data Warehouse / Decision

More information

DATA MINING AND WAREHOUSING

DATA MINING AND WAREHOUSING DATA MINING AND WAREHOUSING Qno Question Answer 1 Define data warehouse? Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making

More information

Data warehouse architecture consists of the following interconnected layers:

Data warehouse architecture consists of the following interconnected layers: Architecture, in the Data warehousing world, is the concept and design of the data base and technologies that are used to load the data. A good architecture will enable scalability, high performance and

More information

DATA MINING TRANSACTION

DATA MINING TRANSACTION DATA MINING Data Mining is the process of extracting patterns from data. Data mining is seen as an increasingly important tool by modern business to transform data into an informational advantage. It is

More information

Interview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept]

Interview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept] Interview Questions on DBMS and SQL [Compiled by M V Kamal, Associate Professor, CSE Dept] 1. What is DBMS? A Database Management System (DBMS) is a program that controls creation, maintenance and use

More information

Data Warehouses Chapter 12. Class 10: Data Warehouses 1

Data Warehouses Chapter 12. Class 10: Data Warehouses 1 Data Warehouses Chapter 12 Class 10: Data Warehouses 1 OLTP vs OLAP Operational Database: a database designed to support the day today transactions of an organization Data Warehouse: historical data is

More information

Informatica Power Center 10.1 Developer Training

Informatica Power Center 10.1 Developer Training Informatica Power Center 10.1 Developer Training Course Overview An introduction to Informatica Power Center 10.x which is comprised of a server and client workbench tools that Developers use to create,

More information

Website: Contact: / Classroom Corporate Online Informatica Syllabus

Website:  Contact: / Classroom Corporate Online Informatica Syllabus Designer Guide: Using the Designer o Configuring Designer Options o Using Toolbars o Navigating the Workspace o Designer Tasks o Viewing Mapplet and Mapplet Reports Working with Sources o Working with

More information

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:

More information

ETL (Extraction Transformation & Loading) Testing Training Course Content

ETL (Extraction Transformation & Loading) Testing Training Course Content 1 P a g e ETL (Extraction Transformation & Loading) Testing Training Course Content o Data Warehousing Concepts BY Srinivas Uttaravilli What are Data and Information and difference between Data and Information?

More information

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing.

This tutorial will help computer science graduates to understand the basic-to-advanced concepts related to data warehousing. About the Tutorial A data warehouse is constructed by integrating data from multiple heterogeneous sources. It supports analytical reporting, structured and/or ad hoc queries and decision making. This

More information

ETL TESTING TRAINING

ETL TESTING TRAINING ETL TESTING TRAINING Retrieving Data using the SQL SELECT Statement Capabilities of the SELECT statement Arithmetic expressions and NULL values in the SELECT statement Column aliases Use of concatenation

More information

Call: SAS BI Course Content:35-40hours

Call: SAS BI Course Content:35-40hours SAS BI Course Content:35-40hours Course Outline SAS Data Integration Studio 4.2 Introduction * to SAS DIS Studio Features of SAS DIS Studio Tasks performed by SAS DIS Studio Navigation to SAS DIS Studio

More information

Data Warehousing. Overview

Data Warehousing. Overview Data Warehousing Overview Basic Definitions Normalization Entity Relationship Diagrams (ERDs) Normal Forms Many to Many relationships Warehouse Considerations Dimension Tables Fact Tables Star Schema Snowflake

More information

Call: Datastage 8.5 Course Content:35-40hours Course Outline

Call: Datastage 8.5 Course Content:35-40hours Course Outline Datastage 8.5 Course Content:35-40hours Course Outline Unit -1 : Data Warehouse Fundamentals An introduction to Data Warehousing purpose of Data Warehouse Data Warehouse Architecture Operational Data Store

More information

1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples.

1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples. Instructions to the Examiners: 1. May the Examiners not look for exact words from the text book in the Answers. 2. May any valid example be accepted - example may or may not be from the text book 1. Attempt

More information

Data about data is database Select correct option: True False Partially True None of the Above

Data about data is database Select correct option: True False Partially True None of the Above Within a table, each primary key value. is a minimal super key is always the first field in each table must be numeric must be unique Foreign Key is A field in a table that matches a key field in another

More information

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective

A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective A Novel Approach of Data Warehouse OLTP and OLAP Technology for Supporting Management prospective B.Manivannan Research Scholar, Dept. Computer Science, Dravidian University, Kuppam, Andhra Pradesh, India

More information

Module 1.Introduction to Business Objects. Vasundhara Sector 14-A, Plot No , Near Vaishali Metro Station,Ghaziabad

Module 1.Introduction to Business Objects. Vasundhara Sector 14-A, Plot No , Near Vaishali Metro Station,Ghaziabad Module 1.Introduction to Business Objects New features in SAP BO BI 4.0. Data Warehousing Architecture. Business Objects Architecture. SAP BO Data Modelling SAP BO ER Modelling SAP BO Dimensional Modelling

More information

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures) CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm

More information

This module presents the star schema, an alternative to 3NF schemas intended for analytical databases.

This module presents the star schema, an alternative to 3NF schemas intended for analytical databases. Topic 3.3: Star Schema Design This module presents the star schema, an alternative to 3NF schemas intended for analytical databases. Star Schema Overview The star schema is a simple database architecture

More information

Pro Tech protechtraining.com

Pro Tech protechtraining.com Course Summary Description This course provides students with the skills necessary to plan, design, build, and run the ETL processes which are needed to build and maintain a data warehouse. It is based

More information

CHAPTER 3 Implementation of Data warehouse in Data Mining

CHAPTER 3 Implementation of Data warehouse in Data Mining CHAPTER 3 Implementation of Data warehouse in Data Mining 3.1 Introduction to Data Warehousing A data warehouse is storage of convenient, consistent, complete and consolidated data, which is collected

More information

Cognos also provides you an option to export the report in XML or PDF format or you can view the reports in XML format.

Cognos also provides you an option to export the report in XML or PDF format or you can view the reports in XML format. About the Tutorial IBM Cognos Business intelligence is a web based reporting and analytic tool. It is used to perform data aggregation and create user friendly detailed reports. IBM Cognos provides a wide

More information

Data Mining Concepts & Techniques

Data Mining Concepts & Techniques Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

More information

Designing Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses

Designing Data Warehouses. Data Warehousing Design. Designing Data Warehouses. Designing Data Warehouses Designing Data Warehouses To begin a data warehouse project, need to find answers for questions such as: Data Warehousing Design Which user requirements are most important and which data should be considered

More information

OLAP Introduction and Overview

OLAP Introduction and Overview 1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata

More information

Introduction to DWH / BI Concepts

Introduction to DWH / BI Concepts SAS INTELLIGENCE PLATFORM CURRICULUM SAS INTELLIGENCE PLATFORM BI TOOLS 4.2 VERSION SAS BUSINESS INTELLIGENCE TOOLS - COURSE OUTLINE Practical Project Based Training & Implementation on all the BI Tools

More information

Basics of Dimensional Modeling

Basics of Dimensional Modeling Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimension

More information

Rocky Mountain Technology Ventures

Rocky Mountain Technology Ventures Rocky Mountain Technology Ventures Comparing and Contrasting Online Analytical Processing (OLAP) and Online Transactional Processing (OLTP) Architectures 3/19/2006 Introduction One of the most important

More information

Star Schema Design (Additonal Material; Partly Covered in Chapter 8) Class 04: Star Schema Design 1

Star Schema Design (Additonal Material; Partly Covered in Chapter 8) Class 04: Star Schema Design 1 Star Schema Design (Additonal Material; Partly Covered in Chapter 8) Class 04: Star Schema Design 1 Star Schema Overview Star Schema: A simple database architecture used extensively in analytical applications,

More information

Guide Users along Information Pathways and Surf through the Data

Guide Users along Information Pathways and Surf through the Data Guide Users along Information Pathways and Surf through the Data Stephen Overton, Overton Technologies, LLC, Raleigh, NC ABSTRACT Business information can be consumed many ways using the SAS Enterprise

More information

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015

CT75 DATA WAREHOUSING AND DATA MINING DEC 2015 Q.1 a. Briefly explain data granularity with the help of example Data Granularity: The single most important aspect and issue of the design of the data warehouse is the issue of granularity. It refers

More information

Data Strategies for Efficiency and Growth

Data Strategies for Efficiency and Growth Data Strategies for Efficiency and Growth Date Dimension Date key (PK) Date Day of week Calendar month Calendar year Holiday Channel Dimension Channel ID (PK) Channel name Channel description Channel type

More information

Oral Questions and Answers (DBMS LAB) Questions & Answers- DBMS

Oral Questions and Answers (DBMS LAB) Questions & Answers- DBMS Questions & Answers- DBMS https://career.guru99.com/top-50-database-interview-questions/ 1) Define Database. A prearranged collection of figures known as data is called database. 2) What is DBMS? Database

More information

Data Warehouse Testing. By: Rakesh Kumar Sharma

Data Warehouse Testing. By: Rakesh Kumar Sharma Data Warehouse Testing By: Rakesh Kumar Sharma Index...2 Introduction...3 About Data Warehouse...3 Data Warehouse definition...3 Testing Process for Data warehouse:...3 Requirements Testing :...3 Unit

More information

Advanced Data Management Technologies Written Exam

Advanced Data Management Technologies Written Exam Advanced Data Management Technologies Written Exam 02.02.2016 First name Student number Last name Signature Instructions for Students Write your name, student number, and signature on the exam sheet. This

More information

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis

Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis Aggregating Knowledge in a Data Warehouse and Multidimensional Analysis Rafal Lukawiecki Strategic Consultant, Project Botticelli Ltd rafal@projectbotticelli.com Objectives Explain the basics of: 1. Data

More information

The Data Organization

The Data Organization C V I T F E P A O TM The Data Organization 1251 Yosemite Way Hayward, CA 94545 (510) 303-8868 rschoenrank@computer.org Business Intelligence Process Architecture By Rainer Schoenrank Data Warehouse Consultant

More information

UNIT-IV (Relational Database Language, PL/SQL)

UNIT-IV (Relational Database Language, PL/SQL) UNIT-IV (Relational Database Language, PL/SQL) Section-A (2 Marks) Important questions 1. Define (i) Primary Key (ii) Foreign Key (iii) unique key. (i)primary key:a primary key can consist of one or more

More information

1. Analytical queries on the dimensionally modeled database can be significantly simpler to create than on the equivalent nondimensional database.

1. Analytical queries on the dimensionally modeled database can be significantly simpler to create than on the equivalent nondimensional database. 1. Creating a data warehouse involves using the functionalities of database management software to implement the data warehouse model as a collection of physically created and mutually connected database

More information

Question Bank. 4) It is the source of information later delivered to data marts.

Question Bank. 4) It is the source of information later delivered to data marts. Question Bank Year: 2016-2017 Subject Dept: CS Semester: First Subject Name: Data Mining. Q1) What is data warehouse? ANS. A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 04-06 Data Warehouse Architecture Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information

Venezuela: Teléfonos: / Colombia: Teléfonos:

Venezuela: Teléfonos: / Colombia: Teléfonos: CONTENIDO PROGRAMÁTICO Moc 20761: Querying Data with Transact SQL Module 1: Introduction to Microsoft SQL Server This module introduces SQL Server, the versions of SQL Server, including cloud versions,

More information

Fig 1.2: Relationship between DW, ODS and OLTP Systems

Fig 1.2: Relationship between DW, ODS and OLTP Systems 1.4 DATA WAREHOUSES Data warehousing is a process for assembling and managing data from various sources for the purpose of gaining a single detailed view of an enterprise. Although there are several definitions

More information

A quality product by Brainheaters education solutions Pvt. Ltd. Brainheaters Notes. Revised (A.Y )

A quality product by Brainheaters education solutions Pvt. Ltd. Brainheaters Notes. Revised (A.Y ) A quality product by Brainheaters education solutions Pvt. Ltd Brainheaters Notes ADBMS IT Semester-5 Revised - 2012 (A.Y 2014-15) 2016-18 Proudly Powered by www.brainheaters.in MRP Rs. 70 The Goal Not

More information

Techno Expert Solutions An institute for specialized studies!

Techno Expert Solutions An institute for specialized studies! Getting Started Course Content of IBM Cognos Data Manger Identify the purpose of IBM Cognos Data Manager Define data warehousing and its key underlying concepts Identify how Data Manager creates data warehouses

More information

Data Warehouse and Mining

Data Warehouse and Mining Data Warehouse and Mining 1. is a subject-oriented, integrated, time-variant, nonvolatile collection of data in support of management decisions. A. Data Mining. B. Data Warehousing. C. Web Mining. D. Text

More information

Summary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse

Summary of Last Chapter. Course Content. Chapter 2 Objectives. Data Warehouse and OLAP Outline. Incentive for a Data Warehouse Principles of Knowledge Discovery in bases Fall 1999 Chapter 2: Warehousing and Dr. Osmar R. Zaïane University of Alberta Dr. Osmar R. Zaïane, 1999 Principles of Knowledge Discovery in bases University

More information

Sql Fact Constellation Schema In Data Warehouse With Example

Sql Fact Constellation Schema In Data Warehouse With Example Sql Fact Constellation Schema In Data Warehouse With Example Data Warehouse OLAP - Learn Data Warehouse in simple and easy steps using Multidimensional OLAP (MOLAP), Hybrid OLAP (HOLAP), Specialized SQL

More information

1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar

1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar 1 DATAWAREHOUSING QUESTIONS by Mausami Sawarkar 1) What does the term 'Ad-hoc Analysis' mean? Choice 1 Business analysts use a subset of the data for analysis. Choice 2: Business analysts access the Data

More information

Comparison of File System vs Database Systems (Limitations to File System)

Comparison of File System vs Database Systems (Limitations to File System) 5IT 2017-18 Subject: DataBase Management System (1 st Midterm) Marks: 10 Attempt all four questions, all questions carry equal marks Q.1 Compare the File System with DBMS. 1. Duplicate Data As all files

More information

Proceedings of the IE 2014 International Conference AGILE DATA MODELS

Proceedings of the IE 2014 International Conference  AGILE DATA MODELS AGILE DATA MODELS Mihaela MUNTEAN Academy of Economic Studies, Bucharest mun61mih@yahoo.co.uk, Mihaela.Muntean@ie.ase.ro Abstract. In last years, one of the most popular subjects related to the field of

More information

Complete. The. Reference. Christopher Adamson. Mc Grauu. LlLIJBB. New York Chicago. San Francisco Lisbon London Madrid Mexico City

Complete. The. Reference. Christopher Adamson. Mc Grauu. LlLIJBB. New York Chicago. San Francisco Lisbon London Madrid Mexico City The Complete Reference Christopher Adamson Mc Grauu LlLIJBB New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto Contents Acknowledgments

More information

Database Vs. Data Warehouse

Database Vs. Data Warehouse Database Vs. Data Warehouse Similarities and differences Databases and data warehouses are used to generate different types of information. Information generated by both are used for different purposes.

More information

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No.

Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No. Database Management System Dr. S. Srinath Department of Computer Science & Engineering Indian Institute of Technology, Madras Lecture No. # 5 Structured Query Language Hello and greetings. In the ongoing

More information

Data Integration and ETL with Oracle Warehouse Builder

Data Integration and ETL with Oracle Warehouse Builder Oracle University Contact Us: 1.800.529.0165 Data Integration and ETL with Oracle Warehouse Builder Duration: 5 Days What you will learn Participants learn to load data by executing the mappings or the

More information

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders Data Warehousing ETL Esteban Zimányi ezimanyi@ulb.ac.be Slides by Toon Calders 1 Overview Picture other sources Metadata Monitor & Integrator OLAP Server Analysis Operational DBs Extract Transform Load

More information

Question: 1 What are some of the data-related challenges that create difficulties in making business decisions? Choose three.

Question: 1 What are some of the data-related challenges that create difficulties in making business decisions? Choose three. Question: 1 What are some of the data-related challenges that create difficulties in making business decisions? Choose three. A. Too much irrelevant data for the job role B. A static reporting tool C.

More information

IBM WEB Sphere Datastage and Quality Stage Version 8.5. Step-3 Process of ETL (Extraction,

IBM WEB Sphere Datastage and Quality Stage Version 8.5. Step-3 Process of ETL (Extraction, IBM WEB Sphere Datastage and Quality Stage Version 8.5 Step-1 Data Warehouse Fundamentals An Introduction of Data warehousing purpose of Data warehouse Data ware Architecture OLTP Vs Data warehouse Applications

More information

Benefits of Automating Data Warehousing

Benefits of Automating Data Warehousing Benefits of Automating Data Warehousing Introduction Data warehousing can be defined as: A copy of data specifically structured for querying and reporting. In most cases, the data is transactional data

More information

T-SQL Training: T-SQL for SQL Server for Developers

T-SQL Training: T-SQL for SQL Server for Developers Duration: 3 days T-SQL Training Overview T-SQL for SQL Server for Developers training teaches developers all the Transact-SQL skills they need to develop queries and views, and manipulate data in a SQL

More information

SQL Interview Questions

SQL Interview Questions SQL Interview Questions SQL stands for Structured Query Language. It is used as a programming language for querying Relational Database Management Systems. In this tutorial, we shall go through the basic

More information

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS

IT DATA WAREHOUSING AND DATA MINING UNIT-2 BUSINESS ANALYSIS PART A 1. What are production reporting tools? Give examples. (May/June 2013) Production reporting tools will let companies generate regular operational reports or support high-volume batch jobs. Such

More information

Oracle 1Z0-640 Exam Questions & Answers

Oracle 1Z0-640 Exam Questions & Answers Oracle 1Z0-640 Exam Questions & Answers Number: 1z0-640 Passing Score: 800 Time Limit: 120 min File Version: 28.8 http://www.gratisexam.com/ Oracle 1Z0-640 Exam Questions & Answers Exam Name: Siebel7.7

More information

A Star Schema Has One To Many Relationship Between A Dimension And Fact Table

A Star Schema Has One To Many Relationship Between A Dimension And Fact Table A Star Schema Has One To Many Relationship Between A Dimension And Fact Table Many organizations implement star and snowflake schema data warehouse The fact table has foreign key relationships to one or

More information

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics

Unit 10 Databases. Computer Concepts Unit Contents. 10 Operational and Analytical Databases. 10 Section A: Database Basics Unit 10 Databases Computer Concepts 2016 ENHANCED EDITION 10 Unit Contents Section A: Database Basics Section B: Database Tools Section C: Database Design Section D: SQL Section E: Big Data Unit 10: Databases

More information

Oracle SQL & PL SQL Course

Oracle SQL & PL SQL Course Oracle SQL & PL SQL Course Complete Practical & Real-time Training Job Support Complete Practical Real-Time Scenarios Resume Preparation Lab Access Training Highlights Placement Support Support Certification

More information

Data-Driven Driven Business Intelligence Systems: Parts I. Lecture Outline. Learning Objectives

Data-Driven Driven Business Intelligence Systems: Parts I. Lecture Outline. Learning Objectives Data-Driven Driven Business Intelligence Systems: Parts I Week 5 Dr. Jocelyn San Pedro School of Information Management & Systems Monash University IMS3001 BUSINESS INTELLIGENCE SYSTEMS SEM 1, 2004 Lecture

More information

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY

DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY DATA WAREHOUSE EGCO321 DATABASE SYSTEMS KANAT POOLSAWASD DEPARTMENT OF COMPUTER ENGINEERING MAHIDOL UNIVERSITY CHARACTERISTICS Data warehouse is a central repository for summarized and integrated data

More information

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL)

Database Systems: Design, Implementation, and Management Tenth Edition. Chapter 7 Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management Tenth Edition Chapter 7 Introduction to Structured Query Language (SQL) Objectives In this chapter, students will learn: The basic commands and

More information

The Evolution of Data Warehousing. Data Warehousing Concepts. The Evolution of Data Warehousing. The Evolution of Data Warehousing

The Evolution of Data Warehousing. Data Warehousing Concepts. The Evolution of Data Warehousing. The Evolution of Data Warehousing The Evolution of Data Warehousing Data Warehousing Concepts Since 1970s, organizations gained competitive advantage through systems that automate business processes to offer more efficient and cost-effective

More information

Information Management course

Information Management course Università degli Studi di Milano Master Degree in Computer Science Information Management course Teacher: Alberto Ceselli Lecture 05(b) : 23/10/2012 Data Mining: Concepts and Techniques (3 rd ed.) Chapter

More information

CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP)

CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP) CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP) INTRODUCTION A dimension is an attribute within a multidimensional model consisting of a list of values (called members). A fact is defined by a combination

More information

Migrating Mappings and Mapplets from a PowerCenter Repository to a Model Repository

Migrating Mappings and Mapplets from a PowerCenter Repository to a Model Repository Migrating Mappings and Mapplets from a PowerCenter Repository to a Model Repository 2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic,

More information

Data Analysis. CPS352: Database Systems. Simon Miner Gordon College Last Revised: 12/13/12

Data Analysis. CPS352: Database Systems. Simon Miner Gordon College Last Revised: 12/13/12 Data Analysis CPS352: Database Systems Simon Miner Gordon College Last Revised: 12/13/12 Agenda Check-in NoSQL Database Presentations Online Analytical Processing Data Mining Course Review Exam II Course

More information

CUBE, ROLLUP, AND MATERIALIZED VIEWS: MINING ORACLE GOLD John Jay King, King Training Resources

CUBE, ROLLUP, AND MATERIALIZED VIEWS: MINING ORACLE GOLD John Jay King, King Training Resources CUBE, ROLLUP, AND MATERIALIZED VIEWS: MINING ORACLE GOLD John Jay, Training Resources Abstract: Oracle8i provides new features that reduce the costs of summary queries and provide easier summarization.

More information

Sql Server Syllabus. Overview

Sql Server Syllabus. Overview Sql Server Syllabus Overview This SQL Server training teaches developers all the Transact-SQL skills they need to create database objects like Tables, Views, Stored procedures & Functions and triggers

More information

Data Warehouses and Deployment

Data Warehouses and Deployment Data Warehouses and Deployment This document contains the notes about data warehouses and lifecycle for data warehouse deployment project. This can be useful for students or working professionals to gain

More information

Data Analysis and Data Science

Data Analysis and Data Science Data Analysis and Data Science CPS352: Database Systems Simon Miner Gordon College Last Revised: 4/29/15 Agenda Check-in Online Analytical Processing Data Science Homework 8 Check-in Online Analytical

More information

~ Ian Hunneybell: DWDM Revision Notes (31/05/2006) ~

~ Ian Hunneybell: DWDM Revision Notes (31/05/2006) ~ Useful reference: Microsoft Developer Network Library (http://msdn.microsoft.com/library). Drill down to Servers and Enterprise Development SQL Server SQL Server 2000 SDK Documentation Creating and Using

More information

Testing Masters Technologies

Testing Masters Technologies 1. What is Data warehouse ETL TESTING Q&A Ans: A Data warehouse is a subject oriented, integrated,time variant, non volatile collection of data in support of management's decision making process. Subject

More information

COGNOS (R) 8 GUIDELINES FOR MODELING METADATA FRAMEWORK MANAGER. Cognos(R) 8 Business Intelligence Readme Guidelines for Modeling Metadata

COGNOS (R) 8 GUIDELINES FOR MODELING METADATA FRAMEWORK MANAGER. Cognos(R) 8 Business Intelligence Readme Guidelines for Modeling Metadata COGNOS (R) 8 FRAMEWORK MANAGER GUIDELINES FOR MODELING METADATA Cognos(R) 8 Business Intelligence Readme Guidelines for Modeling Metadata GUIDELINES FOR MODELING METADATA THE NEXT LEVEL OF PERFORMANCE

More information

Oracle Endeca Information Discovery

Oracle Endeca Information Discovery Oracle Endeca Information Discovery Glossary Version 2.4.0 November 2012 Copyright and disclaimer Copyright 2003, 2013, Oracle and/or its affiliates. All rights reserved. Oracle and Java are registered

More information

Evolution of Database Systems

Evolution of Database Systems Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second

More information

Syllabus. Syllabus. Motivation Decision Support. Syllabus

Syllabus. Syllabus. Motivation Decision Support. Syllabus Presentation: Sophia Discussion: Tianyu Metadata Requirements and Conclusion 3 4 Decision Support Decision Making: Everyday, Everywhere Decision Support System: a class of computerized information systems

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 03 Architecture of DW Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Basic

More information

SAS Data Integration Studio 3.3. User s Guide

SAS Data Integration Studio 3.3. User s Guide SAS Data Integration Studio 3.3 User s Guide The correct bibliographic citation for this manual is as follows: SAS Institute Inc. 2006. SAS Data Integration Studio 3.3: User s Guide. Cary, NC: SAS Institute

More information

ETL Transformations Performance Optimization

ETL Transformations Performance Optimization ETL Transformations Performance Optimization Sunil Kumar, PMP 1, Dr. M.P. Thapliyal 2 and Dr. Harish Chaudhary 3 1 Research Scholar at Department Of Computer Science and Engineering, Bhagwant University,

More information

IBM B5280G - IBM COGNOS DATA MANAGER: BUILD DATA MARTS WITH ENTERPRISE DATA (V10.2)

IBM B5280G - IBM COGNOS DATA MANAGER: BUILD DATA MARTS WITH ENTERPRISE DATA (V10.2) IBM B5280G - IBM COGNOS DATA MANAGER: BUILD DATA MARTS WITH ENTERPRISE DATA (V10.2) Dauer: 5 Tage Durchführungsart: Präsenztraining Zielgruppe: This course is intended for Developers. Nr.: 35231 Preis:

More information

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:-

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:- UNIT III: Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to

More information

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini

Data Warehousing. Ritham Vashisht, Sukhdeep Kaur and Shobti Saini Advance in Electronic and Electric Engineering. ISSN 2231-1297, Volume 3, Number 6 (2013), pp. 669-674 Research India Publications http://www.ripublication.com/aeee.htm Data Warehousing Ritham Vashisht,

More information

DATA WAREHOUSE- MODEL QUESTIONS

DATA WAREHOUSE- MODEL QUESTIONS DATA WAREHOUSE- MODEL QUESTIONS 1. The generic two-level data warehouse architecture includes which of the following? a. At least one data mart b. Data that can extracted from numerous internal and external

More information

Data Mining & Data Warehouse

Data Mining & Data Warehouse Data Mining & Data Warehouse Associate Professor Dr. Raed Ibraheem Hamed University of Human Development, College of Science and Technology (1) 2016 2017 1 Points to Cover Why Do We Need Data Warehouses?

More information

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394

Data Mining. Data warehousing. Hamid Beigy. Sharif University of Technology. Fall 1394 Data Mining Data warehousing Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1394 1 / 22 Table of contents 1 Introduction 2 Data warehousing

More information

Handout 12 Data Warehousing and Analytics.

Handout 12 Data Warehousing and Analytics. Handout 12 CS-605 Spring 17 Page 1 of 6 Handout 12 Data Warehousing and Analytics. Operational (aka transactional) system a system that is used to run a business in real time, based on current data; also

More information

Oracle 1Z0-515 Exam Questions & Answers

Oracle 1Z0-515 Exam Questions & Answers Oracle 1Z0-515 Exam Questions & Answers Number: 1Z0-515 Passing Score: 800 Time Limit: 120 min File Version: 38.7 http://www.gratisexam.com/ Oracle 1Z0-515 Exam Questions & Answers Exam Name: Data Warehousing

More information

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)? Introduction to Data Warehousing and Business Intelligence Overview Why Business Intelligence? Data analysis problems Data Warehouse (DW) introduction A tour of the coming DW lectures DW Applications Loosely

More information

Data warehouses Decision support The multidimensional model OLAP queries

Data warehouses Decision support The multidimensional model OLAP queries Data warehouses Decision support The multidimensional model OLAP queries Traditional DBMSs are used by organizations for maintaining data to record day to day operations On-line Transaction Processing

More information

ETL Testing Concepts:

ETL Testing Concepts: Here are top 4 ETL Testing Tools: Most of the software companies today depend on data flow such as large amount of information made available for access and one can get everything which is needed. This

More information