Overview. DW Performance Optimization. Aggregates. Aggregate Use Example

Size: px
Start display at page:

Download "Overview. DW Performance Optimization. Aggregates. Aggregate Use Example"

Transcription

1 Overview DW Performance Optimization Choosing aggregates Maintaining views Bitmapped indices Other optimization issues Original slides were written by Torben Bach Pedersen Aalborg University 07 - DWML course 2 Aggregates Observations DW queries are simple, follow same schema Aggregate (GROUP-BY) queries Idea Compute and store query results in advance (pre-aggregation) Example: store total sales per month and product Yields large performance improvements (e.g., factor 000) No need to store everything reuse whenever possible Example: quarterly total can be quickly computed from monthly total Prerequisites Tree-structured dimensions with fixed height Many-to-one relationships from fact to dimensions Facts mapped to bottom level in all dimensions Otherwise, re-use is not possible Aggregate Use Example Imagine billion sales rows, 000 products, 00 locations CREATE VIEW TotalSales (pid,locid,total) AS SELECT s.pid,s.locid,sum(s.sales) FROM Sales s GROUP BY s.pid,s.locid The materialized view has 00,000 rows Query rewritten to use view SELECT p.category,sum(s.sales) FROM Products p, Sales s WHERE p.pid=s.pid GROUP BY p.category Rewritten to SELECT p.category,sum(t.total) FROM Products p, TotalSales t WHERE p.pid=t.pid GROUP BY p.category Query becomes 0,000 times faster! Aalborg University 07 - DWML course 3 Aalborg University 07 - DWML course 4

2 Pre-Aggregation Choices Full pre-aggregation: (all combinations of levels) Fast query response Takes a lot of space/update time (0-500 times raw data) No pre-aggregation Slow query response (for terabytes ) Practical pre-aggregation: chosen combinations A good compromise between response time and space use Most (R)OLAP tools now support practical pre-aggregation IBM DB2 UDB Oracle 0g MS Analysis Services Using Aggregates Application: aggregates used via aggregate navigator Given a query, the best aggregate is found, and the query is rewritten to use it Done by the system, not by the user Traditionally done in middleware, e.g., ODBC Performed directly by DBMS Four design goals for aggregate usage Aggregates stored separately from detail data Shrunk dimensions mapped to aggregate facts Connection between aggregates and detail data known by system All queries (SQL) refers only detail data SUM, MIN, MAX, COUNT, AVG can all be handled The last one requires little trick Why? Aalborg University 07 - DWML course 5 Aalborg University 07 - DWML course 6 Choosing Aggregates Practical pre-aggregation, decide what aggregates to store Non-trivial (NP-complete) optimization problem Space use Update speed Response time demands Actual queries Index and/or aggregates Choose an aggregate if it is considerably smaller than available, usable aggregates (factor 3-5-0) Supported (semi)-automatically by DBMS Oracle, DB2, MS SQL Server Implementing Data Cubes Efficiently SIGMOD 96 paper Greedy approach: simple but effective Above DBMS now use similar, but more advanced techniques Data Cube The data cube stores multidimensional GROUP BY relations of tables in data warehouses Red White Blue By make Group By (with total) By color Sum Red White Blue Aggregate Sum Database capable By make Data warehouse Cross Tab capable Chevy Ford By color Sum Ford Chevy By color and year By make and year Sum By make and color Aalborg University 07 - DWML course 7 Aalborg University 07 - DWML course 8

3 A Data Cube example Lattice Framework (I) 8 possible views for 3 dimensions. Each view gives the total sales as per that grouping. (II) 8 views organized into a Lattice. part, supplier, customer (6M rows) 2. part, customer (6M) 3. part, supplier (0.8M) 4. supplier, customer (6M) 5. part (0.2M) 6. supplier (0.0M) 7. customer (0.M) 8. none () psc 6M pc 6M ps 0.8M sc 6M p 0.2M s 0.0M c 0.M none 9 M rows total Scenario: A query asks for the sales of a part a) If view pc is available, need to process about 6M rows; OR b) If view p is available, need to process about 0.2M rows Questions: a) How many views to materialize to get reasonable performance? - and b) Given that we have space S, what views to materialize to minimize average query costs? (III) Picking the right views to materialize improve performance (IV) View pc and sc not needed Effective rows reduced from 9M to 7M Each lattice node represents a view / query Dependence relationship: Q Q2 Q is the descendant of Q2 Q can be answered using only the results of Q2 In other words, Q is dependent on Q2 The operator imposes a partial ordering on the queries Top view = base data (i.e., most detail) Essentially, the lattice captures dependency relationship among queries / views and can be represented by a lattice graph Aalborg University 07 - DWML course 9 Aalborg University 07 - DWML course 0 Cost Model Greedy Algorithm Cost of answering a query based on a view and it has major assumptions Model simple and realistic, Enable the design and analysis of powerful algorithms Greedy algorithm: each time choose the view with the maximum benefit Explanation: 2 3 Time to answer a query is equal to the space occupied by the query from which the query is answered All queries are identical to some queries in the given lattice The clustering of the materialized query and indexes have not been considered An illustration: To answer query Q, choose an ancestor of Q, say, Qa, that has been materialized We thus need to process the table of Qa Cost of answering Q: number of rows in the table of Qa The Greedy algorithm S = {top view} for i= to k do begin Select that view v not in S such that B(v,S) is maximized; S = S union {v}; Given a data cube lattice with space costs associated with each view Always include the top view because it cannot be generated from other views Suppose we may only select k number of views in addition to the top view The benefit of view v (relative to S), is based on how v can improve the costs of evaluating views, including itself Experimental validation of the cost model: almost linear relationship between size and running time Source Size S Time T Ratio m From cell itself From view s 0, From view ps 0.8M end; return S; The total benefit of v is the sum over all views w of the benefit of using v to evaluate w, providing that benefit is positive From view psc 6M Aalborg University 07 - DWML course Aalborg University 07 - DWML course 2

4 Greedy Algorithm an example Greedy Algorithm vs optimal choice 50 b 00 a 30 C 75 Round 2: views b, d, e, g, h start with cost 50 and the other views start with cost 00 Which queries can be improved by view f? Benefit of view f? Ans: (00-40)+(50-40)=70 Situations where the algorithm does poorly Round : Picks c whose benefit is 44 Round 2: Picks b or d with benefit of 200 each e d f 40 g h 0 Lattice with space costs Initial: each query answered by a, with cost 00 Which queries can be answered by view b? What is the benefit of view b? Ans: 5*(00-50)=250 Can we find a better view than view b? End of round : view b selected Benefits of possible choices at each round b c d e f g h Round Round 2 Round 3 50 x 5 = x 5 = x 2 = x 2 =40 60 x 2 = 99 x = x = x 2 = x 2 = 60 x 2 = = x = x = x = x 2 = 60 2 x + 0 =50 49 x = x = 30 After 3 rounds, total cost is 4 (initial cost 800) In summary, each round pick the view with the maximum benefit Nodes total b Nodes total a c 99 Nodes total 000 d 00 Nodes total 000 A Lattice where the greedy does poorly Greedy results in = 624 But, the optimal choice is to pick b and d b and d improve by 00 for itself and all 80 nodes below resulting in total benefits of 80 Ratio of greedy/optimal=624/80=76% but the benefit is at least 63% of the benefit of the optimal algorithm as reasoned by the authors Aalborg University 07 - DWML course 3 Aalborg University 07 - DWML course 4 Greedy algorithm space-time trade-off Optimal cases and anomalies Experiment with a lattice: Important to materialize only the first few (5) views, performance increases at first Greedy order of view selection example Selection Benefit Time Space cp ns nt c p cs np ct t n s none infinite 24M rows 2M 5.9M 5.8M M M 0.0M small small small small 72M rows 48M 36M 30.M 24.3M 6M rows 6M 6M 6.M 6.3M.3M 6.3M Time / Space Total Time Total Space Views Two situations where the algorithm is optimal If a is much larger than the other a s, the greedy is close to optimal If all the a s are equal then greedy is optimal but there are also two situations where the algorithm is not realistic Views in a lattice are unlikely to have the same probability of being requested in a query Instead of asking for some fixed number of views to materialize, should instead allocate a fixed amount of space to views Aalborg University 07 - DWML course 5 Aalborg University 07 - DWML course 6

5 Hypercube lattices some observations Space-optimal and time-optimal solutions The size of views grows exponentially, until it reaches the size of the raw data at rank [log r m], i.e. the cliff Assumptions and basis of reasoning Each domain size is r Top element has m cells appearing in raw data Is there sense to minimize time by materializing all views? Size of views Log r m m If group on i attributes, cube has r i cells If r i m, then each cell will have at most one data point and m of the cells with be nonempty. Use m as the size of any view. What is the average time for a query when the space is optimal? No gain past the cliff No point to do so n Number of group-by attributes Effect of the number of group-by attributes on the size of views * (p,s) does not have 6M rows because the benchmark made it so If r i <m, then almost all r i cells will have at least one data point. Space cost is r i as several data points can be collapsed into one aggregate Which explains why grouping of 2 attributes (p,c), (s,c) have the same size as (p,s,c) at 6 M rows* in Figure Space is minimized when only the top view is materialized Every query would take time m Total time cost for all 2 n queries is m2 n Nature of time-optimal solution depends on. Rank k = [log r m] at which the cliff occurs; and 2. Rank j, such that r j = ( n j) is maximized Aalborg University 07 - DWML course 7 Aalborg University 07 - DWML course 8 Hierarchies & the Lattice Framework Composite lattices Hierarchies are important as they underlie two commonly used query operations, Drilldown and Roll-up Dependencies caused by different dimensions and attribute hierarchies can be combined into a Direct Product lattice A common hierarchy and its dependency relations cp 6M Week Day none Month Year Year Month Day Week Day; but Month week; week month but, hierarchies introduce query dependencies that must be accounted for when determining which queries to materialize; and this can be complex customer 0.M nation 25 none + product 6M size type none Combining two hierarchical dimensions = s 50 p 0.2M none t 50 ns 250 np 5M n 25 cs 5M nt 3750 c 0.M A Direct Product lattice ct 5.99M assuming views can be created by independently grouping any or no member of the hierarchy for each of n dimensions Aalborg University 07 - DWML course 9 Aalborg University 07 - DWML course

6 Applicability of the Lattice Framework Summary on choosing aggregates Advantages of the lattice framework Clean framework A clean framework to work with dimensional hierarchies, since hierarchies are themselves lattices Easy to model dependencies Dependency relationship among queries captured by the lattice Order of materialization A simple descending-order topological sort on lattice nodes gives the required order of materialization Why?. Lattice framework 2. Linear cost model 3. Greedy algorithm (How it works?) 4. Space-time trade-off Aalborg University 07 - DWML course 2 Aalborg University 07 - DWML course 22 Overview Choosing aggregates Maintaining views Bitmapped indices Other optimization issues View Maintenance How and when should we refresh materialized views? Total re-computation Too expensive Incremental view maintenance Apply only changes since last refresh to view Insert new rows R i and delete existing rows R d Update = delete + insert Additional info must be stored in views To make the views self-maintainable Store number of derivations c v along with each row v in V Aalborg University 07 - DWML course 23 Aalborg University 07 - DWML course 24

7 View Maintenance Projection views (with DISTINCT in SQL) V=Π(R) (r,c): r appears c times in R If (r,c i ) Π(R i ) and (r,c r ) V then c r =c r +c i, otherwise insert (r,c i ) into V If (r,c d ) Π(R d ) and (r,c r ) V then c r =c r -c d, delete from V if c r =0 Join views V=R S Compute R i S and add to V, update counts Compute R d S and subtract from V, update counts Aggregation View Maintenance COUNT Maintain <g,count> Update count based on inserts/deletes Insert <g,> rows for new values, delete rows from V with count=0 SUM Maintain <g,sum,count> Update count and sum based on inserts/deletes Insert <g,a,> rows for new values, delete rows from V with count=0 AVG computed as sum/count MIN (similar for MAX) Maintain <g,minimum,count> Insert (g,m): update minimum,count based on m =,<,> minimum Delete (g,m): update minimum,count based on m =,<,> minimum If count=0, base table must be searched (expensive!) Aalborg University 07 - DWML course 25 Aalborg University 07 - DWML course 26 Practical View Maintenance When to synchronize? Immediate in same transaction as base change Lazy when V is used for the first time after base updates Periodic, e.g., once a day, often together with base load Forced after a certain number of changes Updating aggregates Computation outside DBMS in flat files (still relevant?) Built by loader Computation in DBMS using SQL Can be expensive: DBMS must be tuned for this Supported by tool/dbms Oracle, SQL Server, DB2 Indexing Index used in combination with aggregates Index on dimension tables and on materialized views Fact table Build primary B-tree index on dimension keys (primary key)? Build indices on each dimension key separately (index intersection) Indices on combinations of dimension keys? (many!) Sort order is important (index-organized tables) Compressing data can be possible (values not repeated) Can save aggregates due to fast sequential scan Best sort order (almost) always Time Dimension tables Build indices on many/all individual columns (think of type) Build indices on common combinations Hash indices Efficient for un-sorted data Aalborg University 07 - DWML course 27 Aalborg University 07 - DWML course 28

8 Bitmap Indices A B-tree index stores a list of RowIDs for each value A RowID takes ~8 bytes Large space use for columns with low cardinality (gender, color) Example: Index for bio. rows with gender takes 8 GB Not efficient to do index intersection for these columns Idea: make a position bitmap for each value (only two) Female: Male: Takes only (no. of values)*(no. of rows)* bit Example: bitmap index on gender (as before) takes only 256 MB Very efficient to do index intersection (AND/OR) on bitmaps Can be improved for for higher cardinality using compression Supported by some RDBMS (DB2, Oracle) Using Bitmap Indices Query example Find female customers in Aalborg with black hair and blue eyes Female: Aalborg: Black: 0000 Blue: 000 Result use AND, only one such customer Range queries can also be handled and Salary BETWEEN 0,000 AND 300, ,000: ,000: OR together: 000 Use as regular bitmap Aalborg University 07 - DWML course 29 Aalborg University 07 - DWML course 30 Compressed Bitmaps Problem: space use With m possible values and n records: n*m bits required However, probability of a is /m => very few s Solution: compressed bitmaps Run-length encoding A run is i 0 s followed by a Concatenating binary numbers won t work no unique decoding Instead, determine j number of bits in binary representation of i Encode as <j- s> <i in binary> Encode next run similarly, trailing 0 s not encoded Example: encoded as 0 j> => first bit of i is this bit can be saved encoded as 0 Decoding: scan bits to find j, scan next j- bits to find i, find next Example: j = 3, i = 7 () => bitmap = (trailing) Managing Bitmaps Compression factor Assume m=n (unique values) Each value has just one run of length i < n Each run takes at most 2 log 2 n bits (j <= log 2 n) Total space consumption: 2n log 2 n bits (compared to n 2 ) Operations on compressed bitmaps Decompress one run at a time and produce relevant s in output Finding and storing bitvectors Index with B-trees + store in blocks/block chains Finding records Use secondary or primary index (hashing or B-tree) Handling modifications Deletion: retire record number + update bitmaps with s Insertion: add new record to file + update bitmaps with s (trail 0 s) Updates: update bitmaps with old and new s (expensive) Aalborg University 07 - DWML course 3 Aalborg University 07 - DWML course 32

9 Physical Storage Partitioning Data stored in large lumps (partitions) Example: one partition per quarter Queries need only read the relevant partitions Can yield large performance improvements Operations on partitions are independent Creation, deletion, update, indexing Aggregation level can be different among partitions Column storage Data stored in columns, not in rows A different kind of partitioning Works well for typical DW queries (only few columns accessed) Supports good compression of data Physical Configuration RAID Gives (depending on level) error tolerance and improved read speed DW optimized for reads, not for writes DW well suited for, e.g., RAID5 (% redundancy) Disk type Small drives (with many controllers) are more expensive, but faster Large drives are cheaper, can store more aggregates for the same price Block size Large sequential reads faster with large blocks (32K) Scattered index reads faster with small blocks (4K) Memory RAM is cheap: buy a lot RAM caching must be per user session Monitoring user activity Can give feedback to, e.g., choice of aggregates Aalborg University 07 - DWML course 33 Aalborg University 07 - DWML course 34 DBMS Possibilities Aggregate navigation/use, Aggregate choice, Aggregate maintenance Oracle 0g, DB2 UDB, MS Analysis Services Using ordinary indices Oracle 0g, DB2 UDB, MS SQL Server can do star joins Bitmap indices Oracle 0g, DB2 UDB not yet in MS SQL Server Partitioning Oracle 0g, DB2 UDB, MS SQL Server+Analysis Services Column storage Redbrick (Informix->IBM->?) MOLAP/ROLAP/HOLAP Oracle 0g, DB2 UDB, MS SQL Server Mini Project New subtask DW performance optimization Build aggregates SQL Server Management Studio Connect to Analysis Services Find your cube and Aalborg University 07 - DWML course 35 Aalborg University 07 - DWML course 36

Advanced Data Management Technologies

Advanced Data Management Technologies ADMT 2017/18 Unit 13 J. Gamper 1/42 Advanced Data Management Technologies Unit 13 DW Pre-aggregation and View Maintenance J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements:

More information

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)?

Overview. Introduction to Data Warehousing and Business Intelligence. BI Is Important. What is Business Intelligence (BI)? Introduction to Data Warehousing and Business Intelligence Overview Why Business Intelligence? Data analysis problems Data Warehouse (DW) introduction A tour of the coming DW lectures DW Applications Loosely

More information

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A

Data Warehousing and Decision Support. Introduction. Three Complementary Trends. [R&G] Chapter 23, Part A Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 432 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business

More information

Decision Support. Chapter 25. CS 286, UC Berkeley, Spring 2007, R. Ramakrishnan 1

Decision Support. Chapter 25. CS 286, UC Berkeley, Spring 2007, R. Ramakrishnan 1 Decision Support Chapter 25 CS 286, UC Berkeley, Spring 2007, R. Ramakrishnan 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support

More information

Data Warehousing and Decision Support

Data Warehousing and Decision Support Data Warehousing and Decision Support [R&G] Chapter 23, Part A CS 4320 1 Introduction Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business

More information

Data Warehousing and Decision Support

Data Warehousing and Decision Support Data Warehousing and Decision Support Chapter 23, Part A Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke 1 Introduction Increasingly, organizations are analyzing current and historical

More information

Improving the Performance of OLAP Queries Using Families of Statistics Trees

Improving the Performance of OLAP Queries Using Families of Statistics Trees Improving the Performance of OLAP Queries Using Families of Statistics Trees Joachim Hammer Dept. of Computer and Information Science University of Florida Lixin Fu Dept. of Mathematical Sciences University

More information

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures)

CS614 - Data Warehousing - Midterm Papers Solved MCQ(S) (1 TO 22 Lectures) CS614- Data Warehousing Solved MCQ(S) From Midterm Papers (1 TO 22 Lectures) BY Arslan Arshad Nov 21,2016 BS110401050 BS110401050@vu.edu.pk Arslan.arshad01@gmail.com AKMP01 CS614 - Data Warehousing - Midterm

More information

DW Performance Optimization (II)

DW Performance Optimization (II) DW Performance Optimization (II) Overview Data Cube in ROLAP and MOLAP ROLAP Technique(s) Efficient Data Cube Computation MOLAP Technique(s) Prefix Sum Array Multiway Augmented Tree Aalborg University

More information

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe

Copyright 2016 Ramez Elmasri and Shamkant B. Navathe CHAPTER 19 Query Optimization Introduction Query optimization Conducted by a query optimizer in a DBMS Goal: select best available strategy for executing query Based on information available Most RDBMSs

More information

Introduction to DWML. Christian Thomsen, Aalborg University. Slides adapted from Torben Bach Pedersen and Man Lung Yiu

Introduction to DWML. Christian Thomsen, Aalborg University. Slides adapted from Torben Bach Pedersen and Man Lung Yiu Introduction to DWML Christian Thomsen, Aalborg University Slides adapted from Torben Bach Pedersen and Man Lung Yiu Course Structure Business intelligence Extract knowledge from large amounts of data

More information

Advanced Data Management Technologies Written Exam

Advanced Data Management Technologies Written Exam Advanced Data Management Technologies Written Exam 02.02.2016 First name Student number Last name Signature Instructions for Students Write your name, student number, and signature on the exam sheet. This

More information

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact:

Something to think about. Problems. Purpose. Vocabulary. Query Evaluation Techniques for large DB. Part 1. Fact: Query Evaluation Techniques for large DB Part 1 Fact: While data base management systems are standard tools in business data processing they are slowly being introduced to all the other emerging data base

More information

Storage hierarchy. Textbook: chapters 11, 12, and 13

Storage hierarchy. Textbook: chapters 11, 12, and 13 Storage hierarchy Cache Main memory Disk Tape Very fast Fast Slower Slow Very small Small Bigger Very big (KB) (MB) (GB) (TB) Built-in Expensive Cheap Dirt cheap Disks: data is stored on concentric circular

More information

Evolution of Database Systems

Evolution of Database Systems Evolution of Database Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Intelligent Decision Support Systems Master studies, second

More information

HANA Performance. Efficient Speed and Scale-out for Real-time BI

HANA Performance. Efficient Speed and Scale-out for Real-time BI HANA Performance Efficient Speed and Scale-out for Real-time BI 1 HANA Performance: Efficient Speed and Scale-out for Real-time BI Introduction SAP HANA enables organizations to optimize their business

More information

Column Stores vs. Row Stores How Different Are They Really?

Column Stores vs. Row Stores How Different Are They Really? Column Stores vs. Row Stores How Different Are They Really? Daniel J. Abadi (Yale) Samuel R. Madden (MIT) Nabil Hachem (AvantGarde) Presented By : Kanika Nagpal OUTLINE Introduction Motivation Background

More information

Information Systems (Informationssysteme)

Information Systems (Informationssysteme) Information Systems (Informationssysteme) Jens Teubner, TU Dortmund jens.teubner@cs.tu-dortmund.de Summer 2018 c Jens Teubner Information Systems Summer 2018 1 Part IX B-Trees c Jens Teubner Information

More information

DATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843

DATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843 DATA CUBE : A RELATIONAL AGGREGATION OPERATOR GENERALIZING GROUP-BY, CROSS-TAB AND SUB-TOTALS SNEHA REDDY BEZAWADA CMPT 843 WHAT IS A DATA CUBE? The Data Cube or Cube operator produces N-dimensional answers

More information

OLAP Introduction and Overview

OLAP Introduction and Overview 1 CHAPTER 1 OLAP Introduction and Overview What Is OLAP? 1 Data Storage and Access 1 Benefits of OLAP 2 What Is a Cube? 2 Understanding the Cube Structure 3 What Is SAS OLAP Server? 3 About Cube Metadata

More information

Optimizing Testing Performance With Data Validation Option

Optimizing Testing Performance With Data Validation Option Optimizing Testing Performance With Data Validation Option 1993-2016 Informatica LLC. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording

More information

Data warehouses Decision support The multidimensional model OLAP queries

Data warehouses Decision support The multidimensional model OLAP queries Data warehouses Decision support The multidimensional model OLAP queries Traditional DBMSs are used by organizations for maintaining data to record day to day operations On-line Transaction Processing

More information

Processing of Very Large Data

Processing of Very Large Data Processing of Very Large Data Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first

More information

Physical Design. Elena Baralis, Silvia Chiusano Politecnico di Torino. Phases of database design D B M G. Database Management Systems. Pag.

Physical Design. Elena Baralis, Silvia Chiusano Politecnico di Torino. Phases of database design D B M G. Database Management Systems. Pag. Physical Design D B M G 1 Phases of database design Application requirements Conceptual design Conceptual schema Logical design ER or UML Relational tables Logical schema Physical design Physical schema

More information

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores

CSE 544 Principles of Database Management Systems. Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores CSE 544 Principles of Database Management Systems Alvin Cheung Fall 2015 Lecture 8 - Data Warehousing and Column Stores Announcements Shumo office hours change See website for details HW2 due next Thurs

More information

Acknowledgment. MTAT Data Mining. Week 7: Online Analytical Processing and Data Warehouses. Typical Data Analysis Process.

Acknowledgment. MTAT Data Mining. Week 7: Online Analytical Processing and Data Warehouses. Typical Data Analysis Process. MTAT.03.183 Data Mining Week 7: Online Analytical Processing and Data Warehouses Marlon Dumas marlon.dumas ät ut. ee Acknowledgment This slide deck is a mashup of the following publicly available slide

More information

ETL and OLAP Systems

ETL and OLAP Systems ETL and OLAP Systems Krzysztof Dembczyński Intelligent Decision Support Systems Laboratory (IDSS) Poznań University of Technology, Poland Software Development Technologies Master studies, first semester

More information

Data Warehousing Conclusion. Esteban Zimányi Slides by Toon Calders

Data Warehousing Conclusion. Esteban Zimányi Slides by Toon Calders Data Warehousing Conclusion Esteban Zimányi ezimanyi@ulb.ac.be Slides by Toon Calders Motivation for the Course Database = a piece of software to handle data: Store, maintain, and query Most ideal system

More information

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15

Systems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15 Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture II: Indexing Part I of this course Indexing 3 Database File Organization and Indexing Remember: Database tables

More information

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda

1 Dulcian, Inc., 2001 All rights reserved. Oracle9i Data Warehouse Review. Agenda Agenda Oracle9i Warehouse Review Dulcian, Inc. Oracle9i Server OLAP Server Analytical SQL Mining ETL Infrastructure 9i Warehouse Builder Oracle 9i Server Overview E-Business Intelligence Platform 9i Server:

More information

COMP 430 Intro. to Database Systems. Indexing

COMP 430 Intro. to Database Systems. Indexing COMP 430 Intro. to Database Systems Indexing How does DB find records quickly? Various forms of indexing An index is automatically created for primary key. SQL gives us some control, so we should understand

More information

Oracle 1Z0-515 Exam Questions & Answers

Oracle 1Z0-515 Exam Questions & Answers Oracle 1Z0-515 Exam Questions & Answers Number: 1Z0-515 Passing Score: 800 Time Limit: 120 min File Version: 38.7 http://www.gratisexam.com/ Oracle 1Z0-515 Exam Questions & Answers Exam Name: Data Warehousing

More information

RAID in Practice, Overview of Indexing

RAID in Practice, Overview of Indexing RAID in Practice, Overview of Indexing CS634 Lecture 4, Feb 04 2014 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke 1 Disks and Files: RAID in practice For a big enterprise

More information

Data Warehousing & Data Mining

Data Warehousing & Data Mining Data Warehousing & Data Mining Wolf-Tilo Balke Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Summary Last Week: Optimization - Indexes

More information

MTA Database Administrator Fundamentals Course

MTA Database Administrator Fundamentals Course MTA Database Administrator Fundamentals Course Session 1 Section A: Database Tables Tables Representing Data with Tables SQL Server Management Studio Section B: Database Relationships Flat File Databases

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 03 Architecture of DW Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro Basic

More information

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Indexing Week 14, Spring 2005 Edited by M. Naci Akkøk, 5.3.2004, 3.3.2005 Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Overview Conventional indexes B-trees Hashing schemes

More information

New Requirements. Advanced Query Processing. Top-N/Bottom-N queries Interactive queries. Skyline queries, Fast initial response time!

New Requirements. Advanced Query Processing. Top-N/Bottom-N queries Interactive queries. Skyline queries, Fast initial response time! Lecture 13 Advanced Query Processing CS5208 Advanced QP 1 New Requirements Top-N/Bottom-N queries Interactive queries Decision making queries Tolerant of errors approximate answers acceptable Control over

More information

PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH

PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 INTRODUCTION In centralized database: Data is located in one place (one server) All DBMS functionalities are done by that server

More information

Data Warehousing 3. ICS 421 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Data Warehousing 3. ICS 421 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa ICS 421 Spring 2010 Data Warehousing 3 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 4/1/2010 Lipyeow Lim -- University of Hawaii at Manoa 1 Implementation

More information

Data Warehousing and OLAP

Data Warehousing and OLAP Data Warehousing and OLAP INFO 330 Slides courtesy of Mirek Riedewald Motivation Large retailer Several databases: inventory, personnel, sales etc. High volume of updates Management requirements Efficient

More information

Lectures for the course: Data Warehousing and Data Mining (IT 60107)

Lectures for the course: Data Warehousing and Data Mining (IT 60107) Lectures for the course: Data Warehousing and Data Mining (IT 60107) Week 1 Lecture 1 21/07/2011 Introduction to the course Pre-requisite Expectations Evaluation Guideline Term Paper and Term Project Guideline

More information

Data Warehouse Performance - Selected Techniques and Data Structures

Data Warehouse Performance - Selected Techniques and Data Structures Data Warehouse Performance - Selected Techniques and Data Structures Robert Wrembel Poznań University of Technology, Institute of Computing Science, Poznań, Poland Robert.Wrembel@cs.put.poznan.pl Abstract.

More information

CSC 261/461 Database Systems Lecture 19

CSC 261/461 Database Systems Lecture 19 CSC 261/461 Database Systems Lecture 19 Fall 2017 Announcements CIRC: CIRC is down!!! MongoDB and Spark (mini) projects are at stake. L Project 1 Milestone 4 is out Due date: Last date of class We will

More information

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17

Announcement. Reading Material. Overview of Query Evaluation. Overview of Query Evaluation. Overview of Query Evaluation 9/26/17 Announcement CompSci 516 Database Systems Lecture 10 Query Evaluation and Join Algorithms Project proposal pdf due on sakai by 5 pm, tomorrow, Thursday 09/27 One per group by any member Instructor: Sudeepa

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

One Size Fits All: An Idea Whose Time Has Come and Gone

One Size Fits All: An Idea Whose Time Has Come and Gone ICS 624 Spring 2013 One Size Fits All: An Idea Whose Time Has Come and Gone Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 1/9/2013 Lipyeow Lim -- University

More information

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data.

The DBMS accepts requests for data from the application program and instructs the operating system to transfer the appropriate data. Managing Data Data storage tool must provide the following features: Data definition (data structuring) Data entry (to add new data) Data editing (to change existing data) Querying (a means of extracting

More information

Databasesystemer, forår 2005 IT Universitetet i København. Forelæsning 8: Database effektivitet. 31. marts Forelæser: Rasmus Pagh

Databasesystemer, forår 2005 IT Universitetet i København. Forelæsning 8: Database effektivitet. 31. marts Forelæser: Rasmus Pagh Databasesystemer, forår 2005 IT Universitetet i København Forelæsning 8: Database effektivitet. 31. marts 2005 Forelæser: Rasmus Pagh Today s lecture Database efficiency Indexing Schema tuning 1 Database

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Indexing. Jan Chomicki University at Buffalo. Jan Chomicki () Indexing 1 / 25

Indexing. Jan Chomicki University at Buffalo. Jan Chomicki () Indexing 1 / 25 Indexing Jan Chomicki University at Buffalo Jan Chomicki () Indexing 1 / 25 Storage hierarchy Cache Main memory Disk Tape Very fast Fast Slower Slow (nanosec) (10 nanosec) (millisec) (sec) Very small Small

More information

7. Query Processing and Optimization

7. Query Processing and Optimization 7. Query Processing and Optimization Processing a Query 103 Indexing for Performance Simple (individual) index B + -tree index Matching index scan vs nonmatching index scan Unique index one entry and one

More information

Kathleen Durant PhD Northeastern University CS Indexes

Kathleen Durant PhD Northeastern University CS Indexes Kathleen Durant PhD Northeastern University CS 3200 Indexes Outline for the day Index definition Types of indexes B+ trees ISAM Hash index Choosing indexed fields Indexes in InnoDB 2 Indexes A typical

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples.

1. Attempt any two of the following: 10 a. State and justify the characteristics of a Data Warehouse with suitable examples. Instructions to the Examiners: 1. May the Examiners not look for exact words from the text book in the Answers. 2. May any valid example be accepted - example may or may not be from the text book 1. Attempt

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:-

Database design View Access patterns Need for separate data warehouse:- A multidimensional data model:- UNIT III: Data Warehouse and OLAP Technology: An Overview : What Is a Data Warehouse? A Multidimensional Data Model, Data Warehouse Architecture, Data Warehouse Implementation, From Data Warehousing to

More information

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe

Introduction to Query Processing and Query Optimization Techniques. Copyright 2011 Ramez Elmasri and Shamkant Navathe Introduction to Query Processing and Query Optimization Techniques Outline Translating SQL Queries into Relational Algebra Algorithms for External Sorting Algorithms for SELECT and JOIN Operations Algorithms

More information

COLUMN-STORES VS. ROW-STORES: HOW DIFFERENT ARE THEY REALLY? DANIEL J. ABADI (YALE) SAMUEL R. MADDEN (MIT) NABIL HACHEM (AVANTGARDE)

COLUMN-STORES VS. ROW-STORES: HOW DIFFERENT ARE THEY REALLY? DANIEL J. ABADI (YALE) SAMUEL R. MADDEN (MIT) NABIL HACHEM (AVANTGARDE) COLUMN-STORES VS. ROW-STORES: HOW DIFFERENT ARE THEY REALLY? DANIEL J. ABADI (YALE) SAMUEL R. MADDEN (MIT) NABIL HACHEM (AVANTGARDE) PRESENTATION BY PRANAV GOEL Introduction On analytical workloads, Column

More information

Evaluation of relational operations

Evaluation of relational operations Evaluation of relational operations Iztok Savnik, FAMNIT Slides & Textbook Textbook: Raghu Ramakrishnan, Johannes Gehrke, Database Management Systems, McGraw-Hill, 3 rd ed., 2007. Slides: From Cow Book

More information

IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts. Enn Õunapuu

IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts. Enn Õunapuu IDU0010 ERP,CRM ja DW süsteemid Loeng 5 DW concepts Enn Õunapuu enn.ounapuu@ttu.ee Content Oveall approach Dimensional model Tabular model Overall approach Data modeling is a discipline that has been practiced

More information

Decision Support Systems aka Analytical Systems

Decision Support Systems aka Analytical Systems Decision Support Systems aka Analytical Systems Decision Support Systems Systems that are used to transform data into information, to manage the organization: OLAP vs OLTP OLTP vs OLAP Transactions Analysis

More information

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders

Data Warehousing ETL. Esteban Zimányi Slides by Toon Calders Data Warehousing ETL Esteban Zimányi ezimanyi@ulb.ac.be Slides by Toon Calders 1 Overview Picture other sources Metadata Monitor & Integrator OLAP Server Analysis Operational DBs Extract Transform Load

More information

OS and Hardware Tuning

OS and Hardware Tuning OS and Hardware Tuning Tuning Considerations OS Threads Thread Switching Priorities Virtual Memory DB buffer size File System Disk layout and access Hardware Storage subsystem Configuring the disk array

More information

Advanced Data Management Technologies

Advanced Data Management Technologies ADMT 2017/18 Unit 10 J. Gamper 1/37 Advanced Data Management Technologies Unit 10 SQL GROUP BY Extensions J. Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Acknowledgements: I

More information

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag.

Query optimization. Elena Baralis, Silvia Chiusano Politecnico di Torino. DBMS Architecture D B M G. Database Management Systems. Pag. Database Management Systems DBMS Architecture SQL INSTRUCTION OPTIMIZER MANAGEMENT OF ACCESS METHODS CONCURRENCY CONTROL BUFFER MANAGER RELIABILITY MANAGEMENT Index Files Data Files System Catalog DATABASE

More information

Proceedings of the IE 2014 International Conference AGILE DATA MODELS

Proceedings of the IE 2014 International Conference  AGILE DATA MODELS AGILE DATA MODELS Mihaela MUNTEAN Academy of Economic Studies, Bucharest mun61mih@yahoo.co.uk, Mihaela.Muntean@ie.ase.ro Abstract. In last years, one of the most popular subjects related to the field of

More information

collection of data that is used primarily in organizational decision making.

collection of data that is used primarily in organizational decision making. Data Warehousing A data warehouse is a special purpose database. Classic databases are generally used to model some enterprise. Most often they are used to support transactions, a process that is referred

More information

COGNOS DYNAMIC CUBES: SET TO RETIRE TRANSFORMER? Update: Pros & Cons

COGNOS DYNAMIC CUBES: SET TO RETIRE TRANSFORMER? Update: Pros & Cons COGNOS DYNAMIC CUBES: SET TO RETIRE TRANSFORMER? 10.2.2 Update: Pros & Cons GoToWebinar Control Panel Submit questions here Click arrow to restore full control panel Copyright 2015 Senturus, Inc. All Rights

More information

Data Mining Concepts & Techniques

Data Mining Concepts & Techniques Data Mining Concepts & Techniques Lecture No. 01 Databases, Data warehouse Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology Jamshoro

More information

OS and HW Tuning Considerations!

OS and HW Tuning Considerations! Administração e Optimização de Bases de Dados 2012/2013 Hardware and OS Tuning Bruno Martins DEI@Técnico e DMIR@INESC-ID OS and HW Tuning Considerations OS " Threads Thread Switching Priorities " Virtual

More information

Data Warehouse and Data Mining

Data Warehouse and Data Mining Data Warehouse and Data Mining Lecture No. 04-06 Data Warehouse Architecture Naeem Ahmed Email: naeemmahoto@gmail.com Department of Software Engineering Mehran Univeristy of Engineering and Technology

More information

BUSINESS INTELLIGENCE. SSAS - SQL Server Analysis Services. Business Informatics Degree

BUSINESS INTELLIGENCE. SSAS - SQL Server Analysis Services. Business Informatics Degree BUSINESS INTELLIGENCE SSAS - SQL Server Analysis Services Business Informatics Degree 2 BI Architecture SSAS: SQL Server Analysis Services 3 It is both an OLAP Server and a Data Mining Server Distinct

More information

Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10

Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10 Rajiv GandhiCollegeof Engineering& Technology, Kirumampakkam.Page 1 of 10 RAJIV GANDHI COLLEGE OF ENGINEERING & TECHNOLOGY, KIRUMAMPAKKAM-607 402 DEPARTMENT OF COMPUTER SCIENCE & ENGINEERING QUESTION BANK

More information

CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP)

CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP) CHAPTER 8: ONLINE ANALYTICAL PROCESSING(OLAP) INTRODUCTION A dimension is an attribute within a multidimensional model consisting of a list of values (called members). A fact is defined by a combination

More information

Outline. Database Management and Tuning. Index Data Structures. Outline. Index Tuning. Johann Gamper. Unit 5

Outline. Database Management and Tuning. Index Data Structures. Outline. Index Tuning. Johann Gamper. Unit 5 Outline Database Management and Tuning Johann Gamper Free University of Bozen-Bolzano Faculty of Computer Science IDSE Unit 5 1 2 Conclusion Acknowledgements: The slides are provided by Nikolaus Augsten

More information

Working with Analytical Objects. Version: 16.0

Working with Analytical Objects. Version: 16.0 Working with Analytical Objects Version: 16.0 Copyright 2017 Intellicus Technologies This document and its content is copyrighted material of Intellicus Technologies. The content may not be copied or derived

More information

DW schema and the problem of views

DW schema and the problem of views DW schema and the problem of views DW is multidimensional Schemas: Stars and constellations Typical DW queries, TPC-H and TPC-R benchmarks Views and their materialization View updates Main references [PJ01,

More information

Mobile MOUSe MTA DATABASE ADMINISTRATOR FUNDAMENTALS ONLINE COURSE OUTLINE

Mobile MOUSe MTA DATABASE ADMINISTRATOR FUNDAMENTALS ONLINE COURSE OUTLINE Mobile MOUSe MTA DATABASE ADMINISTRATOR FUNDAMENTALS ONLINE COURSE OUTLINE COURSE TITLE MTA DATABASE ADMINISTRATOR FUNDAMENTALS COURSE DURATION 10 Hour(s) of Self-Paced Interactive Training COURSE OVERVIEW

More information

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism

Parallel DBMS. Parallel Database Systems. PDBS vs Distributed DBS. Types of Parallelism. Goals and Metrics Speedup. Types of Parallelism Parallel DBMS Parallel Database Systems CS5225 Parallel DB 1 Uniprocessor technology has reached its limit Difficult to build machines powerful enough to meet the CPU and I/O demands of DBMS serving large

More information

Data Warehouses for Decision Support

Data Warehouses for Decision Support Data Warehouses for Decision Support Vera Goebel Department of Informatics, University of Oslo 2010 1 What and Why of Data Warehousing Database Database Datastore Database System Database System Data System

More information

Data Warehousing 11g Essentials

Data Warehousing 11g Essentials Oracle 1z0-515 Data Warehousing 11g Essentials Version: 6.0 QUESTION NO: 1 Indentify the true statement about REF partitions. A. REF partitions have no impact on partition-wise joins. B. Changes to partitioning

More information

Logical Design A logical design is conceptual and abstract. It is not necessary to deal with the physical implementation details at this stage.

Logical Design A logical design is conceptual and abstract. It is not necessary to deal with the physical implementation details at this stage. Logical Design A logical design is conceptual and abstract. It is not necessary to deal with the physical implementation details at this stage. You need to only define the types of information specified

More information

Basics of Dimensional Modeling

Basics of Dimensional Modeling Basics of Dimensional Modeling Data warehouse and OLAP tools are based on a dimensional data model. A dimensional model is based on dimensions, facts, cubes, and schemas such as star and snowflake. Dimension

More information

Chapter 5. Indexing for DWH

Chapter 5. Indexing for DWH Chapter 5. Indexing for DWH D1 Facts D2 Prof. Bayer, DWH, Ch.5, SS 2000 1 dimension Time with composite key K1 according to hierarchy key K1 = (year int, month int, day int) dimension Region with composite

More information

Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes?

Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes? White Paper Accelerating BI on Hadoop: Full-Scan, Cubes or Indexes? How to Accelerate BI on Hadoop: Cubes or Indexes? Why not both? 1 +1(844)384-3844 INFO@JETHRO.IO Overview Organizations are storing more

More information

Datenbanksysteme II: Caching and File Structures. Ulf Leser

Datenbanksysteme II: Caching and File Structures. Ulf Leser Datenbanksysteme II: Caching and File Structures Ulf Leser Content of this Lecture Caching Overview Accessing data Cache replacement strategies Prefetching File structure Index Files Ulf Leser: Implementation

More information

Data Warehousing & Data Mining

Data Warehousing & Data Mining Data Warehousing & Data Mining Wolf-Tilo Balke Kinda El Maarry Institut für Informationssysteme Technische Universität Braunschweig http://www.ifis.cs.tu-bs.de Summary Last week: Logical Model: Cubes,

More information

Two Types Of Tables Involved In Producing A Star Schema >>>CLICK HERE<<<

Two Types Of Tables Involved In Producing A Star Schema >>>CLICK HERE<<< Two Types Of Tables Involved In Producing A Star Schema Outer Join:It joins the matching records from two table and all the records from Give the two types of tables involved in producing a star schema

More information

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1)

Algorithms for Query Processing and Optimization. 0. Introduction to Query Processing (1) Chapter 19 Algorithms for Query Processing and Optimization 0. Introduction to Query Processing (1) Query optimization: The process of choosing a suitable execution strategy for processing a query. Two

More information

Data Warehousing and Decision Support, part 2

Data Warehousing and Decision Support, part 2 Data Warehousing and Decision Support, part 2 CS634 Class 23, Apr 27, 2016 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke, Chapter 25 pid 11 12 13 pid timeid locid sales Multidimensional

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Data Warehousing 2. ICS 421 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa

Data Warehousing 2. ICS 421 Spring Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa ICS 421 Spring 2010 Data Warehousing 2 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/30/2010 Lipyeow Lim -- University of Hawaii at Manoa 1 Data Warehousing

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20

Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20 Data Warehousing and Decision Support (mostly using Relational Databases) CS634 Class 20 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke, Chapter 25 Introduction Increasingly,

More information

Jet Data Manager 2014 SR2 Product Enhancements

Jet Data Manager 2014 SR2 Product Enhancements Jet Data Manager 2014 SR2 Product Enhancements Table of Contents Overview of New Features... 3 New Features in Jet Data Manager 2014 SR2... 3 Improved Features in Jet Data Manager 2014 SR2... 5 New Features

More information

Aster Data Basics Class Outline

Aster Data Basics Class Outline Aster Data Basics Class Outline CoffingDW education has been customized for every customer for the past 20 years. Our classes can be taught either on site or remotely via the internet. Education Contact:

More information

Oracle #1 RDBMS Vendor

Oracle #1 RDBMS Vendor Oracle #1 RDBMS Vendor IBM 20.7% Microsoft 18.1% Other 12.6% Oracle 48.6% Source: Gartner DataQuest July 2008, based on Total Software Revenue Oracle 2 Continuous Innovation Oracle 11g Exadata Storage

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Managing the Database

Managing the Database Slide 1 Managing the Database Objectives of the Lecture : To consider the roles of the Database Administrator. To consider the involvmentof the DBMS in the storage and handling of physical data. To appreciate

More information