COSC-4411(M) Midterm #1
|
|
- Matilda Cain
- 6 years ago
- Views:
Transcription
1 12 February 2004 COSC-4411(M) Midterm #1 & answers p. 1 of 10 COSC-4411(M) Midterm #1 Sur / Last Name: Given / First Name: Student ID: Instructor: Parke Godfrey Exam Duration: 75 minutes Term: Winter 2004 Answer the following questions to the best of your knowledge. Be precise and be careful. The exam is open-book and open-notes. Write any assumptions you need to make along with your answers, whenever necessary. There are five major questions. Points for each question and sub-question are as indicated. In total, the exam is out of 50 points. If you need additional space for an answer, just indicate clearly where you are continuing. Regrade Policy Regrading should only be requested in writing. Write what you would like to be reconsidered. Note, however, that an exam accepted for regrading will be reviewed and regraded in entirety (all questions). Grading Box Total
2 12 February 2004 COSC-4411(M) Midterm #1 & answers p. 2 of (10 points) Buffer Pool. Okay, I ve replaced the replacement strategy. What next? [short answer / analysis] Dr. Mark Dogfurry of Very Small Databases, Inc., has devised the following replacement strategy. Within the database system, every transaction has a unique timestamp value, start, which is the time the transaction commenced. (A transaction is, for example, a query executing. It will pin, and then unpin, a number of pages.) It is always a transaction that pins a page. Associated with each buffer pool frame is an xtime and a ctime. When a page s pin count = 0 and the page is then pinned, or the page is initially fetched into the pool, its frame s xtime is set equal to the start value of the transaction that requested (pinned) the page. If the page is already pinned (pin count > 0), its frame s xtime is set to the transaction s start if start is newer than the frame s current xtime; otherwise, its xtime value is left as is. The frame s ctime is set to the current clock time whenever the page s pin count becomes 0. For replacement, the page with the oldest xtime over all pages with pin count = 0 is chosen. In the case of ties for oldest xtime, the one with the newest ctime of those is chosen. a. (3 points) What type of replacement strategy is this? (LRU, MRU, Clock, hybrid, etc.?) Briefly describe. The strategy acts most like LRU. A page is chosen for replacement based on oldest timestamp (xtime), which is what LRU does. It differs from LRU in how it handles ties for oldest (here, ties on xtime). LRU might randomly pick between ties for oldest. To be fair, on a single-processor machine, LRU would not see any ties; unpinned times would be sequential. Dogfurry s strategy, on the other hand, then chooses the page with the youngest ctime across pages tied for the oldest xtime. So this acts as MRU over pages of the same xtime. Furthermore, ties on xtime will be common, since a single transaction will pin and unpin many pages. Thus, his strategy is truly a hybrid. A note: While this is mostly LRU-like, it is different in a significant way. LRU is generally done with respect to when pages were unpinned (to pincount 0). This is not based on pages times, but on transactions times.
3 12 February 2004 COSC-4411(M) Midterm #1 & answers p. 3 of 10 b. (4 points) Identify an advantage Dr. Dogfurry s replacement strategy might have (compared with the basic replacement strategies of LRU, MRU, and Clock). This should solve the problem of sequential flooding that LRU can have, for many cases. When a given transaction must read a sequence of pages repeatedly, since these pages are marked with the same xtime, replacement over those will be done by MRU. So sequential flooding will not occur here. Most cases of (potential) sequential flooding probably do occur within the scope of a transaction, so Dogfurry s strategy may offer a good solution. However, the strategy does not solve all cases of sequential flooding. If a sequence of pages were being pinned by different transactions, then they will be replaced according to LRU. This would seem to be an unusual scenario though. c. (3 points) Identify a disadvantage Dr. Dogfurry s replacement strategy might have (compared with the basic replacement strategies of LRU, MRU, and Clock). The strategy favors newer transactions to older ones, Since older xtimes are replaced. Thus, an old transaction is not likely to have its requests in the buffer pool, so it will slow down. Longer transactions will run longer and become older in comparision to other transactions. Thus long transactions will be slowed down under this strategy. On the other hand, short transactions should speed up. Our buffer pool replacement strategy should not play favorites among transactions (unless we have designed it to on purpose); this is a side-effect. Within a transaction, MRU is used. This ignores data locality, which is the reason LRU tends to be better than MRU generally. So MRU may not be the best choice of strategy for within a transaction. This strategy has more overhead than plain LRU, MRU, or Clock. It probably could be implemented efficiently, so this is a minor complaint. But it is important that the replacement strategy is very fast, because this is at the core of buffer pool routines and is called extremely often. This still doesn t accommodate all forms of sequential flooding, namely flooding that can occur due to the interaction of multiple transactions. Whether this type of sequential flooding is common (and thus important to remedy) would need to be studied. I was not looking for all these answers, but for a reasonable observation about what the disadvantages might be.
4 12 February 2004 COSC-4411(M) Midterm #1 & answers p. 4 of (10 points) Index Logic. Take the next index to the left. [short answer / exercise] a. (5 points) You are told that the following indexes are available on the table Employee: key type clustered? A. name, address tree yes B. age, salary hash no C. name tree no D. salary, age hash no E. name, age tree yes You are suspicious that this information is not correct. Why? Identify three problems with what is reported. It is impossible that C. is unclustered if A. or E. are clustered. It is not possible to have two clustered indexes on the same table with different keys: A. & E. It makes no sense to have B. and D.. They are identical functionally....
5 12 February 2004 COSC-4411(M) Midterm #1 & answers p. 5 of 10 b. (5 points) Consider the query SELECT order#, amount, when FROM Purchases WHERE amount BETWEEN 25 AND 30 AND when > ; There are 10,000,000 purchase records. There are 25 records are on each data-record page, on average. 4,000,000 purchase records have when > ,000 purchase records have 25 amount 30. Two indexes are available: A. A clustered B+ tree index on when of type alternative 2. The index pages are three deep, with the leaf pages at depth four. B. An unclustered B+ tree index on amount of type alternative 2. The index pages are three deep, with the leaf pages at depth four. For each index, 50 data entries fit per data-entry (leaf) page. What is the I/O cost using each index to evaluate the query? So which index is best for this? For A., it will cost 3 I/O s to read the index pages from root down, one I/O to read the data entry page at the beginning of the range, and then 160,000 I/O s to read the data record pages with the matching records. Four million records match the when condition. At 25 records per page, they would occupy 160,000 pages. The index is clustered, so the matching records are clustered together. Therefore, it costs about 160,004 I/O s to fetch the records; we check the amount condition on-the-fly. Some said that we would read 80,000 data entry pages (all the entries matching), and fetch the records based on the entries, reading roughly 160,000 data-record pages. Thus, the total would be 240,003 I/O s. This is not how the textbook presents it; we can read data-record pages sequentially for a range. However in real systems, a clustered index is not fully clustered; the data-record pages are allowed to become slightly unsorted. This is a compromise for efficiency on updates. As a consequence though under this design, the data-entry pages must be read. I counted this as right too. For B., it will cost 3 I/O s to read the index pages from root down, and 1,000 I/O s to read the data-entry records that match on the amount condition. This time we read all the matching data-entries regardless, because the index is unclustered. 50,000 records match, and since 50 data-entries fit per page, that is 1,000 pages. Then, to fetch each of the 50,000 records, it will cost us an I/O each to fetch the appropriate data-record page. So 51,003 I/O s. (We might save some on the 50,000 due to hits in the buffer pool. However, the file is 40,000 pages in size, so the savings here will be negligable.) Using B. wins.
6 12 February 2004 COSC-4411(M) Midterm #1 & answers p. 6 of (10 points) General. Grab bag. [multiple choice] a. (2 points) Consider I. clustered tree indexes II. unclustered tree indexes III. clustered hash indexes IV. unclustered hash indexes Range queries can benefit from A. Just I. B. Just I & II. C. Just I, II, & III. D. Just I & III. E. Potentially any of I, II, III, & IV. b. (2 points) The buffer manager manages A. lock management for transaction processing B. query processing C. file allocation and deallocation D. disk memory E. main memory for the database system. c. (2 points) Which of the following is false? A. Locating a record by key in a sorted file by binary search and locating it via a B+ tree make practically the same number of key comparisons. B. Locating a record by key in a sorted file by binary search requires more I/O s than locating it via a B+ tree, in general. C. A bulk build of a B+ tree is faster than building it by inserting a record at a time. D. If the data records are kept in a sorted file, there is no need for a B+ tree index based on the same search / sort key. E. If there is an unclustered B+ tree index over the data records, this does not mean that the records are necessarily sorted. d. (2 points) Which of the following is false? A. The trend is that disk I/O speeds are getting faster in ratio to CPU speeds. B. Page size is dictated by the hardware. C. Generally, many records fit on a page. D. Sequential reads and writes are important to a database system s performance. E. Generally, I/O time dominates CPU time in database operations. e. (2 points) The external merge sort routine A. for its merge passes requires that the input runs all be of equal length. B. can accommodate variable length input runs in a merge pass, but may in that case need to allocate more output frames. C. must use quick-sort in its pass 0. D. may not be faster sorting a given input file given twice the buffer pool allocation. E. cannot sort a file that is already sorted on a different key.
7 12 February 2004 COSC-4411(M) Midterm #1 & answers p. 7 of (10 points) Index Mechanics. Always losing your keys? [exercise] A linear hash has just been started. The linear hashed file currently just has one bucket (primary page). The current hash function pair is h 0, h 1. Here, h 0 masks for zero (!) righthand bits from the hashed key, and so always returns bucket address 0. Hash function h 1 masks for 1 right-hand bit, h 2 for 2, and so forth. Assume that each page can hold two entries. The file currently has one entry of 21 ( ) next A split should be triggered whenever an overflow page is created. Show the linear hashed file after each of the following inserts: 14 ( ), 7 (111 2 ), 35 ( ), and 28 ( ). The insertions are cumulative, so your final hashed file should contain 21, 14, 7, 35, and next next 0 14 next next next next
8 12 February 2004 COSC-4411(M) Midterm #1 & answers p. 8 of (10 points) External Sorting. I ll pass. [analysis] Consider the following optimization. If the last merge of the current merge pass would involve a k-way merge where k < (B 1), then the routine fills up the last merge by adding (B 1) k current runs runs just created in this same pass in previous merge steps to make for a (B 1)-way merge, when possible. a. (5 points) Say that pass (i 1) produced 8 runs and that we are doing 3-way merges. In pass i, we first take 3 of the 8 runs and merge them into a new single run. We next take the next 3 of the 8 runs and merge them into a new run. Now we only have 2 of the original 8 runs left. In the basic external sort routine, we would finish pass i by merging these 2 runs in a 2-way merge. Under the revised algorithm (with the optimization ), we would perform a 3-way merge again instead, using the remaining 2 runs of the 8 (from pass (i 1)) and one of the 2 new runs just created in the first two merges of pass i. Does this help in this example? Why or why not? Let us say i = 1 here, and pass zero made 8 runs of length 4 (B). So the file is 32 pages in size. In pass one, in the first merge, we merge 3 of the 8 runs into one run of length 12; in the second merge, we merge the next 3 of the 8 runs into one run of length 12; in the third merge, we normally merge the remaining 2 of the 8 runs into one run of length 8. This cost 64 I/O s (= 2 size of file). Under the modification, in the third merge of pass one, we would borrow one of the runs we made previously in pass one so we could do a 3-way merge again instead of a 2-way merge. The run we borrow is of length 12, and we merge it with the remaining 2 runs of length 4 each, resulting in a run of length 20. The cost of the pass is now 88 I/O s, however! We had to read in the borrowed run and write it out in addtion to the other I/O s. In the pass two, in the original version, we merge the three resulting runs from pass one into a single run, and we are done. This costs 64 I/O s. In the pass two, in the new version, we merge the two resulting runs from pass one into a single run, and we are done. This costs 64 I/O s. So in this example, pass one cost more for the new version than the original version. The other passes cost the same. So the new version was more expensive!
9 12 February 2004 COSC-4411(M) Midterm #1 & answers p. 9 of 10 b. (5 points) Does this optimization actually always / ever make an external sort more efficient? That is, does it always / ever save I/O s? Argue briefly why or why not. In part a., we established that there are cases when the new algorithm costs more. Does it always cost more? No. When the new version results in fewer passes, it is less expensive since it saves all the I/O s of an additional pass. Consider that in the example above, the file was only 28 pages long. Under the original algorithm, pass zero makes 7 runs of length 4 each. Pass one would make 3 runs, 2 of length 12, and one of length 4 (a 1-way merge!). Pass two finishs with a single run. So 3 64 = 192 I/O s. Under the new algorithm, pass zero makes 7 runs of length 4 each, like before. Pass one would make 2 runs like before in the first and second merges. For the third merge, it would borrow the two just created merges to fill out the 3-way merge. This would result in a single run. So no pass two is necessary! The extra cost is reading and writing the two borrowed runs an extra time: = 48. Plus the 2 passes: 2 64 = 128. So 176 I/O s in all. Okay, not a huge savings in this example. However, we can show other cases when the savings is much more significant.
10 12 February 2004 COSC-4411(M) Midterm #1 & answers p. 10 of 10 (Scratch space.) Relax. Turn in your exam. Go home.
COSC-4411(M) Midterm #1
COSC-4411(M) Midterm #1 Sur / Last Name: Given / First Name: Student ID: Instructor: Parke Godfrey Exam Duration: 75 minutes Term: Winter 2004 Answer the following questions to the best of your knowledge.
More informationIMPORTANT: Circle the last two letters of your class account:
Spring 2011 University of California, Berkeley College of Engineering Computer Science Division EECS MIDTERM I CS 186 Introduction to Database Systems Prof. Michael J. Franklin NAME: STUDENT ID: IMPORTANT:
More informationCSCI-6421 Final Exam York University Fall Term 2004
6 December 2004 CS-6421 Final Exam p. 1 of 7 CSCI-6421 Final Exam York University Fall Term 2004 Due: 6pm Wednesday 15 December 2004 Last Name: First Name: Instructor: Parke Godfrey Exam Duration: take
More informationLassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems
Lassonde School of Engineering Winter 2016 Term Course No: 4411 Database Management Systems Last Name: First Name: Student ID: 1. Exam is 2 hours long 2. Closed books/notes Problem 1 (6 points) Consider
More informationPhysical Disk Structure. Physical Data Organization and Indexing. Pages and Blocks. Access Path. I/O Time to Access a Page. Disks.
Physical Disk Structure Physical Data Organization and Indexing Chapter 11 1 4 Access Path Refers to the algorithm + data structure (e.g., an index) used for retrieving and storing data in a table The
More informationUniversity of California, Berkeley. CS 186 Introduction to Databases, Spring 2014, Prof. Dan Olteanu MIDTERM
University of California, Berkeley CS 186 Introduction to Databases, Spring 2014, Prof. Dan Olteanu MIDTERM This is a closed book examination sided). but you are allowed one 8.5 x 11 sheet of notes (double
More informationVirtual Memory. Chapter 8
Virtual Memory 1 Chapter 8 Characteristics of Paging and Segmentation Memory references are dynamically translated into physical addresses at run time E.g., process may be swapped in and out of main memory
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals- Part V Lecture 13, March 10, 2014 Mohammad Hammoud Today Welcome Back from Spring Break! Today Last Session: DBMS Internals- Part IV Tree-based (i.e., B+
More informationCS 245 Midterm Exam Winter 2014
CS 245 Midterm Exam Winter 2014 This exam is open book and notes. You can use a calculator and your laptop to access course notes and videos (but not to communicate with other people). You have 70 minutes
More informationChapter 12: Query Processing. Chapter 12: Query Processing
Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join
More informationQuery Processing: A Systems View. Announcements (March 1) Physical (execution) plan. CPS 216 Advanced Database Systems
Query Processing: A Systems View CPS 216 Advanced Database Systems Announcements (March 1) 2 Reading assignment due Wednesday Buffer management Homework #2 due this Thursday Course project proposal due
More informationSpring 2013 CS 122C & CS 222 Midterm Exam (and Comprehensive Exam, Part I) (Max. Points: 100)
Spring 2013 CS 122C & CS 222 Midterm Exam (and Comprehensive Exam, Part I) (Max. Points: 100) Instructions: - This exam is closed book and closed notes but open cheat sheet. - The total time for the exam
More informationCS-245 Database System Principles
CS-245 Database System Principles Midterm Exam Summer 2001 SOLUIONS his exam is open book and notes. here are a total of 110 points. You have 110 minutes to complete it. Print your name: he Honor Code
More informationAdvanced Database Systems
Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed
More informationAnnouncements (March 1) Query Processing: A Systems View. Physical (execution) plan. Announcements (March 3) Physical plan execution
Announcements (March 1) 2 Query Processing: A Systems View CPS 216 Advanced Database Systems Reading assignment due Wednesday Buffer management Homework #2 due this Thursday Course project proposal due
More informationChapter 12: Query Processing
Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join
More informationUniversity of California, Berkeley. (2 points for each row; 1 point given if part of the change in the row was correct)
University of California, Berkeley CS 186 Intro to Database Systems, Fall 2012, Prof. Michael J. Franklin MIDTERM II - Questions This is a closed book examination but you are allowed one 8.5 x 11 sheet
More informationCS 186/286 Spring 2018 Midterm 1
CS 186/286 Spring 2018 Midterm 1 Do not turn this page until instructed to start the exam. You should receive 1 single-sided answer sheet and a 18-page exam packet. All answers should be written on the
More informationCS 222/122C Fall 2016, Midterm Exam
STUDENT NAME: STUDENT ID: Instructions: CS 222/122C Fall 2016, Midterm Exam Principles of Data Management Department of Computer Science, UC Irvine Prof. Chen Li (Max. Points: 100) This exam has six (6)
More informationMidterm 1: CS186, Spring I. Storage: Disk, Files, Buffers [11 points] SOLUTION. cs186-
Midterm 1: CS186, Spring 2016 Name: Class Login: SOLUTION cs186- You should receive 1 double-sided answer sheet and an 10-page exam. Mark your name and login on both sides of the answer sheet, and in the
More informationExternal Sorting Implementing Relational Operators
External Sorting Implementing Relational Operators 1 Readings [RG] Ch. 13 (sorting) 2 Where we are Working our way up from hardware Disks File abstraction that supports insert/delete/scan Indexing for
More informationUniversity of Waterloo Midterm Examination Sample Solution
1. (4 total marks) University of Waterloo Midterm Examination Sample Solution Winter, 2012 Suppose that a relational database contains the following large relation: Track(ReleaseID, TrackNum, Title, Length,
More informationCS 186/286 Spring 2018 Midterm 1
CS 186/286 Spring 2018 Midterm 1 Do not turn this page until instructed to start the exam. You should receive 1 single-sided answer sheet and a 13-page exam packet. All answers should be written on the
More informationCMPS 181, Database Systems II, Final Exam, Spring 2016 Instructor: Shel Finkelstein. Student ID: UCSC
CMPS 181, Database Systems II, Final Exam, Spring 2016 Instructor: Shel Finkelstein Student Name: Student ID: UCSC Email: Final Points: Part Max Points Points I 15 II 29 III 31 IV 19 V 16 Total 110 Closed
More informationIndexing. Chapter 8, 10, 11. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1
Indexing Chapter 8, 10, 11 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Tree-Based Indexing The data entries are arranged in sorted order by search key value. A hierarchical search
More informationCSC 261/461 Database Systems Lecture 17. Fall 2017
CSC 261/461 Database Systems Lecture 17 Fall 2017 Announcement Quiz 6 Due: Tonight at 11:59 pm Project 1 Milepost 3 Due: Nov 10 Project 2 Part 2 (Optional) Due: Nov 15 The IO Model & External Sorting Today
More informationQuery Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016
Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,
More informationWhy Is This Important? Overview of Storage and Indexing. Components of a Disk. Data on External Storage. Accessing a Disk Page. Records on a Disk Page
Why Is This Important? Overview of Storage and Indexing Chapter 8 DB performance depends on time it takes to get the data from storage system and time to process Choosing the right index for faster access
More informationMidterm 1: CS186, Spring I. Storage: Disk, Files, Buffers [11 points] cs186-
Midterm 1: CS186, Spring 2016 Name: Class Login: cs186- You should receive 1 double-sided answer sheet and an 11-page exam. Mark your name and login on both sides of the answer sheet, and in the blanks
More informationDatabase design and implementation CMPSCI 645. Lecture 08: Storage and Indexing
Database design and implementation CMPSCI 645 Lecture 08: Storage and Indexing 1 Where is the data and how to get to it? DB 2 DBMS architecture Query Parser Query Rewriter Query Op=mizer Query Executor
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals: Part II Lecture 11, February 17, 2015 Mohammad Hammoud Last Session: DBMS Internals- Part I Today Today s Session: DBMS Internals- Part II A Brief Summary
More informationL9: Storage Manager Physical Data Organization
L9: Storage Manager Physical Data Organization Disks and files Record and file organization Indexing Tree-based index: B+-tree Hash-based index c.f. Fig 1.3 in [RG] and Fig 2.3 in [EN] Functional Components
More informationOverview of Storage and Indexing
Overview of Storage and Indexing UVic C SC 370 Dr. Daniel M. German Department of Computer Science July 2, 2003 Version: 1.1.1 7 1 Overview of Storage and Indexing (1.1.1) CSC 370 dmgerman@uvic.ca Overview
More informationData on External Storage
Advanced Topics in DBMS Ch-1: Overview of Storage and Indexing By Syed khutubddin Ahmed Assistant Professor Dept. of MCA Reva Institute of Technology & mgmt. Data on External Storage Prg1 Prg2 Prg3 DBMS
More informationLecture 8 Index (B+-Tree and Hash)
CompSci 516 Data Intensive Computing Systems Lecture 8 Index (B+-Tree and Hash) Instructor: Sudeepa Roy Duke CS, Fall 2017 CompSci 516: Database Systems 1 HW1 due tomorrow: Announcements Due on 09/21 (Thurs),
More informationFaloutsos 1. Carnegie Mellon Univ. Dept. of Computer Science Database Applications. Outline
Carnegie Mellon Univ. Dept. of Computer Science 15-415 - Database Applications Lecture #14: Implementation of Relational Operations (R&G ch. 12 and 14) 15-415 Faloutsos 1 introduction selection projection
More informationMidterm 1: CS186, Spring 2015
Midterm 1: CS186, Spring 2015 Prof. J. Hellerstein You should receive a double-sided answer sheet and a 7-page exam. Mark your name and login on both sides of the answer sheet, and in the blanks above.
More informationEXTERNAL SORTING. Sorting
EXTERNAL SORTING 1 Sorting A classic problem in computer science! Data requested in sorted order (sorted output) e.g., find students in increasing grade point average (gpa) order SELECT A, B, C FROM R
More informationCS222P Fall 2017, Final Exam
STUDENT NAME: STUDENT ID: CS222P Fall 2017, Final Exam Principles of Data Management Department of Computer Science, UC Irvine Prof. Chen Li (Max. Points: 100 + 15) Instructions: This exam has seven (7)
More informationCOMP 346 WINTER 2018 MEMORY MANAGEMENT (VIRTUAL MEMORY)
COMP 346 WINTER 2018 1 MEMORY MANAGEMENT (VIRTUAL MEMORY) VIRTUAL MEMORY A process may be broken up into pieces (pages or segments) that do not need to be located contiguously in main memory. Memory references
More informationAnnouncements. Reading Material. Recap. Today 9/17/17. Storage (contd. from Lecture 6)
CompSci 16 Intensive Computing Systems Lecture 7 Storage and Index Instructor: Sudeepa Roy Announcements HW1 deadline this week: Due on 09/21 (Thurs), 11: pm, no late days Project proposal deadline: Preliminary
More informationCSIT5300: Advanced Database Systems
CSIT5300: Advanced Database Systems L08: B + -trees and Dynamic Hashing Dr. Kenneth LEUNG Department of Computer Science and Engineering The Hong Kong University of Science and Technology Hong Kong SAR,
More informationImplementing Relational Operators: Selection, Projection, Join. Database Management Systems, R. Ramakrishnan and J. Gehrke 1
Implementing Relational Operators: Selection, Projection, Join Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Readings [RG] Sec. 14.1-14.4 Database Management Systems, R. Ramakrishnan and
More informationReview: Memory, Disks, & Files. File Organizations and Indexing. Today: File Storage. Alternative File Organizations. Cost Model for Analysis
File Organizations and Indexing Review: Memory, Disks, & Files Lecture 4 R&G Chapter 8 "If you don't find it in the index, look very carefully through the entire catalogue." -- Sears, Roebuck, and Co.,
More informationPrinciples of Data Management. Lecture #9 (Query Processing Overview)
Principles of Data Management Lecture #9 (Query Processing Overview) Instructor: Mike Carey mjcarey@ics.uci.edu Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Today s Notable News v Midterm
More informationRoadmap. Handling large amount of data efficiently. Stable storage. Parallel dataflow. External memory algorithms and data structures
Roadmap Handling large amount of data efficiently Stable storage External memory algorithms and data structures Implementing relational operators Parallel dataflow Algorithms for MapReduce Implementing
More informationIntroduction to Indexing 2. Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana
Introduction to Indexing 2 Acknowledgements: Eamonn Keogh and Chotirat Ann Ratanamahatana Indexed Sequential Access Method We have seen that too small or too large an index (in other words too few or too
More informationCSE 344 FEBRUARY 14 TH INDEXING
CSE 344 FEBRUARY 14 TH INDEXING EXAM Grades posted to Canvas Exams handed back in section tomorrow Regrades: Friday office hours EXAM Overall, you did well Average: 79 Remember: lowest between midterm/final
More informationCSE-1520R Test #1. The exam is closed book, closed notes, and no aids such as calculators, cellphones, etc.
9 February 2011 CSE-1520R Test #1 [7F] w/ answers p. 1 of 8 CSE-1520R Test #1 Sur / Last Name: Given / First Name: Student ID: Instructor: Parke Godfrey Exam Duration: 45 minutes Term: Winter 2011 The
More informationCSE-1520R Test #1. The exam is closed book, closed notes, and no aids such as calculators, cellphones, etc.
9 February 2011 CSE-1520R Test #1 [B4] p. 1 of 8 CSE-1520R Test #1 Sur / Last Name: Given / First Name: Student ID: Instructor: Parke Godfrey Exam Duration: 45 minutes Term: Winter 2011 The exam is closed
More informationTotalCost = 3 (1, , 000) = 6, 000
156 Chapter 12 HASH JOIN: Now both relations are the same size, so we can treat either one as the smaller relation. With 15 buffer pages the first scan of S splits it into 14 buckets, each containing about
More informationCMPUT 391 Database Management Systems. Query Processing: The Basics. Textbook: Chapter 10. (first edition: Chapter 13) University of Alberta 1
CMPUT 391 Database Management Systems Query Processing: The Basics Textbook: Chapter 10 (first edition: Chapter 13) Based on slides by Lewis, Bernstein and Kifer University of Alberta 1 External Sorting
More informationSTORING DATA: DISK AND FILES
STORING DATA: DISK AND FILES CS 564- Spring 2018 ACKs: Dan Suciu, Jignesh Patel, AnHai Doan WHAT IS THIS LECTURE ABOUT? How does a DBMS store data? disk, SSD, main memory The Buffer manager controls how
More informationChapter 17. Disk Storage, Basic File Structures, and Hashing. Records. Blocking
Chapter 17 Disk Storage, Basic File Structures, and Hashing Records Fixed and variable length records Records contain fields which have values of a particular type (e.g., amount, date, time, age) Fields
More informationExtra: B+ Trees. Motivations. Differences between BST and B+ 10/27/2017. CS1: Java Programming Colorado State University
Extra: B+ Trees CS1: Java Programming Colorado State University Slides by Wim Bohm and Russ Wakefield 1 Motivations Many times you want to minimize the disk accesses while doing a search. A binary search
More informationCSE 530A. B+ Trees. Washington University Fall 2013
CSE 530A B+ Trees Washington University Fall 2013 B Trees A B tree is an ordered (non-binary) tree where the internal nodes can have a varying number of child nodes (within some range) B Trees When a key
More informationIntro to DB CHAPTER 12 INDEXING & HASHING
Intro to DB CHAPTER 12 INDEXING & HASHING Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing
More information192 Chapter 14. TotalCost=3 (1, , 000) = 6, 000
192 Chapter 14 5. SORT-MERGE: With 52 buffer pages we have B> M so we can use the mergeon-the-fly refinement which costs 3 (M + N). TotalCost=3 (1, 000 + 1, 000) = 6, 000 HASH JOIN: Now both relations
More informationChapter 13: Query Processing
Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing
More informationExternal Sorting. Chapter 13. Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1
External Sorting Chapter 13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke 1 Why Sort? A classic problem in computer science! Data requested in sorted order e.g., find students in increasing
More informationINSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados
-------------------------------------------------------------------------------------------------------------- INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados Exam 1 - solution
More informationAdministrivia. Tree-Structured Indexes. Review. Today: B-Tree Indexes. A Note of Caution. Introduction
Administrivia Tree-Structured Indexes Lecture R & G Chapter 9 Homeworks re-arranged Midterm Exam Graded Scores on-line Key available on-line If I had eight hours to chop down a tree, I'd spend six sharpening
More informationINSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados
-------------------------------------------------------------------------------------------------------------- INSTITUTO SUPERIOR TÉCNICO Administração e optimização de Bases de Dados Exam 1 16 June 2014
More informationQueen s University Faculty of Arts and Science School of Computing CISC 432* / 836* Advanced Database Systems
HAND IN Queen s University Faculty of Arts and Science School of Computing CISC 432* / 836* Advanced Database Systems Final Examination December 14, 2002 Instructor: Pat Martin Instructions: 1. This examination
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals- Part VI Lecture 14, March 12, 2014 Mohammad Hammoud Today Last Session: DBMS Internals- Part V Hash-based indexes (Cont d) and External Sorting Today s Session:
More informationM-ary Search Tree. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. B-Trees (4.7 in Weiss)
M-ary Search Tree B-Trees (4.7 in Weiss) Maximum branching factor of M Tree with N values has height = # disk accesses for find: Runtime of find: 1/21/2011 1 1/21/2011 2 Solution: B-Trees specialized M-ary
More information! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for
Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and
More informationChapter 13: Query Processing Basic Steps in Query Processing
Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and
More informationChapter 3 - Memory Management
Chapter 3 - Memory Management Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 3 - Memory Management 1 / 222 1 A Memory Abstraction: Address Spaces The Notion of an Address Space Swapping
More informationCS 245 Midterm Exam Solution Winter 2015
CS 245 Midterm Exam Solution Winter 2015 This exam is open book and notes. You can use a calculator and your laptop to access course notes and videos (but not to communicate with other people). You have
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals- Part IV Lecture 14, March 10, 015 Mohammad Hammoud Today Last Two Sessions: DBMS Internals- Part III Tree-based indexes: ISAM and B+ trees Data Warehousing/
More informationECE 5730 Memory Systems
ECE 5730 Memory Systems Spring 2009 Command Scheduling Disk Caching Lecture 23: 1 Announcements Quiz 12 I ll give credit for #4 if you answered (d) Quiz 13 (last one!) on Tuesday Make-up class #2 Thursday,
More informationCSE 190D Spring 2017 Final Exam Answers
CSE 190D Spring 2017 Final Exam Answers Q 1. [20pts] For the following questions, clearly circle True or False. 1. The hash join algorithm always has fewer page I/Os compared to the block nested loop join
More informationDisks and Files. Storage Structures Introduction Chapter 8 (3 rd edition) Why Not Store Everything in Main Memory?
Why Not Store Everything in Main Memory? Storage Structures Introduction Chapter 8 (3 rd edition) Sharma Chakravarthy UT Arlington sharma@cse.uta.edu base Management Systems: Sharma Chakravarthy Costs
More informationDatabase Applications (15-415)
Database Applications (15-415) DBMS Internals- Part V Lecture 15, March 15, 2015 Mohammad Hammoud Today Last Session: DBMS Internals- Part IV Tree-based (i.e., B+ Tree) and Hash-based (i.e., Extendible
More informationUniversity of Waterloo Midterm Examination Solution
University of Waterloo Midterm Examination Solution Winter, 2011 1. (6 total marks) The diagram below shows an extensible hash table with four hash buckets. Each number x in the buckets represents an entry
More informationTree-Structured Indexes
Tree-Structured Indexes Chapter 9 Database Management Systems, R. Ramakrishnan and J. Gehrke 1 Introduction As for any index, 3 alternatives for data entries k*: Data record with key value k
More informationDisks and Files. Jim Gray s Storage Latency Analogy: How Far Away is the Data? Components of a Disk. Disks
Review Storing : Disks and Files Lecture 3 (R&G Chapter 9) Aren t bases Great? Relational model SQL Yea, from the table of my memory I ll wipe away all trivial fond records. -- Shakespeare, Hamlet A few
More informationChapter 11: Indexing and Hashing
Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree
More informationChapter 11: Indexing and Hashing
Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree
More informationTree-Structured Indexes
Tree-Structured Indexes Yanlei Diao UMass Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke Access Methods v File of records: Abstraction of disk storage for query processing (1) Sequential scan;
More informationCSE-3421 Test #1 Design
2 April 2009 CSE-3421 Test #1 (corrected) w/ answers p. 1 of 10 CSE-3421 Test #1 Design Family Name: Given Name: Student#: CS Account: Instructor: Parke Godfrey Exam Duration: 75 minutes Term: winter 2009
More informationEvaluation of Relational Operations: Other Techniques
Evaluation of Relational Operations: Other Techniques [R&G] Chapter 14, Part B CS4320 1 Using an Index for Selections Cost depends on #qualifying tuples, and clustering. Cost of finding qualifying data
More informationMemory. Objectives. Introduction. 6.2 Types of Memory
Memory Objectives Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured. Master the concepts
More informationTree-Structured Indexes
Tree-Structured Indexes Chapter 10 Comp 521 Files and Databases Fall 2010 1 Introduction As for any index, 3 alternatives for data entries k*: index refers to actual data record with key value k index
More informationLast Class Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications
Last Class Carnegie Mellon Univ. Dept. of Computer Science 15-415/615 - DB Applications C. Faloutsos A. Pavlo Lecture#12: External Sorting (R&G, Ch13) Static Hashing Extendible Hashing Linear Hashing Hashing
More informationCS 540: Introduction to Artificial Intelligence
CS 540: Introduction to Artificial Intelligence Midterm Exam: 7:15-9:15 pm, October, 014 Room 140 CS Building CLOSED BOOK (one sheet of notes and a calculator allowed) Write your answers on these pages
More informationContext. File Organizations and Indexing. Cost Model for Analysis. Alternative File Organizations. Some Assumptions in the Analysis.
File Organizations and Indexing Context R&G Chapter 8 "If you don't find it in the index, look very carefully through the entire catalogue." -- Sears, Roebuck, and Co., Consumer's Guide, 1897 Query Optimization
More informationIndexing: B + -Tree. CS 377: Database Systems
Indexing: B + -Tree CS 377: Database Systems Recap: Indexes Data structures that organize records via trees or hashing Speed up search for a subset of records based on values in a certain field (search
More informationSorting a File of Records
Sorting a File of Records 1 Consider the problem of sorting a large file, stored on disk, containing a large number of logical records. If the file is very large at all then it will be impossible to load
More informationa process may be swapped in and out of main memory such that it occupies different regions
Virtual Memory Characteristics of Paging and Segmentation A process may be broken up into pieces (pages or segments) that do not need to be located contiguously in main memory Memory references are dynamically
More informationIndexes. File Organizations and Indexing. First Question to Ask About Indexes. Index Breakdown. Alternatives for Data Entries (Contd.
File Organizations and Indexing Lecture 4 R&G Chapter 8 "If you don't find it in the index, look very carefully through the entire catalogue." -- Sears, Roebuck, and Co., Consumer's Guide, 1897 Indexes
More informationRAID in Practice, Overview of Indexing
RAID in Practice, Overview of Indexing CS634 Lecture 4, Feb 04 2014 Slides based on Database Management Systems 3 rd ed, Ramakrishnan and Gehrke 1 Disks and Files: RAID in practice For a big enterprise
More informationIntroduction. Choice orthogonal to indexing technique used to locate entries K.
Tree-Structured Indexes Werner Nutt Introduction to Database Systems Free University of Bozen-Bolzano 2 Introduction As for any index, three alternatives for data entries K : Data record with key value
More informationM-ary Search Tree. B-Trees. B-Trees. Solution: B-Trees. B-Tree: Example. B-Tree Properties. Maximum branching factor of M Complete tree has height =
M-ary Search Tree B-Trees Section 4.7 in Weiss Maximum branching factor of M Complete tree has height = # disk accesses for find: Runtime of find: 2 Solution: B-Trees specialized M-ary search trees Each
More informationSystems Infrastructure for Data Science. Web Science Group Uni Freiburg WS 2014/15
Systems Infrastructure for Data Science Web Science Group Uni Freiburg WS 2014/15 Lecture II: Indexing Part I of this course Indexing 3 Database File Organization and Indexing Remember: Database tables
More informationAdministriva. CS 133: Databases. General Themes. Goals for Today. Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky
Administriva Lab 2 Final version due next Wednesday CS 133: Databases Fall 2018 Lec 11 10/11 Query Evaluation Prof. Beth Trushkowsky Problem sets PSet 5 due today No PSet out this week optional practice
More informationDatabase System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use
Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files Static
More informationDatenbanksysteme II: Caching and File Structures. Ulf Leser
Datenbanksysteme II: Caching and File Structures Ulf Leser Content of this Lecture Caching Overview Accessing data Cache replacement strategies Prefetching File structure Index Files Ulf Leser: Implementation
More informationEXAM 1 SOLUTIONS. Midterm Exam. ECE 741 Advanced Computer Architecture, Spring Instructor: Onur Mutlu
Midterm Exam ECE 741 Advanced Computer Architecture, Spring 2009 Instructor: Onur Mutlu TAs: Michael Papamichael, Theodoros Strigkos, Evangelos Vlachos February 25, 2009 EXAM 1 SOLUTIONS Problem Points
More information