UNIT III DATA STORAGE AND QUERY PROCESSING

Size: px
Start display at page:

Download "UNIT III DATA STORAGE AND QUERY PROCESSING"

Transcription

1 UNIT III DATA STORAGE AND QUERY PROCESSING Although a database system provides a high-level view of data, ultimately data have to be stored as bits on one or more storage devices. A vast majority of databases today store data on magnetic disk and fetch data into main space memory for processing, or copy data onto tapes and other backup devices for archival storage. The physical characteristics of storage devices play a major role in the way data are stored, in particular because access to a random piece of data on disk is much slower than memory access: Disk access takes tens of milliseconds, whereas memory access takes a tenth of a microsecond. Many queries reference only a small proportion of the records in a file. An index is a structure that helps locate desired records of a relation quickly, without examining all records. User queries have to be executed on the database contents, which reside on storage devices. It is usually convenient to break up queries into smaller operations, roughly corresponding to the relational algebra operations. There are many alternative ways of processing a query, which can have widely varying costs. Query optimization refers to the process of finding the lowest-cost method of evaluating a given query.

2 Storage Devices Computer Storage Medium (Hierarchy) Factors: cost, capacity, speed Primary Storage data processed directly by the CPU; main memory, cache memory Secondary (on-line) Storage - data must first be copied into primary storage for processing; magnetic disks Secondary (off-line) Storage - optical disks (direct access), magnetic tapes (sequential) Classification of Physical Storage Media Speed with which data can be accessed Cost per unit of data Reliability data loss on power failure or system crash physical failure of the storage device Can differentiate storage into: volatile storage: loses contents when power is switched off non-volatile storage: Contents persist even when power is switched off. Includes secondary and tertiary storage, as well as batterbacked up main-memory. Physical Storage Media Cache fastest and most costly form of storage; volatile; managed by the computer system hardware. Main memory: o fast access (10s to 100s of nanoseconds; 1 nanosecond = 10 9 seconds) o generally too small (or too expensive) to store the entire database capacities of up to a few Gigabytes widely used currently

3 o Capacities have gone up and per-byte costs have decreased steadily and rapidly (roughly factor of 2 every 2 to 3 years) Volatile contents of main memory are usually lost if a power failure or system crash occurs. Flash memory o o o o o o o Data survives power failure Data can be written at a location only once, but location can be erased and written to again Can support only a limited number of write/erase cycles. Erasing of memory has to be done to an entire bank of memory Reads are roughly as fast as main memory But writes are slow (few microseconds), erase is slower Cost per unit of storage roughly similar to main memory Widely used in embedded devices such as digital cameras also known as EEPROM (Electrically Erasable Programmable Read- Only Memory) Magnetic-disk o Data is stored on spinning disk, and read/written magnetically o Primary medium for the long-term storage of data; typically stores entire database. o Data must be moved from disk to main memory for access, and written back for storage Much slower access than main memory direct-access possible to read data on disk in any order, unlike magnetic tape Hard disks vs floppy disks Capacities range up to roughly 100 GB currently Much larger capacity and cost/byte than main memory/flash memory

4 Growing constantly and rapidly with technology improvements (factor of 2 to 3 every 2 years) Survives power failures and system crashes disk failure can destroy data, but is very rare Optical storage non-volatile, data is read optically from a spinning disk using a laser CD-ROM (640 MB) and DVD (4.7 to 17 GB) most popular forms Write-one, read-many (WORM) optical disks used for archival storage (CD-R and DVD-R) Multiple write versions also available (CD-RW, DVD-RW, and DVD- RAM) Reads and writes are slower than with magnetic disk Juke-box systems, with large numbers of removable disks, a few drives, and a mechanism for automatic loading/unloading of disks available for storing large volumes of data Tape storage non-volatile, used primarily for backup (to recover from disk failure), and for archival data sequential-access much slower than disk very high capacity (40 to 300 GB tapes available) tape can be removed from drive storage costs much cheaper than disk, but drives are expensive Tape jukeboxes available for storing massive amounts of data hundreds of terabytes (1 terabyte = 109 bytes) to even a petabyte (1 petabyte = 1012 bytes) Storage Hierarchy

5 primary storage: Fastest media but volatile (cache, main memory). secondary storage: next level in hierarchy, non-volatile, moderately fast access time o also called on-line storage o E.g. flash memory, magnetic disks tertiary storage: lowest level in hierarchy, non-volatile, slow access time o also called off-line storage o E.g. magnetic tape, optical storage Magnetic Hard Disk Mechanism

6 Read-write head o Positioned very close to the platter surface (almost touching it) o Reads or writes magnetically encoded information. Surface of platter divided into circular tracks o Over 16,000 tracks per platter on typical hard disks Each track is divided into sectors. o A sector is the smallest unit of data that can be read or written. o Sector size typically 512 bytes o Typical sectors per track: 200 (on inner tracks) to 400 (on outer tracks) To read/write a sector o disk arm swings to position head on right track o platter spins continually; data is read/written as sector passes under head Head-disk assemblies o multiple disk platters on a single spindle (typically 2 to 4) o one head per platter, mounted on a common arm. Cylinder i consists of ith track of all the platters Earlier generation disks were susceptible to head-crashes o Surface of earlier generation disks had metal-oxide coatings which would disintegrate on head crash and damage all data on disk

7 o Current generation disks are less susceptible to such disastrous failures, although individual sectors may get corrupted Disk controller interfaces between the computer system and the disk drive hardware. o accepts high-level commands to read or write a sector o initiates actions such as moving the disk arm to the right track and actually reading or writing the data o Computes and attaches checksums to each sector to verify that data is read back correctly If data is corrupted, with very high probability stored checksum won t match recomputed checksum o Ensures successful writing by reading back sector after writing it o Performs remapping of bad sectors Performance Measures of Disks Access time the time it takes from when a read or write request is issued to when data transfer begins. Consists of: o Seek time time it takes to reposition the arm over the correct track. Average seek time is 1/2 the worst case seek time. Would be 1/3 if all tracks had the same number of sectors, and we ignore the time to start and stop arm movement o 4 to 10 milliseconds on typical disks Rotational latency time it takes for the sector to be accessed to appear under the head. Average latency is 1/2 of the worst case latency. 4 to 11 milliseconds on typical disks (5400 to r.p.m.) Data-transfer rate the rate at which data can be retrieved from or stored to the disk. o 4 to 8 MB per second is typical o Multiple disks may share a controller, so rate that controller can handle is also important

8 E.g. ATA-5: 66 MB/second, SCSI-3: 40 MB/s Fiber Channel: 256 MB/s Mean time to failure (MTTF) the average time the disk is expected to run continuously without any failure. o Typically 3 to 5 years o Probability of failure of new disks is quite low, corresponding to a theoretical MTTF of 30,000 to 1,200,000 hours for a new disk E.g., an MTTF of 1,200,000 hours for a new disk means that given 1000 relatively new disks, on an average one will fail every 1200 hours o MTTF decreases as disk ages RAID RAID: Redundant Arrays of Independent Disks o disk organization techniques that manage a large numbers of disks, providing a view of a single disk of high capacity and high speed by using multiple disks in parallel, and high reliability by storing data redundantly, so that data can be recovered even if a disk fails The chance that some disk out of a set of N disks will fail is much higher than the chance that a specific single disk will fail. o E.g., a system with 100 disks, each with MTTF of 100,000 hours (approx. 11 years), will have a system MTTF of 1000 hours (approx. 41 days) o Techniques for using redundancy to avoid data loss are critical with large numbers of disks Originally a cost-effective alternative to large, expensive disks o I in RAID originally stood for ``inexpensive o Today RAIDs are used for their higher reliability and bandwidth. The I is interpreted as independent RAID Levels

9 Schemes to provide redundancy at lower cost by using disk striping combined with parity bits o Different RAID organizations, or RAID levels, have differing cost, performance and reliability characteristics RAID Level 0: Block striping; non-redundant. o Used in high-performance applications where data lost is not critical. RAID Level 1: Mirrored disks with block striping o Offers best write performance. o Popular for applications such as storing log files in a database system. RAID Level 2: Memory-Style Error-Correcting-Codes (ECC) with bit striping. RAID Level 3: Bit-Interleaved Parity o a single parity bit is enough for error correction, not just detection, since we know which disk has failed When writing data, corresponding parity bits must also be computed and written to a parity bit disk To recover data in a damaged disk, compute XOR of bits from other disks (including parity bit disk)

10 RAID Level 3 (Cont.) o Faster data transfer than with a single disk, but fewer I/Os per second since every disk has to participate in every I/O. o Subsumes Level 2 (provides all its benefits, at lower cost). RAID Level 4: Block-Interleaved Parity; uses block-level striping, and keeps a parity block on a separate disk for corresponding blocks from N other disks. o When writing data block, corresponding block of parity bits must also be computed and written to parity disk o To find value of a damaged block, compute XOR of bits from corresponding blocks (including parity block) from other disks. RAID Level 4 (Cont.) o Provides higher I/O rates for independent block reads than Level 3 block read goes to a single disk, so blocks stored on different disks can be read in parallel o Provides high transfer rates for reads of multiple blocks than nostriping o Before writing a block, parity data must be computed

11 Can be done by using old parity block, old value of current block and new value of current block (2 block reads + 2 block writes) Or by recomputing the parity value using the new values of blocks corresponding to the parity block More efficient for writing large amounts of data sequentially o Parity block becomes a bottleneck for independent block writes since every block write also writes to parity disk RAID Level 5: Block-Interleaved Distributed Parity; partitions data and parity among all N + 1 disks, rather than storing data in N disks and parity in 1 disk. o E.g., with 5 disks, parity block for nth set of blocks is stored on disk (n mod 5) + 1, with the data blocks stored on the other 4 disks. Storage of Databases Main Memory Databases entire databases are kept in main memory main memory is a volatile storage: requires a backup copy (on magnetic disk) Most Databases are stored permanently on magnetic disk

12 are too large to fit entirely in main memory magnetic disk is less expensive File Records on Disk Records file as a sequence of records record type = field names + data types Fixed-Length Records records with the same size in a file Variable-Length Records (with separators) records of different sizes caused by multi-valued fields, optional fields, or variable-length fields File Blocks on Disk Disk Block unit of data transfer between disk & memory records of a file are allocated to disk blocks usually 512 to 4K bytes (K=1024) Blocking Factor (bfr) number of (fixed-length) records in a block bfr = B/R (floor function) B = block size, R = record size (in bytes) Spanned vs. Unspanned File Org. Unspanned: leaves the remaining space in each block unused Spanned: utilizes the unused space Contiguous vs. Linked Allocation Operations on Files Contiguous: file blocks are allocated to consecutive disk blocks Linked: each file block contains the pointer to the next block Types of Operations Retrieval: do not change data in the file (open/close a file, find/read records)

13 Update: change the files by insertion, deletion or modification of records Record-at-a-time: operations are applied to a single record Set-at-a-time: operations are applied to a set of records or to the whole file File Open/Close Operations Open: readies the file for access, allocates buffers to hold file blocks, sets the file pointer to the beginning of the file Close: terminates access to the file Set-at-a-time Operations Find: searches for the first file record that satisfies a certain condition (selection condition), and makes it the current file record FindNext: searches for the next file record (from the current record) and makes it the current file record Read: reads the current file record Insert: inserts a new record into the file and makes it the current file record Delete: removes the current file record from the file by marking the record to indicate that it is no longer valid Modify: changes the values of some fields of the current file record Record-at-a-time Operations FindAll: locates all the records satisfying a search condition FindOrdered: retrieves all the records in a specific order Reorganize: reorganizes the records after update operations Operation Factors Access Type: attribute value(=) or range(>) Access Time: to find a particular record(s) Insertion Time: to insert a new record (find the place to insert + index structure update) Deletion Time: to delete a record (find the record(s) to delete + index structure update) Space Overhead: additional space occupied by an index structure

14 Primary vs. Secondary File Organizations Primary File Organizations Heap Files Sorted Files Hashing Secondary File Organizations (Index) Single-level or Multi-level Indexes B-trees B+-trees Heap Files Files of Unordered Records simplest and basic file organization new records are inserted at the end of the file Access: linear search requires searching through the file block by block (N/2 file blocks on average if the record exists, N file blocks if not), very inefficient (it takes O(N) time) Insertion: very efficient (random order) Deletion: must first find its block, inefficient Direct File allows direct access by the position of a record in a file applies only to fixed-length records, contiguous allocation, and unspanned blocks file records: 0, 1,, r-1 (i.e., 120) records in each block (bfr): 0, 1,, bfr-1 (15) ith record of a file (43): block position = (i/bfr), record position in the block = (i mod bfr) Files of Ordered Records file records are kept sorted by the values of an ordering field (sequential file): Access: binary search (on its ordering field) requires reading and searching log2 of the file blocks on the average (O(logN) time), improvement over linear search

15 Insertion: records must be inserted in the correct order, very inefficient Deletion: inefficient, less expensive with deletion marker and periodic reorganization FindOrdered: reading the records in order of the ordering key values is extremely efficient Overflow: temporary unordered file for new records to improve insertion efficiency, periodically merged with the main ordered file Hashing Hash Functions records in the file are unordered determine the address (B) of a record based on the value of the hash field (K) in the record h(k) -> B ex) h(k) = K mod M (1, 2,, M-1) allow direct access to the target disk block record search in the block: main memory Internal Hashing Internal Hashing hashing for an internal file hash table as an array of records noninteger hash field value such as names can be transformed into an integer (ASCII) Collision (of hash addresses) occurs when two hash field values are mapped into the same hash address Collision Resolution Open Addressing checks the subsequent positions in order until an empty position is found Chaining extend the array with a number of overflow positions

16 use a linked list of overflow records for each hash address overflow pointer refers to the position of the next record Multiple Hashing applies a second hash function if the first hash function results in a collision uses open addressing or applies a third hash function if another collision results Good Hashing Function uniform and random distribution of records hash table 70-90% full to minimize collisions with less unused locations External Hashing Hashing Function target address space is made of buckets (one disk block or a cluster of contiguous blocks) maps a hash field value into a bucket number bucket number is then converted to the corresponding disk block address collision is less severe with buckets because as many records as will fit in a bucket Bucket Overflow when a bucket is filled to capacity can be solved by chaining method: a pointer is maintained in each bucket to a linked list of overflow records for the bucket record pointers include both a block address and a relative record position within the block Static Hashing very fast access to records by the hash field a fixed number of buckets M is allocated not suitable for dynamic files (grows and shrinks dynamically) difficult to determine the number of buckets in advance

17 Dynamic Hashing requires a dynamic hashing technique Extendible Hashing maintains a directory of 2d bucket addresses uses first d bits of a hash value to determine a directory entry and then a bucket address d = global depth, d = local depth of a bucket directory expands and shrinks dynamically bucket doubling (split) vs. halving (merge) update directory and local depth appropriately Indexing and Hashing Basic Concepts Indexing mechanisms used to speed up access to desired data. E.g., author catalog in library Search Key - attribute to set of attributes used to look up records in a file. An index file consists of records (called index entries) of the form pointer search-key Index files are typically much smaller than the original file Two basic kinds of indices: Ordered indices: search keys are stored in sorted order Hash indices: search keys are distributed uniformly across buckets using a hash function. Index Evaluation Metrics Access types supported efficiently. E.g., records with a specified value in the attribute or records with an attribute value falling in a specified range of values. Access time Insertion time Deletion time

18 Space overhead Ordered Indices In an ordered index, index entries are stored sorted on the search key value. E.g., author catalog in library. Primary index: in a sequentially ordered file, the index whose search key specifies the sequential order of the file. Also called clustering index The search key of a primary index is usually but not necessarily the primary key. Secondary index: an index whose search key specifies an order different from the sequential order of the file. Also called non-clustering index. Index-sequential file: ordered sequential file with a primary index. Dense Index Files Dense index Index record appears for every search-key value in the file. Sparse Index Files Sparse Index: contains index records for only some search-key values. Applicable when records are sequentially ordered on search-key To locate a record with search-key value K we: Find index record with largest search-key value < K Search file sequentially starting at the record to which the index record points

19 Compared to dense indices: Less space and less maintenance overhead for insertions and deletions. Generally slower than dense index for locating records. Good tradeoff: sparse index with an index entry for every block in file, corresponding to least search-key value in the block. Multilevel Index If primary index does not fit in memory, access becomes expensive. Solution: treat primary index kept on disk as a sequential file and construct a sparse index on it. outer index a sparse index of primary index inner index the primary index file If even outer index is too large to fit in main memory, yet another level of index can be created, and so on. Indices at all levels must be updated on insertion or deletion from the file.

20 Index Update: Deletion If deleted record was the only record in the file with its particular searchkey value, the search-key is deleted from the index also. Single-level index deletion: Dense indices deletion of search-key:similar to file record deletion. Sparse indices if an entry for the search key exists in the index, it is deleted by replacing the entry in the index with the next search-key value in the file (in search-key order). If the next search-key value already has an index entry, the entry is deleted instead of being replaced. Index Update: Insertion Single-level index insertion:

21 Perform a lookup using the search-key value appearing in the record to be inserted. Dense indices if the search-key value does not appear in the index, insert it. Sparse indices if index stores an entry for each block of the file, no change needs to be made to the index unless a new block is created. If a new block is created, the first search-key value appearing in the new block is inserted into the index. Multilevel insertion (as well as deletion) algorithms are simple extensions of the single-level algorithms Secondary Indices Frequently, one wants to find all the records whose values in a certain field (which is not the search-key of the primary index) satisfy some condition. Example 1: In the account relation stored sequentially by account number, we may want to find all accounts in a particular branch Example 2: as above, but where we want to find all accounts with a specified balance or range of balances We can have a secondary index with an index record for each search-key value Secondary Indices Example Secondary index on balance field of account Index record points to a bucket that contains pointers to all the actual records with that particular search-key value. Secondary indices have to be dense

22 Primary and Secondary Indices Indices offer substantial benefits when searching for records. BUT: Updating indices imposes overhead on database modification --when a file is modified, every index on the file must be updated, Sequential scan using primary index is efficient, but a sequential scan using a secondary index is expensive Each record access may fetch a new block from disk Block fetch requires about 5 to 10 milliseconds versus about 100 nanoseconds for memory access B+-Tree Index Files B+-tree indices are an alternative to indexed-sequential files. Disadvantage of indexed-sequential files performance degrades as file grows, since many overflow blocks get created. Periodic reorganization of entire file is required. Advantage of B+-tree index files: automatically reorganizes itself with small, local, changes, in the face of insertions and deletions. Reorganization of entire file is not required to maintain performance. (Minor) disadvantage of B+-trees: extra insertion and deletion overhead, space overhead. Advantages of B+-trees outweigh disadvantages B+-trees are used extensively A B+-tree is a rooted tree satisfying the following properties: All paths from root to leaf are of the same length Each node that is not a root or a leaf has between n/2 and n children. A leaf node has between (n 1)/2 and n 1 values Special cases: If the root is not a leaf, it has at least 2 children. If the root is a leaf (that is, there are no other nodes in the tree), it can have between 0 and (n 1) values.

23 B+-Tree Node Structure Typical node Ki are the search-key values Pi are pointers to children (for non-leaf nodes) or pointers to records or buckets of records (for leaf nodes). The search-keys in a node are ordered K1 < K2 < K3 <... < Kn 1 Leaf Nodes in B+-Trees Properties of a leaf node For i = 1, 2,..., n 1, pointer Pi either points to a file record with searchkey value Ki, or to a bucket of pointers to file records, each record having search-key value Ki. Only need bucket structure if search-key does not form a primary key. If Li, Lj are leaf nodes and i < j, Li s search-key values are less than Lj s search-key values Pn points to next leaf node in search-key order Non-Leaf Nodes in B+-Trees Non leaf nodes form a multi-level sparse index on the leaf nodes. For a non-leaf node with m pointers: All the search-keys in the subtree to which P1 points are less than K1 For 2 i n 1, all the search-keys in the subtree to which Pi points have values greater than or equal to Ki 1 and less than Ki All the search-keys in the subtree to which Pn points have values greater than or equal to Kn 1

24 Example of a B+-tree B+-tree for account file (n = 3) B+-tree for account file (n = 5) Leaf nodes must have between 2 and 4 values ( (n 1)/2 and n 1, with n = 5). Non-leaf nodes other than root must have between 3 and 5 children ( (n/ 2 and n with n =5). Root must have at least 2 children. B-Tree Index Files Similar to B+-tree, but B-tree allows search-key values to appear only once; eliminates redundant storage of search keys. Search keys in nonleaf nodes appear nowhere else in the B-tree; an additional pointer field for each search key in a nonleaf node must be included. Generalized B-tree leaf node Nonleaf node pointers Bi are the bucket or file record pointers.

25 B-Tree Index File Example B-tree (above) and B+-tree (below) on same data Advantages of B-Tree indices: May use less tree nodes than a corresponding B+-Tree. Sometimes possible to find search-key value before reaching leaf node. Disadvantages of B-Tree indices: Only small fraction of all search-key values are found early Non-leaf nodes are larger, so fan-out is reduced. Thus, B-Trees typically have greater depth than corresponding B+-Tree Insertion and deletion more complicated than in B+-Trees Implementation is harder than B+-Trees. Typically, advantages of B-Trees do not out weigh disadvantages. Query Processing Overview

26 Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation Parsing and translation translate the query into its internal form. This is then translated into relational algebra. Parser checks syntax, verifies relations Evaluation The query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query. Basic Steps in Query Processing : Optimization

27 A relational algebra expression may have many equivalent expressions E.g., σbalance<2500( balance(account)) is equivalent to balance(σbalance<2500(account)) Each relational algebra operation can be evaluated using one of several different algorithms Correspondingly, a relational-algebra expression can be evaluated in many ways. Annotated expression specifying detailed evaluation strategy is called an evaluation-plan. E.g., can use an index on balance to find accounts with balance < 2500, or can perform complete relation scan and discard accounts with balance 2500 Query Optimization: Amongst all equivalent evaluation plans choose the one with lowest cost. Cost is estimated using statistical information from the database catalog e.g. number of tuples in each relation, size of tuples, etc. Measures of Query Cost Cost is generally measured as total elapsed time for answering query Many factors contribute to time cost disk accesses, CPU, or even network communication Typically disk access is the predominant cost, and is also relatively easy to estimate. Measured by taking into account Number of seeks * average-seek-cost Number of blocks read * average-block-read-cost Number of blocks written * average-block-write-cost Cost to write a block is greater than cost to read a block data is read back after being written to ensure that the write was successful

28 For simplicity we just use the number of block transfers from disk and the number of seeks as the cost measures tt time to transfer one block ts time for one seek Cost for b block transfers plus S seeks b * tt + S * ts We ignore CPU costs for simplicity Real systems do take CPU cost into account We do not include cost to writing output to disk in our cost formulae Several algorithms can reduce disk IO by using extra buffer space Amount of real memory available to buffer depends on other concurrent queries and OS processes, known only during execution We often use worst case estimates, assuming only the minimum amount of memory needed for the operation is available Required data may be buffer resident already, avoiding disk I/O But hard to take into account for cost estimation Selection Operation File scan search algorithms that locate and retrieve records that fulfill a selection condition. Algorithm A1 (linear search). Scan each file block and test all records to see whether they satisfy the selection condition. Cost estimate = br block transfers + 1 seek br denotes number of blocks containing records from relation r If selection is on a key attribute, can stop on finding record cost = (br /2) block transfers + 1 seek Linear search can be applied regardless of selection condition or ordering of records in the file, or availability of indices

29 A2 (binary search). Applicable if selection is an equality comparison on the attribute on which file is ordered. Assume that the blocks of a relation are stored contiguously Cost estimate (number of disk blocks to be scanned): cost of locating the first tuple by a binary search on the blocks log2(br) * (tt + ts) If there are multiple records satisfying selection Add transfer cost of the number of blocks containing records that satisfy selection condition Selections Using Indices Index scan search algorithms that use an index selection condition must be on search-key of index. A3 (primary index on candidate key, equality). Retrieve a single record that satisfies the corresponding equality condition Cost = (hi + 1) * (tt + ts) A4 (primary index on nonkey, equality) Retrieve multiple records. Records will be on consecutive blocks Let b = number of blocks containing matching records Cost = hi * (tt + ts) + ts + tt * b A5 (equality on search-key of secondary index). Retrieve a single record if the search-key is a candidate key Cost = (hi + 1) * (tt + ts) Retrieve multiple records if search-key is not a candidate key each of n matching records may be on a different block Cost = (hi + n) * (tt + ts) Can be very expensive! Join Operation Several different algorithms to implement joins Nested-loop join

30 Block nested-loop join Indexed nested-loop join Merge-join Hash-join Choice based on cost estimate Examples use the following information Number of records of customer: 10,000 depositor: 5000 Number of blocks of customer: 400 depositor: 100 Merge-Join 1. Sort both relations on their join attribute (if not already sorted on the join attributes). 2. Merge the sorted relations to join them 1. Join step is similar to the merge stage of the sort-merge algorithm. 2. Main difference is handling of duplicate values in join attribute every pair with same value on join attribute must be matched 3. Detailed algorithm in book Can be used only for equi-joins and natural joins Each block needs to be read only once (assuming all tuples for any given value of the join attributes fit in memory Thus the cost of merge join is: br + bs block transfers + br / bb + bs / bb seeks + the cost of sorting if relations are unsorted. hybrid merge-join: If one relation is sorted, and the other has a secondary B+-tree index on the join attribute

31 Merge the sorted relation with the leaf entries of the B+-tree. Sort the result on the addresses of the unsorted relation s tuples Scan the unsorted relation in physical address order and merge with previous result, to replace addresses by the actual tuples Sequential scan more efficient than random lookup Hash-Join Applicable for equi-joins and natural joins. A hash function h is used to partition tuples of both relations h maps JoinAttrs values to {0, 1,..., n}, where JoinAttrs denotes the common attributes of r and s used in the natural join. r0, r1,..., rn denote partitions of r tuples Each tuple tr r is put in partition ri where i = h(tr [JoinAttrs]). r0,, r1..., rn denotes partitions of s tuples Each tuple ts s is put in partition si, where i = h(ts [JoinAttrs]). Note: In book, ri is denoted as Hri, si is denoted as Hsi and n is denoted as nh. r tuples in ri need only to be compared with s tuples in si Need not be compared with s tuples in any other partition, since: an r tuple and an s tuple that satisfy the join condition will have the same value for the join attributes.

32 If that value is hashed to some value i, the r tuple has to be in ri and the s tuple in si. Hash-Join Algorithm The hash-join of r and s is computed as follows. 1.Partition the relation s using hashing function h. When partitioning a relation, one block of memory is reserved as the output buffer for each partition. 2.Partition r similarly. 3.For each i: (a) Load si into memory and build an in-memory hash index on it using the join attribute. This hash index uses a different hash function than the earlier one h. (b) Read the tuples in ri from the disk one by one. For each tuple tr locate each matching tuple ts in si using the in-memory hash index. Output the concatenation of their attributes. The value n and the hash function h is chosen such that each si should fit in memory. Typically n is chosen as bs/m * f where f is a fudge factor, typically around 1.2 The probe relation partitions si need not fit in memory Recursive partitioning required if number of partitions n is greater than number of pages M of memory. instead of partitioning n ways, use M 1 partitions for s Further partition the M 1 partitions using a different hash function Use same partitioning method on r Rarely required: e.g., recursive partitioning not needed for relations of 1GB or less with memory size of 2MB, with block size of 4KB.

33 Objective Questions 1. The disk surface is logically divided into sectors, which are subdivided into tracks a. True b. False 2. SCSI is a. System Computer Small Interconnect b. System Connection Small Interface c. Small Computer System Interconnect d. Small Connection System Interface 3. The time from when a read or write request is issued to when data transfer begins is: a. Seek time b. Access time c. Transfer rate d. None of the above 4. Pick out the true statement regarding "Blocks" a. Data is transferred between disk and main memory in units called blocks b. A block is a contiguous sequence of bytes from a single track of one platter c. Block sizes range from 512 bytes to several thousand d. All of the above e. None of the above 5. Commonly used disk-arm scheduling algorithm is: a. Relavator Algorithm b. Elevator Algorithm c. None of the above 6.RAID is a. Removable Array of Inexpensive Disk b. Reliable Array of Inexpensive Disk c. Rewritable Array of Inexpensive Disk d. Reduntant Array of Inexpensive Disk 7. Which of the following is correct in the case of buffer manager? a. The supersystem responsible for the allocation of buffer space b. Handles some of the requests for blocks of the database

34 c. If the block is already in main memory, the data in main memory is given to the requester d. None of the above 8. Which symbol is used as the end of record? a. % b. # c. e. } 9. Which of the following contains first records of a chain? a. Anchor block b. Overflow block c. Reserved space d. Pointers 10. In which type of file organization, records can be placed anywhere in the file where there is space for the record a. sequential file organization b. hashing file organization c. heap file organization d. clustering file organization 11. Indices whose search key specifies an order different from the sequential order of the file are called a. non clustering index b. primary index c. clustering index 12. In which indices,an index record appears for every search key value in file a. secondary index b. dense index c. sparse index d. multi level index 13. How many children should each nonleaf node in the tree must have? a. between n and n +n / 2 children b. between 2+ n / 2 and n children c. between 2 n* n / 2 and n / 2 children d. between n / 2 and n children

35 14. In processing a query, we traverse a path from the root to a leaf node. If there are K search key values in the file, this path is no longer than a. log (n) K b. log (n/2) K c. log (2n) K d. log (n/2) + K e. None of the above 15. The number of pointers in a node is called the fan-in of the node a. True b. False 16. In B+ trees file organization, if m nodes are involved in redistribution, each node can be guaranteed to contain a. at least [ ( m - 1 ) n / m ] entries b. more than [ ( m - 1 ) n / m ] entries c. less than [ ( m - 1 ) n / m ] entries d. None of the above 17. Some hashing techniques allow the hash function to be modified dynamically to accommodate the growth or shrinking of the database which are called as a. static hash functions b. dynamic hash functions c. Both a and b 18. Out of the following which involves computing the address of a data item by computing a function on the search key value. a. Secondary indices b. B-tree c. Hashing d. B+ -tree 19. In B -tree file organization, the leaf nodes of the tree store records instead of storing pointers to records a. True b. False 20. In a hash function if K1 and K2 are search key values and K1 < K2 then a. h ( K1 ) < h ( K2 ) b. h ( K1 ) > h ( K2 ) c. h ( K1 ) = h ( K2 ) d. h ( K1 ) / h ( K2 )

36 PART-A 1. What is fixed length record? 2. What is variable length record? 3. What are the methods used to implement variable length record? 4. What are the various types of file organization? 5. How clustering file organization differs from other file organization? 6. What is index? What are its uses? 7. List out the types of indices? 8. What is sparse & dense index? 9. What is static and dynamic hashing? 10. What are the qualities of hash function? 11. Difference between B-tree and B+ tree? 12. What is query processing? 13. What is the use of parser and translator? 14. What is query evaluation plan? 15.What are the factors to be considered in estimating the cost of query evaluation plan? 16. What are the methods used for evaluating an expression? Two marks Questions and answers 1. What is an index? An index is a structure that helps to locate desired records of a relation quickly, without examining all records. 2. Define query optimization. Query optimization refers to the process of finding the lowest cost method of evaluating a given query. 3. What are called jukebox systems? Jukebox systems contain a few drives and numerous disks that can be loaded into one of the drives automatically. 4. What are the types of storage devices? Primary storage Secondary storage

37 Tertiary storage Volatile storage Nonvolatile storage 5. What is called remapping of bad sectors? If the controller detects that a sector is damaged when the disk is initially formatted, or when an attempt is made to write the sector, it can logically map the sector to a different physical location. 6. Define access time. Access time is the time from when a read or write request is issued to when data transfer begins. 7. Define seek time. The time for repositioning the arm is called the seek time and it increases with the distance that the arm is called the seek time. 8. Define average seek time. The average seek time is the average of the seek times, measured over a sequence of random requests. 9. Define rotational latency time. The time spent waiting for the sector to be accessed to appear under the head is called the rotational latency time. 10. Define average latency time. The average latency time of the disk is one-half the time for a full rotation of the disk. 11. What is meant by data-transfer rate? The data-transfer rate is the rate at which data can be retrieved from or stored to the disk. 12. What is meant by mean time to failure? The mean time to failure is the amount of time that the system could run continuously without failure. 13. What is a block and a block number? A block is a contiguous sequence of sectors from a single track of one platter. Each request specifies the address on the disk to be referenced. That address is in the form of a block number.

38 14. What are called journaling file systems? File systems that support log disks are called journaling file systems. 15. What is the use of RAID? A variety of disk-organization techniques, collectively called redundant arrays of independent disks are used to improve the performance and reliability. 16. What is called mirroring? The simplest approach to introducing redundancy is to duplicate every disk. This technique is called mirroring or shadowing. 17. What is called mean time to repair? The mean time to failure is the time it takes to replace a failed disk and to restore the data on it. 18. What is called bit-level striping? Data striping consists of splitting the bits of each byte across multiple disks. This is called bit-level striping. 19. What is called block-level striping? Block level striping stripes blocks across multiple disks. It treats the array of disks as a large disk, and gives blocks logical numbers. 20. What are the two main goals of parallelism? Load balance multiple small accesses, so that the throughput of such accesses increases. Parallelize large accesses so that the response time of large accesses is reduced. 21. What are the factors to be taken into account when choosing a RAID level? Monetary cost of extra disk storage requirements. Performance requirements in terms of number of I/O operations Performance when a disk has failed. Performances during rebuild. 22. What is meant by software and hardware RAID systems? RAID can be implemented with no change at the hardware level, using only software modification. Such RAID implementations are called software RAID systems and the systems with special hardware support are called hardware RAID systems.

39 23. Define hot swapping? Hot swapping permits the removal of faulty disks and replaces it by new ones without turning power off. Hot swapping reduces the mean time to repair. 24. What are the ways in which the variable-length records arise in database systems? Storage of multiple record types in a file. Record types that allow variable lengths for one or more fields. Record types that allow repeating fields. 25. What is the use of a slotted-page structure and what is the information present in the header? The slotted-page structure is used for organizing records within a single block. The header contains the following information. The number of record entries in the header. The end of free space An array whose entries contain the location and size of each record. 26. What are the two types of blocks in the fixed length representation? Define them. Anchor block: Contains the first record of a chain. Overflow block: Contains the records other than those that are the first record of a chain. 27. What is known as heap file organization? In the heap file organization, any record can be placed anywhere in the file where there is space for the record. There is no ordering of records. There is a single file for each relation. 28. What is known as sequential file organization? In the sequential file organization, the records are stored in sequential order, according to the value of a search key of each record. 29. What is hashing file organization? In the hashing file organization, a hash function is computed on some attribute of each record. The result of the hash function specifies in which block of the file the record should be placed. 30. What is known as clustering file organization?

40 In the clustering file organization, records of several different relations are stored in the same file. 31. What are the types of indices? Ordered indices Hash indices 32. What are the techniques to be evaluated for both ordered indexing and hashing? Access types Access time Insertion time Deletion time Space overhead 33. What is known as a search key? An attribute or set of attributes used to look up records in a file is called a search key. 34. What is a primary index? A primary index is an index whose search key also defines the sequential order of the file. 35. What are called index-sequential files? The files that are ordered sequentially with a primary index on the search key, are called index-sequential files. 36. What are the two types of indices? Dense index Sparse index 37. What are called multilevel indices? Indices with two or more levels are called multilevel indices. 38. What is B-Tree? A B-tree eliminates the redundant storage of search-key values.it allows search key values to appear only once. 39. What is a B+-Tree index? A B+-Tree index takes the form of a balanced tree in which every path from the root of the root of the root of the tree to a leaf of the tree is of the same length.

41 40. What is a hash index? A hash index organizes the search keys, with their associated pointers, into a hash file structure. 41. What is called query processing? Query processing refers to the range of activities involved in extracting data from a database. 42. What are the steps involved in query processing? The basic steps are: Parsing and translation Optimization Evaluation 43. What is called an evaluation primitive? A relational algebra operation annotated with instructions on how to evaluate is called an evaluation primitive. 44. What is called a query evaluation plan? A sequence of primitive operations that can be used to evaluate ba query is a query evaluation plan or a query execution plan. 45. What is called a query execution engine? The query execution engine takes a query evaluation plan, executes that plan, and returns the answers to the query. 46. What are called as index scans? Search algorithms that use an index are referred to as index scans. 47. What is called as external sorting? Sorting of relations that do not fit into memory is called as external sorting. 48. What is called as recursive partitioning? The system repeats the splitting of the input until each partition of the build input fits in the memory. Such partitioning is called recursive partitioning. 49. What is called as an N-way merge? The merge operation is a generalization of the two-way merge used by the standard in-memory sort-merge algorithm. It merges N runs, so it is called an N- way merge. 50. What is known as fudge factor?

42 The number of partitions is increased by a small value called the fudge factor, which is usually 20 percent of the number of hash partitions computed. PART-B 1. Explain the implementation of fixed length & variable length records? 2. Explain sequential, clustering, heap file organization? 3. Explain Primary index, secondary index and multilevel indices? 4. Write short notes on B-tree and B+tree? 5. Explain the various algorithms used to implement selection and join operation? 6. Dense indices are faster in general, but sparse indices require less space and impose less maintenance for insertions and deletions. Why? 7. State the difference between B+trees and B trees? 8. Draw the structure of a B +tree and explain. 9. Compare and describe Indexing and Hashing 10. What is an Ordered index? Explain the two types of Ordered indices. 11. What is meant by the term hash function? 12.State the difference between dynamic and static hashing. How does these work? 13. What are the characteristics of a magnetic disk?

43 14.Why RAIDs are used? Why are they called so? 15.What is a buffer? Why buffers are used? What is the role of a buffer manager in buffer management? 16. State buffer replacement policies 17. What is a file? 18. What is clustering file organisation and sequential file organisation? Book References: 1. Abraham Silberschatz, Henry F. Korth and S. Sudarshan- Database System Concepts, Fourth Edition, McGraw-Hill, Ramez Elmasri and Shamkant B. Navathe, Fundamental Database Systems, Third Edition, Pearson Education, Raghu Ramakrishnan, Database Management System, Tata McGraw- Hill Publishing Company, Web Resources: /node1.html /node1.html

44 100.html

Storage and File Structure. Classification of Physical Storage Media. Physical Storage Media. Physical Storage Media

Storage and File Structure. Classification of Physical Storage Media. Physical Storage Media. Physical Storage Media Storage and File Structure Classification of Physical Storage Media Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files

More information

Ch 11: Storage and File Structure

Ch 11: Storage and File Structure Ch 11: Storage and File Structure Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files Data-Dictionary Dictionary Storage

More information

Storage and File Structure

Storage and File Structure Storage and File Structure 1 Roadmap of This Lecture Overview of Physical Storage Media Magnetic Disks RAID Tertiary Storage Storage Access File Organization Organization of Records in Files Data-Dictionary

More information

Physical Storage Media

Physical Storage Media Physical Storage Media These slides are a modified version of the slides of the book Database System Concepts, 5th Ed., McGraw-Hill, by Silberschatz, Korth and Sudarshan. Original slides are available

More information

Lecture 15 - Chapter 10 Storage and File Structure

Lecture 15 - Chapter 10 Storage and File Structure CMSC 461, Database Management Systems Spring 2018 Lecture 15 - Chapter 10 Storage and File Structure These slides are based on Database System Concepts 6th edition book (whereas some quotes and figures

More information

Chapter 10: Storage and File Structure

Chapter 10: Storage and File Structure Chapter 10: Storage and File Structure Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 10: Storage and File Structure Overview of Physical Storage Media Magnetic

More information

Administração e Optimização Bases de Dados DEI-IST 2010/2011

Administração e Optimização Bases de Dados DEI-IST 2010/2011 Administração e Optimização Bases de Dados DEI-IST 2010/2011 Overall DBMS Structure Storage and File Structure Overview of Physical Storage Media Magnetic Disks Tertiary Storage RAID Storage Access File

More information

Chapter 13: Query Processing

Chapter 13: Query Processing Chapter 13: Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 13.1 Basic Steps in Query Processing 1. Parsing

More information

Chapter 12: Query Processing. Chapter 12: Query Processing

Chapter 12: Query Processing. Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Query Processing Overview Measures of Query Cost Selection Operation Sorting Join

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Overview Chapter 12: Query Processing Measures of Query Cost Selection Operation Sorting Join

More information

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for

! A relational algebra expression may have many equivalent. ! Cost is generally measured as total elapsed time for Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 13: Query Processing Basic Steps in Query Processing

Chapter 13: Query Processing Basic Steps in Query Processing Chapter 13: Query Processing Basic Steps in Query Processing! Overview! Measures of Query Cost! Selection Operation! Sorting! Join Operation! Other Operations! Evaluation of Expressions 1. Parsing and

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016

Query Processing. Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Query Processing Debapriyo Majumdar Indian Sta4s4cal Ins4tute Kolkata DBMS PGDBA 2016 Slides re-used with some modification from www.db-book.com Reference: Database System Concepts, 6 th Ed. By Silberschatz,

More information

Chapter 1 Disk Storage, Basic File Structures, and Hashing.

Chapter 1 Disk Storage, Basic File Structures, and Hashing. Chapter 1 Disk Storage, Basic File Structures, and Hashing. Adapted from the slides of Fundamentals of Database Systems (Elmasri et al., 2003) 1 Chapter Outline Disk Storage Devices Files of Records Operations

More information

QUESTION BANK. SUBJECT CODE / Name: CS2255 DATABASE MANAGEMENT SYSTEM UNIT V. PART -A (2 Marks)

QUESTION BANK. SUBJECT CODE / Name: CS2255 DATABASE MANAGEMENT SYSTEM UNIT V. PART -A (2 Marks) QUESTION BANK DEPARTMENT: CSE SEMESTER: IV SUBJECT CODE / Name: CS2255 DATABASE MANAGEMENT SYSTEM UNIT V PART -A (2 Marks) 1. What is blind write? If a transaction writes a data item without reading the

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition in SQL

More information

Chapter 12: Indexing and Hashing. Basic Concepts

Chapter 12: Indexing and Hashing. Basic Concepts Chapter 12: Indexing and Hashing! Basic Concepts! Ordered Indices! B+-Tree Index Files! B-Tree Index Files! Static Hashing! Dynamic Hashing! Comparison of Ordered Indexing and Hashing! Index Definition

More information

Chapter 13 Disk Storage, Basic File Structures, and Hashing.

Chapter 13 Disk Storage, Basic File Structures, and Hashing. Chapter 13 Disk Storage, Basic File Structures, and Hashing. Copyright 2004 Pearson Education, Inc. Chapter Outline Disk Storage Devices Files of Records Operations on Files Unordered Files Ordered Files

More information

Database System Concepts

Database System Concepts Chapter 13: Query Processing s Departamento de Engenharia Informática Instituto Superior Técnico 1 st Semester 2008/2009 Slides (fortemente) baseados nos slides oficiais do livro c Silberschatz, Korth

More information

Advanced Database Systems

Advanced Database Systems Lecture IV Query Processing Kyumars Sheykh Esmaili Basic Steps in Query Processing 2 Query Optimization Many equivalent execution plans Choosing the best one Based on Heuristics, Cost Will be discussed

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Overview Catalog Information for Cost Estimation $ Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Transformation

More information

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization

CPSC 421 Database Management Systems. Lecture 11: Storage and File Organization CPSC 421 Database Management Systems Lecture 11: Storage and File Organization * Some material adapted from R. Ramakrishnan, L. Delcambre, and B. Ludaescher Today s Agenda Start on Database Internals:

More information

File Structures and Indexing

File Structures and Indexing File Structures and Indexing CPS352: Database Systems Simon Miner Gordon College Last Revised: 10/11/12 Agenda Check-in Database File Structures Indexing Database Design Tips Check-in Database File Structures

More information

CMSC 424 Database design Lecture 12 Storage. Mihai Pop

CMSC 424 Database design Lecture 12 Storage. Mihai Pop CMSC 424 Database design Lecture 12 Storage Mihai Pop Administrative Office hours tomorrow @ 10 Midterms are in solutions for part C will be posted later this week Project partners I have an odd number

More information

Classifying Physical Storage Media. Chapter 11: Storage and File Structure. Storage Hierarchy (Cont.) Storage Hierarchy. Magnetic Hard Disk Mechanism

Classifying Physical Storage Media. Chapter 11: Storage and File Structure. Storage Hierarchy (Cont.) Storage Hierarchy. Magnetic Hard Disk Mechanism Chapter 11: Storage and File Structure Overview of Storage Media Magnetic Disks Characteristics RAID Database Buffers Structure of Records Organizing Records within Files Data-Dictionary Storage Classifying

More information

Classifying Physical Storage Media. Chapter 11: Storage and File Structure. Storage Hierarchy. Storage Hierarchy (Cont.) Speed

Classifying Physical Storage Media. Chapter 11: Storage and File Structure. Storage Hierarchy. Storage Hierarchy (Cont.) Speed Chapter 11: Storage and File Structure Overview of Storage Media Magnetic Disks Characteristics RAID Database Buffers Structure of Records Organizing Records within Files Data-Dictionary Storage Classifying

More information

Chapter 11: Storage and File Structure. Silberschatz, Korth and Sudarshan Updated by Bird and Tanin

Chapter 11: Storage and File Structure. Silberschatz, Korth and Sudarshan Updated by Bird and Tanin Chapter 11: Storage and File Structure Storage Hierarchy 11.2 Storage Hierarchy (Cont.) primary storage: Fastest media but volatile (cache, main memory). secondary storage: next level in hierarchy, non-volatile,

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 11: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Chapter 12: Indexing and Hashing

Chapter 12: Indexing and Hashing Chapter 12: Indexing and Hashing Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Chapter 11: Storage and File Structure

Chapter 11: Storage and File Structure Chapter 11: Storage and File Structure Click to add text Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 11: Storage and File Structure Overview of Physical Storage

More information

CMSC 424 Database design Lecture 13 Storage: Files. Mihai Pop

CMSC 424 Database design Lecture 13 Storage: Files. Mihai Pop CMSC 424 Database design Lecture 13 Storage: Files Mihai Pop Recap Databases are stored on disk cheaper than memory non-volatile (survive power loss) large capacity Operating systems are designed for general

More information

Database Technology. Topic 7: Data Structures for Databases. Olaf Hartig.

Database Technology. Topic 7: Data Structures for Databases. Olaf Hartig. Topic 7: Data Structures for Databases Olaf Hartig olaf.hartig@liu.se Database System 2 Storage Hierarchy Traditional Storage Hierarchy CPU Cache memory Main memory Primary storage Disk Tape Secondary

More information

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University l Chapter 10: File System l Chapter 11: Implementing File-Systems l Chapter 12: Mass-Storage

More information

CMSC424: Database Design. Instructor: Amol Deshpande

CMSC424: Database Design. Instructor: Amol Deshpande CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons

More information

Chapter 17. Disk Storage, Basic File Structures, and Hashing. Records. Blocking

Chapter 17. Disk Storage, Basic File Structures, and Hashing. Records. Blocking Chapter 17 Disk Storage, Basic File Structures, and Hashing Records Fixed and variable length records Records contain fields which have values of a particular type (e.g., amount, date, time, age) Fields

More information

V. Mass Storage Systems

V. Mass Storage Systems TDIU25: Operating Systems V. Mass Storage Systems SGG9: chapter 12 o Mass storage: Hard disks, structure, scheduling, RAID Copyright Notice: The lecture notes are mainly based on modifications of the slides

More information

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Chapter 10: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space

More information

CMSC424: Database Design. Instructor: Amol Deshpande

CMSC424: Database Design. Instructor: Amol Deshpande CMSC424: Database Design Instructor: Amol Deshpande amol@cs.umd.edu Databases Data Models Conceptual representa1on of the data Data Retrieval How to ask ques1ons of the database How to answer those ques1ons

More information

Chapter 11: Indexing and Hashing" Chapter 11: Indexing and Hashing"

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing" Database System Concepts, 6 th Ed.! Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use " Chapter 11: Indexing and Hashing" Basic Concepts!

More information

Chapter 10: Mass-Storage Systems

Chapter 10: Mass-Storage Systems COP 4610: Introduction to Operating Systems (Spring 2016) Chapter 10: Mass-Storage Systems Zhi Wang Florida State University Content Overview of Mass Storage Structure Disk Structure Disk Scheduling Disk

More information

Silberschatz, et al. Topics based on Chapter 13

Silberschatz, et al. Topics based on Chapter 13 Silberschatz, et al. Topics based on Chapter 13 Mass Storage Structure CPSC 410--Richard Furuta 3/23/00 1 Mass Storage Topics Secondary storage structure Disk Structure Disk Scheduling Disk Management

More information

Intro to DB CHAPTER 12 INDEXING & HASHING

Intro to DB CHAPTER 12 INDEXING & HASHING Intro to DB CHAPTER 12 INDEXING & HASHING Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing

More information

The physical database. Contents - physical database design DATABASE DESIGN I - 1DL300. Introduction to Physical Database Design

The physical database. Contents - physical database design DATABASE DESIGN I - 1DL300. Introduction to Physical Database Design DATABASE DESIGN I - 1DL300 Fall 2011 Introduction to Physical Database Design Elmasri/Navathe ch 16 and 17 Padron-McCarthy/Risch ch 21 and 22 An introductory course on database systems http://www.it.uu.se/edu/course/homepage/dbastekn/ht11

More information

Advanced Databases. Lecture 1- Query Processing. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch

Advanced Databases. Lecture 1- Query Processing. Masood Niazi Torshiz Islamic Azad university- Mashhad Branch Advanced Databases Lecture 1- Query Processing Masood Niazi Torshiz Islamic Azad university- Mashhad Branch www.mniazi.ir Overview Measures of Query Cost Selection Operation Sorting Join Operation Other

More information

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition Chapter 10: Mass-Storage Systems Silberschatz, Galvin and Gagne 2013 Chapter 10: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space

More information

Today: Secondary Storage! Typical Disk Parameters!

Today: Secondary Storage! Typical Disk Parameters! Today: Secondary Storage! To read or write a disk block: Seek: (latency) position head over a track/cylinder. The seek time depends on how fast the hardware moves the arm. Rotational delay: (latency) time

More information

Chapter 11: File System Implementation. Objectives

Chapter 11: File System Implementation. Objectives Chapter 11: File System Implementation Objectives To describe the details of implementing local file systems and directory structures To describe the implementation of remote file systems To discuss block

More information

Module 13: Secondary-Storage

Module 13: Secondary-Storage Module 13: Secondary-Storage Disk Structure Disk Scheduling Disk Management Swap-Space Management Disk Reliability Stable-Storage Implementation Tertiary Storage Devices Operating System Issues Performance

More information

Database System Concepts, 5th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Database System Concepts, 5th Ed. Silberschatz, Korth and Sudarshan See   for conditions on re-use Chapter 12: Indexing and Hashing Database System Concepts, 5th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Chapter 14: Mass-Storage Systems

Chapter 14: Mass-Storage Systems Chapter 14: Mass-Storage Systems Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices Operating System

More information

Database Technology Database Architectures. Heiko Paulheim

Database Technology Database Architectures. Heiko Paulheim Database Technology Database Architectures Today So far, we have treated Database Systems as a black box We can define a schema...and write data into it...and read data from it Today Opening the black

More information

Storage Devices for Database Systems

Storage Devices for Database Systems Storage Devices for Database Systems 5DV120 Database System Principles Umeå University Department of Computing Science Stephen J. Hegner hegner@cs.umu.se http://www.cs.umu.se/~hegner Storage Devices for

More information

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See for conditions on re-use

Database System Concepts, 6 th Ed. Silberschatz, Korth and Sudarshan See  for conditions on re-use Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files Static

More information

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition,

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition, Chapter 12: Mass-Storage Systems, Silberschatz, Galvin and Gagne 2009 Chapter 12: Mass-Storage Systems Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management

More information

Database Systems. November 2, 2011 Lecture #7. topobo (mit)

Database Systems. November 2, 2011 Lecture #7. topobo (mit) Database Systems November 2, 2011 Lecture #7 1 topobo (mit) 1 Announcement Assignment #2 due today Assignment #3 out today & due on 11/16. Midterm exam in class next week. Cover Chapters 1, 2,

More information

Database Management Systems, 2nd edition, Raghu Ramakrishnan, Johannes Gehrke, McGraw-Hill

Database Management Systems, 2nd edition, Raghu Ramakrishnan, Johannes Gehrke, McGraw-Hill Lecture Handout Database Management System Lecture No. 34 Reading Material Database Management Systems, 2nd edition, Raghu Ramakrishnan, Johannes Gehrke, McGraw-Hill Modern Database Management, Fred McFadden,

More information

Tape pictures. CSE 30341: Operating Systems Principles

Tape pictures. CSE 30341: Operating Systems Principles Tape pictures 4/11/07 CSE 30341: Operating Systems Principles page 1 Tape Drives The basic operations for a tape drive differ from those of a disk drive. locate positions the tape to a specific logical

More information

Mass-Storage Structure

Mass-Storage Structure CS 4410 Operating Systems Mass-Storage Structure Summer 2011 Cornell University 1 Today How is data saved in the hard disk? Magnetic disk Disk speed parameters Disk Scheduling RAID Structure 2 Secondary

More information

Outlines. Chapter 2 Storage Structure. Structure of a DBMS (with some simplification) Structure of a DBMS (with some simplification)

Outlines. Chapter 2 Storage Structure. Structure of a DBMS (with some simplification) Structure of a DBMS (with some simplification) Outlines Chapter 2 Storage Structure Instructor: Churee Techawut 1) Structure of a DBMS 2) The memory hierarchy 3) Magnetic tapes 4) Magnetic disks 5) RAID 6) Disk space management 7) Buffer management

More information

Chapter 12: Indexing and Hashing (Cnt(

Chapter 12: Indexing and Hashing (Cnt( Chapter 12: Indexing and Hashing (Cnt( Cnt.) Basic Concepts Ordered Indices B+-Tree Index Files B-Tree Index Files Static Hashing Dynamic Hashing Comparison of Ordered Indexing and Hashing Index Definition

More information

Physical Level of Databases: B+-Trees

Physical Level of Databases: B+-Trees Physical Level of Databases: B+-Trees Adnan YAZICI Computer Engineering Department METU (Fall 2005) 1 B + -Tree Index Files l Disadvantage of indexed-sequential files: performance degrades as file grows,

More information

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University I/O, Disks, and RAID Yi Shi Fall 2017 Xi an Jiaotong University Goals for Today Disks How does a computer system permanently store data? RAID How to make storage both efficient and reliable? 2 What does

More information

BBM371- Data Management. Lecture 2: Storage Devices

BBM371- Data Management. Lecture 2: Storage Devices BBM371- Data Management Lecture 2: Storage Devices 18.10.2018 Memory Hierarchy cache Main memory disk Optical storage Tapes V NV Traveling the hierarchy: 1. speed ( higher=faster) 2. cost (lower=cheaper)

More information

Storing Data: Disks and Files

Storing Data: Disks and Files Storing Data: Disks and Files Yea, from the table of my memory I ll wipe away all trivial fond records. -- Shakespeare, Hamlet Data Access Disks and Files DBMS stores information on ( hard ) disks. This

More information

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms!

Professor: Pete Keleher! Closures, candidate keys, canonical covers etc! Armstrong axioms! Professor: Pete Keleher! keleher@cs.umd.edu! } Mechanisms and definitions to work with FDs! Closures, candidate keys, canonical covers etc! Armstrong axioms! } Decompositions! Loss-less decompositions,

More information

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes? Storing and Retrieving Storing : Disks and Files Chapter 9 base Management Systems need to: Store large volumes of data Store data reliably (so that data is not lost!) Retrieve data efficiently Alternatives

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 9: Mass Storage Structure Prof. Alan Mislove (amislove@ccs.neu.edu) Moving-head Disk Mechanism 2 Overview of Mass Storage Structure Magnetic

More information

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage Chapter 12: Mass-Storage Systems Chapter 12: Mass-Storage Systems Revised 2010. Tao Yang Overview of Mass Storage Structure Disk Structure Disk Attachment Disk Scheduling Disk Management Swap-Space Management

More information

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Database Management Systems need to:

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Database Management Systems need to: Storing : Disks and Files base Management System, R. Ramakrishnan and J. Gehrke 1 Storing and Retrieving base Management Systems need to: Store large volumes of data Store data reliably (so that data is

More information

Storing Data: Disks and Files

Storing Data: Disks and Files Storing Data: Disks and Files Chapter 7 (2 nd edition) Chapter 9 (3 rd edition) Yea, from the table of my memory I ll wipe away all trivial fond records. -- Shakespeare, Hamlet Database Management Systems,

More information

Query Processing & Optimization

Query Processing & Optimization Query Processing & Optimization 1 Roadmap of This Lecture Overview of query processing Measures of Query Cost Selection Operation Sorting Join Operation Other Operations Evaluation of Expressions Introduction

More information

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Chapter 7

Storing Data: Disks and Files. Storing and Retrieving Data. Why Not Store Everything in Main Memory? Chapter 7 Storing : Disks and Files Chapter 7 base Management Systems, R. Ramakrishnan and J. Gehrke 1 Storing and Retrieving base Management Systems need to: Store large volumes of data Store data reliably (so

More information

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006 November 21, 2006 The memory hierarchy Red = Level Access time Capacity Features Registers nanoseconds 100s of bytes fixed Cache nanoseconds 1-2 MB fixed RAM nanoseconds MBs to GBs expandable Disk milliseconds

More information

Storage Systems. Storage Systems

Storage Systems. Storage Systems Storage Systems Storage Systems We already know about four levels of storage: Registers Cache Memory Disk But we've been a little vague on how these devices are interconnected In this unit, we study Input/output

More information

Instructor: Amol Deshpande

Instructor: Amol Deshpande Instructor: Amol Deshpande amol@cs.umd.edu } Storage and Query Processing Storage and memory hierarchy Query plans and how to interpret EXPLAIN output } Other things Midterm grades: later today Look out

More information

CSE 380 Computer Operating Systems

CSE 380 Computer Operating Systems CSE 380 Computer Operating Systems Instructor: Insup Lee University of Pennsylvania Fall 2003 Lecture Note on Disk I/O 1 I/O Devices Storage devices Floppy, Magnetic disk, Magnetic tape, CD-ROM, DVD User

More information

Chapter 11: Indexing and Hashing

Chapter 11: Indexing and Hashing Chapter 11: Indexing and Hashing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Chapter 12: Indexing and Hashing Basic Concepts Ordered Indices B + -Tree Index Files B-Tree

More information

Database Systems. Session 8 Main Theme. Physical Database Design, Query Execution Concepts and Database Programming Techniques

Database Systems. Session 8 Main Theme. Physical Database Design, Query Execution Concepts and Database Programming Techniques Database Systems Session 8 Main Theme Physical Database Design, Query Execution Concepts and Database Programming Techniques Dr. Jean-Claude Franchitti New York University Computer Science Department Courant

More information

1.1 Bits and Bit Patterns. Boolean Operations. Figure 2.1 CPU and main memory connected via a bus. CS11102 Introduction to Computer Science

1.1 Bits and Bit Patterns. Boolean Operations. Figure 2.1 CPU and main memory connected via a bus. CS11102 Introduction to Computer Science 1.1 Bits and Bit Patterns CS11102 Introduction to Computer Science Data Storage 1.1 Bits and Their Storage 1.2 Main Memory 1.3 Mass Storage 1.4 Representation of information as bit patterns Bit: Binary

More information

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes?

Storing and Retrieving Data. Storing Data: Disks and Files. Solution 1: Techniques for making disks faster. Disks. Why Not Store Everything in Tapes? Storing and Retrieving Storing : Disks and Files base Management Systems need to: Store large volumes of data Store data reliably (so that data is not lost!) Retrieve data efficiently Alternatives for

More information

Kathleen Durant PhD Northeastern University CS Indexes

Kathleen Durant PhD Northeastern University CS Indexes Kathleen Durant PhD Northeastern University CS 3200 Indexes Outline for the day Index definition Types of indexes B+ trees ISAM Hash index Choosing indexed fields Indexes in InnoDB 2 Indexes A typical

More information

Indexing. Jan Chomicki University at Buffalo. Jan Chomicki () Indexing 1 / 25

Indexing. Jan Chomicki University at Buffalo. Jan Chomicki () Indexing 1 / 25 Indexing Jan Chomicki University at Buffalo Jan Chomicki () Indexing 1 / 25 Storage hierarchy Cache Main memory Disk Tape Very fast Fast Slower Slow (nanosec) (10 nanosec) (millisec) (sec) Very small Small

More information

Database Systems II. Secondary Storage

Database Systems II. Secondary Storage Database Systems II Secondary Storage CMPT 454, Simon Fraser University, Fall 2009, Martin Ester 29 The Memory Hierarchy Swapping, Main-memory DBMS s Tertiary Storage: Tape, Network Backup 3,200 MB/s (DDR-SDRAM

More information

Indexing Methods. Lecture 9. Storage Requirements of Databases

Indexing Methods. Lecture 9. Storage Requirements of Databases Indexing Methods Lecture 9 Storage Requirements of Databases Need data to be stored permanently or persistently for long periods of time Usually too big to fit in main memory Low cost of storage per unit

More information

Mass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova

Mass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova Mass-Storage ICS332 - Fall 2017 Operating Systems Henri Casanova (henric@hawaii.edu) Magnetic Disks! Magnetic disks (a.k.a. hard drives ) are (still) the most common secondary storage devices today! They

More information

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel

Indexing. Week 14, Spring Edited by M. Naci Akkøk, , Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Indexing Week 14, Spring 2005 Edited by M. Naci Akkøk, 5.3.2004, 3.3.2005 Contains slides from 8-9. April 2002 by Hector Garcia-Molina, Vera Goebel Overview Conventional indexes B-trees Hashing schemes

More information

Disks & Files. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke

Disks & Files. Yanlei Diao UMass Amherst. Slides Courtesy of R. Ramakrishnan and J. Gehrke Disks & Files Yanlei Diao UMass Amherst Slides Courtesy of R. Ramakrishnan and J. Gehrke DBMS Architecture Query Parser Query Rewriter Query Optimizer Query Executor Lock Manager for Concurrency Access

More information

Chapter 1: overview of Storage & Indexing, Disks & Files:

Chapter 1: overview of Storage & Indexing, Disks & Files: Chapter 1: overview of Storage & Indexing, Disks & Files: 1.1 Data on External Storage: DBMS stores vast quantities of data, and the data must persist across program executions. Therefore, data is stored

More information

Physical Database Design: Outline

Physical Database Design: Outline Physical Database Design: Outline File Organization Fixed size records Variable size records Mapping Records to Files Heap Sequentially Hashing Clustered Buffer Management Indexes (Trees and Hashing) Single-level

More information

Chapter 12: Query Processing

Chapter 12: Query Processing Chapter 12: Query Processing Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use Basic Steps in Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation 12.2

More information

Database index structures

Database index structures Database index structures From: Database System Concepts, 6th edijon Avi Silberschatz, Henry Korth, S. Sudarshan McGraw- Hill Architectures for Massive DM D&K / UPSay 2015-2016 Ioana Manolescu 1 Chapter

More information

Why Is This Important? Overview of Storage and Indexing. Components of a Disk. Data on External Storage. Accessing a Disk Page. Records on a Disk Page

Why Is This Important? Overview of Storage and Indexing. Components of a Disk. Data on External Storage. Accessing a Disk Page. Records on a Disk Page Why Is This Important? Overview of Storage and Indexing Chapter 8 DB performance depends on time it takes to get the data from storage system and time to process Choosing the right index for faster access

More information

Chapter 10 Storage and File Structure

Chapter 10 Storage and File Structure Chapter 10 Storage and File Structure Table of Contents z 2 ºÆ Ö c z Storage Media z Buffer Management z File Organization Chapter 10-1 1 1. 2 ºÆ Ö c z File Structure Selection Sequential, Indexed Sequential,

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

MASS-STORAGE STRUCTURE

MASS-STORAGE STRUCTURE UNIT IV MASS-STORAGE STRUCTURE Mass-Storage Systems ndescribe the physical structure of secondary and tertiary storage devices and the resulting effects on the uses of the devicesnexplain the performance

More information

Mass-Storage Structure

Mass-Storage Structure Operating Systems (Fall/Winter 2018) Mass-Storage Structure Yajin Zhou (http://yajin.org) Zhejiang University Acknowledgement: some pages are based on the slides from Zhi Wang(fsu). Review On-disk structure

More information

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism Chapter 13: Mass-Storage Systems Disk Scheduling Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices

More information

Chapter 13: Mass-Storage Systems. Disk Structure

Chapter 13: Mass-Storage Systems. Disk Structure Chapter 13: Mass-Storage Systems Disk Structure Disk Scheduling Disk Management Swap-Space Management RAID Structure Disk Attachment Stable-Storage Implementation Tertiary Storage Devices Operating System

More information