QUIZ: Is either set of attributes a superkey? A candidate key? Source: http://courses.cs.washington.edu/courses/cse444/06wi/lectures/lecture09.pdf 10.1
QUIZ: MVD What MVDs can you spot in this table? Source: https://en.wikipedia.org/wiki/multivalued_dependency 10.2
Chapter 10: Storage and File Structure Database System Concepts, 6 th Ed. See www.db-book.com for conditions on re-use
Chapter 10: Storage and File Structure 1. Overview of Physical Storage Media 2. Magnetic Disks 3. RAID 4. Tertiary Storage 5. File Organization 6. Organization of Records in Files 7. Data Dictionary/Catalog Storage 8. DB Buffer 10.4
10.5 File Organization A file is partitioned into fixed-length storage units called blocks, which are the units of both storage allocation and data transfer from/to the secondary storage (HDD). Most DBMSs use block sizes of 4 to 8 kilobytes by default many DBMSs allow the block size to be specified when a DB instance is created. 10.5
10.5 File Organization The database is stored as a collection of files. Each file is a sequence of records. A record is a sequence of fields. Two possible approaches: Record size is fixed Record size is variable 10.6
10.5.1 Fixed-Length Records Even if the records are not really of fixed length (e.g. varchar below), we assume that each field has max. size: Record access is simple: Record i starts at byte n (i 1), where n is the size of each record (e.g. 53 bytes). 10.7
10.5.1 Fixed-Length Records Problem 1: Unless the block size is a multiple of n, the last record in a block crosses the block boundary Requires two block accesses! Modification: leave the fractional record at the end of the block unused. 10.8
10.5.1 Fixed-Length Records Problem 2: What to do when a record ( i ) is deleted? Possible solutions: shift records i + 1,..., n to i,..., n 1 move record n to i do not move records, but link all free records on a free list 10.9
Deleting record 3 and shifting 10.10
Deleting record 3 and moving last record 10.11
Free Lists (Linked) Store the address of the first deleted record in the file header. Can think of these stored addresses as pointers since they point to the location of a record. For efficiency, reuse the space for normal attributes in the free records to store pointers. (No pointers stored in in-use records!) 10.12
QUIZ 10.13
10.5.2 Variable-Length Records Variable-length records arise in several ways: Storage of multiple record types in a file. E.g. the records represent tuples from different tables Record types that allow variable lengths for one or more fields such as strings (varchar) Record types that allow repeating fields (used in some older data models). 10.14
Internal representation of variable-length records Attributes are stored in order, but Variable length attributes represented by a fixed size pair (offset, length), with actual data stored after all fixed length attributes Null values represented by null-value bitmap 10.15
Representation of variable-length records inside a block: Slotted page structure A slotted page has a header which contains: The nr. of record entries The end of free space in the block The location and size of each record 10.16
Records are allocated contiguously in the page/block, starting from the end. Records can be moved around within the page/block to keep them contiguous (no empty space between them) header entry is updated on every move b/c of this, outside pointers should not point directly to record but to the header entry. 10.17
What if the data is larger than the block size? Remember the data types BLOB and CLOB Large objects are often stored separately from the other (short) attributes, in special file(s). In this case, the record containing the large object has only a pointer to the object. File 1 BLOB 1 File 2 BLOB 2 10.18
10.6 Organization of Records in Files Heap a record can be placed anywhere in the file (in any block) where there is space No ordering whatsoever Sequential store records in sequential order, based on the value of the search key of each record Search key need not be PK, or even superkey! See next slides Hashing a hash function computed on some attribute of each record; the result specifies in which block of the file the record should be placed See Ch.11 10.19
10.6.1 Sequential File Organization The records in the file are ordered by a search-key Suitable for applications (e.g. queries) that require sequential processing of the entire file 10.20
Sequential File Organization (Cont.) Deletion use pointer chains Insertion locate the position where the record is to be inserted if there is free space insert there if no free space, insert the record in an overflow block In either case, pointer chain must be updated Need to reorganize the file from time to time to restore sequential order 10.21
10.6.2 Multitable Clustering File Organization Many large-scale DBMSs do not rely directly on the underlying OS for file management. Instead, the OS allocates one large file to the DBMS, and the DBMS stores all relations in this one file, and manages the file itself. Even if multiple relations are stored in this a single large file, the default is to store records of only one relation in a given block. This simplifies data management. 10.22
However, in some cases it can be useful to store records of more than one relation in a single block. This is called multitable clustering. Example: department instructor multitable clustering of department and instructor 10.23
Multitable Clustering File Organization (cont.) good for queries involving department instructor, and for queries involving one single department and its instructors bad for queries involving only department results in variable size records Can add pointer chains to link records of a particular relation 10.24
10.7 Data Dictionary Storage The Data dictionary (also called system catalog) stores metadata; that is, data about data, such as: Information about relations names of relations names, types and lengths of attributes of each relation names and definitions of views integrity constraints User and accounting information, including passwords Statistical and descriptive data number of tuples in each relation Physical file organization information How relation is stored (sequential/hash/ ) Physical location of relation Information about indices (Chapter 11) 10.25
Relational Representation of Dictionary/Catalog Relational representation on disk Specialized data structures designed for efficient access, in memory If multiple indices, this can be multivalued, so the dictionary DB is not even in 1NF! 10.26
PostgreSQL example 10.27
10.8 DB Buffer A DB file is partitioned into fixed-length storage units called blocks. The DBMS seeks to minimize the number of block transfers between disk and main memory (MM). We can reduce the number of disk accesses by keeping as many blocks as possible in MM. Buffer = portion of main memory available to store copies of disk blocks. Buffer manager = subsystem responsible for allocating buffer space in MM. 10.28
Buffer Manager (BM) A program places call to BM when it needs a block from disk. 1. If the block is already in the buffer, BM returns to the program the MM address of the block. 2. If the block is not in the buffer, the BM does the following: 1. Allocates space in buffer for block 1. Replaces (throws out) some other block, if required, to make space for the new block. 2. Replaced block written back to disk only if it was modified since the most recent time that it was written to/fetched from the disk. 2. Reads the block from the disk to the buffer, and returns the address of the block in MM to program. 10.29
Buffer Replacement Policies Least recently used (LRU) strategy is popular in OSs Idea behind LRU: use past pattern of block references as a predictor of future references However, queries have well-defined access patterns (such as sequential scans), so a DBMS can use the information in a user s query to better predict future references 10.30
Buffer-Replacement Policies (Cont.) Pinned block memory block that is not allowed to be written back to disk. Most recently used (MRU) strategy system must pin the block currently being processed. After the final tuple of that block has been processed, the block is unpinned, and it becomes the most recently used block. A.k.a. toss-immediate Buffer manager can use statistical information regarding the probability that a request will reference a particular relation E.g., the data dictionary is frequently accessed Heuristic: keep data-dictionary blocks in MM buffer 10.31
LRU can be a bad strategy for certain access patterns involving repeated scans of data Example: Compute join of 2 relations r and s by nested loop: for each tuple tr of r do for each tuple ts of s do if the tuples tr and ts match A mixed strategy is preferable: Toss immediate for r LRU for s 10.32
Homework for Ch.10 End-of-chapter exercises 4, 6, 7, 17, 18 10.33 EOL 1