File Systems Hai Tao
File System File system is used to store sources, objects, libraries and executables, numeric data, text, video, audio, etc. The file system provide access and control function for the storage and retrieval of files Organization of the file system Disk Scheduling
File Structure Storage: Contiguous (sequential) versus non-contiguous (non-sequential) Access: Sequential versus Random
File Structure - UNIX Store block information in mapping tables Each file is associated with a small table called i-node For large files, multi-level in-direct maps are used
File Structure File Allocation Table (FAT) Department of Computer Engineering A link structure Each file is represented as linked blocks Blocks are linked using pointers One problem is random access because to access a part of a file, the system always has to start from the beginning of a file
Disk Management Block cache recently accessed blocks are stored in memory for future use. The underlying assumption is that the probability of revisit is high The physical structure of the disk has to be considered for efficient access. For example, the i-node or FAT is stored at the middle of a disk. This make the access time 50% less
Disk Scheduling The actual time to read or write a disk block is determined by The seek time the time required for the movement of the read/write head) The latency time or rotation delay the time during which the transfer cannot proceed until the right block or sector rotates under the read/write head The actual data transfer time for copying data from disk to the main memory Reducing seek time is the key. For CD, the seek time is very long
Disk scheduling Department of Computer Engineering Fist-Come-First-Served (FCFS) - Easy and fair, but long average seek time
Disk scheduling Department of Computer Engineering Shortest-Seek-Time (SSTF) Access the data the can be most easily accessed (shortest seek time). Low seek time, but not fair.
Disk scheduling Department of Computer Engineering SCAN Similar to SSTF, but take current disk movement into account. It serve all request at one direction, then the another direction
Disk scheduling Department of Computer Engineering C-SCAN Similar to SCAN, but less unfair. It serve all request at only one direction.
Multimedia File Systems Special characteristics Real-time performance - Data must be retrieved before the deadline. Buffering is used to remove jitter Large file size Multiple data streams - Multiple streams should be organized in a way so that they can be access before the deadlines There two issues in achieving these goals - Organize data on disk so that they can be retrieved efficiently - Using special disk scheduling and buffering to achieve realtime play data access
Storage Device Magnetic disks Fast Seek-time around 8-10 ms, sustained transfer rate around 15MB/s Affordable $1-$2/GB Portability is relative poor Moving parts Can be very small 1GB on a quarter-size hard drive (IBM) Flash memory (PenDrive) Seek-time less than 1 ms (0.25 ms) Sustained transfer rate 1.5 MB/s Expensive No moving part
Storage Device Department of Computer Engineering Optical disks Slow - 2X speed 320 millisecond seek time 300K per second transfer rate - 4X speed 135-180 millisecond seek time 600K per second transfer rate - 8X speed 135-180 millisecond seek time 1.2 MBps transfer rate - 12X speed 100-150 millisecond seek time 1.8 MBps transfer rate - 16X speed 100-150 millisecond seek time 2.4 MBps (maximum) transfer rate - 24X speed 100-150 millisecond seek time 3.6 Mbps (maximum) transfer rate - 32X speed 100-150 millisecond seek time 4.8 Mbps (maximum) transfer rate Portable Inexpensive
Storage Device DVD Rom Seek time around 100 ms Sustained transfer rate (read) 16 MB/s Capacity Large 4.7 GB Media Expensive Zip Drive Seek time 29 ms Sustained transfer rate 2.4M/s Capacity 100-250 MB
Storage Device Methods of recording are different Optical - constant linear speed, so varying rotation speed. Bits are evenly distributed on the disk Magnetic disk constant rotation speed. Bits are more dense on the inner cylinders Flash memory no moving parts
Placement of data on storage device Department of Computer Engineering Fragmentation Internal fragmentation a block is not entirely filled by data. Only happens at the last block. The larger the block, the more waste may occur (why?) External fragmentation there are small number of blocks between files that are not used (in a contiguous system) In a conventional file system, the goal of file organization is to minimize disk fragmentation, therefore increase the capacity and In a multimedia file system, the goal is to provide a constant and time access of data Store contiguous media in large data blocks contiguously. Files that are likely to be retrieved together are stored together to reduce access time
Disk scheduling algorithm In traditional file system, the goal of disk scheduling is to reduce average seek time to achieve high throughput when multiple files are accessed In a multimedia system, the goal is to meet the deadlines of time-critical tasks In a contiguous file system, scheduling algorithm is needed only when multiple streams are accessed concurrently
Earliest Deadline First (EDF) The block of the stream with the earliest deadline is accessed first EDF is preemptive but may impose unnecessary additional cost in excessive context switching Results in poor throughput and excessive seek time
Earliest Deadline First (EDF) Department of Computer Engineering Example
SCAN-EDF The request with the earliest deadline is always served first For requests with the same deadline, the first one according to the scan direction is served first Efficiency depends on how often requests have the same deadlines. To make this occur more often, use multiple of cycle as deadline (This is usually allowed in playback of media files) Another strategy is to sum up all the access time of tasks in a cycle and use the total time as the deadline for all tasks
SCAN-EDF Department of Computer Engineering Example
Group Sweeping Scheduling (GSS) Serve in cycles in round-robin manner N streams are divided into g groups Groups are served in fixed order In each group, streams are accessed according to SCAN strategy Why Suppose use simple SCAN scheme, for the audio stream, the block can be read first in one cycle but the last in another cycle. Using groups reduces the jitter
Group Sweeping Scheduling (GSS) Department of Computer Engineering Example