Disk scheduling Disk reliability Tertiary storage Swap space management Linux swap space management

Similar documents
Chapter 14: Mass-Storage Systems

Module 13: Secondary-Storage

Tape pictures. CSE 30341: Operating Systems Principles

Chapter 13: Mass-Storage Systems. Disk Scheduling. Disk Scheduling (Cont.) Disk Structure FCFS. Moving-Head Disk Mechanism

Chapter 13: Mass-Storage Systems. Disk Structure

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition,

Silberschatz, et al. Topics based on Chapter 13

Chapter 14: Mass-Storage Systems. Disk Structure

Chapter 14: Mass-Storage Systems

MASS-STORAGE STRUCTURE

Module 13: Secondary-Storage Structure

Disk Scheduling. Based on the slides supporting the text

Disk Scheduling. Chapter 14 Based on the slides supporting the text and B.Ramamurthy s slides from Spring 2001

Chapter 10: Mass-Storage Systems

Chapter 5: Input Output Management. Slides by: Ms. Shree Jaswal

CSE325 Principles of Operating Systems. Mass-Storage Systems. David P. Duggan. April 19, 2011

Overview of Mass Storage Structure

Chapter 14 Mass-Storage Structure

Chapter 10: Mass-Storage Systems

V. Mass Storage Systems

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition

Chapter 11: Mass-Storage Systems

I/O, Disks, and RAID Yi Shi Fall Xi an Jiaotong University

CS3600 SYSTEMS AND NETWORKS

Mass-Storage Structure

Chapter 12: Mass-Storage Systems. Operating System Concepts 8 th Edition,

Chapter 12: Mass-Storage Systems

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage Systems. Operating System Concepts 9 th Edition

CS420: Operating Systems. Mass Storage Structure

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Chapter 12: Secondary-Storage Structure. Operating System Concepts 8 th Edition,

Chapter 10: Mass-Storage Systems

UNIT-7. Overview of Mass Storage Structure

Mass-Storage Structure

Silberschatz and Galvin Chapter 14

Free Space Management

UNIT 4 Device Management

File. File System Implementation. Operations. Permissions and Data Layout. Storing and Accessing File Data. Opening a File

EIDE, ATA, SATA, USB,

OPERATING SYSTEMS CS3502 Spring Input/Output System Chapter 9

Overview of Mass Storage Structure

I/O Systems and Storage Devices

Operating Systems 2010/2011

CHAPTER 12: MASS-STORAGE SYSTEMS (A) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Lecture 9. I/O Management and Disk Scheduling Algorithms

Today: Secondary Storage! Typical Disk Parameters!

Chapter 11. I/O Management and Disk Scheduling

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

OPERATING SYSTEMS CS3502 Spring Input/Output System Chapter 9

File. File System Implementation. File Metadata. File System Implementation. Direct Memory Access Cont. Hardware background: Direct Memory Access

Introduction Disks RAID Tertiary storage. Mass Storage. CMSC 420, York College. November 21, 2006

Chapter 12: Mass-Storage

Chapter 12: Mass-Storage

I/O Device Controllers. I/O Systems. I/O Ports & Memory-Mapped I/O. Direct Memory Access (DMA) Operating Systems 10/20/2010. CSC 256/456 Fall

Chapter 7: Mass-storage structure & I/O systems. Operating System Concepts 8 th Edition,

Disc Allocation and Disc Arm Scheduling?

CSCI-GA Operating Systems. I/O : Disk Scheduling and RAID. Hubertus Franke

CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University

Chapter-6. SUBJECT:- Operating System TOPICS:- I/O Management. Created by : - Sanjay Patel

CS370 Operating Systems

CS 471 Operating Systems. Yue Cheng. George Mason University Fall 2017

Mass-Storage Systems. Mass-Storage Systems. Disk Attachment. Disk Attachment

CSE 380 Computer Operating Systems

Chapter 11 I/O Management and Disk Scheduling

Mass-Storage. ICS332 - Fall 2017 Operating Systems. Henri Casanova

CISC 7310X. C11: Mass Storage. Hui Chen Department of Computer & Information Science CUNY Brooklyn College. 4/19/2018 CUNY Brooklyn College

Operating Systems. Mass-Storage Structure Based on Ch of OS Concepts by SGG

Disk Scheduling COMPSCI 386

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

CS370: Operating Systems [Fall 2018] Dept. Of Computer Science, Colorado State University

Part IV I/O System. Chapter 12: Mass Storage Structure

CSE Opera+ng System Principles

Virtual File System -Uniform interface for the OS to see different file systems.

CSE 4/521 Introduction to Operating Systems. Lecture 27 (Final Exam Review) Summer 2018

CS370: Operating Systems [Fall 2018] Dept. Of Computer Science, Colorado State University

I/O Management and Disk Scheduling. Chapter 11

I/O Hardwares. Some typical device, network, and data base rates

Magnetic Disk. Optical. Magnetic Tape. RAID Removable. CD-ROM CD-Recordable (CD-R) CD-R/W DVD

CS370: Operating Systems [Spring 2017] Dept. Of Computer Science, Colorado State University

Operating Systems. Designed and Presented by Dr. Ayman Elshenawy Elsefy

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Part IV I/O System Chapter 1 2: 12: Mass S torage Storage Structur Structur Fall 2010

MODULE 4. FILE SYSTEM AND SECONDARY STORAGE

CS370 Operating Systems

Principles of Operating Systems CS 446/646

u Covered: l Management of CPU & concurrency l Management of main memory & virtual memory u Currently --- Management of I/O devices

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Lecture 25: Interconnection Networks, Disks. Topics: flow control, router microarchitecture, RAID

Mass-Storage. ICS332 Operating Systems

CSE380 - Operating Systems. Communicating with Devices

COS 318: Operating Systems. Storage Devices. Kai Li Computer Science Department Princeton University

Operating Systems. Operating Systems Professor Sina Meraji U of T

Chapter 11. I/O Management and Disk Scheduling

COS 318: Operating Systems. Storage Devices. Jaswinder Pal Singh Computer Science Department Princeton University

CSCI-GA Database Systems Lecture 8: Physical Schema: Storage

Operating Systems. No. 9 ศร ณย อ นทโกส ม Sarun Intakosum

CHAPTER 12 AND 13 - MASS-STORAGE STRUCTURE & I/O- SYSTEMS

I/O Design, I/O Subsystem, I/O-Handler Device Driver, Buffering, Disks, RAID January WT 2008/09

Transcription:

Lecture Overview Mass storage devices Disk scheduling Disk reliability Tertiary storage Swap space management Linux swap space management Operating Systems - June 28, 2001 Disk Structure Disk drives are addressed as large 1-dimensional arrays of logical blocks, where the logical block is the smallest unit of transfer The 1-dimensional array of logical blocks is mapped into the sectors of the disk sequentially Sector 0 is the first sector of the first track on the outermost cylinder Mapping proceeds in order through that track, then the rest of the tracks in that cylinder, and then through the rest of the cylinders from outermost to innermost 1

Disk Scheduling The OS is responsible minimizing access time and maximizing bandwidth for disks Access time has two major components Seek time is the time for the disk are to move the heads to the cylinder containing the desired sector Rotational latency is the additional time waiting for the disk to rotate the desired sector to the disk head Disk bandwidth is the total number of bytes transferred, divided by the total time from the start of the request to the completion of the last transfer Requests are serviced immediately if the disk is not busy, if disk is busy then requests are queued Disk scheduling chooses the next request to service Disk Scheduling First-come, first-serve (FCFS) As always, this is the simplest algorithm to implement This algorithm is intrinsically fair, but does not necessarily provide the fastest service Consider requests for blocks on the following cylinders 98, 183, 37, 122, 14, 124, 65, 67 Assume the disk head is initially on cylinder 53 The next slide shows the head traversal 2

Disk Scheduling First-come, first-serve (FCFS) Total disk head movement is 640 cylinders Disk Scheduling Shortest-seek-time-first (SSTF) Service requests closest to the current head position (i.e., the minimum seek time) Is a form of Shortest-job-first May lead to starvation Next slide depicts head movement for previous example 3

Disk Scheduling Shortest-seek-time-first (SSTF) Total head movement is 236 cylinders Disk Scheduling Elevator algorithm Also called SCAN The disk arm starts at one end of the disk Moves toward the other end of the disk, servicing requests along the way When there are no more requests or it reaches the end, it reverses direction and moves back toward the other end of the disk, servicing requests along the way This process is continued repeatedly Next slide depicts head movement for previous example 4

Disk Scheduling Elevator algorithm Total head movement is 208 cylinders This figure assumes a strict elevator that moves all the way to the end before reversing directions Disk Scheduling C-SCAN Circular SCAN Provides a more uniform wait time than Elevator The head moves from one end of the disk to the other, servicing requests as it goes When it reaches the other end, it immediately returns to the beginning of the disk, without servicing any requests on the return trip Treats the cylinders as a circular list that wraps around from the last cylinder to the first one 5

Disk Scheduling C-SCAN Head movement for previous example Selecting a Disk Scheduling Algorithm SSTF is common and has a natural appeal Elevator and C-SCAN perform better for systems that place a heavy load on the disk Performance depends on the number and types of requests Requests for disk service can be influenced by the fileallocation method The disk-scheduling algorithm should be written as a separate module of the operating system, allowing it to be replaced with a different algorithm if necessary Either SSTF or Elevator is a reasonable choice for the default algorithm 6

Disk Reliability Several improvements in disk-use techniques involve the use of multiple disks working cooperatively Disk striping uses a group of disks as one Improves performance by breaking blocks into sub-blocks RAID (redundant array of independent disks) schemes improve performance and improve the reliability of the storage system by storing redundant data Mirroring or shadowing keeps duplicate of each disk Block interleaved parity uses much less redundancy RAID Schemes Defined in 1988 by Patterson, Gibson, and Katz from Berkeley University RAID 0: Striping RAID 1: Mirroring RAID 2: Bit Striping with ECC (use Hamming-code; practically not used) RAID 3: Bit striping with parity RAID 4: Striping with fixed parity RAID 5: Striping with striped parity RAID 10: Striped mirrors 7

Tertiary Storage Low cost is the defining characteristic of tertiary storage Generally, tertiary storage is built using removable media Common examples of removable media are floppy disks, CD-ROMs, etc. Tertiary Storage Floppy disk thin flexible disk coated with magnetic material, enclosed in a protective plastic case Most floppies hold about 1 MB; similar technology is used for removable disks that hold more than 1 GB Removable magnetic disks can be nearly as fast as hard disks, but they are at a greater risk of damage from exposure 8

Tertiary Storage A magneto-optic disk records data on a rigid platter coated with magnetic material Laser heat is used to amplify a large, weak magnetic field to record a bit Laser light is also used to read data (Kerr effect) Polarization of laser is rotated depending on magnetic field orientation The magneto-optic head flies much farther from the disk surface than a magnetic disk head, and the magnetic material is covered with a protective layer of plastic or glass; resistant to head crashes Tertiary Storage Optical disks do not use magnetism; they employ special materials that are altered by laser light Phase-change disk Coated with material that can freeze into either a crystalline or an amorphous state Different states reflect laser light with different strengths Laser is used at different power settings to melt and refreeze spots on disk to change their state Dye-polymer disk Coated with plastic contained a dye that absorbs laser light Laser can heat a small spot to cause it to swell and form a bump Laser can reheat a bump to smooth it out High cost, low performance 9

Tertiary Storage Data on read/write disks can be modified many times WORM ( Write Once, Read Many Times ) disks can be written only once Thin aluminum film sandwiched between two glass or plastic platters To write a bit, the drive uses a laser light to burn a small hole through the aluminum; information can be destroyed by not altered Very durable and reliable Swap Space Management Swap space virtual memory uses disk space as an extension of main memory Swap space can be carved out of the normal file system or, more commonly, it can be in a separate disk partition Swapping serves two primary purposes Expands usable address space for processes For example, programs that are bigger than physical RAM can still be executed Expands amount of available RAM to load processes For example, an entire program does not need to be in memory in order to execute 10

Linux Swap Space Management Performed at the page level Requires hardware paging unit in the CPU Linux page table entry Present flag indicates whether a page is swapped out Remaining bits store location of page on disk Linux only swaps Pages belonging to an anonymous memory region of a process (e.g., user mode stack) Modified pages belonging to a private memory mapping of a process Pages belonging to an IPC shared memory region Other page types are either used by the kernel or to map files on disk and are not swapped to the swap area Distributing Pages in a Linux Swap Area A swap area is organized into slots A slot contains exactly one page When swapping out, the kernel tries to store pages in contiguous slots so as to minimize seek time If more than one swap area is used Swap areas on faster disks get higher priority When looking for a free slot, the search starts with the area with the highest priority If several have the same priority then a cyclical selection is made among them to avoid overloading If no free slots are found in the highest priority swap area, then the search continues down to the next highest priority 11

Linux Page Swap Selection Generally, the rule for selecting pages to swap is to take from the process with largest number of pages Changes in 2.4 to processes with fewest page faults A choice must also be made among process pages As we know, LRU is a good selection algorithm The x86 processor does provide hardware support for LRU Linux tries to approximate LRU by using the Accessed flag in each page table entry; this flag is automatically set by the hardware when the page is accessed When to Swap in Linux Linux swaps out pages on the following occasions When the number of free page frames falls below a predefined threshold, the kswapd thread is activated once every second to swap pages When a request to the Buddy system cannot be satisfied because the number of free frames would fall below a predefined threshold 12

Linux Swap Area Linux can support up to MAX_SWAPFILES swap areas (usually set to 8) Each swap area consists of a sequences of page slots, i.e., 4096 byte blocks The first slot of the swap area is used to store information about the swap area The first slot is a swap_header union containing two structures magic - contains magic number to identify partition as swap stored at end of slot info - contains information such as swap algorithm version, last usable page slot, number of bad page slots, and location of bad page slots Linux Swap Area Each swap area also has an in-memory descriptor of type swap_info_struct stored in the swap_info array Contains device number, dentry of file or device file, priority, pointer to an array of counters (one counter for each slot), pointer to an array of bit locks for each swap area, last slot scanned, next slot to scan, etc. Counters are 0 if slot is free, positive value if not free, SWAP_MAP_MAX (32767) if slot is permanent, SWAP_MAP_BAD (32768) if slot is defective Descriptors also form a list sorted by priority 13

Linux Swap Area Swap area data structures swap_info free slot occupied slot defective slot swap area swap_device or swap_file swap_map 0 1 0 2 1 0 32768 32768 array of counters swap area descriptors Swapped-Out Page Identifier A swapped-out page is uniquely identified by specifying the index of the swap area in the swap_info array and its slot index The first possible slot index is 1, since 0 is used The format of a swapped-out page identifier is 31 8 7 2 1 0 page slot index area number This identifier is inserted into page table when a page is swapped out, table entry has three cases Null entry means page is not in process address space Page swapped out if least two bits are zero and 30 remaining are non-zero Least bit is 1, then page is in memory 14

Finding a Free Page Slot The goal is to store pages in contiguous slots Allocating from the start of the swap area increases average swap-out time, while allocating from the last allocated slot increases swap-in time Linux approach tries to balance these issues Linux always searches for free slots from the last allocated slot, unless The end of the swap area is reached The number of allocated slots since the last restart is greater than or equal to SWAPFILE_CLUSTER (usually 256) The Swap Cache Linux allows page frames to be shared when The frame is associated with shared memory mapping The frame is used for copy-on-write operation The frame is allocated to an IPC shared memory resource The swap area is not used in the first case (since the mapped file is used instead) It is possible for a page to be swapped out from one process, but still be in use in another process that is sharing the page The swap cache collects all shared page frames that have been copied to swap areas 15

The Swap Cache It is possible that a page will get swapped out multiple times, which is why the slot counter in swap descriptor may have a value greater than 1 The counter is incremented each time a page is swapped out It is possible to swap out a shared page for all process at once, but it is inefficient since the kernel does not know which processes share the page The Swap Cache Consider page P shared among processes A and B If P is selected for swapping in process A, a slot is allocated for P and the data for P is copied to the slot The swapped-out identifier is put in process A s page table The page s usage counter does not become zero, since process B is still using the page Thus the page was swapped, but not freed Now suppose P is selected for swapping from process B It first checks to see if P is in the swap cache (i.e., it has already been swapped out), if so it gets its swapped-out identifier and puts that in process B s page table The frame will be freed when all processes sharing the page have swapped it out A similar cache look-up is needed for bring the page back in 16