NSTL White Paper. System Performance and File Fragmentation. In Windows NT

Size: px
Start display at page:

Download "NSTL White Paper. System Performance and File Fragmentation. In Windows NT"

Transcription

1 NSTL White Paper File Fragmentation In Windows NT

2 Table of Contents Executive Summary... 3 I. Introduction... 4 File Fragmentation and Data Fragmentation are Different... 4 Fragmentation Can Impede Performance... 5 NTFS is Very Different from FAT... 6 NTFS Does Get Fragmented... 6 Performance Degradations Can Impede Productivity... 6 Keeping a Disk Defragmented Can Prevent These Problems... 7 II. How NTFS Works... 7 NTFS Capabilities in Functional Terms... 7 Master File Table... 8 Directories... 9 Compression Software RAID Dynamic Bad-Cluster Remapping Disk Caching Volume Sets Paging Files III. How NTFS Gets Fragmented Normal Creation and Deletion of Extents The Impact of Unusual Events Checkpoints Increased Head Movement From Disparity of Extents Cluster Size Issues, Trade-offs with Capacity and Performance System Files (Principally, but Not Exclusively, the Paging File) Fragmentation of Directories Fragmentation of the MFT Itself Workstation Specific Issues Server Specific Issues IV. The Implications of Fragmentation Fragmentation is Difficult to Test NT Performance is Impeded by Disk Fragmentation Enterprise Systems are More Susceptible to These Problems RAID Systems are Susceptible to Fragmentation Disk Caching Mitigates, Doesn t Eliminate These Problems Some User Scenarios are Performance Limited, and Productivity is Therefore Impeded by Fragmentation Optimization is Not a Solution V. Conclusions Regular Defragmentation Can Mitigate Performance Problems Both Workstations and Servers Can Benefit Glossary This report was prepared by NSTL under contract for Executive Software. NSTL does not guarantee the accuracy, adequacy or completeness of the services provided. NSTL MAKES NO WARRANTIES, EXPRESSED OR IMPLIED, AS TO RESULTS TO BE OBTAINED BY ANY PERSON OR ENTITY FROM USE OF THE CONTENTS OF THIS REPORT. NSTL MAKES NO EXPRESS OR IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE OF ANY PRODUCT MENTIONED IN THIS REPORT. TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

3 Executive Summary Contrary to early conventional wisdom about Windows NT, its file systems do become fragmented. This fragmentation occurs in the normal course of using the operating system. Theoretical analysis and real-world performance testing demonstrate that fragmentation has an adverse impact on system performance. Special characteristics of the NTFS file system, such as the paging file, directories, and the Master File Table, are especially vulnerable to fragmentation, and allowing them to become fragmented is a guarantee of a decrease in overall system performance. Other NTFS features, such as file system compression, inherently create fragmentation. The best way to avoid these worst-case fragmentation problems, and to keep the system running at optimal performance, is to run a defragmentation system on a regularly, scheduled basis. Both Windows NT Workstations and NT Servers are subject to these problems, and both can improve system performance through regular defragmentation. TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

4 I. Introduction All computer system design involves trade-offs, and file systems are no exception. One of the major detrimental effects of these trade-offs is fragmentation of files and the file system. Files in a file system become fragmented usually when they begin to run out of large physical stretches of free space. Rather than deny a file the ability to grow beyond the size of the largest free block on disk, file systems allow different parts of the file to exist in different non-contiguous locations, and the file system software presents the file to programs running on the computer as one logical unit. File systems can also become fragmented when files become scattered across the disk, even when the individual files themselves are not fragmented into multiple sections. In the long term, this can happen for the same reasons as those that cause internal file fragmentation, and can occur in the normal course of computer use. Normal computer use involves the creation and deletion of files, some of them permanent, some of them transient. Many typical computer processes, such as desktop publishing or software development, involve the creation of large numbers of temporary files, the presence of which the user is normally unaware. During the user task, the program reads source files and may create temporary files to store data used in a later portion of the task. In the end, the application may write result files and delete original source files and temporary files. The end result of this process is that small runs of free space appear amidst the allocated space on the hard disk. This, in and of itself, is a form of fragmentation that decreases performance even if individual files are not internally fragmented. Over time, as the larger runs of free space on the hard disk are lessened in this way, individual files become fragmented because the file system will lack the space to contiguously allocate a file. The term used for space such as this, which is unallocated to any file but unavailable to some degree because it is split into multiple sections, is external fragmentation. More importantly, as individual files grow, there will not be sufficient free adjacent space for them, and the file system will need to allocate a non-contiguous or non-adjacent block of space for new data. Windows NT also supports the FAT and HPFS file systems, which have fragmentation issues of their own. But these file systems are provided for compatibility with legacy systems, such as DOS and OS/2, and do not support the full gamut of Windows NT features, such as integrated security. Many of the issues explored in this paper apply to those file systems as well as to fragmentation generically on any operating system, but the focus of this paper will be on the NTFS file system under Windows NT 4.0. File Fragmentation and Data Fragmentation are Different It s important to note the distinctions between fragmentation at different levels of data storage. Individual applications, such as Microsoft Office programs and database servers like Oracle, have their own issues of fragmentation in their data storage. These issues are generic to all file systems and operating systems. Such disk fragmentation would exist regardless of the file system or operating system. The file system, NTFS in the case of NT, is not aware of the logical organization of your data. Wherever the file may exist on the disk, and whether or not the file is fragmented, the file system presents it to the application as a single contiguous area of storage. But the application s view of the data in that file has a logical structure. To a mailing list program, a file may be a group of first names, last names, TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

5 addresses, and so on. To the file system it is still just a group of clusters of data. A cluster is the smallest unit of storage, which can be allocated by the operating system on a disk. A cluster may consist of one or more sectors of the disk. The application may, in its own internal organization of the data in the file, create gaps in the data, i.e. it may fragment it. Much like a file system, when you delete data in an application it may not actually remove the data, but only mark it as deleted. The resulting gaps in the logical storage of data are known as internal fragmentation. Data files may also have allocated but unused space for other reasons. Programs may allocate space in a file in chunks of space analogous to file system clusters, for their own organizational or performance reasons. They may also use external facilities, such as Windows OLE structured storage, to manage the structure of the data in their files, and these facilities may have their own wasted space. Over time, the growth of such areas will cause the total size of the file to grow and may slow the performance of the application as head movement on the disk increases, even if the logical amount of live data remains constant. This problem occurs even if the file itself is not fragmented at the file system level, although data fragmentation increases the likelihood of file fragmentation simply because the file itself grows. To combat internal data fragmentation, some applications, such as Microsoft Access, provide utilities to defragment (or compact ) the data in the file. Ironically, these utilities themselves run a substantial risk of increasing fragmentation at the file system level because they usually create an entirely new copy of the file, consuming large amounts of disk space in the process. Thus, regular defragmentation of your data files may exacerbate fragmentation of your file system. Lastly, the individual files associated with an application can, over time, become physically dispersed across a disk. This type of fragmentation, known as usage fragmentation, is an especially difficult problem for a defragmentation program, because normal methods of fragmentation analysis may not identify it. Instead, some knowledge of the application s behavior may be necessary in order to rectify this problem. In the future, this problem could, in theory, be managed either by applications providing information about their files to the defragmenter or by sophisticated analysis of the file system journal. Fragmentation Can Impede Performance Almost all hard disks have the same basic design: a stack of circular platters with a series of heads that move across the disk to read concentric circular tracks. In most cases, heads in the disk move in lock step, all the heads will always be physically located over the same track at once, and this group of tracks is called a cylinder. Hard disks operate at their fastest when they are reading physically sequential data, one track at a time, switching from one head to another within a single cylinder, and moving on to the next physically adjacent cylinder. Under these circumstances the disk can read or write data and pass it back to the interface and to the computer with a minimum amount of head movement. If the next data to read or write were stored elsewhere on the disk, the process would have to wait for the heads to move to the correct cylinder and settle over the appropriate sector within that cylinder. Head movement is expensive in terms of computer performance and, in order to maximize performance, head movement should be minimized. Modern hard disks usually read one track of information at a time, so keeping files and free space defragmented also takes maximum advantage of the hard disk s ability to read your data in anticipation of your using it, as well as to cache that data TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

6 in hardware. The more contiguous your data is on the disk, the more likely it is to be read in a single hard disk read operation. One implication of this is that fragmentation (either internal or external) of a file that lies within a single track on a disk is irrelevant, or at least less relevant, to performance, because head movement will be constant. All file system designers are faced with a trade-off between several factors, including performance, efficient use of space, and tendency to fragmentation. File systems allocate disk space in units called clusters. If a file consumes less than an exact multiple of cluster size, the remaining space, often called cluster slack, is technically wasted. But as disks and average file size become larger, it makes sense to use larger clusters, and risk larger amounts of cluster slack. In a well-designed file system, even if cluster size increases, the overall percentage of space wasted as cluster slack remains small, and as the average size of a file increases, the waste in cluster slack also loses its importance. As we will see below, NTFS has special design features that lessen the impact of cluster slack in small files. Real world experience and research indicate that, while some files have gotten large over time, average files remain small enough that smaller cluster sizes, 4K or less, are optimal. NTFS is Very Different from FAT Windows NT is much smarter than its predecessor operating systems in allocating disk space to files. As a result, it is less prone to fragment files. But as a side effect of preventing file fragmentation, NTFS creates fragmentation in the file system s free space. Still, NTFS is not immune to the forces that fragment individual files, and over time, files on an NTFS volume will become fragmented. Starting in version 4.0, Windows NT provides operating system calls designed to facilitate defragmentation, and defragmentation software for Windows NT usually uses these calls. But the design of NTFS and practical implications of how these APIs (application programming interfaces) operate, mean that it is important not only to defragment your disks, but also to do so on a regular basis. NTFS Does Get Fragmented The Windows NTFS File System Driver uses a special file called the Master File Table (MFT) to track all files on that volume. The MFT starts out with some free space to allow new files to be tracked, but on a very busy system it too can run out of space. At this point NTFS extends the MFT itself, creating new stretches of it for new allocations. This situation is precipitated most often by fragmentation in the file system itself, as file system fragments consume entries in the MFT. If these new stretches are not contiguous, the MFT itself becomes fragmented. There are other files, such as the paging file used by Windows NT s virtual memory subsystem, which can also become fragmented with unpleasant implications for performance. The solution to these problems, as we will see, it to prevent them from happening by keeping your system defragmented. Lastly, directories in NTFS are allocated similarly to files, but defragmentation of them can be difficult. Performance Degradations Can Impede Productivity Windows NT does a good job of allowing the system to continue operation even as programs wait for disk I/O, but some inefficiency cannot be hidden forever. TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

7 Especially on a mission-critical server, on which many users rely, inefficiencies in the file system can lead to performance degradation that impedes user productivity. These problems are not always apparent, and are frequently cavalierly blamed on other sources; perhaps the computer s just too slow, needs more memory, or some program being run needs an upgrade. Overall system performance is a complex phenomenon, and even experienced system administrators may not recognize fragmentation in a file system. After all, it can occur with large amounts of free space on the disk. But the main reason users don t recognize fragmentation is because Windows NT comes with no tools to identify it. Heavily used systems, which are by definition mission-critical systems for an organization, will become fragmented over time under normal usage in Windows NT. As performance decreases in such systems and users are forced to wait, productivity is thereby impeded. Keeping a Disk Defragmented Can Prevent These Problems Regular defragmentation of the file system improves overall system performance and, as a result, allows the rest of the system to operate at optimal performance speed given normal circumstances. Heavily fragmented systems can become difficult to defragment, so it is important, in order to maintain optimal performance, to defragment on a regular basis to prevent especially problematic circumstances, such as a fragmented paging file or MFT, from arising. Windows NT s scheduling service and performance monitoring tools provide an efficient solution to this problem by allowing defragmentation to be scheduled for off hours and/or when other load on the system is light. II. How NTFS Works NTFS Capabilities in Functional Terms NTFS is a modern, robust file system designed to support both single user workstations and multi-user servers. Microsoft designed NTFS to overcome the most serious limitations of their predecessor file systems, FAT and HPFS, as well as to support planned features in Windows NT, such as integrated security and support for the POSIX standard. NTFS has very high limits on storage capacity. It uses 64-bits to number clusters which can occupy up to 64K, meaning that a disk volume in NTFS can be up to 2 64 (16 billion billion) clusters or 2 80 bytes, and each file can be up to 2 64 bytes. Both FAT and HPFS had much smaller limits. While NTFS is internally capable of managing this much storage, the disk partitioning scheme or hardware addressing may limit the partition size to a smaller number. NTFS is a recoverable file system. This means that operations in NTFS are transactions, as in a database. Either the entire operation completes or the operating system has the capability to roll back the unfinished portion, safeguarding the integrity of the existing data. NTFS also stores redundant copies of critical file system structures in the unlikely event that physical damage makes one copy of them inaccessible. Security is integrated directly into the NTFS system and derived from the Windows NT object model. Security objects, known as ACLs (Access Control Lists), are stored in the MFT as part of the file. These are the actual security objects used by Windows NT to restrict access to the file object. TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

8 Files in NTFS have attributes: a name, a creation date, an archive bit, and so on. In fact, the data in the file is just another attribute. This characteristic of NTFS is how Windows NT implements many of its sophisticated features, such as complex access controls and support for Apple Macintosh clients. Macintosh files, for example, have two sections, a resource fork and a data fork. NTFS manages the association between these sections by storing them in different attributes of the same file. In some ways, the organizational system of file attributes combats fragmentation, because programmers might otherwise have used additional files to store attribute data. But heavy use of attributes can cause fragmentation within the MFT itself. Because Windows NT is fully Unicode-enabled, so is NTFS. All data in NTFS file systems are stored in the 16-bit Unicode encoding scheme, where each character in the file name is stored in 16 bits in the file s name attribute. Filenames can take up to 255 characters including multiple periods and embedded spaces. Master File Table The heart of the NTFS file system is the Master File Table or MFT. The MFT is itself a file, an array of records constituting a database of all files on the system. Each record in the MFT is usually fixed, by definition, at 1K, and the format of the first 16 records is defined to contain certain volume specific information, and are known collectively as the NTFS metadata files. Metadata is the name given to these overhead structures in the file system, which are used to track the real data. The first four records are duplicated in a file at or near the physical center of the disk for recoverability purposes. Normally, each record in the MFT corresponds to one file or directory in the file system. The MFT record contains the file s attributes. Other standard attribute information in a file record includes the read-only and archive flags; creation and last-accessed dates; the file name, of which there are likely at least two (a long file name and a short 8.3 DOS-compatible name); a security descriptor; and the file data, or pointers to where the file data resides on the disk. Yes, the data in a file is just another attribute of NTFS. For this reason, small files (about 750 bytes, depending on the number of other attributes in the file) can fit entirely within their MFT entry, giving Windows NT and NTFS excellent performance with such files. Such files also exhibit zero fragmentation. There is at least one entry in the MFT for each file on the NTFS volume, including the MFT itself and other metadata files. These are the files, such as the log file, the bad cluster map, and the root directory, which contain the structure of the rest of the volume as seen by NTFS. Users don t see these files, which all have names beginning with $ (for example, the MFT is in $MFT). Most of the remaining entries in the MFT are for user files and directories. In a perfect world, that would be it for the MFT. Of course, many files are not so small that their data fits within their MFT entry, so the MFT stores their data in one or more areas of the disk. NTFS allocates files in units of clusters. The clusters within a file are referenced by NTFS in two ways: first, with Virtual Cluster Numbers (VCNs), from 0 through n-1 where there are n clusters in the file; second, with Logical Cluster Numbers (LCNs), which correspond to the number of the cluster on the NTFS volume. Because LCNs are simply an index to the clusters on a volume, NTFS uses an LCN to calculate an address on the disk to read or write by simply multiplying the LCN by the number of sectors per cluster and reading or writing sectors starting at that address on the disk. TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

9 VCNs are the analog for file offsets requested by applications running under Windows NT. The application knows the format of the data it uses in the file and uses it to calculate a byte offset within the logical format of the file. When the application requests a read or write at that address of the file, NTFS can divide that number by cluster size to determine a VCN to read or write. By associating VCNs with their LCN, NTFS associates a file s logical addressing within its files with the physical locations on disk. This mapping of VCN to LCN is what the file s data attributes do. All files have at least one data attribute, known as the unnamed data attribute. There can be other named data attributes, which correspond to the multiple streams of data referred to above. Directories do not have unnamed data attributes, but they can have named ones. If any attribute, most likely the file data attribute, does not fit in the MFT record, NTFS stores it in a new, separate set of clusters on the disk, called a run or an extent. In fact, other attributes besides the data can become large enough to force new extents. For example, long filenames in Windows NT can be up to 255 characters that, because they are stored in Unicode, consume 2 bytes apiece. When an attribute is stored within the MFT entry, it is called a resident attribute. When one is forced out to an extent, it is called a non-resident attribute. It may come to pass that the extent will need to grow, for instance, if the user appends data to a file. In this case, NTFS will attempt to allocate physically contiguous clusters to the same extent. If there is no more contiguous space available, NTFS will need to allocate a new extent elsewhere on the disk; in other words, it will separate the file into two fragments. The data attribute header, still stored within the MFT record, stores the information in the form of LCNs, and run lengths that NTFS uses to locate the extents. In rare cases, usually when the number of attributes is large enough, NTFS may be forced to allocate an additional MFT entry for the file. In this case, NTFS creates an attribute called an attribute list, which acts as an index to all the attributes for the file or directory. This is an unusual situation which should occur only with files that are extremely large and fragmented, and can greatly slow the performance of operations on that file. Directories Directories are very much like files in NTFS. If the directory is small enough, the index to the files to which it points can fit in the MFT record in an attribute called the Index Root attribute. If enough entries are present, NTFS will create a new extent with a non-resident attribute called an index buffer. In such directories, the index buffers contain what is called a b+ tree, which is a data structure designed to minimize the number of comparisons needed in order to find a particular file entry. A b+ tree stores information (or indexes to that information) in a sorted order. At points in the directory, NTFS stores sorted groups of entries and pointers to entries that fall below those entries in the sort. This has many advantages over storing entries in whatever order they happen to fall. For example, if you want a sorted list of the entries in the directory, your request is satisfied quickly because that is the order of storage in the index buffer. If you want to look up a particular entry, the lookup is quick because the trees tend to get wide, rather than deep, which minimizes the number of accesses necessary to reach a particular point in the tree. TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

10 Compression NTFS supports compression of file data as a native function of the file system. One of the side effects of compression is that it can create fragmentation of files and of free space. You can instruct NTFS to compress data on an entire volume, in a specific directory, or even in a particular file. There are Win32 calls for programs to use to determine the impact of compression, in particular the compressed and uncompressed file sizes. If you get a file s properties in Windows NT Explorer, you will see both sizes. It is in this compression scheme that you begin to see the flexibility created by NTFS s use of both VCNs and LCNs, as well as the potential for problems. In a normal file that has data stored in non-resident attributes or extents, the data attribute will contain mappings of the starting VCN and starting LCN in the extent as well as the length in clusters. NTFS plays games with these cluster numbers to achieve compression, using two basic approaches. Because some large files have large blocks of nulls (bytes of value 0), NTFS uses a sparse storage for such files, meaning that it only stores the non-zero data. Imagine a 100 cluster file in which only the first 5 and last 5 clusters contain data, and the middle 90 are all zeroes. NTFS can store two extents for this file, each 5 clusters long. The first will have VCNs 0 through 4 and the second will have VCNs 95 through 99. NTFS can infer that VCNs 5 through 94 are null, and do not need physical storage. If a program requests data in this space, NTFS can simply fill the requesting program s buffer with nulls. If the program allocates non-zero data to this space, NTFS can create a new extent with the appropriate VCNs. This method is very fast for sparse files. If a file is not predominately null, NTFS uses a different compression method. Instead of trying to write the file data in one extent, NTFS will divide the data up into runs of 16 clusters apiece. In any particular extent, if compressing the data will save at least 1 cluster, NTFS will store the compressed data, meaning 15 or fewer clusters. If the data cannot be effectively compressed (random data, for example, is generally not compressible), NTFS will simply store the entire extent as it normally would without compression. Back in the MFT record for this file, NTFS can see that there are missing VCNs in the runs for a file and can infer that the file is compressed. Because the data is stored in a compressed form, it is not possible to look up a specific byte by calculating the cluster in which it is stored. Instead, NTFS calculates in which 16 cluster run the address is located, decompresses the run back to 16 uncompressed clusters, and then calculates the offset into the file using valid virtual cluster numbers. NTFS ensures that all these runs begin with a virtual cluster number divisible by 16 so that this addressing remains possible without having to decompress the entire file. NTFS tries to write runs of this type into a single contiguous space because the I/O system is already encountering enough added processing and management burden using compressed files without having to fragment individual extents. This is part of the reason NTFS designers chose 16 clusters as the size of a compressed run; it cannot be more than 64K, because the file system buffers are 64K each. It is also very likely to be read in a single I/O operation. NTFS also tries to keep all the separate runs of the file contiguous, but this is a harder job. Compressed files are more likely than non-compressed files to be fragmented. NTFS only compresses the file s data attribute, not the metadata. Compression only works on volumes with 4K clusters or smaller. TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

11 Software RAID NTFS also supports fault tolerance in disk subsystems by dynamically mirroring or striping data across multiple disk volumes. NTFS supports RAID levels 1 and 5. In level 1, known as mirroring, data written to a volume is written in parallel to a second volume; data read from a volume is also read from the second volume and compared to it for correctness. In level 5, known as striping, data streams ( stripes ) are divided among three or more disks, using some of the space to store parity information. If one of the disks registers a physical error, NTFS can calculate the missing data using the remaining data and the parity information and the logical exclusive-or (XOR) operation. Dynamic Bad-Cluster Remapping NTFS is able to dynamically detect the presence of a physically bad cluster and map around it. If, on a disk which has been formatted as an NTFS fault tolerant volume, the NTFS driver attempts to read a cluster and the read operation fails due to a physical read error, the NTFS fault tolerance driver dynamically retrieves a good copy of the data that had been stored in the bad sector using a striped or mirrored volume. NTFS then maps a new cluster to replace the bad one and writes the data to it, and then marks the bad cluster so that it is no longer used. On a nonfault tolerant volume, NTFS can still detect bad clusters and mark them as such, but they cannot necessarily retrieve the data. Remapping the bad cluster almost certainly fragments the file into at least three fragments. Today s hardware is usually reliable and it is good that NT has the capability to maintain the integrity of files in this way, but the potential for sudden fragmentation in critical files is another reason to defragment file systems on a regular basis. Disk Caching Windows NT s I/O Manager integrates a Cache Manager that is involved in all disk I/O. When an application attempts to read data that has not been loaded into the cache, the Cache Manager interacts with the Windows NT Virtual Memory Manager, which calls the NTFS file system driver to load the data into the cache. Similarly, the Cache Manager uses the memory manager to perform all disk writes using background threads. Unless instructed otherwise, NT s Cache Manager caches all reads and writes on all secondary media. Cache Manager uses a number of aggressive techniques to improve performance. For example, it will attempt to read ahead in a file in anticipation of a program requesting the following data. It will also delay writes to the disk, so that if reads or writes of the same data occur quickly, they will be satisfied out of the cache rather than a physical disk operation. Aggressive disk caching can mitigate the effects of disk fragmentation to the extent that data that is read by applications is read from the cache rather than from the disk itself. In fact, adding memory to a heavily fragmented system can improve performance on a fragmented system, although this is an expensive solution to a problem that can be fixed at little cost through software and good practices. Volume Sets The NT fault-tolerance driver also provides some functions unrelated to faulttolerance, including Volume Sets. A volume set is a single logical volume composed of areas of free space on one or more disks. Using the NT Disk Administrator utility, you can combine two 100MB free areas on different disks TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

12 into a single logical 200MB volume. These volume sets can be formatted with any NT-supported file system, although there are advantages to using NTFS. Volume sets are useful for combining smaller disks or free space on larger disks, into a single, more useful area that can be treated as a logical unit. If the volume is formatted with NTFS, the administrator can add new stretches of free space to the volume set while maintaining data on the existing volume. This can be a lowimpact way for network administrators to add storage to an existing network drive without impacting users view of the network. The problem with volume sets, from a fragmentation standpoint, is that they have the capacity to exacerbate normal fragmentation into even more performancelimiting fragmentation across physical volumes or physically separate free stretches of a single volume. Windows NT file systems don t see the fact that they are working with multiple volumes and therefore treat volume sets as they would any single physical device. Paging Files Paging files present a special problem for fragmentation under Windows NT. NT supports up to 16 paging files on a system. These files are used for virtual memory; as Windows NT and its applications use memory in excess of the physical RAM, the Virtual Memory Manager writes the least-recently used areas of memory to the paging files to free RAM. If a program accesses these areas of memory, the Virtual Memory Manager reads them from the paging file back to RAM where the program can use them. Once the system starts up, these files are always open and cannot be moved or deleted. At startup, the Windows NT System process duplicates the file handles for the paging file so that the files will always be open and the operating system will prevent any other process from deleting or moving them. For this reason, paging files are a problem for defragmentation software. In order to safely defragment the paging file, defragmenters must defragment them at system boot time before the Virtual Memory Manager gets a chance to lock them down. While this is a desirable feature, regularly rebooting a system to defragment it is not a desirable situation, so the best solution is to keep the rest of the file system defragmented to mitigate any fragmentation problems caused by the existence of paging files. III. How NTFS Gets Fragmented Normal Creation and Deletion of Extents In the normal course of computing, on any operating system, files are created and deleted, visibly and invisibly. This process leads to the creation of gaps in the used portions of physical storage. As a disk becomes more full, and use of it becomes heavier, it is likely that the large areas of free space that are present early in the system s life will break down into smaller free areas throughout the system. Many programs will explicitly retain the last version, or several versions, of the file the user is working on. Eventually the backup versions are deleted, and their space is freed up. The result is probably a gap in free space on the disk. Or consider the case of downloading the latest version of Netscape Communicator. You might download a 20MB executable program and run it, creating another 20MB or more of files in the Program Files directory. Then you will likely delete the 20MB file you downloaded. The result is that you have a 20MB gap, possibly in one place, possibly split up, and the newly installed program is likely stored after the gap on the disk. The operating system has fewer large free areas to work with. TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

13 But programs and the operating system create files on their own without telling the user. Consider the print spooler. Ever since the early versions of DOS, when you print a file, your program and the operating system actually performs at least two steps. First, it creates a file containing the data printed by the application. In the case of Windows applications, these are in an intermediate format called Windows Metafile Format (WMF). The printer driver for your printer then converts this data to a separate file in the native format for the printer, and then the spooler sends that file to the printer. All this data consumes space on the disk temporarily and is then deleted. Printing a large document consumes a correspondingly large amount of disk space. The Impact of Unusual Events Such normal events can cause fragmentation, but it would take a long time and a lot of use. But fragmentation in NTFS is easy to create using unusual, but not unreasonable, techniques. The example above of downloading a large file and installing it is a minor example of this. To date, there have been 5 service packs for Windows NT 4.0. Each of them has involved a multi-megabyte download, and each makes changes in a large number of NT system programs likely to be stored at the front of the disk. Installing a service pack is therefore likely to push NT system programs further out on the disk, creating gaps. Large service packs may cause fragmentation within NT files themselves, and certainly make fragmentation of other files more likely. Consider also that Microsoft SQL Server, BackOffice, Office 97, and many other common NT programs have their own service packs, and that installing them brings all the same implications of installing an NT service pack. Application upgrades have all the same implications as well. You d think that installing a new Windows NT Workstation would start the system out in a clean, unfragmented state, but even this is not necessarily true. Even a clean install will likely end up fragmented, because the installation process creates numerous files and directories that it then deletes. The subsequent application of service packs exacerbates the situation. It is not unusual for a user to install NT on a system with an existing FAT-formatted drive. NT has the capability to convert the drive to NTFS, but doing so requires moving files around in ways that will fragment free space. Checkpoints Aggravating the problem is the fact that NTFS doesn t immediately make deallocated clusters available for other programs. Instead, they become available after the next time NT checkpoints the disk. Checkpointing is part of NTFS facility for recovering from errors. As we stated above, I/O operations in NTFS are transactions. As it performs I/O operations, such as appending data to a file, NTFS logs undo and redo data for that operation. At some point between transactions, when the disk is known to be in a good state, NTFS writes a checkpoint record to its log. If NT detects a disk error while performing an operation it enters a recovery procedure consisting of three passes: the analysis pass, the redo pass and the undo pass. In the analysis pass, NTFS determines which parts of the operation failed and which clusters it must update in order to undo the transaction. In the redo pass, NTFS performs all other operations that were logged since the last checkpoint. TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

14 Then in the undo pass, it rolls back any uncommitted operations in the offending transaction. Because NTFS cannot be certain of the disposition of data in a cluster until a checkpoint, it cannot allow other data to be written to that cluster. Note that no errors need occur for this to happen. It is unlikely to affect a large amount of disk, but it happens every time a cluster is freed, and will tend in the long term to push data further out in the disk, and thus to diminish the average size of a free area of disk. Increased Head Movement from Disparity of Extents As stated above, an I/O subsystem operates at maximum speed when the disk transfers data to or from adjacent sectors on the disk. This is because the heads on the disks have to move at a minimum under such circumstances. Head movement is the enemy of I/O performance. It is a rare event indeed when the disk gets to read or write contiguously for a long time. It is normal for the heads to move around as Windows NT reads and writes to different files in the normal course of its business. For example, consider the Checkpointing system described above which allows Windows NT to recover the file system to a correct state even in the event of a power failure or physical disk error. The undo, redo and checkpoint information that makes recoverability possible is stored in a log file that the NT Log File Service (LFS) maintains. Periodically, in the course of writing to some other part of the disk, NTFS writes log entries about the disk operations it is performing to the log file. Head movement is also inevitable when the operating system pages memory out to disk. The Virtual Memory Manager will begin to page memory out to disk even before there is no unallocated memory. This is a reasonable policy, but it may negatively impact the performance of disk-intensive applications. In a heavily trafficked system, paging to and from disk is not uncommon, and consumes both CPU and disk time. Even with the normal amount of head movement that occurs in a system, an application can perform at full or near-full speed. But fragmentation in data or program files can significantly increase the amount of time it takes to perform disk operations. Cluster Size Issues, Trade-offs with Capacity and Performance When you format a volume using NTFS you have a choice of cluster size to use. Windows NT has different default cluster sizes for different size volumes. This is a simple association, and knowledge of how the volume is to be used could be used to choose a cluster size more optimal than the default. Depending on your priorities, you might want to choose a different cluster size than the default, but be careful. Choosing a smaller cluster size will waste less space but is more likely to cause fragmentation. Larger cluster sizes are less likely to cause fragmentation but will waste more space. 512 byte clusters in particular are problematic, especially since the MFT consists of records that are always 1024 bytes. It is possible on a system with 512 byte clusters to have individual MFT entries fragmented. MFT record fragmentation of this type is not possible with larger cluster sizes, which can hold one or more complete MFT Records. If a file or directory is contiguous, the cluster size doesn t matter, except to the extent that it wastes a small amount of space. It is therefore wise to choose a cluster TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

15 size large enough discourage any more fragmentation than you are likely to encounter on NT anyway. But if you know that you have a very large number of small files, or if you know that you have very few small files, you have information that you can use for a better cluster decision. Also, a very large absolute number of files (on the order of 100,000) will make fragmentation of the MFT more likely. In this case, a larger cluster size will limit the fragmentation in the MFT as it grows to accommodate. Note that it is possible to create an NTFS volume with a cluster size greater than 4K, however, if you do that you can not use NTFS compression, nor can you get defragmentation using the built in supported Microsoft defragmentation interface. System Files (Principally, but Not Exclusively, the Paging File) DOS and Windows have a small number of files that are known as System files, which make them invisible and unmovable. Windows NT makes far greater use of these files. These files consume a non-trivial portion of the disk space, especially on a boot volume. Windows NT has two kinds of system files. The first kind are the files which constitute the structure and overhead of the NTFS file system. Call them NTFS System Files. The MFT is one such file (named $Mft), with special implications, which we deal with below. First, there is a copy of the first four records of the MFT named $Mftmirr, stored near the physical middle of the disk. There is also the Log File ($Logfile), the Volume file ($Volume), the Attribute Definition Table ($Attrdef), the Root Directory File ($.), the Cluster Bitmap ($Bitmap), the Partition Boot Sector ($Boot), the Bad Cluster File ($Badclus), the Quota Table ($Quota), and the Upcase Table ($Upcase). ($Quota is not used in NT 4.0, but Windows 2000 uses it to implement user storage quotas.) These files are always present on an NTFS volume. The APIs that Windows NT provides to support defragmentation do not move these files, so they cannot be defragmented while Windows NT is running. But there are many other such files, and as with DOS, they present problems for defragmenters. Call them Windows NT System Files. For example, NTDETECT.COM, the multi-boot loader, and ntldr, the Windows NT loader program, are Windows NT System Files. Some notebooks, with proper support, will have large hibernation files, the size of physical memory, and most importantly to every day use and performance, the paging file. Disk I/O to the paging file (\pagefile.sys) is almost always heavily fragmented, because the process being read or written from or to the paging file is not guaranteed to be adjacent to the next process being accessed in the paging file. This is one of the most crucial files for Windows NT s overall performance, because access to it usually occurs at a point where performance is already being constrained by memory. A large number of fragments in the paging file bring with them a severe performance penalty. Fragmentation of Directories NTFS treats directories almost exactly as it treats files. In fact, directories are just another type of file, although they have special types of attributes in their MFT records. Normally applications manage the contents of the data in their files; in the case of directories, it is NTFS that manages the contents, which are b+ trees that provide an indexed access to files in the directories. Some directories, such as most application program file directories, aren t likely to grow or shrink much over their lifetimes. But some directories, such as the TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

16 TEMP directory or user document directories, are likely to grow and shrink considerably. As the number of files in a directory grows, NTFS can grow the directory storage to accommodate it. In the right circumstances, if the content of the directory shrinks, NTFS can also free up the unused space in the directory, but this doesn t happen very often. The directories that are likely to grow and shrink are also the type that is likely to have been created early in the system s life, such as My Documents and TEMP. Therefore it is likely that, as they grow, their growth will be non-contiguous. These are also likely to be heavily used directories, so this fragmentation is likely to have a real impact on system usage. Users should also be aware that deeply nested directories may present an organizational convenience, but there is a performance penalty for them. When NTFS searches its b+ trees for data, it does so once for each level in the directory subtree. Therefore performance may be better with flatter trees that have larger numbers of files in them than with deeper trees that have fewer files in each. Very deep subtrees can also create problems for applications that have limits to the number of characters in a complete file path. Many applications limit such a name to 255 characters. Fragmentation of the MFT Itself Normally the MFT uses one entry per file or directory. The area on the disk reserved for the MFT begins life at the time the volume is formatted with about 12.5% of the total volume space reserved for the MFT. This reserved space (the MFT zone ) and the MFT itself are not movable. If everything goes well, the MFT as pre-allocated will be more than up to the task of tracking file and directory metadata. But when a file becomes very fragmented, it increases the amount of data NTFS need to store in the MFT record in order to track the various fragments or extents. Eventually the MFT record is not large enough to store the data, and NTFS must allocate another record. Because of this, keeping the disk generally defragmented helps to prevent the MFT from becoming fragmented. Part of the problem with the MFT is that it will grow if necessary, but will never contract. In a system with a large number of files, or one that is heavily fragmented, the MFT may run out of available entries. In this case, NTFS will expand the MFT in 32 record chunks. Because use of the volume after it is formatted creates files physically following the MFT zone, expansions of the MFT can be made contiguously if no other files are in the MFT zone. These new entries will contain metadata describing recently created files that are likely to be used, and performance in using them will suffer greatly. As noted above, if the MFT begins to fragment, it is better to have a larger cluster size on the volume, as this will limit the number of fragments. Temporary files are one of the principal ways in which large numbers of files can be created, and the effect is insidious. Users aren t usually aware of the number of temporary files created during operations like compiling, word processing, and even using the Internet; Microsoft s Internet Explorer creates a particularly large number of temporary files. Heavy use of such files and failure to clean them up can fragment not just files and free space, but the MFT itself. Users should use utilities, included with recent versions of Windows and available from 3 rd parties, to clean up unused temporary files, shortcuts that point to nowhere, and other Windows droppings that accumulate over time, and run these utilities on a regular basis. TESTING AND DISTRIBUTION CENTER, 625 RIDGE PIKE, CONSHOHOCKEN, PA

Windows File System. File allocation table (FAT) NTFS - New Technology File System. used in Windows 95, and MS-DOS

Windows File System. File allocation table (FAT) NTFS - New Technology File System. used in Windows 95, and MS-DOS Windows File System Windows File System File allocation table (FAT) used in Windows 95, and MS-DOS NTFS - New Technology File System 2 Key features of NTFS NTFS uses clusters(rather than sectors) as units

More information

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Operating Systems Lecture 7.2 - File system implementation Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Design FAT or indexed allocation? UFS, FFS & Ext2 Journaling with Ext3

More information

Chapter 11: File System Implementation. Objectives

Chapter 11: File System Implementation. Objectives Chapter 11: File System Implementation Objectives To describe the details of implementing local file systems and directory structures To describe the implementation of remote file systems To discuss block

More information

SMD149 - Operating Systems - File systems

SMD149 - Operating Systems - File systems SMD149 - Operating Systems - File systems Roland Parviainen November 21, 2005 1 / 59 Outline Overview Files, directories Data integrity Transaction based file systems 2 / 59 Files Overview Named collection

More information

NTFS Recoverability. CS 537 Lecture 17 NTFS internals. NTFS On-Disk Structure

NTFS Recoverability. CS 537 Lecture 17 NTFS internals. NTFS On-Disk Structure NTFS Recoverability CS 537 Lecture 17 NTFS internals Michael Swift PC disk I/O in the old days: Speed was most important NTFS changes this view Reliability counts most: I/O operations that alter NTFS structure

More information

Hard facts. Hard disk drives

Hard facts. Hard disk drives Hard facts Text by PowerQuest, photos and drawings Vlado Damjanovski 2004 What is a hard disk? A hard disk or hard drive is the part of your computer responsible for long-term storage of information. Unlike

More information

File System Interface and Implementation

File System Interface and Implementation Unit 8 Structure 8.1 Introduction Objectives 8.2 Concept of a File Attributes of a File Operations on Files Types of Files Structure of File 8.3 File Access Methods Sequential Access Direct Access Indexed

More information

Advanced Operating Systems

Advanced Operating Systems Advanced Operating Systems File Systems: File Allocation Table, Linux File System, NTFS Lecture 10 Case Studies of File Systems File Allocation Table (FAT) Unix File System Berkeley Fast File System Linux

More information

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems File system internals Tanenbaum, Chapter 4 COMP3231 Operating Systems Architecture of the OS storage stack Application File system: Hides physical location of data on the disk Exposes: directory hierarchy,

More information

Addresses in the source program are generally symbolic. A compiler will typically bind these symbolic addresses to re-locatable addresses.

Addresses in the source program are generally symbolic. A compiler will typically bind these symbolic addresses to re-locatable addresses. 1 Memory Management Address Binding The normal procedures is to select one of the processes in the input queue and to load that process into memory. As the process executed, it accesses instructions and

More information

FILE SYSTEM IMPLEMENTATION. Sunu Wibirama

FILE SYSTEM IMPLEMENTATION. Sunu Wibirama FILE SYSTEM IMPLEMENTATION Sunu Wibirama File-System Structure Outline File-System Implementation Directory Implementation Allocation Methods Free-Space Management Discussion File-System Structure Outline

More information

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. File-System Structure File structure Logical storage unit Collection of related information File

More information

COMP091 Operating Systems 1. File Systems

COMP091 Operating Systems 1. File Systems COMP091 Operating Systems 1 File Systems Media File systems organize the storage space on persistent media such as disk, tape, CD/DVD/BD, USB etc. Disk, USB drives, and virtual drives are referred to as

More information

Guide to Computer Forensics and Investigations Fourth Edition. Chapter 6 Working with Windows and DOS Systems

Guide to Computer Forensics and Investigations Fourth Edition. Chapter 6 Working with Windows and DOS Systems Guide to Computer Forensics and Investigations Fourth Edition Chapter 6 Working with Windows and DOS Systems Understanding Disk Drives Disk drives are made up of one or more platters coated with magnetic

More information

15: Filesystem Examples: Ext3, NTFS, The Future. Mark Handley. Linux Ext3 Filesystem

15: Filesystem Examples: Ext3, NTFS, The Future. Mark Handley. Linux Ext3 Filesystem 15: Filesystem Examples: Ext3, NTFS, The Future Mark Handley Linux Ext3 Filesystem 1 Problem: Recovery after a crash fsck on a large disk can be extremely slow. An issue for laptops. Power failure is common.

More information

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23 FILE SYSTEMS CS124 Operating Systems Winter 2015-2016, Lecture 23 2 Persistent Storage All programs require some form of persistent storage that lasts beyond the lifetime of an individual process Most

More information

Operating Systems. Operating Systems Professor Sina Meraji U of T

Operating Systems. Operating Systems Professor Sina Meraji U of T Operating Systems Operating Systems Professor Sina Meraji U of T How are file systems implemented? File system implementation Files and directories live on secondary storage Anything outside of primary

More information

Example Implementations of File Systems

Example Implementations of File Systems Example Implementations of File Systems Last modified: 22.05.2017 1 Linux file systems ext2, ext3, ext4, proc, swap LVM Contents ZFS/OpenZFS NTFS - the main MS Windows file system 2 Linux File Systems

More information

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto Ricardo Rocha Department of Computer Science Faculty of Sciences University of Porto Slides based on the book Operating System Concepts, 9th Edition, Abraham Silberschatz, Peter B. Galvin and Greg Gagne,

More information

Chapter 8 Memory Management

Chapter 8 Memory Management 1 Chapter 8 Memory Management The technique we will describe are: 1. Single continuous memory management 2. Partitioned memory management 3. Relocatable partitioned memory management 4. Paged memory management

More information

Chapter 11: Implementing File Systems

Chapter 11: Implementing File Systems Silberschatz 1 Chapter 11: Implementing File Systems Thursday, November 08, 2007 9:55 PM File system = a system stores files on secondary storage. A disk may have more than one file system. Disk are divided

More information

CS5460: Operating Systems Lecture 20: File System Reliability

CS5460: Operating Systems Lecture 20: File System Reliability CS5460: Operating Systems Lecture 20: File System Reliability File System Optimizations Modern Historic Technique Disk buffer cache Aggregated disk I/O Prefetching Disk head scheduling Disk interleaving

More information

Implementation should be efficient. Provide an abstraction to the user. Abstraction should be useful. Ownership and permissions.

Implementation should be efficient. Provide an abstraction to the user. Abstraction should be useful. Ownership and permissions. File Systems Ch 4. File Systems Manage and organize disk space. Create and manage files. Create and manage directories. Manage free space. Recover from errors. File Systems Complex data structure. Provide

More information

File Systems Ch 4. 1 CS 422 T W Bennet Mississippi College

File Systems Ch 4. 1 CS 422 T W Bennet Mississippi College File Systems Ch 4. Ë ¾¾ Ì Ï ÒÒ Ø Å ÔÔ ÓÐÐ 1 File Systems Manage and organize disk space. Create and manage files. Create and manage directories. Manage free space. Recover from errors. Ë ¾¾ Ì Ï ÒÒ Ø Å

More information

CS370 Operating Systems

CS370 Operating Systems CS370 Operating Systems Colorado State University Yashwant K Malaiya Fall 2017 Lecture 24 File Systems Slides based on Text by Silberschatz, Galvin, Gagne Various sources 1 1 Questions from last time How

More information

File System Internals. Jo, Heeseung

File System Internals. Jo, Heeseung File System Internals Jo, Heeseung Today's Topics File system implementation File descriptor table, File table Virtual file system File system design issues Directory implementation: filename -> metadata

More information

makes floppy bootable o next comes root directory file information ATTRIB command used to modify name

makes floppy bootable o next comes root directory file information ATTRIB command used to modify name File Systems File system o Designed for storing and managing files on disk media o Build logical system on top of physical disk organization Tasks o Partition and format disks to store and retrieve information

More information

Operating Systems Unit 6. Memory Management

Operating Systems Unit 6. Memory Management Unit 6 Memory Management Structure 6.1 Introduction Objectives 6.2 Logical versus Physical Address Space 6.3 Swapping 6.4 Contiguous Allocation Single partition Allocation Multiple Partition Allocation

More information

Operating Systems Design Exam 2 Review: Spring 2012

Operating Systems Design Exam 2 Review: Spring 2012 Operating Systems Design Exam 2 Review: Spring 2012 Paul Krzyzanowski pxk@cs.rutgers.edu 1 Question 1 Under what conditions will you reach a point of diminishing returns where adding more memory may improve

More information

UNIT III MEMORY MANAGEMENT

UNIT III MEMORY MANAGEMENT UNIT III MEMORY MANAGEMENT TOPICS TO BE COVERED 3.1 Memory management 3.2 Contiguous allocation i Partitioned memory allocation ii Fixed & variable partitioning iii Swapping iv Relocation v Protection

More information

File Management. COMP3231 Operating Systems

File Management. COMP3231 Operating Systems File Management COMP3231 Operating Systems 1 References Textbook Tanenbaum, Chapter 6 2 Files Named repository for data Potentially large amount of data Beyond that available via virtual memory (Except

More information

Files. File Structure. File Systems. Structure Terms. File Management System. Chapter 12 File Management 12/6/2018

Files. File Structure. File Systems. Structure Terms. File Management System. Chapter 12 File Management 12/6/2018 Operating Systems: Internals and Design Principles Chapter 2 Management Ninth Edition By William Stallings s collections created by users The System is one of the most important parts of the OS to a user

More information

Journaling. CS 161: Lecture 14 4/4/17

Journaling. CS 161: Lecture 14 4/4/17 Journaling CS 161: Lecture 14 4/4/17 In The Last Episode... FFS uses fsck to ensure that the file system is usable after a crash fsck makes a series of passes through the file system to ensure that metadata

More information

File Systems. Chapter 11, 13 OSPP

File Systems. Chapter 11, 13 OSPP File Systems Chapter 11, 13 OSPP What is a File? What is a Directory? Goals of File System Performance Controlled Sharing Convenience: naming Reliability File System Workload File sizes Are most files

More information

Main Points. File layout Directory layout

Main Points. File layout Directory layout File Systems Main Points File layout Directory layout File System Design Constraints For small files: Small blocks for storage efficiency Files used together should be stored together For large files:

More information

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24 FILE SYSTEMS, PART 2 CS124 Operating Systems Fall 2017-2018, Lecture 24 2 Last Time: File Systems Introduced the concept of file systems Explored several ways of managing the contents of files Contiguous

More information

CSN08101 Digital Forensics. Module Leader: Dr Gordon Russell Lecturers: Robert Ludwiniak

CSN08101 Digital Forensics. Module Leader: Dr Gordon Russell Lecturers: Robert Ludwiniak CSN08101 Digital Forensics Lecture 8: File Systems Module Leader: Dr Gordon Russell Lecturers: Robert Ludwiniak Objectives Investigative Process Analysis Framework File Systems FAT NTFS EXT2/EXT3 last

More information

Chapter 8 Virtual Memory

Chapter 8 Virtual Memory Operating Systems: Internals and Design Principles Chapter 8 Virtual Memory Seventh Edition William Stallings Operating Systems: Internals and Design Principles You re gonna need a bigger boat. Steven

More information

Memory Management. Reading: Silberschatz chapter 9 Reading: Stallings. chapter 7 EEL 358

Memory Management. Reading: Silberschatz chapter 9 Reading: Stallings. chapter 7 EEL 358 Memory Management Reading: Silberschatz chapter 9 Reading: Stallings chapter 7 1 Outline Background Issues in Memory Management Logical Vs Physical address, MMU Dynamic Loading Memory Partitioning Placement

More information

File Management. Chapter 12

File Management. Chapter 12 File Management Chapter 12 Files Used for: input to a program Program output saved for long-term storage Terms Used with Files Field basic element of data contains a single value characterized by its length

More information

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems File system internals Tanenbaum, Chapter 4 COMP3231 Operating Systems Summary of the FS abstraction User's view Hierarchical structure Arbitrarily-sized files Symbolic file names Contiguous address space

More information

OPERATING SYSTEM. Chapter 12: File System Implementation

OPERATING SYSTEM. Chapter 12: File System Implementation OPERATING SYSTEM Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management

More information

File System Implementation. Sunu Wibirama

File System Implementation. Sunu Wibirama File System Implementation Sunu Wibirama File-System Structure Outline File-System Implementation Directory Implementation Allocation Methods Free-Space Management Discussion File System Structure File

More information

File. File System Implementation. Operations. Permissions and Data Layout. Storing and Accessing File Data. Opening a File

File. File System Implementation. Operations. Permissions and Data Layout. Storing and Accessing File Data. Opening a File File File System Implementation Operating Systems Hebrew University Spring 2007 Sequence of bytes, with no structure as far as the operating system is concerned. The only operations are to read and write

More information

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) Dept. of Computer Science & Engineering Chentao Wu wuct@cs.sjtu.edu.cn Download lectures ftp://public.sjtu.edu.cn User:

More information

Computer Systems Laboratory Sungkyunkwan University

Computer Systems Laboratory Sungkyunkwan University File System Internals Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics File system implementation File descriptor table, File table

More information

Current Topics in OS Research. So, what s hot?

Current Topics in OS Research. So, what s hot? Current Topics in OS Research COMP7840 OSDI Current OS Research 0 So, what s hot? Operating systems have been around for a long time in many forms for different types of devices It is normally general

More information

DISK DEFRAG Professional

DISK DEFRAG Professional auslogics DISK DEFRAG Professional Help Manual www.auslogics.com / Contents Introduction... 5 Installing the Program... 7 System Requirements... 7 Installation... 7 Registering the Program... 9 Uninstalling

More information

Typical File Extensions File Structure

Typical File Extensions File Structure CS 355 Operating Systems File Systems File Systems A file is a collection of data records grouped together for purpose of access control and modification A file system is software responsible for creating,

More information

File Systems. What do we need to know?

File Systems. What do we need to know? File Systems Chapter 4 1 What do we need to know? How are files viewed on different OS s? What is a file system from the programmer s viewpoint? You mostly know this, but we ll review the main points.

More information

Chapter 8. Operating System Support. Yonsei University

Chapter 8. Operating System Support. Yonsei University Chapter 8 Operating System Support Contents Operating System Overview Scheduling Memory Management Pentium II and PowerPC Memory Management 8-2 OS Objectives & Functions OS is a program that Manages the

More information

a process may be swapped in and out of main memory such that it occupies different regions

a process may be swapped in and out of main memory such that it occupies different regions Virtual Memory Characteristics of Paging and Segmentation A process may be broken up into pieces (pages or segments) that do not need to be located contiguously in main memory Memory references are dynamically

More information

Chapter 9 Memory Management Main Memory Operating system concepts. Sixth Edition. Silberschatz, Galvin, and Gagne 8.1

Chapter 9 Memory Management Main Memory Operating system concepts. Sixth Edition. Silberschatz, Galvin, and Gagne 8.1 Chapter 9 Memory Management Main Memory Operating system concepts. Sixth Edition. Silberschatz, Galvin, and Gagne 8.1 Chapter 9: Memory Management Background Swapping Contiguous Memory Allocation Segmentation

More information

Disks and I/O Hakan Uraz - File Organization 1

Disks and I/O Hakan Uraz - File Organization 1 Disks and I/O 2006 Hakan Uraz - File Organization 1 Disk Drive 2006 Hakan Uraz - File Organization 2 Tracks and Sectors on Disk Surface 2006 Hakan Uraz - File Organization 3 A Set of Cylinders on Disk

More information

Windows 7 Overview. Windows 7. Objectives. The History of Windows. CS140M Fall Lake 1

Windows 7 Overview. Windows 7. Objectives. The History of Windows. CS140M Fall Lake 1 Windows 7 Overview Windows 7 Overview By Al Lake History Design Principles System Components Environmental Subsystems File system Networking Programmer Interface Lake 2 Objectives To explore the principles

More information

Lecture 7. Memory Management

Lecture 7. Memory Management Lecture 7 Memory Management 1 Lecture Contents 1. Memory Management Requirements 2. Memory Partitioning 3. Paging 4. Segmentation 2 Memory Memory is an array of words or bytes, each with its own address.

More information

Operating Systems Design Exam 2 Review: Spring 2011

Operating Systems Design Exam 2 Review: Spring 2011 Operating Systems Design Exam 2 Review: Spring 2011 Paul Krzyzanowski pxk@cs.rutgers.edu 1 Question 1 CPU utilization tends to be lower when: a. There are more processes in memory. b. There are fewer processes

More information

Virtual Memory. Chapter 8

Virtual Memory. Chapter 8 Chapter 8 Virtual Memory What are common with paging and segmentation are that all memory addresses within a process are logical ones that can be dynamically translated into physical addresses at run time.

More information

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University File System Internals Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics File system implementation File descriptor table, File table

More information

Chapter 11: Implementing File Systems

Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems Operating System Concepts 99h Edition DM510-14 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation

More information

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE

RAID SEMINAR REPORT /09/2004 Asha.P.M NO: 612 S7 ECE RAID SEMINAR REPORT 2004 Submitted on: Submitted by: 24/09/2004 Asha.P.M NO: 612 S7 ECE CONTENTS 1. Introduction 1 2. The array and RAID controller concept 2 2.1. Mirroring 3 2.2. Parity 5 2.3. Error correcting

More information

CS 416: Opera-ng Systems Design March 23, 2012

CS 416: Opera-ng Systems Design March 23, 2012 Question 1 Operating Systems Design Exam 2 Review: Spring 2011 Paul Krzyzanowski pxk@cs.rutgers.edu CPU utilization tends to be lower when: a. There are more processes in memory. b. There are fewer processes

More information

CS 111. Operating Systems Peter Reiher

CS 111. Operating Systems Peter Reiher Operating System Principles: File Systems Operating Systems Peter Reiher Page 1 Outline File systems: Why do we need them? Why are they challenging? Basic elements of file system design Designing file

More information

Chapter 11: Implementing File-Systems

Chapter 11: Implementing File-Systems Chapter 11: Implementing File-Systems Chapter 11 File-System Implementation 11.1 File-System Structure 11.2 File-System Implementation 11.3 Directory Implementation 11.4 Allocation Methods 11.5 Free-Space

More information

Chapter 8: Filesystem Implementation

Chapter 8: Filesystem Implementation ADRIAN PERRIG & TORSTEN HOEFLER ( 252-0062-00 ) Networks and Operating Systems Chapter 8: Filesystem Implementation source: xkcd.com Access Control 1 Protection File owner/creator should be able to control:

More information

CS 4284 Systems Capstone

CS 4284 Systems Capstone CS 4284 Systems Capstone Disks & File Systems Godmar Back Filesystems Files vs Disks File Abstraction Byte oriented Names Access protection Consistency guarantees Disk Abstraction Block oriented Block

More information

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto Ricardo Rocha Department of Computer Science Faculty of Sciences University of Porto Slides based on the book Operating System Concepts, 9th Edition, Abraham Silberschatz, Peter B. Galvin and Greg Gagne,

More information

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction

Chapter 6 Memory 11/3/2015. Chapter 6 Objectives. 6.2 Types of Memory. 6.1 Introduction Chapter 6 Objectives Chapter 6 Memory Master the concepts of hierarchical memory organization. Understand how each level of memory contributes to system performance, and how the performance is measured.

More information

22 File Structure, Disk Scheduling

22 File Structure, Disk Scheduling Operating Systems 102 22 File Structure, Disk Scheduling Readings for this topic: Silberschatz et al., Chapters 11-13; Anderson/Dahlin, Chapter 13. File: a named sequence of bytes stored on disk. From

More information

Chapter 12 File-System Implementation

Chapter 12 File-System Implementation Chapter 12 File-System Implementation 1 Outline File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency and Performance Recovery Log-Structured

More information

Memory Management. Memory Management

Memory Management. Memory Management Memory Management Chapter 7 1 Memory Management Subdividing memory to accommodate multiple processes Memory needs to be allocated efficiently to pack as many processes into memory as possible 2 1 Memory

More information

Chapter 11: Implementing File Systems. Operating System Concepts 8 th Edition,

Chapter 11: Implementing File Systems. Operating System Concepts 8 th Edition, Chapter 11: Implementing File Systems, Silberschatz, Galvin and Gagne 2009 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation Methods

More information

WHITE PAPER. Optimizing Virtual Platform Disk Performance

WHITE PAPER. Optimizing Virtual Platform Disk Performance WHITE PAPER Optimizing Virtual Platform Disk Performance Optimizing Virtual Platform Disk Performance 1 The intensified demand for IT network efficiency and lower operating costs has been driving the phenomenal

More information

OPERATING SYSTEMS CS136

OPERATING SYSTEMS CS136 OPERATING SYSTEMS CS136 Jialiang LU Jialiang.lu@sjtu.edu.cn Based on Lecture Notes of Tanenbaum, Modern Operating Systems 3 e, 1 Chapter 4 FILE SYSTEMS 2 File Systems Many important applications need to

More information

File Systems Management and Examples

File Systems Management and Examples File Systems Management and Examples Today! Efficiency, performance, recovery! Examples Next! Distributed systems Disk space management! Once decided to store a file as sequence of blocks What s the size

More information

Process size is independent of the main memory present in the system.

Process size is independent of the main memory present in the system. Hardware control structure Two characteristics are key to paging and segmentation: 1. All memory references are logical addresses within a process which are dynamically converted into physical at run time.

More information

Chapter 11: Implementing File

Chapter 11: Implementing File Chapter 11: Implementing File Systems Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

CS 537 Fall 2017 Review Session

CS 537 Fall 2017 Review Session CS 537 Fall 2017 Review Session Deadlock Conditions for deadlock: Hold and wait No preemption Circular wait Mutual exclusion QUESTION: Fix code List_insert(struct list * head, struc node * node List_move(struct

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection

More information

EMC CLARiiON Backup Storage Solutions

EMC CLARiiON Backup Storage Solutions Engineering White Paper Backup-to-Disk Guide with Computer Associates BrightStor ARCserve Backup Abstract This white paper describes how to configure EMC CLARiiON CX series storage systems with Computer

More information

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition Chapter 11: Implementing File Systems Operating System Concepts 9 9h Edition Silberschatz, Galvin and Gagne 2013 Chapter 11: Implementing File Systems File-System Structure File-System Implementation Directory

More information

Chapter 7 Memory Management

Chapter 7 Memory Management Operating Systems: Internals and Design Principles Chapter 7 Memory Management Ninth Edition William Stallings Frame Page Segment A fixed-length block of main memory. A fixed-length block of data that

More information

Chapter 12: File System Implementation

Chapter 12: File System Implementation Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management Efficiency

More information

What is a file system

What is a file system COSC 6397 Big Data Analytics Distributed File Systems Edgar Gabriel Spring 2017 What is a file system A clearly defined method that the OS uses to store, catalog and retrieve files Manage the bits that

More information

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University File System Internals Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics File system implementation File descriptor table, File table

More information

CS 318 Principles of Operating Systems

CS 318 Principles of Operating Systems CS 318 Principles of Operating Systems Fall 2017 Lecture 16: File Systems Examples Ryan Huang File Systems Examples BSD Fast File System (FFS) - What were the problems with the original Unix FS? - How

More information

19 File Structure, Disk Scheduling

19 File Structure, Disk Scheduling 88 19 File Structure, Disk Scheduling Readings for this topic: Silberschatz et al., Chapters 10 11. File: a named collection of bytes stored on disk. From the OS standpoint, the file consists of a bunch

More information

Motivation. Operating Systems. File Systems. Outline. Files: The User s Point of View. File System Concepts. Solution? Files!

Motivation. Operating Systems. File Systems. Outline. Files: The User s Point of View. File System Concepts. Solution? Files! Motivation Operating Systems Process store, retrieve information Process capacity restricted to vmem size When process terminates, memory lost Multiple processes share information Systems (Ch 0.-0.4, Ch.-.5)

More information

Chapter 10: File System Implementation

Chapter 10: File System Implementation Chapter 10: File System Implementation Chapter 10: File System Implementation File-System Structure" File-System Implementation " Directory Implementation" Allocation Methods" Free-Space Management " Efficiency

More information

Definition of RAID Levels

Definition of RAID Levels RAID The basic idea of RAID (Redundant Array of Independent Disks) is to combine multiple inexpensive disk drives into an array of disk drives to obtain performance, capacity and reliability that exceeds

More information

OS and Hardware Tuning

OS and Hardware Tuning OS and Hardware Tuning Tuning Considerations OS Threads Thread Switching Priorities Virtual Memory DB buffer size File System Disk layout and access Hardware Storage subsystem Configuring the disk array

More information

Workloads. CS 537 Lecture 16 File Systems Internals. Goals. Allocation Strategies. Michael Swift

Workloads. CS 537 Lecture 16 File Systems Internals. Goals. Allocation Strategies. Michael Swift Workloads CS 537 Lecture 16 File Systems Internals Michael Swift Motivation: Workloads influence design of file system File characteristics (measurements of UNIX and NT) Most files are small (about 8KB)

More information

Chapter 11. I/O Management and Disk Scheduling

Chapter 11. I/O Management and Disk Scheduling Operating System Chapter 11. I/O Management and Disk Scheduling Lynn Choi School of Electrical Engineering Categories of I/O Devices I/O devices can be grouped into 3 categories Human readable devices

More information

MEMORY MANAGEMENT/1 CS 409, FALL 2013

MEMORY MANAGEMENT/1 CS 409, FALL 2013 MEMORY MANAGEMENT Requirements: Relocation (to different memory areas) Protection (run time, usually implemented together with relocation) Sharing (and also protection) Logical organization Physical organization

More information

Chapter 8: Virtual Memory. Operating System Concepts

Chapter 8: Virtual Memory. Operating System Concepts Chapter 8: Virtual Memory Silberschatz, Galvin and Gagne 2009 Chapter 8: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating

More information

Chapter 8 Virtual Memory

Chapter 8 Virtual Memory Operating Systems: Internals and Design Principles Chapter 8 Virtual Memory Seventh Edition William Stallings Modified by Rana Forsati for CSE 410 Outline Principle of locality Paging - Effect of page

More information

Long-term Information Storage Must store large amounts of data Information stored must survive the termination of the process using it Multiple proces

Long-term Information Storage Must store large amounts of data Information stored must survive the termination of the process using it Multiple proces File systems 1 Long-term Information Storage Must store large amounts of data Information stored must survive the termination of the process using it Multiple processes must be able to access the information

More information

CS 318 Principles of Operating Systems

CS 318 Principles of Operating Systems CS 318 Principles of Operating Systems Fall 2018 Lecture 16: Advanced File Systems Ryan Huang Slides adapted from Andrea Arpaci-Dusseau s lecture 11/6/18 CS 318 Lecture 16 Advanced File Systems 2 11/6/18

More information

Problem Overhead File containing the path must be read, and then the path must be parsed and followed to find the actual I-node. o Might require many

Problem Overhead File containing the path must be read, and then the path must be parsed and followed to find the actual I-node. o Might require many Sharing files o Introduction Users often need to share files amongst themselves It is convenient for the shared file to appear simultaneously in different directories belonging to different users One of

More information

CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University

CS370: System Architecture & Software [Fall 2014] Dept. Of Computer Science, Colorado State University CS 370: SYSTEM ARCHITECTURE & SOFTWARE [MASS STORAGE] Frequently asked questions from the previous class survey Shrideep Pallickara Computer Science Colorado State University L29.1 L29.2 Topics covered

More information