NAND flash memory is mostly used for data storage of

Size: px
Start display at page:

Download "NAND flash memory is mostly used for data storage of"

Transcription

1 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 3, MARCH High-Performance Scalable Flash File System Using Virtual Metadata Storage with Phase-Change RAM Youngwoo Park, Member, IEEE, andkyuhopark,member, IEEE Abstract Several flash file systems have been developed based on the physical characteristics of NAND flash memory. However, previous flash file systems have performance overhead and scalability problems caused by metadata management in NAND flash memory. In this paper, we present a flash file system called PFFS2. PFFS2 stores all metadata into virtual metadata storage, which employs Phase-change RAM (PRAM). PRAM is a next-generation nonvolatile memory and will be good for dealing with word-level read/write of small-size data. Based on the virtual metadata storage, PFFS2 can manage metadata in a virtually fixed location and through byte-level in-place updates. Therefore, the performance of PFFS2 is 38 percent better than YAFFS2 for small file read/write while matching YAFFS2 performance for large file. Virtual metadata storage is particularly effective in decreasing the burden of computational and I/O overhead of garbage collection. In addition, PFFS2 maintains a 0.18 second mounting time and 284 KB memory usage in spite of increases in NAND flash memory size. We also propose a wear-leveling solution for PRAM in virtual metadata storage and greatly reduce the total write count of NAND flash memory. In addition, the life span of PFFS2 is longer than other flash file systems. Index Terms PRAM, flash file system, embedded system storage, NAND flash memory. Ç 1 INTRODUCTION NAND flash memory is mostly used for data storage of embedded systems due to its large capacity, nonvolatility, fast access time, low power consumption, and high reliability [1]. With multilevel cell (MLC) technology, the capacity of a NAND flash memory chip has grown to more than 16 GB in size and will continue to increase quickly. This will enable a percent cost reduction each year. NAND flash memory will become increasingly popular for main data storage of embedded systems and consumer electronics [2], [3]. On the other hand, NAND flash memory has several constraints. It supports only page-level (512 B or 2 KB) read/ write. At least one page is consumed to write even on byte of data. Moreover, once a page is programmed, the page should be erased first in order to write more data on that page. The erase operation performs in terms of block (32 or 256 KB) that is much larger than a page, and block erase time takes much longer than page write time [4]. NAND flash memory also has a limited life span of erase blocks. Usually, MLC NAND guarantees under 10,000 erase cycles for each erase block. It is important to reduce the number of block erases and prolong the life span of NAND flash memory. To counter these difficulties, previous file systems which are run on a NAND flash memory adapt an out-of-place. The authors are with the Department of Electrical Engineering, Korea Advanced Institute of Science and Technology (KAIST), 335 Gwahak-ro, Yuseong-gu, Daejeon , Korea. ywpark@core.kaist.ac.kr, kpark@ee.kaist.ac.kr. Manuscript received 3 Sept. 2009; revised 25 Jan. 2010; accepted 9 Feb. 2010; published online 10 June Recommended for acceptance by M. Yousif. For information on obtaining reprints of this article, please send to: tc@computer.org, and reference IEEECS Log Number TC Digital Object Identifier no /TC update scheme that is used by a traditional log-structured file system [5], [6], [7]. When data are modified, new data are written to available free pages, and the old version of data are invalidated and considered to be dead pages. Later, the garbage collection operation erases the block occupied by dead pages and makes new free pages when there are not enough live pages in the NAND flash memory. Although the previous NAND flash file systems succeeded in hiding the presence of erase operations due to outof-place updates, they failed to provide sufficient performance for embedded systems. They have problems with managing file systems metadata and cannot properly exploit the locality in file accesses. First, metadata are updated much more frequently than data, and its update creates 512 B or 2 KB page writes in NAND flash memory even though the updated metadata size is only several bytes or less [8], [9], [10], [11]. NAND flash file systems consume a lot of time to write metadata. Second, metadata are sparsely distributed and generates many dead pages in NAND flash memory. NAND flash file systems exhibit poor garbage collection behavior because of dead pages which is generated by metadata. Third, a lot of page writes caused by metadata and garbage collection overhead also decrease the life span of a NAND flash file system. Fourth, using out-of-place updates in NAND flash memory means that the location of metadata changes with every update request. Because a progressive scan is the only method to find metadata, previous NAND flash file systems should scan the latest metadata and copies them into the main memory during file system mounting. Therefore, previous NAND flash file systems need large main memory to manage metadata and long mounting time. For example, JFFS2 needs 4 MB memory, and YAFFS2 needs 512 KB memory for 128 MB NAND flash memory [12]. On the other hand, nonvolatile memories, such as Phasechange RAM (PRAM), Ferroelectric RAM (FRAM) [13], and /11/$26.00 ß 2011 IEEE Published by the IEEE Computer Society

2 322 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 3, MARCH 2011 TABLE 1 Comparison of NAND Flash Memory and PRAM [4], [25] Fig. 1. Storage architecture for proposed NAND flash file system (PFFS2). Magnetic RAM (MRAM) [14], have been developed as nextgeneration memory devices. Among these, PRAM has begun to gain acceptance as an alternative to embedded storage with its notable features, such as nonvolatility, high density, random access capability, in-place updating, low power usage, and long write endurance, as shown in Table 1. PRAM has been developed by key industry manufacturers such as Intel, Samsung, and IBM [15], [16], [17], [18], [19], [20], [21]. Already, Samsung and Intel have announced a 512 Mbit working prototype of PRAM. PRAM is expected to replace NOR flash memory for bootup codes and also to be more suitable to manage small-size metadata updates. However, PRAM density and cost are still not enough to replace NAND flash memory as a main storage device [22], [23], [24]. Although PRAM shows good performance for small-size access, its performance is still not comparable to DRAM and could be worse than that of NAND flash memory for page size read/write. In this paper, we use PRAM [25], [26] as a storage alternative and examines the benefits of PRAM by focusing on the metadata management for NAND flash file system. As shown in Fig. 1, We use PRAM as a boot code storage which is conventionally the role of NOR flash memory. Then, we employ PRAM as the virtual metadata storage of the NAND flash file system in Fig. 1. We have implemented a new file system called PFFS2 which uses the virtual metadata storage in order to exploit the locality in accessing files. Even if small-size PRAM is available, the virtual metadata storage can effectively store the metadata in PRAM at a given time while the rest is stored on the NAND flash memory. PFFS2 can manage all metadata through byte-level read/write which is supported by PRAM. It avoids unnecessary page updates and excessive garbage in NAND flash memory by metadata updates and minimizes the computing overhead for metadata management. Additionally, we redesign the directory and data indexing structure of PFFS2 based on the virtual metadata storage. PFFS2 stores metadata at a virtually fixed region and directly accesses metadata. PFFS2 can save main memory space and reduce the file system mounting time. We propose the PRAM wear-leveling method to ensure the longer life span of PFFS2 than other NAND flash file systems. 2 RELATED WORKS 2.1 NAND Flash File System JFFS2 [6] and YAFFS2 [7] were designed for general embedded systems equipped with NAND flash memory. JFFS2 and YAFFS2 are popular file systems designed for NAND flash memory and are currently used with Linux and WinCE. They are based on a traditional log-structured file system that uses an out-of-place updates and garbage collection scheme for data updates. Therefore, the metadata update performance is bad in both JFFS2 and YAFFS2. Furthermore, because the mounting time and the memory usage increase linearly according to the NAND flash size, they cannot be used for flash memory storage of more than several gigabytes. CFFS [27] is a scalable NAND flash file system. It allocates different flash blocks for metadata and data and stores the entire set of data index pointers in a metadata block in NAND flash memory. By separating the metadata, CFFS can improve garbage collection performance and reduce the region to be scanned during file system mounting. Also, CFFS does not need to maintain all data index entries for every file in memory since the inode, which includes information to manage files, contains data index entries in the NAND flash memory. CFFS can reduce memory utilization. However, CFFS brings additional page writes to modify data index pointers in metadata blocks when every file is updated. This decreases the metadata update performance compared with JFFS2 and YAFFS2. CFFS is still not scalable in terms of memory requirements because CFFS still contains directory entries in the main memory. It also has a wear-leveling problem of a metadata block that is updated more than the data block. 2.2 Using Nonvolatile Memory for File Systems There has been a great deal of research into the use of nonvolatile RAM as an alternative storage. Douglis et al. [1] and Wu and Zwaenepoel [28] proposed the use of nonvolatile RAM to hold an entire file system. Unlike this approach, both HeRMES [10] and Conquest [29] file systems apply nonvolatile RAM to optimize and enhance the performance of a disk storage system. HeRMES uses MRAM to provide high-speed access to relatively small units of data and metadata. The Conquest file system employs nonvolatile RAM to store all small files and metadata (e.g., directories and file attributes), and the disk holds only the data content of the remaining large files. The Conquest and HeRMES are designed for disk storage system from the first. They use nonvolatile memory only to reduce read/write access latency and distinct the location of data saving. However, it needs to reconsider the update method as well as the location of data for NAND flash memory because of out-of-update characteristic and garbage collection. In addition, we should consider the scalability problem of NAND flash file systems with nonvolatile memory.

3 PARK AND PARK: HIGH-PERFORMANCE SCALABLE FLASH FILE SYSTEM USING VIRTUAL METADATA STORAGE WITH PHASE-CHANGE Recently, HFFS [30], the MiNV file system, [31] and a previous version of PFFS2 [22] have used nonvolatile memory to enhance the performance of NAND flash file systems. Using the byte-level read/write capability of NOR flash memory, HFFS synchronously stores data as a log in NOR flash memory. It reduces the write count of NAND flash memory and provides a longer life span than conventional memory with a similar level of file system durability. MiNV uses FRAM as a single metadata space of a file system by modifying YAFFS. It deploys a specific model that analyzes the amount of nonvolatile RAM usage as flash memory storage capacity. It increases the performance and reduces mounting time. However, MiNV is not appropriate to store large number of small-size files because it does not consider the locality of metadata access and store all file metadata in FRAM, including mapping information between file data offset and physical NAND flash memory pages. Also, there is no consideration about the life span of file system based on NAND flash memory. For example, if the size of all files in file system is 2 KB, more than 4 MB FRAM is necessary for metadata storage of 32 MB NAND flash memory. Although it tries to use the board of FRAM array and increase the available size of FRAM, it is fundamentally necessary to design a file system with limited FRAM size. The previous version of PFFS2 which is called PFFS separates all metadata in a file system and saves them into PRAM [22]. It outperforms YAFFS2 and has a constant mounting time and memory usage. It also proposes the segmentation techniques of PRAM and solves the wearleveling of PRAM. Although PFFS reduces the size of necessary PRAM by storing data indexes in NAND flash memory, it does not consider the locality of metadata access and still uses PRAM inefficiently. PFFS needs to increase necessary PRAM in proportion to the size of NAND flash memory. Also, its wear-leveling techniques always incur additional PRAM writes. It is clear that the benefits of nonvolatile RAM can enhance the performance of file systems. All previous work assumes sufficient size of nonvolatile RAM to store data and metadata. Although the cost of nonvolatile RAM is declining rapidly, the required size of data storage is advancing much faster. It is not cost effective to use nonvolatile RAM as main data storage or increase the amount of nonvolatile RAM in proportion to the data size requirements. 3 THE DESIGN OF PFFS2 In this section, we describe the design of PFFS2 in detail. Fig. 2 shows the architecture of PFFS2. The basic idea in PFFS2 is to employ PRAM as a secondary storage for metadata. PRAM enables an increase in the effectiveness of metadata management because it is a nonvolatile memory and supports in-place updates and byte-level access. However, the size of PRAM is insufficient and not scalable to contain all metadata of PFFS2 for large-scale NAND flash memory. We propose the virtual metadata storage as a special concept for metadata management of PFFS2. On top of virtual metadata storage, PFFS2 separates the metadata management from file management. The file system layout and the metadata structure of PFFS2 are designed to be suitable for the proposed virtual metadata Fig. 2. The architecture of PFFS2. storage. Consequently, PFFS2 can be designed to ensure better performance and scalability than typical NAND flash file systems. In the next sections, we describe the concept and benefits of the virtual metadata storage and explain mechanisms to manage the virtual metadata storage. After that, we will discuss the file system layout and metadata structure in more detail. 3.1 Virtual Metadata Storage Architecture Fig. 3 shows the virtual metadata storage concept and its management mechanisms. The virtual metadata storage is virtually regarded as single metadata storage but is physically organized as PRAM and NAND flash memory. Using NAND flash memory for physical metadata space, the virtual metadata storage gives the image of contiguous and unlimited metadata space. Also, the virtual metadata storage makes use of the locality of file access [32], [33], [34] and maintains all recently and frequently used metadata in PRAM. Therefore, the virtual metadata storage seems to inherit the characteristics of PRAM for most metadata operations. PFFS2 can work transparently on top of unlimited virtual metadata storage even though only small-sized PRAM is physically used. The proposed virtual metadata storage is similar to the thin provisioning of storage system [35], [36] because it provides virtual allocation of data and flexible change of data location. However, our virtual metadata storage separately manages the file s metadata and data at a file system level. It is employed not only to allocate metadata but also to compose all directories and indexing structure of file system. The virtual metadata storage is also different with simple metadata cache which contains duplicate metadata in memory for fast access. Although both of them exploit access frequency to store metadata between PRAM and NAND flash memory, the virtual metadata storage considers the type of metadata and the characteristic of storage and keeps only one copy of metadata for each file in all storage devices. There are several benefits to adopt the virtual metadata storage for PFFS2. We can exploit the different characteristics

4 324 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 3, MARCH 2011 Fig. 3. Virtual metadata storage and its management. of metadata and data to increase the performance of PFFS2. PFFS2 composes metadata structures in the virtual metadata storage. Among metadata in virtual metadata storage, the frequently accessed one is physically stored in PRAM. Most of metadata updates in PFFS2 are performed at a byte-level. We can increase the speed of metadata update operations in PFFS2. In addition, NAND flash memory in the virtual metadata storage is only used to manage the segments which contain metadata with low locality, and metadata updates cause few updates in NAND flash memory. The total number of page writes of NAND flash memory is greatly reduced. This does not only increase the performance of the metadata update itself, but also reduces the garbage collection overhead and increases the life span in PFFS2. For example, Fig. 4 compares the difference between a typical NAND flash file system with PFFS2 when three files are written to a file system and updated. In the previous NAND flash file system, a total of six pages were used for metadata, as shown in Fig. 4a. Three of them are to store the metadata of files, and the others are invalidated because of the files update. On the contrary, Fig. 4b shows that PFFS2 does not use a NAND flash memory page to save a file s metadata. Instead of that, NAND flash memory just contains necessary metadata segments. PFFS2 reduces the number of page writes and increases the performance. The virtual metadata storage also reduces the main memory usage and mounting time. In the design of PFFS2, each item of metadata contains directory structures as well as file attributes and data index entries, as shown in Fig. 4b. Also, the metadata are updated in a fixed location of virtual metadata storage. Then, we can find the metadata at runtime. The main memory is not used to maintain temporal metadata structure, as shown in Fig. 4a. Also, no more metadata except the super block are scanned during the mounting time of PFFS2. Consequently, PFFS2 ensures a constant mounting time and a constant size of memory usage regardless of NAND flash memory size. 3.2 Virtual Metadata Storage Management Metadata Segment Mapping In our virtual metadata management scheme, the file interface of PFFS2 assumes that the metadata are allocated continuously in a fixed location of virtual metadata storage. However, the virtual metadata storage is fragmented as a segment that is the minimum unit of allocation, and the segments can be flexibly allocated and moved between the PRAM or NAND flash memory, as shown in Fig. 3. For simple management, the size of a segment should be the same as or a multiple of the size of a NAND flash memory page. Currently, we use a 2 KB segment, which is the same size as a NAND flash page. In order to find the physical location of metadata, we need virtual-to-physical mapping of each segment. We maintain segment table and segment bitmap for metadata segment mapping. The segment table has all of the mapping information of virtual and physical metadata segments and converts the virtual metadata segment address to the physical metadata segment address. We Fig. 4. Metadata management using virtual metadata storage. (a) Typical NAND flash file system. (b) PFFS2.

5 PARK AND PARK: HIGH-PERFORMANCE SCALABLE FLASH FILE SYSTEM USING VIRTUAL METADATA STORAGE WITH PHASE-CHANGE TABLE 2 Example of a Segment Table distinguish the physical location of segments by using the Most Significant Bit (MSB) of the physical segment addresses (PSA). PSA with MSB 1 indicates the segment in NAND flash memory. PSA with MSB 0 indicates the segment in PRAM. Then, the other 31 bits specify the physical address of segment in PRAM or NAND flash memory. Also, if the PSA is 0xFFFFFFFF, this represents a free segment that is not physically allocated yet. In the example of Table 2, the first row means that the logical segment address (LSA) 0x is a free segment and not used. The logical segment 0x000000C8 resides in the 16th page in NAND flash memory, and the logical segment 0x C is stored in the 16th segment in PRAM. The segment bitmap is used to find a free physical metadata segment. However, we do not use a segment bitmap for a physical metadata segment in NAND flash memory. The page bitmap is used to represent the status of a metadata segment or data page in NAND flash memory because the segment size is currently the same as a NAND flash page size. Using this segment table and bitmaps, all physical metadata segments can be allocated and moved anywhere both in PRAM and NAND flash memory. Moreover, they are always loaded in the first region in PRAM and are not swapped out to NAND flash memory. Later, when the file system accesses the virtual metadata storage, we can find the correlated physical segment address and handle the metadata in that physical segment Metadata Segment Swapping Although we use NAND flash memory to store a physical segment, it is preferable to select PRAM, which is better at handling the frequent updates of small-size metadata. As we mentioned in previous sections, if the segment in NAND flash memory is updated, it creates a dead page and decreases the metadata update performance of PFFS2. Actually, if there is a free segment in PRAM, we always allocate the segment in PRAM. When there is no free space in PRAM, some metadata segments have to be stored to NAND flash memory. PFFS2 checks the number of free segments of PRAM in every segment allocation. If there is no free space in PRAM during virtual metadata allocation, the metadata swapping module performs the swap-out procedure. It selects one of the segments in PRAM and swaps the segment out to NAND flash memory. We can allocate a new metadata segment in PRAM. In order to minimize metadata management overhead, it is important to select infrequently accessed segments for swap-out. We use a simple segment replacement policy that is similar to LRFU [37]. We rank the physical segments according to the number of segments accessed. We associate two byte counters to store the access frequency of the segment. The value of a counter increases when the segment is read or updated, and this value is periodically decreased. By comparing these counter values, the least recently and least frequently accessed segment in the PRAM can be found. Although it takes linear time to find the segment for swap-out, this is little overhead because it takes much longer time to write the NAND page and PRAM segment during segment swapping. On the other hand, a swap-in procedure returns the segments of NAND flash memory to PRAM. This takes place whenever the segment in NAND flash memory is updated. The metadata updates are actually performed in PRAM after swap-in procedures. For metadata swapping, we should distinguish the read access and update access of the metadata. The NAND flash page write takes maximum 3 msec but read only does 60 sec [4]. Moreover, the write operations generate dead pages in NAND flash memory. Thus, we design PFFS2 to swap out the frequently read segments and try to maintain the frequently updated segments in PRAM. Currently, the updated segment has a 100 times lower chance of swapping out because the access counter for segment read is increased by only 1, even if we increase the access count by 100 for a segment update. By the read/write access classifications, most of metadata updates can be occurred in PRAM Metadata Segment Reclaiming Virtual metadata storage also needs a mechanism to reclaim segments that are not used any more. The reclamation checking occurs whenever PFFS2 updates the index or deletes file. When a file is deleted, the file metadata objects (file inode, file name, and index pages) are also invalidated, and the metadata segments, which contain only unused metadata objects, are reclaimed. Currently, in PFFS2, the size of all metadata objects and segments is the power of two and the metadata object size is smaller than a segment. Hence, the length of all metadata regions is always aligned to the segment size, and the fixed number of metadata objects is stored in each segment. In this situation, we can easily find the segment that needs reclamation by checking the inode bitmaps. For example, PFFS2 has a 2 KB segment size and 128 bytes inode. 16 inodes are stored in a segment. If we assume that we delete the file whose inode number is seven, we can know whether that segment is empty or not by examining the 0-15th bits of the inode bitmap. Similarly, the segments that are contained in file name fields are reclaimed when the correlated eight bits of the inode bitmap are all 0 because the file name size is 256 bytes. Furthermore, the segments that store index pages and indirect index pages are reclaimed immediately when those pages are not used because the size of index pages and indirect index pages are identical to one segment size. However, once the super block, block information, and inode bitmap region are allocated, they are never reclaimed. If we reclaim the segments that are allocated in these regions, we have to check the data in that segment at a byte or even bit-level because the size of metadata objects in these regions is a few bytes or bits. Also, these items are always necessary for a file system and NAND flash memory management until the file system is unmounted.

6 326 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 3, MARCH 2011 TABLE 3 File Size Upper Limits for Data Page Indexing Fig. 5. File system layout and inode structure of PFFS2. Moreover, the total size of these metadata regions is under 20 KB, which is much smaller than the size of the file inode or file name region. Therefore, this is not a prohibitively large overhead even if we do not reclaim the segments in those regions. 3.3 Metadata Structure of PFFS2 Fig. 5 shows the metadata and inode structure of PFFS2. The superblock is used for basic description of the file system. The inode bitmap is composed of sequences of bits that specify whether the corresponding inodes are free or not. Additionally, the block information field is necessary to manage the free and dead page numbers of corresponding blocks for garbage collection. In the typical NAND flash file system, file data and metadata are sequentially stored in NAND flash memory. On the other hand, the metadata location of PFFS2 is fixed after file system creation, and the metadata are always updated at the same location in the virtual metadata storage while the file data use out-of-place update scheme. Also, the PFFS2 file system interface regards the virtual metadata storage as byte-level accessible memory. The virtual metadata storage allows PFFS2 to use much simpler metadata structure. All directory and file data are accessed by a linked list and index table with in-memory semantics Directory Structure PFFS2 inode includes all information needed by the PFFS2 to handle a file. The inode stores user and group ownership, access mode (read, write, and execute permissions), access time, and file size. The PFFS2 has the file name table, which includes the names of all files and directories. Moreover, the inode of a file and directory contains the inode pointers (i parent, i child, i prev, i next) to link the inodes of related files and directories. All the subdirectories and files in a directory are connected by the i prev and i next pointers as a doubly linked list. The i parent and i child in the inode of a directory are used to represent the relation among directories. The i parent points out the inode of a parent directory and the i child is used to indicate the first entry of the list that includes its subdirectories and files. Consequently, through these four inode pointers, we can find the inode of any files or directories from the root directory of PFFS2 and access any data in the PFFS2 from the data indexing structure of that inode. In addition, PFFS2 generates a hash key (i hash key) of the file name and saves it in each inode of a file for fast directory lookup. After that, PFFS2 sorts the directory list by hash keys. Hence, we replace a 256 byte file name read with a four-byte hash key comparison and reduce the total directory lookup time Indexing Data Pages PFFS2 involves index pages that contain the extra index entries used to index data pages of large-size files. The index page is used to represent second or third order array data pages and includes p=4 direct index entries or single indirect index entries, where p is the page size of NAND flash memory. Then, as shown in Fig. 5, the inode of PFFS2 contains 16 index entries, and all data pages in a file can be addressed from these index entries. The eight direct index entries yield the data pages corresponding to the first eight logical pages of the file, while single indirect index entries and double indirect index entries indirectly indicate data pages by using index pages. For example, if the page size of NAND flash memory is 2 KB, the first single indirect index in a file inode indicates the index page and points out 512 data pages from the ninth logical page of the file. Also, the first double indirect index in a file represents 262,144 (512 2 ) pages ranging from the 2,056th (512 4 þ 8) page. Table 3 summarizes the upper limit placed on a file s size for each page size and each indexing mode. The important point in indexing the structure of PFFS2 is where the index pages are stored. An index page can be stored in the index page pool region of the virtual metadata storage or directly assigned into NAND flash memory. The index page pool in the virtual metadata storage accommodates index pages that are indicated by a single indirect index in an inode. On the other hand, the index pages that are pointed to double indirect indexes are always stored in NAND flash memory. Consequently, all data pages that are smaller than 4 MB can be accessed by index entries in the virtual metadata storage, but the data pages that contain more than 4 MB of data are indicated by index entries in NAND flash memory. There are two reasons why PFFS2 does not keep the index pages of large-size file in the virtual metadata storage. First, the index pages of a large file in the virtual metadata storage cause a lot of unnecessary swap-outs. As we discussed in Section 3, all metadata updates of the virtual metadata storage are always performed in PRAM. When we try to allocate the index pages in the virtual metadata storage, the segments that are previously used and stored in PRAM should be swapped out to NAND flash memory. For example, a maximum of 1,024 segments is necessary to swap out for index pages of 1 GB file. However, most of the

7 PARK AND PARK: HIGH-PERFORMANCE SCALABLE FLASH FILE SYSTEM USING VIRTUAL METADATA STORAGE WITH PHASE-CHANGE index pages of large-size files are no longer updated after the first allocation because it is well known that large-size files are usually accessed for reading after first creation [38], [29], [39]. This means that index pages in PRAM can decrease the performance of PFFS2. Second, the write speed of the index can be decreased using the virtual metadata storage. We note that the write speed of PRAM is currently not faster than NAND flash memory if we write more than several hundred bytes of data, as shown in Table 1 of Section 2. Therefore, it is better to store the index pages in the NAND flash memory when creating or copying large files that incur many data index updates. Certainly, in our indexing structure, writing data pages may cause an additional NAND flash page write because the index page in NAND flash memory should be updated together with the new data page. The updates of the index pages that are pointed to double indirect index rarely occur because they are used to point to files larger than 4 MB. Therefore, the overhead for indirect indexing in files is insignificant. 4 OTHER DESIGN ISSUES OF PFFS2 4.1 Garbage Collection In this section, we discuss the garbage collection of PFFS2. In NAND flash file systems, garbage collection should be performed to recycle dead pages when there are not enough free pages. In order to reduce the overhead of garbage collection, various techniques have been proposed [40], [41], [28], [42]. Most works have tried to minimize the number of live pages copying using hot-cold page separation. If we can reclaim blocks which have many dead pages and a few hot pages, the number of live page copies is reduced, increasing the garbage collection performance. However, these mechanisms require additional data structures and algorithms to manage and determine the hot-cold page information [43]. Such overheads are not easy to ignore for an embedded system, which has limited memory and computing power. Instead of fine-grained garbage collection mechanisms, PFFS2 uses a simple garbage collection mechanism which is inspired by YAFFS2. Whenever PFFS2 writes the NAND flash pages, the garbage collector roughly inspects the block information to find the block that contains only dead pages or less than two live pages without free pages. Then, it reclaims that block after copying the live pages in the block. If PFFS2 has very few free pages, PFFS2 performs aggressive garbage collection. During aggressive garbage collection, PFFS2 inspects all the blocks of NAND flash memory to find the block with a large amount of dead pages in most cases. Therefore, PFFS2 has little overhead to handle garbage collection and saves computing power and memory because it does not use any complex algorithms or data structures. Although PFFS2 does not use any special techniques for garbage collection, the virtual metadata storage makes it efficient. As we have discussed earlier, the virtual metadata storage eliminates the dead pages generated by the frequent updates of metadata. It directly causes a reduction of the number of dead pages in NAND flash memory. Moreover, PFFS2 naturally divides the hot data from cold data because the metadata are hot data from the viewpoint of the file system. Then, most of the NAND flash memory pages in PFFS2 are dead pages or cold live pages. PFFS2 can easily find and reclaim blocks that have many dead pages and minimize amount of live page copying and block erasing during garbage collection. PFFS2 outperforms other conventional flash file systems. 4.2 Life Span Life Span Analysis Limited life span is another critical problem of NAND flash file systems. The proposed virtual metadata storage can increase the life span of PFFS2 because it reduces many of the NAND flash page writes caused by metadata updates. In this section, we analyze the life span of PFFS2 and conventional NAND flash file systems. Let L pffs be the life span of PFFS2. The L pffs is the period at which the PRAM or NAND flash is totally worn out. Therefore, the life span of PFFS2 is as follows: L pffs ¼ minðl pram ;L nand Þ; where L pram and L nand are the life span of PRAM and NAND flash memory in PFFS2. To calculate the life span of PRAM and NAND flash memory, let us assume a perfect wear-leveling of both PRAM and NAND flash memory. If S P and S N represent the size of PRAM and NAND flash memory and E P and E N are the endurance (maximum write cycle limits) of PRAM and NAND flash memory, respectively, the bytes of data to cause the PRAM and NAND flash memory to wear out will be ðs P E P Þ and ðs N E N Þ. We also assume that the data in file system are periodically updated with frequency f and the average probability of data and metadata requests are P D and P M. Then, the update rate of data is ðf P D Þ, and the update rate of metadata is ðf P M Þ. Furthermore, we think that the virtual metadata storage is influenced by ðf P S Þ times metadata swapping. Let S D and S M denote the average size of file data and metadata and S S be the size of a segment of virtual metadata storage. The life span of PRAM and NAND flash memory of PFFS2 can be expressed as follows: S P E P L pram ¼ ðs M P M þ S S P S Þf ; ð2þ S N E N L nand ¼ ðs D P D þ S S P S Þf : ð3þ On the other hand, if we let S PG be the size of a NAND flash page and L conv be the life span of conventional NAND flash file systems, L conv is expressed as follows: S N E N L conv ¼ ðs D P D þ S PG P M Þf ; ð4þ because at least one NAND flash memory page is consumed, even for metadata update. From (3) and (4), we can know that L nand is always longer than L conv because the segment size is same as page size of NAND flash memory and metadata swapping occurs infrequently than metadata update. Therefore, according to (1), L pram should be longer than L conv for PFFS2 to have a longer life span than a conventional flash file system as follows: S P E P ðs M P M þ S S P S Þf > S N E N ðs D P D þ S PG P M Þf : ð1þ ð5þ

8 328 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 3, MARCH 2011 We can represent P S as ð1 P M Þ and the endurance of PRAM is 20 times longer than NAND flash memory in Table 1. Equation (5) can be represented as follows: S P 20 S N > : S M P M þ S PG P S S D ð1 P M ÞþS PG P M If we divide each side of (6) into P M, it becomes ð6þ S P 20 S N S M þ S PG P > S S P D ð 1 : ð7þ M P M 1ÞþS PG Because P S is much smaller than P M, we can approximate (7) as follows: S P 20 S M > S N S D ð 1 P M 1ÞþS PG : The average probability of metadata P M then becomes 1 P M < : ð9þ SN SM 1 þ 20S P S D SPG S D Because P M cannot be greater than 1, (9) is always true when S N S M S PG < 0; ð10þ 20 S P S D S D S N < 20 S PG : ð11þ S P S M In the same way, L pram is also longer than L nand and if the metadata swapping is rarely occurred and S N < 20 S D P D : ð12þ S P S M P M Consequently, if the size of PRAM (S P ) is sufficiently large to meet the requirements of (11), L pffs is longer than L conv. The amount of increased life span is ðl nand L conv Þ or ðl pram L conv Þ, which is determined by (12). For example, if we assume that 4 KB data and 32 bytes metadata are updated for every transaction, we can satisfy (11) and (12) with PRAM whose size is larger than 1/1,280 of the NAND flash memory size. Then, currently developed 32 MB PRAM in Table 1 can be used to increase the life span of PFFS2 with 40 GB NAND flash memory. With those sizes, the PRAM and NAND flash memory in PFFS2 are worn out after and 5: transactions, respectively. On the other hand, conventional NAND flash file system totally wears out the NAND flash memory after 3: transactions. Therefore, PFFS2 has 50 percent longer life span because of the virtual metadata storage Wear-Leveling We should consider the wear-leveling of PRAM as well as those of NAND flash memory to prolong the life span of PFFS2. Actually, PRAM wear-leveling is more critical for the life span of PFFS2 because the metadata are more frequently updated than regular data. Also, the irregular metadata update pattern is directly reflected in PRAM because PRAM uses the in-place update scheme. We suggest forced segment swapping and word-level shifting for PRAM wear-leveling. The forced segment swapping uses segment swapping of virtual metadata storage for the wear-leveling of PRAM. For forced segment swapping, PFFS2 temporally keeps track of the write count ð8þ Fig. 6. Forced swap-out and word-level shifting for PRAM wear-leveling. for each segment of PRAM. If a segment that is stored in PRAM is updated frequently and the write count of that segment exceeds the threshold, we forcibly swap out the segment into NAND flash memory and reset the write count as shown in Fig. 6a. This makes the PRAM segment free and cools down the worn-out level of the segment before another data are stored in the segment. When it is necessary to update the data which are previously swapped out to NAND flash memory, PFFS2 automatically swaps the data in a different PRAM segment of PRAM as shown in Fig. 6c. Therefore, the forced segment swapping prevents a specific segment form being unevenly worn. However, the worn-out level of each word within segments cannot be controlled. For example, if a segment holds the inodes of the files, the words that store the size of a file (i size) are more frequently updated in the segment. To moderate this problem, word-level shifting is proposed, as shown in Fig. 6b. Whenever PFFS2 swaps in segments, PFFS2 stores the segments after shifting several words of the data from the start of the segments. Word-level shifting prevents the specific usage pattern from being repeated in the segment. 5 EXPERIMENTAL RESULTS 5.1 Experimental Environment In order to test PFFS2, we developed an evaluation board. This evaluation board has a 266 MHz ARM processor and 64 MB of SDRAM. It also includes most of the storage devices that are currently used for embedded systems: NOR, SLC and MLC NAND, OneNAND, UtRAM, and PRAM. We were offered a 32 MB PRAM prototype and 1 GB of MLC NAND flash memory [4] by Samsung and implemented the PFFS2 in a Linux OS with a kernel version. PRAM has been developed as a prototype and is not freely available to test various and heavy workloads. We just verified the operation and functionality of PFFS2 using PRAM. The performance test was actually carried out with an emulated PRAM. We analyzed the read/write time of PRAM and emulated PRAM with power-backed UtRAM [44] with software delay. Our emulation method is quite accurate because PRAM read/write time is deterministic and the data in power-backed UtRAM was not volatilized. After implementing PFFS2, we compared PFFS2 with JFFS2 and YAFFS2, which are popular and commonly used for embedded systems, in our evaluation board.

9 PARK AND PARK: HIGH-PERFORMANCE SCALABLE FLASH FILE SYSTEM USING VIRTUAL METADATA STORAGE WITH PHASE-CHANGE Fig. 7. Normalized execution time for different microbenchmark. 5.2 PFFS2 Performance Microbenchmark To evaluate the performance of PFFS2, we compared the file read/write performance of PFFS2 to that of JFFS2 and YAFFS2. First, we used three microbenchmarks to compare the metadata and data write performance of PFFS2. The first microbenchmark is a small file write test. It created 10,240 2 KB files using Linux s dd command. When the files were created by dd command, one NAND page and one file metadata creation are necessary. The second benchmark is a large file write test. It simply wrote a 200 MB file to the file system with a block size of 2 KB. In contrast to the Sprite small file benchmark, it causes the writes of 102,400 NAND data pages and one metadata during execution. The third benchmark is designed to untar a 51 MB archive that contained a clean Linux source tree. The Linux source tree contained 19,528 files in 1,191 directories, and the total size of the source tree is 266 MB. The untar command created new files that averaged about 14 KB, and updated the file attributes by sequentially calling utimes, chown, and chmod system calls. Fig. 7 shows the normalized execution time of each benchmark. For small file and untar test, PFFS2 shows better performance than other flash file systems. This is mainly because PFFS2 reduces the file metadata creation overhead by using the virtual metadata storage. Compared with the small file write, the untar benchmark contains both metadata creation and metadata updates. PFFS2 shows greater performance improvement for the untar benchmark because the metadata updates are more frequent. For the large file write benchmark, the performance of YAFFS2 is slightly better than that of PFFS2. During a large file write, the benefits of metadata update performance were not demonstrated because only one metadata update occurred. Moreover, the PFFS2 needed to update the page bitmap and block information table in the virtual metadata storage whenever writing a data page. Although additional metadata writes increase the data write performance, the performance boost is very small and can be ignored when the write speed of the current PRAM prototype is improved Macrobenchmark We used the PostMark benchmark to analyze the small file read/write performance in detail. The PostMark benchmark is widely used and designed to simulate an ISP workload including s and Web-based transactions [45]. It is well known for measuring file system performance over a workload composed of many short-lived, relatively small files. We configured PostMark to create 200 files with a file size range of 512 bytes to 16 KB and executed the benchmark by increasing the number of transactions from 20,000 to 100,000. For each experiment, the ratio of file creation/deletion was 5:5 and the ratio of read/write operation is configured to be 9:1 for a readintensive workload and 1:9 for a write-intensive workload. Fig. 8 shows the execution time for each number of transactions. The total number of written pages during execution is also represented in Fig. 9. From these figures, we can see that the transaction time of NAND flash file systems increased with the number of page writes. Generally, NAND flash memory has no seek latency and the page read speed is fast. On the other hand, page writes are much slower than page reads and also generate dead pages which cause the garbage collection of the NAND flash file system. The performance of a NAND flash file system is highly related to the number of page writes during execution. By comparing the performance of three flash file systems, we notice that the execution time of JFFS2 is Fig. 8. Execution time of PostMark benchmark for each flash file systems. (a) Write-intensive configuration (Read:Write ¼ 1:9). (b) Read-intensive configuration (Read:Write ¼ 9:1).

10 330 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 3, MARCH 2011 Fig. 9. Number of written pages for each flash file systems during PostMark benchmark execution. (a) Write-intensive configuration (Read:Write ¼ 1:9). (b) Read-intensive configuration (Read:Write ¼ 9:1). longer than that of PFFS2 and YAFFS2, although JFFS2 cause smallest NAND page writes during execution. JFFS2 manages the data and metadata as a unit of a node and stores the node in NAND flash memory after data compression. Also, JFFS2 uses a large write cache, and most write operations are performed asynchronously in the main memory, even though there could be a loss of data in JFFS2 when the write cache is not flushed before powering down. Therefore, JFFS2 saves many NAND flash page writes, and the number of written pages is reduced more than it is in YAFFS2 or PFFS2. However, in order to reduce the number of page writes, JFFS2 needs data compression and CRC operation for every read/write operation. JFFS2 is seriously affected by the CPU speed, and the performance of JFFS2 is lower than that of other file systems in embedded systems that have low computing power. On the other hand, PFFS2 does not use any computeintensive operations. The virtual metadata storage allows PFFS2 to choose much simpler data structures for metadata management. PFFS2 has little computing overhead to manage metadata. Consequently, the performance of PFFS2 is more than three times better than the performance of JFFS2. In the case of YAFFS2, the computing overhead is insignificant, but the number of written pages is much higher than that of PFFS2 and JFFS2. As we explained before, many page writes in YAFFS2 occur for small metadata updates during PostMark execution. However, the metadata updates of PFFS2 mostly occur in the virtual metadata storage. PFFS2 can reduce the NAND flash page writes caused by metadata updates. Consequently, the metadata update overhead of PFFS2 is more greatly reduced than that of other flash file systems. The overall performance of PFFS2 is increased according to the number of metadata updates. The performance of PFFS2 is at maximum 38 percent better than that of YAFFS2. The performance of flash file systems is also affected by garbage collection. Actually, the transaction time slope of YAFFS2 is increased after 60,000 transactions. This is the effect of garbage collection, which will be discussed in Section Impact of PRAM Size In the previous performance experiments, we assume 32 MB of PRAM, which is the current size of the prototype PRAM and enough to store all the metadata of created files during all the benchmark executions. In this section, we will show the performance effects of PFFS2 when the size of PRAM of the virtual metadata storage is insufficient. We executed the PostMark benchmark after reducing the size of PRAM from 10 to 1 MB. We reconfigured PostMark to create 2,000 files with a file size range of 512 bytes to 32 KB in order to increase the metadata footprint to be stored in PRAM. We performed 40,000 transactions with a read/write ratio of 1:9, and the other configuration is the same as that of the previous PostMark tests. During transactions, total 19,921 files are newly created and deleted and a total of 500 MB of data are written. We also used a modified version of the PostMark benchmark for the experiments. Because the PostMark benchmark randomly selects files to perform transactions, it does not have any specific file access patterns. In order to show the performance effect of locality, we divided the files created during PostMark execution into two groups. One group contained 20 percent of the data, and its files were selected at 80 percent probability during the transaction. The other group contained 80 percent of the files, but they were selected at only 20 percent probability. Using the modified benchmark, we can evaluate PFFS2 performance as metadata access locality. Fig. 10 shows the execution time of each benchmark. First of all, almost 10 MB space is necessary to store all the metadata during the benchmark execution. There is no problem operating PFFS2 even when we severely reduce the size of PRAM used in PFFS2. Actually, in the current implementation, 672 KB PRAM is always fixed to maintain the segment table, segment bitmap, and page bitmap because we assume 128 MB virtual metadata storage, 32 MB PRAM, and 1 GB NAND flash memory. Therefore, if the size of PRAM is larger than 672 KB, PFFS2 operates well. We can see that the performance of PFFS2 is higher than YAFFS2 before PRAM size is reduced under 4 MB which is 40 percent of necessary metadata space (10 MB). However, when the size of PRAM used in PFFS2 is decreased, the transaction time of PFFS2 is increased. This result is caused by the metadata swapping. If the size of PRAM is insufficient to store the metadata of all files, the NAND flash memory is

11 PARK AND PARK: HIGH-PERFORMANCE SCALABLE FLASH FILE SYSTEM USING VIRTUAL METADATA STORAGE WITH PHASE-CHANGE Fig. 10. Execution time of benchmarks along with the size of PRAM. also used for the metadata storage, and infrequently used metadata are swapped out to the NAND flash memory. The metadata swapping causes additional NAND flash and PRAM read/write operations and reduces the performance of PFFS2. Although, in current experiment, 10 MB is sufficient to maximize the performance, the required PRAM size is increased according to the number of files stored in PFFS2. However, when there is higher locality in the file access patterns, the performance of PFFS2 is much better with limited size of PRAM. For example, when we perform 40,000 transactions with uniform random access patterns, the transaction time of PFFS2 with 4 MB PRAM is 11 percent longer than the transaction time of YAFFS2. On the other hand, the performance of PFFS2 is slightly increased if there is locality in the file access patterns. It becomes six percent shorter than that of YAFFS2. This means that the swapping cost that is caused by the small-size PRAM can be reduced by the locality. Because it is common to assume the locality in most file systems, by using the virtual management of metadata space, we can expect reasonable performance of PFFS2 without concerning ourselves with the size limitations of PRAM. 5.3 Garbage Collection Most of the previous works performed simulation for garbage collection performance tests and use only the erase count and the number of live page copies as performance metrics of garbage collection. However, as we mentioned in Section 4.1, the additional overhead to manage and determine the page for garbage collection must be considered for the embedded system. We introduce the effective garbage collection time to compare the substantial garbage collection performance including computing and management overhead. Because the final goal of garbage collection is to create free pages, the performance of garbage collection should be determined by the elapsed time in creating free pages. Therefore, the effective garbage collection time is defined as in (13). P GC ¼ T GC : ð13þ N free T GC means that the total elapsed time for garbage collection. It includes time for live page copies, block erase, Fig. 11. The effective garbage collection time. Fig. 12. Number of dead pages in blocks without garbage collection. block management, and block searching for garbage collection. N free is the number of free pages reclaimed by garbage collection. Consequently, the effective garbage collection time shows how much time is necessary to generate one free page during garbage collection. Fig. 11 shows the experimental results of P GC. 1 From the result, the effective garbage collection time of PFFS2 is much shorter than that of YAFFS2 and JFFS2. This means that the PFFS2 garbage collector reclaims free pages more effectively than JFFS2 or YAFFS2. JFFS2 starts garbage collection when the number of remaining free pages is lower than the threshold. JFFS2 performs garbage collection less often than do YAFFS2 and PFFS2 because JFFS2 minimizes the number of page writes using data compression. However, the garbage collection of JFFS2 occurs in the context of a new kernel thread or a user process, and it creates high computing overhead because the metadata of JFFS2 have to be recompressed and written out in a new node during garbage collection. On the other hand, YAFFS2 has no significant computing overhead for garbage collection. However, in YAFFS2, the hot metadata pages are mixed up with data pages and suffer from finding block for garbage collection. Fig. 12 shows the page dead page generation pattern in YAFFS2 and PFFS2. It shows the number of dead pages of blocks in YAFFS2 and PFFS2 when the 40,000 transactions of the PostMark benchmark are executed without garbage collection. If all pages 1. In this experiment, the P GC of JFFS2 of 20,000 transactions is not calculated because N free is zero.

12 332 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 3, MARCH 2011 Fig. 13. Comparison of mounting time of each file system. (128 pages) in a block are dead pages, we can easily perform garbage collection by erasing the block. In the YAFFS2 results, 3,500 blocks were used during PostMark benchmark execution, and most of the blocks did not have fully dead pages. YAFFS2 took a longer time to find a block that had only dead pages. YAFFS2 even failed to find a candidate block for reclamation and just wasted time without creating free pages. Therefore, the effective garbage collection time was much longer than other flash file systems. Particularly, after YAFFS2 performs 60,000 transactions, there are no remaining free pages in the NAND flash memory. In this situation, the number of reclamation pages of YAFFS2 suddenly increases because YAFFS2 performs aggressive garbage collection and is forced to reclaim free pages even though the block does not contain many dead pages. Although the increase of free page reclaiming reduces the effective garbage collection time, it is still much lower than that of PFFS2, and it makes the garbage collection performance of YAFFS2 fluctuate along with the benchmark transaction. Moreover, the burst reclamations by aggressive garbage collection results in the rapid increase of the transaction time of YAFFS2, as we mentioned in Section 5.1. PFFS2 performs garbage collection in the same context of the file system and does not require additional computing overhead for compression and node generation. Furthermore, as we explained in Section 4.1, because of the virtual metadata storage, PFFS2 can perform the garbage reclamation continuously and maintains many free pages in NAND flash memory. As shown in Fig. 12, PFFS2 uses about 2,700 blocks and most of the blocks in PFFS2 have only dead pages before garbage collection. PFFS2 rarely failed to find a block for reclamation and never performed aggressive garbage collection in our experiments. Due to the virtual metadata storage, the effective garbage collection time of PFFS2 is much shorter than other flash file systems, and the garbage collection of PFFS2 is stable during transactions. 5.4 Scalability We also compared the mounting time and memory usage to test the scalability of the NAND flash file systems. Figs. 13 and 14 show the results. Because the scalability of NAND flash file systems is related to file system utilization and the number of files, the entire test was performed after we had written a set of files into the file systems. The x-axis of the Fig. 14. Comparison of memory usage of each file system. Fig. 15. Write access pattern of PRAM without wear-leveling. figures represents the size of file and the number of written files of each test set. For the mounting time in Fig. 13, we can easily determine that the mounting time of JFFS2 and YAFFS2 is much longer than PFFS2. 2 Moreover, the mounting time of JFFS2 and YAFFS2 rapidly increased from file system utilization and the number of files, although the mounting time of PFFS2 remained at seconds. Because PFFS2 never scanned any region in NAND flash memory during mounting time, Even if 10 or 100 GB of flash memory was installed, we mounted the PFFS2 in a constant time (0.179 seconds). Similar to the mounting time, the memory usage of JFFS2 and YAFFS2 was not scalable, as shown in Fig. 14. This proves that JFFS2 and YAFFS2 dynamically manage a great amount of data in memory. Especially, if there are many files in JFFS2 and YAFFS2, the system needs very much memory to store the directory structures. However, the size of memory usage of the PFFS2 is fixed because the PFFS2 stores whole directories and flash memory management structures in PRAM. As a result, the PFFS2 is completely scalable in all aspects. We can easily apply the PFFS2 to any resource-constrained embedded system. 5.5 Wear-Leveling of PRAM Fig. 15 shows the write counts of PRAM in the PFFS2 when we execute the PostMark benchmark with the configuration 2. Actually, the mounting time of JFFS2 of the test set 4 KB/12,800 and 4 KB/25,600 were 45.2 and seconds. Because this is much longer than other results, we do not represent those results in the Fig. 13.

13 PARK AND PARK: HIGH-PERFORMANCE SCALABLE FLASH FILE SYSTEM USING VIRTUAL METADATA STORAGE WITH PHASE-CHANGE byte-addressable storage, both PRAM and main memory can be used as hybrid cache for PFFS2 for buffering smallsize data. How to further reduce the performance of PFFS2 using this hybrid cache is one of the interesting topics. We also found that the nonvolatility of PRAM can be used to guarantee the persistency of data update. The virtual metadata storage and hybrid cache will be extended for the data durability and file system consistency. Fig. 16. Write access pattern of PRAM with wear-leveling. of executing 80,000 transactions without wear-leveling. The x-axis shows the byte address in PRAM and the y-axis represents the write counts of each byte. From this figure, we can see that the write is concentrated in a specific region of PRAM, and some region have about 700 write counts although the write counts of most other region are under 100; what is worse is that there are numerous unused regions in PRAM. This proves that PFFS2 has the wearleveling problem of PRAM. Fig. 16 shows the results after we applied the wearleveling design to solve this wear-leveling problem. If we apply our segmentation design, the maximum write counts are reduced to fewer than 200 for all bytes in PRAM. The write that is concentrated on the beginning parts of PRAM in previous test is moved to the unused segments by segmentation changing. If more write occur in PRAM, these unused segments are continuously used instead of increasing the write counts of previously used segments. Therefore, using our segmentation technique, all bytes in PRAM cannot be worn out before NAND flash memory is worn out. 6 CONCLUSION AND FUTURE WORK In this paper, we presented a high-performance scalable flash memory file system, called PFFS2 that employs PRAM as a metadata storage component. The proposed virtual metadata storage increases the metadata update performance and reduces the garbage collection overhead even if the size of PRAM is limited. It also solves the scalability problem of previous NAND flash file systems. For the life span of PFFS2, it reduces the total write count of NAND flash memory. For PRAM, the proposed wear-leveling method ensures that PRAM does not wear out before NAND flash memory does. Currently, the limitation of PFFS2 is that it needs PRAM as metadata storage. However, PRAM is expected to replace NOR flash memory for code execution and PRAM will be used with NAND flash memory in future embedded systems. When this happens, our PFFS2 will be more easily applied to various embedded systems. As our future work, we will analyze the energy consumption of PFFS2 and design a new low-power storage system with PRAM and NAND flash memory. Because PRAM consumes less energy than NAND flash memory for word-level writing, the virtual metadata storage can reduce the power consumption of PFFS2. Current design of PFFS2 does not consider the data buffering. Because PRAM is a REFERENCES [1] F. Douglis, R. Cacers, F. Kaashoek, K. Li, B. Marsh, and J.A. Tauber, Storage Alternatives for Mobile Computers, Proc. First Symp. Operating Systems Design and Implementation(OSDI), [2] C. Park et al., A High Performance Controller for NAND Flash- Based Solid State Disk (NSSD), Proc. 21st Non-Volatile Semiconductor Memory Workshop, [3] C.-G. Hwang, Nanotechnology Enables a New Memory Growth Model, Proc. IEEE, vol. 91, no. 11, pp , Nov [4] K9G8G08U0M Data Sheet, [5] A. Kawaguchi, S. Nishioka, and H. Motoda, A Flash-Memory Based File System, Proc. USENIX Ann. Technical Conf., [6] D. Woodhouse, JFFS: The Journaling Flash File System, Proc. Ottawa Linux Symp. (OLS), [7] YAFFS: A Flash File System for Embedded Use, [8] J. Piernas, T. Cortes, and J.M. Garcia, The Design of New Journaling File Systems: The DualFS Case, IEEE Trans. Computers, vol. 56, no. 2, pp , Feb [9] G.R. Ganger and Y. Patt, Metadata Update Performance in File Systems, Proc. First USENIX Symp. Operating Systems Design and Implementation (OSDI), [10] E.L. Miller, S.A. Brandt, and D.D.E. long, HeRMES: High- Performance Reliable MRAM-Enabled Storage, Proc. Eighth IEEE Workshop Hot Topics in Operating Systems (HotOS-VIII), [11] M. Baker, S. Asami, E. Deprit, J. Ousterhout, and M. Seltzer, Non- Volatile Memory for Fast, Reliable File Systems, Proc. Fifth Int l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), [12] YAFFS A NAND-Flash Filesystem, FOSDEM Presentation, , [13] M.-K. Choi et al., A 0.25um 3.0V 1T1C 32Mb Nonvolatile Ferroelectric RAM with Address Transition Detector and Current Forcing Latch Sense Amplifier Scheme, Proc. IEEE Int l Conf. Integrated Circuit Design and Technology, [14] MR2A16A MRAM Datasheet, [15] K.-J. Lee et al., A 90nm 1.8V 512Mb Diode-Switch PRAM with 266MB/s Read Throughput, IEEE J. Solid-State Circuits, vol. 43, no. 1, pp , Jan [16] F. Bedeschi et al., A Multi-Level-Cell Bipolar-Selected Phase- Change Memory, Proc. IEEE Int l Solid-State Circuits Conf., [17] S. Raoux et al., Phase-Change Random Access Memory: A Scalable Technology, IBM J. Research and Development, vol. 52, nos. 4/5, pp , July-Sept [18] M.K. Qureshi et al., Enhancing Lifetime and Security of PCM- Based Main Memory with Start-Gap Wear Leveling, Proc. 42nd Ann. IEEE/ACM Int l Symp. Microarchitecture, [19] B.C. Lee et al., Architecting Phase Change Memory as a Scalable DRAM Alternative, Proc. 36th Ann. Int l Symp. Computer Architecture, [20] Samsung Introduces Working Prototype of PRAM, [21] Intel, STMicroelectronics Deliver Industry s First Phase Change Memory Prototypes [22] Y. Park et al., PFFS: A Scalable Flash Memory File System for the Hybrid Architecture of Phase Change RAM and NAND Flash, Proc. ACM Symp. Applied Computing, [23] J.K. Kim, H.G. Lee, S. Choi, and K.I. Bahng, A PRAM and NAND Flash Hybrid Architecture for High-Performance Embedded Storage Subsystems, Proc. Eighth ACM and IEEE Int l Conf. on Embedded Software, [24] M. DeVoss, The Winds of Phase Change are Blowing, Market Brief, isuppli, May [25] KPS5615EZM Data Sheet,

14 334 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 3, MARCH 2011 [26] G.H. Koh et al., PRAM Process Technology, Proc. IEEE Int l Conf. Integrated Circuit Design and Technology, [27] S.-H. Lim and K.-H. Park, An Efficient NAND Flash File System for Flash Memory Storage, IEEE Trans. Computers, vol. 5, no. 7, pp , July [28] M. Wu and W. Zwaenepoel, envy: A Non-Volatile, Main Memory Storage System, Proc. Sixth Int l Conf. Architectural Support for Programming Languages and Operating Systems (ASPLOS), [29] A.-I.A. Wang et al., Conquest: Better Performance Through a Disk/Persistent-RAM Hybrid File System, Proc. USENIX Ann. Technical Conf., [30] C. Lee, S.H. Baek, and K.H. Park, A Hybrid Flash File System Based on NOR and NAND Flash Memories for Embedded Devices, IEEE Trans. Computers, vol. 57, no. 7, pp , July [31] I.H. Doh, J. Choi, D. Lee, and S.H. Noh, Exploiting Non-Volatile RAM to Enhance Flash File System Performance, Proc. Seventh ACM and IEEE Int l Conf. Embedded Software, [32] J.R. Douceur and W.J. Bolosky, A Large-Scale Study of File- System Contents, Proc. ACM SIGMETRICS, [33] W. Vogels, File System Usage in Windows NT 4.0, Proc. 17th ACM Symp. Operating Systems Principles, [34] M. Rosenblum and J.K. Ousterhout, The Design and Implementation of a Log-Structured File System, ACM Trans. Computer Systems, vol. 10, pp , [35] 3PARdata, Inc., Thin Provisioning, [36] S. Kang and A.L. Narasimha Reddy, An Approach to Virtual Allocation in Storage Systems, ACM Trans. Storage, vol. 2, pp , [37] D. Lee, J. Choi, J.-H. Kim, S.H. Noh, S.L. Min, Y. Cho, and C.S. Kim, LRFU: A Spectrum of Policies that Subsumes the Least Recently Used and Least Frequently Used Policies, IEEE Trans. Computers, vol. 50, no. 12, pp , Dec [38] D. Rselli, J.R. Lorch, and T.E. Anderson, A Comparison of File System Workloads, Proc. USENIX Ann. Technical Conf., [39] N. Agrawal et al., A Five-Year Study of File-System Metadata, Proc. Fifth Conf. File and Storage Technologies (FAST 07), [40] A. Kawaguchi, S. Nishioka, and H. Motoda, A Flash Memory Based File System, Proc. USENIX Ann. Technical Conf., [41] H.-J. Kim and S.-G. Lee, A New Flash Memory Management for Flash Storage System, Proc. 23rd Ann. Int l Computer Software and Applications Conf., [42] L.-P. Chang, T.-W. Kuo, and S.-Wu, Real-Time Garbage Collection for Flash-Memory Storage Systems of Real-Time Embedded Systems, ACM Trans. Embedded Computing Systems, vol. 3, pp , [43] M.-L. Chiang, C.-L. Cheng, and C.-H. Wu, A New FTL-Based Flash Memory Management Scheme with Fast Cleaning Mechanism, Proc. Int l Conf. Embedded Software and Systems, [44] K1S5616BCM Data Sheet, [45] J. Katcher, PostMark: A New File System Benchmark, Technical Report TR3022, Network Appliance Inc., Youngwoo Park received the BS and MS degrees in the division of electrical engineering from the Korea Advanced Institute of Science and Technology (KAIST) in 2004 and 2006, respectively. He is currently working toward the PhD degree in the division of electrical engineering at KAIST. His research interests include storage systems, flash file systems, and embedded systems. He is a member of the IEEE and the IEEE Computer Society. Kyu Ho Park received the BS degree in electronics engineering from Seoul National University, Korea, in 1973, the MS degree in electrical engineering from the Korea Advanced Institute of Science and Technology (KAIST) in 1975, and the DrIng degree in electrical engineering from the University de Paris XI, France, in He has been a professor in the Division of Electrical Engineering at KAIST since He was a president of the Korea Institute of Next Generation Computing for the period His research interests include computer architectures, file systems, storage systems, ubiquitous computing, and parallel processing. He is a member of the Korea Information Science Society (KISS), the Korea Institute of Telematics and Electronics (KITE), the Korea Institute of Next Generation Computing, the IEEE, the IEEE Computer Society and the ACM.. For more information on this or any other computing topic, please visit our Digital Library at

NLE-FFS: A Flash File System with PRAM for Non-linear Editing

NLE-FFS: A Flash File System with PRAM for Non-linear Editing 16 IEEE Transactions on Consumer Electronics, Vol. 55, No. 4, NOVEMBER 9 NLE-FFS: A Flash File System with PRAM for Non-linear Editing Man-Keun Seo, Sungahn Ko, Youngwoo Park, and Kyu Ho Park, Member,

More information

Data Organization and Processing

Data Organization and Processing Data Organization and Processing Indexing Techniques for Solid State Drives (NDBI007) David Hoksza http://siret.ms.mff.cuni.cz/hoksza Outline SSD technology overview Motivation for standard algorithms

More information

A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers

A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers A Memory Management Scheme for Hybrid Memory Architecture in Mission Critical Computers Soohyun Yang and Yeonseung Ryu Department of Computer Engineering, Myongji University Yongin, Gyeonggi-do, Korea

More information

Storage Architecture and Software Support for SLC/MLC Combined Flash Memory

Storage Architecture and Software Support for SLC/MLC Combined Flash Memory Storage Architecture and Software Support for SLC/MLC Combined Flash Memory Soojun Im and Dongkun Shin Sungkyunkwan University Suwon, Korea {lang33, dongkun}@skku.edu ABSTRACT We propose a novel flash

More information

Operating Systems Design Exam 2 Review: Spring 2011

Operating Systems Design Exam 2 Review: Spring 2011 Operating Systems Design Exam 2 Review: Spring 2011 Paul Krzyzanowski pxk@cs.rutgers.edu 1 Question 1 CPU utilization tends to be lower when: a. There are more processes in memory. b. There are fewer processes

More information

CS 416: Opera-ng Systems Design March 23, 2012

CS 416: Opera-ng Systems Design March 23, 2012 Question 1 Operating Systems Design Exam 2 Review: Spring 2011 Paul Krzyzanowski pxk@cs.rutgers.edu CPU utilization tends to be lower when: a. There are more processes in memory. b. There are fewer processes

More information

MTD Based Compressed Swapping for Embedded Linux.

MTD Based Compressed Swapping for Embedded Linux. MTD Based Compressed Swapping for Embedded Linux. Alexander Belyakov, alexander.belyakov@intel.com http://mtd-mods.wiki.sourceforge.net/mtd+based+compressed+swapping Introduction and Motivation Memory

More information

TEFS: A Flash File System for Use on Memory Constrained Devices

TEFS: A Flash File System for Use on Memory Constrained Devices 2016 IEEE Canadian Conference on Electrical and Computer Engineering (CCECE) TEFS: A Flash File for Use on Memory Constrained Devices Wade Penson wpenson@alumni.ubc.ca Scott Fazackerley scott.fazackerley@alumni.ubc.ca

More information

LAST: Locality-Aware Sector Translation for NAND Flash Memory-Based Storage Systems

LAST: Locality-Aware Sector Translation for NAND Flash Memory-Based Storage Systems : Locality-Aware Sector Translation for NAND Flash Memory-Based Storage Systems Sungjin Lee, Dongkun Shin, Young-Jin Kim and Jihong Kim School of Information and Communication Engineering, Sungkyunkwan

More information

EMSOFT 09 Yangwook Kang Ethan L. Miller Hongik Univ UC Santa Cruz 2009/11/09 Yongseok Oh

EMSOFT 09 Yangwook Kang Ethan L. Miller Hongik Univ UC Santa Cruz 2009/11/09 Yongseok Oh RCFFS : Adding Aggressive Error Correction to a high-performance Compressing Flash File System EMSOFT 09 Yangwook Kang Ethan L. Miller Hongik Univ UC Santa Cruz 2009/11/09 Yongseok Oh ysoh@uos.ac.kr 1

More information

File Systems. CS170 Fall 2018

File Systems. CS170 Fall 2018 File Systems CS170 Fall 2018 Table of Content File interface review File-System Structure File-System Implementation Directory Implementation Allocation Methods of Disk Space Free-Space Management Contiguous

More information

Chapter 12 Wear Leveling for PCM Using Hot Data Identification

Chapter 12 Wear Leveling for PCM Using Hot Data Identification Chapter 12 Wear Leveling for PCM Using Hot Data Identification Inhwan Choi and Dongkun Shin Abstract Phase change memory (PCM) is the best candidate device among next generation random access memory technologies.

More information

Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM

Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM Migration Based Page Caching Algorithm for a Hybrid Main Memory of DRAM and PRAM Hyunchul Seok Daejeon, Korea hcseok@core.kaist.ac.kr Youngwoo Park Daejeon, Korea ywpark@core.kaist.ac.kr Kyu Ho Park Deajeon,

More information

Using Transparent Compression to Improve SSD-based I/O Caches

Using Transparent Compression to Improve SSD-based I/O Caches Using Transparent Compression to Improve SSD-based I/O Caches Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr

More information

SFS: Random Write Considered Harmful in Solid State Drives

SFS: Random Write Considered Harmful in Solid State Drives SFS: Random Write Considered Harmful in Solid State Drives Changwoo Min 1, 2, Kangnyeon Kim 1, Hyunjin Cho 2, Sang-Won Lee 1, Young Ik Eom 1 1 Sungkyunkwan University, Korea 2 Samsung Electronics, Korea

More information

ScaleFFS: A Scalable Log-Structured Flash File System for Mobile Multimedia Systems

ScaleFFS: A Scalable Log-Structured Flash File System for Mobile Multimedia Systems ScaleFFS: A Scalable Log-Structured Flash File System for Mobile Multimedia Systems DAWOON JUNG, JAEGEUK KIM, JIN-SOO KIM, and JOONWON LEE Korea Advanced Institute of Science and Technology NAND flash

More information

Chapter 3 - Memory Management

Chapter 3 - Memory Management Chapter 3 - Memory Management Luis Tarrataca luis.tarrataca@gmail.com CEFET-RJ L. Tarrataca Chapter 3 - Memory Management 1 / 222 1 A Memory Abstraction: Address Spaces The Notion of an Address Space Swapping

More information

A Mixed Flash Translation Layer Structure for SLC-MLC Combined Flash Memory System

A Mixed Flash Translation Layer Structure for SLC-MLC Combined Flash Memory System A Mixed Flash Translation Layer Structure for SLC-MLC Combined Flash Memory System Seung-Ho Park, Jung-Wook Park, Jong-Min Jeong, Jung-Hwan Kim, Shin-Dug Kim Department of Computer Science, Yonsei University,

More information

A File-System-Aware FTL Design for Flash Memory Storage Systems

A File-System-Aware FTL Design for Flash Memory Storage Systems 1 A File-System-Aware FTL Design for Flash Memory Storage Systems Po-Liang Wu, Yuan-Hao Chang, Po-Chun Huang, and Tei-Wei Kuo National Taiwan University 2 Outline Introduction File Systems Observations

More information

CS3600 SYSTEMS AND NETWORKS

CS3600 SYSTEMS AND NETWORKS CS3600 SYSTEMS AND NETWORKS NORTHEASTERN UNIVERSITY Lecture 11: File System Implementation Prof. Alan Mislove (amislove@ccs.neu.edu) File-System Structure File structure Logical storage unit Collection

More information

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23

FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 23 FILE SYSTEMS CS124 Operating Systems Winter 2015-2016, Lecture 23 2 Persistent Storage All programs require some form of persistent storage that lasts beyond the lifetime of an individual process Most

More information

Operating Systems. Week 9 Recitation: Exam 2 Preview Review of Exam 2, Spring Paul Krzyzanowski. Rutgers University.

Operating Systems. Week 9 Recitation: Exam 2 Preview Review of Exam 2, Spring Paul Krzyzanowski. Rutgers University. Operating Systems Week 9 Recitation: Exam 2 Preview Review of Exam 2, Spring 2014 Paul Krzyzanowski Rutgers University Spring 2015 March 27, 2015 2015 Paul Krzyzanowski 1 Exam 2 2012 Question 2a One of

More information

File System Implementation. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File System Implementation. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University File System Implementation Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Implementing a File System On-disk structures How does file system represent

More information

File Management By : Kaushik Vaghani

File Management By : Kaushik Vaghani File Management By : Kaushik Vaghani File Concept Access Methods File Types File Operations Directory Structure File-System Structure File Management Directory Implementation (Linear List, Hash Table)

More information

Addressing Scalability and Consistency Issues in Hybrid File System for BPRAM and NAND Flash

Addressing Scalability and Consistency Issues in Hybrid File System for BPRAM and NAND Flash 7th IEEE International Workshop on Storage Network Architecture and Parallel I/O SNAPI 2011 Denver, Colorado May 25, 2011 Addressing Scalability and Consistency Issues in Hybrid File System for BPRAM and

More information

Main Points. File systems. Storage hardware characteristics. File system usage patterns. Useful abstractions on top of physical devices

Main Points. File systems. Storage hardware characteristics. File system usage patterns. Useful abstractions on top of physical devices Storage Systems Main Points File systems Useful abstractions on top of physical devices Storage hardware characteristics Disks and flash memory File system usage patterns File Systems Abstraction on top

More information

Mass-Storage Structure

Mass-Storage Structure Operating Systems (Fall/Winter 2018) Mass-Storage Structure Yajin Zhou (http://yajin.org) Zhejiang University Acknowledgement: some pages are based on the slides from Zhi Wang(fsu). Review On-disk structure

More information

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi Lecture - 13 Virtual memory and memory management unit In the last class, we had discussed

More information

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed. File-System Structure File structure Logical storage unit Collection of related information File

More information

ECE 598 Advanced Operating Systems Lecture 14

ECE 598 Advanced Operating Systems Lecture 14 ECE 598 Advanced Operating Systems Lecture 14 Vince Weaver http://www.eece.maine.edu/~vweaver vincent.weaver@maine.edu 19 March 2015 Announcements Homework #4 posted soon? 1 Filesystems Often a MBR (master

More information

Chapter 9: Virtual Memory

Chapter 9: Virtual Memory Chapter 9: Virtual Memory Silberschatz, Galvin and Gagne 2013 Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating

More information

Flash Drive Emulation

Flash Drive Emulation Flash Drive Emulation Eric Aderhold & Blayne Field aderhold@cs.wisc.edu & bfield@cs.wisc.edu Computer Sciences Department University of Wisconsin, Madison Abstract Flash drives are becoming increasingly

More information

ZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency

ZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency ZBD: Using Transparent Compression at the Block Level to Increase Storage Space Efficiency Thanos Makatos, Yannis Klonatos, Manolis Marazakis, Michail D. Flouris, and Angelos Bilas {mcatos,klonatos,maraz,flouris,bilas}@ics.forth.gr

More information

Design and Implementation of a Random Access File System for NVRAM

Design and Implementation of a Random Access File System for NVRAM This article has been accepted and published on J-STAGE in advance of copyediting. Content is final as presented. IEICE Electronics Express, Vol.* No.*,*-* Design and Implementation of a Random Access

More information

File Systems. Chapter 11, 13 OSPP

File Systems. Chapter 11, 13 OSPP File Systems Chapter 11, 13 OSPP What is a File? What is a Directory? Goals of File System Performance Controlled Sharing Convenience: naming Reliability File System Workload File sizes Are most files

More information

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University NAND Flash-based Storage Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics NAND flash memory Flash Translation Layer (FTL) OS implications

More information

2. PICTURE: Cut and paste from paper

2. PICTURE: Cut and paste from paper File System Layout 1. QUESTION: What were technology trends enabling this? a. CPU speeds getting faster relative to disk i. QUESTION: What is implication? Can do more work per disk block to make good decisions

More information

File System Implementation

File System Implementation File System Implementation Jinkyu Jeong (jinkyu@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu SSE3044: Operating Systems, Fall 2016, Jinkyu Jeong (jinkyu@skku.edu) Implementing

More information

80 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 1, JANUARY Flash-Aware RAID Techniques for Dependable and High-Performance Flash Memory SSD

80 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 1, JANUARY Flash-Aware RAID Techniques for Dependable and High-Performance Flash Memory SSD 80 IEEE TRANSACTIONS ON COMPUTERS, VOL. 60, NO. 1, JANUARY 2011 Flash-Aware RAID Techniques for Dependable and High-Performance Flash Memory SSD Soojun Im and Dongkun Shin, Member, IEEE Abstract Solid-state

More information

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24 FILE SYSTEMS, PART 2 CS124 Operating Systems Fall 2017-2018, Lecture 24 2 Last Time: File Systems Introduced the concept of file systems Explored several ways of managing the contents of files Contiguous

More information

JOURNALING techniques have been widely used in modern

JOURNALING techniques have been widely used in modern IEEE TRANSACTIONS ON COMPUTERS, VOL. XX, NO. X, XXXX 2018 1 Optimizing File Systems with a Write-efficient Journaling Scheme on Non-volatile Memory Xiaoyi Zhang, Dan Feng, Member, IEEE, Yu Hua, Senior

More information

JOURNALING FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 26

JOURNALING FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 26 JOURNALING FILE SYSTEMS CS124 Operating Systems Winter 2015-2016, Lecture 26 2 File System Robustness The operating system keeps a cache of filesystem data Secondary storage devices are much slower than

More information

UNIX File Systems. How UNIX Organizes and Accesses Files on Disk

UNIX File Systems. How UNIX Organizes and Accesses Files on Disk UNIX File Systems How UNIX Organizes and Accesses Files on Disk Why File Systems File system is a service which supports an abstract representation of the secondary storage to the OS A file system organizes

More information

Introduction. Secondary Storage. File concept. File attributes

Introduction. Secondary Storage. File concept. File attributes Introduction Secondary storage is the non-volatile repository for (both user and system) data and programs As (integral or separate) part of an operating system, the file system manages this information

More information

ONE of the design challenges of mobile computers is that

ONE of the design challenges of mobile computers is that IEEE TRANSACTIONS ON COMPUTERS, VOL. 59, NO. 10, OCTOBER 2010 1337 A Hybrid Approach to NAND-Flash-Based Solid-State Disks Li-Pin Chang Abstract Replacing power-hungry disks with NAND-flash-based solid-state

More information

Presented by: Nafiseh Mahmoudi Spring 2017

Presented by: Nafiseh Mahmoudi Spring 2017 Presented by: Nafiseh Mahmoudi Spring 2017 Authors: Publication: Type: ACM Transactions on Storage (TOS), 2016 Research Paper 2 High speed data processing demands high storage I/O performance. Flash memory

More information

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Operating Systems Lecture 7.2 - File system implementation Adrien Krähenbühl Master of Computer Science PUF - Hồ Chí Minh 2016/2017 Design FAT or indexed allocation? UFS, FFS & Ext2 Journaling with Ext3

More information

File System Internals. Jo, Heeseung

File System Internals. Jo, Heeseung File System Internals Jo, Heeseung Today's Topics File system implementation File descriptor table, File table Virtual file system File system design issues Directory implementation: filename -> metadata

More information

NAND Flash-based Storage. Computer Systems Laboratory Sungkyunkwan University

NAND Flash-based Storage. Computer Systems Laboratory Sungkyunkwan University NAND Flash-based Storage Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics NAND flash memory Flash Translation Layer (FTL) OS implications

More information

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto

Ricardo Rocha. Department of Computer Science Faculty of Sciences University of Porto Ricardo Rocha Department of Computer Science Faculty of Sciences University of Porto Slides based on the book Operating System Concepts, 9th Edition, Abraham Silberschatz, Peter B. Galvin and Greg Gagne,

More information

Introduction to OS. File Management. MOS Ch. 4. Mahmoud El-Gayyar. Mahmoud El-Gayyar / Introduction to OS 1

Introduction to OS. File Management. MOS Ch. 4. Mahmoud El-Gayyar. Mahmoud El-Gayyar / Introduction to OS 1 Introduction to OS File Management MOS Ch. 4 Mahmoud El-Gayyar elgayyar@ci.suez.edu.eg Mahmoud El-Gayyar / Introduction to OS 1 File Management Objectives Provide I/O support for a variety of storage device

More information

FFS: The Fast File System -and- The Magical World of SSDs

FFS: The Fast File System -and- The Magical World of SSDs FFS: The Fast File System -and- The Magical World of SSDs The Original, Not-Fast Unix Filesystem Disk Superblock Inodes Data Directory Name i-number Inode Metadata Direct ptr......... Indirect ptr 2-indirect

More information

Chapter 11: Implementing File Systems

Chapter 11: Implementing File Systems Silberschatz 1 Chapter 11: Implementing File Systems Thursday, November 08, 2007 9:55 PM File system = a system stores files on secondary storage. A disk may have more than one file system. Disk are divided

More information

SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device

SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device SHRD: Improving Spatial Locality in Flash Storage Accesses by Sequentializing in Host and Randomizing in Device Hyukjoong Kim 1, Dongkun Shin 1, Yun Ho Jeong 2 and Kyung Ho Kim 2 1 Samsung Electronics

More information

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture) Dept. of Computer Science & Engineering Chentao Wu wuct@cs.sjtu.edu.cn Download lectures ftp://public.sjtu.edu.cn User:

More information

smxnand RTOS Innovators Flash Driver General Features

smxnand RTOS Innovators Flash Driver General Features smxnand Flash Driver RTOS Innovators The smxnand flash driver makes NAND flash memory appear to a file system like a disk drive. It supports single-level cell (SLC) and multi-level cell (MLC) NAND flash.

More information

A Hybrid Solid-State Storage Architecture for the Performance, Energy Consumption, and Lifetime Improvement

A Hybrid Solid-State Storage Architecture for the Performance, Energy Consumption, and Lifetime Improvement A Hybrid Solid-State Storage Architecture for the Performance, Energy Consumption, and Lifetime Improvement Guangyu Sun, Yongsoo Joo, Yibo Chen Dimin Niu, Yuan Xie Pennsylvania State University {gsun,

More information

FlashLight: A Lightweight Flash File System for Embedded Systems

FlashLight: A Lightweight Flash File System for Embedded Systems FlashLight: A Lightweight Flash File System for Embedded Systems JAEGEUK KIM, HYOTAEK SHIM, SEON-YEONG PARK, and SEUNGRYOUL MAENG, Korea Advanced Institute of Science and Technology JIN-SOO KIM, Sungkyunkwan

More information

Operating Systems. Operating Systems Professor Sina Meraji U of T

Operating Systems. Operating Systems Professor Sina Meraji U of T Operating Systems Operating Systems Professor Sina Meraji U of T How are file systems implemented? File system implementation Files and directories live on secondary storage Anything outside of primary

More information

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems

File system internals Tanenbaum, Chapter 4. COMP3231 Operating Systems File system internals Tanenbaum, Chapter 4 COMP3231 Operating Systems Architecture of the OS storage stack Application File system: Hides physical location of data on the disk Exposes: directory hierarchy,

More information

IRON FOR JFFS2. Raja Ram Yadhav Ramakrishnan, Abhinav Kumar. { rramakrishn2, ABSTRACT INTRODUCTION

IRON FOR JFFS2. Raja Ram Yadhav Ramakrishnan, Abhinav Kumar. { rramakrishn2, ABSTRACT INTRODUCTION IRON FOR JFFS2 Raja Ram Yadhav Ramakrishnan, Abhinav Kumar { rramakrishn2, kumar8}@wisc.edu ABSTRACT Flash memory is an increasingly common storage medium in embedded devices, because it provides solid

More information

Operating Systems Design Exam 2 Review: Spring 2012

Operating Systems Design Exam 2 Review: Spring 2012 Operating Systems Design Exam 2 Review: Spring 2012 Paul Krzyzanowski pxk@cs.rutgers.edu 1 Question 1 Under what conditions will you reach a point of diminishing returns where adding more memory may improve

More information

Computer Systems Laboratory Sungkyunkwan University

Computer Systems Laboratory Sungkyunkwan University File System Internals Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics File system implementation File descriptor table, File table

More information

Buffer Caching Algorithms for Storage Class RAMs

Buffer Caching Algorithms for Storage Class RAMs Issue 1, Volume 3, 29 Buffer Caching Algorithms for Storage Class RAMs Junseok Park, Hyunkyoung Choi, Hyokyung Bahn, and Kern Koh Abstract Due to recent advances in semiconductor technologies, storage

More information

The Google File System

The Google File System October 13, 2010 Based on: S. Ghemawat, H. Gobioff, and S.-T. Leung: The Google file system, in Proceedings ACM SOSP 2003, Lake George, NY, USA, October 2003. 1 Assumptions Interface Architecture Single

More information

744 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 6, JUNE 2009

744 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 6, JUNE 2009 744 IEEE TRANSACTIONS ON COMPUTERS, VOL. 58, NO. 6, JUNE 2009 Performance Trade-Offs in Using NVRAM Write Buffer for Flash Memory-Based Storage Devices Sooyong Kang, Sungmin Park, Hoyoung Jung, Hyoki Shim,

More information

Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices

Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices Page Mapping Scheme to Support Secure File Deletion for NANDbased Block Devices Ilhoon Shin Seoul National University of Science & Technology ilhoon.shin@snut.ac.kr Abstract As the amount of digitized

More information

UNIX File System. UNIX File System. The UNIX file system has a hierarchical tree structure with the top in root.

UNIX File System. UNIX File System. The UNIX file system has a hierarchical tree structure with the top in root. UNIX File System UNIX File System The UNIX file system has a hierarchical tree structure with the top in root. Files are located with the aid of directories. Directories can contain both file and directory

More information

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University NAND Flash-based Storage Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics NAND flash memory Flash Translation Layer (FTL) OS implications

More information

File System Management

File System Management Lecture 8: Storage Management File System Management Contents Non volatile memory Tape, HDD, SSD Files & File System Interface Directories & their Organization File System Implementation Disk Space Allocation

More information

A Page-Based Storage Framework for Phase Change Memory

A Page-Based Storage Framework for Phase Change Memory A Page-Based Storage Framework for Phase Change Memory Peiquan Jin, Zhangling Wu, Xiaoliang Wang, Xingjun Hao, Lihua Yue University of Science and Technology of China 2017.5.19 Outline Background Related

More information

File Systems: Fundamentals

File Systems: Fundamentals 1 Files Fundamental Ontology of File Systems File Systems: Fundamentals What is a file? Ø A named collection of related information recorded on secondary storage (e.g., disks) File attributes Ø Name, type,

More information

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane

Secondary storage. CS 537 Lecture 11 Secondary Storage. Disk trends. Another trip down memory lane Secondary storage CS 537 Lecture 11 Secondary Storage Michael Swift Secondary storage typically: is anything that is outside of primary memory does not permit direct execution of instructions or data retrieval

More information

NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory

NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory NVMFS: A New File System Designed Specifically to Take Advantage of Nonvolatile Memory Dhananjoy Das, Sr. Systems Architect SanDisk Corp. 1 Agenda: Applications are KING! Storage landscape (Flash / NVM)

More information

File System Implementation. Sunu Wibirama

File System Implementation. Sunu Wibirama File System Implementation Sunu Wibirama File-System Structure Outline File-System Implementation Directory Implementation Allocation Methods Free-Space Management Discussion File System Structure File

More information

Anatomy of Linux flash file systems

Anatomy of Linux flash file systems Options and architectures Skill Level: Intermediate M. Tim Jones (mtj@mtjones.com) Consultant Engineer Emulex Corp. 20 May 2008 You've probably heard of Journaling Flash File System (JFFS) and Yet Another

More information

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University Chapter 11 Implementing File System Da-Wei Chang CSIE.NCKU Source: Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University Outline File-System Structure

More information

Chapter 9: Virtual Memory. Operating System Concepts 9 th Edition

Chapter 9: Virtual Memory. Operating System Concepts 9 th Edition Chapter 9: Virtual Memory Silberschatz, Galvin and Gagne 2013 Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped Files Allocating

More information

Chapter 8 Virtual Memory

Chapter 8 Virtual Memory Operating Systems: Internals and Design Principles Chapter 8 Virtual Memory Seventh Edition William Stallings Operating Systems: Internals and Design Principles You re gonna need a bigger boat. Steven

More information

Understanding the Relation between the Performance and Reliability of NAND Flash/SCM Hybrid Solid- State Drive

Understanding the Relation between the Performance and Reliability of NAND Flash/SCM Hybrid Solid- State Drive Understanding the Relation between the Performance and Reliability of NAND Flash/SCM Hybrid Solid- State Drive Abstract: A NAND flash memory/storage-class memory (SCM) hybrid solid-state drive (SSD) can

More information

A Caching-Oriented FTL Design for Multi-Chipped Solid-State Disks. Yuan-Hao Chang, Wei-Lun Lu, Po-Chun Huang, Lue-Jane Lee, and Tei-Wei Kuo

A Caching-Oriented FTL Design for Multi-Chipped Solid-State Disks. Yuan-Hao Chang, Wei-Lun Lu, Po-Chun Huang, Lue-Jane Lee, and Tei-Wei Kuo A Caching-Oriented FTL Design for Multi-Chipped Solid-State Disks Yuan-Hao Chang, Wei-Lun Lu, Po-Chun Huang, Lue-Jane Lee, and Tei-Wei Kuo 1 June 4, 2011 2 Outline Introduction System Architecture A Multi-Chipped

More information

Chapter 8 & Chapter 9 Main Memory & Virtual Memory

Chapter 8 & Chapter 9 Main Memory & Virtual Memory Chapter 8 & Chapter 9 Main Memory & Virtual Memory 1. Various ways of organizing memory hardware. 2. Memory-management techniques: 1. Paging 2. Segmentation. Introduction Memory consists of a large array

More information

COS 318: Operating Systems. Journaling, NFS and WAFL

COS 318: Operating Systems. Journaling, NFS and WAFL COS 318: Operating Systems Journaling, NFS and WAFL Jaswinder Pal Singh Computer Science Department Princeton University (http://www.cs.princeton.edu/courses/cos318/) Topics Journaling and LFS Network

More information

SCMFS PERFORMANCE ENHANCEMENT AND IMPLEMENTATION ON MOBILE PLATFORM. A Thesis QIAN CAO

SCMFS PERFORMANCE ENHANCEMENT AND IMPLEMENTATION ON MOBILE PLATFORM. A Thesis QIAN CAO SCMFS PERFORMANCE ENHANCEMENT AND IMPLEMENTATION ON MOBILE PLATFORM A Thesis by QIAN CAO Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for

More information

Operating System Concepts

Operating System Concepts Chapter 9: Virtual-Memory Management 9.1 Silberschatz, Galvin and Gagne 2005 Chapter 9: Virtual Memory Background Demand Paging Copy-on-Write Page Replacement Allocation of Frames Thrashing Memory-Mapped

More information

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University File System Internals Jin-Soo Kim (jinsookim@skku.edu) Computer Systems Laboratory Sungkyunkwan University http://csl.skku.edu Today s Topics File system implementation File descriptor table, File table

More information

Main Points. File layout Directory layout

Main Points. File layout Directory layout File Systems Main Points File layout Directory layout File System Design Constraints For small files: Small blocks for storage efficiency Files used together should be stored together For large files:

More information

OPERATING SYSTEM. Chapter 12: File System Implementation

OPERATING SYSTEM. Chapter 12: File System Implementation OPERATING SYSTEM Chapter 12: File System Implementation Chapter 12: File System Implementation File-System Structure File-System Implementation Directory Implementation Allocation Methods Free-Space Management

More information

Secondary Storage (Chp. 5.4 disk hardware, Chp. 6 File Systems, Tanenbaum)

Secondary Storage (Chp. 5.4 disk hardware, Chp. 6 File Systems, Tanenbaum) Secondary Storage (Chp. 5.4 disk hardware, Chp. 6 File Systems, Tanenbaum) Secondary Stora Introduction Secondary storage is the non volatile repository for (both user and system) data and programs. As

More information

Indexing. Jan Chomicki University at Buffalo. Jan Chomicki () Indexing 1 / 25

Indexing. Jan Chomicki University at Buffalo. Jan Chomicki () Indexing 1 / 25 Indexing Jan Chomicki University at Buffalo Jan Chomicki () Indexing 1 / 25 Storage hierarchy Cache Main memory Disk Tape Very fast Fast Slower Slow (nanosec) (10 nanosec) (millisec) (sec) Very small Small

More information

FILE SYSTEM IMPLEMENTATION. Sunu Wibirama

FILE SYSTEM IMPLEMENTATION. Sunu Wibirama FILE SYSTEM IMPLEMENTATION Sunu Wibirama File-System Structure Outline File-System Implementation Directory Implementation Allocation Methods Free-Space Management Discussion File-System Structure Outline

More information

Flash Memory Based Storage System

Flash Memory Based Storage System Flash Memory Based Storage System References SmartSaver: Turning Flash Drive into a Disk Energy Saver for Mobile Computers, ISLPED 06 Energy-Aware Flash Memory Management in Virtual Memory System, islped

More information

Optimizing Translation Information Management in NAND Flash Memory Storage Systems

Optimizing Translation Information Management in NAND Flash Memory Storage Systems Optimizing Translation Information Management in NAND Flash Memory Storage Systems Qi Zhang 1, Xuandong Li 1, Linzhang Wang 1, Tian Zhang 1 Yi Wang 2 and Zili Shao 2 1 State Key Laboratory for Novel Software

More information

ECE 598 Advanced Operating Systems Lecture 18

ECE 598 Advanced Operating Systems Lecture 18 ECE 598 Advanced Operating Systems Lecture 18 Vince Weaver http://web.eece.maine.edu/~vweaver vincent.weaver@maine.edu 5 April 2016 Homework #7 was posted Project update Announcements 1 More like a 571

More information

File Systems: Fundamentals

File Systems: Fundamentals File Systems: Fundamentals 1 Files! What is a file? Ø A named collection of related information recorded on secondary storage (e.g., disks)! File attributes Ø Name, type, location, size, protection, creator,

More information

A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks

A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks A Buffer Replacement Algorithm Exploiting Multi-Chip Parallelism in Solid State Disks Jinho Seol, Hyotaek Shim, Jaegeuk Kim, and Seungryoul Maeng Division of Computer Science School of Electrical Engineering

More information

V. File System. SGG9: chapter 11. Files, directories, sharing FS layers, partitions, allocations, free space. TDIU11: Operating Systems

V. File System. SGG9: chapter 11. Files, directories, sharing FS layers, partitions, allocations, free space. TDIU11: Operating Systems V. File System SGG9: chapter 11 Files, directories, sharing FS layers, partitions, allocations, free space TDIU11: Operating Systems Ahmed Rezine, Linköping University Copyright Notice: The lecture notes

More information

Understanding SSD overprovisioning

Understanding SSD overprovisioning Understanding SSD overprovisioning Kent Smith, LSI Corporation - January 8, 2013 The over-provisioning of NAND flash memory in solid state drives (SSDs) and flash memory-based accelerator cards (cache)

More information

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University

Che-Wei Chang Department of Computer Science and Information Engineering, Chang Gung University Che-Wei Chang chewei@mail.cgu.edu.tw Department of Computer Science and Information Engineering, Chang Gung University Chapter 10: File System Chapter 11: Implementing File-Systems Chapter 12: Mass-Storage

More information

Compressed Swap for Embedded Linux. Alexander Belyakov, Intel Corp.

Compressed Swap for Embedded Linux. Alexander Belyakov, Intel Corp. Compressed Swap for Embedded Linux Alexander Belyakov, Intel Corp. Outline. 1. Motivation 2. Underlying media types 3. Related works 4. MTD compression layer driver place in kernel architecture swap-in/out

More information