CrashMonkey: A Framework to Systematically Test File-System Crash Consistency. Ashlie Martinez Vijay Chidambaram University of Texas at Austin

Similar documents
Crash Consistency: FSCK and Journaling. Dongkun Shin, SKKU

File System Consistency. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

File System Consistency

Journaling versus Soft-Updates: Asynchronous Meta-data Protection in File Systems

TxFS: Leveraging File-System Crash Consistency to Provide ACID Transactions

Announcements. Persistence: Crash Consistency

CS 318 Principles of Operating Systems

W4118 Operating Systems. Instructor: Junfeng Yang

Towards Efficient, Portable Application-Level Consistency

CS3210: Crash consistency

Journaling. CS 161: Lecture 14 4/4/17

Operating Systems. Lecture File system implementation. Master of Computer Science PUF - Hồ Chí Minh 2016/2017

File Systems: Consistency Issues

Operating Systems. File Systems. Thomas Ropars.

JOURNALING FILE SYSTEMS. CS124 Operating Systems Winter , Lecture 26

Case study: ext2 FS 1

FS Consistency & Journaling

*-Box (star-box) Towards Reliability and Consistency in Dropbox-like File Synchronization Services

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Case study: ext2 FS 1

mode uid gid atime ctime mtime size block count reference count direct blocks (12) single indirect double indirect triple indirect mode uid gid atime

[537] Journaling. Tyler Harter

Lecture 18: Reliable Storage

Caching and reliability

CSE506: Operating Systems CSE 506: Operating Systems

Linux Filesystems Ext2, Ext3. Nafisa Kazi

CHAPTER 11: IMPLEMENTING FILE SYSTEMS (COMPACT) By I-Chen Lin Textbook: Operating System Concepts 9th Ed.

Tricky issues in file systems

OPERATING SYSTEM. Chapter 12: File System Implementation

Optimistic Crash Consistency. Vijay Chidambaram Thanumalayan Sankaranarayana Pillai Andrea Arpaci-Dusseau Remzi Arpaci-Dusseau

Membrane: Operating System support for Restartable File Systems

Chapter 12: File System Implementation

Computer Systems Laboratory Sungkyunkwan University

EXPLODE: a Lightweight, General System for Finding Serious Storage System Errors. Junfeng Yang, Can Sar, Dawson Engler Stanford University

Chapter 11: Implementing File Systems

CSE 451: Operating Systems Winter Module 17 Journaling File Systems

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

EECS 482 Introduction to Operating Systems

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson

Shared snapshots. 1 Abstract. 2 Introduction. Mikulas Patocka Red Hat Czech, s.r.o. Purkynova , Brno Czech Republic

Chapter 10: File System Implementation

Push-button verification of Files Systems via Crash Refinement

OPERATING SYSTEMS II DPL. ING. CIPRIAN PUNGILĂ, PHD.

<Insert Picture Here> Filesystem Features and Performance

Advanced Memory Management

Chapter 12: File System Implementation. Operating System Concepts 9 th Edition

File Systems II. COMS W4118 Prof. Kaustubh R. Joshi hdp://

Chapter 12: File System Implementation

COS 318: Operating Systems. Journaling, NFS and WAFL

A whitepaper from Sybase, an SAP Company. SQL Anywhere I/O Requirements for Windows and Linux

ViewBox. Integrating Local File Systems with Cloud Storage Services. Yupu Zhang +, Chris Dragga + *, Andrea Arpaci-Dusseau +, Remzi Arpaci-Dusseau +

Week 12: File System Implementation

From Crash Consistency to Transactions. Yige Hu Youngjin Kwon Vijay Chidambaram Emmett Witchel

Chapter 11: Implementing File

Optimistic Crash Consistency. Vijay Chidambaram Thanumalayan Sankaranarayana Pillai Andrea Arpaci-Dusseau Remzi Arpaci-Dusseau

Chapter 11: Implementing File Systems. Operating System Concepts 9 9h Edition

CSE 333 Lecture 9 - storage

The Dangers and Complexities of SQLite Benchmarking. Dhathri Purohith, Jayashree Mohan and Vijay Chidambaram

C13: Files and Directories: System s Perspective

File System Implementation

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Network File System (NFS)

Optimistic Crash Consistency. Vijay Chidambaram Thanumalayan Sankaranarayana Pillai Andrea Arpaci-Dusseau Remzi Arpaci-Dusseau

CS5460: Operating Systems Lecture 20: File System Reliability

ECE 598 Advanced Operating Systems Lecture 18

Operating Systems. Operating Systems Professor Sina Meraji U of T

Topics. " Start using a write-ahead log on disk " Log all updates Commit

CS307: Operating Systems

GFS: The Google File System. Dr. Yingwu Zhu

Advanced file systems: LFS and Soft Updates. Ken Birman (based on slides by Ben Atkin)

ext3 Journaling File System

Chapter 11: Implementing File Systems

15: Filesystem Examples: Ext3, NTFS, The Future. Mark Handley. Linux Ext3 Filesystem

Advanced File Systems. CS 140 Feb. 25, 2015 Ali Jose Mashtizadeh

ECE 598 Advanced Operating Systems Lecture 19

Outline. Failure Types

2. PICTURE: Cut and paste from paper

BTREE FILE SYSTEM (BTRFS)

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

MEMBRANE: OPERATING SYSTEM SUPPORT FOR RESTARTABLE FILE SYSTEMS

File System Case Studies. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Operating System Concepts Ch. 11: File System Implementation

Lecture 11: Linux ext3 crash recovery

CS3600 SYSTEMS AND NETWORKS

I/O and file systems. Dealing with device heterogeneity

Windows Persistent Memory Support

Files and File Systems

Filesystem. Disclaimer: some slides are adopted from book authors slides with permission

CS-537: Midterm Exam (Spring 2009) The Future of Processors, Operating Systems, and You

November 9 th, 2015 Prof. John Kubiatowicz

Review: FFS background

Da-Wei Chang CSIE.NCKU. Professor Hao-Ren Ke, National Chiao Tung University Professor Hsung-Pin Chang, National Chung Hsing University

File system implementation

File system implementation

File Systems 1. File Systems

File Systems 1. File Systems

File System. Minsoo Ryu. Real-Time Computing and Communications Lab. Hanyang University.

Rethink the Sync. Abstract. 1 Introduction

File System Internals. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Generalized File System Dependencies

Transcription:

CrashMonkey: A Framework to Systematically Test File-System Crash Consistency Ashlie Martinez Vijay Chidambaram University of Texas at Austin

Crash Consistency File-system updates change multiple blocks on storage Data blocks, s, and superblock may all need updating Changes need to happen atomically Need to ensure file system consistent if system crashes Ensures that data is not lost or corrupted File data is correct Links to directories and files unaffected All free data blocks are accounted for Techniques: journaling, copy-on-write Crash consistency is complex and hard to implement 2

Testing Crash Consistency Randomly power cycling a VM or machine Random crashes unlikely to reveal bugs Restarting machine or VM after crash is slow Killing user space file-system process Requires special file-system design Ad-hoc Despite its importance, no standardized or systematic tests 3

What Really Needs Tested? Current tests write data to disk each time Crashing while writing data is not the goal True goal is to generate disk states that crash could cause 4

CrashMonkey Framework to test crash consistency Works by constructing crash states for given workload Does not require reboot of OS/VM File-system agnostic Modular, extensible Currently tests 100,000 crash states in ~10min 5

Outline Overview How Consistency is Tested Today Linux Writes CrashMonkey Preliminary Results Future Plans Conclusion 6

How Consistency Is Tested Today Power cycle a machine or VM Crash machine/vm while data is being written to disk Reboot machine and check file system Random and slow Run file system in user space ZFS test strategy Kill file system user process during write operations Requires file system have the ability to run in user space Rebooting Please Wait... X? Write to foo.txt 7

Outline Overview How Consistency is Tested Today Linux Writes CrashMonkey Preliminary Results Future Plans Conclusion 8

Linux Storage Stack VFS Page Cache File System Generic Block Layer Block Device Driver Disk Cache Provides consistent interface across file systems Holds recently used files and data Ext, NTFS, etc. Interface between file systems and device drivers Device specific driver Caches data on block device Block Device Persistent storage device 9

Linux Writes Write Flags Metadata attached to operations sent to device driver Change how the OS and device driver order operations Both IO scheduler and disk cache reorder requests sync denotes process waiting for this write Orders writes issued with sync in that process flush all data in the device cache should be persisted If request has data, data may not be persisted at return Forced Unit Access (FUA) return when data is persisted Often paired with flush so all data including request is durable 10

Linux Writes Data written to disk in epochs each terminated by flush and/or FUA operations Reordering within epochs Operating system adheres to FUA, flush, and sync flags Block device adheres to FUA and flush flags A: write B: write, meta C: write, sync D: flush E: write, sync F: write, sync G: write, sync H: FUA, flush Epoch 1 Epoch 2 11

Linux Writes Example echo Hello World! > foo.txt Data 1 Data 2 flush epoch 1 Journal: epoch 2 flush Journal: commit epoch 3 flush Operating System Block Device 12

Linux Writes Example echo Hello World! > foo.txt Data 1 Data 2 flush epoch 1 Journal: epoch 2 flush Journal: commit epoch 3 flush Operating System Block Device Data 2 Data 1 flush epoch 1 13

Linux Writes Example echo Hello World! > foo.txt Data 1 Data 2 flush epoch 1 Journal: epoch 2 flush Journal: commit epoch 3 flush Operating System Block Device Data 2 Data 1 flush epoch 1 Journal: epoch 2 flush 14

Linux Writes Example echo Hello World! > foo.txt Data 1 Data 2 flush epoch 1 Journal: epoch 2 flush Journal: commit epoch 3 flush Operating System Block Device Data 2 Data 1 flush epoch 1 Journal: epoch 2 flush Journal: commit epoch 3 flush 15

Outline Overview How Consistency is Tested Today Linux Writes CrashMonkey Preliminary Results Future Plans Conclusion 16

Fast Goals for CrashMonkey Ability to intelligently and systematically direct tests toward interesting crash states File-system agnostic Works out of the box without the need for recompiling the kernel Easily extendable and customizable 17

CrashMonkey: Architecture User provided file-system operations User User Workload Generated potential crash states Crash State 1 Test Harness Crash State 2 Kernel File System Generic Block Layer Device Wrapper Custom RAM Block Device Records information about user workload Provides fast writable snapshot capability 18

Constructing Crash States touch foo.txt echo foo bar baz > foo.txt epoch 1 Journal: flush Randomly choose n epochs to permute (n = 2 here) Data 1 epoch 2 Data 2 Data 3 flush epoch 3 Journal: flush 19

Constructing Crash States touch foo.txt echo foo bar baz > foo.txt epoch 1 Journal: flush epoch 1 Journal: flush Randomly choose n epochs to permute (n = 2 here) Copy epochs [1, n 1] Data 1 epoch 2 Data 2 Data 3 flush epoch 3 Journal: flush 20

Constructing Crash States touch foo.txt echo foo bar baz > foo.txt epoch 1 Journal: flush epoch 1 Journal: flush Randomly choose n epochs to permute (n = 2 here) Copy epochs [1, n 1] epoch 2 Data 1 Data 2 Data 3 epoch 2 Data 3 Data 1 Permute and possibly drop operations from epoch n flush epoch 3 Journal: flush 21

CrashMonkey In Action User Workload Test Harness Device Wrapper Base Disk 22

CrashMonkey In Action Workload Setup User Workload Test Harness mkdir test Device Wrapper Metadata Base Disk 23

CrashMonkey In Action Snapshot Device User Workload Test Harness Device Wrapper Writable Snapshot Metadata 24

CrashMonkey In Action Profile Workload User Workload Test Harness echo bar baz > foo.txt Device Wrapper Metadata Data Writable Snapshot Metadata Metadata Data 25

CrashMonkey In Action Export Data User Workload Test Harness Device Wrapper Metadata Data Metadata Data Writable Snapshot Metadata Metadata Data 26

CrashMonkey In Action Restore Snapshot User Workload Metadata Test Harness Data Device Wrapper Metadata Data Metadata Crash State 27

CrashMonkey In Action Reorder Data User Workload Metadata Test Harness Device Wrapper Metadata Data Metadata Crash State 28

CrashMonkey In Action Write Reordered Data to Snapshot User Workload Metadata Test Harness Device Wrapper Metadata Metadata Data Metadata Crash State 29

CrashMonkey In Action Check File-System Consistency User Workload Metadata Test Harness Device Wrapper Metadata Data Crash State Metadata Metadata 30

Testing Consistency Different types of consistency File system is inconsistent and unfixable File system is consistent but garbage data File system has leaked s but is recoverable File system is consistent and data is good Currently run fsck on all disk states Check only certain parts of file system for consistency Users can define checks for data consistency 31

Customizing CrashMonkey Customize algorithm to construct crash states class Permuter { public: virtual void init_data(vector); virtual bool gen_one_state(vector); }; Customize workload: Setup Data writes Data consistency tests class BaseTestCase { public: virtual int setup(); virtual int run(); virtual int check_test(); }; 32

Outline Overview How Consistency is Tested Today Linux Writes CrashMonkey Preliminary Results Future Plans Conclusion 33

Results So Far Testing 100,000 unique disk states takes ~10 minutes Test creates 10 1KB files in a 10MB ext4 file system Majority of time spent running fsck Profiling the workload takes ~1 minute Happens only once per user-defined test Want operations to write to disk naturally sync() adds extra operations to those recorded Must wait for writeback delay Decrease delay through /proc file 34

Outline Overview How Consistency is Tested Today Linux Writes CrashMonkey Preliminary Results Future Plans Conclusion 35

The Path Ahead Identify interesting crash states Focus on states which have reordered metadata Huge search space from which to select crash states Avoid testing equivalent crash states Avoid generating write sequences that are equivalent Generate write sequences then check for equivalence Parallelize tests Each crash state is independent of the others Optimize test harness to run faster Check only parts of file system for consistency 36

Outline Overview How Consistency is Tested Today Linux Writes CrashMonkey Preliminary Results Future Plans Conclusion 37

Conclusion Crash consistency is very important Crash consistency is hard and complex to implement Current crash consistency not well tested despite importance CrashMonkey seeks to alleviate these problems Efficient, systematic,file-system agnostic Work in progress Code available at https://github.com/utsaslab/crashmonkey 38

Thank You! Questions? 39

Related Work ALICE and BOB [Pillai et al. OSDI 14] Very narrow scope explore how file systems crash No attempt to explore or test crash consistency Database Replay Framework [Zheng et al. OSDI 14] Specifically targets databases Works only on SCSI drives Not open source Does not allow user defined tests 40

Custom RAM Block Device User Process User Kernel File system write RAM Block Device Metadata: File data 41

Custom RAM Block Device User Process User Kernel Snapshot RAM Block Device Writable Snapshot Metadata: Metadata: File data File data 42

Custom RAM Block Device User Process User Kernel Read original data RAM Block Device Writable Snapshot Metadata: Metadata: File data File data 43

Custom RAM Block Device User Process User Kernel Overwrite file data RAM Block Device Writable Snapshot Metadata: File data Metadata: New file data 44

Custom RAM Block Device User Process User Kernel Write new data RAM Block Device Writable Snapshot Metadata: File data Metadata: New file data Metadata: 2 File 2 data 45

Custom RAM Block Device User Process User Kernel Restore RAM Block Device Writable Snapshot Metadata: Metadata: File data File data 46