pnfs Update A standard for parallel file systems HPC Advisory Council Lugano, March 2011

Similar documents
The Last Bottleneck: How Parallel I/O can improve application performance

The Last Bottleneck: How Parallel I/O can attenuate Amdahl's Law

Parallel NFS (pnfs) Garth Gibson CTO & co-founder, Panasas Assoc. Professor, Carnegie Mellon University.

February 15, 2012 FAST 2012 San Jose NFSv4.1 pnfs product community

pnfs BOF SC

NFSv4.1 Using pnfs PRESENTATION TITLE GOES HERE. Presented by: Alex McDonald CTO Office, NetApp

pnfs, parallel storage for grid and enterprise computing Joshua Konkle, NetApp, Inc.

pnfs and Linux: Working Towards a Heterogeneous Future

NFSv4 extensions for performance and interoperability

The Parallel NFS Bugaboo. Andy Adamson Center For Information Technology Integration University of Michigan

Clustered and Parallel Storage System Technologies FAST09

Parallel NFS Dr. Oliver Tennert, Head of Technology. sambaxp 2008 Göttingen,

Brent Welch. Director, Architecture. Panasas Technology. HPC Advisory Council Lugano, March 2011

pnfs: Blending Performance and Manageability

Linux Support of NFS v4.1 and v4.2. Steve Dickson Mar Thu 23, 2017

pnfs, parallel storage for grid, virtualization and database computing Joshua Konkle, Chair NFS SIG

Research on Implement Snapshot of pnfs Distributed File System

HPC File Systems and Storage. Irena Johnson University of Notre Dame Center for Research Computing

System input-output, performance aspects March 2009 Guy Chesnot

Clustered and Parallel Storage System Technologies

NFS Version 4.1. Spencer Shepler, Storspeed Mike Eisler, NetApp Dave Noveck, NetApp

Object-based Storage (OSD) Architecture and Systems

NFS-Ganesha w/gpfs FSAL And Other Use Cases

pnfs & NFSv4.2; a filesystem for grid, virtualization and database David Dale, Director Industry Standards, Netapp Author: Joshua Konkle, DCIG

The State of pnfs: The Parallel File System Market in 2011

NFSv4 extensions for performance and interoperability

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Crossing the Chasm: Sneaking a parallel file system into Hadoop

Coming Changes in Storage Technology. Be Ready or Be Left Behind

A Comparative Experimental Study of Parallel File Systems for Large-Scale Data Processing

pnfs and GPFS Dean Hildebrand Storage Systems IBM Almaden Research Lab 2012 IBM Corporation

Storage Aggregation for Performance & Availability:

Parallel File Systems for HPC

GridNFS: Scaling to Petabyte Grid File Systems. Andy Adamson Center For Information Technology Integration University of Michigan

pnfs and Linux: Working Towards a Heterogeneous Future

Cloud Computing CS

Object Storage: Redefining Bandwidth for Linux Clusters

Open Source Storage. Ric Wheeler Architect & Senior Manager April 30, 2012

Internet Engineering Task Force (IETF) Request for Comments: 8434 Updates: 5661 August 2018 Category: Standards Track ISSN:

The ASCI/DOD Scalable I/O History and Strategy Run Time Systems and Scalable I/O Team Gary Grider CCN-8 Los Alamos National Laboratory LAUR

NFSv4.1 Plan for a Smooth Migration

NFS: What s Next. David L. Black, Ph.D. Senior Technologist EMC Corporation NAS Industry Conference. October 12-14, 2004

Coordinating Parallel HSM in Object-based Cluster Filesystems

Parallel File Systems Compared

Lustre A Platform for Intelligent Scale-Out Storage

HPC I/O and File Systems Issues and Perspectives

dcache NFSv4.1 Tigran Mkrtchyan Zeuthen, dcache NFSv4.1 Tigran Mkrtchyan 4/13/12 Page 1

Sun N1: Storage Virtualization and Oracle

pnfs, POSIX, and MPI-IO: A Tale of Three Semantics

Shared Parallel Filesystems in Heterogeneous Linux Multi-Cluster Environments

File System and Storage Benchmarking Workshop SPECsfs Benchmark The first 10 years and beyond

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files

HPSS Development Update and Roadmap 2017

Internet Engineering Task Force (IETF) Request for Comments: 8000 Category: Standards Track. November 2016

Open Source for OSD. Dan Messinger

NFS Version 4 Open Source Project. William A.(Andy) Adamson Center for Information Technology Integration University of Michigan

Shared Object-Based Storage and the HPC Data Center

Brent Callaghan Sun Microsystems, Inc. Sun Microsystems, Inc

Introducing Panasas ActiveStor 14

Linux File Systems: Challenges and Futures Ric Wheeler Red Hat

Object Storage: The Future Building Block for Storage Systems

NFSv4 Open Source Project Update

The Evolution of the NFS Protocol:

CITI - ARC Joint Project Status Report

Cloud Computing CS

NFS: The Next Generation. Steve Dickson Kernel Engineer, Red Hat Wednesday, May 4, 2011

Scalable I/O, File Systems, and Storage Networks R&D at Los Alamos LA-UR /2005. Gary Grider CCN-9

Red Hat Global File System

NFS around the world Tigran Mkrtchyan for dcache Team dcache User Workshop, Umeå, Sweden

NFS/RDMA Next Steps. Chuck Lever Oracle

Xcellis Technical Overview: A deep dive into the latest hardware designed for StorNext 5

pnfs support for ONTAP Unstriped file systems (WIP) Pranoop Erasani Connectathon Feb 22, 2010

NFS on the Fast track - fine tuning and futures

Architecting Storage for Semiconductor Design: Manufacturing Preparation

IOmark- VM. HP MSA P2000 Test Report: VM a Test Report Date: 4, March

Structuring PLFS for Extensibility

Deploying Celerra MPFSi in High-Performance Computing Environments

Jeremy Allison Samba Team

iscsi Technology Brief Storage Area Network using Gbit Ethernet The iscsi Standard

Beyond Petascale. Roger Haskin Manager, Parallel File Systems IBM Almaden Research Center

An Introduction to GPFS

System Requirements. Version 8.2 May 2, For the most recent version of this document, visit our documentation website.

See what s new: Data Domain Global Deduplication Array, DD Boost and more. Copyright 2010 EMC Corporation. All rights reserved.

BeoLink.org. Design and build an inexpensive DFS. Fabrizio Manfredi Furuholmen. FrOSCon August 2008

DAFS Storage for High Performance Computing using MPI-I/O: Design and Experience

<Insert Picture Here> Oracle Storage

SMB3 and Linux Seamless POSIX file serving. Jeremy Allison Samba Team.

Lessons learned implementing a multithreaded SMB2 server in OneFS

PANASAS TIERED PARITY ARCHITECTURE

Samba in a cross protocol environment

Database over NFS. beepy s personal perspective. Brian Pawlowski Vice President and Chief Architect Network Appliance

Reasons to Migrate from NFS v3 to v4/ v4.1. Manisha Saini QE Engineer, Red Hat IRC #msaini

Storage for High-Performance Computing Gets Enterprise Ready

Data Management. Parallel Filesystems. Dr David Henty HPC Training and Support

EMC Lustre Contributions

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

The CephFS Gateways Samba and NFS-Ganesha. David Disseldorp Supriti Singh

OpenAFS A HPC filesystem? Rich Sudlow Center for Research Computing University of Notre Dame

SAS workload performance improvements with IBM XIV Storage System Gen3

HPC Customer Requirements for OpenFabrics Software

Transcription:

pnfs Update A standard for parallel file systems HPC Advisory Council Lugano, March 2011 Brent Welch welch@panasas.com Panasas, Inc. 1

Why a Standard for Parallel I/O? NFS is the only network file system standard Proprietary file systems have unique advantages, but can cause lock-in NFS widens the playing field Panasas, IBM, EMC want to bring their experience in large scale, high-performance file systems into the NFS community Sun/Oracle and NetApp want a standard HPC solution Broader market benefits vendors More competition benefits customers What about open source NFSv4 Linux client is very important for NFSv4 adoption, and therefore pnfs Still need vendors that are willing to do the heavy lifting required in quality assurance for mission critical storage 2

NFSv4 and pnfs NFS created in 80s to share data among engineering workstations NFSv3 widely deployed NFSv4 several years in the making, lots of new stuff Integrated Kerberos (or PKI) user authentication Integrated File Locking and Open Delegations (stateful server!) ACLs (hybrid of Windows and POSIX models) Official path to add (optional) extensions NFSv4.1 adds even more pnfs for parallel IO Directory Delegations for efficiency RPC Sessions for robustness, better RDMA support 3

Whence pnfs Gary Grider (LANL) and Lee Ward (Sandia) Spoke with Garth Gibson about the idea of parallel IO for NFS in 2003 Garth Gibson (Panasas/CMU) and Peter Honeyman (UMich/CITI) Hosted pnfs workshop at Ann Arbor in December 2003 Garth Gibson, Peter Corbett (NetApp), Brent Welch Wrote initial pnfs IETF drafts, presented to IETF in July and November 2004 Andy Adamson (CITI), David Black (EMC), Garth Goodson (NetApp), Tom Pisek (Sun), Benny Halevy (Panasas), Dave Noveck (NetApp), Spenser Shepler (Sun), Brian Pawlowski (NetApp), Marc Eshel (IBM), Dean Hildebrand (CITI) did pnfs prototype based on PVFS NFSv4 working group commented on drafts in 2005, folded pnfs into the 4.1 minorversion draft in 2006 Many others 4

pnfs: Standard Storage Clusters pnfs is an extension to the Network File System v4 protocol standard Allows for parallel and direct access From Parallel Network File System clients To Storage Devices over multiple storage protocols Moves the NFS (metadata) server out of the data path data pnfs Clients NFSv4.1 Server Block (FC) / Object (OSD) / File (NFS) Storage 5

The pnfs Standard The pnfs standard defines the NFSv4.1 protocol extensions between the server and client The I/O protocol between the client and storage is specified elsewhere, for example: SCSI Block Commands (SBC) over Fibre Channel (FC) SCSI Object-based Storage Device (OSD) over iscsi Network File System (NFS) The control protocol between the server and storage devices is also specified elsewhere, for example: SCSI Object-based Storage Device (OSD) over iscsi Client Storage MetaData Server 6

pnfs Layouts Client gets a layout from the NFS Server The layout maps the file onto storage devices and addresses The client uses the layout to perform direct I/O to storage At any time the server can recall the layout Client commits changes and returns the layout when it s done pnfs is optional, the client can always use regular NFSv4 I/O layout Clients NFSv4.1 Server Storage 7

pnfs Client Common client for different storage back ends Wider availability across operating systems Fewer support issues for storage vendors Client Apps NFSv4.1 pnfs Client Layout Driver 1. SBC (blocks) 2. OSD (objects) 3. NFS (files) 4. PVFS2 (files) 5. Future backend pnfs Server Cluster Filesystem Layout metadata grant & revoke 8

Key pnfs Participants Panasas (Objects) ORNL and ESSC/DoD funding Linux pnfs development Network Appliance (Files over NFSv4) IBM (Files, based on GPFS) BlueArc (Files over NFSv4) EMC (Blocks, HighRoad MPFSi) Sun/Oracle (Files over NFSv4) U of Michigan/CITI (Linux maint., EMC and Microsoft contracts) DESY Java-based implementation 9

pnfs Standard Status IETF approved Internet Drafts in December 2008 RFCs for NFSv4.1, pnfs-objects, and pnfs-blocks published January 2010 RFC 5661 - Network File System (NFS) Version 4 Minor Version 1 Protocol RFC 5662 - Network File System (NFS) Version 4 Minor Version 1 External Data Representation Standard (XDR) Description RFC 5663 - Parallel NFS (pnfs) Block/Volume Layout RFC 5664 - Object-Based Parallel NFS (pnfs) Operations 10

pnfs Implementation Status NFSv4.1 mandatory features have priority RPC session layer giving reliable at-most-once semantics, channel bonding, RDMA Server callback channel Server crash recovery Other details EXOFS object-based file system (file system over OSD) In kernel module since 2.6.29 (2008) Export of this file system via pnfs server protocols Simple striping (RAID-0), mirroring (RAID-1), and RAID-5 Most stable and scalable implementation Files (NFSv4 data server) implementation Open source server based on GFS Layout recall not required due to nature of underlying cluster file system Blocks implementation Server in user-level process, Ganesha/NFS support desirable Sponsored by EMC

Calibrating My Predictions 2006 TBD behind adoption of NFS 4.0 and pnfs implementations 2007 September Anticipate working group last call this October Anticipate RFC being published late Q1 2008 Expect vendor announcements after the RFC is published 2008 November (SC08) IETF working group last call complete, area director approval (Linux patch adoption process really just getting started) 2009 November (SC09) Basic NFSv4.1 features 2H2009 NFSv4.1 pnfs and layout drivers by 1H2010 Linux distributions shipping supported pnfs in 2010, 2011

Linux Release Cycle 2009 2.6.30 Merge window March 2009 RPC sessions, NVSv4.1 server, OSDv2 rev5, EXOFS 2.6.31 Merge window June 2009 NFSv4.1 client, sans pnfs 2.6.32 Merge window September 2009 130 server-side patches add back-channel 2.6.33 Merge window December 2009, released Feb 2010 43 pnfs patches

Linux Release Cycle 2010 2.6.34 Merge window February 2010, Released May 2010 21 NFS 4.1 patches 2.6.35 Merge window May 2010, release August? 2010 1 client and 1 server patch (4.1 support) 2.6.36 Merge window August 2010 16 patches accepted into the merge 2.6.37 Merge window November 2010 Includes first chunk of pnfs beyond generic 4.1 infrastructure Still disabled

Linux Release Cycle 2011 2.6.XX 250 patches of remaining pnfs functionality divided into 4 batches by the developers and maintainers Remaining files-based pnfs patches Object-based pnfs Block-based pnfs Push from RedHat for 6.1 and beyond Key IETF working group meeting in February that resolved a fine point with LAYOUT_COMMIT and files backend Patch adoption process will stretch to the end of 2011

pnfs Performance Testing Native Panasas client Testing in Panasas Labs October 2010 Benny Halevy, Boaz Harrosh Compare pnfs with DirectFlow same back end Medium sized PanFS storage cluster (4.8 GB/sec) Modest number of clients (128) A few fast clients N-to-N streaming I/O tests

System Structure Linux Compute Node(s) pnfsd Server NFSv4.1 pnfs Client DirectFlow Client DirectFlow Client ($$) Panasas Metadata iscsi/osd Panasas RPCs Read and Write iscsi/osd Panasas Storage Blades 18

Equipment 12 Shelves Pas 7 500 GB Blades 4x 10GE uplink from each shelf Force 10 E-1200 switch 128 clients (relatively old Nacona) 2 single-core sockets (2.8Gz), 8GB mem, 1GE 4 Faster clients (E5530) 4 quad-core sockets (2.4 GHz), 12GB mem, 10GE

Streaming Bandwidth Iozone benchmark 1GE files Per-file Object RAID Client writes data and parity in RAID-5 pattern Feature of object-based pnfs layout

MB/sec 5000 4500 DF Write pnfs Write 1GE Client Bandwidth Writes pay for writing parity 4000 3500 DF Read pnfs Read 3000 2500 2000 1500 1000 500 pnfs better at low client count DF still better at high client count pnfs readahead not working right 0 0 16 32 48 64 80 96 112 128 144 Number of Clients

How to use pnfs today Up-to-date GIT tree from Linux developers bhalevy@panasas.com manages the source trees Red Hat/Fedora RPMs that include pnfs steved@redhat.com builds experimental packages Linux NFS mailing list, nfs@linux-nfs.org http://open-osd.org Useful to get to OSD target, the user level program Exofs uses kernel initiator, need the target 22

How to use pnfs today Benny's git tree: git://linux-nfs.org/~bhalevy/linux-pnfs.git The kernel rpms can be found at: http://fedorapeople.org/~steved/repos/pnfs/i686 http://fedorapeople.org/~steved/repos/pnfs/x86_64 The source rpm can be found at: http://fedorapeople.org/~steved/repos/pnfs/source/ Bug database https://bugzilla.linux-nfs.org/index.cgi OSD target http://open-osd.org/ 23

Online References: pnfs NFS Version 4.1 RFC 5661 - Network File System (NFS) Version 4 Minor Version 1 Protocol RFC 5662 - Network File System (NFS) Version 4 Minor Version 1 External Data Representation Standard (XDR) Description RFC 5663 - Parallel NFS (pnfs) Block/Volume Layout RFC 5664 - Object-Based Parallel NFS (pnfs) Operations http://tools.ietf.org/html/ pnfs Problem Statement Garth Gibson (Panasas), Peter Corbett (Netapp), Internet-draft, July 2004 http://www.pdl.cmu.edu/pnfs/archive/gibson-pnfs-problem-statement.html Linux pnfs Kernel Development http://www.citi.umich.edu/projects/asci/pnfs/linux 24