Presenter: Thomas Rivera Senior Technical Associate, Hitachi Systems Author: Christian Bandulet Principal Engineer, Oracle
SNIA Legal Notice The material contained in this tutorial is copyrighted by the SNIA. Member companies and individuals may use this material in presentations and literature under the following conditions: Any slide or slides used must be reproduced without modification The SNIA must be acknowledged as source of any material used in the body of any document containing material from these presentations. This presentation is a project of the SNIA Education Committee. Neither the Author nor the Presenter is an attorney and nothing in this presentation is intended to be nor should be construed as legal advice or opinion. If you need legal advice or legal opinion please contact an attorney. The information presented herein represents the Author's personal opinion and current understanding of the issues involved. The Author, the Presenter, and the SNIA do not assume any responsibility or liability for damages arising out of any reliance on or use of this information. NO WARRANTIES, EXPRESS OR IMPLIED. USE AT YOUR OWN RISK 2
Abstract The s Evolution Over time additional file systems appeared focusing on specialized requirements such as: data sharing, remote file access, distributed file access, parallel files access, HPC, archiving, security, etc. Due to the dramatic growth of unstructured data, files as the basic units for data containers are morphing into file objects, providing more semantics and featurerich capabilities for content processing This presentation will: Categorize and explain the basic principles of currently available file system architectures (e.g. Local, Shared, SAN, Clustered, Network, Distributed, Parallel, etc. Explain technologies like Scale-Out NAS, NAS Aggregation, NAS Virtualization, NAS Clustering, Global Namespace, Parallel NFS Review new file system architectures being developed 3
Related Tutorials We hope you checked out SNIA Tutorial: Using Protocols for Block-based Storage Workloads Check out SNIA Tutorial: Understanding Enterprise NAS Wednesday, 10:45AM Check out Windows 8 Wednesday, 11:40AM Check out SNIA Tutorial: pnfs and NFS V4.2 Wednesday, 4:10PM 4
Why s Have Evolved Scale Megabytes Petabytes Requirements High availability sharing Remote access Performance Archiving others Local System Shared System SAN System Cluster System Network System... Distributed System Object System Parallel System? Time (Not a strict timeline new capabilities are generally incremental) 5
Where s Live User and Libraries (ls, mv, rm, cp,...) User space System Calls (open(), close(), read(), write(), ioctl(), mmap(),...) Kernel space *can be bypassed by using direct I/O Cache* Segmap Cache Volume Manager Device Drivers VFS mmap() DMA Memory Mgmt Process Management Scheduler IPC Buffers Machine dependent code Hardware 6
What s Do (UNIX example) locators: ( inodes ) locators: (pointers) Inode direct 0 Host : (blocks) Blocks direct 1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 direct 2 direct 3 direct 4 direct 5 direct 6 direct 7 direct 8 direct 9 single indirect double indirect triple indirect Owner Type data block data block data block data block data block data block data block data block data block data block data block data block data block Permissions attributes: Last Access. Size # of links 7
A Taxonomy Systems Local Shared Network SAN Cluster Distributed Distributed Parallel 8
Local Local file system system is co-located in the server with application 9
Local Separate islands of data Limitation: no data sharing 10
One Way to Share Scale-Up Vertical scaling 11
Another Way to Share Scale-Out Horizontal Scaling... Storage Network Shared Device: A multi-lun device shared among clients Each client has exclusive access to a dedicated LUN Shared : A physical device shared among clients Clients access LUNs concurrently Shared 12
Access with Shared/Global Separate logical and physical placement Metadata server access is a three-step transaction... Metadata Client Metadata MDS Client Metadata MDS Client Step 1:Request access Step 2: Metadata delivery Step 3: access 13
Shared/Global Asymmetric ( SAN ) Client Network e.g. Web e.g. Web e.g. Web e.g. Web e.g. Web Metadata (active) Metadata (passive) Storage Network Shared One active metadata server Typically homogeneous (scaling limited by metadata server capacity) Inter-node distance limited by storage network capability 14
Shared/Global Symmetric ( Cluster ) Client Network (e.g. Web ) e.g. Web e.g. Web e.g. Web e.g. Web Metadata (active) Metadata (active) Metadata (active) Metadata (active) Metadata (active) Storage Network Shared Metadata server in each node Typically homogeneous (scaling limited by internal communication, e.g., distributed locking) Inter-node distance limited by storage network capability 15
Network s (aka Proxy s) Local Network Client Client Client Client Network Protocol* * e.g. NFS, CIFS, AFP, WebDAV, FTP, HTTP,... Enables sharing of files located on a file server among one or more client computers using a network protocol 16
Network Stack (Example: Sun s NFS) SCSI Port SAN NFS Client RPC/XDR TCP/IP Ethernet NIC SCSI HBA SCSI Driver Volume Mgr NFS RPC/XDR TCP/IP Ethernet NIC LAN 17
Wide Area Network s Consolidation eases Management Administration Cost Compliance Global file sharing and collaboration Location consolidation and optimization SCSI Port SAN SCSI HBA NFS Client RPC/XDR TCP/IP Ethernet NIC SCSI Driver Volume Mgr NFS RPC/XDR TCP/IP Ethernet NIC WAN But: WAN performance is low compared to LAN/SAN performance 18
Improving Wide Area Performance -specific optimizations: email, document management, SQL,... Protocol-specific optimizations: HTTP, NFS, CIFS, WebDAV, FTP, TCP/IP,... Transport acceleration: TCP accelerators Intelligent caching: read-ahead, deferred write, coherency,... compression: algorithms, file-aware differencing, data aggregation, I/O clustering, chunk based de-duplication, cross-protocol data reduction,... SCSI Port SAN SCSI HBA NFS/CIFS NFS/CIFS NFS/CIFS Client NFS/CIFS Client NFS Client Client Client RPC/XDR RPC/XDR RPC/XDR RPC/XDR TCP/IP RPC/XDR TCP/IP TCP/IP TCP/IP Ethernet NIC TCP/IP Ethernet NIC Ethernet NIC Ethernet NIC Ethernet NIC Compression Engine TCP/IP Ethernet NIC TCP/IP Ethernet NIC Compression Engine TCP/IP Ethernet NIC TCP/IP Ethernet NIC SCSI Driver Volume Mgr NFS RPC/XDR TCP/IP Ethernet NIC LAN WAN LAN 19
Distributed (DFS) Client Network Protocol client view: / /a /b /c /a /b /c A network file system with files distributed among multiple file servers Not a parallel file system Single 20
Distributed Parallel Segments of files distributed across storage nodes Enables parallel I/O to individual files (aka file striping) Client Client Client Client Network Protocol Aggregation of Storage s RAIN + RAID (aka Network RAID) Global Namespace 21
NAS Aggregation IP Network NAS Router In-Band Solution Sometimes called NAS Router Global Namespace SAN SAN SAN 22
NAS Virtualization - Out-of-Band Client Client Client Client Global Namespace IP Network Metadata (MDS) distributed files striped files replicated files _A _G _B _D _F _H _C _E _K_1 _K_2 _K_3 _K_4 _A _B _C _B Individual files / file segments pinned to file servers s can be distributed and/or replicated for parallel access s can be striped for intra-file parallel access Clients must locate the right file server e.g. NFSv4.1 (pnfs), Microsoft s DFS 23
NAS Virtualization NFS4.1 pnfs In-Band NAS: Out-of-Band NAS: NFSv4 client NFSv4 client NFSv4 client NFSv4.1 client with pnfs NFSv4.1 client with pnfs NFSv4.1 client with pnfs IP NAS Appliance Storage Protocols: Block: FCP, iscsi, SRP, SAS : NFSv4.1 Object: OSD IP NAS Appliance with NFSv4.1 pnfs extensions SAN SAN path decoupled from control and metadata path 24
Toward Storage Grids via NAS Two variants: NFS CIFS HTTP FTP WebDAV Client Client Client IP VIP Address Clustered Services Cluster (Parallel) All nodes serve all files... NFS CIFS Services Local s System Classic r VIP Address NFS CIFS HTTP FTP WebDAV Clustered Services Each file pinned to a single server... 25
Cloud: The New Grid NAS Cluster is effectively a storage cloud Clients Clients Clients Storage Cloud Clients 26
Segmentation Structured Unstructured Dynamic Media production, ecad, mcad, Office docs Transactional systems, ERP, CRM Fixed Media-archive, DAM, Broadcast, Medical imaging, Media- Internet BI, warehousing, Scientific, Transaction archive 27
The New Reality of Segmentation Dynamic Fixed Structured Unstructured Media production, ecad, mcad, Office docs Transactional systems, ERP, CRM Semi Structured* Media-archive, DAM, Broadcast, medical imaging, Media- Internet BI, data warehousing, scientific, transaction archive *Semi-Structured contains dynamic meta-data defined by users and/or applications 28
Traditional s Metadata Owner, permissions, type, last modification,... 29
Semi-Structured Object ID Methods e.g., Encryption Policies e.g., Replication Attributes User/application defined Metadata Owner, permissions, type, last modification,... 30
The Object Model OID OID Object ID Store Retrieve Methods e.g., Encryption Policies e.g., Replication Object Object Attributes User/application defined Metadata Owner, permissions, type, last modification,... Inode Blocks Name OID Object Name OID Object Name OID Object Name OID Object Name OID Object 31
Managing Objects objects can be managed like records in a relational database with user data as Binary Large Objects (BLOBs) base Schema Object ID Methods Object ID Methods Object ID Methods Object ID Methods Object ID Methods Object ID Methods Object ID Methods Policies Attributes Metadata Policies Policies Policies Policies Policies Policies Object ID Methods Policies Attributes Metadata Attributes Attributes Attributes Attributes Attributes Attributes Metadata Metadata Metadata Metadata Metadata Metadata Object ID Methods Policies Attributes Metadata Object ID Methods Policies Attributes Metadata Object ID Methods Policies Attributes Metadata Object ID Methods Policies Attributes Metadata 32
Managing Objects (Cont.) Object ID Object ID Object ID Object ID Object ID Object ID Methods Methods Methods Methods Methods Methods Policies Policies Policies Policies Policies Policies Attributes Attributes Attributes Attributes Attributes Attributes Metadata Metadata Metadata Metadata Metadata Metadata Indexes constraints/relationships Object search Full text search Join operations Virtual views SQL-like requests Cursors 33
Serving Hierarchy 3 Levels of Abstraction may interface with the storage subsystem in any of three layers: Block highest performance and very little meta data high performance and some metadata Object medium performance and rich metadata Object Many to One Many to One Block Platform 34
Content Repositories Combination of application, database, data services and pointers into external file system specific A B C D Policies Policies Policies Policies Search Engine Search Engine Search Engine Search Engine Indexes Indexes Indexes Indexes Pointers Pointers Pointers Pointers Blocks Name Name Name Name OID OID OID OID Object Object Object Object 35
-Based Content Repositories Combination of files system, database and data services -independent A B C D Blocks Policies Search Engine Name Name OID OID Object Object Indexes Pointers Name Name OID OID Object Object 36
s for Clouds Compute Nodes s are morphing into distributed object-based content repositories Object_B Object_A' Object_D'' Compute Nodes Object_A Object_C' Object_F'' Object_C Object_E' Object_B'' Object_F Object_B' Object_E'' Object_D Object_F' Object_A'' Object_E Object_D' Object_C'' Storage Cloud Compute Nodes Availability through file replication Sacrifice performance Locality of data No RAID protection Peer-to-peer Storage grid Mesh topology Flat namespace Geographically dispersed Heterogeneous Spontaneous federations Compute Nodes 37
Q&A / Feedback Please send any questions or comments on this presentation to SNIA: tracktutorials@snia.org Christian Bandulet Craig Harmer Paul Massiglia Thomas Rivera Joseph White Many thanks to the following individuals for their contributions to this tutorial. - SNIA Education Committee 38