Samba in a cross protocol environment

Similar documents
HANDLING PERSISTENT PROBLEMS: PERSISTENT HANDLES IN SAMBA. Ira Cooper Tech Lead / Red Hat Storage SMB Team May 20, 2015 SambaXP

The CephFS Gateways Samba and NFS-Ganesha. David Disseldorp Supriti Singh

Towards full NTFS semantics in Samba. Andrew Tridgell

Emulating Windows file serving on POSIX. Jeremy Allison Samba Team

Distributed Systems. Hajussüsteemid MTAT Distributed File Systems. (slides: adopted from Meelis Roos DS12 course) 1/25

From an open storage solution to a clustered NAS appliance

Experiences in Clustering CIFS for IBM Scale Out Network Attached Storage (SONAS)

Samba and Ceph. Release the Kraken! David Disseldorp

NFS in Userspace: Goals and Challenges

SMB2 and SMB3 in Samba: Durable File Handles and Beyond. sambaxp 2012

SONAS Best Practices and options for CIFS Scalability

CIFS/9000 Server File Locking Interoperability

System that permanently stores data Usually layered on top of a lower-level physical storage medium Divided into logical units called files

IBM Active Cloud Engine centralized data protection

SNIA SDC 2018 Santa Clara

Clustered Samba Challenges and Directions. SDC 2016 Santa Clara

The State of Samba (June 2011) Jeremy Allison Samba Team/Google Open Source Programs Office

The Google File System

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

AutoRID module extensions to control ID generation

FILE SYSTEMS, PART 2. CS124 Operating Systems Winter , Lecture 24

NFS: Naming indirection, abstraction. Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency

File Locking in NFS. File Locking: Share Reservations

Clustered NAS For Everyone Clustering Samba With CTDB A Tutorial At sambaxp 2009

SDC 2015 Santa Clara

IBM i 7.3 Features for SAP clients A sortiment of enhancements

Mary Komor Development Tools Subcommittee

CLOUD-SCALE FILE SYSTEMS

TPF Users Group Fall 2008 Title: z/tpf Support for OpenLDAP

DISTRIBUTED SYSTEMS [COMP9243] Lecture 9b: Distributed File Systems INTRODUCTION. Transparency: Flexibility: Slide 1. Slide 3.

NFS Version 4.1. Spencer Shepler, Storspeed Mike Eisler, NetApp Dave Noveck, NetApp

An Introduction to GPFS

Clustered NAS For Everyone Clustering Samba With CTDB. NLUUG Spring Conference 2009 File Systems and Storage

NFSv4.1 Using pnfs PRESENTATION TITLE GOES HERE. Presented by: Alex McDonald CTO Office, NetApp

TPF Debugger / Toolkit update PUT 12 contributions!

FluidFS in a Multi-protocol (SMB/NFS) Environment

The Google File System

SDC EMEA 2019 Tel Aviv

Solving Linux File System Pain Points. Steve French Samba Team & Linux Kernel CIFS VFS maintainer Principal Software Engineer Azure Storage

Status of the Linux NFS client

Chapter 11: File-System Interface

ONTAP 9. SMB/CIFS Reference. December _H0 Updated for ONTAP 9.3

Centralized configuration management using registry tdb in a CTDB cluster

Distributed Systems 16. Distributed File Systems II

The Google File System

Network File Systems

CONFIGURING IBM STORWIZE. for Metadata Framework 6.3

Linux CIFS client year in review: From Nocturnal Monster Puppies to Funky Weasels

CTDB + Samba: Scalable Network Storage For The Cloud. Storage Networking World Europe 2011

Distributed File Systems. Distributed Systems IT332

File Storage Level 100

SMB3 and Linux Seamless POSIX file serving. Jeremy Allison Samba Team.

Implementing Persistent Handles in Samba. Ralph Böhme, Samba Team, SerNet

Reasons to Migrate from NFS v3 to v4/ v4.1. Manisha Saini QE Engineer, Red Hat IRC #msaini

! Design constraints. " Component failures are the norm. " Files are huge by traditional standards. ! POSIX-like

TPF Users Group Code Coverage in TPF Toolkit

Today: Distributed File Systems. File System Basics

Loading Files with Programs: Version Control in the File System

ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

EI 338: Computer Systems Engineering (Operating Systems & Computer Architecture)

Today: Distributed File Systems

SMB 2.0 Next Generation CIFS protocol in Data ONTAP

Global Locking. Technical Documentation Global Locking

Jeremy Allison Samba Team

Clustering Samba With CTDB A Tutorial At sambaxp 2010

Clustering Samba With CTDB A Tutorial At sambaxp 2010

Cloud Computing CS

Provisioning with SUSE Enterprise Storage. Nyers Gábor Trainer &

File Systems Overview. Jin-Soo Kim ( Computer Systems Laboratory Sungkyunkwan University

NFS: What s Next. David L. Black, Ph.D. Senior Technologist EMC Corporation NAS Industry Conference. October 12-14, 2004

The Google File System

Clustered Data ONTAP 8.2 File Access Management Guide for NFS

Distributed File Systems II

Module 7 File Systems & Replication CS755! 7-1!

AN OVERVIEW OF DISTRIBUTED FILE SYSTEM Aditi Khazanchi, Akshay Kanwar, Lovenish Saluja

Josh Wisniewski Development Tools Subcommittee

Topics. File Buffer Cache for Performance. What to Cache? COS 318: Operating Systems. File Performance and Reliability

Linux File Systems: Challenges and Futures Ric Wheeler Red Hat

V6R1 System i Navigator: What s New

Running And Troubleshooting A Samba/CTDB Cluster. A Tutorial At sambaxp 2011

The Google File System

Network File System (NFS)

Network File System (NFS)

The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into

Strata: A Cross Media File System. Youngjin Kwon, Henrique Fingler, Tyler Hunt, Simon Peter, Emmett Witchel, Thomas Anderson

Lustre overview and roadmap to Exascale computing

GFS: The Google File System

Ben Walker Data Center Group Intel Corporation

CSE 124: Networked Services Fall 2009 Lecture-19

Azure File Service: Expectations vs. Reality on the Public Cloud David Goebel Microsoft

CS /15/16. Paul Krzyzanowski 1. Question 1. Distributed Systems 2016 Exam 2 Review. Question 3. Question 2. Question 5.

SMB/CIFS Configuration Guide for Microsoft Hyper-V and SQL Server

Chapter 11: File-System Interface

Best practices. Starting and stopping IBM Platform Symphony Developer Edition on a two-host Microsoft Windows cluster. IBM Platform Symphony

z/tpf Enhanced HTTP Client and High Speed Connector Enhancements

Outline. INF3190:Distributed Systems - Examples. Last week: Definitions Transparencies Challenges&pitfalls Architecturalstyles

End to End Analysis on System z IBM Transaction Analysis Workbench for z/os. James Martin IBM Tools Product SME August 10, 2015

CA485 Ray Walshe Google File System

SMB3 Multi-Channel in Samba

SMB2.2 Advancements for WAN Molly Brown, Mathew George Windows File Server Team Microsoft Corporation

Transcription:

Mathias Dietz IBM Research and Development, Mainz Samba in a cross protocol environment aka SMB semantics vs NFS semantics

Introduction Mathias Dietz (IBM) IBM Research and Development in Mainz, Germany NAS architecture and development of IBM SONAS and Storwize V7000 Unified product. Experience with Samba as part of the IBM SoFS offering since 2006 and the IBM OESV offering since 2003 IBM SONAS - Scale Out Network Attached Storage Modular high performance storage with massive scalability and high availability. Supports multiple petabytes of storage for organizations that need billions of files in a single file system. http://www.ibm.com/systems/storage/network/sonas/ IBM Storwize V7000 Unified A virtualized storage system designed to consolidate block and file workloads into a single storage system for simplicity of management, reduced cost, highly scalable capacity and high availability. http://www.ibm.com/systems/storage/disk/storwize_v7000/ 2 May 2014

Cross Protocol Use Cases Some real world customer examples using multi-protocol access 1) Movie rendering where the workstations of 3D designers write the 3D models via SMB and a bunch of render servers access the 3D models with NFS 2) Research Institute: measurement data has been written by NFS and scientists access the data through SMB 3) Software development team used NFS and SMB for accessing source code paths concurrently and build machines are using NFS to build the code. We will talk about Protocol interoperability Accessing the same data through multiple protocols concurrently or sequentially Focus on NFS and SMB Using the example of GPFS filesystem 3 May 2014

Cross Protocol - System Context NAS Clients NAS Servers (protocol level) NFS (Kernel, Ganesha) SMB (Samba) And other protocols... Storage Backend (File System) GPFS Filesystem

Cross Protocol - System Context NAS Clients NAS Servers (protocol level) NFS (Kernel, Ganesha) SMB (Samba) And other protocols... Locking (byte range) Share reservation Delegations Grace (lock reclaim) Sessions (4.1) v3/v4.x differences Authentication ID Mapping Recovery Notifications Extended Attributes Auditing & Tracing I/O & resource balancing Locking Share Modes Oplocks / Leases Durable / Persistent FH Streams Sessions Windows Attributes SMB 2.x version differences Storage Backend (File System) GPFS Filesystem Locks Share Modes / Reservations GPFS Leases (oplocks) Extended attributes Windows attributes ACL...

NFSv4 Concepts vs SMB Concepts From a high level perspective Windows semantics have a huge overlap with NFSv4.x semantics Byte range locks ~ Byte Range locks (NFSv3 POSIX fcntl) Windows Oplocks ~ NFSv4 Delegations Windows Sharemodes ~ NFSv4 Reservations Alternate Data Streams ~ NFSv4.1 Named Attributes SMB 2.1 Directory leases ~ NFSv4.1 Directory Leases SMB Notifications ~ NFSv4.1 Directory Change Notifications NTFS ACLs ~ NFSv4 ACL SMB2 Durable Open ~ Lock reclaim SMB3 Replay Detection ~ Duplicate reply cache NFSv4 has many similarities with SMB But there are many semantical differences when looking into the details! 6

Samba on GPFS - Architecture vfs_gpfs Samba Byte range locks brlock.tdb POSIX Calls Sharemodes/oplocks Notifications locking.tdb notify.tdb Usually managed by CTDB... NFS Server API GPFS Filesystem

Byte Range Lock differences Different byte range lock semantics between NFS & SMB NFS locks (based on fcntl() call) By default are Advisory Mandatory locking is optional in NFSv4 Byte range locks can be merged/split NFSv4 server might reject locks which are overlapping/split (NFS4ERR_LOCK_RANGE), clients can then emulate the behavior (race condition) SMB locks (SMB2 LOCK Request) Are all mandatory (no advisory locks) Supports stacking of locks instead of merging/splitting An unlock range MUST be identical to the lock range. Subranges cannot be unlocked. A write lock can be acquired even if opened for read. 8

Byte Range Lock differences Windows keeps the locks separated (stacked), each lock can be unlocked independently Lock A Lock B Offset: 145 Len:50 Offset: 100 Len:50 POSIX will merge overlapping locks Merged Lock Offset: 100 Len:95 Overwrite lock A in case of different lock types (RDLCK,WDLCK) Lock A Lock B Offset: 100 Len:45 Offset: 145 Len:50 9

Byte Range Lock differences After unlocking Lock B the semantical difference is clear: Windows lock has a length of 50 Lock A Offset: 100 Len:50 POSIX lock has length of 45 instead of 50!! Lock A Offset: 100 Len:45 10

Byte Range Lock differences To achieve cross protocol consistency Lock semantics must be adapted Samba automatically maps between the SMB lock requests and the POSIX locking of the filesystem. Byte range locks from NFS clients must be honored by SMB clients and vice versa Pass lock requests down to the filesystem (posix locking = yes) NFS locks are mandatory for CIFS clients (CIFS treats all fcntl locks as mandatory) CIFS locks are advisory only for NFS clients 11

Oplocks / Leases vs Delegations SMB Opportunistic Locks and NFS Delegations allow a client to cache data locally NFSv4 Delegations NFSv3 does not support delegations NFSv4 delegations are READ or WRITE NFSv4 allows delegations on files or Named Attributes SMB Oplocks Level 1 (exclusive), Level 2 (read), Batch (exclusive) Allows downgrade to Level 2 oplock (SMB2 OPLOCK_BREAK) SMB allows oplocks on Files or Streams SMB 2.1 Leases Lease key - other processes using this key are permitted access without breaking the lease (shared leases) More oplock types (R,RH,RW,RWH) Allows upgrade and downgrade 12

Oplocks / Delegations vs Leases SMB Oplocks / NFSv4 Delegations* OPEN Request Oplock Response w/ Oplock granted 2 nd OPEN Oplock break SMB 2.1 Leases OPEN Request Lease Response w/ Lease granted 2 nd OPEN with same lease key Response w/ Lease granted 13 *NFSv4 seem to have the same limitation, but this can be client dependent

Oplocks, Leases and Delegations To achieve cross protocol consistency Oplocks/Delegations must be delegated to the filesystem GPFS can manage READ and WRITE oplocks Samba option gpfs:leases=yes Kernel limitation prevents cross-protocol Level II Oplocks A read lease can be placed only on a file descriptor that is opened read-only. GPFS lease code is based on Kernel implementation, so same restrictions apply Level II oplocks should be turned off (level2 oplocks=no) SMB 2.1 Leases Samba does not support them yet Not supported by Kernel and GPFS Thus cannot be used in cross protocol environments 14

Share Modes / Share Reservations SMB Share Modes and Share Reservations allows the client to control which operations are allowed for other applications on the same file SMB Share Modes SHARE_READ and SHARE_WRITE modes SMB allows explicit SHARE_DELETE share mode indicates that other opens are allowed to delete or rename the file Mandatory Allowed on Files and Directories NFS Share Reservations NFSv4 DENY READ,WRITE,BOTH or NONE Mandatory NFSv3 does not support share reservations POSIX open() does not support share modes, Samba uses flock() when kernel share modes option is set to yes 15

Share Modes / Share Reservations To achieve cross protocol consistency Share modes must be propagated to the filesystem GPFS can manage READ and WRITE share modes Samba option gpfs:sharemodes=yes DELETE share mode is not supported by GPFS vfs_gpfs module maps DELETE to READ+WRITE Semantic is not 100% correct Currently the Samba sharemode handling is not atomic (multiple calls) Race conditions might occur if NFSv4 server requests share reservations as well GPFS 3.5 introduced a new createfile() API Atomic call to get sharemode/oplock/initial ACL during open() Not implemented in Samba yet With NFS V3 reading a file is not denied even with DENY_ALL mode Directory share modes are not visible to NFS 16

Change Notifications Clients can register to get notified when a file's data or metadata under a given subtree is changed SMB Change Notifications SMB2 CHANGE_NOTIFY Can be recursive Heavily used by Windows explorer NFS Directory Notifications Not recursive Not supported by NFSv3 and NFSv4.0 Support added with NFSv4.1 Samba uses the notify.tdb to register notification requests and send out Notifications when changes through Samba are detected 17

Change Notifications To achieve cross protocol consistency GPFS does not offer a API to register cluster wide change notifications Use Inotify for cross protocol change notifications on a single node Samba option kernel change notify=yes Not recursive For cluster wide change notifications Use vfs_notify_fam together with Garmin Listen to Volker Lendeckes Talk about Scalable file change notify 18

Timestamps and Windows Attributes Different file time stamps for Windows and NFS SMB creation time (birth time), write time (modification of data) access time (last access) NFS change time (meta data change) modify time (content change) access time (last access) Additional Windows Attributes SYSTEM HIDDEN READ-ONLY ARCHIVE 19

Timestamps and Windows Attributes Birth time and Windows attributes can be stored in GPFS using the gpfs:winattr=yes option Cross protocol considerations Windows Attributes are not seen by NFS Read-only has no effect on NFS reads Map read-only can be used in addition, but not with NFSv4 ACLs Hidden has no effect on NFS Archive flag will be set even if modified with NFS Birth time is automatically filled by GPFS when a file is created 20

NFSv4 Grace Period NFSv4 Servers can trigger a grace period to allow lock recovery. During this grace period clients can reclaim their locks but no new locks will be granted. This opens a window for lock stealing across protocols Samba is unaware of NFS recovery. During NFS recovery, Samba can grab a lock in the conflicting mode, resulting denial of NFS reclaim. Possible solutions (not implemented yet) Make Samba aware of the grace period Not granular enough impact to CIFS/SMB IO Disallow conflicting locks during grace Filesystem would need to know NFS recovery information NOTE: the term 'lock' covers all kinds of locks/share reservations/delegations. 21

More NFS - Kerberos PAC decoding Ganesha can forward Kerberos PAC to winbind to decode group membership Samba Machine account should be used to avoid keytab mismatch SMB Delete on close Delete on close semantics are not known to NFS Files are still visible until finally deleted Durable File handles Samba implements durable file handles, but only if all cross-protocol functions are disabled Case insensitivity GPFS has gpfs_get_realfilename() API to allow case-insensitive lookup Alternate Data Streams Stream are not supported by GPFS, but Extended Attributes can be used (64k limit) ID mapping Complex topic could easily fill another 1h presentation ACL - Listen to SambaXP Talk - Recent improvements in using NFS4 ACLs with Samba (Alexander Werth, IBM)

Outlook To achieve better cross protocol support more operations must move down into the file system. Therefore the file system needs more NTFS semantics POSIX interface is not sufficient Samba POSIX mappings must be bypassed (Extend VFS Layer?) Possible GPFS / vfs_gpfs improvements Atomic create file call (bypass Samba create file) SMB Locking semantics in GPFS SMB style byte range locks FILE_SHARE_DELETE flag Level II oplock support (independent from kernel) SMB Leases support Mandatory locking support Durable/Persistent File handle support in GPFS SIDs in NFSv4 ACLs Delete on close support in GPFS More...

Questions? 24 May 2014

Thank you! 25 May 2014

Legal Information and Trademark The following terms are trademarks of International Business Machines Corporation in the United States, other countries, or both: IBM, IBM Logo, on demand business logo, GPFS The following are trademarks or registered trademarks of other companies. Linux is a registered trademark of Linus Torvalds. Microsoft, Windows and Windows NT are registered trademarks of Microsoft Corporation. UNIX is a registered trademark of The Open Group in the United States and other countries. POSIX is a registered trademark of the IEEE Storwize and the Storwize logo are trademarks or registered trademarks of Storwize Inc., an IBM Company. * All other products may be trademarks or registered trademarks of their respective companies. IBM may not offer the products, services or features discussed in this document and the information may be subject to change without notice. Consult your local IBM business contact for information on the product or services available in your area. The information on the new products is intended to outline our general product direction and it should not be relied on in making a purchasing decision. The information on the new products is for informational purposes only and may not be incorporated into any contract. The information on the new products is not a commitment, promise, or legal obligation to deliver any material, code, or functionality. The development, release, and timing of any features or functionality described for our products remains at our sole discretion. All statements regarding IBM's future direction and intent are subject to change or withdrawal without notice, and represent goals and objectives only. Information about non-ibm products is obtained from the manufacturers of those products or their published announcements. IBM has not tested those products and cannot confirm the performance, compatibility, or any other claims related to non-ibm products. Questions on the capabilities of non-ibm products should be addressed to the suppliers of those products. Prices subject to change without notice. Contact your IBM representative or Business Partner for the most current pricing in your geography. 26