The CephFS Gateways Samba and NFS-Ganesha. David Disseldorp Supriti Singh

Similar documents
Samba and Ceph. Release the Kraken! David Disseldorp

CephFS A Filesystem for the Future

openqa features capabilities bugs Ondrej Holecek /aaannz/

Expert Days SUSE Enterprise Storage

openqa making QA interesting since 2013 Ondrej Holecek /aaannz/

Zdeněk Kubala Senior QA

Samba in a cross protocol environment

HANDLING PERSISTENT PROBLEMS: PERSISTENT HANDLES IN SAMBA. Ira Cooper Tech Lead / Red Hat Storage SMB Team May 20, 2015 SambaXP

Collecting data from IoT devices using Sigfox network

Provisioning with SUSE Enterprise Storage. Nyers Gábor Trainer &

openqa Avoiding Disasters of Biblical Proportions Marita Werner QA Project Manager

Distributed File Storage in Multi-Tenant Clouds using CephFS

Choosing an Interface

openqa Avoiding Disasters of Biblical Proportions Marita Werner QA Project Manager

Building a Highly Scalable and Performant SMB Protocol Server

NFS-Ganesha w/gpfs FSAL And Other Use Cases

Deploying Software Defined Storage for the Enterprise with Ceph. PRESENTATION TITLE GOES HERE Paul von Stamwitz Fujitsu

Pushing The Limits Of Linux On ARM

CEPHALOPODS AND SAMBA IRA COOPER SNIA SDC

Distributed File Storage in Multi-Tenant Clouds using CephFS

NFSv4.1 Using pnfs PRESENTATION TITLE GOES HERE. Presented by: Alex McDonald CTO Office, NetApp

Reasons to Migrate from NFS v3 to v4/ v4.1. Manisha Saini QE Engineer, Red Hat IRC #msaini

How To Make Databases on SUSE Linux Enterprise Server Highly Available Mike Friesenegger

SDS Heterogeneous OS Access. Technical Strategist

96Boards Enablement for opensuse

Discover CephFS TECHNICAL REPORT SPONSORED BY. image vlastas, 123RF.com

GlusterFS and RHS for SysAdmins

SMB 2.1 & SMB 3 Protocol features, Status, Architecture, Implementation. Gordon Ross Nexenta Systems, Inc.

Ceph Rados Gateway. Orit Wasserman Fosdem 2016

Docker Networking In OpenStack What you need to know now. Fawad Khaliq

Clustered Samba Challenges and Directions. SDC 2016 Santa Clara

SMB3 and Linux Seamless POSIX file serving. Jeremy Allison Samba Team.

All-NVMe Performance Deep Dive Into Ceph + Sneak Preview of QLC + NVMe Ceph

Jeremy Allison Samba Team

SMB3: Bringing High Performance File Access to Linux: A Status Update. How do you use it? What works? What is coming soon?

Clustered NAS For Everyone Clustering Samba With CTDB. NLUUG Spring Conference 2009 File Systems and Storage

Emulating Windows file serving on POSIX. Jeremy Allison Samba Team

Why software defined storage matters? Sergey Goncharov Solution Architect, Red Hat

BeoLink.org. Design and build an inexpensive DFS. Fabrizio Manfredi Furuholmen. FrOSCon August 2008

From GIT to a custom OS image in a few click OS image made easy

Linux CIFS client year in review: From Nocturnal Monster Puppies to Funky Weasels

HPE Common Internet File System (CIFS) Server Release Notes Version B for HP-UX 11i v3

Clustered NAS For Everyone Clustering Samba With CTDB A Tutorial At sambaxp 2009

RED HAT GLUSTER STORAGE TECHNICAL PRESENTATION

SUSE OpenStack Cloud. Enabling your SoftwareDefined Data Center. SUSE Expert Days. Nyers Gábor Trainer &

GridNFS: Scaling to Petabyte Grid File Systems. Andy Adamson Center For Information Technology Integration University of Michigan

Evaluating Cloud Storage Strategies. James Bottomley; CTO, Server Virtualization

Solving Linux File System Pain Points. Steve French Samba Team & Linux Kernel CIFS VFS maintainer Principal Software Engineer Azure Storage

A New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd.

Lustre overview and roadmap to Exascale computing

SMB2 and SMB3 in Samba: Durable File Handles and Beyond. sambaxp 2012

The opensuse project. Motivation, Goals, and Opportunities. Sonja Krause-Harder Michael Löffler. March 6, 2006

The State of Samba (June 2011) Jeremy Allison Samba Team/Google Open Source Programs Office

February 15, 2012 FAST 2012 San Jose NFSv4.1 pnfs product community

Cloud object storage in Ceph. Orit Wasserman Fosdem 2017

NFS on Steroids: Building Worldwide Distributed File System Gregory Touretsky Intel IT

Distributed File Systems II

An Introduction to GPFS

-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St.

SUSE Manager and Salt

Running And Troubleshooting A Samba/CTDB Cluster. A Tutorial At sambaxp 2011

Linux and z Systems in the Datacenter Berthold Gunreben

PLAYING NICE WITH OTHERS: Samba HA with Pacemaker

CTDB + Samba: Scalable Network Storage For The Cloud. Storage Networking World Europe 2011

SUSE Linux Enterprise Kernel Back to the Future

Essentials. Johannes Meixner. about Disaster Recovery (abbreviated DR) with Relax-and-Recover (abbreviated ReaR)

Gerald Carter Samba Team/HP

SUSE Manager in Large Scale 17220

CA485 Ray Walshe Google File System

Clustering Samba With CTDB A Tutorial At sambaxp 2010

Clustering Samba With CTDB A Tutorial At sambaxp 2010

A fields' Introduction to SUSE Enterprise Storage TUT91098

NFSv4.1 Plan for a Smooth Migration

DELL EMC ISILON F800 AND H600 I/O PERFORMANCE

Implementing SMB2 in Samba. Opening Windows to a Wider. Jeremy Allison Samba Team/Google Open Source Programs Office

Introducing SUSE Enterprise Storage 5

GlusterFS Architecture & Roadmap

Distributed Systems. Hajussüsteemid MTAT Distributed File Systems. (slides: adopted from Meelis Roos DS12 course) 1/25

SUSE Enterprise Storage Technical Overview

NFS around the world Tigran Mkrtchyan for dcache Team dcache User Workshop, Umeå, Sweden

SMB 3.0 (Because 3 > 2) David Kruse Microsoft

Accelerate Ceph By SPDK on AArch64

ONTAP 9. SMB/CIFS Reference. December _H0 Updated for ONTAP 9.3

SMB3.1.1 and beyond: Optimizing access from Linux Client to Samba, the Cloud and modern file servers

SaltStack and SUSE Systems and Configuration Management that Scales and is Easy to Extend

GlusterFS Current Features & Roadmap

SDC 2015 Santa Clara

INTRODUCTION TO CEPH. Orit Wasserman Red Hat August Penguin 2017

WHAT S NEW IN LUMINOUS AND BEYOND. Douglas Fuller Red Hat

Build with SUSE Studio, Deploy with SUSE Linux Enterprise Point of Service and Manage with SUSE Manager Case Study

Pushing the Boundaries of SMB3: Status of the Linux Kernel client and interoperability with Samba

Open Enterprise & Open Community

Welcome to SUSE Expert Days 2017 Service Delivery with DevOps

Software Defined. All The Way with OpenStack. T. R. Bosworth Senior Product Manager SUSE OpenStack Cloud

SUSE Manager Roadmap OS Lifecycle Management from the Datacenter to the Cloud

dcache NFSv4.1 Tigran Mkrtchyan Zeuthen, dcache NFSv4.1 Tigran Mkrtchyan 4/13/12 Page 1

CS 470 Spring Distributed Web and File Systems. Mike Lam, Professor. Content taken from the following:

Saving Real Storage with xip2fs and DCSS. Ihno Krumreich Project Manager for SLES on System z

Best practices with SUSE Linux Enterprise Server Starter System and extentions Ihno Krumreich

From an open storage solution to a clustered NAS appliance

Transcription:

The CephFS Gateways Samba and NFS-Ganesha David Disseldorp ddiss@samba.org Supriti Singh supriti.singh@suse.com

Agenda Why Exporting CephFS over Samba and NFS-Ganesha What Architecture & Features Samba NFS-Ganesha How Interoperability of Samba and NFS-Ganesha

Ceph Architecutre Image Source: http://docs.ceph.com/docs/giant/architecture/

CephFS Clients: Samba and NFS-Ganesha

CephFS Clients: Samba and NFS-Ganesha K-Client Samba NFS- Ganesha

NFS-Ganesha

NFS-Ganesha Open source User space NFS server Supports multiple Filesystem Backends: CephFS RGW (RADOS Gateway) Gluster GPFS

NFS-Ganesha Open source User space NFS server Supports multiple FilSystem Backends: CephFS RGW (RADOS Gateway) Gluster GPFS

NFS-Ganesha and CephFS NFS Mount Point Kernel CephFS Client NFS-Ganesha libcephfs librgw RADOS Cluster

NFS-Ganesha Architecture Duplicate Request Layer Network Channel RPC Dispatcher NFSv3, NFSv4 RPCSEC_GSS FileSystem Abstraction Layer(FSAL) Admin DBus Log MDCACHE CephFS

NFS-Ganesha Modular Architecture RPC layer: uses libntirpc Filesystem Abstraction layer (FSAL) Provides API for exported namespace Metadata cache (MDCACHE) Chunked dirent cache (version 2.6) Dbus Interface System management and communication Log Management Support for internal logging

NFS-Ganesha key features Single nfs-ganesha instance can support: Multiple exports Multiple filesystem backend Multiple Protocols RPCSEC_GSS with krb5 authentication Dynamically export/unexport entries using DBus

NFS-Ganesha CephFS features Cephx authorization Read delegations Export subdirectories Load balancing

NFS-Ganesha High Availability Problem: Single NFS-Ganesha server Single point of failure Bottleneck Cannot scale with backend filesystem. Solution: Clustering High availability Load balancing

NFS-Ganesha HA: Active-Passive Virtual IP Pacemaker/Corosync NFS-Client mount -t nfs <Virtual_IP> /mnt

NFS-Ganesha HA: Active-Passive Virtual IP Pacemaker/Corosync NFS-Client mount -t nfs <Virtual_IP> /mnt

NFS-Ganesha HA: Active-Active Pacemaker/Corosync NFS-Client mount -t nfs <Virtual_IP> /mnt

Samba

Samba File and print sharing SMB / CIFS, SMB2 and SMB3+ dialects Authentication NTLMv2 and Kerberos Identity mapping Windows SIDs to uids and gids Active Directory domain member or domain controller

Protocol SMB / CIFS Legacy dialect Hundreds of commands and subcommands UNIX extensions SMB2 Clean break from old dialects Modern, simplified protocol with improved performance

Protocol (continued) SMB2.1 SMB3.1.1 Most recent protocol revisions Lease improvements RDMA extensions Multichannel Witness Protocol End-to-end encryption

Clients Windows Reference client Protocol specification publisher macos AFP replaced by SMB2 as default for Mavericks (2013) Linux CIFS kernel module and Samba smbclient

Clustered Trivial Database (CTDB) Samba persistent state stored in TDB key-value store Share some of this state across multiple nodes Cluster consistent database Reliable messaging HA features bolted on Monitoring and failover

CTDB Nodes participate in election of recovery master Recovery master monitors state of cluster Performs database recovery if necessary Cluster-wide mutex used to prevent split brain Tickle clients on IP failover

Samba Ceph VFS module TDB libcephfs

Samba Ceph Integration CephFS module for Samba: vfs_ceph Maps SMB file and directory I/O to libcephfs API calls Static cephx credentials Regardless of Samba authenticated user POSIX ACLs

Samba Ceph Integration Ceph RADOS clustered mutex helper for CTDB Ceph librados service integration (coming soon)

Testing Samba smbtorture cifs.ko fstests Interoperability MacOS, Windows, Hyper-V, etc.

Performance

Benchmark Setup Ceph setup on 8 nodes 5 OSD nodes 24 cores 128 GB RAM 3 MON/MDS nodes 24 cores 128 GB RAM 6 OSD daemons per node Bluestore SSD/NVME journals 10 client nodes 16 cores 16 GB RAM Network interconnect Public network 10Gbit/s Cluster network 100Gbit/s

FIO Job Read/write data to #SIZE file for #TIME 10 client nodes. On each client node, jobs are executed A single job is of type: { number_of_workers}rw_{ block_size }_{ op }, where: Number of worker threads: 1, 4, 8, 16 Each worker thread creates a file of 1g size Block Sizes: Op: 4k, 64k, 1m, 4m, 8m rw

Performance: NFS-Ganesha vs CephFS Benchmarking was performed for: NFS-Ganesha v2.5.2 Ceph Version 12.2.1 Single NFS-Ganesha server NFS version 4.0 Read/Write data 1GB file for 2 mins

NFS-Ganesha vs CephFS Kernel client: Aggregated B/W over 10 clients

NFS-Ganesha vs CephFS Kernel client: Read/Write ratio B/W comparison (in %)

Obeservations For single thread, nfs b/w is 80% of cephfs b/w. Performance degrades as no. of threads increase Single nfs-ganesha server is bottleneck.

NFS-Ganesha vs CephFS: 16 threads latency NFS-Ganesha CephFS

Performance: Samba vs CephFS Preliminary results! Environment: Ceph Version 12.2.2 Samba 4.6.9 Three Samba gateways vfs_ceph with oplocks / leases disabled Non-overlapping share paths Linux cifs.ko client 4.4 kernel with many backports SMB 3.0 mount

Challenges and Future

Challenges Cross-protocol client support Shared (NFS, CephFS) ACL model RichACLs or POSIX draft ACLs Coherent client caching Map leases to CephFS FILE and AUTH capabilities Unified authentication and user mapping Use Kerberos / AD for Samba gateway and cephx

Challenges libcephfs asynchronous I/O Multichannel support Experimental in upstream Samba Not integrated with CTDB Automated deployment

Challenges Witness protocol Continuous availability of SMB shares Advertise Samba cluster state to clients Transparent client failover Load balancing

Samba: Future Ceph backed key-value store for Samba Replace or modify CTDB Rocksdb? Samba database API demanding Multiple processes and writers Record locking and transactions

NFS-Ganesha: Future clustering without Linux HA NFS v4.1 support Librados service integration

References Samba: https://samba.org/ CTDB: https://ctdb.samba.org/ SMB 3.1.1 encryption: https://technet.microsoft.com/en-us/library/dn551363(v=ws.11).aspx Multichannel deployment: https://technet.microsoft.com/en-us/library/dn610980(v=ws.11).aspx Witness Protocol: http://www.sambaxp.org/archive_data/sambaxp2015-slides/wed/track1/sambaxp2015-wed-track1- Guenther_Deschner-ImplementingTheWitnessProtocolInSamba.pdf Samba Multichannel Blocker Bug: https://bugzilla.samba.org/show_bug.cgi?id=11897 CephFS cache flags: https://jtlayton.wordpress.com/2016/09/01/cephfs-and-the-nfsv4-change-attribute/

Join Us at www.opensuse.org

License This slide deck is licensed under the Creative Commons Attribution-ShareAlike 4.0 International license. It can be shared and adapted for any purpose (even commercially) as long as Attribution is given and any derivative work is distributed under the same license. Details can be found at https://creativecommons.org/licenses/by-sa/4.0/ General Disclaimer This document is not to be construed as a promise by any participating organisation to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. opensuse makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for opensuse products remains at the sole discretion of opensuse. Further, opensuse reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All opensuse marks referenced in this presentation are trademarks or registered trademarks of SUSE LLC, in the United States and other countries. All third-party trademarks are the property of their respective owners. Credits Template Richard Brown rbrown@opensuse.org Design & Inspiration opensuse Design Team http://opensuse.github.io/brandingguidelines/