SNIA SDC 2018 Santa Clara

Similar documents
Clustered Samba Challenges and Directions. SDC 2016 Santa Clara

SDC 2015 Santa Clara

SMB2 and SMB3 in Samba: Durable File Handles and Beyond. sambaxp 2012

SerNet. Samba Status Update. SNIA SDC 2011 Santa Clara, CA. Volker Lendecke SerNet Samba Team

Clustered NAS For Everyone Clustering Samba With CTDB A Tutorial At sambaxp 2009

Clustering Samba With CTDB A Tutorial At sambaxp 2010

Clustering Samba With CTDB A Tutorial At sambaxp 2010

Implementing SMB2 in Samba. Opening Windows to a Wider. Jeremy Allison Samba Team/Google Open Source Programs Office

Implementing Persistent Handles in Samba. Ralph Böhme, Samba Team, SerNet

Samba in a cross protocol environment

The State of Samba (June 2011) Jeremy Allison Samba Team/Google Open Source Programs Office

SDC EMEA 2019 Tel Aviv

Running And Troubleshooting A Samba/CTDB Cluster. A Tutorial At sambaxp 2011

CTDB + Samba: Scalable Network Storage For The Cloud. Storage Networking World Europe 2011

The workstation account, netlogon schannel and credentials. SambaXP Volker Lendecke Samba Team / SerNet

Clustered NAS For Everyone Clustering Samba With CTDB. NLUUG Spring Conference 2009 File Systems and Storage

PLAYING NICE WITH OTHERS: Samba HA with Pacemaker

Samba. OpenLDAP Developer s Day. Volker Lendecke, Günther Deschner Samba Team

Jeremy Allison Samba Team

SerNet. Async Programming in Samba. Linuxkongress Oktober Volker Lendecke SerNet Samba Team. Network Service in a Service Network

CEPHALOPODS AND SAMBA IRA COOPER SNIA SDC

SMB3 and Linux Seamless POSIX file serving. Jeremy Allison Samba Team.

System Verification At Scale: Thousands of Users Do you need to test with them or not?

CTDB Remix - Dreaming the Fantasy

HANDLING PERSISTENT PROBLEMS: PERSISTENT HANDLES IN SAMBA. Ira Cooper Tech Lead / Red Hat Storage SMB Team May 20, 2015 SambaXP

Emulating Windows file serving on POSIX. Jeremy Allison Samba Team

SMB3 Multi-Channel in Samba

Clustered Samba Not just a hack any more

Multiprotocol Locking and Lock Failover in OneFS Aravind Srinivasan EMC, Isilon Storage Division

Go Deep: Fixing Architectural Overheads of the Go Scheduler

Distributed Systems 16. Distributed File Systems II

Native POSIX Thread Library (NPTL) CSE 506 Don Porter

File Systems Overview. Jin-Soo Kim ( Computer Systems Laboratory Sungkyunkwan University

Samba and Ceph. Release the Kraken! David Disseldorp

Experiences in Clustering CIFS for IBM Scale Out Network Attached Storage (SONAS)

Dynamic Metadata Management for Petabyte-scale File Systems

CephFS A Filesystem for the Future

Chapter 11: File System Implementation. Objectives

Samba4 Status - April Andrew Tridgell Samba Team

CIFS/9000 Server File Locking Interoperability

Samba4 Progress - March Andrew Tridgell Samba Team

The CephFS Gateways Samba and NFS-Ganesha. David Disseldorp Supriti Singh

NFS: Naming indirection, abstraction. Abstraction, abstraction, abstraction! Network File Systems: Naming, cache control, consistency

Lessons learned implementing a multithreaded SMB2 server in OneFS

MATRIX:DJLSYS EXPLORING RESOURCE ALLOCATION TECHNIQUES FOR DISTRIBUTED JOB LAUNCH UNDER HIGH SYSTEM UTILIZATION

Background: I/O Concurrency

Developing Management Strategies and Tools for Samba. Jeffrey Bianchine

Reasons to Migrate from NFS v3 to v4/ v4.1. Manisha Saini QE Engineer, Red Hat IRC #msaini

Ext4 Filesystem Scaling

Distributed Systems. Hajussüsteemid MTAT Distributed File Systems. (slides: adopted from Meelis Roos DS12 course) 1/25

BUILDING A SCALABLE MOBILE GAME BACKEND IN ELIXIR. Petri Kero CTO / Ministry of Games

Using samba base libraries SSSD as an example application

The Art and Science of Memory Allocation

Centralized configuration management using registry tdb in a CTDB cluster

Open Source Storage. Ric Wheeler Architect & Senior Manager April 30, 2012

Final Examination CS 111, Fall 2016 UCLA. Name:

From an open storage solution to a clustered NAS appliance

Virtual File System. Don Porter CSE 506

Causally Ordering Distributed File System Events Tanuj Khurana & Raeanne Marks Dell EMC Isilon

Beyond Relational Databases: MongoDB, Redis & ClickHouse. Marcos Albe - Principal Support Percona

File Systems. What do we need to know?

NFS Version 4.1. Spencer Shepler, Storspeed Mike Eisler, NetApp Dave Noveck, NetApp

What is a file system

RCU. ò Dozens of supported file systems. ò Independent layer from backing storage. ò And, of course, networked file system support

Virtual File System. Don Porter CSE 306

NFS Version 4 Open Source Project. William A.(Andy) Adamson Center for Information Technology Integration University of Michigan

Operating Systems, Assignment 2 Threads and Synchronization

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors

Distributed Filesystem

Introduction To Gluster. Thomas Cameron RHCA, RHCSS, RHCDS, RHCVA, RHCX Chief Architect, Central US Red

Fast and Scalable Queue-Based Resource Allocation Lock on Shared-Memory Multiprocessors

This section discusses the protocols available for volumes on Nasuni Filers.

-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St.

CS 537 Fall 2017 Review Session

PROCESS CONCEPTS. Process Concept Relationship to a Program What is a Process? Process Lifecycle Process Management Inter-Process Communication 2.

The Google File System

An Analysis of Linux Scalability to Many Cores

CSE Traditional Operating Systems deal with typical system software designed to be:

The Google File System

CS 326: Operating Systems. Process Execution. Lecture 5

Distributed file systems

Software Based Fault Injection Framework For Storage Systems Vinod Eswaraprasad Smitha Jayaram Wipro Technologies

DIVING IN: INSIDE THE DATA CENTER

The Art and Science of Memory Alloca4on

Lustre A Platform for Intelligent Scale-Out Storage

SMB2.2 Advancements for WAN Molly Brown, Mathew George Windows File Server Team Microsoft Corporation

FILE SYSTEMS. Jo, Heeseung

GOOGLE FILE SYSTEM: MASTER Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung

FILE SYSTEMS, PART 2. CS124 Operating Systems Fall , Lecture 24

OCFS2 Mark Fasheh Oracle

Input & Output 1: File systems

Kernel Scalability. Adam Belay

CLOUD-SCALE FILE SYSTEMS

CS 4284 Systems Capstone

CSE 124: Networked Services Fall 2009 Lecture-19

Solving Linux File System Pain Points. Steve French Samba Team & Linux Kernel CIFS VFS maintainer Principal Software Engineer Azure Storage

An Introduction to GPFS

Linux File Systems: Challenges and Futures Ric Wheeler Red Hat

NPTEL Course Jan K. Gopinath Indian Institute of Science

System Design for a Million TPS

Transcription:

Clustered Samba Scalability Improvements SNIA SDC 2018 Santa Clara Volker Lendecke Samba Team / SerNet 2018-09-26

Volker Lendecke Samba Status (2 / 13) Samba architecture For every client Samba forks a new process Distinct memory spaces in every process MS-SMB2 and MS-FSA suggest a lot of shared tables Lists of clients, tree connects, open files Samba can t use any of those data structures directly Samba shares data structures via shared key/value stores TDB is a memory-mapped hash table Protection via fcntl locks or shared mutexes TDB provides a clean separation layer

Volker Lendecke Samba Status (3 / 13) SMB history SMB semantics date back to DOS single-user OS Every application by definition had exclusive file access SHARE.EXE maintained illusion by blocking concurrent access Network-aware applications could explicitly permit sharing Different modes of access permitted on a per-open basis Posix opens only have to read metadata Permissions, file location etc Inherent scalability problem through share modes SMB opens need to examine all other opens

Volker Lendecke Samba Status (4 / 13) SMB share modes Every open call requests access permissions READ, WRITE or DELETE (among others) Every open call allows other permissions Concurrent READ, WRITE or DELETE permitted First come, first serve All open handles are entered in a central table struct share mode entry: uint32 t share mode uint32 t access mask Samba stores an array of those per inode in locking.tdb

Volker Lendecke Samba Status (5 / 13) Clustered TDB ctdb ctdb extends tdb files beyond a single machine ctdbd is a daemon to move records around smbd requesting a record gets a local copy ctdb maintains the most recent record location locking.tdb can be lossy Share mode state valid only for open file handles A crashed node s file handles are closed by definition ctdb record access is like NUMA with extreme node distance

Volker Lendecke Samba Status (6 / 13) ctdb Architecture Node 0 ctdb TCP ctdb Node 1 sock sock sock sock smbd smbd mmap mmap mmap mmap smbd mmap smbd mmap locking.tdb locking.tdb

Volker Lendecke Samba Status (7 / 13) locking.tdb scalability Samba s design scales nicely on homedir workloads Mostly exclusive access to many files Heavily contended files are a problem Customer use case: 15,000 users accessing the same exe file simultaneously Every open call needs to check 15.000 share mode entries Nonclustered Samba handles this nicely However, nonlinear processing time per open

Volker Lendecke Samba Status (8 / 13) Optimizations Don t walk the whole list on every open Maintain a central most restrictive share mode share mode: Store the most restrictive share mode handed out access mask: Store the superset of all current access masks Lazy update: More restrictive open request: update central record When closing, don t update: We d have to check all other entries At open time: Just check the central record Only at conflict time, walk the whole list This optimizes the massive non-conflict case

Volker Lendecke Samba Status (9 / 13) Per-Node share mode lists Bouncing 15k share mode entries per open ctdb melts down under this load One central share mode entry per node Every node maintains its own central entry Central record remains small The list of share entries is maintained separately Two databases: locking.tdb and share entries.tdb

Volker Lendecke Samba Status (10 / 13) Multiple Nodes locking.tdb Share Mode Data Super Share Mode Super Share Mode Super Share Mode share entries.tdb

Volker Lendecke Samba Status (11 / 13) But what about leases? Looking at open files.idl there s two arrays Share modes can be split into share entries.tdb Every share entry corresponds to a lease entry lease entries are shared Where to store the lease entries? Leases can be used across different TCP connections Lease information is stored in leases.tdb leases.tdb indexed by client guid and lease key All lease information from locking.tdb moved to leases.tdb leases.tdb needs serious micro-optimization

Volker Lendecke Samba Status (12 / 13) Current Status Remove leases from locking.tdb: Done Central share mode union: Work in progress Multiple share mode arrays: To be done Open problem: Cleanup of share mode data Lazy close keeps share mode unions around When a share mode array drops to 0, look at the others?

Volker Lendecke Samba Status (13 / 13) Questions? vl@samba.org / vl@sernet.de http://www.sambaxp.org/