Distributed File Storage in Multi-Tenant Clouds using CephFS

Similar documents
Distributed File Storage in Multi-Tenant Clouds using CephFS

Why software defined storage matters? Sergey Goncharov Solution Architect, Red Hat

Deploying Software Defined Storage for the Enterprise with Ceph. PRESENTATION TITLE GOES HERE Paul von Stamwitz Fujitsu

Ceph Intro & Architectural Overview. Abbas Bangash Intercloud Systems

INTRODUCTION TO CEPH. Orit Wasserman Red Hat August Penguin 2017

THE CEPH POWER SHOW. Episode 2 : The Jewel Story. Daniel Messer Technical Marketing Red Hat Storage. Karan Singh Sr. Storage Architect Red Hat Storage

Ceph Rados Gateway. Orit Wasserman Fosdem 2016

Choosing an Interface

CephFS A Filesystem for the Future

A fields' Introduction to SUSE Enterprise Storage TUT91098

Datacenter Storage with Ceph

virtual machine block storage with the ceph distributed storage system sage weil xensummit august 28, 2012

ROCK INK PAPER COMPUTER

Part2: Let s pick one cloud IaaS middleware: OpenStack. Sergio Maffioletti

DELIVERING OPENSTACK AND CEPH IN CONTAINERS. OpenStack Summit Sydney November 7, 2017

Ceph Block Devices: A Deep Dive. Josh Durgin RBD Lead June 24, 2015

CEPHALOPODS AND SAMBA IRA COOPER SNIA SDC

CephFS: Today and. Tomorrow. Greg Farnum SC '15

What's new in Jewel for RADOS? SAMUEL JUST 2015 VAULT

Welcome to Manila: An OpenStack File Share Service. May 14 th, 2014

The CephFS Gateways Samba and NFS-Ganesha. David Disseldorp Supriti Singh

GlusterFS Architecture & Roadmap

-Presented By : Rajeshwari Chatterjee Professor-Andrey Shevel Course: Computing Clusters Grid and Clouds ITMO University, St.

Cloud object storage in Ceph. Orit Wasserman Fosdem 2017

Ceph Intro & Architectural Overview. Federico Lucifredi Product Management Director, Ceph Storage Vancouver & Guadalajara, May 18th, 2015

RED HAT CEPH STORAGE ROADMAP. Cesar Pinto Account Manager, Red Hat Norway

GlusterFS Current Features & Roadmap

Samba and Ceph. Release the Kraken! David Disseldorp

Human Centric. Innovation. OpenStack = Linux of the Cloud? Ingo Gering, Fujitsu Dirk Müller, SUSE

클라우드스토리지구축을 위한 ceph 설치및설정

Jason Dillaman RBD Project Technical Lead Vault Disaster Recovery and Ceph Block Storage Introducing Multi-Site Mirroring

CEPH DATA SERVICES IN A MULTI- AND HYBRID CLOUD WORLD. Sage Weil - Red Hat OpenStack Summit

WHAT S NEW IN LUMINOUS AND BEYOND. Douglas Fuller Red Hat

Introduction to Ceph Speaker : Thor

Ceph: scaling storage for the cloud and beyond

Build Cloud like Rackspace with OpenStack Ansible

Red Hat OpenStack Platform 13

Expert Days SUSE Enterprise Storage

Ceph at the DRI. Peter Tiernan Systems and Storage Engineer Digital Repository of Ireland TCHPC

Gluster roadmap: Recent improvements and upcoming features

SUSE Enterprise Storage Technical Overview

OpenStack Manila An Overview of Manila Liberty & Mitaka

Discover CephFS TECHNICAL REPORT SPONSORED BY. image vlastas, 123RF.com

DISTRIBUTED STORAGE AND COMPUTE WITH LIBRADOS SAGE WEIL VAULT

DEPLOYING NFV: BEST PRACTICES

MySQL and Ceph. A tale of two friends

A New Key-value Data Store For Heterogeneous Storage Architecture Intel APAC R&D Ltd.

A High-Availability Cloud for Research Computing

Introduction to OpenStack Trove

Ceph Software Defined Storage Appliance

GlusterFS Cloud Storage. John Mark Walker Gluster Community Leader, RED HAT

GlusterFS and RHS for SysAdmins

RED HAT GLUSTER STORAGE TECHNICAL PRESENTATION

architecting block and object geo-replication solutions with ceph sage weil sdc

OPEN STORAGE IN THE ENTERPRISE with GlusterFS and Ceph

Introducing SUSE Enterprise Storage 5

VMware Integrated OpenStack with Kubernetes Getting Started Guide. VMware Integrated OpenStack 4.0

Ceph. The link between file systems and octopuses. Udo Seidel. Linuxtag 2012

Introduction to Neutron. Network as a Service

Cloud Filesystem. Jeff Darcy for BBLISA, October 2011

Minimal OpenStack Starting Your OpenStack Journey

CEPH APPLIANCE Take a leap into the next generation of enterprise storage

Next Generation Storage for The Software-Defned World

Integration of Flexible Storage with the API of Gluster

Red Hat OpenStack Platform 10 Product Guide

DRBD SDS. Open Source Software defined Storage for Block IO - Appliances and Cloud Philipp Reisner. Flash Memory Summit 2016 Santa Clara, CA 1

What s New in Red Hat OpenShift Container Platform 3.4. Torben Jäger Red Hat Solution Architect

OpenShift + Container Native Storage (CNS)

RED HAT GLUSTER STORAGE 3.2 MARCEL HERGAARDEN SR. SOLUTION ARCHITECT, RED HAT GLUSTER STORAGE

Ceph Snapshots: Diving into Deep Waters. Greg Farnum Red hat Vault

Protecting the Galaxy Multi-Region Disaster Recovery with OpenStack and Ceph

Archive Solutions at the Center for High Performance Computing by Sam Liston (University of Utah)

"Charting the Course... H8Q14S HPE Helion OpenStack. Course Summary

Red Hat Containers Roadmap. Red Hat A panel of product directors

Open Storage in the Enterprise

CloudOpen Europe 2013 SYNNEFO: A COMPLETE CLOUD STACK OVER TECHNICAL LEAD, SYNNEFO

May 2018 OpenStack Manila

IBM Spectrum Scale in an OpenStack Environment

All you need to know about OpenStack Block Storage in less than an hour. Dhruv Bhatnagar NirendraAwasthi

Deploying Ceph clusters with Salt

HPE Helion OpenStack Carrier Grade 1.1 Release Notes HPE Helion

Beyond 1001 Dedicated Data Service Instances

OpenStack and OpenDaylight, the Evolving Relationship in Cloud Networking Charles Eckel, Open Source Developer Evangelist

A Gentle Introduction to Ceph

SUSE OpenStack Cloud Production Deployment Architecture. Guide. Solution Guide Cloud Computing.

Ceph: A Scalable, High-Performance Distributed File System PRESENTED BY, NITHIN NAGARAJ KASHYAP

Reasons to Migrate from NFS v3 to v4/ v4.1. Manisha Saini QE Engineer, Red Hat IRC #msaini

the ceph distributed storage system sage weil msst april 17, 2012

MQ High Availability and Disaster Recovery Implementation scenarios

Red Hat Roadmap for Containers and DevOps

Genomics on Cisco Metacloud + SwiftStack

Important DevOps Technologies (3+2+3days) for Deployment

dcache Ceph Integration

SUSE Enterprise Storage 3

OpenStack SwiftOnFile: User Identity for Cross Protocol Access Demystified Dean Hildebrand, Sasikanth Eda Sandeep Patil, Bill Owen IBM

Webtalk Storage Trends

Container-Native Storage

Designing MQ deployments for the cloud generation

Upcoming Services in OpenStack Rohit Agarwalla, Technical DEVNET-1102

Reimagining OpenStack*

Transcription:

Distributed File Storage in Multi-Tenant Clouds using CephFS FOSDEM 2018 John Spray Software Engineer Ceph Christian Schwede Software Engineer OpenStack Storage

In this presentation Brief overview of key components What is OpenStack Manila CephFS Native Driver CephFS driver implementation (available since OpenStack Newton) NFS Ganesha Driver NFS backed with CephFS driver implementation (available since OpenStack Queens) Future work OpenStack Queens and beyond

What s the challenge? Want: a filesystem that is shared between multiple nodes At the same time: tenant aware Self-managed by the tenant admins Tenant A Tenant B

How do we solve this?

OpenStack Manila OpenStack Shared Filesystems service APIs for tenants to request file system shares Support for several drivers Proprietary CephFS Generic (NFS on Cinder) 1. Create share Tenant admin Manila API Driver A Driver B 4. Pass address 3. Return address Guest VM 5. Mount 2. Create share Storage cluster/controller

CephFS OBJECT BLOCK FILE RGW S3 and Swift compatible object storage with object versioning, multi-site federation, and replication RBD A virtual block device with snapshots, copy-on-write clones, and multi-site replication LIBRADOS A library allowing apps to direct access RADOS (C, C++, Java, Python, Ruby, PHP) CEPHFS A distributed POSIX file system with coherent caches and snapshots on any directory RADOS A software-based, reliable, autonomic, distributed object store comprised of self-healing, self-managing, intelligent storage nodes (OSDs) and lightweight monitors (Mons)

Why integrate CephFS with Manila/Openstack? Most Openstack users are also running a Ceph cluster already Open source storage solution https://www.openstack.org/user-survey/survey-2017 CephFS metadata scalability is ideally suited to cloud environments. 7

Break-in: terms Private Storage network (data plane) Controller nodes Controller 0 MariaDB Controller 1 Service APIs MariaDB Service Controller APIs 2 MariaDB Service APIs Compute nodes Compute 0 Compute 1 Nova Compute 2 Nova Compute 3 Nova Compute 4 Nova Compute X Nova Nova Storage nodes OSD OSD OSD OSD OSD OSD Public OpenStack Service API (External) network - control plane

CephFS native driver* Since OpenStack Mitaka release and jewel * for OpenStack private clouds, helps trusted Ceph clients use shares backed by CephFS backend through native CephFS protocol

First approach: CephFS Native Driver OpenStack client/nova VM Since Openstack Mitaka Best Performance Access to all CephFS features Simple deployment and implementation Ceph server daemons Monitor Metadata Server Metadata updates Data updates OSD Daemon Manila on CephFS at CERN: The Short Way to Production by Arne Wiebalck https://www.openstack.org/videos/boston-2017/manila-on-cephfs-at-cern-the-short-way-to-production 10

CephFS native driver deployment Storage (Ceph public) network Ceph MGR Storage Provider Network Ceph MDS Controller Nodes Ceph MON Tenant A Tenant B Ceph OSD Ceph OSD Ceph OSD Manila Share service Manila API service Router Tenant VMs with 2 nics Compute Nodes External Provider Network Router Storage nodes Ceph MDS placement: With MONs/python services/ dedicated? Public OpenStack Service API (External) network

CephFS Native Driver Pros Performance! Success stories, popular! Simple implementation. Makes HA relatively easy.

CephFS Native Driver Cons User VMs have direct access to the storage network using ceph protocols. Needs client side cooperation. Share size quotas support only with Ceph FUSE clients Assumes trusted user VMs. Requires special client and key distribution.

CephFS NFS driver* Full debut in OpenStack Queens, with luminous, NFS-Ganesha v2.54, Ceph Ansible 3.1. * for OpenStack clouds, helps NFS clients use the CephFS backend via NFS-Ganesha gateways

NFS Ganesha User-space NFSv2, NFSv3, NFSv4, NFSv4.1 and pnfs server Modular architecture: Pluggable File System Abstraction Layer allow for various storage backend (e.g. glusterfs, cephfs, gpfs, Lustre and more) Dynamic export/unexport/update with DBUS Can manage huge metadata caches Simple access for other user-space services (e.g. KRB5, NIS, LDAP) Open source

CephFS NFS driver (in control plane) Tenant (Horizon GUI/manila-client CLI) Create shares*, share-groups, snapshots HTTP Return share s export location HTTP Allow/deny IP access Manila services (with Ceph NFS driver) SSH Add/update/remove export on disk and using D-Bus Create directories, directory snapshots Native Ceph Return directory path, Ceph monitor addresses NFS-Ganesha gateway Native Ceph Storage Cluster (with CephFS) Per directory libcephfs mount/umount with path restricted MDS caps (better security) * manila share = a CephFS dir + quota + unique RADOS name space

CephFS NFS driver (in data plane) OpenStack client/nova VM Clients connected to NFS-Ganesha gateway. Better security. No single point of failure (SPOF) in Ceph storage cluster (HA of MON, MDS, OSD) NFS-Ganesha needs to be HA for no SPOF in data plane. NFS-Ganesha active/passive HA WIP (Pacemaker/Corosync) Metadata updates Native Ceph NFS NFS Data updates gateway server daemons Monitor Metadata Server OSD Daemon

CephFS NFS driver deployment Storage (Ceph public) network Storage NFS Network Ceph MDS Controller Nodes Ceph MON Tenant A Tenant B Ceph OSD Ceph OSD Ceph OSD Manila Share service Manila API service Router Tenant VMs with 2 nics Compute Nodes External Provider Network Router Storage nodes Ceph MDS placement: With MONs/python services/ dedicated? Public OpenStack Service API (External) network

OOO, Pacemaker, containers, and Ganesha Private Storage network (data plane) Controller nodes Storage nodes Controller 0 Pacemaker Controller 1 Pacemaker Controller 2 Pacemaker OSD OSD OSD OSD OSD OSD Ganesha Public OpenStack Service API (External) network - control plane

Current CephFS NFS Driver Pros Security: isolates user VMs from ceph public network and its daemons. Familiar NFS semantics, access control, and end user operations. Large base of clients who can now use Ceph storage for file shares without doing anything different. NFS supported out of the box, doesn t need any specific drivers Path separation in the backend storage and network policy (enforced by neutron security rules on a dedicated StorageNFS network) provide multi-tenancy support.

Current CephFS NFS Driver Cons Ganesha is a man in the middle in the data path and a potential performance bottleneck. HA using the controller node pacemaker cluster impacts our ability to scale As does the (current) inability to run ganesha active-active, and We d like to be able to spawn ganesha services on demand, per-tenant, as required rather than statically launching them at cloud deployment time.

What lies ahead...

Next Step: Integrated NFS Gateway in Ceph to export CephFS Ganesha becomes an integrated NFS Gateway to the Ceph file system. Targets deployments beyond Openstack which need a gateway client to the storage network (e.g. standalone appliance, kerberos, openstack, etc.) Provides an alternative and stable client to avoid legacy kernels or FUSE. Gateways/secures access to the storage cluster. Overlays Ganesha potential enhancements (e.g. Kerberos) VM Share Network Ceph Network mount -t nfs ipaddr:/ Ganesha Ganesha See also John Spray s talk at Openstack in Apr 2016: https://www.youtube.com/watch?v=vt4xuqwetg0&t=1335 MON OSD MDS 23

HA and Scale-Out High Availability Kubernetes managed Ganesha container Container life-cycle and resurrection not managed by Ceph. ceph-mgr creates shares and launches containers through Kubernetes Scale-Out (avoid Single Point of Failure) ceph-mgr creates multiple Ganesha containers for a share. (Potentially) Kubernetes load balancer allows for automatic multiplexing between Ganesha containers via a single service IP.

Manila MGR MDS OSD OSD OSD NFSGW Ganesha 25 Kubernetes Container (HA Managed by Kubernetes)

Future: trivial to have Ganesha per Tenant Ceph public network Ceph MDS Ceph OSD Ceph OSD Ceph MGR Ceph MON kubernetes Ceph OSD Controller Nodes Tenant A Tenant B Manila Share service Manila API service Tenant VMs Compute Nodes Router External Provider Network Router Public OpenStack Service API (External) network

Challenges and Lingering Technical Details How to recover Ganesha state in the MDS during failover (opened files; delegations) One solution: put all Ganesha shares into grace during failover to prevent lock/capability theft. (Heavy weight approach) Preserve CephFS capabilities for takeover on a timeout; introduce sticky client IDs Need a mechanism to indicate to CephFS that state reclamation by the client is complete. Need to handle cold (re)start of the Ceph cluster where state held by the client (Ganesha) was lost by the MDS cluster (need to put entire Ganesha cluster in grace while state is recovered). 27

Further future Performance: Exploit MDS scale-out NFS delegations pnfs Container environments: Implementing Kubernetes Persistent Volume Claims Re-using underlying NFS/networking model Perhaps even re-use Manila itself outside of OpenStack 28

Thanks! John Spray jspray@redhat.com Christian Schwede cschwede@redhat.com

Links Openstack Talks Sage Weil, The State of Ceph, Manila, and Containers in OpenStack, OpenStack Tokyo Summit 2015: https://www.youtube.com/watch?v=dntcboumaau John Spray, CephFS as a service with OpenStack Manila, OpenStack Austin Summit 2016: https://www.youtube.com/watch?v=vt4xuqwetg0 Ramana Raja, Tom Barron, Victoria Martinez de la Cruz, CephFS Backed NFS Share Service for Multi-Tenant Clouds, OpenStack Boston 2017: https://www.youtube.com/watch?v=bmdv-iqlv8c Patrick Donnelly, Large-scale Stability and Performance of the Ceph File System, Vault 2017: https://docs.google.com/presentation/d/1x13lveetquc2qrj1zuzibjeuhhg0cdzcbydimzooq LY Sage Weil et al., Ceph: A Scalable, High-Performance Distributed File System : https://dl.acm.org/citation.cfm?id=1298485sage Weil et al., Panel Experiences Scaling File Storage with CephFS and OpenStack https://www.youtube.com/watch?v=iphkei3arpg CERN Storage Overview - http://cern.ch/go/976x Cloud Overview - http://cern.ch/go/6hld Blog - http://openstack-in-production.blogspot.fr 30