SDS Heterogeneous OS Access Alejandro Bonilla Technical Strategist abonilla@suse.com
Agenda Introduction to Ceph Architecture RADOS Block Device (RBD) iscsi overview Exposing Ceph RBDs via iscsi 2 LIO iscsi target Background and implementation SUSE Enterprise Storage
Introduction to Ceph and RBD
SUSE Enterprise Storage & Ceph Community Ceph Key features Unlimited scalability Self repairing 4 Unified file, object and block access Non disruptive scalability of capacity online No SPOF Rolling upgrades Thin provisioning for optimized utilization Cache tiering for performance Copy-on-write clones for application rollback Erasure coding for spaceefficient resilience
Ceph Cache Tiered Pools 5
Ceph Multiple Replication Options 6
7
Ceph Overview Cluster Roles 8 Monitors Check and provide consensus on cluster state Maintain cluster maps Object Storage Daemons (OSD) Pool storage resources Perform replication Serve objects to clients Clients Obtain cluster maps from Monitors Communicate with OSDs directly
Ceph Overview 9
Ceph Client Access 10
RADOS Block Device Features Block device backed by RADOS objects 11 Inherits availability and scalability characteristics of Ceph Thin provisioned Online resizeable Supports snapshots and clones Linux kernel and librbd clients
Kernel RBD Client libceph and rbd kernel modules RBD image mapped as local block device Use like any other regular block device Page cache utilized for improved performance Application Filesystem Linux Page Cache Rados Block Device libceph Network 12
librbd client User space RBD client library Plumb RBD images directly into VMs QEMU / libvirt support Includes in-memory cache implementation Application (QEMU) librbd librados Network 13
CephFS POSIX file system Interoperability Clustered metadata servers - Dynamic subtree partitioning Linux client (first release) - Samba, Ganesha, Hadoop (to follow) 14
iscsi Overview
iscsi Introduction 16 Mechanism for transporting block storage traffic over a regular TCP/IP network iscsi initiators (clients) communicate with iscsi targets (servers) SCSI commands and responses encapsulated in iscsi packets Remote storage appears on the iscsi initiator as a local disk Attach and format with XFS, NTFS, etc. Boot from a remote target with an iscsi capable network adapter or boot loader
iscsi Linux IO Target LIO Linux IO Target In kernel SCSI target implementation Support for a number of SCSI transports Pluggable storage backend The current preferred iscsi target for Linux Flexible configuration 17 ConfigFS and targetcli shell utility
iscsi Gateway for Ceph RBD
SUSE Enterprise Storage 2 Target Interoperability - iscsi heterogeneous OSLinux-IO block level access FC FCoE FireWire iscsi iser SRP loop vhost 19 (LIO ) Kernel space FABRIC LIO BACKSTORE FILEIO IBLOCK pscsi RAMDISK TCMU RBD
iscsi Gateway for RBD Expose benefits of Ceph RBD to other systems Standardized interface Mature and trusted protocol iscsi initiator implementations are widespread 20 No requirement for Ceph aware applications or operating systems Provided with most modern operating systems
iscsi Gateway for RBD LIO target configured with iscsi transport fabric RBD backstore module Translates SCSI IO into Ceph OSD requests Special handling of operations that require exclusive device access lrbd: Multi-node configuration utility 21 Atomic COMPARE AND WRITE, WRITE SAME and reservations Applies iscsi target configuration across multiple gateways via targetcli
SUSE Enterprise Storage 2 Interoperability - iscsi heterogeneous OS block level access 22
Multipath iscsi Gateway Allows for initiator access via redundant paths iscsi gateway node with multiple network adapters Multiple iscsi gateways exporting same RBD image 23 Protection from a single network adapter failure Protection from entire gateway failure Initiator responsible for utilization of redundant paths Available paths advertised in iscsi discovery exchange May choose to round-robin the IO, or to failover/failback
iscsi Gateway Optimizations Efficient handling of certain SCSI operations Offload RBD image IO to OSDs COMPARE AND WRITE New cmpext OSD operation to handle RBD data comparison Dispatch as compound cmpext+write OSD request WRITE SAME 24 Avoid locking on iscsi gateway nodes New writesame OSD operation to expand duplicate data at the OSD Reservations State stored as RBD image extended attribute Updated using compound cmpxattr+setxattr OSD request
Configuration with lrbd Apply LIO configuration across multiple iscsi gateway nodes Configuration state stored in Ceph cluster 25 iscsi gateway nodes apply configuration on boot RBD images mapped prior to exposure JSON configuration format
Configuration with lrbd Targets section iscsi gateway hosts Target iscsi Qualified Name (IQN) Portals section Pools section 26 IP addresses to utilize for iscsi traffic RBD images to expose Auth section Access restrictions based on initiator name CHAP credentials
iscsi Initiators open-iscsi Default iscsi initiator shipped with SLES10 and later Multipath supported in combination with dm-multipath Available on most Linux distributions Microsoft iscsi initiator Installed by default from Windows Server 2008 and later Supports MPIO in recent versions VMware ESX 27 Not available on desktops Concurrent clustered filesystem (VMFS) access from multiple initiators
SUSE Enterprise Storage
SUSE Enterprise Storage Development Focus Areas Manageability Ease of install Centralized management, monitoring, reporting Interoperability Unified block/file/object (heterogeneous OS access) Fabric interconnect Efficiency Cache tiering Deduplication/compression Hierarchical storage management Availability 29 Backup/archive Continuous data protection Remote replication
30
SUSE Enterprise Storage Target Market PERFORMANCE OPTIMIZED HPC OLTP CRM ERP Data Analytics Email LOW FUNCTIONALITY Big Data VM-Aware Bulk Storage Video Audio Initial Target Market Object Storage 31 HIGH FUNCTIONALITY Archive Storage CAPACITY OPTIMIZED DR Target Data Backup Compliance Archive
SUSE Enterprise Storage PERFORMANCE OPTIMIZED Bulk storage HPC Data Analytics Use Case: LOW FUNCTIONALITY Bulk Storage (Block) Alternative for SCSI disk array iscsi SAN heterogeneous access Big Data Object Storage Email VM-Aware Bulk Storage Archive Storage Windows, VMware, Linux, etc Video Audio OLTP CRM ERP HIGH FUNCTIONALITY DR Target Data Backup Compliance Archive CAPACITY OPTIMIZED Examples: Windows file store General purpose block storage + = EMC Celerra Dot Hill AssuredSan 3000 Dell EqualLogic 32
SUSE Enterprise Storage Object storage PERFORMANCE OPTIMIZED HPC Data Analytics Use Case: LOW FUNCTIONALITY Object Storage Use RESTful APIs (Swift or Amazon S3) Provide On Premise cloud storage Similar cost to AWS (Amazon) + Big Data Object Storage = VM-Aware Video Bulk Audio Storage Archive Storage CAPACITY OPTIMIZED EMC Atmos Dell DX Object Storage Platform 33 Email OLTP CRM ERP HIGH FUNCTIONALITY DR Target Data Backup Compliance Archive
SUSE Enterprise Storage Archive storage PERFORMANCE OPTIMIZED HPC Data Analytics Use Case: Archive Storage (Block or Object) Cold storage or active archive LOW FUNCTIONALITY Big Data Object Storage Medical records Email archive VM-Aware HIGH FUNCTIONALITY DR Video Target Bulk Audio Data Storage Backup Compliance Archive Archive Storage CAPACITY OPTIMIZED Use industry leading backup software + + = Oracle StorageTek NetApp AltaVault EMC Data Domain 34 Email OLTP CRM ERP
SUSE Enterprise Storage Rich media video/audio PERFORMANCE OPTIMIZED HPC Use Case: Data Analytics Rich Media Example: Video streaming Audio streaming Tiered Deployment Use Cache Tiering 35 Active data in performance tier Inactive data stored in capacity tier LOW FUNCTIONALITY Big Data Object Storage Email VM-Aware Bulk Storage Archive Storage Video Audio CAPACITY OPTIMIZED OLTP CRM ERP HIGH FUNCTIONALITY DR Target Data Backup Compliance Archive
For More Information 40 open-iscsi RFC 3720: https://www.ietf.org/rfc/rfc3720.txt URL: http://www.openiscsi.org Ceph General: http://ceph.com Documentation: http://docs.ceph.com SUSE Enterprise Storage Product: https://www.suse.com/products/suse-enterprise-storage/ Documentation: https://www.suse.com/documentation/
Questions? Thank you. 41
42 Corporate Headquarters +49 911 740 53 0 (Worldwide) Join us on: Maxfeldstrasse 5 90409 Nuremberg Germany www.suse.com www.opensuse.org
Unpublished Work of SUSE LLC. All Rights Reserved. This work is an unpublished work and contains confidential, proprietary and trade secret information of SUSE LLC. Access to this work is restricted to SUSE employees who have a need to know to perform tasks within the scope of their assignments. No part of this work may be practiced, performed, copied, distributed, revised, modified, translated, abridged, condensed, expanded, collected, or adapted without the prior written consent of SUSE. Any use or exploitation of this work without authorization could subject the perpetrator to criminal and civil liability. General Disclaimer This document is not to be construed as a promise by any participating company to develop, deliver, or market a product. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. SUSE makes no representations or warranties with respect to the contents of this document, and specifically disclaims any express or implied warranties of merchantability or fitness for any particular purpose. The development, release, and timing of features or functionality described for SUSE products remains at the sole discretion of SUSE. Further, SUSE reserves the right to revise this document and to make changes to its content, at any time, without obligation to notify any person or entity of such revisions or changes. All SUSE marks referenced in this presentation are trademarks or registered trademarks of Novell, Inc. in the United States and other countries. All third-party trademarks are the property of their respective owners.