A SCSI Transport Layer Extension with Separate Data and Control Paths for Scalable Storage-Area-Network Architectures

Similar documents
SNIA Discussion on iscsi, FCIP, and IFCP Page 1 of 7. IP storage: A review of iscsi, FCIP, ifcp

Storage Area Network (SAN)

Module 2 Storage Network Architecture

CONTENTS. 1. Introduction. 2. How To Store Data. 3. How To Access Data. 4. Manage Data Storage. 5. Benefits Of SAN. 6. Conclusion

Introduction to iscsi

Storage Update and Storage Best Practices for Microsoft Server Applications. Dennis Martin President, Demartek January 2009 Copyright 2009 Demartek

Storage Area Network (SAN)

Storage Area Network (SAN) Training Presentation. July 2007 IBM PC CLUB Jose Medeiros Storage Systems Engineer MCP+I, MCSE, NT4 MCT

Disk Storage Systems. Module 2.5. Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved. Disk Storage Systems - 1

Storage Area Networks: Performance and Security

A GPFS Primer October 2005

Direct-Attached Storage (DAS) is an architecture

Storage Area Networks SAN. Shane Healy

InfiniBand Networked Flash Storage

access addresses/addressing advantages agents allocation analysis

IBM IBM Open Systems Storage Solutions Version 4. Download Full Version :

3.1. Storage. Direct Attached Storage (DAS)

Traditional SAN environments allow block

COSC6376 Cloud Computing Lecture 17: Storage Systems

Learn Your Alphabet - SRIOV, NPIV, RoCE, iwarp to Pump Up Virtual Infrastructure Performance

Mellanox InfiniBand Solutions Accelerate Oracle s Data Center and Cloud Solutions

SATA RAID For The Enterprise? Presented at the THIC Meeting at the Sony Auditorium, 3300 Zanker Rd, San Jose CA April 19-20,2005

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

Multifunction Networking Adapters

I/O Considerations for Server Blades, Backplanes, and the Datacenter

iser as accelerator for Software Defined Storage Rahul Fiske, Subhojit Roy IBM (India)

STORAGE CONSOLIDATION WITH IP STORAGE. David Dale, NetApp

Performance Evaluation of iscsi based storage network with cluster mirroring. Rahul Jangid 1, Mr. Rajendra Purohit 2

DESIGN AND IMPLEMENTATION OF A SOFTWARE PROTOTYPE FOR STORAGE AREA NETWORK PROTOCOL EVALUATION

Storage Protocol Offload for Virtualized Environments Session 301-F

Exam : S Title : Snia Storage Network Management/Administration. Version : Demo

Snia S Storage Networking Management/Administration.

CSC Operating Systems Spring Lecture - XIX Storage and I/O - II. Tevfik Koşar. Louisiana State University.

RAID Structure. RAID Levels. RAID (cont) RAID (0 + 1) and (1 + 0) Tevfik Koşar. Hierarchical Storage Management (HSM)

S SNIA Storage Networking Management & Administration

Oracle RAC 10g Celerra NS Series NFS

Administrivia. CMSC 411 Computer Systems Architecture Lecture 19 Storage Systems, cont. Disks (cont.) Disks - review

Evaluating the Impact of RDMA on Storage I/O over InfiniBand

Mass-Storage Structure

Assignment No. SAN. Title. Roll No. Date. Programming Lab IV. Signature

A Crash Course In Wide Area Data Replication. Jacob Farmer, CTO, Cambridge Computer

Demystifying Storage Area Networks. Michael Wells Microsoft Application Solutions Specialist EMC Corporation

Storage Systems Market Analysis Dec 04

Cold Storage: The Road to Enterprise Ilya Kuznetsov YADRO

Introduction to TCP/IP Offload Engine (TOE)

iscsi Protocols iscsi, Naming & Discovery, Boot, MIBs John Hufferd, Sr. Technical Staff IBM SSG

EMC Symmetrix DMX Series The High End Platform. Tom Gorodecki EMC

High Performance Computing Course Notes High Performance Storage

PESIT Bangalore South Campus

iscsi Technology: A Convergence of Networking and Storage

The advantages of architecting an open iscsi SAN

Exploiting the full power of modern industry standard Linux-Systems with TSM Stephan Peinkofer

Storage Systems. Storage Systems

The File Systems Evolution. Christian Bandulet, Sun Microsystems

EMC Unified Storage for Oracle Database 11g/10g Virtualized Solution. Enabled by EMC Celerra and Linux using FCP and NFS. Reference Architecture

IBM High-End Disk Solutions, Version 3. Download Full Version :

Configuring and Managing Virtual Storage

Voltaire. Fast I/O for XEN using RDMA Technologies. The Grid Interconnect Company. April 2005 Yaron Haviv, Voltaire, CTO

S S SNIA Storage Networking Foundations

Microsoft Office SharePoint Server 2007

Chapter 12: Mass-Storage

PeerStorage Arrays Unequalled Storage Solutions

Chapter 10: Mass-Storage Systems. Operating System Concepts 9 th Edition

Virtualization And High Availability. Howard Chow Microsoft MVP

Fibre Channel Gateway Overview

OceanStor 5300F&5500F& 5600F&5800F V5 All-Flash Storage Systems

How to Speed up Database Applications with a Purpose-Built SSD Storage Solution

Performance and Innovation of Storage. Advances through SCSI Express

Veritas NetBackup on Cisco UCS S3260 Storage Server

Novel Intelligent I/O Architecture Eliminating the Bus Bottleneck

As storage networking technology

VX3000-E Unified Network Storage

USING ISCSI AND VERITAS BACKUP EXEC 9.0 FOR WINDOWS SERVERS BENEFITS AND TEST CONFIGURATION

The benefits of. Clustered storage offers advantages in both performance and scalability, but users need to evaluate three different architectures.

Dell Fluid Data solutions. Powerful self-optimized enterprise storage. Dell Compellent Storage Center: Designed for business results

IOPStor: Storage Made Easy. Key Business Features. Key Business Solutions. IOPStor IOP5BI50T Network Attached Storage (NAS) Page 1 of 5

Assessing performance in HP LeftHand SANs

Hands-On Wide Area Storage & Network Design WAN: Design - Deployment - Performance - Troubleshooting

Connectivity. Module 2.2. Copyright 2006 EMC Corporation. Do not Copy - All Rights Reserved. Connectivity - 1

HP Supporting the HP ProLiant Storage Server Product Family.

Chapter 10: Mass-Storage Systems

BUILDING A BLOCK STORAGE APPLICATION ON OFED - CHALLENGES

EMC CLARiiON CX3 Series FCP

EMC VMAX 400K SPC-2 Proven Performance. Silverton Consulting, Inc. StorInt Briefing

SNIA Developers Conference - Growth of the iscsi RDMA (iser) Ecosystem

Database over NFS. beepy s personal perspective. Brian Pawlowski Vice President and Chief Architect Network Appliance

QuickSpecs. Models. HP Smart Array 642 Controller. Overview. Retired

Vendor must indicate at what level its proposed solution will meet the College s requirements as delineated in the referenced sections of the RFP:

Motivation CPUs can not keep pace with network

iscsi Technology Brief Storage Area Network using Gbit Ethernet The iscsi Standard

OceanStor 6800F V5 Mission-Critical All-Flash Storage Systems

EMC Business Continuity for Oracle Database 11g

E-Seminar. Storage Networking. Internet Technology Solution Seminar

Storage Performance Management Overview. Brett Allison, IntelliMagic, Inc.

Specifying Storage Servers for IP security applications

Vblock Architecture. Andrew Smallridge DC Technology Solutions Architect

VX1800 Series Unified Network Storage

Big Data Processing Technologies. Chentao Wu Associate Professor Dept. of Computer Science and Engineering

IBM IBM Storage Networking Solutions Version 1.

Implementing a Digital Video Archive Based on the Sony PetaSite and XenData Software

Transcription:

Technion - Israel Institute of technology Department of Electrical Engineering SCSI-DSDC A SCSI Transport Layer Extension with Separate Data and Control Paths for Scalable Storage-Area-Network Architectures Yitzhak Birk Nafea Bishara Agenda Introduction to Storage Area Networks and Problem Statement SCSI Protocol and associated SCSI Transport Protocol Distributed and Split Data-Control extension to SCSI Transport Protocol Prototype and Performance results 2 1

Introduction to SAN and problem statement 3 Storage in the internet era Storage demand is increasing rapidly: Traditional Internet and Enterprise application Emerging of new killer applications: Local Backups, Remote Backups and disaster recovery Requirement from Storage subsystem: Seamless Scalability: In performance and storage space Multi-platform interoperability Performance Reliability and Availability Dedicated Storage Sub-system independent of the clients or Application Servers 4 2

Various Enterprise Storage Models Direct Attached Storage (DAS) Network Attached Storage (NAS) Storage Area Network (SAN) Application File System Disk Storage Application Network File System Disk Storage File Semantics Block Semantics (SCSI) Application File System Network Disk Storage 5 Storage - The heart of SAN The storage is a software/hardware entity that manages one or more storage-containing entities and provides: A simple and abstract view of the managed devices Making a large store from many small ones Data striping (RAID) For load balancing, throughput., falut-tolerance.. Manages spare disks For seamless fail-over Local and Remote mirroring Data Caching Access control (LUN masking and reservation) LUN masking hierarchy is supported as well 6 3

A Real World Storage Hierarchy Example Client Client Network File Server Application Server Copy Manager Storage Network RAID With built-in disk array Disk Array Disk Array 7 RAID Storage - Location in the SAN The SAN controller can be found: Internal to the host/server Disadvantage: Does not allow sharing of the storage system between multiple hosts/servers In the same enclosure with the disk arrays/tape drives Disadvantage: capacity limited by the capacity of the enclosure A standalone entity, connected to the SAN, and manages other disk arrays and s Advantages: Supports managing more than one enclosure. Facilitates redundant controllers and makes backups simpler Facilitates multi-vendor systems and interoperability The industry trend is going toward standalone SAN controller 8 4

Problem Statement All control and data flows through the storage controller, making it a potential bottleneck. Computational or Network bottleneck Limits the overall storage area network performance The problem becomes more severe with 10Gigabit Networks and more that 1Gigabit HDD transfer rates The challenge: open the bottleneck while retaining compatibility and simplicity Storage Network Storage Disk Array 9 Disk Array Prior Art Many studies focused on distributed File-System implementations (NAS) or Network-Attached Storage/Secure Devices (NASD) Relatively high-level semantics (files, objects) Heavy requirement on storage devices and, that need to handle files and objects, and even run part of application code Other studies focused on efficient data block transfers Remote DMA (Over VIA, InfiniBand or over TCP/IP) Parallel Transport Protocol Focus on data moving and placement, but direct peer-to-peer architectures (no ) 10 5

The Goal of Our Work Define an architecture that would scale the SAN performance, relieving the controller bottleneck under the following constrains: Stay compliant with the SCSI-3 protocol (The de-facto block I/O protocol) Keep backward compatibility and comply with all SCSI/SAN software developed in the last two decades support coexistence of devices supporting the new architecture and traditional devices in the same SAN 11 SCSI Protocol Suite 12 6

The SCSI-3 Layered Architecture SCSI Application SCSI Application Protocol SCSI Application SCSI SCSI SCSI Transport Protocol Service Interface SCSI Transport Protocol Services SCSI Transport Protocol SCSI Transport Protocol Services iscsi FCP SRP Network Interconnect Service Interface Interconnect Services Interconnect Interconnect Services TCP/IP FC-PH IB 13 Initiator The SCSI Protocol WriteCommand( ITT(I), LBA(I),Length) Target R2T ( ITT(I), TTT(T), Length) WriteData( TTT(T), Length) Status( ITT(I) ) Solicited Write ReadCommand( Tag(I), LBA(I),Length) ReadData( Tag(I), Length) Time Read Status( ITT(I) ) 14 LBA: Logical block address R2T: Ready to Transfer ITT: Initiator Task Tag TTT: Target Task Tag CID: Connection ID 7

Initiator The SCSI Protocol with : Write Transaction WriteCommand( ITT(I), LBA(I),Length) Target R2T ( ITT(I), TTT(C), Length) WriteData( TTT(C), Length) WriteCommand( ITT(C), LBA(C),Length) Time R2T ( ITT(C), TTT(T), Length) WriteData( TTT(T), Length) Status( ITT(C) ) Status( ITT(I) ) 15 LBA: Logical block address R2T: Ready to Transfer ITT: Initiator Task Tag TTT: Target Task Tag Distributed and Split Control- Data 16 8

Re-stating DSDC Objectives Define an architecture that would scale the SAN performance under the following constrains: Stay compliant with the SCSI model and application layer To keep backward compatibility and the comply with all SCSI/SAN software developed in the last two decades Limit the changes to the SCSI transport protocol in the Initiator and Target No change in the hardware and/or interconnect layer No change to the SCSI application layer Changes in the transport layer are limited to software/firmware changes. May require changes to the application layer in the controller Be generic enough to apply to (almost) any transport protocol 17 The two Principles for DSDC Splitting control and data network connections The key aspect to handle include: Ordering, Connection failure, Flow control... Direct inter-connect between Initiators and Targets for data transfer Utilizing the inherit parallelism and full connectivity in Switched storage networks Key challenges include: Authentication and Security, Abstraction of the target,... 18 9

Solicited Write Transaction in Traditional Initiator WriteCommand( ITT(I), LBA(I),Length) Target R2T ( ITT(I), TTT(C), Length) WriteData( TTT(C), Length) WriteCommand( ITT(C), LBA(C),Length,) Time R2T ( ITT(C), TTT(T), Length) WriteData( TTT(T), Length) Status( ITT(C) ) Status( ITT(I) ) 19 LBA: Logical block address R2T: Ready to Transfer ITT: Initiator Task Tag TTT: Target Task Tag Solicited Write Transaction in DSDC Initiator WriteCommand( ITT(I), LBA(I),Length) Target R2T ( ITT(I), WriteCommand( ITT(C), LBA(C),Length,,CID,ITT(I ) TTT(T), Length) Time WriteData( TTT(T), Length) Status( ITT(C) ) Status( ITT(I) ) 20 LBA: Logical block address R2T: Ready to Transfer ITT: Initiator Task Tag TTT: Target Task Tag CID: Connection ID 10

DSCD in Read Transactions Initiator ReadCommand( ITT(I), Target LBA(I),Length) ReadCommand( ITT(C), LBA(C),Length, CID, ITT(I) ) ReadData ( ITT(I), Length) Status( ITT(C) ) Time Status( ITT(I) ) May arrive out-of order 21 DSCD in READ Transactions: Complications Read Data and status run on different connections Status: Target Initiator Data: Target Initiator May arrive in different order Solution: The controller returns to the initiator the number of data transfers to expect per command (together with the status) The initiator can use this to identify the end of the transaction 22 11

Other SCSI Commands There are hundreds of other SCSI command All but one (Unsolicited Write) do not transfer data Unsolicited Write: must go through the controller, Solicited Write is the recommended command for SANs The handling of all commands except for Read and Solicited Write is unchanged by DSDC. 23 Other Issues in DSDC* Data Security and Authentication We are not introducing any new threat Simple extension to the authentication schemes showed in the Technical report Latencies Are hardly affected and depend on the target vs. controller location relative to the initiator RAID implementation in DSDC Writing a Single block: Data traffic is cut by half on the, and reduced by 25% in the network Writing a Complete parity group: The DSDC does not save data transfers in the network nor through the controller. (*) Discussed in details in the technical report in EE/Technion/Israel 24 12

Other Issues in DSDC (Cont) Caching Some controllers implement block caching in the controller itself In DSDC, the caching can be implemented in the targets themselves Normally, the targets are disk arrays with built-in cache Smart caching schemes can be implemented by the DSDC controller to proactively fetch data for caching in its own memory. 25 Prototype Description 26 13

Testing Environment All computers running Linux Kernel 2.2.20 iscsi over TCP/IP over Ethernet The controller has two modes: Traditional mode DSDC enabled mode Target Initiator Fast Ethernet Switch 27 SCSI and iscsi in Linux user space kernel space Upper Level SD disks Block device SR cdrom/dvd Block device ST tapes Char device SG generic Char device Mid Level SCSI Unifying Layer this level is responsible for the conversion of command requests into SCSI requests. Manage outstanding command and interact with the low level host bus adapter Low Level Host Bus Adapter driver for local SCSI disks iscsi TCP/IP 28 14

iscsi Implementation Between Two Peers initiator SCSI Upper Level target Mid Level Developed by us Mid-Low layers interface iscsi Initiator iscsi target TCP/IP/Ethernet TCP connections TCP/IP/Ethernet Mid-Low layers interface Host Bus Adaptation driver 29 iscsi Implementation with Initiator SCSI Upper Level Target Mid Level Application iscsi Initiator iscsi target iscsi Initiator iscsi Target TCP/IP/ Ethernet TCP connections TCP/IP/E thernet TCP/IP/E thernet TCP connections TCP/IP/E thernet Host Bus Adaptation driver 30 15

Performance Testing We worked with 1 Initiator, 1 Target and 1 Initiator and Target had 100Mbps Ethernet, to imitate a heavy load from initiators and a large group of targets had 10Mbps Ethernet We built our own SCSI testing utility that bypasses all file-system and buffers cache in Linux We test read and write sustained performance under various data lengths, in both modes: Traditional and DSDC-Enabled 31 Performance Charts for READ 1000 30 I/Os per sec (IO/s) 100 10 25 20 15Mb/s 10 5 bandwidth limitation 1 4 16 64 0 Data per command (KByte) Conventional (IO/s) Conventional (Mb/s) DSDC (IO/s) DSDC (Mb/s) 32 16

Performance Charts for WRITE 100 computational limitation I/Os per 10 sec (IO/s) 16 14 12 10 8 Mb/s 6 4 1 4 16 64 Data per command (KByte) 2 0 Conventional (IO/s) Conventional (Mb/s) DSDC (IO/s) DSDC (Mb/s) 33 Summary: Advantages of DSDC Performance Scalability: The total throughput of the SAN is not limited by the bandwidth of the connection between the controller and the SAN Centralized management Since the controller is no longer the bandwidth bottleneck, it can handle more disk arrays, and saves the need to multiple controller in the SAN, simplifying management The changes are confined to the SCSI Transport layer in the Target/Initiator The controller requires modification in the application layer The prototype demonstrate the correctness and the performance advantages 34 17

Thank You! 35 18