IOFlow. A Software-Defined Storage Architecture

Similar documents
Arrakis: The Operating System is the Control Plane

Network Support for Multimedia

Better Never than Late: Meeting Deadlines in Datacenter Networks

davidklee.net heraflux.com linkedin.com/in/davidaklee

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

IX: A Protected Dataplane Operating System for High Throughput and Low Latency

Resource Containers. A new facility for resource management in server systems. Presented by Uday Ananth. G. Banga, P. Druschel, J. C.

Problems with IntServ. EECS 122: Introduction to Computer Networks Differentiated Services (DiffServ) DiffServ (cont d)

Pricing Intra-Datacenter Networks with

Comparison of Storage Protocol Performance ESX Server 3.5

DC-DRF : Adaptive Multi- Resource Sharing at Public Cloud Scale. ACM Symposium on Cloud Computing 2018 Ian A Kash, Greg O Shea, Stavros Volos

Presented by: Fabián E. Bustamante

Pivot3 Acuity with Microsoft SQL Server Reference Architecture

Differentiated Services

Announcements. Reading. Project #1 due in 1 week at 5:00 pm Scheduling Chapter 6 (6 th ed) or Chapter 5 (8 th ed) CMSC 412 S14 (lect 5)

Next-Generation Cloud Platform

SE Memory Consumption

Mohammad Hossein Manshaei 1393

QoS support for Intelligent Storage Devices

of-service Support on the Internet

RIGHTNOW A C E

Infrastructure Tuning

Quality of Service (QoS)

Huawei FusionCloud Desktop Solution 5.1 Resource Reuse Technical White Paper HUAWEI TECHNOLOGIES CO., LTD. Issue 01.

Announcements. Quality of Service (QoS) Goals of Today s Lecture. Scheduling. Link Scheduling: FIFO. Link Scheduling: Strict Priority

Fast packet processing in the cloud. Dániel Géhberger Ericsson Research

Feature Comparison Summary

Empirical Evaluation of Latency-Sensitive Application Performance in the Cloud

Overview Computer Networking What is QoS? Queuing discipline and scheduling. Traffic Enforcement. Integrated services

Quality of Service Mechanism for MANET using Linux Semra Gulder, Mathieu Déziel

PRESENTATION TITLE GOES HERE

What s New with VMware vcloud Director 8.0

Real-Time Internet of Things

Xen Community Update. Ian Pratt, Citrix Systems and Chairman of Xen.org

Creating the Fastest Possible Backups Using VMware Consolidated Backup. A Design Blueprint

Differentiated Services

ArcGIS Enterprise Performance and Scalability Best Practices. Andrew Sakowicz

CS6453. Data-Intensive Systems: Rachit Agarwal. Technology trends, Emerging challenges & opportuni=es

VMware Virtual SAN. Technical Walkthrough. Massimiliano Moschini Brand Specialist VCI - vexpert VMware Inc. All rights reserved.

Performance & Scalability Testing in Virtual Environment Hemant Gaidhani, Senior Technical Marketing Manager, VMware

NFVnice: Dynamic Backpressure and Scheduling for NFV Service Chains

The HP 3PAR Get Virtual Guarantee Program

SE Memory Consumption

Quality of Service (QoS)

Can "scale" cloud applications "on the edge" by adding server instances. (So far, haven't considered scaling the interior of the cloud).

Operating Systems. Lecture Process Scheduling. Golestan University. Hossein Momeni

Announcements. Quality of Service (QoS) Goals of Today s Lecture. Scheduling. Link Scheduling: Strict Priority. Link Scheduling: FIFO

sroute: Treating the Storage Stack Like a Network

W H I T E P A P E R. Comparison of Storage Protocol Performance in VMware vsphere 4

Packet Scheduling and QoS

TIPS TO. ELIMINATE LATENCY in your virtualized environment

Understanding Your Virtual Workload. Irfan Ahmad CTO CloudPhysics

Topic 4b: QoS Principles. Chapter 9 Multimedia Networking. Computer Networking: A Top Down Approach

RESOURCE MANAGEMENT MICHAEL ROITZSCH

Cloudian Sizing and Architecture Guidelines

IO Priority: To The Device and Beyond

Performance Considerations of Network Functions Virtualization using Containers

Lecture 14: Performance Architecture

Windows Server 2012 Hands- On Camp. Learn What s Hot and New in Windows Server 2012!

ReFlex: Remote Flash Local Flash

Real-World Performance Training Core Database Performance

[TITLE] Virtualization 360: Microsoft Virtualization Strategy, Products, and Solutions for the New Economy

Page 1. Quality of Service. CS 268: Lecture 13. QoS: DiffServ and IntServ. Three Relevant Factors. Providing Better Service.

Advanced Database Systems

WIND RIVER TITANIUM CLOUD FOR TELECOMMUNICATIONS

Elevate the Conversation: Put IT Resilience into Practice for Cloud Service Providers

Measuring zseries System Performance. Dr. Chu J. Jong School of Information Technology Illinois State University 06/11/2012

Leveraging the power of Flash to Enable IT as a Service

TALK THUNDER SOFTWARE FOR BARE METAL HIGH-PERFORMANCE SOFTWARE FOR THE MODERN DATA CENTER WITH A10 DATASHEET YOUR CHOICE OF HARDWARE

EECS 122: Introduction to Computer Networks Resource Management and QoS. Quality of Service (QoS)

SMB Direct Update. Tom Talpey and Greg Kramer Microsoft Storage Developer Conference. Microsoft Corporation. All Rights Reserved.

davidklee.net gplus.to/kleegeek linked.com/a/davidaklee

DPDK Summit 2016 OpenContrail vrouter / DPDK Architecture. Raja Sivaramakrishnan, Distinguished Engineer Aniket Daptari, Sr.

CSE 120 Principles of Operating Systems

Identifying Performance Bottlenecks with Real- World Applications and Flash-Based Storage

Towards General-Purpose Resource Management in Shared Cloud Services

Performance Isolation in Multi- Tenant Relational Database-asa-Service. Sudipto Das (Microsoft Research)

ASN Configuration Best Practices

Lecture 5: Performance Analysis I

GFS: The Google File System. Dr. Yingwu Zhu

VM Migration, Containers (Lecture 12, cs262a)

Single-pass restore after a media failure. Caetano Sauer, Goetz Graefe, Theo Härder

Version 1.0 October Reduce CPU Utilization by 10GbE CNA with Hardware iscsi Offload

SUSE OpenStack Cloud Production Deployment Architecture. Guide. Solution Guide Cloud Computing.

Advanced Software for the Supercomputer PRIMEHPC FX10. Copyright 2011 FUJITSU LIMITED

Cisco HyperFlex Hyperconverged Infrastructure Solution for SAP HANA

Data Path acceleration techniques in a NFV world

CSCI 350 Ch. 1 Introduction to OS. Mark Redekopp Ramesh Govindan and Michael Shindler

3. Quality of Service

Baoping Wang School of software, Nanyang Normal University, Nanyang , Henan, China

Ten things hyperconvergence can do for you

Scaling Without Sharding. Baron Schwartz Percona Inc Surge 2010

Hyper-Convergence De-mystified. Francis O Haire Group Technology Director

Pocket: Elastic Ephemeral Storage for Serverless Analytics

CS519: Computer Networks. Lecture 5, Part 5: Mar 31, 2004 Queuing and QoS

Single Root I/O Virtualization (SR-IOV) and iscsi Uncompromised Performance for Virtual Server Environments Leonid Grossman Exar Corporation

Transparent Throughput Elas0city for IaaS Cloud Storage Using Guest- Side Block- Level Caching

Nutanix User Meeting 2017

VIRTUALIZING SERVER CONNECTIVITY IN THE CLOUD

Design and Implementation of Virtual TAP for Software-Defined Networks

Transcription:

IOFlow A Software-Defined Storage Architecture ACM SOSP 2013 Authors: Eno Thereska, Hitesh Ballani, Greg O'Shea, Thomas Karagiannis, Antony Rowstron, Tom Talpey, and Timothy Zhu Presented by yifan wu, CS294, UC Berkeley, Sep 21 2015 Some material borrowed from authors SOSP 2013 presentation

Problem ' ' ' Virtual' Machine' vdisk' S(NIC' NIC' S(NIC' NIC' Switch' Switch' Switch' S(NIC' ' ' ' Virtual' Machine' S(NIC' vdisk' Who: enterprise data centers (virtualization of servers and storage) What: end-to-end policies (performance and routing) are hard to enforce

Problem Challenges 1. many operations & layers (e.g. 18?!) 2. non-linear: f(io type, data locality, device type, request size) > processing time 3. deployability (cannot change existing OSs) Virtual Machine Virtual Machine App# OS# App# OS# vdisk vdisk S-NIC NIC S-NIC NIC Switch Switch Switch S-NIC S-NIC

Design Goals Enable high-level flow policies, including for multipoint Declarative interface: {[Set of s], [Set of storage shares]}! Policy

no one has done it before

hyperboleandahalf.blogspot.com

Centralized Control* Inspired from software-defined networking (SDN), thus software-defined storage architecture, but it s much harder Stages Virtual Machine Virtual Machine App# OS# App# OS# vdisk vdisk storage driver in hypervisor storage server S-NIC NIC S-NIC NIC Switch Switch Switch S-NIC S-NIC *But distributed enforcement

Components 1. logically centralized controller 2. data-plane queues 3. interface between the controller and control applications (visibility!) IO#Packets App" OS" App" OS" High2level"SLA"... Queue#1 Queue#n Controller" " IOFlow"API"

Server Message Block

Design Challenges Admission control: is the policy possible? Distributed enforcement: differentiated costs at different stages access SQL data files: from guest OS > hypervisor > network switch > storage server OS > disk array Performance: queues at stages must be fast Incremental deployability/resilience Dynamic control: e.g. min bandwidth

Data Plane Queues 1. Classifica+on [IO Header - > Queue] 2. Queue servicing [Queue - > <token rate, priority, queue size>] 3. Rou+ng [Queue - > Next- hop] Malware'' scanner'

Controller API flexible responsive accurate resilient scalable

Rate Limiting Service request: token-bucket abstraction (simple) Storage request: benchmarks the storage devices to measure the cost of IO requests as a function of their type and size, IoMeter. (RAM, SSDs, disks: read/write ratio, request size) Max-min fair sharing

Tricks Split IO requests into smaller buckets Zero-copying of requests moving from stage to stage Min-heap-based selection of which queue to service next within a stage Heuristics for deciding enforcement layer

Controller Blocking, but it doesn t really matter Hypervisor Storage Server

Performance Tenants SLAs enforced. Intra-tenant work conservation Controller notices red tenant s performance Inter-tenant work conservation Tenant& SLA& Red& {1% %30}%*>%Min%800%MB/s% Green& {31% %60}%*>%Min%800%MB/s% Yellow& {61% %90}%*>%Min%2500%MB/s% Blue& {91% %120}%*>%Min%1500%MB/s%

Data Plane Overhead Worst case CPU consumption at hypervisor is less than 5% Worst case reduction in throughput: RAM 14% SSD 9% Disk 5%

Control Plane Overhead

Checking Assumptions performance bottleneck is at the storage servers small IO requests are typically interrupt limited large requests are limited by the bandwidth of the storage back-end or the server s network link

Questions/Ideas What more does it take to scale? Controller bottleneck? Ad hoc access patterns (bad prediction) Too many nobs, hard to tune? Extend to optimize for co-requests?

IOFlow

Experiment Setup # Switch Clients:10#hypervisor#servers,#12#s#each# 4#tenants#(Red,#Green,#Yellow,#Blue)# 30#s/tenant,#3#s/tenant/server# Storage-network:# Mellanox#40Gbps#RDMA#RoCE#fullLduplex# 1-storage-server:## 16#CPUs,#2.4GHz#(Dell#R720)# SMB#3.0#file#server#protocol# 3#types#of#backend:#RAM,#SSDs,#Disks# Controller:#1#separate#server# 1#sec#control#interval#(configurable)# 29#

Resilience/Consitency When controller is unreachable, use default policy. Allow for inconsistencies