Rok: Decentralized storage for the cloud native world

Similar documents
Rok: Data Management for Data Science

Cloud Storage with AWS: EFS vs EBS vs S3 AHMAD KARAWASH

FIVE REASONS YOU SHOULD RUN CONTAINERS ON BARE METAL, NOT VMS

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE

MODERNISE WITH ALL-FLASH. Intel Inside. Powerful Data Centre Outside.

Case Study: Aurea Software Goes Beyond the Limits of Amazon EBS to Run 200 Kubernetes Stateful Pods Per Host

DEPLOY MODERN APPS WITH KUBERNETES AS A SERVICE

Fast and Easy Persistent Storage for Docker* Containers with Storidge and Intel

How to Keep UP Through Digital Transformation with Next-Generation App Development

HARNESSING THE HYBRID CLOUD TO DRIVE GREATER BUSINESS AGILITY

Running Databases in Containers.

VEXATA FOR ORACLE. Digital Business Demands Performance and Scale. Solution Brief

Dell EMC All-Flash solutions are powered by Intel Xeon processors. Learn more at DellEMC.com/All-Flash

Data Centers and Cloud Computing

Data Centers and Cloud Computing. Slides courtesy of Tim Wood

Data Centers and Cloud Computing. Data Centers

VMworld 2018 Content: Not for publication or distribution

Hedvig as backup target for Veeam

STREAMLINING THE DELIVERY, PROTECTION AND MANAGEMENT OF VIRTUAL DESKTOPS. VMware Workstation and Fusion. A White Paper for IT Professionals

Simple Data Protection for the Cloud Era

La plateforme Cloud d Entreprise. Découvrez la vision et la stratégie de Nutanix.

IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage

HOW TO PLAN & EXECUTE A SUCCESSFUL CLOUD MIGRATION

HYCU and ExaGrid Hyper-converged Backup for Nutanix

ebook ADVANCED LOAD BALANCING IN THE CLOUD 5 WAYS TO SIMPLIFY THE CHAOS

Veritas Backup Exec. Powerful, flexible and reliable data protection designed for cloud-ready organizations. Key Features and Benefits OVERVIEW

CONFIDENTLY INTEGRATE VMWARE CLOUD ON AWS WITH INTELLIGENT OPERATIONS

Actifio Sky DB. Actifio s Solution for Oracle, Oracle EBS with standalone, RAC, ASM, EXADATA configurations

Microsoft Operations Management Suite (OMS) Fernando Andreazi RED CLOUD

Offers easy management of all protected devices and data through a unified secure touchfriendly web-based management console.

Beyond 1001 Dedicated Data Service Instances

Hyper-Converged Infrastructure: Providing New Opportunities for Improved Availability

Introduction to the Active Everywhere Database

PERFORMANCE CHARACTERIZATION OF MICROSOFT SQL SERVER USING VMWARE CLOUD ON AWS PERFORMANCE STUDY JULY 2018

3 Ways Businesses Use Network Virtualization. A Faster Path to Improved Security, Automated IT, and App Continuity

Securing Amazon Web Services (AWS) EC2 Instances with Dome9. A Whitepaper by Dome9 Security, Ltd.

GETTING GREAT PERFORMANCE IN THE CLOUD

arcserve r16.5 Hybrid data protection

VMWARE EBOOK. Easily Deployed Software-Defined Storage: A Customer Love Story

SoftNAS Cloud Data Management Products for AWS Add Breakthrough NAS Performance, Protection, Flexibility

Introduction to Database Services

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Software Defined Storage for the Evolving Data Center

Oracle IaaS, a modern felhő infrastruktúra

Storage Solutions for VMware: InfiniBox. White Paper

ECONOMICAL, STORAGE PURPOSE-BUILT FOR THE EMERGING DATA CENTERS. By George Crump

THE SUMMARY. CLUSTER SERIES - pg. 3. ULTRA SERIES - pg. 5. EXTREME SERIES - pg. 9

Choosing the Right Container Infrastructure for Your Organization

Your Complete Guide to Backup and Recovery for MongoDB

EBOOK: VMware Cloud on AWS: Optimized for the Next-Generation Hybrid Cloud

Deep Dive on SimpliVity s OmniStack A Technical Whitepaper

5 reasons why choosing Apache Cassandra is planning for a multi-cloud future

Important DevOps Technologies (3+2+3days) for Deployment

The Three Data Challenges

Enterprise Architectures The Pace Accelerates Camberley Bates Managing Partner & Analyst

EMC XTREMCACHE ACCELERATES VIRTUALIZED ORACLE

Nutanix Tech Note. Virtualizing Microsoft Applications on Web-Scale Infrastructure

ClickHouse on MemCloud Performance Benchmarks by Altinity

Lenovo Validated Designs

SOLUTION BRIEF Fulfill the promise of the cloud

Modernize Your Backup and DR Using Actifio in AWS

HPE SimpliVity 380. Simplyfying Hybrid IT with HPE Wolfgang Privas Storage Category Manager

VMware Virtual SAN Technology

WHITE PAPER. RedHat OpenShift Container Platform. Benefits: Abstract. 1.1 Introduction

NetApp Clustered ONTAP & Symantec Granite Self Service Lab Timothy Isaacs, NetApp Jon Sanchez & Jason Puig, Symantec

THE STATE OF CONTAINERS

Zadara Enterprise Storage in

SIMPLE, FLEXIBLE CONNECTIONS FOR TODAY S BUSINESS. Ethernet Services from Verizon

TCO REPORT. NAS File Tiering. Economic advantages of enterprise file management

Comprehensive Backup & Disaster Recovery with Vembu BDR Suite.

Virtuozzo Containers

WHITE PAPER PernixData FVP

BERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved

Merging Enterprise Applications with Docker* Container Technology

MODERNIZE INFRASTRUCTURE

EASILY DEPLOY AND SCALE KUBERNETES WITH RANCHER

7 Things ISVs Must Know About Virtualization

Migrating Enterprise Applications to the Cloud Session 672. Leighton L. Nelson

Welcome! IT Executive Sagamore Spirit

VMware vsphere 4. The Best Platform for Building Cloud Infrastructures

File backup and recovery programs store duplicate copies of data to protect you from data loss.

Storage Strategies for vsphere 5.5 users

Eliminate the Complexity of Multiple Infrastructure Silos

Why In-Place Copy Data Management is The Better Choice

Aurora, RDS, or On-Prem, Which is right for you

Data Protection for Cisco HyperFlex with Veeam Availability Suite. Solution Overview Cisco Public

RACKSPACE ONMETAL I/O V2 OUTPERFORMS AMAZON EC2 BY UP TO 2X IN BENCHMARK TESTING

Falling Out of the Clouds: When Your Big Data Needs a New Home

Protecting Microsoft Hyper-V 3.0 Environments with Arcserve

Cohesity Flash Protect for Pure FlashBlade: Simple, Scalable Data Protection

Nimble Storage Adaptive Flash

Microsoft Office 365 for Business. Your office-on-the-go. Get more work done virtually anytime, anywhere, on any device.

The intelligence of hyper-converged infrastructure. Your Right Mix Solution

Hyper-Convergence De-mystified. Francis O Haire Group Technology Director

Ten things hyperconvergence can do for you

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Exploring Cloud Security, Operational Visibility & Elastic Datacenters. Kiran Mohandas Consulting Engineer

2018 Cisco and/or its affiliates. All rights reserved.

White Paper. Platform9 ROI for Hybrid Clouds

Top 5 Reasons to Consider

Transcription:

Whitepaper Rok: Decentralized storage for the cloud native world Cloud native applications and containers are becoming more and more popular, as enterprises realize their benefits: containers are great at providing a portable, isolated, and lightweight execution environment for applications. However, things start to get very tricky when stateful applications enter the game, because they depend on storage to persist their state. And portability of storage is a much harder problem to solve, by far. We argue that in the cloud native world there should be a unified data management/ distribution layer that provides all the necessary data services (clones, snapshots), and is totally independent from the persistence layer, which is where the data reside. Moreover, this data management/distribution layer should not depend on the environment that the application runs. IT admins demand performance, flexibility, and high availability for their stateful apps. Their constant dilemma is whether they should deploy their workloads on shared or local storage. On the one hand, shared storage provides flexibility, and high availability, but it is expensive and lacks in performance. On the other hand, local storage delivers high IOPS with extremely low latency, but it misses the flexibility feature. In the case of cloud native applications, high availability at the storage layer is not a real issue, since the design of cloud native apps caters for high availability from the very start. However, operational flexibility is a must have! What you really need is running your application over fast, local SSDs, while keeping the flexibility of shared storage at the same time. Arrikto builds Rok: Decentralized storage for the cloud native world. Rok is the global data distribution layer that makes your apps truly performant and at the same time truly portable, in a fraction of the cost of a shared storage solution. Rok allows you to run your stateful containers over fast, local storage onprem or on the cloud, and still be able to snapshot the whole application, along with its data and distribute it efficiently: across machines of the same cluster, or across distinct locations and administrative domains over a decentralized network. 1

Local storage Shared storage Local storage & Rok Performace Cost Flexibility High Availability* Figure 1: Comparing the available storage solutions * cloud native apps are highly available by design We think performance and flexibility shouldn t be mutually exclusive anymore. One should have both. Everywhere. Rok is fully integrated with Kubernetes, whether it runs on AWS, GCP, VMware, bare metal onprem, and with Docker, so you can even run it on your laptop. Shared Storage + Rok Comparison Storage Type EBS EBS (io1) Local Storage Capacity (raw) 100 TB 100 TB Instance Type c4.4xlarge i3.44xlarge Number of instances 27 27 Total vcpus 432 432 same Total GB of RAM 810 3,294 4x better Nominal aggregate write IOPS 432 K 9,720 K 22x better Nominal aggregate read IOPS 432 K 22,275 K 51x better Cost per month $50,838 $18,016 64% cheaper Figure 2: Comparison of running a Cassandra cluster on AWS over shared storage (EBS), and local backed instances along with Rok. Read the article here Key Features & Benefits Performance Rok allows you to run your applications over local SSDs inside your data center, and backed instances on the cloud. This way, you enjoy the blazing fast storage that SSDs offer: 2

extremely high IOPS I/O latency in the order of μs massive scaleout, with excellent scalability as the cluster grows significant cost savings compared to running over shared storage Flexibility Rok gives you the flexibility of shared storage: fast node migrations local backups offsite backups Your applications may run over local SSDs, but you keep the flexibility feature, as if they were running over shared storage. Rok provides a data management layer that makes it possible for you to instantly snapshot your containers for local and offsite backups, and keep these backups in an immutable backup store, e.g. S3. Combined with Rok s instant cloning, you can recover from hardware failure in minutes, regardless of the volume s capacity. Pod 2 Pod 1 (a) Critical path Rok (b) Recovery path (c) Snapshot path Amazon S3 Figure 3: Architecture: A single that uses SSD for the critical path, and Rok on the side for the data services (a) critical path: goes directly through the Linux kernel to, without any Rok involvement. Minimal latency. (b) recovery path: Rok intervenes to produce a readytouse, thinly cloned volume from snapshot data. This is completely transparent to the cloud native application. In the background, Rok continues to fetch data from the immutable backup store to the local. When this is done, operation reverts to (a). (c) snapshot path: Rok sits on the side, tracking the set of blocks that the application has touched on the local. Periodically, Rok produces a new application snapshot by incorporating these changes into a new deduplicated snapshot, and pushes it to the snapshot store. Users set snapshot policies per application, per container, or per individual volume. 3

Collaboration at global scale Running your applications over local storage together with Rok provides you with a feature that was never possible before, collaboration at global scale. This means that you can take a snapshot of a whole application, along with its data, and share it with another user of a completely distinct administrative domain, at a distinct location. This is a perfect match for test & dev use cases, analytics, or forensics. Rok connects the whole of your global infrastructure. It runs onprem, on any cloud, and on your laptop, and turns your silos into a decentralized storage network, enabling global, end userdriven collaboration workflows. Rok cluster Rok Rok Rok Rok Amazon S3 Figure 4: The Rok Cluster: Run your applications over blazingfast SSDs, and use Rok on the side to enjoy sharedstoragelike flexibility Solving the 4 major problems stateful apps face today Data Mobility Multicloud Environments It is out of the question that enterprises desire workload portability. Stateful applications should be fully portable, so that companies can take full advantage of the new multicloud world that is already here. Containers have already solved the problem of runtime portability. However, data is not portable yet, and this is a big obstacle to adopting containers in the enterprise world. An important reason why data portability does matter is data gravity, which means that apps have to be physically close to the data they manage for reasons of efficiency. Rok solves the problem of data mobility by providing a data management and distribution layer that makes data fully portable, so that they can move freely towards the app. All this, without making compromises regarding performance. Rok enables you to share an entire application stack, among different regions of the same public cloud, or even among different public or private clouds. Spawning time and node migrations Your workloads run directly over local storage with latency in the microseconds. Rok works on the side of your local storage stack, and provides all the required data services (efficient snapshots, clones). This enables instant container recovery on any other node. You can now spawn the container on any other node of the cluster from a previous 4

snapshot and be up and running instantly regardless of volume size. Rok only has to intervene during node recovery, which is transparent to the application. It does not invoke an applicationwide data recovery and rebalancing operation, putting load on the whole cluster and impacting application responsiveness. Instead, it performs blocklevel recovery of this specific node from the Rok backup store, e.g., S3, with predictable performance. Let s illustrate this with an example: if you lose an entire Cassandra node, then your database will continue operating normally, as Rok in combination with Kubernetes will present another Cassandra node. This node will have the date of the latest snapshot that resides on the Rok snapshot store. In this case, Cassandra only has to recover the changed parts, but this is just a small fraction of the node data, and does not cause CPU load on the whole cluster. Backup and Recovery In the enterprise world, it is of vital importance to be able to back up and recover entire applications along with their data. Companies cannot afford data loss, while even the slightest outage time costs a lot of money. Rok ensures business continuity in the cloud native world in case of a failure. It enables you to back up a whole application by creating a group consistent snapshot of its containers in an appindependent way. Moreover, you can have these snapshots available in all your locations. Therefore, onsite and offsite backup becomes a single operation, defined by the desired backup policies. Rok allows you to unify your whole infrastructure, and move your data around over a decentralized network. A location can be a data center, a public cloud, a branch office, or even a laptop, anywhere in the world. Kubernetes cluster 1 Rok Registry Kubernetes cluster 3 Refs Rok Cluster 1 Rok Cluster 3 Instance Store Amazon S3 Data DAS Kubernetes cluster 2 Shared Storage (OnPrem DC) Rok Cluster 2 Rok Cluster 4 Kubernetes cluster 4 Local SSDs GCP Cloud Storage Local SSD Laptop Figure 5: The Rok Network: Rok enables you to share your snapshots among locations over a decentralized network Collaboration workflows with real data Rok allows you to take a snapshot of an entire application along with its data, and share it with another user of a completely distinct administrative domain, at a distinct location. This is a perfect match for test & dev workflows, analytics, or forensics. CI/CD 5

test cases can now run using real data, in parallel, as Rok enables developers and CI/CD infrastructure to exchange data seamlessly. Developers can share the whole application, as well as its state, with other developers, to ensure that all developers work with real data. Moreover, they can now run tests with real, productionderived data. You don t have to worry anymore, that something is going to break when deploying in production. Using Rok, the IT administrator can share a snapshot of the production database with the developers. The developers can thinclone the snapshot on development and preproduction infrastructure. Arrikto Inc. 3505 El Camino Real, Palo Alto, CA 94306 www.arrikto.com rev. 20180206 Copyright 2018 Arrikto Inc. All rights reserved. Rok and Rok Registry are trademarks of Arrikto Inc. All other marks and names mentioned herein may be trademarks of their respective companies. Rok and the Rok Network are built on patentpending technology. 6