Beyond 1001 Dedicated Data Service Instances

Similar documents
Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Cloud Computing Introduction to Cloud Foundry

São Paulo. August,

Design and Architecture. Derek Collison

Taming your heterogeneous cloud with Red Hat OpenShift Container Platform.

VMWARE ENTERPRISE PKS

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Redis for Pivotal Cloud Foundry Docs

VMWARE PKS. What is VMware PKS? VMware PKS Architecture DATASHEET

WHITE PAPER. RedHat OpenShift Container Platform. Benefits: Abstract. 1.1 Introduction

OPENSTACK BEIJING CONFERENCE. by: Steven Hallett Head of Cloud Infrastructure Engineering and Operations

TEN LAYERS OF CONTAINER SECURITY

VMWARE PIVOTAL CONTAINER SERVICE

Extending the BOSH Backup and Restore Framework. Therese Stowell, Product Manager Chunyi Lyu, Engineer Platform Recovery Team, Pivotal

Advanced Continuous Delivery Strategies for Containerized Applications Using DC/OS

Introduction to the Open Service Broker API. Doug Davis

Cloud Foundry and OpenStack

70-532: Developing Microsoft Azure Solutions

MQ High Availability and Disaster Recovery Implementation scenarios

TEN LAYERS OF CONTAINER SECURITY. Kirsten Newcomer Security Strategist

WHITE PAPER AUGUST 2017 AN INTRODUCTION TO BOSH. by VMware

How to Keep UP Through Digital Transformation with Next-Generation App Development

Cloud-Native Applications. Copyright 2017 Pivotal Software, Inc. All rights Reserved. Version 1.0

70-532: Developing Microsoft Azure Solutions

IBM Bluemix platform as a service (PaaS)

OpenShift 3 Technical Architecture. Clayton Coleman, Dan McPherson Lead Engineers

Setting up Kubernetes with Day 2 in Mind. Angela Chin, Senior Software Engineer, Pivotal Urvashi Reddy, Senior Software Engineer, Pivotal

How CloudEndure Disaster Recovery Works

Red Hat Atomic Details Dockah, Dockah, Dockah! Containerization as a shift of paradigm for the GNU/Linux OS

Windows Azure Services - At Different Levels

Backup strategies for Stateful Containers in OpenShift Using Gluster based Container-Native Storage

How CloudEndure Disaster Recovery Works

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

OpenShift on Public & Private Clouds: AWS, Azure, Google, OpenStack

IBM Compose Managed Platform for Multiple Open Source Databases

#techsummitch

How CloudEndure Works

The Post-Cloud. Where Google, DevOps, and Docker Converge

University of Bologna Dipartimento di Informatica Scienza e Ingegneria (DISI) Engineering Bologna Campus

Open Service Broker API: Creating a Cross-Platform Standard Doug Davis IBM Shannon Coen Pivotal

Agenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache

Container-Native Storage

Redis for Pivotal Cloud Foundry Docs

CONTAINERS AND MICROSERVICES WITH CONTRAIL

Amazon Web Services (AWS) Solutions Architect Intermediate Level Course Content

Kubernetes 101. Doug Davis, STSM September, 2017

Amir Zipory Senior Solutions Architect, Redhat Israel, Greece & Cyprus

Is Docker Infrastructure or Platform? & Cloud Foundry intro

Accelerate at DevOps Speed With Openshift v3. Alessandro Vozza & Samuel Terburg Red Hat

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

Important DevOps Technologies (3+2+3days) for Deployment

Industry-leading Application PaaS Platform

Running MarkLogic in Containers (Both Docker and Kubernetes)

How CloudEndure Works

Part2: Let s pick one cloud IaaS middleware: OpenStack. Sergio Maffioletti

DevOps Course Content

Go Further Ford Motor Company. Ford Invests in Making Customer Experience as Strong as Its Vehicles with FordPass

Cloud I - Introduction

Amazon Web Services Training. Training Topics:

Exam : Implementing Microsoft Azure Infrastructure Solutions

5 Things You Need for a True VMware Private Cloud

Kontejneri u Azureu uz pomoć Kubernetesa što i kako? Tomislav Tipurić Partner Technology Strategist Microsoft

Using DC/OS for Continuous Delivery

Managing Openstack in a cloud-native way

Use Case: Scalable applications

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

The 12-Factor app and IBM Bluemix IBM Corporation

Amazon Web Services (AWS) Training Course Content

Microservices. Chaos Kontrolle mit Kubernetes. Robert Kubis - Developer Advocate,

Development and Operations: Continuous Delivery in Practice

Go Faster: Containers, Platforms and the Path to Better Software Development (Including Live Demo)

Nevin Dong 董乃文 Principle Technical Evangelist Microsoft Cooperation

Mesosphere and Percona Server for MongoDB. Peter Schwaller, Senior Director Server Eng. (Percona) Taco Scargo, Senior Solution Engineer (Mesosphere)

Reactive Microservices Architecture on AWS

Designing MQ deployments for the cloud generation

CONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS

@joerg_schad Nightmares of a Container Orchestration System

Kubernetes on Azure. Daniel Neumann Technology Solutions Professional Microsoft. Build, run and monitor your container applications

CONTINUOUS DELIVERY WITH DC/OS AND JENKINS

Red Hat OpenShift Roadmap Q4 CY16 and H1 CY17 Releases. Lutz Lange Solution

Building a government cloud Concepts and Solutions

Deploying and Operating Cloud Native.NET apps

DevOps and Continuous Delivery USE CASE

A10 HARMONY CONTROLLER

File system, 199 file trove-guestagent.conf, 40 flavor-create command, 108 flavor-related APIs list, 280 show details, 281 Flavors, 107

IBM Planning Analytics Workspace Local Distributed Soufiane Azizi. IBM Planning Analytics

What s New in Red Hat OpenShift Container Platform 3.4. Torben Jäger Red Hat Solution Architect

IBM Bluemix compute capabilities IBM Corporation

Deploying and Operating Cloud Native.NET apps

Dell EMC Enterprise Hybrid Cloud for Microsoft Azure Stack. Ahmed Iraqi Account Systems Engineer Dell EMC North & West Africa

VMware Cloud Application Platform

How Microsoft Built MySQL, PostgreSQL and MariaDB for the Cloud. Santa Clara, California April 23th 25th, 2018

COMP6511A: Large-Scale Distributed Systems. Windows Azure. Lin Gu. Hong Kong University of Science and Technology Spring, 2014

Azure Webinar. Resilient Solutions March Sander van den Hoven Principal Technical Evangelist Microsoft

Peco Karayanev Bryan Wynns

Copyright 2016 Pivotal. All rights reserved. Cloud Native Design. Includes 12 Factor Apps

Transform Your Business To An Open Hybrid Cloud Architecture. Presenter Name Title Date

Kubernetes Integration with Virtuozzo Storage

Developing Microsoft Azure Solutions (70-532) Syllabus

Transcription:

Beyond 1001 Dedicated Data Service Instances

Introduction

The Challenge

Given: Application platform based on Cloud Foundry to serve thousands of apps

Application Runtime

Many platform users - who don t know each other Different app langs & frameworks 100% on-demand self-service: no involvement of the platform operator necessary. Instant scalablity & self-healing

Easy Deployment $> cf push myapp

Runtime abstraction

Java Bildpack Java Code Staging Java Droplet Execution Droplet

Ruby Bildpack Ruby Staging Ruby Droplet Execution Droplet

Java Droplet Droplet Droplet Ruby Droplet

In front of the Cloud Foundry Runtime all droplets are equal

Droplet Droplet $START_CMD $START_CMD

Droplet Container Image Droplet $START_CMD $START_CMD $START_CMD

Something you can execute in a container. Something you can execute in a container. Something you can execute in a container. $START_CMD $START_CMD $START_CMD

Abstraction enables further assumptions & automation

Scaling Apps Cloud Foundry Runtime App#2 Instance#1 $> App#1 Instance#1 App#2 Instance#2 Assuming this to be our status quo.

App Scalability $> cf scale -i 3 app#1

Scaling Apps Cloud Foundry Runtime App#2 Instance#1 App#1 Instance#1 App#1 Instance#2 $> App#1 Instance#3 App#2 Instance#2 Two additional instances have been created.

App Self-Healing

App Self-Healing Cloud Foundry Runtime App#2 Instance#1 App#1 Instance#1 App#1 Instance#2 $> App#1 Instance#3 App#2 Instance#2 Everything is healthy.

App Self-Healing Cloud Foundry Runtime App#1 Instance#1 App1 Instance $> 2 App#1 Instance#3 App#2 Instance#1 App#2 Instance#2 App #1 Instance #2 is failing.

App Self-Healing Cloud Foundry Runtime App#2 Instance#1 App#1 Instance#1 $> App#1 Instance#3 App#2 Instance#2 App #1 Instance #2 - gone temporarily.

App Self-Healing Cloud Foundry Runtime App#2 Instance#1 App#1 Instance#1 App#1 Instance#2 $> App#1 Instance#3 App#2 Instance#2 App #1 Instance #2 re-created.

How does the paradise for Backing Services look like?

Missing: A solution to serve thousands of data services

Application Runtime Data Services

The Mission

Providing a growing number of data services with full lifecycle automation of thousands of data service instances across a wide range of infrastructures

Providing a growing number of data services with full lifecycle automation of thousands of data service instances across a wide range of infrastructures

a9s PostgreSQL a9s MongoDB a9s RabbitMQ Providing a growing number of data services with full lifecycle automation of thousands of data service instances across a wide range of infrastructures a9s Elasticsearch a9s Redis a9s LogMe

Providing a growing number of data services with full lifecycle automation of thousands of data service instances across a wide range of infrastructures

Providing a growing number of data services with full lifecycle automation of thousands of data service instances across a wide range of infrastructures

Providing a growing number of data services with full lifecycle automation of thousands of data service instances across a wide range of infrastructures.

Providing a growing number of data services with full lifecycle automation of thousands of data service instances across a wide range of infrastructures.

Providing a growing number of data services with full lifecycle automation of thousands of data service instances across a wide range of infrastructures

in both:

public in both:

in both: public and on-premise clouds

and integrate well with multiple platforms

and integrate well with multiple platforms

Requirements

Portability Security Usability Scalability Performance Maintainability Robustness Manageability Flexibility On-demand self-service Extensibility Multi-tenancy

Portability Scalability Production-Readiness On-demand self-service

Design

How to build it?

Data Service Provisioning API Automation Middleware Data Service Automation

Open Service Broker, a new industry standard for data service provisioning.

Open Service Broker API Supporters Google Pivotal IBM RedHat Fujitsu SAP

Supporting Platforms Cloud Foundry OpenShift Kubernetes More to come

Supporting Platforms Cloud Foundry OpenShift Kubernetes More to come

Get Service Catalog GET /v2/catalog Provision Service - Create Service Instance PUT /v2/service_instances/:id Bind Service PUT /v2/service_instances/:instance_id/ service_bindings/:id Unbind Service DELETE /v2/service_instances/:instance_id/ service_bindings/:id Unprovision Service DELETE /v2/service_instances/:id http://docs.cloudfoundry.org/services/api.html#api-overview

HTTP Verb Action Service Catalog GET /v2/catalog Create Service Instance PUT /v2/service_instances/:id Create Service Binding PUT /v2/service_instances/:instance_id/ service_bindings/:id Delete Service Binding DELETE /v2/ service_instances/:instance_id/ service_bindings/:id Delete Service Instance DELETE /v2/service_instances/:id Deliver meta data about the data service. Provision a VM, install and configure a data service VMs / Cluster representing a service instance. Create a data service user and return credentials representing a service binding. Remove credentials associated with the service binding. Destroy the VMs and data associated with the service instance.

Data Service Provisioning API Automation Middleware Data Service

The Open Service Broker API does not define what a service instance is.

Applying the design pattern: On-Demand Provisioning of Dedicated Data Service Instances

Result

Using a Service Broker with Cloud Foundry $> cf create-service

Easy Deployment $> cf create-service mongodb single-small my-single-mongo-1

my-single-mongo-1 MongoDB VM#1

Easy Deployment $> cf create-service mongodb cluster-small my-3node-mongocluster-2

Newly created service instance my-single-mongo-1 my-3node-mongo-cluster-2 MongoDB VM#1 MongoDB VM#1 MongoDB VM#2 MongoDB VM#3

Technical Challenges

State

State is handled differently in each backing service.

State is handled differently in each backing service. Operational model will be different. Replication, failure detection, failover.

State is handled differently in each backing service. Operational model will be different. Replication, failure detection, failover. The data service automation will be different.

Where to store state?

App Self-Healing Cloud Foundry Runtime App#2 Instance#1 App#1 Instance#1 App#1 Instance#2 App#1 Instance#3 App#2 Instance#2 Everything is healthy.

App Self-Healing Cloud Foundry Runtime App#2 Instance#1 App#1 Instance#1 App1 Instance 2 App#1 Instance#3 App#2 Instance#2 App #1 Instance #2 is failing.

App Self-Healing Cloud Foundry Runtime App#2 Instance#1 App#1 Instance#1 App#1 Instance#3 App#2 Instance#2 App #1 Instance #2 - gone temporarily.

App Self-Healing Cloud Foundry Runtime App#2 Instance#1 App#1 Instance#1 App#1 Instance#2 App#1 Instance#3 App#2 Instance#2 App #1 Instance #2 re-created.

App self-healing is easy because there is NO STATE.

App self-healing is easy because there is NO STATE. How to store state but still being able to perform self-healing?

Store state on a remotely attached block device = persistent disk.

IaaS API VIRTUAL DATACENTER Router STORAGE Storage Node Storage Node Storage Node HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Infrastructure as a Service (IaaS), e.g. OpenStack

IaaS API VIRTUAL DATACENTER VIRTUAL MACHINE Operating System Router STORAGE Storage Node Storage Node Storage Node HDD HDD HDD HDD HDD Storage Volume HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Infrastructure as a Service (IaaS), e.g. OpenStack

Persistent disk has a file system. Filesystems may fail replication / clustering & backups are still very important.

The data lifecycle has been decoupled from the VM lifecycle The VM becomes disposable.

Storing state

What needs to be automated?

Data Service Instance Lifecycle

Lifecycle of a Data Service Instance 1 Provision a data service server Install data service software Configure data service software Consume data service with apps Debug data service issues Update data service version Update operating system Backup & recover data Scale out data service VM(s) Destroy data service & DB VM(s)

Can you do that x * 1000 times?

You either automate it or delegate it to the app developer.

Automation

BOSH

BOSH let s you orchestrate the lifecycle of large-scale deployments of stateful distributed systems to infrastructure.

BOSH CLI $> bosh deploy IaaS API VIRTUAL DATACENTER VIRTUAL MACHINE BOSH API BOSH BOSH CPI Router STORAGE Storage Node Storage Node Storage Node HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Infrastructure as a Service (IaaS), e.g. OpenStack

BOSH CLI $> bosh deploy IaaS API VIRTUAL DATACENTER VIRTUAL MACHINE BOSH API BOSH BOSH CPI Router STORAGE Storage Node Storage Node Storage Node HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Infrastructure as a Service (IaaS), e.g. OpenStack

BOSH CLI $> bosh deploy IaaS API VIRTUAL DATACENTER VIRTUAL MACHINE BOSH API BOSH BOSH CPI Router STORAGE Storage Node Storage Node Storage Node HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Infrastructure as a Service (IaaS), e.g. OpenStack

BOSH CLI $> bosh deploy IaaS API VIRTUAL DATACENTER VIRTUAL MACHINE Operating System BOSH Agent VIRTUAL MACHINE BOSH API BOSH BOSH CPI Router STORAGE Storage Node Storage Node Storage Node HDD HDD HDD HDD HDD Storage Volume HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Infrastructure as a Service (IaaS), e.g. OpenStack

BOSH CLI $> bosh deploy IaaS API VIRTUAL DATACENTER VIRTUAL MACHINE Operating System BOSH Agent VIRTUAL MACHINE BOSH API BOSH BOSH CPI Router STORAGE Storage Node Storage Node Storage Node HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Infrastructure as a Service (IaaS), e.g. OpenStack

BOSH CLI $> bosh deploy IaaS API VIRTUAL DATACENTER VIRTUAL MACHINE PostgreSQL Operating System BOSH Agent VIRTUAL MACHINE BOSH API BOSH BOSH CPI Router STORAGE Storage Node Storage Node Storage Node HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Infrastructure as a Service (IaaS), e.g. OpenStack

BOSH CLI $> bosh deploy IaaS API VIRTUAL DATACENTER VIRTUAL MACHINE PostgreSQL Operating System BOSH Agent VIRTUAL MACHINE BOSH API BOSH BOSH CPI Router STORAGE Storage Node Storage Node Storage Node HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Infrastructure as a Service (IaaS), e.g. OpenStack

BOSH CLI $> bosh deploy IaaS API VIRTUAL DATACENTER VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE PostgreSQL Operating System Operating Cloud Controller System Operating UAA System BOSH API BOSH BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH CPI Router STORAGE Storage Node Storage Node Storage Node HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD HDD Infrastructure as a Service (IaaS), e.g. OpenStack

BOSH Automation

BOSH Releases contain the automation

A BOSH-Deployment depends on 1.. * Stemcells

A BOSH-Deployment is described by a Release & Manifest

A Release describe 1.. * Jobs

A Release contains 1.. * Package

A BOSH Deployment s settings are contained in a Manifest

Infrastructure settings settings are contained in the Cloud Config

BOSH makes your deployments

Infrastructure Independent

A BOSH release contains the main-automation (software packages, how to run processes) BOSH releases can be re-used on every* infrastructure

Automate once, deploy everywhere.

BOSH CLI BOSH BOSH BOSH VMware AWS OpenStack

BOSH CLI BOSH BOSH BOSH VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE Some Service / App Some Service / App Some Service / App BOSH Agent BOSH Agent BOSH Agent VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE Some Service / App Some Service / App Some Service / App BOSH Agent BOSH Agent BOSH Agent VMware AWS OpenStack

BOSH CLI BOSH BOSH BOSH VMware AWS OpenStack

BOSH CLI $> bosh target http://bosh-on.aws.com BOSH BOSH BOSH VMware AWS OpenStack

BOSH CLI BOSH BOSH BOSH VMware AWS OpenStack

BOSH CLI $> bosh deploy BOSH BOSH BOSH VMware AWS OpenStack

BOSH CLI BOSH BOSH BOSH VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE Some Service / App Some Service / App Some Service / App BOSH Agent BOSH Agent BOSH Agent VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE Some Service / App Some Service / App Some Service / App BOSH Agent BOSH Agent BOSH Agent VMware AWS OpenStack

Switch deployment between clouds Keep the same release Use a stemcell specific to the new cloud Adapt the cloud config

Operating System Independent

A BOSH release does not depend on the OS

The only dependency to the OS is a BOSH stemcell

VIRTUAL MACHINE Operating System Image BOSH Agent

VIRTUAL MACHINE Operating System Image BOSH Agent }

VIRTUAL MACHINE Operating System Image BOSH Agent }OS image + BOSH agent = Stemcell

VIRTUAL MACHINE Ubuntu Stemcell BOSH Agent

Changing the OS of a BOSH deployed system Keep the same release Change the stemcell Change the manifest

Scalable

Horizontal Scaling

VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE Some Service Some Service Some Service Some Service Some Service Some Service Some Service BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE Some Service Some Service Some Service Some Service Some Service Some Service Some Service BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent Horizontal Scaling VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE Some Service Some Service Some Service Some Service Some Service Some Service Some Service BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE VIRTUAL MACHINE Some Service Some Service Some Service Some Service Some Service Some Service Some Service BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent BOSH Agent

Scaling-out a BOSH deployed system Keep the same release Use the same stemcell Change the manifest

Vertical Scaling

VIRTUAL MACHINE 4 GB RAM, 1 vcpu 4GB RAM 1 vcpu 10GB persistent disk PostgreSQL Data 10 GB Persistent Disk BOSH Agent

VIRTUAL MACHINE 4 GB RAM, 1 vcpu PostgreSQL Data 10 GB Persistent Disk BOSH Agent

VIRTUAL MACHINE 4 GB RAM, 1 vcpu PostgreSQL BOSH Agent

VIRTUAL MACHINE 8 GB RAM, 2 vcpus PostgreSQL Data 10 GB Persistent Disk 20 GB Persistent Disk BOSH Agent

VIRTUAL MACHINE 8 GB RAM, 2 vcpus PostgreSQL Data 10 GB Persistent Disk Data 20 GB Persistent Disk BOSH Agent

VIRTUAL MACHINE 8 GB RAM, 2 vcpus PostgreSQL Data 20 GB Persistent Disk BOSH Agent

BOSH Deployments are Predictable

Src code is compiled in a freshly created VMs VMs always contain exact the software, specified in the release No left-overs of prior deployments as new VMs are used.

BOSH Deployments are Repeatable

Executing a specific BOSH deployment always leads to exact same deployed system.

Monitored & Self-Healing

Self-healing process failures

BOSH Installation BOSH Managed Infrastructure Resources BOSH Health Monitor BOSH Director CPI VIRTUAL MACHINE BOSH Agent Some process Process Monitor NATS Message Bus BOSH Registry VIRTUAL MACHINE BOSH Agent Some other process Process Monitor BLOB Store VIRTUAL MACHINE BOSH Agent Yet another process Process Monitor

BOSH Installation BOSH Managed Infrastructure Resources BOSH Health Monitor BOSH Director CPI VIRTUAL MACHINE BOSH Agent Some process Process Monitor NATS Message Bus BOSH Registry VIRTUAL MACHINE BOSH Agent Some other process Process Monitor BLOB Store VIRTUAL MACHINE BOSH Agent Yet another process Process Monitor

Self-healing process monitor failures

BOSH Installation BOSH Managed Infrastructure Resources BOSH Health Monitor BOSH Director CPI VIRTUAL MACHINE BOSH Agent Some process Process Monitor NATS Message Bus BOSH Registry VIRTUAL MACHINE BOSH Agent Process Monitor Some other process BLOB Store VIRTUAL MACHINE BOSH Agent Yet another process Process Monitor

BOSH Installation BOSH Managed Infrastructure Resources BOSH Health Monitor BOSH Director CPI VIRTUAL MACHINE BOSH Agent Some process NATS Message Bus BOSH Registry VIRTUAL MACHINE BOSH Agent Process Monitor Some other process BLOB Store VIRTUAL MACHINE BOSH Agent Yet another process Process Monitor

BOSH Installation BOSH Managed Infrastructure Resources BOSH Health Monitor BOSH Director CPI VIRTUAL MACHINE BOSH Agent Some process NATS Message Bus BOSH Registry VIRTUAL MACHINE BOSH Agent Process Monitor Some other process BLOB Store VIRTUAL MACHINE BOSH Agent Yet another process Process Monitor

BOSH Installation BOSH Managed Infrastructure Resources BOSH Health Monitor BOSH Director CPI VIRTUAL MACHINE BOSH Agent Some process Process Monitor NATS Message Bus BOSH Registry VIRTUAL MACHINE BOSH Agent Process Monitor Some other process BLOB Store VIRTUAL MACHINE BOSH Agent Yet another process Process Monitor

Self-healing VM failures

BOSH Installation BOSH Managed Infrastructure Resources BOSH Health Monitor BOSH Director CPI VIRTUAL MACHINE BOSH Agent Some process Process Monitor NATS Message Bus BOSH Registry VIRTUAL MACHINE BOSH Agent Some other process Process Monitor BLOB Store VIRTUAL MACHINE BOSH Agent Process Monitor Yet another process

BOSH Installation BOSH Managed Infrastructure Resources BOSH Health Monitor BOSH Director CPI VIRTUAL MACHINE BOSH Agent Some process Process Monitor NATS Message Bus BOSH Registry VIRTUAL MACHINE BOSH Agent Some other process Process Monitor BLOB Store VIRTUAL MACHINE BOSH Agent Process Monitor Yet another process

Self-healing BOSH Agent failures

BOSH Installation BOSH Managed Infrastructure Resources BOSH Health Monitor BOSH Director CPI VIRTUAL MACHINE BOSH Agent Some process Process Monitor NATS Message Bus BOSH Registry VIRTUAL MACHINE BOSH Agent Process Monitor Some other process BLOB Store VIRTUAL MACHINE BOSH Agent Yet another process Process Monitor

BOSH Installation BOSH Managed Infrastructure Resources BOSH Health Monitor BOSH Director CPI VIRTUAL MACHINE BOSH Agent Some process Process Monitor NATS Message Bus BOSH Registry VIRTUAL MACHINE BOSH Agent Process Monitor Some other process BLOB Store VIRTUAL MACHINE BOSH Agent Process Monitor Yet another process

Data Service Provisioning API Automation Middleware Data Service Automation

Reference Architecture

CF Client create service Cloud Controller create service a9s PostgreSQL SPI create binding Cloud Foundry Adapter a9s Service Broker Middleware Adapter create deployment from template xy with attributes { } create service specific credentials Templates a9s Deployer Deployments deploy release abc & deployment manifest xyz Bosh Execute deployments Service Instance Service Instance Service Instance my-single-postgres-1 my-3node-postgres-cluster-2 my-3node-postgres-cluster-3 VM#1 VM#1 VM#2 VM#3 VM#1 VM#2 VM#3

Data Service Provisioning API Automation Middleware Data Service Automation

Portability BOSH: Multi-infrastructure support Open Service Broker API: Multi-platform support Scalability On-demand provisioning of dedicated service instances BOSH: Scale existing service instances vertically, solo & clustered instances Production Readyness Dedicated data service instances / Strong instance isolation BOSH: Self-healing, clustered service instances, backup & restore On-demand self-service Open Service Broker API, On-demand provisioning, ondemand updates On-demand backup & restore

a9s PostgreSQL a9s MongoDB a9s RabbitMQ a9s Elasticsearch a9s Redis a9s LogMe

Operations

Continuous Data Service Delivery

Delivering Data Service Patches Open Source PostgreSQL Building new Data Service Releases a9s PostgreSQL Upstream Release Build Test a9s Release Platform #1 Platform #2 Updating the Data Services Update Data Service Instances Platform #n

Common Maintenance Tasks to performed at Scale

Create Service Instance Create VM Install and start data services

Vertical Scale Service Instance Destroy old VM Create new VM Mount old persistent disk Create and mount new persistent disk Copy data Optional: reintegrate into the cluster

OS Update Destroy old VM Create new VM based on new Stemcell (\w new OS version) Attach persistent disk

Ultimate Question

Can you handle more than 1001 Data Service Instances?

Yes.

Excerpt from our perf tests:

Provisioning 1001 instances in sequence over a greater period of time does not expose any significant bottleneck.

For large highly-frequented platforms the amount of simultaneous deployments may become relevant.

Data Service Instances BOSH Queue Time Avg. time to provision VM Total time needed to create instances 250 14 min 6:57 min 21 min 500 29 min 7:01 36 min 750 46 min 7:13 53 min

Optimization Task: Manage BOSH queueing time to an acceptable level.

Scaling BOSH is key to deal with simultaneous provisioning.

Sum Up

Sum Up Full lifecycle automation is feasible Open Service Broker API, a new standard Choosing the right automation technology is key, e.g. BOSH CI/CD based dev and ops are essential

Questions?

Common Data Service Design Patterns A. Shared VM cluster B. Dedicated containers C. Dedicated VMs / VM clusters

Scaling a shared VM cluster

Scaling a shared VM cluster MongoDB Cluster 3 VMs MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster MongoDB Cluster 3 VMs MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster MongoDB Cluster 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster MongoDB Cluster 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster Low costs per service instance

Scaling a shared VM cluster Simple Service Broker Logic

create service create a database

create service binding create a database user

Weak Isolation!

Structural Limitation!

Scaling a shared VM cluster What to do when the shared cluster is full?

Scaling a shared VM cluster MongoDB Cluster 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3 Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max

Scaling a shared VM cluster MongoDB Cluster #1 3 VMs 3 VMs Service Instance #1 #1 = database = #1 #1 Service Instance #2 #2 = database = #2 #2 Service Instance #3 #3 = database = #3 #3 Service Instance #4 #4 = database = #4 #4 Service Instance #5 #5 = database = #5 #5 Service Instance #6 #6 = database = #6 #6 Service Instance # # = database = # # Service Instance # n-max # = database = #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster MongoDB Cluster #1 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster MongoDB Cluster #1 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3 MongoDB Cluster #2 3 VMs Service Instance #n+1 Service Instance #n+2 Service Instance #n+3 Service Instance #n+4 Service Instance #n+5 Service Instance #n+6 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6

Scaling a shared VM cluster Simple Service Broker Logic

Scaling a shared VM cluster Complex Service Broker Logic

Fragmentation

Fragmentation MongoDB Cluster #1 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 MongoDB Cluster #2 3 VMs Service Instance #n+1 Service Instance #n+2 Service Instance #n+3 Service Instance #n+4 Service Instance #n+5 Service Instance #n+6 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6 Caused by frequent creation and / deletion of service instances

Fragmentation MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #n+3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6 Caused by frequent creation and / deletion of service instances

Placement Problem

Placement Problem MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #n+3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6

Placement Problem MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #n+3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3? Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6 New Service Instance

Placement Problem MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #n+3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6 New Service Instance Strategy to place new service instances is required and may require data service specific logic.

Cluster Rebalancing

Cluster Rebalancing MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #n+3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6

Cluster Rebalancing MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6

Cluster Rebalancing MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6 A unbalanced set of clusters wastes infrastructure resources.

Cluster Rebalancing MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6

Cluster Rebalancing MongoDB Cluster #1 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance # 2*n-max Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 A cluster rebalance freeing infrastructure resources would desirable.

Shared Cluster Conclusion

Scalability issued can be addressed Isolation issues are heavily data-service specific > A generic solution is not possible.

Scaling Dedicated Containers

Better Isolation

Scaling Dedicated Containers PosgreSQL Cell 2 VMs across 2 AZs Docker host VM #1 Docker host VM #2

Scaling Dedicated Containers PosgreSQL Cell 2 VMs across 2 AZs Service Instance #1 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Docker host VM #1 Docker host VM #2

Scaling Dedicated Containers PosgreSQL Cell 2 VMs across 2 AZs Service Instance #1 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Service Instance #2 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Docker host VM #1 Docker host VM #2

How to scale?

Scaling Dedicated Containers PosgreSQL Cell 2 VMs across 2 AZs Service Instance #1 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Service Instance #2 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Docker host VM #1 Docker host VM #2

Scaling Dedicated Containers PosgreSQL Cell 2 VMs across 2 AZs Service Instance #1 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Service Instance #2 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Docker host VM #1 Docker host VM #2

Structural Limitation!

Scaling a shared VM cluster What to do when the Cell/Cluster is full?

Scaling Dedicated Containers PosgreSQL Cell #1 2 VMs across 2 AZs PosgreSQL Cell #2 2 VMs across 2 AZs Service Instance #1 = 2 Docker containers + 2 PostgreSQL processes Service Instance #3 = 2 Docker containers + 2 PostgreSQL processes Service Instance #2 = 2 Docker containers + 2 PostgreSQL processes Service Instance #3 = 2 Docker containers + 2 PostgreSQL processes Docker host VM #1 Docker host VM #2 Docker host VM #3 Docker host VM #4

Same Service Broker Challenge

New Challenge: How to add Cell-VMs on-demand?

On-Demand VM provisioning is unavoidable.

Why not delegate most challenges?

On-Demand Dedicated VMs and Clusters

Architecture

CF Client create service Cloud Controller create service a9s MongoDB SPI create binding Cloud Foundry Adapter a9s Service Broker Middleware Adapter create deployment from template xy with attributes { } create service specific credentials Templates a9s Deployer Deployments deploy release abc & deployment manifest xyz Bosh Execute deployments Service Instance Service Instance Service Instance my-single-mongodb-1 my-3node-mongodb-cluster-2 my-3node-mongodb-cluster-3 MongoDB MongoDB MongoDB MongoDB MongoDB MongoDB MongoDB VM#1 VM#1 VM#2 VM#3 VM#1 VM#2 VM#3

Let BOSH do the VM orchestration!

Let the infrastructure solve the placement and fragmentation challenge!

Shared Data Services

Shared PostgreSQL Cluster > Bad idea 1x 1x Single PostgreSQL Server 1 VM Service Instance 1 Service Instance 2 VM#1 Service Instance 3 OR PostgreSQL Cluster 3 VMs Service Instance 1 Service Instance 1 Service Instance 1 Service Instance 2 Service Instance 2 Service Instance 2 VM#1 VM#2 VM#3 Service Instance 3 Service Instance 3 Service Instance 3 Single VM or single cluster of VMs Single PostgreSQL server or single PostgreSQL cluster Isolation limited to PostreSQL multitenancy capabilities

Shared PostgreSQL = SPOF

Cloud Foundry Runtime PostgreSQL Cluster 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance Service Instance VM#1 VM#2 VM#3 Service Instance Service Instance Service Instance

Cloud Foundry Runtime PostgreSQL Cluster 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance Service Instance App App App App App App App App App App Service Instance Service Instance Service Instance App App App App App App App App App App App App App App App App App App App App

Cloud Foundry Runtime App App App App App App App App App App App App App App App App App App App PostgreSQL Cluster App App Service Instance 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance App App Service Instance Service Instance Service Instance App App App App App App App App App App App App App

Cloud Foundry Runtime App App App App App App App App App App App App App App App App App App App PostgreSQL Cluster App App Service Instance 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance App App Service Instance Service Instance Service Instance App App App App App App App App App App App App App

Your shared PostgreSQL cluster goes down, all your PostgreSQL database instances go down.

Beware of bad neighborhood

Cloud Foundry Runtime PostgreSQL Cluster 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance Service Instance VM#1 VM#2 VM#3 Service Instance Service Instance Service Instance

Cloud Foundry Runtime PostgreSQL Cluster 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance Service Instance VM#1 VM#2 VM#3 Service Instance Service Instance Service Instance

Cloud Foundry Runtime PostgreSQL Cluster 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance Service Instance VM#1 VM#2 VM#3 Service Instance Service Instance Service Instance

Cloud Foundry Runtime PostgreSQL Cluster 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance Service Instance VM#1 VM#2 VM#3 Service Instance Service Instance Service Instance

Cloud Foundry Runtime Service Instance Service Instance VM#1 Service Instance PostgreSQL Cluster 3 VMs Service Instance Service Instance VM#2 Service Instance Service Instance Service Instance VM#3 Service Instance

Shared clusters are vulnerable to bad neighbors

Dedicated

Dedicated PostgreSQL instances > Good idea n x Service Instance my-single-postgres-1 VM#1 and / or Service Instance my-3node-postgres-cluster-2 Service instance = dedicated VM or dedicated cluster of VMs Uses infrastructure m x isolation to enable VM#1 VM#2 VM#3 multi-tenancy support

Cloud Foundry Runtime Service Instance VM Service Instance VM Service Instance Service Instance VM VM Service Instance Service Instance VM VM Service Instance VM#1 VM#2 VM#3 Service Instance VM#1 VM#2 VM#3

Cloud Foundry Runtime Service Instance VM Service Instance VM Service Instance Service Instance VM VM Service Instance Service Instance VM VM Service Instance VM#1 VM#2 VM#3 Service Instance VM#1 VM#2 VM#3

Cloud Foundry Runtime Service Instance VM Service Instance VM Service Instance Service Instance VM VM Service Instance Service Instance VM VM Service Instance VM#1 VM#2 VM#3 Service Instance VM#1 VM#2 VM#3

PostgreSQL failures are contained. Only one service instance affected.

Bad neighborhood protection with dedicated service instances

Cloud Foundry Runtime PostgreSQL Cluster = Service Instance #1 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance PostgreSQL Cluster = Service Instance #2 3 VMs Service Instance Service Instance Service Instance VM#1 VM#2 VM#3 PostgreSQL Cluster = Service Instance #3 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance

Cloud Foundry Runtime PostgreSQL Cluster = Service Instance #1 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance PostgreSQL Cluster = Service Instance #2 3 VMs Service Instance Service Instance Service Instance VM#1 VM#2 VM#3 PostgreSQL Cluster = Service Instance #3 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance

Cloud Foundry Runtime PostgreSQL Cluster = Service Instance #1 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance PostgreSQL Cluster = Service Instance #2 3 VMs Service Instance Service Instance Service Instance VM#1 VM#2 VM#3 PostgreSQL Cluster = Service Instance #3 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance

Cloud Foundry Runtime PostgreSQL Cluster = Service Instance #1 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance PostgreSQL Cluster = Service Instance #2 3 VMs Service VM#1 Instance Service VM#2 Instance Service Instance VM#3 PostgreSQL Cluster = Service Instance #3 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance

Cloud Foundry Runtime PostgreSQL Cluster = Service Instance #1 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance PostgreSQL Cluster = Service Instance #2 3 VMs Service VM#1 Instance Service VM#2 Instance Service Instance VM#3 PostgreSQL Cluster = Service Instance #3 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance

Cloud Foundry Runtime PostgreSQL Cluster = Service Instance #1 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance PostgreSQL Cluster = Service Instance #2 3 VMs Service VM#1 Instance Service VM#2 Instance Service Instance VM#3 PostgreSQL Cluster = Service Instance #3 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance

Cloud Foundry Runtime PostgreSQL Cluster = Service Instance #1 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance PostgreSQL Cluster = Service Instance #2 3 VMs Service VM#1 Instance Service VM#2 Instance Service Instance VM#3 PostgreSQL Cluster = Service Instance #3 3 VMs Service VM#1 Instance Service VM#2 Instance Service VM#3 Instance Infrastructure isolation

Dedicated clusters isolate bad neighbors

Cloud

Cloud Automation

Cloud Automation Robustness

Cloud Automation Robustness Self-Healing

Cloud Automation Robustness Self-Healing Scalability

Cloud Automation Robustness Self-Healing Scalability On-demand self-service

Resource Type VM vs. Container Provisioning Pre-provisioned vs. On-demand-provisioning Failover Strategy Resurrection Failover vs. Standby-Failover Data Redundancy Single replica vs. Data redundancy / Multiple replicas Infrastructure Reliability Perfect. HA VMs. No SPOFs. Never fails. VMs cost more than a Design to fail. Fails from time to time. Saves money. Automation Technology BOSH vs. Chef vs. Puppet Service Instances Shared Dedicated

Desired time to repair Seconds, minutes, hours? Availability Service instance availability Service broker availability Configurability Adapt to local network and security policies. Integrate existing infrastructure. Accessibility Remote log-in to service instances. Performance Service broker performance (ops/s) Service instance performance Time to provision service instance. Security Network security. Encryption. Transparency Accessing metrics and logs. Operability Easyness to operate and maintain.

1. Define a service instance! 2. Define total # service instances! 3. Define # service instance CRUD ops / min!

Common Data Service Design Patterns

Common Data Service Design Patterns A. Shared VM cluster B. Dedicated containers C. Dedicated VMs / VM clusters

Scaling a shared VM cluster

Scaling a shared VM cluster MongoDB Cluster 3 VMs MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster MongoDB Cluster 3 VMs MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster MongoDB Cluster 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster MongoDB Cluster 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster Low costs per service instance

Scaling a shared VM cluster Simple Service Broker Logic

create service create a database

create service binding create a database user

Weak Isolation!

Structural Limitation!

Scaling a shared VM cluster What to do when the shared cluster is full?

Scaling a shared VM cluster MongoDB Cluster 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3 Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max

Scaling a shared VM cluster MongoDB Cluster #1 3 VMs 3 VMs Service Instance #1 #1 = database = #1 #1 Service Instance #2 #2 = database = #2 #2 Service Instance #3 #3 = database = #3 #3 Service Instance #4 #4 = database = #4 #4 Service Instance #5 #5 = database = #5 #5 Service Instance #6 #6 = database = #6 #6 Service Instance # # = database = # # Service Instance # n-max # = database = #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster MongoDB Cluster #1 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3

Scaling a shared VM cluster MongoDB Cluster #1 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #1 MongoDB VM #2 MongoDB VM #3 MongoDB Cluster #2 3 VMs Service Instance #n+1 Service Instance #n+2 Service Instance #n+3 Service Instance #n+4 Service Instance #n+5 Service Instance #n+6 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6

Scaling a shared VM cluster Simple Service Broker Logic

Scaling a shared VM cluster Complex Service Broker Logic

Fragmentation

Fragmentation MongoDB Cluster #1 3 VMs Service Instance #1 = database #1 Service Instance #2 = database #2 Service Instance #3 = database #3 Service Instance #4 = database #4 Service Instance #5 = database #5 Service Instance #6 = database #6 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 MongoDB Cluster #2 3 VMs Service Instance #n+1 Service Instance #n+2 Service Instance #n+3 Service Instance #n+4 Service Instance #n+5 Service Instance #n+6 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6 Caused by frequent creation and / deletion of service instances

Fragmentation MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #n+3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6 Caused by frequent creation and / deletion of service instances

Placement Problem

Placement Problem MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #n+3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6

Placement Problem MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #n+3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3? Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6 New Service Instance

Placement Problem MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #n+3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6 New Service Instance Strategy to place new service instances is required and may require data service specific logic.

Cluster Rebalancing

Cluster Rebalancing MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #n+3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6

Cluster Rebalancing MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6

Cluster Rebalancing MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6 A unbalanced set of clusters wastes infrastructure resources.

Cluster Rebalancing MongoDB Cluster #1 3 VMs MongoDB Cluster #2 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 Service Instance # 2*n-max MongoDB VM #4 MongoDB VM #5 MongoDB VM #6

Cluster Rebalancing MongoDB Cluster #1 3 VMs Service Instance #1 = database #1 Service Instance #n+2 Service Instance #3 = database #3 Service Instance # 2*n-max Service Instance #5 = database #5 Service Instance # = database # Service Instance # n-max = database #n-max MongoDB VM #2 MongoDB VM #3 A cluster rebalance freeing infrastructure resources would desirable.

Shared Cluster Conclusion

Scalability issued can be addressed Isolation issues are heavily data-service specific > A generic solution is not possible.

Scaling Dedicated Containers

Better Isolation

Scaling Dedicated Containers PosgreSQL Cell 2 VMs across 2 AZs Docker host VM #1 Docker host VM #2

Scaling Dedicated Containers PosgreSQL Cell 2 VMs across 2 AZs Service Instance #1 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Docker host VM #1 Docker host VM #2

Scaling Dedicated Containers PosgreSQL Cell 2 VMs across 2 AZs Service Instance #1 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Service Instance #2 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Docker host VM #1 Docker host VM #2

How to scale?

Scaling Dedicated Containers PosgreSQL Cell 2 VMs across 2 AZs Service Instance #1 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Service Instance #2 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Docker host VM #1 Docker host VM #2

Scaling Dedicated Containers PosgreSQL Cell 2 VMs across 2 AZs Service Instance #1 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Service Instance #2 = 2 Docker containers + 2 PostgreSQL processes + 2 PostgreSQL databases asynchronously replicated Docker host VM #1 Docker host VM #2

Structural Limitation!

Scaling a shared VM cluster What to do when the Cell/Cluster is full?

Scaling Dedicated Containers PosgreSQL Cell #1 2 VMs across 2 AZs PosgreSQL Cell #2 2 VMs across 2 AZs Service Instance #1 = 2 Docker containers + 2 PostgreSQL processes Service Instance #3 = 2 Docker containers + 2 PostgreSQL processes Service Instance #2 = 2 Docker containers + 2 PostgreSQL processes Service Instance #3 = 2 Docker containers + 2 PostgreSQL processes Docker host VM #1 Docker host VM #2 Docker host VM #3 Docker host VM #4

Same Service Broker Challenge

New Challenge: How to add Cell-VMs on-demand?

On-Demand VM provisioning is unavoidable.

Why not delegate most challenges?

On-Demand Dedicated VMs and Clusters

Architecture

CF Client create service Cloud Controller create service a9s MongoDB SPI create binding Cloud Foundry Adapter a9s Service Broker Middleware Adapter create deployment from template xy with attributes { } create service specific credentials Templates a9s Deployer Deployments deploy release abc & deployment manifest xyz Bosh Execute deployments Service Instance Service Instance Service Instance my-single-mongodb-1 my-3node-mongodb-cluster-2 my-3node-mongodb-cluster-3 MongoDB MongoDB MongoDB MongoDB MongoDB MongoDB MongoDB VM#1 VM#1 VM#2 VM#3 VM#1 VM#2 VM#3

Let BOSH do the VM orchestration!

Let the infrastructure solve the placement and fragmentation challenge!

Shared Services Instances

Shared PostgreSQL Cluster > Bad idea 1x 1x Single PostgreSQL Server 1 VM Service Instance 1 Service Instance 2 VM#1 Service Instance 3 OR PostgreSQL Cluster 3 VMs Service Instance 1 Service Instance 1 Service Instance 1 Service Instance 2 Service Instance 2 Service Instance 2 VM#1 VM#2 VM#3 Service Instance 3 Service Instance 3 Service Instance 3 Single VM or single cluster of VMs Single PostgreSQL server or single PostgreSQL cluster Isolation limited to PostreSQL multitenancy capabilities

Shared PostgreSQL = SPOF

Cloud Foundry Runtime PostgreSQL Cluster 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance Service Instance VM#1 VM#2 VM#3 Service Instance Service Instance Service Instance

Cloud Foundry Runtime PostgreSQL Cluster 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance Service Instance App App App App App App App App App App Service Instance Service Instance Service Instance App App App App App App App App App App App App App App App App App App App App

Cloud Foundry Runtime App App App App App App App App App App App App App App App App App App App PostgreSQL Cluster App App Service Instance 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance App App Service Instance Service Instance Service Instance App App App App App App App App App App App App App

Cloud Foundry Runtime App App App App App App App App App App App App App App App App App App App PostgreSQL Cluster App App Service Instance 3 VMs Service Instance Service Instance Service Instance Service Instance Service Instance App App Service Instance Service Instance Service Instance App App App App App App App App App App App App App