Managing Compute and Storage at Scale with Kubernetes. Dan Paik / Google

Similar documents
Introduction to Kubernetes Storage Primitives for Stateful Workloads

Kubernetes Storage: Current Capabilities and Future Opportunities. September 25, 2018 Saad Ali & Nikhil Kasinadhuni Google

What s New in Kubernetes 1.10

Raw Block Volume in Kubernetes Mitsuhiro Tanino, Principal Software Engineer, Hitachi Vantara

Above the clouds with container-native storage

You Have Stateful Apps - What if Kubernetes Would Also Run Your Storage?

Internals of Docking Storage with Kubernetes Workloads

Container-Native Storage

Webinar Series. Cloud Native Storage. May 17, 2017

@briandorsey #kubernetes #GOTOber

PRP Distributed Kubernetes Cluster


What s New in Red Hat OpenShift Container Platform 3.4. Torben Jäger Red Hat Solution Architect

VMWARE PIVOTAL CONTAINER SERVICE

An Introduction to Kubernetes

Kubernetes on Openstack

Learn. Connect. Explore.

Hitachi & Red Hat collaborate: Container migration guide

Kubernetes Integration with Virtuozzo Storage

Kubernetes: What s New

OpenShift + Container Native Storage (CNS)

What s New in K8s 1.3

VMWARE PKS. What is VMware PKS? VMware PKS Architecture DATASHEET

Local Ephemeral Storage Resource Management. Jing Xu, Google

VMWARE ENTERPRISE PKS

Microservices. Chaos Kontrolle mit Kubernetes. Robert Kubis - Developer Advocate,

Kubernetes, Persistent Volumes and the Pure Service Orchestrator. Simon Dodsley, Director of New Stack Technologies

What s New in K8s 1.3

Red Hat Enterprise Linux Atomic Host 7 Getting Started with Kubernetes

/ Cloud Computing. Recitation 5 September 26 th, 2017

Introduction to Kubernetes

Convergence of VM and containers orchestration using KubeVirt. Chunfu Wen

Red Hat OpenShift Roadmap Q4 CY16 and H1 CY17 Releases. Lutz Lange Solution

CONTINUOUS INTEGRATION CONTINUOUS DELIVERY

Package your Java Application using Docker and Kubernetes. Arun

Designing MQ deployments for the cloud generation

Kubernetes: Container Orchestration and Micro-Services logo

Disaster Recovery and Data Protection for Kubernetes Persistent Volumes. Xing Yang, Principal Architect, Huawei

Important DevOps Technologies (3+2+3days) for Deployment

Managing and Protecting Persistent Volumes for Kubernetes. Xing Yang, Huawei and Jay Bryant, Lenovo

Hedvig as backup target for Veeam

McAfee Cloud Workload Security Installation Guide. (McAfee epolicy Orchestrator)

RED HAT GLUSTER TECHSESSION CONTAINER NATIVE STORAGE OPENSHIFT + RHGS. MARCEL HERGAARDEN SR. SOLUTION ARCHITECT, RED HAT BENELUX April 2017

/ Cloud Computing. Recitation 5 February 14th, 2017

DEVELOPER INTRO

Backup strategies for Stateful Containers in OpenShift Using Gluster based Container-Native Storage

Kubernetes: Twelve KeyFeatures

Taming your heterogeneous cloud with Red Hat OpenShift Container Platform.

Kubernetes. An open platform for container orchestration. Johannes M. Scheuermann. Karlsruhe,

Running MarkLogic in Containers (Both Docker and Kubernetes)

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Red Hat Roadmap for Containers and DevOps

Scaling Jenkins with Docker and Kubernetes Carlos

Kuberiter White Paper. Kubernetes. Cloud Provider Comparison Chart. Lawrence Manickam Kuberiter Inc

Kubernetes 1.9 Features and Future

Kubernetes Autoscaling on Azure. Pengfei Ni Microsoft Azure

Kubernetes Basics. Christoph Stoettner Meetup Docker Mannheim #kubernetes101

Kubernetes 1.8 and Beyond

Taming Distributed Pets with Kubernetes

EASILY DEPLOY AND SCALE KUBERNETES WITH RANCHER

Containers Infrastructure for Advanced Management. Federico Simoncelli Associate Manager, Red Hat October 2016

White Paper. Why Remake Storage For Modern Data Centers

Red Hat Gluster Storage 3.1

Kubernetes The Path to Cloud Native

Kubernetes objects on Microsoft Azure

INTRODUCING CONTAINER-NATIVE VIRTUALIZATION

S Implementing DevOps and Hybrid Cloud

Secure Kubernetes Container Workloads

CONTAINERS AND MICROSERVICES WITH CONTRAIL

Kuber-what?! Learn about Kubernetes

Overview of Container Management

When (and how) to move applications from VMware to Cisco Metacloud

Continuous delivery while migrating to Kubernetes

Cisco Container Platform

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

So, I have all these containers! Now what?

Building an on premise Kubernetes cluster DANNY TURNER

Code: Slides:

DevOps + Infrastructure TRACK SUPPORTED BY

Onto Petaflops with Kubernetes

Kubernetes 101. Doug Davis, STSM September, 2017

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

Choosing the Right Container Infrastructure for Your Organization

agenda PAE Docker Docker PAE

Red Hat Gluster Storage Roadmap Past, Present & Future

OpenShift on Public & Private Clouds: AWS, Azure, Google, OpenStack

Kubernetes. Introduction

Kubernetes Integration Guide

OpenShift Roadmap Enterprise Kubernetes for Developers. Clayton Coleman, Architect, OpenShift

Integrate Ceph and Kubernetes. on Wiwynn ST P. All-Flash Storage

Defining Security for an AWS EKS deployment

Blue elephant on-demand: PostgreSQL + Kubernetes. FOSDEM 2018, Brussels

UP! TO DOCKER PAAS. Ming

OPENSTACK Building Block for Cloud. Ng Hwee Ming Principal Technologist (Telco) APAC Office of Technology

Everything You Ever Wanted To Know About Resource Scheduling... Almost

Containers, Serverless and Functions in a nutshell. Eugene Fedorenko

Enterprise Architectures The Pace Accelerates Camberley Bates Managing Partner & Analyst

Kubernetes made easy with Docker EE. Patrick van der Bleek Sr. Solutions Engineer NEMEA

Safe Harbor Statement

MySQL In the Cloud. Migration, Best Practices, High Availability, Scaling. Peter Zaitsev CEO Los Angeles MySQL Meetup June 12 th, 2017.

Transcription:

Managing Compute and Storage at Scale with Kubernetes Dan Paik / Google

Have You Recently... played a hit mobile game? shopped at an online marketplace? followed breaking news? attended a concert? filed expenses?

Have You Recently... played a hit mobile game? shopped an online marketplace? followed breaking news? attended a concert? filed expenses?

Fast, Scalable, Open Fast: Developer productivity Minutes from commit to prod Release 20-50x/day Scalable: efficient scale out Fastest app to $1B Black Friday demand Open: use anywhere 200 warehouses on VMware Hybrid and multi-cloud

Google has been developing and using containers to manage our applications for over 12 years. Images by Connie Zhou

Everything at Google runs in containers: Gmail, Web Search, Maps,... MapReduce, batch,... GFS, Colossus,... Even Google s Cloud Platform: our VMs run in containers!

Everything at Google runs in containers: Gmail, Web Search, Maps,... MapReduce, batch,... GFS, Colossus,... Even Google s Cloud Platform: our VMs run in containers! We launch over 2 billion containers per week

But it s all so different! Deployment Management, monitoring Isolation (very complicated!) Updates Discovery Scaling, replication, sets A fundamentally different way of managing applications requires different tooling and abstractions Images by Connie Zhou

Why Containers? Number of running jobs Containers make operations easier Enabled Google to grow our fleet over 10x faster than we grew our ops team Core Ops Team 2004 2016 10

Needs to run anywhere On-Premises Hybrid Cloud

Kubernetes Open Source, Run Anywhere, Container-centric Infrastructure, for managing containerized applications across multiple hosts. - Auto-scaling, rolling upgrades, A/B testing... Commercial Enterprise Support. Service Partners Inspired by Google s experiences and internal systems (blog post, research paper) 1,050+ Contributors 43,000+ Commits 4,000+ External Projects Based on Kubernetes 200+ Meetups Around the World

Velocity Commits Since July 2014 1.5 1.4 Total Commits 1.3 1.2 1.1 1.0 1.6

Can it scale?

How Can We Scale Out Container Workloads? Node Node Node Cluster??? How to handle replication? What about node failure? What about container failure? How do we manage application upgrades?

Kubernetes Scalability Scalability - exponentially more nodes and pods Multi-cluster federation HA masters Monitoring Global Enhanced resource management and isolation Hybrid Multi-cloud

This year, our customers flourished during Black Friday and Cyber Monday with zero outages, downtime or interruptions in service thanks, in part, to Google Container Engine and Kubernetes Will Warren, Chief Technology Officer at GroupBy

Cloud Datastore Transactions Per Second 50X Actual traffic 5X Worst case estimate 1X Target traffic Original launch target Estimated worst case Actual traffic

Scaling best practices

Pod Small group of containers & volumes Tightly coupled The atom of Kubernetes Shared ip, networking, lifecycle, disk

Horizontal Pod Autoscaling Automatically add (or remove) pods as needed Based on CPU utilization (for now) Custom metrics in Alpha Efficiency now, capacity when you need it Operates within user-defined min/max bounds Set it and forget it Stats...

Cluster autoscaling Add Nodes/VMs when needed Based on unschedulable pods New VMs self-register with API server Remove Nodes/VMs when not needed e.g. CPU usage too low...

Performance & scalability 100 nodes v1.0 07/2015 250 nodes v1.1 11/2015 1000 nodes v1.2 03/2016 2000 nodes v1.3 07/2016 5000 nodes v1.6 03/2017...

Hybrid / Multi-Cloud

Build for a Hybrid/Multi-Cloud World OPEN Public Cloud Private Cloud Open Source & APIs

Hybrid Cloud HYBRID DATA & APPLICATIONS Public Cloud On-Prem An elastic computing environment Secure network connectivity On-demand self service for users Data security Network delivered services Operational management and workflow

Workload portability Goal: Avoid vendor lock-in Runs in many environments, including bare metal and your laptop The API and the implementation are 100% open The whole system is modular and replaceable

Workload portability Goal: Write once, run anywhere* Don t force apps to know about concepts that are cloud-provider specific Examples of this: *mostly Network model Ingress Service load-balancers PersistentVolumes

Workload portability Goal: Avoid coupling Don t force apps to know about concepts that are Kubernetes-specific Examples of this: Namespaces Services / DNS Downward API Secrets / ConfigMaps

Workload portability Result: Portability Build your apps on-prem, lift-and-shift into cloud when you are ready Don t get stuck with a platform that doesn t work for you Put your app on wheels and move it whenever and wherever you need

Storage

Problem Content Manager Consumers Can t share files between containers Files in containers are ephemeral Can t run stateful applications Container crashes result in loss of data Web Server File Puller? Pod

Volumes Content Manager Consumers Kubernetes Volumes Directory, possibly with some data Accessible by all containers in pod Lifetime same as the pod or longer Volume Plugins Define How directory is setup Medium that backs it Contents of the directory File Puller Volume Pod Web Server

Volumes Kubernetes has many volume plugins Persistent GCE Persistent Disk AWS Elastic Block Store Azure File Storage Azure Data Disk iscsi Flocker NFS vsphere GlusterFS Ceph File and RBD Cinder Quobyte Volume FibreChannel VMWare Photon PD Ephemeral Empty dir (and tmpfs) Expose Kubernetes API Secret ConfigMap DownwardAPI Other Flex (exec a binary) Host path Future Local Storage?

GCE PD Volume referenced directly apiversion: v1 kind: Pod metadata: name: sleepypod spec: volumes: - name: data gcepersistentdisk: pdname: panda-disk fstype: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumemounts: - name: data mountpath: /data readonly: false

GCE PD Volume referenced directly apiversion: v1 kind: Pod metadata: name: sleepypod spec: volumes: - name: data gcepersistentdisk: pdname: panda-disk fstype: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumemounts: - name: data mountpath: /data readonly: false

GCE PD Volume referenced directly apiversion: v1 kind: Pod metadata: name: sleepypod spec: volumes: - name: data gcepersistentdisk: pdname: panda-disk fstype: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumemounts: - name: data mountpath: /data readonly: false

GCE PD Example Volume referenced directly apiversion: v1 kind: Pod metadata: name: sleepypod spec: volumes: - name: data gcepersistentdisk: pdname: panda-disk fstype: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumemounts: - name: data mountpath: /data readonly: false gcepd.yaml

PersistentVolumes A higher-level storage abstraction insulation from any one cloud environment Admin provisions them, users claim them Static and dynamic provisioning (via StorageClass) Independent lifetime from consumers lives until user is done with it can be handed-off between pods Claim

Walkthrough Cluster Admin

apiversion: v1 kind: PersistentVolume metadata: name : mypv1 spec: accessmodes: - ReadWriteOnce capacity: storage: 10Gi persistentvolumereclaimpolicy: Retain gcepersistentdisk: fstype: ext4 pdname: panda-disk --apiversion: v1 kind: PersistentVolume metadata: name : mypv2 spec: accessmodes: - ReadWriteOnce capacity: storage: 100Gi persistentvolumereclaimpolicy: Retain gcepersistentdisk: fstype: ext4 pdname: panda-disk2

$ kubectl create -f pv.yaml persistentvolume "pv1" created persistentvolume "pv2" created $ kubectl get pv NAME CAPACITY REASON AGE pv1 10Gi 1m pv2 100Gi 1m ACCESSMODES STATUS RWO Available RWO Available CLAIM

Creates Cluster Admin PersistentVolumes

PersistentVolumes Cluster Admin User

apiversion: v1 kind: PersistentVolumeClaim metadata: name: mypvc namespace: testns spec: accessmodes: - ReadWriteOnce resources: requests: storage: 100Gi

PersistentVolumes Cluster Admin PVClaim Create User

$ kubectl create -f pv.yaml persistentvolume "pv1" created persistentvolume "pv2" created $ kubectl get pv NAME CAPACITY REASON AGE pv1 10Gi 1m pv2 100Gi 1m ACCESSMODES STATUS RWO Available RWO Available CLAIM $ kubectl create -f pvc.yaml persistentvolumeclaim "mypvc" created $ kubectl get pv NAME CAPACITY REASON AGE pv1 10Gi 3m pv2 100Gi ACCESSMODES STATUS RWO Available RWO Bound CLAIM testns/mypvc

PersistentVolumes Cluster Admin PVClaim User Binder

PersistentVolumes Cluster Admin PVClaim Create User Pod

Volume no longer referenced directly apiversion: v1 kind: Pod metadata: name: sleepypod spec: volumes: volumes: - name: data - name: data persistentvolumeclaim: gcepersistentdisk: claimname: mypvc pdname: panda-disk fstype: ext4 containers: - name: sleepycontainer image: gcr.io/google_containers/busybox command: - sleep - "6000" volumemounts: - name: data mountpath: /data readonly: false

PersistentVolumes Cluster Admin * PVClaim User Pod

PersistentVolumes * Cluster Admin PVClaim Delete User Pod

PersistentVolumes Cluster Admin * PVClaim User

PersistentVolumes * Cluster Admin PVClaim Create User Pod

PersistentVolumes Cluster Admin * PVClaim User Pod

PersistentVolumes * Cluster Admin PVClaim Delete User Pod

PersistentVolumes * Cluster Admin PVClaim Delete User

PersistentVolumes Cluster Admin Recycler User

Dynamic provisioning for scalability Allows storage to be created on-demand (when requested by user) Eliminates need for cluster administrators to pre-provision storage Great for public cloud - API can provision new storage Also works on-prem - trigger new NFS share on demand Storage Class Cluster Admin PVClaim User Storage Provider

Dynamic provisioning Cluster/Storage admins enable dynamic provisioning by creating StorageClass objects StorageClass objects define the parameters used during creation StorageClass parameters are opaque to Kubernetes so storage providers can expose any number of custom parameters for the cluster admin to use Cluster Admin storage_class.yaml kind: StorageClass apiversion: storage.k8s.io/v1 metadata: name: slow provisioner: kubernetes.io/gce-pd parameters: type: pd-standard -kind: StorageClass apiversion: storage.k8s.io/v1 metadata: name: fast provisioner: kubernetes.io/gce-pd parameters: type: pd-ssd Storage Class

Dynamic provisioning Users consume storage the same way: PVC Selecting a storage class in PVC triggers dynamic provisioning pvc.yaml (on k8s 1.5) apiversion: v1 kind: PersistentVolumeClaim metadata: name: mypvc namespace: testns annotations: volume.beta.kubernetes.io/storage-class: fast spec: accessmodes: - ReadWriteOnce resources: requests: storage: 100Gi User pvc.yaml (on k8s 1.6+) apiversion: v1 kind: PersistentVolumeClaim metadata: name: mypvc namespace: testns spec: accessmodes: - ReadWriteOnce resources: requests: storage: 100Gi storageclassname: fast PVClaim

Thank You! danpaik@google.com Images by Connie Zhou