Understanding and Evaluating Kubernetes. Haseeb Tariq Anubhavnidhi Archie Abhashkumar

Similar documents
Kubernetes introduction. Container orchestration

Container Orchestration on Amazon Web Services. Arun

An Introduction to Kubernetes

Large-scale cluster management at Google with Borg

Triangle Kubernetes Meet Up #3 (June 9, 2016) From Beginner to Expert

Kubernetes: Twelve KeyFeatures

Package your Java Application using Docker and Kubernetes. Arun

Kubernetes 101. Doug Davis, STSM September, 2017

Microservices. Chaos Kontrolle mit Kubernetes. Robert Kubis - Developer Advocate,

A REFERENCE ARCHITECTURE FOR DEPLOYING WSO2 MIDDLEWARE ON KUBERNETES

Kubernetes - Load Balancing For Virtual Machines (Pods)

Introduction to Kubernetes

Operating Within Normal Parameters: Monitoring Kubernetes

Important DevOps Technologies (3+2+3days) for Deployment

What s New in K8s 1.3

What s New in K8s 1.3

Two years of on Kubernetes

How to build scalable, reliable and stable Kubernetes cluster atop OpenStack.

Federated Prometheus Monitoring at Scale

Overview of Container Management

HDFS Federation. Sanjay Radia Founder and Hortonworks. Page 1

CONTAINERS AND MICROSERVICES WITH CONTRAIL

Kubernetes The Path to Cloud Native

Kubernetes Autoscaling on Azure. Pengfei Ni Microsoft Azure

The Path to GPU as a Service in Kubernetes Renaud Gaubert Lead Kubernetes Engineer

Spark and Flink running scalable in Kubernetes Frank Conrad

Kuberiter White Paper. Kubernetes. Cloud Provider Comparison Chart. Lawrence Manickam Kuberiter Inc

Building an on premise Kubernetes cluster DANNY TURNER

VMWARE PIVOTAL CONTAINER SERVICE

Google Cloud Platform for Systems Operations Professionals (CPO200) Course Agenda

agenda PAE Docker Docker PAE

@briandorsey #kubernetes #GOTOber

OpenShift 3 Technical Architecture. Clayton Coleman, Dan McPherson Lead Engineers

Ray Tsang Developer Advocate Google Cloud Platform

So, I have all these containers! Now what?

Kubernetes made easy with Docker EE. Patrick van der Bleek Sr. Solutions Engineer NEMEA

Running MarkLogic in Containers (Both Docker and Kubernetes)

Characterising Resource Management Performance in Kubernetes. Appendices.

VMWARE PKS. What is VMware PKS? VMware PKS Architecture DATASHEET

Zero to Microservices in 5 minutes using Docker Containers. Mathew Lodge Weaveworks

Building a Microservices Platform with Kubernetes. Matthew Mark

Building Cloud Infrastructure

Distributed System. Gang Wu. Spring,2018

/ Cloud Computing. Recitation 5 February 14th, 2017

Kubernetes. An open platform for container orchestration. Johannes M. Scheuermann. Karlsruhe,

Fault Tolerant Stateful Services on Kubernetes. Timothy St.

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Kubernetes objects on Microsoft Azure

Table of Contents HOL CNA

S Implementing DevOps and Hybrid Cloud

/ Cloud Computing. Recitation 5 September 26 th, 2017

How Container Runtimes matter in Kubernetes?

JFOKUS 2017 EXPERIENCES FROM USING DISCOVERY SERVICES IN A MICROSERVICE LANDSCAPE

GoDocker. A batch scheduling system with Docker containers

Continuous Delivery of Micro Applications with Jenkins, Docker & Kubernetes at Apollo

Windows Azure Services - At Different Levels

Road to Auto Scaling

More Containers, More Problems

Exam : Implementing Microsoft Azure Infrastructure Solutions

More on Testing and Large Scale Web Apps


ECE 7650 Scalable and Secure Internet Services and Architecture ---- A Systems Perspective

Safe Harbor Statement

Code: Slides:

RailCloud: A Reliable PaaS Cloud for Railway Applications

Maximizing Network Throughput for Container Based Storage David Borman Quantum

How we build TiDB. Max Liu PingCAP Amsterdam, Netherlands October 5, 2016

MapReduce. U of Toronto, 2014

GFS: The Google File System

Life of a Packet. KubeCon Europe Michael Rubin TL/TLM in GKE/Kubernetes github.com/matchstick. logo. Google Cloud Platform

DevOps Technologies. for Deployment

EASILY DEPLOY AND SCALE KUBERNETES WITH RANCHER

Launching StarlingX. The Journey to Drive Compute to the Edge Pilot Project Supported by the OpenStack

DEVELOPER INTRO

Kubernetes - Networking. Konstantinos Tsakalozos

TensorFlow on vivo

Continuous delivery while migrating to Kubernetes

Using PCF Ops Manager to Deploy Hyperledger Fabric

GFS: The Google File System. Dr. Yingwu Zhu

The Google File System. Alexandru Costan

Installation and setup guide of 1.1 demonstrator

Building a Kubernetes on Bare-Metal Cluster to Serve Wikipedia. Alexandros Kosiaris Giuseppe Lavagetto

Konstantin Shvachko, Hairong Kuang, Sanjay Radia, Robert Chansler Yahoo! Sunnyvale, California USA {Shv, Hairong, SRadia,

CS-580K/480K Advanced Topics in Cloud Computing. Container III

Best practices for OpenShift high-availability deployment field experience

Running Cloud Native Applications on DigitalOcean Kubernetes WHITE PAPER

K8s(Kubernetes) and SDN for Multi-access Edge Computing deployment

Adaptive Cluster Computing using JavaSpaces

Scale your Docker containers with Mesos

Document Sub Title. Yotpo. Technical Overview 07/18/ Yotpo

Building Kubernetes cloud: real world deployment examples, challenges and approaches. Alena Prokharchyk, Rancher Labs

利用 Mesos 打造高延展性 Container 環境. Frank, Microsoft MTC

K8s(Kubernetes) and SDN for Multi-access Edge Computing deployment

Kubernetes 1.9 Features and Future

How can you implement this through a script that a scheduling daemon runs daily on the application servers?

Disclaimer This presentation may contain product features that are currently under development. This overview of new technology represents no commitme

Secure Kubernetes Container Workloads

Kubernetes Integration with Virtuozzo Storage

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia

Scheduling in Kubernetes October, 2017

Transcription:

Understanding and Evaluating Kubernetes Haseeb Tariq Anubhavnidhi Archie Abhashkumar

Agenda Overview of project Kubernetes background and overview Experiments Summary and Conclusion

1. Overview of Project Goals Our approach Observations

Goals Kubernetes Platform to manage containers in a cluster Understand its core functionality Mechanisms and policies Major questions Scheduling policy Admission control Autoscaling policy Effect of failures

Our Approach Monitor state changes Force system into initial state Introduce stimuli Observe the change towards the final state Requirements Small Kubernetes cluster with resource monitoring Simple workloads to drive the changes

Observations Kubernetes tries to be simple and minimal Scheduling and admission control Based on resource requirements Spreading across nodes Response to failures Timeout and restart Can push to undesirable states Autoscaling as expected Control loop with damping

2. Kubernetes Background Motivation Architecture Components

Need for Container Management Workloads have shifted from using VMs to containers Better resource utilization Faster deployment Simplifies config and portability More than just scheduling Load balancing Replication for services Application health checking Ease of use for Scaling Rolling updates

High Level Design User Master Nodes kublet api-server pod pod scheduler kublet pod pod

Pods Small group of containers Shared namespace Share IP and localhost Volume: shared directory Scheduling unit Resource quotas Limit Min request Once scheduled, pods do not move Content Pod File Puller Volume Consumer Web Server

General Concepts Replication Controller Maintain count of pod replicas Service A set of running pods accessible by virtual IP Network model IP for every pod, service and node Makes all to all communication easy Pod

3. Experimental Setup

Experimental Setup Google Compute Engine cluster 1 master, 6 nodes Limited by free trial Could not perform experiments on scalability Google Compute Engine Pod

Simplified Workloads Simple scripts running in containers Low request - Low usage Low request - High usage Consume specified amount of CPU and Memory High request - Low usage High request - High usage Set the request and usage

Scheduling Behavior 5. Experiments Scheduling based on minrequest or actual usage?

Scheduling based on min-request or actual usage? Initial experiments showed that scheduler tries to spread the load, Based on actual usage or min request? Set up two nodes with no background containers Node A has a high cpu usage but a low request Node B has low cpu usage but higher request Pod See where a new pod gets scheduled

Scheduling based on Min-Request or Actual Usage CPU? - Before Node A Node B Pod1 Request: 10% Usage : 67% Pod2 Request: 10% Usage : 1% Pod3 Request: 10% Usage : 1%

Scheduling based on Min-Request or Actual Usage CPU? - Before Node A Node B Pod1 Request: 10% Usage : 42% Pod4 Request: 10% Usage : 43% Pod2 Request: 10% Usage : 1% Pod3 Request: 10% Usage : 1%

Scheduling based on Min-Request or Actual Usage Memory? We saw the same results when running pods with changing memory usage and request Scheduling is based on min-request

5. Experiments Scheduling Behavior Are Memory and CPU given equal weightage for making scheduling decisions?

Are Memory and CPU given Equal Weightage? First Experiment (15 trials): Both nodes have 20% CPU request and 20% Memory request Average request 20% New pod equally likely to get scheduled on both nodes.

New Pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 20% Pod2 CPU Request: 20% Memory Request : 20% Pod3 CPU Request: 20% Memory Request : 20%

New Pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 20% Pod2 CPU Request: 20% Memory Request : 20% Pod3 CPU Request: 20% Memory Request : 20%

Are Memory and CPU given Equal Weightage? Second Experiment (15 trials): Node A has 20% CPU request and 10% Memory request Average request 15% Node B has 20% CPU request and 20% Memory request Average request 20% New pod should always be scheduled on Node A

New Pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 10% Pod2 CPU Request: 20% Memory Request : 20% Pod3 CPU Request: 20% Memory Request : 20%

New Pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 10% Pod2 CPU Request: 20% Memory Request : 20% Pod3 CPU Request: 20% Memory Request : 20%

Are Memory and CPU given Equal Weightage? Third Experiment (15 trials): Node A has 20% CPU request and 10% Memory request. Average 15% Node B has 10% CPU request and 20% Memory request Average 15% Equally likely to get scheduled on both again

New pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 10% Pod2 CPU Request: 10% Memory Request : 20% Pod3 CPU Request: 20% Memory Request : 20%

New pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 10% Pod2 CPU Request: 10% Memory Request : 20% Pod3 CPU Request: 20% Memory Request : 20%

Are Memory and CPU given Equal Weightage? From the experiments we can see that Memory and CPU requests are given equal weightage in scheduling decisions

Admission Control 5. Experiments Is Admission control based on resource usage or resource request?

Is Admission Control based on Resource Usage or Request? Node A Pod1 Request: 1% Usage : 21% Pod2 Request: 1% Usage : 21% Pod3 Request: 1% Usage : 21% Pod4 Request: 1% Usage : 21%

Is Admission Control based on Actual Usage? : 70% CPU request Node A Pod1 Request: 1% Usage : 2% Pod2 Request: 1% Usage : 2% Pod3 Request: 1% Usage : 2% Pod4 Request: 1% Usage : 2% Pod5 Request: 70% Usage : 78%

Is Admission Control based on Actual Usage?: 98% CPU request Node A Pod1 Request: 1% Usage : 21% Pod2 Request: 1% Usage : 21% Pod3 Request: 1% Usage : 21% Pod4 Request: 1% Usage : 21% Pod1 Request: 98% Usage : 1

Is Admission Control based on Actual Usage?: 98% CPU request Node A Pod1 Request: 1% Usage : 21% Pod2 Request: 1% Usage : 21% Pod3 Request: 1% Usage : 21% Pod4 Request: 1% Usage : 21% Pod1 Request: 98% Usage : 1

Is Admission Control based on Actual Usage? From the previous 2 slides we can show that admission control is also based on minrequest and not actual usage

5. Does kubernetes always guarantee minimum request? Experiments

Before Background Load Node A Pod1 Request: 70% Usage : 75%

After Background Load (100 Processes) Node A Pod1 Request: 70% Usage : 27% High load background process

Does Kubernetes always guarantee Min Request? Background processes on the node are not part of any pods, so kubernetes has no control over them This can prevent pods from getting their minrequest

5. Experiments Fault Tolerance and effect of failures Container and Node crash

Response to Failure Container crash Detected via the docker daemon on the node More sophisticated probes to detect slowdown deadlock Node crash Detected via node controller, 40 second heartbeat Pods of failed node, rescheduled after 5 min

5. Experiments Fault Tolerance and effect of failures Interesting consequence of crash, reboot

Pod Layout before Crash Node A Node B Pod1 Request: 10% Usage : 35% Pod2 Request: 10% Usage : 45% Pod3 Request: 10% Usage : 40%

Pod Layout after Crash Node A Node B Pod1 Request: 10% Usage : 35% Pod2 Request: 10% Usage : 45% Pod3 Request: 10% Usage : 40%

Pod Layout after Crash & before Recovery Node A Node B Pod1 Request: 10% Usage : 29% Pod2 Request: 10% Usage : 27% Pod3 Request: 10% Usage : 26%

Pod Layout after Crash & after Recovery Node A Node B Pod1 Request: 10% Usage : 29% Pod2 Request: 10% Usage : 27% Pod3 Request: 10% Usage : 26%

Interesting Consequence of Crash, Reboot Can shift the container placement into an undesirable or less optimal state Multiple ways to mitigate this Have kubernetes reschedule Increases complexity Users set their requirements carefully so as not to get in that situation Reset the entire system to get back to the desired configuration

Autoscaling 5. Experiments How does kubernetes do autoscaling?

Autoscaling Control Loop Set target CPU utilization for a pod Check CPU utilization of all pods Adjust number of replicas to meet target utilization Here utilization is % of Pod request How does normal autoscaling behavior look like for a stable load?

Target Utilization 50% Normal Behavior of Autoscaler

Normal Behavior of Autoscaler Target Utilization 50% High load is added to the system. The cpu usage and number of pods increase

Normal Behavior of Autoscaler Target Utilization 50% The load is now spread across nodes and the measured cpu usage is now the average cpu usage of 4 nodes

Normal Behavior of Autoscaler Target Utilization 50% The load was removed and pods get removed

Autoscaling Parameters Auto scaler has two important parameters Scale up Delay for 3 minutes before last scaling event Scale down Delay for 5 minutes before last scaling event How does the auto scaler react to a more transient load?

Target Utilization 50% Autoscaling Parameters

Autoscaling Parameters Target Utilization 50% The load went down

Autoscaling Parameters Target Utilization 50% The number of pod don t scale down as quick

Autoscaling Parameters Target Utilization 50% The number of pod don t scale down as quick The is repeated in other runs too

Autoscaling Parameters Needs to be tuned for the nature of the workload Generally conservative Scales up faster Scales down slower Tries to avoid thrashing

5. Summary

Summary Scheduling and Admission control policy is based on min-request of resource CPU and Memory given equal weightage Crashes can drive system towards undesirable states Autoscaler works as expected Has to be tuned for workload

6. Conclusion

Conclusion Philosophy of control loops Observe, rectify, repeat Drive system towards desired state Kubernetes tries to do as little as possible Not a lot of policies Makes it easier to reason about But can be too simplistic in some cases

Thanks! Any questions?

References http://kubernetes.io/ http://blog.kubernetes.io/ Verma, Abhishek, et al. "Large-scale cluster management at Google with Borg." Proceedings of the Tenth European Conference on Computer Systems. ACM, 2015.

Backup slides

Scheduling Behavior 4. Experiments Is the policy based on spreading load across resources?

Is the Policy based on Spreading Load across Resources? Launch a Spark cluster on kubernetes Increase the number of workers one at a time Expect to see them scheduled across the nodes Shows the spreading policy of scheduler Pod

Individual Node Memory Usage

Increase in Memory Usage across Nodes

Final Pod Layout after Scheduling Node A Node B Worker 1 Worker 2 Worker 3 Worker 4 Worker 5 Worker 6 DNS Logging Graphana Logging Master Node C Node D Worker 7 Worker 8 Worker 9 Worker 10 Worker 11 Worker 12 LB Controller Logging Kube-UI Heapster Logging KubeDash

Is the Policy based on Spreading Load across Resources? Exhibits spreading behaviour Inconclusive Based on resource usage or request? Background pods add to noise Spark workload hard to gauge Pod

Autoscaling Algorithm CPU Utilization of pod Actual usage / Amount requested Target Num Pods = Ceil( Sum( All Pods Util ) / Target Util )

Control Plane Components Master API Server Client access to master etcd Distributed consistent storage using raft Scheduler Controller Replication Node Kubelet Manage pods, containers Kube-proxy Load balance among replicas of pod for a service

Detailed Architecture

Autoscaling for Long Stable Loads (10 high, 10 low)

New Pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 20% Pod2 CPU Request: 20% Memory Request : 20% Pod3 (Iter 1) CPU Request: 20% Memory Request : 20%

New Pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 20% Pod2 CPU Request: 20% Memory Request : 20% Pod3 (Iter 2) CPU Request: 20% Memory Request : 20%

New Pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 20% Pod2 CPU Request: 20% Memory Request : 20% Pod3 (Iter 3) CPU Request: 20% Memory Request : 20%

New Pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 10% Pod2 CPU Request: 20% Memory Request : 20% Pod3 (Iter 1) CPU Request: 20% Memory Request : 20%

New Pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 10% Pod2 CPU Request: 20% Memory Request : 20% Pod3 (Iter 2) CPU Request: 20% Memory Request : 20%

New Pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 10% Pod2 CPU Request: 20% Memory Request : 20% Pod3 (Iter 3) CPU Request: 20% Memory Request : 20%

New pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 10% Pod2 CPU Request: 10% Memory Request : 20% Pod3 (Iter 1) CPU Request: 20% Memory Request : 20%

New pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 10% Pod2 CPU Request: 10% Memory Request : 20% Pod3 (Iter 2) CPU Request: 20% Memory Request : 20%

New pod with 20% CPU and 20% Memory Request Node A Node B Pod1 CPU Request: 20% Memory Request : 10% Pod2 CPU Request: 10% Memory Request : 20% Pod3 (Iter 3) CPU Request: 20% Memory Request : 20%