Lightweight Containerization at Facebook

Similar documents
Docker und IBM Digital Experience in Docker Container

Oracle Linux 5 & 6 Advanced Administration

Launching StarlingX. The Journey to Drive Compute to the Edge Pilot Project Supported by the OpenStack

GoDocker. A batch scheduling system with Docker containers

Think Small to Scale Big

Kuber-what?! Learn about Kubernetes

UP! TO DOCKER PAAS. Ming

Orchestrating Docker containers at scale

Important DevOps Technologies (3+2+3days) for Deployment

Container mechanics in Linux and rkt FOSDEM 2016

Beyond 1001 Dedicated Data Service Instances

DevOps Technologies. for Deployment

The age of orchestration

Docker and Oracle Everything You Wanted To Know

Exam : Implementing Microsoft Azure Infrastructure Solutions

Oracle Linux 5 & 6 Advanced Administration

MySQL and Docker Strategies Patrick Galbraith Giuseppe Maxia

Alternatives to Solaris Containers and ZFS for Linux on System z

Travis Cardwell Technical Meeting

Investigating Containers for Future Services and User Application Support

CS-580K/480K Advanced Topics in Cloud Computing. Container III

Docker A FRAMEWORK FOR DATA INTENSIVE COMPUTING

SAINT LOUIS JAVA USER GROUP MAY 2014

ovirt and Docker Integration

[Docker] Containerization

Orchestration Ownage: Exploiting Container-Centric Datacenter Platforms

State of Containers. Convergence of Big Data, AI and HPC

Installation and setup guide of 1.1 demonstrator

Continuous Delivery of Micro Applications with Jenkins, Docker & Kubernetes at Apollo

Continuous Integration using Docker & Jenkins

Container-Native Storage

Virtual Infrastructure: VMs and Containers

70-532: Developing Microsoft Azure Solutions

Container Adoption for NFV Challenges & Opportunities. Sriram Natarajan, T-Labs Silicon Valley Innovation Center

WHITE PAPER. RedHat OpenShift Container Platform. Benefits: Abstract. 1.1 Introduction

STATUS OF PLANS TO USE CONTAINERS IN THE WORLDWIDE LHC COMPUTING GRID

FROM MONOLITH TO DOCKER DISTRIBUTED APPLICATIONS

Building/Running Distributed Systems with Apache Mesos

Shifter at CSCS Docker Containers for HPC

Linux Containers Roadmap Red Hat Enterprise Linux 7 RC. Bhavna Sarathy Senior Technology Product Manager, Red Hat

Docker 101 Workshop. Eric Smalling - Solution Architect, Docker

Red Hat Atomic Details Dockah, Dockah, Dockah! Containerization as a shift of paradigm for the GNU/Linux OS

Networking Approaches in. a Container World. Flavio Castelli Engineering Manager

Dockerized Tizen Platform

Oracle Linux 7: Advanced Administration Ed 1

Zdeněk Kubala Senior QA

Developing and Testing Java Microservices on Docker. Todd Fasullo Dir. Engineering

Cloud I - Introduction

Portable, lightweight, & interoperable Docker containers across Red Hat solutions

@joerg_schad Nightmares of a Container Orchestration System

Amazon EC2 Container Service: Manage Docker-Enabled Apps in EC2

One year of Deploying Applications for Docker, CoreOS, Kubernetes and Co.

Containerizing GPU Applications with Docker for Scaling to the Cloud

Operating and managing an Atomic container-based infrastructure

Windows Azure Services - At Different Levels

OS Containers. Michal Sekletár November 06, 2016

Multi-Arch Layered Image Build System

USING DOCKER FOR MXCUBE DEVELOPMENT AT MAX IV

A Greybeard's Worst Nightmare

Docker for HPC? Yes, Singularity! Josef Hrabal

CONTAINERS AND MICROSERVICES WITH CONTRAIL

ViryaOS RFC: Secure Containers for Embedded and IoT. A proposal for a new Xen Project sub-project

Armon HASHICORP

Dr. Roland Huß, Red

Azure Sphere: Fitting Linux Security in 4 MiB of RAM. Ryan Fairfax Principal Software Engineering Lead Microsoft

Who is Docker and how he can help us? Heino Talvik

Welcome to Docker Birthday # Docker Birthday events (list available at Docker.Party) RSVPs 600 mentors Big thanks to our global partners:

How Container Runtimes matter in Kubernetes?

Cloud & container monitoring , Lars Michelsen Check_MK Conference #4

MANAGING MESOS, DOCKER, AND CHRONOS WITH PUPPET

Microservice Deployment. Software Engineering II Sharif University of Technology MohammadAmin Fazli

docker & HEP: containerization of applications for development, distribution and preservation

Microservices. Chaos Kontrolle mit Kubernetes. Robert Kubis - Developer Advocate,

Oracle Linux 7: Advanced Administration Ed 1 LVC

Deployment Patterns using Docker and Chef

MESOS A State-Of-The-Art Container Orchestrator Mesosphere, Inc. All Rights Reserved. 1

Cross platform enablement for the yocto project with containers. ELC 2017 Randy Witt Intel Open Source Technology Center

Best Practices for Developing & Deploying Java Applications with Docker

Getting Started With Amazon EC2 Container Service

RED HAT GLUSTER TECHSESSION CONTAINER NATIVE STORAGE OPENSHIFT + RHGS. MARCEL HERGAARDEN SR. SOLUTION ARCHITECT, RED HAT BENELUX April 2017

Guest Shell. Finding Feature Information. Information About Guest Shell. Guest Shell Overview

systemd in Containers

The Post-Cloud. Where Google, DevOps, and Docker Converge

Upcoming Services in OpenStack Rohit Agarwalla, Technical DEVNET-1102

CONTINUOUS DELIVERY WITH MESOS, DC/OS AND JENKINS

Running MarkLogic in Containers (Both Docker and Kubernetes)

agenda PAE Docker Docker PAE

Docker Deep Dive. Daniel Klopp

70-532: Developing Microsoft Azure Solutions

VerifyFS in Btrfs Style (Btrfs end to end Data Integrity)

DevOps Course Content

The failure of Operating Systems,

Scale your Docker containers with Mesos

Infrastructure at your Service. Oracle over Docker. Oracle over Docker

Continuous Integration of an Operating System in Kubernetes. Stef Walter Red Hat

Red Hat OpenShift Roadmap Q4 CY16 and H1 CY17 Releases. Lutz Lange Solution

Container-based virtualization: Docker

Project Kuryr. Antoni Segura Puimedon (apuimedo) Gal Sagie (gsagie)

Docker All The Things

Code: Slides:

Transcription:

Lightweight Containerization at Facebook Zoltan Puskas (zpuskas@fb.com) Production Engineer on Infrastructure

Agenda What is Tupperware? Why use Btrfs? Building layered images Launching with systemd Results

Terminology Infrastructure we run on Data Center: One or more clusters Cluster: multiple racks Rack: multiple machines Tupperware Job: a collection of similar tasks Task: instance of a binary running

Tupperware Our solution Tupperware is a massively parallel job execution framework Runs a many of Facebook s services Isolated runtime environment Resource controlled Easy workfow

Tupperware But why? Why not Docker/CoreOS: They did not exist when Tupperware started Integration with Facebook systems Why not VM? Performance penalty Harder to debug issues

Architecture Container Stack Industry Facebook Etcd, Consul Zookeeper based discovery service Kubernetes, Docker Swarm, Chronos Tupperware Scheduler Docker Networking, CoreOS Flannel Tupperware ILA Container Container Container Container Container Container Docker Engine, RKT Tupperware Agent KVM, Hyper-V, LXC Facebook Hosts

Architecture Building Blocks Industry Facebook Docker CLI, kuberctl Tupperware CLI Chef, Puppet, Ansible Job Spec, Chef Docker registry, CoreOS registry Distributed package manager Dockerfle and image, CoreOS image Buck target and Btrfs Image

Architecture Overview Server DB job_confg.tw CLI Scheduler (job) Resource Manager Host (tasks) Host (tasks) Host (tasks) Package storage system

Architecture Tupperware Agent Agent Task manager API Agent-helper task_1 Package manager Volume manager Agent-helper task_2 Resource manager Agent-helper task_3 Scheduler heartbeat

Launching containers Moving to images Previous solution rootfs.tar.gz New solution Self contained Btrfs image Extract a copy per task RW snapshots on RO base yum install during prepare Everything is pre-installed busybox init via LXC Systemd init via nspawn cgroup v1 cgroup v2

Lightweight containers What do we want from our containers? Reliability Determinism Transparency Low overhead Easy end to end workfow

Image Layering A better way to manage runtimes Running Task Application Image Facebook Image Base OS Image Task App FB OS

Btrfs B-trees everywhere Copy-On-Write Subvolumes Snapshots (RO and RW) Binary difs (send/receive) Quotas Cgroups IO control

Btrfs Subvolumes vol_1 /data/vol_1 vol_2 /data/vol_2 vol_3 /data/vol_3

Subvolumes Creating a Btrfs flesystem # truncate -s 1G image.btrfs # mkfs.btrfs image.btrfs... Filesystem size: 1.00GiB Block group profles: Data: single 8.00MiB Metadata: DUP 51.19MiB System: DUP 8.00MiB # mkdir root # mount image.btrfs data # mount... /tmp/demo/image.btrfs on /tmp/demo/data type btrfs (rw,relatime,space_cache,subvolid=5,subvol=/)

Subvolumes Basic subvolumes # btrfs subvolume create data/vol_1 # btrfs subvolume create data/vol_2 # btrfs subvolume create data/vol_1 # ls data vol_1 vol_2 vol_3 # touch data/vol_1/foo # touch data/vol_2/bar # ls data/vol_1/ foo # ls data/vol_2/ bar # ls data/vol_3/ # btrfs subvol list data/ ID 259 gen 17 top level 5 path vol_1 ID 260 gen 18 top level 5 path vol_2 ID 261 gen 18 top level 5 path vol_3 # mkdir vol_1_separate && mount -o subvol=vol_1 image.btrfs vol_1_separate # ls vol_1_separate foo

Btrfs snapshots vol_1 /data/vol_1 vol_1_2 /data/vol_1_2 vol_1_2_3 /data/vol_1_2_3

Subvolumes Snapshots and nesting # btrfs subvolume snapshot data/vol_1 data/vol_1_2 # ls data/vol_1/ foo # ls data/vol_1_2/ foo # touch data/vol_1_2/baz # ls data/vol_1/ foo # ls data/vol_1_2/ foo baz

Copy on write and difs Building a tree of image layers Lower disk space usage Lower disk IO usage Improved disk data caching We can independently version layers Diferent update schedules for layers

Copy-on-Write with Difs

Restore layers with difs

Btrfs Tupperware Layering Task Task Task App1 App2 Task SSH, Cert, Base OS

Building images Buck build Declarative image building Fast parallel builds Reproducible builds Incremental builds Separation of build and run time

Building images Buck build Fully self contained Provides true FS level isolation Build once, use everywhere Testable

Building images Custom image build >_

Image buck TARGET Defne your image! image_layer( name='job_layer', parent_layer=':minimal_busybox', copy_deps=[ (':hello_world.tar', '/foo/bar/'), { 'source': ':hello_world.tar', 'dest': '/foo/bar/hello_world_again.tar', 'group': 'nobody', }, (':cool_service', '/usr/bin/cool_service'), ], features=[':some_feature_dirs_and_rpms'], ) java_library( name = 'collect', srcs = glob(['*.java']), deps = [ '//java/com/facebook/base:base', ], ) java_binary( name = 'cool_service', deps = [ ':collect', ':base', ], )

Image buck TARGET Defne your image! image_layer( name='miminal_busybox', tarballs=[ ('//tools/build/buck/container_images:busybox.tgz', '/'), ], features=[':some_feature_dirs_and_rpms'], ) image_feature( name='some_feature_dirs_and_rpms', rpms={ 'jdk_8u60': 'install', }, make_dirs=[ '/foo/bar', ('/foo/bar', 'baz'), { 'path_to_make': 'borf/beep', 'into_dir': '/foo', 'user': 'nobody', }, ], features=[], )

Control volume Operating on layers / images OS.base OS.base.jdk tasks task_1 taks_2 tmp

systemd Running the container, running inside the container systemd-nspawn systemd-init Container aware Logging to outside of the container (too) Running the container at build time

Performance It s fast

Performance Low IO

Performance Prefetch magic

Performance Still low IO

Fast iteration Develop, test and deploy Tools that enter, run and debug images Automated image build system with dependency handling Distribution via existing package management system Automated releases

Summa summarum Building better containers Tupperware helps us scale Facebook Using a next generation fle system Flexible images Standard containerization tools Higher efciency with more robustness

Zoltan Puskas (zpuskas@fb.com) Questions?