Slacker: Fast Distribution with Lazy Docker Containers Tyler Harter, Brandon Salmon, Rose Liu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau
Container Popularity spoon.net
Theory and Practice Theory: containers are lightweight just like starting a process!
Theory and Practice Theory: containers are lightweight just like starting a process! Practice: container startup is slow 25 second startup time [1] task startup latency (the time from job submission to a task running) is an area that has received and continues to receive significant attention. It is highly variable, with the median typically about 25 s. Package installation takes about 80% of the total: one of the known bottlenecks is contention for the local disk where packages are written. [1] Large-scale cluster management at Google with Borg. http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf
Theory and Practice Theory: containers are lightweight just like starting a process! Practice: container startup is slow 25 second startup time [1] task startup latency (the time from job submission to a task running) is an area that has received and continues to receive significant attention. It is highly variable, with the median typically about 25 s. Package installation takes about 80% of the total: one of the known bottlenecks is contention for the local disk where packages are written. [1] Large-scale cluster management at Google with Borg. http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf
Theory and Practice Theory: containers are lightweight just like starting a process! Practice: container startup is slow 25 second startup time [1] Startup time matters flash crowds load balance interactive development [1] Large-scale cluster management at Google with Borg. http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43438.pdf
Contributions HelloBench Docker benchmark for stressing startup based on 57 container workloads Startup analysis 76% of startup time spent copying/installing images startup requires only 6% of that image data Slacker: Docker storage driver lazily pull only needed data leverage extensions to Linux kernel and NFS server 5-20x startup speedups
Slacker Outline Background Containers: lightweight isolation Docker: file-system provisioning Container Workloads Default Driver: AUFS Our Driver: Slacker Evaluation Conclusion
Why use containers?
Why use containers? (it s trendy)
Why use containers? (it s trendy) (efficient solution to classic problem)
Big Goal: Sharing and Isolation App A App B Physical Machine want: multitenancy
Big Goal: Sharing and Isolation App A App B Physical Machine don t want: crashes
Big Goal: Sharing and Isolation App A App B Physical Machine don t want: crashes
Big Goal: Sharing and Isolation App A App B Physical Machine don t want: unfairness
Big Goal: Sharing and Isolation App A sensitive data App B Physical Machine don t want: leaks
Solution: Virtualization namespaces and scheduling provide illusion of private resources
Evolution of Virtualization 1st generation: process virtualization isolate within OS (e.g., virtual memory) fast, but incomplete (missing ports, file system, etc.) App A App B process process Operating System process virtualization
Evolution of Virtualization 1st generation: process virtualization isolate within OS (e.g., virtual memory) fast, but incomplete (missing ports, file system, etc.) 2nd generation: machine virtualization isolate around OS complete, but slow (redundancy, emulation) App A App B App A App B process process OS OS Operating System VM VM process virtualization machine virtualization
Evolution of Virtualization 1st generation: process virtualization isolate within OS (e.g., virtual memory) fast, but incomplete (missing ports, file system, etc.) 2nd generation: machine virtualization isolate around OS complete, but slow (redundancy, emulation) App A App B App A App B process process OS OS Operating System VM VM process virtualization machine virtualization
Evolution of Virtualization 1st generation: process virtualization isolate within OS (e.g., virtual memory) fast, but incomplete (missing ports, file system, etc.) 2nd generation: machine virtualization isolate around OS complete, but slow (redundancy, emulation) 3rd generation: container virtualization extend process virtualization: ports, file system, etc. fast and complete
Evolution of Virtualization 1st generation: process virtualization isolate within OS (e.g., virtual memory) fast, but incomplete (missing ports, file system, etc.) 2nd generation: machine virtualization isolate around OS complete, but slow (redundancy, emulation) 3rd generation: container virtualization extend process virtualization: ports, file system, etc. fast and complete??? many storage challenges
New Storage Challenges Crash isolation Physical Disentanglement in a Container-Based File System. Lanyue Lu, Yupu Zhang, Thanh Do, Samer Al-Kiswany, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau. OSDI 14. Performance isolation Split-level I/O Scheduling For Virtualized Environments. Suli Yang, Tyler Harter, Nishant Agrawal, Salini Selvaraj Kowsalya, Anand Krishnamurthy, Samer Al-Kiswany, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau. SOSP 15. File-system provisioning Slacker: Fast Distribution with Lazy Docker Containers. Tyler Harter, Brandon Salmon, Rose Liu, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau. FAST 16. today
Slacker Outline Background Containers: lightweight isolation Docker: file-system provisioning Container Workloads Default Driver: AUFS Our Driver: Slacker Evaluation Conclusion
Docker Background Deployment tool built on containers An application is defined by a file-system image application binary shared libraries etc. Version-control model extend images by committing additional files deploy applications by pushing/pulling images
Containers as Repos LAMP stack example commit 1: Linux packages (e.g., Ubuntu) commit 2: Apache commit 3: MySQL commit 4: PHP Central registries Docker HUB private registries Docker layer commit container scratch space
Push, Pull, Run registry worker worker worker
Push, Pull, Run registry push worker worker worker
Push, Pull, Run registry worker worker worker
Push, Pull, Run registry pull pull worker worker worker
Push, Pull, Run registry worker worker worker
Push, Pull, Run registry C run C run C run worker worker worker
Push, Pull, Run registry need a new benchmark to measure Docker push, pull, and run operations. C run C run C run worker worker worker
Slacker Outline Background Container Workloads HelloBench Analysis Default Driver: AUFS Our Driver: Slacker Evaluation Conclusion
HelloBench Goal: stress container startup including push/pull 57 container images from Docker HUB run simple hello world -like task wait until it s done/ready push pull run
HelloBench Goal: stress container startup including push/pull 57 container images from Docker HUB run simple hello world -like task wait until it s done/ready push pull run
HelloBench Goal: stress container startup including push/pull 57 container images from Docker HUB run simple hello world -like task wait until it s done/ready ready push pull run
HelloBench Goal: stress container startup including push/pull 57 container images from Docker HUB run simple hello world -like task wait until it s done/ready Development cycle distributed programming/testing ready push pull run development cycle
HelloBench Goal: stress container startup including push/pull 57 container images from Docker HUB run simple hello world -like task wait until it s done/ready Development cycle distributed programming/testing Deployment cycle flash crowds, rebalance ready push pull run deployment cycle
Workload Categories Linux Distro alpine busybox centos cirros crux debian fedora mageia opensuse oraclelinux ubuntu ubuntudebootstrap ubuntu-upstart Database cassandra crate elas6csearch mariadb mongo mysql percona postgres redis rethinkdb! Web Framework django iojs node rails Language clojure gcc golang haskell hylang java jruby julia mono perl php pypy python r-base rakudo-star ruby thri< Web Server glassfish h>pd je>y nginx php-zendserver tomcat! Other drupal ghost hello-world jenkins rabbitmq registry sonarqube
Slacker Outline Background Container Workloads HelloBench Analysis Default Driver: AUFS Our Driver: Slacker Evaluation Conclusion
Questions How is data distributed across Docker layers?! How much image data is needed for container startup?! How similar are reads between runs?
Questions How is data distributed across Docker layers?! How much image data is needed for container startup?! How similar are reads between runs?
HelloBench images circle: commit red: image
Image Data Depth
Image Data Depth half of data is at depth 9+
Questions How is data distributed across Docker layers? half of data is at depth 9+ design implication: flatten layers at runtime How much image data is needed for container startup? How similar are reads between runs?
Questions How is data distributed across Docker layers? half of data is at depth 9+ design implication: flatten layers at runtime How much image data is needed for container startup? How similar are reads between runs?
Container Amplification
Container Amplification
Container Amplification only 6.4% of data needed during startup
Questions How is data distributed across Docker layers? half of data is at depth 9+ design implication: flatten layers at runtime How much image data is needed for container startup? 6.4% of data is needed design implication: lazily fetch data How similar are reads between runs?
Questions How is data distributed across Docker layers? half of data is at depth 9+ design implication: flatten layers at runtime How much image data is needed for container startup? 6.4% of data is needed design implication: lazily fetch data How similar are reads between runs?
Repeat Runs measure hits/misses for second of two runs
Repeat Runs measure hits/misses for second of two runs up to 99% of reads could be serviced by a cache
Questions How is data distributed across Docker layers? half of data is at depth 9+ design implication: flatten layers at runtime How much image data is needed for container startup? 6.4% of data is needed design implication: lazily fetch data How similar are reads between runs? containers from same image have similar read patterns design implication: share cache state between containers
Slacker Outline Background Container Workloads Default Driver: AUFS Design Performance Our Driver: Slacker Evaluation Conclusion
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) layer directory in underlying FS root FS union of layer directories
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) layer directory in underlying FS root FS union of layer directories Operations push pull run
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS AUFS Driver layers: A B C
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS AUFS Driver directories: A B C
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS AUFS Driver PUSH directories: A B C
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS AUFS Driver PUSH directories: tar.gz A B C
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS AUFS Driver PUSH directories: tar.gz A B C
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS AUFS Driver PULL directories: A B C
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS AUFS Driver PULL directories: A B C tar.gz
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS AUFS Driver PULL directories: A B C tar.gz X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS AUFS Driver PULL directories: A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS AUFS Driver RUN directories: A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS AUFS Driver RUN scratch dir: A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS root FS RUN AUFS Driver AUFS A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS read B RUN AUFS Driver AUFS A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS read B RUN AUFS Driver AUFS A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS RUN AUFS Driver AUFS A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS read X RUN AUFS Driver AUFS A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS read X RUN AUFS Driver AUFS A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS RUN AUFS Driver AUFS A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS append Z RUN AUFS Driver AUFS A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS append Z RUN AUFS Driver Z A B C copy AUFS X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS append Z RUN AUFS Driver Z AUFS A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS append Z RUN AUFS Driver Z AUFS A B C X Y Z
AUFS Storage Driver Uses AUFS file system (Another Union FS) stores data in an underlying FS (e.g., ext4) each layer Docker directory layer in is underlying a directory FS in underlying FS union root FS these union directories of layer to directories create complete view of FS RUN AUFS Driver Z AUFS A B C X Y Z
Slacker Outline Background Container Workloads Default Driver: AUFS Design Performance Our Driver: Slacker Evaluation Conclusion
AUFS File System
AUFS File System Deep data is slow
AUFS Storage Driver
AUFS Storage Driver 76% of deployment cycle spent on pull
Slacker Outline Background Container Workloads Default Driver: AUFS Our Driver: Slacker Evaluation Conclusion
Slacker Driver Goals make push+pull very fast utilize powerful primitives of a modern storage server (Tintri VMstore) create drop-in replacement; don t change Docker framework itself Design lazy pull layer flattening cache sharing
Slacker Driver Goals make push+pull very fast utilize powerful primitives of a modern storage server (Tintri VMstore) create drop-in replacement; don t change Docker framework itself Design lazy pull layer flattening cache sharing
Prefetch vs. Lazy Fetch AUFS Slacker registry worker registry worker images containers images and containers
Prefetch vs. Lazy Fetch AUFS Slacker registry worker registry worker images containers images and containers significant copying over network to/from disk centralized storage easy sharing
Prefetch vs. Lazy Fetch AUFS Slacker registry worker registry worker images containers images and containers
Prefetch vs. Lazy Fetch Slacker registry container ext4 loopback images and containers NFS File
Prefetch vs. Lazy Fetch Slacker registry container ext4 loopback NFS File
Prefetch vs. Lazy Fetch Slacker registry container ext4 loopback NFS File VMstore abstractions
VMstore Abstractions Copy-on-Write VMstore provides snapshot() and clone() block granularity avoids AUFS s problems with file granularity snapshot(nfs_path)! create read-only copy of NFS file return snapshot ID clone(snapshot_id)! create r/w NFS file from snapshot Slacker Usage NFS files container storage snapshots image storage clone() provision container from image snapshot() create image from container
Snapshot and Clone worker A container NFS file Tintri VMstore
Snapshot and Clone worker A snapshot NFS file Tintri VMstore Worker A: push
Snapshot and Clone worker A snapshot NFS file COW snap N Tintri VMstore Worker A: push
Snapshot and Clone worker A N NFS file snap N Tintri VMstore Worker A: push
Snapshot and Clone worker A N registry NFS file snap N Tintri VMstore Worker A: push
Snapshot and Clone worker A registry img N NFS file snap N Tintri VMstore Worker A: push
Snapshot and Clone Note: registry is only a name server. Maps layer metadata snapshot ID worker A registry img N NFS file snap N Tintri VMstore
Snapshot and Clone worker A registry img N NFS file snap N Tintri VMstore
Snapshot and Clone worker A registry worker B img N NFS file snap N Tintri VMstore Worker B: pull and run
Snapshot and Clone worker A registry worker B img N N NFS file snap N Tintri VMstore Worker B: pull and run
Snapshot and Clone worker A registry worker B img N clone N NFS file snap N Tintri VMstore Worker B: pull and run
Snapshot and Clone worker A registry worker B img N clone N NFS file snap N COW NFS file Tintri VMstore Worker B: pull and run
Snapshot and Clone worker A registry worker B img N NFS file snap N NFS file Tintri VMstore Worker B: pull and run
Snapshot and Clone worker A registry worker B img container NFS file snap N NFS file Tintri VMstore Worker B: pull and run
Slacker Driver Goals make push+pull very fast utilize powerful primitives of a modern storage server (Tintri VMstore) create drop-in replacement; don t change Docker framework itself Design lazy pull layer flattening cache sharing
Slacker Driver Goals make push+pull very fast utilize powerful primitives of a modern storage server (Tintri VMstore) create drop-in replacement; don t change Docker framework itself Design lazy pull layer flattening cache sharing
Slacker Flattening File Namespace Level flatten layers if B is child of A, then copy A to B to start. Don t make B empty Block Level do COW+dedup beneath NFS files, inside VMstore namespace copy-on-write dir dir dir dir ext4 ext4 ext4 ext4 ext4 NFS NFS NFS NFS copy-on-write block AUFS Slacker
Slacker Flattening File Namespace Level flatten layers if B is child of A, then copy A to B to start. Don t make B empty Block Level do COW+dedup beneath NFS files, inside VMstore namespace copy-on-write A B C D ext4 A ext4 ext4 ext4 AB ABC ABCD ext4 copy-on-write block AUFS Slacker
Challenge: Framework Assumptions Assumed Layout Actual Layout D A B C D runnable Layers B C runnable Layers A B C A B A A
Challenge: Framework Assumptions Assumed Layout D Actual Layout A B C D Layers B C pull Layers A B C A B pull A A
Challenge: Framework Assumptions Strategy: lazy cloning. Don t clone non-top layers until Docker tries to mount them. Assumed Layout D Actual Layout A B C D Layers B C Layers A B C A B optimize A A
Slacker Driver Goals make push+pull very fast utilize powerful primitives of a modern storage server (Tintri VMstore) create drop-in replacement; don t change Docker framework itself Design lazy pull layer flattening cache sharing
Slacker Driver Goals make push+pull very fast utilize powerful primitives of a modern storage server (Tintri VMstore) create drop-in replacement; don t change Docker framework itself Design lazy pull layer flattening cache sharing
Challenge: Cache Sharing NFS Client: cache: storage for 2 containers started from same image ABC image ABX ABY
Challenge: Cache Sharing NFS Client: cache: ABX ABY
Challenge: Cache Sharing NFS Client: cache: read ABX ABY
Challenge: Cache Sharing NFS Client: cache: AABX ABY
Challenge: Cache Sharing NFS Client: cache: A ABX ABY
Challenge: Cache Sharing NFS Client: cache: A read ABX ABY
Challenge: Cache Sharing NFS Client: cache: A ABX ABY A
Challenge: Cache Sharing NFS Client: cache: A A ABX ABY
Challenge: Cache Sharing NFS Client: cache: A A Challenge: how to avoid space and I/O waste? ABX ABY
Challenge: Cache Sharing NFS Client: cache: 001 001 A A Strategy: track differences and deduplicate I/O (more in paper) ABX ABY
Slacker Outline Background Container Workloads Default Driver: AUFS Our Driver: Slacker Evaluation Conclusion
Questions What are deployment and development speedups? How is long-term performance?!
Questions What are deployment and development speedups? How is long-term performance?
HelloBench Performance deployment: pull+run development: push+pull+run
Questions What are deployment and development speedups? 5x and 20x faster respectively (median speedup) How is long-term performance?
Questions What are deployment and development speedups? 5x and 20x faster respectively (median speedup) How is long-term performance?
Server Benchmarks Databases and Web Servers PostgreSQL Redis Apache web server (static) io.js Javascript server (dynamic) Experiment measure throughput (after startup) run 5 minutes
Server Benchmarks Databases and Web Servers PostgreSQL Redis Apache web server (static) io.js Javascript server (dynamic) Experiment measure throughput (after startup) run 5 minutes Result: Slacker is always at least as fast as AUFS
Questions What are deployment and development speedups? 5x and 20x faster respectively (median speedup) How is long-term performance? there is no long-term penalty for being lazy
Slacker Outline Background Container Workloads Default Driver: AUFS Our Driver: Slacker Evaluation Conclusion
Conclusion Containers are inherently lightweight but existing frameworks are not COW between workers is necessary for fast startup use shared storage utilize VMstore snapshot and clone Slacker driver 5x deployment speedup 20x development speedup HelloBench: https://github.com/tintri/hello-bench