The Elasticity and Plasticity in Semi-Containerized Colocating Cloud Workload: a view from Alibaba Trace

Size: px

Start display at page:

Download "The Elasticity and Plasticity in Semi-Containerized Colocating Cloud Workload: a view from Alibaba Trace"

Paula King
5 years ago
Views:

1 The Elasticity and Plasticity in Semi-Containerized Colocating Cloud Workload: a view from Alibaba Trace Qixiao Liu* and Zhibin Yu Shenzhen Institute of Advanced Technology Chinese Academy of 2018, Carlsbad, CA, U.S.A 1

2 Introduction Challenges in the cloud computing Low resource utilization Tail latency IRU-QoS dilemma Task scheduling, resource management, programming diagram, etc. Traces from industrial production environment The Google trace released in m machines, 670k jobs (mixed workload), 29 days. Alibaba released in k machines, 23k jobs (also mixed workload), in 1 day. Fraction External 2

3 Google trace vs. Alibaba trace Google trace 1. Server heterogeneity 2. Priority information 3. Server failure Alibaba trace 1. All servers are equipped with 64 CPUs, >99% of servers: same memory and disk capacities. 2. No priority information 3. Negligible server failures 4. Mixed workload (production and non-production), but are equal as jobs 4. Online services and batch jobs are traced separately More elaborative views into the co-location 3

4 Elasticity resist a distorting influence and to return to its original size and shape when that influence or force is removed * Plasticity non-reversible changes of shape in response to applied forces * * Elasticity, plasticity (physics), wikipedia 4

5 Resource 100% Elastic computing Resource provision Resource need 0 5

6 Resource 100% Elastic computing Resource provision Resource need 0 6

7 100% Elastic computing in co-location Resource Task eviction Resource provision Resource need Resource used by batch tasks 0 Resource utilization Batch job performance? 7

8 Ideal elastic computing in co-location Resource 100% plus plasticity Resource provision Resource need Resource used by batch tasks 0 8

9 Outline Trace overview Shape the workload Statistics analysis of containers and batch jobs Co-location analysis Discussion Conclusion 9

10 Alibaba cluster management architecture Mixed workload Online services Batch jobs Mixed entities (semi-containerization) Container Tasks Mixed architecture Concurrent schedulers: Sigma and Fuxi Level0-controller Novelty? technical legacy It is in production. 10

11 Trace structure Static information: ID, Machine ID; Req CPU/mem/disk; Create time; Allocated CPU ID. Runtime statistics: Container ID; Sample point; Used CPU/mem/disk; CPU load, CPI, mem miss. Sampled every 5min Containers Static information: Machine ID; Config. CPU/mem/Disk; Event and time. Servers Batch jobs Runtime statistics: Machine ID; Sample point; Used CPU/mem/disk; CPU load, CPI, Static information: Jobid/taskid; Create/end time; Req CPU/memory; Status, #instance. Runtime statistics: Jobid/taskid; Instance start/end time; Status; Sequence; Used CPU/memory Sampled every 5min Execution based 11

12 Outline Trace overview Shape the workload Statistics analysis of containers and batch jobs Co-location analysis Discussion Conclusion 12

13 Container containers, each runs one online service, for 24 hours Container requests CPUs, memory and disk Req. #CPU: 1, 4, 6, 8, 16 Memory capacity (normalized): to Disk capacity (normalized): 3e-11 to <CPU, memory, disk> patterns for all containers, 19 are valid. Requested resource over server capacity (ROC): (Resource_req/Server capacity)*100% CPU Memory Disk ROC 9.5% 10.9% 4.9% ROC SD 4.4% 8.8% 2.1% 13

14 Batch job Batch job structure: job, task, instance Job->task: DAG Task->instance: same computing logic, resource request Batch job 12.9k DAG Batch task Batch task 80k Same computing logic Batch instance Batch instance 11.9m 14

15 Batch job Batch task requests CPU and memory #CPU: 0.45 to 8 (14 values in total, 0.05 basic unit) Memory: to (750 values) 989 <CPU, memory> patterns in 80k tasks CPU Memory ROC 0.8% 0.9% ROC SD 0.5% 0.7% Batch instance status: Failed, interrupted, ready, running, terminated, wait Failed/interrupted rate are 1.5% Google trace: half submissions are resubmissions 15

16 Outline Trace overview Shape the workload Runtime statistics analysis of containers and batch jobs Co-location analysis Discussion Conclusion 16

17 Container resource utilization Reserved but not used Avg. used max used Mem A B C Disk A B CPU Memory Disk C Resource overprovisioning Max vs. average resource used steady memory and disk utilization, but CPU varies significantly 17

18 Container performance L/H: low /high CPU utilization L H L H 0 L H L H 4 CPU 8 CPU 4 CPU 8 CPU CPI CPU load - Containers are guaranteed resources when load rises; - Higher load increases resource utilization, but not hurting the performance. 18

its actual used resources is greater than that it

19 Batch instance resource utilization CPU utilization Memory utilization Resource overcommit, the amount of its actual used resources is greater than that it requested at submission. Both CPU and memory overcommit. 19

20 Incremental resource allocation in Fuxi FUXI, VLDB

21 Incremental resource allocation in Fuxi Local queue in node FUXI, VLDB 2014 Resource request: Initial resource request (low) Actual (peak) request (high) Start to run a batch instance with its initial resource request, increase its allocation when more resources become available. Batch instance with lower resource request has a better chance get to run 21

22 Start Delay per task(s) Cluster wide resource allocation efficiency #servers: #instances per task Instances from the same task, get scheduled at the same time Start delay: SD = start_time i Ref_start_time T 22

23 Start Delay (s) Cluster wide resource allocation efficiency Mean SD Max SD #instances per task The latest one most likely delay the result delivery of the task/job 23

24 Cluster-wide batch instance performance 100% 80% 60% 40% 20% 0% The mean NIL Normalized instance latency: NIL = Execution_time i /Ref_time T most tasks have their avg NIL below 3. 24

25 NIL distribution Cluster-wide batch instance performance 100% 80% 60% 40% 20% 0% Mean NIL Max NIL the max NIL of few tasks deviate from the average. 25

26 Outline Trace overview Shape the workload Statistics analysis of containers and batch jobs Co-location analysis Discussion Conclusion 26

27 #Servers Container deployment CPU Memory Disk Containers reserve resources - 0~64 CPUs; - 0~150% memory; overbooking - 0~60% disk. - Containers are deployed using different policies; - CPU remains the main constraint. 27

28 Batch instance scheduling in the co-location #batch instances BO CA CB CC Instance execution time (hours) BO CA CB CC BO: Batch instance only servers CA, CB, CC: low, medium, full resources reserved by containers No obvious difference to schedule a batch instance in the cluster: - Similar accumulated instance execution time on all servers, although BO has more batch instances running. 28

29 Resource allocate to batch instances Max #CPU used by containers on each server (sampled every 5 min) Max #CPU used by batch a instance on each server (max CPUs during execution) Max #CPU allowed for batch instances to use on servers does not depend on the #CPU used by containers. 29

30 Resource used Resource utilization in the cluster 70% 60% 50% 40% 30% 20% 10% 0% CPU Memory Disk Container BO CA CB CC 30

31 Outline Trace overview Shape the workload Statistics analysis of containers and batch jobs Co-location analysis Discussion Conclusion 31

32 Discussion Elasticity Resource overprovisioning (containers). Resource overcommitment (batch instances). Resource overbooking. Plasticity Very low task eviction rate in the cluster (1.5%). Accumulated batch instance execution time on most servers is similar. SD increases radically when a task owns more than 1000 instances (there are 1313 servers). No obvious difference between the maximum allowed #CPU for batch instance to use on most servers 32

33 Outline Trace overview Shape the workload Statistics analysis of containers and batch jobs Co-location analysis Discussion Conclusion 33

34 Conclusion Alibaba presents a trace, using semi-containerized cluster management Concurrent traces for online services and batch jobs allow more elaborative characterization of the mixed workload Elasticity and plasticity in the cluster management promoted the batch job performance. Thanks for attention!! 34

Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization

Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization Wei Chen, Jia Rao*, and Xiaobo Zhou University of Colorado, Colorado Springs * University of Texas at Arlington Data Center