Efficiently Scheduling Jobs to Asymmetric Multi-core Processors in Context of Performance and Thermal Budget using Autonomic Techniques

Size: px

Start display at page:

Download "Efficiently Scheduling Jobs to Asymmetric Multi-core Processors in Context of Performance and Thermal Budget using Autonomic Techniques"

Moses Tucker
5 years ago
Views:

1 Efficiently Scheduling Jobs to Asymmetric Multi-core Processors in Context of Performance and Thermal Budget using Autonomic Techniques CS 788: Fall 2015 Term Paper 2 Presentation by Abhishek Roy

2 Background Moore s law extremely power hungry processing core, extremely hot leads to multiple cores Multiple homogeneous cores Why asymmetric cores? Overall Speed up Dark silicon problem How to use them most efficiently? Mathematically modeling the performance and power consumption for various asymmetric cores Hierarchical power management- feedback based controller Job arrival rate aware scheduler that minimizes average service time Stay within thermal budget and power budget

Asymmetric Multicore Multiple cores are clustered Same type of cores in a particular cluster Different types of cores in different cluster Our setting will use a

3 Asymmetric Multicore Multiple cores are clustered Same type of cores in a particular cluster Different types of cores in different cluster Our setting will use a 3- cluster chip Due to power budget, only one core can be active at any given time Scheduler needs to choose a cluster that satisfies the power budget Processors

4 The Problem Data center environment Lots of servers, each with multiple clusters and cores Cluster c has N c cores Job arrival rate varies over time, example: Wikipedia Each job has a Degree of Parallelism Minimize mean service time while staying within the power budget Scheduler decides Which cluster to use How many jobs to execute in parallel (J) The DoP of each job (D) At any point J X D = N c

Queueing Theoretic Model We want to calculate mean total service time on a multi-core processor as a function of job arrival rate (λ) and the DoP (D) of

5 Queueing Theoretic Model We want to calculate mean total service time on a multi-core processor as a function of job arrival rate (λ) and the DoP (D) of each job. Poisson process for job arrival, job size exponentially distributed, N c homogeneous cores in a single cluster M/M/n queue, where n = J = N c /D

6 Queueing Theoretic Model Amdahl s Law When S = 0, D* = N c and J* = 1 When S > 0 and λ-> 0, D* = Nc and J* = 1 When S > 0, λmax can be sustained when D* = 1 and J = N c Cluster migrations For high arrival rate For low arrival rate

7 Autonomic Run-time Scheduler Monitors job arrival rate, makes decisions 1) which cluster to migrate to 2) optimal DoP for each job Given a cluster type and DoP, the service rate for the cluster is known in advance assuming that the cluster is fully occupied Set number of jobs to execute in parallel, J = N c t / D Get average service time for each cluster and choose the least one. Problem Arrival time prediction Cluster migration mechanism Cluster migration overhead, use a threshold of 10%

8 Experimental Setup Asymmetric multicore architecture, Small(S), Medium(M) and Large(L) clusters accommodate 64, 32 and 16 cores Sniper multi-core simulator with SPLASH-2, PARSEC and Phoenix Collect execution time data from Sniper Use a Python based discreet event simulation engine based on SimPy. Call it Sniper+DES

9 Results

10 Results

11 Results

12 Results

13 References [1] Raghunathan, Bharathwaj, and Siddharth Garg. "Job arrival rate aware scheduling for asymmetric multi-core servers in the dark silicon era." Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis. ACM, [2] Pagani, Santiago, et al. "TSP: thermal safe power: efficient power budgeting for many-core systems in dark silicon." Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis. ACM, [3] Muthukaruppan, Thannirmalai Somu, et al. "Hierarchical power management for asymmetric multi-core in dark silicon era." Proceedings of the 50th Annual Design Automation Conference. ACM, 2013 [4] Gupta, Vishal, and Ripal Nathuji. "Analyzing performance asymmetric multicore processors for latency sensitive datacenter applications." Proceedings of the 2010 international conference on Power aware computing and systems. USENIX Association, [5] Pricopi, Mihai, et al. "Power-performance modeling on asymmetric multi-cores." Compilers, Architecture and Synthesis for Embedded Systems (CASES), 2013 International Conference on. IEEE, 2013.

14 Conclusion Arrival rate prediction can be done more accurately Experimental results in this research area is mainly simulation based Practical systems with many clusters are still very rare ARMs big-little is one which can be used practically, but it has only two clusters

The Dark Side of Silicon

The Dark Side of Silicon Amir M. Rahmani Pasi Liljeberg Ahmed Hemani Axel Jantsch Hannu Tenhunen Editors The Dark Side of Silicon Energy Efficient Computing in the Dark Silicon Era 123 Editors Amir M.